Research
Academic research, publications, and experimental projects
PolySpeech-HS: Multilingual Non-Autoregressive Text-to-Speec...
PolySpeech-HS: Multilingual Non-Autoregressive Text-to-Speec...(expand)
Speech Synthesis & Multilingual AI
Abstract
A non-autoregressive text-to-speech (TTS) multilingual synthesis framework designed to address the linguistic diversity and real-time deployment challenges of Indian languages. By deploying a unified encoder-decoder architecture paired with lightweight hidden-state adapters, PolySpeech-HS enables efficient cross-lingual generalization while preserving language-specific prosodic nuances. Achieved state-of-the-art performance with MOS of 4.30, MCD of 4.7 dB, and RTF of 0.13 across six Indian languages.
A non-autoregressive text-to-speech (TTS) multilingual synthesis framework designed to address the linguistic diversity ...
A Novel Data-Centric Transformer Fine-Tuning: A Modular Fram...
A Novel Data-Centric Transformer Fine-Tuning: A Modular Fram...(expand)
Large Language Models & Domain Adaptation
Abstract
A data-centric, hardware-light workflow for fine-tuning transformers that sidesteps costly LLM APIs. Automatically scrapes high-signal web content and converts it into Q&A pairs to fine-tune a GPT-2-Medium model (355M parameters) in ~7 minutes on a single RTX-3060. Achieves 67.3% accuracy (+34% over base model) with 1.4s median latency and zero inference cost.
A data-centric, hardware-light workflow for fine-tuning transformers that sidesteps costly LLM APIs. Automatically scrap...
Fine-Tuning Mistral 22B: The First Large Language Model for ...
Fine-Tuning Mistral 22B: The First Large Language Model for ...(expand)
Low-Resource Language Processing
Abstract
The first fine-tuned Large Language Model specifically engineered for Assamese, a low-resource Indo-Aryan language spoken by approximately 15 million individuals. Introduces AssamText-750K dataset and custom Unicode mapping system exclusively for Assamese. This pioneering work becomes the first and only Assamese LLM backed by language-specific Unicode infrastructure, achieving 20% average improvement across text generation fluency, sentiment analysis accuracy, and Assamese-to-English translation quality.
The first fine-tuned Large Language Model specifically engineered for Assamese, a low-resource Indo-Aryan language spoke...