Sinhala Female TTS V2(VITS Fine-tuned 44000 steps)
This is a fine-tuned Coqui TTS model for Sinhala.
The base checkpoint was a Sinhala male VITS model released by Pathnirvana.
I fine-tuned it on the same dataset (speaker: oshadi), adapting the model for a female voice.
✨ Features
- Model architecture: VITS
- Language: Sinhala (සිංහල)
- Voice: Female
- Sampling rate: 22050 Hz
- Dataset size used: 3300 clips
- Training hardware: A4500 GPU (24GB VRAM)
- Training duration: ~16 hours
- Framework: Coqui TTS
- Optimizer: AdamW
- Learning rate: started at 1e-4 20000 steps, reduced to 1e-5 24000 steps
- Epochs: ~300
- Losses: mel loss ~17 at convergence, with stable duration + KL losses
🔧 Installation
only for synthesizing First, install Coqui TTS:
pip install TTS
If you want GPU inference
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
🚀 Usage
Python API
from TTS.api import TTS
tts = TTS(
model_path="tts-si-female-vits-v2_124000.pth",
config_path="config.json",
gpu=True
)
tts.tts_to_file(
text="ඔබ වැස්සට පෙම් බඳින බව කියයි. එහෙත් වැස්ස වහින විට ඔබ කුඩයක් සොයයි",
file_path="output.wav"
)
Command line
tts --model_path tts-si-female-vits-v2_124000.pth \
--config_path config.json \
--out_path output.wav \
--text "ඔබ වැස්සට පෙම් බඳින බව කියයි. එහෙත් වැස්ස වහින විට ඔබ කුඩයක් සොයයි" \
--use_cuda true
GUI
git clone https://github.com/coqui-ai/TTS
cd TTS
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Run the server such as the following and then go to http://localhost:5002/
python TTS/server/server.py --config_path models/config.json --model_path models/tts-si-female-vits-v2_124000.pth
Docker Image You can also try TTS without install with the docker image. Simply run the following command and you will be able to run TTS without installing it.
docker run --rm -it -p 5002:5002 --entrypoint /bin/bash ghcr.io/coqui-ai/tts-cpu
Input the unicode Sinhala - you can use the tool here sinhala unicode text for converting english letters to sinhala unicode
🙏 Acknowledgements
- Coqui TTS training and inference framework
- Pathnirvana Sinhala male VITS checkpoint, dataset, and guidance (this work directly builds on their contribution)
- Sinhala dataset contributors
License
This model is released under the MPL-2.0 license, the same as the original Sinhala TTS checkpoint by Pathnirvana.
- Downloads last month
- 6
Model tree for tharindumihi/tts-si-female-vits-v2
Base model
coqui/XTTS-v2