Sinhala Female TTS V2(VITS Fine-tuned 44000 steps)

This is a fine-tuned Coqui TTS model for Sinhala.
The base checkpoint was a Sinhala male VITS model released by Pathnirvana.
I fine-tuned it on the same dataset (speaker: oshadi), adapting the model for a female voice.

✨ Features

Model architecture: VITS
Language: Sinhala (සිංහල)
Voice: Female
Sampling rate: 22050 Hz
Dataset size used: 3300 clips
Training hardware: A4500 GPU (24GB VRAM)
Training duration: ~16 hours
Framework: Coqui TTS
Optimizer: AdamW
Learning rate: started at 1e-4 20000 steps, reduced to 1e-5 24000 steps
Epochs: ~300
Losses: mel loss ~17 at convergence, with stable duration + KL losses

🔧 Installation

only for synthesizing First, install Coqui TTS:

pip install TTS

If you want GPU inference

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

🚀 Usage

Python API

from TTS.api import TTS

tts = TTS(
    model_path="tts-si-female-vits-v2_124000.pth", 
    config_path="config.json",
    gpu=True
)

tts.tts_to_file(
    text="ඔබ වැස්සට පෙම් බඳින බව කියයි. එහෙත් වැස්ස වහින විට ඔබ කුඩයක් සොයයි",
    file_path="output.wav"
)

Command line

tts --model_path tts-si-female-vits-v2_124000.pth \
    --config_path config.json \
    --out_path output.wav \
    --text "ඔබ වැස්සට පෙම් බඳින බව කියයි. එහෙත් වැස්ස වහින විට ඔබ කුඩයක් සොයයි" \
    --use_cuda true

GUI

git clone https://github.com/coqui-ai/TTS
cd TTS
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Run the server such as the following and then go to http://localhost:5002/

python TTS/server/server.py --config_path models/config.json --model_path models/tts-si-female-vits-v2_124000.pth

Docker Image You can also try TTS without install with the docker image. Simply run the following command and you will be able to run TTS without installing it.

docker run --rm -it -p 5002:5002 --entrypoint /bin/bash ghcr.io/coqui-ai/tts-cpu

Input the unicode Sinhala - you can use the tool here sinhala unicode text for converting english letters to sinhala unicode

🙏 Acknowledgements

Coqui TTS training and inference framework
Pathnirvana Sinhala male VITS checkpoint, dataset, and guidance (this work directly builds on their contribution)
Sinhala dataset contributors

License

This model is released under the MPL-2.0 license, the same as the original Sinhala TTS checkpoint by Pathnirvana.

Downloads last month: 6

Model tree for tharindumihi/tts-si-female-vits-v2

Base model

coqui/XTTS-v2

Finetuned

(52)

this model