Finetune Guide

by rastegar - opened 8 days ago

Discussion

rastegar

8 days ago

Hi there
Is there any finetune guide or example code with example dataset?

cmoney113

6 days ago

Since this is based on StyleTTS, it is quite straightforward to finetune. I assume you want to do something like expand language coverage. You happen to be talking to the right guy. I successfully fully reverse-engineered Kokoro, also based on StyleTTS, and expanded it to speak 72 languages. The only issue is you would need to perform additional training runs w/ this model to get more phonemes inside the model so that the vocoder has the ability to actually formualte the phonemes that you want it to -- then you can do things like cross-phoneme mapping and post-hoc tuning and training. Let me now if you have any questions and I can guide you.

Anilosan15

5 days ago

@comey113 wow, this is the first time I’ve heard something like this. How were you able to reverse engineer Kokoro and train it in other languages? That’s a big achievement , could you share more details? I have around four to five hundred thousand hours of data and I want to try training experiments with smaller models.

stellon-admin

Kitten ML org 3 days ago

@rastegar we will likely release the ability to get custom models w custom voices after our next launch. rn the highest priority is to launch the next model next month. we dont have the bandwidth to release training code rn, unfortunately :( maybe in May or so this will change

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment