The important question
So, how about releasing the full dataset? Or you have just illegally ripped off stolen voices from the web?
surely this is the best way to ask for anything
most companies/developers wouldn't release a training dataset, even when the model is open source. this is not unusual.
most companies/developers wouldn't release a training dataset, even when the model is open source. this is not unusual.
- A bunch of projects like VITS, Tacotron, etc., have released! (And usually they use LJSpeech)
- If you not even say where the data is coming from, it's definitely 100% stolen and they MUST be banned from HF!
- A bunch of projects like VITS, Tacotron, etc., have released! (And usually they use LJSpeech)
- If you not even say where the data is coming from, it's definitely 100% stolen and they MUST be banned from HF!
right, hf should ban 95% models include gpt, llama, gemma as well. none of them have release datasets lol
btw, maya actully notes training data in the metadata
such an aggressive post...
Weβre building voice intelligence for everyone and releasing it freely. That mission stays the same.
The internet is a shared resource. Weβll use every open audio source we can find to train models that talk naturally and push the frontier forward, available to all at no cost.
Thieves
Thank you