The important question

#4
by yukiarimo - opened

So, how about releasing the full dataset? Or you have just illegally ripped off stolen voices from the web?

surely this is the best way to ask for anything

most companies/developers wouldn't release a training dataset, even when the model is open source. this is not unusual.

most companies/developers wouldn't release a training dataset, even when the model is open source. this is not unusual.

  1. A bunch of projects like VITS, Tacotron, etc., have released! (And usually they use LJSpeech)
  2. If you not even say where the data is coming from, it's definitely 100% stolen and they MUST be banned from HF!
  1. A bunch of projects like VITS, Tacotron, etc., have released! (And usually they use LJSpeech)
  2. If you not even say where the data is coming from, it's definitely 100% stolen and they MUST be banned from HF!

right, hf should ban 95% models include gpt, llama, gemma as well. none of them have release datasets lol

btw, maya actully notes training data in the metadata

such an aggressive post...

Maya Research org

We’re building voice intelligence for everyone and releasing it freely. That mission stays the same.

The internet is a shared resource. We’ll use every open audio source we can find to train models that talk naturally and push the frontier forward, available to all at no cost.

Thieves

bharathkumarK changed discussion status to closed
Maya Research org

Thank you

Sign up or log in to comment