Good Audio Generation space, model, dataset
Good Audio Generation space, model, dataset collection
-
Audio-to-Audio • Updated • 100k • 100 -
KittenML/kitten-tts-nano-0.1
Updated • 30.4k • 499 -
FunAudioLLM/ThinkSound
Video-to-Video • Updated • 50 -
ThinkSound
🔊318Generate audio for a video from a caption or description
-
Higgs Audio Demo
🎤397Higgs Audio Demo
-
bosonai/higgs-audio-v2-generation-3B-base
Text-to-Speech • Updated • 194k • 658 -
Song Generation
🎵639Generate a song from lyrics and prompts
-
Vui
🏢185NotebookLM conversational speech model
-
Hibiki Samples
🤗52Translate speech in real-time with high fidelity
-
kyutai/moshiko-pytorch-bf16
Updated • 153k • 229 -
kyutai/mimi
Feature Extraction • 96.2M • Updated • 470k • • 289 -
maya-research/Veena
Text-to-Speech • Updated • 11.4k • 227 -
MiniMax Speech Tech Report
🎙104Generate high-quality speech from text with voice cloning
-
google/magenta-realtime
Updated • 251 • 539 -
PlayDiffusion
🎨120Generate modified audio from text and voice
-
Qwen2.5 Omni 7B Demo
🏆366Chat with AI using text, audio, images, and video
-
Open ASR Leaderboard
🏆1.22kExplore and compare speech‑recognition model benchmarks
-
Open NotebookLM
🎙143Generate a podcast to discuss the topic of your choice!
-
Voila Demo
💻43Chat with a voice-clone AI
-
Voice Clone
🗣2.58kClone a voice and generate speech from your text
-
moonshotai/Kimi-Audio-7B-Instruct
Text-to-Speech • Updated • 1.55k • 386 -
moonshotai/Kimi-Audio-7B
Text-to-Speech • 10B • Updated • 183 • 77 -
Dia 1.6B
👯1.75kGenerate realistic dialogue from a script, using Dia!
-
nari-labs/Dia-1.6B
Text-to-Speech • Updated • 76.2k • • 2.83k -
ByteDance/MegaTTS3
Text-to-Speech • Updated • 80 • 414 -
Di♪♪Rhythm
🎶677Blazingly Fast and Embarrassingly Simple Song Generation
-
Gemini Audio Video
♊35Gemini understands audio and video!
-
nvidia/diar_sortformer_4spk-v1
Automatic Speech Recognition • 0.1B • Updated • 3.55k • 131 -
ACE Step
😻645A Step Towards Music Generation Foundation Model
-
ACE-Step/ACE-Step-v1-3.5B
Text-to-Audio • Updated • 715 -
stepfun-ai/Step-Audio-2-mini
Any-to-Any • Updated • 1.85k • 250 -
neuphonic/neutts-air
Text-to-Speech • 0.7B • Updated • 10.5k • 852 -
NeuTTS-Air
☁311Generate speech in a chosen voice from text
-
KaniTTS
😻112Generate expressive speech from your text in seconds
-
microsoft/UserLM-8b
Text Generation • Updated • 423 • 362 -
pipecat-ai/smart-turn-v3
Voice Activity Detection • Updated • 126 -
meituan-longcat/LongCat-Audio-Codec
Updated • 41 -
Qwen3 TTS Voice Design
📈105Generate custom voice audio from text and description
-
Qwen TTS Clone Demo
👀60Create a custom voice clone and synthesize speech
-
ResembleAI/chatterbox-turbo
Text-to-Speech • Updated • 608 -
Chatterbox Turbo Demo
⚡478Chatterbox Turbo Demo
-
zai-org/GLM-TTS
Text-to-Speech • Updated • 236 • 321 -
Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice
Text-to-Speech • Updated • 1.01M • 1.16k -
Qwen3-TTS Demo
🎙1.51kGenerate custom speech from text, voice descriptions, or samples
-
Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice
Text-to-Speech • Updated • 223k • 111 -
FlashLabs/Chroma-4B
Any-to-Any • Updated • 7.26k • 336 -
FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning
Paper • 2601.11141 • Published • 23 -
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization
Paper • 2601.01554 • Published • 57 -
FunAudioLLM/Fun-Audio-Chat-8B
Any-to-Any • 9B • Updated • 2.83k • 175