SpeechColab

non-profit

SpeechColab

Activity Feed Request to join this org

AI & ML interests

Machine Learning for Audio/Speech

Recent Activity

yfyeung updated a collection 5 days ago

GigaSpeech Series

yfyeung updated a collection 5 days ago

GigaSpeech Series

yfyeung updated a collection 5 days ago

GigaSpeech Series

View all activity

yfyeung

updated a collection 5 days ago

GigaSpeech Series

Collection

Evolving, Large-Scale, and Multi-domain ASR Corpus • 4 items • Updated 5 days ago

yfyeung

authored a paper 5 days ago

SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing

Paper • 2601.09385 • Published 11 days ago

yfyeung

authored a paper 16 days ago

Towards Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training

Paper • 2601.03065 • Published 18 days ago

yfyeung

authored 2 papers 3 months ago

SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations

Paper • 2510.25955 • Published Oct 29, 2025

SpeechLLM-as-Judges: Towards General and Interpretable Speech Quality Evaluation

Paper • 2510.14664 • Published Oct 16, 2025 • 1

yfyeung

authored 2 papers 4 months ago

Towards Responsible Evaluation for Text-to-Speech

Paper • 2510.06927 • Published Oct 8, 2025

Measuring Prosody Diversity in Zero-Shot TTS: A New Metric, Benchmark, and Exploration

Paper • 2509.19928 • Published Sep 24, 2025 • 1

sanchit-gandhi

authored 2 papers 6 months ago

Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 66

Voxtral

Paper • 2507.13264 • Published Jul 17, 2025 • 32

yfyeung

in speechcolab/gigaspeech2 7 months ago

can i use this dataset to finetune tts model?

👀 1

#5 opened 12 months ago by

adityaakmalazhari

yfyeung

authored 2 papers 7 months ago

StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling

Paper • 2506.12570 • Published Jun 14, 2025 • 1

Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR

Paper • 2409.08797 • Published Sep 13, 2024

yfyeung

authored 2 papers 8 months ago

Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling

Paper • 2505.19669 • Published May 26, 2025

VietASR: Achieving Industry-level Vietnamese ASR with 50-hour labeled data and Large-Scale Speech Pretraining

Paper • 2505.21527 • Published May 23, 2025

yfyeung

in speechcolab/gigaspeech2 9 months ago

Question on the dataset license

#6 opened 10 months ago by

Chalermdej

yfyeung

authored 2 papers 9 months ago

EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting

Paper • 2504.12867 • Published Apr 17, 2025

Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis

Paper • 2504.10352 • Published Apr 14, 2025

AI & ML interests

Recent Activity

Team members 11

speechcolab's activity

can i use this dataset to finetune tts model?

Question on the dataset license