Pocket TTS - MLX Weights

MLX-converted weights for Kyutai's Pocket TTS model. Optimized for Apple Silicon inference via MLX (Python) and MLX-Swift.

Weight Variants

Directory	Size	Description	RTF (M2 Air)
`bf16/`	224MB	bfloat16 baseline	~3x realtime
`int8/`	148MB	8-bit quantized FlowLM, bf16 Mimi	~5x realtime
`int4/`	107MB	4-bit quantized FlowLM, bf16 Mimi	~6x realtime

All variants include FlowLM + Mimi decoder in a single unified mlx_model.safetensors file.

8 pre-extracted voice embeddings from the Kyutai release:

See GitHub repo for full source code. (coming soon)

Converted from kyutai/pocket-tts official weights.

MLX

Hardware compatibility

Quantized