mlx-community/kitten-tts-micro-0.8-8bit

This is the INT8 (MLX 8-bit) MLX conversion of KittenML/kitten-tts-micro-0.8.

Usage

pip install -U mlx-audio
python -m mlx_audio.tts.generate --model mlx-community/kitten-tts-micro-0.8-8bit --text "This is a local MLX test voice." --voice "expr-voice-5-m"

Inference Notes

The MLX implementation includes small end-of-utterance smoothing to prevent abrupt cutoffs. You can override it with fade_out_ms=0 and tail_silence_ms=0 in Model.generate().

Conversion Notes / Fixes

  • AdaIN fc.weight orientation was corrected (ONNX stores as (in, out) even when square).
  • AdaIN Snake alpha parameters are loaded and used for generator resblocks.
  • ConvTranspose output padding matches the original (right-side pad for output_padding=1).
  • Phase slice is passed through sin before ISTFT, matching the ONNX graph.
  • ISTFT uses normalized windowing without phase unwrap (to match original behavior).
  • Tail trim + dynamic fade-out + tail silence are applied at inference time to avoid a trailing spurt.

Original Model

Refer to the original model card for details: https://huggingface.co/KittenML/kitten-tts-micro-0.8

Downloads last month
20
Safetensors
Model size
28.3M params
Tensor type
F32
U32
F16
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support