Whisper Small Canto - Chengyi Li

This model is a fine-tuned version of openai/whisper-small on the Common Voice 24.0 - Cantonese dataset. The following results are achieved on the evaluation set using the best model:

  • WER: 62.24
  • CER: 12.46

Model description

It's my first time fine-tuning an ASR model.

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Done through Google Colab Pro using the L4 GPU

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 10000
  • mixed_precision_training: Native AMP

Training results

Step Epoch Training Loss Validation Loss CER
1000 2.1552 0.081 0.2857 13.8249
2000 4.3103 0.0173 0.3094 12.7975
3000 6.4655 0.0039 0.3496 12.7571
4000 8.6207 0.0008 0.3721 12.5457
5000 10.7759 0.0006 0.3784 12.5347
6000 12.9310 0.0043 0.3907 13.0640
7000 15.0862 0.0004 0.4053 12.6560
8000 17.2414 0.0008 0.4123 12.4648
9000 19.3966 0.0002 0.4196 12.4648
10000 21.5517 0.0001 0.4238 12.5071

Framework versions

  • Transformers 4.52.0
  • Pytorch 2.9.0+cu126
  • Datasets 4.4.2
  • Tokenizers 0.21.4
Downloads last month
131
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chengyili2005/whisper-small-canto

Finetuned
(3145)
this model

Evaluation results