multilingual_speech_to__intent_lg_xlsr
This model is a fine-tuned version of KasuleTrevor/wav2vec2-large-xls-r-300m-lg-cv-130hr-v1 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1706
- Accuracy: 0.9726
- Precision: 0.9729
- Recall: 0.9726
- F1: 0.9726
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 500
- num_epochs: 80
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|---|---|---|
| 1.7737 | 1.0 | 219 | 0.5232 | 0.9462 | 0.9530 | 0.9462 | 0.9445 |
| 0.21 | 2.0 | 438 | 0.1100 | 0.9823 | 0.9827 | 0.9823 | 0.9823 |
| 0.168 | 3.0 | 657 | 0.1395 | 0.9696 | 0.9713 | 0.9696 | 0.9695 |
| 0.1578 | 4.0 | 876 | 0.1974 | 0.9469 | 0.9536 | 0.9469 | 0.9466 |
| 0.1389 | 5.0 | 1095 | 0.3522 | 0.9200 | 0.9271 | 0.9200 | 0.9203 |
| 0.1011 | 6.0 | 1314 | 0.1652 | 0.9632 | 0.9666 | 0.9632 | 0.9629 |
| 0.1023 | 7.0 | 1533 | 0.2251 | 0.9519 | 0.9566 | 0.9519 | 0.9511 |
| 0.0766 | 8.0 | 1752 | 0.1642 | 0.9611 | 0.9641 | 0.9611 | 0.9610 |
| 0.0637 | 9.0 | 1971 | 0.1688 | 0.9625 | 0.9649 | 0.9625 | 0.9625 |
| 0.0595 | 10.0 | 2190 | 0.1758 | 0.9660 | 0.9673 | 0.9660 | 0.9657 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.1.0+cu118
- Datasets 3.6.0
- Tokenizers 0.21.2
- Downloads last month
- 8
Model tree for KasuleTrevor/multilingual_speech_to__intent_lg_xlsr
Base model
facebook/wav2vec2-xls-r-300m