---
license: mit
base_model:
- Rohan432/Augmented_on_normal
- bangla-speech-processing/BanglaASR
pipeline_tag: automatic-speech-recognition
---
## Description
This model is a fine-tuned version of Rohan432/Augmented_on_normal. 
Rohan432/Augmented_on_normal is a finetuned version of bangla-speech-processing/BanglaASR  on Bangla speech data.

## Environment:
- Python version: 3.12.12
- PyTorch version: 2.8.0+cu126
- Librosa version: 0.10.1
- NumPy version: 1.26.4

## Training Parameters:
- BATCH_SIZE = 4
- GRADIENT_ACCUMULATION_STEPS = 4 
- LEARNING_RATE = 2e-5
- WARMUP_STEPS = 200
- NUM_TRAIN_EPOCHS = 8
- LOGGING_STEPS = 50

## Validation Set Evaluation:
| Epoch | Training Loss | Validation Loss | WER        | Normalized Levenshtein Similarity |
|-------|---------------|------------------|------------|------------------------------------|
| 0     | 1.447600      | 1.466727         | 13.093923  | 90.565657                          |
| 2     | 1.430200      | 1.469819         | 13.425414  | 90.040404                          |
| 4     | 1.423800      | 1.461309         | 11.657459  | 91.272727                          |
| 6     | 1.424000      | 1.458325         | 11.215470  | 91.545455                          |
| 7     | 1.426100      | 1.457540         | 10.939227  | 91.848485                          |