--- license: mit base_model: - Rohan432/Augmented_on_normal - bangla-speech-processing/BanglaASR pipeline_tag: automatic-speech-recognition --- ## Description This model is a fine-tuned version of Rohan432/Augmented_on_normal. Rohan432/Augmented_on_normal is a finetuned version of bangla-speech-processing/BanglaASR on Bangla speech data. ## Environment: - Python version: 3.12.12 - PyTorch version: 2.8.0+cu126 - Librosa version: 0.10.1 - NumPy version: 1.26.4 ## Training Parameters: - BATCH_SIZE = 4 - GRADIENT_ACCUMULATION_STEPS = 4 - LEARNING_RATE = 2e-5 - WARMUP_STEPS = 200 - NUM_TRAIN_EPOCHS = 8 - LOGGING_STEPS = 50 ## Validation Set Evaluation: | Epoch | Training Loss | Validation Loss | WER | Normalized Levenshtein Similarity | |-------|---------------|------------------|------------|------------------------------------| | 0 | 1.447600 | 1.466727 | 13.093923 | 90.565657 | | 2 | 1.430200 | 1.469819 | 13.425414 | 90.040404 | | 4 | 1.423800 | 1.461309 | 11.657459 | 91.272727 | | 6 | 1.424000 | 1.458325 | 11.215470 | 91.545455 | | 7 | 1.426100 | 1.457540 | 10.939227 | 91.848485 |