Parakeet TDT Fine-tuned on ATC Dataset
This model is a fine-tuned version of qenneth/parakeet-tdt-0.6b-v3-finetuned-for-ATC on the ATC-ASR-Dataset.
Model Details
- Base Model: Parakeet TDT 0.6B v3
- Training Dataset: ATC (Air Traffic Control) ASR Dataset
- Model Type: Automatic Speech Recognition (ASR)
- Framework: NVIDIA NeMo
- Experiment: atc-finetune-f28e5c62
- Best Validation WER: N/A
Usage
import nemo.collections.asr as nemo_asr
# Load the model
asr_model = nemo_asr.models.EncDecRNNTBPEModel.restore_from("Speech_To_Text_Finetuning.nemo")
# Transcribe audio
transcription = asr_model.transcribe(["path/to/audio.wav"])
print(transcription)
Training
This model was trained using Modal's serverless GPU infrastructure with automatic checkpointing and resumption.
Training configuration:
- Optimizer: AdamW
- Learning Rate: 1e-4
- Batch Size: 16
- Precision: 16-bit mixed precision
- Epochs: 30
- GPU: A100
Citation
If you use this model, please cite the original Parakeet paper and the ATC dataset.
- Downloads last month
- 13