Parakeet TDT Fine-tuned on ATC Dataset

This model is a fine-tuned version of qenneth/parakeet-tdt-0.6b-v3-finetuned-for-ATC on the ATC-ASR-Dataset.

Model Details

  • Base Model: Parakeet TDT 0.6B v3
  • Training Dataset: ATC (Air Traffic Control) ASR Dataset
  • Model Type: Automatic Speech Recognition (ASR)
  • Framework: NVIDIA NeMo
  • Experiment: atc-finetune-f28e5c62
  • Best Validation WER: N/A

Usage

import nemo.collections.asr as nemo_asr

# Load the model
asr_model = nemo_asr.models.EncDecRNNTBPEModel.restore_from("Speech_To_Text_Finetuning.nemo")

# Transcribe audio
transcription = asr_model.transcribe(["path/to/audio.wav"])
print(transcription)

Training

This model was trained using Modal's serverless GPU infrastructure with automatic checkpointing and resumption.

Training configuration:

  • Optimizer: AdamW
  • Learning Rate: 1e-4
  • Batch Size: 16
  • Precision: 16-bit mixed precision
  • Epochs: 30
  • GPU: A100

Citation

If you use this model, please cite the original Parakeet paper and the ATC dataset.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train MrEzzat/parakeet-atc-finetuned