MrEzzat
/

parakeet-atc-finetuned

Automatic Speech Recognition

Model card Files Files and versions

Parakeet TDT Fine-tuned on ATC Dataset

This model is a fine-tuned version of qenneth/parakeet-tdt-0.6b-v3-finetuned-for-ATC on the ATC-ASR-Dataset.

Model Details

Base Model: Parakeet TDT 0.6B v3
Training Dataset: ATC (Air Traffic Control) ASR Dataset
Model Type: Automatic Speech Recognition (ASR)
Framework: NVIDIA NeMo
Experiment: atc-finetune-f28e5c62
Best Validation WER: N/A

Usage

import nemo.collections.asr as nemo_asr

# Load the model
asr_model = nemo_asr.models.EncDecRNNTBPEModel.restore_from("Speech_To_Text_Finetuning.nemo")

# Transcribe audio
transcription = asr_model.transcribe(["path/to/audio.wav"])
print(transcription)

Training

This model was trained using Modal's serverless GPU infrastructure with automatic checkpointing and resumption.

Training configuration:

Optimizer: AdamW
Learning Rate: 1e-4
Batch Size: 16
Precision: 16-bit mixed precision
Epochs: 30
GPU: A100

Citation

If you use this model, please cite the original Parakeet paper and the ATC dataset.

Downloads last month: 13

Dataset used to train MrEzzat/parakeet-atc-finetuned