You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model mimba/whisper-ngiemboon

This repository hosts a fine-tuned version of openai/whisper-medium adapted for Automatic Speech Recognition (ASR) from Ngiemboon (nnh).

🧠 Model Details

  • Model name: mimba/whisper-ngiemboon
  • Architecture: Transformer encoder–decoder (fine-tuned)
  • Language: Ngiemboon (Bantu language spoken in Cameroon)
  • Task: Automatic Speech Recognition (ASR)
  • Base model: openai/whisper-medium
  • Author: Mimba

🎯 Intended Use

  • Use case: Transcribe spoken Ngiemboon into text.
  • Audience: Linguists, researchers, developers working on low-resource ASR.
  • Input: 16kHz mono audio waveform in Ngiemboon.
  • Output: Transcribed text in Ngiemboon.
  • Not suitable for: Noisy environments, dialects not represented in training data.

πŸ“š Training Data

  • Source: Community-collected Ngiemboon speech corpus.
  • Size: Approximately 24 hours of transcribed audio.
  • Preprocessing:
    • Audio resampled to 16kHz mono.
    • Normalized and tokenized using a custom vocabulary.
  • Split: Train / Test

πŸ“ˆ Evaluation

  • Metric: Word Error Rate (WER)
  • Test set: Held-out Ngiemboon recordings
  • Results:
    Training Loss Epoch Step Validation Loss Wer
    0.7846 1.0 589 0.7358 0.6419
    0.5542 2.0 1178 0.5998 0.6358
    0.4704 3.0 1767 0.5379 0.5331
    0.4088 4.0 2356 0.5138 0.5010
    0.3807 5.0 2945 0.4872 0.5061
    0.3395 6.0 3534 0.4809 0.4807
    0.3426 7.0 4123 0.4710 0.4997
    0.3215 8.0 4712 0.4676 0.4730
    0.3045 9.0 5301 0.4636 0.4844
    0.2959 10.0 5890 0.4636 0.4744

Framework versions

  • PEFT 0.18.0
  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 2.18.0
  • Tokenizers 0.22.2

πŸ”­ Future Work

  • Expand training corpus with more speakers.
  • Improve robustness to noise and real-world conditions.
  • Release open-source Ngiemboon dataset for community use.
  • Explore multilingual fine-tuning with other Bantu languages.

⚠️ Limitations and Risks

  • May perform poorly on dialects or accents not seen during training.
  • Not robust to background noise or overlapping speech.
  • Limited training data may affect generalization.

πŸ’» Usage Example

from transformers import AutoProcessor, WhisperForConditionalGeneration
import torch
import soundfile as sf

# Load model and processor (depuis ton repo ou dossier local)
processor = AutoProcessor.from_pretrained("mimba/whisper-ngiemboon")
model = WhisperForConditionalGeneration.from_pretrained("mimba/whisper-ngiemboon")

# Load audio
speech, rate = sf.read("example_ngiemboon.wav")

# PrΓ©parer les features
inputs = processor(speech, sampling_rate=rate, return_tensors="pt")

# Predict
with torch.no_grad():
    predicted_ids = model.generate(inputs["input_features"])

# DΓ©coder la transcription
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)

πŸ“¬ BibTeX entry and citation info

@misc{
      title={afrilang: Small Out-of-domain resource for various africain languages}, 
      author={Mimba Ngouana Fofou},
      year={2026},
      howpublished={\\url{https://huggingface.co/mimba/whisper-ngiemboon}}
}
Contact For all questions contact @Mimba.
Downloads last month
-
Safetensors
Model size
0.8B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for mimba/whisper-ngiemboon

Finetuned
(815)
this model

Dataset used to train mimba/whisper-ngiemboon