Swahili-English Translation Model for Child Helpline Services

Model Description

This model is a fine-tuned version of Helsinki-NLP/opus-mt-mul-en for Swahili-to-English translation, specifically optimized for child helpline call transcriptions in East Africa.

Developed by: BITZ IT Consulting Ltd
Project: OpenCHS (Open Child Helpline System)
Funded by: UNICEF Venture Fund
License: Apache 2.0

Performance

Test Set (General Translation)

  • BLEU: 0.2272
  • chrF: 42.25
  • Improvement over baseline: +0.0%

Domain Evaluation (Call Transcriptions)

  • Domain BLEU: 0.0000
  • Domain chrF: 2.90
  • Domain COMET-QE: 0.0000

Intended Use

Primary Use Case: Translating Swahili helpline call transcriptions to English for case documentation, quality assurance, and cross-border referrals.

Languages: Swahili (source) โ†’ English (target)

Usage

from transformers import MarianTokenizer, MarianMTModel

model_name = "brendaogutu/sw-en-translation-v1"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)

swahili_text = "Habari za asubuhi. Ninaitwa Amina na nina miaka 14."
inputs = tokenizer(swahili_text, return_tensors="pt", padding=True)
outputs = model.generate(**inputs, num_beams=5, max_length=256)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translation)

Training Details

Base Model: Helsinki-NLP/opus-mt-mul-en
Training Epochs: 8
Batch Size: 128
Learning Rate: 3e-05
Hardware: NVIDIA GPU with FP16 mixed precision


This model is part of the OpenCHS project supporting child helpline services across East Africa.

Downloads last month
85
Safetensors
Model size
77.1M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support