opus-mt-sw-en-finetuned

This model performs translation trained using MLflow and deployed on Hugging Face.

Model Details

Model Name: opus-mt-sw-en-finetuned
Version: 3
Task: Translation
Languages: sw,en
Framework: pytorch
License: apache-2.0

Intended Uses & Limitations

Intended Uses

Translation tasks
Research and development
Child helpline services support

Limitations

Performance may vary on out-of-distribution data
Should be evaluated on your specific use case before production deployment
Designed for child helpline contexts, may need adaptation for other domains

Training Data

Dataset: sw_en_dataset_v1.jsonl
Size: Not specified
Languages: sw,en

Training Configuration

Parameter	Value
Comments Dataset Notes	Using synthetic helpline conversation data for initial training. Segmentation enabled for long conversations without truncation.
Comments Evaluation Notes	Inference validation runs during training to catch edge cases. Baseline evaluation measures improvement from pre-trained model.
Comments Next Steps	After initial training, expand dataset with filtered OPUS data and back-translations for improved generalization.
Comments Training Notes	Configured to prevent hallucinations with no_repeat_ngram_size=3 and repetition_penalty=1.2. Early stopping monitors BLEU score.
Dataset Config Auto Segment Long Sequences	True
Dataset Config Max Length Ratio	3.5
Dataset Config Max Samples	None
Dataset Config Min Length	3
Dataset Config Primary Dataset	custom
Dataset Config Segment Max Tokens	450
Dataset Config Segment Overlap Tokens	50
Dataset Config Validation Split	0.2
Evaluation Config Inference Test Frequency	every_eval
Evaluation Config Run Inference Validation	True
Evaluation Config Test Size	500
Language Name	Swahili
Language Pair	sw-en
Max Length	512
Model Name	Helsinki-NLP/opus-mt-mul-en
Total Parameters	77518848
Trainable Parameters	76994560
Training Config Batch Size	16
Training Config Dataloader Num Workers	4
Training Config Early Stopping Patience	3
Training Config Early Stopping Threshold	0.001
Training Config Eval Steps	500
Training Config Eval Strategy	steps
Training Config Generation Length Penalty	0.6
Training Config Generation Max Length	512
Training Config Generation No Repeat Ngram Size	3
Training Config Generation Num Beams	4
Training Config Generation Repetition Penalty	1.2
Training Config Gradient Accumulation Steps	2
Training Config Learning Rate	2e-05
Training Config Logging Steps	50
Training Config Lr Scheduler	cosine_with_restarts
Training Config Max Length	512
Training Config Mixed Precision	fp16
Training Config Num Epochs	10
Training Config Pin Memory	True
Training Config Save Strategy	epoch
Training Config Warmup Steps	500
Training Config Weight Decay	0.01
Vocab Size	64172

Performance Metrics

Evaluation Results

Metric	Value
Baseline Bleu	0.0159
Baseline Chrf	16.5958
Baseline Hallucination Rate	0.0000
Baseline Keyword Preservation	0.0950
Baseline Urgency Preservation	0.3492
Bleu Improvement	0.6975
Bleu Improvement Percent	4373.3956
Chrf Improvement	64.2811
Chrf Improvement Percent	387.3345
Epoch	10.0000
Eval Bertscore F1	0.9757
Eval Bleu	0.7134
Eval Chrf	80.8769
Eval Loss	0.2769
Eval Meteor	0.8597
Eval Runtime	212.9135
Eval Samples Per Second	4.9930
Eval Steps Per Second	0.3150
Final Epoch	10.0000
Final Eval Bertscore F1	0.9757
Final Eval Bleu	0.7134
Final Eval Chrf	80.8769
Final Eval Loss	0.2769
Final Eval Meteor	0.8597
Final Eval Runtime	212.9135
Final Eval Samples Per Second	4.9930
Final Eval Steps Per Second	0.3150
Grad Norm	0.9883
Inference Test/Abuse Reporting Success	1.0000
Inference Test/Code Switching Success	1.0000
Inference Test/Emergency Success	1.0000
Inference Test/Emotional Distress Success	1.0000
Inference Test/Empty Input Success	1.0000
Inference Test/Fragmented Trauma Success	1.0000
Inference Test/Help Request Success	1.0000
Inference Test/Incomplete Success	1.0000
Inference Test/Location Info Success	1.0000
Inference Test/Numbers Preservation Success	0.0000
Inference Test/Simple Greeting Success	1.0000
Inference Test/Whitespace Only Success	1.0000
Inference Validation Pass Rate	0.9167
Learning Rate	0.0000
Loss	0.3021
Total Flos	4940618361864192.0000
Total Samples	5313.0000
Train Loss	0.9104
Train Runtime	948.3640
Train Samples	4250.0000
Train Samples Per Second	44.8140
Train Steps Per Second	1.4020
Validation Samples	1063.0000

Usage

Installation

pip install transformers torch

Translation Example

from transformers import pipeline

translator = pipeline("translation", model="openchs/sw-en-opus-mt-mul-en-v1")
result = translator("Your text here")
print(result[0]["translation_text"])

MLflow Tracking

Experiment: translation-sw-en
Run ID: adb407b0ac25465ba5b56960a6b59d10
Training Date: 2025-10-08 10:21:45
Tracking URI: http://192.168.10.6:5000

Training Metrics Visualization

View detailed training metrics and TensorBoard logs in the Training metrics tab.

Citation

@misc{opus_mt_sw_en_finetuned,
  title={opus-mt-sw-en-finetuned},
  author={OpenCHS Team},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/openchs/sw-en-opus-mt-mul-en-v1}
}

Contact

info@bitz-itc.com

Model card auto-generated from MLflow

Downloads last month: 469

Safetensors

Model size

77.1M params

Tensor type

F32

Evaluation results

Baseline Bleu on sw_en_dataset_v1.jsonl
self-reported

0.016
Baseline Chrf on sw_en_dataset_v1.jsonl
self-reported

16.596
Loss on sw_en_dataset_v1.jsonl
self-reported

0.302
Chrf Improvement on sw_en_dataset_v1.jsonl
self-reported

64.281
Chrf Improvement Percent on sw_en_dataset_v1.jsonl
self-reported

387.334
Eval Loss on sw_en_dataset_v1.jsonl
self-reported

0.277
Eval Bleu on sw_en_dataset_v1.jsonl
self-reported

0.713
Eval Chrf on sw_en_dataset_v1.jsonl
self-reported

80.877
Eval Bertscore F1 on sw_en_dataset_v1.jsonl
self-reported

0.976
Final Eval Loss on sw_en_dataset_v1.jsonl
self-reported

0.277

View on Papers With Code