opus-mt-sw-en-finetuned
This model performs translation trained using MLflow and deployed on Hugging Face.
Model Details
- Model Name: opus-mt-sw-en-finetuned
- Version: 3
- Task: Translation
- Languages: sw,en
- Framework: pytorch
- License: apache-2.0
Intended Uses & Limitations
Intended Uses
- Translation tasks
- Research and development
- Child helpline services support
Limitations
- Performance may vary on out-of-distribution data
- Should be evaluated on your specific use case before production deployment
- Designed for child helpline contexts, may need adaptation for other domains
Training Data
- Dataset: sw_en_dataset_v1.jsonl
- Size: Not specified
- Languages: sw,en
Training Configuration
| Parameter | Value |
|---|---|
| Comments Dataset Notes | Using synthetic helpline conversation data for initial training. Segmentation enabled for long conversations without truncation. |
| Comments Evaluation Notes | Inference validation runs during training to catch edge cases. Baseline evaluation measures improvement from pre-trained model. |
| Comments Next Steps | After initial training, expand dataset with filtered OPUS data and back-translations for improved generalization. |
| Comments Training Notes | Configured to prevent hallucinations with no_repeat_ngram_size=3 and repetition_penalty=1.2. Early stopping monitors BLEU score. |
| Dataset Config Auto Segment Long Sequences | True |
| Dataset Config Max Length Ratio | 3.5 |
| Dataset Config Max Samples | None |
| Dataset Config Min Length | 3 |
| Dataset Config Primary Dataset | custom |
| Dataset Config Segment Max Tokens | 450 |
| Dataset Config Segment Overlap Tokens | 50 |
| Dataset Config Validation Split | 0.2 |
| Evaluation Config Inference Test Frequency | every_eval |
| Evaluation Config Run Inference Validation | True |
| Evaluation Config Test Size | 500 |
| Language Name | Swahili |
| Language Pair | sw-en |
| Max Length | 512 |
| Model Name | Helsinki-NLP/opus-mt-mul-en |
| Total Parameters | 77518848 |
| Trainable Parameters | 76994560 |
| Training Config Batch Size | 16 |
| Training Config Dataloader Num Workers | 4 |
| Training Config Early Stopping Patience | 3 |
| Training Config Early Stopping Threshold | 0.001 |
| Training Config Eval Steps | 500 |
| Training Config Eval Strategy | steps |
| Training Config Generation Length Penalty | 0.6 |
| Training Config Generation Max Length | 512 |
| Training Config Generation No Repeat Ngram Size | 3 |
| Training Config Generation Num Beams | 4 |
| Training Config Generation Repetition Penalty | 1.2 |
| Training Config Gradient Accumulation Steps | 2 |
| Training Config Learning Rate | 2e-05 |
| Training Config Logging Steps | 50 |
| Training Config Lr Scheduler | cosine_with_restarts |
| Training Config Max Length | 512 |
| Training Config Mixed Precision | fp16 |
| Training Config Num Epochs | 10 |
| Training Config Pin Memory | True |
| Training Config Save Strategy | epoch |
| Training Config Warmup Steps | 500 |
| Training Config Weight Decay | 0.01 |
| Vocab Size | 64172 |
Performance Metrics
Evaluation Results
| Metric | Value |
|---|---|
| Baseline Bleu | 0.0159 |
| Baseline Chrf | 16.5958 |
| Baseline Hallucination Rate | 0.0000 |
| Baseline Keyword Preservation | 0.0950 |
| Baseline Urgency Preservation | 0.3492 |
| Bleu Improvement | 0.6975 |
| Bleu Improvement Percent | 4373.3956 |
| Chrf Improvement | 64.2811 |
| Chrf Improvement Percent | 387.3345 |
| Epoch | 10.0000 |
| Eval Bertscore F1 | 0.9757 |
| Eval Bleu | 0.7134 |
| Eval Chrf | 80.8769 |
| Eval Loss | 0.2769 |
| Eval Meteor | 0.8597 |
| Eval Runtime | 212.9135 |
| Eval Samples Per Second | 4.9930 |
| Eval Steps Per Second | 0.3150 |
| Final Epoch | 10.0000 |
| Final Eval Bertscore F1 | 0.9757 |
| Final Eval Bleu | 0.7134 |
| Final Eval Chrf | 80.8769 |
| Final Eval Loss | 0.2769 |
| Final Eval Meteor | 0.8597 |
| Final Eval Runtime | 212.9135 |
| Final Eval Samples Per Second | 4.9930 |
| Final Eval Steps Per Second | 0.3150 |
| Grad Norm | 0.9883 |
| Inference Test/Abuse Reporting Success | 1.0000 |
| Inference Test/Code Switching Success | 1.0000 |
| Inference Test/Emergency Success | 1.0000 |
| Inference Test/Emotional Distress Success | 1.0000 |
| Inference Test/Empty Input Success | 1.0000 |
| Inference Test/Fragmented Trauma Success | 1.0000 |
| Inference Test/Help Request Success | 1.0000 |
| Inference Test/Incomplete Success | 1.0000 |
| Inference Test/Location Info Success | 1.0000 |
| Inference Test/Numbers Preservation Success | 0.0000 |
| Inference Test/Simple Greeting Success | 1.0000 |
| Inference Test/Whitespace Only Success | 1.0000 |
| Inference Validation Pass Rate | 0.9167 |
| Learning Rate | 0.0000 |
| Loss | 0.3021 |
| Total Flos | 4940618361864192.0000 |
| Total Samples | 5313.0000 |
| Train Loss | 0.9104 |
| Train Runtime | 948.3640 |
| Train Samples | 4250.0000 |
| Train Samples Per Second | 44.8140 |
| Train Steps Per Second | 1.4020 |
| Validation Samples | 1063.0000 |
Usage
Installation
pip install transformers torch
Translation Example
from transformers import pipeline
translator = pipeline("translation", model="openchs/sw-en-opus-mt-mul-en-v1")
result = translator("Your text here")
print(result[0]["translation_text"])
MLflow Tracking
- Experiment: translation-sw-en
- Run ID:
adb407b0ac25465ba5b56960a6b59d10 - Training Date: 2025-10-08 10:21:45
- Tracking URI: http://192.168.10.6:5000
Training Metrics Visualization
View detailed training metrics and TensorBoard logs in the Training metrics tab.
Citation
@misc{opus_mt_sw_en_finetuned,
title={opus-mt-sw-en-finetuned},
author={OpenCHS Team},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/openchs/sw-en-opus-mt-mul-en-v1}
}
Contact
Model card auto-generated from MLflow
- Downloads last month
- 469
Evaluation results
- Baseline Bleu on sw_en_dataset_v1.jsonlself-reported0.016
- Baseline Chrf on sw_en_dataset_v1.jsonlself-reported16.596
- Loss on sw_en_dataset_v1.jsonlself-reported0.302
- Chrf Improvement on sw_en_dataset_v1.jsonlself-reported64.281
- Chrf Improvement Percent on sw_en_dataset_v1.jsonlself-reported387.334
- Eval Loss on sw_en_dataset_v1.jsonlself-reported0.277
- Eval Bleu on sw_en_dataset_v1.jsonlself-reported0.713
- Eval Chrf on sw_en_dataset_v1.jsonlself-reported80.877
- Eval Bertscore F1 on sw_en_dataset_v1.jsonlself-reported0.976
- Final Eval Loss on sw_en_dataset_v1.jsonlself-reported0.277