| language: | |
| - fr | |
| - mul | |
| language_bcp47: | |
| - fr | |
| - yemba | |
| license: apache-2.0 | |
| base_model: facebook/mbart-large-50 | |
| tags: | |
| - translation | |
| - mbart | |
| - yemba | |
| - african-languages | |
| - low-resource-languages | |
| pipeline_tag: translation | |
| # mBART Yemba Translation Model | |
| Ce modèle est un fine-tuning de mBART pour la traduction entre le français et le yemba. | |
| ## Utilisation | |
| Voici comment utiliser le modèle: | |
| from transformers import MBartForConditionalGeneration, MBart50TokenizerFast | |
| # Charger le modèle et le tokenizer | |
| model = MBartForConditionalGeneration.from_pretrained("Dims002/mbart-yemba-translator") | |
| tokenizer = MBart50TokenizerFast.from_pretrained("Dims002/mbart-yemba-translator") | |
| # Exemple de traduction | |
| def translate_text(text, src_lang="fr_XX", tgt_lang="yemba"): | |
| tokenizer.src_lang = src_lang | |
| encoded = tokenizer(text, return_tensors="pt") | |
| generated_tokens = model.generate(**encoded) | |
| return tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0] | |
| ## Détails du modèle | |
| - **Modèle de base**: facebook/mbart-large-50 | |
| - **Langues**: Français ↔ Yemba | |
| - **Checkpoint**: 20000 | |
| ## Langues supportées | |
| - **Français** (fr): Langue source principale | |
| - **Yemba**: Langue cible (langue bantoue du Cameroun) | |