m2m100_418M-pruned-fra-bre-32768
Pruned version of facebook/m2m100_418M for:
- fra_src: 16384 tokens
- bre_tgt: 16384 tokens
Total vocabulary: 32768 tokens
Usage
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
model = M2M100ForConditionalGeneration.from_pretrained("bourdoiscatie/m2m100_418M-pruned-fra-bre-32768")
tokenizer = M2M100Tokenizer.from_pretrained("bourdoiscatie/m2m100_418M-pruned-fra-bre-32768")
# Example
text = "Hello, how are you?"
tokenizer.src_lang = "en"
encoded = tokenizer(text, return_tensors="pt")
generated = model.generate(**encoded, forced_bos_token_id=tokenizer.get_lang_id("fr"))
translation = tokenizer.batch_decode(generated, skip_special_tokens=True)[0]
Model Details
- Base: facebook/m2m100_418M
- Size: 386.3M params (79.8% of original)
- Pruning: Vocabulary trimming
- Downloads last month
- 55
Model tree for bourdoiscatie/m2m100_418M-pruned-fra-bre-32768
Base model
facebook/m2m100_418M