de1dffef4207f6fec938d98242b37c7a

This model is a fine-tuned version of facebook/mbart-large-50 on the Helsinki-NLP/opus_books [en-fr] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6406
  • Data Size: 1.0
  • Epoch Runtime: 794.3154
  • Bleu: 15.1494

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 6.0084 0 65.8213 1.2509
No log 1 3177 3.2378 0.0078 72.3778 9.6568
0.0542 2 6354 2.2685 0.0156 77.8192 11.8766
9.9744 3 9531 7.5505 0.0312 89.6887 0.0016
2.1183 4 12708 1.8953 0.0625 113.7882 19.2824
1.7258 5 15885 1.6635 0.125 159.3793 21.9236
1.531 6 19062 1.4965 0.25 247.8980 17.6415
1.3825 7 22239 1.4083 0.5 428.8253 14.0973
1.2144 8.0 25416 1.3038 1.0 790.6624 15.7468
1.0109 9.0 28593 1.3003 1.0 794.5531 16.7103
0.8794 10.0 31770 1.3401 1.0 795.7410 17.3701
0.7221 11.0 34947 1.4224 1.0 791.2622 14.7897
0.5718 12.0 38124 1.5171 1.0 797.5035 15.8861
0.4526 13.0 41301 1.6406 1.0 794.3154 15.1494

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/de1dffef4207f6fec938d98242b37c7a

Finetuned
(326)
this model