2fddfdad987dc50fd6cfaa79a290b942

This model is a fine-tuned version of google/long-t5-tglobal-xl on the Helsinki-NLP/opus_books [en-nl] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2575
  • Data Size: 1.0
  • Epoch Runtime: 538.8969
  • Bleu: 7.3004

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 3.2062 0 39.4263 0.4185
No log 1 966 2.5134 0.0078 43.2044 0.9992
No log 2 1932 2.3464 0.0156 51.3754 1.3163
0.0644 3 2898 2.1822 0.0312 62.9556 1.6390
2.5109 4 3864 2.0399 0.0625 78.6052 2.0907
2.3015 5 4830 1.9081 0.125 108.1160 2.6167
2.0713 6 5796 1.7503 0.25 168.7752 3.3277
1.8148 7 6762 1.5766 0.5 294.1465 4.1326
1.558 8.0 7728 1.4059 1.0 543.4787 5.1411
1.416 9.0 8694 1.3204 1.0 539.0476 5.7053
1.2935 10.0 9660 1.2631 1.0 539.2658 6.2669
1.1465 11.0 10626 1.2288 1.0 541.6563 6.5476
1.0686 12.0 11592 1.2230 1.0 542.1311 6.7940
0.9716 13.0 12558 1.2131 1.0 541.3803 6.9580
0.9213 14.0 13524 1.2145 1.0 541.7585 7.1673
0.8286 15.0 14490 1.2235 1.0 543.1369 7.2664
0.7748 16.0 15456 1.2399 1.0 538.1777 7.3070
0.7158 17.0 16422 1.2575 1.0 538.8969 7.3004

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
7
Safetensors
Model size
0.7B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/2fddfdad987dc50fd6cfaa79a290b942

Finetuned
(49)
this model