19b9a99d360126bde69d42d263b160bc

This model is a fine-tuned version of google/long-t5-tglobal-xl on the Helsinki-NLP/opus_books [en-es] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1429
  • Data Size: 1.0
  • Epoch Runtime: 1236.7036
  • Bleu: 12.7860

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 2.6671 0 91.8338 1.6195
No log 1 2336 1.8218 0.0078 103.7495 8.9274
0.035 2 4672 1.7277 0.0156 115.6876 9.5350
0.0481 3 7008 1.6551 0.0312 136.6385 8.6207
1.9382 4 9344 1.5801 0.0625 171.9000 8.5264
1.8127 5 11680 1.5049 0.125 247.4238 8.6256
1.6469 6 14016 1.4036 0.25 390.1462 9.5051
1.5114 7 16352 1.3035 0.5 673.6065 10.3408
1.3464 8.0 18688 1.2034 1.0 1248.7078 11.2180
1.1972 9.0 21024 1.1447 1.0 1252.9235 11.8164
1.0886 10.0 23360 1.1153 1.0 1248.8658 12.3940
1.0169 11.0 25696 1.1019 1.0 1245.3960 12.6838
0.923 12.0 28032 1.0903 1.0 1236.4594 13.1384
0.8471 13.0 30368 1.0974 1.0 1232.4599 12.8015
0.8011 14.0 32704 1.1090 1.0 1231.2274 13.0162
0.7224 15.0 35040 1.1248 1.0 1236.4871 12.8928
0.6932 16.0 37376 1.1429 1.0 1236.7036 12.7860

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
11
Safetensors
Model size
0.7B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/19b9a99d360126bde69d42d263b160bc

Finetuned
(49)
this model