train_rte_101112_1760638015

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the rte dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2353
  • Num Input Tokens Seen: 6980984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3623 1.0 561 0.2576 350480
0.2824 2.0 1122 0.2509 700992
0.2631 3.0 1683 0.2457 1050848
0.2906 4.0 2244 0.2417 1400856
0.357 5.0 2805 0.2410 1749544
0.0055 6.0 3366 0.2391 2099368
0.3289 7.0 3927 0.2367 2447504
0.2394 8.0 4488 0.2359 2794592
0.1399 9.0 5049 0.2363 3145760
0.1701 10.0 5610 0.2368 3495600
0.1945 11.0 6171 0.2353 3844488
0.1955 12.0 6732 0.2387 4191800
0.2157 13.0 7293 0.2358 4538416
0.1572 14.0 7854 0.2385 4888904
0.1489 15.0 8415 0.2358 5236560
0.2896 16.0 8976 0.2365 5587768
0.2738 17.0 9537 0.2368 5935088
0.322 18.0 10098 0.2382 6283144
0.2788 19.0 10659 0.2382 6632504
0.6244 20.0 11220 0.2382 6980984

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_rte_101112_1760638015

Adapter
(2098)
this model

Evaluation results