train_openbookqa_101112_1760638025

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5690
  • Num Input Tokens Seen: 8474968

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6942 1.0 1116 0.6966 424568
0.6806 2.0 2232 0.7041 848552
0.691 3.0 3348 0.6953 1271776
0.6917 4.0 4464 0.6958 1694896
0.7066 5.0 5580 0.6869 2118456
0.6931 6.0 6696 0.6998 2542752
0.6141 7.0 7812 0.6466 2966512
0.4988 8.0 8928 0.6209 3390048
0.5554 9.0 10044 0.5843 3814704
0.4766 10.0 11160 0.5740 4238440
0.4587 11.0 12276 0.5826 4662136
0.6224 12.0 13392 0.5690 5086336
0.393 13.0 14508 0.5838 5510768
0.4478 14.0 15624 0.5707 5933936
0.3398 15.0 16740 0.5972 6357536
0.3969 16.0 17856 0.5900 6779872
0.1803 17.0 18972 0.6234 7203216
0.2085 18.0 20088 0.6647 7626944
0.2846 19.0 21204 0.6692 8051216
0.3797 20.0 22320 0.6721 8474968

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_101112_1760638025

Adapter
(2098)
this model

Evaluation results