train_boolq_123_1762598659

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4221
  • Num Input Tokens Seen: 42678144

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3124 1.0 2121 0.3276 2131904
0.308 2.0 4242 0.3278 4264768
0.3357 3.0 6363 0.3351 6404896
0.3423 4.0 8484 0.3320 8537088
0.3169 5.0 10605 0.3238 10677088
0.2902 6.0 12726 0.3203 12814080
0.2245 7.0 14847 0.3191 14950432
0.3277 8.0 16968 0.3129 17082336
0.2917 9.0 19089 0.3303 19211360
0.2396 10.0 21210 0.3094 21342336
0.3376 11.0 23331 0.3155 23472352
0.2367 12.0 25452 0.3056 25602144
0.2787 13.0 27573 0.3165 27739072
0.2327 14.0 29694 0.3215 29880544
0.188 15.0 31815 0.3245 32013760
0.2431 16.0 33936 0.3384 34138272
0.2415 17.0 36057 0.3464 36269152
0.1752 18.0 38178 0.3552 38408800
0.2795 19.0 40299 0.3594 40541312
0.1345 20.0 42420 0.3606 42678144

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_123_1762598659

Adapter
(2098)
this model

Evaluation results