train_copa_789_1757596138

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6012
  • Num Input Tokens Seen: 548240

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5261 1.0 180 0.2666 27424
0.4265 2.0 360 0.2517 54832
0.2294 3.0 540 0.2400 82160
0.2376 4.0 720 0.2362 109632
0.2273 5.0 900 0.2374 137120
0.2282 6.0 1080 0.2412 164592
0.2299 7.0 1260 0.2372 191920
0.2302 8.0 1440 0.2416 219344
0.264 9.0 1620 0.2483 246736
0.2165 10.0 1800 0.2446 274208
0.254 11.0 1980 0.2517 301600
0.2522 12.0 2160 0.2489 328976
0.2228 13.0 2340 0.2545 356400
0.1836 14.0 2520 0.2654 383808
0.1791 15.0 2700 0.2790 411216
0.1126 16.0 2880 0.3588 438592
0.021 17.0 3060 0.4801 465984
0.0091 18.0 3240 0.5633 493488
0.0818 19.0 3420 0.5928 520816
0.0025 20.0 3600 0.6012 548240

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_789_1757596138

Adapter
(2098)
this model

Evaluation results