train_copa_1757340276

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9463
  • Num Input Tokens Seen: 547440

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2208 1.0 180 0.2563 27344
0.2677 2.0 360 0.2335 54736
0.2249 3.0 540 0.2334 82064
0.2551 4.0 720 0.2424 109456
0.2229 5.0 900 0.2327 136784
0.2276 6.0 1080 0.2340 164192
0.2361 7.0 1260 0.2310 191552
0.2147 8.0 1440 0.2424 218944
0.2244 9.0 1620 0.2365 246352
0.2334 10.0 1800 0.2399 273744
0.2356 11.0 1980 0.2416 301072
0.223 12.0 2160 0.2418 328464
0.2351 13.0 2340 0.2705 355840
0.1368 14.0 2520 0.3143 383168
0.0239 15.0 2700 0.5442 410512
0.1856 16.0 2880 0.7039 437952
0.029 17.0 3060 0.8290 465264
0.0011 18.0 3240 0.9045 492672
0.0005 19.0 3420 0.9412 520048
0.0008 20.0 3600 0.9463 547440

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1757340276

Adapter
(2098)
this model

Evaluation results