train_copa_42_1757596065

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8684
  • Num Input Tokens Seen: 548544

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2414 1.0 180 0.2286 27424
0.2373 2.0 360 0.2293 54832
0.2262 3.0 540 0.2261 82256
0.2317 4.0 720 0.2303 109728
0.2219 5.0 900 0.2318 137120
0.2254 6.0 1080 0.2332 164560
0.2401 7.0 1260 0.2287 192032
0.204 8.0 1440 0.2253 219456
0.2245 9.0 1620 0.2337 246880
0.2302 10.0 1800 0.2340 274272
0.2177 11.0 1980 0.2352 301696
0.2195 12.0 2160 0.2311 329168
0.2572 13.0 2340 0.2248 356608
0.244 14.0 2520 0.2874 384096
0.1153 15.0 2700 0.3360 411552
0.142 16.0 2880 0.4949 438896
0.0976 17.0 3060 0.7797 466320
0.0193 18.0 3240 0.8170 493664
0.0312 19.0 3420 0.8626 521088
0.0107 20.0 3600 0.8684 548544

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_42_1757596065

Adapter
(2099)
this model

Evaluation results