train_copa_1756729609

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2389
  • Num Input Tokens Seen: 273712

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2146 0.5 90 0.2563 13664
0.2358 1.0 180 0.2523 27408
0.2177 1.5 270 0.2437 41120
0.2349 2.0 360 0.2347 54752
0.2026 2.5 450 0.2439 68432
0.2456 3.0 540 0.2322 82176
0.2229 3.5 630 0.2402 95936
0.2258 4.0 720 0.2348 109584
0.2307 4.5 810 0.2455 123232
0.2319 5.0 900 0.2316 137008
0.2225 5.5 990 0.2376 150672
0.2297 6.0 1080 0.2333 164336
0.2299 6.5 1170 0.2325 178032
0.2122 7.0 1260 0.2332 191712
0.2274 7.5 1350 0.2341 205312
0.2397 8.0 1440 0.2398 219072
0.2326 8.5 1530 0.2392 232768
0.2314 9.0 1620 0.2372 246416
0.2125 9.5 1710 0.2374 260112
0.2223 10.0 1800 0.2389 273712

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1756729609

Adapter
(2098)
this model

Evaluation results