train_multirc_1756735871

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the multirc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3394
  • Num Input Tokens Seen: 117044976

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3178 0.5000 6130 0.3802 5867968
0.5557 1.0001 12260 0.3493 11717296
0.0109 1.5001 18390 0.1670 17588368
0.2324 2.0002 24520 0.3016 23419920
0.277 2.5002 30650 0.3568 29254768
0.3393 3.0002 36780 0.3402 35127456
0.3145 3.5003 42910 0.3391 40997712
0.3363 4.0003 49040 0.3400 46843184
0.3059 4.5004 55170 0.3385 52692880
0.4232 5.0004 61300 0.3394 58550736
0.2604 5.5004 67430 0.3361 64403536
0.2416 6.0005 73560 0.3536 70253968
0.4941 6.5005 79690 0.3594 76098208
0.4356 7.0006 85820 0.3684 81942784
0.5113 7.5006 91950 0.3346 87796256
0.2519 8.0007 98080 0.3266 93652576
0.388 8.5007 104210 0.3368 99519344
0.4701 9.0007 110340 0.3482 105351168
0.4091 9.5008 116470 0.3413 111213008

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_multirc_1756735871

Adapter
(2099)
this model

Evaluation results