train_cb_1754652159

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 3.8483
  • Num Input Tokens Seen: 367864

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
12.6939 0.5088 29 12.1533 20064
11.0368 1.0175 58 10.8909 37832
10.4486 1.5263 87 9.3964 57288
7.9533 2.0351 116 7.9230 74520
7.4909 2.5439 145 6.9202 93080
6.0778 3.0526 174 6.1027 111928
6.4246 3.5614 203 5.5888 131160
5.7896 4.0702 232 5.2264 150056
4.6737 4.5789 261 4.9209 167208
4.9442 5.0877 290 4.6369 186160
4.2959 5.5965 319 4.4270 206000
4.0552 6.1053 348 4.2730 224064
4.3442 6.6140 377 4.1417 243840
3.8654 7.1228 406 4.0376 261504
4.2193 7.6316 435 3.9544 280352
3.6867 8.1404 464 3.9009 299344
3.9532 8.6491 493 3.8730 318672
4.1488 9.1579 522 3.8590 337480
4.0229 9.6667 551 3.8483 356456

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1754652159

Adapter
(2094)
this model

Evaluation results