train_cb_1757340215

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0186
  • Num Input Tokens Seen: 620240

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1978 1.0 113 0.4647 31064
0.3015 2.0 226 0.4196 62304
0.49 3.0 339 0.2182 93232
0.0664 4.0 452 0.1010 124680
0.2766 5.0 565 0.1023 155672
0.0905 6.0 678 0.1096 186688
0.0007 7.0 791 0.2587 217736
0.0003 8.0 904 0.0398 248784
0.0001 9.0 1017 0.0999 279688
0.0 10.0 1130 0.0227 310504
0.0 11.0 1243 0.0211 341152
0.0 12.0 1356 0.0198 371768
0.0 13.0 1469 0.0196 402896
0.0 14.0 1582 0.0200 433768
0.0 15.0 1695 0.0186 464816
0.0 16.0 1808 0.0197 496216
0.0 17.0 1921 0.0192 527360
0.0 18.0 2034 0.0182 558088
0.0 19.0 2147 0.0190 589072
0.0 20.0 2260 0.0186 620240

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1757340215

Adapter
(2098)
this model

Evaluation results