train_record_789_1769896664

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3071
  • Num Input Tokens Seen: 928969632

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4391 1.0 31242 0.4958 46450496
0.3081 2.0 62484 0.3964 92891936
0.3691 3.0 93726 0.3563 139349440
0.3423 4.0 124968 0.3371 185796864
0.4379 5.0 156210 0.3265 232235744
0.235 6.0 187452 0.3195 278704192
0.3793 7.0 218694 0.3139 325156032
0.3391 8.0 249936 0.3110 371599168
0.1754 9.0 281178 0.3108 418050784
0.1995 10.0 312420 0.3071 464504128
0.3682 11.0 343662 0.3080 510961472
0.3631 12.0 374904 0.3086 557400608
0.2445 13.0 406146 0.3092 603828768
0.2535 14.0 437388 0.3078 650269472
0.2114 15.0 468630 0.3087 696703648
0.3696 16.0 499872 0.3085 743153504
0.3658 17.0 531114 0.3085 789592640
0.2149 18.0 562356 0.3087 836057504
0.2168 19.0 593598 0.3087 882513984
0.2888 20.0 624840 0.3087 928969632

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
54
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_789_1769896664

Adapter
(2365)
this model