train_record_42_1761523804

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2716
  • Num Input Tokens Seen: 929295680

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2075 1.0 31242 0.3079 46467680
0.3533 2.0 62484 0.2716 92922208
0.1586 3.0 93726 0.2872 139391296
0.2084 4.0 124968 0.3042 185843712
0.2229 5.0 156210 0.3172 232305472
0.183 6.0 187452 0.3756 278763136
0.1189 7.0 218694 0.3927 325229440
0.2179 8.0 249936 0.3841 371696128
0.139 9.0 281178 0.4512 418155040
0.0842 10.0 312420 0.4691 464625760
0.1327 11.0 343662 0.5245 511102336
0.1141 12.0 374904 0.5519 557571936
0.2159 13.0 406146 0.6113 604043392
0.11 14.0 437388 0.6416 650494080
0.1411 15.0 468630 0.7649 696961792
0.0878 16.0 499872 0.7965 743437824
0.1648 17.0 531114 0.8925 789906944
0.1131 18.0 562356 0.9994 836372000
0.1143 19.0 593598 1.1541 882831072
0.1219 20.0 624840 1.1925 929295680

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_42_1761523804

Adapter
(2098)
this model

Evaluation results