train_record_42_1761957644

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3110
  • Num Input Tokens Seen: 929295680

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5052 1.0 31242 0.4960 46467680
0.5026 2.0 62484 0.3959 92922208
0.3241 3.0 93726 0.3586 139391296
0.2875 4.0 124968 0.3401 185843712
0.2671 5.0 156210 0.3300 232305472
0.3153 6.0 187452 0.3231 278763136
0.2318 7.0 218694 0.3194 325229440
0.3274 8.0 249936 0.3154 371696128
0.3717 9.0 281178 0.3121 418155040
0.2968 10.0 312420 0.3120 464625760
0.2307 11.0 343662 0.3128 511102336
0.221 12.0 374904 0.3116 557571936
0.2548 13.0 406146 0.3128 604043392
0.2354 14.0 437388 0.3110 650494080
0.2499 15.0 468630 0.3119 696961792
0.2935 16.0 499872 0.3120 743437824
0.2881 17.0 531114 0.3129 789906944
0.2174 18.0 562356 0.3125 836372000
0.2468 19.0 593598 0.3125 882831072
0.1973 20.0 624840 0.3128 929295680

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_42_1761957644

Adapter
(2094)
this model

Evaluation results