train_record_123_1763024605

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 6.4921
  • Num Input Tokens Seen: 928969984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3045 1.0 31242 0.3144 46454112
0.2037 2.0 62484 0.2961 92908288
0.2339 3.0 93726 0.2803 139351808
0.2753 4.0 124968 0.2709 185790304
0.2165 5.0 156210 0.2694 232243968
0.2011 6.0 187452 0.2638 278686752
0.1549 7.0 218694 0.2635 325137568
0.2504 8.0 249936 0.2607 371592704
0.1917 9.0 281178 0.2620 418033696
0.3011 10.0 312420 0.2620 464483424
0.2058 11.0 343662 0.2632 510926720
0.1434 12.0 374904 0.2629 557369088
0.2686 13.0 406146 0.2662 603816992
0.204 14.0 437388 0.2736 650269248
0.1561 15.0 468630 0.2768 696727936
0.2222 16.0 499872 0.2874 743174112
0.2384 17.0 531114 0.2887 789614720
0.1332 18.0 562356 0.2930 836057280
0.171 19.0 593598 0.2942 882504192
0.191 20.0 624840 0.2959 928969984

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_123_1763024605

Adapter
(2098)
this model

Evaluation results