rbelanec's picture
End of training
ac0be2f verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - prefix-tuning
  - generated_from_trainer
model-index:
  - name: train_record_1756914931
    results: []

train_record_1756914931

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4367
  • Num Input Tokens Seen: 437687728

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2393 0.5 31242 0.3863 21882176
0.2851 1.0 62484 0.3722 43764848
0.3349 1.5 93726 0.3156 65654448
0.3611 2.0 124968 0.3249 87540080
0.2412 2.5 156210 0.3086 109399264
0.2926 3.0 187452 0.3157 131310112
0.2121 3.5 218694 0.3256 153180192
0.1926 4.0 249936 0.3087 175076672
0.1418 4.5 281178 0.3052 196954256
0.1742 5.0 312420 0.3157 218847792
0.1728 5.5 343662 0.3363 240725504
0.2249 6.0 374904 0.3064 262607120
0.2989 6.5 406146 0.3348 284498880
0.1548 7.0 437388 0.3352 306371872
0.1321 7.5 468630 0.3689 328260832
0.1422 8.0 499872 0.3563 350148752
0.1586 8.5 531114 0.4022 372040912
0.1379 9.0 562356 0.3845 393915104
0.1341 9.5 593598 0.4348 415811824
0.1225 10.0 624840 0.4367 437687728

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1