rbelanec's picture
End of training
49ebcce verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - lntuning
  - generated_from_trainer
model-index:
  - name: train_codealpacapy_456_1765320135
    results: []

train_codealpacapy_456_1765320135

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4469
  • Num Input Tokens Seen: 24973864

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6003 1.0 1908 0.5420 1246832
0.562 2.0 3816 0.4901 2497936
0.4684 3.0 5724 0.4709 3743760
0.4606 4.0 7632 0.4625 4991472
1.1977 5.0 9540 0.4581 6239608
0.4325 6.0 11448 0.4543 7485248
0.6624 7.0 13356 0.4522 8733024
0.6294 8.0 15264 0.4506 9983720
0.3541 9.0 17172 0.4490 11229792
0.3995 10.0 19080 0.4484 12476552
0.4189 11.0 20988 0.4486 13725560
0.4387 12.0 22896 0.4477 14977976
0.4162 13.0 24804 0.4474 16225896
0.4816 14.0 26712 0.4473 17477224
0.4314 15.0 28620 0.4472 18726216
0.316 16.0 30528 0.4469 19973408
0.3909 17.0 32436 0.4470 21226656
0.4489 18.0 34344 0.4469 22472696
0.4458 19.0 36252 0.4472 23722376
0.7972 20.0 38160 0.4471 24973864

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1