rbelanec's picture
End of training
f23ae79 verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - prompt-tuning
  - generated_from_trainer
model-index:
  - name: train_boolq_456_1765348231
    results: []

train_boolq_456_1765348231

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3320
  • Num Input Tokens Seen: 42758400

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3421 1.0 2121 0.3622 2134176
0.3701 2.0 4242 0.3352 4262848
0.3564 3.0 6363 0.3581 6393376
0.3406 4.0 8484 0.3343 8527168
0.3276 5.0 10605 0.3372 10663616
0.382 6.0 12726 0.3354 12804064
0.3345 7.0 14847 0.3352 14945088
0.3754 8.0 16968 0.3446 17092000
0.276 9.0 19089 0.3377 19221568
0.3576 10.0 21210 0.3368 21363392
0.332 11.0 23331 0.3349 23510432
0.3691 12.0 25452 0.3381 25647872
0.3393 13.0 27573 0.3332 27792544
0.3103 14.0 29694 0.3336 29926112
0.3476 15.0 31815 0.3338 32059584
0.362 16.0 33936 0.3329 34199392
0.3542 17.0 36057 0.3321 36344896
0.328 18.0 38178 0.3320 38482112
0.335 19.0 40299 0.3325 40624896
0.3011 20.0 42420 0.3322 42758400

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1