train_conala_1754507515

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7582
  • Num Input Tokens Seen: 1524216

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.7084 0.5 268 2.6180 75936
2.115 1.0 536 1.6151 152672
1.1504 1.5 804 1.0901 229344
1.1388 2.0 1072 0.9712 305288
0.9143 2.5 1340 0.9083 382120
0.7339 3.0 1608 0.8660 457952
0.7818 3.5 1876 0.8380 534688
0.5496 4.0 2144 0.8174 610944
0.6749 4.5 2412 0.8038 687328
1.1392 5.0 2680 0.7910 762440
0.8316 5.5 2948 0.7825 839656
0.8026 6.0 3216 0.7750 914920
0.8252 6.5 3484 0.7701 992104
0.6084 7.0 3752 0.7664 1067520
0.9433 7.5 4020 0.7625 1142912
0.8192 8.0 4288 0.7612 1220200
0.9766 8.5 4556 0.7601 1295720
0.9787 9.0 4824 0.7596 1372560
0.8351 9.5 5092 0.7582 1447376
0.5224 10.0 5360 0.7590 1524216

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_1754507515

Adapter
(2369)
this model