train_svamp_42_1757596059

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1135
  • Num Input Tokens Seen: 1349904

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6709 1.0 315 0.5841 67312
0.0811 2.0 630 0.1495 134816
0.0254 3.0 945 0.0757 202256
0.0748 4.0 1260 0.0671 269648
0.0008 5.0 1575 0.0548 337152
0.0424 6.0 1890 0.0977 404752
0.0629 7.0 2205 0.0875 472288
0.0151 8.0 2520 0.0815 539760
0.0002 9.0 2835 0.0707 607296
0.0044 10.0 3150 0.1061 674736
0.0 11.0 3465 0.0970 742288
0.0002 12.0 3780 0.1058 809680
0.0 13.0 4095 0.1059 877264
0.0 14.0 4410 0.1101 944768
0.0 15.0 4725 0.1124 1012288
0.0 16.0 5040 0.1126 1079760
0.0 17.0 5355 0.1130 1147376
0.0 18.0 5670 0.1134 1214944
0.0 19.0 5985 0.1158 1282448
0.0 20.0 6300 0.1135 1349904

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_42_1757596059

Adapter
(2099)
this model

Evaluation results