train_cola_42_1763630698

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2143
  • Num Input Tokens Seen: 3463336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5901 0.5 1924 0.2601 173168
0.1216 1.0 3848 0.2143 346040
0.3276 1.5 5772 0.2487 518984
0.0274 2.0 7696 0.2422 692368
0.0056 2.5 9620 0.2262 866112
0.0069 3.0 11544 0.2263 1039080
0.0018 3.5 13468 0.2481 1212120
0.0046 4.0 15392 0.2439 1385192
0.242 4.5 17316 0.2580 1558248
0.002 5.0 19240 0.2663 1731824
0.2218 5.5 21164 0.2889 1904960
0.3473 6.0 23088 0.2948 2078408
0.003 6.5 25012 0.3017 2251848
0.1628 7.0 26936 0.2941 2424592
0.4409 7.5 28860 0.3241 2597104
0.1974 8.0 30784 0.3211 2770768
0.1135 8.5 32708 0.3176 2944224
0.0007 9.0 34632 0.3309 3117120
0.0013 9.5 36556 0.3303 3290224
0.0005 10.0 38480 0.3334 3463336

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_42_1763630698

Adapter
(2098)
this model

Evaluation results