train_cola_42_1757596047

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2412
  • Num Input Tokens Seen: 6927000

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2546 1.0 3848 0.2480 346040
0.1205 2.0 7696 0.2484 692368
0.2615 3.0 11544 0.2438 1039080
0.2572 4.0 15392 0.2436 1385192
0.2552 5.0 19240 0.2432 1731824
0.3358 6.0 23088 0.2496 2078408
0.2235 7.0 26936 0.2438 2424592
0.2903 8.0 30784 0.2476 2770768
0.2715 9.0 34632 0.2459 3117120
0.2141 10.0 38480 0.2748 3463336
0.2359 11.0 42328 0.2426 3809536
0.316 12.0 46176 0.2439 4155688
0.3199 13.0 50024 0.2455 4502336
0.2547 14.0 53872 0.2459 4848864
0.2146 15.0 57720 0.2422 5194640
0.3529 16.0 61568 0.2419 5541160
0.2237 17.0 65416 0.2437 5887864
0.3058 18.0 69264 0.2429 6234216
0.2963 19.0 73112 0.2419 6580528
0.3099 20.0 76960 0.2412 6927000

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_42_1757596047

Adapter
(2098)
this model

Evaluation results