train_cola_101112_1760638044

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1636
  • Num Input Tokens Seen: 7325256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2357 1.0 1924 0.2709 366136
0.2299 2.0 3848 0.1864 732880
0.111 3.0 5772 0.1455 1099816
0.1013 4.0 7696 0.1425 1465464
0.1245 5.0 9620 0.1338 1831728
0.1636 6.0 11544 0.1381 2198176
0.1655 7.0 13468 0.1348 2564208
0.035 8.0 15392 0.1460 2930240
0.0486 9.0 17316 0.1341 3297136
0.0396 10.0 19240 0.1487 3663392
0.0693 11.0 21164 0.1524 4028760
0.0187 12.0 23088 0.1645 4394320
0.0341 13.0 25012 0.1672 4761000
0.0522 14.0 26936 0.1779 5127440
0.1024 15.0 28860 0.1908 5494368
0.0535 16.0 30784 0.2188 5860888
0.0826 17.0 32708 0.2318 6226952
0.0033 18.0 34632 0.2388 6593400
0.0023 19.0 36556 0.2486 6959600
0.0222 20.0 38480 0.2490 7325256

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_101112_1760638044

Adapter
(2098)
this model

Evaluation results