train_wic_101112_1760638034

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wic dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4611
  • Num Input Tokens Seen: 8443576

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4026 1.0 1222 0.4925 421736
0.4093 2.0 2444 0.4784 844280
0.4079 3.0 3666 0.4698 1266928
0.73 4.0 4888 0.4636 1689112
0.4841 5.0 6110 0.4682 2111392
0.4261 6.0 7332 0.4700 2533592
0.4682 7.0 8554 0.4644 2955304
0.6027 8.0 9776 0.4641 3377216
0.4048 9.0 10998 0.4659 3799208
0.3773 10.0 12220 0.4688 4221160
0.2559 11.0 13442 0.4663 4643512
0.3705 12.0 14664 0.4640 5066080
0.4083 13.0 15886 0.4644 5487840
0.4473 14.0 17108 0.4627 5910224
0.6863 15.0 18330 0.4650 6332064
0.5297 16.0 19552 0.4670 6754408
0.436 17.0 20774 0.4611 7176696
0.5006 18.0 21996 0.4628 7598912
0.4396 19.0 23218 0.4628 8021160
0.4385 20.0 24440 0.4628 8443576

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wic_101112_1760638034

Adapter
(2099)
this model

Evaluation results