train_multirc_1755694495

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the multirc dataset. It achieves the following results on the evaluation set:

Loss: 0.4210
Num Input Tokens Seen: 117044976

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 123
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.3106	0.5000	6130	0.3370	5867968
0.2456	1.0001	12260	0.2171	11717296
0.0183	1.5001	18390	0.1600	17588368
0.0191	2.0002	24520	0.1445	23419920
0.0116	2.5002	30650	0.1618	29254768
0.0151	3.0002	36780	0.1502	35127456
0.0071	3.5003	42910	0.1629	40997712
0.0549	4.0003	49040	0.1801	46843184
0.0045	4.5004	55170	0.1625	52692880
0.0028	5.0004	61300	0.1730	58550736
0.0007	5.5004	67430	0.2433	64403536
0.0019	6.0005	73560	0.2333	70253968
0.1469	6.5005	79690	0.2315	76098208
0.004	7.0006	85820	0.2299	81942784
0.0005	7.5006	91950	0.2834	87796256
0.0006	8.0007	98080	0.2994	93652576
0.0	8.5007	104210	0.3707	99519344
0.0	9.0007	110340	0.3787	105351168
0.0	9.5008	116470	0.4142	111213008

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_multirc_1755694495

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2099)

this model

rbelanec
/

train_multirc_1755694495

train_multirc_1755694495

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_multirc_1755694495

Evaluation results