train_siqa_1756735776

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the siqa dataset. It achieves the following results on the evaluation set:

Loss: 0.5492
Num Input Tokens Seen: 28646680

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 123
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.5663	0.5000	7518	0.5535	1434368
0.5459	1.0001	15036	0.5499	2864912
0.6257	1.5001	22554	0.5551	4297616
0.5411	2.0001	30072	0.5522	5729232
0.5326	2.5002	37590	0.5547	7161424
0.5501	3.0002	45108	0.5531	8594304
0.5861	3.5002	52626	0.5572	10025776
0.5616	4.0003	60144	0.5524	11458320
0.572	4.5003	67662	0.5502	12890928
0.5497	5.0003	75180	0.5521	14322784
0.552	5.5004	82698	0.5531	15756256
0.5462	6.0004	90216	0.5507	17189144
0.5767	6.5004	97734	0.5510	18622072
0.559	7.0005	105252	0.5498	20053992
0.5565	7.5005	112770	0.5491	21486200
0.5579	8.0005	120288	0.5505	22918992
0.5769	8.5006	127806	0.5495	24351152
0.528	9.0006	135324	0.5497	25783688
0.5405	9.5006	142842	0.5493	27216936

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_siqa_1756735776

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2099)

this model

rbelanec
/

train_siqa_1756735776

train_siqa_1756735776

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_siqa_1756735776

Evaluation results