train_siqa_1755694504

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the siqa dataset. It achieves the following results on the evaluation set:

Loss: 0.5506
Num Input Tokens Seen: 28646680

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 123
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.5636	0.5000	7518	0.5540	1434368
0.542	1.0001	15036	0.5497	2864912
0.6198	1.5001	22554	0.5551	4297616
0.5434	2.0001	30072	0.5521	5729232
0.529	2.5002	37590	0.5554	7161424
0.5494	3.0002	45108	0.5522	8594304
0.59	3.5002	52626	0.5577	10025776
0.5675	4.0003	60144	0.5522	11458320
0.5674	4.5003	67662	0.5504	12890928
0.5455	5.0003	75180	0.5519	14322784
0.5454	5.5004	82698	0.5526	15756256
0.5474	6.0004	90216	0.5505	17189144
0.5716	6.5004	97734	0.5511	18622072
0.5484	7.0005	105252	0.5505	20053992
0.5424	7.5005	112770	0.5502	21486200
0.5734	8.0005	120288	0.5509	22918992
0.5587	8.5006	127806	0.5504	24351152
0.5356	9.0006	135324	0.5506	25783688
0.5627	9.5006	142842	0.5502	27216936

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_siqa_1755694504

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2099)

this model

rbelanec
/

train_siqa_1755694504

train_siqa_1755694504

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_siqa_1755694504

Evaluation results