train_svamp_789_1757596132

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

Loss: 0.1505
Num Input Tokens Seen: 1349424

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.6716	1.0	315	0.6802	67504
0.1795	2.0	630	0.2406	135040
0.1411	3.0	945	0.1102	202528
0.1363	4.0	1260	0.1078	269840
0.0399	5.0	1575	0.1041	337408
0.0005	6.0	1890	0.1423	404880
0.0472	7.0	2205	0.1482	472240
0.0222	8.0	2520	0.1383	539744
0.0004	9.0	2835	0.1402	607456
0.0	10.0	3150	0.1834	674784
0.0001	11.0	3465	0.1627	742336
0.0001	12.0	3780	0.1488	809792
0.0	13.0	4095	0.1551	877248
0.0	14.0	4410	0.1505	944752
0.0	15.0	4725	0.1477	1012272
0.0	16.0	5040	0.1495	1079744
0.0	17.0	5355	0.1502	1147088
0.0	18.0	5670	0.1516	1214432
0.0	19.0	5985	0.1515	1282000
0.0	20.0	6300	0.1505	1349424

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_789_1757596132

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2099)

this model

rbelanec
/

train_svamp_789_1757596132

train_svamp_789_1757596132

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_svamp_789_1757596132

Evaluation results