train_sst2_101112_1760638076

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the sst2 dataset. It achieves the following results on the evaluation set:

Loss: 0.1172
Num Input Tokens Seen: 67752208

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.1291	1.0	15154	0.0678	3387376
0.0439	2.0	30308	0.0576	6776272
0.0327	3.0	45462	0.0582	10162816
0.0116	4.0	60616	0.0543	13553408
0.0188	5.0	75770	0.0597	16938528
0.0051	6.0	90924	0.0538	20326208
0.1168	7.0	106078	0.0544	23713072
0.0266	8.0	121232	0.0561	27100096
0.039	9.0	136386	0.0560	30487024
0.1492	10.0	151540	0.0623	33871648
0.0169	11.0	166694	0.0603	37260608
0.0079	12.0	181848	0.0692	40646320
0.003	13.0	197002	0.0675	44035072
0.0642	14.0	212156	0.0748	47423936
0.0285	15.0	227310	0.0781	50810512
0.0018	16.0	242464	0.0851	54201824
0.0027	17.0	257618	0.0945	57588768
0.0014	18.0	272772	0.1026	60979040
0.0012	19.0	287926	0.1049	64364064
0.0026	20.0	303080	0.1052	67752208

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_sst2_101112_1760638076

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2099)

this model

rbelanec
/

train_sst2_101112_1760638076

train_sst2_101112_1760638076

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_sst2_101112_1760638076

Evaluation results