train_cola_456_1757596096

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

Loss: 1.1780
Num Input Tokens Seen: 6925896

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.273	1.0	3848	0.2536	346216
0.2121	2.0	7696	0.2523	692896
0.2276	3.0	11544	0.2543	1039432
0.1208	4.0	15392	0.2731	1385744
0.1824	5.0	19240	0.2530	1732008
0.2493	6.0	23088	0.2525	2078472
0.1641	7.0	26936	0.2586	2425080
0.227	8.0	30784	0.2558	2771400
0.2494	9.0	34632	0.2570	3117888
0.2	10.0	38480	0.2596	3463936
0.4975	11.0	42328	0.2685	3809992
0.3292	12.0	46176	0.3268	4156000
0.3055	13.0	50024	0.3519	4502216
0.3318	14.0	53872	0.3920	4848576
0.0367	15.0	57720	0.4937	5194888
0.1322	16.0	61568	0.5821	5541144
0.153	17.0	65416	0.8123	5887264
0.0056	18.0	69264	1.0105	6233416
0.1486	19.0	73112	1.1386	6579848
0.2454	20.0	76960	1.1780	6925896

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_456_1757596096

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2099)

this model

rbelanec
/

train_cola_456_1757596096

train_cola_456_1757596096

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_cola_456_1757596096

Evaluation results