train_cola_1757340233

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

Loss: 0.8409
Num Input Tokens Seen: 6920344

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.3085	1.0	3848	0.2561	345704
0.2631	2.0	7696	0.2533	691408
0.2945	3.0	11544	0.2753	1037864
0.1402	4.0	15392	0.2886	1383872
0.187	5.0	19240	0.2485	1729688
0.2234	6.0	23088	0.2514	2075456
0.2122	7.0	26936	0.2496	2421448
0.3578	8.0	30784	0.2499	2767560
0.1831	9.0	34632	0.2535	3113600
0.1986	10.0	38480	0.2760	3459744
0.2155	11.0	42328	0.2652	3805704
0.3169	12.0	46176	0.3165	4151712
0.2616	13.0	50024	0.3440	4497816
0.0289	14.0	53872	0.3798	4843560
0.1456	15.0	57720	0.4600	5189648
0.1306	16.0	61568	0.5392	5535664
0.0079	17.0	65416	0.6758	5882024
0.2242	18.0	69264	0.7435	6228384
0.0231	19.0	73112	0.8285	6574288
0.0348	20.0	76960	0.8409	6920344

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_1757340233

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2098)

this model

rbelanec
/

train_cola_1757340233

train_cola_1757340233

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_cola_1757340233

Evaluation results