train_copa_456_1757596114

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

Loss: 1.1002
Num Input Tokens Seen: 547792

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2953	1.0	180	0.2306	27376
0.2406	2.0	360	0.2338	54800
0.254	3.0	540	0.2441	82256
0.2126	4.0	720	0.2456	109680
0.2399	5.0	900	0.2420	137040
0.2275	6.0	1080	0.2374	164336
0.2029	7.0	1260	0.2474	191760
0.2183	8.0	1440	0.2454	219200
0.2397	9.0	1620	0.2568	246592
0.2287	10.0	1800	0.2484	273936
0.2819	11.0	1980	0.2490	301328
0.1011	12.0	2160	0.3574	328704
0.1685	13.0	2340	0.3697	356080
0.3555	14.0	2520	0.9776	383488
0.004	15.0	2700	0.9718	410832
0.0001	16.0	2880	1.0573	438176
0.0001	17.0	3060	1.0890	465632
0.0	18.0	3240	1.0990	492976
0.0	19.0	3420	1.0898	520384
0.0	20.0	3600	1.1002	547792

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_456_1757596114

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2098)

this model

rbelanec
/

train_copa_456_1757596114

train_copa_456_1757596114

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_copa_456_1757596114

Evaluation results