train_mrpc_101112_1760638022

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

Loss: 0.2019
Num Input Tokens Seen: 6767120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2905	1.0	826	0.2125	339144
0.1383	2.0	1652	0.2078	677000
0.1752	3.0	2478	0.2053	1015816
0.2262	4.0	3304	0.2042	1354072
0.2431	5.0	4130	0.2020	1692608
0.2919	6.0	4956	0.2020	2030416
0.1903	7.0	5782	0.2044	2369136
0.2176	8.0	6608	0.2024	2707920
0.2401	9.0	7434	0.2032	3046392
0.3443	10.0	8260	0.2047	3383592
0.2325	11.0	9086	0.2026	3722064
0.2625	12.0	9912	0.2027	4060512
0.2929	13.0	10738	0.2042	4398224
0.2417	14.0	11564	0.2022	4737184
0.1231	15.0	12390	0.2019	5075136
0.2862	16.0	13216	0.2028	5413384
0.3427	17.0	14042	0.2023	5751600
0.1236	18.0	14868	0.2055	6090384
0.2249	19.0	15694	0.2054	6429416
0.1636	20.0	16520	0.2053	6767120

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 6

Model tree for rbelanec/train_mrpc_101112_1760638022

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2098)

this model

rbelanec
/

train_mrpc_101112_1760638022

train_mrpc_101112_1760638022

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_mrpc_101112_1760638022

Evaluation results