train_stsb_101112_1760638040

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the stsb dataset. It achieves the following results on the evaluation set:

Loss: 4.7234
Num Input Tokens Seen: 8712528

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
5.1474	1.0	1294	4.9125	434624
4.4252	2.0	2588	4.7650	869056
4.6605	3.0	3882	4.7327	1304160
4.9587	4.0	5176	4.7326	1740416
4.2984	5.0	6470	4.7394	2175568
4.6637	6.0	7764	4.7338	2611168
4.7358	7.0	9058	4.7271	3047200
4.8885	8.0	10352	4.7332	3482720
4.8576	9.0	11646	4.7345	3918416
4.3534	10.0	12940	4.7457	4355072
4.7233	11.0	14234	4.7340	4790336
4.5093	12.0	15528	4.7355	5227040
4.7466	13.0	16822	4.7388	5662848
4.8084	14.0	18116	4.7329	6099600
4.7322	15.0	19410	4.7352	6534256
4.9552	16.0	20704	4.7373	6968992
4.5146	17.0	21998	4.7239	7405040
4.907	18.0	23292	4.7365	7840784
4.6263	19.0	24586	4.7234	8276160
4.6279	20.0	25880	4.7234	8712528

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: -

Model tree for rbelanec/train_stsb_101112_1760638040

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2155)

this model

rbelanec
/

train_stsb_101112_1760638040

train_stsb_101112_1760638040

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_stsb_101112_1760638040

Evaluation results