train_wsc_101112_1760446104

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wsc dataset. It achieves the following results on the evaluation set:

Loss: 0.4362
Num Input Tokens Seen: 1471184

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.3629	1.504	188	0.4066	74288
0.3344	3.008	376	0.4253	147040
0.3523	4.5120	564	0.3512	221408
0.3645	6.016	752	0.3525	294736
0.3869	7.52	940	0.3689	368400
0.3533	9.024	1128	0.3554	441968
0.3493	10.528	1316	0.3552	514960
0.3368	12.032	1504	0.3549	588032
0.3421	13.536	1692	0.3556	662784
0.3505	15.04	1880	0.3548	735760
0.3536	16.544	2068	0.3628	809088
0.3419	18.048	2256	0.3575	883568
0.3534	19.552	2444	0.3595	958720
0.3313	21.056	2632	0.3684	1031776
0.3376	22.56	2820	0.3696	1105632
0.3543	24.064	3008	0.3914	1179856
0.3551	25.568	3196	0.4112	1253280
0.328	27.072	3384	0.4254	1327824
0.2946	28.576	3572	0.4310	1400944

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wsc_101112_1760446104

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2099)

this model

rbelanec
/

train_wsc_101112_1760446104

train_wsc_101112_1760446104

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_wsc_101112_1760446104

Evaluation results