roberta-nepali-sequence-ged

This model is a fine-tuned version of IRIIS-RESEARCH/RoBERTa_Nepali_125M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1973
Model Preparation Time: 0.002
Accuracy: 0.9231
Precision: 0.9222
Recall: 0.9326
F1: 0.9274
Precision Correct: 0.9242
Recall Correct: 0.9127
F1 Correct: 0.9184
Precision Incorrect: 0.9222
Recall Incorrect: 0.9326
F1 Incorrect: 0.9274

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 512
eval_batch_size: 1024
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 1024
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Accuracy	Precision	Recall	F1	Precision Correct	Recall Correct	F1 Correct	Precision Incorrect	Recall Incorrect	F1 Incorrect
0.2734	0.1016	1000	0.2748	0.002	0.8894	0.8951	0.8946	0.8949	0.8831	0.8836	0.8833	0.8951	0.8946	0.8949
0.2302	0.2031	2000	0.2455	0.002	0.9026	0.9049	0.9106	0.9078	0.9001	0.8937	0.8969	0.9049	0.9106	0.9078
0.2169	0.3047	3000	0.2462	0.002	0.9016	0.8918	0.9252	0.9082	0.9134	0.8753	0.8939	0.8918	0.9252	0.9082
0.2101	0.4062	4000	0.2315	0.002	0.9086	0.9047	0.9236	0.9140	0.9131	0.8920	0.9024	0.9047	0.9236	0.9140
0.2052	0.5078	5000	0.2234	0.002	0.9124	0.9131	0.9212	0.9171	0.9117	0.9026	0.9071	0.9131	0.9212	0.9171
0.2003	0.6094	6000	0.2248	0.002	0.9100	0.9024	0.9294	0.9157	0.9189	0.8885	0.9034	0.9024	0.9294	0.9157
0.1987	0.7109	7000	0.2187	0.002	0.9131	0.9074	0.9298	0.9184	0.9199	0.8946	0.9071	0.9074	0.9298	0.9184
0.1965	0.8125	8000	0.2105	0.002	0.9180	0.9189	0.9260	0.9224	0.9171	0.9092	0.9131	0.9189	0.9260	0.9224
0.1939	0.9140	9000	0.2129	0.002	0.9166	0.9126	0.9306	0.9215	0.9212	0.9010	0.9110	0.9126	0.9306	0.9215
0.1896	1.0155	10000	0.2055	0.002	0.9198	0.9206	0.9277	0.9241	0.9190	0.9111	0.9150	0.9206	0.9277	0.9241
0.1796	1.1171	11000	0.2065	0.002	0.9188	0.9169	0.9301	0.9234	0.9211	0.9064	0.9137	0.9169	0.9301	0.9234
0.1788	1.2187	12000	0.2058	0.002	0.9192	0.9164	0.9314	0.9238	0.9224	0.9056	0.9139	0.9164	0.9314	0.9238
0.1787	1.3202	13000	0.2018	0.002	0.9212	0.9204	0.9307	0.9255	0.9221	0.9106	0.9163	0.9204	0.9307	0.9255
0.1774	1.4218	14000	0.2038	0.002	0.9206	0.9177	0.9328	0.9252	0.9240	0.9072	0.9155	0.9177	0.9328	0.9252
0.1767	1.5233	15000	0.1940	0.002	0.9251	0.9309	0.9263	0.9286	0.9186	0.9237	0.9211	0.9309	0.9263	0.9286
0.1785	1.6249	16000	0.1943	0.002	0.9245	0.9283	0.9282	0.9283	0.9203	0.9204	0.9204	0.9283	0.9282	0.9283
0.1761	1.7265	17000	0.1957	0.002	0.9237	0.9253	0.9301	0.9277	0.9220	0.9166	0.9193	0.9253	0.9301	0.9277
0.176	1.8280	18000	0.1960	0.002	0.9240	0.9253	0.9307	0.9280	0.9225	0.9165	0.9195	0.9253	0.9307	0.9280
0.1761	1.9296	19000	0.1973	0.002	0.9231	0.9222	0.9326	0.9274	0.9242	0.9127	0.9184	0.9222	0.9326	0.9274

Framework versions

Transformers 4.57.1
Pytorch 2.8.0+cu128
Datasets 4.4.1
Tokenizers 0.22.1

Downloads last month: 219

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for DipeshChaudhary/roberta-nepali-sequence-ged

Base model

IRIIS-RESEARCH/RoBERTa_Nepali_125M

Finetuned

(5)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard