roberta-nepali-sequence-ged

This model is a fine-tuned version of IRIIS-RESEARCH/RoBERTa_Nepali_125M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1973
  • Model Preparation Time: 0.002
  • Accuracy: 0.9231
  • Precision: 0.9222
  • Recall: 0.9326
  • F1: 0.9274
  • Precision Correct: 0.9242
  • Recall Correct: 0.9127
  • F1 Correct: 0.9184
  • Precision Incorrect: 0.9222
  • Recall Incorrect: 0.9326
  • F1 Incorrect: 0.9274

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 512
  • eval_batch_size: 1024
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 1024
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Accuracy Precision Recall F1 Precision Correct Recall Correct F1 Correct Precision Incorrect Recall Incorrect F1 Incorrect
0.2734 0.1016 1000 0.2748 0.002 0.8894 0.8951 0.8946 0.8949 0.8831 0.8836 0.8833 0.8951 0.8946 0.8949
0.2302 0.2031 2000 0.2455 0.002 0.9026 0.9049 0.9106 0.9078 0.9001 0.8937 0.8969 0.9049 0.9106 0.9078
0.2169 0.3047 3000 0.2462 0.002 0.9016 0.8918 0.9252 0.9082 0.9134 0.8753 0.8939 0.8918 0.9252 0.9082
0.2101 0.4062 4000 0.2315 0.002 0.9086 0.9047 0.9236 0.9140 0.9131 0.8920 0.9024 0.9047 0.9236 0.9140
0.2052 0.5078 5000 0.2234 0.002 0.9124 0.9131 0.9212 0.9171 0.9117 0.9026 0.9071 0.9131 0.9212 0.9171
0.2003 0.6094 6000 0.2248 0.002 0.9100 0.9024 0.9294 0.9157 0.9189 0.8885 0.9034 0.9024 0.9294 0.9157
0.1987 0.7109 7000 0.2187 0.002 0.9131 0.9074 0.9298 0.9184 0.9199 0.8946 0.9071 0.9074 0.9298 0.9184
0.1965 0.8125 8000 0.2105 0.002 0.9180 0.9189 0.9260 0.9224 0.9171 0.9092 0.9131 0.9189 0.9260 0.9224
0.1939 0.9140 9000 0.2129 0.002 0.9166 0.9126 0.9306 0.9215 0.9212 0.9010 0.9110 0.9126 0.9306 0.9215
0.1896 1.0155 10000 0.2055 0.002 0.9198 0.9206 0.9277 0.9241 0.9190 0.9111 0.9150 0.9206 0.9277 0.9241
0.1796 1.1171 11000 0.2065 0.002 0.9188 0.9169 0.9301 0.9234 0.9211 0.9064 0.9137 0.9169 0.9301 0.9234
0.1788 1.2187 12000 0.2058 0.002 0.9192 0.9164 0.9314 0.9238 0.9224 0.9056 0.9139 0.9164 0.9314 0.9238
0.1787 1.3202 13000 0.2018 0.002 0.9212 0.9204 0.9307 0.9255 0.9221 0.9106 0.9163 0.9204 0.9307 0.9255
0.1774 1.4218 14000 0.2038 0.002 0.9206 0.9177 0.9328 0.9252 0.9240 0.9072 0.9155 0.9177 0.9328 0.9252
0.1767 1.5233 15000 0.1940 0.002 0.9251 0.9309 0.9263 0.9286 0.9186 0.9237 0.9211 0.9309 0.9263 0.9286
0.1785 1.6249 16000 0.1943 0.002 0.9245 0.9283 0.9282 0.9283 0.9203 0.9204 0.9204 0.9283 0.9282 0.9283
0.1761 1.7265 17000 0.1957 0.002 0.9237 0.9253 0.9301 0.9277 0.9220 0.9166 0.9193 0.9253 0.9301 0.9277
0.176 1.8280 18000 0.1960 0.002 0.9240 0.9253 0.9307 0.9280 0.9225 0.9165 0.9195 0.9253 0.9307 0.9280
0.1761 1.9296 19000 0.1973 0.002 0.9231 0.9222 0.9326 0.9274 0.9242 0.9127 0.9184 0.9222 0.9326 0.9274

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu128
  • Datasets 4.4.1
  • Tokenizers 0.22.1
Downloads last month
219
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DipeshChaudhary/roberta-nepali-sequence-ged

Finetuned
(5)
this model