Configuration Parsing Warning: In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string

stlenc-mod-temp0.3

This model is a fine-tuned version of saracandu/stlenc-mod-temp0.3 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7258

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 512
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss
1.0152 0.3639 250 1.5929
1.0018 0.7278 500 1.3667
0.9798 1.0917 750 1.2744
0.9726 1.4556 1000 1.2397
0.9678 1.8195 1250 1.1935
0.9656 2.1834 1500 1.1832
0.9629 2.5473 1750 1.1704
0.9565 2.9112 2000 1.1772
0.9543 3.2751 2250 1.1584
0.9519 3.6390 2500 1.1552
0.9387 4.0029 2750 1.1178
0.8481 4.3668 3000 1.0497
0.7371 4.7307 3250 0.9688
0.7112 5.0946 3500 0.9343
0.6876 5.4585 3750 0.9121
0.6665 5.8224 4000 0.8882
0.6646 6.1863 4250 0.8854
0.6566 6.5502 4500 0.8672
0.6448 6.9141 4750 0.8607
0.6345 7.2780 5000 0.8554
0.6295 7.6419 5250 0.8440
0.6174 8.0058 5500 0.8480
0.6192 8.3697 5750 0.8369
0.6155 8.7336 6000 0.8363
0.6154 9.0975 6250 0.8290
0.5997 9.4614 6500 0.8132
0.6013 9.8253 6750 0.8206
0.5894 10.1892 7000 0.7885
0.5889 10.5531 7250 0.7830
0.5883 10.9170 7500 0.7855
0.5823 11.2809 7750 0.7717
0.5742 11.6448 8000 0.7672
0.5718 12.0087 8250 0.7515
0.567 12.3726 8500 0.7506
0.5707 12.7365 8750 0.7438
0.57 13.1004 9000 0.7426
0.5609 13.4643 9250 0.7388
0.5664 13.8282 9500 0.7330
0.5595 14.1921 9750 0.7284
0.5606 14.5560 10000 0.7209
0.5598 14.9199 10250 0.7258

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.4.2
  • Tokenizers 0.22.1
Downloads last month
388
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for saracandu/stlenc-mod-temp0.3

Unable to build the model tree, the base model loops to the model itself. Learn more.