train_stsb_1745333593

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the stsb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4850
  • Num Input Tokens Seen: 54490336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6638 0.6182 200 0.7643 272576
0.5347 1.2349 400 0.6415 544096
0.5704 1.8532 600 0.6039 818048
0.4631 2.4699 800 0.5765 1089600
0.4388 3.0866 1000 0.5634 1361504
0.693 3.7048 1200 0.5462 1636960
0.4535 4.3215 1400 0.5389 1909696
0.4323 4.9397 1600 0.5290 2182656
0.3798 5.5564 1800 0.5207 2453904
0.4876 6.1731 2000 0.5214 2727984
0.4517 6.7913 2200 0.5114 2999760
0.5385 7.4080 2400 0.5062 3274528
0.4074 8.0247 2600 0.5102 3546880
0.4173 8.6430 2800 0.5032 3821184
0.489 9.2597 3000 0.5034 4090704
0.4075 9.8779 3200 0.5005 4363696
0.4296 10.4946 3400 0.4953 4636656
0.3792 11.1113 3600 0.4945 4908928
0.3687 11.7295 3800 0.4972 5179040
0.3789 12.3462 4000 0.4960 5452192
0.4376 12.9645 4200 0.4948 5724448
0.4317 13.5811 4400 0.4928 5998032
0.4002 14.1978 4600 0.4929 6269792
0.476 14.8161 4800 0.4920 6541248
0.3445 15.4328 5000 0.4919 6815200
0.3755 16.0495 5200 0.4891 7086224
0.3578 16.6677 5400 0.4908 7360560
0.3744 17.2844 5600 0.4911 7632240
0.3955 17.9026 5800 0.4870 7904432
0.3865 18.5193 6000 0.4902 8177168
0.3805 19.1360 6200 0.4934 8449968
0.4311 19.7543 6400 0.4869 8722992
0.3465 20.3709 6600 0.4850 8996224
0.3858 20.9892 6800 0.4895 9269504
0.3605 21.6059 7000 0.4894 9542432
0.4013 22.2226 7200 0.4904 9812704
0.3508 22.8408 7400 0.4930 10086272
0.3407 23.4575 7600 0.4894 10358832
0.3689 24.0742 7800 0.4912 10630000
0.3787 24.6924 8000 0.4932 10904880
0.3438 25.3091 8200 0.4945 11176208
0.3577 25.9274 8400 0.4950 11451344
0.3586 26.5440 8600 0.4966 11723328
0.3423 27.1607 8800 0.5096 11996224
0.392 27.7790 9000 0.4955 12267520
0.3147 28.3957 9200 0.5021 12542064
0.3367 29.0124 9400 0.5015 12812048
0.3615 29.6306 9600 0.5037 13085264
0.3649 30.2473 9800 0.4999 13356384
0.3335 30.8655 10000 0.5066 13629216
0.3558 31.4822 10200 0.5121 13902736
0.3038 32.0989 10400 0.5135 14174192
0.3651 32.7172 10600 0.5102 14448176
0.2963 33.3338 10800 0.5215 14718096
0.3443 33.9521 11000 0.5185 14992048
0.2942 34.5688 11200 0.5177 15265072
0.3032 35.1855 11400 0.5257 15538960
0.3385 35.8037 11600 0.5200 15812880
0.3497 36.4204 11800 0.5251 16082608
0.3119 37.0371 12000 0.5319 16357888
0.3515 37.6553 12200 0.5314 16627872
0.2948 38.2720 12400 0.5314 16900336
0.3536 38.8903 12600 0.5321 17175024
0.3559 39.5070 12800 0.5341 17446864
0.2944 40.1236 13000 0.5415 17716560
0.267 40.7419 13200 0.5411 17991792
0.2937 41.3586 13400 0.5442 18262992
0.2906 41.9768 13600 0.5475 18536880
0.2847 42.5935 13800 0.5486 18806784
0.2651 43.2102 14000 0.5650 19080608
0.3004 43.8284 14200 0.5563 19352320
0.2882 44.4451 14400 0.5577 19624544
0.3009 45.0618 14600 0.5654 19896064
0.2607 45.6801 14800 0.5691 20168064
0.2597 46.2968 15000 0.5689 20440208
0.2702 46.9150 15200 0.5700 20713296
0.2881 47.5317 15400 0.5809 20985744
0.2614 48.1484 15600 0.5856 21257920
0.2963 48.7666 15800 0.5834 21529248
0.2637 49.3833 16000 0.5892 21800992
0.2584 50.0 16200 0.5929 22073392
0.273 50.6182 16400 0.5978 22345648
0.2521 51.2349 16600 0.6023 22617984
0.2449 51.8532 16800 0.6144 22892544
0.2814 52.4699 17000 0.6152 23163488
0.2463 53.0866 17200 0.6220 23438320
0.2816 53.7048 17400 0.6195 23708720
0.2481 54.3215 17600 0.6271 23984304
0.2311 54.9397 17800 0.6256 24256368
0.258 55.5564 18000 0.6307 24527040
0.2913 56.1731 18200 0.6434 24799312
0.2648 56.7913 18400 0.6451 25072848
0.2268 57.4080 18600 0.6544 25347056
0.2213 58.0247 18800 0.6534 25618400
0.2426 58.6430 19000 0.6535 25892960
0.1879 59.2597 19200 0.6707 26164688
0.2532 59.8779 19400 0.6717 26437392
0.2191 60.4946 19600 0.6703 26710176
0.2675 61.1113 19800 0.6764 26981728
0.2558 61.7295 20000 0.6822 27253632
0.2305 62.3462 20200 0.6964 27524928
0.2387 62.9645 20400 0.6923 27799712
0.1951 63.5811 20600 0.7081 28071024
0.2427 64.1978 20800 0.7150 28342880
0.2567 64.8161 21000 0.7228 28617696
0.2557 65.4328 21200 0.7182 28888112
0.2216 66.0495 21400 0.7119 29162944
0.2295 66.6677 21600 0.7291 29434784
0.1668 67.2844 21800 0.7314 29706800
0.2651 67.9026 22000 0.7252 29980240
0.21 68.5193 22200 0.7525 30250192
0.2106 69.1360 22400 0.7529 30522672
0.172 69.7543 22600 0.7453 30795024
0.1719 70.3709 22800 0.7751 31066544
0.1707 70.9892 23000 0.7745 31338128
0.1801 71.6059 23200 0.7800 31609104
0.2754 72.2226 23400 0.7910 31881424
0.1745 72.8408 23600 0.7791 32155024
0.1767 73.4575 23800 0.7874 32425312
0.199 74.0742 24000 0.7898 32698784
0.2037 74.6924 24200 0.8119 32974144
0.2096 75.3091 24400 0.8035 33245216
0.1574 75.9274 24600 0.8163 33517088
0.1923 76.5440 24800 0.8089 33788432
0.1803 77.1607 25000 0.8337 34060416
0.1691 77.7790 25200 0.8249 34333408
0.1939 78.3957 25400 0.8377 34605392
0.1815 79.0124 25600 0.8315 34879536
0.1638 79.6306 25800 0.8423 35153488
0.1984 80.2473 26000 0.8502 35424912
0.1305 80.8655 26200 0.8434 35698064
0.1657 81.4822 26400 0.8636 35968160
0.1752 82.0989 26600 0.8575 36240928
0.2094 82.7172 26800 0.8681 36514208
0.1408 83.3338 27000 0.8701 36785136
0.1945 83.9521 27200 0.8848 37061648
0.1189 84.5688 27400 0.8792 37333648
0.1421 85.1855 27600 0.8936 37605184
0.176 85.8037 27800 0.8887 37875360
0.177 86.4204 28000 0.8847 38150208
0.1926 87.0371 28200 0.8942 38422048
0.2273 87.6553 28400 0.8948 38692224
0.1891 88.2720 28600 0.8979 38964176
0.1589 88.8903 28800 0.9041 39235184
0.1536 89.5070 29000 0.9115 39507520
0.1432 90.1236 29200 0.9220 39779328
0.1498 90.7419 29400 0.9144 40051520
0.1378 91.3586 29600 0.9267 40322576
0.1903 91.9768 29800 0.9267 40596016
0.1529 92.5935 30000 0.9272 40867568
0.189 93.2102 30200 0.9355 41140848
0.1464 93.8284 30400 0.9404 41412848
0.1292 94.4451 30600 0.9393 41683920
0.1466 95.0618 30800 0.9418 41959008
0.1593 95.6801 31000 0.9377 42231520
0.1176 96.2968 31200 0.9493 42502416
0.1401 96.9150 31400 0.9456 42776304
0.1378 97.5317 31600 0.9464 43048176
0.1371 98.1484 31800 0.9534 43320144
0.1492 98.7666 32000 0.9617 43591728
0.1374 99.3833 32200 0.9644 43866048
0.1248 100.0 32400 0.9672 44137040
0.1303 100.6182 32600 0.9570 44408848
0.1603 101.2349 32800 0.9661 44682912
0.1306 101.8532 33000 0.9627 44956000
0.1348 102.4699 33200 0.9677 45227824
0.1561 103.0866 33400 0.9671 45498320
0.1498 103.7048 33600 0.9746 45773648
0.1882 104.3215 33800 0.9667 46044128
0.1162 104.9397 34000 0.9666 46317504
0.1419 105.5564 34200 0.9691 46589024
0.1153 106.1731 34400 0.9769 46863680
0.1434 106.7913 34600 0.9752 47135520
0.1279 107.4080 34800 0.9835 47407056
0.1606 108.0247 35000 0.9815 47680112
0.1347 108.6430 35200 0.9813 47951632
0.1386 109.2597 35400 0.9760 48224016
0.1182 109.8779 35600 0.9836 48497072
0.1284 110.4946 35800 0.9799 48768624
0.144 111.1113 36000 0.9807 49041488
0.1322 111.7295 36200 0.9841 49314352
0.1475 112.3462 36400 0.9811 49584848
0.1575 112.9645 36600 0.9861 49858864
0.1715 113.5811 36800 0.9859 50130000
0.1008 114.1978 37000 0.9796 50404128
0.1341 114.8161 37200 0.9906 50678112
0.1298 115.4328 37400 0.9836 50946800
0.1605 116.0495 37600 0.9842 51219680
0.1178 116.6677 37800 0.9880 51492544
0.1211 117.2844 38000 0.9920 51764160
0.1631 117.9026 38200 0.9970 52039488
0.1161 118.5193 38400 0.9933 52311648
0.1555 119.1360 38600 0.9975 52584960
0.1456 119.7543 38800 0.9864 52855712
0.13 120.3709 39000 0.9861 53128480
0.1392 120.9892 39200 0.9897 53401056
0.1638 121.6059 39400 0.9863 53673600
0.1318 122.2226 39600 0.9917 53943712
0.1471 122.8408 39800 0.9935 54217344
0.1613 123.4575 40000 0.9949 54490336

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_stsb_1745333593

Adapter
(2099)
this model

Evaluation results