rbelanec
/

train_wsc_1754652157

+---
+library_name: peft
+license: llama3
+base_model: meta-llama/Meta-Llama-3-8B-Instruct
+tags:
+- llama-factory
+- generated_from_trainer
+model-index:
+- name: train_wsc_1754652157
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# train_wsc_1754652157
+This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.3862
+- Num Input Tokens Seen: 469104
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 123
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 10.0
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Input Tokens Seen |
+|:-------------:|:------:|:----:|:---------------:|:-----------------:|
+| 13.9838       | 0.504  | 63   | 13.8711         | 25504             |
+| 9.9251        | 1.008  | 126  | 9.7372          | 49696             |
+| 6.0466        | 1.512  | 189  | 5.3090          | 74112             |
+| 1.6788        | 2.016  | 252  | 1.9802          | 99136             |
+| 1.0818        | 2.52   | 315  | 0.7968          | 123904            |
+| 0.8394        | 3.024  | 378  | 0.5601          | 148736            |
+| 0.5184        | 3.528  | 441  | 0.4782          | 174432            |
+| 0.3853        | 4.032  | 504  | 0.4613          | 198656            |
+| 0.4549        | 4.536  | 567  | 0.4388          | 224032            |
+| 0.4193        | 5.04   | 630  | 0.4215          | 247424            |
+| 0.3691        | 5.5440 | 693  | 0.4073          | 271232            |
+| 0.3746        | 6.048  | 756  | 0.4005          | 295728            |
+| 0.427         | 6.552  | 819  | 0.4005          | 320464            |
+| 0.3347        | 7.056  | 882  | 0.4107          | 345856            |
+| 0.331         | 7.5600 | 945  | 0.4089          | 371040            |
+| 0.4144        | 8.064  | 1008 | 0.3849          | 395216            |
+| 0.3779        | 8.568  | 1071 | 0.3869          | 419184            |
+| 0.3714        | 9.072  | 1134 | 0.3899          | 444560            |
+| 0.3858        | 9.576  | 1197 | 0.3862          | 469104            |
+### Framework versions
+- PEFT 0.15.2
+- Transformers 4.51.3
+- Pytorch 2.8.0+cu128
+- Datasets 3.6.0
+- Tokenizers 0.21.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3cedac709ac777437baffbe905dd3ed98f9669d6534f7b96c937c6043736f35e
 size 26214528

 version https://git-lfs.github.com/spec/v1
+oid sha256:18369e3b16869dd76b9dfba8977ba696dab260dcc4c0c210be6037c3dc5e4968
 size 26214528