---
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
- base_model:adapter:meta-llama/Meta-Llama-3-8B-Instruct
- llama-factory
- transformers
pipeline_tag: text-generation
model-index:
- name: train_winogrande_101112_1760638073
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# train_winogrande_101112_1760638073

This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the winogrande dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0604
- Num Input Tokens Seen: 38366624

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 101112
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 20

### Training results

| Training Loss | Epoch | Step   | Validation Loss | Input Tokens Seen |
|:-------------:|:-----:|:------:|:---------------:|:-----------------:|
| 0.1899        | 1.0   | 9090   | 0.1525          | 1917952           |
| 0.0784        | 2.0   | 18180  | 0.0934          | 3835840           |
| 0.0395        | 3.0   | 27270  | 0.0797          | 5753152           |
| 0.2799        | 4.0   | 36360  | 0.0697          | 7672000           |
| 0.0141        | 5.0   | 45450  | 0.0657          | 9590080           |
| 0.1489        | 6.0   | 54540  | 0.0621          | 11509088          |
| 0.0257        | 7.0   | 63630  | 0.0613          | 13427712          |
| 0.008         | 8.0   | 72720  | 0.0637          | 15346672          |
| 0.0722        | 9.0   | 81810  | 0.0616          | 17265344          |
| 0.0662        | 10.0  | 90900  | 0.0604          | 19184224          |
| 0.0033        | 11.0  | 99990  | 0.0685          | 21102912          |
| 0.0008        | 12.0  | 109080 | 0.0705          | 23021312          |
| 0.002         | 13.0  | 118170 | 0.0719          | 24938688          |
| 0.0553        | 14.0  | 127260 | 0.0716          | 26857088          |
| 0.0197        | 15.0  | 136350 | 0.0733          | 28775840          |
| 0.0331        | 16.0  | 145440 | 0.0760          | 30693088          |
| 0.1504        | 17.0  | 154530 | 0.0764          | 32612480          |
| 0.0958        | 18.0  | 163620 | 0.0781          | 34530176          |
| 0.0019        | 19.0  | 172710 | 0.0784          | 36447600          |
| 0.0013        | 20.0  | 181800 | 0.0779          | 38366624          |


### Framework versions

- PEFT 0.17.1
- Transformers 4.51.3
- Pytorch 2.9.0+cu128
- Datasets 4.0.0
- Tokenizers 0.21.4