---
base_model:
- Qwen/Qwen3-4B-Thinking-2507
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3
- trl
license: apache-2.0
language:
- en
datasets:
- Agent-Ark/Toucan-1.5M
---

# Uploaded  model

- **Developed by:** amityco
- **License:** apache-2.0
- **Finetuned from model :** unsloth/Qwen3-4B-Thinking-2507

This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)


![image](https://cdn-uploads.huggingface.co/production/uploads/64739bc371f07ae738d2d61d/7qqHWD4GBL9-caV6Lx8rd.png)

100 samples
```
trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = ds_train['train'],
    args = SFTConfig(
        dataset_text_field = "text",
        per_device_train_batch_size = 8,
        gradient_accumulation_steps = 1, 
        warmup_steps = 10,
        num_train_epochs = 8,
        learning_rate = 1e-5, 
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        report_to = "wandb", 
        output_dir = "outputs",
        save_strategy = "steps",
        save_steps = 50,
        save_total_limit = 2,
    ),
)
```