Built with Axolotl

See axolotl config

axolotl version: 0.13.0.dev0

adapter: lora
base_model: Qwen/Qwen3-32B
bf16: true
flash_attention: true
gradient_checkpointing: true

datasets:
- path: /workspace/data/wangchan_fixed
  type: alpaca
  split: train

val_set_size: 0
sequence_len: 2048
train_on_inputs: false

micro_batch_size: 4
gradient_accumulation_steps: 8

optimizer: adamw_torch
learning_rate: 1.0e-4
lr_scheduler: cosine
warmup_ratio: 0.03
weight_decay: 0.01
max_grad_norm: 1.0
num_epochs: 2

lora_r: 32
lora_alpha: 64
lora_dropout: 0.05
lora_target_modules:
- q_proj
- k_proj
- v_proj
- o_proj
- gate_proj
- down_proj
- up_proj

output_dir: ./outputs/qwen32b-thai
logging_steps: 10
save_steps: 300

Qwen3-32B Thai LoRA

This model is a fine-tuned version of Qwen/Qwen3-32B on the WangchanThaiInstruct dataset for improved Thai language instruction-following capabilities.

Model Description

This LoRA adapter enhances Qwen3-32B's ability to understand and respond to Thai language instructions across various domains including finance, general knowledge, creative writing, and classification tasks.

  • Base Model: Qwen/Qwen3-32B
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Language: Thai (th)
  • Training Loss: 0.85 → 0.55

Intended Uses & Limitations

Intended Uses

  • Thai language question answering
  • Thai instruction following
  • Thai content generation
  • Financial domain queries in Thai

Limitations

  • Performance may vary on domains not covered in the training data
  • Inherits limitations of the base Qwen3-32B model
  • Primarily optimized for Thai; multilingual performance may differ from base model

Training and Evaluation Data

Dataset

  • Name: WangchanThaiInstruct
  • Training Samples: ~29,000 (after filtering sequences > 2048 tokens)
  • Format: Alpaca-style (instruction, input, output)
  • Domains: Finance, General Knowledge, Creative Writing, Classification, Open QA, Closed QA

Training Procedure

Hardware

  • GPU: 1x NVIDIA H200 SXM (141GB VRAM)
  • Training Time: ~10 hours

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 43
  • training_steps: 1444

Training Results

Step Loss
10 0.85
20 0.78
1068 0.55
1444 (final) ~0.50

Framework versions

  • PEFT 0.17.1
  • Transformers 4.57.3
  • Pytorch 2.7.1+cu126
  • Datasets 4.3.0
  • Tokenizers 0.22.1

Citation

If you use this model, please cite the original dataset and base model:

@misc{wangchanthaiinstruct,
  title={WangchanThaiInstruct},
  author={AIResearch.in.th},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/datasets/airesearch/WangchanThaiInstruct}
}

@misc{qwen3,
  title={Qwen3 Technical Report},
  author={Qwen Team},
  year={2025},
  eprint={2505.09388},
  archivePrefix={arXiv}
}
Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for devrf/qwen32b-thai-lora

Base model

Qwen/Qwen3-32B
Adapter
(185)
this model

Paper for devrf/qwen32b-thai-lora