qwen1.5B-india-finetuned
Overview
This repository contains Qwen2.5-1.5B fine-tuned with LoRA on small Indic instruction-following datasets.
The LoRA adapters were merged into the base weights, producing a standalone checkpoint that can be used directly with mlx_lm.
License
- The base model Qwen/Qwen2.5-1.5B is released under the Qwen License.
- This fine-tuned checkpoint is subject to the same license. Please review the terms before use, especially for commercial scenarios.
- Marked here as
license: otherto follow Hugging Face conventions.
Training Configuration
- Method: LoRA-SFT (attention + MLP)
- LoRA hyperparams:
r=16, alpha=32, dropout=0.05 - Max sequence length: 1024
- Steps: 1500
- Batch size: 1
- Optimizer: AdamW (default in
mlx_lm) - Hardware: Apple Silicon (MacBook Pro M4)
- Framework:
mlx_lm
The training configuration YAML used can be found at:
configs/qwen2.5-3b_lora.yaml
Data
- Subsets from ai4bharat/indic-align were used: Dolly_T and Anudesh.
- Converted into completion-style prompts (
prompt/completionpairs). - The focus is on Indic languages (Kannada, Hindi, Tamil, Telugu, Marathi, Gujarati) with some English instructions.
- Preprocessed into
train.jsonl/valid.jsonl(not included here).
Usage
Run with mlx_lm:
python -m mlx_lm generate \
--model 5ivatej/qwen2.5-1.5B-india-finetuned \
--max-tokens 200 \
--prompt "Reply ONLY in Kannada written in English letters. Question: kannada dalli mathadoo?\n\nAnswer:"
- Downloads last month
- 8
Model tree for 5ivatej/qwen2.5-1.5B-india-finetuned
Base model
Qwen/Qwen2.5-1.5B