qwen1.5B-india-finetuned

Overview

This repository contains Qwen2.5-1.5B fine-tuned with LoRA on small Indic instruction-following datasets.
The LoRA adapters were merged into the base weights, producing a standalone checkpoint that can be used directly with mlx_lm.

License

The base model Qwen/Qwen2.5-1.5B is released under the Qwen License.
This fine-tuned checkpoint is subject to the same license. Please review the terms before use, especially for commercial scenarios.
Marked here as license: other to follow Hugging Face conventions.

Training Configuration

Method: LoRA-SFT (attention + MLP)
LoRA hyperparams: r=16, alpha=32, dropout=0.05
Max sequence length: 1024
Steps: 1500
Batch size: 1
Optimizer: AdamW (default in mlx_lm)
Hardware: Apple Silicon (MacBook Pro M4)
Framework: mlx_lm

The training configuration YAML used can be found at: configs/qwen2.5-3b_lora.yaml

Data

Subsets from ai4bharat/indic-align were used: Dolly_T and Anudesh.
Converted into completion-style prompts (prompt/completion pairs).
The focus is on Indic languages (Kannada, Hindi, Tamil, Telugu, Marathi, Gujarati) with some English instructions.
Preprocessed into train.jsonl / valid.jsonl (not included here).

Usage

Run with mlx_lm:

python -m mlx_lm generate \
  --model 5ivatej/qwen2.5-1.5B-india-finetuned \
  --max-tokens 200 \
  --prompt "Reply ONLY in Kannada written in English letters. Question: kannada dalli mathadoo?\n\nAnswer:"

Downloads last month: 8

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for 5ivatej/qwen2.5-1.5B-india-finetuned

Base model

Qwen/Qwen2.5-1.5B

Adapter

(354)

this model

5ivatej
/

qwen2.5-1.5B-india-finetuned