Persian/Arabic OCR - Qwen3-VL-2B-Instruct - v1.0

This is a 16-bit version of Qwen/Qwen3-VL-2B-Instruct fine-tuned specifically for Persian text recognition (OCR) on individual text lines. The model has been trained exclusively on cropped single-line text images and is not designed for full-page OCR.

Training Details

  • Dataset:
    • 56,000 real Persian text line images
    • 100,000 synthetic images (47 fonts in 3 diffrent sizes) with clean and noisy/colored backgrounds
  • Total Examples: 156k text line images
  • Epochs: 1
  • LoRA Rank: 512
  • Batch Size: 100
  • Learning Rate: 2e-4
  • Trainable Parameters: 759M Params

Training Performance

Training loss decreased steadily, ending at approximately 0.071.

Step Training Loss
0 2.2102
100 0.1577
200 0.1384
300 0.1330
400 0.1208
500 0.1078
600 0.1080
700 0.1065
800 0.0980
900 0.0871
1000 0.0827
1100 0.0802
1200 0.0898
1300 0.0866
1400 0.0811
1500 0.0774
1550 0.0714
Downloads last month
136
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mohajesmaeili/Qwen3-VL-2B-Persian-Arabic-Ocr-v1.0

Finetuned
(80)
this model

Dataset used to train mohajesmaeili/Qwen3-VL-2B-Persian-Arabic-Ocr-v1.0