Persian/Arabic OCR - Qwen3-VL-2B-Instruct - v1.0
This is a 16-bit version of Qwen/Qwen3-VL-2B-Instruct fine-tuned specifically for Persian text recognition (OCR) on individual text lines. The model has been trained exclusively on cropped single-line text images and is not designed for full-page OCR.
Training Details
- Dataset:
- 56,000 real Persian text line images
- 100,000 synthetic images (47 fonts in 3 diffrent sizes) with clean and noisy/colored backgrounds
- Total Examples: 156k text line images
- Epochs: 1
- LoRA Rank: 512
- Batch Size: 100
- Learning Rate: 2e-4
- Trainable Parameters: 759M Params
Training Performance
Training loss decreased steadily, ending at approximately 0.071.
| Step | Training Loss |
|---|---|
| 0 | 2.2102 |
| 100 | 0.1577 |
| 200 | 0.1384 |
| 300 | 0.1330 |
| 400 | 0.1208 |
| 500 | 0.1078 |
| 600 | 0.1080 |
| 700 | 0.1065 |
| 800 | 0.0980 |
| 900 | 0.0871 |
| 1000 | 0.0827 |
| 1100 | 0.0802 |
| 1200 | 0.0898 |
| 1300 | 0.0866 |
| 1400 | 0.0811 |
| 1500 | 0.0774 |
| 1550 | 0.0714 |
- Downloads last month
- 136
Model tree for mohajesmaeili/Qwen3-VL-2B-Persian-Arabic-Ocr-v1.0
Base model
Qwen/Qwen3-VL-2B-Instruct