A large-scale synthetic Arabic OCR dataset comprising 843,622 book-style document images across 10 fonts, designed to advance VLM for Arabic Texts
Robotics and Interne-of-Things
riotu-lab
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
2 days ago
MURAD: A Large-Scale Multi-Domain Unified Reverse Arabic Dictionary Dataset
upvoted
a
paper
2 days ago
SARD: A Large-Scale Synthetic Arabic OCR Dataset for Book-Style Text
Recognition
upvoted
a
paper
2 days ago
ARCADE: A City-Scale Corpus for Fine-Grained Arabic Dialect Tagging
Organizations
None yet