PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16 • 94
Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs Paper • 2510.01954 • Published Oct 2 • 12
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution Paper • 2410.16256 • Published Oct 21, 2024 • 60