MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 54
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning? Paper • 2505.23359 • Published May 29 • 39
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15 • 120
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset Paper • 2503.07091 • Published Mar 10 • 3
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values Paper • 2504.05535 • Published Apr 7 • 44
ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks Paper • 2503.06885 • Published Mar 10 • 4
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 104
Reward-Guided Speculative Decoding for Efficient LLM Reasoning Paper • 2501.19324 • Published Jan 31 • 39
Are Human-generated Demonstrations Necessary for In-context Learning? Paper • 2309.14681 • Published Sep 26, 2023 • 1
Towards Building the Federated GPT: Federated Instruction Tuning Paper • 2305.05644 • Published May 9, 2023 • 5
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks Paper • 2401.05507 • Published Jan 10, 2024 • 1