On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published 20 days ago • 36
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28 • 71
RoboOmni: Proactive Robot Manipulation in Omni-modal Context Paper • 2510.23763 • Published Oct 27 • 53
Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences Paper • 2510.23451 • Published Oct 27 • 26
Training Language Models to Generate Quality Code with Program Analysis Feedback Paper • 2505.22704 • Published May 28 • 14
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper • 2509.15221 • Published Sep 18 • 111
MiroThinker-v0.2 Collection Better performance in multi-hop search and multilingual tasks. • 8 items • Updated Nov 9 • 7
The Role of Summarization in Generative Agents: A Preliminary Perspective Paper • 2305.01253 • Published May 2, 2023 • 1
ARIA: Training Language Agents with Intention-Driven Reward Aggregation Paper • 2506.00539 • Published May 31 • 30
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26 • 104
TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs Paper • 2410.10479 • Published Oct 14, 2024 • 1