BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions Paper • 2510.10666 • Published Oct 12 • 27
UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published Oct 9 • 70
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26 • 24
Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning Paper • 2509.22824 • Published Sep 26 • 20
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning Paper • 2509.07980 • Published Sep 9 • 99
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published Sep 2 • 123
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2 • 83
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1 • 73
ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability Paper • 2508.07050 • Published Aug 9 • 116
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent Paper • 2508.06600 • Published Aug 8 • 40
NeuralOS: Towards Simulating Operating Systems via Neural Generative Models Paper • 2507.08800 • Published Jul 11 • 79
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks Paper • 2504.12764 • Published Apr 17 • 41