Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations Paper • 2602.05885 • Published 10 days ago • 28
Numina-Lean-Agent: An Open and General Agentic Reasoning System for Formal Mathematics Paper • 2601.14027 • Published 26 days ago • 12
How Far Are We from Genuinely Useful Deep Research Agents? Paper • 2512.01948 • Published Dec 1, 2025 • 56
How Far Are We from Genuinely Useful Deep Research Agents? Paper • 2512.01948 • Published Dec 1, 2025 • 56
How Brittle is Agent Safety? Rethinking Agent Risk under Intent Concealment and Task Complexity Paper • 2511.08487 • Published Nov 11, 2025 • 3
How Brittle is Agent Safety? Rethinking Agent Risk under Intent Concealment and Task Complexity Paper • 2511.08487 • Published Nov 11, 2025 • 3
ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning Paper • 2511.14366 • Published Nov 18, 2025 • 17
ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning Paper • 2511.14366 • Published Nov 18, 2025 • 17