FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents Paper • 2602.01566 • Published 2 days ago • 38
Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles Paper • 2602.01590 • Published 2 days ago • 30
WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora Paper • 2602.02053 • Published 1 day ago • 39
AIonopedia: an LLM agent orchestrating multimodal learning for ionic liquid discovery Paper • 2511.11257 • Published Nov 14, 2025 • 25
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts Paper • 2510.19363 • Published Oct 22, 2025 • 62
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents Paper • 2506.11763 • Published Jun 13, 2025 • 74
rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset Paper • 2505.21297 • Published May 27, 2025 • 29