ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition Paper • 2503.21248 • Published Mar 27 • 21
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models Paper • 2503.21380 • Published Mar 27 • 38
Large Language Model Agent: A Survey on Methodology, Applications and Challenges Paper • 2503.21460 • Published Mar 27 • 83