-
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition
Paper • 2503.21248 • Published • 21 -
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
Paper • 2503.21380 • Published • 38 -
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Paper • 2503.21460 • Published • 83
suke
teppeki
AI & ML interests
None yet
Organizations
None yet
later
-
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition
Paper • 2503.21248 • Published • 21 -
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
Paper • 2503.21380 • Published • 38 -
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Paper • 2503.21460 • Published • 83
models
0
None public yet
datasets
0
None public yet