ScaleAI/act_meadow_bowl50
Updated
•
1
None defined yet.
Agentic Rubrics as Contextual Verifiers for SWE Agents
ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents