MedCalc-Bench Collection Evaluating Large Language Models for Medical Calculations • 4 items • Updated 29 days ago • 2
MedCalc-Bench Collection Evaluating Large Language Models for Medical Calculations • 4 items • Updated 29 days ago • 2
MedCalc-Bench Collection Evaluating Large Language Models for Medical Calculations • 4 items • Updated 29 days ago • 2
GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information Paper • 2304.09667 • Published Apr 19, 2023 • 1
RAG-Gym: Optimizing Reasoning and Search Agents with Process Supervision Paper • 2502.13957 • Published Feb 19, 2025 • 1
Benchmarking Retrieval-Augmented Generation for Chemistry Paper • 2505.07671 • Published May 12, 2025 • 1