Reward Models 10-2025 Collection A collection of great reward models for research and production • 6 items • Updated 3 days ago • 4
Reward Models 10-2025 Collection A collection of great reward models for research and production • 6 items • Updated 3 days ago • 4
view article Article Can Your LLM Think Like a Professional? Introducing ProfBench By nvidia and 7 others • 11 days ago • 15
view article Article Can Your LLM Think Like a Professional? Introducing ProfBench By nvidia and 7 others • 11 days ago • 15
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge Paper • 2510.18941 • Published 18 days ago • 7
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge Paper • 2510.18941 • Published 18 days ago • 7
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25 • 4
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25 • 4
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25 • 4 • 2