Reward Models 10-2025 Collection A collection of great reward models for research and production • 6 items • Updated 3 days ago • 4
view article Article Can Your LLM Think Like a Professional? Introducing ProfBench By nvidia and 7 others • 11 days ago • 15
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge Paper • 2510.18941 • Published 18 days ago • 7
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25 • 4
HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages Paper • 2505.11475 • Published May 16 • 3
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Paper • 2405.01481 • Published May 2, 2024 • 31