Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning Paper • 2512.06533 • Published 3 days ago • 6
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published 23 days ago • 132