Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning Paper • 2512.06533 • Published 4 days ago • 6
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published 23 days ago • 132
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation Paper • 2509.25849 • Published Sep 30 • 47
Performance Prediction for Large Systems via Text-to-Text Regression Paper • 2506.21718 • Published Jun 26 • 6