TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning Paper • 2509.25760 • Published Sep 30 • 54
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code Paper • 2508.18106 • Published Aug 25 • 342
RLVR-Decomposed Collection The collection for the Paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning" • 9 items • Updated Jun 1 • 3
AdaDecode Collection [ICML 2025] AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism. • 18 items • Updated Jun 4 • 3
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning Paper • 2506.01347 • Published Jun 2 • 3
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning Paper • 2505.16421 • Published May 22 • 19
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated Apr 30 • 103
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Paper • 2406.09961 • Published Jun 14, 2024 • 55