Submitted by Deren Lei 14 Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models AI at Meta 1
Submitted by Jack Zhang 40 The Alignment Waltz: Jointly Training Agents to Collaborate for Safety AI at Meta 2
Submitted by Lin Long 30 Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense AI at Meta 4
Submitted by Niels Rogge 14 OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows AI at Meta 4
Submitted by Jacob Kahn 7 CWM: An Open-Weights LLM for Research on Code Generation with World Models AI at Meta 704 2
Submitted by Zhepei Wei 54 TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning AI at Meta 3
Submitted by Chuanyang Jin 18 The Era of Real-World Human Interaction: RL from User Conversations AI at Meta 3
Submitted by Niels Rogge 5 Cluster and Predict Latents Patches for Improved Masked Image Modeling AI at Meta 123 2