Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents Paper • 2506.01344 • Published Jun 2 • 6
THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning Paper • 2509.13761 • Published Sep 17 • 16
Visual-TableQA: Open-Domain Benchmark for Reasoning over Table Images Paper • 2509.07966 • Published Sep 9 • 4
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents Paper • 2509.09265 • Published Sep 11 • 45
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 187
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training Paper • 2509.03403 • Published Sep 3 • 21
Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR Paper • 2509.02522 • Published Sep 2 • 25
Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models Paper • 2508.21365 • Published Aug 29 • 29
Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models Paper • 2508.15202 • Published Aug 21 • 4
Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models Paper • 2508.09138 • Published Aug 12 • 36
🔍 Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized • 134 items • Updated 21 days ago • 116
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology Paper • 2507.07999 • Published Jul 10 • 49
StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs Paper • 2506.03077 • Published Jun 3 • 17
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics Paper • 2506.04308 • Published Jun 4 • 43
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning Paper • 2506.04207 • Published Jun 4 • 48