Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective Paper • 2509.22613 • Published Sep 26 • 9
DocReward: A Document Reward Model for Structuring and Stylizing Paper • 2510.11391 • Published 28 days ago • 26
Information-Preserving Reformulation of Reasoning Traces for Antidistillation Paper • 2510.11545 • Published 27 days ago • 1
Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs Paper • 2510.24514 • Published 13 days ago • 20
The Era of Agentic Organization: Learning to Organize with Language Models Paper • 2510.26658 • Published 10 days ago • 23
The Era of Agentic Organization: Learning to Organize with Language Models Paper • 2510.26658 • Published 10 days ago • 23
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Paper • 2510.25726 • Published 11 days ago • 44
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published 19 days ago • 110
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing Paper • 2510.19808 • Published 18 days ago • 28
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Paper • 2510.19779 • Published 18 days ago • 58
QueST: Incentivizing LLMs to Generate Difficult Problems Paper • 2510.17715 • Published 20 days ago • 32
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels Paper • 2510.06499 • Published Oct 7 • 31
DocReward: A Document Reward Model for Structuring and Stylizing Paper • 2510.11391 • Published 28 days ago • 26