Vibe Checker: Aligning Code Evaluation with Human Preference Paper • 2510.07315 • Published Oct 8 • 32
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published 16 days ago • 105
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published 11 days ago • 99
Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published 9 days ago • 50
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published 4 days ago • 137
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Paper • 2510.25726 • Published 17 days ago • 44
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published 16 days ago • 113
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain Paper • 2509.26507 • Published Sep 30 • 531
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing Paper • 2509.08721 • Published Sep 10 • 673
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing Paper • 2508.10881 • Published Aug 14 • 52
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published Aug 14 • 142
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7 • 178
Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing Paper • 2504.21356 • Published Apr 30 • 2