Balancing Understanding and Generation in Discrete Diffusion Models Paper • 2602.01362 • Published 4 days ago • 13
WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models Paper • 2602.02537 • Published 8 days ago • 5
MARS: Modular Agent with Reflective Search for Automated AI Research Paper • 2602.02660 • Published 3 days ago • 53
No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs Paper • 2602.02103 • Published 3 days ago • 62
Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation Paper • 2602.01756 • Published 3 days ago • 22
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System Paper • 2602.02488 • Published 3 days ago • 29
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published 5 days ago • 199
Toward Cognitive Supersensing in Multimodal Large Language Model Paper • 2602.01541 • Published 4 days ago • 16
TTCS: Test-Time Curriculum Synthesis for Self-Evolving Paper • 2601.22628 • Published 6 days ago • 32
ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought Paper • 2601.23184 • Published 6 days ago • 32
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas Paper • 2601.21558 • Published 7 days ago • 56
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning Paper • 2601.18631 • Published 10 days ago • 47
Memory-V2V: Augmenting Video-to-Video Diffusion Models with Memory Paper • 2601.16296 • Published 14 days ago • 28
TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers Paper • 2601.14133 • Published 16 days ago • 60
VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents Paper • 2601.16973 • Published 13 days ago • 40