MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives Paper • 2512.14699 • Published 5 days ago • 26
Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection Paper • 2512.16905 • Published 3 days ago • 23
GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation Paper • 2512.12751 • Published 7 days ago • 6
GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation Paper • 2512.12751 • Published 7 days ago • 6
DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning Paper • 2512.12799 • Published 7 days ago • 9
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search Paper • 2509.07969 • Published Sep 9 • 58
AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems Paper • 2503.06669 • Published Mar 9 • 2
Is Diversity All You Need for Scalable Robotic Manipulation? Paper • 2507.06219 • Published Jul 8 • 20
AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems Paper • 2503.06669 • Published Mar 9 • 2
Is Diversity All You Need for Scalable Robotic Manipulation? Paper • 2507.06219 • Published Jul 8 • 20 • 1
Is Diversity All You Need for Scalable Robotic Manipulation? Paper • 2507.06219 • Published Jul 8 • 20
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning Paper • 2410.14633 • Published Oct 18, 2024 • 1
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning Paper • 2410.14633 • Published Oct 18, 2024 • 1