TriVLA: A Triple-System-Based Unified Vision-Language-Action Model for General Robot Control Paper • 2507.01424 • Published Jul 2, 2025 • 1
A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding Paper • 2507.06719 • Published Jul 9, 2025
ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning Paper • 2503.23297 • Published Mar 30, 2025
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control Paper • 2601.05138 • Published 13 days ago • 16
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control Paper • 2601.05138 • Published 13 days ago • 16 • 3
Crafter Series Collection Crafter series models for 3D reconstruction and generation • 7 items • Updated 12 days ago • 1
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control Paper • 2601.05138 • Published 13 days ago • 16