4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models Paper • 2503.10437 • Published Mar 13 • 32
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k Paper • 2503.09642 • Published Mar 12 • 19
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering Paper • 2503.16422 • Published Mar 20 • 14
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse Paper • 2503.18470 • Published Mar 24 • 3
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement Paper • 2503.04919 • Published Mar 6 • 8
FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis Paper • 2503.13265 • Published Mar 17 • 15
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Paper • 2503.20776 • Published Mar 26 • 10
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency Paper • 2503.20785 • Published Mar 26 • 22
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step Paper • 2504.01956 • Published Apr 2 • 41
EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence Paper • 2506.10600 • Published Jun 12 • 8
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams Paper • 2506.08862 • Published Jun 10 • 5
π^3: Scalable Permutation-Equivariant Visual Geometry Learning Paper • 2507.13347 • Published Jul 17 • 64
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels Paper • 2507.21809 • Published Jul 29 • 131
DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation Paper • 2507.13985 • Published Jul 18 • 6
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation Paper • 2508.01126 • Published Aug 2 • 5
G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration Paper • 2508.11379 • Published Aug 15 • 12
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer Paper • 2508.10893 • Published Aug 14 • 31
MeshSplat: Generalizable Sparse-View Surface Reconstruction via Gaussian Splatting Paper • 2508.17811 • Published Aug 25 • 6
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels Paper • 2508.17437 • Published Aug 20 • 37
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents Paper • 2510.23691 • Published 16 days ago • 51