scene4D - a PandaQQ Collection

PandaQQ 's Collections

RL

robot

scene4D

scene4D

updated 13 days ago

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13 • 32
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Paper • 2503.09642 • Published Mar 12 • 19
VGGT: Visual Geometry Grounded Transformer

Paper • 2503.11651 • Published Mar 14 • 33
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Paper • 2503.16422 • Published Mar 20 • 14
SynCity: Training-Free Generation of 3D Worlds

Paper • 2503.16420 • Published Mar 20 • 27
M3: 3D-Spatial MultiModal Memory

Paper • 2503.16413 • Published Mar 20 • 15
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse

Paper • 2503.18470 • Published Mar 24 • 3
Any6D: Model-free 6D Pose Estimation of Novel Objects

Paper • 2503.18673 • Published Mar 24 • 3
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement

Paper • 2503.04919 • Published Mar 6 • 8
FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis

Paper • 2503.13265 • Published Mar 17 • 15
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields

Paper • 2503.20776 • Published Mar 26 • 10
Segment Any Motion in Videos

Paper • 2503.22268 • Published Mar 28 • 19
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Paper • 2503.20785 • Published Mar 26 • 22
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

Paper • 2504.01956 • Published Apr 2 • 41
TAPIP3D: Tracking Any Point in Persistent 3D Geometry

Paper • 2504.14717 • Published Apr 20 • 8
Towards Understanding Camera Motions in Any Video

Paper • 2504.15376 • Published Apr 21 • 158
EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence

Paper • 2506.10600 • Published Jun 12 • 8
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

Paper • 2506.08862 • Published Jun 10 • 5
PlayerOne: Egocentric World Simulator

Paper • 2506.09995 • Published Jun 11 • 34
π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Paper • 2507.13347 • Published Jul 17 • 64
SpatialTrackerV2: 3D Point Tracking Made Easy

Paper • 2507.12462 • Published Jul 16 • 18
PhysX: Physical-Grounded 3D Asset Generation

Paper • 2507.12465 • Published Jul 16 • 43
Streaming 4D Visual Geometry Transformer

Paper • 2507.11539 • Published Jul 15 • 14
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23 • 85
Reconstructing 4D Spatial Intelligence: A Survey

Paper • 2507.21045 • Published Jul 28 • 35
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 131
NeRF Is a Valuable Assistant for 3D Gaussian Splatting

Paper • 2507.23374 • Published Jul 31 • 11
DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation

Paper • 2507.13985 • Published Jul 18 • 6
Matrix-3D: Omnidirectional Explorable 3D World Generation

Paper • 2508.08086 • Published Aug 11 • 75
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation

Paper • 2508.01126 • Published Aug 2 • 5
G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration

Paper • 2508.11379 • Published Aug 15 • 12
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

Paper • 2508.10893 • Published Aug 14 • 31
MeshSplat: Generalizable Sparse-View Surface Reconstruction via Gaussian Splatting

Paper • 2508.17811 • Published Aug 25 • 6
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37
DA^2: Depth Anything in Any Direction

Paper • 2509.26618 • Published Sep 30 • 25
TTT3R: 3D Reconstruction as Test-Time Training

Paper • 2509.26645 • Published Sep 30 • 14
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents

Paper • 2510.23691 • Published 16 days ago • 51