Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image Paper • 2512.05044 • Published 10 days ago • 15
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning Paper • 2512.02835 • Published 12 days ago • 9
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control Paper • 2508.21112 • Published Aug 28 • 77
Exploring the Potential of Encoder-free Architectures in 3D LMMs Paper • 2502.09620 • Published Feb 13 • 26
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model Paper • 2501.15830 • Published Jan 27 • 13