25 2

Zeyu Zhang

SteveZeyuZhang

https://steve-zeyu-zhang.github.io/

steve-zeyu-zhang

AI & ML interests

Geometric Learning, Generative AI, Computer Vision, Robotics, AI for Health

Recent Activity

authored a paper 11 days ago

VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery

authored a paper 11 days ago

3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence

submitted a paper 11 days ago

3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence

View all activity

Organizations

authored 2 papers 11 days ago

VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery

Paper • 2509.17191 • Published Sep 21, 2025 • 1

3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence

Paper • 2601.06496 • Published 14 days ago • 1

submitted a paper to Daily Papers 11 days ago

3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence

Paper • 2601.06496 • Published 14 days ago • 1

authored 2 papers 12 days ago

CoV: Chain-of-View Prompting for Spatial Reasoning

Paper • 2601.05172 • Published 15 days ago • 10

AnyDepth: Depth Estimation Made Easy

Paper • 2601.02760 • Published 18 days ago • 10

submitted a paper to Daily Papers 12 days ago

AnyDepth: Depth Estimation Made Easy

Paper • 2601.02760 • Published 18 days ago • 10

authored a paper 25 days ago

DriveGen3D: Boosting Feed-Forward Driving Scene Generation with Efficient Video Diffusion

Paper • 2510.15264 • Published Oct 17, 2025 • 4

submitted a paper to Daily Papers about 1 month ago

DragMesh: Interactive 3D Generation Made Easy

Paper • 2512.06424 • Published Dec 6, 2025 • 1

authored 5 papers about 2 months ago

EgoLCD: Egocentric Video Generation with Long Context Diffusion

Paper • 2512.04515 • Published Dec 4, 2025 • 6

BlockVid: Block Diffusion for High-Quality and Consistent Minute-Long Video Generation

Paper • 2511.22973 • Published Nov 28, 2025 • 5

MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots

Paper • 2511.17889 • Published Nov 22, 2025 • 5

Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation

Paper • 2511.20714 • Published Nov 25, 2025 • 48

EvoVLA: Self-Evolving Vision-Language-Action Model

Paper • 2511.16166 • Published Nov 20, 2025 • 6

authored 6 papers 4 months ago

VLA-R1: Enhancing Reasoning in Vision-Language-Action Models

Paper • 2510.01623 • Published Oct 2, 2025 • 11

UniVid: The Open-Source Unified Video Model

Paper • 2509.24200 • Published Sep 29, 2025 • 5

VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction

Paper • 2509.19297 • Published Sep 23, 2025 • 25

FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion

Paper • 2506.04648 • Published Jun 5, 2025 • 1

StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes

Paper • 2509.16415 • Published Sep 19, 2025 • 3

Nav-R1: Reasoning and Navigation in Embodied Scenes

Paper • 2509.10884 • Published Sep 13, 2025 • 9

authored a paper 6 months ago

ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS

Paper • 2505.23734 • Published May 29, 2025 • 4

Zeyu Zhang

AI & ML interests

Recent Activity

Organizations

SteveZeyuZhang's activity