Wang Chengyao's picture

Wang Chengyao

wcy1122

·

https://wcy1122.github.io/

AI & ML interests

Multimodal Intelligence

Recent Activity

upvoted a paper 19 days ago

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

liked a model 28 days ago

deepseek-ai/DeepSeek-V3.2-Speciale

View all activity

Organizations

authored a paper 2 months ago

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Paper • 2510.23607 • Published Oct 27 • 177

authored 3 papers 3 months ago

DreamOmni2: Multimodal Instruction-based Editing and Generation

Paper • 2510.06679 • Published Oct 8 • 73

DreamOmni: Unified Image Generation and Editing

Paper • 2412.17098 • Published Dec 22, 2024 • 2

MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech

Paper • 2509.25131 • Published Sep 29 • 15

authored a paper about 1 year ago

Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition

Paper • 2412.09501 • Published Dec 12, 2024 • 48

authored a paper almost 2 years ago

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Paper • 2403.18814 • Published Mar 27, 2024 • 47

authored a paper about 2 years ago

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

Paper • 2311.17043 • Published Nov 28, 2023