UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions Paper • 2511.03334 • Published 6 days ago • 48
OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models Paper • 2509.17627 • Published Sep 22 • 66
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published Sep 10 • 127
Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published 19 days ago • 54
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks Paper • 2510.15019 • Published 26 days ago • 63
UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published Oct 9 • 70
MotionStream: Real-Time Video Generation with Interactive Motion Controls Paper • 2511.01266 • Published 8 days ago • 25
Running on CPU Upgrade 2.01k 2.01k The Smol Training Playbook: The Secrets to Building World-Class LLMs 📝 Display loss curves for training LLMs