Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising Paper • 2511.08633 • Published 6 days ago • 49
World Simulation with Video Foundation Models for Physical AI Paper • 2511.00062 • Published 18 days ago • 39
Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling Paper • 2510.16751 • Published 28 days ago • 19
VISTA: A Test-Time Self-Improving Video Generation Agent Paper • 2510.15831 • Published 29 days ago • 20
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published 30 days ago • 48
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published Oct 2 • 92
DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder Paper • 2509.25182 • Published Sep 29 • 36
Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24 • 77
InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis Paper • 2509.10441 • Published Sep 12 • 30
Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing Paper • 2508.09192 • Published Aug 8 • 30
Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning Paper • 2507.14137 • Published Jul 18 • 34
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation Paper • 2507.08441 • Published Jul 11 • 61
Cosmos-Predict2 Collection World Foundation Model for Future Prediction • 13 items • Updated about 19 hours ago • 29