Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6 • 111
FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published Dec 17, 2024 • 71
VideoRoPE: What Makes for Good Video Rotary Position Embedding? Paper • 2502.05173 • Published Feb 7 • 65
FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing Paper • 2503.16153 • Published Mar 20 • 2
NeuralSVG: An Implicit Representation for Text-to-Vector Generation Paper • 2501.03992 • Published Jan 7 • 2