Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6 • 111
FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published Dec 17, 2024 • 70
VideoRoPE: What Makes for Good Video Rotary Position Embedding? Paper • 2502.05173 • Published Feb 7 • 65
FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing Paper • 2503.16153 • Published Mar 20 • 2
Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models Paper • 2508.00819 • Published Aug 1 • 62
NeuralSVG: An Implicit Representation for Text-to-Vector Generation Paper • 2501.03992 • Published Jan 7 • 2
Rendering-Aware Reinforcement Learning for Vector Graphics Generation Paper • 2505.20793 • Published May 27 • 12
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 130
view article Article Everything You Need to Know about Knowledge Distillation By Kseniase and 1 other • Mar 6 • 52
BioMed Collection A suite of open-source biomedical foundation models. https://research.ibm.com/projects/biomedical-foundation-models • 28 items • Updated Jul 11 • 9
Continuous Speech Synthesis using per-token Latent Diffusion Paper • 2410.16048 • Published Oct 21, 2024 • 29
Granite 3.0 Language Models Collection A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 10 days ago • 100
DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models Paper • 2210.08933 • Published Oct 17, 2022 • 6