OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 28 days ago • 86
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published Apr 3 • 51
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper • 2506.09350 • Published Jun 11 • 48
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 141
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields Paper • 2505.02005 • Published May 4 • 3
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields Paper • 2505.02005 • Published May 4 • 3 • 1
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17 • 93
svjack/video-dataset-genshin-impact-landscape-organized Viewer • Updated Mar 14 • 60 • 40 • 3