Optimized Table Tokenization for Table Structure Recognition Paper • 2305.03393 • Published May 5, 2023 • 1
PubTables-1M: Towards comprehensive table extraction from unstructured documents Paper • 2110.00061 • Published Sep 30, 2021 • 3
Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning Paper • 2505.20161 • Published May 26 • 1
Image-GS: Content-Adaptive Image Representation via 2D Gaussians Paper • 2407.01866 • Published Jul 2, 2024 • 1
2D Gaussian Splatting with Semantic Alignment for Image Inpainting Paper • 2509.01964 • Published Sep 2 • 6
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation Paper • 2509.00428 • Published Aug 30 • 17
Representing Speech Through Autoregressive Prediction of Cochlear Tokens Paper • 2508.11598 • Published Aug 15 • 17
StyleMM: Stylized 3D Morphable Face Model via Text-Driven Aligned Image Translation Paper • 2508.11203 • Published Aug 15 • 10
FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers Paper • 2507.12956 • Published Jul 17 • 24
FLEXITOKENS: Flexible Tokenization for Evolving Language Models Paper • 2507.12720 • Published Jul 17 • 9
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation Paper • 2507.09862 • Published Jul 14 • 49
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published Jun 24 • 41
Peccavi: Visual Paraphrase Attack Safe and Distortion Free Image Watermarking Technique for AI-Generated Images Paper • 2506.22960 • Published Jun 28 • 6
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling Paper • 2506.20452 • Published Jun 25 • 19
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model Paper • 2506.15682 • Published Jun 18 • 5