START: Spatial and Textual Learning for Chart Understanding Paper • 2512.07186 • Published 8 days ago • 2
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published Nov 14 • 112
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning Paper • 2412.03248 • Published Dec 4, 2024 • 27