view article Article A Survey of Small Language Models in the Era of LLMs: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness Jul 16 • 4
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 238
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 • 470
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 Jul 5, 2024 • 301
Vision Language Models: 2025 Update Collection This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update • 67 items • Updated May 12 • 5
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 169
MinerU: An Open-Source Solution for Precise Document Content Extraction Paper • 2409.18839 • Published Sep 27, 2024 • 32
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing Paper • 2509.22186 • Published Sep 26 • 132
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15, 2024 • 190