Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 269
Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation Paper • 2512.20908 • Published Dec 24, 2025 • 29
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning Paper • 2601.09088 • Published Jan 14 • 63
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Paper • 2509.23661 • Published Sep 28, 2025 • 48
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 133
Scaling Latent Reasoning via Looped Language Models Paper • 2510.25741 • Published Oct 29, 2025 • 228
SoulX-Podcast: Towards Realistic Long-form Podcasts with Dialectal and Paralinguistic Diversity Paper • 2510.23541 • Published Oct 27, 2025 • 15
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper • 2510.03215 • Published Oct 3, 2025 • 98
Artificial Hippocampus Networks for Efficient Long-Context Modeling Paper • 2510.07318 • Published Oct 8, 2025 • 31
The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP Paper • 2510.05644 • Published Oct 7, 2025 • 25
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 189
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14, 2025 • 300
Granite 3.0 Language Models Collection A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated Nov 17, 2025 • 101
Granite 2.0 Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated Nov 17, 2025 • 202
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 627