Running on CPU Upgrade 1.75k 1.75k The Smol Training Playbook: The Secrets to Building World-Class LLMs π Explore loss curves for training LLMs
view article Article On the Shifting Global Compute Landscape By huggingface and 1 other β’ 11 days ago β’ 45
Less is More: Recursive Reasoning with Tiny Networks Paper β’ 2510.04871 β’ Published Oct 6 β’ 468
Running 201 201 FineVision: Open Data is All You Need π A new open-source dataset for training VLMs
Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages β’ 4 items β’ Updated Oct 1 β’ 296
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. β’ 2 items β’ Updated Aug 7 β’ 376
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models By nvidia and 3 others β’ Jul 18 β’ 50
Comment on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity Paper β’ 2506.09250 β’ Published Jun 10 β’ 27