Smol, multilingual, long-context reasoner
AI & ML interests
Exploring smol models (for text, vision and video) and high quality web and synthetic datasets
Recent Activity
View all activity
Papers
View all Papers
Organization Card
Hugging Face Smol Models Research
This is the home for smol models (SmolLM & SmolVLM) and high quality pre-training datasets. We released:
- FineWeb-Edu: a filtered version of FineWeb dataset for educational content, paper available here.
- Cosmopedia: the largest open synthetic dataset, with 25B tokens and 30M samples. It contains synthetic textbooks, blog posts, and stories, posts generated by Mixtral. Blog post available here.
- Smollm-Corpus: the pre-training corpus of SmolLM: Cosmopedia v0.2, FineWeb-Edu dedup and Python-Edu. Blog post available here.
- FineMath: the best public math pretraining dataset with 50B tokens of mathematical and problem solving data.
- Stack-Edu: the best open code pretraining dataset with educational code in 15 programming languages.
- SmolLM2 models: a series of strong small models in three sizes: 135M, 360M and 1.7B
- SmolVLM2: a family of small Video and Vision models in three sizes: 2.2B, 500M and 256M. Blog post available here.
News 🗞️
- SmolLM3: SOTA 3B model with dual reasoning, supports 6 languages and long context with strong function calling: HuggingFaceTB/SmolLM3-3B
- SmolLM3 Engineering Blueprint available here.
spaces
17
Running
on
CPU Upgrade
1.74k
The Smol Training Playbook: The Secrets to Building World-Class LLMs
📝
Explore loss curves for training LLMs
Running
18
Smol Training Playbook - Table of Contents
📚
Running
39
SmolLM WebGPU
🤏
A powerful AI chatbot that runs locally in your browser
Running
52
SmolLM3 WebGPU
🚀
A dual reasoning model that runs locally in your browser.
Build error
81
SmolVLM
📊
Generate answers by combining text and images
models
80
HuggingFaceTB/SmolLM2-360M-Instruct
Text Generation
•
0.4B
•
Updated
•
273k
•
152
HuggingFaceTB/SmolLM2-135M-Instruct
Text Generation
•
0.1B
•
Updated
•
157k
•
264
HuggingFaceTB/SmolLM3-3B
Text Generation
•
3B
•
Updated
•
55.7k
•
•
788
HuggingFaceTB/SmolLM3-3B-checkpoints
Updated
•
4.48k
•
22
HuggingFaceTB/SmolLM3-3B-Base
Text Generation
•
3B
•
Updated
•
13.1k
•
134
HuggingFaceTB/smollm2-135M-SFT-Only
0.1B
•
Updated
•
142
•
1
HuggingFaceTB/SmolLM3-3B-ONNX
Text Generation
•
Updated
•
342
•
19
HuggingFaceTB/SmolLM2-1.7B-Instruct
Text Generation
•
2B
•
Updated
•
32k
•
680
HuggingFaceTB/SmolVLM2-2.2B-Base
Image-Text-to-Text
•
2B
•
Updated
•
110
•
9
HuggingFaceTB/SmolVLM-256M-Instruct
Image-Text-to-Text
•
0.3B
•
Updated
•
184k
•
297
datasets
60
HuggingFaceTB/smoltalk2_everyday_convs_think
Viewer
•
Updated
•
2.06k
•
134
HuggingFaceTB/smoltalk2
Viewer
•
Updated
•
8.61M
•
9.03k
•
117
HuggingFaceTB/Countdown-Task-GOLD
Viewer
•
Updated
•
149k
•
162
HuggingFaceTB/training-guide-nanotron-configs
Viewer
•
Updated
•
2
•
74
•
4
HuggingFaceTB/post-training-benchmarks-viewer
Viewer
•
Updated
•
45
•
51
•
1
HuggingFaceTB/llm-benchmarks-viewer
Viewer
•
Updated
•
36
•
82
HuggingFaceTB/OpenR1-Math-220k-default-verified
Viewer
•
Updated
•
105k
•
351
HuggingFaceTB/smollm3-configs
Updated
•
71
•
3
HuggingFaceTB/instruct-data-basics-smollm-H4
Viewer
•
Updated
•
767
•
56
•
1
HuggingFaceTB/smollm3-blueprint
Viewer
•
Updated
•
1
•
334
•
7