Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
0.7
TFLOPS
11
37
415
Matricardi Fabio
FM-1976
Follow
afrideva's profile picture
chibiboo's profile picture
ltim's profile picture
18 followers
·
99 following
https://medium.com/@fabio.matricardi
ThePoorGpuGuy
fabiomatricardi
AI & ML interests
control system engineering, AI, LLM with python. ThePoorGPUguy on substack
Recent Activity
liked
a model
about 20 hours ago
Tiiny/SmallThinker-3B-Preview
liked
a model
5 days ago
LiquidAI/LFM2-2.6B-Exp
reacted
to
codelion
's
post
with 🚀
5 days ago
Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models! Key findings from our research on optimal architectures for small language models: → Depth beats width: 32 layers outperforms 12 layers at the same parameter count → Best-in-class factuality: 47.5% on TruthfulQA → 10x training efficiency using WSD (Warmup-Stable-Decay) conversion → Canon layers add only 0.13% parameters but improve reasoning We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens. Blog: https://huggingface.co/blog/codelion/optimal-model-architecture Model: https://huggingface.co/codelion/dhara-70m
View all activity
Organizations
None yet
FM-1976
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
about 20 hours ago
Tiiny/SmallThinker-3B-Preview
Text Generation
•
3B
•
Updated
Jan 16
•
22.7k
•
415
liked
2 models
5 days ago
LiquidAI/LFM2-2.6B-Exp
Text Generation
•
3B
•
Updated
5 days ago
•
4k
•
260
codelion/dhara-70m
Text Generation
•
71.3M
•
Updated
about 22 hours ago
•
3.23k
•
24
liked
2 models
8 days ago
google/t5gemma-2-1b-1b
Image-Text-to-Text
•
2B
•
Updated
13 days ago
•
4.17k
•
59
facebook/sam-audio-small
Updated
about 12 hours ago
•
6.78k
•
61
liked
2 models
18 days ago
hitonet/hito-1.7b
Text Generation
•
2B
•
Updated
19 days ago
•
1.05k
•
7
ByteDance-Seed/Seed-X-PPO-7B
Translation
•
Updated
Jul 28
•
13k
•
285
liked
3 models
19 days ago
NeuML/bert-hash-pico
Updated
Oct 9
•
25
•
3
liu-nlp/hyperllama-180m-multilingual-1x
Text Generation
•
0.2B
•
Updated
19 days ago
•
51
•
1
TitleOS/Lightning-1.7B
Text Generation
•
2B
•
Updated
20 days ago
•
56
•
3
liked
2 models
22 days ago
jhu-clsp/ettin-decoder-68m
Fill-Mask
•
Updated
Jul 16
•
125
•
1
jhu-clsp/ettin-encoder-17m
Fill-Mask
•
Updated
Jul 16
•
4.23k
•
11
liked
5 models
23 days ago
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition
•
Updated
Nov 27
•
69.9k
•
491
UsefulSensors/moonshine
Automatic Speech Recognition
•
Updated
Nov 30
•
1
•
85
shoumenchougou/RWKV7-G1a-0.1B-GGUF
0.2B
•
Updated
Oct 16
•
220
•
3
shoumenchougou/RWKV7-G1b-1.5B-GGUF
2B
•
Updated
27 days ago
•
178
•
1
onnx-community/ettin-encoder-32m-ONNX
Fill-Mask
•
Updated
23 days ago
•
21
•
1
liked
2 models
24 days ago
LucidityAI/Astral-0.6B-Flash-Coder
0.6B
•
Updated
Oct 5
•
14
•
1
keras/moonshine_tiny_en
Updated
Jun 17
•
10
•
1
liked
a model
25 days ago
mradermacher/aquif-3.5-Nano-1B-GGUF
2B
•
Updated
29 days ago
•
638
•
1
Load more