Matricardi Fabio's picture

Matricardi Fabio

FM-1976

·

https://medium.com/@fabio.matricardi

AI & ML interests

control system engineering, AI, LLM with python. ThePoorGPUguy on substack

Recent Activity

liked a model about 20 hours ago

Tiiny/SmallThinker-3B-Preview

liked a model 5 days ago

LiquidAI/LFM2-2.6B-Exp

reacted to codelion's post with 🚀 5 days ago

Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models! Key findings from our research on optimal architectures for small language models: → Depth beats width: 32 layers outperforms 12 layers at the same parameter count → Best-in-class factuality: 47.5% on TruthfulQA → 10x training efficiency using WSD (Warmup-Stable-Decay) conversion → Canon layers add only 0.13% parameters but improve reasoning We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens. Blog: https://huggingface.co/blog/codelion/optimal-model-architecture Model: https://huggingface.co/codelion/dhara-70m

View all activity

Organizations

None yet

liked a model about 20 hours ago

Tiiny/SmallThinker-3B-Preview

Text Generation • 3B • Updated Jan 16 • 22.7k • 415

liked 2 models 5 days ago

LiquidAI/LFM2-2.6B-Exp

Text Generation • 3B • Updated 5 days ago • 4k • 260

codelion/dhara-70m

Text Generation • 71.3M • Updated about 22 hours ago • 3.23k • 24

liked 2 models 8 days ago

google/t5gemma-2-1b-1b

Image-Text-to-Text • 2B • Updated 13 days ago • 4.17k • 59

facebook/sam-audio-small

Updated about 12 hours ago • 6.78k • 61

liked 2 models 18 days ago

hitonet/hito-1.7b

Text Generation • 2B • Updated 19 days ago • 1.05k • 7

ByteDance-Seed/Seed-X-PPO-7B

Translation • Updated Jul 28 • 13k • 285

liked 3 models 19 days ago

NeuML/bert-hash-pico

Updated Oct 9 • 25 • 3

liu-nlp/hyperllama-180m-multilingual-1x

Text Generation • 0.2B • Updated 19 days ago • 51 • 1

TitleOS/Lightning-1.7B

Text Generation • 2B • Updated 20 days ago • 56 • 3

liked 2 models 22 days ago

jhu-clsp/ettin-decoder-68m

Fill-Mask • Updated Jul 16 • 125 • 1

jhu-clsp/ettin-encoder-17m

Fill-Mask • Updated Jul 16 • 4.23k • 11

liked 5 models 23 days ago

nvidia/parakeet-tdt-0.6b-v3

Automatic Speech Recognition • Updated Nov 27 • 69.9k • 491

UsefulSensors/moonshine

Automatic Speech Recognition • Updated Nov 30 • 1 • 85

shoumenchougou/RWKV7-G1a-0.1B-GGUF

0.2B • Updated Oct 16 • 220 • 3

shoumenchougou/RWKV7-G1b-1.5B-GGUF

2B • Updated 27 days ago • 178 • 1

onnx-community/ettin-encoder-32m-ONNX

Fill-Mask • Updated 23 days ago • 21 • 1

liked 2 models 24 days ago

LucidityAI/Astral-0.6B-Flash-Coder

0.6B • Updated Oct 5 • 14 • 1

keras/moonshine_tiny_en

Updated Jun 17 • 10 • 1

liked a model 25 days ago

mradermacher/aquif-3.5-Nano-1B-GGUF

2B • Updated 29 days ago • 638 • 1