SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations Paper • 2512.14080 • Published 12 days ago • 5
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 6 items • Updated 5 days ago • 105
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 5 days ago • 81
Motif-Technologies/Motif-2-12.7B-Reasoning Text Generation • 13B • Updated 16 days ago • 639 • 33
view changelog Changelog Team & Enterprise Articles Now Featured on the Hugging Face Blog 20 days ago • 71
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 27 days ago • 93
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23 • 276
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 28 days ago • 256
INTELLECT-3 Collection INTELLECT-3: A 100B+ MoE trained with large-scale RL • 4 items • Updated 30 days ago • 11