Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
moonshotai
's Collections
Kimi-Linear-A3B
Kimi-K2
Kimi-VL-A3B
Kimi-Audio-7B
Moonlight-A3B
Moonlight-A3B
updated
6 days ago
Moonshot's Compute-efficient MoE LLM, first Scaling Up of Muon Optimizer
Upvote
8
moonshotai/Moonlight-16B-A3B-Instruct
Text Generation
•
16B
•
Updated
Mar 3
•
17.8k
•
184
moonshotai/Moonlight-16B-A3B
Text Generation
•
16B
•
Updated
Feb 26
•
13k
•
96
Muon is Scalable for LLM Training
Paper
•
2502.16982
•
Published
Feb 24
•
7
Upvote
8
+4
Share collection
View history
Collection guide
Browse collections