Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
moonshotai
's Collections
Kimi-Linear-A3B
Kimi-K2
Kimi-VL-A3B
Kimi-Audio-7B
Moonlight-A3B
Moonlight-A3B
updated
19 days ago
Moonshot's Compute-efficient MoE LLM, first Scaling Up of Muon Optimizer
Upvote
8
moonshotai/Moonlight-16B-A3B-Instruct
Text Generation
•
16B
•
Updated
Mar 3
•
18.2k
•
184
moonshotai/Moonlight-16B-A3B
Text Generation
•
16B
•
Updated
Feb 26
•
11.9k
•
97
Muon is Scalable for LLM Training
Paper
•
2502.16982
•
Published
Feb 24
•
8
Upvote
8
+4
Share collection
View history
Collection guide
Browse collections