Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
sumitdotml
/
moe-emergence
like
0
Text Generation
Transformers
Safetensors
codeparrot/codeparrot-clean
allenai/ai2_arc
allenai/c4
English
mixture-of-experts
gpt2
research
expert-specialization
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
moe-emergence
2 contributors
History:
18 commits
sumit
updated model card with ablation results and all 4 runs
4049aa7
9 days ago
dense-baseline
add dense and moe checkpoints
10 days ago
moe-main
add dense and moe checkpoints
10 days ago
no-lb-ablation
Upload no-lb-ablation/ckpt-step-500.pt with huggingface_hub
9 days ago
top2-main-10k
Upload top2-main-10k/ckpt-step-9999.pt with huggingface_hub
9 days ago
.gitattributes
Safe
1.52 kB
add dense and moe checkpoints
10 days ago
README.md
7.09 kB
updated model card with ablation results and all 4 runs
9 days ago