glm4-moe-tiny
A small (~543M parameter) GLM-4 MoE model for testing prime-rl integration with HuggingFace Transformers and vLLM.
Purpose
This model is not for production use. It exists to:
- Validate MoE weight conversion between HuggingFace and PrimeRL formats
- Test the full RL training pipeline (inference server + trainer) at small scale
- Catch architecture-specific bugs without needing 100B+ parameter models
The model has been fine-tuned on PrimeIntellect/Reverse-Text-SFT to provide a non-trivial distribution for KL divergence during RL.
Quick Start
# Run RL with reverse-text environment
uv run rl @ configs/ci/integration/rl_moe/glm4_moe.toml
See the Testing MoE at Small Scale guide for full instructions.
Model Details
| Parameter | Value |
|---|---|
| Hidden size | 1024 |
| Layers | 24 |
| Experts | 8 |
| Active experts | 4 |
| Parameters | ~543M |
Links
- prime-rl - RL training framework
- PrimeIntellect - Building infrastructure for decentralized AI
- Downloads last month
- 8