Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 14 items • Updated 6 days ago • 39
✨SimpleChat Collection The SimpleChat series represents our new exploration into Non-Chain-of-Thought (Non-CoT) models. Designed to be concise, rational, and empathetic. • 5 items • Updated Sep 3 • 3
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers Jul 30, 2024 • 68