Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs Paper • 2508.14896 • Published Aug 20 • 22
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning Paper • 2507.16812 • Published Jul 22 • 63
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face Jul 29 • 197
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated about 14 hours ago • 143
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11 • 83
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 56
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) Dec 9, 2022 • 371
view article Article Multi-Label Classification Model From Scratch: Step-by-Step Tutorial Jan 8, 2024 • 47