view article Article Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation Sep 16 • 13
Running on CPU Upgrade 2.19k 2.19k The Smol Training Playbook 📚 The secrets to building world-class LLMs