ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases By QuentinJG and 4 others • 6 days ago • 36
⚡ Power, Heat, and Intelligence ☁️ - AI Data Centers Explained 🏭 By sasha and 1 other • 6 days ago • 12
Llasa Goes RL: Training LLaSA with GRPO for Improved Prosody and Expressiveness By Steveeeeeeen • 6 days ago • 10
Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation By exploding-gradients • Sep 16 • 12
Budget Alignment: Making Models Reason in the User’s Language By shanchen and 2 others • 7 days ago • 6
Who Routes LLM Routers? RouterArena: Building the Evaluation Foundation for LLM Routing By JerryPotter and 6 others • about 3 hours ago • 5
ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases By QuentinJG and 4 others • 6 days ago • 36
⚡ Power, Heat, and Intelligence ☁️ - AI Data Centers Explained 🏭 By sasha and 1 other • 6 days ago • 12
Llasa Goes RL: Training LLaSA with GRPO for Improved Prosody and Expressiveness By Steveeeeeeen • 6 days ago • 10
Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation By exploding-gradients • Sep 16 • 12
Budget Alignment: Making Models Reason in the User’s Language By shanchen and 2 others • 7 days ago • 6
Who Routes LLM Routers? RouterArena: Building the Evaluation Foundation for LLM Routing By JerryPotter and 6 others • about 3 hours ago • 5