GVE Collection Towards General Video Embeddings: Models and Benchmarks • 4 items • Updated 6 days ago • 16
Reasoning Language Model Inference Serving Unveiled: An Empirical Study Paper • 2510.18672 • Published 18 days ago • 7
LightOnOCR Collection The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR • 6 items • Updated 13 days ago • 13
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published 9 days ago • 99
Reasoning with Sampling: Your Base Model is Smarter Than You Think Paper • 2510.14901 • Published 23 days ago • 45
AION-1: Omnimodal Foundation Model for Astronomical Sciences Paper • 2510.17960 • Published 19 days ago • 27
When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling Paper • 2510.15346 • Published 23 days ago • 32
Robust Layerwise Scaling Rules by Proper Weight Decay Tuning Paper • 2510.15262 • Published 23 days ago • 5
Large Language Models Do NOT Really Know What They Don't Know Paper • 2510.09033 • Published 30 days ago • 16
Deconstructing Attention: Investigating Design Principles for Effective Language Modeling Paper • 2510.11602 • Published 26 days ago • 14