MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion Paper • 2502.04235 • Published Feb 6, 2025 • 23
Memory Retrieval and Consolidation in Large Language Models through Function Tokens Paper • 2510.08203 • Published Oct 9, 2025 • 10
Artificial Hippocampus Networks for Efficient Long-Context Modeling Paper • 2510.07318 • Published Oct 8, 2025 • 31
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published Oct 2, 2025 • 96
Generative Universal Verifier as Multimodal Meta-Reasoner Paper • 2510.13804 • Published Oct 15, 2025 • 27
Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures Paper • 2510.14616 • Published Oct 16, 2025 • 13
Parallel Loop Transformer for Efficient Test-Time Computation Scaling Paper • 2510.24824 • Published Oct 28, 2025 • 17
MME-CC: A Challenging Multi-Modal Evaluation Benchmark of Cognitive Capacity Paper • 2511.03146 • Published Nov 5, 2025 • 7
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 59
Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space Paper • 2510.04476 • Published Oct 6, 2025 • 16
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 464
Deep Ignorance Collection This collection contains the model and data artifacts from O'Brien et al. (2025). https://deepignorance.ai • 44 items • Updated Dec 17, 2025 • 10
H-Net Collection The family of hierarchical networks (H-Nets) from https://arxiv.org/abs/2507.07955 • 8 items • Updated Jul 11, 2025 • 20
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale Paper • 2505.03005 • Published May 5, 2025 • 36