Predicting the Order of Upcoming Tokens Improves Language Modeling Paper • 2508.19228 • Published Aug 26 • 22
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability Paper • 2506.01789 • Published Jun 2 • 14
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax Paper • 2504.20966 • Published Apr 29 • 32
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Paper • 2410.12705 • Published Oct 16, 2024 • 32