What Language Model to Train if You Have One Million GPU Hours? Paper • 2210.15424 • Published Oct 27, 2022 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 35
Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies Paper • 2305.12586 • Published May 21, 2023
TESS 2: A Large-Scale Generalist Diffusion Language Model Paper • 2502.13917 • Published Feb 19, 2025 • 6
When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research Paper • 2505.11855 • Published May 17, 2025 • 10
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning Paper • 2502.17407 • Published Feb 24, 2025 • 26
KMMLU: Measuring Massive Multitask Language Understanding in Korean Paper • 2402.11548 • Published Feb 18, 2024
HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models Paper • 2309.02706 • Published Sep 6, 2023 • 2
Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance Paper • 2301.03136 • Published Jan 9, 2023
TESS: Text-to-Text Self-Conditioned Simplex Diffusion Paper • 2305.08379 • Published May 15, 2023 • 3