PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs Paper • 2503.09543 • Published Mar 12
CLIMB: Curriculum Learning for Infant-inspired Model Building Paper • 2311.08886 • Published Nov 15, 2023
Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing Paper • 2410.11462 • Published Oct 15, 2024
Attention-based Contextual Language Model Adaptation for Speech Recognition Paper • 2106.01451 • Published Jun 2, 2021
From Babble to Words: Pre-Training Language Models on Continuous Streams of Phonemes Paper • 2410.22906 • Published Oct 30, 2024
Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies Paper • 2410.22886 • Published Oct 30, 2024 • 1
Tending Towards Stability: Convergence Challenges in Small Language Models Paper • 2410.11451 • Published Oct 15, 2024
Self-Training Large Language Models for Tool-Use Without Demonstrations Paper • 2502.05867 • Published Feb 9
Tending Towards Stability: Convergence Challenges in Small Language Models Paper • 2410.11451 • Published Oct 15, 2024
AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets Paper • 2404.05623 • Published Apr 8, 2024 • 3
Diable: Efficient Dialogue State Tracking as Operations on Tables Paper • 2305.17020 • Published May 26, 2023