Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models Paper • 2508.01908 • Published Aug 3 • 3
BFM v0 Collection General-Purpose Brain Foundation Models for Time-Series Neuroimaging Data • 4 items • Updated Aug 6
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 56
Continual Pre-Training of Large Language Models: How to (re)warm your model? Paper • 2308.04014 • Published Aug 8, 2023 • 2
$μ$LO: Compute-Efficient Meta-Generalization of Learned Optimizers Paper • 2406.00153 • Published May 31, 2024 • 13
$μ$LO: Compute-Efficient Meta-Generalization of Learned Optimizers Paper • 2406.00153 • Published May 31, 2024 • 13