Estimating Knowledge in Large Language Models Without Generating a Single Token Paper • 2406.12673 • Published Jun 18, 2024 • 9
ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations Paper • 2505.02819 • Published May 5 • 26
Share Your Attention: Transformer Weight Sharing via Matrix-based Dictionary Learning Paper • 2508.04581 • Published Aug 6 • 5
view article Article Sparse Mixture of Experts Language Model from Scratch: Extending makeMoE with Expert Capacity Mar 18, 2024 • 13
DyVo: Dynamic Vocabularies for Learned Sparse Retrieval with Entities Paper • 2410.07722 • Published Oct 10, 2024 • 15
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 Jul 1 • 132
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 222
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards Paper • 2505.24760 • Published May 30 • 74
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 Mar 26 • 177
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval Paper • 2505.16967 • Published May 22 • 24
BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model Paper • 1902.04094 • Published Feb 11, 2019 • 1
Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper • 2412.16145 • Published Dec 20, 2024 • 38