Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning Paper • 2601.20829 • Published 5 days ago • 5
guactastesgood/DeepSeek-R1-Distill-Qwen-1.5B-failure-prefix-conditioning-iteration1 Updated about 10 hours ago
Failure-Prefix Conditioning Collection Collection for the paper: Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning • 1 item • Updated about 10 hours ago
Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning Paper • 2601.20829 • Published 5 days ago • 5
Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning Paper • 2601.20829 • Published 5 days ago • 5
Reinforcement Learning vs. Distillation: Understanding Accuracy and Capability in LLM Reasoning Paper • 2505.14216 • Published May 20, 2025 • 2
Reinforcement Learning vs. Distillation: Understanding Accuracy and Capability in LLM Reasoning Paper • 2505.14216 • Published May 20, 2025 • 2
Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings Paper • 2505.13718 • Published May 19, 2025 • 7
Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings Paper • 2505.13718 • Published May 19, 2025 • 7
Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges Paper • 2502.08680 • Published Feb 12, 2025 • 11
Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges Paper • 2502.08680 • Published Feb 12, 2025 • 11 • 2
Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges Paper • 2502.08680 • Published Feb 12, 2025 • 11