-
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 41 -
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper • 2408.08152 • Published • 59 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 425 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131
Collections
Discover the best community collections!
Collections including paper arxiv:2402.03300
-
Reasoning Language Models: A Blueprint
Paper • 2501.11223 • Published • 33 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 425
-
deepseek-ai/DeepSeek-V3-Base
685B • Updated • 4.85k • 1.68k -
TransMLA: Multi-head Latent Attention Is All You Need
Paper • 2502.07864 • Published • 58 -
2
Qwen2.5 Bakeneko 32b Instruct Awq
⚡Generate detailed responses to text prompts
-
3
Deepseek R1 Distill Qwen2.5 Bakeneko 32b Awq
⚡Generate text responses to user messages in a chat interface
-
Atla Selene Mini: A General Purpose Evaluation Model
Paper • 2501.17195 • Published • 35 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 71 -
Optimizing Large Language Model Training Using FP4 Quantization
Paper • 2501.17116 • Published • 37 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 115 -
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper • 2412.06559 • Published • 84 -
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
Paper • 2412.15084 • Published • 13 -
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99
-
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
3.49k
The Ultra-Scale Playbook
🌌The ultimate guide to training LLM on large GPU Clusters
-
239
LLM训练终极指南 | The Ultra-Scale Playbook
🔥了解LLM训练的方方面面
-
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 22 -
Titans: Learning to Memorize at Test Time
Paper • 2501.00663 • Published • 26 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 54 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 165
-
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 41 -
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper • 2408.08152 • Published • 59 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 425 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131
-
Atla Selene Mini: A General Purpose Evaluation Model
Paper • 2501.17195 • Published • 35 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 71 -
Optimizing Large Language Model Training Using FP4 Quantization
Paper • 2501.17116 • Published • 37 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131
-
Reasoning Language Models: A Blueprint
Paper • 2501.11223 • Published • 33 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 425
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 115 -
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper • 2412.06559 • Published • 84 -
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
Paper • 2412.15084 • Published • 13 -
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99
-
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
3.49k
The Ultra-Scale Playbook
🌌The ultimate guide to training LLM on large GPU Clusters
-
239
LLM训练终极指南 | The Ultra-Scale Playbook
🔥了解LLM训练的方方面面
-
deepseek-ai/DeepSeek-V3-Base
685B • Updated • 4.85k • 1.68k -
TransMLA: Multi-head Latent Attention Is All You Need
Paper • 2502.07864 • Published • 58 -
2
Qwen2.5 Bakeneko 32b Instruct Awq
⚡Generate detailed responses to text prompts
-
3
Deepseek R1 Distill Qwen2.5 Bakeneko 32b Awq
⚡Generate text responses to user messages in a chat interface
-
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 22 -
Titans: Learning to Memorize at Test Time
Paper • 2501.00663 • Published • 26 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 54 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 165