DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published 2 days ago • 16
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models Paper • 2601.07372 • Published 16 days ago • 38
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published Nov 27, 2025 • 90
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 254
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14, 2025 • 75
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition Paper • 2504.21801 • Published Apr 30, 2025 • 4
Inference-Time Scaling for Generalist Reward Modeling Paper • 2504.02495 • Published Apr 3, 2025 • 58
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling Paper • 2501.17811 • Published Jan 29, 2025 • 8
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16, 2025 • 167
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22, 2025 • 437
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation Paper • 2410.13848 • Published Oct 17, 2024 • 35
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding Paper • 2412.10302 • Published Dec 13, 2024 • 22
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15, 2024 • 61
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models Paper • 2407.01906 • Published Jul 2, 2024 • 46
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published May 23, 2024 • 43
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper • 2406.11931 • Published Jun 17, 2024 • 68