brittlewis12
/

DeepSeek-R1-0528-Qwen3-8B-GGUF

+---
+base_model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
+pipeline_tag: text-generation
+inference: true
+language:
+- en
+license: mit
+model_creator: deepseek-ai
+model_name: DeepSeek-R1-0528-Qwen3-8B
+model_type: qwen3
+quantized_by: brittlewis12
+tags:
+- reasoning
+- deepseek
+- qwen3
+---
+# DeepSeek R1 0528 Qwen3 8B GGUF
+**Original model**: [DeepSeek-R1-0528-Qwen3-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B)
+**Model creator**: [DeepSeek AI](https://huggingface.co/deepseek-ai)
+> We distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking. We believe that the chain-of-thought from DeepSeek-R1-0528 will hold significant importance for both academic research on reasoning models and industrial development focused on small-scale models.
+This repo contains GGUF format model files for DeepSeek AI's _DeepSeek R1 0528 Qwen3 8B_.
+### What is GGUF?
+GGUF is a file format for representing AI models. It is the third version of the format, introduced by the llama.cpp team on August 21st 2023.
+Converted with llama.cpp build b5536 (revision [2b13162](https://github.com/ggml-org/llama.cpp/commits/2b131621e60d8ec2cc961201beb6773ab37b6b69)), using [autogguf-rs](https://github.com/brittlewis12/autogguf-rs).
+### Prompt template: [DeepSeek R1](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B/blob/main/tokenizer_config.json#L34)
+```
+{{system_message}}
+<｜User｜>{{prompt}}<｜Assistant｜>
+```
+### Notes from DeepSeek on Running Locally
+> Compared to previous versions of DeepSeek-R1, the usage recommendations for DeepSeek-R1-0528 have the following changes:
+>
+> - System prompt is supported now.
+> - It is not required to add `<think>\n` at the beginning of the output to force the model into thinking pattern.
+>
+> The model architecture of DeepSeek-R1-0528-Qwen3-8B is identical to that of Qwen3-8B, but it shares the same tokenizer configuration as DeepSeek-R1-0528.
+---
+## Download & run with [cnvrs](https://twitter.com/cnvrsai) on iPhone, iPad, and Mac!
+![cnvrs.ai](https://pbs.twimg.com/profile_images/1744049151241797632/0mIP-P9e_400x400.jpg)
+[cnvrs](https://testflight.apple.com/join/sFWReS7K) is the best app for private, local AI on your device:
+- create & save **Characters** with custom system prompts & temperature settings
+- download and experiment with any **GGUF model** you can [find on HuggingFace](https://huggingface.co/models?library=gguf)!
+    * or, use an API key with the chat completions-compatible model provider of your choice -- ChatGPT, Claude, Gemini, DeepSeek, & more!
+- make it your own with custom **Theme colors**
+- powered by Metal ⚡️ & [Llama.cpp](https://github.com/ggml-org/llama.cpp), with **haptics** during response streaming!
+- **try it out** yourself today, on [Testflight](https://testflight.apple.com/join/sFWReS7K)!
+    * if you **already have the app**, download DeepSeek R1 0528 Qwen3 8B now!
+    * <cnvrsai:///models/search/hf?id=brittlewis12/DeepSeek-R1-0528-Qwen3-8B-GGUF>
+- follow [cnvrs on twitter](https://twitter.com/cnvrsai) to stay up to date
+---
+## Original Model Evaluation
+> We distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.
+|                                | AIME 24 | AIME 25 | HMMT Feb 25 | GPQA Diamond | LiveCodeBench (2408-2505) |
+|--------------------------------|---------|---------|-------------|--------------|---------------------------|
+| Qwen3-235B-A22B	                | 85.7    | 81.5    | 62.5        | 71.1         | 66.5                  |
+| Qwen3-32B                      | 81.4    | 72.9    | -           | 68.4         | -                         |
+| Qwen3-8B                      | 76.0   | 67.3    | -           | 62.0       | -                         |
+| Phi-4-Reasoning-Plus-14B       | 81.3    | 78.0    | 53.6        | 69.3         | -          |
+| Gemini-2.5-Flash-Thinking-0520 | 82.3    | 72.0    | 64.2        | 82.8         | 62.3                  |
+| o3-mini (medium)               | 79.6    | 76.7    | 53.3        | 76.8         | 65.9                     |
+| **DeepSeek-R1-0528-Qwen3-8B**      | **86.0**   | **76.3**    | **61.5**        | **61.1**         | **60.5**                      |
+---
+## DeepSeek R1 0528 Qwen3 8B in cnvrs on iOS
+![deepseek-r1-qwen3-8b in cnvrs pt1](https://cdn-uploads.huggingface.co/production/uploads/63b64d7a889aa6707f155cdb/nsXnOaK6Sb-0PGvdY8ayy.png)
+![deepseek-r1-qwen3-8b in cnvrs pt2](https://cdn-uploads.huggingface.co/production/uploads/63b64d7a889aa6707f155cdb/4AnhMFL41EuIwhKuVaCGi.png)
+---