DeepSeek-V3.2-Speciale-GGUF-preview

GGUF quants for ik_llama.cpp (llama.cpp fork):
https://github.com/ikawrakow/ik_llama.cpp/tree/main

Experiment created by stripping out the deepseek DeepseekV32 lightening indexer and treating it as DeepseekV3 arch.
Initial tests suggests minimal performance loss.
Note: Tool calling somewhat broken as support for the new tool call format is not provided here.

No immediate plans to provide preview quants for mainline llama.cpp given current limitations / incoming DeepseekV32 compatibility.

Optimized quants are provided based loosely on my own experiments, and work done by ubergam

Note - weights are file shared and will need to be concatenated:

huggingface-cli download --local-dir DeepSeek-V3.2-Speciale-GGUF-preview --include "q4_k_m/*" gallantpigeon/DeepSeek-V3.2-Speciale-GGUF-preview
cat q4_k_m/* >> deepseek-v3.2-speciale-q4_k_m.gguf

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gallantpigeon/DeepSeek-V3.2-Speciale-GGUF-preview

Base model

deepseek-ai/DeepSeek-V3.2-Exp-Base

Finetuned

deepseek-ai/DeepSeek-V3.2-Speciale

Finetuned

(8)

this model