DeepSeek-V3.2-Speciale-GGUF-preview

GGUF quants for ik_llama.cpp (llama.cpp fork):
https://github.com/ikawrakow/ik_llama.cpp/tree/main

Experiment created by stripping out the deepseek DeepseekV32 lightening indexer and treating it as DeepseekV3 arch.
Initial tests suggests minimal performance loss.
Note: Tool calling somewhat broken as support for the new tool call format is not provided here.

No immediate plans to provide preview quants for mainline llama.cpp given current limitations / incoming DeepseekV32 compatibility.

Optimized quants are provided based loosely on my own experiments, and work done by ubergam

Note - weights are file shared and will need to be concatenated:

huggingface-cli download --local-dir DeepSeek-V3.2-Speciale-GGUF-preview --include "q4_k_m/*" gallantpigeon/DeepSeek-V3.2-Speciale-GGUF-preview
cat q4_k_m/* >> deepseek-v3.2-speciale-q4_k_m.gguf
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for gallantpigeon/DeepSeek-V3.2-Speciale-GGUF-preview

Finetuned
(8)
this model