DeepSeek-V3.2-Speciale-GGUF-preview
GGUF quants for ik_llama.cpp (llama.cpp fork):
https://github.com/ikawrakow/ik_llama.cpp/tree/main
Experiment created by stripping out the deepseek DeepseekV32 lightening indexer and treating it as DeepseekV3 arch.
Initial tests suggests minimal performance loss.
Note: Tool calling somewhat broken as support for the new tool call format is not provided here.
No immediate plans to provide preview quants for mainline llama.cpp given current limitations / incoming DeepseekV32 compatibility.
Optimized quants are provided based loosely on my own experiments, and work done by ubergam
Note - weights are file shared and will need to be concatenated:
huggingface-cli download --local-dir DeepSeek-V3.2-Speciale-GGUF-preview --include "q4_k_m/*" gallantpigeon/DeepSeek-V3.2-Speciale-GGUF-preview
cat q4_k_m/* >> deepseek-v3.2-speciale-q4_k_m.gguf
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for gallantpigeon/DeepSeek-V3.2-Speciale-GGUF-preview
Base model
deepseek-ai/DeepSeek-V3.2-Exp-Base
Finetuned
deepseek-ai/DeepSeek-V3.2-Speciale