olmo-3-DISTILL-glm-4.7-think-GGUF

GGUF quantized versions of olmo-3-DISTILL-glm-4.7-think

Available Formats

Filename Size Quant Type Description
olmo-3-DISTILL-glm-4.7-think-f16.gguf 13.60 GB OLMO-3-DISTILL-GLM-4.7-THINK-F16
olmo-3-DISTILL-glm-4.7-think-q2_k.gguf 2.66 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q2_K
olmo-3-DISTILL-glm-4.7-think-q3_k_l.gguf 3.68 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q3_K_L
olmo-3-DISTILL-glm-4.7-think-q3_k_m.gguf 3.40 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q3_K_M
olmo-3-DISTILL-glm-4.7-think-q3_k_s.gguf 3.08 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q3_K_S
olmo-3-DISTILL-glm-4.7-think-q4_0.gguf 3.93 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q4_0
olmo-3-DISTILL-glm-4.7-think-q4_1.gguf 4.33 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q4_1
olmo-3-DISTILL-glm-4.7-think-q4_k_m.gguf 4.16 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q4_K_M
olmo-3-DISTILL-glm-4.7-think-q4_k_s.gguf 3.96 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q4_K_S
olmo-3-DISTILL-glm-4.7-think-q5_0.gguf 4.73 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q5_0
olmo-3-DISTILL-glm-4.7-think-q5_1.gguf 5.13 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q5_1
olmo-3-DISTILL-glm-4.7-think-q5_k_m.gguf 4.85 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q5_K_M
olmo-3-DISTILL-glm-4.7-think-q5_k_s.gguf 4.73 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q5_K_S
olmo-3-DISTILL-glm-4.7-think-q6_k.gguf 5.58 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q6_K
olmo-3-DISTILL-glm-4.7-think-q8_0.gguf 7.23 GB OLMO-3-DISTILL-GLM-4.7-THINK-Q8_0

Quick Start

Ollama

# Use Q4_K_M (recommended)
ollama run hf.co/glogwa68/olmo-3-DISTILL-glm-4.7-think-GGUF:Q4_K_M

# Or other quantizations
ollama run hf.co/glogwa68/olmo-3-DISTILL-glm-4.7-think-GGUF:Q8_0
ollama run hf.co/glogwa68/olmo-3-DISTILL-glm-4.7-think-GGUF:Q2_K

llama.cpp

# Download and run
llama-cli --hf-repo glogwa68/olmo-3-DISTILL-glm-4.7-think-GGUF --hf-file olmo-3-distill-glm-4.7-think-q4_k_m.gguf -p "Hello, how are you?"

# With server
llama-server --hf-repo glogwa68/olmo-3-DISTILL-glm-4.7-think-GGUF --hf-file olmo-3-distill-glm-4.7-think-q4_k_m.gguf -c 2048

LM Studio / GPT4All

Download the .gguf file of your choice and load it in your application.

Quantization Details

Type Bits Use Case
Q2_K 2 Extreme compression, low quality
Q3_K_M 3 Very compressed
Q4_K_M 4 Recommended - Best size/quality
Q5_K_M 5 High quality
Q6_K 6 Very high quality
Q8_0 8 Near lossless
F16 16 Original precision

Original Model

This is the quantized version of olmo-3-DISTILL-glm-4.7-think

  • Base Model: unsloth/Olmo-3-7B-Think
  • Fine-tuning Dataset: TeichAI/glm-4.7-2000x
  • Special Feature: Thinking/Reasoning with <think> tags
Downloads last month
725
GGUF
Model size
7B params
Architecture
olmo2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for glogwa68/olmo-3-DISTILL-glm-4.7-think-GGUF