Granite 4.0 1b (GGUF)

This repository contains models that have been converted to the GGUF format with various quantizations from an IBM Granite base model.

Please reference the base model's full model card here: https://huggingface.co/ibm-granite/granite-4.0-1b

Known Issues

This model often uses the full numerical range of a 32-bit float (f32), so variants with smaller numerical ranges may run into precision errors at inference. The F16 variant is known to fail on many hardware combinations.

The recommended full-precision variant is bf16.

Downloads last month: 1,423

GGUF

Model size

2B params

Architecture

granite

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ibm-granite/granite-4.0-1b-GGUF

Base model

ibm-granite/granite-4.0-1b-base

Finetuned

ibm-granite/granite-4.0-1b

Quantized

(21)

this model

Collection including ibm-granite/granite-4.0-1b-GGUF

Granite Quantized Models

Collection

Quantized versions of IBM Granite models. Licensed under the Apache 2.0 license. • 44 items • Updated Nov 21, 2025 • 29