Granite 4.0 1b (GGUF)

This repository contains models that have been converted to the GGUF format with various quantizations from an IBM Granite base model.

Please reference the base model's full model card here: https://huggingface.co/ibm-granite/granite-4.0-1b

Known Issues

This model often uses the full numerical range of a 32-bit float (f32), so variants with smaller numerical ranges may run into precision errors at inference. The F16 variant is known to fail on many hardware combinations.

The recommended full-precision variant is bf16.

Downloads last month
1,423
GGUF
Model size
2B params
Architecture
granite
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ibm-granite/granite-4.0-1b-GGUF

Quantized
(21)
this model

Collection including ibm-granite/granite-4.0-1b-GGUF