4 bit (UINT4 with SVD rank 32) quantization of Tongyi-MAI/Z-Image-Turbo using SDNQ.

Usage:

pip install git+https://github.com/Disty0/sdnq
import torch
import diffusers
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers

pipe = diffusers.ZImagePipeline.from_pretrained("Disty0/Z-Image-Turbo-SDNQ-uint4-svd-r32", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."
image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    num_inference_steps=9,
    guidance_scale=0.0,
    generator=torch.manual_seed(42),
).images[0]
image.save("z-image-turbo-sdnq-uint4-svd-r32.png")

Original BF16 vs SDNQ quantization comparison:

Quantization Model Size Visualization
Original BF16 12.3 GB Original BF16
SDNQ UINT4 3.5 GB SDNQ UINT4
Downloads last month
150
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Disty0/Z-Image-Turbo-SDNQ-uint4-svd-r32

Quantized
(6)
this model

Collection including Disty0/Z-Image-Turbo-SDNQ-uint4-svd-r32