Hunyuan Image 3.0 - INT8 Quantized

This is an INT8 quantized version of Tencent's HunyuanImage-3.0 model, optimized for high-end GPU workflows without CPU offloading.

Model Description

INT8 quantization of the Hunyuan Image 3.0 text-to-image diffusion transformer, providing a balance between the full BF16 precision and more aggressive NF4 quantization. This version maintains excellent image quality while reducing memory requirements.

Key Features:

🎯 High quality output comparable to BF16
💾 ~80GB VRAM required (fits RTX 6000 Ada/Blackwell)
⚡ ~3.5 minutes generation time at base resolution
🔧 Designed for ComfyUI workflows

VRAM Requirements

Phase	VRAM Usage
Weight Loading	~80 GB
Inference (additional)	~12-20 GB
Total	~92-100 GB

Recommended Hardware:

NVIDIA RTX 6000 Ada (48GB) - requires model split/offload
NVIDIA RTX 6000 Blackwell (96GB) - fits entirely in VRAM ✅ Workflows on the github page
Multi-GPU setups with 80GB+ combined VRAM

Usage

ComfyUI (Recommended)

This model is designed to work with the Comfy_HunyuanImage3 custom nodes:

cd ComfyUI/custom_nodes
git clone https://github.com/EricRollei/Comfy_HunyuanImage3

Install the nodes and download this model to your ComfyUI models directory. The nodes handle INT8 loading automatically.

Direct Usage

# INT8 weights can be loaded with standard torch quantization
# See the ComfyUI nodes for reference implementation

Performance

Generation Time: ~3.5 minutes for base resolution (1024x1024)
Weight Loading: ~60 seconds (one-time per session)
Quality: Excellent - minimal degradation from BF16
Speed: Faster inference than BF16 due to reduced memory bandwidth

Quantization Details

Method: INT8 per-channel quantization
Target: Hunyuan Image 3.0 transformer backbone
Precision Loss: Minimal - image quality remains high
Trade-off: Middle ground between NF4 (lower quality) and BF16 (highest VRAM)

Original Model

This is a quantized derivative of Tencent's HunyuanImage-3.0.

Original Model Details:

Architecture: Diffusion Transformer
Resolution: Up to 2048x2048
Language Support: English and Chinese prompts
License: Tencent Hunyuan Community License

Please review the original model card and license for full details on capabilities and restrictions.

Limitations

Requires high-end professional GPU (80GB+ VRAM)
Not suitable for consumer GPUs (4090, 5090) without further optimization
INT8 quantization may introduce minor quality differences in edge cases
Loading time adds ~1 minute overhead to first generation

Credits

Original Model: Tencent Hunyuan Team Quantization: Eric Rollei ComfyUI Integration: Comfy_HunyuanImage3

License

This model inherits the license from the original Hunyuan Image 3.0 model:

License: Tencent Hunyuan Community License
Please review the original license for commercial use restrictions and requirements

Citation

@misc{hunyuan-image-3-int8,
  author = {Rollei, Eric},
  title = {Hunyuan Image 3.0 INT8 Quantized},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/[YOUR_USERNAME]/[MODEL_NAME]}}
}

Original model citation:

@misc{tencent2024hunyuan,
  title={Hunyuan Image 3.0},
  author={Tencent Hunyuan Team},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/tencent/HunyuanImage-3.0}}
}

Downloads last month: 23

Safetensors

Model size

83B params

Tensor type

F32

BF16