Hunyuan Image 3.0 - INT8 Quantized
This is an INT8 quantized version of Tencent's HunyuanImage-3.0 model, optimized for high-end GPU workflows without CPU offloading.
Model Description
INT8 quantization of the Hunyuan Image 3.0 text-to-image diffusion transformer, providing a balance between the full BF16 precision and more aggressive NF4 quantization. This version maintains excellent image quality while reducing memory requirements.
Key Features:
- ๐ฏ High quality output comparable to BF16
- ๐พ ~80GB VRAM required (fits RTX 6000 Ada/Blackwell)
- โก ~3.5 minutes generation time at base resolution
- ๐ง Designed for ComfyUI workflows
VRAM Requirements
| Phase | VRAM Usage |
|---|---|
| Weight Loading | ~80 GB |
| Inference (additional) | ~12-20 GB |
| Total | ~92-100 GB |
Recommended Hardware:
- NVIDIA RTX 6000 Ada (48GB) - requires model split/offload
- NVIDIA RTX 6000 Blackwell (96GB) - fits entirely in VRAM โ Workflows on the github page
- Multi-GPU setups with 80GB+ combined VRAM
Usage
ComfyUI (Recommended)
This model is designed to work with the Comfy_HunyuanImage3 custom nodes:
cd ComfyUI/custom_nodes
git clone https://github.com/EricRollei/Comfy_HunyuanImage3
Install the nodes and download this model to your ComfyUI models directory. The nodes handle INT8 loading automatically.
Direct Usage
# INT8 weights can be loaded with standard torch quantization
# See the ComfyUI nodes for reference implementation
Performance
- Generation Time: ~3.5 minutes for base resolution (1024x1024)
- Weight Loading: ~60 seconds (one-time per session)
- Quality: Excellent - minimal degradation from BF16
- Speed: Faster inference than BF16 due to reduced memory bandwidth
Quantization Details
- Method: INT8 per-channel quantization
- Target: Hunyuan Image 3.0 transformer backbone
- Precision Loss: Minimal - image quality remains high
- Trade-off: Middle ground between NF4 (lower quality) and BF16 (highest VRAM)
Original Model
This is a quantized derivative of Tencent's HunyuanImage-3.0.
Original Model Details:
- Architecture: Diffusion Transformer
- Resolution: Up to 2048x2048
- Language Support: English and Chinese prompts
- License: Tencent Hunyuan Community License
Please review the original model card and license for full details on capabilities and restrictions.
Limitations
- Requires high-end professional GPU (80GB+ VRAM)
- Not suitable for consumer GPUs (4090, 5090) without further optimization
- INT8 quantization may introduce minor quality differences in edge cases
- Loading time adds ~1 minute overhead to first generation
Credits
Original Model: Tencent Hunyuan Team Quantization: Eric Rollei ComfyUI Integration: Comfy_HunyuanImage3
License
This model inherits the license from the original Hunyuan Image 3.0 model:
- License: Tencent Hunyuan Community License
- Please review the original license for commercial use restrictions and requirements
Citation
@misc{hunyuan-image-3-int8,
author = {Rollei, Eric},
title = {Hunyuan Image 3.0 INT8 Quantized},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/[YOUR_USERNAME]/[MODEL_NAME]}}
}
Original model citation:
@misc{tencent2024hunyuan,
title={Hunyuan Image 3.0},
author={Tencent Hunyuan Team},
year={2024},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/tencent/HunyuanImage-3.0}}
}
- Downloads last month
- 23