---
license: apache-2.0
tags:
- llmcompressor
- qwen
- quantized
- aworld
- moe
- qwen3
- 8-bit
- 8bit
- int8
- gptq
- awq
- fp8
- a8w8
model-index:
- name: Qwen3-32B-AWorld-W8A16
  results: []
language:
- en
- zh
base_model:
- inclusionAI/Qwen3-32B-AWorld
---

# Qwen3-32B-AWorld-W8A16

This is a W8A16 (8-bit weight, 16-bit activation) quantized version of the [inclusionAI/Qwen3-32B-AWorld](https://huggingface.co/inclusionAI/Qwen3-32B-AWorld) model, created using [LLM Compressor](https://github.com/vllm/llm-compressor).

## Model Details

- **Model Type**: Mixture-of-Experts (MoE) Language Model
- **Base Model**: [inclusionAI/Qwen3-32B-AWorld](https://huggingface.co/inclusionAI/Qwen3-32B-AWorld)
- **Quantization Method**: GPTQ with SmoothQuant preprocessing
- **Weight Precision**: 8-bit
- **Activation Precision**: 16-bit
- **Compression Ratio**: ~2x model size reduction

## Quantization Process

This model was quantized using the LLM Compressor library with the following key parameters:

- Algorithm: GPTQ with SmoothQuant preprocessing
- Protection: MoE gate layers kept at full precision
- Calibration: 512 samples from ultrachat_200k dataset
- Sequence Length: 2048 tokens

The quantization process preserves the quality of the original model while reducing its size by approximately 2x, making it more suitable for deployment on resource-constrained environments.

## Usage

The model can be loaded using the standard Hugging Face transformers library:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("groxaxo/Qwen3-32B-AWorld-W8A16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("groxaxo/Qwen3-32B-AWorld-W8A16")
```

## Original Model

This quantized model is derived from [inclusionAI/Qwen3-32B-AWorld](https://huggingface.co/inclusionAI/Qwen3-32B-AWorld). Please refer to the original model card for detailed information about the base model's capabilities, training process, and intended use.

## License

This model is licensed under the Apache 2.0 license, inheriting the license from the original model.