|
|
--- |
|
|
library_name: mlx |
|
|
license: apache-2.0 |
|
|
license_link: https://huggingface.co/Qwen/Qwen3-30B-A3B-Thinking-2507/blob/main/LICENSE |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- mlx |
|
|
base_model: Qwen/Qwen3-30B-A3B-Thinking-2507 |
|
|
--- |
|
|
|
|
|
# Qwen3-30B-A3B-Thinking-2507-512k-qx6-mlx |
|
|
|
|
|
this model uses an experimental quanting combination |
|
|
|
|
|
code name: Deckard |
|
|
|
|
|
purpose: evaluating replicants |
|
|
|
|
|
Analysis of qx6 Performance: |
|
|
|
|
|
Best Suited Tasks for qx6: |
|
|
1. OpenBookQA (0.432) |
|
|
|
|
|
This is the highest score among all models in this dataset |
|
|
+0.002 improvement over bf16 (0.430) |
|
|
Strongest performance for knowledge-based reasoning tasks |
|
|
|
|
|
2. BoolQ (0.881) |
|
|
|
|
|
Highest among all quantized models for boolean reasoning |
|
|
Only 0.002 behind baseline (0.879) |
|
|
|
|
|
Excellent for logical reasoning and question answering |
|
|
|
|
|
3. Arc_Challenge (0.422) |
|
|
|
|
|
Perfect match with baseline (0.422) |
|
|
Maintains full performance on the most challenging questions |
|
|
|
|
|
Secondary Strengths: |
|
|
|
|
|
4. PIQA (0.724) |
|
|
|
|
|
Above baseline performance (0.720) |
|
|
Strong physical interaction reasoning |
|
|
|
|
|
5. HellaSwag (0.546) |
|
|
|
|
|
Very close to baseline (0.550) |
|
|
Good commonsense reasoning |
|
|
|
|
|
Key Advantages: |
|
|
|
|
|
Best overall performance in OpenBookQA (0.432) |
|
|
|
|
|
Perfect retention of Arc_Challenge performance |
|
|
|
|
|
Exceptional BoolQ scores |
|
|
|
|
|
Strong knowledge reasoning capabilities |
|
|
|
|
|
|
|
|
Recommendation: |
|
|
|
|
|
qx6 is best suited for OpenBookQA and BoolQ tasks. |
|
|
|
|
|
The model's exceptional performance in OpenBookQA (highest among all models) combined with its perfect retention of Arc_Challenge and superior BoolQ scores makes it ideal for: |
|
|
|
|
|
Knowledge-intensive question answering systems |
|
|
|
|
|
Educational assessment applications |
|
|
|
|
|
Logical reasoning tasks requiring factual accuracy |
|
|
|
|
|
Research and academic question answering |
|
|
|
|
|
The model demonstrates optimal balance between knowledge retention and logical processing, making it particularly valuable for applications where both factual recall and reasoning skills are crucial. |
|
|
|
|
|
|
|
|
This model [Qwen3-30B-A3B-Thinking-2507-512k-qx6-mlx](https://huggingface.co/Qwen3-30B-A3B-Thinking-2507-512k-qx6-mlx) was |
|
|
converted to MLX format from [Qwen/Qwen3-30B-A3B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Thinking-2507) |
|
|
using mlx-lm version **0.26.3**. |
|
|
|
|
|
## Use with mlx |
|
|
|
|
|
```bash |
|
|
pip install mlx-lm |
|
|
``` |
|
|
|
|
|
```python |
|
|
from mlx_lm import load, generate |
|
|
|
|
|
model, tokenizer = load("Qwen3-30B-A3B-Thinking-2507-512k-qx6-mlx") |
|
|
|
|
|
prompt = "hello" |
|
|
|
|
|
if tokenizer.chat_template is not None: |
|
|
messages = [{"role": "user", "content": prompt}] |
|
|
prompt = tokenizer.apply_chat_template( |
|
|
messages, add_generation_prompt=True |
|
|
) |
|
|
|
|
|
response = generate(model, tokenizer, prompt=prompt, verbose=True) |
|
|
``` |
|
|
|