Text Generation
MLX
Safetensors
qwen3_moe
programming
code generation
code
codeqwen
Mixture of Experts
coding
coder
qwen2
chat
qwen
qwen-coder
Qwen3-Coder-30B-A3B-Instruct
Qwen3-30B-A3B
mixture of experts
128 experts
8 active experts
1 million context
qwen3
finetune
brainstorm 20x
brainstorm
optional thinking
unsloth
Merge
conversational
8-bit precision
Update README.md
Browse files
README.md
CHANGED
|
@@ -49,7 +49,7 @@ Let’s go deep: how does merging two distinct cognitive styles affect reasoning
|
|
| 49 |
📊 Benchmark Comparison (All 42B MoE qx86x-hi variants)
|
| 50 |
```bash
|
| 51 |
Model arc_challenge arc_easy boolq hellaswag openbookqa piqa winogrande
|
| 52 |
-
|
| 53 |
ST-TNG-IV 0.537 0.689 0.882 0.689 0.432 0.780 0.654
|
| 54 |
PKDick-V 0.531 0.695 0.882 0.689 0.432 0.784 0.657
|
| 55 |
TNG-IV-PKDick-V 0.532 0.693 0.881 0.686 0.428 0.782 0.649
|
|
|
|
| 49 |
📊 Benchmark Comparison (All 42B MoE qx86x-hi variants)
|
| 50 |
```bash
|
| 51 |
Model arc_challenge arc_easy boolq hellaswag openbookqa piqa winogrande
|
| 52 |
+
Baseline 0.533 0.690 0.882 0.684 0.428 0.781 0.646
|
| 53 |
ST-TNG-IV 0.537 0.689 0.882 0.689 0.432 0.780 0.654
|
| 54 |
PKDick-V 0.531 0.695 0.882 0.689 0.432 0.784 0.657
|
| 55 |
TNG-IV-PKDick-V 0.532 0.693 0.881 0.686 0.428 0.782 0.649
|