nightmedia
/

Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-TNG-IV-PKDick-V-qx86x-hi-mlx

Text Generation

code generation

Mixture of Experts

Qwen3-Coder-30B-A3B-Instruct

mixture of experts

8 active experts

1 million context

optional thinking

8-bit precision

Model card Files Files and versions

nightmedia commited on 13 days ago

Commit

b900303

·

verified ·

1 Parent(s): 437ac75

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -49,7 +49,7 @@ Let’s go deep: how does merging two distinct cognitive styles affect reasoning
 📊 Benchmark Comparison (All 42B MoE qx86x-hi variants)
 ```bash
 Model	arc_challenge arc_easy	boolq hellaswag	openbookqa piqa winogrande
-TOTAL-RECALL	0.533	0.690	0.882	0.684	0.428	0.781	0.646
 ST-TNG-IV		0.537	0.689	0.882	0.689	0.432	0.780	0.654
 PKDick-V		0.531	0.695	0.882	0.689	0.432	0.784	0.657
 TNG-IV-PKDick-V	0.532	0.693	0.881	0.686	0.428	0.782	0.649

 📊 Benchmark Comparison (All 42B MoE qx86x-hi variants)
 ```bash
 Model	arc_challenge arc_easy	boolq hellaswag	openbookqa piqa winogrande
+Baseline		0.533	0.690	0.882	0.684	0.428	0.781	0.646
 ST-TNG-IV		0.537	0.689	0.882	0.689	0.432	0.780	0.654
 PKDick-V		0.531	0.695	0.882	0.689	0.432	0.784	0.657
 TNG-IV-PKDick-V	0.532	0.693	0.881	0.686	0.428	0.782	0.649