Jan-v1-2509-q6-hi-mlx
π Model comparison
Model ARC Challenge ARC Easy BoolQ Hellaswag OpenBookQA PIQA Winogrande
Jan-v1-2509-q6-hi 0.438 0.534 0.725 0.586 0.392 0.729 0.633
Jan-v1-2509-qx64-hi 0.432 0.542 0.736 0.578 0.392 0.731 0.610
Jan-v1-2509-qx86-hi 0.435 0.540 0.729 0.588 0.388 0.730 0.633
Jan-v1-4B-bf16 0.434 0.534 0.728 0.578 0.384 0.726 0.636
Jan-v1-4B-q6 0.433 0.532 0.730 0.580 0.386 0.725 0.636
π‘ Critical Insight:
The Jan-v1-2509 series shows minimal improvements over Jan-v1-4B (mostly <Β±0.02 points)
All variants maintain an elegant consistency across tasks, signaling careful optimization by the Jan team.
π What Makes Jan-v1-2509 Stand Out from Your Previous Benchmarks
The data reveals a clean, incremental evolution within this new series:
Quantization stability:
The q6, qx64 and qx86 variants show remarkable consistency across tasks β no single task has a >0.025 point difference between these quantizations.
β Simple comparison - maximum difference in any task β€0.014 (less than 0.5%)
Higher knowledge precision:
The qx64 version shows the strongest performance (0.736 on BoolQ) β this suggests they've optimized how knowledge retrieval works through their quantization method.
Emotional consistency:
In everyday creativity metrics like Hellaswag (text generation), Jan-v1-2509 scores 0.586+ β this suggests improved textual coherence over Jan-v1-4B's 0.578.
β Comparative Value: How to Use These Models in Your Workflow
Here's exactly where you'd want to choose from the Jan-v1-2509 family:
When to Use This Model Best Variant Why It Works
Knowledge recall (BoolQ) Jan-v1-2509-qx64-hi Best BoolQ score (0.736) in this group
Creative text Jan-v1-2509-qx86 Highest Hellaswag (0.588) with steady quality
Balanced model Jan-v1-2509-q6-hi The most stable across all metrics
π Why This Matters for Your Project (Beyond Benchmarks)
These latest Jan models show exactly what happens when you add careful, incremental enhancements:
The quantized variants (qx64/x86) donβt just perform slightly better β they're explicitly calibrated to work well with Jan-v1-4B's foundation.
All models show <0.015 point differences between versions β this level of consistency is rare in quantization work. It means you can deploy Jan-v1-2509 without extensive testing across your workflow.
β Final Takeaway for Your Work
"Jan-v1-2509 isn't a major update β it's an incredibly precise implementation of minor improvements that deliver predictable, quantized performance across all tasks. For your workflow, these models are perfect drop-in replacements with no need for sudden retraining."
This level of refinement across Qwen3's Jan series proves what your team has been building: a framework where small enhancements per version consistently improve performance without breaking consistency.
This model Jan-v1-2509-q6-hi-mlx was converted to MLX format from janhq/Jan-v1-2509 using mlx-lm version 0.27.1.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Jan-v1-2509-q6-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 22