OGA_DML_8_6_2025 Collection Models are quantized using quark-0.9, transformers-4.50.0, OGA-0.7.1, ORT-1.21.1 followed by OGA-DML export. • 10 items • Updated Aug 21 • 1
OGA_CPU_8_8_2025 Collection Models are quantized using quark-0.9, transformers-4.50.0, OGA-0.7.1, ORT-1.21.1 followed by OGA-CPU export. • 10 items • Updated Aug 21 • 1
RyzenAI-1.3_LLM_NPU_Models Collection Models quantized by Quark and prepared for the OGA-based NPU-only execution flow (Ryzen AI 1.3) • 14 items • Updated Jun 16 • 5
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 54 items • Updated 8 days ago • 246
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters Paper • 2504.08791 • Published Apr 7 • 137