A collection of my FP8 quants for models missing this.
Markus PRO
AI & ML interests
NLP
Recent Activity
liked
a model
about 6 hours ago
Qwen/Qwen3-Reranker-0.6B
replied to
danielhanchen's
post
about 12 hours ago
You can now run Kimi K2.5 locally! 🔥
We shrank the 1T model to 240GB (-60%) via Dynamic 1-bit.
Get >40 tok/s on 242GB or 622GB VRAM/RAM for near full precision.
GGUF: https://huggingface.co/unsloth/Kimi-K2.5-GGUF
Guide: https://unsloth.ai/docs/models/kimi-k2.5
liked
a model
about 14 hours ago
nvidia/Nemotron-Orchestrator-8B