view post Post 133 You can now run Kimi K2 Thinking locally with our Dynamic 1-bit GGUFs: unsloth/Kimi-K2-Thinking-GGUFWe shrank the 1T model to 245GB (-62%) & retained ~85% of accuracy on Aider Polyglot. Run on >247GB RAM for fast inference.We also collaborated with the Moonshot AI Kimi team on a system prompt fix! π₯°Guide + fix details: https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally See translation π₯ 2 2 β€οΈ 2 2 π 2 2 π€ 2 2 π€― 1 1 + Reply
unsloth/Llama-3.2-3B-Instruct-GGUF Text Generation β’ 3B β’ Updated about 14 hours ago β’ 25.4k β’ 50
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. β’ 54 items β’ Updated about 22 hours ago β’ 238
view post Post 6155 Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFs!πGGUFs: unsloth/DeepSeek-V3.1-GGUFThe 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.The 1-bit GGUF passes all our code tests & we fixed the chat template for llama.cpp supported backends.Guide: https://docs.unsloth.ai/basics/deepseek-v3.1 See translation β€οΈ 18 18 π₯ 9 9 π 5 5 + Reply
view post Post 5278 Run OpenAI's new gpt-oss models locally with Unsloth GGUFs! π₯π¦₯20b GGUF: unsloth/gpt-oss-20b-GGUF120b GGUF: unsloth/gpt-oss-120b-GGUFModel will run on 14GB RAM for 20b and 66GB for 120b. See translation 2 replies Β· β€οΈ 20 20 π₯ 6 6 π 5 5 + Reply
view post Post 3460 It's Qwen3 week! π We uploaded Dynamic 2-bit GGUFs for:Qwen3-Coder: unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUFQwen3-2507: unsloth/Qwen3-235B-A22B-Instruct-2507-GGUFSo you can run them both locally!Guides are in model cards. See translation 1 reply Β· π€ 5 5 β€οΈ 4 4 π₯ 3 3 + Reply
view post Post 3697 Made some 245GB (80% size reduction) 1.8bit quants for Kimi K2! unsloth/Kimi-K2-Instruct-GGUF See translation π₯ 10 10 π€― 2 2 + Reply
view post Post 3824 We fixed more issues! Use --jinja for all!* Fixed Nanonets OCR-s unsloth/Nanonets-OCR-s-GGUF* Fixed THUDM GLM-4 unsloth/GLM-4-32B-0414-GGUF* DeepSeek Chimera v2 is uploading! unsloth/DeepSeek-TNG-R1T2-Chimera-GGUF See translation β€οΈ 3 3 π 3 3 π€ 2 2 + Reply
view post Post 3062 Gemma 3n finetuning is now 1.5x faster and uses 50% less VRAM in Unsloth!Click "Use this model" and click "Google Colab"! unsloth/gemma-3n-E4B-it unsloth/gemma-3n-E2B-ithttps://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3N_(4B)-Conversational.ipynb See translation 2 replies Β· π₯ 3 3 β€οΈ 3 3 π 2 2 + Reply
view post Post 1228 We updated lots of our GGUFs and uploaded many new ones!* unsloth/dots.llm1.inst-GGUF* unsloth/Jan-nano-GGUF* unsloth/Nanonets-OCR-s-GGUF* Updated and fixed Q8_0 upload for unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF* Added Q2_K_XL for unsloth/DeepSeek-R1-0528-GGUF* Updated and fixed Vision support for unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF See translation π€ 2 2 π₯ 1 1 + Reply
view post Post 2446 Mistral releases Magistral, their new reasoning models! π₯GGUFs to run: unsloth/Magistral-Small-2506-GGUFMagistral-Small-2506 excels at mathematics and coding.You can run the 24B model locally with just 32GB RAM by using our Dynamic GGUFs. See translation π₯ 7 7 π€ 2 2 π 1 1 + Reply
view post Post 3787 New DeepSeek-R1-0528 1.65-bit Dynamic GGUF!Run the model locally even easier! Will fit on a 192GB Macbook and run at 7 tokens/s.DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-GGUFQwen3-8B DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUFAnd read our Guide: https://docs.unsloth.ai/basics/deepseek-r1-0528 See translation π₯ 10 10 β€οΈ 8 8 π 3 3 π€ 3 3 π€ 2 2 π€― 1 1 + Reply
view post Post 2287 π Qwen3 128K Context Length: We've released Dynamic 2.0 GGUFs + 4-bit safetensors!Fixed: Now works on any inference engine and fixed issues with the chat template.Qwen3 GGUFs:30B-A3B: unsloth/Qwen3-30B-A3B-GGUF235-A22B: unsloth/Qwen3-235B-A22B-GGUF32B: unsloth/Qwen3-32B-GGUFRead our guide on running Qwen3 here: https://docs.unsloth.ai/basics/qwen3-how-to-run-and-finetune128K Context Length:30B-A3B: unsloth/Qwen3-30B-A3B-128K-GGUF235-A22B: unsloth/Qwen3-235B-A22B-128K-GGUF32B: unsloth/Qwen3-32B-128K-GGUFAll Qwen3 uploads: unsloth/qwen3-680edabfb790c8c34a242f95 See translation β€οΈ 7 7 π€ 1 1 π 1 1 π₯ 1 1 + Reply