Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

4,824

Full-text search

Active filters: quantized

Luckybalabala/AutoGLM-Phone-9B-GGUF

Image-Text-to-Text • 9B • Updated Dec 20, 2025 • 3.09k • 7

cybermotaz/nemotron3-nano-nvfp4-w4a16

Text Generation • 18B • Updated Dec 18, 2025 • 26k • 11

unsloth/Qwen-Image-GGUF

Text-to-Image • 20B • Updated Dec 22, 2025 • 2.84k • 21

Wwayu/GLM-4.7-PRISM-mlx-2Bit

Text Generation • 353B • Updated 29 days ago • 2.42k • 3

marksverdhai/vibevoice-7b-bnb-4bit

Text-to-Speech • 10B • Updated 22 days ago • 367 • 2

unsloth/Qwen-Image-2512-FP8

Text-to-Image • Updated 21 days ago • 27

GadflyII/MiniMax-M2.1-NVFP4

Text Generation • Updated 20 days ago • 192 • 3

animaslabs/parakeet-tdt-0.6b-v3-mlx-8bit

Automatic Speech Recognition • Updated 11 days ago • 82 • 2

xhxlb/IQuest-Coder-V1-40B-Instruct-int4

Text Generation • 6B • Updated 14 days ago • 2

thebajajra/RexReranker-0.6B-FP8

Text Ranking • 0.8B • Updated 12 days ago • 49 • 1

thebajajra/RexReranker-0.6B-MXFP4

Text Ranking • 0.5B • Updated 10 days ago • 13 • 1

GadflyII/GLM-4.6V-NVFP4

Image-Text-to-Text • 62B • Updated 10 days ago • 3.95k • 2

ipsilondev/chatterbox-multilingual-ONNX-q4

Text-to-Speech • Updated 10 days ago • 1

steampunque/Deepseek-R1-Distill-Qwen-14B-Hybrid-GGUF

15B • Updated 9 days ago • 269 • 1

dipsht9999/GLM-4.7-PRISM-mlx-4Bit

Text Generation • 353B • Updated 9 days ago • 144 • 1

Wwayu/GLM-4.7-PRISM-mlx-4Bit

Text Generation • 353B • Updated 8 days ago • 304 • 1

lxcorp/Link-270M-GGUF

Text Generation • 0.3B • Updated 3 days ago • 72 • 1

solarkyle/GLM-4.7-Flash-GGUF

Text Generation • 30B • Updated 3 days ago • 251 • 1

Hadidiz9/UI-S1-7B-Hybrid-W4-Quanto

Image-Text-to-Text • Updated 3 days ago • 15 • 1

Firworks/mox-tiny-1-nvfp4

5B • Updated 2 days ago • 20 • 1

marksverdhei/Qwen3-Omni-30B-A3B-FP8

Any-to-Any • 35B • Updated 2 days ago • 144 • 1

ruv/ruvltra-claude-code

Text Generation • 0.5B • Updated 2 days ago • 43 • 1

Lewdiculous/InfinityRP-v1-7B-GGUF-IQ-Imatrix

7B • Updated May 4, 2024 • 139 • 44

PP12546/Heartsync_NSFW-Uncensored-BF16

Text-to-Image • Updated Sep 10, 2025 • 4

ravenscroftj/CodeGen-350M-multi-ggml-quant

Text Generation • Updated Apr 24, 2023 • 2

ravenscroftj/CodeGen-2B-multi-ggml-quant

Text Generation • Updated Aug 5, 2023 • 2

ravenscroftj/CodeGen-6B-multi-ggml-quant

Text Generation • Updated Apr 24, 2023 • 9

ethzanalytics/dolly-v2-12b-sharded-8bit

Text Generation • Updated Apr 29, 2023 • 4 • 4

ethzanalytics/dolly-v2-7b-sharded-8bit

Text Generation • Updated Jun 28, 2023 • 2 • 1

pszemraj/long-t5-tglobal-xl-16384-book-summary-8bit

Summarization • 3B • Updated 25 days ago • 6