Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

48

Full-text search

Active filters: 量化修复

JunHowie/Qwen3-4B-Instruct-2507-GPTQ-Int4

Text Generation • 4B • Updated Sep 4, 2025 • 1.85k • 1

model-scope/glm-4-9b-chat-GPTQ-Int4

Text Generation • 9B • Updated Jul 17, 2024 • 63 • 6

model-scope/glm-4-9b-chat-GPTQ-Int8

Text Generation • 9B • Updated Jul 23, 2024 • 16 • 2

JunHowie/Qwen3-0.6B-GPTQ-Int4

Text Generation • 0.6B • Updated Sep 3, 2025 • 419 • 1

JunHowie/Qwen3-0.6B-GPTQ-Int8

Text Generation • 0.6B • Updated Sep 3, 2025 • 20

JunHowie/Qwen3-1.7B-GPTQ-Int4

Text Generation • 2B • Updated Sep 3, 2025 • 451 • 1

JunHowie/Qwen3-1.7B-GPTQ-Int8

Text Generation • 2B • Updated Sep 3, 2025 • 17

JunHowie/Qwen3-32B-GPTQ-Int4

Text Generation • 33B • Updated Sep 5, 2025 • 846 • 3

JunHowie/Qwen3-32B-GPTQ-Int8

Text Generation • 33B • Updated Sep 5, 2025 • 250 • 3

JunHowie/Qwen3-30B-A3B-GPTQ-Int4

Text Generation • 5B • Updated Sep 6, 2025 • 132 • 1

JunHowie/Qwen3-14B-GPTQ-Int8

Text Generation • 15B • Updated Sep 5, 2025 • 83 • 1

JunHowie/Qwen3-14B-GPTQ-Int4

Text Generation • 15B • Updated Sep 5, 2025 • 751 • 4

JunHowie/Qwen3-8B-GPTQ-Int8

Text Generation • 8B • Updated Sep 4, 2025 • 112

JunHowie/Qwen3-8B-GPTQ-Int4

Text Generation • 8B • Updated Sep 4, 2025 • 1.37k • 4

JunHowie/Qwen3-4B-GPTQ-Int4

Text Generation • 4B • Updated Sep 4, 2025 • 208 • 1

JunHowie/Qwen3-4B-GPTQ-Int8

Text Generation • 4B • Updated Sep 4, 2025 • 9

JunHowie/Qwen3-30B-A3B-GPTQ-Int8

Text Generation • 8B • Updated Sep 6, 2025 • 7.26k

QuantTrio/Qwen3-235B-A22B-GPTQ-Int8

Text Generation • 235B • Updated Sep 5, 2025 • 53

QuantTrio/DeepSeek-R1-0528-Qwen3-8B-GPTQ-Int4-Int8Mix

Text Generation • 11B • Updated Sep 5, 2025 • 87 • 3

QuantTrio/DeepSeek-R1-0528-GPTQ-Int4-Int8Mix-Lite

Text Generation • 721B • Updated Aug 30, 2025 • 11 • 1

QuantTrio/DeepSeek-R1-0528-GPTQ-Int4-Int8Mix-Compact

Text Generation • 847B • Updated Jun 19, 2025 • 8 • 5

QuantTrio/DeepSeek-R1-0528-GPTQ-Int4-Int8Mix-Medium

Text Generation • 912B • Updated Jun 20, 2025 • 43 • 1

koushd/Qwen3-235B-A22B-Instruct-2507-AWQ

Text Generation • 235B • Updated Aug 26, 2025 • 65 • 4

QuantTrio/Qwen3-235B-A22B-Instruct-2507-GPTQ-Int4-Int8Mix

Text Generation • 248B • Updated Aug 20, 2025 • 317 • 2

QuantTrio/Qwen3-235B-A22B-Instruct-2507-AWQ

Text Generation • 235B • Updated Aug 19, 2025 • 2.76k • 10

QuantTrio/Qwen3-Coder-480B-A35B-Instruct-AWQ

Text Generation • 480B • Updated Aug 19, 2025 • 515 • 8

QuantTrio/Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix

Text Generation • 534B • Updated Sep 5, 2025 • 184 • 6

QuantTrio/Qwen3-235B-A22B-Thinking-2507-AWQ

Text Generation • 235B • Updated Sep 5, 2025 • 1.71k • 5

QuantTrio/Qwen3-235B-A22B-Thinking-2507-GPTQ-Int4-Int8Mix

Text Generation • 253B • Updated Sep 5, 2025 • 128 • 2

QuantTrio/Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8

Text Generation • 31B • Updated Sep 5, 2025 • 899 • 9