How about int8 quantization?
#3 opened 6 months ago
by
traphix
INT 8
#2 opened 6 months ago
by
freegheist
Slow inference on vLLM
3
#1 opened 7 months ago
by
hp1337