Q4-Q6 chance?

by cchance27 - opened Sep 8

Sep 8

Any chance you could upload a Q4 to Q6 version? normally Q5ish is best for low vram but high coherence to Q8/FP16 weights

Owner Sep 8

It doesnt have support in llama.cpp so id first need to implement it, though since there is no inference support yet i dont see a reason really

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment