Question about quanting the bf16
#3
by
TPH441
- opened
If I were to download your bf16 and quantize the experts to Q4_0 and the rest to Q8_0, would that be lossless for the experts? Since the original model used INT4 for them.
Hmm hard to say - llama.cpp Q4_0 is different from INT4 since I think Q4_0 uses float16 scalers whilst INT4 uses bfloat16 scalers