NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models
Paper
•
2602.06694
•
Published
•
12
None defined yet.
NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models
RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs