z-lab 's Collections

ParoQuant

Pairwise Rotation Quantization for Efficient Reasoning LLM Inference