NPU - QNN
Collection
leading models optimized for NPU deployment on Qualcomm Snapdragon • 7 items • Updated
phi-3.5-onnx-qnn is an ONNX QNN int4 quantized version of Microsoft Phi-3.5-mini-instruct, providing a small fast NPU inference implementation, optimized for NPU deployment on Windows ARM64 AI PCs with Snapdragon Elite X NPU processors.
Base model
microsoft/Phi-3.5-mini-instruct