Model Card: Qwen2.5-0.5B-Instruct-ADLA (INT8 Quantized)

This model is a hardware-accelerated version of the Qwen2.5-0.5B-Instruct large language model. It has been quantized and compiled specifically for the Amlogic NPU (ADLA) using the adla-toolkit.

🚀 Model Details

Model Name: Qwen2.5-1.5B-Instruct
Original Source: Qwen/Qwen2.5-1.5B-Instruct
Developer: Alibaba Qwen Team
Parameters: 1.5B
Format: .adla (Amlogic Deep Learning Accelerator Binary)
Quantization: INT8 (Weight & Activation)

🛠 Deployment & Compilation

This model is optimized for edge deployment on Amlogic SoCs (e.g., A311D2, S905X5).

Feature	Specification
Compiler	Amlogic ADLA-toolkit
Quantization	INT8
Input Shape	Configurable (Default based on toolkit settings)
Target Hardware	Amlogic NPU (ADLA)

⚖️ License

The base model is licensed under Apache 2.0. You are free to use this model for commercial purposes as per the license terms.

⚠️ Limitations

Quantization Loss: Users may experience a slight drop in accuracy compared to the FP16 model due to INT8 quantization.
Hardware Lock-in: This binary will only run on Amlogic hardware with ADLA support.
Safety: Content filtering is recommended for production use.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Amlogic-NN/Qwen2.5-1.5B-Instruct_quant_i8_adla

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Finetuned

(1421)

this model