Model Card: Qwen2.5-0.5B-Instruct-ADLA (INT8 Quantized)
This model is a hardware-accelerated version of the Qwen2.5-0.5B-Instruct large language model. It has been quantized and compiled specifically for the Amlogic NPU (ADLA) using the adla-toolkit.
π Model Details
- Model Name: Qwen2.5-1.5B-Instruct
- Original Source: Qwen/Qwen2.5-1.5B-Instruct
- Developer: Alibaba Qwen Team
- Parameters: 1.5B
- Format:
.adla(Amlogic Deep Learning Accelerator Binary) - Quantization: INT8 (Weight & Activation)
π Deployment & Compilation
This model is optimized for edge deployment on Amlogic SoCs (e.g., A311D2, S905X5).
| Feature | Specification |
|---|---|
| Compiler | Amlogic ADLA-toolkit |
| Quantization | INT8 |
| Input Shape | Configurable (Default based on toolkit settings) |
| Target Hardware | Amlogic NPU (ADLA) |
βοΈ License
The base model is licensed under Apache 2.0. You are free to use this model for commercial purposes as per the license terms.
β οΈ Limitations
- Quantization Loss: Users may experience a slight drop in accuracy compared to the FP16 model due to INT8 quantization.
- Hardware Lock-in: This binary will only run on Amlogic hardware with ADLA support.
- Safety: Content filtering is recommended for production use.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support