Model Card: Qwen2.5-0.5B-Instruct-ADLA (INT8 Quantized)

This model is a hardware-accelerated version of the Qwen2.5-0.5B-Instruct large language model. It has been quantized and compiled specifically for the Amlogic NPU (ADLA) using the adla-toolkit.

πŸš€ Model Details

  • Model Name: Qwen2.5-1.5B-Instruct
  • Original Source: Qwen/Qwen2.5-1.5B-Instruct
  • Developer: Alibaba Qwen Team
  • Parameters: 1.5B
  • Format: .adla (Amlogic Deep Learning Accelerator Binary)
  • Quantization: INT8 (Weight & Activation)

πŸ›  Deployment & Compilation

This model is optimized for edge deployment on Amlogic SoCs (e.g., A311D2, S905X5).

Feature Specification
Compiler Amlogic ADLA-toolkit
Quantization INT8
Input Shape Configurable (Default based on toolkit settings)
Target Hardware Amlogic NPU (ADLA)

βš–οΈ License

The base model is licensed under Apache 2.0. You are free to use this model for commercial purposes as per the license terms.

⚠️ Limitations

  • Quantization Loss: Users may experience a slight drop in accuracy compared to the FP16 model due to INT8 quantization.
  • Hardware Lock-in: This binary will only run on Amlogic hardware with ADLA support.
  • Safety: Content filtering is recommended for production use.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Amlogic-NN/Qwen2.5-1.5B-Instruct_quant_i8_adla

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1421)
this model