Update README.md
Browse files
README.md
CHANGED
|
@@ -3,26 +3,31 @@ license: apache-2.0
|
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
pipeline_tag: text-generation
|
| 6 |
-
base_model:
|
| 7 |
-
- PowerInfer/SmallThinker-21BA3B-Instruct
|
| 8 |
---
|
| 9 |
-
|
| 10 |
## SmallThinker-21BA3B-Instruct-GGUF
|
| 11 |
|
| 12 |
-
- GGUF models with `.gguf` suffix can used with [*llama.cpp*
|
| 13 |
-
|
| 14 |
-
- GGUF models with `.powerinfer.gguf` suffix are integrated with fused sparse FFN operators and sparse LM head operators. These models are only compatible to [*powerinfer* framwork](https://github.com/SJTU-IPADS/PowerInfer/tree/main/smallthinker).
|
| 15 |
|
|
|
|
| 16 |
|
| 17 |
## Introduction
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
SmallThinker is a family of **on-device native** Mixture-of-Experts (MoE) language models specially designed for local deployment,
|
| 20 |
co-developed by the **IPADS** and **School of AI at Shanghai Jiao Tong University** and **Zenergize AI**.
|
| 21 |
Designed from the ground up for resource-constrained environments,
|
| 22 |
SmallThinker brings powerful, private, and low-latency AI directly to your personal devices,
|
| 23 |
without relying on the cloud.
|
| 24 |
|
|
|
|
|
|
|
| 25 |
## Performance
|
|
|
|
|
|
|
|
|
|
| 26 |
| Model | MMLU | GPQA-diamond | MATH-500 | IFEVAL | LIVEBENCH | HUMANEVAL | Average |
|
| 27 |
|------------------------------|-------|--------------|----------|--------|-----------|-----------|---------|
|
| 28 |
| **SmallThinker-21BA3B-Instruct** | 84.43 | <u>55.05</u> | 82.4 | **85.77** | **60.3** | <u>89.63</u> | **76.26** |
|
|
@@ -74,7 +79,7 @@ You can deploy SmallThinker with offloading support using [PowerInfer](https://g
|
|
| 74 |
|
| 75 |
### Transformers
|
| 76 |
|
| 77 |
-
|
| 78 |
The following contains a code snippet illustrating how to use the model generate content based on given inputs.
|
| 79 |
|
| 80 |
```python
|
|
@@ -118,4 +123,5 @@ from modelscope import AutoModelForCausalLM, AutoTokenizer
|
|
| 118 |
## Statement
|
| 119 |
- Due to the constraints of its model size and the limitations of its training data, its responses may contain factual inaccuracies, biases, or outdated information.
|
| 120 |
- Users bear full responsibility for independently evaluating and verifying the accuracy and appropriateness of all generated content.
|
| 121 |
-
- SmallThinker does not possess genuine comprehension or consciousness and cannot express personal opinions or value judgments.
|
|
|
|
|
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
| 6 |
---
|
|
|
|
| 7 |
## SmallThinker-21BA3B-Instruct-GGUF
|
| 8 |
|
| 9 |
+
- GGUF models with `.gguf` suffix can used with [*llama.cpp*](https://github.com/ggml-org/llama.cpp) framwork.
|
|
|
|
|
|
|
| 10 |
|
| 11 |
+
- GGUF models with `.powerinfer.gguf` suffix are integrated with fused sparse FFN operators and sparse LM head operators. These models are only compatible to [*powerinfer*](https://github.com/SJTU-IPADS/PowerInfer/tree/main/smallthinker) framwork.
|
| 12 |
|
| 13 |
## Introduction
|
| 14 |
|
| 15 |
+
<p align="center">
|
| 16 |
+
  🤗 <a href="https://huggingface.co/PowerInfer">Hugging Face</a>   |   🤖 <a href="https://modelscope.cn/organization/PowerInfer">ModelScope</a>   |    📑 <a href="https://github.com/SJTU-IPADS/SmallThinker/blob/main/smallthinker-technical-report.pdf">Technical Report</a>   
|
| 17 |
+
</p>
|
| 18 |
+
|
| 19 |
SmallThinker is a family of **on-device native** Mixture-of-Experts (MoE) language models specially designed for local deployment,
|
| 20 |
co-developed by the **IPADS** and **School of AI at Shanghai Jiao Tong University** and **Zenergize AI**.
|
| 21 |
Designed from the ground up for resource-constrained environments,
|
| 22 |
SmallThinker brings powerful, private, and low-latency AI directly to your personal devices,
|
| 23 |
without relying on the cloud.
|
| 24 |
|
| 25 |
+
|
| 26 |
+
|
| 27 |
## Performance
|
| 28 |
+
|
| 29 |
+
Note: The model is trained mainly on English.
|
| 30 |
+
|
| 31 |
| Model | MMLU | GPQA-diamond | MATH-500 | IFEVAL | LIVEBENCH | HUMANEVAL | Average |
|
| 32 |
|------------------------------|-------|--------------|----------|--------|-----------|-----------|---------|
|
| 33 |
| **SmallThinker-21BA3B-Instruct** | 84.43 | <u>55.05</u> | 82.4 | **85.77** | **60.3** | <u>89.63</u> | **76.26** |
|
|
|
|
| 79 |
|
| 80 |
### Transformers
|
| 81 |
|
| 82 |
+
`transformers==4.53.3` is required, we are actively working to support the latest version.
|
| 83 |
The following contains a code snippet illustrating how to use the model generate content based on given inputs.
|
| 84 |
|
| 85 |
```python
|
|
|
|
| 123 |
## Statement
|
| 124 |
- Due to the constraints of its model size and the limitations of its training data, its responses may contain factual inaccuracies, biases, or outdated information.
|
| 125 |
- Users bear full responsibility for independently evaluating and verifying the accuracy and appropriateness of all generated content.
|
| 126 |
+
- SmallThinker does not possess genuine comprehension or consciousness and cannot express personal opinions or value judgments.
|
| 127 |
+
|