Update README.md
Browse files
README.md
CHANGED
|
@@ -11,13 +11,17 @@ library_name: transformers
|
|
| 11 |
|
| 12 |
[](LICENSE)
|
| 13 |
|
| 14 |
-
PCMind-2.1-Kaiyuan-2B is a cutting-edge, **fully open-source language model** trained
|
| 15 |
With 1.4B non-embedding parameters and training on 2.2 trillion tokens,
|
| 16 |
it achieves performance competitive with current state-of-the-art fully open models and even rivals some leading open-weight models of similar scale.
|
| 17 |
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
Our data preprocessing and pre-training pipeline is designed for enhanced training efficiency and model quality,
|
| 23 |
achieved through several key innovations:
|
|
@@ -43,7 +47,7 @@ achieved through several key innovations:
|
|
| 43 |
The model architecture is similar to [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B),
|
| 44 |
and can be easily loaded by libraries like `transformers`.
|
| 45 |
|
| 46 |
-
Please use [`demo.py`](demo.py) as an example
|
| 47 |
|
| 48 |
*Note: This is a pretrained base model only and has not undergone fine-tuning,
|
| 49 |
reinforcement learning (RL), or any other post-training procedures.
|
|
|
|
| 11 |
|
| 12 |
[](LICENSE)
|
| 13 |
|
| 14 |
+
PCMind-2.1-Kaiyuan-2B is a cutting-edge, **fully open-source language model** (i.e., open dataset) trained on a Ascend 910A cluster.
|
| 15 |
With 1.4B non-embedding parameters and training on 2.2 trillion tokens,
|
| 16 |
it achieves performance competitive with current state-of-the-art fully open models and even rivals some leading open-weight models of similar scale.
|
| 17 |
|
| 18 |
+
<center>
|
| 19 |
+

|
| 20 |
+
</center>
|
| 21 |
+
|
| 22 |
+
We will publish the datasets used to train Kaiyuan-2B soon.
|
| 23 |
|
| 24 |
+
## Introduction
|
| 25 |
|
| 26 |
Our data preprocessing and pre-training pipeline is designed for enhanced training efficiency and model quality,
|
| 27 |
achieved through several key innovations:
|
|
|
|
| 47 |
The model architecture is similar to [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B),
|
| 48 |
and can be easily loaded by libraries like `transformers`.
|
| 49 |
|
| 50 |
+
Please use [`demo.py`](demo.py) as an example.
|
| 51 |
|
| 52 |
*Note: This is a pretrained base model only and has not undergone fine-tuning,
|
| 53 |
reinforcement learning (RL), or any other post-training procedures.
|