openhonor commited on
Commit
b24ad92
·
verified ·
1 Parent(s): efd99d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -11,7 +11,9 @@ library_name: transformers
11
 
12
  [![License](https://img.shields.io/badge/License-Apache-f5de53?&color=f5de53)](LICENSE)
13
 
14
- PCMind-2.1-Kaiyuan-2B is a fully-open model.
 
 
15
 
16
  ## Introduction
17
 
@@ -33,7 +35,7 @@ achieved through several key innovations:
33
  Spark-based framework optimized with [Chukonu](https://pacman.cs.tsinghua.edu.cn/~cwg/publication/chukonu-2021/),
34
  delivering exceptional efficiency for large-scale deduplication and sorting tasks.
35
 
36
- 5. **Architecture for Training Stability:** Optimized for training on 910A clusters (FP16 precision, similar to V100),
37
  the Kaiyuan-2B architecture integrates QK norm, sandwich norm, and soft-capping techniques to ensure stable and robust pre-training.
38
 
39
  ## Usage
 
11
 
12
  [![License](https://img.shields.io/badge/License-Apache-f5de53?&color=f5de53)](LICENSE)
13
 
14
+ PCMind-2.1-Kaiyuan-2B is a cutting-edge, **fully open-source language model** trained using Ascend 910A clusters.
15
+ With 1.4B non-embedding parameters and training on 2.2 trillion tokens,
16
+ it achieves performance competitive with current state-of-the-art fully open models and even rivals some leading open-weight models of similar scale.
17
 
18
  ## Introduction
19
 
 
35
  Spark-based framework optimized with [Chukonu](https://pacman.cs.tsinghua.edu.cn/~cwg/publication/chukonu-2021/),
36
  delivering exceptional efficiency for large-scale deduplication and sorting tasks.
37
 
38
+ 5. **Architecture for Training Stability:** Optimized for training on Ascend 910A clusters (FP16 precision, similar to V100),
39
  the Kaiyuan-2B architecture integrates QK norm, sandwich norm, and soft-capping techniques to ensure stable and robust pre-training.
40
 
41
  ## Usage