harryleafchen commited on
Commit
5af9ddb
·
verified ·
1 Parent(s): 20483c1

Add arxiv citation in README

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -12,6 +12,7 @@ datasets:
12
  # PCMind-2.1-Kaiyuan-2B (脑海-2.1-开元-2B)
13
 
14
  [![License](https://img.shields.io/badge/License-Apache-f5de53?&color=f5de53)](LICENSE)
 
15
 
16
  PCMind-2.1-Kaiyuan-2B is a cutting-edge, **fully open-source language model** (i.e., open dataset) trained on a Ascend 910A cluster.
17
  With 1.4B non-embedding parameters and training on 2.2 trillion tokens,
@@ -23,6 +24,8 @@ it achieves performance competitive with current state-of-the-art fully open mod
23
 
24
  The dataset used to train Kaiyuan-2B is published at <https://huggingface.co/datasets/thu-pacman/PCMind-2.1-Kaiyuan-2B>.
25
 
 
 
26
  ## Introduction
27
 
28
  Our data preprocessing and pre-training pipeline is designed for enhanced training efficiency and model quality,
@@ -59,7 +62,19 @@ or to fine-tune the model for specific downstream applications.*
59
 
60
  ## Citation
61
 
62
- Our technical report is coming soon!
 
 
 
 
 
 
 
 
 
 
 
 
63
 
64
  ## License
65
 
 
12
  # PCMind-2.1-Kaiyuan-2B (脑海-2.1-开元-2B)
13
 
14
  [![License](https://img.shields.io/badge/License-Apache-f5de53?&color=f5de53)](LICENSE)
15
+ [![arXiv-2512.07612](https://img.shields.io/badge/arXiv-2512.07612-b31b1b.svg?style=flat)](https://arxiv.org/abs/2512.07612)
16
 
17
  PCMind-2.1-Kaiyuan-2B is a cutting-edge, **fully open-source language model** (i.e., open dataset) trained on a Ascend 910A cluster.
18
  With 1.4B non-embedding parameters and training on 2.2 trillion tokens,
 
24
 
25
  The dataset used to train Kaiyuan-2B is published at <https://huggingface.co/datasets/thu-pacman/PCMind-2.1-Kaiyuan-2B>.
26
 
27
+ The _PCMind-2.1-Kaiyuan-2B Technical Report_ is published at <https://arxiv.org/abs/2512.07612>.
28
+
29
  ## Introduction
30
 
31
  Our data preprocessing and pre-training pipeline is designed for enhanced training efficiency and model quality,
 
62
 
63
  ## Citation
64
 
65
+ Please cite [our technical report](https://arxiv.org/abs/2512.07612) if you use our model, dataset, or code.
66
+
67
+ ```bib
68
+ @misc{luo2025pcmind21kaiyuan2btechnicalreport,
69
+ title={PCMind-2.1-Kaiyuan-2B Technical Report},
70
+ author={Kairong Luo and Zhenbo Sun and Xinyu Shi and Shengqi Chen and Bowen Yu and Yunyi Chen and Chenyi Dang and Hengtao Tao and Hui Wang and Fangming Liu and Kaifeng Lyu and Wenguang Chen},
71
+ year={2025},
72
+ eprint={2512.07612},
73
+ archivePrefix={arXiv},
74
+ primaryClass={cs.CL},
75
+ url={https://arxiv.org/abs/2512.07612},
76
+ }
77
+ ```
78
 
79
  ## License
80