jianzhnie commited on
Commit
c088958
·
1 Parent(s): dde5d8e

Update pclreasoner model

Browse files
README.md CHANGED
@@ -1,3 +1,129 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # **PCL-Reasoner-V1.5**
2
+
3
+ ## Model Overview
4
+ We release **PCL-Reasoner-V1.5**, a next-generation reasoning model built upon **PCL-Reasoner-V1** and further enhanced through **offline reinforcement learning** method on the **vllm-ascend** and **MindSpeed-LLM framework** with **Ascend hardware acceleration**. Building on the strong foundation of PCL-Reasoner-V1, PCL-Reasoner-V1.5 achieves even greater improvement in complex mathematical reasoning with long chains of thought (CoT), demonstrating state-of-the-art performance among 32B-scale models.
5
+
6
+ PCL-Reasoner-V1.5 attains **90.9% on AIME 2024** and **85.7% on AIME 2025**, significantly outperforming prior 32B-class models and closing the gap with much larger systems. This advancement stems from refined data curation, improved contamination filtering, and optimized training dynamics tailored for deep reasoning tasks.
7
+
8
+ ![Evaluation Results](images/benchmark.png)
9
+
10
+ We have fully open-sourced the **model weights**, **dataset**, and **training code** to foster transparency, reproducibility, and community innovation. Follow the tutorial below to deploy, evaluate, or extend PCL-Reasoner-V1.5 in your own research!
11
+
12
+ ## Codes
13
+
14
+ [GitHub Repository](https://github.com/PCL-Reasoner/V1.5)
15
+
16
+ [OpenI Project Page](https://openi.pcl.ac.cn/PCL-Reasoner/V1.5)
17
+
18
+ ## Evaluation
19
+ All results are reported using the **Avg@32 metric** (average accuracy over 32 independent sampling attempts per problem), ensuring robust and fair comparison.
20
+
21
+ <!-- Table base styling (optional) -->
22
+
23
+ <style>
24
+ table { border-collapse: collapse; width: 100%; margin-left: auto;margin-right: auto;}
25
+ th, td { border: 1px solid #ddd; padding: 8px; text-align: center; }
26
+ </style>
27
+
28
+ <!-- Table content -->
29
+
30
+ <table>
31
+ <tr>
32
+ <th>Model Scale</th>
33
+ <th>Model</th>
34
+ <th>AIME 24</th>
35
+ <th>AIME 25</th>
36
+ </tr>
37
+ <!-- Merged row header >100B -->
38
+ <tr>
39
+ <th rowspan="6">&gt;100B</th>
40
+ </tr>
41
+ <!-- >100B data rows -->
42
+ <tr>
43
+ <td>DeepSeek-R1</td>
44
+ <td><span style="color:grey">79.8</span></td>
45
+ <td><span style="color:grey">70</span></td>
46
+ </tr>
47
+ <tr>
48
+ <td>DeepSeek-R1-0528</td>
49
+ <td><span style="color:red">91.4</span></td>
50
+ <td><span style="color:red">87.5</span></td>
51
+ </tr>
52
+ <tr>
53
+ <td>Qwen3-235B-A22B</td>
54
+ <td><span style="color:grey">85.7</span></td>
55
+ <td><span style="color:grey">81.5</span></td>
56
+ </tr>
57
+ <tr>
58
+ <td>OpenAI-o3</td>
59
+ <td><span style="color:red">91.6</span></td>
60
+ <td><span style="color:red">88.9</span></td>
61
+ </tr>
62
+ <tr>
63
+ <td>Gemini-2.5-Pro-0506</td>
64
+ <td><span style="color:red">90.8</span></td>
65
+ <td><span style="color:grey">83</span></td>
66
+ </tr>
67
+ <!-- Separator row -->
68
+ <tr>
69
+ <td colspan="4"></td>
70
+ </tr>
71
+ <!-- Merged row header 32B -->
72
+ <tr>
73
+ <th rowspan="9">32B</th>
74
+ </tr>
75
+ <!-- 32B data rows -->
76
+ <tr>
77
+ <td>Qwen3-32B</td>
78
+ <td><span style="color:grey">81.4</span></td>
79
+ <td><span style="color:grey">72.9</span></td>
80
+ </tr>
81
+ <tr>
82
+ <td>QwQ-32B</td>
83
+ <td><span style="color:grey">79.5</span></td>
84
+ <td><span style="color:grey">69.5</span></td>
85
+ </tr>
86
+ <tr>
87
+ <td>DeepSeek-R1-Distill-Qwen-32B</td>
88
+ <td><span style="color:grey">72.6</span></td>
89
+ <td><span style="color:grey">49.6</span></td>
90
+ </tr>
91
+ <tr>
92
+ <td>Skywork-OR1-32B</td>
93
+ <td><span style="color:grey">82.2</span></td>
94
+ <td><span style="color:grey">73.3</span></td>
95
+ </tr>
96
+ <tr>
97
+ <td>AM-Thinking-v1</td>
98
+ <td><span style="color:grey">85.3</span></td>
99
+ <td><span style="color:grey">74.4</span></td>
100
+ </tr>
101
+ <tr>
102
+ <td>OpenReasoning-Nemotron-32B</td>
103
+ <td><span style="color:grey">89.2</span></td>
104
+ <td><span style="color:grey">84.2</span></td>
105
+ </tr>
106
+ <tr>
107
+ <td>PCL-Reasoner-v1</td>
108
+ <td><p style="font-weight:grey;">85.7</p></td>
109
+ <td><p style="font-weight:grey;">84.2</p></td>
110
+ </tr>
111
+ <tr>
112
+ <td>PCL-Reasoner-v1.5</td>
113
+ <td><p style="font-weight: bold;">90.9</p></td>
114
+ <td><p style="font-weight: bold;">85.7</p></td>
115
+ </tr>
116
+ </table>
117
+
118
+ > **Note**: Model outputs on AIME24/25 are included in the repository under `eval_result/` for verification and analysis.
119
+
120
+ ## Citation
121
+
122
+ ```bibtex
123
+ @article{PCL-Reasoner-v1.5,
124
+ title={PCL-Reasoner-v1.5: A Math Problem Solver with Chain of Thought Reasoning},
125
+ author={Yao Lu, Deng Dong Fan, Jianzheng Nie, et al.},
126
+ journal={arXiv preprint arXiv:2405.14524},
127
+ year={2026}
128
+ }
129
+ ```
images/._pcl_reasoner_v15.png ADDED
images/benchmark.png ADDED
images/pcl_reasoner_v15.png ADDED