jianzhnie commited on
Commit ·
c088958
1
Parent(s): dde5d8e
Update pclreasoner model
Browse files- README.md +129 -3
- images/._pcl_reasoner_v15.png +0 -0
- images/benchmark.png +0 -0
- images/pcl_reasoner_v15.png +0 -0
README.md
CHANGED
|
@@ -1,3 +1,129 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# **PCL-Reasoner-V1.5**
|
| 2 |
+
|
| 3 |
+
## Model Overview
|
| 4 |
+
We release **PCL-Reasoner-V1.5**, a next-generation reasoning model built upon **PCL-Reasoner-V1** and further enhanced through **offline reinforcement learning** method on the **vllm-ascend** and **MindSpeed-LLM framework** with **Ascend hardware acceleration**. Building on the strong foundation of PCL-Reasoner-V1, PCL-Reasoner-V1.5 achieves even greater improvement in complex mathematical reasoning with long chains of thought (CoT), demonstrating state-of-the-art performance among 32B-scale models.
|
| 5 |
+
|
| 6 |
+
PCL-Reasoner-V1.5 attains **90.9% on AIME 2024** and **85.7% on AIME 2025**, significantly outperforming prior 32B-class models and closing the gap with much larger systems. This advancement stems from refined data curation, improved contamination filtering, and optimized training dynamics tailored for deep reasoning tasks.
|
| 7 |
+
|
| 8 |
+

|
| 9 |
+
|
| 10 |
+
We have fully open-sourced the **model weights**, **dataset**, and **training code** to foster transparency, reproducibility, and community innovation. Follow the tutorial below to deploy, evaluate, or extend PCL-Reasoner-V1.5 in your own research!
|
| 11 |
+
|
| 12 |
+
## Codes
|
| 13 |
+
|
| 14 |
+
[GitHub Repository](https://github.com/PCL-Reasoner/V1.5)
|
| 15 |
+
|
| 16 |
+
[OpenI Project Page](https://openi.pcl.ac.cn/PCL-Reasoner/V1.5)
|
| 17 |
+
|
| 18 |
+
## Evaluation
|
| 19 |
+
All results are reported using the **Avg@32 metric** (average accuracy over 32 independent sampling attempts per problem), ensuring robust and fair comparison.
|
| 20 |
+
|
| 21 |
+
<!-- Table base styling (optional) -->
|
| 22 |
+
|
| 23 |
+
<style>
|
| 24 |
+
table { border-collapse: collapse; width: 100%; margin-left: auto;margin-right: auto;}
|
| 25 |
+
th, td { border: 1px solid #ddd; padding: 8px; text-align: center; }
|
| 26 |
+
</style>
|
| 27 |
+
|
| 28 |
+
<!-- Table content -->
|
| 29 |
+
|
| 30 |
+
<table>
|
| 31 |
+
<tr>
|
| 32 |
+
<th>Model Scale</th>
|
| 33 |
+
<th>Model</th>
|
| 34 |
+
<th>AIME 24</th>
|
| 35 |
+
<th>AIME 25</th>
|
| 36 |
+
</tr>
|
| 37 |
+
<!-- Merged row header >100B -->
|
| 38 |
+
<tr>
|
| 39 |
+
<th rowspan="6">>100B</th>
|
| 40 |
+
</tr>
|
| 41 |
+
<!-- >100B data rows -->
|
| 42 |
+
<tr>
|
| 43 |
+
<td>DeepSeek-R1</td>
|
| 44 |
+
<td><span style="color:grey">79.8</span></td>
|
| 45 |
+
<td><span style="color:grey">70</span></td>
|
| 46 |
+
</tr>
|
| 47 |
+
<tr>
|
| 48 |
+
<td>DeepSeek-R1-0528</td>
|
| 49 |
+
<td><span style="color:red">91.4</span></td>
|
| 50 |
+
<td><span style="color:red">87.5</span></td>
|
| 51 |
+
</tr>
|
| 52 |
+
<tr>
|
| 53 |
+
<td>Qwen3-235B-A22B</td>
|
| 54 |
+
<td><span style="color:grey">85.7</span></td>
|
| 55 |
+
<td><span style="color:grey">81.5</span></td>
|
| 56 |
+
</tr>
|
| 57 |
+
<tr>
|
| 58 |
+
<td>OpenAI-o3</td>
|
| 59 |
+
<td><span style="color:red">91.6</span></td>
|
| 60 |
+
<td><span style="color:red">88.9</span></td>
|
| 61 |
+
</tr>
|
| 62 |
+
<tr>
|
| 63 |
+
<td>Gemini-2.5-Pro-0506</td>
|
| 64 |
+
<td><span style="color:red">90.8</span></td>
|
| 65 |
+
<td><span style="color:grey">83</span></td>
|
| 66 |
+
</tr>
|
| 67 |
+
<!-- Separator row -->
|
| 68 |
+
<tr>
|
| 69 |
+
<td colspan="4"></td>
|
| 70 |
+
</tr>
|
| 71 |
+
<!-- Merged row header 32B -->
|
| 72 |
+
<tr>
|
| 73 |
+
<th rowspan="9">32B</th>
|
| 74 |
+
</tr>
|
| 75 |
+
<!-- 32B data rows -->
|
| 76 |
+
<tr>
|
| 77 |
+
<td>Qwen3-32B</td>
|
| 78 |
+
<td><span style="color:grey">81.4</span></td>
|
| 79 |
+
<td><span style="color:grey">72.9</span></td>
|
| 80 |
+
</tr>
|
| 81 |
+
<tr>
|
| 82 |
+
<td>QwQ-32B</td>
|
| 83 |
+
<td><span style="color:grey">79.5</span></td>
|
| 84 |
+
<td><span style="color:grey">69.5</span></td>
|
| 85 |
+
</tr>
|
| 86 |
+
<tr>
|
| 87 |
+
<td>DeepSeek-R1-Distill-Qwen-32B</td>
|
| 88 |
+
<td><span style="color:grey">72.6</span></td>
|
| 89 |
+
<td><span style="color:grey">49.6</span></td>
|
| 90 |
+
</tr>
|
| 91 |
+
<tr>
|
| 92 |
+
<td>Skywork-OR1-32B</td>
|
| 93 |
+
<td><span style="color:grey">82.2</span></td>
|
| 94 |
+
<td><span style="color:grey">73.3</span></td>
|
| 95 |
+
</tr>
|
| 96 |
+
<tr>
|
| 97 |
+
<td>AM-Thinking-v1</td>
|
| 98 |
+
<td><span style="color:grey">85.3</span></td>
|
| 99 |
+
<td><span style="color:grey">74.4</span></td>
|
| 100 |
+
</tr>
|
| 101 |
+
<tr>
|
| 102 |
+
<td>OpenReasoning-Nemotron-32B</td>
|
| 103 |
+
<td><span style="color:grey">89.2</span></td>
|
| 104 |
+
<td><span style="color:grey">84.2</span></td>
|
| 105 |
+
</tr>
|
| 106 |
+
<tr>
|
| 107 |
+
<td>PCL-Reasoner-v1</td>
|
| 108 |
+
<td><p style="font-weight:grey;">85.7</p></td>
|
| 109 |
+
<td><p style="font-weight:grey;">84.2</p></td>
|
| 110 |
+
</tr>
|
| 111 |
+
<tr>
|
| 112 |
+
<td>PCL-Reasoner-v1.5</td>
|
| 113 |
+
<td><p style="font-weight: bold;">90.9</p></td>
|
| 114 |
+
<td><p style="font-weight: bold;">85.7</p></td>
|
| 115 |
+
</tr>
|
| 116 |
+
</table>
|
| 117 |
+
|
| 118 |
+
> **Note**: Model outputs on AIME24/25 are included in the repository under `eval_result/` for verification and analysis.
|
| 119 |
+
|
| 120 |
+
## Citation
|
| 121 |
+
|
| 122 |
+
```bibtex
|
| 123 |
+
@article{PCL-Reasoner-v1.5,
|
| 124 |
+
title={PCL-Reasoner-v1.5: A Math Problem Solver with Chain of Thought Reasoning},
|
| 125 |
+
author={Yao Lu, Deng Dong Fan, Jianzheng Nie, et al.},
|
| 126 |
+
journal={arXiv preprint arXiv:2405.14524},
|
| 127 |
+
year={2026}
|
| 128 |
+
}
|
| 129 |
+
```
|
images/._pcl_reasoner_v15.png
ADDED
|
images/benchmark.png
ADDED
|
images/pcl_reasoner_v15.png
ADDED
|