Update README.md
Browse files
README.md
CHANGED
|
@@ -48,18 +48,6 @@ Use of this model is governed by the [NVIDIA Open Model License](https://www.nvi
|
|
| 48 |
|
| 49 |
---
|
| 50 |
|
| 51 |
-
## RewardBench LeaderBoard
|
| 52 |
-
|
| 53 |
-
As of 29 May 2025, Qwen-2.5-Nemotron-32B-Reward is the highest Qwen-based reward model on RewardBench.
|
| 54 |
-
**Note on licensing:** Base model Apache-2.0; HelpSteer2/3 CC-BY-4.0; this wrapper under NVIDIA license.
|
| 55 |
-
|
| 56 |
-
| Model | Overall | Chat | Chat-Hard | Safety | Reasoning |
|
| 57 |
-
|:--------------------------|---------|:----------|:-------|:----------|:----------|
|
| 58 |
-
| **Qwen-2.5-Nemotron-32B-Reward** | **92.8** | 96.4 | **89.0** | 89.5 | 96.3 |
|
| 59 |
-
| [R-I-S-E/RISE-Judge-Qwen2.5-32B](https://huggingface.co/R-I-S-E/RISE-Judge-Qwen2.5-32B) | 92.7 | 96.6 | 83.3 | 91.9 | **98.8** |
|
| 60 |
-
|
| 61 |
-
---
|
| 62 |
-
|
| 63 |
## RM-Bench LeaderBoard
|
| 64 |
|
| 65 |
As of 29 May 2025, Qwen-2.5-Nemotron-32B-Reward is slightly lower on [RM-Bench](https://arxiv.org/abs/2410.16184) and [JudgeBench](https://huggingface.co/spaces/ScalerLab/JudgeBench) compared to Llama-3.3-Nemotron-70B-Reward.
|
|
|
|
| 48 |
|
| 49 |
---
|
| 50 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
## RM-Bench LeaderBoard
|
| 52 |
|
| 53 |
As of 29 May 2025, Qwen-2.5-Nemotron-32B-Reward is slightly lower on [RM-Bench](https://arxiv.org/abs/2410.16184) and [JudgeBench](https://huggingface.co/spaces/ScalerLab/JudgeBench) compared to Llama-3.3-Nemotron-70B-Reward.
|