Text Classification
Transformers
Safetensors
English
qwen2
nvidia
qwen2.5
reward-model
text-generation-inference
nvhshin commited on
Commit
9f7b650
·
verified ·
1 Parent(s): 9895319

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -12
README.md CHANGED
@@ -48,18 +48,6 @@ Use of this model is governed by the [NVIDIA Open Model License](https://www.nvi
48
 
49
  ---
50
 
51
- ## RewardBench LeaderBoard
52
-
53
- As of 29 May 2025, Qwen-2.5-Nemotron-32B-Reward is the highest Qwen-based reward model on RewardBench.
54
- **Note on licensing:** Base model Apache-2.0; HelpSteer2/3 CC-BY-4.0; this wrapper under NVIDIA license.
55
-
56
- | Model | Overall | Chat | Chat-Hard | Safety | Reasoning |
57
- |:--------------------------|---------|:----------|:-------|:----------|:----------|
58
- | **Qwen-2.5-Nemotron-32B-Reward** | **92.8** | 96.4 | **89.0** | 89.5 | 96.3 |
59
- | [R-I-S-E/RISE-Judge-Qwen2.5-32B](https://huggingface.co/R-I-S-E/RISE-Judge-Qwen2.5-32B) | 92.7 | 96.6 | 83.3 | 91.9 | **98.8** |
60
-
61
- ---
62
-
63
  ## RM-Bench LeaderBoard
64
 
65
  As of 29 May 2025, Qwen-2.5-Nemotron-32B-Reward is slightly lower on [RM-Bench](https://arxiv.org/abs/2410.16184) and [JudgeBench](https://huggingface.co/spaces/ScalerLab/JudgeBench) compared to Llama-3.3-Nemotron-70B-Reward.
 
48
 
49
  ---
50
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  ## RM-Bench LeaderBoard
52
 
53
  As of 29 May 2025, Qwen-2.5-Nemotron-32B-Reward is slightly lower on [RM-Bench](https://arxiv.org/abs/2410.16184) and [JudgeBench](https://huggingface.co/spaces/ScalerLab/JudgeBench) compared to Llama-3.3-Nemotron-70B-Reward.