suhara commited on
Commit
fdb60b2
·
verified ·
1 Parent(s): 96133d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -148,7 +148,8 @@ We evaluated our model on the following benchmarks:
148
  | MMLU-ProX (avg over langs) | 59.5 | **77.6\*** | 69.1\* |
149
  | WMT24++ (en-\>xx) | **86.2** | 85.6 | 83.2 |
150
 
151
- All evaluation results were collected via [Nemo Evaluator SDK](https://github.com/NVIDIA-NeMo/Evaluator) and [Nemo Skills](https://github.com/NVIDIA-NeMo/Skills). The open source container on Nemo Skills packaged via NVIDIA's Nemo Evaluator SDK used for evaluations can be found [here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/eval-factory/containers/nemo_skills?version=25.11). In addition to Nemo Skills, the evaluations also used dedicated packaged containers for Tau-2 Bench, ArenaHard v2, AA\_LCR. A reproducibility tutorial along with all configs can be found in [Nemo Evaluator SDK examples](https://github.com/NVIDIA-NeMo/Evaluator/tree/main/packages/nemo-evaluator-launcher/examples/nemotron/nano-v3-reproducibility.md). \* denotes the accuracy numbers are measured by us.
 
152
 
153
  ### Deployment Geography: Global
154
 
 
148
  | MMLU-ProX (avg over langs) | 59.5 | **77.6\*** | 69.1\* |
149
  | WMT24++ (en-\>xx) | **86.2** | 85.6 | 83.2 |
150
 
151
+ All evaluation results were collected via [Nemo Evaluator SDK](https://github.com/NVIDIA-NeMo/Evaluator) and [Nemo Skills](https://github.com/NVIDIA-NeMo/Skills). The open source container on Nemo Skills packaged via NVIDIAs Nemo Evaluator SDK used for evaluations can be found [here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/eval-factory/containers/nemo_skills?version=25.11). In addition to Nemo Skills, the evaluations also used dedicated packaged containers for Tau-2 Bench, ArenaHard v2, AA_LCR. A reproducibility tutorial along with all configs can be found in [Nemo Evaluator SDK examples](https://github.com/NVIDIA-NeMo/Evaluator/tree/main/packages/nemo-evaluator-launcher/examples/nemotron/nano-v3-reproducibility.md). The configs are also available in this HF repo [here](./nemo-evaluator-launcher-configs/local_nvidia_nemotron_3_nano_30b_a3b.yaml). \* denotes the accuracy numbers are measured by us.
152
+
153
 
154
  ### Deployment Geography: Global
155