Update README.md
Browse files
README.md
CHANGED
|
@@ -148,7 +148,8 @@ We evaluated our model on the following benchmarks:
|
|
| 148 |
| MMLU-ProX (avg over langs) | 59.5 | **77.6\*** | 69.1\* |
|
| 149 |
| WMT24++ (en-\>xx) | **86.2** | 85.6 | 83.2 |
|
| 150 |
|
| 151 |
-
All evaluation results were collected via [Nemo Evaluator SDK](https://github.com/NVIDIA-NeMo/Evaluator) and [Nemo Skills](https://github.com/NVIDIA-NeMo/Skills). The open source container on Nemo Skills packaged via NVIDIA
|
|
|
|
| 152 |
|
| 153 |
### Deployment Geography: Global
|
| 154 |
|
|
|
|
| 148 |
| MMLU-ProX (avg over langs) | 59.5 | **77.6\*** | 69.1\* |
|
| 149 |
| WMT24++ (en-\>xx) | **86.2** | 85.6 | 83.2 |
|
| 150 |
|
| 151 |
+
All evaluation results were collected via [Nemo Evaluator SDK](https://github.com/NVIDIA-NeMo/Evaluator) and [Nemo Skills](https://github.com/NVIDIA-NeMo/Skills). The open source container on Nemo Skills packaged via NVIDIA’s Nemo Evaluator SDK used for evaluations can be found [here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/eval-factory/containers/nemo_skills?version=25.11). In addition to Nemo Skills, the evaluations also used dedicated packaged containers for Tau-2 Bench, ArenaHard v2, AA_LCR. A reproducibility tutorial along with all configs can be found in [Nemo Evaluator SDK examples](https://github.com/NVIDIA-NeMo/Evaluator/tree/main/packages/nemo-evaluator-launcher/examples/nemotron/nano-v3-reproducibility.md). The configs are also available in this HF repo [here](./nemo-evaluator-launcher-configs/local_nvidia_nemotron_3_nano_30b_a3b.yaml). \* denotes the accuracy numbers are measured by us.
|
| 152 |
+
|
| 153 |
|
| 154 |
### Deployment Geography: Global
|
| 155 |
|