Update README.md (#19)
Browse files- Update README.md (ef04363cc33702de8651f794142d8b155b9ec989)
README.md
CHANGED
|
@@ -303,6 +303,7 @@ print(tokenizer.decode(outputs[0]))
|
|
| 303 |
### Use it with vLLM
|
| 304 |
|
| 305 |
For more detailed information on how to use the model with vLLM, please see [this cookbook](https://github.com/NVIDIA-NeMo/Nemotron/blob/main/usage-cookbook/Nemotron-3-Nano/vllm\_cookbook.ipynb).
|
|
|
|
| 306 |
|
| 307 |
```
|
| 308 |
pip install -U "vllm>=0.12.0"
|
|
@@ -725,7 +726,7 @@ The following table depicts our sample distribution for the 6 languages and 5 tr
|
|
| 725 |
## Inference
|
| 726 |
|
| 727 |
- Engines: HF, vLLM, TRT-LLM, SGLang, Llama.cpp
|
| 728 |
-
- Test Hardware: NVIDIA A100 80GB, H100 80GB, B200 192GB, RTX PRO 6000 96GB
|
| 729 |
|
| 730 |
|
| 731 |
## Ethical Considerations
|
|
|
|
| 303 |
### Use it with vLLM
|
| 304 |
|
| 305 |
For more detailed information on how to use the model with vLLM, please see [this cookbook](https://github.com/NVIDIA-NeMo/Nemotron/blob/main/usage-cookbook/Nemotron-3-Nano/vllm\_cookbook.ipynb).
|
| 306 |
+
If you are on Jetson Thor, please use this vllm container: `ghcr.io/nvidia-ai-iot/vllm:latest-jetson-thor`.
|
| 307 |
|
| 308 |
```
|
| 309 |
pip install -U "vllm>=0.12.0"
|
|
|
|
| 726 |
## Inference
|
| 727 |
|
| 728 |
- Engines: HF, vLLM, TRT-LLM, SGLang, Llama.cpp
|
| 729 |
+
- Test Hardware: NVIDIA A100 80GB, H100 80GB, B200 192GB, RTX PRO 6000 96GB, Jetson Thor
|
| 730 |
|
| 731 |
|
| 732 |
## Ethical Considerations
|