suhara commited on
Commit
d3ad469
·
verified ·
1 Parent(s): d011723

Update README.md (#19)

Browse files

- Update README.md (ef04363cc33702de8651f794142d8b155b9ec989)

Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -303,6 +303,7 @@ print(tokenizer.decode(outputs[0]))
303
  ### Use it with vLLM
304
 
305
  For more detailed information on how to use the model with vLLM, please see [this cookbook](https://github.com/NVIDIA-NeMo/Nemotron/blob/main/usage-cookbook/Nemotron-3-Nano/vllm\_cookbook.ipynb).
 
306
 
307
  ```
308
  pip install -U "vllm>=0.12.0"
@@ -725,7 +726,7 @@ The following table depicts our sample distribution for the 6 languages and 5 tr
725
  ## Inference
726
 
727
  - Engines: HF, vLLM, TRT-LLM, SGLang, Llama.cpp
728
- - Test Hardware: NVIDIA A100 80GB, H100 80GB, B200 192GB, RTX PRO 6000 96GB
729
 
730
 
731
  ## Ethical Considerations
 
303
  ### Use it with vLLM
304
 
305
  For more detailed information on how to use the model with vLLM, please see [this cookbook](https://github.com/NVIDIA-NeMo/Nemotron/blob/main/usage-cookbook/Nemotron-3-Nano/vllm\_cookbook.ipynb).
306
+ If you are on Jetson Thor, please use this vllm container: `ghcr.io/nvidia-ai-iot/vllm:latest-jetson-thor`.
307
 
308
  ```
309
  pip install -U "vllm>=0.12.0"
 
726
  ## Inference
727
 
728
  - Engines: HF, vLLM, TRT-LLM, SGLang, Llama.cpp
729
+ - Test Hardware: NVIDIA A100 80GB, H100 80GB, B200 192GB, RTX PRO 6000 96GB, Jetson Thor
730
 
731
 
732
  ## Ethical Considerations