nvidia
/

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

Text Generation

Model card Files Files and versions

suhara commited on 7 days ago

Commit

d3ad469

·

verified ·

1 Parent(s): d011723

Update README.md (#19)

- Update README.md (ef04363cc33702de8651f794142d8b155b9ec989)

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -303,6 +303,7 @@ print(tokenizer.decode(outputs[0]))
 ### Use it with vLLM
 For more detailed information on how to use the model with vLLM, please see [this cookbook](https://github.com/NVIDIA-NeMo/Nemotron/blob/main/usage-cookbook/Nemotron-3-Nano/vllm\_cookbook.ipynb).
 ```
 pip install -U "vllm>=0.12.0"
@@ -725,7 +726,7 @@ The following table depicts our sample distribution for the 6 languages and 5 tr
 ## Inference
 - Engines: HF, vLLM, TRT-LLM, SGLang, Llama.cpp
-- Test Hardware: NVIDIA A100 80GB, H100 80GB, B200 192GB, RTX PRO 6000 96GB
 ## Ethical Considerations

 ### Use it with vLLM
 For more detailed information on how to use the model with vLLM, please see [this cookbook](https://github.com/NVIDIA-NeMo/Nemotron/blob/main/usage-cookbook/Nemotron-3-Nano/vllm\_cookbook.ipynb).
+If you are on Jetson Thor, please use this vllm container: `ghcr.io/nvidia-ai-iot/vllm:latest-jetson-thor`.
 ```
 pip install -U "vllm>=0.12.0"
 ## Inference
 - Engines: HF, vLLM, TRT-LLM, SGLang, Llama.cpp
+- Test Hardware: NVIDIA A100 80GB, H100 80GB, B200 192GB, RTX PRO 6000 96GB, Jetson Thor
 ## Ethical Considerations