scb10x
/

llama3.1-typhoon2-deepseek-r1-70b-preview

Text Generation

text-generation-inference

Model card Files Files and versions

kunato commited on Feb 7

Commit

be759fe

·

verified ·

1 Parent(s): d30afb2

Update README.md

Files changed (1) hide show

README.md +6 -3

README.md CHANGED Viewed

@@ -45,7 +45,7 @@ The paper is coming soon.
 </div>
-## Recommend system prompt
 More controllable | less reasoning capability
 ```
@@ -60,7 +60,7 @@ Highest reasoning capability | least controllable
 # No system prompt
 ```
-## Usage Example
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
@@ -103,7 +103,7 @@ response = outputs[0][input_ids.shape[-1]:]
 print(tokenizer.decode(response, skip_special_tokens=True)) # <think> Okay, .... </think> ดังนั้น จำนวนเต็มบวกที่น้อยที่สุดที่เป็นผลคูณของ 30 และเขียนได้ด้วยตัวเลข 0 และ 2 เท่านั้นคือ 2220 boxed{2220}
 ```
-## Inference Server Hosting Example
 ```bash
 pip install vllm
 vllm serve scb10x/llama3.1-typhoon2-deepseek-r1-70b --tensor-parallel-size 2 --gpu-memory-utilization 0.95 --max-model-len 16384 --enforce-eager
@@ -112,6 +112,9 @@ vllm serve scb10x/llama3.1-typhoon2-deepseek-r1-70b --tensor-parallel-size 2 --g
 # see more information at https://docs.vllm.ai/
 ```
 ## **Intended Uses & Limitations**
 This model is an reasoning instructional model. However, it’s still undergoing development. It incorporates some level of guardrails, but it still may produce answers that are inaccurate, biased, or otherwise objectionable in response to user prompts. We recommend that developers assess these risks in the context of their use case.

 </div>
+## **Recommend system prompt**
 More controllable | less reasoning capability
 ```
 # No system prompt
 ```
+## **Usage Example**
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 print(tokenizer.decode(response, skip_special_tokens=True)) # <think> Okay, .... </think> ดังนั้น จำนวนเต็มบวกที่น้อยที่สุดที่เป็นผลคูณของ 30 และเขียนได้ด้วยตัวเลข 0 และ 2 เท่านั้นคือ 2220 boxed{2220}
 ```
+## **Inference Server Hosting Example**
 ```bash
 pip install vllm
 vllm serve scb10x/llama3.1-typhoon2-deepseek-r1-70b --tensor-parallel-size 2 --gpu-memory-utilization 0.95 --max-model-len 16384 --enforce-eager
 # see more information at https://docs.vllm.ai/
 ```
+## **Tool use**
+We don't recommend using tool use on this model.
 ## **Intended Uses & Limitations**
 This model is an reasoning instructional model. However, it’s still undergoing development. It incorporates some level of guardrails, but it still may produce answers that are inaccurate, biased, or otherwise objectionable in response to user prompts. We recommend that developers assess these risks in the context of their use case.