File size: 9,541 Bytes
1593545 8acd1f8 6032208 1593545 d550a7d 0ae1628 16ac904 8acd1f8 31c07cb 8acd1f8 8cf0787 8acd1f8 fb2393d bff529c 570a749 4ecbf84 8acd1f8 bff529c 7c64f93 bff529c 8acd1f8 4ecbf84 aeac114 4ecbf84 aeac114 4ecbf84 8acd1f8 be759fe d30afb2 b5f3bbe d30afb2 b5f3bbe 4ecbf84 be759fe 8acd1f8 a0e30c3 8acd1f8 1a640c9 9553035 8acd1f8 c805654 85e8676 8acd1f8 be759fe 8acd1f8 a0e30c3 8acd1f8 1593545 4c10005 5429042 4c10005 be759fe 8acd1f8 1593545 8acd1f8 1593545 8acd1f8 1593545 8acd1f8 1593545 8acd1f8 1593545 fdfbf15 1593545 8acd1f8 1593545 1a86c55 1593545 8acd1f8 1a86c55 8acd1f8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
---
base_model:
- scb10x/llama3.1-typhoon2-70b-instruct
- deepseek-ai/DeepSeek-R1-Distill-Llama-70B
library_name: transformers
license: llama3.1
tags:
- mergekit
- merge
pipeline_tag: text-generation
---
**Typhoon2-DeepSeek-R1-70B**: Thai reasoning large language model. (research preview)
**Typhoon2-DeepSeek-R1-70B** is a Thai 🇹🇭 reasoning large language model with 70 billion parameters, and it is based on DeepSeek R1 70B Distill and Typhoon2 70B Instruct + SFT merged.
For more details, please see our [blog](https://blog.opentyphoon.ai/introducing-typhoon2-r1-70b-enhanced-reasoning-with-deepseek-r1-abcedaa9b7ad). Demo on [opentyphoon.ai](https://playground.opentyphoon.ai/playground).
Paper: [https://arxiv.org/abs/2502.09056](https://arxiv.org/abs/2502.09056)
*To acknowledge Meta's effort in creating the foundation model and to comply with the license, we explicitly include "llama-3.1" in the model name.
## **Key Highlights**
- **Advanced Reasoning:** Comparable to DeepSeek R1 70B Distill’s reasoning capabilities.
- **Enhanced Math & Coding Performance**: Up to 6 times more accurate than Typhoon2 70B Instruct.
- **Strong Thai Language Proficiency**: Performs similarly to Typhoon2 70B Instruct.
- **Cross-Domain Reasoning**: Adapts to various domains, going beyond math and coding.
## **Model Description**
- **Model type**: A 70B instruct decoder-only model based on Llama architecture.
- **Requirement**: transformers 4.45.0 or newer.
- **Primary Language(s)**: Thai 🇹🇭 and English 🇬🇧
- **License**: [Llama 3.1 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE)
- **Demo**: [opentyphoon.ai](https://playground.opentyphoon.ai/playground).
## **Performance**
**Reasoning Performance**
<div align="center">
<img src="https://storage.googleapis.com/typhoon-public/assets/typhoon2-text/r1_reasoning.jpg" alt="Typhoon2 DeepSeek R1 70B Reasoning Performance" width="100%" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
</div>
**General Instruction-Following Performance**
<div align="center">
<img src="https://storage.googleapis.com/typhoon-public/assets/typhoon2-text/r1_general.jpg" alt="Typhoon2 DeepSeek R1 70B General Performance" width="100%" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
</div>
## **Recommend system prompt**
More controllable | less reasoning capability
```
You are a helpful assistant named Typhoon that always reasons before answering. First, provide reasoning in English, starting with "Alright!" and ending with the `</think>` token. Then, always respond to the user in the language they use or request. Avoid unnecessary affirmations or filler phrases such as "Alright", "Okay", etc., in the response.
```
More reasoning capability | less controllable
```
You are a helpful assistant named Typhoon that always reasons before answering. First, provide reasoning in English start with Alright!, then respond to the user in the language they use or request.
```
Highest reasoning capability | least controllable
```
# No system prompt
```
## **Usage Example**
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "scb10x/llama3.1-typhoon2-deepseek-r1-70b-preview"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": 'You are a helpful assistant named Typhoon that always reasons before answering. First, provide reasoning in English, starting with "Alright!" and ending with the `</think>` token. Then, always respond to the user in the language they use or request. Avoid unnecessary affirmations or filler phrases such as "Alright", "Okay", etc., in the response'},
{"role": "user", "content": "จำนวนเต็มบวกที่น้อยที่สุดที่เป็นผลคูณของ 30 ซึ่งสามารถเขียนได้ด้วยตัวเลข 0 และ 2 เท่านั้นคืออะไร"},
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
]
outputs = model.generate(
input_ids,
max_new_tokens=512,
eos_token_id=terminators,
do_sample=True,
temperature=0.7,
top_p=0.95,
)
response = outputs[0][input_ids.shape[-1]:]
# if you want only response without thinking trace. try `tokenizer.decode(response, skip_special_tokens=True).split('</think>')[-1]` to get only response
print(tokenizer.decode(response, skip_special_tokens=True)) # <think> Okay, .... </think> ดังนั้น จำนวนเต็มบวกที่น้อยที่สุดที่เป็นผลคูณของ 30 และเขียนได้ด้วยตัวเลข 0 และ 2 เท่านั้นคือ 2220 boxed{2220}
```
## **Inference Server Hosting Example**
```bash
pip install vllm
vllm serve scb10x/llama3.1-typhoon2-deepseek-r1-70b-preview --tensor-parallel-size 2 --gpu-memory-utilization 0.95 --max-model-len 16384 --enforce-eager
# using at least 2 80GB gpu eg A100, H100 for hosting 70b model
# to serving longer context (90k), 4 gpu is required (and you can omit --enforce-eager to improve throughput)
# see more information at https://docs.vllm.ai/
```
## **Chat template**
This model is using the DeepSeek R1 70B Distill chat template, not the Llama 3 template. Be careful.
```
{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='') %}{%- for message in messages %}{%- if message['role'] == 'system' %}{% set ns.system_prompt = message['content'] %}{%- endif %}{%- endfor %}{{bos_token}}{{ns.system_prompt}}{%- for message in messages %}{%- if message['role'] == 'user' %}{%- set ns.is_tool = false -%}{{'<|User|>' + message['content']}}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is none %}{%- set ns.is_tool = false -%}{%- for tool in message['tool_calls']%}{%- if not ns.is_first %}{{'<|Assistant|><|tool▁calls▁begin|><|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '\\n' + '```json' + '\\n' + tool['function']['arguments'] + '\\n' + '```' + '<|tool▁call▁end|>'}}{%- set ns.is_first = true -%}{%- else %}{{'\\n' + '<|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '\\n' + '```json' + '\\n' + tool['function']['arguments'] + '\\n' + '```' + '<|tool▁call▁end|>'}}{{'<|tool▁calls▁end|><|end▁of▁sentence|>'}}{%- endif %}{%- endfor %}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is not none %}{%- if ns.is_tool %}{{'<|tool▁outputs▁end|>' + message['content'] + '<|end▁of▁sentence|>'}}{%- set ns.is_tool = false -%}{%- else %}{% set content = message['content'] %}{% if '</think>' in content %}{% set content = content.split('</think>')[-1] %}{% endif %}{{'<|Assistant|>' + content + '<|end▁of▁sentence|>'}}{%- endif %}{%- endif %}{%- if message['role'] == 'tool' %}{%- set ns.is_tool = true -%}{%- if ns.is_output_first %}{{'<|tool▁outputs▁begin|><|tool▁output▁begin|>' + message['content'] + '<|tool▁output▁end|>'}}{%- set ns.is_output_first = false %}{%- else %}{{'\\n<|tool▁output▁begin|>' + message['content'] + '<|tool▁output▁end|>'}}{%- endif %}{%- endif %}{%- endfor -%}{% if ns.is_tool %}{{'<|tool▁outputs▁end|>'}}{% endif %}{% if add_generation_prompt and not ns.is_tool %}{{'<|Assistant|>'}}{% endif %}
```
## **Tool use**
We don't recommend using tool use on this model.
## **Intended Uses & Limitations**
This model is an reasoning instructional model. However, it’s still undergoing development. It incorporates some level of guardrails, but it still may produce answers that are inaccurate, biased, or otherwise objectionable in response to user prompts. We recommend that developers assess these risks in the context of their use case.
## **Follow us**
**https://twitter.com/opentyphoon**
## **Support**
**https://discord.gg/us5gAYmrxw**
## **Citation**
- If you find Typhoon 2 / Typhoon 2 R1 useful for your work, please cite it using:
```
@misc{typhoon2,
title={Typhoon 2: A Family of Open Text and Multimodal Thai Large Language Models},
author={Kunat Pipatanakul and Potsawee Manakul and Natapong Nitarach and Warit Sirichotedumrong and Surapon Nonesung and Teetouch Jaknamon and Parinthapat Pengpun and Pittawat Taveekitworachai and Adisai Na-Thalang and Sittipong Sripaisarnmongkol and Krisanapong Jirayoot and Kasima Tharnpipitchai},
year={2024},
eprint={2412.13702},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2412.13702},
}
```
```
@misc{pipatanakul2025adaptinglanguagespecificllmsreasoning,
title={Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging - An Open Recipe},
author={Kunat Pipatanakul and Pittawat Taveekitworachai and Potsawee Manakul and Kasima Tharnpipitchai},
year={2025},
eprint={2502.09056},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.09056},
}
``` |