Text Generation
Safetensors
qwen2
conversational
kunato commited on
Commit
65fc012
·
verified ·
1 Parent(s): c0f2358

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +349 -0
README.md ADDED
@@ -0,0 +1,349 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ license: apache-2.0
4
+ ---
5
+
6
+ **Typhoon2-Qwen2.5-7B-Instruct**: Thai Large Language Model (Instruct)
7
+
8
+ **Typhoon2-Qwen2.5-7B-Instruct** is a instruct Thai 🇹🇭 large language model with 7 billion parameters, and it is based on Qwen2.5-7B.
9
+
10
+
11
+ [TODO add one image and table result]
12
+
13
+ For release post, please see our [blog](...).
14
+
15
+ ## **Model Description**
16
+
17
+ - **Model type**: A 7B instruct decoder-only model based on Qwen2 architecture.
18
+ - **Requirement**: transformers 4.45.0 or newer.
19
+ - **Context length**: 128k - Please refer to this section[] for detailed instructions on how to deploy Typhoon2-Qwen2.5 for handling long texts.
20
+ - **Primary Language(s)**: Thai 🇹🇭 and English 🇬🇧
21
+ - **License**: Apache-2.0
22
+
23
+
24
+ ## Usage Example
25
+
26
+ ```python
27
+ from transformers import AutoTokenizer, AutoModelForCausalLM
28
+ import torch
29
+
30
+ model_id = "scb10x/typhoon2-qwen2.5-7b-instruct"
31
+
32
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
33
+ model = AutoModelForCausalLM.from_pretrained(
34
+ model_id,
35
+ torch_dtype=torch.bfloat16,
36
+ device_map="auto",
37
+ )
38
+
39
+ messages = [
40
+ {"role": "system", "content": "You are Typhoon, an AI assistant created by SCB 10X, designed to be helpful, harmless, and honest. Typhoon assists with analysis, answering questions, math, coding, creative writing, teaching, role-play, discussions, and more. Typhoon responds directly without affirmations or filler phrases (e.g., “Certainly,” “Of course”). Responses do not start with “Certainly” in any form. Typhoon adheres to these rules in all languages and always replies in the user's language or as requested. Communicate in fluid, conversational prose, showing genuine interest, empathy, and presenting information clearly and visually."},
41
+ {"role": "user", "content": "ขอสูตรไก่ย่าง"},
42
+ ]
43
+
44
+ input_ids = tokenizer.apply_chat_template(
45
+ messages,
46
+ add_generation_prompt=True,
47
+ return_tensors="pt"
48
+ ).to(model.device)
49
+
50
+
51
+ outputs = model.generate(
52
+ input_ids,
53
+ max_new_tokens=512,
54
+ do_sample=True,
55
+ temperature=0.7,
56
+ top_p=0.9,
57
+ )
58
+ response = outputs[0][input_ids.shape[-1]:]
59
+ print(tokenizer.decode(response, skip_special_tokens=True))
60
+ ```
61
+
62
+ ## Inference Server Hosting Example
63
+ ```bash
64
+ pip install vllm
65
+ vllm serve scb10x/typhoon2-qwen2.5-7b-instruct
66
+ # see more information at https://docs.vllm.ai/
67
+ ```
68
+
69
+
70
+ ## Function-Call Example
71
+ ```python
72
+ import json
73
+ import torch
74
+ from transformers import AutoModelForCausalLM, AutoTokenizer
75
+ import os
76
+ import ast
77
+
78
+ model_name = "scb10x/typhoon2-qwen2.5-7b-instruct"
79
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
80
+ model = AutoModelForCausalLM.from_pretrained(
81
+ model_name, torch_dtype=torch.bfloat16
82
+ )
83
+
84
+ get_weather_api = {
85
+ "name": "get_weather",
86
+ "description": "Get the current weather for a location",
87
+ "parameters": {
88
+ "type": "object",
89
+ "properties": {
90
+ "location": {
91
+ "type": "string",
92
+ "description": "The city and state, e.g. San Francisco, New York",
93
+ },
94
+ "unit": {
95
+ "type": "string",
96
+ "enum": ["celsius", "fahrenheit"],
97
+ "description": "The unit of temperature to return",
98
+ },
99
+ },
100
+ "required": ["location"],
101
+ },
102
+ }
103
+
104
+
105
+ search_api = {
106
+ "name": "search",
107
+ "description": "Search for information on the internet",
108
+ "parameters": {
109
+ "type": "object",
110
+ "properties": {
111
+ "query": {
112
+ "type": "string",
113
+ "description": "The search query, e.g. 'latest news on AI'",
114
+ }
115
+ },
116
+ "required": ["query"],
117
+ },
118
+ }
119
+
120
+ get_stock = {
121
+ "name": "get_stock_price",
122
+ "description": "Get the stock price",
123
+ "parameters": {
124
+ "type": "object",
125
+ "properties": {
126
+ "symbol": {
127
+ "type": "string",
128
+ "description": "The stock symbol, e.g. AAPL, GOOG",
129
+ }
130
+ },
131
+ "required": ["symbol"],
132
+ },
133
+ }
134
+ # Tool input are same format with OpenAI tools
135
+ openai_format_tools = [get_weather_api, search_api, get_stock]
136
+
137
+ messages = [
138
+ {"role": "system", "content": "You are helpful assistance."},
139
+ {"role": "user", "content": "ขอราคาหุ้น Tasla (TLS) และ Amazon (AMZ) ?"},
140
+ ]
141
+
142
+ final_prompt = tokenizer.apply_chat_template(
143
+ messages, tools=openai_format_tools, add_generation_prompt=True, tokenize=False
144
+ )
145
+
146
+ inputs = tokenizer.apply_chat_template(
147
+ messages, tools=openai_format_tools, add_generation_prompt=True, return_tensors="pt"
148
+ ).to(model.device)
149
+
150
+ outputs = model.generate(
151
+ inputs,
152
+ max_new_tokens=512,
153
+ do_sample=True,
154
+ temperature=0.7,
155
+ num_return_sequences=1,
156
+ )
157
+ response = outputs[0][input_ids.shape[-1]:]
158
+
159
+ print("Here Output:", tokenizer.decode(response, skip_special_tokens=True))
160
+
161
+
162
+ # Decoding function utility
163
+ def resolve_ast_by_type(value):
164
+ if isinstance(value, ast.Constant):
165
+ if value.value is Ellipsis:
166
+ output = "..."
167
+ else:
168
+ output = value.value
169
+ elif isinstance(value, ast.UnaryOp):
170
+ output = -value.operand.value
171
+ elif isinstance(value, ast.List):
172
+ output = [resolve_ast_by_type(v) for v in value.elts]
173
+ elif isinstance(value, ast.Dict):
174
+ output = {
175
+ resolve_ast_by_type(k): resolve_ast_by_type(v)
176
+ for k, v in zip(value.keys, value.values)
177
+ }
178
+ elif isinstance(
179
+ value, ast.NameConstant
180
+ ): # Added this condition to handle boolean values
181
+ output = value.value
182
+ elif isinstance(
183
+ value, ast.BinOp
184
+ ): # Added this condition to handle function calls as arguments
185
+ output = eval(ast.unparse(value))
186
+ elif isinstance(value, ast.Name):
187
+ output = value.id
188
+ elif isinstance(value, ast.Call):
189
+ if len(value.keywords) == 0:
190
+ output = ast.unparse(value)
191
+ else:
192
+ output = resolve_ast_call(value)
193
+ elif isinstance(value, ast.Tuple):
194
+ output = tuple(resolve_ast_by_type(v) for v in value.elts)
195
+ elif isinstance(value, ast.Lambda):
196
+ output = eval(ast.unparse(value.body[0].value))
197
+ elif isinstance(value, ast.Ellipsis):
198
+ output = "..."
199
+ elif isinstance(value, ast.Subscript):
200
+ try:
201
+ output = ast.unparse(value.body[0].value)
202
+ except:
203
+ output = ast.unparse(value.value) + "[" + ast.unparse(value.slice) + "]"
204
+ else:
205
+ raise Exception(f"Unsupported AST type: {type(value)}")
206
+ return output
207
+
208
+
209
+ def resolve_ast_call(elem):
210
+ func_parts = []
211
+ func_part = elem.func
212
+ while isinstance(func_part, ast.Attribute):
213
+ func_parts.append(func_part.attr)
214
+ func_part = func_part.value
215
+ if isinstance(func_part, ast.Name):
216
+ func_parts.append(func_part.id)
217
+ func_name = ".".join(reversed(func_parts))
218
+ args_dict = {}
219
+ for arg in elem.keywords:
220
+ output = resolve_ast_by_type(arg.value)
221
+ args_dict[arg.arg] = output
222
+ return {func_name: args_dict}
223
+
224
+
225
+ def ast_parse(input_str, language="Python"):
226
+ if language == "Python":
227
+ cleaned_input = input_str.strip("[]'")
228
+ parsed = ast.parse(cleaned_input, mode="eval")
229
+ extracted = []
230
+ if isinstance(parsed.body, ast.Call):
231
+ extracted.append(resolve_ast_call(parsed.body))
232
+ else:
233
+ for elem in parsed.body.elts:
234
+ assert isinstance(elem, ast.Call)
235
+ extracted.append(resolve_ast_call(elem))
236
+ return extracted
237
+ else:
238
+ raise NotImplementedError(f"Unsupported language: {language}")
239
+
240
+
241
+ def parse_nested_value(value):
242
+ """
243
+ Parse a potentially nested value from the AST output.
244
+
245
+ Args:
246
+ value: The value to parse, which could be a nested dictionary, which includes another function call, or a simple value.
247
+
248
+ Returns:
249
+ str: A string representation of the value, handling nested function calls and nested dictionary function arguments.
250
+ """
251
+ if isinstance(value, dict):
252
+ # Check if the dictionary represents a function call (i.e., the value is another dictionary or complex structure)
253
+ if all(isinstance(v, dict) for v in value.values()):
254
+ func_name = list(value.keys())[0]
255
+ args = value[func_name]
256
+ args_str = ", ".join(
257
+ f"{k}={parse_nested_value(v)}" for k, v in args.items()
258
+ )
259
+ return f"{func_name}({args_str})"
260
+ else:
261
+ # If it's a simple dictionary, treat it as key-value pairs
262
+ return (
263
+ "{"
264
+ + ", ".join(f"'{k}': {parse_nested_value(v)}" for k, v in value.items())
265
+ + "}"
266
+ )
267
+ return repr(value)
268
+
269
+
270
+ def decoded_output_to_execution_list(decoded_output):
271
+ """
272
+ Convert decoded output to a list of executable function calls.
273
+
274
+ Args:
275
+ decoded_output (list): A list of dictionaries representing function calls.
276
+
277
+ Returns:
278
+ list: A list of strings, each representing an executable function call.
279
+ """
280
+ execution_list = []
281
+ for function_call in decoded_output:
282
+ for key, value in function_call.items():
283
+ args_str = ", ".join(
284
+ f"{k}={parse_nested_value(v)}" for k, v in value.items()
285
+ )
286
+ execution_list.append(f"{key}({args_str})")
287
+ return execution_list
288
+
289
+
290
+ def default_decode_ast_prompting(result, language="Python"):
291
+ result = result.strip("`\n ")
292
+ if not result.startswith("["):
293
+ result = "[" + result
294
+ if not result.endswith("]"):
295
+ result = result + "]"
296
+ decoded_output = ast_parse(result, language)
297
+ return decoded_output
298
+
299
+
300
+ fc_result = default_decode_ast_prompting(tokenizer.decode(response, skip_special_tokens=True))
301
+ print(fc_result) # [{'Function': {'arguments': '{"symbol": "TLS"}', 'name': 'get_stock_price'}}, {'Function': {'arguments': '{"symbol": "AMZ"}', 'name': 'get_stock_price'}}]
302
+ ```
303
+
304
+ ## Processing Long Texts
305
+
306
+ The current `config.json` is set for context length up to 32,768 tokens.
307
+ To handle extensive inputs exceeding 32,768 tokens, we utilize [YaRN](https://arxiv.org/abs/2309.00071), a technique for enhancing model length extrapolation, ensuring optimal performance on lengthy texts.
308
+
309
+ For supported frameworks, you could add the following to `config.json` to enable YaRN:
310
+ ```json
311
+ {
312
+ ...,
313
+ "rope_scaling": {
314
+ "factor": 4.0,
315
+ "original_max_position_embeddings": 32768,
316
+ "type": "yarn"
317
+ }
318
+ }
319
+ ```
320
+
321
+ For deployment, we recommend using vLLM.
322
+ Please refer to our [Documentation](https://qwen.readthedocs.io/en/latest/deployment/vllm.html) for usage if you are not familar with vLLM.
323
+ Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts**.
324
+ We advise adding the `rope_scaling` configuration only when processing long contexts is required.
325
+
326
+ ## **Intended Uses & Limitations**
327
+
328
+ This model is an instructional model. However, it’s still undergoing development. It incorporates some level of guardrails, but it still may produce answers that are inaccurate, biased, or otherwise objectionable in response to user prompts. We recommend that developers assess these risks in the context of their use case.
329
+
330
+ ## **Follow us**
331
+
332
+ **https://twitter.com/opentyphoon**
333
+
334
+ ## **Support**
335
+
336
+ **https://discord.gg/CqyBscMFpg**
337
+
338
+ ## **Citation**
339
+
340
+ - If you find Typhoon2 useful for your work, please cite it using:
341
+ ```
342
+ @article{pipatanakul2023typhoon,
343
+ title={Typhoon: Thai Large Language Models},
344
+ author={Kunat Pipatanakul and Phatrasek Jirabovonvisut and Potsawee Manakul and Sittipong Sripaisarnmongkol and Ruangsak Patomwong and Pathomporn Chokchainant and Kasima Tharnpipitchai},
345
+ year={2023},
346
+ journal={arXiv preprint arXiv:2312.13951},
347
+ url={https://arxiv.org/abs/2312.13951}
348
+ }
349
+ ```