Improve language tag

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show

README.md +97 -85

README.md CHANGED Viewed

@@ -1,86 +1,98 @@
----
-datasets:
-- id4thomas/emotion-prediction-comet-atomic-2020
-language:
-- en
-base_model:
-- Qwen/Qwen2.5-3B-Instruct
----
-# emotion-predictor-Qwen2.5-3B-Instruct
-LLM trained to predict a character's emotional response in the given situation
-* Trained to predict in a structured output format.
-Prediction Performance:
-| Setting | Performance by Emotion|
-| --- | --- |
-| Pretrained | <img src="./assets/qwen2_5-3b-baseline_perf.png" alt="baseline_perf" width="100%" /> |
-| Tuned | <img src="./assets/finetuned_perf.png" alt="trained_perf" width="100%" /> |
-## Quickstart
-The model is trained to predict in the following schema
-```
-from enum import Enum
-from pydantic import BaseModel
-class RelationshipStatus(str, Enum):
-    na = "na"
-    low = "low"
-    medium = "medium"
-    high = "high"
-class EmotionLabel(BaseModel):
-    joy: RelationshipStatus
-    trust: RelationshipStatus
-    fear: RelationshipStatus
-    surprise: RelationshipStatus
-    sadness: RelationshipStatus
-    disgust: RelationshipStatus
-    anger: RelationshipStatus
-    anticipation: RelationshipStatus
-class EntryResult(BaseModel):
-    emotion: EmotionLabel
-    reason: str
-```
-Using `outlines` package to generate structured predictions
-* system prompt & user template is provided [here](./assets/inference_prompt.yaml)
-```
-import outlines
-from outlines import models
-from transformers import AutoTokenizer, AutoModelForCausalLM
-model = AutoModelForCausalLM.from_pretrained("id4thomas/emotion-predictor-Qwen2.5-3B-Instruct")
-tokenizer = AutoTokenizer.from_pretrained("id4thomas/emotion-predictor-Qwen2.5-3B-Instruct")
-# Initalize outlines generator
-outlines_model = models.Transformers(model, tokenizer)
-generator = outlines.generate.json(outlines_model, EntryResult)
-# Generate
-messages = [
-    {"role": "system", "content": system_message},
-    {"role": "user", "content": user_message}
-]
-input_text = tokenizer.apply_chat_template(
-    messages,
-    tokenize=False,
-    add_generation_prompt=True,
-)
-prediction = generator(input_text)
->>> EntryResult(emotion=EmotionLabel(joy=<RelationshipStatus.na: 'na'>, ...)
-```
-Using endpoint loaded with vllm & OpenAI client package
-* example of using vllm container is provided [here](./assets/run_vllm.sh)
-```
-client = OpenAI(...)
-json_schema = EntryResult.model_json_schema()
-completion = client.chat.completions.create(
-    model="id4thomas/emotion-predictor-Qwen2.5-3B-Instruct",
-    messages=messages,
-    extra_body={"guided_json": json_schema},
-)
-print(completion.choices[0].message.content)
 ```

+---
+datasets:
+- id4thomas/emotion-prediction-comet-atomic-2020
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+base_model:
+- Qwen/Qwen2.5-3B-Instruct
+---
+# emotion-predictor-Qwen2.5-3B-Instruct
+LLM trained to predict a character's emotional response in the given situation
+* Trained to predict in a structured output format.
+Prediction Performance:
+| Setting | Performance by Emotion|
+| --- | --- |
+| Pretrained | <img src="./assets/qwen2_5-3b-baseline_perf.png" alt="baseline_perf" width="100%" /> |
+| Tuned | <img src="./assets/finetuned_perf.png" alt="trained_perf" width="100%" /> |
+## Quickstart
+The model is trained to predict in the following schema
+```
+from enum import Enum
+from pydantic import BaseModel
+class RelationshipStatus(str, Enum):
+    na = "na"
+    low = "low"
+    medium = "medium"
+    high = "high"
+class EmotionLabel(BaseModel):
+    joy: RelationshipStatus
+    trust: RelationshipStatus
+    fear: RelationshipStatus
+    surprise: RelationshipStatus
+    sadness: RelationshipStatus
+    disgust: RelationshipStatus
+    anger: RelationshipStatus
+    anticipation: RelationshipStatus
+class EntryResult(BaseModel):
+    emotion: EmotionLabel
+    reason: str
+```
+Using `outlines` package to generate structured predictions
+* system prompt & user template is provided [here](./assets/inference_prompt.yaml)
+```
+import outlines
+from outlines import models
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model = AutoModelForCausalLM.from_pretrained("id4thomas/emotion-predictor-Qwen2.5-3B-Instruct")
+tokenizer = AutoTokenizer.from_pretrained("id4thomas/emotion-predictor-Qwen2.5-3B-Instruct")
+# Initalize outlines generator
+outlines_model = models.Transformers(model, tokenizer)
+generator = outlines.generate.json(outlines_model, EntryResult)
+# Generate
+messages = [
+    {"role": "system", "content": system_message},
+    {"role": "user", "content": user_message}
+]
+input_text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True,
+)
+prediction = generator(input_text)
+>>> EntryResult(emotion=EmotionLabel(joy=<RelationshipStatus.na: 'na'>, ...)
+```
+Using endpoint loaded with vllm & OpenAI client package
+* example of using vllm container is provided [here](./assets/run_vllm.sh)
+```
+client = OpenAI(...)
+json_schema = EntryResult.model_json_schema()
+completion = client.chat.completions.create(
+    model="id4thomas/emotion-predictor-Qwen2.5-3B-Instruct",
+    messages=messages,
+    extra_body={"guided_json": json_schema},
+)
+print(completion.choices[0].message.content)
 ```