ertghiu256
/

Qwen3-4b-tcomanr-merge

@@ -1,81 +1,150 @@
----
-base_model:
-- ertghiu256/qwen3-4b-code-reasoning
-- Qwen/Qwen3-4B
-- Tesslate/UIGEN-T3-4B-Preview-MAX
-- ertghiu256/qwen-3-4b-mixture-of-thought
-- POLARIS-Project/Polaris-4B-Preview
-- ertghiu256/qwen3-math-reasoner
-- ertghiu256/qwen3-multi-reasoner
-- ValiantLabs/Qwen3-4B-Esper3
-- ValiantLabs/Qwen3-4B-ShiningValiant3
-- prithivMLmods/Crux-Qwen3_OpenThinking-4B
-library_name: transformers
-tags:
-- mergekit
-- merge
----
-# MergedFinetunes-1
-This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
-## Merge Details
-### Merge Method
-This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) as a base.
-### Models Merged
-The following models were included in the merge:
-* [ertghiu256/qwen3-4b-code-reasoning](https://huggingface.co/ertghiu256/qwen3-4b-code-reasoning)
-* [Tesslate/UIGEN-T3-4B-Preview-MAX](https://huggingface.co/Tesslate/UIGEN-T3-4B-Preview-MAX)
-* [ertghiu256/qwen-3-4b-mixture-of-thought](https://huggingface.co/ertghiu256/qwen-3-4b-mixture-of-thought)
-* [POLARIS-Project/Polaris-4B-Preview](https://huggingface.co/POLARIS-Project/Polaris-4B-Preview)
-* [ertghiu256/qwen3-math-reasoner](https://huggingface.co/ertghiu256/qwen3-math-reasoner)
-* [ertghiu256/qwen3-multi-reasoner](https://huggingface.co/ertghiu256/qwen3-multi-reasoner)
-* [ValiantLabs/Qwen3-4B-Esper3](https://huggingface.co/ValiantLabs/Qwen3-4B-Esper3)
-* [ValiantLabs/Qwen3-4B-ShiningValiant3](https://huggingface.co/ValiantLabs/Qwen3-4B-ShiningValiant3)
-* [prithivMLmods/Crux-Qwen3_OpenThinking-4B](https://huggingface.co/prithivMLmods/Crux-Qwen3_OpenThinking-4B)
-### Configuration
-The following YAML configuration was used to produce this model:
-```yaml
-models:
-  - model: ertghiu256/qwen3-math-reasoner
-    parameters:
-      weight: 0.7
-  - model: ertghiu256/qwen3-4b-code-reasoning
-    parameters:
-      weight: 0.8
-  - model: ertghiu256/qwen-3-4b-mixture-of-thought
-    parameters:
-      weight: 0.9
-  - model: POLARIS-Project/Polaris-4B-Preview
-    parameters:
-      weight: 0.7
-  - model: ertghiu256/qwen3-multi-reasoner
-    parameters:
-      weight: 0.8
-  - model: ValiantLabs/Qwen3-4B-Esper3
-    parameters:
-      weight: 0.8
-  - model: Tesslate/UIGEN-T3-4B-Preview-MAX
-    parameters:
-      weight: 0.8
-  - model: ValiantLabs/Qwen3-4B-ShiningValiant3
-    parameters:
-      weight: 0.8
-  - model: prithivMLmods/Crux-Qwen3_OpenThinking-4B
-    parameters:
-      weight: 0.9
-merge_method: ties
-base_model: Qwen/Qwen3-4B
-parameters:
-  normalize: true
-  int8_mask: true
-dtype: float16
-```

+---
+base_model:
+- ertghiu256/qwen3-4b-code-reasoning
+- Qwen/Qwen3-4B
+- Tesslate/UIGEN-T3-4B-Preview-MAX
+- ertghiu256/qwen-3-4b-mixture-of-thought
+- POLARIS-Project/Polaris-4B-Preview
+- ertghiu256/qwen3-math-reasoner
+- ertghiu256/qwen3-multi-reasoner
+- ValiantLabs/Qwen3-4B-Esper3
+- ValiantLabs/Qwen3-4B-ShiningValiant3
+- prithivMLmods/Crux-Qwen3_OpenThinking-4B
+library_name: transformers
+tags:
+- mergekit
+- merge
+---
+# Ties merged COde MAth aNd Reasoning model
+This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
+## Merge Details
+This model aims to combine the code and math capabilities by merging multiple Qwen 3 finetunes.
+# How to run
+You can run this model by using multiple interface choices
+## transformers
+As the qwen team suggested to use
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "ertghiu256/Qwen3-4b-tcomanr-merge"
+# load the tokenizer and the model
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    device_map="auto"
+)
+# prepare the model input
+prompt = "Give me a short introduction to large language model."
+messages = [
+    {"role": "user", "content": prompt}
+]
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True,
+    enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
+)
+model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
+# conduct text completion
+generated_ids = model.generate(
+    **model_inputs,
+    max_new_tokens=32768
+)
+output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
+# parsing thinking content
+try:
+    # rindex finding 151668 (</think>)
+    index = len(output_ids) - output_ids[::-1].index(151668)
+except ValueError:
+    index = 0
+thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
+content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
+print("thinking content:", thinking_content)
+print("content:", content)
+```
+## vllm
+Run this command
+`vllm serve ertghiu256/Qwen3-4b-tcomanr-merge --enable-reasoning --reasoning-parser deepseek_r1`
+## Sglang
+Run this command
+`python -m sglang.launch_server --model-path ertghiu256/Qwen3-4b-tcomanr-merge --reasoning-parser deepseek-r1`
+## llama.cpp
+Run this command
+`.\llama-server -m hf.co/ertghiu256/Qwen3-4b-tcomanr-merge`
+## ollama
+Run this command
+`ollama run hf.co/ertghiu256/Qwen3-4b-tcomanr-merge`
+## lm studio
+Search `ertghiu256/Qwen3-4b-tcomanr-merge` in the lm studio model search list then download
+### Merge Details
+This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) as a base.
+#### Models:
+The following models were included in the merge:
+* [ertghiu256/qwen3-4b-code-reasoning](https://huggingface.co/ertghiu256/qwen3-4b-code-reasoning)
+* [Tesslate/UIGEN-T3-4B-Preview-MAX](https://huggingface.co/Tesslate/UIGEN-T3-4B-Preview-MAX)
+* [ertghiu256/qwen-3-4b-mixture-of-thought](https://huggingface.co/ertghiu256/qwen-3-4b-mixture-of-thought)
+* [POLARIS-Project/Polaris-4B-Preview](https://huggingface.co/POLARIS-Project/Polaris-4B-Preview)
+* [ertghiu256/qwen3-math-reasoner](https://huggingface.co/ertghiu256/qwen3-math-reasoner)
+* [ertghiu256/qwen3-multi-reasoner](https://huggingface.co/ertghiu256/qwen3-multi-reasoner)
+* [ValiantLabs/Qwen3-4B-Esper3](https://huggingface.co/ValiantLabs/Qwen3-4B-Esper3)
+* [ValiantLabs/Qwen3-4B-ShiningValiant3](https://huggingface.co/ValiantLabs/Qwen3-4B-ShiningValiant3)
+* [prithivMLmods/Crux-Qwen3_OpenThinking-4B](https://huggingface.co/prithivMLmods/Crux-Qwen3_OpenThinking-4B)
+#### Configuration
+The following YAML configuration was used to produce this model:
+```yaml
+models:
+  - model: ertghiu256/qwen3-math-reasoner
+    parameters:
+      weight: 0.7
+  - model: ertghiu256/qwen3-4b-code-reasoning
+    parameters:
+      weight: 0.8
+  - model: ertghiu256/qwen-3-4b-mixture-of-thought
+    parameters:
+      weight: 0.9
+  - model: POLARIS-Project/Polaris-4B-Preview
+    parameters:
+      weight: 0.7
+  - model: ertghiu256/qwen3-multi-reasoner
+    parameters:
+      weight: 0.8
+  - model: ValiantLabs/Qwen3-4B-Esper3
+    parameters:
+      weight: 0.8
+  - model: Tesslate/UIGEN-T3-4B-Preview-MAX
+    parameters:
+      weight: 0.8
+  - model: ValiantLabs/Qwen3-4B-ShiningValiant3
+    parameters:
+      weight: 0.8
+  - model: prithivMLmods/Crux-Qwen3_OpenThinking-4B
+    parameters:
+      weight: 0.9
+merge_method: ties
+base_model: Qwen/Qwen3-4B
+parameters:
+  normalize: true
+  int8_mask: true
+dtype: float16
+```