TheBloke
/

wizard-vicuna-13B-GPTQ

@@ -6,6 +6,17 @@ tags:
   - llama
 inference: false
 ---
 # Wizard-Vicuna-13B-GPTQ
@@ -18,7 +29,7 @@ It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com
 * [4bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-GPTQ).
 * [4bit and 5bit GGML models for CPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-GGML).
 * [float16 HF format model for GPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-HF).
 ## How to easily download and use this model in text-generation-webui
 Open the text-generation-webui UI as normal.
@@ -53,7 +64,18 @@ It was created without the `--act-order` parameter. It may have slightly lower i
     ```
     CUDA_VISIBLE_DEVICES=0 python3 llama.py wizard-vicuna-13B-HF c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors wizard-vicuna-13B-GPTQ-4bit.compat.no-act-order.safetensors
     ```
 # Original WizardVicuna-13B model card
 Github page: https://github.com/melodysdreamj/WizardVicunaLM
@@ -68,7 +90,7 @@ I am a big fan of the ideas behind WizardLM and VicunaLM. I particularly like th
 ![](https://user-images.githubusercontent.com/21379657/236088663-3fa212c9-0112-4d44-9b01-f16ea093cb67.png)
-### Detail
 The questions presented here are not from rigorous tests, but rather, I asked a few questions and requested GPT-4 to score them. The models compared were ChatGPT 3.5, WizardVicunaLM, VicunaLM, and WizardLM, in that order.

   - llama
 inference: false
 ---
+<div style="width: 100%;">
+    <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
+</div>
+<div style="display: flex; justify-content: space-between; width: 100%;">
+    <div style="display: flex; flex-direction: column; align-items: flex-start;">
+        <p><a href="https://discord.gg/UBgz4VXf">Chat & support: my new Discord server</a></p>
+    </div>
+    <div style="display: flex; flex-direction: column; align-items: flex-end;">
+        <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? Patreon coming soon!</a></p>
+    </div>
+</div>
 # Wizard-Vicuna-13B-GPTQ
 * [4bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-GPTQ).
 * [4bit and 5bit GGML models for CPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-GGML).
 * [float16 HF format model for GPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-HF).
 ## How to easily download and use this model in text-generation-webui
 Open the text-generation-webui UI as normal.
     ```
     CUDA_VISIBLE_DEVICES=0 python3 llama.py wizard-vicuna-13B-HF c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors wizard-vicuna-13B-GPTQ-4bit.compat.no-act-order.safetensors
     ```
+## Want to support my work?
+I've had a lot of people ask if they can contribute. I love providing models and helping people, but it is starting to rack up pretty big cloud computing bills.
+So if you're able and willing to contribute, it'd be most gratefully received and will help me to keep providing models, and work on various AI projects.
+Donaters will get priority support on any and all AI/LLM/model questions, and I'll gladly quantise any model you'd like to try.
+* Patreon: coming soon! (just awaiting approval)
+* Ko-Fi: https://ko-fi.com/TheBlokeAI
+* Discord: https://discord.gg/UBgz4VXf
 # Original WizardVicuna-13B model card
 Github page: https://github.com/melodysdreamj/WizardVicunaLM
 ![](https://user-images.githubusercontent.com/21379657/236088663-3fa212c9-0112-4d44-9b01-f16ea093cb67.png)
+### Detail
 The questions presented here are not from rigorous tests, but rather, I asked a few questions and requested GPT-4 to score them. The models compared were ChatGPT 3.5, WizardVicunaLM, VicunaLM, and WizardLM, in that order.