TheBloke
/

wizard-vicuna-13B-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

TheBloke commited on May 4, 2023

Commit

d8efefa

·

1 Parent(s): 3a66857

Update README.md

Files changed (1) hide show

README.md +6 -3

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ inference: false
 # Wizard-Vicuna-13B-GPTQ
-This repo contains 4bit GPTQ format quantised models of  [junlee's wizard-vicuna 13B](https://huggingface.co/junelee/wizard-vicuna-13b).
 It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
@@ -17,7 +17,8 @@ It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com
 * [4bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-GPTQ).
 * [4bit and 5bit GGML models for CPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-GGML).
 ## How to easily download and use this model in text-generation-webui
 Open the text-generation-webui UI as normal.
@@ -53,7 +54,9 @@ It was created without the `--act-order` parameter. It may have slightly lower i
     CUDA_VISIBLE_DEVICES=0 python3 llama.py wizard-vicuna-13B-HF c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors wizard-vicuna-13B-GPTQ-4bit.compat.no-act-order.safetensors
     ```
-# Original wizard-vicuna-13B model card
 # WizardVicunaLM
 ### Wizard's dataset + ChatGPT's conversation extension + Vicuna's tuning method

 # Wizard-Vicuna-13B-GPTQ
+This repo contains 4bit GPTQ format quantised models of  [junelee's wizard-vicuna 13B](https://huggingface.co/junelee/wizard-vicuna-13b).
 It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
 * [4bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-GPTQ).
 * [4bit and 5bit GGML models for CPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-GGML).
+* [float16 HF format model for GPU inference](https://huggingface.co/TheBloke/wizard-vicuna-13B-HF).
 ## How to easily download and use this model in text-generation-webui
 Open the text-generation-webui UI as normal.
     CUDA_VISIBLE_DEVICES=0 python3 llama.py wizard-vicuna-13B-HF c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors wizard-vicuna-13B-GPTQ-4bit.compat.no-act-order.safetensors
     ```
+# Original WizardVicuna-13B model card
+Github page: https://github.com/melodysdreamj/WizardVicunaLM
 # WizardVicunaLM
 ### Wizard's dataset + ChatGPT's conversation extension + Vicuna's tuning method