Dracones
/

WizardLM-2-8x22B_exl2_6.0bpw

@@ -19,6 +19,62 @@ These quants were made with exllamav2 version 0.0.18. Quants made on this versio
 If you have problems loading these models, please update Text Generation WebUI to the latest version.
 ## Quant Details

 If you have problems loading these models, please update Text Generation WebUI to the latest version.
+## Perplexity Scoring
+Below are the perplexity scores for the EXL2 models. A lower score is better.
+| Quant Level | Perplexity Score |
+|-------------|------------------|
+| 7.0 | 4.5859 |
+| 6.0 | 4.6252 |
+| 5.5 | 4.6493 |
+| 5.0 | 4.6937 |
+| 4.5 | 4.8029 |
+| 4.0 | 4.9372 |
+| 3.5 | 5.1336 |
+| 3.25 | 5.3636 |
+| 3.0 | 5.5468 |
+| 2.75 | 5.8255 |
+| 2.5 | 6.3362 |
+| 2.25 | 7.7763 |
+### Perplexity Script
+This was the script used for perplexity testing.
+```bash
+#!/bin/bash
+# Activate the conda environment
+source ~/miniconda3/etc/profile.d/conda.sh
+conda activate exllamav2
+DATA_SET=/root/wikitext/wikitext-2-v1.parquet
+# Set the model name and bit size
+MODEL_NAME="WizardLM-2-8x22B"
+BIT_PRECISIONS=(6.0 5.5 5.0 4.5 4.0 3.5 3.25 3.0 2.75 2.5 2.25)
+# Print the markdown table header
+echo "| Quant Level | Perplexity Score |"
+echo "|-------------|------------------|"
+for BIT_PRECISION in "${BIT_PRECISIONS[@]}"
+do
+  LOCAL_FOLDER="/root/models/${MODEL_NAME}_exl2_${BIT_PRECISION}bpw"
+  REMOTE_FOLDER="Dracones/${MODEL_NAME}_exl2_${BIT_PRECISION}bpw"
+  if [ ! -d "$LOCAL_FOLDER" ]; then
+    huggingface-cli download --local-dir-use-symlinks=False --local-dir "${LOCAL_FOLDER}" "${REMOTE_FOLDER}" >> /root/download.log 2>&1
+  fi
+  output=$(python test_inference.py -m "$LOCAL_FOLDER" -gs 40,40,40,40 -ed "$DATA_SET")
+  score=$(echo "$output" | grep -oP 'Evaluation perplexity: \K[\d.]+')
+  echo "| $BIT_PRECISION | $score |"
+  # rm -rf "${LOCAL_FOLDER}"
+done
+```
 ## Quant Details