Update README.md
Browse files
README.md
CHANGED
|
@@ -9,18 +9,27 @@ inference: False
|
|
| 9 |
license: apache-2.0
|
| 10 |
---
|
| 11 |
|
| 12 |
-
# ethzanalytics/gpt-j-8bit-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
-
This is a version of `hivemind/gpt-j-6B-8bit` fine-tuned on the Wizard of Wikipedia dataset for 10k steps on an A100. it can be used as a chatbot.
|
| 15 |
|
| 16 |
_NOTE: this needs to be loaded via the special patching technique outlined in the hivemind model card (as with all 8bit models)_
|
| 17 |
|
|
|
|
| 18 |
|
| 19 |
-
|
| 20 |
|
| 21 |
|
| 22 |
---
|
| 23 |
|
| 24 |
|
| 25 |
-
|
|
|
|
| 26 |
|
|
|
|
|
|
| 9 |
license: apache-2.0
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# ethzanalytics/gpt-j-8bit-KILT_WoW_10k_steps
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
<a href="https://colab.research.google.com/gist/pszemraj/e49c60aafe04acc52fcfdd1baefe12e4/-ai-msgbot-gpt-j-6b-8bit-with-hub.ipynb">
|
| 16 |
+
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
| 17 |
+
</a>
|
| 18 |
+
|
| 19 |
+
This is a version of `hivemind/gpt-j-6B-8bit` fine-tuned on the [Wizard of Wikipedia](https://arxiv.org/abs/1811.01241) dataset for 10k steps (_just under an epoch_) on an A100. it can be used as a chatbot. It is designed to be used with [ai-msgbot](https://github.com/pszemraj/ai-msgbot) to take advantage of the prompt engineering.
|
| 20 |
|
|
|
|
| 21 |
|
| 22 |
_NOTE: this needs to be loaded via the special patching technique outlined in the hivemind model card (as with all 8bit models)_
|
| 23 |
|
| 24 |
+
## Training
|
| 25 |
|
| 26 |
+
For details, please see [this wandb report](https://wandb.ai/pszemraj/conversational-6B-train-vanilla/reports/Training-6B-GPT-J-8bit-for-Dialogue--VmlldzoyNTg3MzE0) for both the daily-dialogues version and the WoW version.
|
| 27 |
|
| 28 |
|
| 29 |
---
|
| 30 |
|
| 31 |
|
| 32 |
+
TODO: rest of README
|
| 33 |
+
|
| 34 |
|
| 35 |
+
---
|