Update README.md
Browse files
README.md
CHANGED
|
@@ -20,10 +20,6 @@ This model is llama-3-8b-instruct from Meta (uploaded by unsloth) trained on the
|
|
| 20 |
|
| 21 |
The Qalore method uses Qlora training along with the methods from Galore for additional reductions in VRAM allowing for llama-3-8b to be loaded on 14.5 GB of VRAM. This allowed this training to be completed on an RTX A4000 16GB in 130 hours for less than $20.
|
| 22 |
|
| 23 |
-
Dataset used for training this model:
|
| 24 |
-
|
| 25 |
-
- https://huggingface.co/datasets/Replete-AI/OpenCodeInterpreterData
|
| 26 |
-
|
| 27 |
Qalore notebook for training:
|
| 28 |
|
| 29 |
- https://colab.research.google.com/drive/1bX4BsjLcdNJnoAf7lGXmWOgaY8yekg8p?usp=sharing
|
|
|
|
| 20 |
|
| 21 |
The Qalore method uses Qlora training along with the methods from Galore for additional reductions in VRAM allowing for llama-3-8b to be loaded on 14.5 GB of VRAM. This allowed this training to be completed on an RTX A4000 16GB in 130 hours for less than $20.
|
| 22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
Qalore notebook for training:
|
| 24 |
|
| 25 |
- https://colab.research.google.com/drive/1bX4BsjLcdNJnoAf7lGXmWOgaY8yekg8p?usp=sharing
|