Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Paper
•
2306.02707
•
Published
•
47
This model is a fine-tuned version of TheBloke/Llama-2-13B-GPTQ on Orca dataset Open-Orca/OpenOrca.
### System:\n{system}\n\n### User:\n{instruction}\n\n### Response:
The model was trained with the following 16 system messages used to generate the training examples (see ORCA paper):
First make sure you have AutoGPTQ installed:
GITHUB_ACTIONS=true pip install auto-gptq
In order to use this, you need to download the base model from TheBloke/Llama-2-13B-GPTQ and then load the adpter from this repo. Then try the following example code:
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig, get_gptq_peft_model
MODEL_PATH_GPTQ= "Llama-2-13B-GPTQ"
ADAPTER_DIR= "Llama-2-13B-GPTQ-Orca"
DEV = "cuda:0"
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH_GPTQ, use_fast=True)
model = AutoGPTQForCausalLM.from_quantized(
MODEL_PATH_GPTQ,
use_safetensors=True,
trust_remote_code=False,
use_triton=True,
device="cuda:0",
warmup_triton=False,
trainable=True,
inject_fused_attention=True,
inject_fused_mlp=False,
)
model = get_gptq_peft_model(
model,
model_id=ADAPTER_DIR,
train_mode=False
)
model.eval()
The files provided will work with AutoGPTQ (CUDA and Triton modes), GPTQ-for-LLaMa (only CUDA has been tested), and Occ4m's GPTQ-for-LLaMa fork.
ExLlama works with Llama models in 4-bit. Please see the Provided Files table above for per-file compatibility.
@software{OpenOrca_Preview1,
title = {OpenOrca_Preview1: A LLaMA-13B Model Fine-tuned on Small Portion of OpenOrcaV1 Dataset},
author = {Wing Lian and Bleys Goodson and Eugene Pentland and Austin Cook and Chanvichet Vong and "Teknium"},
year = {2023},
publisher = {HuggingFace},
journal = {HuggingFace repository},
howpublished = {\url{https://https://huggingface.co/Open-Orca/OpenOrca-Preview1-13B},
}
@misc{mukherjee2023orca,
title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4},
author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
year={2023},
eprint={2306.02707},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@misc{longpre2023flan,
title={The Flan Collection: Designing Data and Methods for Effective Instruction Tuning},
author={Shayne Longpre and Le Hou and Tu Vu and Albert Webson and Hyung Won Chung and Yi Tay and Denny Zhou and Quoc V. Le and Barret Zoph and Jason Wei and Adam Roberts},
year={2023},
eprint={2301.13688},
archivePrefix={arXiv},
primaryClass={cs.AI}
}
@software{touvron2023llama,
title={LLaMA: Open and Efficient Foundation Language Models},
author={Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timoth{\'e}e and Rozi{\`e}re, Baptiste and Goyal, Naman and Hambro, Eric and Azhar, Faisal and Rodriguez, Aurelien and Joulin, Armand and Grave, Edouard and Lample, Guillaume},
journal={arXiv preprint arXiv:2302.13971},
year={2023}
}