inference

#1
by CemalSahin - opened

Thanks for quantizing this model!

Made a simple script to use these GGUF models easily: https://github.com/cmlshn/PromptEnhancer-GGUF

Just run python inference/prompt_enhancer_gguf.py. Works great on H100, getting ~54 tok/s

mradermacher pinned discussion

Sign up or log in to comment