inference
#1
pinned
by
CemalSahin
- opened
Thanks for quantizing this model!
Made a simple script to use these GGUF models easily: https://github.com/cmlshn/PromptEnhancer-GGUF
Just run python inference/prompt_enhancer_gguf.py. Works great on H100, getting ~54 tok/s
mradermacher
pinned discussion