FP8 non-GGUF?

by FiditeNemini - opened Aug 16

Discussion

FiditeNemini

Aug 16

Hi, @huihui-ai - do you maybe have this as a non-GGUF version to convert into 8bit MLX format please?

huihui-ai

Owner Aug 16

Can you convert GGUF to Transformers, and then to MLX format?

nicoboss

Aug 16

•

edited Aug 16

I would love a SafeTensors release too so I can run this in vLLM. Glm4MoeForCausalLM GGUFs are not currently supported by vLLM. Running this model using multiple GPUs using llama.cpp makes the model run much slower compared to vLLM due to llama.cpp lacking tensor parallelism. If you upload an FP16/BF16 GGUF I could try convearting GGUF to SafeTensors.

concedo

Sep 4

Any update on this? I would like the safetensors version as well

huihui-ai

Owner Sep 4

https://x.com/support_huihui/status/1957628113275457649

algorithm

Oct 9

@huihui-ai

Is there any way to purchase this model using crypto?
And are we allowed to share the model with just a few people when we buy it?

huihui-ai

Owner Oct 9

•

edited Oct 9

In principle, we don't encourage it, but it's allowed. We don't recommend sharing on huggingface.co.

https://ko-fi.com/s/49e1ab7527

bitcoin:

  bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge

If purchasing with Bitcoin, after the purchase, send an email to support@huihui.ai telling us the last four digits of the transaction ID, and I will provide you with the download token.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment