FP8 non-GGUF?
Hi, @huihui-ai - do you maybe have this as a non-GGUF version to convert into 8bit MLX format please?
Can you convert GGUF to Transformers, and then to MLX format?
I would love a SafeTensors release too so I can run this in vLLM. Glm4MoeForCausalLM GGUFs are not currently supported by vLLM. Running this model using multiple GPUs using llama.cpp makes the model run much slower compared to vLLM due to llama.cpp lacking tensor parallelism. If you upload an FP16/BF16 GGUF I could try convearting GGUF to SafeTensors.
Any update on this? I would like the safetensors version as well
Is there any way to purchase this model using crypto?
And are we allowed to share the model with just a few people when we buy it?
In principle, we don't encourage it, but it's allowed. We don't recommend sharing on huggingface.co.
https://ko-fi.com/s/49e1ab7527
- bitcoin:
bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge
If purchasing with Bitcoin, after the purchase, send an email to support@huihui.ai telling us the last four digits of the transaction ID, and I will provide you with the download token.