mainline llama.cpp?

#4
by gghfez - opened

Not sure if I hallucinated it, but in the deleted model card, did you have a link to a PR / fork of mainline llama.cpp to run the XXS quant?

@gghfez

Yes, this model card still has the link at the very bottom in the reference section! And yes I tested the smol-IQ2_XXS successfully using mainline llama.cpp released here. I rarely release mainline compatible quants, but did so here to help everyone test given there were no other GGUFs available yet.

Read this comment for the perplexity values and more background: https://github.com/ikawrakow/ik_llama.cpp/pull/837#issuecomment-3416794658

If I'm able to keep uploading, I hope to relase some more ik specific quants next.

Yes, this model card still has the link at the very bottom in the reference section!
Thanks πŸ˜…

Nice work getting this model released btw. Nobody else has done it as you said!

If I'm able to keep uploading, I hope to relase some more ik specific quants next.

Yeah it's really annoying. I'm just going to have to delete these so I can upload my control-vectors for GLM-4.6

Models for user: gghfez (sorted by size)
----------------------------------------------------------------------
   1.22 TB | gghfez/DeepSeek-R1-0528-256x21B-BF16
   1.22 TB | gghfez/DeepSeek-V3-0324-256x21B-BF16
   1.22 TB | gghfez/DeepSeek-V3.1-Base-256x21B-BF16
   1.22 TB | gghfez/DeepSeek-R1-Zero-256x21B-BF16
   1.22 TB | gghfez/DeepSeek-R1-OG-256x21B-BF16
   1.22 TB | gghfez/DeepSeek-V3-OG-256x21B-BF16

I'm trying to move them to Modelscope at the moment, as they're still useful for creating new ik_llama quants.

Sign up or log in to comment