MiMo-7B-RL-nomtp-GGUF / README.md

quantflex

Update README.md

2be0f00 verified 8 months ago

preview code

raw

history blame contribute delete

892 Bytes

metadata

quantized_by: quantflex
pipeline_tag: text-generation
license: mit
base_model_relation: quantized
base_model: XiaomiMiMo/MiMo-7B-RL
language:
  - zh
  - en

---- NOTE -----

I have deleted the mtp layers in order to make it work with llama.cpp. Quality might be degraded.

A proper implementation would be better, but this will work until that is implemented.

For more information feel free to open a discussion here.

Here is a tutorial on how I made these quants without mtp:

https://huggingface.co/XiaomiMiMo/MiMo-7B-RL/discussions/5

---- NOTE -----

GGUF Quants for: XiaomiMiMo/MiMo-7B-RL

Model by: XiaomiMiMo

Quants by: quantflex

Run with llama.cpp:

./llama-cli -m MiMo-7B-RL-nomtp-Q5_K_M.gguf -cnv