|
|
--- |
|
|
quantized_by: quantflex |
|
|
pipeline_tag: text-generation |
|
|
license: mit |
|
|
base_model_relation: quantized |
|
|
base_model: XiaomiMiMo/MiMo-7B-RL |
|
|
language: |
|
|
- zh |
|
|
- en |
|
|
--- |
|
|
|
|
|
**---- NOTE -----** |
|
|
|
|
|
I have deleted the mtp layers in order to make it work with llama.cpp. Quality might be degraded. |
|
|
|
|
|
A proper implementation would be better, but this will work until that is implemented. |
|
|
|
|
|
For more information feel free to open a discussion here. |
|
|
|
|
|
Here is a tutorial on how I made these quants without mtp: |
|
|
|
|
|
https://huggingface.co/XiaomiMiMo/MiMo-7B-RL/discussions/5 |
|
|
|
|
|
**---- NOTE -----** |
|
|
|
|
|
GGUF Quants for: [XiaomiMiMo/MiMo-7B-RL](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL) |
|
|
|
|
|
Model by: [XiaomiMiMo](https://huggingface.co/XiaomiMiMo) |
|
|
|
|
|
Quants by: [quantflex](https://huggingface.co/quantflex) |
|
|
|
|
|
Run with [llama.cpp](https://github.com/ggerganov/llama.cpp): |
|
|
|
|
|
```./llama-cli -m MiMo-7B-RL-nomtp-Q5_K_M.gguf -cnv``` |