quantflex
/

MiMo-7B-RL-nomtp-GGUF

Text Generation

Model card Files Files and versions

MiMo-7B-RL-nomtp-GGUF / README.md

quantflex's picture

Update README.md

2be0f00 verified 8 months ago

|

history blame contribute delete

892 Bytes

	---
	quantized_by: quantflex
	pipeline_tag: text-generation
	license: mit
	base_model_relation: quantized
	base_model: XiaomiMiMo/MiMo-7B-RL
	language:
	- zh
	- en
	---

	---- NOTE -----

	I have deleted the mtp layers in order to make it work with llama.cpp. Quality might be degraded.

	A proper implementation would be better, but this will work until that is implemented.

	For more information feel free to open a discussion here.

	Here is a tutorial on how I made these quants without mtp:

	https://huggingface.co/XiaomiMiMo/MiMo-7B-RL/discussions/5

	---- NOTE -----

	GGUF Quants for: [XiaomiMiMo/MiMo-7B-RL](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL)

	Model by: [XiaomiMiMo](https://huggingface.co/XiaomiMiMo)

	Quants by: [quantflex](https://huggingface.co/quantflex)

	Run with [llama.cpp](https://github.com/ggerganov/llama.cpp):

	```./llama-cli -m MiMo-7B-RL-nomtp-Q5_K_M.gguf -cnv```