baseten-admin
/

glm-4.7-fp4-fp4kv

8-bit precision

Model card Files Files and versions

glm-4.7-fp4-fp4kv / hf_quant_config.json

baseten-admin's picture

Update hf_quant_config.json

7f2cc8d verified 2 months ago

history blame contribute delete

302 Bytes

	{
	"producer": {
	"name": "modelopt",
	"version": "0.40.0"
	},
	"quantization": {
	"quant_algo": "NVFP4",
	"kv_cache_quant_algo": "NVFP4",
	"group_size": 16,
	"exclude_modules": [
	"lm_head",
	"model.layers.92*"
	]
	}
	}