runtime error
Exit code: 3. Reason: __init__ self._create_inference_session(providers, provider_options, disabled_optimizers) File "/home/user/.local/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 526, in _create_inference_session sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /home/user/.cache/huggingface/hub/models--onnx-community--granite-4.0-h-350m-ONNX/snapshots/b9fd29cd514d833c0f34e02e7fdd77f9d95d2d21/onnx/model_q4.onnx failed:This is an invalid model. In Node, ("/model/embed_tokens/Gather_Quant", GatherBlockQuantized, "com.microsoft", -1) : ("model_embed_tokens_weight_quant": tensor(uint8),"input_ids": tensor(int64),"model_embed_tokens_weight_scales": tensor(float),"model_embed_tokens_weight_zp": tensor(uint8),) -> ("/model/embed_tokens/Gather/output_0": tensor(float),) , Error Unrecognized attribute: bits for operator GatherBlockQuantized ERROR: Application startup failed. Exiting. [INFO] Loading ONNX session from /home/user/.cache/huggingface/hub/models--onnx-community--granite-4.0-h-350m-ONNX/snapshots/b9fd29cd514d833c0f34e02e7fdd77f9d95d2d21/onnx/model_q4.onnx... [ERROR] Failed to load model: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /home/user/.cache/huggingface/hub/models--onnx-community--granite-4.0-h-350m-ONNX/snapshots/b9fd29cd514d833c0f34e02e7fdd77f9d95d2d21/onnx/model_q4.onnx failed:This is an invalid model. In Node, ("/model/embed_tokens/Gather_Quant", GatherBlockQuantized, "com.microsoft", -1) : ("model_embed_tokens_weight_quant": tensor(uint8),"input_ids": tensor(int64),"model_embed_tokens_weight_scales": tensor(float),"model_embed_tokens_weight_zp": tensor(uint8),) -> ("/model/embed_tokens/Gather/output_0": tensor(float),) , Error Unrecognized attribute: bits for operator GatherBlockQuantized
Container logs:
Fetching error logs...