MBZUAI/geochat-7B · KeyError: 'geochat'

alfiinyang

Oct 24, 2024

I keep getting this KeyError when I try to load the model.

Any help please?

kaushal16103

Dec 31, 2024

Hi, did you get any solution for this?

I tried updating the transformers version but it didn't help

alfiinyang

Dec 31, 2024

I still don't have a solution for it.

JELADE

Feb 26

Hi did you later get a solution for this?

kaushal16103

Feb 26

Sadly, no

I just gave up

halox7000

May 9

This is happening because geochat is a custom model type with its own GeoChatConfig and GeoChatLlamaForCausalLM classes, and those Python files aren’t part of the standard Transformers package.

To get around it you can

Clone the official GeoChat repo locally
Install it in editable mode so that Python can import geochat like the following: pip install --no-deps -e ~/GeoChat
Import the classes directly
so like this
from geochat.model import GeoChatConfig, GeoChatLlamaForCausalLM
Load
base_model = GeoChatLlamaForCausalLM.from_pretrained(
"MBZUAI/geochat-7B",
config=config,
trust_remote_code=True,
ignore_mismatched_sizes=True,
)

After that, from_pretrained could find and instantiate GeoChatLlamaForCausalLM just like any other HF model.

kXborg

Aug 7

•

edited Aug 7

Tried your method @halox7000 . However, after downloading few binaries,, the following error occurs. To reach here, I had to remove config first.

File /usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:3031, in PreTrainedModel._load_pretrained_model.<locals>._find_mismatched_keys(state_dict, model_state_dict, loaded_keys, add_prefix_to_model, remove_prefix_from_model, ignore_mismatched_sizes)
   3025 elif add_prefix_to_model:
   3026     # The model key doesn't start with `prefix` but `checkpoint_key` does so we remove it.
   3027     model_key = ".".join(checkpoint_key.split(".")[1:])
   3029 if (
   3030     model_key in model_state_dict
-> 3031     and state_dict[checkpoint_key].shape != model_state_dict[model_key].shape
   3032 ):
   3033     mismatched_keys.append(
   3034         (checkpoint_key, state_dict[checkpoint_key].shape, model_state_dict[model_key].shape)
   3035     )
   3036     del state_dict[checkpoint_key]

KeyError: 'lm_head.weight

halox7000

Aug 9

i did something like this to fix it

state_dict = {}
if not os.path.isdir(args.model_source):
    for shard in sorted(glob.glob(os.path.join(repo_dir, "pytorch_model-*.bin"))):
        state_dict.update(torch.load(shard, map_location="cpu"))
    if "lm_head.weight" not in state_dict and "embed_tokens.weight" in state_dict:
        state_dict["lm_head.weight"] = state_dict["embed_tokens.weight"]

kXborg

Aug 11

•

edited Aug 11

I opted to download models locally from hugging face, then run python geochat_demo.py --model-path ./geochat-7B directly.

I can see the model being loaded. Given the conditionals, I can see the line 101 from the script being executed to load the model.

On a few instances it is able to detect on or two things but mostly the results are poor. Is it because the model is not being loaded properly? First few lines of Logs attached below.

Initializing Chat
------------------------------------------------
geochat-7B
------------------------------------------------
Loading GeoChat......
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:945: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:2025: UserWarning: for vision_model.embeddings.class_embedding: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
  warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:2025: UserWarning: for vision_model.embeddings.patch_embedding.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '

Some logs from middle.

Some weights of GeoChatLlamaForCausalLM were not initialized from the model checkpoint at ../geochat-7B and are newly initialized: ['model.vision_tower.vision_tower.vision_model.embeddings.position_ids']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of the model checkpoint at openai/clip-vit-large-patch14-336 were not used when initializing CLIPVisionModel:

🤔In the comment above, what have you modified. More specifically - which file have you modfied?

The total count should be 24. It says 11, which is very far from actual count.

halox7000

Aug 20

Do you mean like a Python file? The model itself is likely to perform poorly on most detection tasks. It was trained primarily through instruction tuning—that is, the researchers provided examples of airplanes along with captions describing their location. Also, VLMs are generally weak at counting, and I suspect the limitation here is for the same reason: they struggle with precise localization and discrete object enumeration.

That said, some vision–language models specialized for geospatial imagery are explicitly geometry-aware. One recent example is RingMoGPT, published in early 2025, which is much newer than GeoChat. I am currently working with GeoChat because its architecture is relatively simple, and I only need it for captioning purposes rather than more complex reasoning.