Fix loading TITAN from local files

#10
AI for Pathology Image Analysis Lab @ HMS / BWH org
edited 4 days ago

Bug fix when using TITAN from local files.

Bug
To reproduce the issue, either start from a clean environment or remove the content of ~/.cache/huggingface/modules/transformers_modules/TITAN
We want to load TITAN from local files without pulling files from HF:

titan = AutoModel.from_pretrained('/path/to/local/titan/dir', trust_remote_code=True)

This will throw an error:

...
  File "/home/paul/miniconda3/envs/judith/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 582, in from_pretrained
    model_class = get_class_from_dynamic_module(
  File "/home/paul/miniconda3/envs/judith/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 582, in get_class_from_dynamic_module
    return get_class_in_module(class_name, final_module, force_reload=force_download)
  File "/home/paul/miniconda3/envs/judith/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 265, in get_class_in_module
    module_files: list[Path] = [module_file] + sorted(map(Path, get_relative_import_files(module_file)))
  File "/home/paul/miniconda3/envs/judith/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 130, in get_relative_import_files
    new_imports.extend(get_relative_imports(f))
  File "/home/paul/miniconda3/envs/judith/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 99, in get_relative_imports
    with open(module_file, encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/paul/.cache/huggingface/modules/transformers_modules/TITAN_test/conch_tokenizer.py

Root of the problem
AutoModel.from_pretrained will copy .py files to the transformers module .cache (at ~/.cache/huggingface/modules/transformers_modules/TITAN/...).
To figure out what files to copy, it will look at relative imports specified in modeling_titan.py.
However, the current implementation doesn't do it recursively so it will miss conch_tokenizer and throw an error.

Solution
Adding a relative import to conch_tokenizer inside modeling_titan.py solves the problem and it gets correctly loaded at inference time.

pauldoucet changed pull request status to open
pauldoucet changed pull request title from fix-local to Fix loading TITAN from local files
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment