Character-BERT pipeline’s tokenizer becomes a bool, causing TypeError
Problem:
Using the Hugging Face pipeline helper with model="helboukkouri/character-bert" and trust_remote_code=True results in pipe.tokenizer being set to the Boolean value False instead of a real tokenizer object. Any attempt to call the pipeline for feature extraction then raises a TypeError because a bool is not callable.
Workarounds tried:
Manually loading the tokenizer and model via the AutoTokenizer and AutoModel classes, which produces valid embeddings.
Cloning the Character-BERT repository locally and pointing from_pretrained at the disk copy, which also works—but the pipeline helper remains broken.
Importing the custom tokenizer and model classes directly from their Python files using importlib, bypassing the pipeline entirely to avoid the Boolean hijack.