Version of the gpt-oss tokenizer (o200k_harmony) filtered to exclude tokens with characters that are not in the Latin, Cyrillic, Greek, or Georgian scripts (or unicode Common/Unknown) using https://github.com/spyysalo/tokenizer-filter/ as follows:
python3 filter_by_script.py openai/gpt-oss-120b Latin Cyrillic Greek Georgian --save-dir harmony-latin-plus
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support