---
datasets:
- UW/olmo-mix-1124-subset-p99
---
# BPE tokenizer
This is a BPE counterpart to [alisawuffles/superbpe-tokenizer-128k](https://huggingface.co/alisawuffles/superbpe-tokenizer-128k), trained on the same data with the same vocabulary size.

You can experiment with this tokenizer on our [tokenizer playground](https://superbpe.github.io/) by entering a custom HF repository ID.