--- datasets: - UW/olmo-mix-1124-subset-p99 --- # BPE tokenizer This is a BPE counterpart to [alisawuffles/superbpe-tokenizer-128k](https://huggingface.co/alisawuffles/superbpe-tokenizer-128k), trained on the same data with the same vocabulary size. You can experiment with this tokenizer on our [tokenizer playground](https://superbpe.github.io/) by entering a custom HF repository ID.