This is a BERT model from the Turkish transformer collection of research work Optimal Turkish Subword Strategies at Scale: Systematic Evaluation of Data, Vocabulary, Morphology Interplay.

The collection Turkish Subwords Research contains BERT models and this model read as trained with wordpiece-2K-minimal tokenizer. Tokenizers comes with several vocabulary sizes and trained on 3 sizes of corpora, minimal, medium and alldata. The collection contains all the tokenizers of the name wordpiece_{voxab-size}k_{corpus size}. For more information, plrease refer to the research paper.

This is not a production model, RESEARCH PURPOSES only.

Downloads last month
7
Safetensors
Model size
87.6M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including turkish-nlp-suite/bert-2K-minimal

Paper for turkish-nlp-suite/bert-2K-minimal