--- datasets: - bbunzeck/babylm-german - bbunzeck/german-babylm-5m-subsets language: - de pipeline_tag: text-generation --- This is a German BabyLM model trained on a [5M token subset](https://huggingface.co/datasets/bbunzeck/german-babylm-5m-subsets) of the [German BabyLM corpus](https://huggingface.co/datasets/bbunzeck/babylm-german). If you use this model, please cite the following publication: ``` @inproceedings{bunzeck-etal-2025-construction, title = "Do Construction Distributions Shape Formal Language Learning In {G}erman {B}aby{LM}s?", author = "Bunzeck, Bastian and Duran, Daniel and Zarrie{\ss}, Sina", editor = "Boleda, Gemma and Roth, Michael", booktitle = "Proceedings of the 29th Conference on Computational Natural Language Learning", month = jul, year = "2025", address = "Vienna, Austria", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.conll-1.12/", doi = "10.18653/v1/2025.conll-1.12", pages = "169--186", ISBN = "979-8-89176-271-8", } ```