Update README.md
Browse files
README.md
CHANGED
|
@@ -1,7 +1,31 @@
|
|
| 1 |
---
|
| 2 |
datasets:
|
| 3 |
- bbunzeck/babylm-german
|
|
|
|
| 4 |
language:
|
| 5 |
- de
|
| 6 |
pipeline_tag: text-generation
|
| 7 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
datasets:
|
| 3 |
- bbunzeck/babylm-german
|
| 4 |
+
- bbunzeck/german-babylm-5m-subsets
|
| 5 |
language:
|
| 6 |
- de
|
| 7 |
pipeline_tag: text-generation
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
This is a German BabyLM model trained on a [5M token subset](https://huggingface.co/datasets/bbunzeck/german-babylm-5m-subsets) of the [German BabyLM corpus](https://huggingface.co/datasets/bbunzeck/babylm-german).
|
| 11 |
+
|
| 12 |
+
If you use this model, please cite the following publication:
|
| 13 |
+
```
|
| 14 |
+
@inproceedings{bunzeck-etal-2025-construction,
|
| 15 |
+
title = "Do Construction Distributions Shape Formal Language Learning In {G}erman {B}aby{LM}s?",
|
| 16 |
+
author = "Bunzeck, Bastian and
|
| 17 |
+
Duran, Daniel and
|
| 18 |
+
Zarrie{\ss}, Sina",
|
| 19 |
+
editor = "Boleda, Gemma and
|
| 20 |
+
Roth, Michael",
|
| 21 |
+
booktitle = "Proceedings of the 29th Conference on Computational Natural Language Learning",
|
| 22 |
+
month = jul,
|
| 23 |
+
year = "2025",
|
| 24 |
+
address = "Vienna, Austria",
|
| 25 |
+
publisher = "Association for Computational Linguistics",
|
| 26 |
+
url = "https://aclanthology.org/2025.conll-1.12/",
|
| 27 |
+
doi = "10.18653/v1/2025.conll-1.12",
|
| 28 |
+
pages = "169--186",
|
| 29 |
+
ISBN = "979-8-89176-271-8",
|
| 30 |
+
}
|
| 31 |
+
```
|