bbunzeck commited on
Commit
3e0ee9b
·
verified ·
1 Parent(s): 6b80a2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -1
README.md CHANGED
@@ -1,7 +1,31 @@
1
  ---
2
  datasets:
3
  - bbunzeck/babylm-german
 
4
  language:
5
  - de
6
  pipeline_tag: text-generation
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  datasets:
3
  - bbunzeck/babylm-german
4
+ - bbunzeck/german-babylm-5m-subsets
5
  language:
6
  - de
7
  pipeline_tag: text-generation
8
+ ---
9
+
10
+ This is a German BabyLM model trained on a [5M token subset](https://huggingface.co/datasets/bbunzeck/german-babylm-5m-subsets) of the [German BabyLM corpus](https://huggingface.co/datasets/bbunzeck/babylm-german).
11
+
12
+ If you use this model, please cite the following publication:
13
+ ```
14
+ @inproceedings{bunzeck-etal-2025-construction,
15
+ title = "Do Construction Distributions Shape Formal Language Learning In {G}erman {B}aby{LM}s?",
16
+ author = "Bunzeck, Bastian and
17
+ Duran, Daniel and
18
+ Zarrie{\ss}, Sina",
19
+ editor = "Boleda, Gemma and
20
+ Roth, Michael",
21
+ booktitle = "Proceedings of the 29th Conference on Computational Natural Language Learning",
22
+ month = jul,
23
+ year = "2025",
24
+ address = "Vienna, Austria",
25
+ publisher = "Association for Computational Linguistics",
26
+ url = "https://aclanthology.org/2025.conll-1.12/",
27
+ doi = "10.18653/v1/2025.conll-1.12",
28
+ pages = "169--186",
29
+ ISBN = "979-8-89176-271-8",
30
+ }
31
+ ```