Open Datasets
updated
fka/awesome-chatgpt-prompts
Viewer
•
Updated
•
759
•
22k
•
9.52k
Viewer
•
Updated
•
470M
•
46.1k
•
321
Viewer
•
Updated
•
2.2M
•
6.98k
•
385
Matthijs/cmu-arctic-xvectors
Viewer
•
Updated
•
7.93k
•
18.8k
•
62
parler-tts/libritts-r-filtered-speaker-descriptions
Viewer
•
Updated
•
359k
•
222
•
7
Viewer
•
Updated
•
860k
•
12.1k
•
517
alpindale/two-million-bluesky-posts
Viewer
•
Updated
•
2.11M
•
700
•
200
arimalabs/2.3-million-bluesky-posts
Viewer
•
Updated
•
2.37M
•
76
•
5
Viewer
•
Updated
•
70k
•
78.5k
•
216
Viewer
•
Updated
•
1.34M
•
8.51k
•
29
Viewer
•
Updated
•
1.12M
•
5.44k
•
4
parler-tts/libritts_r_filtered
Viewer
•
Updated
•
359k
•
2.72k
•
21
opendiffusionai/cc12m-cleaned
Viewer
•
Updated
•
8.53M
•
372
•
10
Viewer
•
Updated
•
31.4k
•
363
•
22
Preview
•
Updated
•
1.11k
•
7
Viewer
•
Updated
•
61.6M
•
67.5k
•
1.01k
parler-tts/mls-eng-speaker-descriptions
Viewer
•
Updated
•
10.8M
•
303
•
10
Viewer
•
Updated
•
110M
•
2.14k
•
97
Updated
•
159
•
2
Viewer
•
Updated
•
602k
•
11.1k
•
144
Viewer
•
Updated
•
4.48B
•
56.4k
•
707
Viewer
•
Updated
•
1.55k
•
45
•
4
Updated
•
10.6k
•
138
Viewer
•
Updated
•
59.1k
•
3.23k
•
12
keremberke/license-plate-object-detection
Viewer
•
Updated
•
8.83k
•
855
•
33
Updated
•
51
•
8
Viewer
•
Updated
•
98.6k
•
1.41k
•
100
nebius/SWE-agent-trajectories
Viewer
•
Updated
•
80k
•
1.15k
•
67
Viewer
•
Updated
•
3.4k
•
4.76k
•
53
cfahlgren1/react-code-instructions
Viewer
•
Updated
•
74.4k
•
456
•
154
DAMO-NLP-SG/multimodal_textbook
Updated
•
4.3k
•
156
NovaSky-AI/Sky-T1_data_17k
Viewer
•
Updated
•
16.4k
•
250
•
186
Viewer
•
Updated
•
5.45B
•
7.64k
•
437
Viewer
•
Updated
•
546M
•
21k
•
898
hoskinson-center/proof-pile
Viewer
•
Updated
•
363k
•
6.99k
•
63
HuggingFaceFW/fineweb-edu
Viewer
•
Updated
•
3.5B
•
286k
•
886
EleutherAI/the_pile_deduplicated
Viewer
•
Updated
•
134M
•
16.6k
•
106
MohamedRashad/multilingual-tts
Viewer
•
Updated
•
25.5k
•
255
•
45
Viewer
•
Updated
•
16.4k
•
65
•
4
facebook/multilingual_librispeech
Viewer
•
Updated
•
1.49M
•
21.9k
•
167
Viewer
•
Updated
•
1.25M
•
15.9k
•
85
Viewer
•
Updated
•
2.77M
•
7.28k
•
112
Fumika/Wikinews-multilingual
Viewer
•
Updated
•
15.2k
•
103
•
7
ayymen/Weblate-Translations
Viewer
•
Updated
•
11.7M
•
3.9k
•
16
Updated
•
17.1k
•
152
Helsinki-NLP/opus_wikipedia
Viewer
•
Updated
•
1.75M
•
353
•
10
Viewer
•
Updated
•
3.59M
•
109
•
1
MLCommons/unsupervised_peoples_speech
Updated
•
32.6k
•
69
HKUSTAudio/Llasa_opensource_speech_data_160k_hours_tokenized
Updated
•
446
•
30
Viewer
•
Updated
•
10k
•
3.19k
•
516
Viewer
•
Updated
•
68.1k
•
155k
•
20
allenai/RLVR-GSM-MATH-IF-Mixed-Constraints
Viewer
•
Updated
•
29.9k
•
1.21k
•
30
allenai/olmo-2-0325-32b-preference-mix
Updated
•
221
•
15
allenai/tulu-3-sft-olmo-2-mixture-0225
Viewer
•
Updated
•
866k
•
817
•
22
Viewer
•
Updated
•
170M
•
44.9k
•
88
Viewer
•
Updated
•
621M
•
35.2k
•
84
Viewer
•
Updated
•
932
•
16.2k
•
526
Congliu/Chinese-DeepSeek-R1-Distill-data-110k
Viewer
•
Updated
•
110k
•
472
•
713
Viewer
•
Updated
•
102k
•
218
•
46
Viewer
•
Updated
•
450k
•
12.6k
•
687
Viewer
•
Updated
•
167M
•
2.03k
•
60