Experimental Models lumees/Lumees-3.8B-Reasoning Text Generation β’ 4B β’ Updated Nov 23, 2025 β’ 2 β’ 2
Global Corpus lumees/turkish-corpus-100b Viewer β’ Updated Nov 30, 2025 β’ 107M β’ 874 β’ 3 lumees/multilingual-safety-classification-dataset Viewer β’ Updated Oct 24, 2025 β’ 213k β’ 226 β’ 2 lumees/bulgarian-corpus-33b Viewer β’ Updated Nov 30, 2025 β’ 34.9M β’ 869 β’ 3 lumees/dutch-corpus-200b Viewer β’ Updated Dec 1, 2025 β’ 170M β’ 365 β’ 3
lumees/multilingual-safety-classification-dataset Viewer β’ Updated Oct 24, 2025 β’ 213k β’ 226 β’ 2
Turkish Retrieval Datasets lumees/ms-marco-tr-hard-negatives Viewer β’ Updated Nov 27, 2025 β’ 786k β’ 44 β’ 2 lumees/wikipedia-turkish-synthetic-query Viewer β’ Updated Nov 28, 2025 β’ 19.8k β’ 25 β’ 3
Retrieval Models lumees/lumees-matryoshka-embedding-v1 Sentence Similarity β’ 0.6B β’ Updated Nov 25, 2025 β’ 6 β’ 2 lumees/lumees-matryoshka-vision-embedding-v1 Feature Extraction β’ Updated Nov 26, 2025 β’ 4 β’ 3 lumees/aethel-reranker-en-v1 Text Ranking β’ 0.1B β’ Updated Nov 20, 2025 β’ 66 β’ 3
lumees/lumees-matryoshka-embedding-v1 Sentence Similarity β’ 0.6B β’ Updated Nov 25, 2025 β’ 6 β’ 2
Code Retrieval Datasets lumees/codesearchnet-hard-negatives Viewer β’ Updated Nov 28, 2025 β’ 955k β’ 27 β’ 2
Safety & Moderation Datasets Comprehensive collection of high-quality multilingual datasets for NLP research and production. lumees/multilingual-safety-classification-dataset Viewer β’ Updated Oct 24, 2025 β’ 213k β’ 226 β’ 2 lumees/age-specific-text-simplification Viewer β’ Updated Aug 13, 2025 β’ 17.2k β’ 28 β’ 2
lumees/multilingual-safety-classification-dataset Viewer β’ Updated Oct 24, 2025 β’ 213k β’ 226 β’ 2
Experimental Models lumees/Lumees-3.8B-Reasoning Text Generation β’ 4B β’ Updated Nov 23, 2025 β’ 2 β’ 2
Retrieval Models lumees/lumees-matryoshka-embedding-v1 Sentence Similarity β’ 0.6B β’ Updated Nov 25, 2025 β’ 6 β’ 2 lumees/lumees-matryoshka-vision-embedding-v1 Feature Extraction β’ Updated Nov 26, 2025 β’ 4 β’ 3 lumees/aethel-reranker-en-v1 Text Ranking β’ 0.1B β’ Updated Nov 20, 2025 β’ 66 β’ 3
lumees/lumees-matryoshka-embedding-v1 Sentence Similarity β’ 0.6B β’ Updated Nov 25, 2025 β’ 6 β’ 2
Global Corpus lumees/turkish-corpus-100b Viewer β’ Updated Nov 30, 2025 β’ 107M β’ 874 β’ 3 lumees/multilingual-safety-classification-dataset Viewer β’ Updated Oct 24, 2025 β’ 213k β’ 226 β’ 2 lumees/bulgarian-corpus-33b Viewer β’ Updated Nov 30, 2025 β’ 34.9M β’ 869 β’ 3 lumees/dutch-corpus-200b Viewer β’ Updated Dec 1, 2025 β’ 170M β’ 365 β’ 3
lumees/multilingual-safety-classification-dataset Viewer β’ Updated Oct 24, 2025 β’ 213k β’ 226 β’ 2
Code Retrieval Datasets lumees/codesearchnet-hard-negatives Viewer β’ Updated Nov 28, 2025 β’ 955k β’ 27 β’ 2
Turkish Retrieval Datasets lumees/ms-marco-tr-hard-negatives Viewer β’ Updated Nov 27, 2025 β’ 786k β’ 44 β’ 2 lumees/wikipedia-turkish-synthetic-query Viewer β’ Updated Nov 28, 2025 β’ 19.8k β’ 25 β’ 3
Safety & Moderation Datasets Comprehensive collection of high-quality multilingual datasets for NLP research and production. lumees/multilingual-safety-classification-dataset Viewer β’ Updated Oct 24, 2025 β’ 213k β’ 226 β’ 2 lumees/age-specific-text-simplification Viewer β’ Updated Aug 13, 2025 β’ 17.2k β’ 28 β’ 2
lumees/multilingual-safety-classification-dataset Viewer β’ Updated Oct 24, 2025 β’ 213k β’ 226 β’ 2