Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.13136

LLäMmlein2Vec 🐑

ModernGBERT: German-only 1B Encoder Model Trained from Scratch

Paper • 2505.13136 • Published May 19 • 21
LSX-UniWue/LLaMmlein2Vec_7B

Feature Extraction • Updated 28 days ago • 10
LSX-UniWue/LLaMmlein2Vec_1B

Feature Extraction • Updated 28 days ago • 11
LSX-UniWue/LLaMmlein2Vec_120M

Feature Extraction • Updated 28 days ago • 14

LLMs for "Low Training Data Languages"

SEA-LION: Southeast Asian Languages in One Network

Paper • 2504.05747 • Published Apr 8
Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings

Paper • 2408.02237 • Published Aug 5, 2024
A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs

Paper • 2406.17377 • Published Jun 25, 2024
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts

Paper • 2306.11372 • Published Jun 20, 2023

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 29
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33

ModernGBERT: German-only 1B Encoder Model Trained from Scratch

Paper • 2505.13136 • Published May 19 • 21
LSX-UniWue/ModernGBERT_1B

Feature Extraction • 1B • Updated 18 days ago • 2.32k • 7
LSX-UniWue/ModernGBERT_134M

Feature Extraction • 0.2B • Updated 18 days ago • 3.58k • • 5
LSX-UniWue/LLaMmlein-Dataset

Viewer • Updated 28 days ago • 838M • 1.59k • 3

Text Classification

LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification

Paper • 2411.19638 • Published Nov 29, 2024 • 6
Word Sense Linking: Disambiguating Outside the Sandbox

Paper • 2412.09370 • Published Dec 12, 2024 • 10
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 157
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

LLäMmlein2Vec 🐑

ModernGBERT: German-only 1B Encoder Model Trained from Scratch

Paper • 2505.13136 • Published May 19 • 21
LSX-UniWue/LLaMmlein2Vec_7B

Feature Extraction • Updated 28 days ago • 10
LSX-UniWue/LLaMmlein2Vec_1B

Feature Extraction • Updated 28 days ago • 11
LSX-UniWue/LLaMmlein2Vec_120M

Feature Extraction • Updated 28 days ago • 14

ModernGBERT: German-only 1B Encoder Model Trained from Scratch

Paper • 2505.13136 • Published May 19 • 21
LSX-UniWue/ModernGBERT_1B

Feature Extraction • 1B • Updated 18 days ago • 2.32k • 7
LSX-UniWue/ModernGBERT_134M

Feature Extraction • 0.2B • Updated 18 days ago • 3.58k • • 5
LSX-UniWue/LLaMmlein-Dataset

Viewer • Updated 28 days ago • 838M • 1.59k • 3

LLMs for "Low Training Data Languages"

SEA-LION: Southeast Asian Languages in One Network

Paper • 2504.05747 • Published Apr 8
Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings

Paper • 2408.02237 • Published Aug 5, 2024
A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs

Paper • 2406.17377 • Published Jun 25, 2024
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts

Paper • 2306.11372 • Published Jun 20, 2023

Text Classification

LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification

Paper • 2411.19638 • Published Nov 29, 2024 • 6
Word Sense Linking: Disambiguating Outside the Sandbox

Paper • 2412.09370 • Published Dec 12, 2024 • 10
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 157
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 29
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs