Collections
Discover the best community collections!
Collections including paper arxiv:2505.13136
-
SEA-LION: Southeast Asian Languages in One Network
Paper • 2504.05747 • Published -
Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings
Paper • 2408.02237 • Published -
A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs
Paper • 2406.17377 • Published -
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts
Paper • 2306.11372 • Published
-
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Paper • 2404.15653 • Published • 29 -
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 15 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 33
-
ModernGBERT: German-only 1B Encoder Model Trained from Scratch
Paper • 2505.13136 • Published • 21 -
LSX-UniWue/ModernGBERT_1B
Feature Extraction • 1B • Updated • 2.32k • 7 -
LSX-UniWue/ModernGBERT_134M
Feature Extraction • 0.2B • Updated • 3.58k • • 5 -
LSX-UniWue/LLaMmlein-Dataset
Viewer • Updated • 838M • 1.59k • 3
-
LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification
Paper • 2411.19638 • Published • 6 -
Word Sense Linking: Disambiguating Outside the Sandbox
Paper • 2412.09370 • Published • 10 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 157 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376
-
ModernGBERT: German-only 1B Encoder Model Trained from Scratch
Paper • 2505.13136 • Published • 21 -
LSX-UniWue/ModernGBERT_1B
Feature Extraction • 1B • Updated • 2.32k • 7 -
LSX-UniWue/ModernGBERT_134M
Feature Extraction • 0.2B • Updated • 3.58k • • 5 -
LSX-UniWue/LLaMmlein-Dataset
Viewer • Updated • 838M • 1.59k • 3
-
SEA-LION: Southeast Asian Languages in One Network
Paper • 2504.05747 • Published -
Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings
Paper • 2408.02237 • Published -
A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs
Paper • 2406.17377 • Published -
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts
Paper • 2306.11372 • Published
-
LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification
Paper • 2411.19638 • Published • 6 -
Word Sense Linking: Disambiguating Outside the Sandbox
Paper • 2412.09370 • Published • 10 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 157 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376
-
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Paper • 2404.15653 • Published • 29 -
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 15 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 33