-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 39 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 85 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83
Collections
Discover the best community collections!
Collections including paper arXiv:2402.06619
-
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 55 -
YAYI 2: Multilingual Open-Source Large Language Models
Paper • 2312.14862 • Published • 15 -
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 -
TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes
Paper • 2311.10797 • Published
-
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Paper • 2311.06783 • Published • 28 -
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning
Paper • 2311.07574 • Published • 16 -
Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding
Paper • 2401.04575 • Published • 17 -
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Paper • 2402.00159 • Published • 65
-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 39 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 85 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83
-
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 55 -
YAYI 2: Multilingual Open-Source Large Language Models
Paper • 2312.14862 • Published • 15 -
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 -
TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes
Paper • 2311.10797 • Published
-
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Paper • 2311.06783 • Published • 28 -
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning
Paper • 2311.07574 • Published • 16 -
Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding
Paper • 2401.04575 • Published • 17 -
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Paper • 2402.00159 • Published • 65