Collections
Discover the best community collections!
Collections including paper arXiv:2302.13971
-
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 23 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 20 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 17
-
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Paper • 2304.13712 • Published -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 18 -
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
A Comprehensive Overview of Large Language Models
Paper • 2307.06435 • Published • 2
-
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 17 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 14 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77
-
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 18 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Paper • 2407.21770 • Published • 22
-
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 18 -
Finetuned Language Models Are Zero-Shot Learners
Paper • 2109.01652 • Published • 4 -
LIMA: Less Is More for Alignment
Paper • 2305.11206 • Published • 26
-
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 23 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 9 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 20
-
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 18 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Paper • 2407.21770 • Published • 22
-
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 23 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 20 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 17
-
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Paper • 2304.13712 • Published -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 18 -
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
A Comprehensive Overview of Large Language Models
Paper • 2307.06435 • Published • 2
-
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 18 -
Finetuned Language Models Are Zero-Shot Learners
Paper • 2109.01652 • Published • 4 -
LIMA: Less Is More for Alignment
Paper • 2305.11206 • Published • 26
-
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 23 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 9 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 20
-
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 17 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 14 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77