Collections
Discover the best community collections!
Collections including paper arXiv:2402.09906
-
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Paper • 2402.07827 • Published • 48 -
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Paper • 2104.08663 • Published • 3 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
Generative Representational Instruction Tuning
Paper • 2402.09906 • Published • 54
-
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper • 2401.02415 • Published • 53 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 109 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 22 -
Generative Representational Instruction Tuning
Paper • 2402.09906 • Published • 54
-
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 32 -
Tailoring Self-Rationalizers with Multi-Reward Distillation
Paper • 2311.02805 • Published • 7 -
Ultra-Long Sequence Distributed Transformer
Paper • 2311.02382 • Published • 6 -
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Paper • 2309.11235 • Published • 15
-
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 85 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 39 -
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 89
-
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation
Paper • 2401.15688 • Published • 11 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 74 -
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Paper • 2401.15071 • Published • 37
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper • 2311.10775 • Published • 10 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 29 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 37
-
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 89 -
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 66 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 41 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105
-
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Paper • 2402.07827 • Published • 48 -
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Paper • 2104.08663 • Published • 3 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
Generative Representational Instruction Tuning
Paper • 2402.09906 • Published • 54
-
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation
Paper • 2401.15688 • Published • 11 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 74 -
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Paper • 2401.15071 • Published • 37
-
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper • 2401.02415 • Published • 53 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 109 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 22 -
Generative Representational Instruction Tuning
Paper • 2402.09906 • Published • 54
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper • 2311.10775 • Published • 10 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 29 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 37
-
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 32 -
Tailoring Self-Rationalizers with Multi-Reward Distillation
Paper • 2311.02805 • Published • 7 -
Ultra-Long Sequence Distributed Transformer
Paper • 2311.02382 • Published • 6 -
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Paper • 2309.11235 • Published • 15
-
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 89 -
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 66 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 41 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105
-
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 85 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 39 -
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 89