Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arXiv:2402.09906

Generative Representational Instruction Tuning (GRIT)

Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15, 2024 • 54
GritLM/GritLM-7B

Text Generation • 7B • Updated Apr 15 • 9.11k • • 111
GritLM/GritLM-8x7B

Text Generation • 47B • Updated Apr 15 • 7.89k • 38
GritLM/GritLM-7B-KTO

Text Generation • 7B • Updated Jun 14, 2024 • 8.41k • • 4

Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Paper • 2402.07827 • Published Feb 12, 2024 • 48
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

Paper • 2104.08663 • Published Apr 17, 2021 • 3
Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 77
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15, 2024 • 54

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4, 2024 • 53
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 109
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15, 2024 • 22
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15, 2024 • 54

Alignment: FineTuning-Preference

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 32
Tailoring Self-Rationalizers with Multi-Reward Distillation

Paper • 2311.02805 • Published Nov 6, 2023 • 7
Ultra-Long Sequence Distributed Transformer

Paper • 2311.02382 • Published Nov 4, 2023 • 6
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Paper • 2309.11235 • Published Sep 20, 2023 • 15

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 53
Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 39
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 89

Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Paper • 2402.07827 • Published Feb 12, 2024 • 48
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 109
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15, 2024 • 54

Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation

Paper • 2401.15688 • Published Jan 28, 2024 • 11
SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 74
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Paper • 2401.15071 • Published Jan 26, 2024 • 37

Research Papers

A collection of papers focused on LLM

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 77
ToolTalk: Evaluating Tool-Usage in a Conversational Setting

Paper • 2311.10775 • Published Nov 15, 2023 • 10
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning

Paper • 2311.11077 • Published Nov 18, 2023 • 29
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning

Paper • 2311.11501 • Published Nov 20, 2023 • 37

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 89
LMDX: Language Model-based Document Information Extraction and Localization

Paper • 2309.10952 • Published Sep 19, 2023 • 66
Table-GPT: Table-tuned GPT for Diverse Table Tasks

Paper • 2310.09263 • Published Oct 13, 2023 • 41
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 105

Generative Representational Instruction Tuning (GRIT)

Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15, 2024 • 54
GritLM/GritLM-7B

Text Generation • 7B • Updated Apr 15 • 9.11k • • 111
GritLM/GritLM-8x7B

Text Generation • 47B • Updated Apr 15 • 7.89k • 38
GritLM/GritLM-7B-KTO

Text Generation • 7B • Updated Jun 14, 2024 • 8.41k • • 4

Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Paper • 2402.07827 • Published Feb 12, 2024 • 48
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 109
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15, 2024 • 54

Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Paper • 2402.07827 • Published Feb 12, 2024 • 48
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

Paper • 2104.08663 • Published Apr 17, 2021 • 3
Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 77
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15, 2024 • 54

Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation

Paper • 2401.15688 • Published Jan 28, 2024 • 11
SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 74
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Paper • 2401.15071 • Published Jan 26, 2024 • 37

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4, 2024 • 53
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 109
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15, 2024 • 22
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15, 2024 • 54

Research Papers

A collection of papers focused on LLM

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 77
ToolTalk: Evaluating Tool-Usage in a Conversational Setting

Paper • 2311.10775 • Published Nov 15, 2023 • 10
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning

Paper • 2311.11077 • Published Nov 18, 2023 • 29
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning

Paper • 2311.11501 • Published Nov 20, 2023 • 37

Alignment: FineTuning-Preference

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 32
Tailoring Self-Rationalizers with Multi-Reward Distillation

Paper • 2311.02805 • Published Nov 6, 2023 • 7
Ultra-Long Sequence Distributed Transformer

Paper • 2311.02382 • Published Nov 4, 2023 • 6
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Paper • 2309.11235 • Published Sep 20, 2023 • 15

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 89
LMDX: Language Model-based Document Information Extraction and Localization

Paper • 2309.10952 • Published Sep 19, 2023 • 66
Table-GPT: Table-tuned GPT for Diverse Table Tasks

Paper • 2310.09263 • Published Oct 13, 2023 • 41
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 105

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 53
Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 39
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 89

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs