-
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper ⢠2311.10093 ⢠Published ⢠59 -
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper ⢠2311.12092 ⢠Published ⢠23 -
DREAM: Diffusion Rectification and Estimation-Adaptive Models
Paper ⢠2312.00210 ⢠Published ⢠17 -
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models
Paper ⢠2312.00079 ⢠Published ⢠17
Collections
Discover the best community collections!
Collections including paper arxiv:2312.06550
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Paper ⢠2211.05100 ⢠Published ⢠34 -
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Paper ⢠2201.11115 ⢠Published -
Training language models to follow instructions with human feedback
Paper ⢠2203.02155 ⢠Published ⢠24 -
FinGPT: Large Generative Models for a Small Language
Paper ⢠2311.05640 ⢠Published ⢠32
-
The Impact of Depth and Width on Transformer Language Model Generalization
Paper ⢠2310.19956 ⢠Published ⢠10 -
Retentive Network: A Successor to Transformer for Large Language Models
Paper ⢠2307.08621 ⢠Published ⢠172 -
RWKV: Reinventing RNNs for the Transformer Era
Paper ⢠2305.13048 ⢠Published ⢠20 -
Attention Is All You Need
Paper ⢠1706.03762 ⢠Published ⢠96
-
AlpaGasus: Training A Better Alpaca with Fewer Data
Paper ⢠2307.08701 ⢠Published ⢠23 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper ⢠2303.03915 ⢠Published ⢠7 -
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper ⢠2309.04662 ⢠Published ⢠24 -
SlimPajama-DC: Understanding Data Combinations for LLM Training
Paper ⢠2309.10818 ⢠Published ⢠11
-
DualMix: Unleashing the Potential of Data Augmentation for Online Class-Incremental Learning
Paper ⢠2303.07864 ⢠Published ⢠1 -
Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks
Paper ⢠2305.13547 ⢠Published ⢠1 -
MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning
Paper ⢠2304.09402 ⢠Published ⢠2 -
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning
Paper ⢠2305.18169 ⢠Published ⢠1
-
A technical note on bilinear layers for interpretability
Paper ⢠2305.03452 ⢠Published ⢠1 -
Interpreting Transformer's Attention Dynamic Memory and Visualizing the Semantic Information Flow of GPT
Paper ⢠2305.13417 ⢠Published ⢠1 -
Explainable AI for Pre-Trained Code Models: What Do They Learn? When They Do Not Work?
Paper ⢠2211.12821 ⢠Published ⢠2 -
The Linear Representation Hypothesis and the Geometry of Large Language Models
Paper ⢠2311.03658 ⢠Published ⢠1
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper ⢠2401.02038 ⢠Published ⢠65 -
Learning To Teach Large Language Models Logical Reasoning
Paper ⢠2310.09158 ⢠Published ⢠1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper ⢠2311.00176 ⢠Published ⢠9 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper ⢠2308.09583 ⢠Published ⢠7
-
Creative Robot Tool Use with Large Language Models
Paper ⢠2310.13065 ⢠Published ⢠9 -
CodeCoT and Beyond: Learning to Program and Test like a Developer
Paper ⢠2308.08784 ⢠Published ⢠5 -
Lemur: Harmonizing Natural Language and Code for Language Agents
Paper ⢠2310.06830 ⢠Published ⢠34 -
CodePlan: Repository-level Coding using LLMs and Planning
Paper ⢠2309.12499 ⢠Published ⢠79
-
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper ⢠2311.10093 ⢠Published ⢠59 -
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper ⢠2311.12092 ⢠Published ⢠23 -
DREAM: Diffusion Rectification and Estimation-Adaptive Models
Paper ⢠2312.00210 ⢠Published ⢠17 -
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models
Paper ⢠2312.00079 ⢠Published ⢠17
-
DualMix: Unleashing the Potential of Data Augmentation for Online Class-Incremental Learning
Paper ⢠2303.07864 ⢠Published ⢠1 -
Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks
Paper ⢠2305.13547 ⢠Published ⢠1 -
MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning
Paper ⢠2304.09402 ⢠Published ⢠2 -
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning
Paper ⢠2305.18169 ⢠Published ⢠1
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Paper ⢠2211.05100 ⢠Published ⢠34 -
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Paper ⢠2201.11115 ⢠Published -
Training language models to follow instructions with human feedback
Paper ⢠2203.02155 ⢠Published ⢠24 -
FinGPT: Large Generative Models for a Small Language
Paper ⢠2311.05640 ⢠Published ⢠32
-
A technical note on bilinear layers for interpretability
Paper ⢠2305.03452 ⢠Published ⢠1 -
Interpreting Transformer's Attention Dynamic Memory and Visualizing the Semantic Information Flow of GPT
Paper ⢠2305.13417 ⢠Published ⢠1 -
Explainable AI for Pre-Trained Code Models: What Do They Learn? When They Do Not Work?
Paper ⢠2211.12821 ⢠Published ⢠2 -
The Linear Representation Hypothesis and the Geometry of Large Language Models
Paper ⢠2311.03658 ⢠Published ⢠1
-
The Impact of Depth and Width on Transformer Language Model Generalization
Paper ⢠2310.19956 ⢠Published ⢠10 -
Retentive Network: A Successor to Transformer for Large Language Models
Paper ⢠2307.08621 ⢠Published ⢠172 -
RWKV: Reinventing RNNs for the Transformer Era
Paper ⢠2305.13048 ⢠Published ⢠20 -
Attention Is All You Need
Paper ⢠1706.03762 ⢠Published ⢠96
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper ⢠2401.02038 ⢠Published ⢠65 -
Learning To Teach Large Language Models Logical Reasoning
Paper ⢠2310.09158 ⢠Published ⢠1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper ⢠2311.00176 ⢠Published ⢠9 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper ⢠2308.09583 ⢠Published ⢠7
-
AlpaGasus: Training A Better Alpaca with Fewer Data
Paper ⢠2307.08701 ⢠Published ⢠23 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper ⢠2303.03915 ⢠Published ⢠7 -
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper ⢠2309.04662 ⢠Published ⢠24 -
SlimPajama-DC: Understanding Data Combinations for LLM Training
Paper ⢠2309.10818 ⢠Published ⢠11
-
Creative Robot Tool Use with Large Language Models
Paper ⢠2310.13065 ⢠Published ⢠9 -
CodeCoT and Beyond: Learning to Program and Test like a Developer
Paper ⢠2308.08784 ⢠Published ⢠5 -
Lemur: Harmonizing Natural Language and Code for Language Agents
Paper ⢠2310.06830 ⢠Published ⢠34 -
CodePlan: Repository-level Coding using LLMs and Planning
Paper ⢠2309.12499 ⢠Published ⢠79