-
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Paper • 2304.09842 • Published • 2 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 30 -
Gorilla: Large Language Model Connected with Massive APIs
Paper • 2305.15334 • Published • 5 -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 5
Collections
Discover the best community collections!
Collections including paper arXiv:2402.16837
-
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation
Paper • 2401.15688 • Published • 11 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 74 -
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Paper • 2401.15071 • Published • 37
-
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper • 2401.12954 • Published • 33 -
Learning Universal Predictors
Paper • 2401.14953 • Published • 22 -
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper • 2402.01622 • Published • 37 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 29
-
Beyond Surface: Probing LLaMA Across Scales and Layers
Paper • 2312.04333 • Published • 20 -
SymbolicAI: A framework for logic-based approaches combining generative models and solvers
Paper • 2402.00854 • Published • 22 -
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper • 2402.01622 • Published • 37 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 29
-
Scaling Instruction-Finetuned Language Models
Paper • 2210.11416 • Published • 7 -
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 146 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 66 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 65
-
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper • 2402.17753 • Published • 20 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 29 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 29 -
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper • 2402.15000 • Published • 24
-
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Paper • 2401.03065 • Published • 11 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 66 -
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation
Paper • 2312.14187 • Published • 51 -
On the Effectiveness of Large Language Models in Domain-Specific Code Generation
Paper • 2312.01639 • Published • 2
-
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper • 2408.08152 • Published • 59 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 22 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 56 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 20
-
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Paper • 2304.09842 • Published • 2 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 30 -
Gorilla: Large Language Model Connected with Massive APIs
Paper • 2305.15334 • Published • 5 -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 5
-
Scaling Instruction-Finetuned Language Models
Paper • 2210.11416 • Published • 7 -
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 146 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 66 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 65
-
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper • 2402.17753 • Published • 20 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 29 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 29 -
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper • 2402.15000 • Published • 24
-
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation
Paper • 2401.15688 • Published • 11 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 74 -
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Paper • 2401.15071 • Published • 37
-
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Paper • 2401.03065 • Published • 11 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 66 -
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation
Paper • 2312.14187 • Published • 51 -
On the Effectiveness of Large Language Models in Domain-Specific Code Generation
Paper • 2312.01639 • Published • 2
-
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper • 2401.12954 • Published • 33 -
Learning Universal Predictors
Paper • 2401.14953 • Published • 22 -
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper • 2402.01622 • Published • 37 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 29
-
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper • 2408.08152 • Published • 59 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 22 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 56 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 20
-
Beyond Surface: Probing LLaMA Across Scales and Layers
Paper • 2312.04333 • Published • 20 -
SymbolicAI: A framework for logic-based approaches combining generative models and solvers
Paper • 2402.00854 • Published • 22 -
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper • 2402.01622 • Published • 37 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 29