-
End-to-End Goal-Driven Web Navigation
Paper • 1602.02261 • Published -
Learning Language Games through Interaction
Paper • 1606.02447 • Published -
Naturalizing a Programming Language via Interactive Learning
Paper • 1704.06956 • Published -
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration
Paper • 1802.08802 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2508.01780
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 18 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 30 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7
-
AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving
Paper • 2506.12508 • Published • 1 -
Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling
Paper • 2507.23370 • Published -
MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools
Paper • 2509.09734 • Published • 15 -
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
Paper • 2508.01780 • Published • 20
-
LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries
Paper • 2508.15760 • Published • 46 -
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
Paper • 2508.01780 • Published • 20 -
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs
Paper • 2304.08244 • Published • 1 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 154
-
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 155 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51 -
AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research
Paper • 2507.13300 • Published • 19 -
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
Paper • 2508.01780 • Published • 20
-
End-to-End Goal-Driven Web Navigation
Paper • 1602.02261 • Published -
Learning Language Games through Interaction
Paper • 1606.02447 • Published -
Naturalizing a Programming Language via Interactive Learning
Paper • 1704.06956 • Published -
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration
Paper • 1802.08802 • Published • 1
-
AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving
Paper • 2506.12508 • Published • 1 -
Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling
Paper • 2507.23370 • Published -
MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools
Paper • 2509.09734 • Published • 15 -
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
Paper • 2508.01780 • Published • 20
-
LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries
Paper • 2508.15760 • Published • 46 -
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
Paper • 2508.01780 • Published • 20 -
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs
Paper • 2304.08244 • Published • 1 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 154
-
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 155 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51 -
AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research
Paper • 2507.13300 • Published • 19 -
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
Paper • 2508.01780 • Published • 20
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 18 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 30 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7