Adaptive Multi-Agent Response Refinement in Conversational Systems Paper • 2511.08319 • Published 11 days ago • 39
Simulating Environments with Reasoning Models for Agent Training Paper • 2511.01824 • Published 19 days ago • 1
AgentFold: Long-Horizon Web Agents with Proactive Context Management Paper • 2510.24699 • Published 25 days ago • 66
ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs Paper • 2510.04767 • Published Oct 6 • 26
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published Oct 10 • 48
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs Paper • 2510.07499 • Published Oct 8 • 48
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? Paper • 2510.02209 • Published Oct 2 • 52
ACON: Optimizing Context Compression for Long-horizon LLM Agents Paper • 2510.00615 • Published Oct 1 • 32
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28 • 171
Rethinking Reward Models for Multi-Domain Test-Time Scaling Paper • 2510.00492 • Published Oct 1 • 27
Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games Paper • 2506.03610 • Published Jun 4 • 9
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23 • 81
FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA Paper • 2505.12805 • Published May 19 • 22
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction Paper • 2505.11254 • Published May 16 • 48
Simple Semi-supervised Knowledge Distillation from Vision-Language Models via texttt{D}ual-texttt{H}ead texttt{O}ptimization Paper • 2505.07675 • Published May 12 • 20
UniversalRAG: Retrieval-Augmented Generation over Multiple Corpora with Diverse Modalities and Granularities Paper • 2504.20734 • Published Apr 29 • 61
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning Paper • 2504.17192 • Published Apr 24 • 120