User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale
Abstract
Large reasoning models enable scalable multi-turn dialogue generation through automated task-oriented simulation and user-oriented behavioral modeling for enhanced human-agent interaction datasets.
The recent paradigm shift toward large reasoning models (LRMs) as autonomous agents has intensified the demand for sophisticated, multi-turn tool-use capabilities. Yet, existing datasets and data-generation approaches are limited by static, predefined toolsets that cannot scale to the complexity of open-ended human-agent collaboration. To address this, we initially developed a framework for automated task-oriented multi-turn dialogue generation at scale, utilizing an LRM-based simulator to dynamically generate high-value, domain-specific tools to solve specified tasks. However, we observe that a purely task-oriented design often results in "solely task-solving" trajectories, where the agent completes the objective with minimal interaction, failing to generate the high turn-count conversations seen in realistic scenarios. To bridge this gap, we shift toward a user-oriented simulation paradigm. By decoupling task generation from a dedicated user simulator that mimics human behavioral rules - such as incremental request-making and turn-by-turn feedback - we facilitate more authentic, extended multi-turn dialogues that reflect the iterative nature of real-world problem solving. Our generation pipeline operates as a versatile, plug-and-play module capable of initiating generation from any state, ensuring high scalability in producing extended tool-use data. Furthermore, by facilitating multiple task completions within a single trajectory, it yields a high-density dataset that reflects the multifaceted demands of real-world human-agent interaction.
Community
While large language models have shown remarkable progress in tool use, maintaining high-quality, user-centric multi-turn conversations at scale remains a significant challenge.
Our work focuses on:
(1) Generating high-fidelity multi-turn dialogue datasets designed for practical tool-use scenarios.
(2) Enhancing model performance in complex, user-oriented interactions.
(3) Providing insights into scaling dialogue generation without compromising on user experience.
Check out the full paper here: https://arxiv.org/abs/2601.08225
This sounds silly.
If you fine tune the model for "high turn-count conversations", it will (literally) learn to converse about things it could have just answered instead.
LLMs don't have any thought process - they don't know what they know before predicting the next token.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Jenius Agent: Towards Experience-Driven Accuracy Optimization in Real-World Scenarios (2026)
- RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction (2026)
- SpeakRL: Synergizing Reasoning, Speaking, and Acting in Language Models with Reinforcement Learning (2025)
- ToolGym: an Open-world Tool-using Environment for Scalable Agent Testing and Data Curation (2026)
- TravelBench: A Broader Real-World Benchmark for Multi-Turn and Tool-Using Travel Planning (2025)
- SimRPD: Optimizing Recruitment Proactive Dialogue Agents through Simulator-Based Data Evaluation and Selection (2026)
- Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper