arxiv:2510.10528

Merlin's Whisper: Enabling Efficient Reasoning in LLMs via Black-box Adversarial Prompting

Published on Oct 12

Authors:

Abstract

AdvPrompt, an iterative refinement framework, reduces token usage in large reasoning models without sacrificing accuracy, demonstrating effectiveness across various benchmarks and model types.

AI-generated summary

Large reasoning models (LRMs) have demonstrated remarkable proficiency in tackling complex reasoning tasks through step-by-step thinking. However, such a lengthy reasoning process incurs substantial computational and latency overheads, hindering the practical deployment of these models. In this work, we present a new perspective on mitigating overthinking in LRMs via black-box adversarial prompting. By treating both open-source LRMs and closed-source APIs as black-box communicators, we investigate how to elicit concise responses without sacrificing accuracy. We introduce AdvPrompt, an iterative refinement framework that generates high-quality adversarial prompts from diverse perspectives. Experiments across multiple benchmarks demonstrate that AdvPrompt consistently reduces token usage while preserving performance. Notably, AdvPrompt achieves a 3x reduction in average response length on simple GSM8K questions for the Qwen3 model series, and delivers an average ~40% token reduction across four benchmarks. For closed-source APIs, AdvPrompt reduces token usage on MATH-500 by 35% for Claude-3.7 and 47% for Gemini-2.5. Further analysis reveals the generalizability of AdvPrompt across various model scales and families, underscoring the potential of black-box prompting as a practical and effective strategy for enhancing LRM efficiency.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.10528 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.10528 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.10528 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.