Improve model card: add metadata and refine description

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +27 -3
README.md CHANGED
@@ -1,12 +1,36 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
 
 
 
5
  <!-- Badges -->
6
  [![arXiv](https://img.shields.io/badge/arXiv-2512.05325-b31b1b.svg)](https://arxiv.org/abs/2512.05325)
7
  [![GitHub](https://img.shields.io/badge/GitHub-Code-black?logo=github)](https://github.com/farukakgul/LYNX)
8
 
9
- <div style="font-size: 15px; line-height: 1.55;">
 
 
 
 
 
 
 
 
 
 
 
10
 
11
- Large reasoning models achieve strong performance on complex tasks by generating extended chains of thought, but they often “overthink”: continuing to reason long after they internally have enough information to answer correctly. This wastes inference-time compute and can even hurt accuracy. Existing attempts to stop early either manipulate decoding with extra sampling and heuristics, rely on auxiliary verifier models, or operate only as post-hoc analysis pipelines without formal guarantees. We introduce LYNX, an online early-exit mechanism that turns a model’s own hidden-state awareness into confidence-controlled stopping decisions. LYNX attaches exit decisions to naturally occurring reasoning cues (e.g., “hmm”, “wait”) during generation, trains a lightweight probe on hidden states at those cue tokens using supervision from forced exits, and wraps the resulting scores in split conformal prediction to obtain distribution-free control over the rate of premature exits. Crucially, we train and calibrate this probe once on a generic mathematical corpus and then reuse it unchanged across benchmarks, decoding temperatures, and even non-mathematical tasks. Across three model families spanning 1.5B to 32B parameters (DeepSeek-R1-1.5B, QwQ-32B, and Llama-3.1-Nemotron-8B), a single mathematically trained probe per base model yields strong accuracy–efficiency tradeoffs. On GSM8K, LYNX matches or improves baseline accuracy while reducing tokens by 40–65%; on MATH-500 it improves accuracy by up to 12 points with roughly 35–60% fewer tokens; on AIME 2024 it recovers baseline accuracy with more than 50% token savings; and on CommonsenseQA, a non-math benchmark, it transfers zero-shot with modest accuracy gains and up to 70% fewer tokens. Compared to state-of-the-art early-exit methods, LYNX offers competitive or superior Pareto frontiers while remaining fully online, requiring no proxy models at inference, and providing explicit, user-tunable confidence guarantees. Code is available at https://github.com/farukakgul/LYNX.
12
- </div>
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ library_name: transformers
5
  ---
6
 
7
+ # LYNX: Learning Dynamic Exits for Confidence-Controlled Reasoning
8
+
9
  <!-- Badges -->
10
  [![arXiv](https://img.shields.io/badge/arXiv-2512.05325-b31b1b.svg)](https://arxiv.org/abs/2512.05325)
11
  [![GitHub](https://img.shields.io/badge/GitHub-Code-black?logo=github)](https://github.com/farukakgul/LYNX)
12
 
13
+ LYNX turns a reasoning model’s own hidden states into **confidence‑controlled early exits**. At naturally occurring cue tokens (e.g., `hmm`, `wait`, `alternatively`), LYNX:
14
+ 1. Extracts features from a few intermediate layers.
15
+ 2. Uses a lightweight probe to predict whether the final answer will be correct if we stop now.
16
+ 3. Wraps the probe with split conformal prediction to get a **user‑tunable confidence level** and explicit guarantees.
17
+
18
+ This repository contains a minimal, self‑contained implementation of that pipeline for open‑weight LMs (e.g., DeepSeek‑R1‑1.5B, QwQ‑32B, and Llama‑3.1‑Nemotron‑8B), featuring an HF‑only pipeline for training, calibration, and evaluation.
19
+
20
+ For more details, refer to the [paper](https://huggingface.co/papers/2512.05325) and the [GitHub repository](https://github.com/farukakgul/LYNX).
21
+
22
+ ## Citation
23
+
24
+ If you find LYNX useful, please cite the accompanying paper:
25
 
26
+ ```bibtex
27
+ @misc{akgül2025lynxlearningdynamicexits,
28
+ title={LYNX: Learning Dynamic Exits for Confidence-Controlled Reasoning},
29
+ author={Ömer Faruk Akgül and Yusuf Hakan Kalaycı and Rajgopal Kannan and Willie Neiswanger and Viktor Prasanna},
30
+ year={2025},
31
+ eprint={2512.05325},
32
+ archivePrefix={arXiv},
33
+ primaryClass={cs.CL},
34
+ url={https://arxiv.org/abs/2512.05325},
35
+ }
36
+ ```