arxiv:1811.11043

Rotting bandits are not harder than stochastic ones

Published on Nov 27, 2018

Authors:

Michal Valko

Abstract

The FEWA algorithm addresses the non-parametric rotting bandit problem, achieving better regret bounds compared to existing methods.

AI-generated summary

In stochastic multi-armed bandits, the reward distribution of each arm is assumed to be stationary. This assumption is often violated in practice (e.g., in recommendation systems), where the reward of an arm may change whenever is selected, i.e., rested bandit setting. In this paper, we consider the non-parametric rotting bandit setting, where rewards can only decrease. We introduce the filtering on expanding window average (FEWA) algorithm that constructs moving averages of increasing windows to identify arms that are more likely to return high rewards when pulled once more. We prove that for an unknown horizon T, and without any knowledge on the decreasing behavior of the K arms, FEWA achieves problem-dependent regret bound of mathcal{O}((KT)), and a problem-independent one of mathcal{O}(KT). Our result substantially improves over the algorithm of Levine et al. (2017), which suffers regret mathcal{O}(K^{1/3}T^{2/3}). FEWA also matches known bounds for the stochastic bandit setting, thus showing that the rotting bandits are not harder. Finally, we report simulations confirming the theoretical improvements of FEWA.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/1811.11043 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1811.11043 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.