new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Dec 24

Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments

Consider the problem of exploration in sparse-reward or reward-free environments, such as in Montezuma's Revenge. In the curiosity-driven paradigm, the agent is rewarded for how much each realized outcome differs from their predicted outcome. But using predictive error as intrinsic motivation is fragile in stochastic environments, as the agent may become trapped by high-entropy areas of the state-action space, such as a "noisy TV". In this work, we study a natural solution derived from structural causal models of the world: Our key idea is to learn representations of the future that capture precisely the unpredictable aspects of each outcome -- which we use as additional input for predictions, such that intrinsic rewards only reflect the predictable aspects of world dynamics. First, we propose incorporating such hindsight representations into models to disentangle "noise" from "novelty", yielding Curiosity in Hindsight: a simple and scalable generalization of curiosity that is robust to stochasticity. Second, we instantiate this framework for the recently introduced BYOL-Explore algorithm as our prime example, resulting in the noise-robust BYOL-Hindsight. Third, we illustrate its behavior under a variety of different stochasticities in a grid world, and find improvements over BYOL-Explore in hard-exploration Atari games with sticky actions. Notably, we show state-of-the-art results in exploring Montezuma's Revenge with sticky actions, while preserving performance in the non-sticky setting.

  • 6 authors
·
Nov 18, 2022

STORI: A Benchmark and Taxonomy for Stochastic Environments

Reinforcement learning (RL) techniques have achieved impressive performance on simulated benchmarks such as Atari100k, yet recent advances remain largely confined to simulation and show limited transfer to real-world domains. A central obstacle is environmental stochasticity, as real systems involve noisy observations, unpredictable dynamics, and non-stationary conditions that undermine the stability of current methods. Existing benchmarks rarely capture these uncertainties and favor simplified settings where algorithms can be tuned to succeed. The absence of a well-defined taxonomy of stochasticity further complicates evaluation, as robustness to one type of stochastic perturbation, such as sticky actions, does not guarantee robustness to other forms of uncertainty. To address this critical gap, we introduce STORI (STOchastic-ataRI), a benchmark that systematically incorporates diverse stochastic effects and enables rigorous evaluation of RL techniques under different forms of uncertainty. We propose a comprehensive five-type taxonomy of environmental stochasticity and demonstrate systematic vulnerabilities in state-of-the-art model-based RL algorithms through targeted evaluation of DreamerV3 and STORM. Our findings reveal that world models dramatically underestimate environmental variance, struggle with action corruption, and exhibit unreliable dynamics under partial observability. We release the code and benchmark publicly at https://github.com/ARY2260/stori, providing a unified framework for developing more robust RL systems.

  • 3 authors
·
Sep 1

Motile Bacteria-laden Droplets Exhibit Reduced Adhesion and Anomalous Wetting Behavior

Hypothesis: Bacterial contamination of surfaces poses a major threat to public health. Designing effective antibacterial or self-cleaning surfaces requires understanding how bacteria-laden droplets interact with solid substrates and how readily they can be removed. We hypothesize that bacterial motility critically influences the early-stage surface interaction (i.e., surface adhesion) of bacteria-laden droplets, which cannot be captured by conventional contact angle goniometry. Experiments: Sessile droplets containing live and dead Escherichia coli (E. coli) were studied to probe their wetting and interfacial behavior. Contact angle goniometry was used to probe dynamic wetting, while a cantilever-deflection-based method was used to quantify adhesion. Internal flow dynamics were visualized using micro-particle image velocimetry (PIV) and analyzed statistically. Complementary sliding experiments on moderately wettable substrates were performed to assess contact line mobility under tilt. Findings: Despite lower surface tension, droplets containing live bacteria exhibited lower surface adhesion forces than their dead counterparts, with adhesion further decreasing at higher bacterial concentrations. Micro-PIV revealed that flagellated live E. coli actively resist evaporation-driven capillary flow via upstream migration, while at higher concentrations, collective dynamics emerge, producing spatially coherent bacterial motion despite temporal variability. These coordinated flows disrupt passive transport and promote depinning of the contact line, thereby reducing adhesion. Sliding experiments confirmed enhanced contact line mobility and frequent stick-slip motion in live droplets, even with lower receding contact angles and higher hysteresis. These findings provide mechanistic insight into droplet retention, informing the design of self-cleaning/antifouling surfaces.

  • 4 authors
·
Oct 28