What can AI Learn from Human Exploration? Intrinsically-Motivated Humans and Agents in Open-World Exploration

16 Dec 2023

Paper titled What can AI Learn from Human Exploration? Intrinsically-Motivated Humans and Agents in Open-World Exploration was selected for an oral presentation at the Intrinsically Motivated Open-Ended Learning workshop at NeurIPS 2023 conference that took place on 12/16/2023 in New Orleans.
In their paper, originally published on 10/20/2023, the authors Yuqing Du, Eliza Kosoy, Alyssa Dayan, Maria Rufova, Pieter Abbeel, and Alison Gopnik compare human and AI agent exploration in a complex, open-ended environment.


Abstract:

What drives exploration? Understanding intrinsic motivation is a long-standing question in both cognitive science and artificial intelligence (AI); numerous exploration objectives have been proposed and tested in human experiments and used to train reinforcement learning (RL) agents. However, experiments in the former are often in simplistic environments that do not capture the complexity of real world exploration. On the other hand, experiments in the latter use more complex environments, yet the trained RL agents fail to come close to human exploration efficiency. To study this gap, we propose a framework for directly comparing human and agent exploration in an open-ended environment, Crafter. We study how well commonly-proposed information theoretic objectives for intrinsic motivation relate to actual human and agent behaviours, finding that human exploration consistently shows a significant positive correlation with Entropy, Information Gain, and Empowerment. Surprisingly, we find that intrinsically-motivated RL agent exploration does not show the same significant correlation consistently, despite being designed to optimize objectives that approximate Entropy or Information Gain. In a preliminary analysis of verbalizations, we find that children’s verbalizations of goals positively correlates strongly with Empowerment, suggesting that goal-setting may be an important aspect of efficient exploration.