CHAI Papers Published

04 Jun 2022

Here are some of the papers that have been published by CHAI students and affiliates recently: 

ICLR 2022 

Cassidy Laidlaw, Anca Dragan – The Boltzmann Policy Distribution: Accounting for Systematic Suboptimality in Human Models 

This paper introduces a novel model of human behavior, the Boltzmann policy distribution (BPD). The BPD enables human action prediction and human-AI cooperation much better than existing methods like Boltzmann rationality by modeling systematically suboptimal human behavior. Despite requiring little or no human data, it performs almost as well as human models requiring far more data. 

Scientific Reports 

Tom Lenaerts – Inferring strategies from observations in long iterated Prisoner’s dilemma experiments

While many theoretical studies have revealed the strategies that could lead to and maintain cooperation in the Iterated Prisoner’s dilemma, less is known about what human participants actually do in this game and how strategies change when being confronted with anonymous partners in each round. Previous attempts used short experiments, made different assumptions of possible strategies, and led to very different conclusions. Presented here are two long treatments that differ in the partner matching strategy used, i.e. fixed or shuffled partners. Unsupervised methods are used to cluster the players based on their actions and then Hidden Markov Model to infer what the memory-one strategies are in each cluster. Analysis of the inferred strategies reveals that fixed partner interaction leads to behavioral self-organization. Shuffled partners generate subgroups of memory-one strategies that remain entangled, apparently blocking the self-selection process that leads to fully cooperating participants in the fixed partner treatment. Analyzing the latter in more detail shows that AllC, AllD, TFT- and WSLS-like behavior can be observed. This study also reveals that long treatments are needed as experiments with less than 25 rounds capture mostly the learning phase participants go through in these kinds of experiments. 

Tom Lenaerts – Delegation to artificial agents fosters prosocial behaviors in the collective risk dilemma 

Home assistant chat-bots, self-driving cars, drones or automated negotiation systems are some of the several examples of autonomous (artificial) agents that have pervaded our society. These agents enable the automation of multiple tasks, saving time and (human) effort. However, their presence in social settings raises the need for a better understanding of their effect on social interactions and how they may be used to enhance cooperation towards the public good, instead of hindering it. To this end, the paper presents an experimental study of human delegation to autonomous agents and hybrid human-agent interactions centered on a non-linear public goods dilemma with uncertain returns in which participants face a collective risk. Its aim is to understand experimentally whether the presence of autonomous agents has a positive or negative impact on social behaviour, equality and cooperation in such a dilemma. The results show that cooperation and group success increases when participants delegate their actions to an artificial agent that plays on their behalf. Yet, this positive effect is less pronounced when humans interact in hybrid human-agent groups, where humans in successful hybrid groups mostly make higher contributions earlier in the game. Also, the paper shows that participants wrongly believe that artificial agents will contribute less to the collective effort. In general, results suggest that delegation to autonomous agents has the potential to work as commitment devices, which prevent both the temptation to deviate to an alternate (less collectively good) course of action, as well as limiting responses based on betrayal aversion.) 

ICML 2022

Micah Carroll, Dylan Hadfield-Menell, Stuart Russell, Anca Dragan – Estimating and Penalizing Induced Preference Shifts in Recommender Systems 

The content that a recommender system (RS) shows to users influences them. Therefore, when choosing which recommender to deploy, one is implicitly also choosing to induce specific internal states in users. Even more, systems trained via long-horizon optimization will have direct incentives to manipulate users, e.g. shift their preferences so they are easier to satisfy. In this work we focus on induced preference shifts in users. We argue that – before deployment – system designers should: estimate the shifts a recommender would induce; evaluate whether such shifts would be undesirable; and even actively optimize to avoid problematic shifts.)