Anca Dragan and Andrea Bajcsy from UC Berkeley, along with two other researchers from Rice University, published a paper arXiv called “Learning from Physical Human Corrections, One Feature at a Time” The abstract reads:
We focus on learning robot objective functions from human guidance: specifically, from physical corrections provided by the person while the robot is acting. Objective functions are typically parametrized in terms of features, which capture aspects of the task that might be important. When the person intervenes to correct the robot’s behavior, the robot should update its understanding of which features matter, how much, and in what way. Unfortunately, real users do not provide optimal corrections that isolate exactly what the robot was doing wrong. Thus, when receiving a correction, it is difficult for the robot to determine which features the person meant to correct, and which features were changed unintentionally. In this paper, we propose to improve the efficiency of robot learning during physical interactions by reducing unintended learning. Our approach allows the human-robot team to focus on learning one feature at a time, unlike state-of-the-art techniques that update all features at once. We derive an online method for identifying the single feature which the human is trying to change during physical interaction, and experimentally compare this one-at-a-time approach to the all-at-once baseline in a user study. Our results suggest that users teaching one-at-a-time perform better, especially in tasks that require changing multiple features.