CHAI hosts weekly research meetings for graduate students at UC Berkeley. Below are samples of some of their work.
Dylan is a fourth year Ph.D. student at UC Berkeley advised by Anca Dragan, Pieter Abbeel, and Stuart Russell. His research focuses on algorithms that facilitate human-compatible artificial intelligence. In particular, he tries to develop frameworks that account for uncertainty about the objective being optimized.
Before coming to Berkeley, Dylan did a Master’s of Engineering with Leslie Kaelbling and Tomás Lozano-Pérez at MIT. At Berkeley, Dylan’s research has taken a turn to focus more on AI safety and, thinking longer term, AI value alignment:
- In 2016, he and his advisors formally described a cooperative inverse reinforcement learning problem (paper). The problem serves as a tool to help researchers consider how robots could learn humans’ values via cooperative instruction. While robots could learn humans’ values by observing humans, cooperative instruction is likely to be significantly faster.
- In 2017, Dylan and his advisors described an “off-switch game,” a simplified problem describing scenarios in which a human would like to turn off a robot, but the robot is able to disable its off-switch (paper). They showed that a robot who is uncertain about the utility derived from various outcomes in the game is more likely to allow a human to turn it off.
Jaime Fernandez Fisac
Jaime is a fourth-year Ph.D. student in EECS at UC Berkeley. He has worked on autonomous robots in both academia and industry, with a particular focus on collision avoidance and multi-agent systems. Broadly, his research focuses on safely introducing robotics into society. Jaime also thinks deeply about long-term implications of AI development and conducts his current research with an eye toward the future.
Under the guidance of CHAI Professors Anca Dragan and Tom Griffiths, and along with fellow CHAI graduate students Monica Gates and Dylan Hadfield-Menell, Jaime has focused on solving the cooperative inverse reinforcement learning (CIRL) dynamic game using well-established models of human inference, decision making, and theory of mind from the cognitive science literature. Previous solutions have relied on modelling both the human and robot as perfectly rational and able to coordinate in advance, which are nontrivial assumptions in the real world. Instead, Jaime and his colleagues’ work models the human as pedagogic (i.e., her behaviour will aim to be instructive) and the robot as pragmatic (i.e., it knows the human is not perfectly rational but is still trying to teach it). Results suggest this formulation produces robots that are more competent collaborators (paper). In the past, Jaime has also researched how we can have robots choose their course of action in a way that will be easy for a human observer to anticipate (paper) and how incorporating uncertainty into a safety framework for robotic systems that works in conjunction with their learning process can provide meaningful safety guarantees (paper and video).
In the future, Jaime intends to look into how AI systems can utilize models of human cognition and behavior to ensure safe interaction with people. He believes that the safer robots will be those that engage their users and procure their cooperation, rather than try to protect against their indifference. He hopes that designing safe human-centered robotic systems in the short term will give us key insights to tackle the broader, long-term AI safety problem.
You can learn more about Jaime at his website.
Smitha Milli is a first-year Ph.D. student at UC Berkeley, where she recently completed her undergraduate degree in EECS. She spent the summer of 2017 interning with OpenAI. Smitha wants to help create intelligent, collaborative AI systems that will have a positive impact on society.
During her undergraduate studies, Smitha worked with Stuart Russell (CHAI’s founder), Anca Dragan (a Principal Investigator at CHAI), and Dylan Hadfield-Menell (a CHAI graduate student) on the problem of how to make robots obedient, but not too obedient (paper). Because humans are fallible and sometimes give instructions that run counter to their preferences, it’s important to create robots that do not automatically obey all commands, but rather try to infer what a human desires. Smitha and her colleagues attempted to formalize this problem and assessed the extent to which various methods of inferring human preferences err on the side of obedience.
Going forward, Smitha wants to work on creating machine learning systems that are more robust and reliable. She is particularly interested in how interaction with humans can help ensure that systems have learned the right objectives and behavior.
Daniel Filan is a 2nd-year Ph.D. student of EECS at UC Berkeley, supervised by Stuart Russell. He’s interested in effective altruism and wants to ensure that future artificial intelligences who may be much more strategically intelligent than us behave in a safe way.
In 2016, Daniel worked with Tom Everitt, Mayank Daswani, and Marcus Hutter, considering the problem that a sufficiently advanced AI could choose to modify its source code in order to have easily achievable goals, and such modifications may not be to humans’ liking (paper). They determined that an agent will not self-modify if and only if the value function of the agent anticipates the consequences of self-modification and uses the agent’s current utility function when evaluating the future.
Daniel is currently exploring how the task “create an AI whose interests are aligned with its user” is similar to the task of “write a contract to pay somebody for doing work for you.”, i.e. the Principal Agent problem in economics. He suspects there is insight to be gained from exploring the types of problems shared between both tasks, and how solutions to a problem for one of those tasks may be transferred to a similar problem for the other.
You can learn more about Daniel’s work at his personal website.
Monica Gates is a second-year Ph.D. student of Neuroscience at UC Berkeley, advised by Professor Tom Griffiths (a PI at CHAI). Her research is broadly aimed at developing computational models of social cognition. She is interested in:
- How people can infer beliefs and intentions in others, especially by observing others’ actions and employing recursive theory-of-mind (e.g. inverse reinforcement learning, social psychology);
- Group-level equilibria when agents are collaborating or competing (e.g. game theory, agent-based modeling);
- Mechanism design and other ways quantitative characterizations of a phenomenon can be used to predict and shape behavior.
Recently, Monica has worked on two projects related to CHAI’s mission. She has assisted with the paper on solving the cooperative inverse reinforcement learning (CIRL) dynamic game described in more detail in Jaime Fernandez Fisac’s profile and is working with Professors Anca D. Dragan, Tom L. Griffiths, and Anant Sahai on preference aggregation across agents. More specifically, in the latter project Monica and her colleagues have set up a study in which participants are presented with a problem that requires mediating between the preferences of multiple agents. Monica and her colleagues take participants’ responses and attempt to explain them using a quantitative model. Their hope is to create a baseline standard of “fair” reactions to the problem—a standard to which the behavior of future AIs can be compared.
Going forward, Monica would like research social inference problems. She wants to continue to approach social cognition from a computational perspective, using probabilistic models and large-scale web-based crowdsourcing to investigate the computational goals and algorithms driving the social mind. By understanding the complex inferences made by human minds, she hopes to contribute to the development of artificial intelligence that can collaborate and is compatible with human behavior.
You can learn more about Monica at her website.