News RSS feed

Embracing AI That Reflects Human Values: Insights from Brian Christian’s Journey

28 Mar 2024

Discover how, Brian Christian, an acclaimed author’s quest for deeper understanding could lead to AI systems that truly mirror human values and decisions.

When Your AIs Deceive You: Challenges with Partial Observability of Human Evaluators in Reward Learning

05 Mar 2024

The researchers at Center for Human-Compatible AI (CHAI) at the University of California, Berkeley, has embarked on a study that brings to light the nuanced challenges encountered when AI systems learn from human feedback, especially under conditions of partial observability.

The Prosocial Ranking Challenge – $60,000 in prizes for better social media algorithms

18 Jan 2024

Deadline extended! First round submissions now due April 15th. See below.

Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse Reinforcement Learning

16 Jan 2024

How can a robot self-assess whether it has received enough demonstrations from an expert to ensure a desired level of performance? The authors of this paper examine the problem of determining demonstration sufficiency.

ALMANACS: A Simulatability Benchmark for Language Model Explainability

20 Dec 2023

How do we measure the efficacy of language model explainability methods? The authors of this paper present ALMANACS, a language model explainability benchmark that scores explainability methods on simulatability.

What can AI Learn from Human Exploration? Intrinsically-Motivated Humans and Agents in Open-World Exploration

16 Dec 2023

In their paper, that was selected for oral presentation at IMOL@NeurIPS2023, the authors compare human and AI agent exploration in a complex, open-ended environment.

AI heralds a ‘fourth industrial revolution.’ Why isn’t America regulating it?

11 Dec 2023

The current approach to AI is a reflection of enormous power imbalances between the tech giants and national governments. What happens when a globe-spanning corporation becomes so powerful that even nations must answer to it?”

Mitigating Generative Agent Social Dilemmas

04 Dec 2023

The authors of this paper find evidence that social dilemmas involving generative agents can be mitigated with contracting and negotiation.

Orienting AI Toward Peace

21 Nov 2023

Jonathan Stray presented a talk that outlined a three part strategy to ensure that AI systems do not inadvertently escalate political conflict as a result of misaligned optimization and are resistant to bad conflict actors.

Human Compatible has been Reissued in the UK in 2023

15 Nov 2023

On September 28th, 2023, Stuart Russell’s book “Human Compatible: AI and the Problem of Control” has been updated and reissued in UK.

« Previous PageNext Page »