Prominent AI Scientists from China and the West Propose Joint Strategy to Mitigate Risks from AI
31 Oct 2023
DITCHLEY PARK, UNITED KINGDOM – Ahead of the highly anticipated AI Safety Summit, leading AI scientists from the US, the PRC, the UK and other countries agreed on the importance of global cooperation and jointly called for research and policies to prevent unacceptable risks from advanced AI.
Prominent scientists gathered from the USA, the PRC, the UK, Europe, and Canada for the first “International Dialogue on AI Safety”. The meeting was convened by Turing Award winners Yoshua Bengio and Andrew Yao, UC Berkeley professor Stuart Russell, OBE, and founding Dean of the Tsinghua Institute for AI Industry Research Ya-Qin Zhang. The event took place earlier this month at Ditchley Park near Oxford. Attendees worked to build a shared understanding of risks from advanced AI systems, inform intergovernmental processes, and lay the foundations for further cooperation to prevent worst-case outcomes from AI development.
The expert attendees warned governments and AI developers that “coordinated global action on AI safety research and governance is critical to prevent uncontrolled frontier AI development from posing unacceptable risks to humanity.” Attendees produced a joint statement with specific technical and policy recommendations, which is attached below. Prof. Zhang remarked that it is “crucial for governments and AI corporations to invest heavily in frontier AI safety research and engineering”, while Prof. Yao stressed the importance that we “work together as a global community to ensure the safe progress of AI.” Prof. Bengio called upon AI developers to “demonstrate the safety of their approach before training and deploying” AI systems, while Prof. Russell concurred that “if they cannot do that, they cannot build or deploy their systems. Full stop.”
ABOUT THE INTERNATIONAL DIALOGUES ON AI SAFETY
The International Dialogues on AI Safety are a new initiative bringing together scientists from around the world to collaborate on mitigating the risks of artificial intelligence. The event was held in partnership with the Center for Human-Compatible AI, FAR AI, and the Ditchley Foundation.
ABOUT THE CENTER FOR HUMAN-COMPATIBLE AI
CHAI is a multi-institution research group based at UC Berkeley, with academic affiliates at a variety of other universities. CHAI’s goal is to develop the conceptual and technical wherewithal to reorient the general thrust of AI research towards provably beneficial systems.
ABOUT FAR AI
FAR AI is a non-profit organization working to ensure AI systems are trustworthy and beneficial to society. FAR AI incubates and accelerates research agendas that are too resource-intensive for academia but not yet ready for commercialisation by industry.
ABOUT THE DITCHLEY FOUNDATION
Ditchley is an independent foundation working towards the renewal of democratic societies, states and alliances by bringing people together for frank conversations across divides and creating space for strategic thinking.
PRESS CONTACT
Fynn Heide
idais@far.ai
ENGLISH STATEMENT
Coordinated global action on AI safety research and governance is critical to prevent uncontrolled frontier AI development from posing unacceptable risks to humanity.
Global action, cooperation, and capacity building are key to managing risk from AI and enabling humanity to share in its benefits. AI safety is a global public good that should be supported by public and private investment, with advances in safety shared widely. Governments around the world — especially of leading AI nations — have a responsibility to develop measures to prevent worst-case outcomes from malicious or careless actors and to rein in reckless competition. The international community should work to create an international coordination process for advanced AI in this vein.
We face near-term risks from malicious actors misusing frontier AI systems, with current safety filters integrated by developers easily bypassed. Frontier AI systems produce compelling misinformation and may soon be capable enough to help terrorists develop weapons of mass destruction. Moreover, there is a serious risk that future AI systems may escape human control altogether. Even aligned AI systems could destabilize or disempower existing institutions. Taken together, we believe AI may pose an existential risk to humanity in the coming decades.
In domestic regulation, we recommend mandatory registration for the creation, sale or use of models above a certain capability threshold, including open-source copies and derivatives, to enable governments to acquire critical and currently missing visibility into emerging risks. Governments should monitor large-scale data centers and track AI incidents, and should require that AI developers of frontier models be subject to independent third-party audits evaluating their information security and model safety. AI developers should also be required to share comprehensive risk assessments, policies around risk management, and predictions about their systems’ behavior in third party evaluations and post-deployment with relevant authorities.
We also recommend defining clear red lines that, if crossed, mandate immediate termination of an AI system — including all copies — through rapid and safe shut-down procedures. Governments should cooperate to instantiate and preserve this capacity. Moreover, prior to deployment as well as during training for the most advanced models, developers should demonstrate to regulators’ satisfaction that their system(s) will not cross these red lines.
Reaching adequate safety levels for advanced AI will also require immense research progress. Advanced AI systems must be demonstrably aligned with their designer’s intent, as well as appropriate norms and values. They must also be robust against both malicious actors and rare failure modes. Sufficient human control needs to be ensured for these systems. Concerted effort by the global research community in both AI and other disciplines is essential; we need a global network of dedicated AI safety research and governance institutions. We call on leading AI developers to make a minimum spending commitment of one third of their AI R&D on AI safety and for government agencies to fund academic and non-profit AI safety and governance research in at least the same proportion.
CHINESE STATEMENT
在人工智能安全研究与治理上的全球协同行动,是避免不受控制的前沿人工智能发展为全人类带来不可容忍的风险的关键。
全球的行动、合作与能力建设,是管理人工智能风险、使全人类共享人工智能发展成果的关键。人工智能安全是全球的共同利益,应得到公共和私人投资的支持,将安全相关进展广泛共享。世界各国政府——尤其是人工智能的领先国家——有责任制定措施,以避免恶意或不负责任的参与者造成最坏的后果,并遏制鲁莽的竞争。国际社会应针对此问题,共同建立一个针对前沿人工智能的国际协调过程。
恶意滥用前沿人工智能系统的风险已近在咫尺:开发者目前所采取的安全措施能被轻易攻破。前沿人工智能系统能创造令人信服的错误信息,并可能很快能够帮助恐怖分子开发大规模杀伤性武器。此外,未来的人工智能系统可能完全脱离人类的控制,亦存在重大的风险。即使是与人类对齐的人工智能系统,也可能破坏或削弱现有的社会机制。综合考虑,我们相信在未来的几十年内,人工智能将对全人类构成生存性风险。
在政府监管中,我们建议对超过某些能力阈值的人工智能系统,包括其开源的副本和衍生品,在建立、销售与模型使用上进行强制注册,为政府提供关键但目前缺失的对新兴风险的可见性。政府应监测大型数据中心并追踪人工智能事故,要求前沿人工智能模型的开发者接受独立第三方审计,对其信息安全和模型安全进行评估。人工智能开发者还应被要求向相关当局提供全面的风险评估、风险管理策略,以及在第三方评估和部署后对系统行为的预测。
我们同时建议规定一些明确的红线,并建立快速且安全的终止程序。一旦某个人工智能系统超越此红线,该系统及其所有的副本须被立即关闭。各国政府应合作建立并维持这一能力。此外,在最前沿的模型训练期间与部署前,开发者必须向监管机构证明其系统不会越过这些红线,以获得监管机构的批准。
让前沿人工智能做到充分安全,仍需要重大的研究进展。前沿人工智能系统必须明确地与其设计者的意图、社会规范与价值观相对齐。它们还需在恶意攻击及罕见的故障模式下保持鲁棒。我们必须确保这些系统有充分的人类控制。全球研究社区在人工智能及其他学科上的协作与努力,是至关重要的:我们需要一个致力于人工智能安全研究和治理机构的全球网络。我们呼吁领先的人工智能开发者承诺至少将三分之一的人工智能研发经费用于人工智能安全研究,同时呼吁政府机构至少以同等比例资助学术与非营利性的人工智能安全与治理研究。
SIGNATORIES |
---|
Yoshua Bengio Scientific Director and Founder, Montreal Institute for Learning Algorithms Professor, Department of CS and Operations Research, Université de Montréal Turing Award Recipient |
Stuart Russell Professor of EECS, UC Berkeley Founder and Head, Center for Human-Compatible Artificial Intelligence Director, Kavli Center for Ethics, Science, and the Public |
Andrew Yao Dean of Institute for Interdisciplinary Information Sciences, Tsinghua University Distinguished Professor-At-Large, The Chinese University of Hong Kong Professor of Center for Advanced Study, Tsinghua University Turing Award Recipient |
Ya-Qin Zhang Chair Professor of AI Science at Tsinghua University Dean of Institute for AI Industry Research of Tsinghua University (AIR) Former President of Baidu |
Ed Felten Robert E. Kahn Professor of Computer Science and Public Affairs, Princeton University Founding Director, Center for Information Technology Policy, Princeton University |
Roger Grosse Associate Professor of Computer Science at the University of Toronto Founding Member, Vector Institute |
Gillian Hadfield Schwartz Reisman Chair in Technology and Society at the University of Toronto Faculty of Law Director of the Schwartz Reisman Institute for Technology and Society AI2050 Senior Fellow |
Dylan Hadfield-Menell Bonnie and Marty (1964) Tenenbaum Career Development Assistant Professor of EECS , MIT Lead, Algorithmic Alignment Group Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT |
Yang-Hui He London Institute, Fellow |
Sana Khareghani Professor of Practice in AI, King’s College London AI Policy Lead, Responsible AI UK Former Head of UK Government Office for Artificial Intelligence |
Karine Perset |
Elizabeth Seger Research Scholar, Centre for the Governance of AI |
Dawn Song Professor of EECS, UC Berkeley Founder, Oasis Labs |
Max Tegmark Professor, MIT Center for Brains, Minds & Machines President and Co-founder, Future of Life Institute |
Yi Zeng Professor and Director of Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences Founding Director of Center for Long-term AI |
HongJiang Zhang Chairman, Beijing Academy of AI |
Xin Chen PhD student, ETH Zurich |
Adam Gleave Founder and CEO, FAR AI |
Fynn Heide Research Scholar, Centre for the Governance of AI |