Cybersecurity & Tech

AI Will Not Want To Self-Improve

Peter N. Salib
Monday, May 20, 2024, 5:00 AM
Classic arguments for AI risk assume that capable, goal-seeking systems will naturally attempt to improve themselves, but a closer look at the operative incentives reveals a more complicated story.
Apps on an iPhone. (Pixabay, https://pixabay.com/service/license/)

Published by The Lawfare Institute
in Cooperation With
Brookings

In foundational accounts of AI risk, the prospect of AI self-improvement looms large. The idea is simple. For any capable, goal-seeking system, the system’s goal will be more readily achieved if the system first makes itself even more capable. Having become somewhat more capable, the system will be able to improve itself again. And so on, possibly generating a rapid explosion of AI capabilities, resulting in systems that humans cannot hope to control.

This paper argues that explosive cycles of AI self-improvement are less likely than existing accounts commonly assume. The reason is not that AI systems will never become capable enough to do cutting-edge machine learning research. Rather, there are previously-unrecognized incentives cutting against AI self-improvement.

Despite being unappreciated, the incentives against AI self-improvement are familiar. They are the same ones that cut against humans building AI systems more capable than ourselves. Currently, there is no reliable way to ensure that such systems are “aligned” with their creators—that they share their creators’ goals. Absent a major breakthrough in the science of AI alignment, any entity—human or artificial—that makes an AI more capable than itself risks generating not a helpful agent, but a powerful competitor and an existential threat.

You can find that paper here or below:


Peter N. Salib is an Assistant Professor of Law at the University of Houston Law Center and Affiliated Faculty at the Hobby School of Public Affairs. He thinks and writes about constitutional law, economics, and artificial intelligence. His scholarship has been published in, among others, the University of Chicago Law Review, the Northwestern University Law Review, and the Texas Law Review.

Subscribe to Lawfare