Solving The AI Control Problem Amplifies The Human Control Problem

The AI Control Problem is the task of controlling what an AI system will do and its outcomes, especially in the case of self-improving superintelligent AGI systems¹.

There are reasons to believe that no complete solution exists; however, let’s explore the assumption that a solution is found. Humans have discovered a way to guarantee that self-improving superintelligent AGI systems will do the good deeds we wish without sillily stupid misunderstandings causing unbeckoned harm.

Now what if someone wishes to use an AI for personally good deeds, such as a larger version of the WannaCry ransomware attack that affected many NHS² computers? Sure, your wish is my command!

Or did we solve the AI Goodness Problem as well as the control problem, so that AI will only do good and there is no easy way to change the source code to get it to do precisely what its controllers want? If one finds a way to create uncontrolled bodhisattva AGI, then perhaps these BGI³. systems will generally do what we wish for without harmful misunderstandings, only provided there is net benefit. So I can imagine the BGI could be more figureoutable than the control problem 😉.

Back to the AI Control Solution, we find ourselves in a world where any human with sufficient resources can unleash an ASI⁴ on any tasks desired. The cost to run an ASI (adequate for many tasks) is likely to decrease. Moreover, even if the initial AGI code is closed-source, protecting this against hacking may be difficult. Earlier versions of the code can be applied to self-improve their open-source code bases. Will we have hyper-ASI-backed monitoring and flagging of all open ASI code? Unlike with the regulation of nuclear technology, over 1/3 of humans own computers. How strict will monitoring and controls have to be to prevent undesired ASI use? Perhaps going in the 1984 surveillance state direction (as warned of in The Transparent Society)?

What about nation-states and large corporations? Unless we converge into benevolent dictatorship by the first ASI’s owners, they’ll surely wind up with their own ASI control. Will some lock-in totalitarianism for the long run? North Korea might be happy to — perhaps even taking over South Korea if they get access first, which they might as effective hackers. What can we do to stop it? Threaten nuclear war? ASI on more powerful hardware to find a solution and out-hack them?

What we find is that solving the AI Control Problem lifts the problem of increasing our own safety in the face of superintelligence to the age-old Human Control Problem of coordinating humans in a harmonious manner such that we treat each other well, do good, productive acts, and avoid harm⁵.

We have not solved the Human Control Problem yet. We do seem to know much more about it at both small and large scales than the AI Control Problem and homicide rates are falling in the developed world. Progress. Yet look at how we’re engaged in so many wars (and the US’s involvement isn’t even tagged 🤣). Putting ASI control in the hands of humans amplifies the human control problem in various ways that may not be easy to predict⁶.

Thus the question is begged: do we want total control of AGI systems⁷?

Share this: