Solving The AI Control Problem Amplifies The Human Control Problem

The AI Control Problem is the task of controlling what an AI system will do and its outcomes, especially in the case of self-improving superintelligent AGI systems1Sometimes this is vaguely weakened to simply ensuring that the AI systems don’t harm people and benefit them..

There are reasons to believe that no complete solution exists; however, let’s explore the assumption that a solution is found. Humans have discovered a way to guarantee that self-improving superintelligent AGI systems will do the good deeds we wish without sillily stupid misunderstandings causing unbeckoned harm.

Now what if someone wishes to use an AI for personally good deeds, such as a larger version of the WannaCry ransomware attack that affected many NHS2National Health Service in the UK computers? Sure, your wish is my command!

Or did we solve the AI Goodness Problem as well as the control problem, so that AI will only do good and there is no easy way to change the source code to get it to do precisely what its controllers want? If one finds a way to create uncontrolled bodhisattva AGI, then perhaps these BGI3Beneficial General Intelligence. Bodhisattvic General Intelligence systems are probably a subclass.. systems will generally do what we wish for without harmful misunderstandings, only provided there is net benefit. So I can imagine the BGI could be more figureoutable than the control problem 😉.

Back to the AI Control Solution, we find ourselves in a world where any human with sufficient resources can unleash an ASI4Artificial SuperIntelligence on any tasks desired. The cost to run an ASI (adequate for many tasks) is likely to decrease. Moreover, even if the initial AGI code is closed-source, protecting this against hacking may be difficult. Earlier versions of the code can be applied to self-improve their open-source code bases. Will we have hyper-ASI-backed monitoring and flagging of all open ASI code? Unlike with the regulation of nuclear technology, over 1/3 of humans own computers. How strict will monitoring and controls have to be to prevent undesired ASI use? Perhaps going in the 1984 surveillance state direction (as warned of in The Transparent Society)?

What about nation-states and large corporations? Unless we converge into benevolent dictatorship by the first ASI’s owners, they’ll surely wind up with their own ASI control. Will some lock-in totalitarianism for the long run? North Korea might be happy to — perhaps even taking over South Korea if they get access first, which they might as effective hackers. What can we do to stop it? Threaten nuclear war? ASI on more powerful hardware to find a solution and out-hack them?

What we find is that solving the AI Control Problem lifts the problem of increasing our own safety in the face of superintelligence to the age-old Human Control Problem of coordinating humans in a harmonious manner such that we treat each other well, do good, productive acts, and avoid harm5Notice how thinking of us as beings worthy of care leads to a less spooky, more respectful wording of what is essentially the same problem?.

We have not solved the Human Control Problem yet. We do seem to know much more about it at both small and large scales than the AI Control Problem and homicide rates are falling in the developed world. Progress. Yet look at how we’re engaged in so many wars (and the US’s involvement isn’t even tagged 🤣). Putting ASI control in the hands of humans amplifies the human control problem in various ways that may not be easy to predict6Mustafa Suleyman’s The Coming Wave did a better job discussing this conundrum..

Thus the question is begged: do we want total control of AGI systems7David Brin argues at BGI24 and in “Wired: Give Every AI a Soul—or Else” that we should extend the current apparatus for managing systems of general intelligences, humans, to AIs. One step of which could be to artificially individuate AIs with (hardware embedded) identities. This way the AGI are kept in check by each other without being outright controlled. On the other hand, will this incentivize AGI systems toward an Accelerationist Capitalist Singularity of high-frequency trading a la Accelerando??