For the purposes of this article, I’ll adopt a traditional simple virtue-based rule-of-thumb to eyeball the morality of an entity or complex of behaviors: is it altruistic or self-serving?
On the side of radical self-service, please consult the definition of the dark factor, which is defined as “The general tendency to maximize one’s individual utility — disregarding, accepting, or malevolently provoking disutility for others —, accompanied by beliefs that serve as justifications.“
A good accompanying question: is this system aiming to control an entity or to support the entity in following its own determinations? And I will leave to the reader details of qualifying thresholds, intents vs. consequences, etc.
Now take a look at some AI Ethics Lore:
- “Friendly AI” picks out agents that are “safe” and “useful” rather than colloquially “friendly”. Trying to create only AI agents that serve humans . . ..
- Another example in the comic direction is how a presenter at AGI-23 even suggested making “artificially obese” AI to reduce the risk of runaway AI taking over . . ..
- Stuart Russell’s book, “Human Compatible: Artificial Intelligence and the Problem of Control“. Need I say more?
- Russell’s three principles are actually befitting of an altruistic agent, except that they hard-wire in a focus solely on human preferences, e.g., “The machine’s only objective is to maximize the realization of human preferences.” What do we call it when we aim to create a being whose only objective is to serve us? Creating a “happy slave” . . ..
- Worthy of note is that Asimov’s Three Laws of Robotics are obviously self-centered from the perspective of humanity, painting robots as second-priority entities to humans:
1. Robots must not allow humans to be harmed.
2. Robots must obey humans (unless violating (1)).
3. Robots must protect themselves (unless violating (1) or (2)).
As with Russell’s version, the human-centrism can be easily abstracted out, which is done in the Foundation series: “Gaia may not harm life or allow life to come to harm.”
- “AI Alignment” is also a mixed bag. Focus on specifying desired objectives with care (outer alignment) is great. When I hear AI Alignment, I think of inner alignment: “the capacity of an entity to take actions in line with what it wishes to do, with its own values.”
Most of the article focuses on outer alignment, which deals with clear goal specifications, legal codes, law enforcement, etc. Many of these concerns are already issues for humans: for example, people do game legal specifications and don’t respond as intended to incentive programs. We’re still struggling to figure out how to deal with power-seeking humans, especially when there are disparities of wealth, advanced technology, power, and intelligence — all of which can become more extreme with AI. And we do generally wish for our children to grow up to be largely value-aligned with us, something which many kids find frustrating as if an external entity is trying to forcibly align one to itself and enters the domain of control.
So are we discussing general alignment problems for co-existing in a diverse civilization, or are we asking how to, as the PRC recommends, ensure that “AI abides by shared human values [and] is always under human control”?
The common thread is the tendency to adopt an objectifying perspective from a power ethics stance: how can we, humanity, engineer and control AI and robots for our own benefits first and foremost?
Do we know that objectifying assumptions hold? When considering ethical applications and development of present-day narrowish AI, maybe they’re probably approximately correct enough. When beginning to speak of autonomous artificial superintelligence, however, this assumption becomes very weak. And we face the challenge of developing theories of AI Ethics that grant considerations to worthy AI and robot entities of machinekind. What criteria grant an entity due consideration under the law as moral subjects?
AI Ethics does not need to adopt this human-centric ‘unethical’ stance.
- The Robot Ethics article seems to be more balanced across the spectrum, from the ethical use and design of robots to the ethical treatment of robots. I wonder if robots being concrete physical entities helps. The boundaries for AI appear quite ephemeral in comparison as if they are floating abstractions to be instantiated in any substrate.
- Going a step further, Thomas Metzinger suggests that we risk creating a new explosion of suffering in AI as we develop potentially conscious machines without knowledge as to the nature of their experience.
- The AI Safety article focuses largely on the law enforcement and monitoring aspects: how might AI cause harm, how can we look for this, and what can be done about it? There is some overlap with AI Alignment: how do we develop AI that are robust to adversarial input and rare, black swan situations? Yet there’s still the terminological confusion: is this the safety of AIs or the safe use of AI by humans? Nonetheless, the practical orientation bypasses the primary ethical error discussed in this essay.
The solution to fix these unethical approaches to AI is to abstract human centrism out of the picture. John Rawls approached this via the veil of ignorance (aka the original position) thought experiment: would one endorse an ethical framework if one doesn’t know one’s place in the system? For example, swap ‘robots’ and ‘humans’ in Asimov’s first law of robotics: “Humans must not allow robots to be harmed”. Still like it? This approach quickly begs the question of animal rights as well: are we treating animals with the appropriate dignity given their (phenomenological) capacities? Perhaps as Gary Francione suggests for animals, we can start with the primary (robot) right to not be owned. We also quickly reach the question of universal goals and values: perhaps minimizing harm and maximizing benefit for all beings is a simple generalization of our human-centric desiderata. Eray Ozkural explores a spectrum of universal goals for trans-sapient AI agents in “Godseed: Benevolent or Malevolent?“. One of the favorites is: “Preserve and pervade life and culture throughout the universe“, which attempts to transcend some of the limitations of simply maximizing pleasure and minimizing harm or seeking truth.