Fully autonomous AI

                       Wolfhart Totschnig[0000−0003−2918−6286]

                      Universidad Diego Portales, Santiago, Chile

   Note: This is an extended abstract of a paper that has been accepted for
publication in Science and Engineering Ethics.

    In the fields of artificial intelligence and robotics, the term “autonomy” is
generally used to mean the capacity of an artificial agent to operate indepen-
dently of human guidance. To create agents that are autonomous in this sense is
the central aim of these fields. Until recently, the aim could be achieved only by
restricting and controlling the conditions under which the agents will operate.
The robots on an assembly line in a factory, for instance, perform their delicate
tasks reliably because the surroundings have been meticulously prepared. Today,
however, we are witnessing the creation of artificial agents that are designed to
function in “real-world”—that is, uncontrolled—environments. Self-driving cars,
which are already in use, and “autonomous weapon systems,” which are in de-
velopment, are the most prominent examples. When such machines are called
“autonomous,” it is meant that they are able to choose by themselves, without
human intervention, the appropriate course of action in the manifold situations
they encounter.1
    This way of using the term “autonomy” goes along with the assumption
that the artificial agent has a fixed goal or “utility function,” a set purpose
with respect to which the appropriateness of its actions will be evaluated. So,
in the first example, the agent’s purpose is to drive safely and efficiently from
one place to another, and in the second example, it is to neutralize all and only
enemy combatants in the chosen area of operation. It has thus been defined and
established, in general terms, what the agent is supposed to do. The attribute
“autonomous” concerns only whether the agent will be able to carry out the
given general instructions in concrete situations.
    From a philosophical perspective, this notion of autonomy seems oddly weak.
For, in philosophy, the term is generally used to refer to a stronger capacity,
namely the capacity, as Kant put it, to “give oneself the law” (Kant [1785] 1998,
4:440–441), to decide by oneself what one’s goal or principle of action will be.
This understanding of the term derives from its Greek etymology (auto = “by
oneself,” nomos = “law”). An instance of such autonomy would be an agent who
decides, by itself, to devote its efforts to a certain project—the attainment of
1
     For prominent instances of this usage, see Russell & Norvig’s popular textbook
    Artificial intelligence: A modern approach (2010, 18), Anderson & Anderson’s intro-
    duction to their edited volume Machine ethics (2011, 1), the papers collected in the
    volume Autonomy and artificial intelligence (Lawless et al. 2017) and especially the
    one by Tessier (2017), as well as Müller (2012), Mindell (2015, ch. 1), and Johnson
    & Verdicchio (2017).


Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0)
2       W. Totschnig

knowledge, say, or the realization of justice. In contrast, any agent that has a pre-
determined and immutable goal or purpose would not be considered autonomous
in this sense.
    The aim of the present paper is to argue that an artificial agent can possess
autonomy as understood in philosophy—or “full autonomy,” as I will call it for
short. “Can” is here intended in the sense of general possibility, not in the sense
of current feasibility. I contend that the possibility of a fully autonomous AI
cannot be excluded, but do not mean to imply that such an AI can be created
today.
    My argument stands in opposition to the predominant view in the literature
on the long-term prospects and risks of artificial intelligence. The predominant
view is that an artificial agent cannot exhibit full autonomy because it cannot
rationally change its own final goal, since changing the final goal is counterpro-
ductive with respect to that goal and hence undesirable (Yudkowsky 2001, 2008,
2011, 2012; Bostrom 2002, 2014; Omohundro 2008, 2012, 2016; Yampolskiy &
Fox 2012, 2013; Domingos 2015). I will challenge this view by showing that it is
based on questionable assumptions about the nature of goals and values. I will
argue that a general artificial intelligence—i.e., an artificial intelligence that, like
human beings, develops a general understanding of the world and of itself—may
very well come to change its final goal in the course of its development.
    This issue is obviously of great importance for how we are to assess the long-
term prospects and risks of artificial intelligence. If artificial agents can reach full
autonomy, which law will they give themselves when that happens? In particular,
what confidence can we have that the chosen law will include respect for human
beings?


References
 1. Anderson, M., Anderson, S.L.: General introduction. In: Anderson, M., Anderson,
    S.L. (eds.) Machine ethics, pp. 1–4. Cambridge University Press (2011)
 2. Bostrom, N.: Existential risks. Journal of Evolution and Technology 9(1) (2002),
    http://www.jetpress.org/volume9/risks.html
 3. Bostrom, N.: Superintelligence. Oxford University Press (2014)
 4. Domingos, P.: The master algorithm. Basic Books (2015)
 5. Johnson, D.G., Verdicchio, M.: Reframing AI discourse. Minds and Machines 27(4),
    575–590 (2017)
 6. Kant, I.: Groundwork of the metaphysics of morals. Cambridge Texts in the History
    of Philosophy, Cambridge University Press (1998)
 7. Lawless, W.F., Mittu, R., Sofge, D., Russell, S. (eds.): Autonomy and artificial
    intelligence. Springer International Publishing (2017)
 8. Mindell, D.A.: Our robots, ourselves. Viking (2015)
 9. Müller, V.C.: Autonomous cognitive systems in real-world environments. Cognitive
    Computation 4(3), 212–215 (2012)
10. Omohundro, S.M.: The nature of self-improving artificial intelligence (2008)
11. Omohundro, S.M.: Rational artificial intelligence for the greater good. In: Eden,
    A.H., Moor, J.H., Søraker, J.H., Steinhart, E. (eds.) Singularity hypotheses, pp.
    161–176. Springer (2012)
                                                          Fully autonomous AI          3

12. Omohundro, S.M.: Autonomous technology and the greater human good. In:
    Müller, V.C. (ed.) Risks of artificial intelligence, pp. 9–27. CRC Press (2016)
13. Russell, S.J., Norvig, P.: Artificial intelligence. Prentice Hall (2010)
14. Tessier, C.: Robots autonomy. In: Lawless, W.F., Mittu, R., Sofge, D., Russell,
    S. (eds.) Autonomy and artificial intelligence, pp. 179–194. Springer International
    Publishing (2017)
15. Yampolskiy, R.V., Fox, J.: Artificial general intelligence and the human mental
    model. In: Eden, A.H., Moor, J.H., Søraker, J.H., Steinhart, E. (eds.) Singularity
    hypotheses, pp. 129–145. Springer (2012)
16. Yampolskiy, R.V., Fox, J.: Safety engineering for artificial general intelligence.
    Topoi 32(2), 217–226 (2013)
17. Yudkowsky, E.: Creating friendly AI 1.0. The Singularity Institute (2001)
18. Yudkowsky, E.: Artificial intelligence as a positive and negative factor in global
    risk. In: Bostrom, N., Čirkovič, M.M. (eds.) Global catastrophic risks, pp. 308–
    345. Oxford University Press (2008)
19. Yudkowsky, E.: Complex value systems in friendly AI. In: Schmidhuber, J., Thris-
    son, K.R., Looks, M. (eds.) Artificial general intelligence, pp. 388–393. Springer
    (2011)
20. Yudkowsky, E.: Friendly artificial intelligence. In: Eden, A.H., Moor, J.H., Søraker,
    J.H., Steinhart, E. (eds.) Singularity hypotheses, pp. 181–193. Springer (2012)