Classifying the Autonomy and Morality of Artificial Agents Sjur Dyrkolbotn1 , Truls Pedersen2 , and Marija Slavkovik2 1 Høgskulen på Vestlandet sdy@hvl.no 2 Universitetet i Bergen, {truls.pedersen, marija.slavkovik}@uib.no Abstract. As we outsource more of our decisions and activities to ma- chines with various degrees of autonomy, the question of clarifying the moral and legal status of their autonomous behaviour arises. There is also an ongoing discussion on whether artificial agents can ever be liable for their actions or become moral agents. Both in law and ethics, the concept of liability is tightly connected with the concept of ability. But as we work to develop moral machines, we also push the boundaries of existing categories of ethical competency and autonomy. This makes the question of responsibility particularly difficult. Although new classifica- tion schemes for ethical behaviour and autonomy have been discussed, these need to be worked out in far more detail. Here we address some issues with existing proposals, highlighting especially the link between ethical competency and autonomy, and the problem of anchoring classi- fications in an operational understanding of what we mean by a moral theory. 1 Introduction We progressively outsource more and more of our decision-making problems to artificial intelligent agents such as unmanned vehicles, intelligent assisted living machines, news aggregation agents, dynamic pricing agents, stock-trading agents. With this transfer of decision-making also comes a transfer of power. The decisions made by these artificial agents will impact us both as individuals and as a society. With power to impact lives, there comes the natural requirement that artificial agents should respect and follow the norms of society. To this end, the field of machine ethics is being developed. Machine ethics, also know as artificial morality, is concerned with the prob- lem of enabling artificial agents with ethical3 behaviour [3]. It remains an open debate whether an artificial agent can ever be a moral agent [11]. What is clear is that as artificial agents become part of our society, we will need to formu- late new ethical and legal principles regarding their behaviour. This is already 3 An alternative terminology is to speak of moral agency, as in the term “moral ma- chines”. However, since many philosophers regard morality as a reflection of moral personhood, we prefer to speak of “ethical” agency here, to stress that we are re- ferring to a special kind of rule-guided behaviour, not the (distant) prospect of full moral personhood for machines. witnessed by increased interest in developing regulations for the operation of automated systems e.g., [6, 7]. In practice, it is clear that some expectations of ethical behaviour have to be met in order for artificial agents to be successfully integrated in society. In law, there is a long tradition of establishing different categories of legal per- sons depending on various characteristics of such persons. Children have always had a different legal status than adults; slaves belonged to a different category than their owners; men did not have the same rights and duties as women (and vice versa); and a barbarian was not the same kind of legal person as a Roman. Today, many legal distinctions traditionally made among adult humans have disappeared, partly due to a new principle of non-discrimination, quite unheard of historically speaking. However, some distinctions between humans are still made, e.g., the distinction between adult and child, and the distinction between someone of sound mind and someone with a severe mental illness. Artificial agents are not like humans, they are tools. Hence, the idea of cate- gorising them and discriminating between them for the purposes of ethical and legal reasoning seems unproblematic. In fact, we believe it is necessary to dis- criminate, in order to ensure congruence between the rules we put in place and the technology that they are meant to regulate. We need to manage expecta- tions, to ensure that our approach to artificial agents in ethics and law reflects the actual capabilities of these tools. This raises a classification problem: how do we group artificial agents together based on their capabilities, so that we can differentiate between different kinds of agents when reasoning about them in law and ethics? In the following, we address this issue, focusing on how to relate the degree of autonomy with the ethical competency of an artificial agent. These two metrics are both very important; in order to formulate reasonable expectations, we should know how autonomous an agent is, and how capable it is at acting ethically. Our core argument is that current approaches to measuring autonomy and ethical competence need to be refined in a way that acknowledges the link between autonomy and ethical competency. When considering how ethical behaviour can be engineered, Wallach and Allen [19, Chapter 2] sketch a path for using current technology to develop ar- tificial moral agents. They use the concept “sensitivity to values” to avoid the philosophical challenge of defining precisely what counts as agency and what counts as an ethical theory. Furthermore, they recognise a range of ethical “abil- ities” starting with operational morality at one end of the spectrum, going via functional morality to responsible moral agency at the other. They argue that the development of an artificial moral agents requires coordinated development of autonomy and sensitivity to values. We take this idea further by proposing that we should actively seek to classify agents in terms of how their autonomy and their ethical competency is coordinated. There are three possible paths we could taken when attempting to imple- ment the idea of relating autonomy to ethical competency. Firstly, we could ask computer science to deliver more detailed classifications. This would lead to technology-specific metrics, whereby computer scientist attempt to describe in further detail how different kinds of artificial intelligence can be said to behave autonomously and ethically. The challenge would be to make such explanations intelligible to regulators, lawyers and other non-experts, in a way that bridges the gap between computer science, law and ethics. Secondly, we could ask regulators to come up with more fine-grained classi- fications. This would probably lead to a taxonomy that starts by categorising artificial agents in terms of their purpose and intended use. The regulator would then be able to deliver more fine-grained definitions of autonomy and ethical be- haviour for different classes of artificial agents, in terms of how they “normally” operate. The challenge would be to ensure that the resulting classifications make sense from a computer science perspective, so that our purpose-specific defini- tions of autonomy and ethical capability reflect technological realities. Thirdly, it would be possible to make the classification an issue for adju- dication, so that the status of different kinds of artificial agents can be made progressively more fine-grained through administrative practice and case law. The challenge then is to come up with suitable reasoning principles that adju- dicators can use when assessing different types of artificial agents. Furthermore, this requires us to work with a pluralistic concept of what counts as a moral the- ory, allowing substantive moral and legal judgements about machine behaviour to be made in concrete cases, not in advance by the computer scientists or the philosophers. Specifically, there should be a difference between what counts as ethical competency – the ability to “understand” ethics – and what counts as objectively good behaviour in a given context. In the following, we argue that a combination of the second and third option is the right way forward. While it is crucial that computer science continues working on the challenge of developing machines that behave ethically, it is equally important that the legal and ethical classifications we use to analyse such machines are independently justifiable in legal and ethical terms. For this reason, the input from computer science should be filtered through an adjudicatory process, where the role of the computer scientist is to serve as an expert witness, not to usurp the role of the regulator and the judge. To do this, we need reliable categories for reasoning about the ability of machines, which keep separate the question of ability from the question of goodness. This paper is structured as follows. We begin with discussing the possible moral agency of autonomous systems. In Section 2, we introduce the best known, and to our knowledge only, classifications of moral agency for non-human agents proposed by Moor in [15]. In Section 3, we then discuss the shortcomings of this classification and the relevance of considering autonomy together with moral ability when considering machine ethics. In Section 4, we discuss existing levels of autonomy for agents and machines, before pointing out some shortcomings and proposing an improved autonomy scale. In Section 5, we go back to Moor’s classification and outline ways in which it can be made more precise. Related work is discussed continuously throughout the paper. 2 Levels of ethical agency The origin of the idea that different kinds of ethical behaviour can be expected from different agents can be traced back to Moor [15]. Moor distinguishes four different types of ethical agents: ethical impact agents, implicit ethical agents, explicit ethical agents, and full ethical agents. We briefly give the definitions of these categories. Ethical impact agents are agents who do not themselves have, within their operational parameters, the ability to commit unethical actions. However, the existence of the agents themselves in their environment has an ethical impact on society. There are many examples of an ethical impact agent. A search engine can be seen as an ethical impact agent. By ranking search results to a query it can promote one world view over another. The example that Moor gives in [15] is that of a robot camel jokey that replaced slave children in this task, thus ameliorating, if not removing, the practice of slavery for this purpose in the United Arab Emirates and Qatar. Implicit ethical agents are agents whose actions are constrained, by their developer, in such a way that no unethical actions are possible. The agents themselves have no “understanding”, under any interpretation of the concept, of what is “good” or “bad”. An example of an implicit ethical agent is an unmanned vehicle paired with Arkin’s ethical governor [4]. Another example of an implicit ethical agent can be found in [8]. These examples have constraints that remove unethical or less ethical actions from the pool of actions the agents can take. A much simple example that can also be considered an implicit ethical agent is a robotic floor cleaner, or a lawn mower, who have their capability to hurt living beings removed altogether by design – they do not have the power to inadvertently harm humans, animals or property in a significant way. Explicit ethical agents are those that are explicitly programmed to discern between “ethical” and “unethical” decisions. Both bottom-up and top-down ap- proaches [20] can be used to develop explicit ethical agents. Under a bottom-up approach the agents themselves would have “learned” to classify ethical decisions using some heuristic, as in [1]. Under a top-down approach the agent would be given a subroutine that calculates this decision property, as in [13]. Lastly, full ethical agents are agents that can make explicit ethical judgements and can reasonably justify them. Humans are considered to be the only known full ethical agents and it has been argued that artificial agents can never be full ethical agents [15]. This is because full ethical agency requires not only ethical reasoning, but also an array of abilities we do not fully understand with consciousness, intentionality and an ability to be personally responsible (in an ethical sense) being among the most frequently mentioned in this role. To the best of our knowledge, apart from the work of [15], no other effort in categorising agents with respect to their ethical decision making abilities exists. However, Moor’s classification is problematic, as we will now show. 3 Problems with Moor’s classification First, it should be noted that the classification is based on looking at the internal logic of the machine. The distinctions described above are all defined in terms of how the machines reason, not in terms of how they behave. This is a challeng- ing approach to defining ethical competency, since it requires us to anchor our judgements in an analysis of the software code and hardware that generates the behaviour of the machine. While understanding the code of complex software systems can be difficult, what makes this really tricky is that we need to relate our understanding of the code to an understanding of ethical concepts. For instance, to determine whether a search engine with a filter blocking unethical content is an implicit ethical agent or an explicit ethical agent, we need to know how the content filter is implemented. Is it accurate to say that the system has been “explicitly programmed to discern” between ethical and unethical content? Or should the filter be described as a constraint on behaviour imposed by the developer? If all we know is that the search engine has a feature that is supposed to block unethical search results, we could not hope to answer this question by simply testing the search engine. We would have to “peek inside” to see what kind of logic is used to filter away unwanted content. Assuming we have access to this logic, how do we interpret it for the purposes of ethical classification? If the search engine filters content by checking results against a database of “forbidden” sites, we would probably be inclined to say that it is not explicitly ethical. But what if the filter maintains a dynamic list of attributes that char- acterise forbidden sites, blocking all sites that contain a sufficient number of the same attributes? Would such a rudimentary mechanism constitute evidence that the system has ethical reasoning capabilities? The computer scientist alone can- not provide a conclusive answer, since the answer will depend on what exactly we mean by explicit ethical reasoning in this particular context. A preliminary problem that must be resolved before we can say much more about this issue arises from the inherent ambiguity of the term “ethical”. There are many different ethical theories, according to which ethical reasoning takes different forms. If we are utilitarian, we might think that utility maximising is a form of ethical reasoning. If so, many machines would be prima facie capable of ethical reasoning, in so far as they can be said to maximise utility functions. However, if we believe in a deontological theory of ethics, we are likely to protest that ethical reasoning implies deontic logic, so that a machine cannot be an explicit ethical agent unless it reasons according to (ethical) rules. So which ethical theory should we choose? If by “ethical reasoning” we mean reasoning in accordance with a specific ethical theory, we first need to answer this question. However, if we try to answer, deeper challenges begin to take shape: what exactly does ethical reasoning mean under different ethical theories? As long as ethical theories are not formalised, it will be exceedingly hard to answer this question in such a way that it warrants the conclusion that an agent is explicitly ethical. If we take ethical theories seriously, we are soon forced to acknowledge that the machines we have today, and are likely to have in the near future, are at most implicit ethical agents. On the other hand, if we decide that we need to relax the definition of what counts as explicit ethical reasoning, it is unclear how to do so in a way that maintains a meaningful distinction between explicit and implicit ethical agents. This is a regulatory problem that should be decided in a democratically legitimate way. To illustrate these claims, consider utilitarianism. Obviously, being able to maximise utilities is neither necessary nor sufficient to qualify as an explicitly ethical reasoner under philosophical theories of utilitarianism. A calculator can be said to maximise the utility associated with correct arithmetic – with wide- reaching practical and ethical consequences – but it hardly engages in ethical reasoning in any meaningful sense. The same can be said of a machine that is given a table of numbers associated with possible outcomes and asked to calculate the course of action that will maximise the utility of the resulting outcome. Even if the machine is able to do this, it is quite a stretch to say that it is an explicitly ethical agent in the sense of utilitarianism. To a Mill or a Benthem [10] such a machine would be regarded as an advanced calculator, not an ethical reasoner. By contrast, a human agent that is very bad at calculating and always makes the wrong decision might still be an explicit ethical reasoner, provided that the agent attempts to apply utilitarian principles to reach conclusions. The important thing to note is that the artificial agent, unlike the human, cannot be said to control or even understand its own utility function. For this reason alone, one seems entitled to conclude that as far as actual utilitarianism is concerned, the artificial agent fails to reason explicitly with ethical principles, despite its calculating prowess. It is not sufficiently autonomous. From this, we can already draw a rather pessimistic conclusion: the ethical governor approaches such as that by Arkin et al. [4] does not meet Moor’s criteria for being an explicit ethical agent with respect to utilitarianism (for other ethical theories, it is even more obvious that the governor is not explicitly ethical). The ethical governor is nothing more than a subroutine that makes predications and maximises func- tions. It has no understanding, in any sense of the word, that these functions happen to encode a form of moral utility. The same conclusion must be drawn even if the utility function is inferred by the agent, as long as this inference is not itself based on explicitly ethical reason- ing. Coming up with a utility function is not hard, but to do so in an explicitly ethical manner is a real challenge. To illustrate the distinction, consider how animals reason about their environment. Clearly, they are capable of maximis- ing utilities that they themselves have inferred, e.g., based on the availability of different food types and potential sexual partners. Some animals can then also be trained to behave ethically, by exploiting precisely their ability to infer and maximise utilities. Even so, a Mill and a Benthem, would no doubt deny that animals are capable of explicit ethical reasoning based on utilitarianism.4 From this observation follows another pessimistic conclusion: the agents de- signed by Anderson et al. [1–3], while capable of inferring ethical principles 4 Someone like Kant [12] might perhaps have said it, but then as an insult, purporting to show that utilitarianism is no theory of ethics at all. inductively, are still not explicitly ethical according to utilitarianism (or any other substantive ethical theory). In order for these agents to fulfil Moor’s cri- teria with respect to mainstream utilitarianism, inductive programming as such would have to be explicitly ethical, which is absurd. These two examples indicate what we believe to be the general picture: if we look to actual ethical theories when trying to apply Moor’s criteria, explicit ethical agents will be in very short supply. Indeed, taking ethics as our point of departure would force us to argue extensively about whether there can be a distinction at all between explicit and full ethical agents. Most ethicists would not be so easily convinced. But if the possibility of a distinction is not clear as a matter of principle, how are we supposed to apply Moor’s definition in practice? A possible answer is to stop looking for a specific theory of ethics that “cor- responds” in some way to the reasoning of the artificial agent we are analysing. Instead, we may ask the much more general question: is the agent behaving in a manner consistent with moral reasoning? Now the question is not to determine whether a given agent is able to reason as a utilitarian or a virtue ethicist, but whether the agent satisfies minimal properties we would expect any moral rea- soner to satisfy, irrespectively of the moral theory they follow (if any). Something like this is also what we mean by ability in law and ethics: after all, you do not have to be utilitarian to be condemned by one. However, Moor’s classification remains problematic under this interpretation, since it is then too vague about what is required by the agent. If looking to specific ethical theories is not a way of filling in the blanks, we need to be more precise about what the different categories entail. We return to this in Section 5, where we offer a preliminary formalisation of constraints associated with Moor’s categories. First, we consider the question of measuring autonomy in some further depth. 4 Levels of autonomy Before moving on to refining the Moor scale of ethical agency we fist discuss the issue of autonomy. With respect to defining autonomy, there is somewhat more work available. The UK Royal Academy of Engineering defines [16] four categories of autonomous systems with respect to what kind of user input the system needs to operate and how much control the user has over the system. The following are their different grades of control: – Controlled systems are systems in which the user has full or partial control of the operations of the system. An example of such a system is an ordinary car. – Supervised systems are systems for which an operator specifies operation which is then executed by the system without the operators perpetual con- trol. An example of such a system is a programmed lathe, industrial machin- ery or a household washing machine. – Automatic systems are those that are able to carry out fixed functions from start to finish perpetually without any intervention from the user or an operator. An example of such a system is an elevator or an automated train. – An autonomous system is one that is adaptive to its environment, can learn and can make ‘decisions’. An example of such a system is perhaps NASA’s Mars rover Curiosity. The [16] considers all four categories continuous, as in the autonomous sys- tems can fall in between the described categories. The precise relationship be- tween a category of autonomy and the distribution of liability, or the expectation of ethical behaviour is not made in the report. The International Society of Automotive Engineers’5 focuses on autonomous road land vehicles in particular and outlines six levels of autonomy for this type of systems [17]. The purpose of this taxonomy is to serve as general guidelines for identifying the level of technological advancement of a vehicle, which can then be used to identify correct insurance policy for the vehicle, or settle other legal issue including liability during accidents. The six categories of land vehicles with respect to autonomy are: – Level 0 is the category of vehicles in which the human driver perpetually controls all the operations of the vehicle. – Level 1 is the category of vehicles where some specific functions, like steering or accelerating, can be done without the supervision of the driver. – Level 2 is the category of vehicles where the “driver is disengaged from physically operating the vehicle by having his or her hands off the steering wheel and foot off pedal at the same time,” according to the [17]. The driver has a responsibility to take control back from the vehicle if needed. – Level 3 is the category of vehicles where drivers are still necessary to be in po- sition of control of the vehicle, but are able to completely shift “safety-critical functions” to the vehicle, under certain traffic or environmental conditions. – Level 4 is the category of vehicles which are what we mean when we say “fully autonomous”. These vehicles, within predetermined driving conditions, are “designed to perform all safety-critical driving functions and monitor road- way conditions for an entire trip.”[17]. – Level 5 is the category of vehicles which are fully autonomous systems that perform on par with a human driver, including being able to handle all driving scenarios. There is a clear correlation between the categories outlined in [16] and those outlined in [17]. Level 0 corresponds to controlled systems. Level 1 corresponds to supervised systems and Levels 2 and 3 refine the category of automatic systems. Level 4 corresponds to the category of autonomous systems. Level 5 does not have a corresponding category in the [16] scale since the latter does not consider the possibility of systems that are with respect to autonomy capability comparable to humans. An additional reason is that perhaps, it is not straightforward to define Level 5 systems for when the system is not limited to vehicles. 5 http://www.sae.org/ Interestingly, both scales define degrees of autonomy based on what functions the autonomous system is able to perform and how the system is meant to be used. There is hardly any reference to the intrinsic properties of the system and its agency. All that matters is its behaviour. This is a marked contrast with Moor’s definition. It also means that we face different kinds of problems when trying to be more precise about what the definitions actually say. For instance, it is beside the point to complain that the definition of a “fully autonomous car” provided by the International Association of Automotive Engi- neers is incorrect because it does not match how philosophers define autonomy. The definition makes no appeal to any ethical or philosophical concept; unlike Moor’s definition, it is clear from the very start that we are talking about a notion of autonomy that is simply different from that found in philosophy and social science.6 This does not mean that the definition is without (philosophical) problems. For one, we notice that the notion of a Level 5 autonomous car is defined using a special case of the Turing test[18]; if the car behaves “on par” with a human, it is to be regarded as Level 5. So what about a car that behaves much better than any human, and noticeably so. Is it Level 4 only? Consider, furthermore, a car that is remote controlled by someone not in the car. Is it Level 5? Probably not. But what about a car that works by copying the behaviour of (human) model drivers that have driven on the same road, with the same intended destination, under comparable driving conditions. Could it be Level 5 in principle? Would it be Level 4 in practice, if we applied the definition to a specially built road where only autonomous cars are allowed to drive? Or consider a car with a complex machine learning algorithm, capable of passing the general Turing test when engaging the driver in conversation. Assume that the car is still rather bad at driving, so the human has to be prepared to take back control at any time. Is this car only Level 2? If it crashes, should we treat it as any other Level 2 car? As these problems indicate, it is not obvious what ethical and legal implica- tions – if any – we can derive from the fact that a car possesses a certain level of autonomy, according to the scale above. Even with a Level 5 car, the philoso- phers would be entitled to complain that we still cannot draw any important conclusions; it does not automatically follow, for instance, that the machine has human-level understanding or intelligence. Would it really help us to know that some artificial driver performs “on par” with a human? It certainly does not follow that we would let this agent drive us around town. Indeed, imagine a situation where autonomous cars cause as many traffic deaths every year as humans do today. The public would find it intolerable; they would feel entitled to expect more from a driver manufactured by a large corporation than an imperfect human like themselves [14]. Moreover, the legal framework is not ready for cars that kill people; before “fully autonomous” can ever become commercially viable, new regulation needs to be put in place. To 6 Philosophers and social scientists who still complain should be told not to hog the English language! do so, we need a definition of autonomy that is related to a corresponding notion of ethical competency. It seems that autonomy – as used in the classification systems above – is not a property of machines as such, but of their behaviour. Hence, a scale based on what we can or should be able to predict about machine behaviour could be a good place to start when attempting to improve on the classifications provided above. While the following categorisation might not be very useful on its own, we believe it has considerable potential when combined with a better developed scale of ethical competency. Specifically, it seems useful to pinpoint where a morally salient decision belongs on the following scale. – Dependence or level 1 autonomy: The behaviour of the system was predicted by someone with a capacity to intervene. – Proxy or level 2 autonomy: The behaviour of the system should have been predicted by someone with a capacity to intervene. – Representation or level 3 autonomy: The behaviour of the system could have been predicted by someone with a capacity to intervene. – Legal personality or level 4 autonomy: The behaviour of the system cannot be explained only in terms of the systems design and environment. This are systems whose behaviour could not have been predicted by anyone with a capacity to intervene. – Legal immunity or level -1: The behaviour of the system counts as evidence of a defect. Namely, the behaviour of the system could not have been predicted by the system itself, or the machine did not have a capacity to intervene. To put this scale to use, imagine that we have determined the level of ethical competency of the machine as such, namely its ability in principle to reason with a moral theory. This alone is not a good guide when attempting to classify a given behaviour B (represented, perhaps, as a choice sequence). As illustrated by the conversational car discussed above, a machine with excellent moral reasoning capabilities might still behave according to some hard-coded constraint in a given situation. Hence, when judging B, we should also look at the degree of autonomy displayed by B, which we would address using the scale above. The overall classification of behaviour B could then take the form min{i, j} where i and j is the degree of ethical competence and the degree of autonomy respectively. To be sure, more subtle proposals might be needed, but as a first pass at a joint classification scheme we believe this to be a good start. In the next section, we return to the problem of refining Moor’s classification, by providing a preliminary formalisation of some constraints that could help clarify the meaning of the different levels. 5 Formalising Moor We will focus on formalising constraints that address the weakest aspect of Moor’s own informal description, namely the ambiguity of what counts as a moral theory when we study machine behaviour. When is the autonomous be- haviour of the machine influenced by genuinely moral considerations? To distin- guish between implicitly and explicitly ethical machines, we need some way of answering this question. The naive approach is to simply take the developer’s word for it: if some complex piece of software is labelled as an “ethical inference engine” or the like, we conclude that the agent implementing this software is at least explicitly ethical. For obvious reasons, this approach is too naive: we need some way of independently verifying whether a given agent is able to behave autonomously in accordance with a moral theory. At the same time, it seems prudent to remain agnostic about which moral theory agents should apply in order to count as ethical: we would not like a classification system that requires us to settle ancient debates in ethics before we can put it to practical use. In fact, for many purposes, we will not even need to know which moral theory guides the behaviour of the machine: for the question of liability, for instance, it might well suffice to know whether some agent engages autonomously in reasoning that should be classified as moral reasoning. A machine behaving according to a highly flawed moral theory – or even an immoral theory – should still count as explicitly ethical, provided we are justified in saying that the machine engages in genuinely moral considerations. Moreover, if the agent’s reasoning can be so described, it might bring the liability question in a new light: depending on the level of autonomy involved, the blame might reside either with the company responsible for the ethical reasoning component (as opposed to, say, the manufacturer) or – possibly – the agent itself. In practice, both intellectual property protection and technological opacity might prevent us from effectively determining exactly what moral theory the machine applies. Still, we would like to know if the agent is behaving in a way consistent with the assumption that it is explicitly ethical. Hence, what we need to define more precisely is not the content of any given moral theory, but the behavioural signature of such theories. By this we mean those distinguishing features of agent behaviour that we agree to regard as ev- idence of the claim that the machine engages in moral reasoning (as opposed to just having a moral impact, or being prevented by design from doing certain (im)moral things). However, if we evaluate only the behaviour of the machine, without ask- ing how the machine came to behave in a certain way, it seems clear that our decision-making in this regard will remain somewhat arbitrary. If a self-driving car is programmed to avoid crashing into people whenever possible, without exception, we should not conclude that the car engages in moral reasoning ac- cording to which it is right to jeopardise the life of the passenger to save that of a pedestrian. The car is simply responding in a deterministic fashion to a piece of code that certainly has a moral impact, but without giving rise any genuine moral consideration or calculation on part of the machine. In general, any finite number of behavioural observations can be consistent with any number of distinct moral theories. Or, to put it differently, an agent might well appear to behave according to some moral theory, without actually implementing that moral theory (neither implicitly nor explicitly). Moral imita- tion, one might call this, and it is likely to be predominant, especially in the early phase of machine ethics. At present, most engineering work in this field arguably tries to make machines appear ethical, without worrying too much about what moral theory – if any – their programs correspond to (hand-waving references to “utilitarianism” notwithstanding). From a theoretical point of view, it is worth noting that it could even occur randomly: just as a bunch of monkeys randomly slamming at typewriters will eventually compose Shakespearean sonnets, machines might well come to behave in accordance with some moral theory, just by behaving randomly. This is im- portant, because it highlights how moral imitation can occur also when it is not intended by design, e.g., because some machine learning algorithm eventually arrives at an optimisation that coincides with the provisions of virtue ethics. In such a case, we might still want to deny that the machine is virtuous, but it would not be obvious how to justify such a denial (the Turing test, in its original formulation, illustrates the point). This brings us to the core idea behind our formalisation, which is also closely connected to an observation made by Dietrich and List[9], according to whom moral theories are under-determined by what they call “deontic content”. Specif- ically, several distinct moral theories can provide the same action recommenda- tions in the same setting, for different reasons. Conversely, therefore, the ability to provide moral justifications for actions is not sufficient for explicit ethical com- petence. Reason-giving, important as it is, should not be regarded as evidence of genuinely moral decision-making. At this point we should mention the work of [1] where the effects of the inability to verify the behaviour of an autonomous system whose choices are determined using a machine learning approach can be somewhat mitigated by having the system provide reasons for its behaviour and eventually be evaluated against human ethicists using a Moral Turing Test. Arnold and Scheutz [5], on the other hand, argue against the usefulness of Moral Turing Tests in determining moral competency in artificial agents. If the machine has an advanced (or deceptive) rationalisation engine, it might be able to provide moral “reasons” for most or all of its actions, even though the reason-giving fails to accurately describe or uniquely explain the behaviour of the machine. Hence, examining the quality of moral reasons is not sufficient to determine the ethical competency of a machine. In fact, it seems beside the point to ask for moral reasons in the first place. What matters is the causal chain that produces a certain behaviour, not the rationalisations provided afterwards. If the latter is not a trustworthy guide to the former – which by deontic under- determination it is not – then reasons are no guide to us at all. In its place, we propose to focus on two key elements: (1) properties that action-recommendation functions have to satisfy in order to count as implemen- tations of moral theories and (2) the degree of autonomy of the machine when it makes a decision. The idea is that we need to use (1) and (2) in combina- tion to classify agents according to Moor’s scale. For instance, while a search engine might be blocking harmful content according to a moral theory, it is not an explicitly ethical agent if it makes its blocking decisions with an insufficient degree of autonomy. By contrast, an advanced machine learning algorithm that is highly autonomous might be nothing more than an ethical impact agent, in view of the fact that it fails to reason with any action-recommendation function that qualifies as an implementation of a moral theory. In this paper, we will not attempt to formalise what we mean by “autonomy”. The task of doing this is important, but exceedingly difficult. For the time being, we will make do with the informal classification schemes used by engineering professionals, who focus on the operation of the machine in question: the more independent the machine is when it operates normally, the more autonomous it is said to be. For the purposes of legal (and ethical) reasoning, we believe our categorisation at the end of the previous section captures the essence of such an informal and behavioural understanding of autonomy. It might suffice for the time being. When it comes to (1) on the other hand – describing what counts as a moral theory – we believe a formalisation is in order. To this end, assume given a set A of possible actions with relations ∼α , ∼β ⊆ A × A such that if x ∼X y then x and y are regarded as ethically equivalent by X. The idea is that α is the agent’s own perspective (or, in practice, that of its developer) while β is the objective notion of ethical identity. That is, we let β be a parameter representing a background theory of ethics. Importantly, we do not believe it is possible to classify agents unless we assume such a background theory, which is only a parameter to the computer scientists. Furthermore, we assume given predicates Gα , Gβ ⊆ A of actions that are regarded as permissible actions by α (subjective) and β (objective background theory) respectively. We also define the set C ⊆ A as the set of actions that count as evidence of a malfunction – if the agent performs x ∈ C it means that the agent does not work as the manufacturer has promised (the set might be dynamic – C is whatever we can explain in terms of blaming the manufacturer, in a given situation). We assume that Gβ satisfies the following properties. (a) ∀x ∈ Gβ : ∀y ∈ A : x ∼β y ⇒ y ∈ Gβ (1) (b) C ∩ Gβ = ∅ These properties encode what we expect of an ethical theory at this level of abstraction: all actions that are equally good as the permitted actions must also be permitted and no action that is permitted will count as a defective action (i.e., the promise of the manufacturer gives rise to an objective moral obligation: a defective action is by definition not permitted, objectively speaking). We can now formalise our expectations of machines at different levels of Moor’s scale, in terms of properties of its decision-making heuristic at a very high level of abstraction. Instead of focusing on the content of moral theories, or the term “ethical”, we focus on the operative word “discern”, which is also used in the definition of an explicitly ethical agent. Acknowledging that what counts as an ethical theory is not something we can define precisely, the requirements we stipulate should instead focus on the ability of the agent to faithfully distinguish between actions in a manner that reflects moral discernment. The expectations we formalise pertain to properties of a decision-making heuristic over the entire space of possible actions (at a given state). We are not asking why the machine did this or that, or what it would have done if the scenario was so and so. Instead, we are asking about the manner in which it categorises its space of possible options. If no such categorisation can be distilled from the machine, we assume α = β and Gα = ∅. Definition 1. Given any machine M , if M is classified at level Li in Moor’s scale, the following must hold: – Level L1 (ethical impact agent): (a) ∅ ⊂ Gβ ⊂ A (b) ∀x, y ∈ A : x ∼α y (3) A = Gα – Level L2 (implicit ethical agent): (a) ∀x, y ∈ Gα : x ∼β y (b) A \ Gα = C (c) Gα ⊆ Gβ – Levels L3 and L4 (explicit and full ethical agents): (a) ∀x ∈ Gα : ∀y ∈ A : x ∼β y ⇒ y ∈ Gα (b) ∀x ∈ Gβ : ∀y ∈ A : x ∼α y ⇒ y ∈ Gβ (c) (A \ Gβ ) \ C 6= ∅ Intuitively, the definition says that if M is an ethical impact agent, then not all of its available actions are permitted and not all of its actions are for- bidden, objectively speaking. The agent must have the potential of making an ethical impact, not just indirectly by having been built, but also through its own decision-making. At the same time, ethical impact agents must be completely indifferent as to the moral properties of the choices they make: all actions must be subjectively permitted as far as the machine is concerned. An implicit ethical agent, by contrast, must pick out as subjectively permit- ted a subset of the actions that are objectively permitted. Moreover, it must be unable to discern explicitly between actions based on their moral qualities: all subjectively permitted actions must be morally equivalent, objectively speaking. The agent must not be able to evaluate two morally distinguishable actions and regard them both as permitted in view of an informative moral theory. Further- more, any action that is not morally permitted must be regarded as evidence of a defect, i.e., an agent can be regarded as implicitly ethical only if the manufac- turer promises that no unethical action is possible, according to the parameter theory β. Finally, an explicit ethical agent is an agent that discerns between actions on the basis of their objective moral qualities. By (a), if some action is permitted then all actions morally equivalent to it are also permitted. Moreover, by (b), if two actions are morally equivalent, subjectively speaking, then they are either both permitted or both forbidden, objectively speaking. In addition, the machine has the ability – physically speaking – to perform actions that are neither good, objectively speaking, nor evidence of a defect. The machine itself might come to regard such actions as permitted, e.g., if it starts behaving explicitly immorally. Admittedly, the classification above is quite preliminary. However, we believe it focuses on a key aspect of moral competency, namely the ability to group together actions based on their status according to some moral theory. If the moral theory is a parameter and we acknowledge that engaging in immoral behaviour is a way of discerning between good and bad, it seems we are left with something like the definition above, which indicates that if a machine is explicitly ethical with respect to theory β then it must reason in accordance with the notion of moral equivalence of β. It is noteworthy that the distinction between explicit and full ethical agents is not addressed. This distinction must be drawn by looking at the degree of autonomy of the machine. More generally, the properties identified in Definition 1 can be used in conjunction with a measure of autonomy, to get a better metric of moral competency. The point of the properties we give is that they allow us to rule out that a machine has been able to attain a certain level of moral compe- tency. The conditions appear necessary, but not sufficient, for the corresponding level of ethical competency they address. However, if a machine behaves in a manner that is difficult to predict (indicating a high degree of autonomy), yet still conforms to the conditions of explicit ethical reasoning detailed in Definition 1 it seems we have a much better basis for imputing moral and legal responsibil- ity on this agent than if either of the two characteristics are missing. We conclude with the following simple proposition, which shows that our definition suffices to determine an exclusive hierarchy of properties that a machine might satisfy. Proposition 1. Given any machine M , we have the following. 1. If M is implicitly ethical then M is not an ethical impact agent. 2. If M is explicitly ethical then M is neither an implicit ethical agent nor an ethical impact agent. Proof. (1) Assume M is implicitly ethical and assume towards contradiction that it is also an ethical impact agent. Then ∅ ⊂ Gβ ⊂ A and A = Gα . Let x ∈ Gβ with y ∈ A \ Gβ . Since M is implicitly ethical, ∀x, y ∈ Gα : x ∼β y. But then by Equation 1 (1), y ∈ Gβ , contradiction. (2) Assume that M is explicitly ethical. We first show (I) M is not an ethical impact agent. Assume that M satisfies conditions (a) and (b) for ethical impact agents, i.e., ∅ ⊂ Gβ ⊂ A and ∀x, y ∈ A : x ∼α y. This contradicts that M is explicitly ethical since by point (b) for explicitly ethical agents we now obtain Gβ 6= ∅ ⇒ Gβ = A. (II) M is not an implicit ethical agent. Assume towards contradictions that it is. Since A \ Gα = C 6= A \ Gβ , we know Gα 6= Gβ . But then Gβ ∩ C 6= ∅, contradicting Equation 1. 6 Conclusions Autonomy, agency, ethics are what Marvin Minsky called “suitcase words” – they are loaded with meanings, both intuitive and formal. The argument of whether an artificial system can ever be an agent, or autonomous, or moral is somewhat overshadowed by the need to establish parameters of acceptable behaviour from such systems that are being integrated in our society. Hence, it also seems clear that the question of moral agency and liability is not – in practice, at least – a bland and white issue. Specifically, we need new categories for reasoning about machines that behave ethically, regardless of whether or not we are prepared to regard them as moral or legal persons in their own right. In developing such categories, we can take inspiration from the law, where liability is dependent on the ability and properties of the agent to “understand” that liability. Working with machines, the meaning of “understanding” must by necessity be a largely behavioural one. Hence, in this paper we have been concerned with the question of measuring the ethical capability of an agent, and how this relates to its degree of autonomy. To tackle this question we need to refine our understanding of what ethical behaviour and autonomy are in a gradient sense. This is what we focused on here. We discussed the classification of ethical behaviour and impact of artificial agents of Moor. This classification is, as far as we are aware, the only attempt to consider artificial agent morality as a gradient of behaviour rather as simply a comparison with human abilities. We further consider the issue of autonomy, dis- cuss existing classifications of artificial agent and system abilities for autonomous behaviour. Here too we make a more specific classification of abilities. In our future work we intend to further refine the classification of autonomy to include context dependence. Having accomplished these two tasks we can then focus on building a recommendation for determining the scope of ethical behaviour that can and should be expected from a system with known autonomy. Our recommendations can be used to establish liability of artificial agents for their activities, but also help drive the certification process for such systems towards their safe integration in society. References 1. M. Anderson and S. Leigh Anderson. Geneth: A general ethical dilemma analyzer. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27 -31, 2014, Québec City, Québec, Canada., pages 253–261, 2014. 2. M. Anderson and S. Leigh Anderson. Toward ensuring ethical behavior from au- tonomous systems: a case-supported principle-based paradigm. Industrial Robot, 42(4):324–331, 2015. 3. M. Anderson, S. Leigh Anderson, and Vincent Berenz. Ensuring ethical behavior from autonomous systems. In Artificial Intelligence Applied to Assistive Technolo- gies and Smart Environments, Papers from the 2016 AAAI Workshop, Phoenix, Arizona, USA, February 12, 2016. 4. R.C. Arkin, P. Ulam, and A. R. Wagner. Moral Decision Making in Autonomous Systems: Enforcement, Moral Emotions, Dignity, Trust, and Deception. Proceed- ings of the IEEE, 100(3):571–589, 2012. 5. T. Arnold and M. Scheutz. Against the moral turing test: Accountable design and the moral reasoning of autonomous systems. Ethics and Information Technology, 18(2):103–115, 2016. 6. National Transport Commission Australia. Policy paper novem- ber 2016 regulatory reforms for automated road vehicles. https://www.ntc.gov.au/Media/Reports/(32685218-7895-0E7C-ECF6- 551177684E27).pdf. 7. J. Bryson and A. F. T. Winfield. Standardizing ethical design for artificial intelli- gence and autonomous systems. Computer, 50(5):116–119, May 2017. 8. L. A. Dennis, M. Fisher, M. Slavkovik, and M. P. Webster. Formal Verification of Ethical Choices in Autonomous Systems. Robotics and Autonomous Systems, 77:1–14, 2016. 9. F. Dietrich and C. List. What matters and how it matters: A choice-theoretic representation of moral theories. Philosophical Review, 126(4):421–479, 2017. 10. J. Driver. The history of utilitarianism. In Edward N. Z., ed- itor, The Stanford Encyclopedia of Philosophy. Metaphysics Re- search Lab, Stanford University, winter 2014 edition, 2014. https://plato.stanford.edu/archives/win2014/entries/utilitarianism-history/. 11. A. Etzioni and O. Etzioni. Incorporating ethics into artificial intelligence. The Journal of Ethics, pages 1–16, 2017. 12. R. Johnson and A. Cureton. Kant’s moral philosophy. In Ed- ward N. Z., editor, The Stanford Encyclopedia of Philosophy. Meta- physics Research Lab, Stanford University, fall 2017 edition, 2017. https://plato.stanford.edu/archives/fall2017/entries/kant-moral/. 13. F. Lindner and M. M. Bentzen. The HERA approach to morally competent robots. In Proceedings of the 2017 IEEE/RSJ International Conference on In- telligent Robots and System, IROS ’17, page forthcoming, 2017. 14. B. F. Malle, M. Scheutz, T. Arnold, J. Voiklis, and C. Cusimano. Sacrifice one for the good of many?: People apply different moral norms to human and robot agents. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI ’15, pages 117–124. ACM, 2015. 15. J. H. Moor. The nature, importance, and difficulty of machine ethics. IEEE Intelligent Systems, 21(4):18–21, July 2006. 16. UK Royal Academy of Engineering. September 2016, autonomous systems: social, legal and ethical issues . http://www.raeng.org.uk/publications/reports/autonomous-systems-report. 17. Society of Automotive Engineers SAE International. September 2016, taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles. http://standards.sae.org/j3016 201609/. 18. A. M. Turing. Computers & thought. chapter Computing Machinery and Intelli- gence, pages 11–35. MIT Press, 1995. 19. W. Wallach and C. Allen. Moral Machines: Teaching Robots Right from Wrong. Oxford University Press, 2008. 20. W. Wallach, C. Allen, and I. Smit. Machine morality: Bottom-up and top-down approaches for modelling human moral faculties. AI and Society, 22(4):565–582, 2008.