=Paper=
{{Paper
|id=Vol-1283/paper28
|storemode=property
|title=
Explaining Best Decisions via Argumentation
|pdfUrl=https://ceur-ws.org/Vol-1283/paper_28.pdf
|volume=Vol-1283
|dblpUrl=https://dblp.org/rec/conf/ecsi/ZhongFTL14
}}
==
Explaining Best Decisions via Argumentation ==
Explaining Best Decisions via Argumentation Qiaoting Zhong1 , Xiuyi Fan2 , Francesca Toni2 , and Xudong Luo1 1 Sun Yat-sen University, China 2 Imperial College London, United Kingdom Abstract. This paper presents an argumentation-based multi-attribute decision making model, where decisions made can be explained in natural language. More specifically, an explanation for a decision is obtained from a mapping between the given decision framework and an argumentation framework, such that best deci- sions correspond to admissible sets of arguments, and the explanation is gener- ated automatically from dispute trees sanctioning the admissibility of arguments. We deploy a notion of rationality where best decisions meet most goals and ex- hibit fewest redundant attributes. We illustrate our method by a legal example, where decisions amount to past cases most similar to a given new, open case. 1 Introduction Argumentation is often portrayed as a powerful method to support several aspects of decision-making [10, 18], especially when decisions need to be explained, but, ex- cept for few studies (notably [1, 9, 15]), it is not connected to means of formal eval- uation sanctioning recommended decisions as rational. Moreover, existing approaches to argumentation-based decision making either lack automatic support for generating explanations (e.g., [5]) or directly use the outputs of argumentation engines as explana- tions (e.g., [12]), even though these may be obscure to the non-expert human users. To this end, this paper gives an argumentation-based decision-making model that can base the output of an argumentation engine to generate automatically argumentative expla- nations for rational decisions in natural language, so that humans can understand and trust the decisions recommended by our model. In this paper, we deal with multi-attribute decision making problems where deci- sions may or may not fulfil goals depending on whether they exhibit attributes capable of reaching those goals. Our recommended decisions are rational in that they meet most goals (in the sense of [9]) but in addition deviate minimally by having as few redundant attributes as possible, not contributing to fulfilling the goals. Concretely, in the spirit of [9], we define a (provably correct) mapping between the kind of decision framework we consider and a specific Assumption-Based Argumentation (ABA) [6] framework, so that minimally deviating decisions correspond to admissible sets of arguments. We use the argumentation mapping to generate explanations, in natural language, for explaining decisions, and in particular why some decisions are preferred to (more rational than) others. These natural language explanations are generated automatically, using an algorithm we design in this paper, from dispute trees computed as a standard argumentative explanation for decisions in ABA. Overall, for any pair of decisions, our Index No. a1 a2 a3 a4 a5 a6 a7 a8 a9 Sentence 1 245 0 1 0 1 0 1 0 0 0 1y+$1,000 2 97 1 0 0 0 0 0 1 0 0 10y+3y+$10,000 3 420 1 0 0 1 1 0 0 1 0 6m+$1,000 4 96 1 0 0 0 0 1 0 1 0 3y+$3,000 5 48 1 0 0 0 1 0 0 1 0 6m+$1,000 6 751 1 0 0 0 0 1 0 1 0 5y+$5,000 7 412 1 0 0 1 0 0 0 0 0 5m+$1,000 8 1962 1 0 0 0 0 0 1 0 1 7y+$7,000 9 389 1 0 0 1 1 0 0 0 0 4m+$1,000 10 686 1 0 1 0 1 0 0 1 0 6m+$1,000 11 355 1 0 0 0 0 0 1 1 0 10y+3y+$10,000 Table 1. A fragment of the Past Cases Characteristics data, where a1 , · · · , a9 stand for “older than 18”, “age between 16 and 18”, “burglary”, “repeatedly”, (value of goods) “large amount”, “huge amount”, “extremely huge amount”, “goods found” and “accessory”, respectively. technique can give succinct, human understandable descriptions to explain why one decision is better than another, which is particularly important for applications. Throughout we motivate and illustrate our work with the following legal example. Example 1. In the practice of law, when lawyers, judges, jury members or other legal entities receive a new case, they need to identify and compare similar past cases, and then use the (court) sentence for the past cases as a prediction for the possible sentence for the open case [13]. This can be viewed as a decision making problem, i.e., past cases are alternative decisions whose rationality can be measured once the sentence of the new case is out: the closer the sentence to the ones for the past cases, the more rational the choice of these past cases as similar. Consider the 11 cases summarised in Table 1,3 where: each case has a number of attributes, e.g., “the item is of large value” (a5 in Table 1) or “the goods have been found” (a8 in Table 1), and a sentence, e.g., “1 year of imprisonment with $1,000 fine” or “10 years of imprisonment, 3 years deprive of political right with $10,000 fine”. For each case, attributes can have value 1 (if the case has that attribute) or 0 (if it does not). For instance, the row for the first case in Table 1 represents: The defendant in case No. 245, with age between 16 and 18 (a2 ), stole re- peatedly (a4 ), and the value of the stolen goods was huge (a6 ). The resulting sentence was 1 year imprisonment with $1,000 fine. Throughout the paper we view case No. 355 (case 11 in Table 1) as new (thus ignoring its sentence). Case No. 97 is the past case closer to case No. 355 in terms of the final sentence, and thus it can be deemed the most similar. Case No. 97 is the case matching most attributes of case No. 355 while also “deviating minimally” from it (case No. 97 only deviates from case No. 355 on a8 ). We will define rational decisions following this intuition, and give an argumentative counterpart thereof that can be used to explain why other past cases are not so similar as case No. 97 to case No. 355. 3 These cases, all concerning theft, are adapted from real cases from the Nanhai District People’s Court in the city of Foshan, Guangdong Province, China. The rest of this paper is organised as follows. Section 2 recalls necessary back- ground knowledge. Section 3 defines minimally deviating decisions. Section 4 describes the argumentative counterpart of rational decisions, defined in terms of minimal devi- ation. Section 5 discusses how to compare decisions. Section 6 designs the algorithm for the generation of natural language explanation for (rational) decisions. Section 7 discusses related work. Finally, Section 8 concludes this paper with future work. 2 Background This work relies upon Assumption-Based Argumentation (ABA) [6] and the decision frameworks of [9]. We recap them in this section. An ABA framework is a tuple hL, R, A, Ci, where hL, Ri is a deductive system, with language L and rule set R = {s0 ← s1 , . . . , sm | s0 , · · · , sm ∈ L}; A ⊆ L is a (non-empty) set, referred to as the assumptions; and C is a total mapping from A into 2L \ {{}}, i.e., C(α) is the contrary of α ∈ A. Given rule ρ = s0 ← s1 , . . . , sm , s0 is the head (denoted Head(ρ) = s0 ) and s1 , . . . , sm constitute the body (denoted Body(ρ) = {s1 , . . . , sm }). If m = 0, ρ is represented as of s0 ← and Body(ρ) = {}. In ABA, arguments are deductions of claims using rules and supported by assump- tions, and attacks are directed at assumptions: – an argument for claim c ∈ L supported by S ⊆ A (S ` c for short) is a (finite) tree with nodes labelled by sentences in L or by the symbol τ ,4 such that the root is labelled by c, leaves are either τ or assumptions in S, and a non-leave s has as many children as the elements in the body of a rule with head s, in an one-to-one correspondence with the elements of this body; and – S1 ` c1 attacks S2 ` c2 iff c1 ∈ C(α) for some α ∈ S2 . A set of arguments is admissible if and only if it does not attack any argument it con- tains but attacks all arguments attacking it.5 Admissible sets of arguments can be char- acterised in terms of dispute trees [7], namely trees with proponent (P) and opponent (O) nodes, labelled by arguments, and such that arguments labelling a node attack the argument in their parent node. Each P-node has all their attacking arguments as its chil- dren, and each O-node has one child only. If no arguments in a dispute tree label a P-node as well as an O-node, then the dispute tree is admissible and the set of all argu- ments labelling P-nodes (called defence set) is admissible [7]. Argumentation engines, such as proxdd,6 compute admissible dispute trees and admissible sets of arguments. A decision framework is a tuple hD, A, G, DA, GAi, with a set of decisions D = {d1 , · · · , dn }, n > 0; a set of attributes A = {a1 , · · · , am }, m > 0; a set of goals G = {g1 , · · · , gl }, l > 0; two tables, DA (of size n × m) and GA (of size l × m),7 such that – every DAi,j (1 ≤ i ≤ n, 1 ≤ j ≤ m) is either 1, representing that alternative decision di has attributes aj , or 0, otherwise; and 4 τ ∈/ L stands for “true” and is used to represent the empty body of rules [6]. 5 An argument set As attacks an argument B iff some A ∈ As attacks B, and an argument A attacks an argument set Bs iff A attacks some B ∈ Bs. 6 www.doc.ic.ac.uk/∼rac101/proarg 7 We use Xi,j to represent the cell in row i and column j in X ∈ {DA, GA}. a1 a2 a3 a4 a5 a6 a7 a8 a9 g1 1 0 0 0 0 0 0 0 0 g2 0 0 0 0 0 0 1 0 0 g3 0 0 0 0 0 0 0 1 0 Table 2. GA table for Example 2. – every GAi,j (1 ≤ i ≤ l, 1 ≤ j ≤ m) is either 1, representing that goal gi is satisfied by attribute aj , or 0, otherwise. The column orders in both DA and GA are the same, and the indices of decisions, goals, and attributes in DA and GA are the row numbers of the decisions and goals and the column number of attributes in DA and GA, respectively. DEC and DF denote the set of all possible decisions and the set of all possible decision frameworks, respectively. A decision d ∈ D with row index i in DA meets a goal g ∈ G with row index j in GA if and only if there is an attribute a ∈ A with column index k in both DA and GA, such that DAi,k = 1 and GAj,k = 1. γ(d) ⊆ G denotes the set of goals met by d. A mapping ψ : DF 7→ 2DEC is a decision function if, for df = hD, A, G, DA, GAi, ψ(df ) ⊆ D. ψ(df ) is the set of decisions that are selected with respect to ψ. Ψ denotes the set of all decision functions. Strongly dominant decisions meet all goals. Weakly dominant decisions meet goals that are not met by other decisions. Formally, given df = hD, A, G, DA, GAi, – ψs ∈ Ψ is a strongly dominant decision function iff ∀d ∈ ψs (df ), γ(d) = G. – ψw ∈ Ψ is a weakly dominant decision function iff ∀d ∈ ψw (df ), @d0 ∈ D\{d} such that γ(d) ⊂ γ(d0 ). In the remainder of this paper, unless otherwise specified, we will assume a generic decision framework df = hD, A, G, DA, GAi ∈ DF. 3 Minimally Deviating Decisions This section discusses decision criteria. Meeting goals is crucial in rational decision selection, but it does not always allow to discriminate amongst decisions, and more importantly it ignores other factors in decision making: the presence of “redundant” attributes, illustrated below. Example 2. (Example 1 continued.) With respect to the new case No. 355, the decision problem can be formalised as df in which D = {d1 , · · · , d10 }, A = {a1 , · · · , a9 }, G = {g1 , g2 , g3 } (where g1 , g2 and g3 stand for “older than 18”, “extremely huge amount” and “goods found”, respectively), DA is adapted from Table 1 (without case No. 355 and sentences) and GA is in Table 2. Then, using the decision function ψs , no strongly dominant decisions exist. Since both d2 (case No. 97) and d8 (case No. 1962) meet two goals, i.e., g1 and g2 , they are both weakly dominant and thus equally good according to the decision function ψw . However, these two cases have different sentences (as shown in Table 1). But case No. 1962 also has attribute a9 that contributes to no goals and makes case No. 1962 distinct from our new case. Thus, we can deem case No. 97 more similar to the new one. This is legitimated by the actual sentence for case No. 355, since this is the same as the sentence for case No. 97. Thus, we need new decision criteria. Intuitively, given a decision framework, a de- cision d is optimal iff d is (strongly or weakly) dominant with the fewest redundant attributes. Formally, we have: Definition 1. Let α ∈ A and i be the column index in DA and GA. Then α is a deviating attribute iff ∀g ∈ G, if g has row index j in GA, then GAj,i 6= 1. The set of deviating attributes decision d has is denoted as λ(d). For our legal example, since GA1,9 = 0, GA2,9 = 0 and GA3,9 = 0, a9 is a deviating attribute. Similarly, a2 , · · · , a6 are deviating. Then, λ(d1 ) = {a2 , a4 , a6 }, λ(d2 ) = {} and so on. Intuitively, α ∈ λ(d) means that d has α but α fulfils no goals in G. Intuitively, minimally-deviating decisions have a minimal number of deviating at- tributes with respect to set inclusion. They are the output of minimally-deviating deci- sion functions. Formally, we have: Definition 2. ψm ∈ Ψ is a minimally-deviating decision function iff ∀d ∈ ψm (df ), @d0 ∈ D such that λ(d0 ) ⊂ λ(d). In words, a decision d is minimally-deviating if and only if there does not exist d0 such that the set of deviating attributes that d0 has is a proper subset of those that d has. For our legal example, it is easy to see that d2 is minimally-deviating. Minimal deviation is about decision having attributes, hence it is orthogonal to dom- inance (see Section 2), concerning decisions meeting goals. Thus, we can select deci- sions in a two-step process: first find dominant decisions, and then, amongst these, further select the minimally deviating decisions. Thus, we introduce sub-frameworks to refine decisions on grounds of deviation as follows: Definition 3. Given D0 ⊆ D, the sub-framework of df w.r.t. D0 is a decision framework hD0 , A, G, DA0 , GAi such that DA0 is the restriction of DA that contains only rows for di such that di ∈ D0 . We can then combine strongly/weakly dominance (s-dominance/w-dominance for short) and minimal deviation as follows: Definition 4. Let dfs be the sub-framework of df w.r.t. ψs (df ), and dfw be the sub- framework of df w.r.t. ψw (df ). Then: (i) ψms ∈ Ψ is a Minimally-Deviating S-Dominant (MDSD) decision function if ∀d ∈ D, d ∈ ψms (df ) iff d ∈ ψm (dfs ); and (ii) ψmw ∈ Ψ is a Minimally-Deviating W-Dominant (MDWD) decision function if ∀d ∈ D, d ∈ ψmw (df ) iff d ∈ ψm (dfw ). For our legal example, d2 (case No. 97) is MDWD as it is the only weakly dominant decision that has no deviating attributes. The following result relates MDSD and MDWD decisions. Proposition 1. Let Ds = ψs (df ), Dw = ψw (df ), Dms = ψms (df ) and Dmw = ψmw (df ). If Ds 6= {} then Dms = Dmw . Proof. Let dfs be the sub-framework of df w.r.t. Ds , and dfw be the sub-framework of df w.r.t. Dw . We firstly prove that Dmw ⊆ Dms . For any d ∈ Dmw , we have d ∈ ψm (dfw ) by Definition 4. Since Ds 6= {}, we have Ds = Dw (by Proposition 4 in [9]). Thus, we have dfw = dfs . Therefore, d ∈ ψm (dfs ) holds. By Definition 4, we have d ∈ Dms . Similarly, we can prove that Dms ⊆ Dmw . 2 4 Argumentative Counterpart of Minimally Deviating Decisions After introducing a new decision criterion, minimal deviation and related concepts of MDSD and MDWD decisions, this section presents a method to map the problem of determining minimally deviating decisions onto the problem of finding admissible sets of argument in an argumentation framework. This mapping serves as a means to gener- ate dispute trees that explain decisions in its own right and will be used in Section 6 to feed into the algorithm for providing natural language explanations. Concretely, we use ABA to develop the mapping. The idea is that for a decision framework and a given decision function (e.g., ψms or ψmw ), an equivalent ABA frame- work with rules, assumptions and contraries is constructed; then the computation of selected decisions can be performed within the ABA framework via standard argumen- tation semantics computation, using tools such as proxdd (see Section 2). We show that such a mapping is not only sound and complete but also able to generate an argu- mentative explanation. We start with giving the ABA construction for MDSD decisions. Definition 5. Let df = hD, A, G, DA, GAi be such that |D| = n, |A| = m and |G| = l. The MDSD ABA framework corresponding to hD, A, G, DA, GAi is hL, R, A, Ci, where: – L is the language. – R consists of the following rules:8 ∀dk ∈ D, gj ∈ G, ai ∈ A: isD(dk ) ←; isG(gj) ←; isA(ai ) ←; (1) ∀1 ≤ k ≤ n, 1 ≤ i ≤ m, if DAk,i = 1 then hasAttr(dk , ai ) ←; (2) ∀1 ≤ j ≤ l, 1 ≤ i ≤ m, if GAj,i = 1 then satBy(gj , ai ) ←; (3) de(X, Y ) ← isD(X), isA(Y ), hasAttr(X, Y ), deviate(Y ); (4) notDeviate(Y ) ← satBy(X, Y ), isG(X), isA(Y ); (5) notMSD(X) ← sd(X 0 ), worse(X, X 0 ), nWorse(X 0 , X), isD(X), isD(X 0 ); (6) nSD(X) ← nMet(X, Z), isD(X), isG(Z); (7) met(X, Y ) ← hasAttr(X, Z), satBy(Y, Z), isD(X), isG(Y ), isA(Z); (8) worse(X 0 , X) ← isD(X), isD(X 0 ), isA(Y ), de(X 0 , Y ), nAttr(X, Y ). (9) – A = {{nWorse(dr , dk )|dk , dr ∈ D} ∪ {nMet(dk , gj ) | dk ∈ D, gj ∈ G}∪ {ms(dk ), sd(dk ), deviate(ai ), nAttr(dk , ai ) | dk ∈ D, ai ∈ A}}. (10) – C(ms(dk )) = {notMSD(dk ), nSD(dk )}; C(sd(dk )) = {nSD(dk )}; (11) C(deviate(ai )) = {notDeviate(ai )}; C(nAttr(dk , ai )) = {hasAttr(dk , ai )}; (12) C(nWorse(dr , dk )) = {worse(dr , dk )}; C(nMet(dk , gj )) = {met(dk , gj )}. (13) We explain some crucial aspects of the above definition as follows. A decision dk is assumed to be MDSD by declaring it as an assumption ms(dk ) (see (10)). dk is not MDSD under either of the two conditions: (i) dk is not strongly dominant, or (ii) dk is not minimally-deviating. Hence, the contraries of ms(dk ) are nSD(dk ) and notM SD(dk ), respectively (see (11)). For any decision X, it is not minimally- deviating (notM SD(X)) if there exists some decision X 0 such that X 0 is strongly dom- inant (sd(X 0 )) and X contains some deviating attribute which X 0 does not (worse(X, 8 We use schemata with variables to represent compactly all rules that can be obtained by in- stantiating the variables over the appropriate domains. Fig. 1. A fragment of the admissible dispute tree for our legal example. Individual arguments are wrapped in boxes, the claim of an argument is the inner box on the right and assumptions and facts (rules with empty bodies) are the inner boxes on the left. Arguments are P or O (see Section 2). If there is only one inner box in an argument, then this is supported by the empty set (of assumptions). Attacks are labelled as arrows between outer boxes (the argument). X 0 )); and X 0 does not contain any more deviating attributes than X (nW orse(X 0 , X)) (see (6)). A decision X is not strongly dominant (nSD(X)) if there exists some goal Z that X does not meet (nM et(X, Z)) (see (7)). This mapping onto ABA for MDSD decisions can serve as the basis for the compu- tation for MDSD decisions by virtue of the following theorem: Theorem 1. Let AF be the MDSD ABA framework corresponding to df . Then ∀dk ∈ D, dk ∈ ψms (df ) iff {ms(dk )} ` ms(dk ) belongs to an admissible set of arguments in AF . This theorem can be drawn from Definitions 4 and 5 as well as the definition of admissibility in ABA (see Section 2). For the sake of space, the proof is omitted. Definition 6. Let hL, R, A, Ci be the MDSD ABA framework corresponding to df . The MDWD ABA framework corresponding to df is hL, R0 , A0 , C 0 i, where: – R0 is derived from R by replacing rules with heads notMSD(X) and nSD(X) with the rules: notMWD(X) ← wd(X 0 ), worse(X, X 0 ), nWorse(X 0 , X), isD(X), isD(X 0 ), nWD(X) ← met(Y, Z), nMet(X, Z), nMore(X, Y ), isD(X), isD(Y ), isG(Z), nMore(X, Y ) ← met(X, Z), nMet(Y, Z), isG(Z), isD(X), isD(Y ); – A0 is A ∪ {nMore(dr , dk )|dr , dk ∈ D}, where ms(dk ) and sd(dk ) are replaced by mw(dk ) and wd(dk ), respectively. – C 0 (nMore(dr , dk )) = {more(dr , dk )} and otherwise C 0 is C except that ms(dk ), nSD(dk ), notMSD(dk ) and sd(dk ) are replaced by mw(dk ), nWD, notMWD(dk ) and wd(dk ), respectively. Similarly, ABA can also be used for MDWD by virtue of the following theorem: Theorem 2. Let F be the MDWD ABA framework corresponding to df . Then ∀dk ∈ D, dk ∈ ψmw (df ) iff {mw(dk )} ` mw(dk ) belongs to an admissible set of arguments in F . This theorem can be drawn from Definitions 4 and 6 as well as the definition of admissibility in ABA (see Section 2). The proof details are omitted for lack of space. We illustrate the computation of selected decisions in ABA for our legal example: Example 3. (Example 2 continued.) Given the MDSD ABA framework corresponding to our legal df (see Definition 5), {mw(d2 )} ` mw(d2 ) belongs to an admissible set, as shown by the fragment of an admissible dispute tree in Figure 1, adapted from the output of proxdd. This tree illustrates the argumentative explanation aspect of ABA computation. The root argument of the tree (right-most box labelled P:1) claims that d2 is MDWD. This claim is attacked by three different arguments, boxes O:23 - O:25, giving reasons for why d2 is not MDWD as follows: – (O:23) d8 is better than d2 ; – (O:24) d2 is not weakly dominant since d2 does not meet goals that are not met by d8 , but d8 meets the goal g1 that are not met by d2 ; and – (O:25) d2 is not weakly dominant since d2 does not meet goals that are not met by d8 , but d8 meets the goal g2 that are not met by d2 . These three arguments are counter-attacked by arguments in P:32 - P:34, respectively: – (P:32) d8 has deviating attribute a9 but d2 does not, so d8 is worse than d2 ; and – (P:33) / (P:34) d2 meets the goal g1 / g2 (respectively). 5 Comparing Decisions We have considered several criteria for evaluating decisions, i.e., strong/weak domi- nance, MDSD and MDWD. Here we will formally compare decisions in different cate- gories and explain their differences. That is, we give a mechanism to rank decisions that meet different criteria. We start with defining the better-than relation between decisions: Definition 7. For all d, d0 ∈ D, d is better than d0 , denoted d d0 , iff: (i) d ∈ ψs (df ) and d0 6∈ ψs (df ), or (ii) d ∈ ψw (df ) and d0 6∈ ψw (df ), or (iii) d ∈ ψms (df ) and d0 ∈ ψs (df ) but d0 6∈ ψms (df ), or (iv) d ∈ ψmw (df ) and d0 ∈ ψw (df ) but d0 6∈ ψmw (df ). d is as good as d0 , denoted d ∼ d0 , iff neither d d0 nor d0 d. The intuition behind Definition 7 is that: (1) minimally deviating strongly/weakly dominant decisions are better than strongly/weakly dominant decisions, which in turn are better than non-strongly/non-weakly dominant decisions; and (2) if it is not the case that one decision is better than the other, then they are equally good. Note that if the set of strongly dominant decisions D is not empty, then D is also weakly dominant (by Proposition 4 in [9]). Hence, strongly dominant decisions are as good as weakly dominant decisions. For our legal example, since d2 (case No. 97) is the only MSWD and d8 (case No. 1962) is weakly dominant, we have d2 d8 according to Definition 7(iv). To ensure that our notions of “better than” and “as good as” are well defined, we need to show that for any two decisions in our decision framework, they can always be compared using our notions. The following several results sanction this: Proposition 2. For all d1 , d2 , d3 ∈ D, if d3 d2 and d2 d1 , then d3 d1 . Proposition 3. ∼ is an equivalence relation on D. Propositions 2 and 3 can be proved by using mathematical induction and contradic- tion, respectively. For the sake of space, their details are omitted here. Given the quotient set D/ ∼ of all equivalence classes, we define a binary relation ≥ on D/ ∼ as follows: Definition 8. Given d ∈ D, let [d] denote the equivalence class to which d belongs. For any [d], [d0 ] ∈ D/ ∼, [d] ≥ [d0 ] iff d d0 or d ∼ d0 . Proposition 4. ≥ is a total order on D/ ∼. Proof. (Sketch) We need to prove that ≥ is anti-symmetric, transitive and total. (i) Anti- symmetry. If [d1 ] ≥ [d2 ], then d1 d2 or d1 ∼ d2 . For the former, d2 6 d1 . Since [d2 ] ≥ [d1 ], we have d2 ∼ d1 , which leads to [d1 ] = [d2 ]. For the latter, [d1 ] = [d2 ] certainly. (ii) Transitivity. Since and ∼ are proven as transitive, it is easy to see that ∀d1 , d2 , d3 ∈ D, if d1 ∼ d2 and d2 d3 , then d1 d3 ; and if d1 d2 and d2 ∼ d3 , then d1 d3 . (iii) Totality can be proven by contradiction. 2 Proposition 4 actually says that any two decisions in a decision framework can be compared with respect to the “better than” or “as good as” relation given in Definition 7. 6 Natural Language Explanation Dispute trees give a comprehensive explanation to decisions. However, they are less useful for comparing decisions, i.e., for answering the question: Why is di better than dj ? Transparently answering such a comparison question is crucial for human, e.g., to trust the decision that is recommended by computer systems. Hence, in this section we will design an algorithm to extract information from dispute trees and then generate natural language explanations specifically answering pairwise comparison questions. This algorithm corresponds to the decision ranking mechanism described in Section 5. Figures 2-4 show our algorithm for generating natural language texts explaining the rationale in the decision making, where Figures 3 and 4 give subroutines for procedures used in Figure 2. We illustrate these algorithms with the legal example. Example 4 (Example 3: continued). From earlier discussions, we know that d2 is the best decision. Comparing d2 with d8 , the algorithm in Figure 2 provides an explana- tion. Since argument {mw(d2 )} ` mw(d2 ) is admissible as d2 is MDWD, the algo- rithm compares d2 and d8 with the dispute tree shown in Figure 1. Since argument {mw(d8 )} ` mw(d8 ) is not admissible as d8 is not MDWD, and nWD(d8 ) is not the conclusion of any P argument (indicating that d8 is not weakly dominant), it will check whether or not worse(d8 , d2 ) (d8 is worse than d2 for having more deviating attributes) is the claim of any P argument. Since worse(d8 , d2 ) is indeed the claim of argument P: 32, which contains deviate(a9 ), indicating that d8 has more deviating attributes (a9 ) than d2 , the system outputs: Both d2 and d8 meet most goals, however, d2 is better than d8 as d2 has fewer deviating attributes. For example, d8 has deviating attribute a9 but d2 does not. Fig. 2. Algorithm for automated explanation (labels on the right are used in proofs). Theorem 3. The algorithm shown in Figures 2, 3 and 4 is sound and complete w.r.t. the ordering given in Definition 7. Proof. (Sketch) With the labels beside each output in Figures 2-4, we can easily prove completeness. For soundness, here we just consider the case of d d0 since d ∈ ψs (df ) and d0 6∈ ψs (df ), others can be proved similarly. Since d ∈ ψs (df ), if ms(d) is admissi- ble, according to Figure 3, the output is “d is better than d0 as d meets all goals but d0 does not”; otherwise, by Proposition 1, we have d 6∈ ψmw (df ). Then from Figure 2, the output is “d is better than d0 because d meets all goals but d0 does not”. 2 Putting everything together, based on the corresponding ABA framework, we see that case No. 97 is the most similar case, because only it is MDWD. Case No. 97 fully agrees with the sentence for case No. 355 (i.e., 10 years of imprisonment, 3 years deprive of political right with $10,000 fine), so indeed it can be deemed to be the most similar, thus validating MDWD as a suitable decision criterion here. In addition, when being posed with the question: “why is case No. 97 better than case No. 245?”, our algorithm gives: No. 97 is better than No. 245 because No. 97 meets more goals. For example, No. 245 does not meet goal “the defendant is older than 18” but No. 97 does. 7 Related work Some studies exist about argumentation-based decision making and analysis. For ex- ample, in [9], Fan et al. give basic decision frameworks and functions, and in [8] they discuss how preferences over goals can be expressed and incorporated. Our work has Fig. 3. Comparison based on the dispute tree of ms(d) (labels on the right are used in proofs). extended theirs by introducing new decision criteria, a decision comparison method and the algorithms for generating natural language texts to explain the selected decisions. Müller and Hunter [17] present an argumentation-based system for decision analysis. Their work is based on ASPIC+ [20], whereas our work is based on ABA. Moreover, they present a method for generating decisions, whereas we present an algorithm for the generation of natural language explanations for decisions. Natural Language Generation (NLG) is the natural language processing task of pro- ducing readable text in ordinary language, usually from complex data streams. A wide range of practical uses of NLG has been studied, including writing weather forecasts [22], summarising medical data [19] or generating hypothesis [21]. To the best of our knowledge, ours is the first NLG work for argumentation-based decision making. Heras et al. [11] have proposed a case-based argumentation framework for agent societies for customer support. By being engaged in argumentation dialogues, agents reach agreements for best solutions. The differences between their work and ours are: (i) they use value-based argumentation [18], whereas we use ABA; and (ii) they justify the reasoning process by showing dialogue graphs, while we generate natural language text, which is (arguably) easier for ordinary human users to understand. Our model can be used for case-based reasoning (CBR), but it is quite different from existing work. For example, though both HYPO [3, 4] and our method can iden- tify the most similar cases, only ours can give an argumentative natural-language expla- nation ranking. Moreover, our work is actually a generic decision framework that can be used in any other domains, while HYPO is specific for law. Moreover, Armengol and Plaza [2] provide an explanation scheme for similarities in CBR, through which the users can understand why some cases are retrieved, but cannot know why others are not. Our work can do both by: (i) argumentative explanation with ABA computation and (ii) our natural language explanation for pairwise decision comparison. In addition, McSherry [16] applies a CBR approach to a recommender system that offers benefits in explaining the recommendation process, but its focus is on the efficiency and trans- Fig. 4. Comparison based on the dispute tree of mw(d) (labels on the right are used in proofs). parency of the recommendation process by explaining relevant questions being asked to users. Instead, our work focuses on multi-attribute decision making and the natural language explanation for a decision that our model recommend. Finally, our model can be viewed as a general framework for the explanation of multi-attribute decision models. Similarly, in [14], a selected decision can be explained by the analysis of the values of weights on criteria together with the overall scores of the decisions. However, we do not consider weighted-based decision model but rather the natural language generation of pairwise comparison explanation. 8 Conclusion This paper presents a new decision criterion, minimal deviation, and its combination with two notions of dominance to select decisions that meet more goals but with fewer “redundant” attributes (i.e., attributes not contributing to meeting goals). We have de- veloped mappings onto ABA for the two resulting types of decisions, not only serving as a basis for decision selection but also contributing to better explaining the selection. Moreover, we have formally defined a decision ranking mechanism by giving a total ordering amongst all decisions. The argumentative counterpart used to identify best de- cisions also serves as foundation to provide natural language explanations for why one decision is better than another. Our natural language explanation algorithm fully takes advantage of the potential of argumentative decision making to support transparent ex- planations automatically. Finally, we have illustrated our method in the legal domain, for identifying the most similar cases in a repository of past cases, and more impor- tantly for explaining, in natural language, why a case is more similar than others, which is critical for lawyers to trust the case(s) that our model recommends. In the future, we plan to expand our legal case-study to larger repositories. It would also be interesting to consider decision criteria involving both attribute deviation and preference ranking over goals. It is also worth furthering the link between argumentation- based decision making and other decision making methods, and applying our frame- work to other application domains. In particular, this will allows us to ascertain the generality of our method. Acknowledgements This research was supported by the EPSRC TRaDAr project: EP/J020915/1, Inter- national Program of Project 985, Sun Yat-Sen University, Raising Program of Major Project of Sun Yat-sen University (No. 1309089), and MOE Project of Key Research Institute of Humanities and Social Sciences at Universities, China (No. 13JJD720017). References 1. L. Amgoud and H. Prade. Using arguments for making and explaining decisions. Art. Int., 173(3-4):413–436, 2009. 2. E. Armengol and E. Plaza. Symbolic explanation of similarities in case-based reasoning. Comput. Inf., 25(2-3):153–171, 2012. 3. K. D. Ashley. Modeling legal arguments: Reasoning with cases and hypotheticals. MIT press, 1991. 4. K. D. Ashley, C. Lynch, N. Pinkwart, and V. Aleven. A process model of legal argument with hypotheticals. In Proc. JURIX, pages 1–10, 2008. 5. K. Atkinson, T. Bench-Capon, and P. McBurney. Justifying practical reasoning. In Proc. CMNA, pages 87–90, 2004. 6. P. M. Dung, R. A. Kowalski, and F. Toni. Assumption-based argumentation. In Argumenta- tion in Artificial Intelligence, pages 199–218. Springer, 2009. 7. P.M. Dung, R.A. Kowalski, and F. Toni. Dialectic proof procedures for assumption-based, admissible argumentation. Art. Int., 170(2):114–159, 2006. 8. X. Fan, R. Craven, R. Singer, F. Toni, and M. Williams. Assumption-based argumentation for decision-making with preferences: A medical case study. In Proc. CLIMA, pages 374–390, 2013. 9. X. Fan and F. Toni. Decision making with assumption-based argumentation. In Proc. TAFA, pages 127–142, 2013. 10. J. Fox, D. Glasspool, V. Patkar, M. Austin, L. Black, M. South, D. Robertson, and C. Vincent. Delivering clinical decision support services: There is nothing as practical as a good theory. J. of Biom. Inf., 43(5):831–843, 2010. 11. S. Heras, J. Jordn, V. Botti, and V. Julian. Argue to agree: A case-based argumentation approach. Int. J. App. Reas., 54(1):82–108, 2013. 12. A Kakas and P Moraitis. Argumentation based decision making for autonomous agents. In Proc. AAMAS, pages 883–890, 2003. 13. A. T. Kronman. Precedent and tradition. Yale Law J., 99(5):1029–1068, 1990. 14. C. Labreuche. A general framework for explaining the results of a multi-attribute preference model. Art. Int., 175(7):1410–1448, 2011. 15. P. A. Matt, F. Toni, and J. Vaccari. Dominant decisions by argumentation agents. In Proc. ArgMAS, pages 42–59. Springer, 2009. 16. D. McSherry. Explanation in recommender systems. Art. Int. Rev., 24(2):179–197, 2005. 17. J. Müller and A. Hunter. An argumentation-based approach for decision making. In Proc. ICTAI, pages 564–571, 2012. 18. F. S. Nawwab, T. J. M. Bench-Capon, and P. E. Dunne. A methodology for action-selection using value-based argumentation. In Proc. COMMA, pages 264–275, 2008. 19. F. Portet, E. Reiter, , A. Gatt, J. Hunter, S. Sripada, Y. Freer, and C. Sykes. Automatic generation of textual summaries from neonatal intensive care data. Art. Int., 173(7):789– 816, 2009. 20. H. Prakken. An abstract framework for argumentation with structured arguments. Arg. and Comp., 1(2):93–124, 2010. 21. P. R. Quinlan, A. Thompson, and C. Reed. An analysis and hypothesis generation platform for heterogeneous cancer databases. In Proc. COMMA, pages 59–70, 2012. 22. E. Reiter, S. Sripada, J.Hunter, and I. Davy. Choosing words in computer-generated weather forecasts. Art. Int., 167(1-2):137–169, 2005.