Justification in Case-Based Reasoning Wijnand van Woerkom1 , Davide Grossi2,3,4 , Henry Prakken1,5 and Bart Verheij2 1 Department of Information and Computing Sciences, Utrecht University, The Netherlands 2 Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, The Netherlands 3 Amsterdam Center for Law and Economics, University of Amsterdam, The Netherlands 4 Institute for Logic, Language and Computation, University of Amsterdam, The Netherlands 5 Faculty of Law, University of Groningen, The Netherlands Abstract The explanation and justification of decisions is an important subject in contemporary data-driven automated methods. Case-based argumentation has been proposed as the formal background for the explanation of data-driven automated decision making. In particular, a method was developed in recent work based on the theory of precedential constraint which reasons from a case base, given by the training data of the machine learning system, to produce a justification for the outcome of a focus case. An important role is played in this method by the notions of citability and compensation, and in the present work we develop these in more detail. Special attention is paid to the notion of compensation; we formally specify the notion and identify several of its desirable properties. These considerations reveal a refined formal perspective on the explanation method as an extension of the theory of precedential constraint with a formal notion of justification. Keywords Precedential constraint, Interpretability, Law 1. Introduction In [1] a case-based reasoning method is proposed to explain data-driven automated decisions for binary classification, based on the theory of precedential constraint introduced in [2, 3]. This method is motivated by an analogy between the way in which a machine learning system draws on training data to assign a label to a new data point and the way in which a court of law draws on previously decided cases to make a decision about a new fact situation, because in both of these situations the precedent that has been set must be adhered to as closely as possible. The theory of precedential constraint, which has been developed to describe the type of a fortiori reasoning used for legal decision making on the basis of case law, can therefore be applied to analyze machine-learned decisions that are made on the basis of training data. More specifically, the method of [1] formally models the kind of dialogue in which lawyers cite precedents to argue in favor of their preferred outcome of the new fact situation. These citations, and the way in which they attack the opponent’s citation, are formalized using an 1st International Workshop on Argumentation for eXplainable AI (ArgXAI, co-located with COMMA ’22), September 12, 2022, Cardiff, UK $ w.k.vanwoerkom@uu.nl (W. van Woerkom) € https://webspace.science.uu.nl/~woerk003/ (W. van Woerkom) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 1 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 abstract argumentation framework as in [4]. A winning strategy in the grounded argument game on this framework, starting with an initial citation of a suitable precedent case, is taken as the explanation of the decision of the new fact situation. In the present work, we examine the explanation model of [1] in detail and make various suggestions and modifications for improvement. Particularly close attention is paid to the subject of compensation; the way in which important differences between a new fact situation and a precedent case can be compensated for by features of the focus case. We make the formal nature of this subject more explicit, and specify various desirable properties it may have. Subsequently, we show that the model can be equivalently viewed as extending the theory of precedential constraint with notions of justification and citability, the combination of which constitutes the explanations produced by the model. This equivalent formulation only uses the simple notion of relation, thus greatly simplifying the specification of the model. The resulting view may be more broadly applied to the type of downplaying attacks seen in similar systems such as cato [5]. We begin this work by summarizing the relevant aspects of the theory of precedential constraint in Section 2. In Section 3 we give a description of the explanation method of [1]. In Section 4 we revisit the definition of best citability, suggest some improvements, and demonstrate their potential experimentally. Then in Section 5 we reconsider the compensation relation and formulate desirable properties. These considerations lead us to give an equivalent formulation of the model just in terms of relations, which we do in Section 6. We conclude in Section 7 with some final thoughts and remarks. 2. Precedential Constraint The theory of precedential constraint was developed in [2, 3] to describe the a fortiori reasoning involved with case law. It is taken as the point of departure of the explanation method in [1] and so we begin by recalling those aspects of it that are necessary for the rest of this work. The contents of this section are largely similar to [6, Section 2]. In order to describe the fact situation of a case we use what are called dimensions in the ai & law literature, which are formally just partially ordered sets. Definition 2.1. A dimension is a partially ordered set (𝑑, ⪯). We will frequently omit explicit reference to the dimension order ⪯ and instead refer to just the set 𝑑 when we speak of a dimension. A model of precedential constraint of a specific domain assumes there is a set of these dimensions 𝐷, relative to which the rest of the definitions are specified. Definition 2.2. A fact situation 𝑃 is a choice function on the set of dimensions 𝐷, i.e. for each dimension 𝑑 ∈ 𝐷 an element 𝑃 (𝑑) ∈ 𝑑 of that dimension is chosen by 𝑃 . A case 𝑝 is a fact situation 𝑃 paired with an outcome 𝑠 ∈ {0, 1}, written 𝑝 = 𝑃 : 𝑠. A set CB of cases is called a case base. If 𝑝 = 𝑃 : 𝑠 we may write 𝑝(𝑑) instead of 𝑃 (𝑑). In the context of a case 𝑝, 𝑞, 𝑟, . . . we will refer to its fact situation by the corresponding upper case letters 𝑃, 𝑄, 𝑅, . . . without further explicit mention. 2 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 The order ⪯ of a dimension 𝑑 specifies the relative preference the elements of 𝑑 have towards either of two outcomes 0 and 1. More specifically, if 𝑣 ≺ 𝑤 for 𝑣, 𝑤 ∈ 𝑑 this means 𝑤 prefers outcome 1 relative to 𝑣, and conversely 𝑣 prefers outcome 0 relative to 𝑤. Usually we want to compare preference towards an arbitrary outcome 𝑠, so to do this we define for any dimension (𝑑, ⪯) the notation ⪯𝑠 := ⪯ if 𝑠 = 1 and ⪯𝑠 := ⪰ if 𝑠 = 0. Definition 2.3. Given fact situations 𝑃 and 𝑄 we say 𝑄 is at least as good as 𝑃 for an outcome 𝑠, denoted 𝑃 ⪯𝑠 𝑄, if it is at least as good for 𝑠 on every dimension 𝑑: 𝑃 ⪯𝑠 𝑄 if and only if 𝑃 (𝑑) ⪯𝑠 𝑄(𝑑) for all 𝑑 ∈ 𝐷. If moreover 𝑝 = 𝑃 : 𝑠 is a previously decided case we say that 𝑝 forces the decision of 𝑄 for 𝑠. A case base CB forces the decision of 𝑄 for 𝑠 if it contains a case that does so. Definition 2.4. Given two cases 𝑝 = 𝑃 : 𝑠 and 𝑞 = 𝑄 : 𝑠 such that 𝑃 ⪯𝑠 𝑄 we say that the outcome of 𝑞 for 𝑠 was forced by the case 𝑝, and write 𝑝 ⪯ 𝑞. To give some intuition for these definitions we consider a running example of risk of recidi- vism, as in [6, Example 2.1]. Example 2.1. Convicts are described along three dimensions: age (Age, ⪯Age ), the number of prior offenses (Priors, ⪯Priors ), and sex (Sex, ⪯Sex ). Age and number of priors have the natural numbers as possible values, so Age := N and Priors := N. The values for sex are Sex := {M, F}. The outcome for this domain is a judgement of whether the person is at high (1) or low (0) risk of recidivism. The associated orders are as follows: (Age, ⪯Age ) := (N, ≥), (Priors, ⪯Priors ) := (N, ≤), (Sex, ⪯Sex ) := ({M, F}, {(F, F), (M, M), (F, M)}). If a relation 𝑅 is defined on all dimension we can, for fact situations 𝑃 and 𝑄, refer to the set of dimensions on which 𝑅 holds with [𝑅(𝑃, 𝑄)] := {𝑑 ∈ 𝐷 | 𝑅(𝑃 (𝑑), 𝑄(𝑑))}. For instance, instantiating 𝑅 := ̸⪯𝑠 we have [𝑃 ̸⪯𝑠 𝑄] = {𝑑 ∈ 𝐷 | 𝑃 (𝑑) ̸⪯𝑠 𝑄(𝑑)}; the dimensions on which 𝑄 is not at least as good for 𝑠 as 𝑃 . Besides fact situations we will also consider partial fact situations, i.e. fact situations defined only on a particular subset of the dimensions. We can do so conveniently using the well established notation for function restriction. Let 𝑓 : 𝑋 → 𝑌 and 𝑍 ⊆ 𝑋, we obtain a function 𝑓 ↾ 𝑍 : 𝑍 → 𝑌 by restriction: 𝑓 ↾ 𝑍 := {(𝑥, 𝑦) ∈ 𝑓 | 𝑥 ∈ 𝑍}. For cases 𝑝 and 𝑞 with the same outcome 𝑠 we write 𝑊 (𝑝, 𝑞) := 𝑄 ↾ [𝑃 ̸⪯𝑠 𝑄], the values of 𝑞 on which 𝑞 is worse than 𝑝 for 𝑠, and 𝐵(𝑝, 𝑞) := 𝑄 ↾ [𝑃 ⪯𝑠 𝑄], the values of 𝑞 on which 𝑞 is better than 𝑝 for 𝑠. Example 2.2. Suppose we have a case base of recidivism risk judgements, and two cases 𝑝, 𝑞 with outcome 1 (i.e. judged high risk of recidivism) such that: 𝑝(Age) = 45, 𝑝(Priors) = 4, 𝑝(Sex) = M, 𝑞(Age) = 50, 𝑞(Priors) = 5, 𝑞(Sex) = M. Now we can compute that 𝑊 (𝑝, 𝑞) = {(Age, 50)} and 𝐵(𝑝, 𝑞) = {(Priors, 5), (Sex, M)}. 3 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 3. A Case-Based Reasoning Explanation Method In this section we detail the workings of the dimension-based model of explanation of [1], which was inspired by the work of [7]. A more detailed comparison between [1], [7], and other related works, can be found in [1, Section 8]. The method is built upon the theory of precedential constraint of [2, 3] and conceptually tries to mimic the arguments relating to precedent used by lawyers with respect to case law. In such discussions, precedent cases are cited by both sides as a means of arguing that the present (focus) case should be decided similarly as the precedent. Both sides may attack the other’s citations, by pointing to important differences between the citation and the focus case; and they may defend themselves against such attacks, by pointing to aspects of the focus case which compensates for these differences. Each of the elements of such a discussion – case citations, pointing to differences, and compensating for differences – has its counterpart in the formal model of explanation. A key idea underlying the approach is that a tabular dataset for binary classification can be interpreted as a case base CB in the sense of Definition 2.2. The method assumes access to the training data used by the system, and interprets each of the features in the data as a dimension in the sense of Definition 2.1. The corresponding dimension orders may be determined by knowledge engineering, statistical methods, or a combination thereof. This gives us a body of precedent CB upon which the machine learning system bases its decisions. Under this interpretation the machine learning system can be seen as deciding new fact situations for sides. The goal is to explain a particular decision of a fact situation 𝐹 for a side 𝑠, called the focus case 𝑓 = 𝐹 : 𝑠. This explanation is provided in the form of a best citable precedent 𝑝 ∈ CB together with an explanation dialogue in which the choice for this 𝑝 is justified. This dialogue is formalized as a winning strategy in the grounded argument game of a particular abstract argumentation framework. Before we can apply the theory of precedential constraint, we should specify the dimensions as in Definition 2.1, and we begin in Section 3.1 by mentioning the method used for doing so in [1, 6]. Any explanation dialogue should start with the citation of a best citable case. A suggestion for the definition of this notion is given in [1] and we continue by recalling it in Section 3.2, after which we explain and motivate the presence of the arguments occurring in the argumentation framework in Sections 3.3 and 3.4. We are then ready to give the formal definition of the framework in Section 3.5, explain what it means to have a winning strategy in the argument game it induces, and as such what constitutes an explanation according to the model. 3.1. Determining the Dimension Orders In order to instantiate the explanation method for a particular dataset, we should specify the dimension orders as in Definition 2.1. As just noted, this may be done on the basis of knowledge engineering and/or statistical methods. In [1] a general method for determining the orders corresponding to the dimensions was proposed, using a function 𝑐 that associates each ordinal feature 𝑥 in the data with a coefficient expressing the degree to which the values in the range of 𝑥 prefer outcome 1. See [6, Section 4.2] for a more detailed explanation. 4 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 3.2. Citability of Cases An important aspect of the explanations produced by the method of [1] is the selection of the precedent case 𝑝 with which it initiates its explanation of the outcome of the focus case 𝑓 . We will now describe how this selection procedure works; later in Section 4 we return to this topic to suggest improvements. We begin with the notion of citability. Definition 3.1. A case 𝑝 is citable for a case 𝑓 if (a) both cases have the same outcome 𝑠; and (b) there is a dimension 𝑑 such that 𝑝(𝑑) ⪯𝑠 𝑓 (𝑑). Since this is a quite weak requirement there may in general be very many citable cases 𝑝 for any given 𝑓 . For this reason the notion is strengthened by requiring that 𝑝 should have a minimal number of relevant differences with 𝑓 , according to some suitable notion of minimality. To make this formal we should first define what a relevant differences is. This is accomplished by [1, Definition 11], which we repeat here. Definition 3.2. The set 𝐷(𝑝, 𝑓 ) of relevant differences between 𝑝 = 𝑃 : 𝑠 and 𝑓 = 𝐹 : 𝑡 is 𝐷(𝑝, 𝑓 ) := 𝑃 ↾ [𝑃 ̸⪯𝑠 𝐹 ] = {(𝑑, 𝑃 (𝑑)) | 𝑑 ∈ 𝐷, 𝑃 (𝑑) ̸⪯𝑠 𝐹 (𝑑)}. In other words, the relevant differences are given by the values of the precedent 𝑝 on the dimensions on which 𝑓 is not better than 𝑝 for 𝑠. Now a best citable precedent should minimize this set of differences, in the following sense. Definition 3.3. A case 𝑝 is a best citable case for a case 𝑓 if (a) 𝑝 is citable for 𝑓 ; and (b) there is no other 𝑞 satisfying (a) for which 𝐷(𝑞, 𝑓 ) ⊂ 𝐷(𝑝, 𝑓 ). 3.3. Compensation of Relevant Differences An idea central to the explanation dialogues is that when a precedent 𝑝 does not force a focus case 𝑓 , the values 𝑊 (𝑝, 𝑓 ) on which 𝑓 is worse than 𝑝 for their outcome can be compensated for by the values 𝐵(𝑝, 𝑓 ) on which 𝑓 is better than 𝑝. This idea is often encountered in the literature on case-based reasoning, see e.g. [8], where certain compensations are described as “showing that at a more abstract level, a parallel exists between the cases, arguing in effect that the apparent distinction is merely a mismatch of details." In our context we assume the existence of a relation SC on partial fact situations 𝑥, 𝑦, where SC(𝑦, 𝑥) says that 𝑦 compensates for 𝑥. This is used in practise as follows. Consider a precedent 𝑝 and a focus case 𝑓 , both with outcome 𝑠. If 𝑝 forces the decision of 𝑓 then 𝑓 is at least as good as 𝑝 for 𝑠 on all dimensions, so ∅ = 𝑊 (𝑝, 𝑓 ) or equivalently 𝐵(𝑝, 𝑓 ) = 𝑓 . If this is not the case, then ∅ ⊂ 𝑊 (𝑝, 𝑓 ) or equivalently 𝐵(𝑝, 𝑓 ) ⊂ 𝑓 , and for 𝑝 to justify the outcome of 𝑓 we should have that 𝐵(𝑝, 𝑓 ) compensates for 𝑊 (𝑝, 𝑓 ) as determined by whether SC(𝐵(𝑝, 𝑓 ), 𝑊 (𝑝, 𝑓 )) holds. 5 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 3.4. Opposing Citations and Case Transformations The last component of the dialogue is opposing citations, to which a response is possible through the use of case transformations. The idea is that the proponent of the decision of 𝑓 for its outcome 𝑠 can have their citation countered by the citation of a case 𝑞 with outcome ¯𝑠, as a means of saying that 𝑞 should be a more appropriate precedent to draw on. This is analogous to the argument between lawyers in a legal case. Definition 3.4. We define a semantics function J·K on the compensation arguments by: JCompensates𝑝 (𝑦, 𝑥)K := (𝑃 ∖ 𝑃 ↾ dom(𝑥)) ∪ 𝑥 : 𝑠. A case 𝑝 can be transformed into 𝑞 iff 𝑝 = 𝑞 or there exists 𝑋 ∈ 𝒜𝑝 such that J𝑋K = 𝑞. The goal of the semantics function is to change 𝑝 into a case 𝑞 that forces the outcome of 𝑓 . It does so by replacing the values of the precedent case with those of the focus case, on those dimensions on which the focus case is not at least as good as the precedent. 3.5. An Abstract Argumentation Framework for Explanation We are now ready to describe the formal account of the explanation dialogues in [1] through the use of an abstract argumentation framework, a concept introduced in [4]. An abstract argumentation framework AF = (Arg, Attack) is a directed graph, in which the nodes are interpreted as arguments and the edges as an attack relation between them. An argumentation framework (Arg, Attack) is defined in [1] that combines the types of arguments defined in the preceding Sections 3.2, 3.3, and 3.4, relative to a focus case 𝑓 = 𝐹 : 𝑠. To do so we first define, for a particular precedent 𝑝 = 𝑃 : 𝑠 that may be cited in defense of the decision of 𝐹 for 𝑠, a subset 𝒜𝑝 ⊆ Arg as follows: ⋃︁ {︁ 𝒜𝑝 := {Worse𝑝 (𝑥) | 𝑥 = 𝑊 (𝑝, 𝑓 ) ̸= ∅}, (1) {Compensates𝑝 (𝑦, 𝑥) | Worse𝑝 (𝑥) ∈ 𝒜𝑝 , 𝑦 ⊆ 𝐵(𝑝, 𝑓 ), SC(𝑦, 𝑥)}, }︁ {Transformed𝑝 (𝑞) | 𝑝 can be transformed into a case 𝑞 with 𝑞 ⪯ 𝑓 } . Definition 3.5. Given a finite case base CB, a focus case 𝑓 = 𝐹 : 𝑠, and a compensation relation SC, an abstract argumentation framework for explanation with dimensions is a pair AF = (Arg, Attack) where the arguments Arg are given by ⋃︁ Arg := CB ∪ {𝒜𝑝 | 𝑝 ∈ CB if 𝑝 has the same outcome as 𝑓 }, and for arguments 𝑋, 𝑌 ∈ Arg we have Attack(𝑋, 𝑌 ) if and only if either: • 𝑋, 𝑌 ∈ CB have different outcomes and [𝑋 ̸⪯ 𝑓 ] ̸⊂ [𝑌 ̸⪯ 𝑓 ]; • 𝑌 ∈ CB and 𝑋 is of the form Worse𝑌 (𝑥); • 𝑌 is of the form Worse𝑝 (𝑥) and 𝑋 is of the form Compensates𝑝 (𝑦, 𝑥); or • 𝑌 ∈ CB has outcome ¯𝑠 and 𝑋 is of the form Transformed𝑝 (𝑞). 6 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 A dialogue now takes the form of a grounded argument game played on (Arg, Attack). For the sake of brevity we only give an intuitive explanation of how this works, the reader is referred to [1] for a detailed treatment of the subject. An argument game on an AF (𝐴, 𝑅) is a two-player game, in which the players take turns playing arguments from 𝐴 which must attack the previously played argument according to the attack relation 𝑅. A player can win the game by moving an argument to which the other player cannot reply, and a winning strategy for a player is a method of playing that ensures a win regardless of how the opponent plays. We now have the formal machinery in place to define explanations as in [1]. Definition 3.6. An explanation of a focus case 𝑓 is a winning strategy in the grounded argument game starting with the citation of a best citable precedent 𝑝 ∈ CB for 𝑓 , played on the abstract argumentation framework for explanation with dimensions (Arg, Attack). The winning strategies may be viewed as trees and have the following general shape: 𝑝 Worse𝑝 (𝑥) 𝑞1 ... 𝑞𝑛 Compensates𝑝 (𝑦, 𝑥) Transformed𝑝 (𝑟1 ) ... Transformed𝑝 (𝑟𝑛 ). 4. On the Citability of Cases Let us now consider some possible modifications of Definition 3.3 to better formalize the intuitive notion of a most closely related case 𝑝 of our focus case 𝑓 . Firstly, since Definition 3.2 does not gather just the dimensions on which 𝑓 is worse than 𝑝 but also the value of 𝑝 at that dimension, a situation can arise where there is some case 𝑞 with [𝑄 ̸⪯𝑠 𝐹 ] ⊂ [𝑃 ̸⪯𝑠 𝐹 ] but 𝑄 ↾ [𝑄 ̸⪯𝑠 𝐹 ] ̸⊂ 𝑃 ↾ [𝑃 ̸⪯𝑠 𝐹 ], just because there is some dimension 𝑑 ∈ [𝑄 ̸⪯𝑠 𝐹 ] with 𝑄(𝑑) ̸= 𝑃 (𝑑). It does not seem correct to dismiss 𝑞 as a good citation simply because it disagrees with 𝑝 on a single dimension, especially when [𝑄 ̸⪯𝑠 𝐹 ] is only a very small subset of [𝑃 ̸⪯𝑠 𝐹 ]. Let us look at an example to illustrate this point. Example 4.1. We consider three cases 𝑝, 𝑞, 𝑓 with outcome 1 (meaning they were judged high risk of recidivism) in the recidivism scenario of Example 2.1: 𝑝(Age) = 20, 𝑝(Sex) = M, 𝑝(Priors) = 3, 𝑞(Age) = 50, 𝑞(Sex) = M, 𝑞(Priors) = 1, 𝑓 (Age) = 40, 𝑓 (Sex) = M, 𝑓 (Priors) = 2. We have that 𝐷(𝑝, 𝑓 ) = {(Age, 20), (Priors, 3)} and 𝐷(𝑞, 𝑓 ) = {(Priors, 1)}. Therefore, even though there are fewer dimensions on which 𝑞 has relevant differences with 𝑓 – as {Priors} ⊂ {Age, Priors} – this does not prevent 𝑝 from being considered a best citable precedent for 𝑓 – as {(Priors, 1)} ̸⊂ {(Age, 20), (Priors, 3)}. 7 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 This consideration suggests the definition should require minimality of [𝑃 ̸⪯𝑠 𝐹 ] instead of 𝑃 ↾ [𝑃 ̸⪯𝑠 𝐹 ]. However, this modification leaves room for a second type of scenario where there is some precedent 𝑞 which is intuitively much closer to the focus case relatively to some other 𝑝, without hindering 𝑝 from being considered best citable. To see why we consider a set of 𝑛 + 1 dimensions {𝑑0 , . . . , 𝑑𝑛 }. Now we may have that [𝑄 ̸⪯𝑠 𝐹 ] = {𝑑0 } and [𝑃 ̸⪯𝑠 𝐹 ] = {𝑑1 , . . . , 𝑑𝑛 }. This means that the presence of 𝑞 does not hinder 𝑝’s being considered a best citable precedent for 𝑓 , even though 𝑓 is worse than 𝑝 on 𝑛 times as many dimensions as it is worse on than 𝑞. To remedy this, we could require minimality of the number of dimensions rather than the set of dimensions itself, i.e. of |[𝑃 ̸⪯𝑠 𝐹 ]|. In addition to looking just at differences between the precedent and focus case it may be beneficial to also consider the similarities since after all, the stare decisis doctrine states that similar cases must be decided similarly. To achieve this we can require the best citable precedent to subsequently maximize |[𝑃 = 𝐹 ]|, so that it both minimizes differences and maximizes similarities. In all, this leads us to the following definition. Definition 4.1. A case 𝑝 is a best citable case for a case 𝑓 if it satisfies the conditions (a) 𝑝 is citable for 𝑓 ; (b) there is no other 𝑞 satisfying (a) with |[𝑄 ̸⪯𝑠 𝐹 ]| < |[𝑃 ̸⪯𝑠 𝐹 ]|; (c) there is no other 𝑞 satisfying (a) and (b) with |[𝑄 = 𝐹 ]| > |[𝑃 = 𝐹 ]|. Experimental results in [1] showed that there are in general many cases satisfying Definition 3.3 for any 𝑓 . Measured on three datasets, the mean and standard deviation of the number of best citable cases were respectively 82 ± 123.6, 76 ± 134, and 106 ± 116.5 [1, Table 5]. Recalculating these statistics for the same datasets with Definition 4.1 instead results in respectively 5.6 ± 2.0, 2.1 ± 2.6, and 2.6 ± 2.5 average number of best citable cases; a substantial decrease. Still, the definition remains somewhat ad-hoc, and more research is needed to assess its adequacy. 5. Specifying the Compensation Relation In [1] no further explicit assumptions are made of the compensation relation SC. However in order for this relation to function according to our intuitions it may be necessary to do so, and we now consider a few such requirements. Let us first illustrate SC through a continuation of Example 2.2. Example 5.1. We saw two example cases 𝑝, 𝑞 where 𝑞 was worse than 𝑝 on the dimensions Age and Sex, but better on Priors. Suppose that for a number of priors higher than 4, we no longer care about values besides the number of priors. Then we may define SC(𝑦, 𝑥) if and only if 𝑦(Priors) ≥ 4. In this case the worse values 𝑊 (𝑝, 𝑞) would indeed be compensated for by the better values 𝐵(𝑝, 𝑞), since 𝑞(Priors) = 5. A point to consider is whether the compensation relation should itself adhere to an a fortiori principle. That is to say, if a set 𝑦 is capable of compensating for a set 𝑥, should a superset 𝑧 ⊇ 𝑦 be capable of compensating for 𝑥 as well? 8 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 Definition 5.1. A compensation relation SC is monotone if for any partial fact situations 𝑥, 𝑦, 𝑧 it holds that SC(𝑦, 𝑥) implies SC(𝑦 ∪ 𝑧, 𝑥). The same goes for values that are being compensated for; if a set 𝑦 can compensate for a set 𝑥 then we might require of it to compensate any subset 𝑧 ⊆ 𝑥 as well. Definition 5.2. A compensation relation SC is antitone if for any partial fact situations 𝑥, 𝑦, 𝑧 it holds that SC(𝑦, 𝑥) implies SC(𝑦, 𝑥 ∩ 𝑧). In the factor based model of explanation in [1], i.e. the special case where the dimensions are all two element sets with a linear order, it is possible to compensate for a set of worse values in parts through the use of a pSubstitutes(𝑦, 𝑥, 𝑐)&cCancels(𝑦 ′ , 𝑥′ , 𝑐) move [1, Definition 5]. We can translate this to the dimensional setting as follows. Definition 5.3. A compensation relation SC is linear if for any partial fact situations 𝑤, 𝑥, 𝑦, 𝑧 it holds that SC(𝑤, 𝑥) and SC(𝑦, 𝑧) imply SC(𝑤 ∪ 𝑦, 𝑥 ∪ 𝑧). A more fundamental question regarding the compensation relation is that of context depen- dence; should the compensation of two sets be allowed to depend on the context in which it takes place? This question and its consequences are the subject of Section 6. 6. Justification as an Extension of Forcing An interesting way to think of the compensation relation is as an extension of the notion of forcing between cases. In essence a compensation says that while a precedent 𝑝 might not force the decision of some other case 𝑞, the obstructing relevant differences can be compensated, and so the precedent 𝑝 may still be said to justify the outcome of 𝑞. 6.1. Context-Dependent Compensations A downside of the formal specification of this compensation relation is that it is defined on partial fact situations, rather than just fact situations. This makes it impossible for compensations to take the values of the precedent into account when allowing compensations to be made. Example 6.1. In Example 2.2 the difference in age between 𝑝 and 𝑞 is only 5, and we may want to say that 𝐵(𝑝, 𝑞) compensates for 𝑊 (𝑝, 𝑞) in this case if we find this difference small enough to be insignificant. To make this compensation possible formally we would need to postulate SC({(Age, 50)}, {(Priors, 5), (Sex, M)} but this would inadvertently sanction compensations where the age of the precedent case is, say, 20, in which case we may find the difference in age large enough to be significant. Modifying SC so that it takes the precedents’ values into account yields a relation on full fact situations. A natural requirement of any such relation is that it extends the forcing relation ⪯ of Definition 2.4. This is akin to saying that any set can compensate for the empty set. This leads us to the following definition. Definition 6.1. A relation ⊑ on cases is called a justification relation if it extends the forcing relation ⪯, i.e. if ⪯ ⊆ ⊑. 9 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 Note that any compensation relation SC gives rise to a justification relation ⊑SC : 𝑝 ⊑SC 𝑞 if and only if 𝑝 ⪯ 𝑞 or SC(𝐵(𝑝, 𝑞), 𝑊 (𝑝, 𝑞)). (2) The converse does not hold, precisely because a justification relation takes into account the con- text of the compensation. To see this, consider the naïve approach of obtaining a compensation relation SC⊑ from a justification relation ⊑: SC⊑ (𝑦, 𝑥) if and only if 𝑝 ⊑ 𝑞 for some 𝑝, 𝑞 with 𝑥 = 𝑊 (𝑝, 𝑞), 𝑦 = 𝐵(𝑝, 𝑞). (3) The problem is that this definition is not necessarily well defined, meaning that the truth value of SC⊑ (𝑦, 𝑥) may depend on the particular representatives 𝑝 and 𝑞 that are used for its evaluation. This leads us to define the notion of a context-independent ⊑, requiring exactly that the relation SC⊑ above is well defined. Definition 6.2. A justification relation ⊑ is context-independent if for any four cases 𝑝, 𝑞, 𝑟, 𝑠 with 𝑊 (𝑝, 𝑞) = 𝑊 (𝑟, 𝑠) and 𝐵(𝑝, 𝑞) = 𝐵(𝑟, 𝑠) it holds that 𝑝 ⊑ 𝑞 iff 𝑟 ⊑ 𝑠. 6.2. Winning Strategies and Justification The terminology of Definition 6.1 is inspired by [1], where an argument is said to be justified if and only if the proponent has a winning strategy in the grounded argument game about the argument. We will now formally justify this comparison by showing that for any compensation relation SC the proponent of an initial citation 𝑝 has a winning strategy in the game on the argumentation framework if and only if 𝑝 ⊑SC 𝑓 (of Eq. (2)). Let us fix a precedent case 𝑝 and a focus case 𝑓 , and introduce some shorthand terminology to ease our work. We will say a case 𝑝 has a winning strategy if the proponent has a winning strategy in the grounded argument game on the explanation AF (Arg, Attack) of Definition 3.5, starting with a citation of 𝑝. Following [1] we distinguish between nontrivial winning strategies for 𝑝, in which 𝑝 can be attacked by a Worse𝑝 (𝑥) move, and trivial winning strategies for 𝑝, in which there is no Worse𝑝 (𝑥) attack possible. In other words, a winning strategy for 𝑝 is nontrivial if Worse𝑝 (𝑥) ∈ 𝒜𝑝 and trivial if Worse𝑝 (𝑥) ̸∈ 𝒜𝑝 , with 𝒜𝑝 as defined in Eq. (1). Proposition 6.1. There is a trivial winning strategy for 𝑝 if and only if 𝑝 ⪯ 𝑓 . Proof. Note that Worse𝑝 (𝑥) ̸∈ 𝒜𝑝 iff 𝑊 (𝑝, 𝑓 ) = ∅ iff 𝑝 ⪯ 𝑓 . Hence left to right is immediate. For right to left we note in addition that any citation made by the opponent can be attacked with a Transformed𝑝 (𝑝) move, and so since there is no reply possible to a Transformed move the proponent has a (trivial) winning strategy for 𝑝. Proposition 6.2. There is a nontrivial winning strategy for 𝑝 if and only if 𝑊 (𝑝, 𝑓 ) ̸= ∅ and SC(𝐵(𝑝, 𝑓 ), 𝑊 (𝑝, 𝑓 )). Proof. Suppose the proponent has a winning strategy. Since Worse𝑝 (𝑥) ̸∈ 𝒜𝑝 attacks the initial citation of 𝑝 there should be a Compensates𝑝 (𝑦, 𝑥) response to the Worse𝑝 (𝑥) move available to the proponent, with 𝑦 = 𝐵(𝑝, 𝑞). This implies that SC(𝐵(𝑝, 𝑓 ), 𝑊 (𝑝, 𝑓 )). 10 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 For the other direction we begin by noting that because 𝑊 (𝑝, 𝑞) ̸= ∅ there is Worse𝑝 (𝑥) ∈ 𝒜𝑝 , and so the assumption SC(𝐵(𝑝, 𝑓 ), 𝑊 (𝑝, 𝑓 )) guarantees that there is 𝐶 = Compensates𝑝 (𝑦, 𝑥) ∈ 𝒜𝑝 . Now, there are two types of moves available to the opponent to which we need a reply. 1. The first is Worse𝑝 (𝑥) ∈ 𝒜𝑝 . As mentioned we have a reply 𝐶 available, and since a compensation move cannot be replied to the game is won by the proponent. 2. The second is the citation of a case 𝑞 ∈ CB with outcome ¯𝑠 for which it holds that [𝑞 ̸⪯ 𝑓 ] ̸⊂ [𝑝 ̸⪯ 𝑓 ]. By Definition 3.4 we have that 𝑝 can be transformed into 𝑝′ = J𝐶K, and so we can reply to the citation with Transformed𝑝 (𝑞) ∈ 𝒜𝑝 . There are no more moves available to the opponent and so the proponent wins the game. Corollary 6.2.1. There is a winning strategy for 𝑝 if and only if 𝑝 ⊑SC 𝑓 . Proof. Applying Eq. (2) and then Propositions 6.1 and 6.2 we get 𝑝 ⊑SC 𝑓 iff 𝑝 ⪯ 𝑞 or SC(𝐵(𝑝, 𝑞), 𝑊 (𝑝, 𝑞)) iff 𝑝 has a trivial winning strategy or 𝑝 has a nontrivial winning strategy iff 𝑝 has a winning strategy. Under this view of the winning strategies, and employing a fully general definition of com- pensation through a justification relation ⊑, we can now rephrase Definition 3.6 of explanations in the following way. Definition 6.3. An explanation of a case 𝑓 is a best citable precedent 𝑝 ∈ CB with 𝑝 ⊑ 𝑓 . The theory of precedential constraint describes how the outcome of a fact situation can be forced by precedent. However the collection of precedents may not be sufficient to force the outcome of all possible new fact situations. If such an undecided fact situation presents itself there may still be a precedent which, on the basis of additional reasoning, can be argued to justify an outcome for the fact situation. This is the view suggested by Corollary 6.2.1; a justification relation goes beyond the forcing relation by sanctioning citations of precedents that do not strictly force the outcome of the focus case. 6.3. A Relational Description of the Explanation Model Having shown that a justification relation in some sense corresponds to the winning strategies underlying the explanations of [1], we can give a succinct description of the explanation method just through the use of relations on cases. Let us think of citability as a relation ⊴, then those 𝑝 ∈ CB related to the focus case through the intersection ⊑ ∩ ⊴ with 𝑓 are said to explain the focus case 𝑓 , i.e. those 𝑝 with 𝑝 ⊑ 𝑓 and 𝑝 ⊴ 𝑓 . The model in [1] is a top-level model as it does not give explicit definitions of these notions, apart from suggesting a definition for the citability relation ⊴ as in Definition 3.3, and a method for determining ⪯ on the basis of Pearson correlation coefficients. In its running example and the experiments in [1, Section 6] all compensations are allowed, so that ⊑ ∩ ⊴ = ⊴. Through 11 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 the relational view we summarize these inputs as follows: 1. The forcing relation ⪯, determined by specifying the dimensions and their orders. 2. The justification relation ⊑, determined by specifying the compensations. 3. The citability relation ⊴, determined by the definition of a best citable precedent. This view considerable simplifies the presentation of the model as it does not rely on the concepts of argumentation frameworks and winning strategies. 7. Discussion and Conclusion We have described the explanation model of [1] in Section 3, which provides explanations as winning strategies on the grounded argument game of an abstract argumentation theory. In Section 6 we showed that this model admits an equivalent rephrasing in terms of relations, in which explanations are provided as cases related to the focus case through justification and citation relations. Most notably this shows that the explanation model can in some sense be seen as adding a notion of justification to the theory of precedential constraint as a relation ⊑ extending the forcing relation ⪯. We conclude by noting an important consideration for future work on the topic. In order to apply this notion of justification to the explanation of machine-learned decisions, it is imperative that the input parameters – that being the forcing, justification, and citation relations – are constructed in such a way that they are faithful to the rationale of the black-box under consid- eration, because otherwise such a justification runs a high risk of becoming a rationalization if it does not reflect the real reasons behind the decision. Acknowledgments This research was (partially) funded by the Hybrid Intelligence Center, a 10-year programme funded by the Dutch Ministry of Education, Culture and Science through the Netherlands Organisation for Scientific Research, grant number 024.004.022. We also thank the referees for very useful commentary and suggestions. References [1] H. Prakken, R. Ratsma, A top-level model of case-based argumentation for explanation: Formalisation and experiments, Argument & Computation 13 (2022) 159–194. [2] J. F. Horty, Rules and reasons in the theory of precedent, Legal Theory 17 (2011) 1–33. [3] J. Horty, Reasoning with dimensions and magnitudes, Artificial Intelligence and Law 27 (2019) 309–345. [4] P. Dung, On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games, Artificial Intelligence 77 (1995) 321–357. [5] V. Aleven, K. D. Ashley, Evaluating a learning environment for case-based argumentation skills, in: Proceedings of the 6th international conference on Artificial intelligence and law, 1997, pp. 170–179. 12 Wijnand van Woerkom et al. CEUR Workshop Proceedings 1–13 [6] W. van Woerkom, D. Grossi, H. Prakken, B. Verheij, Landmarks in case-based reasoning: From theory to data, in: Proceedings of the First International Conference on Hybrid Human-Machine Intelligence, Frontiers of AI, IOS Press, 2022, p. tbd. [7] K. Čyras, D. Birch, Y. Guo, F. Toni, R. Dulay, S. Turvey, D. Greenberg, T. Hapuarachchi, Explanations by arbitrated argumentative dispute, Expert Systems with Applications 127 (2019) 141–156. [8] V. Aleven, Using background knowledge in case-based legal reasoning: A computational model and an intelligent learning environment, Artificial Intelligence 150 (2003) 183–237. 13