=Paper=
{{Paper
|id=Vol-3209/5942
|storemode=property
|title=Justification in Case-Based Reasoning
|pdfUrl=https://ceur-ws.org/Vol-3209/5942.pdf
|volume=Vol-3209
|authors=Wijnand van Woerkom, Davide Grossi, Henry Prakken, Bart Verheij
|dblpUrl=https://dblp.org/rec/conf/comma/WoerkomGPV22
}}
==Justification in Case-Based Reasoning==
<pdf width="1500px">https://ceur-ws.org/Vol-3209/5942.pdf</pdf>
<pre>
Justification in Case-Based Reasoning
Wijnand van Woerkom1 , Davide Grossi2,3,4 , Henry Prakken1,5 and Bart Verheij2
1
  Department of Information and Computing Sciences, Utrecht University, The Netherlands
2
  Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen,
The Netherlands
3
  Amsterdam Center for Law and Economics, University of Amsterdam, The Netherlands
4
  Institute for Logic, Language and Computation, University of Amsterdam, The Netherlands
5
  Faculty of Law, University of Groningen, The Netherlands


                                         Abstract
                                         The explanation and justification of decisions is an important subject in contemporary data-driven
                                         automated methods. Case-based argumentation has been proposed as the formal background for the
                                         explanation of data-driven automated decision making. In particular, a method was developed in recent
                                         work based on the theory of precedential constraint which reasons from a case base, given by the training
                                         data of the machine learning system, to produce a justification for the outcome of a focus case. An
                                         important role is played in this method by the notions of citability and compensation, and in the present
                                         work we develop these in more detail. Special attention is paid to the notion of compensation; we
                                         formally specify the notion and identify several of its desirable properties. These considerations reveal
                                         a refined formal perspective on the explanation method as an extension of the theory of precedential
                                         constraint with a formal notion of justification.

                                         Keywords
                                         Precedential constraint, Interpretability, Law


1. Introduction
In [1] a case-based reasoning method is proposed to explain data-driven automated decisions
for binary classification, based on the theory of precedential constraint introduced in [2, 3].
This method is motivated by an analogy between the way in which a machine learning system
draws on training data to assign a label to a new data point and the way in which a court of
law draws on previously decided cases to make a decision about a new fact situation, because
in both of these situations the precedent that has been set must be adhered to as closely as
possible. The theory of precedential constraint, which has been developed to describe the type
of a fortiori reasoning used for legal decision making on the basis of case law, can therefore be
applied to analyze machine-learned decisions that are made on the basis of training data.
   More specifically, the method of [1] formally models the kind of dialogue in which lawyers
cite precedents to argue in favor of their preferred outcome of the new fact situation. These
citations, and the way in which they attack the opponent’s citation, are formalized using an

1st International Workshop on Argumentation for eXplainable AI (ArgXAI, co-located with COMMA ’22), September 12,
2022, Cardiff, UK
$ w.k.vanwoerkom@uu.nl (W. van Woerkom)
 https://webspace.science.uu.nl/~woerk003/ (W. van Woerkom)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                           1
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                         1–13


abstract argumentation framework as in [4]. A winning strategy in the grounded argument
game on this framework, starting with an initial citation of a suitable precedent case, is taken
as the explanation of the decision of the new fact situation.
   In the present work, we examine the explanation model of [1] in detail and make various
suggestions and modifications for improvement. Particularly close attention is paid to the
subject of compensation; the way in which important differences between a new fact situation
and a precedent case can be compensated for by features of the focus case. We make the
formal nature of this subject more explicit, and specify various desirable properties it may have.
Subsequently, we show that the model can be equivalently viewed as extending the theory of
precedential constraint with notions of justification and citability, the combination of which
constitutes the explanations produced by the model. This equivalent formulation only uses the
simple notion of relation, thus greatly simplifying the specification of the model. The resulting
view may be more broadly applied to the type of downplaying attacks seen in similar systems
such as cato [5].
   We begin this work by summarizing the relevant aspects of the theory of precedential
constraint in Section 2. In Section 3 we give a description of the explanation method of [1]. In
Section 4 we revisit the definition of best citability, suggest some improvements, and demonstrate
their potential experimentally. Then in Section 5 we reconsider the compensation relation and
formulate desirable properties. These considerations lead us to give an equivalent formulation
of the model just in terms of relations, which we do in Section 6. We conclude in Section 7 with
some final thoughts and remarks.


2. Precedential Constraint
The theory of precedential constraint was developed in [2, 3] to describe the a fortiori reasoning
involved with case law. It is taken as the point of departure of the explanation method in [1]
and so we begin by recalling those aspects of it that are necessary for the rest of this work. The
contents of this section are largely similar to [6, Section 2].
  In order to describe the fact situation of a case we use what are called dimensions in the ai &
law literature, which are formally just partially ordered sets.

Definition 2.1. A dimension is a partially ordered set (𝑑, ⪯).

  We will frequently omit explicit reference to the dimension order ⪯ and instead refer to just
the set 𝑑 when we speak of a dimension. A model of precedential constraint of a specific domain
assumes there is a set of these dimensions 𝐷, relative to which the rest of the definitions are
specified.

Definition 2.2. A fact situation 𝑃 is a choice function on the set of dimensions 𝐷, i.e. for each
dimension 𝑑 ∈ 𝐷 an element 𝑃 (𝑑) ∈ 𝑑 of that dimension is chosen by 𝑃 . A case 𝑝 is a fact
situation 𝑃 paired with an outcome 𝑠 ∈ {0, 1}, written 𝑝 = 𝑃 : 𝑠. A set CB of cases is called a
case base. If 𝑝 = 𝑃 : 𝑠 we may write 𝑝(𝑑) instead of 𝑃 (𝑑).

  In the context of a case 𝑝, 𝑞, 𝑟, . . . we will refer to its fact situation by the corresponding
upper case letters 𝑃, 𝑄, 𝑅, . . . without further explicit mention.


                                                2
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                           1–13


   The order ⪯ of a dimension 𝑑 specifies the relative preference the elements of 𝑑 have towards
either of two outcomes 0 and 1. More specifically, if 𝑣 ≺ 𝑤 for 𝑣, 𝑤 ∈ 𝑑 this means 𝑤 prefers
outcome 1 relative to 𝑣, and conversely 𝑣 prefers outcome 0 relative to 𝑤. Usually we want to
compare preference towards an arbitrary outcome 𝑠, so to do this we define for any dimension
(𝑑, ⪯) the notation ⪯𝑠 := ⪯ if 𝑠 = 1 and ⪯𝑠 := ⪰ if 𝑠 = 0.
Definition 2.3. Given fact situations 𝑃 and 𝑄 we say 𝑄 is at least as good as 𝑃 for an outcome
𝑠, denoted 𝑃 ⪯𝑠 𝑄, if it is at least as good for 𝑠 on every dimension 𝑑:

                    𝑃 ⪯𝑠 𝑄 if and only if       𝑃 (𝑑) ⪯𝑠 𝑄(𝑑) for all 𝑑 ∈ 𝐷.

If moreover 𝑝 = 𝑃 : 𝑠 is a previously decided case we say that 𝑝 forces the decision of 𝑄 for 𝑠. A
case base CB forces the decision of 𝑄 for 𝑠 if it contains a case that does so.
Definition 2.4. Given two cases 𝑝 = 𝑃 : 𝑠 and 𝑞 = 𝑄 : 𝑠 such that 𝑃 ⪯𝑠 𝑄 we say that the
outcome of 𝑞 for 𝑠 was forced by the case 𝑝, and write 𝑝 ⪯ 𝑞.
   To give some intuition for these definitions we consider a running example of risk of recidi-
vism, as in [6, Example 2.1].
Example 2.1. Convicts are described along three dimensions: age (Age, ⪯Age ), the number
of prior offenses (Priors, ⪯Priors ), and sex (Sex, ⪯Sex ). Age and number of priors have the
natural numbers as possible values, so Age := N and Priors := N. The values for sex are
Sex := {M, F}. The outcome for this domain is a judgement of whether the person is at high
(1) or low (0) risk of recidivism. The associated orders are as follows:

                           (Age, ⪯Age ) := (N, ≥),
                   (Priors, ⪯Priors ) := (N, ≤),
                           (Sex, ⪯Sex ) := ({M, F}, {(F, F), (M, M), (F, M)}).

   If a relation 𝑅 is defined on all dimension we can, for fact situations 𝑃 and 𝑄, refer to the set
of dimensions on which 𝑅 holds with [𝑅(𝑃, 𝑄)] := {𝑑 ∈ 𝐷 | 𝑅(𝑃 (𝑑), 𝑄(𝑑))}. For instance,
instantiating 𝑅 := ̸⪯𝑠 we have [𝑃 ̸⪯𝑠 𝑄] = {𝑑 ∈ 𝐷 | 𝑃 (𝑑) ̸⪯𝑠 𝑄(𝑑)}; the dimensions on
which 𝑄 is not at least as good for 𝑠 as 𝑃 . Besides fact situations we will also consider partial
fact situations, i.e. fact situations defined only on a particular subset of the dimensions. We can
do so conveniently using the well established notation for function restriction. Let 𝑓 : 𝑋 → 𝑌
and 𝑍 ⊆ 𝑋, we obtain a function 𝑓 ↾ 𝑍 : 𝑍 → 𝑌 by restriction: 𝑓 ↾ 𝑍 := {(𝑥, 𝑦) ∈ 𝑓 | 𝑥 ∈ 𝑍}.
For cases 𝑝 and 𝑞 with the same outcome 𝑠 we write 𝑊 (𝑝, 𝑞) := 𝑄 ↾ [𝑃 ̸⪯𝑠 𝑄], the values of 𝑞
on which 𝑞 is worse than 𝑝 for 𝑠, and 𝐵(𝑝, 𝑞) := 𝑄 ↾ [𝑃 ⪯𝑠 𝑄], the values of 𝑞 on which 𝑞 is
better than 𝑝 for 𝑠.
Example 2.2. Suppose we have a case base of recidivism risk judgements, and two cases 𝑝, 𝑞
with outcome 1 (i.e. judged high risk of recidivism) such that:

             𝑝(Age) = 45,                𝑝(Priors) = 4,                 𝑝(Sex) = M,
             𝑞(Age) = 50,                𝑞(Priors) = 5,                 𝑞(Sex) = M.

Now we can compute that 𝑊 (𝑝, 𝑞) = {(Age, 50)} and 𝐵(𝑝, 𝑞) = {(Priors, 5), (Sex, M)}.


                                                 3
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                          1–13


3. A Case-Based Reasoning Explanation Method
In this section we detail the workings of the dimension-based model of explanation of [1], which
was inspired by the work of [7]. A more detailed comparison between [1], [7], and other related
works, can be found in [1, Section 8]. The method is built upon the theory of precedential
constraint of [2, 3] and conceptually tries to mimic the arguments relating to precedent used by
lawyers with respect to case law. In such discussions, precedent cases are cited by both sides as
a means of arguing that the present (focus) case should be decided similarly as the precedent.
Both sides may attack the other’s citations, by pointing to important differences between the
citation and the focus case; and they may defend themselves against such attacks, by pointing
to aspects of the focus case which compensates for these differences. Each of the elements of
such a discussion – case citations, pointing to differences, and compensating for differences –
has its counterpart in the formal model of explanation.
   A key idea underlying the approach is that a tabular dataset for binary classification can be
interpreted as a case base CB in the sense of Definition 2.2. The method assumes access to the
training data used by the system, and interprets each of the features in the data as a dimension
in the sense of Definition 2.1. The corresponding dimension orders may be determined by
knowledge engineering, statistical methods, or a combination thereof. This gives us a body of
precedent CB upon which the machine learning system bases its decisions.
   Under this interpretation the machine learning system can be seen as deciding new fact
situations for sides. The goal is to explain a particular decision of a fact situation 𝐹 for a side
𝑠, called the focus case 𝑓 = 𝐹 : 𝑠. This explanation is provided in the form of a best citable
precedent 𝑝 ∈ CB together with an explanation dialogue in which the choice for this 𝑝 is justified.
This dialogue is formalized as a winning strategy in the grounded argument game of a particular
abstract argumentation framework.
   Before we can apply the theory of precedential constraint, we should specify the dimensions
as in Definition 2.1, and we begin in Section 3.1 by mentioning the method used for doing
so in [1, 6]. Any explanation dialogue should start with the citation of a best citable case. A
suggestion for the definition of this notion is given in [1] and we continue by recalling it in
Section 3.2, after which we explain and motivate the presence of the arguments occurring in
the argumentation framework in Sections 3.3 and 3.4. We are then ready to give the formal
definition of the framework in Section 3.5, explain what it means to have a winning strategy in
the argument game it induces, and as such what constitutes an explanation according to the
model.

3.1. Determining the Dimension Orders
In order to instantiate the explanation method for a particular dataset, we should specify the
dimension orders as in Definition 2.1. As just noted, this may be done on the basis of knowledge
engineering and/or statistical methods. In [1] a general method for determining the orders
corresponding to the dimensions was proposed, using a function 𝑐 that associates each ordinal
feature 𝑥 in the data with a coefficient expressing the degree to which the values in the range
of 𝑥 prefer outcome 1. See [6, Section 4.2] for a more detailed explanation.


                                                4
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                             1–13


3.2. Citability of Cases
An important aspect of the explanations produced by the method of [1] is the selection of the
precedent case 𝑝 with which it initiates its explanation of the outcome of the focus case 𝑓 . We
will now describe how this selection procedure works; later in Section 4 we return to this topic
to suggest improvements. We begin with the notion of citability.

Definition 3.1. A case 𝑝 is citable for a case 𝑓 if

  (a) both cases have the same outcome 𝑠; and

  (b) there is a dimension 𝑑 such that 𝑝(𝑑) ⪯𝑠 𝑓 (𝑑).

  Since this is a quite weak requirement there may in general be very many citable cases 𝑝
for any given 𝑓 . For this reason the notion is strengthened by requiring that 𝑝 should have a
minimal number of relevant differences with 𝑓 , according to some suitable notion of minimality.
To make this formal we should first define what a relevant differences is. This is accomplished
by [1, Definition 11], which we repeat here.

Definition 3.2. The set 𝐷(𝑝, 𝑓 ) of relevant differences between 𝑝 = 𝑃 : 𝑠 and 𝑓 = 𝐹 : 𝑡 is

               𝐷(𝑝, 𝑓 ) := 𝑃 ↾ [𝑃 ̸⪯𝑠 𝐹 ] = {(𝑑, 𝑃 (𝑑)) | 𝑑 ∈ 𝐷, 𝑃 (𝑑) ̸⪯𝑠 𝐹 (𝑑)}.

In other words, the relevant differences are given by the values of the precedent 𝑝 on the
dimensions on which 𝑓 is not better than 𝑝 for 𝑠. Now a best citable precedent should minimize
this set of differences, in the following sense.

Definition 3.3. A case 𝑝 is a best citable case for a case 𝑓 if

  (a) 𝑝 is citable for 𝑓 ; and

  (b) there is no other 𝑞 satisfying (a) for which 𝐷(𝑞, 𝑓 ) ⊂ 𝐷(𝑝, 𝑓 ).

3.3. Compensation of Relevant Differences
An idea central to the explanation dialogues is that when a precedent 𝑝 does not force a focus
case 𝑓 , the values 𝑊 (𝑝, 𝑓 ) on which 𝑓 is worse than 𝑝 for their outcome can be compensated
for by the values 𝐵(𝑝, 𝑓 ) on which 𝑓 is better than 𝑝. This idea is often encountered in the
literature on case-based reasoning, see e.g. [8], where certain compensations are described as
“showing that at a more abstract level, a parallel exists between the cases, arguing in effect that the
apparent distinction is merely a mismatch of details."
   In our context we assume the existence of a relation SC on partial fact situations 𝑥, 𝑦, where
SC(𝑦, 𝑥) says that 𝑦 compensates for 𝑥. This is used in practise as follows. Consider a precedent
𝑝 and a focus case 𝑓 , both with outcome 𝑠. If 𝑝 forces the decision of 𝑓 then 𝑓 is at least as good
as 𝑝 for 𝑠 on all dimensions, so ∅ = 𝑊 (𝑝, 𝑓 ) or equivalently 𝐵(𝑝, 𝑓 ) = 𝑓 . If this is not the case,
then ∅ ⊂ 𝑊 (𝑝, 𝑓 ) or equivalently 𝐵(𝑝, 𝑓 ) ⊂ 𝑓 , and for 𝑝 to justify the outcome of 𝑓 we should
have that 𝐵(𝑝, 𝑓 ) compensates for 𝑊 (𝑝, 𝑓 ) as determined by whether SC(𝐵(𝑝, 𝑓 ), 𝑊 (𝑝, 𝑓 ))
holds.


                                                  5
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                          1–13


3.4. Opposing Citations and Case Transformations
The last component of the dialogue is opposing citations, to which a response is possible through
the use of case transformations. The idea is that the proponent of the decision of 𝑓 for its
outcome 𝑠 can have their citation countered by the citation of a case 𝑞 with outcome ¯𝑠, as a
means of saying that 𝑞 should be a more appropriate precedent to draw on. This is analogous
to the argument between lawyers in a legal case.
Definition 3.4. We define a semantics function J·K on the compensation arguments by:

                     JCompensates𝑝 (𝑦, 𝑥)K := (𝑃 ∖ 𝑃 ↾ dom(𝑥)) ∪ 𝑥 : 𝑠.

A case 𝑝 can be transformed into 𝑞 iff 𝑝 = 𝑞 or there exists 𝑋 ∈ 𝒜𝑝 such that J𝑋K = 𝑞.
   The goal of the semantics function is to change 𝑝 into a case 𝑞 that forces the outcome of 𝑓 .
It does so by replacing the values of the precedent case with those of the focus case, on those
dimensions on which the focus case is not at least as good as the precedent.

3.5. An Abstract Argumentation Framework for Explanation
We are now ready to describe the formal account of the explanation dialogues in [1] through
the use of an abstract argumentation framework, a concept introduced in [4]. An abstract
argumentation framework AF = (Arg, Attack) is a directed graph, in which the nodes are
interpreted as arguments and the edges as an attack relation between them.
   An argumentation framework (Arg, Attack) is defined in [1] that combines the types of
arguments defined in the preceding Sections 3.2, 3.3, and 3.4, relative to a focus case 𝑓 = 𝐹 : 𝑠.
To do so we first define, for a particular precedent 𝑝 = 𝑃 : 𝑠 that may be cited in defense of the
decision of 𝐹 for 𝑠, a subset 𝒜𝑝 ⊆ Arg as follows:
             ⋃︁ {︁
      𝒜𝑝 :=       {Worse𝑝 (𝑥) | 𝑥 = 𝑊 (𝑝, 𝑓 ) ̸= ∅},                                            (1)
                  {Compensates𝑝 (𝑦, 𝑥) | Worse𝑝 (𝑥) ∈ 𝒜𝑝 , 𝑦 ⊆ 𝐵(𝑝, 𝑓 ), SC(𝑦, 𝑥)}, }︁
                  {Transformed𝑝 (𝑞) | 𝑝 can be transformed into a case 𝑞 with 𝑞 ⪯ 𝑓 } .

Definition 3.5. Given a finite case base CB, a focus case 𝑓 = 𝐹 : 𝑠, and a compensation
relation SC, an abstract argumentation framework for explanation with dimensions is a pair
AF = (Arg, Attack) where the arguments Arg are given by
                             ⋃︁
               Arg := CB ∪ {𝒜𝑝 | 𝑝 ∈ CB if 𝑝 has the same outcome as 𝑓 },

and for arguments 𝑋, 𝑌 ∈ Arg we have Attack(𝑋, 𝑌 ) if and only if either:
    • 𝑋, 𝑌 ∈ CB have different outcomes and [𝑋 ̸⪯ 𝑓 ] ̸⊂ [𝑌 ̸⪯ 𝑓 ];

    • 𝑌 ∈ CB and 𝑋 is of the form Worse𝑌 (𝑥);

    • 𝑌 is of the form Worse𝑝 (𝑥) and 𝑋 is of the form Compensates𝑝 (𝑦, 𝑥); or

    • 𝑌 ∈ CB has outcome ¯𝑠 and 𝑋 is of the form Transformed𝑝 (𝑞).


                                                6
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                          1–13


   A dialogue now takes the form of a grounded argument game played on (Arg, Attack). For
the sake of brevity we only give an intuitive explanation of how this works, the reader is referred
to [1] for a detailed treatment of the subject.
   An argument game on an AF (𝐴, 𝑅) is a two-player game, in which the players take turns
playing arguments from 𝐴 which must attack the previously played argument according to
the attack relation 𝑅. A player can win the game by moving an argument to which the other
player cannot reply, and a winning strategy for a player is a method of playing that ensures a
win regardless of how the opponent plays.
   We now have the formal machinery in place to define explanations as in [1].
Definition 3.6. An explanation of a focus case 𝑓 is a winning strategy in the grounded argument
game starting with the citation of a best citable precedent 𝑝 ∈ CB for 𝑓 , played on the abstract
argumentation framework for explanation with dimensions (Arg, Attack).
  The winning strategies may be viewed as trees and have the following general shape:
                                            𝑝


           Worse𝑝 (𝑥)                       𝑞1                ...                𝑞𝑛


      Compensates𝑝 (𝑦, 𝑥)          Transformed𝑝 (𝑟1 )         ...       Transformed𝑝 (𝑟𝑛 ).


4. On the Citability of Cases
Let us now consider some possible modifications of Definition 3.3 to better formalize the intuitive
notion of a most closely related case 𝑝 of our focus case 𝑓 .
   Firstly, since Definition 3.2 does not gather just the dimensions on which 𝑓 is worse than
𝑝 but also the value of 𝑝 at that dimension, a situation can arise where there is some case 𝑞
with [𝑄 ̸⪯𝑠 𝐹 ] ⊂ [𝑃 ̸⪯𝑠 𝐹 ] but 𝑄 ↾ [𝑄 ̸⪯𝑠 𝐹 ] ̸⊂ 𝑃 ↾ [𝑃 ̸⪯𝑠 𝐹 ], just because there is some
dimension 𝑑 ∈ [𝑄 ̸⪯𝑠 𝐹 ] with 𝑄(𝑑) ̸= 𝑃 (𝑑). It does not seem correct to dismiss 𝑞 as a good
citation simply because it disagrees with 𝑝 on a single dimension, especially when [𝑄 ̸⪯𝑠 𝐹 ] is
only a very small subset of [𝑃 ̸⪯𝑠 𝐹 ]. Let us look at an example to illustrate this point.
Example 4.1. We consider three cases 𝑝, 𝑞, 𝑓 with outcome 1 (meaning they were judged high
risk of recidivism) in the recidivism scenario of Example 2.1:

             𝑝(Age) = 20,                𝑝(Sex) = M,                𝑝(Priors) = 3,
             𝑞(Age) = 50,                𝑞(Sex) = M,                𝑞(Priors) = 1,
             𝑓 (Age) = 40,               𝑓 (Sex) = M,               𝑓 (Priors) = 2.

We have that 𝐷(𝑝, 𝑓 ) = {(Age, 20), (Priors, 3)} and 𝐷(𝑞, 𝑓 ) = {(Priors, 1)}. Therefore,
even though there are fewer dimensions on which 𝑞 has relevant differences with 𝑓 – as
{Priors} ⊂ {Age, Priors} – this does not prevent 𝑝 from being considered a best citable
precedent for 𝑓 – as {(Priors, 1)} ̸⊂ {(Age, 20), (Priors, 3)}.


                                                 7
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                               1–13


   This consideration suggests the definition should require minimality of [𝑃 ̸⪯𝑠 𝐹 ] instead
of 𝑃 ↾ [𝑃 ̸⪯𝑠 𝐹 ]. However, this modification leaves room for a second type of scenario
where there is some precedent 𝑞 which is intuitively much closer to the focus case relatively
to some other 𝑝, without hindering 𝑝 from being considered best citable. To see why we
consider a set of 𝑛 + 1 dimensions {𝑑0 , . . . , 𝑑𝑛 }. Now we may have that [𝑄 ̸⪯𝑠 𝐹 ] = {𝑑0 }
and [𝑃 ̸⪯𝑠 𝐹 ] = {𝑑1 , . . . , 𝑑𝑛 }. This means that the presence of 𝑞 does not hinder 𝑝’s being
considered a best citable precedent for 𝑓 , even though 𝑓 is worse than 𝑝 on 𝑛 times as many
dimensions as it is worse on than 𝑞. To remedy this, we could require minimality of the number
of dimensions rather than the set of dimensions itself, i.e. of |[𝑃 ̸⪯𝑠 𝐹 ]|.
   In addition to looking just at differences between the precedent and focus case it may be
beneficial to also consider the similarities since after all, the stare decisis doctrine states that
similar cases must be decided similarly. To achieve this we can require the best citable precedent
to subsequently maximize |[𝑃 = 𝐹 ]|, so that it both minimizes differences and maximizes
similarities. In all, this leads us to the following definition.

Definition 4.1. A case 𝑝 is a best citable case for a case 𝑓 if it satisfies the conditions (a) 𝑝 is
citable for 𝑓 ; (b) there is no other 𝑞 satisfying (a) with |[𝑄 ̸⪯𝑠 𝐹 ]| < |[𝑃 ̸⪯𝑠 𝐹 ]|; (c) there is no
other 𝑞 satisfying (a) and (b) with |[𝑄 = 𝐹 ]| > |[𝑃 = 𝐹 ]|.

   Experimental results in [1] showed that there are in general many cases satisfying Definition
3.3 for any 𝑓 . Measured on three datasets, the mean and standard deviation of the number of best
citable cases were respectively 82 ± 123.6, 76 ± 134, and 106 ± 116.5 [1, Table 5]. Recalculating
these statistics for the same datasets with Definition 4.1 instead results in respectively 5.6 ± 2.0,
2.1 ± 2.6, and 2.6 ± 2.5 average number of best citable cases; a substantial decrease. Still, the
definition remains somewhat ad-hoc, and more research is needed to assess its adequacy.


5. Specifying the Compensation Relation
In [1] no further explicit assumptions are made of the compensation relation SC. However in
order for this relation to function according to our intuitions it may be necessary to do so, and
we now consider a few such requirements. Let us first illustrate SC through a continuation of
Example 2.2.

Example 5.1. We saw two example cases 𝑝, 𝑞 where 𝑞 was worse than 𝑝 on the dimensions
Age and Sex, but better on Priors. Suppose that for a number of priors higher than 4, we no
longer care about values besides the number of priors. Then we may define

                            SC(𝑦, 𝑥) if and only if      𝑦(Priors) ≥ 4.

In this case the worse values 𝑊 (𝑝, 𝑞) would indeed be compensated for by the better values
𝐵(𝑝, 𝑞), since 𝑞(Priors) = 5.

  A point to consider is whether the compensation relation should itself adhere to an a fortiori
principle. That is to say, if a set 𝑦 is capable of compensating for a set 𝑥, should a superset 𝑧 ⊇ 𝑦
be capable of compensating for 𝑥 as well?


                                                   8
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                            1–13


Definition 5.1. A compensation relation SC is monotone if for any partial fact situations 𝑥, 𝑦, 𝑧
it holds that SC(𝑦, 𝑥) implies SC(𝑦 ∪ 𝑧, 𝑥).
   The same goes for values that are being compensated for; if a set 𝑦 can compensate for a set
𝑥 then we might require of it to compensate any subset 𝑧 ⊆ 𝑥 as well.
Definition 5.2. A compensation relation SC is antitone if for any partial fact situations 𝑥, 𝑦, 𝑧
it holds that SC(𝑦, 𝑥) implies SC(𝑦, 𝑥 ∩ 𝑧).
   In the factor based model of explanation in [1], i.e. the special case where the dimensions are
all two element sets with a linear order, it is possible to compensate for a set of worse values in
parts through the use of a pSubstitutes(𝑦, 𝑥, 𝑐)&cCancels(𝑦 ′ , 𝑥′ , 𝑐) move [1, Definition 5]. We
can translate this to the dimensional setting as follows.
Definition 5.3. A compensation relation SC is linear if for any partial fact situations 𝑤, 𝑥, 𝑦, 𝑧
it holds that SC(𝑤, 𝑥) and SC(𝑦, 𝑧) imply SC(𝑤 ∪ 𝑦, 𝑥 ∪ 𝑧).
  A more fundamental question regarding the compensation relation is that of context depen-
dence; should the compensation of two sets be allowed to depend on the context in which it
takes place? This question and its consequences are the subject of Section 6.


6. Justification as an Extension of Forcing
An interesting way to think of the compensation relation is as an extension of the notion of
forcing between cases. In essence a compensation says that while a precedent 𝑝 might not force
the decision of some other case 𝑞, the obstructing relevant differences can be compensated, and
so the precedent 𝑝 may still be said to justify the outcome of 𝑞.

6.1. Context-Dependent Compensations
A downside of the formal specification of this compensation relation is that it is defined on partial
fact situations, rather than just fact situations. This makes it impossible for compensations to
take the values of the precedent into account when allowing compensations to be made.
Example 6.1. In Example 2.2 the difference in age between 𝑝 and 𝑞 is only 5, and we may want
to say that 𝐵(𝑝, 𝑞) compensates for 𝑊 (𝑝, 𝑞) in this case if we find this difference small enough
to be insignificant. To make this compensation possible formally we would need to postulate
SC({(Age, 50)}, {(Priors, 5), (Sex, M)} but this would inadvertently sanction compensations
where the age of the precedent case is, say, 20, in which case we may find the difference in age
large enough to be significant.
   Modifying SC so that it takes the precedents’ values into account yields a relation on full fact
situations. A natural requirement of any such relation is that it extends the forcing relation ⪯
of Definition 2.4. This is akin to saying that any set can compensate for the empty set. This
leads us to the following definition.
Definition 6.1. A relation ⊑ on cases is called a justification relation if it extends the forcing
relation ⪯, i.e. if ⪯ ⊆ ⊑.


                                                 9
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                           1–13


  Note that any compensation relation SC gives rise to a justification relation ⊑SC :

                    𝑝 ⊑SC 𝑞   if and only if   𝑝 ⪯ 𝑞 or SC(𝐵(𝑝, 𝑞), 𝑊 (𝑝, 𝑞)).                  (2)

The converse does not hold, precisely because a justification relation takes into account the con-
text of the compensation. To see this, consider the naïve approach of obtaining a compensation
relation SC⊑ from a justification relation ⊑:

       SC⊑ (𝑦, 𝑥)   if and only if 𝑝 ⊑ 𝑞 for some 𝑝, 𝑞 with 𝑥 = 𝑊 (𝑝, 𝑞), 𝑦 = 𝐵(𝑝, 𝑞).          (3)

The problem is that this definition is not necessarily well defined, meaning that the truth value of
SC⊑ (𝑦, 𝑥) may depend on the particular representatives 𝑝 and 𝑞 that are used for its evaluation.
This leads us to define the notion of a context-independent ⊑, requiring exactly that the relation
SC⊑ above is well defined.

Definition 6.2. A justification relation ⊑ is context-independent if for any four cases 𝑝, 𝑞, 𝑟, 𝑠
with 𝑊 (𝑝, 𝑞) = 𝑊 (𝑟, 𝑠) and 𝐵(𝑝, 𝑞) = 𝐵(𝑟, 𝑠) it holds that 𝑝 ⊑ 𝑞 iff 𝑟 ⊑ 𝑠.

6.2. Winning Strategies and Justification
The terminology of Definition 6.1 is inspired by [1], where an argument is said to be justified if
and only if the proponent has a winning strategy in the grounded argument game about the
argument. We will now formally justify this comparison by showing that for any compensation
relation SC the proponent of an initial citation 𝑝 has a winning strategy in the game on the
argumentation framework if and only if 𝑝 ⊑SC 𝑓 (of Eq. (2)).
   Let us fix a precedent case 𝑝 and a focus case 𝑓 , and introduce some shorthand terminology
to ease our work. We will say a case 𝑝 has a winning strategy if the proponent has a winning
strategy in the grounded argument game on the explanation AF (Arg, Attack) of Definition 3.5,
starting with a citation of 𝑝. Following [1] we distinguish between nontrivial winning strategies
for 𝑝, in which 𝑝 can be attacked by a Worse𝑝 (𝑥) move, and trivial winning strategies for 𝑝,
in which there is no Worse𝑝 (𝑥) attack possible. In other words, a winning strategy for 𝑝 is
nontrivial if Worse𝑝 (𝑥) ∈ 𝒜𝑝 and trivial if Worse𝑝 (𝑥) ̸∈ 𝒜𝑝 , with 𝒜𝑝 as defined in Eq. (1).

Proposition 6.1. There is a trivial winning strategy for 𝑝 if and only if 𝑝 ⪯ 𝑓 .

Proof. Note that Worse𝑝 (𝑥) ̸∈ 𝒜𝑝 iff 𝑊 (𝑝, 𝑓 ) = ∅ iff 𝑝 ⪯ 𝑓 . Hence left to right is immediate.
For right to left we note in addition that any citation made by the opponent can be attacked
with a Transformed𝑝 (𝑝) move, and so since there is no reply possible to a Transformed move
the proponent has a (trivial) winning strategy for 𝑝.

Proposition 6.2. There is a nontrivial winning strategy for 𝑝 if and only if 𝑊 (𝑝, 𝑓 ) ̸= ∅ and
SC(𝐵(𝑝, 𝑓 ), 𝑊 (𝑝, 𝑓 )).

Proof. Suppose the proponent has a winning strategy. Since Worse𝑝 (𝑥) ̸∈ 𝒜𝑝 attacks the initial
citation of 𝑝 there should be a Compensates𝑝 (𝑦, 𝑥) response to the Worse𝑝 (𝑥) move available
to the proponent, with 𝑦 = 𝐵(𝑝, 𝑞). This implies that SC(𝐵(𝑝, 𝑓 ), 𝑊 (𝑝, 𝑓 )).


                                                10
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                          1–13


   For the other direction we begin by noting that because 𝑊 (𝑝, 𝑞) ̸= ∅ there is
Worse𝑝 (𝑥) ∈ 𝒜𝑝 , and so the assumption SC(𝐵(𝑝, 𝑓 ), 𝑊 (𝑝, 𝑓 )) guarantees that there is
𝐶 = Compensates𝑝 (𝑦, 𝑥) ∈ 𝒜𝑝 . Now, there are two types of moves available to the opponent
to which we need a reply.

   1. The first is Worse𝑝 (𝑥) ∈ 𝒜𝑝 . As mentioned we have a reply 𝐶 available, and since a
      compensation move cannot be replied to the game is won by the proponent.

   2. The second is the citation of a case 𝑞 ∈ CB with outcome ¯𝑠 for which it holds that
      [𝑞 ̸⪯ 𝑓 ] ̸⊂ [𝑝 ̸⪯ 𝑓 ]. By Definition 3.4 we have that 𝑝 can be transformed into 𝑝′ = J𝐶K,
      and so we can reply to the citation with Transformed𝑝 (𝑞) ∈ 𝒜𝑝 . There are no more
      moves available to the opponent and so the proponent wins the game.

Corollary 6.2.1. There is a winning strategy for 𝑝 if and only if 𝑝 ⊑SC 𝑓 .

Proof. Applying Eq. (2) and then Propositions 6.1 and 6.2 we get

      𝑝 ⊑SC 𝑓 iff 𝑝 ⪯ 𝑞 or SC(𝐵(𝑝, 𝑞), 𝑊 (𝑝, 𝑞))
               iff 𝑝 has a trivial winning strategy or 𝑝 has a nontrivial winning strategy
               iff 𝑝 has a winning strategy.

   Under this view of the winning strategies, and employing a fully general definition of com-
pensation through a justification relation ⊑, we can now rephrase Definition 3.6 of explanations
in the following way.

Definition 6.3. An explanation of a case 𝑓 is a best citable precedent 𝑝 ∈ CB with 𝑝 ⊑ 𝑓 .

   The theory of precedential constraint describes how the outcome of a fact situation can
be forced by precedent. However the collection of precedents may not be sufficient to force
the outcome of all possible new fact situations. If such an undecided fact situation presents
itself there may still be a precedent which, on the basis of additional reasoning, can be argued
to justify an outcome for the fact situation. This is the view suggested by Corollary 6.2.1; a
justification relation goes beyond the forcing relation by sanctioning citations of precedents
that do not strictly force the outcome of the focus case.

6.3. A Relational Description of the Explanation Model
Having shown that a justification relation in some sense corresponds to the winning strategies
underlying the explanations of [1], we can give a succinct description of the explanation method
just through the use of relations on cases. Let us think of citability as a relation ⊴, then those
𝑝 ∈ CB related to the focus case through the intersection ⊑ ∩ ⊴ with 𝑓 are said to explain the
focus case 𝑓 , i.e. those 𝑝 with 𝑝 ⊑ 𝑓 and 𝑝 ⊴ 𝑓 .
   The model in [1] is a top-level model as it does not give explicit definitions of these notions,
apart from suggesting a definition for the citability relation ⊴ as in Definition 3.3, and a method
for determining ⪯ on the basis of Pearson correlation coefficients. In its running example and
the experiments in [1, Section 6] all compensations are allowed, so that ⊑ ∩ ⊴ = ⊴. Through


                                                11
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                          1–13


the relational view we summarize these inputs as follows: 1. The forcing relation ⪯, determined
by specifying the dimensions and their orders. 2. The justification relation ⊑, determined by
specifying the compensations. 3. The citability relation ⊴, determined by the definition of a
best citable precedent. This view considerable simplifies the presentation of the model as it
does not rely on the concepts of argumentation frameworks and winning strategies.


7. Discussion and Conclusion
We have described the explanation model of [1] in Section 3, which provides explanations as
winning strategies on the grounded argument game of an abstract argumentation theory. In
Section 6 we showed that this model admits an equivalent rephrasing in terms of relations, in
which explanations are provided as cases related to the focus case through justification and
citation relations. Most notably this shows that the explanation model can in some sense be
seen as adding a notion of justification to the theory of precedential constraint as a relation ⊑
extending the forcing relation ⪯.
   We conclude by noting an important consideration for future work on the topic. In order to
apply this notion of justification to the explanation of machine-learned decisions, it is imperative
that the input parameters – that being the forcing, justification, and citation relations – are
constructed in such a way that they are faithful to the rationale of the black-box under consid-
eration, because otherwise such a justification runs a high risk of becoming a rationalization if
it does not reflect the real reasons behind the decision.


Acknowledgments
This research was (partially) funded by the Hybrid Intelligence Center, a 10-year programme
funded by the Dutch Ministry of Education, Culture and Science through the Netherlands
Organisation for Scientific Research, grant number 024.004.022. We also thank the referees for
very useful commentary and suggestions.


References
[1] H. Prakken, R. Ratsma, A top-level model of case-based argumentation for explanation:
    Formalisation and experiments, Argument & Computation 13 (2022) 159–194.
[2] J. F. Horty, Rules and reasons in the theory of precedent, Legal Theory 17 (2011) 1–33.
[3] J. Horty, Reasoning with dimensions and magnitudes, Artificial Intelligence and Law 27
    (2019) 309–345.
[4] P. Dung, On the acceptability of arguments and its fundamental role in nonmonotonic
    reasoning, logic programming and n-person games, Artificial Intelligence 77 (1995) 321–357.
[5] V. Aleven, K. D. Ashley, Evaluating a learning environment for case-based argumentation
    skills, in: Proceedings of the 6th international conference on Artificial intelligence and law,
    1997, pp. 170–179.


                                                12
Wijnand van Woerkom et al. CEUR Workshop Proceedings                                     1–13


[6] W. van Woerkom, D. Grossi, H. Prakken, B. Verheij, Landmarks in case-based reasoning:
    From theory to data, in: Proceedings of the First International Conference on Hybrid
    Human-Machine Intelligence, Frontiers of AI, IOS Press, 2022, p. tbd.
[7] K. Čyras, D. Birch, Y. Guo, F. Toni, R. Dulay, S. Turvey, D. Greenberg, T. Hapuarachchi,
    Explanations by arbitrated argumentative dispute, Expert Systems with Applications 127
    (2019) 141–156.
[8] V. Aleven, Using background knowledge in case-based legal reasoning: A computational
    model and an intelligent learning environment, Artificial Intelligence 150 (2003) 183–237.


                                             13

</pre>