<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Longitudinal Distance: Towards Accountable Instance Attribution</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rosina O. Weber</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Prateek Goel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shideh Amiri</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gideon Simpson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Information Science, Department of Mathematics, Drexel University</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Previous research in interpretable machine learning (IML) and explainable artificial intelligence (XAI) can be broadly categorized as either focusing on seeking interpretability in the agent's model (i.e., IML) or focusing on the context of the user in addition to the model (i.e., XAI). The former can be categorized as feature or instance attribution. Example- or sample-based methods such as those using or inspired by case-based reasoning (CBR) rely on various approaches to select instances that are not necessarily attributing instances responsible for an agent's decision. Furthermore, existing approaches have focused on interpretability and explainability but fall short when it comes to accountability. Inspired in case-based reasoning principles, this paper introduces a pseudo-metric we call Longitudinal distance and its use to attribute instances to a neural network agent's decision that can be potentially used to build accountable CBR agents.</p>
      </abstract>
      <kwd-group>
        <kwd>explainable artificial intelligence</kwd>
        <kwd>interpretable machine learning</kwd>
        <kwd>case-based reasoning</kwd>
        <kwd>pseudo-metric</kwd>
        <kwd>accountability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The literature in interpretable machine learning (IML) and explainable artificial
intelligence (XAI) have proposed various categorization schemes and taxonomies
to describe their methods. One of those is through attribution – attribution
methods attempt to explain model behavior by associating a classified (or
predicted) test instance to elements of the model (e.g., [
        <xref ref-type="bibr" rid="ref28 ref32">28, 32</xref>
        ]).
      </p>
      <p>
        There are two main categories of attribution methods. Instance attribution
methods select instances as integral elements to associate with a classification or
prediction (e.g., [
        <xref ref-type="bibr" rid="ref15 ref17 ref24 ref31 ref34 ref8">8, 15, 17, 24, 34, 31</xref>
        ]). Instance attribution methods have been
used for debugging models, detecting data set errors, and creating
visuallyindistinguishable adversarial training examples [
        <xref ref-type="bibr" rid="ref17 ref24 ref34 ref8">8, 17, 24, 34</xref>
        ]. Feature attribution
methods select features to attribute to a classification or prediction indicating
which features play a more significant role than others [
        <xref ref-type="bibr" rid="ref20 ref29 ref30 ref32 ref5 ref9">5, 9, 20, 29, 30, 32</xref>
        ].
      </p>
      <p>
        When using case-based reasoning (CBR), a third category is often mentioned:
example- or sample-based. CBR can offer instances as examples but it does
not make attributions. The most successful use of CBR for explainability or
interpretability is in the ANN-CBR twin by [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In their work, Kenny and Keane
Copyright © 2021 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] were able to transfer the accuracy of a CNN into CBR for full transparency.
This was done by applying the feature attribution approach DeepLift [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ].
      </p>
      <p>
        Both instance and feature attribution methods suffer strong criticisms. DeepLift
[
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] demonstrated good performance when used in the investigation reported in
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. However, feature attribution methods have been shown to be insensitive to
changes in models and in data [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In addition, they do not work in deep learning
(DL) models that use a memory [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. One important limitation of feature
attribution is how to evaluate them because of how the data distribution changes
when a feature is removed [
        <xref ref-type="bibr" rid="ref13 ref2">13, 2</xref>
        ].
      </p>
      <p>
        If we extend the analysis beyond interpretability and explainability and
consider accountability, then feature attribution methods pose more challenges. An
accountable decision-making process should demonstrate its processes align with
legal and policy standards [
        <xref ref-type="bibr" rid="ref19 ref25">19, 25</xref>
        ]. This notion is expanded in the literature
around two goals. The first is to generate a fully reproducible [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] audit of the
system generating the decision (e.g.,[
        <xref ref-type="bibr" rid="ref2 ref26 ref27 ref6">2, 6, 26, 27</xref>
        ]); the second is to change the
system when its decision is unwarranted (e.g., [
        <xref ref-type="bibr" rid="ref11 ref25">11, 25</xref>
        ]). The additional challenge
to feature attribution lies therefore on the difficulty to change the behavior of a
system that produces unwanted decisions based on simply knowing the role of
features in each decision. Notwithstanding, it seems plausible to make changes
when the instances responsible for each decision are known.
      </p>
      <p>
        Instance attribution methods are also target of criticisms. One criticism has
been directed to instances attributed when using influence functions [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] by
Brashan et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. It refers to the fact that the selected instances are mostly
outliers, a problem they overcome with relative influence [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Another criticism
is how time and processing consuming it is to compute influence functions [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
A third criticism is that selected sets of instances have substantial overlap across
multiple test instances to be classified. Consequently, there seems there is still
much to gain from further examining instance attribution.
      </p>
      <p>
        This paper proposes a new instance attribution method. It is intuitive and
thus simple to explain to users. It is also simple to compute. It has limitations as
well. It shares with influence functions the overlap between sets of instances
attributed to multiple test instance. It shares with ROAR [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and HYDRA [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] the
requirement to retrain the model. It is a step into a different direction that may
spur further research. The proposed method is based on a new pseudo-metric
we call Longitudinal Distance (LD). This paper focuses exclusively on
classifications. For simplicity, we henceforth refer to solutions as classifications and the
NN structures as classifiers. In the next sections, we describe the pseudo-metric
LD and a non-pseudo-metric variant and how they can be used for explanation.
We then present some preliminary studies. We conclude with a discussion and
future steps.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Introducing Longitudinal Distances</title>
      <p>Longitudinal distances are a class of distances based on a heuristic that an
iterative learning model such as NN can be used as an oracle for instance attribution.
Longitudinal distances operate in the metric space of instance elements that are
used to train and to be solved by NN methods through a series of iterations (or
epochs).</p>
      <p>Given a classification model trained with neural networks (NN) on a space
where x ∈ X are instances mapped by features f ∈ F , Xtrain ∈ X are training
instances and hence include labels y ∈ Y to indicate the class they belong,
and Xtest are testing instances. The classifier Ce(xi) is a NN classifier that
learns to assign labels y using a set of n training instances xi through e epochs,
e = 1, . . . , k, xi = 1, . . . , n. We assume that the ability of a classifier to solve
a previously unseen instance is represented in the weights learned through a
sequence of learning iterations from a set of training instances. We therefore
hypothesize that there is at least one (or more) training instance(s) responsible
for the classification of an unseen instance, be it correct or incorrect, and that
the relation between the training and unseen instances are somehow represented
throughout the sequence of learning iterations.</p>
      <p>
        We justify the proposed hypothesis based on the fact that when a previously
known solution, i.e., a known class, is selected to classify an unseen problem,
that both the training instance and the unseen instance, now both labeled as
members of the same class, meet some condition that causes their solutions
to be interchangeable, that is, their labels are the same. This condition is an
abstract concept that we wish to represent, let us call it the oracle condition.
We know that two instances together met this oracle condition when they are
members of the same class. Consequently, a trained classifier can be perceived as
a function that indicates that two instances meet said oracle condition. However
a trained classifier may not be ideal to distinguish, among all instances that
meet the oracle condition, which ones are responsible for the classification of an
unknown instance. We consider, for example, the problem known as catastrophic
interference [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] to possibly degrade the quality of a classifier with respect to
some classes as it continues to learn new classes. Because some abilities may
be forgotten as an NN method moves through new epochs, we contend that
the complete experience learned by a classifier is better represented by every
weight matrix learned in all epochs of its learning cycle. The last weight matrix
represents the learned classifier that has refined its ability to classify all instances
through generalization. Therefore, to successfully generalize, it is reasonable to
assume that particular abilities with respect to some instances or labels were
sacrificed. Hence, we propose to use the sequence of all intermediary classifiers,
the weight matrices resultant at the end of each learning iteration, to measure
the relationship between an unknown instance and all the training instances and
identify the one(s) responsible for the classification.
      </p>
      <p>We first propose the pseudometric Longitudinal Distance, dL to produce the
ratio of incorrect classifications to the total number of epochs.</p>
      <p>dL(xi, x) = 1 − (</p>
      <p>δe(xi, x))
(1)</p>
      <p>Now we propose the non-pseudometric Strict Longitudinal Distance,
dSL(xi, x) = 1 −
ee==1k we(xi)δe(xi, x)
ee==1k we(xi)</p>
      <p>We define the distance space as (d, X) for the distance pseudo-metric dL(xi, x),
along with dSL(xi, x) where xi, x ∈ X are instances mapped by features f ∈ F
that are classified by a classifier Ce at epoch e and receive a label y ∈ Y to
indicate their outcome class.</p>
      <p>In the above expressions,
the 1Ce(xi)=Ce(x) in Equation 3 is the indicator function taking the value of 1
when they are equal and 0 otherwise.</p>
      <p>For assigning relevance, we incorporate we(xi) as a binary weight to Equation
2 that indicates whether the classifier Ce(xi) is correct or wrong. When we(xi) =
1, this means the label y predicted by the classifier Ce(xi) for xi is equal to the
label of the instance and thus it is correct, otherwise we(xi) = 0. We observe
that Equation 1 assumes xi as a training instance and therefore there is a label
y designated as its class and thus we(xi) can be computed. Note that only the
correctness of the classification of the training instance is verified as the correct
label of unseen instances is unknown. Consequently Equation 2 is suited for
computing the distances between a training instance xi ∈ Xtrain and a testing
instance x ∈ Xtest, which is the goal of this paper.</p>
      <p>Note that we can rewrite
(3)
(5)
which is the mean number of classifier mismatches over all the epochs.</p>
      <p>Next we demonstrate that dL a pseudo-metric. Recall the pseudo-metric
properties:
1. d(x, x) = 0
2. d(x, y) = d(y, x)
3. d(x, y) ≤ d(x, z) + d(z, y)
What makes d a pseudo-metric and not necessarily a metric is that in order to
be a metric, if d(x, y) = 0, then x = y. Because of mapping into feature space
to evaluate the classifier, there may exist predictors, xi and x, for which xi = x,
but d(xi, x) = 0.</p>
      <p>For dL(xi, x),
dL(xi, x) =</p>
      <p>1 e=k
1Ce(xi)=Ce(x) = k
e=1
0 = 0.
The symmetry property is also obvious. For the triangle property, given x, y,
and z, suppose Ce(x) = Ce(y). Then,
regardless of what the two terms on the right are. If, instead, Ce(x) = Ce(y),
then</p>
      <p>1Ce(x)=Ce(y) = 1
Now we argue by contradiction. Suppose for some choice of z,
For this to be true, each of these two terms must be zero, so
1Ce(x)=Ce(z) + 1Ce(z)=Ce(y) = 0</p>
      <p>Ce(x) = Ce(z) = Ce(y)
(6)
(7)
(8)
(9)
(11)
1Ce(x)=Ce(z)+1Ce(z)=Ce(y) = dC (x, z)+dC (z, y)
But this contradicts the assumption that Ce(x) = Ce(y). Thus, 1Ce(x)=Ce(z) +
1Ce(z)=Ce(y) ≥ 1, and
Consequently the triangle equality holds for each epoch, and it must then hold
over any finite sum of epochs:
1Ce(x)=Ce(y) ≤ 1Ce(x)=Ce(z) + 1Ce(z)=Ce(y)
(10)
dL(x, y) =
The variant Strict Longitudinal Distance considers a relevance weight based on
how correct the classifier Ce(x) is throughout its life cycle.</p>
      <p>If the we were independent of the xi and x arguments, then the analogous
computations would demonstrate dSL, Equation 2, to also be a pseudo-metric.
However, because of dSL’s dependence on the binary weights, we(xi), on the
first argument, xi, it fails to be a pseudo-metric when x ∈ Xtest is an element of
the training set. It does, however, satisfy the properties that dSL(x, y) ≥ 0 and
dSL(x, x) = 0, so it provides some weaker notion of distance between points.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Explaining with Longitudinal Distances</title>
      <p>To use the proposed metrics dL and dSL for explaining decisions of an NN, it
is necessary that the weight matrices produced at every epoch are preserved. If
those have not been preserved, then it is necessary to retrain the NN preserving
its weight matrices. Due to its non-deterministic nature, the instance to be
explained has to be solved again by the trained classifier for an explanation to be
provided. To explain a given target instance, we propose to compute the distance
between the target instance and all training instances using the longitudinal
distance or the strict longitudinal distance. With the results of distances, it is then
possible to determine the minimum (i.e., shortest) distance observed between
the target and training instances. Now note that multiple training instances can
be equidistant to the target, and this is exactly why longitudinal distances fail
the axioms for being metrics.</p>
      <p>Definition 1 The shortest distance observed between a given target instance
and all training instances computed through longitudinal distances is defined as
the explainer distance.</p>
      <p>Definition 2 The set of instances that are equidistant to the target instance at
the explainer distance constitute the positive explainer set.</p>
      <p>The positive explainer set and the negative explainer set are further specified
in that the explainer set computed via longitudinal instances is, by nature of
equations (1) and (2), the positive explainer set. The premise of the longitudinal
distances may be reversed to compute the negative explainer set by modifying
Equation (3) as follows:
the 1Ce(xi)=Ce(x) in Equation 12 is the indicator function taking the value of 1
when they are not equal and 0 otherwise.</p>
      <p>Definition 3 The negative explainer set is the set of instances computed via
longitudinal distances that represent negative instances in classifying a given
target instance.</p>
      <p>Definition 4 The explainer set results from the combination of the positive
explainer set and the negative explainer set.</p>
      <p>The instances belonging to the explainer set we theorize are responsible for
producing the solution to the target instance and consequently can explain its
solution. Once the explainer set is known, some considerations are needed. First,
the explainer distance needs to be defined based on a precision value E . The
value for E depends on the domain given that explanations tend to be
domaindependent. Second, a training instance may produce a direct or indirect
explanation. A direct explanation would be one that does not require any further
processing as in all contents of the training instance(s) suffices to explain the
target instance. An indirect explanation may require further processing such as
comparing whether all features of the target instance match the instances of
the training instance chosen to explain the target. Third, the explainer set may
include one or more training instances. When cardinality is greater than one,
then a process to select the most loyal explanation may be needed. As a result of
this process, it may be that no explanation is given. Fourth, if the explanation
is meant to foster human trust, the explainer set may need to be interpreted.
When the set is meant to produce accountability, then it has to be logical.</p>
      <p>We note that the negative explainer set is needed to demonstrate positive and
negative instances that could be potentially used to train a classifier. When used
to select the best candidate for explanation, the positive explainer set suffices.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Studies</title>
      <p>We used the MNIST data set consisting of 55k training instances, and 10k testing
instances. We implemented a convolutional neural network design consisting of
1 convolutional layer with 28 3x3 kernels, 1 max pool layer of size 2x2 and stride
of 1. The resultant layer is flattened and then connected to 1 hidden layer with
128 nodes with ReLU activation function for every node. A dropout of 20% is
introduced following the hidden layer. The resultant output is passed to the
output layer with 10 nodes (each corresponding to the label) with the Softmax
activation function. This entire network is trained with 55,000 28x28 MNIST
images in batches of 64 over 10 epochs. This training reached an accuracy of
83.9%1.</p>
      <p>Fig. 1 shows the images of the entire positive explainer set for one of the
testing instances from MNIST labeled as a number nine. This positive explainer
set was selected for illustration because its cardinality is 12 and thus we can
show all the 12 images that are at distance zero from the target testing instance,
measured by dL. The negative explainer set contains 3,569 images. Figure set
shown circled in blue is a 0.4% sample of the complete set of images at distance
1, which included 3,569 images.</p>
      <p>Although not a validation of accuracy, images are often presented to illustrate
feature attribution methods. In Fig. 1 the negative explainer set is a random
selection of the large overall set. In that sample, the types of nines are still nines
but they are of a different category from those in the positive explainer set and
the target instance.
4.2</p>
      <p>
        Fidelity of explanations with longitudinal distances
Fidelity of an explanation is a metric that aims to determine how well an
explanation generated by an XAI method aligns with ground-truth [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].This concept
has been discussed from multiple perspectives (e.g., [
        <xref ref-type="bibr" rid="ref12 ref16 ref21 ref22 ref3 ref32 ref33 ref7">3, 7, 12, 16, 21, 22, 32, 33</xref>
        ]),
and also related to gold features [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. In this study, we use the data and method
from [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Their approach is to create artificial data sets by populating equation
variables with values based on a given interval so that each instance can be
explained by the equation used to create it. This way, an explanation is loyal to the
data when it approximates the relationship between the variables in the original
equation. In this section, we use a data set created from seven equations. The
equations have three variables, which were randomly populated. The result of
these equations, the numeric value produced when applying the values and
executing the equation, is the fourth feature of the data set. The label corresponds
to the equation, hence there are seven labels, l = {0, 1, 2, 3, 4, 5, 6}. The data set
consists of 503,200 instances. The experiment is as follows.
      </p>
      <p>Data set and deep neural network classifier. We selected 403,200 instances
for training and 100,000 for testing and validation. We trained a deep learning
architecture with 1 hidden layer and 8 nodes. The activation for the input and
hidden layers was done using ReLU functions and for the output layer using the
Softmax activation function. The loss calculated was categorical cross entropy.
The 403,200 training instances were trained in batches of 128 over 15 epochs
and reached an accuracy of 95% in the testing set. As required to implement the
longitudinal distances, we preserved the classifiers at each epoch.
Accuracy metric and hypothesis. As earlier described, the data in this
experiment was designed with equations that are known and are used to
designate the class of the instances. This way, an XAI method produces a correct
1 Code and results are available https://github.com/Rosinaweber/LongitudinalDistances
explanation if it selects to explain a target instance with a training instance
that implements the same equation with which the instance being explained was
built. The explanation will be given by an existing training instance, which has a
label. Consequently, correctness is observed when the class label of the training
instance selected to explain the target instance has the same label as the target
instance being explained. With the definition of a correct explanation, we can
define accuracy as the ratio of correct explanations to the total explanations
produced by the proposed approach. Our hypothesis is that the proposed method
using both longitudinal distances will select instances to explain the
classifications produced by the deep NN architecture with average accuracy above 95%.
Methodology. With the classifier trained, we randomly selected 1,000 from
the 100,000 testing instances to measure the accuracy of the proposed approach.
This evaluation, as the metric describes, only seeks the selection of the class
of explanations, and thus it does not use the negative explainer set. For each
instance, we followed these steps:
1. Compute the positive explainer set using dL and dSL.
2. Use the label of each instance to create subsets of instances for each label.
3. Determine the largest subset and use its label as explanation.
4. Assess whether label is correct.
5. Compute average accuracy.</p>
      <p>Results. Of the 1,000 testing instances, there were 978 and 980 correct
explanations for the approach when using dL and dSL, respectively. Both average
accuracy levels are above 95%, which is consistent with our hypothesis. The
classifier is correct for 968 instances out of 1,000 (96.8%), being wrong in 32
instances. Both distances led to the selection of the wrong explanation only when
the classifier’s predictions were wrong. However, the distances do not indicate
the wrong explanations whenever the classifier is wrong. Interestingly, both
distances dL and dSL were correct, respectively, in 10 and 12 instances for which
the classifier was incorrect.</p>
      <p>Discussion. It is curious that the distances led to the selection of the correct
explanation in instances when the classifier was wrong because the distances are
based on the results of the classifier. We examined whether there was anything
else unusual about those instances. We found that when dL was able to select
correct explanations while the classifier was wrong, the cardinality of the explainer
sets was much smaller than in other instances. Considering when the classifier
was wrong, the average size of the explainer set was 274 instances when the
correct explanation was selected in comparison to 23,239 when the explanation was
not correct. We then examined the predictions of the classifier throughout the
15 epochs and computed two values, namely, the number of distinct predictions
and the number of times the prediction changed. Table 1 shows the values for
all instances where dL selected the correct explanation when the classifier was
wrong. The correlation between the explainer set and the respective number of
distinct predictions and changes is 0.60 and 0.74. We also computed the
correlation between the explainer set and the respective number of distinct predictions
and changes for when dL was not correct, and the results are 0.82 and 0.72.
These correlations suggest that the more changes in labels the classifier makes
as it is learning, the more demanding the distances become causing the explainer
sets to be more efficient. There is a lot to investigate further on this finding.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>In this paper, we introduced longitudinal distances to be computed between an
unseen target instance classified by an NN and all training instances used to
train it. The training instances at the shortest distance from the target instance
constitute its explainer set. The training instances that are members of the
explainer set are hypothesized to have contributed to classify the target instance.
As such, they can be potentially used to explain the target instance.</p>
      <p>Longitudinal instances are inspired by the similarity heuristics and the
principle that similar problems have similar solutions. Although not demonstrated
yet, instance attribution methods have the potential to bring to example-based
XAI, particularly when implemented with CBR, the facet of attribution,
currently missing in those approaches.</p>
      <p>In this paper, the positive explainer set was used to select the best candidate
for explanation. We did not use the negative explainer set. It is obvious that,
if we are interested in accountability, as this paper described in its motivation,
then we first need to demonstrate that the explainer set produces the explained
decision; this is when we will use the negative explainer set. Ultimately, we also
need to determine how to modify the set to change its decisions.</p>
      <p>New XAI and IML approaches should demonstrate how they address any
criticisms. Demonstrating how they perform in presence of noise, the proportion
of outliers in the explainer set, and the proportion of overlap across explainer
sets for multiple instances are future work for longitudinal distances.
Acknowledgments. Support for the preparation of this paper to Rosina O
Weber and Prateek Goel was provided by NCATS, through the Biomedical Data
Translator program (NIH award 3OT2TR003448-01S1). Gideon Simpson is
supported by NSF Grant no.DMS-1818716.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Adebayo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gilmer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muelly</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goodfellow</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hardt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Sanity checks for saliency maps</article-title>
          .
          <source>In: 32nd NeurIPS</source>
          . pp.
          <fpage>9525</fpage>
          -
          <lpage>9536</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Adler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Falk</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedler</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nix</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rybeck</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scheidegger</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Venkatasubramanian</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Auditing black-box models for indirect influence</article-title>
          .
          <source>Knowledge and Information Systems</source>
          <volume>54</volume>
          (
          <issue>1</issue>
          ),
          <fpage>95</fpage>
          -
          <lpage>122</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Alvarez-Melis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaakkola</surname>
            ,
            <given-names>T.S.</given-names>
          </string-name>
          :
          <article-title>Towards robust interpretability with selfexplaining neural networks</article-title>
          .
          <source>arXiv preprint arXiv:1806</source>
          .
          <volume>07538</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Amiri</surname>
            ,
            <given-names>S.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weber</surname>
            ,
            <given-names>R.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goel</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brooks</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gandley</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kitchell</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zehm</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Data representing ground-truth explanations to evaluate xai methods</article-title>
          . arXiv preprint arXiv:
          <year>2011</year>
          .
          <volume>09892</volume>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bach</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Binder</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montavon</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klauschen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , Mu¨ller,
          <string-name>
            <given-names>K.R.</given-names>
            ,
            <surname>Samek</surname>
          </string-name>
          , W.:
          <article-title>On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation</article-title>
          .
          <source>PloS one 10(7)</source>
          ,
          <year>e0130140</year>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Baldoni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baroglio</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>May</surname>
            ,
            <given-names>K.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Micalizio</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tedeschi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Computational accountability</article-title>
          . In:
          <article-title>Deep Understanding and Reasoning: A Challenge for Nextgeneration Intelligent Agents</article-title>
          ,
          <string-name>
            <surname>URANIA</surname>
          </string-name>
          <year>2016</year>
          . vol.
          <year>1802</year>
          , pp.
          <fpage>56</fpage>
          -
          <lpage>62</lpage>
          . CEUR Workshop Proceedings (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Barr</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bertini</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reilly</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bruss</surname>
            ,
            <given-names>C.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wittenbach</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          :
          <article-title>Towards ground truth explainability on tabular data</article-title>
          .
          <source>arXiv preprint arXiv:2007</source>
          .
          <volume>10532</volume>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Barshan</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brunet</surname>
            ,
            <given-names>M.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dziugaite</surname>
            ,
            <given-names>G.K.</given-names>
          </string-name>
          :
          <article-title>Relatif: Identifying explanatory training samples via relative influence</article-title>
          .
          <source>In: International Conference on Artificial Intelligence and Statistics</source>
          . pp.
          <fpage>1899</fpage>
          -
          <lpage>1909</lpage>
          . PMLR (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Bhatt</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ravikumar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , et al.:
          <article-title>Building human-machine trust via interpretability</article-title>
          .
          <source>In: AAAI Conference on Artificial Intelligence</source>
          . vol.
          <volume>33</volume>
          , pp.
          <fpage>9919</fpage>
          -
          <lpage>9920</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miao</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : Hydra:
          <article-title>Hypergradient data relevance analysis for interpreting deep neural networks</article-title>
          .
          <source>arXiv:2102.02515</source>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Garfinkel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matthews</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shapiro</surname>
            ,
            <given-names>S.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          :
          <article-title>Toward algorithmic transparency and accountability</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>60</volume>
          (
          <issue>9</issue>
          ) (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Guidotti</surname>
          </string-name>
          , R.:
          <article-title>Evaluating local explanation methods on ground truth</article-title>
          .
          <source>Artificial Intelligence</source>
          <volume>291</volume>
          ,
          <issue>103428</issue>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Hooker</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erhan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kindermans</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>A benchmark for interpretability methods in deep neural networks</article-title>
          . In: Wallach,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Larochelle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Beygelzimer</surname>
          </string-name>
          , A.,
          <string-name>
            <surname>d'Alch´</surname>
            e-Buc,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fox</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garnett</surname>
            ,
            <given-names>R</given-names>
          </string-name>
          . (eds.)
          <source>Advances in Neural Information Processing Systems</source>
          . vol.
          <volume>32</volume>
          . Curran Associates, Inc. (
          <year>2019</year>
          ), https://proceedings. neurips.cc/paper/2019/file/fe4b8556000d0f0cae99daa5c5c5a410-Paper.pdf
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Kenny</surname>
            ,
            <given-names>E.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keane</surname>
          </string-name>
          , M.T.:
          <article-title>Twin-systems to explain artificial neural networks using case-based reasoning: Comparative tests of feature-weighting methods in anncbr twins for xai</article-title>
          . In: Twenty-Eighth
          <source>International Joint Conferences on Artificial Intelligence (IJCAI)</source>
          ,
          <year>Macao</year>
          ,
          <fpage>10</fpage>
          -
          <lpage>16</lpage>
          August
          <year>2019</year>
          . pp.
          <fpage>2708</fpage>
          -
          <lpage>2715</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Khanna</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koyejo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Interpreting black box predictions using fisher kernels</article-title>
          .
          <source>In: The 22nd International Conference on Artificial Intelligence and Statistics</source>
          . pp.
          <fpage>3382</fpage>
          -
          <lpage>3390</lpage>
          . PMLR (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wattenberg</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gilmer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wexler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Viegas</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , et al.:
          <article-title>Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav)</article-title>
          .
          <source>In: International conference on machine learning</source>
          . pp.
          <fpage>2668</fpage>
          -
          <lpage>2677</lpage>
          . PMLR (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Koh</surname>
            ,
            <given-names>P.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Understanding black-box predictions via influence functions</article-title>
          .
          <source>In: International Conference on Machine Learning</source>
          . pp.
          <fpage>1885</fpage>
          -
          <lpage>1894</lpage>
          . PMLR (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Koul</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greydanus</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fern</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Learning finite state representations of recurrent policy networks</article-title>
          .
          <source>arXiv preprint arXiv:1811</source>
          .
          <volume>12530</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Kroll</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huey</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barocas</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Felten</surname>
            ,
            <given-names>E.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reidenberg</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>D.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
          </string-name>
          , H.:
          <article-title>Accountable algorithms</article-title>
          . UNIVERSITY of PENNSYLVANIA LAW REVIEW pp.
          <fpage>633</fpage>
          -
          <lpage>705</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Lundberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>S.I.:</given-names>
          </string-name>
          <article-title>A unified approach to interpreting model predictions</article-title>
          .
          <source>arXiv preprint arXiv:1705.07874</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Mahajan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sharma</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Preserving causal constraints in counterfactual explanations for machine learning classifiers</article-title>
          . arXiv preprint arXiv:
          <year>1912</year>
          .
          <volume>03277</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Man</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>E.P.:</given-names>
          </string-name>
          <article-title>The best way to select features? comparing mda, lime, and shap</article-title>
          .
          <source>The Journal of Financial Data Science</source>
          <volume>3</volume>
          (
          <issue>1</issue>
          ),
          <fpage>127</fpage>
          -
          <lpage>139</lpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>McCloskey</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          :
          <article-title>Catastrophic interference in connectionist networks: The sequential learning problem</article-title>
          .
          <source>Psychology of Learning and Motivation - Advances in Research and Theory 24(C)</source>
          ,
          <volume>109</volume>
          -
          <fpage>165</fpage>
          (
          <year>Jan 1989</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Mercier</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siddiqui</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dengel</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahmed</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Interpreting deep models through the lens of data</article-title>
          .
          <source>In: 2020 International Joint Conference on Neural Networks (IJCNN)</source>
          . pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . IEEE (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Mittelstadt</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Russell</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wachter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Explaining explanations in ai</article-title>
          .
          <source>In: Proceedings of the conference on fairness, accountability, and transparency</source>
          . pp.
          <fpage>279</fpage>
          -
          <lpage>288</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Mittelstadt</surname>
            ,
            <given-names>B.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Allo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taddeo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wachter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Floridi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>The ethics of algorithms: Mapping the debate</article-title>
          .
          <source>Big Data &amp; Society</source>
          <volume>3</volume>
          (
          <issue>2</issue>
          ),
          <volume>2053951716679679</volume>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>NOW</surname>
          </string-name>
          , A.:
          <string-name>
            <surname>Annual</surname>
          </string-name>
          report-new york university (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Olah</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Satyanarayan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnson</surname>
          </string-name>
          , I.,
          <string-name>
            <surname>Carter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schubert</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ye</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mordvintsev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The building blocks of interpretability</article-title>
          .
          <source>Distill</source>
          <volume>3</volume>
          (
          <issue>3</issue>
          ),
          <year>e10</year>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Ribeiro</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guestrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>” why should i trust you?” explaining the predictions of any classifier</article-title>
          .
          <source>In: 22nd ACM SIGKDD</source>
          . pp.
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Shrikumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greenside</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kundaje</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Learning important features through propagating activation differences</article-title>
          .
          <source>In: International Conference on Machine Learning</source>
          . pp.
          <fpage>3145</fpage>
          -
          <lpage>3153</lpage>
          . PMLR (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Sliwinski</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Strobel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zick</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Axiomatic characterization of data-driven influence measures for classification</article-title>
          .
          <source>In: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          . vol.
          <volume>33</volume>
          , pp.
          <fpage>718</fpage>
          -
          <lpage>725</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Sundararajan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          :
          <article-title>Axiomatic attribution for deep networks</article-title>
          .
          <source>In: International Conference on Machine Learning</source>
          . pp.
          <fpage>3319</fpage>
          -
          <lpage>3328</lpage>
          . PMLR (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Benchmarking attribution methods with relative feature importance</article-title>
          . arXiv preprint arXiv:
          <year>1907</year>
          .
          <volume>09701</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Yeh</surname>
            ,
            <given-names>C.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>J.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yen</surname>
            ,
            <given-names>I.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ravikumar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Representer point selection for explaining deep neural networks</article-title>
          .
          <source>arXiv preprint arXiv:1811</source>
          .
          <volume>09720</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>