<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Knowledge-based XAI through CBR: There is more to explanations than models can tell</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rosina O. Weber</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manil Shrestha</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adam J Johs</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          ,
          <addr-line>Computer Science</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Drexel University</institution>
          ,
          <addr-line>Philadelphia, PA 19104</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Information Science</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The underlying hypothesis of knowledge-based explainable artificial intelligence is: the data required for data-centric artificial intelligence agents (e.g., neural networks) are less diverse in contents than the data required to explain the decisions of such agents to humans. The idea is that a classifier can attain high accuracy using data that express a phenomenon from one perspective whereas the audience of explanations can entail multiple stakeholders and span diverse perspectives. We hence propose to use domain knowledge to complement the data used by agents. We formulate knowledge-based explainable artificial intelligence as a supervised data classification problem aligned with the CBR methodology. In this formulation, the inputs are case problems composed of both the inputs and outputs of the data-centric agent, and case solutions, the outputs, are explanation categories obtained from domain knowledge and subject matter experts. This formulation does not typically lead to an accurate classification, preventing the selection of the correct explanation category. Knowledge-based explainable artificial intelligence extends the data in this formulation by adding features aligned with domain knowledge that can increase accuracy when selecting explanation categories.</p>
      </abstract>
      <kwd-group>
        <kwd>explainable artificial intelligence</kwd>
        <kwd>knowledge</kwd>
        <kwd>expertise</kwd>
        <kwd>interpretable machine learning</kwd>
        <kwd>case-based reasoning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Explainable artificial intelligence (XAI) (e.g., [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]) is a sub-field of artificial
intelligence (AI) research that arose with substantial influence from interpretable
machine learning (IML) (e.g., [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]). The focus of IML has always been (e.g.,
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]) to promote interpretability to ML experts who need to comprehend how
concepts are learned to advance the state of the art. The research in IML can be
broadly categorized as feature attribution (e.g., [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]), instance attribution (e.g.,
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]), and example-based (e.g., [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]). XAI has been proposed [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] with a focus on
explainability to users—despite this distinction, most methods for XAI are limited
to considering only the interpretability of models (e.g., [
        <xref ref-type="bibr" rid="ref5 ref6 ref9">5, 6, 9</xref>
        ]).
      </p>
      <p>
        The literature in XAI implies that seeking explanations solely within AI
models is insufficient. This implication is supported by various authors (e.g., [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ]
who have provided contents of explanations not available in models (See Section
2.1 for detailed discussion). Knowledge-based explainable artificial intelligence
(KBXAI) is an approach to XAI that seeks to bridge this gap by acquiring
knowledge for explanations from domain knowledge and subject matter experts
(SMEs). In this paper, we introduce and describe how to implement KBXAI
with case-based reasoning (CBR). KBXAI (See Fig. 1) is implemented in two
steps: 1) defining explanation categories, and 2) case extension learning.
      </p>
      <p>The next section presents background and related works. With the goal of
examining challenges and opportunities brought to bear with the introduction of
KBXAI, we illustrate and discuss KBXAI in three problem contexts, each with
different data types—tabular, image, and text.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Related Works</title>
      <sec id="sec-2-1">
        <title>Explanation Types</title>
        <p>In this section, we review works where authors proposed various explanation
contents for use in explanations of intelligent agents. This review is not exhaustive
but illustrates how the breadth of explanation contents extends beyond models
and data.</p>
        <p>
          Lim [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] describes a taxonomy of explanation types that include situation,
inputs, outputs, why, why not, how, what if, what else, visualization, certainty,
and control. The item input refers to external sources an intelligent agent may
have used. In the credit industry, for instance, companies purchase hundreds
of thousands of credit profiles of unidentified applicants that are not directly
considered in the explanations. Another item is situation, which Lim (ibid.)
exemplifies with an industrial process where an anomaly is presented, triggering
an agent’s decision. Lim (ibid.) states that some users would like explanations
to include what the normal process was prior to the anomalous event.
        </p>
        <p>
          Nunes and Jannach [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] conducted a systematic review of the literature
toward understanding the characteristics of explanation content provided to users
across multiple intelligent systems. Explanations proposed in the literature were
qualitatively coded to identify the types of contents communicated in
explanations — 17 types of explanation contents were identified and grouped as: 1) user
preferences and inputs, 2) decision inference process, 3) background and
complementary information, and 4) alternatives and their features. Nunes and Jannach
(ibid.) consider multiple forms of background and complementary information,
mostly external to the data or knowledge used by intelligent agents—e.g., the
background information a human would necessitate for a classification instance;
this dovetails with Lim’s inputs and situation explanation types.
        </p>
        <p>
          Chari et al. [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] propose a taxonomy of approaches and algorithms to
support user-centered AI system design. This taxonomy includes two components of
scientific explanations divorced from AI models and the data used by AI agents:
the scientific method and evidence from the literature. Additional contributions
to explanation types are found in [
          <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
          ].
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Three main categories of IML and XAI methods</title>
        <p>
          The three main categories of IML and XAI methods are feature attribution
[
          <xref ref-type="bibr" rid="ref15 ref16 ref17 ref18 ref5 ref6">5, 6, 15–18</xref>
          ], instance attribution [
          <xref ref-type="bibr" rid="ref19 ref20 ref21 ref22 ref3">3, 19–22</xref>
          ], and example-based [
          <xref ref-type="bibr" rid="ref23 ref24 ref25 ref7 ref9">7, 9, 23–25</xref>
          ]—
all predicated on obtaining explanations from an agent’s model. Attribution
methods explain model behavior by associating an input solved by an agent
to elements of the model used by that agent, either by looking at the instance
features (i.e., feature attribution) or by looking at each instance as an integral
component (i.e., instance attribution). KBXAI neither employs attribution nor
prescribes reliance on examples. KBXAI may use features from the model, but
knowledge external to the model is required, signifying better alignment with
an additional category of model-extrinsic methods. In the XAI categorization as
intrinsic and post-hoc, KBXAI is post-hoc because it is implemented after rather
than contemporaneously to the agent as with intrinsic methods (e.g., [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]).
        </p>
        <p>
          Feature attribution methods are relatively easy to compute and have risen in
popularity (e.g., [
          <xref ref-type="bibr" rid="ref15 ref16 ref17 ref18 ref5 ref6">5, 16, 17, 15, 6, 18</xref>
          ]). Among such methods are LIME [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], which
creates perturbations and then fits them to a linear regression to explain a point
that participates in the straight line with its coefficients. Saliency methods [
          <xref ref-type="bibr" rid="ref16 ref5 ref6">5,
6, 16</xref>
          ] are widely used to explain image models because such methods afford the
construction of heat maps that emphasize regions (i.e., features) of an image
where weights are higher. Another prevailing method is SHAP [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], which adds
rigor from Shapley values to feature attribution based on perturbations. Such
methods have been criticized for producing the same explanation despite noise
added to the data or changes made to the models [
          <xref ref-type="bibr" rid="ref27 ref28">27, 28</xref>
          ]. Feature attribution
have been found to not work in neural architectures that use a memory [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ].
        </p>
        <p>
          Instance attribution methods provide the instances associated with a
decision [
          <xref ref-type="bibr" rid="ref19 ref20 ref21 ref22 ref3">3, 19–22</xref>
          ]. These methods have been shown to have multiple uses such as
debugging models, detecting data set errors, and creating visually
indistinguishable adversarial training examples [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. In addition to being computationally
expensive [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], there are other criticisms to these methods–e.g., attributed
instances are often outliers and the sets of instances attributed to different samples
have substantial overlap [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. Methods that select training instances based on
some similarity concept as the basis for explanations are known as example- or
prototype-based (e.g., [
          <xref ref-type="bibr" rid="ref23 ref24 ref25 ref7 ref9">7, 9, 23–25</xref>
          ]). Example-based methods are relatively easy
to compute and have been successful in user studies [
          <xref ref-type="bibr" rid="ref25 ref7 ref9">7, 25, 9</xref>
          ]; the core problem
with such methods is the absence of attribution.
Domain knowledge has been used as part of explanation for recommender
systems (e.g., [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]), expert systems (e.g., [
          <xref ref-type="bibr" rid="ref31 ref32">31, 32</xref>
          ]), and CBR systems (e.g., [
          <xref ref-type="bibr" rid="ref33 ref34 ref35">33–35</xref>
          ].
For scientific insights and scientific discoveries, domain knowledge is considered
i) a prerequisite for attaining scientific outcomes, ii) pertinent to enhancing
scientific consistency, and iii) necessary for explainability [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ]. In the biomedical
domain, Pesquita [
          <xref ref-type="bibr" rid="ref37">37</xref>
          ] proposed augmenting post-hoc explanations with
domainspecific knowledge graphs to produce semantic explanations.
        </p>
        <p>
          Contextual decomposition explanation penalization [
          <xref ref-type="bibr" rid="ref38">38</xref>
          ] permits insertion of
domain knowledge into deep learning with the aim of mitigating false
associations, rectifying errors, and generalizing to other methods of interpretability;
examples of incorporable domain knowledge range from human labeled ground
truth explanations for every data point, to the importance of various feature
interactions. [
          <xref ref-type="bibr" rid="ref39">39</xref>
          ]’s explanatory interactive learning method leverages
human-inthe-loop revision to align model explanations with the knowledge of the expert
in-the-loop.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Knowledge-based explainable AI</title>
      <p>Based on the premise that data used by an agent is to be supplemented, we
formulate KBXAI as a problem where input data is given to an agent to execute
an intelligent task (for simplicity, henceforth this task is referred to as
classification). Consider a classifier agent Ω using training instances from an input space
X to an output space Y that uses labeled training instances z1, ..., zn ∈ Z where
zi = (xi, yi) ∈ X × Y .</p>
      <p>
        As introduced in [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ], KBXAI has two main steps: 1) defining explanation
categories, and 2) case extension learning. Fig. 2 shows the first step when KBXAI
uses domain knowledge to define a finite set of explanation categories (EC)
e ∈ EC, which are defined by a mapping def: Z × EC → 0 or 1. We refer
to these explanations as categories because they are meant to explain one to
many classifications. Within KBXAI, the explanations are textual even when
explaining images to facilitate incorporation of supplemental features.
      </p>
      <p>This step creates a new classification problem, which we formulate as
casebased. The case problems are the inputs and outputs (i.e., or labels) of the
agent. The case solutions are the explanation categories that we wish to select
for each agent’s classification. This formulation produces low accuracy because
it is typically indeterminate. The reason for this is that the explanations include
contents that are not in the data and model used by the agent Ω. The goal
of KBXAI is to successfully select the correct explanation category for a given
input-output pair. This prompts the need for the next step.</p>
      <p>The second step, case extension learning, is when KBXAI supplements the
data by proposing and evaluating supplemental features. The aim is to find
features that improve the baseline accuracy to successfully select the correct
explanation category for a given input-output pair (see Fig. 3). Proposing features
from domain knowledge represents a knowledge engineering step.
3.1</p>
      <sec id="sec-3-1">
        <title>Case-based implementation</title>
        <p>
          We implement case extension learning with CBR. There are two main reasons
for not adopting a data-centric approach like neural networks (NN), namely,
lack of transparency and sample distribution discrepancy. When comparing the
performance of an NN with and without one or more features, if the testing
data used are the same for testing both variations then it places the testing
and training at different distributions. This does not conform with the machine
learning (ML) principle that testing and training must come from the same
distribution (e.g., [
          <xref ref-type="bibr" rid="ref40 ref41 ref42">40–42</xref>
          ]). CBR is transparent and allows evaluation of features
without violating the ML principle. Through ablation using weighted k-Nearest
Neighbor (kNN) and leave-one-out cross validation (LOOCV), when a feature is
included, all instances that are left out in LOOCV include such feature, when
excluded, the instances do not include it.
        </p>
        <p>
          Implementing case extension learning with CBR through ablation is as
follows. With the problem formulation as depicted in Fig. 2, we used ReliefF [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ]
to learn weights with local similarity as either a binary (i.e., equal vs unequal)
function or, when applicable, a function that computes the difference between
values and normalizes based on the range of observed values. Average accuracy is
computed with LOOCV. Baseline accuracy is computed with the agent’s inputs
and outputs, and explanation categories; this is before adding any supplemental
features. We do not always we have access to the representations of the input
to the agent. In these situations, we consider input as a nominal feature. Case
extension entails proposing candidate supplemental features and evaluating how
they impact overall average accuracy with respect to the baseline accuracy. The
supplemental features are evaluated one at a time and then in aggregation.
Supplemental features may not be independent to the features that come from the
agent’s input, which is fine because we analyze their impact and keep the ones
that better contribute to increasing accuracy. When features are redundant, they
do not increase accuracy proportionally when combined.
        </p>
        <p>Next we describe studies applying KBXAI in three data sets using different
data types. The goal of these studies is to assess potential challenges for KBXAI.
Each data set was obtained differently and the knowledge used to complement
the problems also varies. None of the studies represent a complete real-world
scenario where KBXAI could be fully deployed. The increments in accuracy
shown are modest. Considering that we found supplemental features that caused
accuracy to increase is what demonstrates the KBXAI hypothesis has potential.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>KBXAI in Tabular data</title>
        <p>
          This synthetic data set is a binary classification with labels accept and reject
[
          <xref ref-type="bibr" rid="ref44">44</xref>
          ]. This data has 54 instances and three features with four allowable values
each. The first feature, job stability (X1), corresponds to the job status of the
applicant. This feature has integer values [
          <xref ref-type="bibr" rid="ref2 ref5">2, 5</xref>
          ], where 2 means lack of a job, and
values 3, 4, and 5, respectively, that applicant has a job for less than one year,
less than 3 years, or more than 3 years. The second feature is credit score, credit
score (X2), and has values [
          <xref ref-type="bibr" rid="ref3">0, 3</xref>
          ], meaning less than 580, 650, 750 and more than
750. The third feature is the ratio of debt payments to monthly income (X3),
with values [
          <xref ref-type="bibr" rid="ref3">0, 3</xref>
          ], meaning less than 25%, 50%, 75% and more than 75%.
        </p>
        <p>The agent is a NN architecture with four hidden layers and 512 neuron and
ReLU activation layers, ending with a sigmoid activation layer. The loss function
is binary cross-entropy and the optimizer used is gradient descent. The classifier
reached 100% accuracy, which is likely overfit given that we did not separate
data because of the small number of samples.</p>
        <p>To identify explanation categories, these authors used their own knowledge
of credit assessment combined with online resources to identify 15 explanation
categories that align to the 54 instances. The explanation categories are
hypothetical rules combining feature values in both accept and reject classes. Some
example explanation categories are, ”with lowest credit score, either job
condition and debt have to be excellent or both very good for acceptance”; ”no job
and credit score is not excellent then reject”; and ”despite no job, credit score is
excellent then accept”.</p>
        <p>Fig. 4 summarizes two examples of case extension learning with this data
set. On the left of Fig. 4, we implement the agent’s input as a nominal feature,
on the right we use the three features used by the agent. The baseline accuracy
is 17.8% and 53.5%, respectively. This is not surprising, given the agent’s input
are the basis of the agent’s learning.</p>
        <p>We created 29 features by combining values of subset of features and
decisions. We only describe those that improved accuracy. Feature X3 is obtained
with a function that assigns 1 when despite the debt-income ratio being greater
than 75%, the applicant still is approved for credit. For Feature X15, the
function assigns 1 if debt-income ratio is less than 25% and the result is approved.
Feature X18 is the same as X13 for rejected applications. Feature X27 is valued
1 when credit score is less than 650 and the decision is accept. Feature X29
requires credit score to be below 750 and debt-income ratio not to be &lt;25% when
the class is reject to receive value 1 and is zero otherwise.</p>
        <p>With the same supplemental features, the accuracy improves about 60% when
using nominal values for input, and about 23% when the input features are used
(Fig. 4). We also note the combinations of supplemental features reveal different
performances in these two executions, potentially suggesting that some of the
supplemental features are redundant with respect to the agent’s input. Note
how they improved accuracy when considered alone and in combination with
other features. The best performing feature changes from X27 to X29 in the two
executions.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>KBXAI in Image data</title>
        <p>
          The data for the study with images is a subset of CIFAR-10 [
          <xref ref-type="bibr" rid="ref45">45</xref>
          ]. Out of 10
classes, we selected four, namely, dogs, trucks, cats, and horses. We formulated
these data as a binary classification of dog or not dog. The entire data set has
5,000 images per class for training and 1,000 for testing. We trained a VGG-16
architecture that reached 85% accuracy for the binary classification.
        </p>
        <p>To identify explanation categories, we adopted an example-based strategy for
selecting example images for explanations (See Section 2.1). The strategy is to
select images that are like the image whose classification we want to explain. The
candidate images to be used for explanations are the false negatives produced
by the binary VGG-16 architecture classifier. The false negatives are all images
of the class dog that were misclassified (outliers). See example in Fig. 5a.</p>
        <p>To create explanation categories, we selected a subset of Cifar-10 test
instances from the selected classes. For each instance, we computed the cosine
between the embedding vectors of each test instance and all false negative dog
images from the initial test excluding the image itself. To create the
explanation category, we used two candidate images with the highest similarity score.
Embedding vectors of images were created with an autoencoder.</p>
        <p>After explanation categories are created, we now create the data set that
maps the agent’s input instances and their classification to their explanation
category. Note this step was done as a proxy to having a subject matter expert
selecting explanation categories. We used this approach because we did not want
to have any of the authors interfere with this selection because we would select
supplemental features. This mapping refers to identifying the correct explanation
category for each instance. We selected explanation categories for each instance
by computing the median value of the cosine similarity between the embedding
vectors of the instances and the two images in the explanation category. Once
this step was completed, we removed duplicates from the resulting explanation
categories. We then removed the explanation categories obtained with lower
cosine values. We then examined the mapping of testing instances to explanation
categories to select the explanation categories that explained more instances as
a measure of their popularity. Finally, we took the 12 most popular explanation
categories and randomly selected 10 testing instances mapped to each. The final
data set has 120 instances and 12 explanation categories.</p>
        <p>
          For images, we utilized both model and domain knowledge to propose features
(Fig. 6). From the model, we computed the image’s saliency [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Saliency brought
the 24% baseline to 25%. Truck similarity is a feature that assigns 1 to truck
images and the cosine similarity between the embedding of image instance to
embedding of a truck we selected as typical (See left of Fig. 7). This feature is
from the data as trucks are also part of the data set.
        </p>
        <p>
          All other features are from
commonsense knowledge, which here replaces domain
knowledge. Frog similarity is the cosine
similarity between embeddings of image instance
and frog image we selected as typical of a frog
(See left of Fig. 7). Frog similarity alone
increased accuracy from 24 to 32.5%. The two Fig. 7: Images of a typical truck
last features were selected based on what the and a typical frog selected for
authors perceived in the pictures as possi- features truck and frog similarity
bly explaining the recognition of a dog. Note
these images are so blurred that human accuracy is estimated to be around
90% [
          <xref ref-type="bibr" rid="ref46">46</xref>
          ]. We found that dogs commonly have their eyes and nose as a triangle,
marked in red in Fig. 5b. Analogously, two paws refer to the images where the
two front paws of the dog are clearly distinguishable.
        </p>
        <p>
          This example shows that the combination
of features from the model and from
commonsense knowledge together work better to
increase accuracy. Individually, frog
similarity showed the best performance. Note that
frog images were not included in the data,
making this a feature that is extrinsic to the
model and the data.
The KBXAI implementation with textual data was was previously published in
[
          <xref ref-type="bibr" rid="ref35">35</xref>
          ]. The data set was built from a selection of 10 scientific articles. The agent
used is a citation recommender [
          <xref ref-type="bibr" rid="ref47">47</xref>
          ] that produces articles to be cited in the input
article. We submitted the 10 articles as inputs to the recommender 10 times to
create 100 cases for KBXAI. The explanation categories were learned from the
domain of citation analysis. There are only two explanation categories, namely,
background and paraphrasing. Fig. 8 show results of case extension learning.
The baseline of 63.43% was increased to an accuracy of 72.64% with only two
features. The baseline accuracy is higher than with previous data probably due
to a binary selection of explanation categories. The improvement was only 14.5%.
Like in image data, only nominal features were used, which limits accuracy.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Concluding remarks</title>
      <p>
        Incorporation of knowledge engineering inevitably carries its problems such as
difficulty to scale. However, KBXAI only requires knowledge acquisition to
capture the additional perspectives that account to multiple stakeholders; it is not
meant to acquire knowledge for an entire agent. The explanations tend to repeat
and this is why we group them in categories. An open question is approaches
where domain knowledge can be learned to be incorporated into KBXAI. The
features we added to the tabular example are functions based on rules. This is
aligned with explanation-based learning [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ], pointing to a future direction.
      </p>
      <p>
        The studies in this paper suggest it is necessary to leverage the
representation used by the agent and not nominal features to represent the agent’s input.
This imposes on KBXAI, and consequently on CBR, that it should handle any
type of representation. To overcome this, one direction is to adopt ANN-CBR
twins [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] and utilize the original representations. The next steps also include
implementation in real-world scenarios and evaluation with humans.
Acknowledgments The authors thank the anonymous reviewers for
suggestions to improve this paper. Support for the preparation of this paper was
provided by NCATS, through the Biomedical Data Translator program (NIH award
3OT2TR003448-01S1). Authors Weber and Shrestha are also partially funded
by DARPA-PA-20-02-06-POCUS-AI-FP-023.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Amina</given-names>
            <surname>Adadi</surname>
          </string-name>
          and
          <string-name>
            <given-names>Mohammed</given-names>
            <surname>Berrada</surname>
          </string-name>
          .
          <article-title>Peeking inside the black-box: A survey on explainable artificial intelligence (xai)</article-title>
          .
          <source>IEEE Access</source>
          , PP:
          <fpage>1</fpage>
          -
          <lpage>1</lpage>
          ,
          <year>09 2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Finale</surname>
            Doshi-Velez and
            <given-names>Been</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>Towards a rigorous science of interpretable machine learning</article-title>
          .
          <source>arXiv preprint arXiv:1702.08608</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Rajiv</given-names>
            <surname>Khanna</surname>
          </string-name>
          , Been Kim, Joydeep Ghosh, and
          <string-name>
            <given-names>Sanmi</given-names>
            <surname>Koyejo</surname>
          </string-name>
          .
          <article-title>Interpreting black box predictions using fisher kernels</article-title>
          .
          <source>In The 22nd International Conference on Artificial Intelligence and Statistics</source>
          , pages
          <fpage>3382</fpage>
          -
          <lpage>3390</lpage>
          . PMLR,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Robert</given-names>
            <surname>Andrews</surname>
          </string-name>
          , Joachim Diederich, and Alan B Tickle.
          <article-title>Survey and critique of techniques for extracting rules from trained artificial neural networks</article-title>
          .
          <source>Knowledgebased systems</source>
          ,
          <volume>8</volume>
          (
          <issue>6</issue>
          ):
          <fpage>373</fpage>
          -
          <lpage>389</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Karen</given-names>
            <surname>Simonyan</surname>
          </string-name>
          , Andrea Vedaldi, and
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Zisserman</surname>
          </string-name>
          .
          <article-title>Deep inside convolutional networks: Visualising image classification models and saliency maps</article-title>
          .
          <source>arXiv preprint arXiv:1312.6034</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Avanti</given-names>
            <surname>Shrikumar</surname>
          </string-name>
          , Peyton Greenside, and
          <string-name>
            <given-names>Anshul</given-names>
            <surname>Kundaje</surname>
          </string-name>
          .
          <article-title>Learning important features through propagating activation differences</article-title>
          .
          <source>In International Conference on Machine Learning</source>
          , pages
          <fpage>3145</fpage>
          -
          <lpage>3153</lpage>
          . PMLR,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Conor</given-names>
            <surname>Nugent</surname>
          </string-name>
          and P´adraig Cunningham.
          <article-title>A case-based explanation system for black-box systems</article-title>
          .
          <source>Artificial Intelligence Review</source>
          ,
          <volume>24</volume>
          (
          <issue>2</issue>
          ):
          <fpage>163</fpage>
          -
          <lpage>178</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>David</given-names>
            <surname>Gunning</surname>
          </string-name>
          and
          <string-name>
            <given-names>David</given-names>
            <surname>Aha</surname>
          </string-name>
          .
          <article-title>Darpa's explainable artificial intelligence (xai) program</article-title>
          .
          <source>AI Magazine</source>
          ,
          <volume>40</volume>
          (
          <issue>2</issue>
          ):
          <fpage>44</fpage>
          -
          <lpage>58</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Folke</surname>
          </string-name>
          , Scott Cheng-Hsin
          <string-name>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Sean Anderson</surname>
            , and
            <given-names>Patrick</given-names>
          </string-name>
          <string-name>
            <surname>Shafto</surname>
          </string-name>
          .
          <article-title>Explainable ai for medical imaging: explaining pneumothorax diagnoses with bayesian teaching</article-title>
          .
          <source>arXiv preprint arXiv:2106.04684</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. Brian Y Lim.
          <article-title>Improving understanding and trust with intelligibility in contextaware applications</article-title>
          .
          <source>PhD thesis</source>
          , Carnegie Mellon University,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Ingrid</given-names>
            <surname>Nunes</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dietmar</given-names>
            <surname>Jannach</surname>
          </string-name>
          .
          <article-title>A systematic review and taxonomy of explanations in decision support and recommender systems</article-title>
          .
          <source>User Modeling</source>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <volume>27</volume>
          (
          <issue>3</issue>
          ):
          <fpage>393</fpage>
          -
          <lpage>444</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Shruthi</surname>
            <given-names>Chari</given-names>
          </string-name>
          , Oshani Seneviratne, Daniel M Gruen, Morgan A Foreman,
          <article-title>Amar K Das,</article-title>
          and
          <string-name>
            <surname>Deborah L McGuinness</surname>
          </string-name>
          .
          <article-title>Explanation ontology: A model of explanations for user-centered ai</article-title>
          .
          <source>In International Semantic Web Conference</source>
          , pages
          <fpage>228</fpage>
          -
          <lpage>243</lpage>
          . Springer,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Fatih</surname>
            <given-names>Gedikli</given-names>
          </string-name>
          , Dietmar Jannach, and
          <string-name>
            <given-names>Mouzhi</given-names>
            <surname>Ge</surname>
          </string-name>
          .
          <article-title>How should i explain? a comparison of different explanation types for recommender systems</article-title>
          .
          <source>International Journal of Human-Computer Studies</source>
          ,
          <volume>72</volume>
          (
          <issue>4</issue>
          ):
          <fpage>367</fpage>
          -
          <lpage>382</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>Giulia</given-names>
            <surname>Vilone</surname>
          </string-name>
          and
          <string-name>
            <given-names>Luca</given-names>
            <surname>Longo</surname>
          </string-name>
          .
          <article-title>Explainable artificial intelligence: a systematic review</article-title>
          .
          <source>arXiv preprint arXiv:2006.00093</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>Scott</given-names>
            <surname>Lundberg</surname>
          </string-name>
          and
          <string-name>
            <surname>Su-In Lee</surname>
          </string-name>
          .
          <article-title>A unified approach to interpreting model predictions</article-title>
          .
          <source>arXiv preprint arXiv:1705.07874</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Sebastian</surname>
            <given-names>Bach</given-names>
          </string-name>
          , Alexander Binder, Gr´egoire Montavon, Frederick Klauschen,
          <article-title>Klaus-Robert Mu¨ller, and Wojciech Samek</article-title>
          .
          <article-title>On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation</article-title>
          .
          <source>PloS one</source>
          ,
          <volume>10</volume>
          (
          <issue>7</issue>
          ):e0130140,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17. Marco Tulio Ribeiro,
          <string-name>
            <given-names>Sameer</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Carlos</given-names>
            <surname>Guestrin</surname>
          </string-name>
          .
          <article-title>” why should i trust you?” explaining the predictions of any classifier</article-title>
          .
          <source>In 22nd ACM SIGKDD</source>
          , pages
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Mukund</surname>
            <given-names>Sundararajan</given-names>
          </string-name>
          , Ankur Taly, and
          <string-name>
            <given-names>Qiqi</given-names>
            <surname>Yan</surname>
          </string-name>
          .
          <article-title>Axiomatic attribution for deep networks</article-title>
          .
          <source>In International Conference on Machine Learning</source>
          , pages
          <fpage>3319</fpage>
          -
          <lpage>3328</lpage>
          . PMLR,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. Pang Wei Koh and
          <string-name>
            <given-names>Percy</given-names>
            <surname>Liang</surname>
          </string-name>
          .
          <article-title>Understanding black-box predictions via influence functions</article-title>
          .
          <source>In International Conference on Machine Learning</source>
          , pages
          <fpage>1885</fpage>
          -
          <lpage>1894</lpage>
          . PMLR,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Chih-Kuan</surname>
            <given-names>Yeh</given-names>
          </string-name>
          , Joon Sik Kim,
          <source>Ian EH Yen, and Pradeep Ravikumar</source>
          .
          <article-title>Representer point selection for explaining deep neural networks</article-title>
          .
          <source>arXiv preprint arXiv:1811.09720</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Dominique</surname>
            <given-names>Mercier</given-names>
          </string-name>
          , Shoaib Ahmed Siddiqui, Andreas Dengel, and
          <string-name>
            <given-names>Sheraz</given-names>
            <surname>Ahmed</surname>
          </string-name>
          .
          <article-title>Interpreting deep models through the lens of data</article-title>
          .
          <source>In 2020 International Joint Conference on Neural Networks (IJCNN)</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . IEEE,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Elnaz</surname>
            <given-names>Barshan</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marc-Etienne Brunet</surname>
          </string-name>
          , and Gintare Karolina Dziugaite. Relatif:
          <article-title>Identifying explanatory training samples via relative influence</article-title>
          .
          <source>In International Conference on Artificial Intelligence and Statistics</source>
          , pages
          <fpage>1899</fpage>
          -
          <lpage>1909</lpage>
          . PMLR,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Been</surname>
            <given-names>Kim</given-names>
          </string-name>
          , Cynthia Rudin, and
          <article-title>Julie A Shah. The bayesian case model: A generative approach for case-based reasoning and prototype classification</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          , pages
          <fpage>1952</fpage>
          -
          <lpage>1960</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Eoin M Kenny and Mark T Keane</surname>
          </string-name>
          .
          <article-title>Twin-systems to explain artificial neural networks using case-based reasoning: Comparative tests of feature-weighting methods in ann-cbr twins for xai</article-title>
          . In Twenty-Eighth
          <source>International Joint Conferences on Artifical Intelligence (IJCAI)</source>
          ,
          <year>Macao</year>
          ,
          <fpage>10</fpage>
          -16
          <source>August</source>
          <year>2019</year>
          , pages
          <fpage>2708</fpage>
          -
          <lpage>2715</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Eoin</surname>
            <given-names>M Kenny</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Courtney</given-names>
            <surname>Ford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Molly</given-names>
            <surname>Quinn</surname>
          </string-name>
          , and Mark T Keane.
          <article-title>Explaining blackbox classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in xai user studies</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>294</volume>
          :
          <fpage>103459</fpage>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Oscar</surname>
            <given-names>Li</given-names>
          </string-name>
          , Hao Liu, Chaofan Chen, and
          <string-name>
            <given-names>Cynthia</given-names>
            <surname>Rudin</surname>
          </string-name>
          .
          <article-title>Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions</article-title>
          .
          <source>In Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>32</volume>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Julius</surname>
            <given-names>Adebayo</given-names>
          </string-name>
          , Justin Gilmer, Michael Muelly, Ian Goodfellow, and
          <string-name>
            <surname>Kim-Been</surname>
            <given-names>Hardt</given-names>
          </string-name>
          , Moritz.
          <article-title>Sanity checks for saliency maps</article-title>
          .
          <source>In 32nd NeurIPS</source>
          , pages
          <fpage>9525</fpage>
          -
          <lpage>9536</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Pieter-Jan</surname>
            <given-names>Kindermans</given-names>
          </string-name>
          , Sara Hooker, Julius Adebayo, Maximilian Alber, Kristof T Schu¨tt, Sven D¨ahne, Dumitru Erhan, and
          <string-name>
            <given-names>Been</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>The (un) reliability of saliency methods</article-title>
          .
          <source>In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning</source>
          , pages
          <fpage>267</fpage>
          -
          <lpage>280</lpage>
          . Springer,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Anurag</surname>
            <given-names>Koul</given-names>
          </string-name>
          , Sam Greydanus, and
          <string-name>
            <given-names>Alan</given-names>
            <surname>Fern</surname>
          </string-name>
          .
          <article-title>Learning finite state representations of recurrent policy networks</article-title>
          .
          <source>arXiv preprint arXiv:1811.12530</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30. Markus Zanker and
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Ninaus</surname>
          </string-name>
          .
          <article-title>Knowledgeable explanations for recommender systems</article-title>
          .
          <source>In 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology</source>
          , volume
          <volume>1</volume>
          , pages
          <fpage>657</fpage>
          -
          <lpage>660</lpage>
          . IEEE,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Moore-JD Swartout</surname>
            ,
            <given-names>WR.</given-names>
          </string-name>
          <article-title>Explanation in second generation expert systems</article-title>
          .
          <source>In Second generation expert systems</source>
          , pages
          <fpage>543</fpage>
          -
          <lpage>585</lpage>
          . Springer,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Michael R Wick and William B</surname>
          </string-name>
          <article-title>Thompson</article-title>
          .
          <article-title>Reconstructive explanation: Explanation as complex problem solving</article-title>
          .
          <source>In IJCAI</source>
          , pages
          <fpage>135</fpage>
          -
          <lpage>140</lpage>
          ,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Ralph</surname>
            <given-names>Bergmann</given-names>
          </string-name>
          , Gerd Pews, and
          <string-name>
            <given-names>Wolfgang</given-names>
            <surname>Wilke</surname>
          </string-name>
          .
          <article-title>Explanation-based similarity: A unifying approach for integrating domain knowledge into case-based reasoning for diagnosis and planning tasks</article-title>
          .
          <source>In European Workshop on Case-Based Reasoning</source>
          , pages
          <fpage>182</fpage>
          -
          <lpage>196</lpage>
          . Springer,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <given-names>Agnar</given-names>
            <surname>Aamodt</surname>
          </string-name>
          .
          <article-title>Explanation-driven case-based reasoning</article-title>
          .
          <source>In European Workshop on Case-Based Reasoning</source>
          , pages
          <fpage>274</fpage>
          -
          <lpage>288</lpage>
          . Springer,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Rosina</surname>
            <given-names>Weber</given-names>
          </string-name>
          , Adam Johs,
          <string-name>
            <given-names>Jianfei</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Kent</given-names>
            <surname>Huang</surname>
          </string-name>
          .
          <article-title>Investigating textual case-based xai</article-title>
          .
          <source>In LNCS</source>
          , Springer, volume
          <volume>11156</volume>
          , page 431-447,
          <year>07 2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Ribana</surname>
            <given-names>Roscher</given-names>
          </string-name>
          , Bastian Bohn, Marco F Duarte,
          <string-name>
            <surname>and Jochen Garcke.</surname>
          </string-name>
          <article-title>Explainable machine learning for scientific insights and discoveries</article-title>
          .
          <source>Ieee Access</source>
          ,
          <volume>8</volume>
          :
          <fpage>42200</fpage>
          -
          <lpage>42216</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <given-names>Catia</given-names>
            <surname>Pesquita</surname>
          </string-name>
          .
          <article-title>Towards semantic integration for explainable artificial intelligence in the biomedical domain</article-title>
          .
          <source>In HEALTHINF</source>
          , pages
          <fpage>747</fpage>
          -
          <lpage>753</lpage>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <surname>Laura</surname>
            <given-names>Rieger</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Chandan</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>William</given-names>
            <surname>Murdoch</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Bin</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <article-title>Interpretations are useful: penalizing explanations to align neural networks with prior knowledge</article-title>
          .
          <source>In International Conference on Machine Learning</source>
          , pages
          <fpage>8116</fpage>
          -
          <lpage>8126</lpage>
          . PMLR,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <surname>Patrick</surname>
            <given-names>Schramowski</given-names>
          </string-name>
          , Wolfgang Stammer, Stefano Teso, Anna Brugger, Xiaoting Shao,
          <string-name>
            <surname>Hans-Georg</surname>
            <given-names>Luigs</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anne-Katrin Mahlein</surname>
            , and
            <given-names>Kristian</given-names>
          </string-name>
          <string-name>
            <surname>Kersting</surname>
          </string-name>
          .
          <article-title>Making deep neural networks right for the right scientific reasons by interacting with their explanations</article-title>
          .
          <source>arXiv preprint arXiv:2001.05371</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          40.
          <string-name>
            <surname>Sara</surname>
            <given-names>Hooker</given-names>
          </string-name>
          , Dumitru Erhan,
          <string-name>
            <surname>Pieter-Jan Kindermans</surname>
            , and
            <given-names>Been</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>A benchmark for interpretability methods in deep neural networks</article-title>
          .
          <source>arXiv preprint arXiv:1806.10758</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          41.
          <string-name>
            <given-names>Piotr</given-names>
            <surname>Dabkowski</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yarin</given-names>
            <surname>Gal</surname>
          </string-name>
          .
          <article-title>Real time image saliency for black box classifiers</article-title>
          .
          <source>In Proceedings of the 31st International Conference on Neural Information Processing Systems</source>
          , pages
          <fpage>6970</fpage>
          -
          <lpage>6979</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          42.
          <string-name>
            <surname>Ruth C Fong and Andrea Vedaldi</surname>
          </string-name>
          .
          <article-title>Interpretable explanations of black boxes by meaningful perturbation</article-title>
          .
          <source>In 2017 IEEE International Conference on Computer Vision</source>
          (ICCV), pages
          <fpage>3449</fpage>
          -
          <lpage>3457</lpage>
          . IEEE,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          43. I. Kononenko and
          <string-name>
            <surname>Robnik-Sˇikonja M. Sˇimec</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <article-title>Overcoming the myopia of inductive learning algorithms with relieff</article-title>
          .
          <source>Applied Intelligence</source>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          ):
          <fpage>39</fpage>
          -
          <lpage>55</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          44. Shideh Shams Amiri, Rosina O Weber,
          <string-name>
            <surname>Prateek Goel</surname>
            , Owen Brooks, Archer Gandley, Brian Kitchell, and
            <given-names>Aaron</given-names>
          </string-name>
          <string-name>
            <surname>Zehm</surname>
          </string-name>
          .
          <article-title>Data representing ground-truth explanations to evaluate xai methods</article-title>
          . arXiv preprint arXiv:
          <year>2011</year>
          .09892,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          45.
          <string-name>
            <surname>Alex</surname>
            <given-names>Krizhevsky</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Geoffrey</given-names>
            <surname>Hinton</surname>
          </string-name>
          , et al.
          <article-title>Learning multiple layers of features from tiny images</article-title>
          .
          <source>Tech Report</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          46.
          <string-name>
            <surname>Tien</surname>
          </string-name>
          Ho-Phuoc.
          <article-title>Cifar10 to compare visual recognition performance between deep neural networks and humans</article-title>
          . arXiv preprint arXiv:
          <year>1811</year>
          .07270,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          47.
          <string-name>
            <surname>Chandra</surname>
            <given-names>Bhagavatula</given-names>
          </string-name>
          , Sergey Feldman, Russell Power, and
          <string-name>
            <given-names>Waleed</given-names>
            <surname>Ammar</surname>
          </string-name>
          .
          <article-title>Content-based citation recommendation</article-title>
          .
          <source>In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          , pages
          <fpage>238</fpage>
          -
          <lpage>251</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          48.
          <string-name>
            <surname>Gerald</surname>
            <given-names>DeJong</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Raymond</given-names>
            <surname>Mooney</surname>
          </string-name>
          .
          <article-title>Explanation-based learning: An alternative view</article-title>
          .
          <source>Machine learning</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ):
          <fpage>145</fpage>
          -
          <lpage>176</lpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>