<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Riccardo Guidotti[</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Explaining Explanation Methods</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Pisa</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>0000</year>
      </pub-date>
      <volume>0002</volume>
      <fpage>6</fpage>
      <lpage>13</lpage>
      <abstract>
        <p>The most e ective Arti cial Intelligence (AI) systems exploit complex machine learning models to ful ll their tasks due to their high performance. Unfortunately, the most e ective machine learning models use for their decision processes a logic not understandable from humans that makes them real black-box models. The lack of transparency on how AI systems make decisions is a clear limitation in their adoption in safety-critical and socially sensitive contexts. Consequently, since the applications in which AI are employed are various, research in eXplainable AI (XAI) has recently caught much attention, with speci c distinct requirements for di erent types of explanations for di erent users. In this paper, we brie y present the existing explanation problems, the main strategies adopted to solve them, and the desiderata for XAI methods. Finally, the most common types of explanations are illustrated with references to state-of-the-art explanation methods able to retrieve them.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Nowadays, Arti cial Intelligence is one of the most important scienti c and
technological areas, with a huge socio-economic impact and a pervasive adoption in
every eld of the modern society. High-pro le applications such as autonomous
vehicles, medical diagnosis, spam ltering, image recognition, and voice
assistants are based on Arti cial Intelligence (AI) systems. Modern AI is mainly
based on Machine Learning models that allow AI systems to reach impressive
performance in emulating human behavior. The most e ective ML models are
black-box models [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], i.e., obscure decision-making or predictive methods that
\hide" the logic of their internal decision processes to humans, either because not
human-understandable or because not directly accessible. Examples of black-box
models include Neural Networks and Deep Neural Networks, SVMs, Ensemble
classi ers such as Random Forest, but also compositions of expert systems, data
mining, and hard-coded software. The choice for the adoption of these obscure
models is driven by the high performance in terms of accuracy [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. As a
consequence, the last decade has witnessed the rise of a black-box society [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
      <p>
        The lack of explanations of how these black-box models make decisions is
a restriction for their adoption in safety-critical contexts and socially sensitive
domains such as healthcare or law. Moreover, the problem is not only for lack of
transparency but also for possible biases inherited by black-box models from
artifacts and preconceptions hidden in the training data of the ML algorithms.
Predictive ML models learned on biased datasets may inherit such biases, possibly
leading to unfair and wrong decisions. Consequences of biased misclassi cations
can damage decision-makers and put certain societal groups at risk [
        <xref ref-type="bibr" rid="ref28 ref39 ref9">9, 28, 39</xref>
        ]
For instance, the AI software used by Amazon to determine the areas of the US
to which Amazon would o er free same-day delivery, unintentionally restricted
minority neighborhoods from participating in the program (often when every
surrounding neighborhood was allowed)1. Another example is relative to
propublica.org. Their journalists have shown that the COMPAS score, a predictive
model for the \risk of crime recidivism" (proprietary secret of Northpointe),
has a strong ethnic bias. Indeed, according to this score, a black who did not
re-o end was classi ed as \high risk" twice as much as whites who did not
reo end. On the other hand, white repeat o enders were classi ed as \low risk"
twice as much as black repeat o enders2. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is shown that the neural network
used to train the English language words was encoding biases towards gender
and stereotypes. The authors show that for the analogy \Man is to computer
programmer as woman is to X", the variable X was replaced by \homemaker"
by the neural network. Consequently, the research in eXplainable AI (XAI) and
on the study of explanation methods for obscure ML models has recently caught
much attention [
        <xref ref-type="bibr" rid="ref1 ref18 ref26 ref40 ref5">1, 5, 18, 26, 40</xref>
        ].
      </p>
      <p>
        In addition, an innovative aspect of the General Data Protection Regulation
(GDPR) promulgated by the European Parliament, which has become law in
May 2018, are the clauses on automated decision-making. The GDPR, for the
rst time, introduces, to some extent, a right of explanation for all individuals to
obtain \meaningful explanations of the logic involved" when automated decision
making takes place. Despite con icting opinions among legal scholars regarding
the real scope of these clauses [
        <xref ref-type="bibr" rid="ref15 ref24 ref37">15, 24, 37</xref>
        ], there is a joint agreement on the need
for the implementation of such a principle is imperative and that it represents
today a huge open scienti c challenge. However, without technology capable of
explaining the logic of black boxes, the right to explanation will remain a \dead
letter". How can companies trust their AI services without understanding and
validating the underlying rationale of their ML components? Furthermore, in
turn, how can users trust AI services? It will be impossible to increase the trust
of people in AI without explaining the rationale followed by these models. These
are the reasons why explanation is now at the heart of responsible, open data
science across multiple industry sectors and scienti c disciplines.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Explanation Methods</title>
      <p>
        A black-box predictor is a ML obscure model, whose internals are unknown to
the observer, or they are known but uninterpretable by humans [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Therefore,
1 http://www.techinsider.io/how-algorithms-can-be-racist-2016-4
2 http://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
in ML to interpret means to give or provide the meaning or to explain in
understandable terms the predictive process of a model to a human [
        <xref ref-type="bibr" rid="ref13 ref5">5, 13</xref>
        ]. It is
assumed that the concepts composing an explanation are self-contained and do
not need further explanations [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The most widely used approach to explain
black-box models and return interpretations is a sort of reverse engineering : the
explanation is learned by observing the changes in the black-box output by
varying the input. A set of dimensions are identi ed to analyze ML interpretability,
and explanation methods and, in turn, re ect on existing types of explanations.
      </p>
      <p>
        Explanation Problems. In the literature, we recognize two types of
problems: black box explanation and explanation by design [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The black-box
explanation idea is to couple a ML black-box model with an explanation method
able to interpret the black-box decisions. The underlying strategy is to maintain
the high performance of the obscure model and to use an explanation method to
retrieve the explanations [
        <xref ref-type="bibr" rid="ref12 ref23 ref29">12, 23, 29</xref>
        ]. The explanation methods generally try to
approximate the black-box behavior with an interpretable predictor, also named
surrogate model. This kind of approach is the one more addressed nowadays in
the XAI research eld. On the other hand, the explanation by design consists of
directly designing a transparent model that is interpretable by design and aims
at replacing the obscure ML model with the new transparent one [
        <xref ref-type="bibr" rid="ref32 ref33">32, 33</xref>
        ].
      </p>
      <p>
        In the literature, there are various models recognized to be interpretable.
Examples are decision tree, decision rules, and linear models [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. These models
are considered easily understandable and interpretable for humans. They
sacrice performance for interpretability. Besides, most of them cannot be applied to
data types such as images or text, but only on tabular data.
      </p>
      <p>
        Explanation Targets and Strategy. We recognize global and local
explanation methods depending on the target of the explanation. A global explanation
consists in providing an explanation that allows understanding the whole logic
of a black-box model and interpreting any possible decision. Global explanations
are di cult to achieve, and in the literature are provided only for tabular data.
On the other hand, a local explanation consists in retrieving the reasons for the
prediction returned by a black-box model for a speci c case. While for a global
explanation, the interpretable surrogate approximates the whole black-box, for a
local explanation, the interpretable surrogate model is used to approximate the
black-box behavior only in a \neighborhood" of the instance analyzed. The idea
is that, in such a neighborhood, it is easier to explain the decision boundary [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>
        In addition, we distinguish between model-speci c and model-agnostic
explanation method depending on the strategy adopted. An explanation method
is model-speci c, or not generalizable [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], if it can be used to interpret only
particular types of black-box models. If an explanation method is designed to
interpret a Random Forest [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] and internally use a distance between trees, such
a method cannot be used to explain the predictions of a Neural Network. On the
other hand, a generalizable or model-agnostic explanation method can be used
independently from the black-box model being explained because the internal
characteristics of the black-box are not exploited to retrieve the explanation [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>
        Desiderata of Explainable Methods. A set of desiderata should be
considered when designing and using explanation methods [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The interpretability
aspect should measure to what extent a given explanation is
human-understandable. Interpretability is generally evaluated with the complexity of the
interpretable surrogate model. For example, the complexity of a rule can be measured
with the number of clauses in the condition, for linear models with the number
of non-zero weights, while for decision trees with the depth of the tree. The
performance of the interpretable surrogate model form which explanations are
extracted is generally called delity and measures to which extent it accurately
imitates the black-box prediction. The delity is practically measured in terms
of Accuracy score, F1-score, etc. [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] with respect to the prediction of the
blackbox model. Moreover, an interpretable model should satisfy guarantee fairness
by protecting minorities against discrimination [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], and privacy by not revealing
sensitive information [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Also, an explanation methods must return robust and
stable explanations: similar instances should have similar explanation for a given
black-box model [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. In addition, since the meaningfulness of an explanation
depends on the stakeholder [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the explanation returned must consider the user
background : common users require simple clari cations, while domain experts
can be able to understand complex explanations. Finally, the time that a user
is allowed to spend on understanding an explanation is another crucial aspect.
In contexts where the decision time is not a constraint, one might prefer a more
exhaustive explanation, while when the user needs to quickly make a decision, it
is preferable to have an explanation \easy to read". Thus an explanation method
must consider time limitations.
      </p>
      <p>Types of Explanations. Research on XAI is producing various alternatives.
Explanation methods di er one from another depending on the type of
explanation returned. In the following, we illustrate the most used types of explanations
and highlights how explanation methods build them.</p>
      <p>
        { List of Rules. An explanation returned in the form of a list of rules implies
that rules are read one after the other, and the rst rule for which the
conditions are veri ed is used for prediction. Rules are in form of if-then rule:
if conditions, then consequent the consequent corresponds to the prediction,
while the conditions explain the factual reasons for the consequent. The
CORELS method [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is a transparent by design method able to build a list
of rules with the aim of globally replacing the black-box model. A compact
set of rules is returned by the transparent predictive method proposed in [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
{ Single Tree Approximation. The black-box predictor is approximated
with a decision tree that represents all the possible decisions. The TREPAN
explanation method [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] allows to globally explore a Neural Network through
a tree structure that, starting from a root, shows for every path the
conditions driving the decision process. TREPAN retrieves the decision tree by
maximizing a gain ratio [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] calculated on the delity with respect to the
predictions of an obscure Neural Network.
{ Rule-based Explanation. A single if-then rule is used for local
explanations. The conditions of the rule explain the factual reasons for the
prediction. The LORE explanation method [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] builds a local decision tree in
the neighborhood of the instance analyzed, and then extracts from the tree a
single rule revealing the reasons for the decision on the speci c instance. The
ANCHOR method [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] returns if then rules called anchors. An anchor
contains a set of attributes with the values which are fundamental for obtaining
a certain prediction.
{ Features Importance. A feature importance-based local explanation
consists of attributes equipped with positive and negative values. The
explanation consists of both the sign and the magnitude of the contribution of the
attributes for a speci c prediction. If the value is positive, then it contributes
by increasing the model's output, if the sign is negative, it decreases the
output of the model. LIME [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] adopts a linear model as the interpretable local
surrogate and returns the importance of the features as an explanation
exploiting the regression's coe cients. SHAP [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] provides the local unique
additive feature importance for a speci c record exploiting shapely values.
{ Saliency Maps. In image processing, typical explanations consist of saliency
maps, i.e., images that show the positive (or negative) contribution of each
pixel to the black-box prediction. Saliency maps are built for locally
explaining DNN models by gradient [
        <xref ref-type="bibr" rid="ref34 ref35">34, 35</xref>
        ] and perturbation-based [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] attribution
methods. These explanation methods assign a score to each pixel such that it
is maximized the probability of returning the same answer without
considering irrelevant pixels. Under appropriate image transformations that exploit
the concept of \superpixels" also methods such as LORE and LIME can be
employed to explain black-box working on images.
{ Prototype-based Explanations. An explanation based on prototypes
returns specimens similar to the instance analyzed, which makes clear the
reasons for the prediction. Prototype-based explanations can refer to any type of
data. In [
        <xref ref-type="bibr" rid="ref11 ref22">11, 22</xref>
        ], image prototypes are used as the foundation of the concept
for interpretability [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] is discussed the concept of counter-prototypes
called criticisms for tabular data, i.e., prototype showing what should be
different to obtain another decision. Exemplar and counter-exemplars synthetic
images are generated by the ABELE explanation method [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] to augment
the interpretability of local saliency maps.
{ Counterfactual Explanations. A counterfactual explanation shows what
should have been di erent to change the prediction of the black-box model.
Counterfactuals help people in reasoning on the cause-e ect relations
between observed features and classi cation outcomes [
        <xref ref-type="bibr" rid="ref10 ref4">4, 10</xref>
        ] and reveal what
should change in a given instance to obtain a di erent prediction [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]. The
explanation method proposed in [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] returns counterfactual explanations that
describe the smallest change that can be made to a given instance to obtain
a certain outcome by solving an optimization problem. The aforementioned
LORE [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], besides a factual explanation rule, also provides a set of
counterfactual rules extracted from the local decision tree, while ABELE [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
returns synthetically generated counter-exemplar images.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>AI systems based on obscure ML models cannot be the long term solution for any
real application, especially those involving humans with the nal predictions.
Research on XAI has strong ethical motivations aimed at empowering users against
undesired, possibly illegal, e ects of black-box automated decision-making
systems. Di erent types of explanations, and di erent explanation methods, permits
to retrieve the logic of machines, which can be completely di erent from the logic
of humans and resolve unexpected bugs and issues.</p>
      <p>However, despite recent developments on XAI some questions remain open.
Are the existing explanation methods useful for the realization of the right of
explanation declared in the GDPR? Can the actual explanation methods
effectively be exploited by business companies for the industrial development of
explainable AI services and products? Are explanation methods able to reveal
forms of discrimination towards vulnerable social groups, and are their immune
from other algorithmic bias and artifacts in the data? Only when these
questions will have a positive answer, the research on explanation methods would
have reached a satisfactory level.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgment</title>
      <p>This work is partially supported by the European Community H2020 programme
under the funding schemes H2020-INFRAIA-2019-1: Res. Infr. G.A. 871042
SoBigData++, (sobigdata.eu), G.A. 952026 Humane AI-Net, (humane-ai.eu).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>A.</given-names>
            <surname>Adadi</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Berrada</surname>
          </string-name>
          .
          <article-title>Peeking inside the black-box: A survey on explainable arti cial intelligence (xai)</article-title>
          .
          <source>IEEE Access</source>
          ,
          <volume>6</volume>
          :
          <fpage>52138</fpage>
          {
          <fpage>52160</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Y. A. A. S.</given-names>
            <surname>Aldeen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Salleh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Razzaque</surname>
          </string-name>
          .
          <article-title>A comprehensive review on privacy preserving data mining</article-title>
          .
          <source>SpringerPlus</source>
          ,
          <volume>4</volume>
          (
          <issue>1</issue>
          ):
          <fpage>694</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>E.</given-names>
            <surname>Angelino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Larus-Stone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Alabi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Seltzer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          .
          <article-title>Learning certiably optimal rule lists</article-title>
          .
          <source>In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , pages
          <volume>35</volume>
          {
          <fpage>44</fpage>
          . ACM,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>A.</given-names>
            <surname>Apicella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Isgro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Prevete</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Tamburrini</surname>
          </string-name>
          .
          <article-title>Contrastive explanations to classi cation systems using sparse dictionaries</article-title>
          .
          <source>In International Conference on Image Analysis and Processing</source>
          , pages
          <volume>207</volume>
          {
          <fpage>218</fpage>
          . Springer,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>A. B. Arrieta</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <article-title>D az-Rodr guez</article-title>
          ,
          <source>J. Del Ser</source>
          , et al.
          <article-title>Explainable arti cial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai</article-title>
          .
          <source>Information Fusion</source>
          ,
          <volume>58</volume>
          :
          <fpage>82</fpage>
          {
          <fpage>115</fpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>S.</given-names>
            <surname>Bach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Binder</surname>
          </string-name>
          , et al.
          <article-title>On pixel-wise explanations for non-linear classi er decisions by layer-wise relevance propagation</article-title>
          .
          <source>PloS one</source>
          ,
          <volume>10</volume>
          (
          <issue>7</issue>
          ):e0130140,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>U.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Weller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Taly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jia</surname>
          </string-name>
          , et al.
          <article-title>Explainable machine learning in deployment</article-title>
          .
          <source>In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency</source>
          , pages
          <volume>648</volume>
          {
          <fpage>657</fpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>J.</given-names>
            <surname>Bien</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Tibshirani</surname>
          </string-name>
          .
          <article-title>Prototype selection for interpretable classi cation</article-title>
          .
          <source>The Annals of Applied Statistics</source>
          ,
          <volume>5</volume>
          (
          <issue>4</issue>
          ):
          <volume>2403</volume>
          {
          <fpage>2424</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>T.</given-names>
            <surname>Bolukbasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Saligrama</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Kalai</surname>
          </string-name>
          .
          <article-title>Man is to computer programmer as woman is to homemaker? debiasing word embeddings</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          , pages
          <volume>4349</volume>
          {
          <fpage>4357</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>R. M. Byrne</surname>
          </string-name>
          .
          <article-title>Counterfactuals in explainable arti cial intelligence (xai): evidence from human reasoning</article-title>
          .
          <source>In Proceedings of the Twenty-Eighth International Joint Conference on Arti cial Intelligence, IJCAI-19</source>
          , pages
          <fpage>6276</fpage>
          {
          <fpage>6282</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>C. Chen</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Barnett</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Su</surname>
            , and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Rudin</surname>
          </string-name>
          .
          <article-title>This looks like that: deep learning for interpretable image recognition</article-title>
          .
          <source>arXiv:1806.10574</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>M.</given-names>
            <surname>Craven</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Shavlik</surname>
          </string-name>
          .
          <article-title>Extracting tree-structured representations of trained networks</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          , pages
          <volume>24</volume>
          {
          <fpage>30</fpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>F.</given-names>
            <surname>Doshi-Velez</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>Towards a rigorous science of interpretable machine learning</article-title>
          .
          <source>arXiv preprint arXiv:1702.08608</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Freitas</surname>
          </string-name>
          .
          <article-title>Comprehensible classi cation models: a position paper</article-title>
          .
          <source>ACM SIGKDD explorations newsletter</source>
          ,
          <volume>15</volume>
          (
          <issue>1</issue>
          ):1{
          <fpage>10</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>B.</given-names>
            <surname>Goodman</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Flaxman</surname>
          </string-name>
          .
          <article-title>Eu regulations on algorithmic decision-making and a \right to explanation"</article-title>
          .
          <source>In ICML workshop on human interpretability in machine learning (WHI</source>
          <year>2016</year>
          ), New York, NY. http://arxiv. org/abs/1606.08813 v1,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          , et al.
          <article-title>Black box explanation by learning image exemplars in the latent feature space</article-title>
          .
          <source>In Joint European Conference on Machine Learning and Knowledge Discovery in Databases</source>
          , pages
          <volume>189</volume>
          {
          <fpage>205</fpage>
          . Springer,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          .
          <article-title>Factual and counterfactual explanations for black box decision making</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          .
          <article-title>A survey of methods for explaining black box models</article-title>
          .
          <source>ACM computing surveys (CSUR)</source>
          ,
          <volume>51</volume>
          (
          <issue>5</issue>
          ):1{
          <fpage>42</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          .
          <article-title>On the stability of interpretable models</article-title>
          .
          <source>In 2019 International Joint Conference on Neural Networks</source>
          , pages
          <fpage>1</fpage>
          <article-title>{8</article-title>
          . IEEE,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. O.</given-names>
            <surname>Koyejo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Khanna</surname>
          </string-name>
          .
          <article-title>Examples are not enough, learn to criticize! criticism for interpretability</article-title>
          .
          <source>In Advances In Neural Information Processing Systems</source>
          , pages
          <fpage>2280</fpage>
          {
          <fpage>2288</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21. H.
          <string-name>
            <surname>Lakkaraju</surname>
          </string-name>
          et al.
          <article-title>Interpretable decision sets: A joint framework for description and prediction</article-title>
          .
          <source>In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , pages
          <volume>1675</volume>
          {
          <fpage>1684</fpage>
          . ACM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>O.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          .
          <article-title>Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions</article-title>
          .
          <source>In Thirtysecond AAAI conference on arti cial intelligence</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.-I.</given-names>
            <surname>Lee</surname>
          </string-name>
          .
          <article-title>A uni ed approach to interpreting model predictions</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          , pages
          <volume>4765</volume>
          {
          <fpage>4774</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24. G. Malgieri and
          <string-name>
            <given-names>G.</given-names>
            <surname>Comande</surname>
          </string-name>
          .
          <article-title>Why a right to legibility of automated decisionmaking exists in the General Data Protection Regulation</article-title>
          .
          <source>International Data Privacy Law</source>
          ,
          <volume>7</volume>
          (
          <issue>4</issue>
          ):
          <volume>243</volume>
          {
          <fpage>265</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <given-names>D.</given-names>
            <surname>Martens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Baesens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. Van</given-names>
            <surname>Gestel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and J.</given-names>
            <surname>Vanthienen</surname>
          </string-name>
          .
          <article-title>Comprehensible credit scoring models using rule extraction from support vector machines</article-title>
          .
          <source>European journal of operational research</source>
          ,
          <volume>183</volume>
          (
          <issue>3</issue>
          ):
          <volume>1466</volume>
          {
          <fpage>1476</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Explanation in arti cial intelligence: Insights from the social sciences</article-title>
          .
          <source>Arti cial Intelligence</source>
          ,
          <volume>267</volume>
          :1{
          <fpage>38</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <given-names>F.</given-names>
            <surname>Pasquale</surname>
          </string-name>
          .
          <article-title>The black box society</article-title>
          . Harvard University Press,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          .
          <article-title>Meaningful explanations of black box ai decision systems</article-title>
          .
          <source>In Proceedings of the AAAI Conference on Arti cial Intelligence</source>
          , volume
          <volume>33</volume>
          , pages
          <fpage>9780</fpage>
          {
          <fpage>9784</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>M. T.</surname>
          </string-name>
          Ribeiro et al.
          <article-title>Why should i trust you?: Explaining the predictions of any classi er</article-title>
          .
          <source>In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , pages
          <volume>1135</volume>
          {
          <fpage>1144</fpage>
          . ACM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>M. T. Ribeiro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Singh</surname>
            , and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Guestrin</surname>
          </string-name>
          .
          <article-title>Anchors: High-precision model-agnostic explanations</article-title>
          .
          <source>In Proceedings of the Thirty-Second AAAI Conference on Arti cial Intelligence (AAAI)</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <given-names>A.</given-names>
            <surname>Romei</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          .
          <article-title>A multidisciplinary survey on discrimination analysis</article-title>
          .
          <source>The Knowledge Engineering Review</source>
          ,
          <volume>29</volume>
          (
          <issue>5</issue>
          ):
          <volume>582</volume>
          {
          <fpage>638</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          .
          <article-title>Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead</article-title>
          .
          <source>NMI</source>
          ,
          <volume>1</volume>
          (
          <issue>5</issue>
          ):
          <volume>206</volume>
          {
          <fpage>215</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Radin</surname>
          </string-name>
          .
          <article-title>Why are we using black box models in ai when we don't need to? a lesson from an explainable ai competition</article-title>
          .
          <source>HHDSR</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ),
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34. A.
          <string-name>
            <surname>Shrikumar</surname>
          </string-name>
          et al.
          <article-title>Not just a black box: Learning important features through propagating activation di erences</article-title>
          .
          <source>arXiv:1605.01713</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <given-names>K.</given-names>
            <surname>Simonyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vedaldi</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Zisserman</surname>
          </string-name>
          .
          <article-title>Deep inside convolutional networks: Visualising image classi cation models and saliency maps</article-title>
          .
          <source>arXiv preprint arXiv:1312.6034</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>P.-N. Tan</surname>
          </string-name>
          et al.
          <article-title>Introduction to data mining</article-title>
          .
          <source>Pearson Education India</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <given-names>S.</given-names>
            <surname>Wachter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mittelstadt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Floridi</surname>
          </string-name>
          .
          <article-title>Why a right to explanation of automated decision-making does not exist in the general data protection regulation</article-title>
          .
          <source>International Data Privacy Law</source>
          ,
          <volume>7</volume>
          (
          <issue>2</issue>
          ):
          <volume>76</volume>
          {
          <fpage>99</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <given-names>S.</given-names>
            <surname>Wachter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mittelstadt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Russell</surname>
          </string-name>
          .
          <article-title>Counterfactual explanations without opening the black box: Automated decisions and the gdpr</article-title>
          .
          <source>HJLT</source>
          ,
          <volume>31</volume>
          :
          <fpage>841</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Kosinski</surname>
          </string-name>
          .
          <article-title>Deep neural networks are more accurate than humans at detecting sexual orientation from facial images</article-title>
          .
          <source>JPSP</source>
          ,
          <volume>114</volume>
          (
          <issue>2</issue>
          ):
          <fpage>246</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          40.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          and
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>Explainable recommendation: A survey and new perspectives</article-title>
          . arXiv preprint arXiv:
          <year>1804</year>
          .11192,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>