<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Systematic Review on Model-agnostic XAI Libraries</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jesus M. Darias</string-name>
          <email>jdarias@ucm.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bel´en D´ıaz-Agudo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan A. Recio-Garcia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Software Engineering and Artificial Intelligence Instituto de Tecnolog ́ıas del Conocimiento Universidad Complutense de Madrid</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>During the last few years, the topic of explainable artificial intelligence (XAI) has become a hotspot in the ML research community. Model-agnostic interpretation methods propose separating the explanations from the ML model, making these explanation methods reusable through XAI libraries. In this paper, we have reviewed some selected XAI libraries and provide examples of different model agnostic explanations. The context of the research conducted in this paper is the iSee project 1 that will show how users of Artificial Intelligence (AI) can capture, share and re-use their experiences of AI explanations with other users who have similar explanation needs.</p>
      </abstract>
      <kwd-group>
        <kwd>XAI</kwd>
        <kwd>libraries</kwd>
        <kwd>model agnostic models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Interpretability and trust have become a requirement for black-box AI
models applied to real-world tasks like diagnosis or decision-making processes. At
a high level, the literature distinguishes between two main approaches to
interpretability: model-specific (also called transparent or white box) models and
model-agnostic (post-hoc) surrogate models to explain black-box models [
        <xref ref-type="bibr" rid="ref10 ref15 ref9">15,
9, 10</xref>
        ]. Transparent models are ones that are inherently interpretable by users.
Consequently, the easiest way to achieve interpretability is to use algorithms
that create interpretable models, such as decision trees, simple nearest-neighbour
models, or linear regression. However, the best-performing models are often not
interpretable, or they are partially interpretable [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. It is a permanent
challenge to ensure the high accuracy of a model while maintaining a sufficient level
of comprehensibility. Model-agnostic interpretation methods propose separating
      </p>
      <p>LIME Anchors SHAP PDP ALE Counterfactuals CEM
Interpret</p>
      <p>Alibi
Aix360</p>
      <p>Dalex</p>
      <p>Dice
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
the explanations from the ML model. The main advantage is flexibility although
some authors consider this type of post-hoc explanations as limited justifications
because they are not linked to the real reasoning process occurring in the black
box. The context of the research conducted in this paper is the iSee project
that aims to provide a unifying platform where personalized explanations are
created by reasoning with Explanation Experiences using Case-based reasoning
(CBR). This is a very challenging, long-term goal as we want to capture
complete user-centered explanation experiences on complex explanation strategies.
Our proposal relies on an ontology to help to the knowledge-intensive
representation of previous experiences, different types of users and explanation needs,
characterization of the data, the black-box model, and the contextual properties
of the application domain and task. We aim to be able to recommend what
explanation strategy better suits an explanation situation. One of the first tasks in the
iSee project is to be able to characterize the existing XAI libraries. Explainers of
these libraries will be the building blocks of our library of reusable explanation
strategies that will be described using the unified terminology defined by the
ontology.</p>
      <p>In this position paper, we have reviewed some existing XAI libraries:
Interpret, Alibi, Aix360, Dalex, and Dice. We have compared different options to
explain the same black box prediction model with the same training data and the
most relevant explanation methods, namely: Local Interpretable Model-Agnostic
Explanations (LIME), Anchors, Shapley Additive Explanations (SHAP), Partial
Dependence Plots (PDPs), Accumulated Local Effects (ALE) and counterfactual
explanations. Section 2 describes the methodology to compare the libraries and
their explainers (see Table 1) and defines the variables used to perform a
quantitative analysis of the libraries in Section 3. The XAI methods are analysed
through a qualitative evaluation described in Section 4. Finally, Section 5
concludes the paper by discussing and comparing the libraries.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Methodology</title>
      <p>We propose different variables that allow us to compare the XAI libraries. The
resulting quantitative analysis of the libraries is presented in Section 3, whereas
a qualitative evaluation focusing on the XAI methods is included in Section 4.
Documentation and usability. Is the documentation well-structured and
selfexplanatory? Good documentation should be complemented with usage
examples which makes the library easy to use.</p>
      <p>Interpretability metrics. Refers to the availability of metrics such as
accuracy, recall, ROC/AUC values, mean squared error, etc. These metrics allow
users to evaluate the performance of a model.</p>
      <p>
        Available explainers such as LIME[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], SHAP [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], Counterfactuals [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ],
Anchors[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], PDPs[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], ALE plots [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], CEMs [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and others.
      </p>
      <p>Analysis and description capabilities of the training data: refers to the
availability of tools that allow a better interpretation of data itself such as
marginal and scatter plots, data imbalances, etc.</p>
      <p>Interactivity, meaning the user is able to dive deeper into the explanation
that is outputted by looking into certain features or other aspects more
thoroughly.</p>
      <p>Personalization. Refers to the capability of providing different explanations
according to the user’s requirements.</p>
      <p>Dependencies. Development language/environment and requirements (if any).</p>
      <p>Use of other methods from libraries such as TensorFlow, SKLearn, and
others. We also take into consideration the use of wrapper classes and methods
of the original author´s implementation of certain explainers.</p>
      <p>
        The use case consists on explaining the prediction of cervical cancer given by two
different models: a random forest (RF) classifier and a multi-layer perceptron
(MLP), both with a scikit-learn back-end. The dataset used to train both models
was extracted from the UCI Machine Learning repository [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It contains 858
instances. Table 2 summarizes its statistical descriptors. Note that the data set
is quite unbalanced, as only 6% of the individuals had cervical cancer.
      </p>
      <p>The RF model was built with 100 estimators and was configured so it would
adjust the weights inversely proportional to class frequencies. In this way, it is
possible to mitigate data imbalances moderately. However, this approach cannot
be done when building an MLP, which affected the performance of the model
considerably. Our MLP was built with two hidden layers, 100 neurons for the
first and 50 neurons for the second. The selected optimization algorithm was</p>
      <p>Documentation and
usability</p>
      <p>Interpret</p>
      <p>Very good</p>
      <p>Metrics ROC/AUC
Explainers 3</p>
      <p>Analysis Yes
Interactivity Yes</p>
      <p>Personaltiizoan- No
Dependencies Python 3.6+</p>
      <p>Dice
Good</p>
      <p>No
Adam. The random forest had an accuracy of 88.8%, a precision of 10%, and a
recall of 6.2%. On the other hand, the MLP model had an accuracy of 87.4%, a
precision of 13.3%, and a recall of 12.5%. It is shown that both models have a
considerable rate of false negatives which may be something to take into account
because of the sensitive nature of this particular problem.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Quantitative analysis of the XAI Libraries</title>
      <p>This section describes the XAI libraries according to the features described in
the previous section (see Table 3).</p>
      <p>InterpretML is one of the most popular XAI libraries. It offers state-of-the-art
explanations for black-box models both locally and globally. It implements a
dashboard that makes the communication process between the end-users and
the program more interactive, allowing them to have a better understanding
of the explanation. Table 4 contains its analysis.</p>
      <p>Dice whose name comes from Diverse Counterfactual Explanations, uniquely
focuses on counterfactual generation. Three different approaches can be taken
when using dice in order to find counterfactuals: using random sampling, k-d
trees, or genetic algorithms. Its simplicity of use makes Dice a great
candidate when the only explanation needed is various counterfactuals. Table 5
contains its analysis.</p>
      <p>ALIBI provides local and global explanation methods for classification and
regression problems for both with and black-box models. It is a broad library
with many different explainers. One of the strengths of this library is that
some explainers are compatible with Tensorflow models, such as CEM and
counterfactuals, thus increasing its versatility. Table 6 contains its analysis.
Aix360 is a multipurpose library that provides some of the most up-to-date
explainers available. Besides implementing the widely accepted LIME and
The documentation is well-structured and explanatory. Usage
examples are provided in a simple fashion so the user is able to
begin using the library very quickly. This library is very
intuitive and using it should not arise any issues for
less-experienced users.</p>
      <p>Metrics</p>
      <p>ROC/AUC values.</p>
      <p>
        SHAP methods, algorithms like Protodash [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and CEM with Monotonic
Attribute Functions show some of the latest, local explainers available. Aix360
also provides global explainers such as Generalized Linear Rule Models and
model performance metrics. Table 7 contains its analysis.
      </p>
      <p>Dalex is a multipurpose library that focuses on model-agnostic explanations for
black-box models. The core methodology behind it is to create a wrapper
around the given model that can later be explained through a variety of local
and global explainers. This library implements well-known explainers such
as LIME, SHAP, and ALE, and also allows measuring the fairness of the
model. It provides plenty of different performance metrics according to the
given model. Dalex is complemented by the Arena2 visual dashboard, that
allows interactive exploration and personalization of the explanation. Table
8 contains Dalex quantitative analysis.
2 https://arena.drwhy.ai/docs/</p>
      <p>Personalization</p>
      <p>The documentation is very extensive and educational. Not only
does it explain how to use the methods, but gives a
mathematical background for each explainer. However, the
examples provided for some explainers only cover the
explanation of models with a Tensorflow backend, which may
cause difficulties to users who are not experienced in this
environment.</p>
      <p>Metrics Linearity measure and trust scores.</p>
      <p>Explainers 5</p>
      <p>Analysis Not available.</p>
      <p>This library is not interactive. The process is finished once the
Interactivity eaxlpolwa-nlaetvieolnfaisshoiuontpausttreadw. Idnatfaactth,amtotshteeuxspelranmaatiyonnseeadretogiven in
convert to a more interpretable format.</p>
      <p>Not available.</p>
      <p>
        Python 3.6+. This library is heavily based on tensorflow. For
Dependencies the SHAP explainer, it uses the original implementation of the
author [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>Qualitative evaluation of the XAI methods</title>
      <p>This section presents a descriptive evaluation of the XAI methods provided by
the libraries, focusing on the visualization of the explanations. In order to grasp
a general idea of the inner mechanics of the models, using SHAP as a global
explanation method is typically a good first approach although it has a high
computational cost. The results obtained for our use case are shown in Figure 1.</p>
      <p>The features that impact the prediction the most on average for the random
forest model are the number of years using hormonal contraceptives, the age of
the individual, and the age of their first sexual intercourse. The years of smoking
barely contribute to the predictions of the model on average. On the other hand,
the SHAP summary plot for the MLP model, which may be somewhat harder
to understand, still gives the major contribution to the hormonal contraceptive</p>
      <p>Metrics</p>
      <p>The documentation is clear and extensive. It provides many
usage examples with different data sets that make the library
easy to use. The Aix360 website offers interactive tutorials as
complementary guidance for its use.</p>
      <p>Faithfulness and monotonicity. Faithfulness refers to the
correlation between the feature importance assigned by the
interpretability algorithm and the effect of features on model
accuracy. On the other hand, monotonicity tests whether model
accuracy increases as features are added in order of their
importance.</p>
      <p>Yes. Particularly, the Protodash algorithm is able to find
prototypes that help summarizing the data set.</p>
      <p>This library does not provide the interactivity feature.</p>
      <p>Explanations are outputted to the users in the format of
graphics or plain data, and the is no further interaction between
the user and program.</p>
      <p>Not available. However, the importance of personalization of the
explanations is referenced in the official website throughout the
Personalization interactive demo. It outlines that different users look for
different kinds of explanations (as it is the purpose in the iSee
project).</p>
      <p>Dependencies</p>
      <p>
        Python 3.6+. The implementation of the original authors is
used for the LIME and SHAP explainers [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ][
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
feature, but then comes the number of pregnancies and the years of smoking.
Something interesting about this plot is that the years of smoking contribute
both negatively and positively in different situations when the instance values
are high, which might indicate that the model is not properly calibrated.
      </p>
      <p>The partial dependence plots are also useful when globally examining
the behavior of a single feature. In Figure 2, the respective plots of the random
forest and MLP models are shown for the feature of years using hormonal
contraceptives. Although the average impact on the random forest model is higher,
the interpretation is the same for both plots; the more years using hormonal
contraceptives, the greater the average response on the prediction is. However,
this last statement is only true when variables are not correlated. Furthermore,
the density indicates that most instances focus on a range between 0 and 1.88
years which makes the resulting graphs less reliable as the value of this feature
increases.</p>
      <p>An unbiased alternative method that does consider correlations is ALE.
Although ALE plots are an excellent way to cope with the shortcomings of
PDPs regarding correlation, the reliability related to the density of instances is
still the same.</p>
      <p>When the aim is to explain individual predictions, LIME is one of the most
used methods. It perturbs the dataset to get predictions for new, proximate
samples that allow adjusting the weighting and training of an interpretable, linear</p>
      <p>Documentation
and usability</p>
      <p>The documentation is good and plenty of examples are
provided. Other complementary resources such as tutorials are
provided as well. However, it may be hard to find the exact
usage illustration for a specific explainer in a notebook since
they are organized by data sets.</p>
      <p>There are many different metrics provided depending on the
nature of the problem. For classification, F1 score, accuracy,
Metrics recall, precision, specificity, and ROC/AUC are provided. For
regression problems there is mean squared error, R squared, and
median absolute deviation.</p>
      <p>Explainers 4</p>
      <p>Analysis</p>
      <p>Not included although the Dalex Arena allows the user easily
Interactivity comparing different explanations for the same problem and even
different models.
model. This interpretable model provides a local explanation because its training
is based on the proximity of the generated data points to the original instance.
In Figure 3, a specific instance A is explained using LIME on the random forest
model. The attributes of instance A, that obtains a positive prediction, are
presented in Table 9. The plot shows that the features that affect the prediction the
most around the given instance are hormonal contraceptives and STD-related
ones. Other features, such as years of smoking and the number of pregnancies
are considerably less impacting, even though they have high values in
comparison with the average. This interpretation may represent properly the behavior
of the model locally around the given instance. However, it does not necessarily
represent the global behavior of the model.</p>
      <p>Anchors provide conditions that are locally sufficient to determine a
prediction with a certain degree of confidence. Let us look at Instance in Table 9.
This instance was predicted as negative for cancer. Using anchors we obtain the
following conditional rule:
Anchor: Age &lt;= 31.00 AND</p>
      <p>STDs..number. &lt;= 0.00
Precision: 0.97
Coverage: 0.69</p>
      <p>The anchor given is that when the age is less or equal to 31 and the individual
has not had any STD, the model classifies the individual as healthy with a
precision of 97% and the coverage, representing the extent of the area of the
perturbation space to which this rule applies, is rather high with 69%. The</p>
      <p>Counterfactual Sexual partners</p>
      <p>Pregnancies</p>
      <p>Smokes (years) Contraceptives (years)
simplicity of anchor makes them excellent to obtain local explanations that are
easy to interpret. However, the given rule may be too complicated or have low
precision and coverage in certain cases.</p>
      <p>If the focus is to provide contrastive, concise, and easy-to-interpret
individual explanations, counterfactuals are one the best choices. Using again the
individual from table 3, we restrict the features to vary to years of smoking, the
number of pregnancies, years using hormonal contraceptives, and the number
of sexual partners. The counterfactuals generated are shown in Table 5; only
the indicated features are included, the rest remains the same. All the
counterfactuals generated show a considerable decrease in the years of smoking value,
but there are interesting combinations of features. For example, if the individual
had had only 1 pregnancy instead of 5, smoked for 24.7 years instead of 37, the
classification would have changed.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>
        This review of the XAI libraries allowed us to have a better understanding of
some of the most popular and up-to-date explainers that machine learning and
data scientists use to explain black-box models. Although all the libraries
reviewed had their pros and cons, some of them proved to be highly versatile
and interactive, making the process of obtaining good explanations
considerably easier. To conclude this paper, we provide some subjective opinions of each
library regarding its usability, variety, interactivity, and other characteristics.
If we had to rank the libraries, InterpretML would probably get first place.
Even though is not the most extensive library, its usability and neat interfaces
make it the number one choice for explainability. Interpret is very easy to use
as most of its explainers barely require a single function call specific to the
explainer used. The explanations generated are shown in the dashboard, which
is an interactive interface that allows switching the visualization depending on
the attribute that is emphasized, and even shows different explanations for the
same model. This makes Interpret a very versatile tool if what we need is to
obtain various explanations and compare them in a way we can choose the one
that better fits our needs. Additionally, its documentation is well structured and
complemented with several examples. It is a library that a person with little
experience in machine learning would be able to use properly in a short time.
However, this library only provides LIME and SHAP as local explainers, and
partial dependence plots, which does not provide the same reliability as ALE
plots. If Interpret widened its explainer repertoire, it would undoubtedly be the
best option for machine learning explainability. Curiously, Interpret developers
have also developed Dice, which is a separate library that uniquely focuses on
counterfactual generation. Although Dice is considerably different from the rest
of the libraries reviewed in this paper, it proved to be a solid option to obtain
counterfactual explanations. In fact, the algorithm configuration is much more
straightforward and intuitive than the one in Alibi. This library also outputs the
counterfactuals in an easy-to-understand fashion by using dataframes.
Generally speaking, it is easy to use, and the examples provided in the documentation
are illustrative and completely model-agnostic, in contrast to Alibi. In most
aspects, Dice is considerably better than the approach offered in Alibi as it allows
generating counterfactuals easily and outputting them in an interpretable way.
Unfortunately, a simple and interactive visualization of explanations is not
available in most of the XAI Libraries. Moreover, for counterfactual generation and
CEM from Alibi or Aix360, the explanations are given in a low-level format that
is hard to read and comprehend. Consequently, the programmer must process
this data in order to convert it to a more readable format. This is one of the main
issues of both libraries since explainers do not provide a high-level abstraction of
the output so end-users can easily understand the explanations. Although Alibi
is the most extensive library out of all reviewed ones, the way the explanations
are outputted is somewhat of a letdown. Furthermore, many of the usage
examples given are heavily oriented to Tensorflow models, which is a disadvantage
when the model to be explained has a different backend. Despite the fact that
the documentation is very specific and illustrative of the concepts behind each of
the explainers available, for users without a deep background in machine
learning and interpretability, using this library may prove to be difficult. On the good
side, Alibi has a wide variety of explainers and is the only reviewed library that
offers explanations through anchors. However, it does not include LIME. On the
other hand, Aix360 is not as complete as Alibi regarding basic explainers, but
it includes many other innovative model and data explanation methods such as
Protodash and Profweight that may be worth diving deeper into. There are also
other global explanation methods such as Boolean Decision Rules via Column
Generation [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and Generalized Linear Rule Models [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] that are not available
at any other library than Aix360. Moreover, its documentation is well developed
and there are many tutorials available on the official website. However, the
implementation of basic explainers such as LIME, SHAP, and CEM does not offer
any advantages over other libraries that also implement them. Lastly, we have
Dalex, which is not so different from the previous libraries described. One of the
few reasons to use it over Interpret is that it provides ALE plots, while only
PDPs are available in Interpret. It does not have contrastive methods such as
counterfactuals and CEM but it does provide tools for data analysis and feature
importance methods. The documentation is appropriately organized but some
of the methods are outdated, specifically the ones for the SHAP plotting. In
conclusion, choosing one of these libraries over the others depends on the
specific needs and preferences of the person who will be using them since there is
considerable overlapping between them.
      </p>
      <p>We conclude that one of the greatest downfalls of the XAI libraries currently
available is the lack of interactivity and personalization of the explanations.
Only InterpretML allows a simple interaction between the user who receives the
explanations and the program and none of the libraries reviewed provide any
form of personalization.</p>
      <p>The idea behind the iSee project is to provide personalized explanations that
suit the needs of the person receiving them, by analyzing user interactions using
a case-based reasoning system. In this way, it will be possible to merge the
already existing explainability methods with a user-oriented approach that aims
to improve the machine learning interpretability field.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Apley</surname>
            ,
            <given-names>D.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
          </string-name>
          , J.:
          <article-title>Visualizing the effects of predictor variables in black box supervised learning models (</article-title>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Dash</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Gu¨nlu¨k,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Boolean decision rules via column generation (</article-title>
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Dhurandhar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>P.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tu</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ting</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shanmugam</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Explanations based on the missing: Towards contrastive explanations with pertinent negatives (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Dua</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Graff</surname>
            ,
            <given-names>C.:</given-names>
          </string-name>
          <article-title>UCI machine learning repository (</article-title>
          <year>2017</year>
          ), http://archive.ics.uci.edu/ml
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Friedman</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          :
          <article-title>Greedy function approximation: A gradient boosting machine</article-title>
          .
          <source>The Annals of Statistics</source>
          <volume>29</volume>
          (
          <issue>5</issue>
          ),
          <fpage>1189</fpage>
          -
          <lpage>1232</lpage>
          (
          <year>2001</year>
          ). https://doi.org/10.1214/aos/1013203451
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Gurumoorthy</surname>
            ,
            <given-names>K.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dhurandhar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cecchi</surname>
          </string-name>
          , G.:
          <article-title>Protodash: Fast interpretable prototype selection</article-title>
          .
          <source>ArXiv abs/1707</source>
          .01212 (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lipton</surname>
            ,
            <given-names>Z.C.</given-names>
          </string-name>
          :
          <article-title>The mythos of model interpretability</article-title>
          .
          <source>Commun. ACM</source>
          <volume>61</volume>
          (
          <issue>10</issue>
          ),
          <fpage>36</fpage>
          -
          <lpage>43</lpage>
          (
          <year>2018</year>
          ). https://doi.org/10.1145/3233231
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lundberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>S.I.:</given-names>
          </string-name>
          <article-title>A unified approach to interpreting model predictions (</article-title>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Explanation in artificial intelligence: Insights from the social sciences</article-title>
          .
          <source>CoRR abs/1706</source>
          .07269 (
          <year>2017</year>
          ), http://arxiv.org/abs/1706.07269
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Molnar</surname>
            ,
            <given-names>C.:</given-names>
          </string-name>
          <article-title>Interpretable Machine Learning (</article-title>
          <year>2019</year>
          ), https://christophm.github.io/interpretable-ml-book/
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Ribeiro</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guestrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>“why should i trust you?”: Explaining the predictions of any classifier</article-title>
          .
          <source>In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          . p.
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          . Association for Computing Machinery, New York, NY, USA (
          <year>2016</year>
          ). https://doi.org/10.1145/2939672.2939778
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Ribeiro</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guestrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Anchors: High-precision model-agnostic explanations</article-title>
          . In: McIlraith,
          <string-name>
            <given-names>S.A.</given-names>
            ,
            <surname>Weinberger</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.Q</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of the ThirtySecond AAAI Conference on Artificial Intelligence, AAAI-18</source>
          . pp.
          <fpage>1527</fpage>
          -
          <lpage>1535</lpage>
          . AAAI Press (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Verma</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dickerson</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hines</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Counterfactual explanations for machine learning: A review (</article-title>
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dash</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , Gu¨nlu¨k, O.:
          <article-title>Generalized linear rule models (</article-title>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Weld</surname>
            ,
            <given-names>D.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bansal</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The challenge of crafting intelligible intelligence</article-title>
          .
          <source>Commun. ACM</source>
          <volume>62</volume>
          (
          <issue>6</issue>
          ),
          <fpage>70</fpage>
          -
          <lpage>79</lpage>
          (
          <year>2019</year>
          ). https://doi.org/10.1145/3282486
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>