<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Clustering-based Approach for Interpreting Black-box Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>(Discussion Paper)</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Ferragina</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simona Nisticò</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DIMES, University of Calabria</institution>
          ,
          <addr-line>87036 Rende (CS)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Classification and regression tasks involving image data are often connected to critical domains or operations. In this context, Machine and Deep Learning techniques have achieved astonishing performances. Unfortunately, the models resulting from such techniques are so complex to be seen as black boxes, even when we have full access to the model's information. This is limiting for experts who leverage these tools to make decisions and lowers the trust of users who are somehow subjected to their outcomes. Some methods have been proposed to solve the task of explaining a black box both in a non-specific data domain and for images. Nevertheless, the most used explanation tools when dealing with image data have some limitations, as they consider pixel-level explanations (SHAP), involve an image segmentation phase (LIME) or apply to specific neural architectures (Grad-CAM). In this work, we introduce CLAIM, a model-agnostic explanation approach, that interprets black boxes by leveraging a clustering-based approach to produce interpretation-dependent higher lever features. Additionally, we perform a preliminary analysis aimed at probing the potentiality of the proposed approach.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;eXplainable AI</kwd>
        <kwd>Post-hoc explanations</kwd>
        <kwd>Local Explanations</kwd>
        <kwd>Model-agnostic Explanations</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The pervasive use of Machine and Deep Learning models in everyday life processes has raised
the problem of models’ trustworthiness. The root of this problem lies in the level of complexity
characterizing them. Indeed, such complexity makes it dificult to understand the logic followed
by the model to perform its prediction, and this is true not only for final users but also for
Machine and Deep Learning experts who have dificulties inspecting and debugging their own
models. When predictive models take part to decisions that afect users’ lives, the above-stated
problem involves even a legal dimension. The GDPR enshrines the right of explanation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
which requires to be able to provide to users an intelligible explanation for the model outcome.
      </p>
      <p>
        All the above-stated issues have led to the birth of the eXplainable Artificial Intelligence
(XAI) field, which collects all the research eforts in providing instruments for a more-aware use
of artificial intelligence solutions. Many types of approaches have been developed to face the
model explainability problem [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. From the taxonomy described in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], it emerges that some
works focus on building models which are interpretable by design, whose more challenging
aspect is to provide explainability without afecting too much model performances.
      </p>
      <p>Others, known as post-hoc explanation methods, aim to explain already designed and trained
models. This class of methods difers from the others because there are various levels of
information availability ranging from having access only to the model output to having complete
access to all its information. In this work, we will focus on the latter class of explanation methods.
Our contributions can be summarized as follows:
• we analyze the main algorithms for explaining models working on image data;
• we introduce CLAIM, CLustering-based Approach for Interpreting black-box Models;
• we provide preliminary experimental results.</p>
      <p>The structure of the paper is the following. In Section 2 we describe the background and
discuss related works. In Section 3 we introduce the CLAIM algorithm and provide an example
to illustrate its working principles. In Section 4 the results of the experiments are reported.
Finally, Section 5 concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries and Related Works</title>
      <p>Regarding post-hoc explainability, various settings arise from diferent levels of information
availability. One of them considers model-agnostic explanations in which only the information
provided by the model output is exploited to understand its behaviour.</p>
      <p>
        To explain the prediction for a certain data instance, some Post-hoc methodologies perturb
the input, collect information about how the outcome changes, and then exploit it to estimate
the level of importance of each feature. Among them, SHAP [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is a game-theory-inspired
method that attempts to enhance interpretability by computing the importance values for each
feature. While SHAP applies to diferent data types, the RISE [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] method focuses on image data.
To explain a prediction, it generates an importance map indicating how salient each pixel is
for the model’s prediction by probing the model with randomly masked versions of the input
image to observe how the output changes.
      </p>
      <p>
        Explanations can also be given by examples, in which case counterfactuals, which are
instances similar to the considered sample that bring the model to produce a diferent outcome,
are provided to users as justifications. Among the methods of this category it is possible to find
LORE [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which learns an interpretable classifier in a neighbourhood of the sample to explain.
This neighbourhood is generated by employing a genetic algorithm, using model outcomes
as labels, and then extracting from the interpretable classifier an explanation consisting of a
decision rule and a set of counterfactuals. DiCE, proposed in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], generates counterfactual
examples that are both actionable and diverse. The diversity requirement aims at increasing
the richness of information delivered to the user. MILE [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] exploits an adversarial-like neural
network architecture to learn a transformation able to change the black-box model outcome for
the considered sample. It can provide simultaneously two kinds of explanation: a counterfactual,
which is the result of the transformation application, and a score for each object feature, derived
from the transformation.
      </p>
      <p>
        Finally, some methodologies explain models via local surrogates, which are self-interpretable
functions that mimic the decisions of black-box models locally. LIME [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] explains black-box
models by converting the data object into a domain composed of interpretable features, and
then perturbing it in that domain and querying the black-box to learn a simple model (the
local surrogate) using this generated dataset. A variant of LIME called -LIME, has been
proposed in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] to solve the problem related to out-of-distribution generated samples. In
particular, it exploits semantic features extracted through unsupervised learning to generate the
neighbourhood. The type of explanations provided changes in Anchor [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] where if-then rules
having high precision, called anchors, are created and utilised to represent locally suficient
conditions for prediction. Other approaches, known as model-specific methods, exploit the
peculiarities of the class of models they are targeted to. For example, some methods focus on
neural networks and exploit information carried by the gradient. The authors of [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] propose
two methodologies to explain ConvNets through visualization, that leverage the computation
of the gradient related to the class score for the input image considered. Grad-CAM [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ]
uses the gradients flowing into the final convolutional layer of any target concept, which can
be related to classification or other tasks, to produce a coarse localization map highlighting
important regions in the image for predicting the concept.
      </p>
      <p>Among the aforementioned methods, the ones that specifically address the issue of providing
a post-hoc explanation of black-box models dealing with image data are Grad-CAM and RISE.
However, Grad-CAM has the weakness of applying only to a very specific type of models, i. e.
convolutional neural networks.</p>
      <p>As for RISE, being designed for computing the importance of every single feature, it provides
explanations that are not beneficial since there is a consistent number of features. Generally, to
make these explanations more user-friendly, as long as image data is considered, the explanation
is shaped as a heatmap h that assigns a continuous importance value to each pixel. Regrettably,
this is not suficient, since the so obtained explanations should be so scattered to be still
incomprehensible for users.</p>
      <p>Even if it is not specifically tailored to image data, one of the most widespread methods for
model-agnostic explanation is LIME, because of its versatility and ease of use. When dealing
with images, LIME considers as interpretable features for the surrogate model the output
obtained from a segmentation algorithm. The issue with this approach is that the segmentation
and explanation steps are separated from each other, therefore the aggregation of diferent
pixels is based on their importance for the model. This fact may lead to a rough explanation
that identifies the most important portions (according to the black box) of the image with low
precision.</p>
      <p>In the following section, we introduce CLAIM, an algorithm for post-hoc explaining
blackbox models specifically designed for image data. CLAIM faces the issues described above by
aggregating pixels that the model considers of similar importance, i. e. that produce a similar
efect when they are perturbed.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>Let  : R → R be a black-box assigning a real value to each data point belonging to the input
space. Theoretically,  may be the function obtained from a machine learning method achieving
any specific task. Thus, for example,  (x) may represent the result of a regression analysis on
x, the probability that x belong to a certain class, the anomaly degree of the point x, and so on.</p>
      <p>Given a sample x, in order to understand which features are the most important,
according to the model  , for the elaboration of the output  (x), we investigate how feature-wise
perturbations of x afect the value of  (x). Thus, for each fixed  ∈ {1, . . . , }, we consider
() =  (x) −  (x + e),
(1)
where  is a perturbation step,  ∈ {− , . . . , − 1, 1, . . . ,  } determines the number of
perturbations we are performing and e is a vector whose components are all equals to 0 except for
the -th that is equal to 1.</p>
      <p>The value in Equation (1) expresses how much the output of the black-box model  varies if
we perturb by  the feature  of the input data point x. By collecting the variations obtained
on the feature  with all the diferent  ∈ {− , . . . , − 1, 1, . . . ,  }, we obtain an embedded
representation p() ∈ R2 relative to the feature , whose components are expressed by
The same reasoning applied to each feature of x produces a finite set of embedded points
p() = [︁(−) , . . . , −()1, (1), . . . , ()]︁ .</p>
      <p>= {︁p(1), . . . , p()}︁ ⊆ R2
such that each p() represents how the model behaves when we perturb x on the feature . Each
of the 2 dimensions of the space we build contains information about how the pixels of the
sample behave when they are subjected to a fixed perturbation.</p>
      <p>The norm of a certain p() in the 2 -dimensional space represents a score measuring the
importance of the feature  for the elaboration of the output provided by  on the sample x.
Indeed, if ‖p()‖ is relatively low, it means that perturbations on the feature  of the input do
not substantially modify the output provided by  , thus the contribution of the feature  to the
output  (x) is poor. On the other hand, if it is high, it means that perturbing the feature  has a
huge efect on the output, thus  must be a very important feature according to the model.</p>
      <p>
        In order to provide a more understandable heatmap we apply a clustering algorithm on 
with the aim of aggregating features with similar behaviors. Here we consider the -MEANS
algorithm [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] that builds up a set of  centroids  = {c(1), . . . , c()} that are representative
of the  clusters {1, . . . ,  } in which  is partitioned.
      </p>
      <p>From our perspective, each cluster  with  ∈ {1, . . . , } constitutes a subset of features
of the input point x in which the model  behaves similarly in presence of perturbations. Thus,
each  can be seen as a macro-feature and the norm of the relative centroid ‖c()‖ indicates
the importance of this macro-feature for the output provided by  on x. To better visualize the
explanation obtained, we build a heatmap h whose -th element is given by
Algorithm 1: CLAIM</p>
      <p>Input: Black-box  , point x, perturbation step , number of perturbations  , number of
macro-features</p>
      <p>Output: An heatmap h highlighting the most important macro-features of x according to 
1 foreach feature  = 1, . . . ,  do
2 foreach  ∈ {− , . . . , − 1, 1, . . . ,  } do
3 Compute the perturbation () using Equation (1);
4 Apply the -MEANS algorithm to the set  obtaining clusters {1, . . . ,  };
5 Build the value of the feature  of the heatmap h using Equation (2);</p>
      <p>ℎ = ‖(p())‖ (2)
where we indicate by  :  →  the function that assigns each point in  to the centroid
of the cluster to which it belongs. To better illustrate each step of CLAIM (also reported in
Algorithm 1), in the next section we provide a detailed example.
3.1. Motivating Example
When dealing with a black-box model, an analysis of the value of performance metrics such as
accuracy, is not always suficient to efectively assess its quality.</p>
      <p>One common situation is the one in which the presence of some bias in the data used for the
training of a model  may afect the output provided by  , making it focus on non-relevant
features. This is potentially dangerous, because, if also the data used for the quality assessment
presents the same issue, the resulting model performances do not reflect its poor quality.</p>
      <p>
        To reproduce this kind of scenario, we set up the following experiment. We consider as  a
logistic regression model that must classify the images belonging to the MNIST [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] data set as
3 or 9, more specifically, given an image x,  outputs the probability of x belonging to the class
9. Then, we train  with samples of classes 3 and 9 in which we add a black rectangle in the
bottom left corner to all the training images belonging to the class 3, as shown in Figure 1a. In
the following, we refer to the images with this modification as biased.
      </p>
      <p>At inference time we pass to the model an input sample x representing a non-biased 3 (Figure
1b). The model fails in recognizing it as a 3 since it outputs  (x) = 0.97.</p>
      <p>This happens because the added sign is fully discriminating between classes in the training
set, thus, the model is clearly focusing only on the portion of the image where the sign is (or
should be) located, as depicted in Figure 1c showing the magnitude of the model’s weights.</p>
      <p>We now apply CLAIM to explain the behaviour of  on the image x, in particular, we set
 = 0.5,  = 2, and  = 1.</p>
      <p>Figure 2a shows the bi-dimensional data space  built in line 3 of Algorithm 1 and Figure 2b
shows the two centroids (the red triangles) and the splitting into the two clusters (purple and
yellow points) obtained in line 4. We can observe that the cluster in yellow is the one farther
from the origin, which means that its centroid has a larger norm and, thus, it is relative to the
macro-feature that contributes the most to the output of  .</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Results</title>
      <p>The example described before is quite simple since the bias is inducted by a patch with a regular
shape that is quite easy to handle for most algorithms. Therefore, in order to analyze the
behavior of CLAIM in more challenging scenarios, in the following we consider settings similar
to the one described in Section 3.1 but in which the shapes of the patches are more elaborated.</p>
      <p>To have guidance in assessing the adherence of the explanation to the behaviour of the
explained model, in all the experiments we consider again a logistic regression as a black-box  .
We also compare the results obtained by CLAIM with those obtained by LIME.</p>
      <p>In the first experiment, instead of a single square, the bias is given by five squares placed
in a sort of "chessboard"-shaped patch (Figure 3a). Figure 3 reports the visual result of the
experiment on an image (Figure 3a) of a 9 that, diferently from the training set, contains the
0 5 10 15 20 25
0 5 10 15 20 25
bias. Similarly, as before, the model has mainly focused on the features involved in the bias,
indeed all the weights associated with the other features are very close to 0 (Figure 3b). As
we can see from Figure 3d CLAIM is able to perfectly isolate the bias aggregating all its pixels
in a single macro-feature and assigning to it a large value in the heatmap. On the other hand,
LIME is only able to roughly identify the portion of the image containing the bias but it fails to
precisely determine its shape. This happens because the segmentation algorithm used in LIME
does not exactly separate the bias from the rest of the image.</p>
      <p>A similar behavior can be observed on an image of a 3 that does not contain the patch (Figure
4a). Even in this case in which the bias is invisible, our model succeeds in capturing it, obtaining
a heatmap (Figure 4d) consistent with the weights of  (Figure 4b). As for LIME (Figure 4c), its
behaviour in this sample is worse. Indeed, since the segmentation step is performed before the
rest of the algorithm and only considers the image content, it is impossible for it to obtain a
partition of the image that is suitable to identify this bias.</p>
      <p>In the second experiment, we add another dificulty by inserting a disconnected bias into the
training set. In particular, we place one "chessboard" patch in the bottom-left corner and one in
the top-right corner. As we can see from Figure 5, the heatmap provided by CLAIM is extremely
precise, since it includes in a single macro-feature both the patches and judges it as the most
relevant macro-feature for  . For what concerns LIME, also in this case, the segmentation step
causes some issues. In particular, the two patches belong to diferent portions of the images
and thus they are assigned to two diferent macro-features. This type of result is not desirable
since the pixels belonging to the two patches contribute in exactly the same way to the output
provided by  and, for this reason, they conceptually belong to a unique macro-feature.</p>
      <p>Still concerning this experiment, Table 1 shows Precision and Recall over the features
considered important by the weight of the logistic regressor. The metrics are computed on an
unbiased test set and we report the mean and the standard deviation over all the samples in the
test set. The first column is relative to our method while the others are relative to the heatmap
of LIME in which we consider the 1, 2, or 3 most important macro-features. The numerical
results confirm that CLAIM overcomes LIME in detecting such kind of bias. In particular, the
Precision of LIME is always much smaller than the Recall, which means that it is judging as
important a lot of pixels that actually are irrelevant for  .</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this work, we deal with the post-hoc explanation of models that work with image data
and introduce the CLAIM algorithm. Our goal is to provide users with heatmaps that include
higher-level features built through the guidance of the black-box model without exploiting
segmentation algorithms, which only consider image content.</p>
      <p>The preliminary analysis performed on the CLAIM’s explanations leads to promising results
that give rise to hope about the potential of this method in producing faithful explanations that
lead users to focus only on important image regions.</p>
      <p>In future development, we will focus on deepening the expressive power of the information it
extracts to compute explanations. We will also investigate CLAIM’s robustness with respect to
its parameters, to analyze how they afect explanation quality. Furthermore, we plan to enlarge
the experiments performed to include more competitors and to consider richer data sets.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>We acknowledge the support of the PNRR project FAIR - Future AI Research (PE00000013),
Spoke 9 - Green-aware AI, under the NRRP MUR program funded by the NextGenerationEU.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <article-title>A survey of methods for explaining black box models</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>51</volume>
          (
          <year>2019</year>
          )
          <volume>93</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>93</lpage>
          :
          <fpage>42</fpage>
          . URL: https://doi.org/10.1145/3236009. doi:
          <volume>10</volume>
          .1145/3236009.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. N.</given-names>
            <surname>Wu</surname>
          </string-name>
          , S.-C. Zhu,
          <article-title>Interpretable convolutional neural networks</article-title>
          ,
          <source>in: Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>8827</fpage>
          -
          <lpage>8836</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Donnelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Barnett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Deformable protopnet: An interpretable image classifier using deformable prototypes</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>10265</fpage>
          -
          <lpage>10275</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-I.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>A unified approach to interpreting model predictions</article-title>
          ,
          <source>arXiv preprint arXiv:1705.07874</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.</given-names>
            <surname>Petsiuk</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Saenko</surname>
          </string-name>
          ,
          <article-title>Rise: Randomized input sampling for explanation of blackbox models</article-title>
          , arXiv preprint arXiv:
          <year>1806</year>
          .
          <volume>07421</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          ,
          <article-title>Factual and counterfactual explanations for black box decision making</article-title>
          ,
          <source>IEEE Intelligent Systems</source>
          <volume>34</volume>
          (
          <year>2019</year>
          )
          <fpage>14</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Mothilal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. Tan,</surname>
          </string-name>
          <article-title>Explaining machine learning classifiers through diverse counterfactual explanations</article-title>
          ,
          <source>in: Proceedings of the 2020 ACM FAccT</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>607</fpage>
          -
          <lpage>617</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Angiulli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fassetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nisticò</surname>
          </string-name>
          ,
          <article-title>Local interpretable classifier explanations with selfgenerated semantic features</article-title>
          ,
          <source>in: DS</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>401</fpage>
          -
          <lpage>410</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>" why should i trust you?" explaining the predictions of any classifier</article-title>
          ,
          <source>in: Proceedings of the 22nd ACM SIGKDD KDD</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F.</given-names>
            <surname>Angiulli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fassetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nisticò</surname>
          </string-name>
          ,
          <article-title>Finding local explanations through masking models</article-title>
          ,
          <source>in: IDEAL 2021</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>467</fpage>
          -
          <lpage>475</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>Anchors: High-precision model-agnostic explanations</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>32</volume>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>K.</given-names>
            <surname>Simonyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vedaldi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zisserman</surname>
          </string-name>
          ,
          <article-title>Visualising image classification models and saliency maps, Deep Inside Convolutional Networks (</article-title>
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Selvaraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cogswell</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Vedantam</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Parikh</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Batra</surname>
          </string-name>
          , Grad-cam:
          <article-title>Visual explanations from deep networks via gradient-based localization</article-title>
          ,
          <source>in: Proceedings of the IEEE ICCV</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>618</fpage>
          -
          <lpage>626</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chattopadhay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sarkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Howlader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Balasubramanian</surname>
          </string-name>
          ,
          <article-title>Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks, in: 2018 IEEE winter conference on applications of computer vision (WACV)</article-title>
          , IEEE,
          <year>2018</year>
          , pp.
          <fpage>839</fpage>
          -
          <lpage>847</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>J. MacQueen</surname>
          </string-name>
          , et al.,
          <article-title>Some methods for classification and analysis of multivariate observations</article-title>
          ,
          <source>in: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability</source>
          , volume
          <volume>1</volume>
          , Oakland, CA, USA,
          <year>1967</year>
          , pp.
          <fpage>281</fpage>
          -
          <lpage>297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>L.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <article-title>The mnist database of handwritten digit images for machine learning research</article-title>
          ,
          <source>IEEE Signal Processing Magazine</source>
          <volume>29</volume>
          (
          <year>2012</year>
          )
          <fpage>141</fpage>
          -
          <lpage>142</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>