<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Explaining CNN Classifications Using Small Patches</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jean-Marc Boutay</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Quentin Leblanc</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Damian Boquete</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Deniz Köprülü</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ludovic Pfeifer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guido Bologna</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Applied Sciences and Arts of Western Switzerland</institution>
          ,
          <addr-line>Rue de la Prairie 4, 1202 Geneva</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Convolutional neural networks (CNNs) have achieved remarkable success in image classification tasks. However, their decision-making processes remain dificult to interpret, limiting their adoption in sensitive domains. We present here a novel explainability method that generates global explanations in the form of propositional rules, combining both pixel values and probabilities associated with sub-image patches. Our approach integrates a multi-layer perceptron trained on image patches, a CNN trained on patch probabilities, and a global rule extraction technique. The key idea is to highlight the most relevant image regions that the model uses for its predictions while maintaining high classification performance. We apply this method to three image classification problems: MNIST, CIFAR-10, and FER2013. The generated rulesets capture meaningful patterns in the data and provide accurate and faithful explanations. Although the rules generalize well on simpler datasets like MNIST, both simple and complex image classification problems, such as CIFAR-10, result in large rulesets, with the size increasing further as visual variability grows. Our method achieves competitive performance with standard CNNs while adding a rule-based explanation that highlights the inference process.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Model Explanation</kwd>
        <kwd>Rule Extraction</kwd>
        <kwd>Convolutional Neural Networks</kwd>
        <kwd>Image Patch Explanation</kwd>
        <kwd>AI Trustworthiness</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The explainability of deep neural networks (DNNs) remains
an open research problem. In the XAI domain, popular
techniques applied to DNNs include LIME [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], SHAP [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
and GradCAM [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Some object recognition techniques
generate heatmaps, e.g. highlighting important regions that
contribute to the classification. Reviews of XAI methods
have been presented in [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ].
      </p>
      <p>A problem with heatmaps provided by GradCAM and
similar methods is that they can highlight the same area
for diferent classes, and they do not clarify the inference
process. This work proposes a novel explainability method
that uses propositional rules with antecedents representing
small patches of an image to explain classification decisions.
Furthermore, each patch represented in a rule is associated
with a probability. For example, a rule could be: “If a patch of
a certain size at a given position contributes to a given class
with a certain probability, then a certain class of objects is
present".</p>
      <p>
        Very few works have tried to generate propositional rules
from DNNs. In previous work, our rule extraction
algorithm was applied to multi-layer perceptrons (MLPs),
support vector machines (SVMs), ensembles of MLPs, and
convolutional neural networks [
        <xref ref-type="bibr" rid="ref6 ref7 ref8 ref9">6, 7, 8, 9</xref>
        ]. The key idea behind
our algorithm is to determine axis-parallel discriminatory
hyperplanes using a greedy algorithm that, at each iteration,
tries to maximize fidelity. The main contribution of this
work is the patches obtained by applying the rule extraction
algorithm to a second-trained model, which associates a
probability with each patch of each image. We illustrate
our method with three benchmark problems. Finally, we
present several examples of rules that explain CNN
classifications. The following parts present related work, the rule
extraction algorithm, the methodologies employed in this
study, the experiments, and the conclusion.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Explainability is a feature lacking in many machine learning
models. Examples of explainable models include
propositional rules, linear and logistic regression, single decision
trees, and, to a certain extent, nearest-neighbor classifiers.
Artificial neural networks and model ensembles, such as
random forests, are inherently inexplicable. However, a
number of methods have made it possible to approximate
these opaque models with interpretable ones. For instance,
a large number of techniques have been proposed to
approximate MLPs or SVMs with propositional rules [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ].
      </p>
      <p>
        Common explainability techniques used to understand
model decisions in connectionist models are feature
relevance methods and rule-based explanations. Feature
relevance methods, such as Shapley values [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and LIME (Local
Interpretable Model-Agnostic Explanations) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], assess the
contribution of each input feature to a specific prediction.
Shapley values, which are based on cooperative game theory,
have desirable properties such as eficiency and symmetry.
However, they are computationally complex and assume
feature independence. LIME, on the other hand, constructs
a local surrogate model (linear or tree-based) around the
instance to be explained using perturbed samples weighted
by proximity.
      </p>
      <p>
        Rule-based explanations present decisions in a logical
way, making them easier to understand. Early taxonomies
classify rule extraction methods as pedagogical,
decompositional, or eclectic [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Pedagogical approaches, such as
decision trees, learn input-output relations without using
model weights, whereas decompositional methods analyze
the weights and often encounter exponential complexity.
Eclectic methods combine both.
      </p>
      <p>Unlike visual explanations such as heatmaps, which
highlight general regions (e.g. in image classification), logical
rules provide an explanation and a prediction of the
original model’s behavior, ofering greater interpretability. For
instance, in the task of classifying numerals within images,
heatmaps typically highlight pixel regions corresponding to
the digit, yet this emphasis is uniform across all classes and
fails to distinguish between them. We argue that a robust
explanation should incorporate both descriptive and
predictive elements, allowing the original model to be substituted.</p>
      <p>
        The use of propositional rules to explain deep learning
models remains largely unexplored in the existing literature
due to the inherent complexity of the problem. Townsend
et al. presented ERIC (Extracting Relations Inferred from
Convolutions) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. This method assumes that each kernel
is linked to a specific concept. The output of each kernel
is quantized as a binary value, which allows rules to be
extracted that link the binarized kernels to each other. This
technique involves a phase in which the kernels are labeled
using manually established symbols. In future work, the
authors proposed the automation of symbol annotation, with
particular emphasis on leveraging established approaches to
facilitate the mapping of convolutional kernels or receptive
ifelds to their corresponding semantic constructs.
      </p>
      <p>
        The method introduced by Padalkar et al. is similar to the
previous one [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The authors consider that each kernel in
the last convolutional layer can be associated with multiple
concepts rather than just one. Therefore, after binarising
the kernel activations, they introduce a method that enables
them to automatically label the concepts. However, the
entities present in the images must be annotated in advance.
For example, the ‘bedroom’ class is characterized by the
presence of one or more beds, which must be annotated in
the images in the dataset.
      </p>
      <p>Unlike the previous two approaches, this work extracts
propositional rules regardless of any prior knowledge of the
classification problems. It is worth noting that this covers
the vast majority of datasets. Consequently, the relevant
regions related to the rule antecedents, indicated by small
squares in the images, are visualized to help users
understand the classification strategy.
3.</p>
    </sec>
    <sec id="sec-3">
      <title>Methods and Models</title>
      <sec id="sec-3-1">
        <title>3.1. The use of Axis-Parallel Hyperplanes</title>
        <p>Let us describe a general MLP model and denote (0) as
activation values (+1) of the neurons are
a vector for the input layer. For layer  + 1 ( ≥
0), the
(+1) =  ( ()() + ()).
 () is a matrix of weight parameters between two
successive layers  and  + 1; () is a vector called the bias and
 () is a sigmoid activation function:
 () =</p>
        <p>1
1 + exp(− )
.</p>
        <p>When  (0) is a diagonal matrix and the activation
function in the first hidden layer is a step function () given
below, we obtain axis-parallel hyperplanes; one for each
input neuron.
(1)
(2)
(3)</p>
        <p>
          Interpretable Layer). Furthermore, the QIL can simply act as
a normalization layer, which means that it is frozen during
training. In fact, QIL weight values depend on the averages
and standard deviations of the training data for each input
neuron [
          <xref ref-type="bibr" rid="ref7 ref9">7, 9</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Our Local and Global Rule Extraction</title>
      </sec>
      <sec id="sec-3-3">
        <title>Algorithms</title>
        <p>Fidelity refers to the degree to which the extracted rules
mimic the behavior of a model. This is a measure of the
accuracy with which the rules represent the decision-making
process of a neural network. Specifically, with  samples in
a training set and ′ samples for which the classifications of
the rules match the classifications of the model, the fidelity
is ′ .</p>
        <p>
          Our local rule extraction algorithm strongly uses fidelity.
During rule construction, at each iteration, it determines the
hyperplane that involves the highest increase of fidelity. Its
computational complexity is linear with respect to the
product of the following: the dimensionality of the classification
problem, the number of training samples, the maximum
number of antecedents per rule, and the number of steps in
the staircase activation function [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>The execution time of our local rule extraction technique
can be accelerated by considering two dropout parameters:
 and . Essentially,  determines at each step the proportion
of input variables that will not be taken into account. The
parameter  is similar, but concerning excluded hyperplanes.</p>
        <p>Our global rule extraction algorithm generates a set of
rules for a training set of size . It corresponds to a covering
technique that calls our local rule extraction algorithm 
times. Therefore, it first generates  rules and then uses
a heuristic to select a subset of the rule base that covers
all  samples. Then, a simple heuristic consists of ranking
the rules in descending order according to the number of
samples covered, and then selecting the rules in descending
order until all the samples are covered.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.3. Methodologies for Obtaining Rules</title>
      </sec>
      <sec id="sec-3-5">
        <title>Involving Patches</title>
        <p>Our goal is to obtain explanations on image samples
classification using patches reflecting exactly the model’s behavior.
To do so, we implemented two diferent yet similar
methodologies. The diagram of the first one is illustrated in Figure
1.</p>
        <p>In this method, we first split an image into sub-image
patches of size  ×</p>
        <p>×
width, and  the number of channels. The goal is to get
localized explanations of the image. We start at the top
left of the image to get the first patch; then we slide
horizontally and vertically with a stride  to scan the whole
image and get all patches. Usually, we choose a patch
, with  the height,  the
size of 7 ×
size  ×
︀(⌊ −  ⌋︀ + 1)︀ ×

 ×
7 and a stride size of 1. If the image has</p>
        <p>︀(⌊
 (height, width, channels), we obtain</p>
        <p>−  ⌋︀ + 1)︀ patches per image. We
build a dataset composed of all these patches with their
corresponding location in the image, and we train an MLP
on it. Then, for each patch, we extract its classification
probability for each class. We construct another dataset
by concatenating these probabilities with the original
image samples. The shape of each data in the new dataset is
︀(⌊ −  ⌋︀ + 1)︀ × (︀⌊

 −   ⌋︀ + 1)︀ + ( ×  × ).
() =
︂{
1
0
if  &gt; 0;
otherwise.</p>
        <p>
          To eficiently train an MLP with axis-parallel hyperplanes,
we replace the step function with its generalization, which
is a staircase activation function. Each step of this function
creates an axis-parallel hyperplane [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>The creation of axis-parallel hyperplanes can be applied
to CNNs. In this case, adding a special layer after the input
layer is suficient [</p>
        <p>
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. It plays the role of quantization via
a diagonal matrix of weights (see eq. 1) and a staircase
activation function. We denote this layer as QIL (Quantized
Probabilities on
        </p>
        <p>patches
(H-6) x (W-6)</p>
        <p>x
nbClass</p>
        <p>Image after</p>
        <p>QIL
,
Probabilities
after QIL
Concatenate</p>
        <p>and</p>
        <p>Train MLP
Weights Train VGG on
image</p>
        <p>Dataset
Probabilities on
patches</p>
        <p>+
Image</p>
        <p>QIL</p>
        <p>Train CNN
on probabilities
of patches
Predictions
Global rule
extraction
Input (images HxWxC)</p>
        <p>We train it on a custom model composed of a QIL (cf.
Sect. 2.1) and two separate networks: a VGG-16 and a
custom CNN processing respectively the image and the patch
probabilities, reunited at the end through a final MLP. The
predictions of this model are used to obtain rules
explaining the model with our global rule extraction algorithm
described above in Sect. 3.2. A rule can contain two types
of antecedents. These are image pixels and the probability
of a patch for a specific class. The image pixels come from
the original image.</p>
        <p>The second methodology is similar to the first and gives
the same kind of rules. Its diagram is shown in Figure 2.
The first steps of the pipeline are the same; we train patches
on an MLP and construct the same dataset as before. The
second model is diferent. The data also passes through a
QIL, but then we train diferent VGGs, one for each class in
the dataset. Each VGG is responsible for distinguishing one
class from all the others and outputs the probability that
each sample belongs to that class. The VGG responsible for
class  takes input data of shape  ×  × 3. This data
is composed of the probabilities of the patches for class 
that have been padded to match the size of the image, along
with the red and green channels of the original image. The
prediction for each sample is the maximum among all class
prediction scores. The rules are then computed in the same
way.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <sec id="sec-4-1">
        <title>4.1. Datasets</title>
        <p>
          We applied our method to three diferent datasets: MNIST
[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], CIFAR-10 [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], and FER2013 [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. The MNIST dataset
is a collection of 28× 28 handwritten digits of classes 0-9 and
is a common benchmark for image classification. It is made
Data
Data
Probability_Multi_Nets
Probabilities on
        </p>
        <p>patches
(H-6) x (W-6)</p>
        <p>x
nbClass
Weights
up of tiny black-and-white images. CIFAR-10 is another
classification benchmark with colorful 32 × 32 × 3 images
of ten diferent classes: airplane, automobile, bird, cat, deer,
dog, frog, horse, ship, and truck. It is known to be a more
dificult problem than MNIST. Finally, we used the FER2013
dataset. It is a set of black and white 48 × 48 images that
represent facial expressions. There are seven classes: angry,
disgust, fear, happy, sad, surprise, and neutral. To simplify
the dificulty of the classification, we have chosen to merge
all the non-happy classes into one new class, resulting in two
classes: happy and non-happy. This results in an imbalanced
dataset, with 33.4% of samples belonging to the happy class.
Table 1 shows for each dataset the number of training and
testing samples, the size of the images, and the number of
classes.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Metrics</title>
        <p>We evaluated our models and rulesets on the datasets using
several diferent metrics. We defined an activated rule for
a sample as a rule whose conditions (or antecedents) are
satisfied by that specific sample. A correct rule is defined as
an activated rule that is faithful to the model’s prediction,
and a wrong rule as a non-faithful activated rule. The fidelity
of a ruleset generated from a training set is 100%. It is
essential to note that if no rules are activated for a specific
sample, the ruleset will agree by default to the model’s
prediction. The statistics we consider are the following:
– The training and testing accuracy of the first MLP
model;
– The training and testing accuracy of the second
model using VGG-16;
– The number of rules in the ruleset;
– The mean number of antecedents per rule;
– The mean covering size per rule (the mean number
of train samples covered by a rule);</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Results</title>
        <sec id="sec-4-3-1">
          <title>4.3.1. Model architectures and training settings</title>
          <p>We detail the architecture of the models and the parameter
settings used in our experiments. We have chosen the same
configuration for MNIST and CIFAR-10. We trained the first
model with patches of size 7 × 7 and a stride of size 1. We
used 100 steps in the staircase activation function in the QIL
and a dropout of 0.95% of input variables and hyperplanes
during rule extraction. We applied the first methodology
described in Sect. 3.3. The three datasets were all learned
by the same MLP model on the patches. The model splits
into two branches. The first is dedicated to learning each
patch and has two dense layers of sizes 128 and 64. The
second has only one dense layer of size 8 and focuses on the
localization of the patch in the original image, characterized
by the coordinate of its uppermost point on the left. The
two branches merge and pass through a dense layer of size
64, and through an output layer with softmax, whose size
matches the number of target classes. It was trained for 60
epochs with a categorical cross-entropy loss and optimized
using the Adam algorithm with a learning rate of 1 × 10− 3.
During the second training, the model splits also in two
branches. The images are resized to 224 × 224, passed
through a VGG-16 and a dense 256-dimensional layer with a
dropout of 30% and batch normalization. On the other side,
the probabilities of patches are resized to twice in height and
width, passed through three 3 × 3 convolutional layers of
sizes 64, 128, and 256, and a dense layer of size 256. They use
batch normalization and a LeakyReLU activation function. A
max-pooling of size 2 and an L2-regularization of 5 × 10− 4
are used for convolutions, and a final dropout of 40% is
applied. The two branches merge and pass through three
dense layers of sizes 256, 128, and 64, ending with an output
using the softmax activation function. It was trained for 80
epochs, with categorical cross-entropy loss, and optimized
with Adam with a learning rate of 1 × 10− 5.</p>
          <p>For FER2013, we used the same staircase activation
function for the QIL element, but the patch size is 8 × 8 with
a stride of 1, and the dropout is 0.9% of the input variables
and hyperplanes. The second methodology has been used.
The second training consists of an application of a VGG-16
for each class during 80 epochs, as described in Sect. 3.3.</p>
          <p>For each dataset, Table 2 shows the number of training
and testing patches and the input and output shape of the
ifrst MLP model. Table 3 shows the number of training and
testing samples in the second model using VGG-16, and the
input and output shape of these samples.</p>
        </sec>
        <sec id="sec-4-3-2">
          <title>4.3.2. Classification performance</title>
          <p>The results presented in this paper were only computed on a
single execution and do not represent a mean across several
runs, nor do they contain a variance. This is mainly due
to the high computational cost of the entire pipeline, even
when using a GPU. However, based on our past experiments,
we are confident that these results would remain stable
across executions, with minimal variance, and are therefore
representative of the model’s expected behavior.</p>
          <p>The accuracies of the first MLP model trained on patches
are shown in Table 4. The results of MNIST and CIFAR-10
are not that high because it is dificult to classify between
10 classes with only patches of size 7 × 7. In the case of
FER2013, the performance is strongly related to the rate of
non-happy samples, which is 66.6%. The model will almost
always predict "non-happy".</p>
          <p>The performance of the second model, represented in
Table 5, is much better. This is mainly because when we
process one sample, we consider the original image sample
and all its patches together. All three problems perform well,
exceeding 97% in training accuracy. The test accuracies are
slightly lower for CIFAR-10 and FER2013, but stay above 92%.
These results can be compared with the results shown in
Table 6, which reports the mean and standard deviation over
10 runs of training a VGG-16 for 80 epochs on the original
images. The performance of our method is really close to
these, showing that adding the probability of patches did
not really afect the performance.</p>
        </sec>
        <sec id="sec-4-3-3">
          <title>4.3.3. Statistical analysis of rulesets</title>
          <p>Table 7 shows the statistics of the ruleset for each dataset.
As CIFAR-10 is a more dificult problem, more rules are
required to cover every training sample. The object of interest
in the image, for example, a cat, can appear in many
different orientations, under various perspectives, scales, and
positions. That is the main reason why there are so many
rules in the generated ruleset. On the contrary, the faces
from FER2013 and the digits from MNIST are very often
centered in the image, with similar size and position.</p>
          <p>The number of antecedents per rule is nearly the same
for each dataset, between three and four. This is fewer than
what we could have expected. On average, only three or
four conditions are required in a rule to cover samples that
have been classified into the same category by the model.</p>
          <p>MNIST requires fewer conditions to obtain rules that are
faithful to the model, and these rules cover many more
training samples, meaning that they capture more
common patterns in the training set. They also generalize well,
achieving a fidelity of 99.05% and a rule accuracy of 98.82%
on the test set. Furthermore, the CIFAR-10 rules are less
efective when applied to the test set. This results in a
significant decrease in both fidelity and accuracy, which reflects
the dificulty in finding reliable and representative rules.</p>
          <p>A noteworthy observation is the increase in test accuracy
when considering only samples for which the rules and the
model agree on the prediction. It increases even further
when considering only activated rules (in this case,
uncovered samples in the test set are not taken into account). This
Train accuracy</p>
          <p>Test accuracy
MNIST</p>
          <p>CIFAR-10</p>
          <p>FER2013
shows the global relevance of the ruleset, which sometimes
outperforms the model. We notice that many samples
activate several rules, which means that there are various ways
to explain the decision of one sample. Sometimes, one can
obtain a rule explaining the test sample that does not reflect
the model’s decision. It is rare for MNIST but common for
the two other datasets. We should prioritize a rule faithful
to the model, but it is interesting to analyze these "wrong"
rules that maybe predict the truth.</p>
          <p>Note that a rule is found for approximately 97.8% of the
MNIST test samples, 69% for CIFAR-10, and 84% for FER2013.
When no rule is found or when they do not agree on the
same prediction, our local rule extraction algorithm would
be performed, but this is outside of the scope of this work.</p>
        </sec>
        <sec id="sec-4-3-4">
          <title>4.3.4. Computational cost</title>
          <p>We now look at the computational cost. We need to train
all patches, then all probabilities and images, and finally
execute the global rule extraction algorithm. We may use
GPUs to train the models, but our rule generation is not yet
implemented for GPU usage. However, it is parallelized for
multiple CPUs. In Table 8, we show the execution times
when using one GPU for training and 48 CPUs for rule
generation.</p>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Rules Visualization</title>
        <p>Let us now visualize some of the rules that we have
obtained. Each figure represents a sample and one of the rules
that covers it. It is made up of the original image, an
image that contains all the antecedents, and one image for
each antecedent. There are two types of antecedents: the
probability of a patch for a specific class, represented by a
patch on the image, or a pixel value, represented by a single
pixel. They are green if the condition inequality is ≥ and
red if it is &lt;. The title of the sub-image describes exactly
the antecedent.</p>
        <sec id="sec-4-4-1">
          <title>4.4.1. Handwritten digits - MNIST</title>
          <p>Let us start by looking at some of the rules that we generated
for the MNIST dataset. Figure 3 represents a rule of class 0
for a specific sample. This rule has four antecedents. The
three patches in the rule are probability conditions for class
0 in these areas. The probability needed goes above 67.65%
for the first condition. The last antecedent is a red pixel,
which means that this specific pixel value has to be less
than 0.11 in the image. If the four conditions of the rule
are met by a sample, it will be classified as class 0 by this
rule. This specific rule covers 3362 training examples, which
means that it is highly representative of the digit 0. This
rule has perfect fidelity and accuracy on the train and test
sets. Figure 4 shows another sample covered by this rule,
which means that the rule covers diferent shapes of zeros.</p>
          <p>In Figures 5 and 6, we see a sample of class 9 covered by
two diferent rules. The two rules have, again, perfect
accuracy and fidelity. As before, they cover many samples. We
observe that a single sample activates several diferent rules
in the generated ruleset. Both rules have patches around
the loop of digit 9. The green ones are related to class 9,
and the red one is the probability of another class. For the
ifrst one represented in Figure 5, it is interesting to notice
that without the red patch, an 8 could fit perfectly the three
green patches. That is the reason why the rule needs to
eliminate class 8 with a red patch at the bottom, where an 8
will difer from a 9. In Figure 6, it is almost the same, but
with class 4 for the red patch. Digit 4 usually does not have a
signal on the top right of the image, but the other two green
patches could be part of a 4. Therefore, the rule needs to
eliminate it with this red patch, asking for a low probability
of a 4 there.</p>
        </sec>
        <sec id="sec-4-4-2">
          <title>4.4.2. Colorful images - CIFAR-10</title>
          <p>Figures 7 and 8 show a sample of a CIFAR-10 truck that
activates two diferent rules. It is worth noting that one
of them uses many pixels to classify the sample, while the
other does not use any. As shown in Figure 7, the patches
in the first rule are related to the truck. All antecedents
are placed in important spots: the sky, the road, the truck
itself, and the wheels (or underneath the truck). They are
the principal features of a truck image that can diferentiate
it from other classes.</p>
          <p>The second rule (Figure 8) does not have a sky patch,
but the highest patch contains a white area that could be
interpreted as the sky. The penultimate patch is a probability
for the class frog, likely due to the brown, gray, and blue
colors present in the front of the truck. The last one has a
probability for the class ship. As an image of a ship usually
contains a transition between the water and the ship, the
transition between the truck and its wheels can look similar.
It is interesting to see how the model can use the other
classes to predict an image when it has been trained with
patches first. This rule covers only seven train samples and
a unique test sample. It is perfectly accurate and faithful
on training samples, but even though it is accurate on the
covered test sample, it is not faithful, which means that
the ruleset predicts better than the model on this specific
sample.</p>
          <p>Another interesting class is the horse. Figure 9 illustrates
a typical rule that we obtain. We observe that patches are
located on the head and legs of the animal. There is also a
patch eliminating the frog class, as the principal colors of
the image, the grass, sky, and the brown of the horse, are
often present in a frog image.
because classifying small patches is complex and they do
not contain enough information. Consequently, patches
measuring 8 × 8 were used in this experiment.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion and Conclusion</title>
      <p>We presented two new methodologies explaining the deep
neural networks used for image classification. We obtained
explanatory rules in the form of patches and pixels of
interest by applying a global rule extraction technique. We
provided several visualizations to illustrate and explain the rules
and image samples. We did it on three diferent datasets,
showing possible results for diferent use cases. We are not
aware of any other work that generates global rules from
CNNs with conditions involving patches and pixels. As we
used a VGG-16 model with minimal modification (we added
the QIL element), our models exhibit very good predictive
accuracy. Another strength of our method is that we can
always find an explanatory rule to describe a sample because
we can activate our local rule extraction algorithm if no rule
in an extracted ruleset is applicable.</p>
      <p>Our goal was to highlight the most important areas of
the image sample that the model uses to predict. In order
to obtain good accuracy and fidelity, we needed to train
two diferent models and explain the second model with
our global rule extraction technique. The complexity of
this method increases because of this, but the results are
convincing.</p>
      <p>The number of rules obtained in the ruleset is very high
for each dataset, especially for Cifar-10. This is mainly
because the rules need to cover the entire training set and
each rule must have perfect fidelity. The Cifar-10 dataset
contains ten diferent classes, with objects appearing in
various locations within the image, at diferent scales and
orientations. It is harder to find rules that correspond to
many images.</p>
      <p>The highlighted patches and pixels in the rules provide
valuable information on the areas on which the model
focuses and the links it establishes between the diferent
classes. Often, a rule will look like this: If there is a strong
probability for a class to be at some place on the image, and
a good probability for another class to be somewhere else,
but yet another class should not be at that place, then the
rule predicts a certain class. This shows how the model uses
the patch predictions to predict the sample.</p>
      <p>We plan to try new methodologies and models to see how
far performance can be increased and how much we can
improve the relevance of the explanation. We wanted to
test our method on benchmark problems. We will work on
speeding up the whole process to apply it to larger images.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work was conducted in the context of the Horizon
Europe project PRE-ACT (Prediction of Radiotherapy side
efects using explainable AI for patient communication and
treatment modification), and it has received funding through
the European Commission Horizon Europe Program (Grant
Agreement number: 101057746). In addition, this work
was supported by the Swiss State Secretariat for
Education, Research and Innovation (SERI) under contract number
2200058.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used
ChatGPT, Grammarly in order to: Grammar and spelling check,
paraphrase, and reword. After using these tools, the authors
reviewed and edited the content as needed and assume full
responsibility for the content of the publication.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>"why should i trust you?" explaining the predictions of any classifier</article-title>
          ,
          <source>in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          . doi:
          <volume>10</volume>
          .1145/ 2939672.2939778.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.-I. Lee</surname>
          </string-name>
          ,
          <article-title>Explaining models by propagating shapley values of local components, Explainable AI in Healthcare and Medicine: Building a Culture of Transparency and Accountability (</article-title>
          <year>2021</year>
          )
          <fpage>261</fpage>
          -
          <lpage>270</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -53352-6_
          <fpage>24</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Selvaraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cogswell</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Vedantam</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Parikh</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Batra</surname>
          </string-name>
          , Grad-cam:
          <article-title>Visual explanations from deep networks via gradient-based localization</article-title>
          ,
          <source>in: Proceedings of the IEEE international conference on computer vision</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>618</fpage>
          -
          <lpage>626</lpage>
          . doi:
          <volume>10</volume>
          .1109/ ICCV.
          <year>2017</year>
          .
          <volume>74</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <article-title>A survey of methods for explaining black box models, ACM computing surveys (CSUR) 51 (</article-title>
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>42</lpage>
          . doi:
          <volume>10</volume>
          .1145/3236009.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Haar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elvira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Ochoa</surname>
          </string-name>
          ,
          <article-title>An analysis of explainability methods for convolutional neural networks</article-title>
          ,
          <source>Engineering Applications of Artiifcial Intelligence</source>
          <volume>117</volume>
          (
          <year>2023</year>
          )
          <article-title>105606</article-title>
          . doi:
          <volume>10</volume>
          .1016/ j.engappai.
          <year>2022</year>
          .
          <volume>105606</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bologna</surname>
          </string-name>
          , C. Pellegrini,
          <article-title>Constraining the mlp power of expression to facilitate symbolic rule extraction</article-title>
          ,
          <source>in: 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No. 98CH36227)</source>
          , volume
          <volume>1</volume>
          , IEEE,
          <year>1998</year>
          , pp.
          <fpage>146</fpage>
          -
          <lpage>151</lpage>
          . doi:
          <volume>10</volume>
          .1109/ IJCNN.
          <year>1998</year>
          .
          <volume>682252</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bologna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hayashi</surname>
          </string-name>
          ,
          <article-title>A rule extraction study from svm on sentiment analysis</article-title>
          ,
          <source>Big Data and Cognitive Computing</source>
          <volume>2</volume>
          (
          <year>2018</year>
          )
          <article-title>6</article-title>
          . doi:
          <volume>10</volume>
          .3390/bdcc2010006.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bologna</surname>
          </string-name>
          ,
          <article-title>A rule extraction technique applied to ensembles of neural networks, random forests, and gradient-boosted trees</article-title>
          ,
          <source>Algorithms</source>
          <volume>14</volume>
          (
          <year>2021</year>
          )
          <article-title>339</article-title>
          . doi:
          <volume>10</volume>
          .3390/a14120339.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bologna</surname>
          </string-name>
          , J.
          <string-name>
            <surname>-M. Boutay</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Boquete</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>Leblanc</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Köprülü</surname>
          </string-name>
          , L. Pfeifer,
          <article-title>Fidex and fidexglo: From local explanations to global explanations of deep models</article-title>
          ,
          <source>Algorithms</source>
          <volume>18</volume>
          (
          <year>2025</year>
          ). URL: https://www.mdpi.com/ 1999-4893/18/3/120. doi:
          <volume>10</volume>
          .3390/a18030120.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F.</given-names>
            <surname>Sabbatini</surname>
          </string-name>
          ,
          <article-title>Four decades of symbolic knowledge extraction from sub-symbolic predictors. a survey</article-title>
          ,
          <source>ACM Computing Surveys</source>
          <volume>58</volume>
          (
          <year>2025</year>
          )
          <fpage>1</fpage>
          -
          <lpage>36</lpage>
          . doi:
          <volume>10</volume>
          .1145/ 37490.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Diederich</surname>
          </string-name>
          ,
          <article-title>Rule extraction from support vector machines: An introduction, in: Rule extraction from support vector machines</article-title>
          , Springer,
          <year>2008</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R.</given-names>
            <surname>Andrews</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Diederich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Tickle</surname>
          </string-name>
          ,
          <article-title>Survey and critique of techniques for extracting rules from trained artificial neural networks</article-title>
          ,
          <source>Knowledgebased systems 8</source>
          (
          <year>1995</year>
          )
          <fpage>373</fpage>
          -
          <lpage>389</lpage>
          . doi:
          <volume>10</volume>
          .1016/
          <fpage>0950</fpage>
          -
          <lpage>7051</lpage>
          (
          <issue>96</issue>
          )
          <fpage>81920</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Townsend</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kasioumis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Inakoshi</surname>
          </string-name>
          , Eric:
          <article-title>Extracting relations inferred from convolutions</article-title>
          ,
          <source>in: Proceedings of the Asian Conference on Computer Vision</source>
          , Springer, Cham,
          <year>2020</year>
          , pp.
          <fpage>206</fpage>
          -
          <lpage>222</lpage>
          . doi:
          <volume>10</volume>
          .1007/978- 3-
          <fpage>030</fpage>
          -69535-4_
          <fpage>13</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Padalkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          , G. Gupta,
          <article-title>Nesyfold: a framework for interpretable image classification</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference On Artificial Intelligence</source>
          , volume
          <volume>38</volume>
          ,
          <year>2024</year>
          , pp.
          <fpage>4378</fpage>
          -
          <lpage>4387</lpage>
          . doi:
          <volume>10</volume>
          .1609/ aaai.v38i5.
          <fpage>2823</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>L.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <article-title>The mnist database of handwritten digit images for machine learning research</article-title>
          ,
          <source>IEEE Signal Processing Magazine</source>
          <volume>29</volume>
          (
          <year>2012</year>
          )
          <fpage>141</fpage>
          -
          <lpage>142</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Krizhevsky</surname>
          </string-name>
          ,
          <article-title>Learning multiple layers of features from tiny images</article-title>
          ,
          <source>Technical Report TR-2009</source>
          , University of Toronto,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>D.</given-names>
            <surname>Erhan</surname>
          </string-name>
          , I. Goodfellow,
          <string-name>
            <given-names>W.</given-names>
            <surname>Cukierski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <article-title>Challenges in representation learning: Facial expression recognition challenge</article-title>
          , https://www.kaggle.com/ competitions/challenges-in
          <article-title>-representation-learningfacial-expression-recognition-</article-title>
          <string-name>
            <surname>challenge</surname>
          </string-name>
          ,
          <year>2013</year>
          . Kaggle competition, Accessed:
          <fpage>2025</fpage>
          -06-24.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>