Argumentative Interpretable Image Classification

Argumentative Interpretable Image Classification HamedAyoobi h.ayoobi@imperial.ac.uk Department of Computing Imperial College London

United Kingdom

NicoPotyka potykan@cardiff.ac.uk Cardiff University

United Kingdom

FrancescaToni f.toni@imperial.ac.uk Department of Computing Imperial College London

United Kingdom

Argumentative Interpretable Image Classification 1613-0073 FE04B29260CE63BBD6426878146772A0 GROBID - A machine learning software for extracting information from scholarly documents Interpretable Image Classification Argumentation Prototypical-Parts Learning XAI

We propose ProtoSpArX, a novel interpretable deep neural architecture for image classification in the spirit of prototypical-part-learning as found, e.g. in ProtoPNet. While earlier approaches associate every class with multiple prototypical-parts, ProtoSpArX uses super-prototypes that combine prototypical-parts into single class representations. Furthermore, while earlier approaches use interpretable classification layers, e.g. logistic regression in ProtoPNet, ProtoSpArX improves accuracy with multi-layer perceptrons while relying upon an interpretable reading thereof based on a form of argumentation. ProtoSpArX is customisable to user cognitive requirements by a process of sparsification of the multi-layer perceptron/argumentation component. Also, as opposed to other prototypical-part-learning approaches, ProtoSpArX can recognise spatial relations between different prototypical-parts that are from various regions in images, similar to how CNNs capture relations between patterns recognized in earlier layers.

Introduction

versus the proposed superprototypes (b) for a sample in the CUB dataset [1].

Deep neural architectures are successful in various tasks, but tend to be mostly inscrutable blackboxes. In high-stakes settings, interpretability is crucial and interpretable models are advocated over black-boxes, especially if they achieve comparable performance [2]. Prototypical-part learning for image classification amounts to learning prototypical-parts of classes in images by introducing a human-interpretable prototype layer between the convolutional backbone (intutively, it learns patterns in the image space) and the classification component (intuitively, it uses the patterns identified by the backbone to classify an image) of convolutional neural networks [3]. Prototypicalparts can be seen as patches in images, like the beak or tail of a bird (see Figure 1 (a)). The prototype layer determines the similarity between prototypical-parts and patches in the latent space that the convolutional backbone maps to. Even though some prototypical-parts may correspond to background patches that are meaningless for humans (rather than exclusively meaningful parts in images as in Figure 1 (a)), they allow making transparent classifications, based on clearly defined prototypes, if the classification component is interpretable.

We propose ProtoSpArX (Section 4, overviewed in Figure 2), a novel interpretable deep neural architecture for image classification in the spirit of prototypical-part-learning. Similar to ProtoPShare [4] and ProtoTrees [5], ProtoSpArX shares prototypes among classes. However, while these and other prototypical-part-learning approaches associate every class with multiple prototypical parts, ProtoSpArX summarizes them in a single super-prototype per class that encodes spatial relations among them (see Figure 1 (b) for an illustration).

The use of super-prototypes allows capturing spatial relations between prototypical parts similar to how CNNs capture relations between patterns recognized in earlier layers. As we will show in the experiments with the SHAPES dataset [6], these relations are essential for some classification tasks but state-of-the-art prototypical-part-learning approaches are unable to capture them. For example, in Figure 2, a positive example (Class 1) has a triangle in the left column and a circle in the right column on the same row. Merely recognizing prototypical-parts for triangles and circles in the input image (as in other prototypical-part-learning approaches) is insufficient for determining the class label in this example. ProtoSpArX effectively tackles this challenge by encoding the spatial relations between distinct prototypical-parts using the super-prototype kernels.

The classifier component in ProtoSpArX is a quantitative bipolar argumentation framework (QBAF) that is trained using the SpArX methodology of [7]. Intuitively, the QBAF uses weighted attacks and supports between super-prototypes and meta-arguments (latent arguments attacked and supported by super-prototypes or other meta-arguments) to classify an image. This is indicated by the red and green arrows in Figure 2 1 .

We show experimentally (Sections 5 and 6) that ProtoSpArX outperforms the state-of-the-art prototypical-part-learning models ProtoPNet [3], ProtoTree [5], ProtoPShare [4], ProtoPool [8] and PIP-Net [9] in terms of classification accuracy and the ability to encode and detect spatial relations in images, supported by a number of ablations and the study of the cognitive complexity 1 Please note that the colour of the shapes in the input image has no relation with or bearing on the colours of the super-prototypes and edges in the QBAF, indicating attack and support (see Section 4 for details).

of local explanations derived from the sparsification of QBAFs obtained with ProtoSpArX.

Related Work

The problem of explaining image classifiers is well studied in the literature. Examples include feature attributions [10], attention maps [11] and counterfactual explanations [12]. While the former can be seen as post-hoc explanations that aim at explaining the decisions of a blackbox classifier, there is also an increasing literature on interpretable-by-design approaches. One interesting interpretable direction is based on prototypical-part-learning [13]. These approaches were motivated by the observation that class-prototypes [14] for datasets with simple backgrounds (as in MNIST [15]) do not generalize well to natural images with more complex backgrounds. To overcome this problem, ProtoPNet [3] introduced prototypical parts for capturing parts of the class (like the beak or tail of a bird) rather than the whole object (the bird). The original idea has been extended in various directions including prototypes that can be shared among classes [4], the integration of prototypical parts into decision trees [5] and improved similarity functions [8]. Our ProtoSpArX adds super-prototypes and uses bipolar quantitative argumentation to achieve a better tradeoff between classification performance and interpretability. Speficially, ProtoSpArX extends the SpArX approach [7], originally defined for MLPs with tabular data only, to the setting of prototypical-part-learning with images.

Several other argumentation-based forms of explainability have been proposed, we refer to [16] for an overview. Other works combine argumentation and image classification, e.g. [17,18] for explaining the outputs of CNNs and [19] to obtain an interpretable image classifier. ProtoSpArX may also be deemed neuro-symbolic as it combines, end-to-end (see Figure 2), neural components (the convolutional backbone, the prototype kernels, and the super-prototype kernels) with symbolic argumentation frameworks (QBAFs) drawn from MLPs. However, whereas recent neuro-symbolic systems often combine purely symbolic with purely neural systems [20,21], ProtoSpArX is based on the observation that MLPs can be seen as QBAFs and vice versa [22,7]. We keep the reasoning process in QBAFs interpretable by sparsification , as in [7].

Preliminaries

We build up on SpArX [7], a post-hoc explanation method that aims at generating structurally faithful explanations for MLPs. SpArX exploits that MLPs can be understood as Quantitative Bipolar Argumentation Frameworks (QBAFs) [22]. QBAFs can be seen as graphical reasoning models whose nodes represent arguments and whose edges represent attack or support relations between the arguments, each with a (negative or positive, respectively) intensity value [23,24,25,26].

Arguments in QBAFs are abstract entities (what makes them arguments is that they are in dialectical relationships). To capture MLPs as in [22], these abstract arguments represent input features, hidden neurons and output classifications, and the graphical structure of QBAFs mirror the MLP. This correspondence allows representing MLPs faithfully by QBAFs, but the QBAF representation is not useful for interpretability and explainability, because the QBAF has the same size as the original MLP. Thus, SpArX clusters neurons with similar activations and summarizes each cluster as a single argument [7]. Experiments with tabular data show that SpArX can give explanations that are both sparse and faithful [7].

In this work, we extend SpArX to make ProtoSpArX interpretable and explainable. An illustration is given in Figure 2: neurons in the MLP component of ProtoSpArX are treated as arguments, alongside the similarity scores from the super-prototypes, which serve as the input features for the MLP in our architecture (see the examples in Section 4 for further details on this illustration). Similarly to the original SpArX, we experiment with sparsification by various compression ratios (Section 6.4), showing that ProtoSpArX can provide explanations that are both sparse and faithful for image classification.

Method

Figure 2 shows the architecture of ProtoSpArX. ProtoSpArX consists of a convolutional backbone 𝑓 with weights 𝑊 𝑐𝑜𝑛𝑣 , a prototype layer 𝒫, a Channel-Wise Max (CWM) layer 𝒞𝒲ℳ, a Super-Prototype kernel 𝒮𝒫 followed by an MLP ℳℒ𝒫 with weights 𝑊 ℳℒ𝒫 , mapped onto a QBAF for interpretability and explainability purposes. We discuss each component in turn, assuming that inputs are images and the classification task amounts to predicting a class in the set 𝐾 (|𝐾| ≥ 2).

Prototypes

Let 𝑧 = 𝑓 (𝑥) be the convolutional output for an input image 𝑥, where the output tensor 𝑧 has shape 𝐻 × 𝑊 × 𝐷 with height 𝐻, width 𝑊 and 𝐷 channels. This output tensor serves as input to the prototype layer, 𝒫. which represents prototypical-parts. 𝒫 consists of 𝑁 prototypes 𝑃 = {𝑝 𝑖 } 𝑁 𝑖=1 with shapes 𝐻 1 × 𝑊 1 × 𝐷 (we have used 𝐻 1 = 𝑊 1 = 1 in all experiments). For each prototype 𝑝 𝑖 ∈ 𝑃 and every 𝐻 1 × 𝑊 1 × 𝐷 sub-tensor 𝑧 𝑗 of 𝑧, the prototype layer 𝒫 computes the cosine similarity

𝒞𝒮(𝑝 𝑖 , 𝑧 𝑗 ) = 𝑝 𝑖 • 𝑧 𝑗 ‖𝑝 𝑖 ‖‖𝑧 𝑗 ‖(1)

and outputs a similarity map

𝒮ℳ 𝑖 = 𝒞𝒮 𝑧 𝑗 ∈𝑧 (𝑝 𝑖 , 𝑧 𝑗 )(2)

with shape 𝐻 × 𝑊 for each prototype 𝑝 𝑖 ∈ 𝑃 . Intuitively, 𝒮ℳ 𝑖 indicates how similar the prototypical-part 𝑝 𝑖 is to patches of the input image 𝑥 in the latent space. We implemented 𝒮ℳ using the 2D convolution operator *. It generates 𝒮ℳ 𝑖 by convoluting the normalized convolutional output

𝑧 ˆ= 𝑧 ‖𝑧‖ = [︀ 𝑧 𝑗 ‖𝑧 𝑗 ‖ ]︀ 𝑧 𝑗 ∈𝑧 with a normalized prototype kernel 𝑝 𝑖 ˆ= 𝑝 𝑖 ‖𝑝 𝑖 ‖ , 𝒮ℳ 𝑖 = 𝑧 ˆ* 𝑝 𝑖 ˆ.

Since cosine similarity is used for the prototype layer, the values in similarity maps can be both positive and negative in the range [−1, 1]. The output dimensions of the prototype layer are 𝐻 × 𝑊 × 𝑁 .

Channel-Wise Max

The Channel-Wise Max layer aims to both localize and extract the max value of each similarity map while maintaining its dimensions. 𝒞𝒲ℳ takes the similarity maps as input and extracts the maximum value from each input channel by passing the maximum value and setting all other values to zero while preserving the input dimensions. Formally, for every similarity value 𝑠 ∈ 𝒮ℳ 𝑖 , the 𝑖 𝑡ℎ similarity map, the channel-wise max filter 𝒞𝒲ℳ 𝑖 retains the highest value 𝑠 𝑚𝑎𝑥 = max(𝒮ℳ 𝑖 ) within the map and assigns a value of zero to the remaining elements:

𝒞𝒲ℳ 𝑖 = {︃ 𝑠 𝑚𝑎𝑥 if s = max(𝒮ℳ 𝑖 ); 0 otherwise.(3)

The output dimensions of 𝒞𝒲ℳ are still 𝐻 × 𝑊 × 𝑁 .

Super-Prototypes and Similarity Scores

The super-prototypes kernel takes the output of the channel-wise max layer as input and provides a single representation per class. This is done in three steps.

In the first step, for each class 𝑘 ∈ 𝐾, 𝑀 linear combinations of the channel-wise max filters, denoted by ℒ𝒞 𝑘 𝑖 where 𝑖 ∈ {1, . . . , 𝑀 }, are learned. Here, 𝑀 is a customisable hyper-parameter of the model (𝑀 = 32 achieved the best results in the experiments). Formally:

ℒ𝒞 𝑘 𝑖 = 𝑁 ∑︁ 𝑗=1 𝑤 ℒ𝒞 𝑘 𝑖 𝑗 • 𝒞𝒲ℳ 𝑗(4)

where 𝑤

ℒ𝒞 𝑘 𝑖 𝑗

is a trainable scalar weight. We let 𝑊 ℒ𝒞 denote the vector summarizing all these weights. This operation can be implemented with 𝑀 convolutions with kernel shape 1 × 1 × 𝑁 using the 𝑁 channel-wise max filters as input.

In the second step, the super-prototypes are constructed. Each linear combination ℒ𝒞 𝑘 𝑖 is then multiplied by a trainable weight matrix 𝑊 𝒮𝒫 𝑘 𝑖 with shape 𝐻 × 𝑊 to obtain a single super-prototype for each class from the 𝑀 linear combinations. This means that the number of super-prototypes is equal to the number of classes |𝐾|. Each super-prototype 𝒮𝒫 𝑘 is then computed as follows:

𝒮𝒫 𝑘 = 𝑀 ∑︁ 𝑖=1 ℒ𝒞 𝑘 𝑖 ⊙ 𝑊 𝒮𝒫 𝑘 𝑖 ,(5)

where ⊙ denotes element-wise product. Each super-prototype has the shape 𝐻 × 𝑊 . By utilizing the receptive field of the convolutional output 𝑓 to rescale the similarity maps 𝒮ℳ to the input dimensions, the super-prototypes can be visualized on the input image 𝑥 employing Equation 5, as illustrated next.

Example 1. Figure 2 illustrates the visualization of the super-prototypes on the input image, where the colours indicate support (green) for Class 1 at the bottom and attack (red) against Class 0 at the top. Note that, since we are dealing with binary classification, the supporting regions for accepting one class are the attacking regions for accepting the other class. Also, the colours in the input images are irrelevant to the classification task which associates Class 1 to images with a triangle in the left column and a circle in the right column on the same row, no matter their colour.

In the third and final step, a single similarity score 𝑠𝑠 𝑘 is computed for each super-prototype by summing up the values 𝑠𝑝 ∈ 𝒮𝒫 𝑘 :

𝑠𝑠 𝑘 = ∑︁ 𝑠𝑝∈𝒮𝒫 𝑘 𝑠𝑝.(6)

Equations 5 and 6 can be simultaneously implemented by employing |𝐾| convolutions with a kernel shape of 𝐻 × 𝑊 × 𝑀 , while taking the 𝑀 linear combinations for each class as input.

Classifier Layer

Using the similarity scores as input, ℳℒ𝒫 is used for classification. After the training phase, ℳℒ𝒫 is converted to a QBAF (c.f., Section 3 -this involves sparsifying the underlying ℳℒ𝒫 and then translating it to a QBAF). The obtained QBAF can provide reasons for and against assigning an input 𝑥 to a specific class, making ProtoSpArX interpretable as illustrated next.

Example 2. The (sparsified) 1-hidden layer-MLP/QBAF in Figure 2 can be interpreted as follows:

• Super-prototype of Class 1 supports and attacks, with high intensity, the arguments corresponding to, respectively, the bottom and top neurons in the hidden layer; • Conversely, the super-prototype for Class 0 attacks and supports, with low intensity, the same arguments; • The hidden clusters and output neurons are visualized using the super-prototypes they "propagate" through the MLP, in the sense that these super-prototypes support them, e.g. the super-prototype for Class 0 supports the top cluster in the hidden layer and the predicted Class 1 is supported by the super-prototype for Class 1.

Overall, this interpretation indicates that the predicted Class 1 for the input image is supported by the presence of a circle in the bottom left corner and a triangle in the bottom right corner, while also pointing to the reasoning of the MLP in terms of the super-prototypes used.

Training ProtoSpArX

Unlike other prototypical-part-learning approaches, the training phase of ProtoSpArX is done in one step. This means that all amongst the prototype layer, the super-prototype kernels and the classifier are trained at once without a need for freezing the weight of the classifier first and fine-tuning it later. For the 𝑖 𝑡ℎ data point in a dataset of size 𝑛, with the data point belonging to class label 𝑦 𝑖 ∈ 𝐾 (where 𝐾 is the set of class labels), the target class super-prototype should obtain a high similarity score 𝑠𝑠 𝑦 𝑖 . Moreover, the corresponding similarity scores for the super-prototypes of other classes ({𝑠𝑠 𝑘 } |𝐾| 𝑘=1,𝑘̸ =𝑦 𝑖 ) should be low. Simultaneously, the output of the classifier should be 1 for the target class 𝑦 𝑖 and 0 for the other classes. Therefore, we integrate in the loss function two components 𝐿 𝒮𝒫 and 𝐿 𝑐𝑙𝑠 for the corresponding objectives. Definition 1. The total loss function ℒ that we aim to minimize is:

ℒ = 𝐿 𝐶𝐸 + 𝐿 𝒮𝒫 (7)

where 𝐿 𝐶𝐸 is the Cross-Entropy loss and 𝐿 𝒮𝒫 is a regularization term that aims at associating super-prototypes with their associated classes by penalizing the similarity to wrong classes and rewarding the similarity to the correct class :

𝐿 𝐶𝐸 = 𝑛 ∑︁ 𝑖=1 𝐶𝑟𝑠𝐸𝑛𝑡(𝐺(𝑥 𝑖 ), 𝑦 𝑖 ), (8) 𝐿 𝒮𝒫 = 𝑛 ∑︁ 𝑖=1 (( |𝐾| ∑︁ 𝑘=1 𝑘̸ =𝑦 𝑖 𝑠𝑠 𝑘 ) − 𝑠𝑠 𝑦 𝑖 );(9)

where 𝐺(𝑥 𝑖 ) denotes the output of ProtoSpArX.

Given the definition of total loss function ℒ, we then use the Adam optimizer [27] to tune the convolutional weights 𝑊 𝑐𝑜𝑛𝑣 , prototypes 𝒫, linear combination weights 𝑊 ℒ𝒞 , super-prototype weights 𝑊 𝒮𝒫 , and MLP weights 𝑊 ℳℒ𝒫 in an end-to-end fashion to minimize ℒ:

min 𝑊 𝑐𝑜𝑛𝑣 ,𝒫,𝑊 ℒ𝒞 ,𝑊 𝒮𝒫 ℒ(𝑊 𝑐𝑜𝑛𝑣 , 𝒫, 𝑊 ℒ𝒞 , 𝑊 𝒮𝒫 ) (10)

Finally, for the projection of prototypes, we follow the same approach as ProtoPNet [3] to push the prototypes to the latent representation of the closest image patch from the input space in the convolutional output so that each prototype has a global interpretable representation. We have compared our approach with the state-of-the-art prototypical-partlearning models ProtoPNet [3], Pro-toTrees [5], ProtoPShare [4], ProtoPool [8] and PIP-Net [9]. We have conducted four sets of experiments to evaluate the classification performance (Section 6.1), the role of each layer on the model's performance by an ablation study (Section 6.2), the ability to encode and detect spatial relationships in the input (Section 6.3), and the cognitive complexity of explanations naturally drawn from ProtoSpArX (Section 6.4). Notice that, for all the experiments, we use classification accuracy as our performance measure, as is the case with the baselines. For all the experiments, we have used CUB-200-2011 (CUB) [1] and Stanford Cars (Cars) [28], which are the standard benchmarks for prototypical-part learning models. To assess the ability to encode spatial relationships, we use the SHAPES dataset [6] adapted to binary classification.

Experiments

Classification Performance

The first two columns in Table 1 show the accuracy of our method compared to the baselines, for CUB and Cars. For both datasets, our ProtoSpArX outperforms the other approaches. Ablation studies on CUB and Cars in Table 2 show that ProtoSpArX achieves the best accuracy when employing super-prototypes atop the cosine similarity prototype layer, together with an MLP as classifier component. Alternatively, the L2distance-based prototype layer, as utilized in ProtoPNet, can be employed in conjunction with a fixed logistic regression layer for classification (fine-tuned in the second training phase in ProtoPNet). Notably, ProtoSpArX surpasses the performance of state-of-the-art methods even when utilizing a fixed logistic regression layer, instead of an MLP as the classifier (but performs best with the MLP).

Ablation study

Spatial Correlations

To assess whether different image classification methods can account for spatial relationships between prototypical-parts in images, we adapted the SHAPES dataset [6] as a benchmark. We randomly generated synthetic images containing 3 × 3 grids of circles, triangles, and squares in different colours (red, green, and blue), so that an image is assigned Class 1 if a triangle is located in the first column and a circle is located in the third column of the same row2 , and Class 0 otherwise. The resulting dataset comprises 10,000 28 × 28 images with balanced binary class labels. Figure 4 shows examples of images in the dataset. The first row contains images from class 1, where a triangle is located in the first column and a circle is located in the third column of the same row. The second row contains images from class 0, where this condition is not met. The last column in Table 1 compares the accuracy of the baselines for this SHAPES dataset. ProtoSpArX, with an accuracy of 98.4% ± 0.2%, significantly outperforms all other approaches. The accuracy of the other approaches is around 50%, suggesting that these models are unable to infer class labels solely based on the presence of prototypes in images, being unable to infer information about the relative placement of the prototypical-parts in the images. ProtoSpArX addresses this limitation by using channel-wise max and super-prototypes, which enable the 3: Comparison of the number of (super-)prototypes for different approaches.

For ProtoPool and ProtoTrees, we also report the ensemble cases.

The combination of super-prototypes and QBAFs can serve as the basis for humanreadable local explanations for the outputs of ProtoSpArX. Figure 2 showed a generated local explanation for a data point in SHAPES (see the examples in Section 4 for details on this illustration). Figure 3 illustrates a local explanation generated for a data instance from the CUB dataset, specifically for the target class "Baird Sparrow." The green overlay on the super-prototype highlights the region in the input image that supports the correct classification, while the red region identifies the attacked or unsupported portion of the input. This super-prototype can be interpreted as the bird's head resembles a "Baird Sparrow, " but its tail is atypical for this species. We have added this reading manually here for illustration, simulating how to read the super-prototypes. We leave the automatic generation of natural language interpretations of the super-prototypes and the QBAF for future work. We can use the number of representative (super-)prototypes as a measure of the cognitive complexity of the explanations drawn from prototypical-part-learning methods. Table 3 compares the number of (super-)prototypes for each approach, before and after the pruning phase if applicable. Since ProtoPool and ProtoTrees use an ensemble of multiple models, we have also reported these cases. Like ProtoPool, our ProtoSpArX does not have an additional phase for pruning unnecessary prototypes. The number of super-prototypes in our approach is equal to the number of classes since ProtoSpArX has one super-prototype per class. Notice that using a fixed classification layer, as in ProtoPNet, for ProtoSpArX, the local explanations require only one super-prototype while other approaches need multiple prototypes.

The global cognitive complexity of ProtoSpArX should additionally include the number of hidden nodes in the MLP since each node in the resulting QBAF would be part of the explanation. This complexity can be controlled by sparsification as in SpArX [7], with a trade-off between compression ratio of the MLP classifier and accuracy of the resulting ProtoSpArX model. For illustration, considering a one-layer MLP and 10 arguments in the QBAF after the sparsification of the MLP, the cognitive complexity of the QBAF would be 210 and 206 for CUB and Cars, respectively.

Conclusion

We proposed ProtoSpArX, a novel prototypical-part-learning approach. ProtoSpArX learns a single super-prototype per class. The super-prototypes integrate multiple prototypical-parts shared between different classes into a representative prototype per class. It can be trained end-to-end and does not require an additional pruning phase. As opposed to previous prototypical-partlearning approaches, the use of super-prototypes allows ProtoSpArX to capture spatial relationships between prototypical-parts. Using an MLP for classification allows ProtoSpArX to capture non-linear relationships between super-prototypes, while applying the SpArX methodology allows explaining the classification outcome. Experiments show that ProtoSpArX outperforms state-of-the-art prototypical-part-learning approaches in terms of accuracy and the ability to model spatial relationships between prototypical-parts.

Future directions include expanding ProtoSpArX's capabilities to encompass multi-modal data. Additionally, we will investigate the implementation of a user-model feedback loop to enhance the debugging process for super-prototypes. Further, we plan to deploy ProtoSpArX with real data, e.g. in the medical domain. Finally, we plan to explore various options for obtaining explanations from ProtoSpArX, including interactive forms thereof [16].

Figure 1 :1Figure 1: Conventional prototypes (a) versus the proposed superprototypes (b) for a sample in the CUB dataset [1].

Figure 2 :2Figure 2: Architecture of ProtoSpArX (see Section 4 for the details), illustrated with a sample from the SHAPES dataset [6].

Figure 3 :3Figure 3: Example of ProtoSpArX explanation for a Baird Sparrow image from CUB dataset. The prototypical-parts are learned from the training image patches. The super-prototype highlights the supported regions with a green overlay and the attacked regions with a red overlay. The QBAF outlines the reasoning of the MLP while assigning the probability of 0.9 for classifying the input as Baird Sparrow. model to infer the spatial correlation of different prototypical-parts in the image when needed for classification.

Figure 4 :4Figure 4: Examples from adapted SHAPES dataset, with binary class labels.

83.4 ± 0.2 89.3 ± 0.2 98.4 ± 0.2Table 1 :1Accuracy of ProtoSpArX and other prototypical-part-learning methods for different datasets. (Best accuracy in bold)MethodCUBAccuracy CarsSHAPESProtoPNet ProtoPShare 74.7 ± 0.2 86.4 ± 0.2 50.4 ± 0.8 79.2 ± 0.1 86.1 ± 0.1 51.1 ± 0.7 ProtoPool 80.3 ± 0.2 88.9 ± 0.1 50.8 ± 0.6 ProtoTrees 82.2 ± 0.7 86.6 ± 0.2 51.4 ± 0.7 PIP-Net 82.0 ± 0.3 86.5 ± 0.3 50.6 ± 0.6 ProtoSpArX

Table 2 :2Ablation study with different prototype layers and classifiers with respect to a super-prototype kernel. (Best accuracy in bold)Super-Prototype Classifier Prototype LayerAccuracy CUB Cars------------L2 L2 Cosine CosineFixed 79.2 86.1 MLP 81.2 86.7 Fixed 81.5 87.2 MLP 81.8 87.8L2Fixed 81.0 87.3L2MLP81.6 87.9CosineFixed 82.7 88.9CosineMLP 83.4 89.3

This criterion can be customized to reflect the user's preferences. For example, the dataset could assign Class 1 to images with a square in the first column, a blue triangle in the second, and a red square in the third.

F. Toni) https://profiles.imperial.ac.uk/h.ayoobi (H. Ayoobi); https://profiles.cardiff.ac.uk/staff/potykan (N. Potyka); https://www.doc.ic.ac.uk/~ft/ (F. Toni)

<author> <persName><forename type="first">C</forename><surname>Wah</surname></persName> </author> <author> <persName><forename type="first">S</forename><surname>Branson</surname></persName> </author> <author> <persName><forename type="first">P</forename><surname>Welinder</surname></persName> </author> <author> <persName><forename type="first">P</forename><surname>Perona</surname></persName> </author> <author> <persName><forename type="first">S</forename><surname>Belongie</surname></persName> </author> <idno>CNS-TR-2011-001</idno> <imprint> <date type="published" when="2011">2011</date> </imprint> <respStmt> <orgName>California Institute of Technology</orgName> </respStmt> </monogr> <note type="report_type">Technical Report</note> </biblStruct> <biblStruct xml:id="b1"> <analytic> <title level="a" type="main">Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead CRudin 10.1038/S42256-019-0048-X Nat. Mach. Intell 1 2019 This looks like that: deep learning for interpretable image recognition CChen OLi DTao ABarnett CRudin JKSu Advances in neural information processing systems 32 2019 ProtoPShare: Prototypical parts sharing for similarity discovery in interpretable image classification DRymarczyk LStruski JTabor BZielinski SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) FZhu BCOoi CMiao ACM 2021 Neural prototype trees for interpretable fine-grained image recognition MNauta RVan Bree CSeifert IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2021 2021 Computer Vision Foundation / IEEE Learning to reason: End-to-end module networks for visual question answering RHu JAndreas MRohrbach TDarrell KSaenko 10.1109/ICCV.2017.93 IEEE International Conference on Computer Vision (ICCV) 2017. 2017 SpArX: Sparse argumentative explanations for neural networks HAyoobi NPotyka FToni European Conference on Artificial Intelligence (ECAI) KGal ANowé GJNalepa RFairstein RRadulescu IOS Press 2023 372 Frontiers in Artificial Intelligence and Applications Interpretable image classification with differentiable prototypes assignment DRymarczyk LStruski MGórszczak KLewandowska JTabor BZieliński 10.1007/978-3-031-19775-8_21 doi: Computer Vision -ECCV 2022: 17th European Conference

Tel Aviv, Israel; Berlin, Heidelberg

Springer-Verlag October 23-27, 2022. 2022 Proceedings, Part XII Pip-net: Patch-based intuitive prototypes for interpretable image classification MNauta JSchlötterer MVan Keulen CSeifert Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023 the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023 2023 Computer Vision Foundation / IEEE Learning important features through propagating activation differences AShrikumar PGreenside AKundaje 2017 ICML Integrated grad-cam: Sensitivity-aware visual explanation of deep convolutional networks via integrated gradient-based scoring SSattarzadeh MSudhakar KNPlataniotis IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP) 2021 Counterfactual visual explanations YGoyal ZWu JErnst DBatra DParikh SLee 2019 ICML, PMLR Prototypical networks for few-shot learning JSnell KSwersky RZemel Advances in neural information processing systems 30 2017 Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions OLi HLiu CChen CRudin Proceedings of the AAAI Conference on Artificial Intelligence the AAAI Conference on Artificial Intelligence 2018 32 The MNIST database of handwritten digit images for machine learning research LDeng IEEE Signal Processing Magazine 29 2012 Argumentative XAI: A survey KCyras ARago EAlbini PBaroni FToni Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021 the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021 2021 DAX: deep argumentative explanation for neural networks EAlbini PLertvittayakumjorn ARago FToni CoRR abs/2012.05766 2020 Neural QBAFs: Explaining neural networks under lrp-based argumentation frameworks PSukpanichnant ARago PLertvittayakumjorn FToni 10.1007/978-3-031-08421-8_30 doi: AIxIA 2021 -Advances in Artificial Intelligence -20th International Conference of the Italian Association for Artificial Intelligence, Virtual Event Lecture Notes in Computer Science Springer December 1-3, 2021. 2021 13196 Revised Selected Papers Argue to learn: Accelerated argumentationbased learning HAyoobi MCao RVerbrugge BVerheij 10.1109/ICMLA52953.2021.00183 20th IEEE International Conference on Machine Learning and Applications (ICMLA) 2021 Neurasp: Embracing neural networks into answer set programming ZYang AIshay JLee International Joint Conference on Artificial Intelligence CBessiere ijcai 2020. 2020 Deep symbolic learning: Discovering symbols and rules from perceptions ADaniele TCampari SMalhotra LSerafini Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th

Macao, SAR, China

August 2023. 2023 Interpreting neural networks as quantitative argumentation frameworks NPotyka Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence the Thirty-Third AAAI Conference on Artificial Intelligence 2021 AAAI-21 Argumentation-based online incremental learning HAyoobi MCao RVerbrugge BVerheij 10.1109/TASE.2021.3120837 IEEE Trans Autom. Sci. Eng 19 2022 Argue to learn: Accelerated argumentationbased learning HAyoobi MCao RVerbrugge BVerheij 10.1109/ICMLA52953.2021.00183 20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021 MAWani IKSethi WShi GQu DSRaicu RJin

Pasadena, CA, USA

IEEE December 13-16, 2021. 2021 Explain what you see: Openended segmentation and recognition of occluded 3d objects HAyoobi HKasaei MCao RVerbrugge BVerheij 10.1109/ICRA48891.2023.10160927 IEEE International Conference on Robotics and Automation, ICRA 2023

London, UK

IEEE May 29 -June 2, 2023. 2023 FLeofante HAyoobi ADejl GFreedman DGorur JJiang GPaulino-Passos ARago ARapberger FRusso XYin DZhang FToni 10.48550/ARXIV.2405.10729 arXiv:2405.10729 Contestable AI needs computational argumentation 2024 Adam: A method for stochastic optimization DPKingma JBa 3rd International Conference on Learning Representations, ICLR 2015 YBengio YLecun

San Diego, CA, USA

May 7-9, 2015. 2015 Conference Track Proceedings 3d object representations for fine-grained categorization JKrause MStark JDeng LFei-Fei 10.1109/ICCVW.2013.77 2013 IEEE International Conference on Computer Vision Workshops 2013