<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Argumentative Interpretable Image Classification</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Hamed</forename><surname>Ayoobi</surname></persName>
							<email>h.ayoobi@imperial.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computing</orgName>
								<orgName type="institution">Imperial College London</orgName>
								<address>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Nico</forename><surname>Potyka</surname></persName>
							<email>potykan@cardiff.ac.uk</email>
							<affiliation key="aff1">
								<orgName type="institution">Cardiff University</orgName>
								<address>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Francesca</forename><surname>Toni</surname></persName>
							<email>f.toni@imperial.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computing</orgName>
								<orgName type="institution">Imperial College London</orgName>
								<address>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Argumentative Interpretable Image Classification</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">FE04B29260CE63BBD6426878146772A0</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:16+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Interpretable Image Classification</term>
					<term>Argumentation</term>
					<term>Prototypical-Parts Learning</term>
					<term>XAI</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We propose ProtoSpArX, a novel interpretable deep neural architecture for image classification in the spirit of prototypical-part-learning as found, e.g. in ProtoPNet. While earlier approaches associate every class with multiple prototypical-parts, ProtoSpArX uses super-prototypes that combine prototypical-parts into single class representations. Furthermore, while earlier approaches use interpretable classification layers, e.g. logistic regression in ProtoPNet, ProtoSpArX improves accuracy with multi-layer perceptrons while relying upon an interpretable reading thereof based on a form of argumentation. ProtoSpArX is customisable to user cognitive requirements by a process of sparsification of the multi-layer perceptron/argumentation component. Also, as opposed to other prototypical-part-learning approaches, ProtoSpArX can recognise spatial relations between different prototypical-parts that are from various regions in images, similar to how CNNs capture relations between patterns recognized in earlier layers.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>versus the proposed superprototypes (b) for a sample in the CUB dataset <ref type="bibr" target="#b0">[1]</ref>.</p><p>Deep neural architectures are successful in various tasks, but tend to be mostly inscrutable blackboxes. In high-stakes settings, interpretability is crucial and interpretable models are advocated over black-boxes, especially if they achieve comparable performance <ref type="bibr" target="#b1">[2]</ref>. Prototypical-part learning for image classification amounts to learning prototypical-parts of classes in images by introducing a human-interpretable prototype layer between the convolutional backbone (intutively, it learns patterns in the image space) and the classification component (intuitively, it uses the patterns identified by the backbone to classify an image) of convolutional neural networks <ref type="bibr" target="#b2">[3]</ref>. Prototypicalparts can be seen as patches in images, like the beak or tail of a bird (see Figure <ref type="figure" target="#fig_1">1</ref> (a)). The prototype layer determines the similarity between prototypical-parts and patches in the latent space that the convolutional backbone maps to. Even though some prototypical-parts may correspond to background patches that are meaningless for humans (rather than exclusively meaningful parts in images as in Figure <ref type="figure" target="#fig_1">1</ref> (a)), they allow making transparent classifications, based on clearly defined prototypes, if the classification component is interpretable.</p><p>We propose ProtoSpArX (Section 4, overviewed in Figure <ref type="figure" target="#fig_2">2</ref>), a novel interpretable deep neural architecture for image classification in the spirit of prototypical-part-learning. Similar to ProtoPShare <ref type="bibr" target="#b3">[4]</ref> and ProtoTrees <ref type="bibr" target="#b4">[5]</ref>, ProtoSpArX shares prototypes among classes. However, while these and other prototypical-part-learning approaches associate every class with multiple prototypical parts, ProtoSpArX summarizes them in a single super-prototype per class that encodes spatial relations among them (see Figure <ref type="figure" target="#fig_1">1</ref> (b) for an illustration).</p><p>The use of super-prototypes allows capturing spatial relations between prototypical parts similar to how CNNs capture relations between patterns recognized in earlier layers. As we will show in the experiments with the SHAPES dataset <ref type="bibr" target="#b5">[6]</ref>, these relations are essential for some classification tasks but state-of-the-art prototypical-part-learning approaches are unable to capture them. For example, in Figure <ref type="figure" target="#fig_2">2</ref>, a positive example (Class 1) has a triangle in the left column and a circle in the right column on the same row. Merely recognizing prototypical-parts for triangles and circles in the input image (as in other prototypical-part-learning approaches) is insufficient for determining the class label in this example. ProtoSpArX effectively tackles this challenge by encoding the spatial relations between distinct prototypical-parts using the super-prototype kernels.</p><p>The classifier component in ProtoSpArX is a quantitative bipolar argumentation framework (QBAF) that is trained using the SpArX methodology of <ref type="bibr" target="#b6">[7]</ref>. Intuitively, the QBAF uses weighted attacks and supports between super-prototypes and meta-arguments (latent arguments attacked and supported by super-prototypes or other meta-arguments) to classify an image. This is indicated by the red and green arrows in Figure <ref type="figure" target="#fig_2">2</ref> 1 .</p><p>We show experimentally (Sections 5 and 6) that ProtoSpArX outperforms the state-of-the-art prototypical-part-learning models ProtoPNet <ref type="bibr" target="#b2">[3]</ref>, ProtoTree <ref type="bibr" target="#b4">[5]</ref>, ProtoPShare <ref type="bibr" target="#b3">[4]</ref>, ProtoPool <ref type="bibr" target="#b7">[8]</ref> and PIP-Net <ref type="bibr" target="#b8">[9]</ref> in terms of classification accuracy and the ability to encode and detect spatial relations in images, supported by a number of ablations and the study of the cognitive complexity 1 Please note that the colour of the shapes in the input image has no relation with or bearing on the colours of the super-prototypes and edges in the QBAF, indicating attack and support (see Section 4 for details).</p><p>of local explanations derived from the sparsification of QBAFs obtained with ProtoSpArX.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>The problem of explaining image classifiers is well studied in the literature. Examples include feature attributions <ref type="bibr" target="#b9">[10]</ref>, attention maps <ref type="bibr" target="#b10">[11]</ref> and counterfactual explanations <ref type="bibr" target="#b11">[12]</ref>. While the former can be seen as post-hoc explanations that aim at explaining the decisions of a blackbox classifier, there is also an increasing literature on interpretable-by-design approaches. One interesting interpretable direction is based on prototypical-part-learning <ref type="bibr" target="#b12">[13]</ref>. These approaches were motivated by the observation that class-prototypes <ref type="bibr" target="#b13">[14]</ref> for datasets with simple backgrounds (as in MNIST <ref type="bibr" target="#b14">[15]</ref>) do not generalize well to natural images with more complex backgrounds. To overcome this problem, ProtoPNet <ref type="bibr" target="#b2">[3]</ref> introduced prototypical parts for capturing parts of the class (like the beak or tail of a bird) rather than the whole object (the bird). The original idea has been extended in various directions including prototypes that can be shared among classes <ref type="bibr" target="#b3">[4]</ref>, the integration of prototypical parts into decision trees <ref type="bibr" target="#b4">[5]</ref> and improved similarity functions <ref type="bibr" target="#b7">[8]</ref>. Our ProtoSpArX adds super-prototypes and uses bipolar quantitative argumentation to achieve a better tradeoff between classification performance and interpretability. Speficially, ProtoSpArX extends the SpArX approach <ref type="bibr" target="#b6">[7]</ref>, originally defined for MLPs with tabular data only, to the setting of prototypical-part-learning with images.</p><p>Several other argumentation-based forms of explainability have been proposed, we refer to <ref type="bibr" target="#b15">[16]</ref> for an overview. Other works combine argumentation and image classification, e.g. <ref type="bibr" target="#b16">[17,</ref><ref type="bibr" target="#b17">18]</ref> for explaining the outputs of CNNs and <ref type="bibr" target="#b18">[19]</ref> to obtain an interpretable image classifier. ProtoSpArX may also be deemed neuro-symbolic as it combines, end-to-end (see Figure <ref type="figure" target="#fig_2">2</ref>), neural components (the convolutional backbone, the prototype kernels, and the super-prototype kernels) with symbolic argumentation frameworks (QBAFs) drawn from MLPs. However, whereas recent neuro-symbolic systems often combine purely symbolic with purely neural systems <ref type="bibr" target="#b19">[20,</ref><ref type="bibr" target="#b20">21]</ref>, ProtoSpArX is based on the observation that MLPs can be seen as QBAFs and vice versa <ref type="bibr" target="#b21">[22,</ref><ref type="bibr" target="#b6">7]</ref>. We keep the reasoning process in QBAFs interpretable by sparsification , as in <ref type="bibr" target="#b6">[7]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Preliminaries</head><p>We build up on SpArX <ref type="bibr" target="#b6">[7]</ref>, a post-hoc explanation method that aims at generating structurally faithful explanations for MLPs. SpArX exploits that MLPs can be understood as Quantitative Bipolar Argumentation Frameworks (QBAFs) <ref type="bibr" target="#b21">[22]</ref>. QBAFs can be seen as graphical reasoning models whose nodes represent arguments and whose edges represent attack or support relations between the arguments, each with a (negative or positive, respectively) intensity value <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b23">24,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b25">26]</ref>.</p><p>Arguments in QBAFs are abstract entities (what makes them arguments is that they are in dialectical relationships). To capture MLPs as in <ref type="bibr" target="#b21">[22]</ref>, these abstract arguments represent input features, hidden neurons and output classifications, and the graphical structure of QBAFs mirror the MLP. This correspondence allows representing MLPs faithfully by QBAFs, but the QBAF representation is not useful for interpretability and explainability, because the QBAF has the same size as the original MLP. Thus, SpArX clusters neurons with similar activations and summarizes each cluster as a single argument <ref type="bibr" target="#b6">[7]</ref>. Experiments with tabular data show that SpArX can give explanations that are both sparse and faithful <ref type="bibr" target="#b6">[7]</ref>.</p><p>In this work, we extend SpArX to make ProtoSpArX interpretable and explainable. An illustration is given in Figure <ref type="figure" target="#fig_2">2</ref>: neurons in the MLP component of ProtoSpArX are treated as arguments, alongside the similarity scores from the super-prototypes, which serve as the input features for the MLP in our architecture (see the examples in Section 4 for further details on this illustration). Similarly to the original SpArX, we experiment with sparsification by various compression ratios (Section 6.4), showing that ProtoSpArX can provide explanations that are both sparse and faithful for image classification.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Method</head><p>Figure <ref type="figure" target="#fig_2">2</ref> shows the architecture of ProtoSpArX. ProtoSpArX consists of a convolutional backbone 𝑓 with weights 𝑊 𝑐𝑜𝑛𝑣 , a prototype layer 𝒫, a Channel-Wise Max (CWM) layer 𝒞𝒲ℳ, a Super-Prototype kernel 𝒮𝒫 followed by an MLP ℳℒ𝒫 with weights 𝑊 ℳℒ𝒫 , mapped onto a QBAF for interpretability and explainability purposes. We discuss each component in turn, assuming that inputs are images and the classification task amounts to predicting a class in the set 𝐾 (|𝐾| ≥ 2).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Prototypes</head><p>Let 𝑧 = 𝑓 (𝑥) be the convolutional output for an input image 𝑥, where the output tensor 𝑧 has shape 𝐻 × 𝑊 × 𝐷 with height 𝐻, width 𝑊 and 𝐷 channels. This output tensor serves as input to the prototype layer, 𝒫. which represents prototypical-parts. 𝒫 consists of 𝑁 prototypes 𝑃 = {𝑝 𝑖 } 𝑁 𝑖=1 with shapes 𝐻 1 × 𝑊 1 × 𝐷 (we have used 𝐻 1 = 𝑊 1 = 1 in all experiments). For each prototype 𝑝 𝑖 ∈ 𝑃 and every 𝐻 1 × 𝑊 1 × 𝐷 sub-tensor 𝑧 𝑗 of 𝑧, the prototype layer 𝒫 computes the cosine similarity</p><formula xml:id="formula_0">𝒞𝒮(𝑝 𝑖 , 𝑧 𝑗 ) = 𝑝 𝑖 • 𝑧 𝑗 ‖𝑝 𝑖 ‖‖𝑧 𝑗 ‖<label>(1)</label></formula><p>and outputs a similarity map</p><formula xml:id="formula_1">𝒮ℳ 𝑖 = 𝒞𝒮 𝑧 𝑗 ∈𝑧 (𝑝 𝑖 , 𝑧 𝑗 )<label>(2)</label></formula><p>with shape 𝐻 × 𝑊 for each prototype 𝑝 𝑖 ∈ 𝑃 . Intuitively, 𝒮ℳ 𝑖 indicates how similar the prototypical-part 𝑝 𝑖 is to patches of the input image 𝑥 in the latent space. We implemented 𝒮ℳ using the 2D convolution operator *. It generates 𝒮ℳ 𝑖 by convoluting the normalized convolutional output</p><formula xml:id="formula_2">𝑧 ˆ= 𝑧 ‖𝑧‖ = [︀ 𝑧 𝑗 ‖𝑧 𝑗 ‖ ]︀ 𝑧 𝑗 ∈𝑧 with a normalized prototype kernel 𝑝 𝑖 ˆ= 𝑝 𝑖 ‖𝑝 𝑖 ‖ , 𝒮ℳ 𝑖 = 𝑧 ˆ* 𝑝 𝑖 ˆ.</formula><p>Since cosine similarity is used for the prototype layer, the values in similarity maps can be both positive and negative in the range [−1, 1]. The output dimensions of the prototype layer are 𝐻 × 𝑊 × 𝑁 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Channel-Wise Max</head><p>The Channel-Wise Max layer aims to both localize and extract the max value of each similarity map while maintaining its dimensions. 𝒞𝒲ℳ takes the similarity maps as input and extracts the maximum value from each input channel by passing the maximum value and setting all other values to zero while preserving the input dimensions. Formally, for every similarity value 𝑠 ∈ 𝒮ℳ 𝑖 , the 𝑖 𝑡ℎ similarity map, the channel-wise max filter 𝒞𝒲ℳ 𝑖 retains the highest value 𝑠 𝑚𝑎𝑥 = max(𝒮ℳ 𝑖 ) within the map and assigns a value of zero to the remaining elements:</p><formula xml:id="formula_3">𝒞𝒲ℳ 𝑖 = {︃ 𝑠 𝑚𝑎𝑥 if s = max(𝒮ℳ 𝑖 ); 0 otherwise.<label>(3)</label></formula><p>The output dimensions of 𝒞𝒲ℳ are still 𝐻 × 𝑊 × 𝑁 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Super-Prototypes and Similarity Scores</head><p>The super-prototypes kernel takes the output of the channel-wise max layer as input and provides a single representation per class. This is done in three steps.</p><p>In the first step, for each class 𝑘 ∈ 𝐾, 𝑀 linear combinations of the channel-wise max filters, denoted by ℒ𝒞 𝑘 𝑖 where 𝑖 ∈ {1, . . . , 𝑀 }, are learned. Here, 𝑀 is a customisable hyper-parameter of the model (𝑀 = 32 achieved the best results in the experiments). Formally:</p><formula xml:id="formula_4">ℒ𝒞 𝑘 𝑖 = 𝑁 ∑︁ 𝑗=1 𝑤 ℒ𝒞 𝑘 𝑖 𝑗 • 𝒞𝒲ℳ 𝑗<label>(4)</label></formula><p>where 𝑤</p><formula xml:id="formula_5">ℒ𝒞 𝑘 𝑖 𝑗</formula><p>is a trainable scalar weight. We let 𝑊 ℒ𝒞 denote the vector summarizing all these weights. This operation can be implemented with 𝑀 convolutions with kernel shape 1 × 1 × 𝑁 using the 𝑁 channel-wise max filters as input.</p><p>In the second step, the super-prototypes are constructed. Each linear combination ℒ𝒞 𝑘 𝑖 is then multiplied by a trainable weight matrix 𝑊 𝒮𝒫 𝑘 𝑖 with shape 𝐻 × 𝑊 to obtain a single super-prototype for each class from the 𝑀 linear combinations. This means that the number of super-prototypes is equal to the number of classes |𝐾|. Each super-prototype 𝒮𝒫 𝑘 is then computed as follows:</p><formula xml:id="formula_6">𝒮𝒫 𝑘 = 𝑀 ∑︁ 𝑖=1 ℒ𝒞 𝑘 𝑖 ⊙ 𝑊 𝒮𝒫 𝑘 𝑖 ,<label>(5)</label></formula><p>where ⊙ denotes element-wise product. Each super-prototype has the shape 𝐻 × 𝑊 . By utilizing the receptive field of the convolutional output 𝑓 to rescale the similarity maps 𝒮ℳ to the input dimensions, the super-prototypes can be visualized on the input image 𝑥 employing Equation <ref type="formula" target="#formula_6">5</ref>, as illustrated next.</p><p>Example 1. Figure <ref type="figure" target="#fig_2">2</ref> illustrates the visualization of the super-prototypes on the input image, where the colours indicate support (green) for Class 1 at the bottom and attack (red) against Class 0 at the top. Note that, since we are dealing with binary classification, the supporting regions for accepting one class are the attacking regions for accepting the other class. Also, the colours in the input images are irrelevant to the classification task which associates Class 1 to images with a triangle in the left column and a circle in the right column on the same row, no matter their colour.</p><p>In the third and final step, a single similarity score 𝑠𝑠 𝑘 is computed for each super-prototype by summing up the values 𝑠𝑝 ∈ 𝒮𝒫 𝑘 :</p><formula xml:id="formula_7">𝑠𝑠 𝑘 = ∑︁ 𝑠𝑝∈𝒮𝒫 𝑘 𝑠𝑝.<label>(6)</label></formula><p>Equations 5 and 6 can be simultaneously implemented by employing |𝐾| convolutions with a kernel shape of 𝐻 × 𝑊 × 𝑀 , while taking the 𝑀 linear combinations for each class as input.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Classifier Layer</head><p>Using the similarity scores as input, ℳℒ𝒫 is used for classification. After the training phase, ℳℒ𝒫 is converted to a QBAF (c.f., Section 3 -this involves sparsifying the underlying ℳℒ𝒫 and then translating it to a QBAF). The obtained QBAF can provide reasons for and against assigning an input 𝑥 to a specific class, making ProtoSpArX interpretable as illustrated next.</p><p>Example 2. The (sparsified) 1-hidden layer-MLP/QBAF in Figure <ref type="figure" target="#fig_2">2</ref> can be interpreted as follows:</p><p>• Super-prototype of Class 1 supports and attacks, with high intensity, the arguments corresponding to, respectively, the bottom and top neurons in the hidden layer; • Conversely, the super-prototype for Class 0 attacks and supports, with low intensity, the same arguments; • The hidden clusters and output neurons are visualized using the super-prototypes they "propagate" through the MLP, in the sense that these super-prototypes support them, e.g. the super-prototype for Class 0 supports the top cluster in the hidden layer and the predicted Class 1 is supported by the super-prototype for Class 1.</p><p>Overall, this interpretation indicates that the predicted Class 1 for the input image is supported by the presence of a circle in the bottom left corner and a triangle in the bottom right corner, while also pointing to the reasoning of the MLP in terms of the super-prototypes used.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Training ProtoSpArX</head><p>Unlike other prototypical-part-learning approaches, the training phase of ProtoSpArX is done in one step. This means that all amongst the prototype layer, the super-prototype kernels and the classifier are trained at once without a need for freezing the weight of the classifier first and fine-tuning it later. For the 𝑖 𝑡ℎ data point in a dataset of size 𝑛, with the data point belonging to class label 𝑦 𝑖 ∈ 𝐾 (where 𝐾 is the set of class labels), the target class super-prototype should obtain a high similarity score 𝑠𝑠 𝑦 𝑖 . Moreover, the corresponding similarity scores for the super-prototypes of other classes ({𝑠𝑠 𝑘 } |𝐾| 𝑘=1,𝑘̸ =𝑦 𝑖 ) should be low. Simultaneously, the output of the classifier should be 1 for the target class 𝑦 𝑖 and 0 for the other classes. Therefore, we integrate in the loss function two components 𝐿 𝒮𝒫 and 𝐿 𝑐𝑙𝑠 for the corresponding objectives. Definition 1. The total loss function ℒ that we aim to minimize is:</p><formula xml:id="formula_8">ℒ = 𝐿 𝐶𝐸 + 𝐿 𝒮𝒫 (7)</formula><p>where 𝐿 𝐶𝐸 is the Cross-Entropy loss and 𝐿 𝒮𝒫 is a regularization term that aims at associating super-prototypes with their associated classes by penalizing the similarity to wrong classes and rewarding the similarity to the correct class :</p><formula xml:id="formula_9">𝐿 𝐶𝐸 = 𝑛 ∑︁ 𝑖=1 𝐶𝑟𝑠𝐸𝑛𝑡(𝐺(𝑥 𝑖 ), 𝑦 𝑖 ), (8) 𝐿 𝒮𝒫 = 𝑛 ∑︁ 𝑖=1 (( |𝐾| ∑︁ 𝑘=1 𝑘̸ =𝑦 𝑖 𝑠𝑠 𝑘 ) − 𝑠𝑠 𝑦 𝑖 );<label>(9)</label></formula><p>where 𝐺(𝑥 𝑖 ) denotes the output of ProtoSpArX.</p><p>Given the definition of total loss function ℒ, we then use the Adam optimizer <ref type="bibr" target="#b26">[27]</ref> to tune the convolutional weights 𝑊 𝑐𝑜𝑛𝑣 , prototypes 𝒫, linear combination weights 𝑊 ℒ𝒞 , super-prototype weights 𝑊 𝒮𝒫 , and MLP weights 𝑊 ℳℒ𝒫 in an end-to-end fashion to minimize ℒ:</p><formula xml:id="formula_10">min 𝑊 𝑐𝑜𝑛𝑣 ,𝒫,𝑊 ℒ𝒞 ,𝑊 𝒮𝒫 ℒ(𝑊 𝑐𝑜𝑛𝑣 , 𝒫, 𝑊 ℒ𝒞 , 𝑊 𝒮𝒫 ) (10)</formula><p>Finally, for the projection of prototypes, we follow the same approach as ProtoPNet <ref type="bibr" target="#b2">[3]</ref> to push the prototypes to the latent representation of the closest image patch from the input space in the convolutional output so that each prototype has a global interpretable representation. We have compared our approach with the state-of-the-art prototypical-partlearning models ProtoPNet <ref type="bibr" target="#b2">[3]</ref>, Pro-toTrees <ref type="bibr" target="#b4">[5]</ref>, ProtoPShare <ref type="bibr" target="#b3">[4]</ref>, ProtoPool <ref type="bibr" target="#b7">[8]</ref> and PIP-Net <ref type="bibr" target="#b8">[9]</ref>. We have conducted four sets of experiments to evaluate the classification performance (Section 6.1), the role of each layer on the model's performance by an ablation study (Section 6.2), the ability to encode and detect spatial relationships in the input (Section 6.3), and the cognitive complexity of explanations naturally drawn from ProtoSpArX (Section 6.4). Notice that, for all the experiments, we use classification accuracy as our performance measure, as is the case with the baselines. For all the experiments, we have used CUB-200-2011 (CUB) <ref type="bibr" target="#b0">[1]</ref> and Stanford Cars (Cars) <ref type="bibr" target="#b27">[28]</ref>, which are the standard benchmarks for prototypical-part learning models. To assess the ability to encode spatial relationships, we use the SHAPES dataset <ref type="bibr" target="#b5">[6]</ref> adapted to binary classification.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Experiments</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.">Classification Performance</head><p>The first two columns in Table <ref type="table" target="#tab_0">1</ref> show the accuracy of our method compared to the baselines, for CUB and Cars. For both datasets, our ProtoSpArX outperforms the other approaches. Ablation studies on CUB and Cars in Table <ref type="table" target="#tab_1">2</ref> show that ProtoSpArX achieves the best accuracy when employing super-prototypes atop the cosine similarity prototype layer, together with an MLP as classifier component. Alternatively, the L2distance-based prototype layer, as utilized in ProtoPNet, can be employed in conjunction with a fixed logistic regression layer for classification (fine-tuned in the second training phase in ProtoPNet). Notably, ProtoSpArX surpasses the performance of state-of-the-art methods even when utilizing a fixed logistic regression layer, instead of an MLP as the classifier (but performs best with the MLP).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.">Ablation study</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3.">Spatial Correlations</head><p>To assess whether different image classification methods can account for spatial relationships between prototypical-parts in images, we adapted the SHAPES dataset <ref type="bibr" target="#b5">[6]</ref> as a benchmark. We randomly generated synthetic images containing 3 × 3 grids of circles, triangles, and squares in different colours (red, green, and blue), so that an image is assigned Class 1 if a triangle is located in the first column and a circle is located in the third column of the same row<ref type="foot" target="#foot_0">2</ref> , and Class 0 otherwise. The resulting dataset comprises 10,000 28 × 28 images with balanced binary class labels. Figure <ref type="figure" target="#fig_4">4</ref> shows examples of images in the dataset. The first row contains images from class 1, where a triangle is located in the first column and a circle is located in the third column of the same row. The second row contains images from class 0, where this condition is not met. The last column in Table <ref type="table" target="#tab_0">1</ref> compares the accuracy of the baselines for this SHAPES dataset. ProtoSpArX, with an accuracy of 98.4% ± 0.2%, significantly outperforms all other approaches. The accuracy of the other approaches is around 50%, suggesting that these models are unable to infer class labels solely based on the presence of prototypes in images, being unable to infer information about the relative placement of the prototypical-parts in the images. ProtoSpArX addresses this limitation by using channel-wise max and super-prototypes, which enable the   <ref type="table">3</ref>: Comparison of the number of (super-)prototypes for different approaches.</p><p>For ProtoPool and ProtoTrees, we also report the ensemble cases.</p><p>The combination of super-prototypes and QBAFs can serve as the basis for humanreadable local explanations for the outputs of ProtoSpArX. Figure <ref type="figure" target="#fig_2">2</ref> showed a generated local explanation for a data point in SHAPES (see the examples in Section 4 for details on this illustration). Figure <ref type="figure" target="#fig_3">3</ref> illustrates a local explanation generated for a data instance from the CUB dataset, specifically for the target class "Baird Sparrow." The green overlay on the super-prototype highlights the region in the input image that supports the correct classification, while the red region identifies the attacked or unsupported portion of the input. This super-prototype can be interpreted as the bird's head resembles a "Baird Sparrow, " but its tail is atypical for this species. We have added this reading manually here for illustration, simulating how to read the super-prototypes. We leave the automatic generation of natural language interpretations of the super-prototypes and the QBAF for future work. We can use the number of representative (super-)prototypes as a measure of the cognitive complexity of the explanations drawn from prototypical-part-learning methods. Table <ref type="table">3</ref> compares the number of (super-)prototypes for each approach, before and after the pruning phase if applicable. Since ProtoPool and ProtoTrees use an ensemble of multiple models, we have also reported these cases. Like ProtoPool, our ProtoSpArX does not have an additional phase for pruning unnecessary prototypes. The number of super-prototypes in our approach is equal to the number of classes since ProtoSpArX has one super-prototype per class. Notice that using a fixed classification layer, as in ProtoPNet, for ProtoSpArX, the local explanations require only one super-prototype while other approaches need multiple prototypes.</p><p>The global cognitive complexity of ProtoSpArX should additionally include the number of hidden nodes in the MLP since each node in the resulting QBAF would be part of the explanation. This complexity can be controlled by sparsification as in SpArX <ref type="bibr" target="#b6">[7]</ref>, with a trade-off between compression ratio of the MLP classifier and accuracy of the resulting ProtoSpArX model. For illustration, considering a one-layer MLP and 10 arguments in the QBAF after the sparsification of the MLP, the cognitive complexity of the QBAF would be 210 and 206 for CUB and Cars, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusion</head><p>We proposed ProtoSpArX, a novel prototypical-part-learning approach. ProtoSpArX learns a single super-prototype per class. The super-prototypes integrate multiple prototypical-parts shared between different classes into a representative prototype per class. It can be trained end-to-end and does not require an additional pruning phase. As opposed to previous prototypical-partlearning approaches, the use of super-prototypes allows ProtoSpArX to capture spatial relationships between prototypical-parts. Using an MLP for classification allows ProtoSpArX to capture non-linear relationships between super-prototypes, while applying the SpArX methodology allows explaining the classification outcome. Experiments show that ProtoSpArX outperforms state-of-the-art prototypical-part-learning approaches in terms of accuracy and the ability to model spatial relationships between prototypical-parts.</p><p>Future directions include expanding ProtoSpArX's capabilities to encompass multi-modal data. Additionally, we will investigate the implementation of a user-model feedback loop to enhance the debugging process for super-prototypes. Further, we plan to deploy ProtoSpArX with real data, e.g. in the medical domain. Finally, we plan to explore various options for obtaining explanations from ProtoSpArX, including interactive forms thereof <ref type="bibr" target="#b15">[16]</ref>.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Conventional prototypes (a) versus the proposed superprototypes (b) for a sample in the CUB dataset [1].</figDesc><graphic coords="1,410.09,452.78,92.84,72.51" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Architecture of ProtoSpArX (see Section 4 for the details), illustrated with a sample from the SHAPES dataset [6].</figDesc><graphic coords="2,89.29,75.68,416.69,106.16" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Example of ProtoSpArX explanation for a Baird Sparrow image from CUB dataset. The prototypical-parts are learned from the training image patches. The super-prototype highlights the supported regions with a green overlay and the attacked regions with a red overlay. The QBAF outlines the reasoning of the MLP while assigning the probability of 0.9 for classifying the input as Baird Sparrow. model to infer the spatial correlation of different prototypical-parts in the image when needed for classification.</figDesc><graphic coords="9,130.96,55.84,333.35,214.42" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Examples from adapted SHAPES dataset, with binary class labels.</figDesc><graphic coords="9,110.13,375.01,166.68,126.10" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>83.4 ± 0.2 89.3 ± 0.2 98.4 ± 0.2Table 1 :</head><label>1</label><figDesc>Accuracy of ProtoSpArX and other prototypical-part-learning methods for different datasets. (Best accuracy in bold)</figDesc><table><row><cell>Method</cell><cell>CUB</cell><cell>Accuracy Cars</cell><cell>SHAPES</cell></row><row><cell cols="4">ProtoPNet ProtoPShare 74.7 ± 0.2 86.4 ± 0.2 50.4 ± 0.8 79.2 ± 0.1 86.1 ± 0.1 51.1 ± 0.7 ProtoPool 80.3 ± 0.2 88.9 ± 0.1 50.8 ± 0.6 ProtoTrees 82.2 ± 0.7 86.6 ± 0.2 51.4 ± 0.7 PIP-Net 82.0 ± 0.3 86.5 ± 0.3 50.6 ± 0.6 ProtoSpArX</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Ablation study with different prototype layers and classifiers with respect to a super-prototype kernel. (Best accuracy in bold)</figDesc><table><row><cell cols="3">Super-Prototype Classifier Prototype Layer</cell><cell>Accuracy CUB Cars</cell></row><row><cell>------------</cell><cell>L2 L2 Cosine Cosine</cell><cell cols="2">Fixed 79.2 86.1 MLP 81.2 86.7 Fixed 81.5 87.2 MLP 81.8 87.8</cell></row><row><cell></cell><cell>L2</cell><cell cols="2">Fixed 81.0 87.3</cell></row><row><cell></cell><cell>L2</cell><cell>MLP</cell><cell>81.6 87.9</cell></row><row><cell></cell><cell>Cosine</cell><cell cols="2">Fixed 82.7 88.9</cell></row><row><cell></cell><cell>Cosine</cell><cell cols="2">MLP 83.4 89.3</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">This criterion can be customized to reflect the user's preferences. For example, the dataset could assign Class 1 to images with a square in the first column, a blue triangle in the second, and a red square in the third.</note>
		</body>
		<back>

			<div type="funding">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>F. Toni) https://profiles.imperial.ac.uk/h.ayoobi (H. Ayoobi); https://profiles.cardiff.ac.uk/staff/potykan (N. Potyka); https://www.doc.ic.ac.uk/~ft/ (F. Toni)</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">C</forename><surname>Wah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Branson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Welinder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Perona</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Belongie</surname></persName>
		</author>
		<idno>CNS-TR-2011-001</idno>
		<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
		<respStmt>
			<orgName>California Institute of Technology</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical Report</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead</title>
		<author>
			<persName><forename type="first">C</forename><surname>Rudin</surname></persName>
		</author>
		<idno type="DOI">10.1038/S42256-019-0048-X</idno>
		<ptr target="https://doi.org/10.1038/s42256-019-0048-x.doi:10.1038/S42256-019-0048-X" />
	</analytic>
	<monogr>
		<title level="j">Nat. Mach. Intell</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="206" to="215" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">This looks like that: deep learning for interpretable image recognition</title>
		<author>
			<persName><forename type="first">C</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Tao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barnett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Rudin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">K</forename><surname>Su</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">ProtoPShare: Prototypical parts sharing for similarity discovery in interpretable image classification</title>
		<author>
			<persName><forename type="first">D</forename><surname>Rymarczyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Struski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tabor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zielinski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)</title>
				<editor>
			<persName><forename type="first">F</forename><surname>Zhu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><forename type="middle">C</forename><surname>Ooi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Miao</surname></persName>
		</editor>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1420" to="1430" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Neural prototype trees for interpretable fine-grained image recognition</title>
		<author>
			<persName><forename type="first">M</forename><surname>Nauta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Van Bree</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Seifert</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2021</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="14933" to="14943" />
		</imprint>
	</monogr>
	<note>Computer Vision Foundation / IEEE</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Learning to reason: End-to-end module networks for visual question answering</title>
		<author>
			<persName><forename type="first">R</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Andreas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rohrbach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Darrell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Saenko</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICCV.2017.93</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Computer Vision (ICCV)</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
			<biblScope unit="page" from="804" to="813" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">SpArX: Sparse argumentative explanations for neural networks</title>
		<author>
			<persName><forename type="first">H</forename><surname>Ayoobi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Potyka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Toni</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">European Conference on Artificial Intelligence (ECAI)</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Gal</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Nowé</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><forename type="middle">J</forename><surname>Nalepa</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Fairstein</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Radulescu</surname></persName>
		</editor>
		<imprint>
			<publisher>IOS Press</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">372</biblScope>
			<biblScope unit="page" from="149" to="156" />
		</imprint>
	</monogr>
	<note>Frontiers in Artificial Intelligence and Applications</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Interpretable image classification with differentiable prototypes assignment</title>
		<author>
			<persName><forename type="first">D</forename><surname>Rymarczyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Struski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Górszczak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lewandowska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tabor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zieliński</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-19775-8_21</idno>
		<idno>doi:</idno>
		<ptr target="10.1007/978-3-031-19775-8_21" />
	</analytic>
	<monogr>
		<title level="m">Computer Vision -ECCV 2022: 17th European Conference</title>
				<meeting><address><addrLine>Tel Aviv, Israel; Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag</publisher>
			<date type="published" when="2022">October 23-27, 2022. 2022</date>
			<biblScope unit="page" from="351" to="368" />
		</imprint>
	</monogr>
	<note>Proceedings, Part XII</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Pip-net: Patch-based intuitive prototypes for interpretable image classification</title>
		<author>
			<persName><forename type="first">M</forename><surname>Nauta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schlötterer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Van Keulen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Seifert</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023</title>
				<meeting>the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023</meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>Computer Vision Foundation / IEEE</note>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Learning important features through propagating activation differences</title>
		<author>
			<persName><forename type="first">A</forename><surname>Shrikumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Greenside</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kundaje</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
			<publisher>ICML</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Integrated grad-cam: Sensitivity-aware visual explanation of deep convolutional networks via integrated gradient-based scoring</title>
		<author>
			<persName><forename type="first">S</forename><surname>Sattarzadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sudhakar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">N</forename><surname>Plataniotis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP)</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Counterfactual visual explanations</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ernst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Batra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Parikh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lee</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019</date>
			<publisher>ICML, PMLR</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Prototypical networks for few-shot learning</title>
		<author>
			<persName><forename type="first">J</forename><surname>Snell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Swersky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zemel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions</title>
		<author>
			<persName><forename type="first">O</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Rudin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI Conference on Artificial Intelligence</title>
				<meeting>the AAAI Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">32</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">The MNIST database of handwritten digit images for machine learning research</title>
		<author>
			<persName><forename type="first">L</forename><surname>Deng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Signal Processing Magazine</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="page" from="141" to="142" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Argumentative XAI: A survey</title>
		<author>
			<persName><forename type="first">K</forename><surname>Cyras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rago</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Albini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Baroni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Toni</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021</title>
				<meeting>the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">DAX: deep argumentative explanation for neural networks</title>
		<author>
			<persName><forename type="first">E</forename><surname>Albini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lertvittayakumjorn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rago</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Toni</surname></persName>
		</author>
		<idno>CoRR abs/2012.05766</idno>
		<ptr target="https://arxiv.org/abs/2012.05766.arXiv:2012.05766" />
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Neural QBAFs: Explaining neural networks under lrp-based argumentation frameworks</title>
		<author>
			<persName><forename type="first">P</forename><surname>Sukpanichnant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rago</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lertvittayakumjorn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Toni</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-08421-8_30</idno>
		<idno>doi:</idno>
		<ptr target="10.1007/978-3-031-08421-8\_30" />
	</analytic>
	<monogr>
		<title level="m">AIxIA 2021 -Advances in Artificial Intelligence -20th International Conference of the Italian Association for Artificial Intelligence, Virtual Event</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">December 1-3, 2021. 2021</date>
			<biblScope unit="volume">13196</biblScope>
			<biblScope unit="page" from="429" to="444" />
		</imprint>
	</monogr>
	<note>Revised Selected Papers</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Argue to learn: Accelerated argumentationbased learning</title>
		<author>
			<persName><forename type="first">H</forename><surname>Ayoobi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Verbrugge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Verheij</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICMLA52953.2021.00183</idno>
	</analytic>
	<monogr>
		<title level="m">20th IEEE International Conference on Machine Learning and Applications (ICMLA)</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Neurasp: Embracing neural networks into answer set programming</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ishay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Joint Conference on Artificial Intelligence</title>
				<editor>
			<persName><forename type="first">C</forename><surname>Bessiere</surname></persName>
		</editor>
		<imprint>
			<publisher>ijcai</publisher>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="page" from="1755" to="1762" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Deep symbolic learning: Discovering symbols and rules from perceptions</title>
		<author>
			<persName><forename type="first">A</forename><surname>Daniele</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Campari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Malhotra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Serafini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th</title>
				<meeting>the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th<address><addrLine>Macao, SAR, China</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023-08">August 2023. 2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Interpreting neural networks as quantitative argumentation frameworks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Potyka</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence</title>
				<meeting>the Thirty-Third AAAI Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note>AAAI-21</note>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Argumentation-based online incremental learning</title>
		<author>
			<persName><forename type="first">H</forename><surname>Ayoobi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Verbrugge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Verheij</surname></persName>
		</author>
		<idno type="DOI">10.1109/TASE.2021.3120837</idno>
		<ptr target="https://doi.org/10.1109/TASE.2021.3120837.doi:10.1109/TASE.2021.3120837" />
	</analytic>
	<monogr>
		<title level="j">IEEE Trans Autom. Sci. Eng</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="3419" to="3433" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Argue to learn: Accelerated argumentationbased learning</title>
		<author>
			<persName><forename type="first">H</forename><surname>Ayoobi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Verbrugge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Verheij</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICMLA52953.2021.00183</idno>
		<ptr target="https://doi.org/10.1109/ICMLA52953.2021.00183.doi:10.1109/ICMLA52953.2021.00183" />
	</analytic>
	<monogr>
		<title level="m">20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021</title>
				<editor>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Wani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">I</forename><forename type="middle">K</forename><surname>Sethi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">W</forename><surname>Shi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Qu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Raicu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Jin</surname></persName>
		</editor>
		<meeting><address><addrLine>Pasadena, CA, USA</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2021">December 13-16, 2021. 2021</date>
			<biblScope unit="page" from="1118" to="1123" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Explain what you see: Openended segmentation and recognition of occluded 3d objects</title>
		<author>
			<persName><forename type="first">H</forename><surname>Ayoobi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Kasaei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Verbrugge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Verheij</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICRA48891.2023.10160927</idno>
		<ptr target="https://doi.org/10.1109/ICRA48891.2023.10160927.doi:10.1109/ICRA48891.2023.10160927" />
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Robotics and Automation, ICRA 2023</title>
				<meeting><address><addrLine>London, UK</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2023-06-02">May 29 -June 2, 2023. 2023</date>
			<biblScope unit="page" from="4960" to="4966" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<author>
			<persName><forename type="first">F</forename><surname>Leofante</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ayoobi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Dejl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Freedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Gorur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Paulino-Passos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rago</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rapberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Russo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Toni</surname></persName>
		</author>
		<idno type="DOI">10.48550/ARXIV.2405.10729</idno>
		<idno type="arXiv">arXiv:2405.10729</idno>
		<ptr target="/ARXIV.2405.10729" />
		<title level="m">Contestable AI needs computational argumentation</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Adam: A method for stochastic optimization</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">P</forename><surname>Kingma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ba</surname></persName>
		</author>
		<ptr target="http://arxiv.org/abs/1412.6980" />
	</analytic>
	<monogr>
		<title level="m">3rd International Conference on Learning Representations, ICLR 2015</title>
				<editor>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Lecun</surname></persName>
		</editor>
		<meeting><address><addrLine>San Diego, CA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2015">May 7-9, 2015. 2015</date>
		</imprint>
	</monogr>
	<note>Conference Track Proceedings</note>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">3d object representations for fine-grained categorization</title>
		<author>
			<persName><forename type="first">J</forename><surname>Krause</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Stark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Fei-Fei</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICCVW.2013.77</idno>
	</analytic>
	<monogr>
		<title level="m">2013 IEEE International Conference on Computer Vision Workshops</title>
				<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="554" to="561" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
