1. Introduction

On the Impact of Sparsification on Quantitative Argumentative Explanations in Neural Networks

Daniel Peacock

daniel.peacock20@imperial.ac.uk 1

Mansi

Nico Potyka

potykan@cardif.ac.uk 0

Francesca Toni

Xiang Yin

x.yin20@imperial.ac.uk 1 0 Cardif University , UK 1 Imperial College London , UK

Neural Networks (NNs) are powerful decision-making tools, but their lack of explainability limits their use in high-stakes domains such as healthcare and criminal justice. The recent SpArX framework sparsifies NNs and maps them to (weighted) Quantitative Bipolar Argumentation Frameworks (QBAFs) to provide an argumentative understanding of their mechanics. QBAFs can be explained by various quantitative argumentative explanation methods such as Argument Attribution Explanations (AAEs), Relation Attribution Explanations (RAEs), and Contestability Explanations (CEs) - which assign numerical scores to arguments or relations to quantify their influence on the dialectical strength of an argument to be explained. However, it remains unexplored how sparsification of NNs impacts the explanations derived from the corresponding (weighted) QBAFs. In this paper we explore two directions for impact. First, we empirically investigate how varying the sparsification levels of NNs afects the preservation of these explanations: using four datasets (Iris, Diabetes, Cancer, and COMPAS), we ifnd that AAEs are generally well preserved, whereas RAEs are not. Then, for CEs, we find that sparsification can improve computational eficiency in several cases. Overall, this study ofers a preliminary investigation into the potential synergy between sparsification and explanation methods, opening up new avenues for future research.

eol>Explainability Neural Networks Argumentative Explanations

1. Introduction

Layer 1 Neuron 1 Activation: 0.83 Layer 1 Neuron 2 Activation: 0.00

Layer 2 Neuron 1 Activation: 0.02 Layer 2 Neuron 2 Activation: 0.00 Layer 2 Neuron 3 Activation: 1.00

Iris-setosa Activation: 0.00 Iris-versicolor Activation: 0.99 Iris-virginica

Activation: 1.00 (a) Original MLP.

Sepal Length Strength:0.4722 Sepal Width Strength:0.3182 Petal Length Strength:0.7193 Petal Width Strength:0.6250

Layer 1 Cluster 1 Strength:0.0142

Layer 2 Cluster 1

Strength:0.0001 (b) Sparse QBAF.

StrIernisg-tshe:t0o.s0a000 Iris-versicolor Strength:1.0000 Iris-virginica Strength:1.0000 and attack relations, Contestability Explanations (CEs) [18] determine how the edge weights can be modified to reach a desired dialectical strength for the topic argument. Together, these methods ofer a ifne-grained and quantitative understanding of the reasoning process within (weighted) QBAFs.

Both sparsification and quantitative argumentative explanations (i.e. AAEs, RAEs and CEs) advance the interpretability of NNs, but there has been little investigation into how the former impacts the latter. This gap is particularly concerning because sparsification simplifies the structure of MLPs, which may alter or distort the quantitative explanations derived from the resulting (weighted) QBAF1, potentially misleading users, and even resulting in ethical or legal risks. For example, in a healthcare setting, misleading explanations may lead incorrect treatments being given to patients.

To address this gap, we focus on two core research questions (as illustrated in Figure 2): ( 1 ) To what extent does sparsification preserve AAEs and RAEs? ( 2 ) Can sparsification improve CEs’ computational eficiency? We distinguish these two questions because the nature of the explanations difers: AAEs and RAEs quantify how arguments and relations contribute to a fixed outcome (as opposed to identifying changes leading to a diferent outcome, as in CEs), so preservation under sparsification is crucial to assess whether interpretability can be maintained. In contrast, generating CEs typically involves heuristic or optimization-based search procedures, rather than direct computation as in AAEs and RAEs. Therefore, the primary concern for CEs is whether sparsification can accelerate this search. We empirically investigate these questions and make the following contributions: 1. We analyse the impact of sparsification on AAEs, and find that AAEs are generally well-preserved across varying sparsity levels. 2. We analyse the impact of sparsification on RAEs, and find that RAEs are not as well-preserved under sparsification as AAEs.

3. We propose a method that leverages the sparsification to improve the runtime of computing CEs. The code is available at https://github.com/DanielPeacock/ArguingWithNeuralNetworksPublic.

2. Preliminaries

In this paper we focus on NNs in the form of MLPs. These are directed, acyclic graphs as illustrated in Figure 1a, processing inputs in an input layer (on the left in Figure 1a) through hidden layers (layer 1 and 2 in Figure 1a) to obtain a prediction in the last layer (on the right in Figure 1a). Nodes in all layers amount to neurons, whose activation is determined by an activation function applied to the (edge) weighted sum of the neuron’s incoming connections plus a bias value assigned to each neuron. Throughout this paper we use the logistic activation function. 1Note that AAEs and RAEs use the unweighted QBAFs, while CEs use the weighted QBAFs. With a slight abuse of notation, we use “QBAF” to refer to both throughout the paper. Sparsifying

SpArX

Translation Sparsified MLP Translation

Sparse QBAF

Explanations Methods

AAEs RAEs CEs

Explanations for Original QBAF

Preservation (AAEs/RAEs)

& Efficiency (CEs) Explanations for Sparse QBAF (a) AAE Scores for arguments (input features/ neurons). Red/blue scores indicate a negative/positive influence on the topic argument (output). (b) RAE Scores for relations. Red/blue scores indicate a negative/positive influence on the topic argument (output).

Each MLP can be represented by an equivalent Quantitative Bipolar Argumentation Framework (QBAFs). Here, we view each neuron as an argument, and each edge as a relation. Edges with negative weights are attacks, and edges with positive weights are supports. Each argument has a base score corresponding to its initial strength and relations are weighted. For more details on the translation process see [9]. This provides a new argumentative interpretation of MLPs with dialectical strength values for each argument (mathematically equivalent to the activations of neurons in the MLP).

We use several existing argumentation-based explanation methods, overviewed here (see original papers for more details).

SpArX [7] The QBAF interpretation of MLPs does not necessarily improve explainability since QBAFs are of the same size and density as the MLPs, which can be very large. SpArX provides explanations by reducing the size of the given MLPs first. The neurons in the hidden layers are clustered based on their activations, and then merged by averaging their biases and edge weights. The sparse MLPs are then converted to equivalent QBAFs (Figure 1b) from which qualitative explanations can be found, for example by creating word clouds of the most important input features or examining the dialectical relationships between the arguments. In this paper we consider instead quantitative argumentative explanations drawn from the sparsified QBAFs.

AAEs [16] AAEs attempt to explain QBAFs by examining the contribution of other arguments to a topic argument. Throughout this paper, we use topic argument to refer to the argument we are trying to explain (usually this is an argument corresponding to one of the output neurons in the equivalent MLP). We focus on Gradient-based AAEs (although other types exist such as Shapley-based [19] and Removalbased AAEs [20]). Gradient methods work by computing a score which represents the sensitivity of the topic argument to changes in the base score of other arguments. An example is shown in Figure 3a. RAEs [17] RAEs attempt to understand the role of the relations in contributing to the strength of a topic argument. In this paper, we focus on Shapley-based RAEs (although other types such as Gradient-based RAEs [18] also exist). These are based on Shapley values [21], and look at every subset of the attacks and supports to understand the influence of each one on the topic argument. Due to the complexity in computing these scores, an approximation is used. Figure 3b shows an example.

CEs [18] CEs calculate how the

weights of each relation in the QBAF must be modified in order to reach a certain dialectical strength in the topic argument (called the desired strength). This is similar to the counterfactual problem in AI, where methods are used to try and explain how a model’s outputs would change with modifications to the inputs [22, p. 847 848]. CEs are computed by iteratively updating the weights using the gradient-based RAE (G-RAE) to guide the search until the desired strength is reached. Table 1 shows an example.

3. Methodology

In order to ascertain whether AAEs and RAEs are preserved after sparsification, we train MLPs of various sizes and compare an aggregation of the scores for the original MLPs to the scores for the MLP after sparsification. The aggregation is needed to allow a comparison of scores since there are significantly more scores for the original MLPs due to their larger size. Here, we define these aggregations.

Let = {1, . . . , } be a cluster of interest after sparsification in hidden layer , containing neurons for = 1, . . . , . Similarly, let ′ = {′1, . . . , ′} be another cluster of interest in the next layer + 1 containing neurons ′ for = 1, . . . , .

Aggregation of AAEs We aggregate the AAE scores by averaging the score for each neuron in the cluster of interest. Formally, the aggregated score for cluster is

Agg_aae_score() = 1 ∑︁ aae_score().

∈ A simple example of this process is shown in Figure 4.

Aggregation of RAEs We aggregate the RAE scores by averaging the RAE scores of edges between all pairs of neurons contained in two clusters of interest. Formally, the aggregated score for the edge between clusters and ′ is

Agg_rae_score(, ′) = 1 ∑︁ 1 ∑︁ rae_score(, ).

∈ ∈′ A simple example of this process is shown in Figure 5.

Output Cluster Score: 1.5

Score: 2

Score: 1

De-aggregation of CEs We do not attempt to directly understand if CEs are preserved with sparsification since this question is ill-defined. Indeed, CEs do not give a fixed score to each component in the same way as AAEs and RAEs and so it is challenging to define what preservation means in this setting. Instead, we look at CEs in the opposite direction: that is, we attempt to de-aggregate the CEs for the sparse MLPs to approximate/recover the CE for the original MLPs and improve computational eficiency. The sparse CE assigns weights to each edge in the sparse QBAF. We de-aggregate by assuming the weights are equally distributed amongst edges merged together by SpArX. Every edge merged together is assigned the same weight in our approximate CE. Formally, consider the edge between two clusters and ′, assigned weight in the sparse CE. There are a total of edges between every pair of neurons in these clusters. So every edge between these pairs is assigned weight in the approximate CE. A simple example of this process is shown in Figure 6.

4. Results and Analysis

To compare the aggregated AAE and RAE scores with the sparse MLP scores, we look at two approaches: the overall pattern in the scores and the highest scoring arguments/ edges. All of our analysis is for the Iris [23], Diabetes [24], Cancer [25] and COMPAS [26] datasets and we average our results over the test set for each dataset. These datasets are commonly used and of varying levels of complexity (number of input features and dataset size). For AAEs we train MLPs with 1 - 2 hidden layers each with 10 - 100 neurons. For RAEs we train MLPs with 1 - 2 hidden layers each with 2 - 10 neurons. These are significantly smaller than the MLPs used for AAEs due to the time complexity involved in computing Shapley-based RAEs making it impractical to use large MLPs.

Sparse QBAF (With CE Weights)

Original QBAF (With Approximated CE Weights)

Overall Pattern To check if the overall pattern was preserved we check the Spearman Rank [27] and Kendall- coeficient [ 28]. These provide a measure of the strength of the relationship between two variables. We use these measures to examine the strength of the correlation between the aggregated scores and the sparse scores. A rank/coeficient close to 1 means a strong correlation, indicating the pattern in both sets of scores is similar and hence the pattern in the scores is preserved with sparsification. We also rank the arguments and edges based on their scores. We then look at the percentage diference in ranking between each aggregated argument/edge and the corresponding argument/edge in the sparse QBAF. A small diference in rankings would indicate a similar pattern after sparsifying. Highest Scores We also look specifically at the highest scoring arguments/edges. These are important since they are the most influential components of the MLPs so it is important these are preserved. Firstly, we look at the top-ten scoring arguments/edges and check what percentage of arguments/edges in the aggregated scores are also in the top-ten of the sparse MLP scores i.e. how many of the highest scoring arguments/edges stay the same after sparsification. In addition, for RAEs we also look at the top-scoring aggregated edge and check whether this edge is in the top-ten of the sparse scores i.e. checking that the most important edge remains important after sparsification. High percentages would indicate that the highest scoring arguments/edges are preserved after sparsifying.

4.1. Preservation of AAEs

Overall our results are positive, showing that AAEs are preserved well by sparsification. Overall Pattern The results can be found in Table 2. We find that the pattern/distribution of scores matches closely before and after sparsification. We can see that the correlation between the scores is very strong. The coeficients are always higher than 0.7 and in most cases at least 0.9. Since the coeficients/ranks computed are all close to 1, we can conclude that the pattern in scores is preserved well with sparsification. We should note that the coeficients do reduce slightly towards higher levels of sparsification (around 90%), but this is very little and the correlation still remains strong. Considering the rankings diferences ( Δ in Table 2), we can see that the rankings are similar. Towards the lower levels of sparsification, there is only around a 10% diference in rankings, and at high levels of sparsification this goes up to 30%. However, this is still relatively low, and only appears with high levels of sparsification (90%). This is also to be expected, since high levels of sparsification should result in greater loss of information. Results for the preservation of AAE scores (↑: higher is better; ↓: lower is better), including Spearman (↑) , Kendall (↑) , and the percentage rankings diferences (%) Δ(↓) between the aggregated and sparse AAE scores.

Dataset

Iris Diabetes Cancer COMPAS but again this is to be expected as the amount of information lost should increase as we sparsify more. There is a balance between sparsification and loss of information, but this depends on the type of dataset and its complexity e.g. the COMPAS dataset (the most complex) loses more information with high sparsification levels, but this does not happen to the same extent with less complex datasets such as Iris. However, in general, most of the top scoring arguments do remain within the top 10, and so we can conclude that the highest scoring arguments with AAEs are preserved well.

4.2. Preservation of RAEs

Overall, the results are mixed and we find that RAEs are not preserved well in the way seen for AAEs. Overall Pattern

The results can be found in Tables 4a and 4b. Considering first the Spearman Rank and Kendall coeficients (Table 4a), there is a relatively strong correlation between the aggregated scores and sparse scores. The ranks/coeficients are around 0.8, although this reduces as the sparsification level is increased. For example, the Spearman Rank of the Diabetes dataset decreases from 0.888 at 20% sparsification to only 0.696 at 80%. This is a similar pattern to what we saw for AAEs, and largely what is to be expected; the sparser an MLP, the more information is lost. This indicates that the overall pattern in scores is relatively well preserved. However, compared to the equivalent AAEs analysis (Table 2), the correlation is significantly lower (around 0.9 for AAEs). Therefore, although the pattern looks to be preserved, we do lose more information about RAEs with sparsification compared to AAEs. very high levels of sparsification (90%), but this is always the case for RAEs, even at very low levels of sparsification. This indicates that significantly more information is lost through sparsification for RAEs, and the overall pattern in scores is not preserved well for RAEs. This does seem to contradict the correlation coeficients/ranks seen previously, but this only measures the correlation in the scores and does not look at the individual scores themselves.

Looking at both sets of results, we can conclude that although the scores before and after sparsification are highly correlated, the individual scores and their rankings are afected by sparsification. The overall pattern is to some extent preserved (strong correlation), but lots of information is lost as a result especially in the individual scores.

Highest Scoring Edges

The results can be found in Tables 5a and 5b. We can see that the most important edges are not well preserved by sparsification. First looking at Table 5a, we see that in all cases, a very low percentage of edges remain in the top ten. The results indicate that generally around 30% of the top-ten edges stay the same, but is as low as 23% in some cases. This fits with our previous analysis that the individual rankings of edges is not preserved well, and there is a large change in rankings. This tells us that in general the preservation of the highest scoring edges is poor, and information about RAEs is lost as a result of sparsification.

Looking at Table 5b, we again see poor preservation. In most cases the top scoring edge does not remain high scoring after sparsification. In general, only in around 40% of cases does the top scoring edge remain high-scoring and this is as low as 23% in some cases. We should note that for the COMPAS dataset, the highest scoring edges do look to be better preserved. Due to the size and the complexity of the dataset, significantly fewer MLPs were tested compared to the other datasets. This may have resulted in the slightly diferent results for COMPAS compared to the other datasets. However, the pattern across all datasets tested indicates that the highest scoring edges are not well preserved.

4.3. De-aggregated CEs

We analyse the de-aggregated CEs using a diferent methodology to that used for AAEs and RAEs. We look at the validity and distance to check the quality of the de-aggregated (approximate) CEs and use this methodology to improve runtime.

Validity A CE is valid if the topic argument attains the desired strength using the edge weights provided by the CE. We create an approximate CE for the original QBAF from the sparse QBAF using the de-aggregation method in Section 3. We check the validity of our approximate CE and the results are shown in Table 6 (results labelled 1). Clearly, we can see that the approximation does not successfully produce a valid CE in the majority of cases. In many cases, the percentage of valid CEs is less than 10% so our approximation clearly does not work efectively. For the COMPAS dataset, the percentage of valid CEs does increase, up to around 20% in some cases. This is positive since the COMPAS dataset is the most complex of the datasets. However, this is still a very low percentage, and therefore we can conclude that our approximation is not efective. It is likely that our assumption in the approximation that the edge weights were equally distributed is incorrect, causing this poor performance. Distance To further understand the quality of the approximate CEs, we also look at the distance. We check if the topic argument’s strength gets closer to the desired strength than before the CE weights are applied. The results can be found in Table 6 (results labelled 2). We can see that our approximate CE does bring the strength of the topic argument closer to the desired strength in the majority of cases. For the Iris and Cancer datasets, consistently in over 90% of cases, the approximate CE brings the strength closer. For the Diabetes dataset, this percentage does reduce to around 80%, but this is still high and in most cases the approximation does succeed. Finally, for the COMPAS dataset, around 85%-90% of cases generally are closer, except for the 90% sparsification case, where the percentage reduces to 73%. Overall, however, this is still a positive result in bringing the strength of the topic argument closer to the desired strength. Note also that the approximation does successfully get closer even at very high levels of sparsification. This implies that CEs are preserved well with sparsification.

(a) (b) Runtime The CE algorithm in [18] works by iteratively updating the weights of the QBAF until the desired strength is reached. However, the initial weights are randomised and so the algorithm can take some time to converge to the desired strength (the algorithm may get stuck in a local minimum). Therefore, to improve convergence, we can initialise the weights of the QBAF with our approximated CE instead of randomised weights. We perform experiments to see if the runtime was improved using our approximation. We run all experiments on a Linux PC running 64-bit Ubuntu 24.04, Intel Core i7-8700 3.20GHz processor and 16GB memory. We compare the following: (a) Apply the usual CE algorithm i.e. translate the MLP to the equivalent QBAF and apply the CE algorithm using randomised initial weights. (b) Use our approximation method i.e. sparsify the MLP, translate to a sparse QBAF, apply the CE algorithm to the sparse QBAF with random initial weights, create an approximation for the original MLP and apply the CE algorithm to the original MLP using the approximation as the initial weights. We plot graphs of the average runtime using the two methods for each dataset and MLP size. For the second method, this involves checking both how much time is spent on ( 1 ) computing the CE on the sparsified MLP and ( 2 ) computing the CE on the original MLP starting from the previously computed CE. We give two of these graphs in Figure 7 (one small MLP, and one larger MLP) for the COMPAS dataset only (the most complex analysed) for succinctness although the full set of graphs for each dataset can be found in Appendix A. From the figure, we can see that when the MLP is small (with a small number of edges), the runtime is longer using our method as more steps must be done to compute the CE, and due to the small number of edges, the CE algorithm converges quickly anyway. However, when the MLP is larger, our method does improve the runtime. For this reason, we calculate the percentage reduction in the average runtime for MLPs of more than 500 edges only. The results are given in Table 7. We see that for low levels of sparsification our method can still be slower (e.g. for Diabetes). However, in all cases, for high levels of sparsification there is a reduction in runtime, often by more than 50%. The much larger number of edges (over 1300 in the graph shown in Figure 7b) means that the CE algorithm takes longer to converge, so initialising the weights with the approximate CE does improve the runtime. We can also note that using our initial guess from a very high level of sparsification (e.g. 90%), our method performs well. This is positive, as even at a high level of sparsification, we can still recover a large amount of information. Although we cannot easily recover a valid CE, we can still make a good guess.

5. Conclusion

In this paper, we explored the impact of sparsification on quantitative argumentative explanations. Our investigations allow us to understand whether sparsification alters or distorts the argumentative explanations (AAEs, RAEs, CEs) produced from the resulting QBAF. Without this, users could be misled by explanations, leading to ethical or legal risks. Our findings showed that AAEs are well-preserved under sparsification, suggesting that AAEs can be reliably used alongside sparsification to enhance the interpretability of NNs. In contrast, RAEs appeared less robust, making them challenging to use alongside sparsification. Finally, for CEs, we saw that sparsification can improve the computational eficiency of explanation generation, which is particularly useful for large and dense MLPs.

There are a few avenues for future work. While we found promising empirical results for the preservation of AAEs, further work is needed to establish theoretical guarantees for such preservation. Additionally, while our findings do not support the preservation of RAEs, future work could explore whether gradient-based RAEs (G-RAEs) [18] exhibit better consistency, potentially enabling RAEs to contribute more efectively to explanations in sparsified settings. Future work could also explore using a weighted aggregation of the RAE scores, similar to the weighted averaging used by SpArX when merging the edge weights ([7, Def. 6]) to see if this results in better preservation. Further, for both AAEs and RAEs, other aggregation methods could be explored. Using other techniques instead of mean aggregation (e.g. min./max. aggregation) may result in sparsification having a lower impact.

Finally, although we observed that sparsification can improve the speed of computing CEs in large MLPs, our method is a heuristic. Further work is necessary to find guarantees as to when our method is faster, perhaps by finding a lower bound on MLP size for which our method is guaranteed to improve the runtime. Our approximation method also did not directly produce valid CEs; further work should be done to find a method of de-aggregating the CEs to produce valid CEs for the original QBAFs. Perhaps weighting the edges diferently rather than assuming a equal distribution would help.

Acknowledgments

This research was partially funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 101020934, ADIX) and by J.P. Morgan and by the Royal Academy of Engineering under the Research Chairs and Senior Research Fellowships scheme. Any views or opinions expressed herein are solely those of the authors.

Declaration on Generative AI The authors have not employed any Generative AI tools.

[4] P. Angelov, E. Soares, Towards explainable deep neural networks (xdnn), Neural Networks 130 (2020) 185–194. [5] A. Rago, K. Cyras, J. Mumford, O. Cocarascu, Argumentation and Machine Learning, in: D. Gabbay, G. Kern-Isberner, G. R. Simari, M. Thimm (Eds.), Handbook of Formal Argumentation, Volume 3, 2024. arXiv:2410.23724. [6] K. Čyras, A. Rago, E. Albini, P. Baroni, F. Toni, Argumentative XAI: A survey, in: Z.-H. Zhou (Ed.),

IJCAI-21, 2021, pp. 4392–4399. [7] H. Ayoobi, N. Potyka, F. Toni, SpArX: Sparse Argumentative Explanations for Neural Networks, in: ECAI, volume 372 of Frontiers in Artificial Intelligence and Applications, 2023, pp. 149–156. [8] T. Mossakowski, F. Neuhaus, Modular semantics and characteristics for bipolar weighted argumentation graphs, CoRR (2018). arXiv:1807.06685. [9] N. Potyka, Interpreting Neural Networks as Quantitative Argumentation Frameworks, Proceedings of the AAAI Conference on Artificial Intelligence 35 (2021) 6463–6470. [10] P. Baroni, M. Romano, F. Toni, M. Aurisicchio, G. Bertanza, Automatic evaluation of design alternatives with quantitative argumentation, Argument & Computation 6 (2015) 24–49. [11] A. Rago, F. Toni, M. Aurisicchio, P. Baroni, Discontinuity-free decision support with quantitative argumentation debates, in: 15th International Conference on the Principles of Knowledge Representation and Reasoning (KR), 2016. [12] N. Potyka, Continuous dynamical systems for weighted bipolar argumentation, in: International

Conference on Principles of Knowledge Representation and Reasoning (KR), 2018, pp. 148–157. [13] L. Amgoud, J. Ben-Naim, Evaluation of arguments in weighted bipolar graphs, International

Journal of Approximate Reasoning 99 (2018) 39–55. [14] C. Fan, J. Liu, Y. Zhang, E. Wong, D. Wei, S. Liu, Salun: Empowering machine unlearning via gradient-based weight saliency in both image classification and generation, in: The Twelfth International Conference on Learning Representations, 2024. [15] S. Han, J. Pool, J. Tran, W. J. Dally, Learning both weights and connections for eficient neural networks, in: Proceedings of the 29th International Conference on Neural Information Processing Systems - Volume 1, NIPS’15, MIT Press, Cambridge, MA, USA, 2015, p. 1135–1143. [16] X. Yin, N. Potyka, F. Toni, Argument attribution explanations in quantitative bipolar argumentation frameworks, in: ECAI, volume 372, 2023, pp. 2898–2905. [17] X. Yin, N. Potyka, F. Toni, Explaining arguments’ strength: Unveiling the role of attacks and supports, in: K. Larson (Ed.), IJCAI-24, 2024, pp. 3622–3630. [18] X. Yin, N. Potyka, A. Rago, T. Kampik, F. Toni, Contestability in quantitative argumentation, arXiv preprint arXiv:2507.11323 (2025). [19] T. Kampik, N. Potyka, X. Yin, K. Čyras, F. Toni, Contribution functions for quantitative bipolar argumentation graphs: A principle-based analysis, International Journal of Approximate Reasoning 173 (2024) 109255. [20] J. Delobelle, S. Villata, Interpretability of gradual semantics in abstract argumentation, in: G. Kern-Isberner, Z. Ognjanović (Eds.), Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Cham, 2019, pp. 27–38. [21] L. S. Shapley, Notes on the N-Person Game II: The Value of an n-Person Game, Santa Monica, CA, 1951. [22] S. Wachter, B. Mittelstadt, C. Russell, Counterfactual explanations without opening the black box:

Automated decisions and the GDPR, Harvard Journal of Law and Technology 31 (2018) 841–887. [23] R. A. Fisher, Iris, UCI Machine Learning Repository, 1936. [24] National Institute of Diabetes and Digestive and Kidney Diseases, Diabetes Dataset, Kaggle, 1990. [25] W. N. Street, W. H. Wolberg, O. L. Mangasarian, Breast Cancer Wisconsin (Diagnostic), UCI

Machine Learning Repository, 1993. [26] ProPublica, Compas recidivism risk score data and analysis, GitHub, 2016. [27] C. Spearman, The proof and measurement of association between two things. (1961). [28] M. G. Kendall, A new measure of rank correlation, Biometrika 30 (1938) 81–93.

A. Runtime graphs

Here we give the full results for the runtime of our modified CE method compared to the original CE method. In the plots, the x-axis is the sparsification percentage, and the y-axis is the runtime (in seconds). The red line represents the average runtime (over the test dataset) of the original CE algorithm, and the black line is the runtime using our approximation method with various levels of sparsification. • In Figure 8, we see the results for the Iris dataset. • In Figure 9, we see the results for the Diabetes dataset. • In Figure 10 we see the results for the Cancer dataset.

• In Figure 11, we see the results for the COMPAS dataset.

We can see from these plots that in general when the MLP is small (a low number of edges), the original CE method is faster than our approximation method. However, as the MLP size increases, our method can outperform the original CE method.

[1]

M. T.

Ribeiro ,

Singh ,

Guestrin , "Why should I trust you?": Explaining the predictions of any classifier , in: ACM SIGKDD , 2016 , pp. 1135 - 1144 .

[2]

S. M.

Lundberg ,

S.-I.

Lee , A unified approach to interpreting model predictions , in: Advances in Neural Information Processing Systems 30 , 2017 , pp. 4765 - 4774 .

[3]

Bach ,

Binder ,

Montavon ,

Klauschen , K.-R. Müller , W. Samek, On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , PLOS ONE 10 ( 2015 ) e0130140 .