1. Introduction

SEBD

NT4XAI: a Framework Exploiting Network Theory to Support XAI on Classifiers

(Discussion Paper)

Gianluca Bonifazi

g.bonifazi@univpm.it 1

Francesco Cauteruccio

f.cauteruccio@univpm.it 1

Enrico Corradini

e.corradini@pm.univpm.it 1

Michele Marchetti

m.marchetti@pm.univpm.it 1

Giorgio Terracina

terracina@mat.unical.it 0

Domenico Ursino

d.ursino@univpm.it 1

Luca Virgili

luca.virgili@univpm.it 1 0 DEMACS, University of Calabria 1 DII, Polytechnic University of Marche

2023

31 02 05

Explainable AI (XAI, for short) aims to explain the behavior of closed AI systems that act as black-boxes (like many Machine Learning and Deep Learning systems). In this paper, we propose NT4XAI, a modelagnostic framework carrying out explainable AI on classifiers. NT4XAI is based on network theory and, consequently, is able to take advantage of the enormous amount of results found over the years by researchers in this area. Here, we describe both the data model and the approach used by NT4XAI to achieve its goals. Furthermore, we contextualize our framework within the existing XAI research scenarios. Finally, we illustrate some tests we carried out to assess its adequacy in performing the tasks for which it was designed.

eol>Explainable Artificial Intelligence Model-Agnostic XAI Systems Graph Theory Feature Relevance Feature Dyscrasia Sensitivity Analysis

1. Introduction

Explainable AI (XAI, for short) aims to identify transparent and interpretable explanations to the decisions and actions of black-box AI systems [ 1, 2, 3, 4, 5 ]. It aims to know, at least partially, how a black-box AI model acts and to use that information for improving its performance, increasing confidence in it, as well as the level of acceptance of the knowledge it returns [ 6, 7 ]. With the pervasive difusion of Deep Learning (DL, for short), the number of black-box models has grown tremendously and, in hand, interest in XAI has increased. One of the most challenging issues in XAI concerns the study and development of “model-agnostic” XAI approaches. These are capable of interpreting and explaining the decisions of any black-box system, regardless of the model on which it is based. Therefore, they are extremely general, and investing in them provides a considerable return since they can be applied to explain very varied AI models. The downside is that these systems are very dificult to design because they must feature a high abstraction level with respect to the black-box models they want to explain.

In this paper, we aim to make a contribution in this setting by proposing NT4XAI (Network Theory for Explainable AI), a model-agnostic framework for explainability of classifiers. NT4XAI operates on a classifier model whose behavior is unknown. The classifier receives as input a set of instances, all characterized by the same set of features, and assigns a class to each of them. As its name indicates, NT4XAI is based on network theory [ 8 ]; in fact, it builds and maintains a fully connected network. In it, nodes represent instances, while the direction of the arc between two nodes is an indicator of the confidence level with which the classifier has classified the corresponding instances. Once the network is constructed, NT4XAI computes the “dyscrasia” of each feature for all instances. This measure indicates the efectiveness of a feature in discriminating instances. Starting from the values of dyscrasia thus obtained and the properties of the constructed network, NT4XAI computes the relevance of each feature during the classification process [ 9, 10, 11, 12, 13 ]. For this purpose, it uses a version of PageRank [ 14 ] specifically defined to address this issue. The knowledge of the most relevant features provides valuable information about the behavior of the black-box classifier, as it has already been shown in the scientific literature on XAI [ 1, 15, 9, 16, 17 ]. The choice to use network theory in NT4XAI is motivated by the extreme generality and flexibility characterizing network-based representations. Furthermore, network theory has been intensively studied in the past, in terms of both its theoretical aspects and its possible applications [ 18, 19, 20 ]. Therefore, NT4XAI can benefit from the wide range of past results in this research field adapting them to address the issue for which it was thought.

The outline of this paper is as follows: In Section 2, we describe NT4XAI in detail. In Section 3, we present some experiments we performed to evaluate it. Finally, in Section 4, we draw some conclusions and define some possible future developments of this research.

2. Description of NT4XAI

In this section, we illustrate the model underlying NT4XAI and the behavior of the latter. Let ℐ = {1, 2, · · · , } be a set of instances to be classified and let = {1, 2, · · · , } be the set of possible classes. Let ℱ = {1, 2, · · · , } be the set of features characterizing the instances of ℐ. Accordingly, an instance ∈ ℐ can be represented by the set ℱ = {1 , 2 , · · · , } of the values of its features. Here, ∈ ℱ indicates the value of the feature in . Each feature can be numeric, categorical or textual.

Suppose we have a classifier model ℳ that was already trained. For each instance ∈ ℐ, ℳ assigns a class of to it with a confidence level 1 belonging to the real interval [ 0, 1 ]; the higher , the more confident ℳ in classifying . The behavior of ℳ can be represented by a network = ⟨, ⟩. The nodes of represent the instances of ℐ, while its arcs indicate the confidence level of ℳ in classifying the instances associated with the corresponding nodes. Formally speaking, there is a node ∈ for each instance ∈ ℐ. Since a biunivocal correspondence 1Our classifier model assumes that each instance can be assigned to exactly one class. exists between a node and an instance , in the following we will use the terms “node” and “instance”, as well as the symbols and , interchangeably. There is an arc of for each pair of nodes (, ℎ) of . It is directed from to ℎ if < ℎ; otherwise, if ℎ < , it is directed from ℎ to . Finally, if = ℎ, its direction is set randomly.

Having defined the model underlying NT4XAI, let us now see how our framework defines the dyscrasia ( , ℎ ) between the values and ℎ of the feature for the instances and ℎ. The concept of dyscrasia is intended to capture the “disharmony” in the role that two occurrences and ℎ of the same feature played in the classification of two instances and ℎ made by ℳ. As we shall see below, the dyscrasia between two occurrences of the same feature will play a key role in calculating the relevance of the latter. The reasoning behind the definition of ( , ℎ ) is as follows: If ℳ assigned and ℎ to the same class, the value of ( , ℎ ) is the greater the more: (i) and ℎ have dissimilar values, and (ii) the confidences and ℎ with which ℳ classified and ℎ are low (meaning that there is no significant confidence about the correctness of the actions of ℳ). In contrast, if ℳ assigned and ℎ to diferent classes, the value of is the greater the more: (i) and ℎ have similar values, (ii) the value of ℎ is high and the one of is low (meaning that the possibility that ℳ classified ℎ correctly and incorrectly is significant).

The dyscrasia ( , ℎ ) can be formalized as follows: ( , ℎ ) = ︂{ () · (ℎ) · ( , ℎ ) () · (ℎ) · [1 − ( , ℎ )] otherwise if ℳ assigned and ℎ to the same class

Here, (· , · ) is a function that receives two values and ℎ and returns a value in the real interval [ 0, 1 ] indicating the dissimilarity degree between and ℎ . Clearly, the definition of (· , · ) depends on the type of . For example, if is numeric, (· , · ) might return the absolute value of the dissimilarity between and ℎ , suitably normalized. (· ) is a function that receives a node and returns the confidence of ℳ in classifying the instance corresponding to . Finally, (· ) receives a node and returns the error of ℳ in classifying . It is defined as () = 1 − ().

Having defined the dyscrasia between two occurrences of a feature, we are now able to describe how NT4XAI defines the relevance of a feature during a classification process performed by a (possibly) black-box classifier. Recall that, based on the definition of the model underlying NT4XAI, given a node ∈ , its incoming (resp., outgoing) arcs start from nodes whose associated instances have been classified with lower (resp., higher) or equal confidence. The two sets can be defined as follows: = {ℎ|ℎ ∈ , ℎ ̸= , (, ℎ) ∈ } and = {ℎ|ℎ ∈ , ℎ ̸= , (ℎ, ) ∈ }. Let be the feature whose relevance NT4XAI must determine. In order to carry out this task, NT4XAI must preliminarily determine the relevance of for each instance ∈ ℐ. Let be the node corresponding to in . Based on what we said above, in determining the role of in the classification task, can act as a “guide” for the nodes of , while it should be “guided” by the nodes of . One way to formalize this reasoning is to adapt PageRank centrality [ 14 ] to this scenario. Proceeding in this way, we have that the relevance ( ) of can be defined as: ( ) = 1 − + · ⎝ | |

∑︁ (ℎ ) ℎ∈ | ℎ| ⎠ ⎛ ⎞

As can be seen from this formula, the relevance of includes a fixed and a variable component. The former depends on the number of nodes in . The latter depends on the relevance of the feature occurrences related to the starting nodes of the arcs incoming into . The relevance (ℎ ) of each of these nodes ℎ is weighted by the number of arcs outgoing from ℎ. In fact, the greater the number of these arcs, the lower the weight of (ℎ ). This is justified considering that the number of arcs outgoing from ℎ indicates the number of nodes having a higher confidence than ℎ.

Unlike the original PageRank formula [ 14 ], the damping factor in the definition of ( ) has not a constant value, but varies for each node ∈ and depends on the characteristics of the latter. In particular, it depends on the number of arcs outgoing from and the dyscrasia between the feature occurrence of each of these nodes and the feature occurrence of in . More specifically, can be defined as follows: = ︂( ∑︀ℎ∈|(| ,ℎ ) )︂ .

The rationale for this definition is the following: the value of depends on the magnitude of the dyscrasia between the occurrence of for and the occurrence of for all the ending nodes of the arcs outgoing from , thus characterized by a higher confidence than the one of . Therefore, there is a positive correlation between the values of the damping factor and those of dyscrasia. Let us now consider the definition of ( ); in it, if the value of is high, the weight of the first term in the formula tends to be very low. The second term depends strongly on the number of arcs incoming into . If that number is low (implying that the confidence of ℳ in the classification of is low) then the relevance of will be low. This is correct since ℳ did not show a high confidence in classifying , and showed a high dyscrasia with the feature occurrences of the nodes whose instances were classified by ℳ with a higher confidence than . The function (· ) present in the formula of is the sigmoid function. It varies between 0 and 1 when its argument varies from −∞ to +∞. In particular, if the argument can only be non-negative, as in our case, (· ) varies between 0.5 and 1 and acts as an amplifier of the diferences in the values taken on by the argument as it goes along.

Having defined the relevance of a single feature occurrence , we can define the relevance of a feature as the mean of the relevances of all its occurrences: () = ∑︀∈|| ( ) .

3. Experimental campaign

We implemented NT4XAI in Python 3.9 and performed our tests on a 2019 MacBook Pro equipped with 16GB of RAM and 2.6 GHz Intel Core i7 6 core. In addition, we chose multiple classifier models among those most widely used in the literature [ 11, 21, 22 ]. Specifically, the classifiers we chose are: (i) Naive Bayes (hereafter, NB); (ii) SVM with polynomial kernel (hereafter, SVMP); (iii) SVM with radial basis function kernel (hereafter, SVMR); (iv) Multi-Layer Perceptron (hereafter, MLP); (v) Random Forest (hereafter, RF). Naive Bayes is a probabilistic classifier, unlike SVM. Regarding the latter, we considered two kernels. The first, polynomial, considers features and their combinations. The second, radial, separates data using a nonlinear decision-boundary. Multi-Layer Perceptron is a special case of neural network and therefore is a totally black-box model. Finally, Random Forest is an ensemble learning model. In these experiments, we chose classifiers of diferent types, which exhibit very diferent behaviors, because we wanted to test the real ability of NT4XAI to be model-agnostic.

During the test campaign, we used the Iris dataset [ 23 ] published on the UCI Machine Learning Repository [ 24 ]. It consists of 150 instances, 4 features and 3 classes. Specifically, the features are: (i) sepal_length, representing the sepal length in centimeters; its values range in the real interval [4.3, 7.9]; (ii) sepal_width, denoting the sepal width in centimeters; its values range in the real interval [2.0, 4.4]; (iii) petal_length, indicating the petal length in centimeters; its values range in the real interval [1.0, 6.9]; (iv) petal_width, representing the petal width in centimeters; its values range in the real interval [0.1, 2.5]. Although all features are numerical, their values are very heterogeneous. To homogenize them, we performed a normalization task by using a min-max scaler [25]. It operates as follows: given the value ′ of a feature, whose maximum and minimum values are ′ and ′ , the scaler obtains ′ − ′ . belongs to the real interval [ 0, 1 ]. the normalized value of ′ as: = ′ − ′ Now, since all feature occurrences are normalized between 0 and 1, we chose as the dissimilarity function ( , ℎ ) between two feature occurrences and ℎ the absolute value of their diference: ( , ℎ ) = | − ℎ |.

The first test we carried out was the computation of the accuracy of classifiers. In Table 1, we report the results obtained. As can be seen from this table, the values are very high. This allows us to conclude that all classifiers considered can guarantee high confidence values and, therefore, can be employed in the next tests.

Model Accuracy Naive Bayes 0.93 SVM with polynomial kernel 0.98 SVM with radial basis function kernel 0.96 Multi-Layer Perceptron 0.93

Random Forest algorithm 0.96

Before proceeding further, a premise is necessary. The main objective of our analysis is to check whether there are any features that have a higher relevance value than others. Therefore, if all classifiers showed no significant diferences between the relevance values of the various features, we could reasonably conclude that the latter all have the same relevance. In contrast, if some or all of the classifiers show significantly diferent relevance values for the various features and agree in indicating which of them are the most relevant, we could reasonably conclude that the relevance values of the features are significantly diferent and could determine which features are most relevant. In this case, the best classifiers would be those that can best show the diferences in the relevances among the various features. Having this in mind, we can proceed with the next tests. The first of them aims to compute the value of the damping factor for the various features and classifiers. Figure 1 reports the corresponding distributions represented by means of boxplots.

From the analysis of this figure we can see that the classifiers show completely diferent behaviors. In fact: • Naive Bayes tends to assign similar and very low values to the damping factor for all features. • Polynomial SVM assigns very diferent values to the damping factor for diferent features.

Therefore, it shows a very good ability to discriminate features. • Radial SVM shows diferences in the values of the damping factor, although these are smaller than the ones shown by Polynomial SVM. • Multi-Layer Perceptron returns very diferent values of the damping factor for the occurrences of the same feature. In contrast, median values are all very high. This classifier proved less capable of discriminating features than the two SVM classifiers, although it seems better than Naive Bayes. • Random Forest returns results similar, albeit less extreme, to the ones returned by Naive

Bayes. It does not reveal much ability to discriminate features.

The results on the damping factor shown above are indicative of potential trends but are still preliminary. In fact, they need to be confirmed or corrected by the analysis of the relevance values, which represent the final outcome of our XAI process. These results are shown in Figure 2. From the analysis of this figure we can conclude that: • Naive Bayes and Random Forest are unable to discriminate feature relevances. • The two SVM classifiers and Multi-Layer Perceptron are capable of discriminating feature relevances, although to diferent degrees. • The diferences identified by the various classifiers are concordant. In fact, the two SVM classifiers and, to some extent, also Multi-Layer Perceptron, show that and petal_width are more relevant than sepal_length and sepal_width. • Polynomial SVM and Radial SVM prove to be the most capable of discerning diferences in feature relevances.

The conclusions drawn from the examination of Figure 2 are qualitative and only partially quantitative. Actually, it would be important to find a way to quantify the diferent abilities of the classifiers to discern feature relevance. A first way to achieve this goal is to compare the median values of the occurrence relevances for each feature and for each classifier. These values are reported in Table 2. The analysis of this table shows that, even at the quantitative level, petal_length and petal_width are more relevant than sepal_length and sepal_width.

Model NB SVMR RF

Feature sepal_length sepal_width petal_length petal_width sepal_length sepal_width petal_length petal_width sepal_length sepal_width petal_length petal_width

Relevance 0.014598 0.014572 0.014696 0.014714 0.009293 0.009238 0.011012 0.011139 0.014313 0.014280 0.014504 0.014534

Model SVMP MLP

Feature sepal_length sepal_width petal_length petal_width sepal_length sepal_width petal_length petal_width

Relevance

A second, more accurate way to achieve the goal above is to introduce a new function (· ). It receives a classifier ℳ and returns a real number in the interval [0, 100] that measures the ability of ℳ to diferentiate feature relevances. (· ) can be defined as follows: (ℳ) = ℳ − ℳ ℳ · 100

Here, ℳ (resp., ℳ) is the maximum (resp., minimum) value taken by the median relevance of a feature when ℳ is adopted. ℳ (Maximum Central Percentile Interval) is obtained in the following way: first we compute the widths of the intervals between the values corresponding to the 25th and 75th percentiles of the distributions of the feature relevances returned by ℳ. Then, we calculate the maximum of these widths. In the formula of (· ), we decided to take the values corresponding to the 25th and 75th percentiles, instead of all values, to avoid (· ) being sensitive to outliers.

In Table 3, we report the values returned by (· ) for the classifiers of our interest. This table gives us an accurate quantitative result of what we had guessed qualitatively from examining Figures 1 and 2 and Table 2. In particular, it allows us to conclude that the best classifier in diferentiating feature relevances is Polynomial SVM, with a value of (· ) equal to 37.47%, while the second best classifier is Radial SVM, with a value of (· ) equal to 17.62%. Multi-Layer Perceptron is still a good classifier, while Naive Bayes and Random Forest are incapable of discriminating which features are most relevant.

Value of (· )

Naive Bayes 1.29%

Polynomial SVM 37.47%

Radial SVM 17.62%

Multi-Layer Perceptron 11.43%

Random Forest 2.50%

4. Conclusion

In this paper, we have proposed NT4XAI, a model-agnostic, network-based XAI framework to explain the behavior of any classifier. As its name indicates, NT4XAI is based on network theory and the vast amount of results obtained in this research area in the past. NT4XAI achieves its goal by evaluating the relevance of features in the behavior of a classifier. We also described some tests that allowed us to evaluate the efectiveness of NT4XAI both quantitatively and qualitatively. The main contributions of this paper are: (i) the definition of NT4XAI, a new model-agnostic network-based XAI framework; (ii) the definition of the concept of dyscrasia, by which the consistency of the occurrences of a feature during the classification process can be qualitatively evaluated; (iii) the definition of an approach for calculating the relevance of a feature in classifying the corresponding instances.

As for possible future developments of this research, we can first think of extending NT4XAI by considering latent structural properties in our network-based model. Also, we could use a totally diferent network model, such as a multilayer network [ 8, 26 ], to support NT4XAI. This would allow us to have a new point of view and capture diferent properties [ 1 ] using local model knowledge. [25] M. Ahsan, M. Mahmud, P. Saha, K. Gupta, Z. Siddique, Efect of data scaling methods on machine learning algorithms and model performance, Technologies 9 (2021) 52. MDPI. [26] G. Bonifazi, B. Breve, S. Cirillo, E. Corradini, L. Virgili, Investigating the COVID-19 vaccine discussions on Twitter through a multilayer network-based approach, Information Processing & Management 59 (2022) 103095. Elsevier.

[1]

Barredo Arrieta ,

Díaz-Rodríguez ,

J. Del

Ser ,

Bennetot ,

Tabik ,

Barbado ,

Garcia ,

Gil-Lopez ,

Molina ,

Benjamins ,

Chatila ,

Herrera , Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI , Information Fusion 58 ( 2020 ) 82 - 115 . Elsevier.

[2]

Gunning , D. Aha, DARPA's Explainable Artificial Intelligence (XAI) Program, AI Magazine 40 ( 2019 ) 44 - 58 . AAAI.

[3]

Zini ,

Awad , On the Explainability of Natural Language Processing Deep Models, ACM Computing Surveys 55 ( 2022 ). ACM.

[4]

Adadi ,

Berrada , Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , IEEE Access 6 ( 2018 ) 52138 - 52160 . IEEE.

[5]

Yoo ,

Kang , Explainable Artificial Intelligence for manufacturing cost estimation and machining feature visualization , Expert Systems with Applications 183 ( 2021 ) 115430 . Elsevier.

[6]

Kaur ,

Uslu ,

K. J.

Rittichier ,

Durresi , Trustworthy Artificial Intelligence: A Review, ACM Computing Surveys 55 ( 2022 ). ACM.

[7]

Li ,

Qi ,

Liu ,

Di , J. Liu,

Pei ,

Yi ,

Zhou , Trustworthy

: From Principles to Practices, ACM Computing Surveys ( 2022 ). ACM.

[8]

Newman , Networks, 2018 . Oxford University Press.

[9]

S. M.

Lundberg ,

S. I.

Lee , A unified approach to interpreting model predictions , Advances in neural information processing systems 30 ( 2017 ).

[10]

Razmjoo ,

Xanthopoulos ,

Zheng , Online feature importance ranking based on sensitivity analysis , Expert Systems with Applications 85 ( 2017 ) 397 - 406 . Elsevier.

[11]

Strumbelj , I. Kononenko, An eficient explanation of individual classifications using game theory , The Journal of Machine Learning Research 11 ( 2010 ) 1 - 18 . JMLR.org.

[12]

Dabkowski ,

Gal , Real Time Image Saliency for Black Box Classifiers , in: Proc. of the International Conference on Neural Information Processing Systems (NIPS'17) , Long Beach, CA, USA, 2017 , pp. 6970 - 6979 . Curran Associates Inc.

[13]

Fong ,

Vedaldi , Interpretable explanations of black boxes by meaningful perturbation , in: Proc. of the International IEEE Conference on Computer Vision (ICCV'17), Venice, Italy, 2017 , pp. 3449 - 3457 . IEEE.

[14]

Brin ,

Page , The Anatomy of a Large-Scale Hypertextual Web Search Engine , Computer Networks 30 ( 1998 ) 107 - 117 .

[15]

Burkart ,

M. F.

Huber , A survey on the explainability of supervised machine learning , Journal of Artificial Intelligence Research 70 ( 2021 ) 245 - 317 .

[16]

Štrumbelj , I. Kononenko,

M. R.

Šikonja , Explaining instance classifications with interactions of subsets of feature values , Data & Knowledge Engineering 68 ( 2009 ) 886 - 904 . Elsevier.

[17]

Ribeiro ,

Singh ,

Guestrin , “ Why should I trust you?” Explaining the predictions of any classifier , in: Proc. of the International Conference on Knowledge Discovery and Data Mining (KDD'16) , San Francisco, CA, USA, 2016 , pp. 1135 - 1144 .

[18]

Gosak ,

Markovič ,

Dolenšek ,

Rupnik ,

Marhl ,

Stožer ,

Perc , Network science of biological systems at diferent scales: A review , Physics of life reviews 24 ( 2018 ) 118 - 135 . Elsevier.

[19]

Sporns , Graph theory methods: applications in brain networks, Dialogues in clinical neuroscience ( 2022 ). Taylor & Francis.

[20]

Camacho ,

Panizo-LLedot ,

Bello-Orgaz ,

Gonzalez-Pardo , E. Cambria, The four dimensions of social network analysis: An overview of research methods, applications, and software tools , Information Fusion 63 ( 2020 ) 88 - 120 . Elsevier.

[21]

Datta ,

Sen ,

Zick , Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems , in: Proc. of the International Symposium on Security and Privacy (SP'16) , IEEE, Fairmont, San Jose, CA, USA, 2016 , pp. 598 - 617 .

[22]

Henelius ,

Puolamäki ,

Ukkonen , Interpreting classifiers through attribute interactions in datasets , arXiv preprint arXiv:1707.07576 ( 2017 ).

[23]

Fisher , The use of multiple measurements in taxonomic problems , Annals of Eugenics 7 ( 1936 ) 179 - 188 . Wiley Online Library.

[24]

Asuncion , D. Newman, UCI machine learning repository , 2007 . Available online at: https://archive.ics.uci. edu/ml/index.php.