An XAI-based masking approach to improve classification systems Andrea Apicella1,2,3,*,† , Salvatore Giugliano1,2,3,† , Francesco Isgrò1,2,3,† , Andrea Pollastro1,2,3,4,† and Roberto Prevete1,2,3,† 1 Laboratory of Augmented Reality for Health Monitoring (ARHeMLab) 2 Laboratory of Artificial Intelligence, Privacy & Applications (AIPA Lab) 3 Department of Electrical Engineering and Information Technology, University of Naples Federico II 4 Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA Abstract Explainable Artificial Intelligence (XAI) seeks to elucidate the decision-making mechanisms of AI models, enabling users to glean insights beyond the results they produce. While a key objective of XAI is to enhance the performance of AI models through explanatory processes, a notable portion of XAI literature predominantly addresses the explanation of AI systems, with limited focus on leveraging XAI methods for performance improvement. This study introduces a novel approach utilizing Integrated Gradients explanations to enhance a classification system, which is subsequently evaluated on three datasets: Fashion-MNIST, CIFAR10, and STL10. Empirical findings indicate that Integrated Gradients explanations effectively contribute to enhancing classification performance. Keywords XAI, Machine Learning, DNN, Integrated Gradients, attributions 1. Introduction Explainable Artificial Intelligence (XAI) plays a crucial role in understanding the decision- making processes of AI models, especially as they become integral to critical applications in healthcare, finance, and everyday life. While existing XAI literature primarily focuses on providing explanations for AI systems, there’s a notable gap in leveraging these explanations to enhance the performance of the models. This paper addresses this gap by examining established an XAI method commonly employed in Machine Learning (ML) classification tasks. The goal is to utilize explanations for model improvement. The core concept hinges on the idea that explanations about model outputs offer insights to fine-tune the ML system parameters effectively. However, interpreting Deep Neural Networks (DNNs) can be challenging due to their 2nd Workshop on Bias, Ethical AI, Explainability and the role of Logic and Logic Programming, BEWARE-23, co-located with AIxIA 2023, Roma Tre University, Roma, Italy, 2023 * Corresponding author. † These authors contributed equally. $ andrea.apicella@unina.it (A. Apicella); salvatore.giugliano2@unina.it (S. Giugliano); francesco.isgro@unina.it (F. Isgrò); andrea.pollastro@unina.it (A. Pollastro); rprevete@unina.it (R. Prevete)  0000-0002-5391-168X (A. Apicella); 0000-0002-1791-6416 (S. Giugliano); 0000-0001-9342-5291 (F. Isgrò); 0000-0003-4075-0757 (A. Pollastro); 0000-0002-3804-1719 (R. Prevete) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings inherent complexity, demanding explanations that are human-readable. This work operates on the premise that explanation-derived knowledge can be harnessed to comprehend the model’s strengths and weaknesses, thereby enhancing its adaptability to various inputs. In this context, explanations are constructed based on the behavior of the ML system, shedding light on its input-output relationships. Consequently, they enable the identification of input characteristics influencing outputs, thereby empowering adjustments to the ML system itself. This paper specifically delves into the exploration of Integrated Gradient [1] XAI method to assess whether the relevant features it highlights can be used in conjunction with input data to augment the classification performance of an ML system. The results of this approach have been more extensively treated in [2]. 2. Related works The internal mechanisms of modern ML approaches, particularly in the realm of Deep Learning, often remain opaque, making it challenging for AI scientists to fully grasp the underlying processes guiding their behaviors. The utilization of XAI methods has gained prominence in providing explanations for various classification systems across domains like images [3, 4, 5, 6, 7], natural language processing [8, 9], clinical decision support systems [10], and more. In particular, in [1] Integrated gradient was proposed, an XAI method that involves calculating the average of gradients between an input x and a reference x𝑟𝑒𝑓 , where 𝐶(x𝑟𝑒𝑓 ) yields a given model to a neutral prediction. This approach, termed Integrated Gradient (IG), considers the magnitude of gradients of features of inputs closer to the baseline. The significance of each feature 𝑥𝑖 is determined by aggregating the gradients along the intermediate inputs on the straight-line path connecting the baseline and the input. However, the application of XAI methods to enhance the performance of ML models in classification tasks is a relatively underexplored area in current research. A survey in [11] provides an overview of works leveraging XAI methods to improve classification systems. Furthermore [12, 13, 2] conduct an empirical analysis of several well-known XAI methods on an ML system trained on EEG data, showing that many components identified as relevant by XAI methods can potentially be employed to build a system with improved generalization capabilities. In contrast, the primary focus of the current study is to assess the effectiveness of selected XAI methods in enhancing the performance of a machine learning system for image classification tasks. Additionally, the study delves into various strategies for integrating input data and explanations to optimize the ML system’s performance. The detailed results have been further elaborated in [2], where they are also compared with an alternative strategy. 3. Method This study endeavors to propose a viable method for leveraging an XAI explanation to enhance the performance of a classifier. However, it is essential to note that our approach begins with the premise that, for a specific input, an explanation of the model’s output for the correct target class is accessible. While this assumption may not hold in real-world scenarios where the correct class for new input is unknown, it is a starting point for effectively investigating the potential Figure 1: Architecture of the soft-masking schema. improvement in classification performance through the utilization of explanations. We suggest a potential approach for integrating IG explanations into the classification process through a soft-masking scheme. In essence, we make a model able to combine the relevance 𝐴(x, 𝐶) with the input x. To accomplish this, we introduce an additional mixer network, denoted as the Mixer, which is connected to the classifier 𝐶, as illustrated in Fig. 1. We employ two additional networks, 𝐸x and 𝐸𝐴 , to reduce the dimensionality of x and 𝐴(x, 𝐶) respectively. The outputs of 𝐸x and 𝐸𝐴 are then concatenated and fed into the Mixer. The resulting output of the Mixer can be interpreted as an input mask 𝑀 , which is used to weight the input x for classifier 𝐶. The parameters of Mixer, 𝐸x , and 𝐸𝐴 can be learned while keeping the parameters of 𝐶 fixed. This involves employing standard training procedures on the non-fixed parameters, effectively searching for the optimal set of parameters for Mixer, 𝐸x , and 𝐸𝐴 that effectively reduce and integrate 𝐴(x, 𝐶) and x for a given classifier 𝐶. 4. Experimental assessment Fashion-MNIST [14], CIFAR10, and STL10 datasets were used as benchmark datasets, while ResNet18 [15] pre-trained on ImageNet dataset was adopted as classifier 𝐶 for the CIFAR10 and STL10 dataset, and a two fully-connected layers Neural Network equipped with ReLU activation function for Fashion-MNIST dataset. Baselines was computed fine tuning 𝐶 with the training set provided in each adopted dataset. Then, for each input and baseline the Integrated Gradient explanation have been built. The architectures adopted for 𝐸x and 𝐸𝐴 are reported in Tab. STL10 CIFAR10 Fashion-Mnist 𝐸x , 𝐸𝐴 Mixer 𝐸x , 𝐸𝐴 Mixer 𝐸x , 𝐸𝐴 Mixer 𝐶 FC 4096 FC 512 FC 2048 FC 512 FC 512 FC 512 FC 128 batch norm.+ReLU batch norm.+ReLU batch norm.+ReLU batch norm.+ReLU batch norm.+ReLU batch norm.+ReLU ReLU FC 2048 FC 1024 FC 1024 FC 1024 FC 256 FC 784 FC 64 batch norm.+ReLU batch norm.+ReLU batch norm.+ReLU batch norm.+ReLU ReLU FC 1024 FC 4096 FC 512 FC 128 FC 10 batch norm.+ReLU batch norm.+ReLU batch norm.+ReLU FC 512 FC 9216 FC 256 batch norm.+ReLU batch norm.+ReLU FC 256 FC 128 batch norm.+ReLU FC 128 Table 1 Architectures adopted. For each Fully-Connected (FC) layer, the numbers indicate how many neurons are employed. The 𝐶 module adopted for CIFAR10 and STL10 was a ResNet18 pretrained on ImageNet. 1. The training consisted in training the Mixer network, 𝐸X , and 𝐸𝐴 while freezing the 𝐶 Model CIFAR10 STL10 Fashion-MNIST baseline 85.7 % 66.3 % 87.3 % proposed 87.6 % 68.6 % 99.9 % Table 2 Accuracy scores on test set using the soft masking scheme. parameters. The training was made with the Adam algorithm and a validation set of 30% of the training data to stop the iterative learning process. Best batch size and learning rate were found with a grid-search approach, with batch sizes {64, 128, 256}, learning rates in range [0.001, 0.01] with step of 0.02. 5. Results & conclusions In Tab. 2 the results of the proposed schema are reported. It is highlighted that the proposed strategies lead to an improvement in accuracy in all the investigated datasets. The proposed ap- proach offers a strategy to effectively integrate explanations with input data, leading to enhanced model classification performance. This is achieved by allowing the model to autonomously determine the optimal mixing strategy through a learning process. The results demonstrate promise in the experimental scenario for all the investigated datasets. It’s important to note, however, that all results are derived under the assumption that accurate explanations for the correct classes are available for the test data. This assumption, while useful for this study, is unrealistic in practice since the true class of test data is typically unknown. Therefore, the findings of this research can pave the way for the development of a system that can provide reliable approximations of explanations even in the testing phase. We intend to further explore and expand upon this avenue in our future research endeavors. Acknowledgments This work is supported by the European Union - FSE-REACT-EU, PON Research and Innovation 2014-2020 DM1062/2021 contract number 18-I-15350-2, and was partially supported by the Ministry of University and Research, PRIN research project "BRIO – BIAS, RISK, OPACITY in AI: design, verification and development of Trustworthy AI.", Project no. 2020SSKZ7R, by the Ministry of Economic Development, “INtegrated Technologies and ENhanced SEnsing for cognition and rehabilitation” (INTENSE) project, and by Centro Nazionale HPC, Big Data e Quantum Computing (PNRR CN1 spoke 9 Digital Society & Smart Cities, CUP: E63C22000980007 ). Furthermore, we acknowledge financial support from the PNRR MUR project PE0000013-FAIR (CUP: E63C22002150007 ). References [1] M. Sundararajan, A. Taly, Q. Yan, Axiomatic attribution for deep networks, in: International conference on machine learning, PMLR, 2017, pp. 3319–3328. [2] A. Apicella, L. Di Lorenzo, F. Isgrò, A. Pollastro, R. Prevete, Strategies to exploit xai to improve classification systems, in: Explainable Artificial Intelligence. xAI 2023, Springer Nature Switzerland, 2023, pp. 147–159. [3] M. T. Ribeiro, S. Singh, C. Guestrin, " why should i trust you?" explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144. [4] A. Apicella, F. Isgrò, R. Prevete, A. Sorrentino, G. Tamburrini, Explaining classification systems using sparse dictionaries, ESANN 2019 - Proceedings, 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2019). [5] G. Montavon, A. Binder, S. Lapuschkin, W. Samek, K.-R. Müller, Layer-wise relevance propagation: an overview, Explainable AI: interpreting, explaining and visualizing deep learning (2019) 193–209. [6] A. Apicella, S. Giugliano, F. Isgrò, R. Prevete, A general approach to compute the relevance of middle-level input features, in: Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021, Proceedings, Springer, 2021, pp. 189–203. [7] A. Apicella, S. Giugliano, F. Isgro, R. Prevete, et al., Explanations in terms of hierarchically organised middle level features, in: CEUR WORKSHOP PROCEEDINGS, volume 3014, CEUR-WS, 2021, pp. 44–57. [8] K. Qian, M. Danilevsky, Y. Katsis, B. Kawas, E. Oduor, L. Popa, Y. Li, Xnlp: A living survey for xai research in natural language processing, in: 26th International Conference on Intelligent User Interfaces-Companion, 2021, pp. 78–80. [9] T. Lei, R. Barzilay, T. Jaakkola, Rationalizing neural predictions, arXiv preprint arXiv:1606.04155 (2016). [10] T. A. Schoonderwoerd, W. Jorritsma, M. A. Neerincx, K. Van Den Bosch, Human-centered xai: Developing design patterns for explanations of clinical decision support systems, International Journal of Human-Computer Studies 154 (2021) 102684. [11] L. Weber, S. Lapuschkin, A. Binder, W. Samek, Beyond explaining: Opportunities and challenges of xai-based model improvement, Information Fusion (2022). [12] A. Apicella, F. Isgrò, A. Pollastro, R. Prevete, Toward the application of XAI methods in eeg-based systems, in: Proceedings of the 3rd Italian Workshop on Explainable Artificial Intelligence co-located with 21th International Conference of the Italian Association for Artificial Intelligence(AIxIA 2022), Udine, Italy, November 28 - December 3, 2022, volume 3277 of CEUR Workshop Proceedings, CEUR-WS.org, 2022, pp. 1–15. [13] A. Apicella, F. Isgrò, R. Prevete, XAI approach for addressing the dataset shift problem: BCI as a case study (short paper), in: Proceedings of 1st Workshop on Bias, Ethical AI, Explainability and the Role of Logic and Logic Programming (BEWARE 2022) co-located with the 21th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2022), Udine, Italy, December 2, 2022, volume 3319 of CEUR Workshop Proceedings, 2022, pp. 83–88. [14] H. Xiao, K. Rasul, R. Vollgraf, Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, arXiv preprint arXiv:1708.07747 (2017). [15] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Pro- ceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.