1. Introduction

E. Di Nardo); angelo.ciaramella@uniparthenope.it (A. Ciaramella) ~ https://sites.google.com/view/ciss-angelociaramella/home (A. Ciaramella)

Advanced Fuzzy Relational Neural Network

E. Di Nardo

0 1

A. Ciaramella

1 0 Department of Computer Science, University of Milan , Milan 20122 , Italy 1 Department of Science and Technology, University of Naples Parthenope, Centro Direzionale Isola C4 , I-80143, Napoli , Italy

2021

000 0 0002

Nowadays most of the researches aimed for studying artificial neural networks and in particular convolutional neural networks for the impressive results in several scientific fields. However, these methodologies need of post-hoc technique for improving their interpretability and explainability. In the last years, fuzzy systems are raising great interest for the simplicity to develop trustworthy and explainable systems. This work aims to introduce a fuzzy relational neural network based model for extrapolating relevant information from images data permitting to obtain a clearer indication on the classification processes. Encouraging results are obtained on benchmark data sets.

eol>Deep Learning Fuzzy Logic Fuzzy Relational Neural Network Computational Intelligence

1. Introduction 2. Fuzzy Relational Neural Network model

Fuzzy Rule-based Systems (FRSs), are raising great interest in XAI in the last years as ante-hoc methodologies [ 1 ]. The main components of any FRS are the knowledge base (KB) and the inference engine module. The KB comprises all the fuzzy rules within a rule base (RB), and the definition of the fuzzy sets in the data base. The inference engine includes a fuzzification interface, an inference system, and a defuzzification interface [ 5, 4 ]. Fuzzy Relational Neural Network (FRNN) [ 6 ] is and adaptive model based on a FRS. FRNN can be developed with diferent norms and a backpropagation algorithm is used for learning. In this work we model local t-norms modifying the inner operation of convolution and replacing the linear combination provided by matrix multiplication with fuzzy operators. We define a receptive field that applies a triangular operation on a restricted area. As happen in convolution we have a kernel of size × × × where and are the spatial dimensions, is the number of input channels and the number of output features maps. Kernel slides over the image with a parametric step. Weights are initialized in range [ 0, 1 ] and constrained to be in the same interval after the optimization step using a scaling operation based on minimum and maximum values: =

− min() max() − min() (1) The network structure is composed by an input layer and a fuzzification layer where the membership function is just a scaling of the pixel value in range [ 0 − 1 ]. We compare the results by using one or two hidden layers. Next there is a defuzzification operation that is composed by a fully connected layer like in [ 6 ] and an output layer with a Categorical Crossentropy is used for classification. Architectures have been tested with and without a threshold activation function, a modification of leaky relu with a minimum boundary > 0. Networks are compared with equivalent CNN architectures.

3. Experimental results

FRNN has been applied for images classification and the MNIST [ 7 ] and CIFAR10 [ 8 ] datasets are considered. Input images are scaled in range [ 0 − 1 ] as fuzzification step. The single hidden layer architecture uses a feature map of size 8, in the two hidden layers setup there are respectively feature maps of size 8 and 16. Weights are randomly initialized using a uniform distribution in range [ 0 − 1 ] in order to define a random degree order. Further weights are constrained to be in the same range after the backpropagation phase, because they have to define at any moment a data membership degree for all channels. There is a soft constraint that re-scales weights in the correct range without a hard clipping of the weights on boundaries, as in the gradient clipping case. All layers have a kernel size of 3 on spatial dimensions. In table 1 it is possible to check performance of CNN compared with fuzzy architectures. It is possible to observe that performances on MNIST are comparable but on CIFAR10, CNN outperform FRNN. However, observing the activations and the heatmaps (Fig. 1) of the models and some others visualizations based on GradCAM [ 9 ], Gradients*Inputs [10] and Integrated Gradients [11] (Fig. 2, 3) it is possible to note that FRNN can explain more accurately the information used in classification. GradCAM shows that conv2d and relu function cut-out important features of the object retaining the non relevant one, instead, FRNN, also if unrelevant area are present, is able to preserves the shape of the ships. It is more clear also observing the gradients. Both techniques have some noise, but the fuzzy module is able to focus more on the image subject. The same analysis is possible observing MNIST fig. 3 where gradients, the attention of the network, are located following the object shape. This statement is not valid for convolutional layer that shows a lot of noise and is unable to focus on the main subject.

Model Conv2D+ReLU MaxMin MaxMin (2L)

4. Conclusions

In this work a fuzzy relational neural network based model for extrapolating relevant information from the image data has been introduced. From the preliminary results we observed that the model permits to obtain a clearer indication on the classification processes.

In the next future the authors will focus on further validations of the model from both theoretical and practical point of views. explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 618–626. [10] D. Baehrens, T. Schroeter, S. Harmeling, M. Kawanabe, K. Hansen, K.-R. Müller, How to explain individual classification decisions, The Journal of Machine Learning Research 11 (2010) 1803–1831. [11] Z. Qi, S. Khorram, F. Li, Visualizing deep networks by optimizing with integrated gradients., in: CVPR Workshops, volume 2, 2019.

[1]

Knapič ,

Malhi ,

Saluja , K. Främling, xplainable artificial intelligence for human decision-support system in medical domain , arXiv ( 2021 ).

[2]

J. M.

Mendel ,

P. P.

Bonissone , Critical thinking about explainable ai (xai) for rule-based fuzzy systems , IEEE Transactions on Fuzzy Systems 14 ( 2019 ) 69 - 81 .

[3]

Mencar ,

Alonso , Paving the way to explainable artificial intelligence with fuzzy modeling , volume 24 , 2018 , pp. 215 - 227 .

[4]

Camastra ,

Ciaramella ,

Giovannelli ,

Lener ,

Rastelli ,

Staiano ,

Starace , A fuzzy decision system for genetically modified plant environmental risk assessment using mamdani inference , Expert Systems with Applications 42 ( 2015 ) 1710 - 1716 .

[5]

Ciaramella ,

Tagliaferri ,

Pedrycz ,

Di Nola , Fuzzy relational neural network , International Journal of Approximate Reasoning 41 ( 2006 ) 146 - 163 .

[6]

Ciaramella ,

Tagliaferri ,

Pedrycz ,

Di Nola , Fuzzy relational neural network , International Journal of Approximate Reasoning 41 ( 2006 ) 146 - 163 .

[7]

LeCun , C. Cortes, MNIST handwritten digit database ( 2010 ). URL: http://yann.lecun. com/exdb/mnist/.

[8]

Krizhevsky ,

Hinton , et al., Learning multiple layers of features from tiny images ( 2009 ).

[9]

R. R.

Selvaraju ,

Cogswell , A. Das , R.

Vedantam , D.

Parikh , D.

Batra , Grad-cam: Visual