Neo-Fuzzy System with Special Type Membership Functions Adaptation and Fast Tuning of Synaptic Weights in Emotion Recognition Task Yevgeniy Bodyanskiy, Nonna Kulishova and Olha Chala Kharkiv National University of Radio Electronics, Nauky Ave. 14, Kharkiv, 61166, Ukraine Abstract The neo-fuzzy system for image recognition (by the example of emotion recognition) is proposed. It is designed to solve the task under consideration in conditions of a short dataset and overlapping classes. The distinctive feature of the system is its hybrid learning, which includes controlled learning with a teacher, lazy learning on the principle of "neurons at data points" and self-learning according to T. Kohonen. Also, the ability to tune both synaptic weights and membership functions of special form and provide improved approximation abilities. The system under consideration has a high learning speed and provides good recognition quality that is proved by the results of the computational experiment. Keywords 1 Neo-fuzzy neuron, nonlinear synapse, controlled learning, lazy learning, self-learning, the adaptive membership function of Epanechnikov kernel type 1. Introduction Emotions are a powerful tool for interaction between people, which is now increasingly used in various fields of human-computer interaction. The most promising areas of human facial expressions automatic recognition include, for example, education, digital marketing, automated ranking and recommendation systems [1–6]. However, such applications put forward special requirements for recognition approaches: high performance, high accuracy under conditions of significant changes in posture, lighting, and shooting angle. Systems for user's emotional status automatic recognition, as a rule, have a similar architecture, which includes subsystems for pre-processing, feature extraction and facial expression classification. Each of these subsystems can use different approaches to initial data acquisition, machine learning and computational intelligence methods. A wide spectrum of research in this direction has been considered in several reviews [7–9]. To solve the classification problem in emotion detection systems, various machine-learning methods were used [8,10,11]. Since the problem is data-driven, classification accuracy depends on the quality of solutions of previous stages  pre-processing, feature extraction. Deep neural networks permit to improve the recognition accuracy significantly [9]. Among similar architectures Convolutional Neural Networks (CNN), attentional CNN [12,13], graph CNN [14–16], component-wise LSTM (cLSTM) [17] and many others were proposed. Despite the variety of deep networks applied to facial expression recognition, they all have serious common disadvantages. First, the recognition accuracy of trained networks strongly depends on how diverse and large the dataset to network learning was used, whether it contained information about representatives of different races, ages, and cultures. When preparing such datasets, specialists conduct video or photography in studio conditions, when a person's posture, movements, nature, lighting almost do not change, and emotions are manifested as much as possible for an unambiguous interpretation. These factors reduce adaptive recognition II International Scientific Symposium «Intelligent Solutions» IntSol-2021, September 28–30, 2021, Kyiv-Uzhhorod, Ukraine EMAIL: yevgeniy.bodyanskiy@nure.ua (A. 1); nonna.kulishova@nure.ua (A. 2); olha.chala@nure.ua (A. 3) ORCID: 0000-0001-5418-2143 (A. 1); 0000-0001-7921-3110 (A. 2); 0000-0002-7603-1247 (A. 3) ©️ 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 158 capabilities of trained deep network. In addition, dataset formation and labeling required for deep network training is time and labor costly, and dataset size can reach millions of samples. These aspects significantly limit possibilities of deep networks used for real-time recognition, when emotions are manifested to a small extent, mixed, while training data volumes are small, and samples can be unlabelled. Therefore, the problem of developing a system for automatic emotions recognition in real time remains actual. One of the helpful approaches is the use of neo-fuzzy neurons based systems [18], where the solution of the fuzzy classification problem [19,20] is implemented. In this work, it is also proposed to consider a person's facial expression recognition as a fuzzy classification problem, and to use it for solving a modification of a neo-fuzzy neuron with membership functions such as Epanechnikov kernel with tuning centers. 2. Architecture of neo-fuzzy system for emotion recognition Among many approaches, deep neural networks are adjusted in the best way to solve pattern recognition task. Such systems proved their effectiveness in solving many problems, which are related to the procession of big amount of information. At the same time, these neural networks are quite slow, contain huge among of tuning synaptic weights (sometimes billions, some – more than trillion). Hence, they require huge amount of training data for their learning. In case the size of training dataset is limited that appends often in real tasks, deep neural networks become noneffective and the transfer learning usage is not always allow solve arising problems. A neuro-fuzzy neuron [21–25] is the simplest system that allows restoring separating hypersurfaces proved its effectiveness for solving numerous real tasks. In general case, this construction is Takagi-Sugeno-Kang system of a zeroth-order differential equation, viz, it is a universal approximator. The architecture of standard neo-fuzzy neuron is represented in Fig.1. Input vector-image x(k )   x1 (k ),..., xi (k ),..., xn (k ))T  Rn (here k  1, 2,..., N – or a number of an observations from a training dataset, or current discrete time) is fed to the inputs of nonlinear synapse NSi each of which contains hi membership functions li ( xi ) and same number of tunable synaptic weights wli that are specified or in batch mode or in online mode through optimisation of the adopted learning criteria – goal function. In the general case, neo-fuzzy neuron implements nonlinear transformation: n n hi yˆ (k )   fi ( xi (k ))   li ( xi (k )) wli (k  1), (1) i 1 i 1 l 1 thereat due to the linear dependence between output signal yˆ (k ) and synaptic weights, the algorithms of linear adaptive identification, including optimal by speed, robust, with smoothing [25–27] can be used for neo-fuzzy neuron’s learning. Usually in neo-fuzzy neurons B-splines are used as an activation functions, because they satisfy Ruspini unity partition conditions. Thereby, neo-fuzzy neuron does not contain defuzzification layer, that takes a lot of hard work out of its implementation. Typically there are the first-order B-splines in other words they are traditional triangular membership functions. Their main advantage is to fire only two neighboring functions in discrete moment k, and in each nonlinear synaps only two neighboring synaptic weights are tuned (total x in neo-fuzzy neuron 2n). Such approach essentially improves learning speed, especially when data are fed to the processing sequentially in online mode. Meanwhile, such neo-fuzzy neuron can implement only piecewise approximation of separating hypersufraces that cannot always provide necessary quality of the classification. In this context in [28] the extended neo-fuzzy neuron was introduced, where each non-linear signal of each neuron is Takagi-Sugeno-Kang fuzzy system of an arbitrary orderfuzzy. The image recognition system is based on the extended neo-fuzzy neurons, provides high quality of the solution of the classification task, however, it still require increased in size training datasets, because it has considerably larger number of tunable weights. Hence it is reasonable to use other kernel functions than B-splines in standart neo- fuzzy neuron, and the simplest examples is Epanechnikov kernels [29] that are represented in Fig.2. 159 Figure 1: Neo-fuzzy neuron Figure 2: Membership functions of Epanechnikov kernel type Here xi min , xi max interval of input signal adjustment on i-th input. If on this input hi such functions are evenly allocated, then interval between neighbouring centers is set with the formula xi min  xi max ri  . (2) hi  1 These functions can be written in analytical form: 160 li ( xi )  (1  ( xi  cli ) 2 ri 2 ) li (3) where cli - centers of corresponding functions, 1 if xi  cli  ri ,  li   (4) 0 otherwise. In more general case membership functions can be distributed nonuniformly as it is shown in Fig.3. Figure 3: Unsymmetrical membership functions of Epanechnikov kernel type. It is readily seen that in this situation membership functions are unsymmetrical and can be described with the following equitation:    xi  cli   2  liL ( xi )  1  2 ,    cl 1,i  cli     (5)    xi  cli   2  liR ( xi )  1  2     c l 1,i  cli    where   - a projection on positive ortant. It is easy to see that i each moment k, only two neighbouring functions can be fired as well, their derivatives are equal to zero in centers. However, these functions do not satisfy Ruspini unity partition conditions, in other words the system is built on the basis of such neo-fuzzy neurons requires additional output defuzzification layer. Having used neo-fuzzy neuron, the binary classification task can be solved, however, inasmuch as this case it is nessesary to split initial dataset to m possibly overlapping classes. Thus it is reasonable to introduce neo-fuzzy system, that is designed to solve pattern recognition task, and architecture of which is represented in Fig. 4. The system contains m connected in parallel neo-fuzzy neurons, where their outputs y j (k ), j  1, 2,..., m are formed with softmax activation functions, that are usually form output signals of deep convolutional neural network, that solve classification task. Therefore on the outputs of the neo-fuzzy system signals are formed: yˆ ( k ) e j y j (k )  softmax yˆ j (k )  m (6) e j yˆ ( k ) i 1 161 Figure 4: Neo-fuzzy system for image recognition m e yˆ j ( k ) whereas  1 . These signals set membership level of the observation x(k ) to the j-th class. j 1 It is interesting to notice that output softmax layer play the role defuzzification layer in neuro- fuzzy systems that is to say in this system it is not necessary for membership functions to meet requirements of the unity partitioning. 3. Learning of neo-fuzzy system The cross-entropy, which is usually used for deep convolutional neural networks tuning, underlies the learning criteria that is used for the learning of the proposed neo-fuzzy system: m m E (k )   E j (k )   y*j (k ) ln y j (k ) (7) j 1 j 1 where y*j (k ) – external reference signal, which takes only two values: 1 if the vector-image x(k ) belongs to the j-th class and 0 otherwise. Let us introduce into consideration vectors of synaptic weights and membership functions of j-th  N  neo-fuzzy neuron, that have following dimension:    h  1 :  ( x)    j11 ( x1 ), ...,  ( x ), i 1 i j jh1 1  j12 ( x2 ) ...,  jh 2 ( x2 ), ...  jli ( xi ) ...,  jhn ( xn )  , w j   w j11 ,..., w jh1 , w j12 , ... w jh 2 , ... w jli , ... T w jhn  thus its output signal can be written in the following form: T yˆ j (k )  wTj (k 1) j ( x(k )). (8)   which is formed with T Introducing vector reference signal y* (k )  y1* (k ),..., y*j (k ),..., ym* ( k ) zeroes and ones (so-called “one-hot coding”), vector output signals of the system in general T   n y(k )   y1 (k ),..., y j (k ),..., ym (k )  and yˆ (k )   yˆ1 (k ),..., yˆ j (k ),..., yˆ m ( k )  ,  m hi  1  vector T  i 1    of membership functions  ( x(k ))   1 ( x(k )),...,  j ( x(k )),..., m ( x( k ))  and  m  m hi  T n T T T  i 1  matrix of synaptic weights 162  w1T (k  1)      w(k  1)   wTj (k  1)  (9)      T   wm (k  1)  the output signals of the system can be written in vector-matrix form: yˆ (k )  w(k  1)  ( x(k )), (10) e yˆ ( k ) e w( k 1)  ( x ( k )) (11) y (k )   I T e yˆ ( k ) I T e w( k 1)  ( x ( k )) where I - (m  1) vector that is formed with unities. Matrix version of optimal by speed learning algorithm of Kaczmarz-Widrow-Hoff can be used for the tuning of the matrix of synaptic weights w( k ) and written in the form: y* (k )  w(k  1)  ( x(k )) w(k )  w(k  1)   T ( x(k )) (12)  ( x(k )) 2 or its adaptive regularized modification: y* (k )  w(k  1)  ( x(k )) w(k )  w(k  1)   T ( x(k )) (13)    (k ) 2 (here   0 – momentum term), protected from the “exploding gradient”. The quality of learning can be improved through turning not only synaptic weights, but also centers of membership functions of nonlinear synapse. In order to prevent training dataset from growing it is better to use ideas of selflearning, that are developed by T. Kohonen [30] as well as lazy learning[31]. Let’s introduce in consideration certain threshold of indecomposability that set minimal possible distance between neighbouring centers ri min  cli  cl 1,i . Then the initial stages such min process take place in the following way:  Feeding nonlinear synapse NSi on the input of the signal xi (1) , first center c1i is formed;  Feeding nonlinear synapse signal xi (2) on the input, the condition is checked: ( xi (2)  c1i  ri min ; 14)  In case this condition are met, nothing will happen – new centers will not be formed;  If the following condition is satisfied: ri min  xi (2)  c1i  2ri min (15) centers are corrected according to Kohonen’s rule “winner takes all”[30]: c1i (2)  c1i (1)   (2)( xi (2)  c1i (1)) (16) (here  (2) - selflearning rate parameter);  If the following condition is satisfied: 2ri min  xi (2)  c1i (17) according to the lazy learning rule “Neurons at data points”, the second center is formed c2i  xi (2) and formed early centers c1i stay where they are. The process of centers tuning take place until hi centers will be formed, where this value is defined according to the following formula: xi max  xi min hi   1. (18) ri min 163 Further, only their coordinates are corrected according to selflearning algorithm. Such tuning of kernel membership functions’ location allows to improve the approximation abilities of the system. 4. Results The accuracy and speed of proposed system were investigated on a dataset that was formed from known Psychological Image Collection at Stirling (PICS) [32], Extended Cohn-Kanade (CK +) databases [33]. The set contains 821 images that convey emotion development in dynamics and also contain micro-facial expressions (Fig. 5). An array of 35 feature points was selected as a face model (Fig. 6). Figure 5: Photos from dataset showing emotions development over time, including micro- expressions Figure 6: Location of 35 feature points 164 All images represent basic emotions: surprise, joy, disgust, grief, anger, fear, and neutral expression. Thus, there are 7 classes in the classification problem. 5. Conclusion The neo-fuzzy system and its combined learning (controlled learning, self-learning, leaning leaning) were proposed and deigned to solve image-emotion recognition task under the conditions of limited by volume dataset. The main characteristic of the proposed system is the usage a special kernel constructions as a activation functions that allows improving approximation properties of the system. On behalf of the controlled learning, the optimal by speed algorithm adjusted for the conditions of short dataset is used. Additionally, lazy learning and self-leaning allow disposing placing membership functions belongings in nonlinear synapse in the optimal way. The proposed system is quite simple in calculation implementation and provides high quality of recognition that is proved by the computational experiment. 6. References [1] F. Alqahtani, N. Ramzan, Comparison and Efficacy of Synergistic Intelligent Tutoring Systems with Human Physiological Response, Sensors, 2019. https://doi.org/10.3390/s19030460. [2] Z. Hussain, M. Zhang, X. Zhang, K. Ye, C. Thomas, Z. Agha, N. Ong, A. Kovashka, Automatic Understanding of Image and Video Advertisements, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Honolulu, HI, 2017, pp. 1100–1110. https://doi.org/10.1109/CVPR.2017.123. [3] J.J. Sun, T. Liu, G. Prasad, GLA in MediaEval 2018 Emotional Impact of Movies Task, CoRR. abs/1911.12361 (2019). http://arxiv.org/abs/1911.12361. [4] W. Hua, F. Dai, L. Huang, J. Xiong, G. Gui, HERO: Human Emotions Recognition for Realizing Intelligent Internet of Things, IEEE Access, 2019, pp. 24321–24332. https://doi.org/10.1109/ACCESS.2019.2900231. [5] Z. Wei, J. Zhang, Z. Lin, J.-Y. Lee, N. Balasubramanian, M. Hoai, D. Samaras, Learning Visual Emotion Representations From Web Data, in: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 13103–13112. https://doi.org/10.1109/CVPR42600.2020.01312. [6] M.S. Hossain, G. Muhammad, An Emotion Recognition System for Mobile Applications, IEEE Access, 2017, pp. 2281–2287. https://doi.org/10.1109/ACCESS.2017.2672829. [7] C.A. Corneanu, M.O. Simon, J.F. Cohn, S.E. Guerrero, Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications 38 volume of IEEE Trans. Pattern Anal. Mach. Intell., 2016 1548–1568. https://doi.org/10.1109/TPAMI.2016.2515606. [8] B. Martinez, M.F. Valstar, B. Jiang, M. Pantic, Automatic Analysis of Facial Actions: A Survey, 10 volume of IEEE Trans. Affective Comput., 2019, 325–347. https://doi.org/10.1109/TAFFC.2017.2731763. [9] S. Li, W. Deng, Deep Facial Expression Recognition: A Survey, IEEE Trans. Affective Comput., 2020. https://doi.org/10.1109/TAFFC.2020.2981446. [10] Z. Zhang, P. Luo, C.C. Loy, X. Tang, From Facial Expression Recognition to Interpersonal Relation Prediction, CoRR., 2016. [11] A. Aslam, B. Hussian, Emotion recognition techniques with rule based and machine learning approaches, CoRR., 2021. [12] K.-C. Liu, C.-C. Hsu, W.-Y. Wang, H.-H. Chiang, Real-Time Facial Expression Recognition Based on CNN, in: 2019 International Conference on System Science and Engineering (ICSSE), IEEE, Dong Hoi, Vietnam, 2019, pp. 120–123. https://doi.org/10.1109/ICSSE.2019.8823409. [13] S. Minaee, A. Abdolrashidi, Deep-Emotion: Facial Expression Recognition Using Attentional Convolutional Network, CoRR., 2019. 165 [14] W.-S. Chien, H.-C. Yang, C.-C. Lee, Cross Corpus Physiological-based Emotion Recognition Using a Learnable Visual Semantic Graph Convolutional Network, in: Proceedings of the 28th ACM International Conference on Multimedia, ACM, Seattle WA USA, 2020, pp. 2999–3006. https://doi.org/10.1145/3394171.3413552. [15] Y. Fan, J.C.K. Lam, V.O.K. Li, Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution, CoRR., 2020. [16] L. Lo, H.-X. Xie, H.-H. Shuai, W.-H. Cheng, MER-GCN: Micro Expression Recognition Based on Relation Modeling with Graph Convolutional Network, CoRR., 2020. [17] T. Mittal, P. Mathur, A. Bera, D. Manocha, Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality, CoRR., 2021. [18] Y. Bodyanskiy, N. Kulishova, O. Chala, The Extended Multidimensional Neo-Fuzzy System and Its Fast Learning in Pattern Recognition Tasks, 3 volume of Data, 2018. https://doi.org/10.3390/data3040063. [19] Y. Bodyanskiy, O. Chala, I. Pliss, A. Deineko, Adaptive Probabilistic Neural Network With Fuzzy Inference And Its Online Learning, in: 2020 IEEE 15th International Conference on Computer Sciences and Information Technologies (CSIT), IEEE, Zbarazh, Ukraine, 2020, pp. 96–99. https://doi.org/10.1109/CSIT49958.2020.9322052. [20] Y. Bodyanskiy, I. Pliss, O. Chala, A. Deineko, Evolving fuzzy-probabilistic neural network and its online learning, in: 10th International Conference on Advanced Computer Information Technologies, Deggendorf, Germany, 2020. [21] E. Uchino, T. Yamakawa, Soft Computing Based Signal Prediction, Restoration, and Filtering, in: D. Ruan (Ed.), Intelligent Hybrid Systems, Springer US, Boston, MA, 1997, pp. 331–351. https://doi.org/10.1007/978-1-4615-6191-0_14. [22] T. Miki, Analog Implementation of Neo-Fuzzy Neuron and Its On-board Learning, 1999. [23] D. Zurita, M. Delgado, J.A. Carino, J.A. Ortega, G. Clerc, Industrial Time Series Modelling by Means of the Neo-Fuzzy Neuron, volume 4 of IEEE Access, 2016 6151–6160. https://doi.org/10.1109/ACCESS.2016.2611649. [24] T. Yamakawa, E. Uchino, J. Miki, H. Kusanagi, A neo-fuzzy neuron and its application to system identification and prediction of the system behavior, in: Proceedings of the 2nd International Conference on Fuzzy Logic & Neural Networks, Iizuka, Japan, 1992. [25] Y. Bodyanskiy, I. Kokshenev, V. Kolodyazhniy, An adaptive learning algorithm for a neo-fuzzy neuron, in: Proceedings of the 3rd Conference of the European Society for Fuzzy Logic and Technology, Zittau, Germany, 2003. [26] Y. Bodyanskiy, S. Popov, M. Titov, Robust Learning Algorithm for Networks of Neuro-Fuzzy Units, in: T. Sobh (Ed.), Innovations and Advances in Computer Sciences and Engineering, Springer Netherlands, Dordrecht, 2010, pp. 343–346. https://doi.org/10.1007/978-90-481-3658- 2_59. [27] G.C. Goodwin, P.J. Ramadge, P.E. Caines, Discrete Time Stochastic Adaptive Control, 19 volume of SIAM J. Control Optim., 1981, 829–853. https://doi.org/10.1137/0319052. [28] Ye.V. Bodyanskiy, N.E. Kulishova, Extended neo-fuzzy neuron in the task of images filtering, Radio Electronics, Computer Science, Control., 2014. https://doi.org/10.15588/1607-3274- 2014-1-16. [29] V.A. Epanechnikov, Non-Parametric Estimation of a Multivariate Probability Density, volume14 of Theory Probab. Appl., 1969, 153–158. https://doi.org/10.1137/1114019. [30] T. Kohonen, Self-Organizing Maps, Springer Berlin Heidelberg, Berlin, Heidelberg, 2001. https://doi.org/10.1007/978-3-642-56927-2. [31] D.R. Zahirniak, R. Chapman, Rogers, Pattern recognition using radial basis function networks, in: Sixth Annual Aerospace Applications of AI Conf, Dayton, 1990, pp. 249–260. [32] 2D face sets, (n.d.). http://pics.psych.stir.ac.uk/2D_face_sets.htm (accessed July 13, 2021). [33] P. Lucey, J.F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, I. Matthews, The Extended Cohn- Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression, in: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, 2010, pp. 94–101. https://doi.org/10.1109/CVPRW.2010.5543262. 166