Adaptive Fuzzy Clustering Approach Based on Evolutionary Cat Swarm Optimization Alina Shafronenko1[0000-0002-8040-0279], Yevgeniy Bodyanskiy1[0000-0001-5418-2143] 1 Kharkiv National University of Radio Electronics, Nauky Ave., 14, Kharkiv, Ukraine yevgeniy.bodyanskiy@nure.ua, alina.shafronenko@nure.ua Abstract. Computational intelligence methods are widely used to solve many complex problems, including, of course, traditional: Data Mining and such new directions as Dynamic Data Mining, Data Stream Mining, Big Data Mining, Web Mining, Text Mining, etc. One of the main areas of computational intelligence are evolutionary algorithms that essentially represent certain mathematical models of biological organisms evolution. In the paper adaptive methods of fuzzy clustering using on evolutionary cat swarm optimization were proposed. Using proposed approach it’s possible to solve clustering task in on-line mode. Keywords: fuzzy clustering, learning rule, online probabilistic fuzzy clustering, online possibilistic fuzzy clustering, online credibilistic fuzzy clustering, cat swarm. 1 Introduction The task of classification in the self-learning mode (clustering) of multidimensional data is an important part of traditional intellectual analysis such as Data Mining, Dynamic Data Mining, Data Stream Mining, Big Data Mining, Web Mining [1, 2]. One of the main areas of computational intelligence are the so-called evolutionary algorithms, which are mathematical models of biological organisms evolution. The problem connected with by vector-observations, clustering often occurs in many applications of data mining, and first of all in data fuzzy clustering when processing vector-observation with different levels of probability possibility, credibility etc. can belong to more than one class. Very effective are Kohonen self-organizing maps [3] and the evolutionary algorithms that can improve data clusterization in case, when the data are processed sequentially in online mode. The problem of fuzzy clustering of data arrays is considered in the conditions when the formed clusters arbitrarily overlap in the space of features. The source information for solving the problem is an array of multidimensional data vectors, formed by a set of vector-observations X  ( x (1), x (2),..., x ( k ),...x ( N ))  R n where k in the general case the observation number in the initial array, x ( k )  ( x1 ( k ),..., xi ( k ),..xn ( n))T . The result of clustering is the partition of this array on m overlapped classes with Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). prototypes-centroids Cl j  R n , j  1, 2,..., m , and computing of membership levels 0  U j ( k )  1 of each vector-observation x(k ) to every cluster Cl j . 2 Adaptive algorithm for probabilistic fuzzy clustering (APrFC) The goal function of popular probabilistic clustering has the form [4] N m E (U j (k ), Cl j )   U j (k ) D 2 ( xk , Cl j ) (1) k 1 j 1 m N where  U j (k )  1, 0   U j (k )  N . Solving the nonlinear programming problem, j 1 k 1 we obtain the probabilistic fuzzy clustering algorithm  1  m 1  1 U (j 1) ( k )  D 2 ( xk , Cl (j ) ))1  *   ( D 2 ( xk , Cll( ) ))1   ,   l 1    (2)  1  ( 1) N  N  Cl j   (U (j 1) )  x k *   (U (j 1) ( k ))   ,  k 1  k 1  in case when the   2 - parameter that is called fuzzyfier and defines "blurring" the boundaries between classes we come to a decision, that has the form  ( 1) 2  m 2  1 U j (k )  xk  Cl j ( ) *   xk  Cll ( )  ,   l 1   1 (3)  ( 1) N  N 2 Cl  q   (U ( 1) j ( k )) 2 x  * k   (U ( 1) j ( k ))  .  k 1  k 1  The process of fuzzy clustering can be organized in on-line adaptive mode  1  m 1  1 U j (k  1)  ( D ( xk 1 , Cl j )) *   ( D ( xk 1 , Cll (k )))  , 2 ( k ) 1  2 1    l 1  (4)    Cl j (k  1)  Cl j (k )   (k  1)U j (k  1)( xk 1  Cl j (k )).  3 Adaptive algorithm for possibilistic fuzzy clustering (APosFC) The goal function of possibilistic clustering has the form N m m N E (U j (k ), Cl j ,  j )  U j (k ) D 2 ( xk , Cl j )    j  (1  U j (k ))  (5) k 1 j 1 j 1 k 1 where   0 - the scalar parameter that specifies the distance at which the level of membership equals 0.5. Minimizing the equation (5) relatively, U j (k ) , Clq and  j we obtain the system of equations (6)   D 2 ( x (k ), Cl (j ) ) 11  1 U (j 1) (k )   1  ( )  ,    (j )     1  ( 1) N   N   Cl  j   (U ( 1) j ( k )) x  ( k ) *  j  (U ( 1) ( k ))  ,  (6)  k 1 k 1  1 N  N     ( 1)   j   ( 1)  2 U j (k )) D ( x(k ), Cl j   ( 1) *   (U j (k ))  , ( 1) k 1  k 1   in case when the   2 we come to a decision, that has the form (7):   ( ) 2 1   ( 1) x  ( k )  Cl U ( k )  1  j  ,  j   (j )      1  ( 1) N  N 2 Cl j   (U j (k )) x (k ) *   (U j (k ))  , ( ) 2 ( ) (7)  k 1  k 1   1   ( 1)  (U ( ) (k ))2 x (k )  Cl ( 1) 2 *  (U ( ) (k )) 2  . N N  j  k 1 j j  j  k 1     In the online mode algorithms (6), (7) can be written in recurrent form (8)   D 2 ( x ( k  1), Cl j ( k )) 11  1 U j ( k  1)   1  ( )  ,    ( k )   j    Cl j ( k  1)  Cl j ( k )   ( k  1)U j ( k  1)( x ( k  1)  Cl j ( k )), (8)  1  k 1  k 1     j ( k  1)   U j ( p ) D ( x ( p), Cl j ( k  1)) *   U j ( p)   2  p 1  p 1  and   1 x (k )  c j (k )  2    , U j ( k  1)   1   j (k )      c j (k  1)  c j ( k )   (k  1)U j (k  1)( x (k  1)  c j (k )), 2 (9)  1  k 1 2  k    j (k  1)   U j ( p) x ( p )  c j (k  1) *  U j ( p )  , 2 2  p 1  p 1   that permits to solve the fuzzy clustering task in sequental mode. 4 Adaptive Credibilistic Fuzzy Clustering (ACrFC) Credibilistic fuzzy clustering is associated with minimizing the goal function (10) N m   E Crq  k  , wq  Crq  k  D 2 xk , wq   (10) k 1 q 1 with constraints 0  Crq  k   1q, k ; supCrq (k)  0, 5k ; Crq  k   supCrl  k   1 for any q and k for which Crq  k   0.5 . Here Crq  k  – credibility that observation xk belongs to a cluster Clq . In this case, the membership level is calculated using on the membership function [5, 6]   U q  k    q D xk , Clq  (11) where: q   – monotonically decreases in the interval [0, ] , q  0   1 , q     0 . It is easy to see that function (11) is essentially measure of similarity based on distance [7]. As such a function, it was proposed in [8] to use the expression    . 1 U q  k   1  D 2 xk , Clq (12) It is interesting to note that expression (12) can be rewritten in the form 1 1  m 1  Uq k   D  x , Cl  k    2 k q 1   l 1 ,   D 2  xk , Cll  k   1        1 1 1        m  D 2  xk , Clk  k   1  D 2 xk , Clq  k  1    D 2  xk , Cll  k   1  ) 1  (13) l 1 lq 1  1 1          D  x , Cl  k   m   1  D 2 xk , Clq  k  1  2 k l 1    l 1   l q  that for the Euclidean metric and   2 takes the form of a Cauchy distribution density function with a width parameter  q2 [9] 1  xk  Clq  k   2  Uq k   1  , (14)   q2    1  m   2   q    xk  Cll (k )  . 2 (15)  ll 1q    It is easy to see that the membership function (13) is a special case of (14) for  1. 2 q Finally, a batch algorithm of credibilistic fuzzy clustering can be written in the form [5, 6]:  k   1  D 2  xk , Clq    , 1 U q  1 (16)  k   U q 1  k   sup U l 1  k   , 1 U *q  1 (17) 1  ( 1)  Crq  1 k    Uq  k   1  sup U l  k   , (18) 2 l q  1   k   N N       1 Clq   Crq  1 xk   Crq 1  k   . (19) k 1  k 1  Based on this formulas, we can introduce into consideration online version of the credibilistic fuzzy clustering method in the form    1  2  m 2   q  k  1    xk 1  Cll  k   ,   l 1    lq   1   xk 1  Clq  k   2 U q  k  1  1   ,    q2  k  1     (20)  * U  k 1  U q  k  1  sup U l  k  1  , 1  Cr  k  1  1  U *  k  1  1  sup U *  k  1  ,  q 2   q l lq    Clq  k  1  Clq  k    k  1 Crq  k  1 xk 1  Clq  k  .     Therefore, from a computational point of view, the online algorithm for credibilistic fuzzy clustering is no more complicated than the recurrent versions of FCM and PCM, while retaining the advantages of a credibility approach. 5 Evolutionary Cat Swarm Optimization To search global extremum of a function (1) (5) and (10) it is expedient to use the bio- inspired evolutionary particles swarm optimization algorithms [10]. Among the swarm algorithms, one of the fastest ones are the so-called algorithms of the cats swarm [11, 12], that proved to be effective in solving a wide range of Data Mining tasks. Cat swarm-CS model of behavior, assumes that each cat cat p of swarm consisting of Q individuals ( p  1, 2,..., Q) , can be in one of two states: Seeking Mode (SM) and Tracing Mode (TM). In this case, the seeking mode is associated with slow movements with a slight amplitude near the initial position (space scanning in the vicinity of the current position), and the tracing mode that is determined by fast jumps with a large amplitude and allows the cat cat p go put from local extremum, if she got there. The combination of local scanning and abrupt changes in the current state makes it more likely to find a global extremum compared to traditional multi-extrema optimization methods. In the general case, both of these modes for each of the cats swarms can be described by the recurrent optimization procedure [13] ˆ E (cat ( ))   ( ), cat p (  1)  cat p ( )   (cat p ( )  cat p (  1))  M p  (21) where cat p (  1) - state of p -th cat of swarm on  -th iteration of the search,  - parameter that determines the inertia properties of the tracing mode. In a case when   0 process optimization approaches to the standard gradient search.  - seeking ˆ E (cat ( )) - gradient estimate of the goal function (1), (5) and (10) in mode step,  p the neighborhood of the point сat p ( ) ,  ( ) - a random component that introduces additional stochastic motions into the tracing process,  - parameter that specifies the amplitude of these movements. In this algorithm, each cat can have two parallel states: search mode and tracking mode. This approach provides a search for a global extremum in the case when the number of cats in the swarm is sufficient. 6 Experimental research Fuzzy clustering based on evolutionary cat swarm optimization (CSO) was performed on four different data samples: Iris, Cancer, Wine and Glass. Each of the data sets has a number of parameters, presented in Table 1. Table 1. Characteristic Parameters of the Samples Number of Number of Number of Data Set clusters attributes observations. Iris 3 4 150 Cancer 2 9 683 Wine 3 13 178 Glass 6 8 214 Table 2. Parameters of cat swarm optimization algorithm (CSO) Parameters Value SRD Random [0,1] Seeking memory Pool (SMP) 5 Population size Number of clusters r1 Random in [0,1] c1 Const SPC Random in [0,1] Number of iteration Manually Table 3. Comparative results of time processing of clustering algorithms such as Fuzzy c-means algorithm (FCM), Particle Swarm Optimization (PSO), Gauss – Seidel algorithm (GSA), CSO, APrFCCSO, APosFCCSO and ACrFCCSO Data APrFC APosFC ACrFC FCM PSO GSA CSO Set CSO CSO CSO Cancer 0.009 0.138 0.204 0.026 0.007 0.010 0.012 Glass 0.010 0.431 0.431 0.021 0.020 0.025 0.024 Iris 0.008 0.020 0.022 0.043 0.012 0.017 0.015 Wine 0.009 0.282 0.098 0.076 0.013 0.015 0.013 Also conducted a comparative analysis of the quality of clustering data on the main characteristics quality ratings, such as: Partition Coefficient (PC), Partition Index (SC), Xie and Beni’s Index (XB) of existing clustering methods and proposed method. Table 4. Evaluation of the quality of fuzzy clustering methods data Data clustering methods PC SC XB Fuzzy C-means 0.50 1.62 0.19 Adaptive probabilistic fuzzy clustering 0.25 1.44 0.01 Adaptive possibilistic fuzzy data clustering 0.26 1.22 0.01 Adaptive credibilistic fuzzy data clustering 0.24 1.25 0.01 Adaptive probabilistic fuzzy clustering 0.22 1.37 0.25 CSO (APrFCCSO) Adaptive possibilistic fuzzy data clustering 0.23 0.95 0.01 CSO (APosFCCSO) Adaptive credibilistic fuzzy data clustering 0.22 0.70 0.15 CSO (ACrFCCSO) Table 5. Results of clustering CSO, APrFCCSO, APosFCCSO and APosFCCSO with a different number of iterations (average error in%) Number of Number of Number of Number Data iterations iterations iterations of iterations CSO Set APrFCCSO APosFCCSO APosFCCSO 50 100 150 50 100 150 50 100 150 50 100 150 23.34 20.84 21.67 17.55 14.78 16.46 17.65 14.88 16.55 17.45 14.78 16.45 Iris 40,23 40,55 41,47 38.89 39.22 39.15 37.49 38.42 39.65 37.45 38.44 39.65 Cancer 24,55 21,44 22,20 18.43 17.37 16.32 18.95 18.37 19.32 18.85 18.47 19.22 Wine 56.34 56.48 55.67 51.63 49.79 52.03 51.88 49.39 52.13 51.78 49.49 51.7 Glass Table 6 – Comparative results of perfomance indicators of algorithms such as PSO, CSO, APrFCCSO, APosFCCSO and APosFCCSO for 1st database Data MSE PSO CSO APrFCCSO APosFCCSO APosFCCSO Set Best 4  10 7 9  10 10 8.4  10 11 8.8  10 11 8.56  10 11 Iris Median 7  10 6 7.3  10 9 5.2  10 10 5.4  10 10 5.5  10 10 Worst 1.2  10 5 9.3  10 9 9.6  10 10 9.4  10 10 9.4  10 10 7 10 11 11 Best 1.3  10 8  10 8.5  10 8.4  10 8.35  10 11 Cance Median 7  10 6 4.4  10 9 7.6  10 11 6.6  10 11 6.55  10 11 Worst 2.02  10 5 6.8  10 9 7.8  10 10 7.78  10 10 7.8  10 10 6 10 11 11 Best 1.4  10 7  10 9.5  10 9.0  10 8.95  10 11 Wine Median 5  10 5 4  10 9 6.6  10 11 6  10 11 5.95  10 11 Worst 2  10 5 6  10 9 6.8  10 10 6.7  10 10 6.69  10 10 7 10 11 11 Best 2.5  10 7.9  10 8.7  10 8.4  10 8.35  10 11 Glass Median 6.9  10 6 5  10 9 7.7  10 11 7.8  10 11 7.77  10 11 Worst 3  10 5 5.9  10 9 7.7  10 10 7.5  10 10 7.49  10 10 7 Conclusion The problem of fuzzy clustering based on probabilistic, possibilistic and credibilistic online approaches was considered. The recurrent modifications of well-known batch procedures designed to solve problems of Data Stream Mining allow to process information in online mode as sequential additions to the system under consideration. Since the goal functions of fuzzy clustering in the general case are multi-extremal, was is proposed to refine the solutions using swarm evolutionary optimization algorithms. The modification introduced on the basis of the optimization procedure cat swarms with improved properties through the use of a stochastic gradient estimate was proposed. The experiments confirmed the effectiveness of the developed approach, which is characterized by simplicity of numerical implementation and a sufficiently high convergency rale. References 1. Xu, R., Wunsch, D.C.: Clustering. Hoboken, N.J. John Wiley & Sons, Inc. (2009) 2. Aggarwal, C.C.: Data Mining: Text Book. Springer (2015) 3. Kohonen, T.: Self-Organizing Maps. Berlin: Springer-Verlag (1995) 4. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981) 5. Zhou, J., Wang, Q., Hung, C.-C., Yi, X.: Credibilistic clustering: the model and algorithms. Int.J. of Uncertainty, Fuzziness and Knowledge-Based Systems, №4, pp.545-564 (2015). 6. Zhou, J., Wang, Q., Hung, C. C.: Credibilistic clustering algorithms via alternating cluster estimation. J. Intell. Manuf., pp.727-738 (2017). 7. Young, F.W., Hamer, R.M.: Theory and Applications of Multidimensional Scaling-Hillsdale, N.J.: Erlbaum (1994). 8. Zhou, J., & Hung, C.-C.: A generalized approach to possibilistic clustering algorithms. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems. 15. pp. 117–138. (2007). 9. Hu, Zh., Bodyanskiy, Ye, Tyshchenko, O., Shafronenko, A.: Fuzzy clustering of incomplete data by means of similarity measures. Proc.2019 IEEE 2nd Ukr. Conf. on Electrical and Computer Engineering (UKRCON), Track 6. Lviv, Ukraine, pp.149-152 (2019). 10. Grosan, C., Abraham, A., Chis, M.: Swarm intelligence in Data Mining - Studies in Computational Intelligence (2006) 11. Chu, S.-C., Tsai, P.-W., Pan, J.S.: Cat swarm optimization. In: Lecture Notes in Artificial Intelligence. Berlin Heidelberg: Springer-Verlag (2006). 12. Chu, S.-C., Tsai, P.-W.: Computational Intelligence based on the behavior of cats.In: “Int. J. of Innovative Computing, Information, and Control”, №1, pp.163 – 173 (2007) 13. Shafronenko, A., Bodyanskiy, Ye. Pliss, I., Patlan, K.: Fuzzy Clusterization of Distorted by Missing Observations Data Sets Using Evolutionary Optimization. In: Proceedings “Advanced Computer Information Technologies (ACIT’2019)”, Česke Budejovice, Czech Republic, June 5-7, 2019, pp. 217-220 (2019). doi: 10.1109/ACITT.2019.8779888 14. Hu, Zh., Bodyanskiy, Ye, Tyshchenko, O., Shafronenko, A.: Fuzzy clustering of incomplete data by means of similarity measures. In: 2019 IEEE 2nd Ukraine Conference on Electrical and Computer Engineering UKRCON - 2019, Conference Proceedings, July 2-6, 2019, Lviv, Ukraine, pp.149-152 (2019) doi: 10.1109/UKRCON.2019.8879844. 15. Shafronenko, A. Yu, Bodyanskiy, Ye. V., Pliss I.P.: The Fast Modification of Evolutionary Bioinspired Cat Swarm Optimization Method. In: 2019 IEEE 8th International Conference on Advanced Optoelectronics and Lasers (CAOL), Sozopol, Bulgaria, 6-8 Sept. 2019, pp. 548-552. (2019) doi: 10.1109/CAOL46282.2019.9019583