=Paper=
{{Paper
|id=Vol-3392/paper12
|storemode=property
|title=Credibilistic Fuzzy Clustering Method Based on Evolutionary Approach of Crazy Wolfs in Online Mode
|pdfUrl=https://ceur-ws.org/Vol-3392/paper12.pdf
|volume=Vol-3392
|authors=Alina Shafronenko,Yevgeniy Bodyanskiy,Iryna Pliss
|dblpUrl=https://dblp.org/rec/conf/cmis/ShafronenkoBP23
}}
==Credibilistic Fuzzy Clustering Method Based on Evolutionary Approach of Crazy Wolfs in Online Mode==
Credibilistic Fuzzy Clustering Method Based on Evolutionary Approach of Crazy Wolfs in Online Mode Alina Shafronenkoa, Yevgeniy Bodyanskiya, Iryna Plissa a Kharkiv National University of Radio Electronics, Nauky ave 14, Kharkiv, 61166, Ukraine Abstract The problem of big data credibilistic fuzzy clustering is considered. This task was interested, when data are fed in both batch and online modes and has a lot of the global extremums. To find the global extremum of the objective function of plausible fuzzy clustering, a modification of the crazy gray wolfs algorithm was introduced, which combines the advantages of evolutionary algorithms and global random search. It is shown that different search modes are generated by a unified mathematical procedure, special cases of which are well-known algorithms for both local and global optimization. The proposed approach is quite simple in computational implementation and is characterized by high speed and reliability in tasks of multi-extreme fuzzy clustering. Keywords 1 Fuzzy clustering, credibilistic fuzzy clustering, adaptive goal function, evolutionary algorithms, gray wolf optimization 1. Introduction The task of classification in the mode of self-learning (clusterization) of multidimensional data is an important part of traditional intellectual analysis. One of the main directions of computational intelligence is the so-called evolutionary algorithms, which are mathematical models of the evolution of biological organisms. The problem of clustering associated with vector observations often arises in many tasks of intelligent data analysis, such as Data Mining, Dynamic Data Mining, Data Stream Mining, Big Data Mining, Web Mining [1, 2] and primarily in fuzzy data clustering, when the processing of vector observations with different levels of probability, reliability, etc. can belong to more than one class. The most effective are Kohonen's self-organizing maps [3] and evolutionary algorithms, which can improve data clustering in cases where data is processed in online modes. From a computational point of view, the problem of clustering is reduced to finding the optimum of multi-extremal functions of a vector argument with the help of gradient procedures, which are repeatedly launched from different starting points. For speed up the process of finding extremums by using the ideas of evolutionary optimization. Evolutionary computation [4] and swarm intelligence are considered the fastest growing algorithms that employ computational intelligence to solve optimization problems. The evolutionary swarm intelligence includes genetic algorithm [5], biogeography-based optimizer [6], evolutionary strategies [7], population-based incremental learning [8], genetic programming [9], and differential evolution [10]. Swarm intelligence [11] is another powerful form of the computational intelligence used to solve the optimization problems. Swarm intelligence algorithm using the ideas of evolutionary optimization, which includes algorithms inspired by nature, swarm algorithms, population algorithms, The Sixth International Workshop on Computer Modeling and Intelligent Systems (CMIS-2023), May 3, 2023, Zaporizhzhia, Ukraine EMAIL: alina.shafronenko@nure.ua (A. Shafronenko); yevgeniy.bodyaskiy@nure.ua (Y. Bodyanskiy); iryna.pliss@nure.ua (I. Pliss) ORCID: 0000-0002-8040-0279 (A. Shafronenko); 0000-0001-5418-2143 (Y. Bodyanskiy); 0000-0001-7918-7362 (I. Pliss) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) and imitate the natural swarms or communities or systems such as fish schools, bird swarms, cat swarms, bacterial growth, insect’s colonies and animal herds etc. Evolutionary swarm intelligence algorithms include many algorithms such ant colony optimization [12], particle swarm optimization (PSO) [13], artificial bee colony [14], ant lion optimizer (ALO) [15], moth-flame optimization, dragonfly algorithm, sine cosine algorithm, whale optimization algorithm, multi-verse optimizer, grey wolf optimizer (GWO) [16], firefly algorithm, cuckoo search, and many others. The purpose of the paper is to consider the problem of data fuzzy clustering that is received for processing in both batch and online modes. To find the global optimum of a multi-extremal objective function, to develop a method that can find the global extremum of complex functions. 2. Problem Statement Baseline information for solving the tasks of clustering in a batch mode is the sample of observations, formed from N n -dimensional feature vectors X {x1 , x2 ,..., xN } R n , xk X , k 1, 2,..., N . The problem of fuzzy clustering of data arrays is considered in the conditions when the formed clusters arbitrarily overlap in the space of features. The source information for solving the problem is an array of multidimensional data vectors, formed by a set of vector-observations X ( x(1), x (2),..., x(k ),...x( N )) R n where k in the general case the observation number in the initial array, x(k ) ( x1 (k ),..., xi (k ),..xn (n))T . The result of clustering is the partition of this array on m overlapped classes with centroids cq R n , q 1,2,..., m , and computing of membership levels 0 u q ( k ) 1 of each vector-observation x(k ) to every cluster cq . 3. Credibilistic Fuzzy Clustering Method Alternatively, to probabilistic and possibilistic procedures [17,18] it was introduced credibilistic fuzzy clustering approach using as its basis the credibility theory [19, 20] and is largely devoid of the drawbacks of known methods. The most common approach within the framework of probabilistic fuzzy clustering is associated with minimizing the goal function [3]. N m Goal u q ( k ), cq u q ( k ) d 2 x ( k ), cq (1) k 1 q 1 where uq (k ) - membership level, cq - centroid, d - Euclidian metric, β - fuzzifier; with сonstraints m uq ( k ) 1, q 1 N (2) 0 u ( k ) N . k 1 q Solution of nonlinear programming problem using the method of Lagrange indefinite multipliers leads to the well-known result 1 d 2 x ( k ), cq 1 u q ( k ) 1 , m d 2 x(k ), cl 1 l 1 (3) N uq (k ) x(k ) cq k 1 N uq (k ) k 1 coinciding with 2 with a popular method of Fuzzy C-Means of J. Bezdek (FCM) [21]. If the data are fed to processing sequentially, the solution of the nonlinear programming problem (1), (2) using the Arrow-Hurwitz-Uzawa algorithm leads to an online procedure: 1 d x(k 1), cq ( k ) 2 1 uq (k 1) m 1 , 2 d x( k 1), cl (k ) 1 l 1 (4) cq (k 1) c( k ) ( k 1)uq (k 1) * * x(k 1) c (k ) . q The goal function of credibilistic fuzzy clustering has the form [19, 22] close to (1) N m Goal Cred q ( k ), cq Cred q ( k ) d 2 x ( k ), cq (5) k 1 q 1 with "softer" than (2) constraints: 0 Cred q ( k ) 1, for all q and k , sup Cred q ( k ) 0,5, for all k , (6) Cred q ( k ) sup Cred l ( k ) 1, for any q and k , for which Cred ( k ) 0,5. q It should be noted that the goal functions (1) and (5) are similar and that there are no rigid probabilistic constraints in (6) on the sum of the membership in (2). In the procedures of credibilistic clustering, there is also the concept of fuzzy membership, which is calculated using the neighborhood function of the form uq (k ) q d x(k ), cq (7) monotonically decreasing on the interval [0, ] so that q (0) 1, q ( ) 0. Such a function is essentially an empirical similarity measure of [23] related to distance by the relation 1 uq ( k ) . (8) 1 d x(k ), cq 2 Note also that earlier it was shown in [16] that the first relation (3) for 2 can be rewritten as d 2 x ( k ), cq 1 uq (k ) 1 , (9) q2 where 1 m q2 d 2 x(k ), cl (10) l 1 l which is a generalization of the function (8) (for q2 1 (8) coincides with (10)) and satisfies all the conditions for (7). In batch form the algorithm of credibilistic fuzzy clustering in the accepted notation can be written as [20, 22] u (k ) 1 d 2 x(k ), c 1 , q q uq (k ) uq ( k ) sup ul ( k ) , 1 Cred q ( k ) uq ( k ) 1 sup ul ( k ) , 1 2 l q (11) N Cred q (k ) x( k ) cq k 1 N Cred q (k ) k 1 and in the online mode, taking into account (9), (10): 2 1 q (k 1) m 2 , l 1 d x(k 1), cl ( k ) l q u (k 1) 1 d x(k 1), cq ( k ) , 1 2 q q2 (k 1) u (k 1) (12) uq (k 1) q , sup ul ( k 1) Cred (k 1) 1 u (k 1) 1 sup u (k ) , 2 q q l l q q q q c (k 1) c (k ) ( k 1)Cred (k 1) x(k 1) c (k ) . q From the point of view of computational implementation, algorithm (12) is not more complicated than procedure (4) and, in the general case, is its generalization to the case of credibilistic approach to fuzzy clustering. 4. Evolutionary Approach of Crazy Wolfs To find the global extremum of functions (1) and (5), it is advisable to use bio-inspired evolutionary algorithms for particle swarm optimization [24], among which the so-called gray wolf algorithm (GWO) can be noted as one of the fastest coding ones [25]. According to the algorithm proposed in [25], gray wolves live together and hunt in groups. The search and hunting process can be described as follows: Cα B1 * GWα X (t ) , Cβ B2 * GWβ X (t ) , Cδ B3 * GWδ X (t ) , if the prey is found, they first track, pursue and approach it. If the prey runs, then the gray wolves chase, surround and watch the prey until it stops moving, which can be written as GW1 GWα A1 * Cα , GW2 GWβ A2 * Cβ , GW3 GWδ A3 * Cδ , where GW ‐ wolf movement function. After the prey is found, the attack begins: GW1 GW2 GW3 GW (t ) . 3 Figure 1: How , , wolfs are defined in GWO In a case, when A 1 - gray wolves flee from dominants, which means that omega wolves will flee from prey and explore more space; if A 1 they approach the dominants, which means that δ- wolves will follow the dominants that approach the prey. Figure 2: Exploration and exploitation of wolf in GWO To achieve a proper trade-off between scouting and hunting, an improved gray wolf algorithm is proposed. Figure 3: Scheme of operation of the gray wolf algorithm The change in the positions of the wolves is described as follows: С В * X P (t ) X (t ) , (10) X t 1 X P (t ) A * C , (11) where X P - prey position, X - position of the wolf, - a random perturbation that introduces an additional scan of the search space. To increase the reliability of finding exactly the global extremum of the objective function, you can use the idea of "crazy cats" - optimization [26, 27], modified by the introduction of a random walk, which has proven its effectiveness when solving multi-extremal problems. By introducing, according to L. Rastrigin, an additional search perturbation, it is possible to write down the movement of the wolf in the form: 1 X t 1 X t 2 H k (12) where - parameter for correction of disturbance characteristics, 0 1 - self-learning speed parameter, - white noise variance H . 2 In this way, the probability of finding the global extremum of the adopted objective function increases, which ultimately increases the effectiveness of the fuzzy clustering process. 5. Experimental Research The proposed method was tested on well-known multiextremal function such as Ackley's function, presented in form (13), Fig. 4: f ( x) 20exp(0.2 0.5( x 2 y 2 )) exp(0.5cos(2 y)) e 20. (13) Figure 4: Ackley's function Pseudocode of proposed method can write in the form of next steps presented in the Table 1: Table 1 Pseudocode of Crazy Wolfs method Description Pseudocode Data sampling Swarm population Setup options Control parameter Stop criterion Initialization Initial positions of dominant gray wolves If this is not a stopping criterion, then the calculation of the new value of the fitness function Position update Search Restrictions on the positions of wolves Update α, β and δ Update stop criteria End The behavior of Crazy GWO method can tested on multiextremal Ackley's function. Analyzes was provide on 100 iterations. On Fig. 5 demonstrate search history of global extremum of proposed function (a), the trajectory of the first variable of the first wolf (b), fitness history of all wolves (c) and the parameter A (d) Figure 5: The history work of Crazy GWO method on 100 iterations For further validating the searching accuracy of Crazy GWO method (CGWO) with GWO and PSO algorithms. Table 2 Numerical statistics results Accuracy CGWO GWO PSO Best 1.5099e − 017 7.5495e − 013 0.0035 Ave 1.2204e − 016 1.0048e − 012 0.0055 Worst 2.2204e − 014 1.4655e − 013 0.0082 Two swarm intelligence algorithms (PSO algorithm and GWO algorithm) are selected to carry out the simulation contrast experiments with the proposed CGWO so as to verify its superiority on the convergence velocity and searching precision. The simulation convergence results on the adopted testing functions are shown in Fig. 6: Figure 6: Convergence curves Ackley's function 6. Discussion Analyzing the results of the obtained experimental studies and comparative analysis of the method of clustering data arrays based on the crazy gray wolf algorithm with clustering methods based on both the classical approach to data clustering and more exotic ones, the proposed method demonstrates sufficiently high results. The main advantages of the proposed method are the simplicity of mathematical calculations, the speed of working with data, regardless of the type, size, and quality of the analyzed sample. It should be noted the accuracy of the data clustering method based on the improved gray algorithm and the obtained clustering results, which is achieved with the help of the optimization procedure of the evolutionary algorithm. Conclusion The problem of credibilistic fuzzy clustering of data using a crazy gray wolf algorithm is considered. The proposed method excludes the possibility of "getting stuck" in local extrema by means of a double check of finding the dominant wolf in the extremum and, in comparison with the given calculation error, allows to reduce the number of procedures starts. A feature of the proposed modified optimization procedure is the possibility of managing random disturbances that determine the properties of the modified the random walk, that has proven effectiveness when solving multi- extremal problems. The approach is quite simple in numerical implementation, allows finding global extrema of complex functions, which is confirmed by the results of experimental research. 7. Acknowledgement The work is supported by the state budget scientific research project of Kharkiv National University of Radio Electronics “Development of methods and algorithms of combined learning of deep neuro-neo-fuzzy systems under the conditions of a short training sample”. 8. References [1] R. Xu, D.C. Wunsch, Clustering. Hoboken, N.J. John Wiley & Sons, Inc., 2009. [2] C.C. Aggarwal, Data Mining: Text Book. Springer, 2015. [3] T. Kohonen, Self-Organizing Maps. Berlin: Springer, 1995, 362 p. DOI: 10.1007/978-3-642- 56927-2. [4] T. Back, D.B. Fogel, Z. Michalewicz, Handbook of evolutionary computation. Oxford University Press, New York, 1997. [5] L. Davis, Handbook of genetic algorithms. Van Nostrand Reinhold, New York, 1991. [6] D. Simon, Biogeography-based optimization. IEEE Trans Evol Comput, 2008, 12(6):702–713. [7] N. Hansen, S. Kern, Evaluating the CMA evolution strategy on multimodal test functions. International conference on parallel problem solving from nature. Springer, 2004, pp 282–291 [8] M. Hohfeld, G. Rudolph, Towards a theory of populationbased incremental learning. Proceedings of the 4th IEEE conference on evolutionary computation, 1997. [9] J.R. Koza, Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, 1992. [10] R. Storn, K. Price, Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim, 1997, 11(4), pp.341–359. [11] A. P. Engelbrecht, Fundamentals of computational swarm intelligence. Wiley, Hoboken, 2006. [12] M. Dorigo, L.M. Gambardella, Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput, 1997, 1(1), pp.53–66. [13] R. Eberhart, J. Kennedy, A new optimizer using particle swarm theory. Proceedings of the sixth international symposium on micro machine and human science, 1995. MHS’95. IEEE, pp 39–43 [14] D. Karaboga, An idea based on honey bee swarm for numerical optimization. Technical report, technical report-tr06, Erciyes University, engineering faculty, computer engineering department, 2005. [15] S. Mirjalili, The ant lion optimizer. Adv Eng Softw 83, 2015, pp.80–98. [16] E. Elhariri, N. El-Bendary, A.E. Hassanien, A. Abraham, Grey wolf optimization for one- against-one multi-class support vector machines. 2015 7th international conference of soft computing and pattern recognition (SoCPaR). IEEE, 2015, pp 7–12. [17] F.Höppner, F. Klawonn, R. Kruse, T. Runker, Fuzzy Clustering Analysis: Methods for Classification, Data Analysis and Image Recognition. Chichester, John Wiley &Sons, 1999, 289p. [18] R. A Krishnapuram, “Possibilistic Approach to Clustering”. IEEE Transactions on Fuzzy Systems, May 1993. Vol. 1. pp. 98-110. DOI: 10.1109/91.227387 [19] J. Zhou, Credibilistic clustering: the model and algorithms. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. 2015. Vol. 23, №4. pp. 545-564. DOI: 10.1142/S0218488515500245. [20] J. Zhou, Credibilistic clustering algorithms via alternating cluster estimation. Journal of Intelligent Manufacturing. 2017. Vol. 28. pp.727-738. DOI: 10.1007/s10845-014-1004-6. [21] J.C. Bezdek, ”A convergence theorem for the fuzzy ISODATA clustering algorithms/ Bezdek J.C.// IEEE Transactions on Pattern Analysis and Machine Intelligence”. IEEE, 1980. Vol. 2, №1. pp. 1- 8. DOI: 10.1109/TPAMI.1980.4766964. [22] Shafronenko A., Bodyanskiy Ye., Klymova I., Holovin O., Online credibilistic fuzzy clustering of data using membership functions of special type. Proceedings of The Third International Workshop on Computer Modeling and Intelligent Systems (CMIS-2020), April 27-1 May 2020. Zaporizhzhia, 2020. Access mode: http://ceur-ws.org/Vol-2608/paper56.pdf. [23] Bodyanskiy Ye, Shafronenko A., Mashtalir S., Online robust fuzzy clustering of data with omissions using similarity measure of special type. Lecture Notes in Computational Intelligence and Decision Making-Cham: Springer, 2020. pp.637-646. [24] Shafronenko A., Bodyanskiy Ye. Pliss I., Patlan K. Fuzzy Clusterization of Distorted by Missing Observations Data Sets Using Evolutionary Optimization. Proceedings “Advanced Computer Information Technologies (ACIT’2019)”, Česke Budejovice, Czech Republic, June 5-7, 2019, pp. 217-220. doi: 10.1109/ ACITT.2019.8779888 [25] Shafronenko A. Yu, Bodyanskiy Ye. V., Pliss I.P. ”The Fast Modification of Evolutionary Bioinspired Cat Swarm Optimization Method”. 2019 IEEE 8th International Conference on Advanced Optoelectronics and Lasers (CAOL), Sozopol, Bulgaria, 6-8 Sept. 2019, pp. 548-552. doi: 10.1109/CAOL46282. 2019.9019583 [26] Saha S.K., Kar R., Mandal D., Ghoshal S.P. IIR filter design with craziness based particle swarm optimization technique. World Academy of Science, Engineering and Technology-2011- pp.1628-16356 [27] Sarangi A., Sarangi S.K., Mukherjee M., Panigrahi S.P. System identification by crazy-cat swarm optimization. Proc. Int. Conf. on Microwave, Optical and Communication Engineering- Bhubaneswar, India - 2015-Doc.18-29 pp. 439 – 442.