=Paper=
{{Paper
|id=Vol-3611/paper3
|storemode=property
|title=Grey wolf optimizer combined with k-nn algorithm for clustering problem
|pdfUrl=https://ceur-ws.org/Vol-3611/paper3.pdf
|volume=Vol-3611
|authors=Katarzyna Prokop
|dblpUrl=https://dblp.org/rec/conf/ivus/Prokop22
}}
==Grey wolf optimizer combined with k-nn algorithm for clustering problem==
Grey wolf optimizer combined with k-nn algorithm for clustering problem Katarzyna Prokop Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44100 Gliwice, Poland Abstract The clustering problem is an important task in machine learning. Clustering algorithms allow for the division of the set into individual clusters on the basis of a specific measure. Such an idea is used for undescribed data, where its label must be automatically assigned. One of the most popular algorithms is the π-nearest neighbors. In this paper, we propose a modification of this algorithm by combining it with heuristics, i.e. the grey wolf optimizer. The idea assumes that individuals in heuristic will be understood as a sample and unknown classes as victims in heuristic. Then the heuristic operation is used for analyzing the set. The proposition was described in terms of original algorithms and proposed hybridization of them. Then it was tested on Iris Flower Dataset and obtained results were discussed in terms of its advantages. 1. Introduction extracted information can be used for describing an ob- ject and used in further classification. Machine learning algorithms are known as data-hungry. In this paper, a hybridization of π-nn and selected It means, that many of them need a large number of heuristic algorithm was proposed. It is an alternative samples to fit/train the model. However, in many cases, way that indicates that these two solutions can be com- collected data are not labeled and cannot be used in the bined and result in good accuracy. For the research the supervised training process. Therefore, the clustering Grey Wolf Optimizer was chosen. This is a fairly young method can be used to split them into some classes/clusters. method [5], uses the hierarchy of units in the herd. This An example of clustering visual features was presented in heuristic algorithm shows competitive results compared [1]. Another solution is modifying a k-means algorithm to other known metaheuristics for the function optimiza- by introducing some dynamic changed conditions [2, 3]. tion problem. The Gray Wolf Optimizer can be success- Moreover, different approaches to clustering are mod- fully used, for example, in industry [15] or smart home eled and it can be seen in the example of deep spectral solutions [16]. clustering that uses an auto-encoder network [4]. Moreover, the optimization task is important in the area of machine learning. Therefore, many newer algo- 2. Methodology rithms are modeled as an alternative and accurate ap- proach [5]. Except for new models, the hybridization of The main idea is to combine the operation of k Nearest them is introduced. One such example is a cooperative Neighbors classifier with Grey Wolf Optimizer and test idea of many such algorithms [6]. The application of the effectiveness of the method obtained as a result. these algorithms shows that it is a promising approach and can help in different areas of artificial intelligence. 2.1. k Nearest Neighbors Algorithm For instance, meta-heuristic algorithms were used in the The k Nearest Neighbors classifier (π-nn) is an example of federated training process of convolutional neural net- an algorithm that is used for finding the π most closely works [7, 8]. Also, it was combined to create a neuro- related items to the one that is considered, which makes swarm heuristic for dynamics of covid19 [9]. Interesting classifying this object enabled. This algorithm is used solution was to use nature-inspired algorithms in im- for clustering, interpolation and even classification tasks age analysis [10, 11]. Heuristic algorithms were used for [17, 18]. Therefore this algorithm determines the similar- motion planning of aircraft [12], or others engineering ity between two objects using selected measures. Based problems. In most cases, engineering problems can be on the results a group of the items with the least differ- presented as an optimization task, where the best coeffi- ence i s created. This s et contains eponymous π elements. cients must be found to reach the best results [13]. Except Objects reminding the considered item are called βneigh- using heuristics in hybridization and optimization, these borsβ. By the βvoice of majorityβ, they are responsible algorithms are also used for feature extraction [14]. The for assigning the tested object to the appropriate class. This means that obtained class is the most frequently IVUS 2022: 27th International Conference on Information Technology $ ktn.prokop@gmail.com (K. Prokop) appearing label among neighbors. Β© 2022 Copyright for this paper by its authors. Use permitted under Creative The algorithm is also used in a clustering problem. Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop CEUR Workshop Proceedings (CEUR-WS.org) Proceedings http://ceur-ws.org ISSN 1613-0073 It is possible to create groups of elements with similar CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings features applying π-nn. There are many different ways Algorithm 1: k Nearest Neighbors Algorithm. of dividing the same dataset so various methods and their Input: dataset with π vectors π₯π , unclassified variable elements can be customized for this purpose. vector π¦, neighbors number π Records in a given database and tested elements can Output: π¦ group be treated as vectors. Then the similarity between them 1 π := 1; may be calculated by a distance function. In this paper 2 for π β€ π do Euclidean metric and Manhattan metric were applied. As- 3 Calculate the distance from π¦ to π₯π using suming π and π are two records being compared, where measure π; each of them consists of π attributes, the following vec- 4 π + +; tors are obtained: 5 end π = [π1 , ..., ππ ], 6 Sort the records in the database in ascending order relative to the calculated distances; π = [π1 , ..., ππ ], 7 π := 1; 8 for π β€ π do which means that attributes should take numerical values. 9 Make a note of the assigned group for π₯π ; Distance function between π and π for the Euclidean 10 π + +; metric is defined as below: 11 end 12 Select the most popular group; β― βΈ π (1) 13 return π¦ group; βΈβοΈ π(π, π) = β· (ππ β ππ )2 . π=1 For the Manhattan metric, distance function can be de- scribed by: takes command of the herd when the leader is indisposed. Further two groups of wolves are distinguished: the third βοΈπ level in the hierarchy are individuals who are doing fairly π(π, π) = ππ β ππ . (2) well and the last group consists of old and sick wolves. π=1 The tasks undertaken by the pack include mainly search- Let mark every π databaseβs element with π attributes ing for food, i.e. hunting mammals. In nature, wolves π‘β as π₯π = [π₯π1 , ..., π₯ππ ], π = 1, ..., π. Thus π is the number hunt in various configurations: alone, in pairs, or as a of all records in the dataset. Obviously, the number π is whole pack. less than or equal to π. Tested item can be presented as Grey Wolf Optimizer uses a group hunting strategy. π¦ = [π¦1 , ..., π¦π ]. By the pseudocode 1, with the selected The different levels of the wolf hierarchy can be repre- distance function π, the π Nearest Neighbors Algorithm sented by the symbols: πΌ β the male wolf, π½ β the second is shown. strongest wolf, πΏ β the third level of the hierarchy, and π β old and sick wolves. In particular, the first three 2.2. Grey Wolf Optimizer levels have an impact on the operation of the algorithm. Firstly, the pack consisting of a fixed number of wolves The method proposed in this paper besides the classifier is initiated. A wolf is treated like a vector π₯: also uses an optimizer. In detail, Grey Wolf Optimizer was applied. This algorithm is an example of a heuristic π₯ = [π₯1 , ..., π₯π ], method of optimization which means that its aim is to find an approximate solution to a given problem but the values of which determine the wolfβs location. The there is no guarantee of its correctness. Heuristics are number π defines the dimension of a given problem that useful in case of high resource cost or high computational is a number of variables of the function π (Β·) wanted to complexity of classic methods. optimize. In the initial pack, coordinates π₯1 , ..., π₯π are Grey Wolf Optimizer was developed in 2014 [5] based drawn from a given interval. Next, the three strongest on the behavior of a pack of wolves. Wolves are predators individuals are selected: πΌ, π½, πΏ (the best wolf in the third and live in herds where a hierarchy occurs. Every pack is level of hierarchy). This operation takes place by compar- led by a leader, the so-called male wolf. This individual ing the values of the function π (Β·) for all individuals in is responsible for launching attacks. The male wolf is the herd. When a hierarchy is established, wolves move also the strongest wolf in the pack and initiates all packβs around in relation to the victim they are hunting. Since πΌ actions. It is selected from the herd by victories in direct is always the leader in the hunt, followed by π½ and πΏ, the battles with other wolves. An important role in the pack position of the other wolves depends on the movements is also played by the second strongest wolf. With the of the strongest individuals (because they are closest to male wolf, they complement each other. This individual the victim). Assuming being in the π π‘β time step, the location of Algorithm 2: Grey Wolf Optimizer. the wolf π₯ in the next moment can be defined by equation Input: wolves number π€, iterations number 3: ππππ₯ , range of the arguments, range π ππ΄ + ππ΅ + ππ· π₯π+1 = , (3) Output: individual πΌ 3 1 Generate initial pack with π€ individuals; where 2 π := 0; ππ΄ = π₯πΌ β π΄π Β· π·πΌ , (4) 3 for π < ππππ₯ do ππ΅ = π₯π½ β π΄π Β· π·π½ , (5) 4 Calculate ππ according to the equation (8); 5 Calculate π΄π according to the equation (7); π ππ· = π₯πΏ β π΄ Β· π·πΏ . (6) 6 Calculate πΆ π according to the equation (12); ππ΄ , ππ΅ , ππ· are the coefficients depending on the posi- 7 Find the best individuals πΌ, π½, πΏ; tions of the best wolves at the moment (denoted as π₯πΌ , 8 β := 0; π₯π½ , π₯πΏ ), 9 for wolves do π΄π is a parameter that updates in each iteration π: 10 Calculate distances from the best individuals π·πΌ , π·π½ , π·πΏ according to π΄π = 2 Β· ππ Β· π, (7) the equations (9), (10), (11); 11 Calculate coefficients ππ΄ , ππ΅ , ππ· and π is a random value from the range [0, 1]. The according to the equations (4), (5), (6); value ππ depends on the interval [ππππ , ππππ₯ ] set in 12 Update wolfβs location according to the the beginning and is calculated for each moment π = equation (3); 0, 1, . . . , ππππ₯ according to the formula: 13 β + +; ππππ₯ β ππππ 14 end ππ = ππππ₯ β Β· π. (8) π + +; ππππ₯ 15 16 end Usually, it is assumed that the ππ value decreases from 2 17 Find the best individuals πΌ, π½, πΏ; to 0 [15]. Then ππππ = 0 and ππππ₯ = 2. The values π·πΌ , 18 return πΌ; π·π½ , π·πΏ evaluate the distance from the given individual π₯ to the best adapted wolves: π·πΌ = πΆ π Β· π₯πΌ β π₯, (9) where each ππ‘β unit (wolf) stores information about the ππ‘β π-dimensional record from the specified database. π π·π½ = πΆ Β· π₯π½ β π₯, (10) Naturally, the pack is the same size of π as the consid- ered dataset. Next the strongest individuals πΌ, π½, πΏ have π π·πΏ = πΆ Β· π₯πΏ β π₯, (11) to be selected. Due to the necessity of hierarchy estab- where lishment, the values of the function π (Β·) for individual πΆ π = 2 Β· π, (12) wolves have to be compared. The function π (Β·) corre- sponds to the Euclidean distance function (1) or distance which is recalculated for each iteration π and π, like π, is for Manhattan metric (2). The strongest units are the a random value in the range [0, 1]. wolves closest to the victim at the moment. When a hi- The above operations are performed a certain number erarchy in the herd is established, the positions of the of times (ππππ₯ ) to finally select the best-adapted wolf (πΌ) wolves are updated. Being in the π π‘β time step, the loca- that is closest to the victim, i.e. the wanted solution. The tion of appropriate wolf in the next moment is described scheme of the algorithm is presented in the pseudocode by the formula (3), which uses coefficients ππ΄ , ππ΅ , ππ· 2. represented by the equations (4), (5), (6). Each of these coefficients is a difference between location from one of 2.3. Hybridization the best wolves (πΌ, π½ or πΏ) and product of the parameter π΄π defined by the formula (7) and corresponding π· co- Let π¦ = [π¦1 , ..., π¦π ] be an πβdimensional vector of un- efficient (π·πΌ , π·π½ or π·πΏ described by the equations (9), known class as in the π Nearest Neighbors classifier (10), (11)), respectively. The parameter π΄π is modified model. Then π¦ is identified with the victim that wolves in each iteration due to the variability of the parame- hunt, while the pack consists of records from the database ter ππ (described by the equation (8)). Additionally, the with π attributes, the values that are stored by wolves. value of the π· coefficient is influenced by the value of Therefore, the population consists of units of the follow- the changing πΆ π parameter defined by the formula (12). ing form: The above steps are repeated ππππ₯ times. Then, the π₯π = [π₯π1 , ..., π₯ππ ], (13) best wolf from the pack is selected (πΌ). It is the record 3. Experiments that after ππππ₯ iterations is at the shortest distance from the considered vector π¦ out of the entire pack. Searching The hybrid method using Grey Wolf Optimizer and π-nn for such an individual takes place in total π times in was tested in a process of matching the class to the object order to identify π closest neighbors of the π Nearest from a database. The database of iris flowers was used Neighbors algorithm. Finally, according to the concept for experiments. The author of this dataset, the British of the π Nearest Neighbors algorithm, the appropriate biologist and statistician Ronald Fisher [19], shared data class for the π¦ vector is selected based on the occurrence in 1936. of individual classes among neighboring records. The iris flowers dataset consists of 150 records describing The pseudocode 3 shows the structure of solving the the appearance of these plants. Every record stores infor- classification problem using the Grey Wolf Optimizer. mation about 5 attributes: the length of the plot of the flower cup, the width of the plot, the length of the petal and its width, and also the name of the species. Thus Algorithm 3: Combination of π-nn with Grey the first four characteristics are expressed by numerical Wolf Optimizer. value and the fifth one constitutes a class label presented Input: dataset of π records (π₯π ), unclassified in a text form. Three species of iris are included in the vector π¦, nearest neighbors number π, collection: Iris setosa, Iris virginica, Iris versicolor. iterations number ππππ₯ , range of π In order to test the designed method, a program was Output: π¦ class implemented. 20% of all records were checked whether 1 Generate initial pack consists π records; the appropriate class was matched. For calculations, the 2 Create array ππππ π ππ of length π; program retrieved the first four attributes of each record 3 π := 0; omitting the class labels. However, labels were stored 4 for π < π do for later comparison to obtaining results with the actual 5 π := 0; state. As it was earlier described, every record was treated 6 for π < ππππ₯ do like a vector to use the created method. To determine 7 Calculate ππ according to the formula effectiveness of the method, the following coefficient πππ (8); was defined: ππ 8 Calculate π΄π according to the formula πππ = Β· 100%, (14) ππ‘ (7); 9 Calculate πΆ π according to the formula where ππ is the number of correct matches and ππ‘ is (12); equal to the number of all tested records. The final result 10 Find the best individuals πΌ, π½, πΏ; was rounded to two decimal places. 11 β := 0; The program was tested for four variants of iterations 12 for wolves do number. A range of π in all cases was assumed to [0; 2]. 13 Calculate distances from the best For each option effectiveness of the method was checked individuals π·πΌ , π·π½ , π·πΏ according for Euclidean distance function 1 and also for distance in to the equations (9), (10), (11); Manhattan metric 2 by launching the program five times 14 Calculate coefficients ππ΄ , ππ΅ , ππ· in both cases and calculating the arithmetic mean of the according to the equations (4), (5), obtained results. These operations were performed for (6); 15 different values of π parameter: π = 1, ..., 15. 15 Update wolfβs location according to The first v ariant o f i terations n umber w as π πππ₯ = the equation (3); 100. For the Euclidean metric, the highest arithmetic 16 β + +; means of πππ value was obtained for π = 2 and reached 17 end 71.33%. This value did not fall below 34.00%. In the case 18 π + +; of Manhattan distance, the arithmetic means of the πππ coefficient assumed values between 26.00% and 44.00%. 19 end Detailed results are placed in Table 1. 20 Find the best individuals πΌ, π½, πΏ; In the second test, the number of iterations was modi- 21 Assign to ππππ π ππ [π] the class of individual fied to 25. The obtained results are presented in the table πΌ; 2. It turned out that for Euclidean distance significant 22 π + +; improvement of effectiveness was achieved regardless of 23 end the value of π. In this case, the arithmetic mean of πππ 24 Choose the most frequently repeated class in was greater than or equal to 90.00%. It means that nearly the array ππππ π ππ ; all of the tested records were correctly classified. For the 25 return π¦ class; Manhattan distance, the results were similar to the first Table 1 Table 3 Arithmetic mean of the πππ coefficient Arithmetic mean of the πππ coefficient for ππππ₯ = 100, π β [0; 2]. for ππππ₯ = 50, π β [0; 2]. k Euclidean distance Manhattan distance k Euclidean distance Manhattan distance 1 55.33% 31.33% 1 65.33% 35.33% 2 71.33% 33.33% 2 61.33% 34.67% 3 62.00% 43.33% 3 70.67% 30.67% 4 44.00% 34.67% 4 72.00% 38.67% 5 34.00% 37.33% 5 59.33% 36.00% 6 41.33% 28.67% 6 71.33% 36.00% 7 64.67% 38.67% 7 62.66% 34.00% 8 70.67% 28.67% 8 67.33% 36.66% 9 47.33% 34.66% 9 70.00% 31.33% 10 53.33% 35.33% 10 64.00% 30.67% 11 65.33% 27.33% 11 66.67% 34.67% 12 53.33% 41.33% 12 67.33% 36.00% 13 58.00% 44.00% 13 62.67% 36.00% 14 58.67% 30.67% 14 72.00% 34.00% 15 48.00% 26.00% 15 60.00% 30.67% Table 2 Table 4 Arithmetic mean of the πππ coefficient Arithmetic mean of the πππ coefficient for ππππ₯ = 25, π β [0; 2]. for ππππ₯ = 1000, π β [0; 2]. k Euclidean distance Manhattan distance k Euclidean distance Manhattan distance 1 96.67% 37.33% 1 36.67% 38.67% 2 93.33% 42.67% 2 32.00% 37.33% 3 91.33% 37.33% 3 38.00% 30.67% 4 94.67% 33.33% 4 42.67% 35.33% 5 92.67% 38.66% 5 42.00% 32.00% 6 91.33% 32.67% 6 47.33% 33.33% 7 94.67% 36.67% 7 30.67% 39.33% 8 90.00% 27.33% 8 34.00% 26.67% 9 92.00% 42.67% 9 57.33% 29.34% 10 92.00% 37.33% 10 44.67% 31.33% 11 94.67% 37.33% 11 71.33% 30.00% 12 94.67% 37.33% 12 61.33% 33.33% 13 92.00% 40.00% 13 60.67% 33.33% 14 97.33% 35.33% 14 56.00% 34.00% 15 93.33% 38.00% 15 40.66% 33.33% test. The arithmetic mean between 27.33% and 42.67% of iterations significantly increased. Table 4 presents was obtained. the arithmetic mean of the πππ coefficient in this vari- Another test was performed for 50 iterations. As in the ant. Again, the distance function for Manhattan brought previous step, results for the Manhattan metric did not results similar to other tests. None of the values of param- improve noticeably. The highest arithmetic mean 38.67% eter π causes results markedly better than others. If the can be observed for π = 4. 30.67% is the lowest obtained Euclidean metric is applied, results depend on π value. value. For Euclidean metric results are not as good as For example, when π = 7, the arithmetic mean of πππ in the previous test but there were much more correctly is equal to 30.67% and it is the lowest result. Simultane- classified records than for 100 iterations, for example ously, for π = 11 it is 71.33%. Thus the range of these when π = 4 it was 72.00% in this variant (ππππ₯ = 50) and results is quite big β almost 40%. 44.00% when ππππ₯ = 100. Other values are presented in table 3. The last test assumed ππππ₯ = 1000 so the number 4. Conclusions heuristic with interior-point for nonlinear sitr model for dynamics of novel covid-19, Alexandria The effectiveness of the method obtained from a combi- Engineering Journal 60 (2021) 2811β2824. nation of π-nn with Grey Wolf Optimizer depends on [10] D. PoΕap, M. WoΕΊniak, R. DamaΕ‘eviΔius, established features. Using this method and appropriate R. MaskeliuΜnas, Bio-inspired voice evalua- input parameters can bring satisfactory results β for ex- tion mechanism, Applied Soft Computing 80 (2019) ample as in the test for ππππ₯ = 25 and the Euclidean 342β357. metric. Comparing other values, it can be observed that [11] D. PoΕap, N. Wawrzyniak, M. WΕodarczyk-Sielicka, with the problem described in this paper, the distance Side-scan sonar analysis using roi analysis and deep function for the Manhattan metric is not very effective. neural networks, IEEE Transactions on Geoscience The correctness of matches using this metric is low. On and Remote Sensing (2022). the other hand, in some cases, the value of π parameter [12] Y. Wu, A survey on population-based meta- also has an influence on results. The fourth performed heuristic algorithms for motion planning of aircraft, test proves it. In connection with the above, this hy- Swarm and Evolutionary Computation 62 (2021) brid method can be useful with appropriate assumptions. 100844. However, it requires more experimentation for specific [13] G. Dhiman, Ssc: A hybrid nature-inspired meta- cases. heuristic optimization algorithm for engineering applications, Knowledge-Based Systems 222 (2021) 106926. References [14] M. Sharma, P. Kaur, A comprehensive analysis of [1] M. Caron, P. Bojanowski, A. Joulin, M. Douze, Deep nature-inspired meta-heuristic techniques for fea- clustering for unsupervised learning of visual fea- ture selection problem, Archives of Computational tures, in: Proceedings of the European conference Methods in Engineering 28 (2021) 1103β1127. on computer vision (ECCV), 2018, pp. 132β149. [15] Ε. KnypiΕski, L. Nowak, Zastosowanie algorytmu [2] M. Z. Hossain, M. N. Akhtar, R. B. Ahmad, M. Rah- szarych wilkΓ³w do rozwiΔ zania zadaΕ optymalizacji man, A dynamic k-means clustering for data min- urzΔ dzeΕ elektromagnetycznych, Poznan Univer- ing, Indonesian Journal of Electrical engineering sity of Technology Academic Journals. Electrical and computer science 13 (2019) 521β526. Engineering (2019). [3] K. P. Sinaga, M.-S. Yang, Unsupervised k-means [16] S. N. Makhadmeh, A. T. Khader, M. A. Al-Betar, clustering algorithm, IEEE access 8 (2020) 80716β S. Naim, An optimal power scheduling for smart 80727. home appliances with smart battery using grey wolf [4] X. Yang, C. Deng, F. Zheng, J. Yan, W. Liu, Deep optimizer, in: 2018 8th IEEE international confer- spectral clustering using dual autoencoder network, ence on control system, computing and engineering in: Proceedings of the IEEE/CVF conference on (ICCSCE), IEEE, 2018, pp. 76β81. computer vision and pattern recognition, 2019, pp. [17] M. WΕodarczyk-Sielicka, N. Wawrzyniak, Problem 4066β4075. of bathymetric big data interpolation for inland [5] S. Mirjalili, S. M. Mirjalili, A. Lewis, Grey wolf mobile navigation system, in: International Con- optimizer, Advances in engineering software 69 ference on Information and Software Technologies, (2014) 46β61. Springer, 2017, pp. 611β621. [6] M. Abd Elaziz, A. A. Ewees, N. Neggaz, R. A. [18] D. Zhao, X. Hu, S. Xiong, J. Tian, J. Xiang, J. Zhou, Ibrahim, M. A. Al-qaness, S. Lu, Cooperative meta- H. Li, K-means clustering and knn classification heuristic algorithms for global optimization prob- based on negative databases, Applied Soft Comput- lems, Expert Systems with Applications 176 (2021) ing 110 (2021) 107732. 114788. [19] R. A. Fisher, The use of multiple measurements in [7] D. PoΕap, M. WoΕΊniak, A hybridization of dis- taxonomic problems, Annals of eugenics 7 (1936) tributed policy and heuristic augmentation for im- 179β188. proving federated learning approach, Neural Net- works 146 (2022) 130β140. [8] D. PoΕap, M. WoΕΊniak, Meta-heuristic as manager in federated learning approaches for image process- ing purposes, Applied Soft Computing 113 (2021) 107872. [9] M. Umar, Z. Sabir, M. A. Z. Raja, F. Amin, T. Saeed, Y. Guerrero-Sanchez, Integrated neuro-swarm