Gradient Modification for Cuckoo Search Algorithm to Increase Accuracy Karolina K˛esik Institute of Mathematics Silesian University of Technology Kaszubska 23, 44-100 Gliwice, Poland Email: karola.ksk@gmail.com Abstract—Optimization problems are used in different areas Important features describing different lungs diseases on the of our life. Very often the functions that are optimized are so X-RAY images were detected by heuristic [17], and it were complex that classic methods can not deal with them. The solution used as support technique for circle detection which allow to for it are heuristic techniques that do not guarantee the correct solution in a finite time. In this work, the classic version of detect phases of bacterial life [16]. Cuckoo Search Algorithm and proposed modifications that allow to increase the precision of obtained solutions are described. The II. P ROBLEM FORMULATION proposal was presented and tested on nine different test functions, The optimization problem is understood as searching for and the obtained solutions were discussed in terms of advantages a point whose function value reaches the global extreme, and disadvantages. more precisely the minimum or maximum. The choice of I. I NTRODUCTION maximization or minimization is insignificant because in the case of a function having a maximum, it is enough to negate More and more often, contemporary problems are described the function. The same applies to the opposite case. To by mathematical equations, which makes it possible to min- define the problem formally, let us introduce some signs. imize variables in such a way that the cost of production is For a continuous function f (x) : Rn → R, the point as low as possible and the highest quality. The large space of x = (x1 , x2 , . . . , xn ) will be a global extreme if the following solutions, the number of variables or even the landscape of condition is satisfied functions can be a big challenge for computers. For this type of tasks, besides the classic methods of finding the best values min f (x) or max f (x). (1) x x of the functions are heuristic algorithms. These techniques are inspired by a certain action of physics or nature. Optimization functions are usually n-dimensional, for which Mathematical models describing the action of nature are the extreme is hard to locate. Moreover, classic and numerical created all the time, which is evident in numerous scientific methods mostly will not be able to do this. They are called works. In [14], the grasshopper behavior has been described as artificial landscape because in three dimensions they resemble an optimization algorithm. Again in [10], the authors modeled familiar shapes from nature or physics. A set of such a the way of polar bears survive in arctic condition through the functions is presented in Table I, where are defined the most way they move on ice floes and hunt for seals. are tested not known and representing the basic shapes like bowl-shaped, only by the optimization problem, but also by the graph. In plate-shaped, valley-shaped, steep ridges and others. [2], the authors presented binary monarch butterfly algorithm To find the best solution, it is necessary to use heuristic and it was used to solve knapsack problem. Again in [4], [8], algorithm. It is a technique inspired by the nature, which nature inspired algorithm were used as a solver in vehicle does not guarantee an accurate solution in a finite time (only routing problem. approximate). In the case of practical applications, such techniques have III. C UCKOO S EARCH A LGORITHM been used primarily in decision-making systems [19]. In [9], [11], it was used in feature extraction process from images In this section, the classic model has been described along and sound file for the purpose of voice/image recognition. In with the proposed modifications. [3], the heuristics allowed for the construction of commercial A. Classic version microgrids depending on the given assumptions. In a similar way it was used in the assessment of creditworthiness [6] or Cuckoo Search Algorithm (CSA) was presented in [18] economic security [5]. where the author described heuristic technique inspired by Heuristic approach has also found use in location-based the cuckoos behavior while tossing their eggs to the nests of social networks [7] or finding outlying points in diabetes data other birds. For obvious reasons, the mathematical model had sets [1]. Another important issue is medicine and biomedicine. to be simplified in order to reduce the number of unknown parameters as well as to minimize the number of calculations. Copyright held by the author(s). To this end, several simplifications have been introduced 24 Table I: Selected test function for described optimization problem. Name  v Function  Input domain x Global minimum n n u X ! X 2 u1 1 Ackley −20 −0.2t xi  − exp n n cos(2πxi ) h−32.8, 32.8i (0, . . . , 0) 0 i=1 i=1 n n x2i   X Y xi Griewank − cos √ +1 h−600, 600i (0, . . . , 0) 0 i=1 4000 i=1 i Xn x2i − 10 cos(2πxi )  Rastrigin 10n + h−5.12, 5.12i (0, . . . , 0) 0 i=1 n X p Schwefel 418.9829n − xi sin( |xi |) h−500, 500i (0, . . . , 0) 0 i=1 n X X i Rotated hyper-ellipsoid x2j h−65.5, 65.5i (0, . . . , 0) 0 i=1 j=1 Xn Sum squares ix2i h−5.12, 5.12i (0, . . . , 0) 0 i=1 !2 !4 Xn n X n X Zakharov x2i + 0.5ixi + 0.5ixi h−5, 10i (0, . . . , 0) 0 i=1 i=1 i=1 n−1 X Rosenbrock 100(xi+1 − x2i )2 + (xi − 1)2 h−5, 10i (0, . . . , 0) 0 i=1 Xn Styblinski-Tang 1 2 x4i − 16x2i + 5xi h−5, 5i (−2.903534, . . . , −2.903534) −39.16599n i=1 • each cuckoo is interpreted as a point x = (x0 , . . . , xn−1 ) where β is a random value in range h0, 1i. In most cases, pa = in n-dimensions, 0.5 what gives the equal likelihood of staying and throwing • the nature environment is understood as input domain the egg out of the nest. xi ∈ ha, bi, • one cuckoo can drop only one single egg in one iteration, B. Modified version • egg is identified with the cuckoo, Proposition is based on modification host decision to prevent • number of cuckoos in population is constant, the remove eggs that represents good solutions for a given • the host can leave or throw away an egg with a probability problem. Moreover, every move that cuckoo does is corrected pa ∈ h0, 1i. In the case of getting rid of the egg, the by using a local search method. cuckoo is forced to find a new place. 1) Gradient method: One of the classic local search tech- The above-mentioned assumptions cause that these birds has nique is gradient descent. It is based on the analysis direction a specific environment. Each cuckoo moves in each iteration in of search space by the calculating the negative gradient of search of a better nest for its offspring. The flight is carried out function f . Let assume, that the algorithm starts at position in accordance with the Levy’s equations (often called Levy’s x = (x0 , . . . , xn−1 ). Then, the negative gradient for each flight), which is continuous probability distribution defined as spatial coordinates is calculated as  c  ∂f (x0 , . . . , xn−1 ) exp −∇fxi = − , (5) r c 2(x − µ) ∂xi L(xi , µ, c) = · 3 , (2) what means the direction of the fastest descent. Then, the value 2π (x − µ) 2 of xi is recalculated as where µ is location coefficient and c is scale parameter. Using Levy’s equation, we can move birds on each spatial xti = xti + λ(−∇fxti ), (6) coordinates as where λ is the length of step and t is the current iteration for xt+1 = xti + α · L(xti , µ, c), (3) these method. After T iterations, if function value in point xT i is better then before local search, it is replaced. where α is the length of Levy’s flight, t is the current iteration. 2) Proposed modification: A cuckoo moving in search of a After the move of cuckoo, the egg is subjected to a certain nest goes to one point and then the host’s decision follows. Let test depending on whether it will remain in the nest. In case us assume that the egg-laying cuckoo looks for such a place in the host notices that he has not his egg, he can throw it away. the nest to minimize the discovery of a new egg. In practice, The probability of discovery is pa , and it’s modeled as it can be realized through the local search like method of the ( smallest gradient described in Algorithm 1. t+1 β < pa drop the egg The second modification is remodeling the host’s deci- H(xi ) = , (4) β ≥ pa leave the egg sion. In the original version depends on the randomness 25 Algorithm 1 Gradient Descent Calculation were made using two sets of basic parameters 1: Start, which are (the size of population, the number of iterations) and 2: Define f (·), length of the step λ, number of iteration T , were set as (50, 50) and (100, 100). In addition, the algorithm 3: Take point x, parameter setting was as follows pa = 0.7, c = 0.4, µ = 0.3 4: t := 1, and the step size for gradient descent was 15. 5: while t ≤ T do All presented results are the average value from 100 tests 6: Calculate xt+1 i using (6), calculated as 7: if f (xt+1 i ) ≤ f (xt i ) then 99 8: t t+1 xi = xi , 1 X f ({x0 , . . . , x99 }) = f (xi ). (8) 9: end if 100 i=0 10: t + +, 11: end while The obtained averaged solutions are presented in II. From 12: Return xti , these table, it is clearly to see that for each function, the 13: Stop. accuracy is at least slightly better. In the case of function with many local extremes like Rastrigin or Schwefel, the solution is better more then 10% which is a good result considering of parameter β. This type solution allows to remove well- the possibility of getting stuck. Especially when the algorithm matched egg in the nest. It is possible to delete the best uses local search and this can increase accuracy, but only by solution in the whole population, and it leads to a jump in finding a better point in a given part of the function. the convergence. Removing the worse solutions by finding Not only the accuracy was measured, but also average the opportunity to the location a better one is justified and trajectory in each iteration defined as therefore remodeling the decision condition is so important 99 1 X – the best solutions should remain, and only the worse ones xideal − xjbest , (9) should be removed. Moreover, the decision must depend on 100 j=0 the optimization function. In addition, it would be good if the where xideal is the exactly solution for a given function and decision depended also on the quality of the current point with xjbest is the best point in j-th test, and || · || is an Euclidean respect to the best in whole population. An example of such metric defined as the distance between two given points p and a model is presented by the following equation q in n-dimensions f (xtbest )  < pa drop the egg   v u n f (xtcurrent )  H(xt+1 ) = , (7) uX i t f (xbest ) ||p − q|| = t (pi − qi )2 . (10) ≥ p leave the egg  a  i=0 f (xtcurrent )  where xtbest is the best solution in current iteration in whole And the last measurement coefficient is rate of convergence population. in each iteration from the first to the t − 1 calculated as |f (xk+1 ) − f (xideal )| Algorithm 2 Modified Cuckoo Search Algorithm , (11) |f (xk ) − f (xideal )| 1: Start, 2: Define pα , c, µ, number of individuals N and number of where k is the number of the given iteration. All of these iteration T , values were measured during tests for a population of 100 3: Define test function f (·), individuals and 100 iterations. For each iteration, data was 4: Generate N cuckoos at random, saved, then averaged and coefficients calculated in accordance 5: t:=0, with the above formulas. The illustration of the obtained 6: while t ≤ T do results are shown in Fig. 1 (for classic version) and in Fig. 7: Move each cuckoo using Eq. (3), 2 (for modified version). At first glance, both sets of charts 8: Find the best position in nest using Algorithm 1, look similar. However, the values on the axes are much smaller 9: Hosts decision is made by Eq. (7), in the case of the algorithm with proposed modifications. 10: t + +, In almost all cases, the curves have been slightly smoothed 11: end while with the introduction of modifications. One of the most 12: Return the best cuckoo in population, interesting cases is the Schwafel function, for which the graph 13: Stop. of the average value of the fitness function slowed down to zero. The reason is primarily the landscape (that is, many local minimums) and the use of local search technique which IV. E XPERIMENTS deepened the falling into this minimum. Hence, the function For all test function are presented in Table I. Both versions graph is more bulged compared to the original. In other (classic and modified) were implemented in Mathematica 9. functions, there can be seen improvement. 26 Table II: Obtained results for all test function in 5 dimensions. Function Population Classic CSA Modified CSA iterations f(x) f(x) Ackley 50/50 8.1115809961431699E-3 1.0173805303914699E-3 100/100 3.1161468790261702E-3 4.25027784882959E-4 Griewank 50/50 1.1104141890427901E-6 1.11533858337188E-9 100/100 9.7064015180947607E-7 1.6771042340344999E-11 Rastrigin 50/50 1.02208808007731E-3 2.0757391077097499E-5 100/100 8.5738084933950599E-4 1.6665285423624399E-5 Schwefel 50/50 2.839832E-3 1.62429252212134E-4 100/100 1.89128E-3 1.4616910059213499E-4 Rotated hyper-ellipsoid 50/50 3.2147278590055301E-6 8.3496728930287895E-7 100/100 9.774778625110691E-7 1.06247907632132E-7 Sum squares 50/50 3.9473326955707401E-6 3.2406587358828998E-6 100/100 3.0576751975143799E-6 3.67314536996053E-7 Zakharov 50/50 1.1666325957392999E-5 3.7743509867365198E-6 100/100 4.8645321838795998E-6 2.7434504482140701E-7 Rosenbrock 50/50 -5.4930904358968698E-2 -6.6770179875013505E-2 100/100 -8.1294844521185197E-2 -9.8308810919833806E-3 Stybliski-Tang 50/50 -194.11209430931001 -194.2329823892 100/100 -195.2309123 -195.84287387270001 V. C ONCLUSIONS [6] S. Oreski and G. Oreski. Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert systems with applications, In this paper, two modification were presented to improve 41(4):2052–2064, 2014. the performance of Cuckoo Search Algorithm. The first one is [7] J. Peng, Y. Meng, M. Xue, X. Hei, and K. W. Ross. Attacks and defenses in location-based social networks: A heuristic number theory approach. the introduction of a local search as finding the best position In International Symposium on Security and Privacy in Social Networks in the nest to reduce the probability of being detected by the and Big Data, pages 64–71. IEEE, 2015. host. The second modification is the remodeling of the host’s [8] P. H. V. Penna, A. Subramanian, L. S. Ochi, T. Vidal, and C. Prins. A hybrid heuristic for a broad class of vehicle routing problems with decision regarding the egg. This is important because in the heterogeneous fleet. Annals of Operations Research, pages 1–70, 2017. original algorithm, it was possible to remove the best solutions [9] D. Połap. Neuro-heuristic voice recognition. In Federated Conference in a given iteration. And now, the decision depends on the on Computer Science and Information Systems, pages 487–490. IEEE, 2016. quality of the position relative to the best individual in whole [10] D. Połap et al. Polar bear optimization algorithm: Meta-heuristic with population. fast population movement and dynamic birth and death mechanism. Symmetry, 9(10):203, 2017. Both versions of CSA were tested in terms of accuracy, [11] D. Połap and M. Woźniak. Detection of important features from images quality of adaptation, trajectory, convergence of solution in using heuristic approach. In International Conference on Information each iteration. Comparison of results indicates the advantage and Software Technologies, pages 432–441. Springer, 2017. [12] D. Połap, M. Woźniak, C. Napoli, and E. Tramontana. Real-time cloud- of modification because the accuracy is more than 10% higher based game management system via cuckoo search algorithm. Interna- than the original. Moreover, in most cases, convergence was tional Journal of Electronics and Telecommunications, 61(4):333–338, faster except the cases of functions with many local minima. 2015. [13] D. Połap, M. Woźniak, C. Napoli, E. Tramontana, and R. Damaševičius. The reason is faster move to the minimum through local Is the colony of ants able to recognize graphic objects? In International search. An important element for future work is to design the Conference on Information and Software Technologies, pages 376–387. exit mechanism from the minimum and to analyze other local Springer, 2015. [14] S. Saremi, S. Mirjalili, and A. Lewis. Grasshopper optimisation search techniques. algorithm: Theory and application. Advances in Engineering Software, 105:30–47, 2017. R EFERENCES [15] J. T. Starczewski, S. Pabiasz, N. Vladymyrska, A. Marvuglia, C. Napoli, and M. Woźniak. Self organizing maps for 3d face understanding. In [1] S. Anitha and M. M. Metilda. A heuristic approach for observing International Conference on Artificial Intelligence and Soft Computing, outlying points in diabetes data set. In IEEE International Conference on pages 210–217. Springer, 2016. Smart Technologies and Management for Computing, Communication, [16] M. Woźniak, D. Połap, L. Kośmider, and T. Cłapa. Automated Controls, Energy and Materials, pages 199–202. IEEE, 2017. fluorescence microscopy image analysis of pseudomonas aeruginosa [2] Y. Feng, G.-G. Wang, S. Deb, M. Lu, and X.-J. Zhao. Solving 0–1 bacteria in alive and dead stadium. Engineering Applications of Artificial knapsack problem by a novel binary monarch butterfly optimization. Intelligence, 67:100–110, 2018. Neural computing and applications, 28(7):1619–1634, 2017. [17] M. Wozniak, D. Polap, L. Kosmider, C. Napoli, and E. Tramontana. [3] N. Liu, Q. Chen, J. Liu, X. Lu, P. Li, J. Lei, and J. Zhang. A heuristic op- A novel approach toward x-ray images classifier. In IEEE Symposium eration strategy for commercial building microgrids containing evs and Series on Computational Intelligence, pages 1635–1641. IEEE, 2015. pv system. IEEE Transactions on Industrial Electronics, 62(4):2560– [18] X.-S. Yang and S. Deb. Cuckoo search via lévy flights. In Nature & 2570, 2015. Biologically Inspired Computing, 2009. NaBIC 2009. World Congress [4] M. Ogiołda. The use of clonal selection algorithm for the vehicle on, pages 210–214. IEEE, 2009. routing problem with time windows. Symposium for Young Scientists in [19] K. Z. Zhang, S. J. Zhao, C. M. Cheung, and M. K. Lee. Examining the Technology, Engineering and Mathematics, pages 68–74, 2017. influence of online reviews on consumers’ decision-making: A heuristic– [5] E. Okewu, S. Misra, R. Maskeliūnas, R. Damaševičius, and systematic model. Decision Support Systems, 67:78–89, 2014. L. Fernandez-Sanz. Optimizing green computing awareness for environ- mental sustainability and economic security as a stochastic optimization problem. Sustainability, 9(10):1857, 2017. 27 Ackley function: Griewank function: Rastrigin function: Rosenbrock function: Rotated hyper-ellipsoid: Schwefel function: Styblinski-Tang function: Sum squares function: Zakharov function: Figure 1: Sample benchmark tests results for all test function 28 obtained by the classic version of CSA. In each row, there are: 3D plot, average trajectory, average fitness function and convergence rate. Ackley function: Griewank function: Rastrigin function: Rosenbrock function: Rotated hyper-ellipsoid: Schwefel function: Styblinski-Tang function: Sum squares function: Zakharov function: Figure 2: Sample benchmark tests results for all test function29obtained by the modified version of CSA. In each row, there are: 3D plot, average trajectory, average fitness function and convergence rate.