Evolutionary approach to S-box generation: Optimizing nonlinear substitutions in symmetric ciphers⋆ Oleksandr Kuznetsov1,2,3,†, Nikolay Poluyanenko2,†, Emanuele Frontoni3,†, Marco Arnesano1,† and Oleksii Smirnov4,∗,† 1 eCampus University, 10 Via Isimbardi, 22060 Novedrate, Italy 2 V. N. Karazin Kharkiv National University, 4 Svobody sq., 61022 Kharkiv, Ukraine 3 University of Macerata, 30/32 Via Crescimbeni, 62100 Macerata, Italy 4 Central Ukrainian National Technical University, 8 University ave., 25006 Kropyvnytskyi, Ukraine Abstract This study explores the application of genetic algorithms in generating highly nonlinear substitution boxes (S-boxes) for symmetric key cryptography. We present a novel implementation that combines a genetic algorithm with the Walsh-Hadamard Spectrum (WHS) cost function to produce 8×8 S-boxes with a nonlinearity of 104. Our approach achieves performance parity with the best-known methods, requiring an average of 49,399 iterations with a 100% success rate. The study demonstrates significant improvements over earlier genetic algorithm implementations in this field, reducing iteration counts by orders of magnitude. By achieving equivalent performance through a different algorithmic approach, our work expands the toolkit available to cryptographers and highlights the potential of genetic methods in cryptographic primitive generation. The adaptability and parallelization potential of genetic algorithms suggests promising avenues for future research in S-box generation, potentially leading to more robust, efficient, and innovative cryptographic systems. Our findings contribute to the ongoing evolution of symmetric key cryptography, offering new perspectives on optimizing critical components of secure communication systems. Keywords S-box generation, genetic algorithms, nonlinear substitutions, Walsh-Hadamard spectrum, cryptographic primitives, heuristic optimization, cryptographic strength1 1. Introduction play crucial roles in determining an S-box’s overall cryptographic efficacy [13, 14]. The realm of digital security is in a constant state of While algebraically constructed S-boxes, such as the one evolution, with symmetric key cryptography serving as a used in the Advanced Encryption Standard (AES) with its fundamental pillar in the architecture of secure optimal nonlinearity of 112 [15], might seem ideal, they are communication systems [1–3]. At the core of many not without vulnerabilities. The presence of inherent symmetric encryption algorithms lie Substitution boxes (S- algebraic structures in such S-boxes can create potential boxes) [4], which play a pivotal role in establishing the weaknesses, making them susceptible to algebraic nonlinear components essential for robust encryption [5, 6]. cryptanalysis [16–18]. This vulnerability underscores the These S-boxes are critical in creating the confusion and need for randomly generated S-boxes that lack hidden diffusion properties that Claude Shannon identified as algebraic structures, thereby enhancing resistance against crucial for secure ciphers [7, 8]. sophisticated cryptanalytic techniques [19–21]. The cryptographic strength of an S-box is multifaceted, The generation of cryptographically robust S-boxes encompassing several key indicators [9]. Nonlinearity, presents a significant computational challenge. The vast which quantifies an S-box’s resistance to linear search space of possible configurations for 8×8 S-boxes is cryptanalysis, stands as a primary measure. For 8×8 S-boxes, estimated at 28! (approximately 10506), renders exhaustive commonly employed in modern ciphers, achieving a search methods impractical. This complexity has driven nonlinearity of 104 represents a significant benchmark [10– research towards heuristic approaches for S-box generation 12]. However, other properties such as differential [22–24]. Methods such as simulated annealing, hill uniformity, algebraic degree, and algebraic immunity also climbing, and genetic algorithms have shown promise in navigating this expansive solution space efficiently [25]. CQPC-2024: Classic, Quantum, and Post-Quantum Cryptography, August 0000-0003-2331-6326 (O. Kuznetsov); 0000-0001-9386-2547 6, 2024, Kyiv, Ukraine (N. Poluyanenko); 0000-0002-8893-9244 (E. Frontoni); 0000-0003-1700- ∗ Corresponding author. 3075 (M. Arnesano); 0000-0001-9543-874X (O. Smirnov) † These authors contributed equally. © 2024 Copyright for this paper by its authors. Use permitted under oleksandr.kuznetsov@uniecampus.it (O. Kuznetsov); Creative Commons License Attribution 4.0 International (CC BY 4.0). nlfsr01@gmail.com (N. Poluyanenko); emanuele.frontoni@unimc.it (E. Frontoni); marco.arnesano@uniecampus.it (M. Arnesano); dr.smirnovoa@gmail.com (O. Smirnov) CEUR Workshop ceur-ws.org ISSN 1613-0073 1 Proceedings Recent advances in heuristic S-box generation have made high nonlinearity. Building on this work, Souravlias et al. significant advances. Researchers have studied various cost (2017) [33] proposed an algorithm portfolio approach functions, including the Walsh-Hadamard spectrum (WHS) combining simulated annealing and tabu search, showing function [23, 26], the Picek cost function (PCF) [10], improved results under limited time budgets. improved Walsh-Hadamard spectrum-based cost functions Genetic algorithms have also been explored for S-box (WCF) [10, 27], and two new extended cost functions (ECF generation [24, 34]. Tesar (2010) [35] combined a genetic and WCFS) [6, 28, 29] in conjunction with different search algorithm with a tree search method, generating 8×8 S- algorithms [10, 22]. These efforts have progressively boxes with nonlinearity up to 104. Picek et al. (2016) [11] reduced the computational cost of generating highly presented a novel cost function for evolving S-boxes, nonlinear S-boxes, with some methods achieving the target achieving significant improvements in both speed and nonlinearity of 104 in fewer than 100,000 iterations. quality of results compared to previous approaches. Despite these advancements, there remains a gap in Ivanov et al. (2016a, 2016b) [36, 37] introduced an understanding the full potential of genetic algorithms in this innovative approach using a modified immune algorithm domain. While genetic approaches have been applied to S- combined with hill climbing, rapidly generating large sets box generation, their performance in comparison to other of highly nonlinear bijective S-boxes. Their work heuristic methods, particularly in terms of consistency and demonstrated the potential of hybrid approaches in S-box efficiency in generating S-boxes with optimal cryptographic generation. properties, remains an area ripe for exploration. Recent advancements have focused on improving Our study aims to address this gap by presenting a specific cryptographic properties. Rodinko et al. (2017) [38] comprehensive investigation into the application of genetic optimized a method for generating high nonlinear S-boxes, algorithms for generating 8×8 S-boxes with a nonlinearity of achieving nonlinearity of 104, algebraic immunity of 3, and 104. We explore the synergy between genetic algorithms and 8-uniformity within reasonable computational time. Freyre the WHS cost function, aiming to match or surpass the Echevarría and Martínez Díaz (2020) [27] proposed a new efficiency of existing methods while leveraging the inherent cost function specifically designed to improve the advantages of evolutionary approaches, such as adaptability nonlinearity of bijective S-boxes. and the potential for parallelization. The importance of multiple cryptographic criteria has The remainder of this paper is structured as follows: been emphasized in recent literature. Freyre-Echevarría et Section 2 provides a comprehensive review of the literature, al. (2020) [10] introduced an external parameter- detailing the evolution of S-box generation techniques and independent cost function for evolving bijective S-boxes, the current state of the art. Section 3 offers a background on considering both nonlinearity and other important S-boxes, their cryptographic properties, and the theoretical properties. Their work highlighted the need for balanced foundations underpinning their design. Section 4 delineates optimization across multiple cryptographic criteria. our methodology and experimental setup, including the More recent studies have explored novel approaches to specifics of our genetic algorithm implementation and S-box generation. Artuğer and Özkaynak (2024) [39] evaluation criteria. Section 5 presents our results and a proposed a post-processing approach to improve the detailed discussion, comparing our findings with existing nonlinearity of chaos-based S-boxes, addressing a methods and analyzing their implications. Finally, Section 6 longstanding challenge in this area. Haider et al. (2024) [40] concludes the paper, summarizing our key findings and introduced an S-box generator based on elliptic curves, outlining promising directions for future research in this offering a balance between randomization and optimization critical area of cryptographic system design. with minimal computation time. The application of S-boxes in specific cryptographic 2. Literature review contexts has also been a focus of recent research. Jamal et al. (2024) [41] developed a region of interest-based medical The design and generation of cryptographically strong S- image encryption technique using chaotic S-boxes, boxes have been subjects of intensive research in the field demonstrating the practical applications of advanced S-box of symmetric key cryptography. This section provides a designs in specialized domains. comprehensive review of the existing literature, focusing on Emerging threats and the need for enhanced security various approaches to S-box generation and their have led to new considerations in S-box design. Fahd et al. cryptographic properties. (2024) [42] examined the reality of backdoored S-boxes, Algebraic constructions of S-boxes, such as those based highlighting the importance of thorough cryptanalysis and on finite field inversion used in the Advanced Encryption the potential vulnerabilities in S-box structures. Standard (AES) [15, 30, 31], have been widely studied. In conclusion, the literature reveals a trend towards more However, as Bard (2009) [16] and Courtois and Bard (2007) sophisticated, multi-criteria optimization approaches in S-box [17] point out, these constructions may be vulnerable to generation. While significant progress has been made in algebraic attacks due to their inherent mathematical achieving high nonlinearity and other desirable properties, structure. This vulnerability has led to increased interest in there remains a need for methods that can consistently produce generating S-boxes with more complex algebraic structures S-boxes with optimal cryptographic characteristics while [32]. balancing computational efficiency and resistance to emerging Heuristic approaches have gained significant traction in cryptanalytic techniques. recent years. Clark et al. (2005) [26] introduced a simulated annealing approach for S-box generation [23], demonstrating its effectiveness in producing S-boxes with 2 3. Background The algebraic immunity can be computed by constructing the minimal reduced Gröbner basis of the ideal I(S) using the Symmetric cryptography forms the backbone of secure degree reverse lexicographic (degrevlex) ordering, and communication in the digital age. At the heart of many finding the polynomial of minimum degree in this basis. symmetric ciphers lie Substitution boxes (S-boxes), These properties collectively contribute to the S-box’s nonlinear components crucial for ensuring the security and ability to resist various cryptanalytic attacks, including robustness of these cryptographic systems. This section differential, linear, and algebraic cryptanalysis. The concept provides a comprehensive overview of S-boxes, their role in of algebraic immunity for S-boxes, as introduced by Faugère symmetric cryptography, and the application of genetic and Perret, provides a crucial measure of resistance against algorithms in their optimization. algebraic attacks, which attempt to express the cipher as a system of low-degree multivariate polynomial equations. 3.1. S-boxes in symmetric cryptography The relationship between the algebraic immunity of an Substitution boxes (S-boxes) are fundamental components S-box and that of Boolean functions can be established in symmetric-key algorithms, serving as the primary source through the following construction. Consider a Boolean of nonlinearity [7, 8]. An S-box is essentially a vectorial function 𝑓 : 𝐹 → 𝐹 defined as [44, 45]: Boolean function that maps a fixed number of input bits to f S ( x1 , x2 ,..., xn , y1 , y2 ,..., ym )  a fixed number of output bits. Formally, an n×m S-box can be defined as [9]: 1, if i , j : fi ( x1 , x2 ,..., xn )  y j ;  S : F2n  F2m , 0, if i, j : f i ( x1 , x2 ,..., xn )  y j . where 𝐹 and 𝐹 are vector spaces over the Galois field The algebraic immunity of the S-box S is then equivalent GF(2) with dimensions n and m, respectively. to the minimum degree of non-zero polynomials in the The cryptographic strength of an S-box is determined annihilator of fS: by several critical properties [10]: AI (S )  min deg ( g) | g  Ann( f S ) . 1) Nonlinearity: A measure of the distance between the S-box and the set of all affine functions. For an n×n S-box, This formulation provides a bridge between the the nonlinearity is defined as: algebraic immunity of vectorial Boolean functions (S-boxes) and that of single Boolean functions, unifying the concept 1 across different cryptographic primitives. NL ( S )  2 n 1  max  2 aF2n ,bF2n \ 0 xF2n (1)bS ( x ) a x , 3.2. Importance of S-boxes in modern where denotes the dot product and ⊕ represents bitwise ciphers and the need for XOR. randomness 2) Differential uniformity: Quantifies the uniformity of S-boxes play a pivotal role in ensuring the security of output differences when the input is changed. The symmetric ciphers by introducing nonlinearity and differential uniformity δ is given by: complexity into the encryption process [7]. They are   max | x  F2n : S ( x)  S ( x  a)  b | employed in widely-used algorithms such as the Advanced a  0,b . Encryption Standard (AES) [15], where the SubBytes 3) Algebraic degree: The highest degree among the operation relies on a carefully designed 8×8 S-box. component Boolean functions of S. For an n×m S-box, the However, the increasing sophistication of cryptanalytic algebraic degree is: techniques has necessitated a reevaluation of traditional S- box design methods. deg (S )  max m deg (v  S ) While algebraically constructed S-boxes, such as those vF2 \0 . used in AES (based on finite field inverses) [30, 31], offer 4) Balancedness: An S-box is balanced if each output certain advantages in terms of implementation efficiency occurs with equal probability when the input is uniformly and some cryptographic properties, they may fall short in distributed. terms of algebraic immunity [43]. The structured nature of 5) Algebraic Immunity [43]: A measure of resistance these S-boxes can potentially lead to vulnerabilities against against algebraic attacks. For an S-box 𝑆: 𝐹 → 𝐹 , the algebraic attacks, which have gained significant attention in algebraic immunity is defined as: recent years [16, 17]. AI ( S )  min deg( P), P  I ( S ) , Algebraic attacks exploit the possibility of expressing the cipher as a system of low-degree multivariate where I(S) is the ideal generated by the polynomials polynomial equations [17, 18]. The complexity of solving representing the S-box: such systems is closely related to the algebraic immunity of  y1  f1 ( x1 , x2 ,..., xn ),  the S-box [43]. A low algebraic immunity allows for a   simpler representation of the cipher, potentially reducing  y2  f 2 ( x1 , x2 ,..., xn ),  the computational effort required for cryptanalysis [44, 45]. I (S )   ...,  This vulnerability has prompted researchers to explore   alternative methods for S-box generation that prioritize  ym  f m ( x1 , x2 ,..., xn )  . high algebraic immunity alongside other critical properties. 3 To address these concerns, there is growing interest in the offer a powerful and flexible approach to optimizing cryptographic community in random or pseudo-random S- multiple cryptographic properties simultaneously boxes [24, 46, 47]. These S-boxes, generated through [10, 37, 54]. heuristic methods, offer several advantages: The fundamental principle of GAs is to emulate the process of natural selection, where the fittest individuals are  Higher algebraic immunity: Random S-boxes are more likely to survive and reproduce, passing their less likely to exhibit algebraic structures that can beneficial traits to future generations [52, 53]. In the case of be exploited in attacks, potentially leading to S-box generation, an “individual” represents a candidate S- higher algebraic immunity values. box, and its “fitness” is determined by how well it satisfies  Resistance to specialized attacks: Algebraically the desired cryptographic properties. constructed S-boxes might be vulnerable to attacks The basic structure of a GA includes the following tailored to their specific structure. Random S- components [54, 55]: boxes, lacking such predictable structures, can offer better protection against these targeted  Chromosome representation: Encoding of attacks. potential solutions (S-boxes).  Flexibility in design: Heuristic methods allow for  Fitness function: Evaluates the quality of solutions the optimization of multiple cryptographic criteria based on cryptographic criteria. simultaneously, enabling a more balanced  Selection mechanism: Chooses individuals for approach to S-box design. reproduction.  Genetic operators: Crossover and mutation to Adaptability to evolving threat models: As new create new solutions. cryptanalytic techniques emerge, the criteria for S-box  Termination criteria: Conditions for ending the generation can be adjusted more easily with heuristic evolutionary process. methods compared to algebraic constructions. Various heuristic approaches have been proposed for A general pseudocode for a Genetic Algorithm applied generating high-quality random S-boxes, including: to S-box generation can be described as follows:  Simulated Annealing [23, 26, 33, 48]: This method Algorithm: Genetic algorithm for S-box generation mimics the physical process of annealing in Input: Population size N, number of generations G, metallurgy, gradually “cooling” the system to find crossover rate pc, mutation rate pm; an optimal configuration. It has shown promise in Output: Optimized S-box; generating S-boxes with good cryptographic properties. 1. Initialize population P of N random S-boxes  Hill Climbing [6, 10, 36, 49, 50]: A local search 2. For g = 1 to G do algorithm that iteratively makes small 3. Evaluate the fitness of each S-box in P improvements to a candidate solution. This 4. Select parents for reproduction using tournament approach can be effective in fine-tuning S-box selection properties. 5. Create new population P’ through crossover and  Genetic Algorithms [10, 35, 37, 51]: Evolutionary mutation: approaches that mimic natural selection to evolve 6. For i = 1 to N/2 do a population of S-boxes towards desired 7. Select two parents p1 and p2 from P properties. These algorithms have demonstrated 8. If random (0,1) < pc then the ability to generate S-boxes with excellent 9. (c1, c2) = Crossover(p1, p2) cryptographic characteristics, including high 10. Else algebraic immunity. 11. (c1, c2) = (p1, p2) 12. End If In this work, we focus on genetic algorithms due to their 13. Mutate c1 and c2 with probability pm ability to efficiently explore large search spaces and handle 14. Add c1 and c2 to P’ multi-objective optimization problems. Genetic algorithms 15. End For offer a promising approach to generating S-boxes that 16. P = P’ balance multiple cryptographic criteria, including high 17. End For algebraic immunity, nonlinearity, and differential 18. Return the best S-box from P uniformity. Key parameters and their roles: 3.3. Genetic algorithms for S-box generation Genetic Algorithms (GAs) are stochastic optimization  Population size (N): Determines the diversity of techniques inspired by the principles of natural selection solutions. A larger population allows for broader and evolution [52, 53]. They operate on a population of exploration of the search space but increases potential solutions, evolving them over successive computational cost. generations to improve their fitness concerning a defined  Number of generations (G): Controls the duration objective function. In the context of S-box generation, GAs of the evolutionary process. More generations 4 allow for further optimization but may lead to Algorithm: Modified genetic algorithm for S-box overfitting. generation  Crossover rate (pc): Probability of performing Input: Spop, Kiter, Kpop, Kmut crossover. Higher rates promote the exploration of Output: Optimized S-box or 0 (failure) new solution combinations.  Mutation rate (pm): Probability of mutating each 1. For t = 0 to Kiter - 1 do bit in a chromosome. Higher rates increase 2. Spop = elite_selection(Spop) diversity but may disrupt good solutions. 3. For p = 0 to Kpop - 1 do 4. S ← Spop[p] The fitness function is crucial in guiding the 5. For k = 0 to Kmut - 1 do evolutionary process towards S-boxes with desired 6. S’ ← S cryptographic properties. 7. i ← random(0, 255) The selection mechanism, often implemented as 8. j ← random(0, 255) tournament selection, ensures that fitter individuals have a 9. swap(S’[i], S’[j]) higher chance of being chosen for reproduction. This 10. Nf, Fc ← evaluate(S’) process mimics natural selection, where more adapted 11. If Nf ≥ 104 then individuals are more likely to pass on their genes. 12. Return S’ Crossover operators for S-boxes must be carefully 13. Spop = Spop ∪ {S’} designed to preserve the bijective property. One approach 14. End For is to use a permutation-based crossover, where segments of 15. End For the S-box permutation are exchanged between parents. For 16. End For example, given two parent S-boxes P1 and P2, a two-point 17. Return 0 crossover might produce offspring C1 and C2 as follows: Key components and parameters of the algorithm: P1  (a1 , a2 ,..., ak | ak 1 ,..., al | al 1 ,..., an ) ;  Spop: The population of S-boxes, initially generated P2  (b1 , b2 ,..., bk | bk 1 ,..., bl | bl 1 ,..., bn ) ; using the Fisher-Yates shuffle algorithm to ensure bijectivity. C1  (a1 , a2 ,..., ak | bk 1 ,..., bl | al 1 ,..., an ) ;  Kiter: Maximum number of iterations, set to 150,000 in our experiments. C2  (b1 , b2 ,..., bk | ak 1 ,..., al | bl 1 ,..., bn ) .  Kpop: Population size, representing the number of Mutation operators introduce small random changes to elite S-boxes maintained in each generation. maintain genetic diversity and prevent premature  Kmut: Number of mutations applied to each S-box convergence. For S-boxes, this might involve swapping two in the population per generation. randomly chosen elements or applying a random permutation to a subset of elements. The elite selection function performs a crucial role in our algorithm. It ranks the S-boxes based on their 4. Modified genetic algorithm nonlinearity and objective function values, prioritizing higher nonlinearity and lower objective function values. Our research focuses on developing and implementing a This function ensures that only the top Kpop S-boxes survive modified genetic algorithm for generating to the next generation, maintaining a high-quality cryptographically strong S-boxes. This section details our population. approach, the algorithm’s structure, and the experimental setup used to evaluate its performance. 4.2. Mutation operator 4.1. Modified genetic algorithm overview Our mutation operator is designed to preserve the bijectivity of the S-box while introducing controlled We have developed a modified genetic algorithm that randomness. It operates by swapping two randomly incorporates elements of hill climbing, enhancing its ability selected (distinct) elements within the S-box. This approach to navigate the complex search space of S-box ensures that the fundamental property of bijectivity is configurations. This approach allows for a more targeted maintained throughout the evolutionary process. exploration of promising regions while maintaining the Formally, the mutation can be described as: population-based nature of genetic algorithms. The core idea of our algorithm is to maintain a S [i ]  S [ j ], S [ j ]  S [ i ] , population of S-boxes, subject them to controlled mutations, where 𝑖, 𝑗 ∈ 0,1, … , 255, 𝑖 ≠ 𝑗 and all other elements evaluate their cryptographic properties, and selectively remain unchanged. propagate the best specimens to subsequent generations. This process is iterated until either an S-box meeting the 4.3. Objective function desired criteria is found or a predefined computational limit is reached. The pseudocode for our modified genetic The choice of objective function is critical in guiding the algorithm is: evolutionary process towards cryptographically strong S- boxes. We employ the WHS function proposed by Clark et 5 al. [26], which has shown effectiveness in generating high- For each combination of Kpop and Kmut, we performed 100 quality S-boxes. The WHS function is defined as [26]: independent runs of the search algorithm to ensure 255 255 statistical significance. This resulted in a total of  | WHT [b, i ] |  X R WHS  11×11×100 = 12,100 experimental runs. b 1 i  0 , The algorithm was set to terminate upon finding an S- where WHT[b,i] represents the Walsh-Hadamard box with nonlinearity ≥ 104 or reaching the maximum transform coefficients; i iterate over all component iteration limit of 150,000. For each run, we recorded the functions and their linear combinations; b iterates over all number of S-boxes generated and evaluated (KSbox), which linear functions; X and R are real-valued parameters. serves as our primary metric for computational efficiency. Based on empirical studies, we set R = 12 and X = 0, which has been shown to yield optimal results in generating 5. Results and discussion bijective S-boxes with high nonlinearity [56, 57]. This section presents the results of our comprehensive experimental study on the modified genetic algorithm for S- 4.4. Evaluation criteria box generation. We analyze the performance of the The primary criteria for evaluating the generated S-boxes algorithm across various parameter configurations and are: discuss the implications of our findings.  Nonlinearity (NL): We aim for a nonlinearity of at 5.1. Overview of experimental results least 104, which is close to the theoretical Our primary metric for evaluating the algorithm’s maximum for 8×8 S-boxes. efficiency is KSbox, which represents the number of S-boxes  Differential uniformity (δ): Lower values indicate generated and evaluated before finding an S-box with the better resistance against differential cryptanalysis. desired nonlinearity of 104. Table 1 presents the average  Algebraic degree (deg): Higher degrees provide KSbox values for different combinations of population size better resistance against algebraic attacks. (Kpop) and mutation rate (Kmut).  Algebraic immunity (AI): Higher values indicate increased resistance to algebraic cryptanalysis. 5.2. Analysis of population size impact The evaluate function in our algorithm computes these One of the most striking observations from our results is the properties for each generated S-box, allowing us to assess superior performance of the algorithm when Kpop = 1. This its cryptographic strength comprehensively. configuration consistently yielded the lowest KSbox values across all mutation rates, with averages ranging from 49,277 to 4.5. Experimental setup 58,213. This finding is somewhat counterintuitive, as genetic algorithms typically benefit from larger population sizes that Our experiments were conducted on a high-performance provide greater genetic diversity. computing cluster to handle the computational intensity of The effectiveness of a single-individual population the S-box generation process. The implementation was done suggests that our algorithm’s behavior in this configuration in C++ for efficiency, with parallelization to utilize multiple closely resembles that of a stochastic hill-climbing method. This cores. approach appears to be particularly well-suited to the S-box Given that the calculation of the objective function is optimization problem, possibly due to the following factors: the most computationally expensive operation in terms of Landscape structure: The fitness landscape of S-box processor time, the complexity of the entire search configurations may have numerous local optima that are algorithm can be considered proportional to the number of relatively close in quality to the global optimum. In such a times the objective function is calculated. This corresponds scenario, an aggressive local search can be highly effective. to the number of S-boxes that were generated and Mutation operator efficiency: Our swap-based evaluated. We denote this quantity as KSbox. mutation operator appears to be sufficiently powerful to To accelerate the algorithm’s performance, we navigate the search space effectively, even without the implemented parallel computation of the new population diversity typically provided by a larger population. using Nthread = 8 threads within each iteration. This Reduced computational overhead: With Kpop = 1, the parallelization significantly reduced the overall execution algorithm avoids the computational cost associated with time of the algorithm. managing and evaluating a large population, allowing for We conducted a comprehensive parameter sweep to more iterations within the same computational budget. analyze the impact of population size and mutation rate on the quality of the generated S-boxes and the algorithm's 5.3. Impact of mutation rate convergence rate. Specifically: While the population size shows a clear trend, the impact of  Population size (Kpop) was varied from 1 to 21 with the mutation rate (Kmut) is more nuanced. For Kpop = 1, we a step size of 2. observe that:  The mutation rate (Kmut) was varied from 1 to 31 The lowest KSbox (49,277) was achieved with Kmut = 7. with a step size of 3. Performance generally degraded with higher mutation rates, with KSbox increasing to 58,213 at Kmut = 1. 6 This pattern suggests that there exists an optimal  Slower convergence: Diversity maintenance in balance between exploration and exploitation in the search larger populations may slow down the process. Lower mutation rates may lead to premature convergence to high-quality solutions. convergence, while higher rates may disrupt good solutions too frequently. However, it’s worth noting that larger populations might offer benefits not captured by the KSbox metric alone, 5.4. Scalability and computational such as increased robustness or the ability to find a more efficiency diverse set of high-quality S-boxes. As Kpop increases, we observe a general trend of increasing 5.5. Parallelization performance KSbox values, indicating reduced computational efficiency. This scaling behavior can be attributed to: Our implementation of parallel computation using 8 threads (Nthread = 8) has proven to be effective in accelerating the  Increased evaluation overhead: Larger populations search process. This parallelization strategy is particularly require more objective function evaluations per beneficial for configurations with larger Kpop and Kmut generation. values, where the workload can be more evenly distributed across threads. Table 1 The average number of S-boxes generated (KSbox) before finding an S-box with Nf = 104 Kpop Kmut 1 3 5 7 9 11 13 15 17 19 21 1 58,213 65,942 72,830 86,642 101,726 111,990 112,718 125,113 132,806 140,336 149,339 4 56,067 64,863 75,069 89,598 94,726 105,925 122,364 137,003 136,740 151,874 163,291 7 49,277 67,198 77,848 88,353 103,154 109,618 122,382 130,901 142,463 144,601 165,918 10 54,636 65,723 82,198 92,542 102,797 114,163 129,411 137,442 147,416 161,020 165,672 13 56,042 62,660 83,216 94,538 101,073 117,611 124,466 135,244 152,048 158,696 171,756 16 56,010 68,711 79,645 93,134 107,371 120,567 125,274 140,817 150,494 155,049 169,462 19 56,532 65,910 82,883 92,911 105,144 117,877 129,718 142,017 155,463 164,902 175,531 22 54,775 67,236 77,663 92,874 105,559 120,992 131,029 140,772 156,224 162,808 176,669 25 50,066 70,394 79,596 98,967 115,462 118,406 135,294 144,708 157,321 177,621 183,087 28 54,203 70,453 82,200 91,841 108,783 121,683 133,665 152,984 159,887 176,751 181,781 31 53,709 71,581 91,827 101,536 109,625 126,616 143,233 156,987 160,573 183,069 192,493 5.6. Comparison with existing methods cryptographers and security researchers. This diversity in high-performing methods enhances The best-performing configuration of our algorithm the robustness of S-box generation techniques. (Kpop = 1, Kmut = 7) achieves an average KSbox of 49,277. To  Consistency and Reliability: Like our previous best contextualize our findings within the broader landscape of results, the genetic algorithm maintains a 100% S-box generation research, we conducted a comprehensive success rate in generating target S-boxes with comparison of our genetic algorithm approach with existing nonlinearity 104. This level of reliability is crucial methods. Table 2 presents this comparative analysis, for practical applications in cryptographic system encompassing various techniques and cost functions design. employed in the field.  Efficiency Across Methods: The similarity in Our genetic algorithm implementation, utilizing the WHS performance between our genetic algorithm and cost function, achieves results that are on par with the best- hill climbing approaches (49,399 vs. 50,000 known methods in the field. Specifically, our approach iterations) suggests that we may be approaching generates S-boxes with a nonlinearity of 104 in an average of theoretical limits of efficiency for generating S- 49,399 iterations, with a 100% success rate. This performance boxes with these properties using heuristic is comparable to our previous works using hill climbing methods. [6, 28], which required 50,000 iterations on average. Several key observations emerge from this comparative Progress from Earlier Genetic Approaches: Compared to analysis: earlier genetic algorithm implementations [11, 35], our method shows substantial improvement, reducing the  Parity in Performance: Our genetic algorithm required iterations by orders of magnitude while achieving achieves results that are statistically equivalent to higher nonlinearity. the best-known methods, particularly our earlier The achievement of parity with the best-known results hill-climbing approach. This parity is significant, using a genetic algorithm is particularly noteworthy and as it demonstrates the versatility and potential of underscores the potential of evolutionary approaches in genetic algorithms in this domain. cryptographic primitive generation.  Algorithmic Diversity: By achieving comparable results through a different algorithmic approach, we have expanded the toolkit available to 7 5.7. Practical Implications computational resources, as it doesn’t require maintaining a large population. The superior performance of the Kpop = 1 configuration has  Simplicity: The simplified population management important implications for the practical application of our makes the algorithm easier to implement and tune. algorithm:  Adaptability: The algorithm’s efficiency makes it  Resource efficiency: The algorithm can be suitable for scenarios where S-boxes need to be effectively run on systems with limited generated or updated frequently. Table 2 Comparison of S-box Generation Methods Method Cost Function Algorithm NL Success Rate, % Avg. Iterations [23,26] WHS SA 102 0.5 - [23] WHS SA 104 - 30,000,000 [35] WHS HC 100 - 2,500 [35] WHS GaT 104 - 3,239,000 [11] WHS Ga 102 - 28,200 [11] WHS GaT 104 - 3,849,881 [11] WHS LSA 102 - 6,701 [11] PCF Ga 104 - 741,371 [11] PCF GaT 104 - 167,451 [11] PCF LSA 104 - 172,280 [27] WCF LSA 104 - 89,460 [27] WCF HC 104 37 65,933 [58] WHS SA 104 56.4 450,000 [48] WCF SA 104 100 65,000 [48,59] ECF SA 104 100 55,000 … 83,000 [49] WHS HC 104 100 50,000 [6,28] WCFS HC 104 100 50,000 Our work WHS Ga 104 100 49,399 However, it’s important to note that while this valuable insights to the field of cryptographic primitive configuration is optimal for finding a single high-quality design and offer a powerful tool for the development of S-box, alternative configurations may be more suitable secure symmetric encryption systems. for generating a diverse set of S-boxes or for multi- objective optimization scenarios. 6. Conclusions 5.8. Limitations and future work This study presents a significant advancement in the field of S-box generation for symmetric key While our results are promising, several avenues for cryptography, focusing on the application of genetic future research remain: algorithms to produce highly nonlinear substitutions. Our research demonstrates that genetic algorithms,  Extended cryptographic criteria: Incorporate when properly optimized and combined with the Walsh- additional criteria such as algebraic immunity Hadamard Spectrum (WHS) cost function, can achieve and differential uniformity into the objective performance parity with the best-known methods in function. generating 8×8 S-boxes with a nonlinearity of 104.  Adaptive parameter tuning: Develop methods Key findings of our work include: to dynamically adjust Kpop and Kmut during the search process.  The genetic algorithm approach achieves an  Alternative mutation operators: Explore more average of 49,399 iterations to generate target sophisticated mutation strategies that leverage S-boxes, comparable to the best results of domain-specific knowledge about S-box 50,000 iterations using hill-climbing methods. structures.  A 100% success rate in producing S-boxes with  Multi-objective optimization: Extend the the desired nonlinearity, matching the reliability algorithm to simultaneously optimize multiple of top-performing techniques. cryptographic properties, potentially using  A significant improvement over earlier genetic Pareto-based approaches. algorithm implementations, reducing iteration counts by orders of magnitude. In conclusion, our modified genetic algorithm demonstrates exceptional efficiency in generating The achievement of performance parity using a different cryptographically strong S-boxes, particularly in its hill- algorithmic approach expands the toolkit available to climbing-like configuration. These findings contribute cryptographers and highlights the versatility of genetic 8 methods in cryptographic primitive generation. This [9] C. Carlet, Vectorial Boolean Functions for diversity in high-performing techniques enhances the Cryptography, Boolean Models and Methods in robustness of S-box generation methodologies. Mathematics, Computer Science, and Engineering Furthermore, our results underscore the potential of (2006). genetic algorithms in this domain, particularly their [10] A. Freyre-Echevarría, et al., An External Parameter adaptability to evolving cryptographic criteria and their Independent Novel Cost Function for Evolving inherent parallelization capabilities. These Bijective Substitution-Boxes, Symmetry 12 (2020) characteristics position genetic approaches as promising 1896. doi: 10.3390/sym12111896. avenues for future research, potentially leading to more [11] S. Picek, M. Cupic, L. Rotim, A New Cost Function efficient, flexible, and innovative S-box generation for Evolution of S-Boxes, Evolut. Computation 24 techniques. (2016) 695–718. doi: 10.1162/EVCO_a_00191. In conclusion, while not surpassing existing [12] J. Álvarez-Cubero, Vector Boolean Functions: methods in raw performance, our genetic algorithm Applications in Symmetric Cryptography (2015). approach offers a valuable alternative that matches the doi: 10.13140/RG. 2.2.12540.23685. best-known results. This equivalence, coupled with the [13] K. Lisitskiy, I. Lisitska, A. Kuznetsov, unique advantages of genetic algorithms, opens new Cryptographically Properties of Random S-Boxes., perspectives in cryptographic research and 16th International Conference on ICT in Education, development. Future work should focus on exploiting Research and Industrial Applications. Integration, these advantages, potentially through hybridization Harmonization and Knowledge Transfer II, vol. with other heuristic methods or by leveraging parallel 2732 (2020) 228–241. computing architectures to further enhance S-box [14] I. Gorbenko, et al., Random S-Boxes Generation generation efficiency. Methods for Symmetric Cryptography, IEEE 2nd Ukraine Conference on Electrical and Computer References Engineering (UKRCON) (2019) 947–950. doi: 10.1109/UKRCON.2019.8879962. [1] Y. Li, et al., HDLBC: A Lightweight Block Cipher [15] J. Daemen, V. Rijmen, Specification of Rijndael, with High Diffusion, Integr. 94 (2024) 102090. doi: The Design of Rijndael: The Advanced Encryption 10.1016/j.vlsi.2023. 102090. Standard (AES), Springer (2020) 31–51. doi: [2] A. Tiwari, Chapter 14—Cryptography in 10.1007/978-3-662-60769-5_3. Blockchain, Distributed Computing to Blockchain, [16] G. V. Bard, Algebraic Cryptanalysis, Springer US Academic Press (2023) 251–265. doi: 10.1016/B978- (2009). doi: 10.1007/978-0-387-88757-9. 0-323-96146-2.00011-5. [17] N. T. Courtois, G. V. Bard, Algebraic Cryptanalysis [3] M. Milanič, B. Servatius, H. Servatius, Chapter 8 - of the Data Encryption Standard, Cryptography Codes and cyphers, Discrete Mathematics With and Coding, LNSC 4887 (2007) 152–169. doi: Logic, Academic Press (2024) 163–179. doi: 10.1007/978-3-540-77272-9_10. 10.1016/B978-0-44-318782-7.00013-7. [18] N. T. Courtois, J. Pieprzyk, Cryptanalysis of Block [4] Y. Sovyn, et al., Minimization of Bitsliced Ciphers with Overdefined Systems of Equations, Representation of 4×4 S-Boxes based on Ternary Advances in Cryptology—ASIACRYPT 2002, Logic Instruction, in Cybersecurity Providing in LNCS 2501 (2002) 267–287. doi: 10.1007/3-540- Information and Telecom-munication Systems, 36178-2_17. vol. 3421 (2023) 12–24. [19] V. Buhas, et al., Using Machine Learning [5] A. S. Changle, S. P. Metkar, R. K. Patole, Techniques to Increase the Effectiveness of Implementation of S-box for Lightweight Block Cybersecurity, in: Cybersecurity Providing in Cipher, 3rd International Conference on Intelligent Information and Telecommunication Systems, vol. Technologies (2023) 1–4. doi: 3188, no. 2 (2021) 273–281. 10.1109/CONIT59222. 2023.10205535. [20] V. Buriachok, et al., Invasion Detection Model [6] A. Kuznetsov, et al., A New Cost Function for using Two-Stage Criterion of Detection of Heuristic Search of Nonlinear Substitutions, Network Anomalies, in: Cybersecurity Providing Expert Syst. Appls. 237 (2024). doi: in Information and Telecommunication Systems, 10.1016/j.eswa.2023.121684. vol. 2746 (2020) 23–32. [7] A. J. Menezes, et al., CRC Press (2018). doi: [21] V. Sokolov, P. Skladannyi, H. Hulak, Stability 10.1201/9780429466335. Verification of Self-Organized Wireless Networks [8] C. E. Shannon, Communication Theory of Secrecy with Block Encryption, in: 5th International Systems, Bell Syst. Tech. J. 28 (1949) 656–715. doi: Workshop on Computer Modeling and Intelligent 10.1002/j.1538-7305.1949.tb00928.x. Systems, vol. 3137 (2022) 227–237. 9 [22] A. Freyre Echevarría, Evolución Híbrida de S-cajas http://dspace.lib.vutbr.cz/xmlui/handle/11012/569 no Lineales Resistentes a Ataques de Potencia 57 (2020). doi: 10.13140/RG.2.2.17037.77284/1. [36] G. Ivanov, N. Nikolov, S. Nikova, [23] J. McLaughlin, Applications of Search Techniques Cryptographically Strong S-Boxes Generated by to Cryptanalysis and the Construction of Cipher Modified Immune Algorithm, Cryptography and Components, PhD, University of York (2012). URL: Information Security in the Balkans, LNSC 9540 https://etheses.whiterose.ac.uk/3674/ (2016) 31–42. doi: 10.1007/978-3-319-29172-7_3. [24] L. D. Burnett, Heuristic Optimization of Boolean [37] G. Ivanov, N. Nikolov, S. Nikova, Reversed Genetic Functions and Substitution Boxes for Algorithms for Generation of Bijective S-boxes Cryptography, PhD, Queensland University of with Good Cryptographic Properties, Cryptogr. Technology, (2005). URL: Commun. 8 (2016) 247–276. doi: 10.1007/s12095- https://eprints.qut.edu.au/16023/ 015-0170-5. [25] A. Kuznetsov, et al., SBGen: A High-performance [38] M. Rodinko, R. Oliynykov, Y. Gorbenko, Library for Rapid Generation of Cryptographic S- Optimization of the High Nonlinear S-Boxes boxes, SoftwareX 27 (2024) 101788. doi: Generation Method, Tatra Mountains Math. Publ. 10.1016/j.softx.2024.101788. 70 (2017) 93–105. doi: 10.1515/tmmp-2017-0020. [26] J. A. Clark, J. L. Jacob, S. Stepney, The Design of S- [39] F. Artuğer, F. Özkaynak, A New Post-Processing boxes by Simulated Annealing, New Gener. Approach for Improvement of Nonlinearity Comput. 23 (2005) 219–231. doi: 10.1007/BF03037 Property in Substitution Boxes, Integration 94 656. (2024) 102105. doi: 10.1016/j.vlsi.2023.102105. [27] A. Freyre Echevarría, I. Martínez Díaz, A New Cost [40] T. Haider, N. A. Azam, U. Hayat, Substitution Box Function to Improve Nonlinearity of Bijective S- Generator with Enhanced Cryptographic boxes (2020). Properties and Minimal Computation Time, [28] O. Kuznetsov, et al., Enhancing Smart Expert Syst. Appl. 241 (2024) 122779. doi: Communication Security: A Novel Cost Function 10.1016/j.eswa.2023.122779. for Efficient S-Box Generation in Symmetric Key [41] S. S. Jamal, et al., Region of Interest-Based Medical Cryptography, Cryptography 8 (2024) 17. doi: Image Encryption Technique Based on Chaotic S- 10.3390/cryptography8020017. boxes, Expert Syst. Appl. 238 (2024) 122030. doi: [29] O. Kuznetsov, et al., Enhancing Cryptographic 10.1016/j.eswa.2023.122030. Primitives through Dynamic Cost Function [42] S. Fahd, et al., The Reality of Backdoored S-Boxes— Optimization in Heuristic Search, Electronics 13 An Eye Opener, J. Inf. Secur. Appls. 80 (2024) (2024) 1825. doi: 10.3390/electronics13101825. 103674. doi: 10.1016/j.jisa.2023.103674. [30] K. Nyberg, Perfect Nonlinear S-boxes, Advances in [43] G. Ars, J.-C. Faugère, Algebraic Immunities of Cryptology — EUROCRYPT ’91, LNCS 547 (1991) Functions Over Finite Fields, INRIA, (2005). URL: 378–386. doi: 10.1007/3-540-46416-6_32. https://hal.inria.fr/inria-00070475 [31] K. Nyberg, Differentially uniform mappings for [44] O. O. Kuznetsov, et al., Algebraic Immunity of cryptography, Advances in Cryptology— Non-linear Blocks of Symmetric Ciphers, EUROCRYPT ’93, LNCS 765 (1994) 55–64. doi: Telecommun. Radio Eng. 77 (2018) 309–325. doi: 10.1007/3-540-48285-7_6. 10.1615/TelecomRadEng.v77.i4.30. [32] R. La Scala, S. K. Tiwari, Stream/Block Ciphers, [45] A. Kuznetsov, et al., Evaluation of Algebraic Difference Equations and Algebraic Attacks, J. Immunity of modern block ciphers, IEEE Int. Conf. Symbolic Comput. 109 (2022) 177–198. doi: Dependable Syst., Serv. Technol. (2018) 288–293. 10.1016/j.jsc.2021.09.001. doi: 10.1109/DESSERT.2018.8409146. [33] D. Souravlias, K. Parsopoulos, G. Meletiou, [46] K. Lisitskiy, I. Lisitska, A. Kuznetsov, Designing Bijective S-boxes Using Algorithm Cryptographically Properties of Random S-boxes, Portfolios with Limited Time Budgets, Appl. Soft in: ICT in Education, Research and Industrial Comput. 59 (2017) 475–486. doi: Applications. Integration, Harmonization and 10.1016/j.asoc.2017.05. 052. Knowledge Transfer, vol. 2732 (2020) 228–241. [34] W. Millan, et al., Evolutionary Heuristics for [47] I. Gorbenko, et al., Random S-boxes Generation Finding Cryptographically Strong S-Boxes, Methods for Symmetric Cryptography, IEEE 2nd Information and Communication Security, LNCS Ukraine Conference on Electrical and Computer 1726 (1999) 263–274. doi: 10.1007/978-3-540-47942- Engineering, (2019) 947–950. doi: 0_22. 10.1109/ukrcon.2019.8879962. [35] P. Tesar, A New Method for Generating High Non- [48] A. Kuznetsov, et al., Optimized Simulated linearity S-Boxes, (2010). URL: Annealing for Efficient Generation of Highly 10 Nonlinear S-boxes, Soft. Comput. (2023). doi: 10.1007/s00500-023-09334-y. [49] A. Kuznetsov, et al., Optimizing Hill Climbing Algorithm for S-Boxes Generation, Electronics 12 (2023) 2338. doi: 10.3390/electronics12102338. [50] A. Freyre-Echevarría, et al., Evolving Nonlinear S- Boxes With Improved Theoretical Resilience to Power Attacks, IEEE Access 8 (2020) 202728– 202737. doi: 10.1109/ACCESS.2020.3035163. [51] E.C. Laskari, G.C. Meletiou, M.N. Vrahatis, Utilizing Evolutionary Computation Methods for the Design of S-Boxes, International Conference on Computational Intelligence and Security, (2006) 1299–1302. doi: 10.1109/ICCIAS.2006.295267. [52] A. Ghosh, S. Das, B. Saha, Chapter 6—Nature- inspired Optimization Algorithms, Artificial Intelligence in Textile Engineering, Woodhead Publishing (2024) 171–231. doi: 10.1016/B978-0- 443-15395-2.00002-8. [53] C.-W. Tsai, M.-C. Chiang, Chapter Seven—Genetic algorithm, Handbook of Metaheuristic Algorithms, Academic Press (2023) 111–138. doi: 10.1016/B978-0-44-319108-4.00020-4. [54] T. Kapuściński, R. K. Nowicki, C. Napoli, Application of Genetic Algorithms in the Construction of Invertible Substitution Boxes, Artificial Intelligence and Soft Computing, LNAI 9692 (2016) 380–391. doi: 10.1007/978-3-319-39378- 0_33. [55] C.-W. Tsai, M.-C. Chiang, Chapter Sixteen—Local Search Algorithm, Handbook of Metaheuristic Algorithms, Academic Press (2023) 351–374. doi: 10.1016/B978-0-44-319108-4.00030-7. [56] A. Kuznetsov, et al., WHS Cost Function for Generating S-boxes, IEEE 8th International Conference on Problems of Infocommunications, Science and Technology (2021) 434–438. doi: 10.1109/PICST54195. 2021.9772133. [57] A. Kuznetsov, et al., Opportunities to Minimize Hardware and Software Costs for Implementing Boolean Functions in Stream Ciphers, Int. J. Comput. 18 (2019) 443–452. [58] A. Kuznetsov, et al., Optimization of a Simulated Annealing Algorithm for S-Boxes Generating, Sensors 22 (2022) 6073. doi: 10.3390/s22166073. [59] A. Kuznetsov, et al., Generation of Nonlinear Substitutions by Simulated Annealing Algorithm, Inf. 14 (2023) 259. doi: 10.3390/info14050259. 11