=Paper=
{{Paper
|id=Vol-3829/paper1
|storemode=property
|title=Evolutionary approach to S-box generation: Optimizing nonlinear substitutions in symmetric ciphers
|pdfUrl=https://ceur-ws.org/Vol-3829/paper1.pdf
|volume=Vol-3829
|authors=Oleksandr Kuznetsov,Nikolay Poluyanenko,Emanuele Frontoni,Marco Arnesano,Oleksii Smirnov
|dblpUrl=https://dblp.org/rec/conf/cqpc/KuznetsovPFAS24
}}
==Evolutionary approach to S-box generation: Optimizing nonlinear substitutions in symmetric ciphers==
Evolutionary approach to S-box generation:
Optimizing nonlinear substitutions in symmetric ciphers⋆
Oleksandr Kuznetsov1,2,3,†, Nikolay Poluyanenko2,†, Emanuele Frontoni3,†,
Marco Arnesano1,† and Oleksii Smirnov4,∗,†
1
eCampus University, 10 Via Isimbardi, 22060 Novedrate, Italy
2
V. N. Karazin Kharkiv National University, 4 Svobody sq., 61022 Kharkiv, Ukraine
3
University of Macerata, 30/32 Via Crescimbeni, 62100 Macerata, Italy
4
Central Ukrainian National Technical University, 8 University ave., 25006 Kropyvnytskyi, Ukraine
Abstract
This study explores the application of genetic algorithms in generating highly nonlinear substitution boxes
(S-boxes) for symmetric key cryptography. We present a novel implementation that combines a genetic
algorithm with the Walsh-Hadamard Spectrum (WHS) cost function to produce 8×8 S-boxes with a
nonlinearity of 104. Our approach achieves performance parity with the best-known methods, requiring an
average of 49,399 iterations with a 100% success rate. The study demonstrates significant improvements
over earlier genetic algorithm implementations in this field, reducing iteration counts by orders of
magnitude. By achieving equivalent performance through a different algorithmic approach, our work
expands the toolkit available to cryptographers and highlights the potential of genetic methods in
cryptographic primitive generation. The adaptability and parallelization potential of genetic algorithms
suggests promising avenues for future research in S-box generation, potentially leading to more robust,
efficient, and innovative cryptographic systems. Our findings contribute to the ongoing evolution of
symmetric key cryptography, offering new perspectives on optimizing critical components of secure
communication systems.
Keywords
S-box generation, genetic algorithms, nonlinear substitutions, Walsh-Hadamard spectrum, cryptographic
primitives, heuristic optimization, cryptographic strength1
1. Introduction play crucial roles in determining an S-box’s overall
cryptographic efficacy [13, 14].
The realm of digital security is in a constant state of While algebraically constructed S-boxes, such as the one
evolution, with symmetric key cryptography serving as a used in the Advanced Encryption Standard (AES) with its
fundamental pillar in the architecture of secure optimal nonlinearity of 112 [15], might seem ideal, they are
communication systems [1–3]. At the core of many not without vulnerabilities. The presence of inherent
symmetric encryption algorithms lie Substitution boxes (S- algebraic structures in such S-boxes can create potential
boxes) [4], which play a pivotal role in establishing the weaknesses, making them susceptible to algebraic
nonlinear components essential for robust encryption [5, 6]. cryptanalysis [16–18]. This vulnerability underscores the
These S-boxes are critical in creating the confusion and need for randomly generated S-boxes that lack hidden
diffusion properties that Claude Shannon identified as algebraic structures, thereby enhancing resistance against
crucial for secure ciphers [7, 8]. sophisticated cryptanalytic techniques [19–21].
The cryptographic strength of an S-box is multifaceted, The generation of cryptographically robust S-boxes
encompassing several key indicators [9]. Nonlinearity, presents a significant computational challenge. The vast
which quantifies an S-box’s resistance to linear search space of possible configurations for 8×8 S-boxes is
cryptanalysis, stands as a primary measure. For 8×8 S-boxes, estimated at 28! (approximately 10506), renders exhaustive
commonly employed in modern ciphers, achieving a search methods impractical. This complexity has driven
nonlinearity of 104 represents a significant benchmark [10– research towards heuristic approaches for S-box generation
12]. However, other properties such as differential [22–24]. Methods such as simulated annealing, hill
uniformity, algebraic degree, and algebraic immunity also climbing, and genetic algorithms have shown promise in
navigating this expansive solution space efficiently [25].
CQPC-2024: Classic, Quantum, and Post-Quantum Cryptography, August 0000-0003-2331-6326 (O. Kuznetsov); 0000-0001-9386-2547
6, 2024, Kyiv, Ukraine (N. Poluyanenko); 0000-0002-8893-9244 (E. Frontoni); 0000-0003-1700-
∗ Corresponding author. 3075 (M. Arnesano); 0000-0001-9543-874X (O. Smirnov)
†
These authors contributed equally.
© 2024 Copyright for this paper by its authors. Use permitted under
oleksandr.kuznetsov@uniecampus.it (O. Kuznetsov); Creative Commons License Attribution 4.0 International (CC BY 4.0).
nlfsr01@gmail.com (N. Poluyanenko); emanuele.frontoni@unimc.it
(E. Frontoni); marco.arnesano@uniecampus.it (M. Arnesano);
dr.smirnovoa@gmail.com (O. Smirnov)
CEUR
Workshop
ceur-ws.org
ISSN 1613-0073
1
Proceedings
Recent advances in heuristic S-box generation have made high nonlinearity. Building on this work, Souravlias et al.
significant advances. Researchers have studied various cost (2017) [33] proposed an algorithm portfolio approach
functions, including the Walsh-Hadamard spectrum (WHS) combining simulated annealing and tabu search, showing
function [23, 26], the Picek cost function (PCF) [10], improved results under limited time budgets.
improved Walsh-Hadamard spectrum-based cost functions Genetic algorithms have also been explored for S-box
(WCF) [10, 27], and two new extended cost functions (ECF generation [24, 34]. Tesar (2010) [35] combined a genetic
and WCFS) [6, 28, 29] in conjunction with different search algorithm with a tree search method, generating 8×8 S-
algorithms [10, 22]. These efforts have progressively boxes with nonlinearity up to 104. Picek et al. (2016) [11]
reduced the computational cost of generating highly presented a novel cost function for evolving S-boxes,
nonlinear S-boxes, with some methods achieving the target achieving significant improvements in both speed and
nonlinearity of 104 in fewer than 100,000 iterations. quality of results compared to previous approaches.
Despite these advancements, there remains a gap in Ivanov et al. (2016a, 2016b) [36, 37] introduced an
understanding the full potential of genetic algorithms in this innovative approach using a modified immune algorithm
domain. While genetic approaches have been applied to S- combined with hill climbing, rapidly generating large sets
box generation, their performance in comparison to other of highly nonlinear bijective S-boxes. Their work
heuristic methods, particularly in terms of consistency and demonstrated the potential of hybrid approaches in S-box
efficiency in generating S-boxes with optimal cryptographic generation.
properties, remains an area ripe for exploration. Recent advancements have focused on improving
Our study aims to address this gap by presenting a specific cryptographic properties. Rodinko et al. (2017) [38]
comprehensive investigation into the application of genetic optimized a method for generating high nonlinear S-boxes,
algorithms for generating 8×8 S-boxes with a nonlinearity of achieving nonlinearity of 104, algebraic immunity of 3, and
104. We explore the synergy between genetic algorithms and 8-uniformity within reasonable computational time. Freyre
the WHS cost function, aiming to match or surpass the Echevarría and Martínez Díaz (2020) [27] proposed a new
efficiency of existing methods while leveraging the inherent cost function specifically designed to improve the
advantages of evolutionary approaches, such as adaptability nonlinearity of bijective S-boxes.
and the potential for parallelization. The importance of multiple cryptographic criteria has
The remainder of this paper is structured as follows: been emphasized in recent literature. Freyre-Echevarría et
Section 2 provides a comprehensive review of the literature, al. (2020) [10] introduced an external parameter-
detailing the evolution of S-box generation techniques and independent cost function for evolving bijective S-boxes,
the current state of the art. Section 3 offers a background on considering both nonlinearity and other important
S-boxes, their cryptographic properties, and the theoretical properties. Their work highlighted the need for balanced
foundations underpinning their design. Section 4 delineates optimization across multiple cryptographic criteria.
our methodology and experimental setup, including the More recent studies have explored novel approaches to
specifics of our genetic algorithm implementation and S-box generation. Artuğer and Özkaynak (2024) [39]
evaluation criteria. Section 5 presents our results and a proposed a post-processing approach to improve the
detailed discussion, comparing our findings with existing nonlinearity of chaos-based S-boxes, addressing a
methods and analyzing their implications. Finally, Section 6 longstanding challenge in this area. Haider et al. (2024) [40]
concludes the paper, summarizing our key findings and introduced an S-box generator based on elliptic curves,
outlining promising directions for future research in this offering a balance between randomization and optimization
critical area of cryptographic system design. with minimal computation time.
The application of S-boxes in specific cryptographic
2. Literature review contexts has also been a focus of recent research. Jamal et
al. (2024) [41] developed a region of interest-based medical
The design and generation of cryptographically strong S- image encryption technique using chaotic S-boxes,
boxes have been subjects of intensive research in the field demonstrating the practical applications of advanced S-box
of symmetric key cryptography. This section provides a designs in specialized domains.
comprehensive review of the existing literature, focusing on Emerging threats and the need for enhanced security
various approaches to S-box generation and their have led to new considerations in S-box design. Fahd et al.
cryptographic properties. (2024) [42] examined the reality of backdoored S-boxes,
Algebraic constructions of S-boxes, such as those based highlighting the importance of thorough cryptanalysis and
on finite field inversion used in the Advanced Encryption the potential vulnerabilities in S-box structures.
Standard (AES) [15, 30, 31], have been widely studied. In conclusion, the literature reveals a trend towards more
However, as Bard (2009) [16] and Courtois and Bard (2007) sophisticated, multi-criteria optimization approaches in S-box
[17] point out, these constructions may be vulnerable to generation. While significant progress has been made in
algebraic attacks due to their inherent mathematical achieving high nonlinearity and other desirable properties,
structure. This vulnerability has led to increased interest in there remains a need for methods that can consistently produce
generating S-boxes with more complex algebraic structures S-boxes with optimal cryptographic characteristics while
[32]. balancing computational efficiency and resistance to emerging
Heuristic approaches have gained significant traction in cryptanalytic techniques.
recent years. Clark et al. (2005) [26] introduced a simulated
annealing approach for S-box generation [23],
demonstrating its effectiveness in producing S-boxes with
2
3. Background The algebraic immunity can be computed by constructing
the minimal reduced Gröbner basis of the ideal I(S) using the
Symmetric cryptography forms the backbone of secure degree reverse lexicographic (degrevlex) ordering, and
communication in the digital age. At the heart of many finding the polynomial of minimum degree in this basis.
symmetric ciphers lie Substitution boxes (S-boxes), These properties collectively contribute to the S-box’s
nonlinear components crucial for ensuring the security and ability to resist various cryptanalytic attacks, including
robustness of these cryptographic systems. This section differential, linear, and algebraic cryptanalysis. The concept
provides a comprehensive overview of S-boxes, their role in of algebraic immunity for S-boxes, as introduced by Faugère
symmetric cryptography, and the application of genetic and Perret, provides a crucial measure of resistance against
algorithms in their optimization. algebraic attacks, which attempt to express the cipher as a
system of low-degree multivariate polynomial equations.
3.1. S-boxes in symmetric cryptography The relationship between the algebraic immunity of an
Substitution boxes (S-boxes) are fundamental components S-box and that of Boolean functions can be established
in symmetric-key algorithms, serving as the primary source through the following construction. Consider a Boolean
of nonlinearity [7, 8]. An S-box is essentially a vectorial function 𝑓 : 𝐹 → 𝐹 defined as [44, 45]:
Boolean function that maps a fixed number of input bits to f S ( x1 , x2 ,..., xn , y1 , y2 ,..., ym )
a fixed number of output bits. Formally, an n×m S-box can
be defined as [9]: 1, if i , j : fi ( x1 , x2 ,..., xn ) y j ;
S : F2n F2m , 0, if i, j : f i ( x1 , x2 ,..., xn ) y j .
where 𝐹 and 𝐹 are vector spaces over the Galois field The algebraic immunity of the S-box S is then equivalent
GF(2) with dimensions n and m, respectively. to the minimum degree of non-zero polynomials in the
The cryptographic strength of an S-box is determined annihilator of fS:
by several critical properties [10]: AI (S ) min deg ( g) | g Ann( f S ) .
1) Nonlinearity: A measure of the distance between the
S-box and the set of all affine functions. For an n×n S-box, This formulation provides a bridge between the
the nonlinearity is defined as: algebraic immunity of vectorial Boolean functions (S-boxes)
and that of single Boolean functions, unifying the concept
1 across different cryptographic primitives.
NL ( S ) 2 n 1 max
2 aF2n ,bF2n \ 0 xF2n
(1)bS ( x ) a x
, 3.2. Importance of S-boxes in modern
where denotes the dot product and ⊕ represents bitwise ciphers and the need for
XOR. randomness
2) Differential uniformity: Quantifies the uniformity of S-boxes play a pivotal role in ensuring the security of
output differences when the input is changed. The symmetric ciphers by introducing nonlinearity and
differential uniformity δ is given by: complexity into the encryption process [7]. They are
max | x F2n : S ( x) S ( x a) b | employed in widely-used algorithms such as the Advanced
a 0,b . Encryption Standard (AES) [15], where the SubBytes
3) Algebraic degree: The highest degree among the operation relies on a carefully designed 8×8 S-box.
component Boolean functions of S. For an n×m S-box, the However, the increasing sophistication of cryptanalytic
algebraic degree is: techniques has necessitated a reevaluation of traditional S-
box design methods.
deg (S ) max
m
deg (v S ) While algebraically constructed S-boxes, such as those
vF2 \0 . used in AES (based on finite field inverses) [30, 31], offer
4) Balancedness: An S-box is balanced if each output certain advantages in terms of implementation efficiency
occurs with equal probability when the input is uniformly and some cryptographic properties, they may fall short in
distributed. terms of algebraic immunity [43]. The structured nature of
5) Algebraic Immunity [43]: A measure of resistance these S-boxes can potentially lead to vulnerabilities against
against algebraic attacks. For an S-box 𝑆: 𝐹 → 𝐹 , the algebraic attacks, which have gained significant attention in
algebraic immunity is defined as: recent years [16, 17].
AI ( S ) min deg( P), P I ( S ) , Algebraic attacks exploit the possibility of expressing
the cipher as a system of low-degree multivariate
where I(S) is the ideal generated by the polynomials polynomial equations [17, 18]. The complexity of solving
representing the S-box: such systems is closely related to the algebraic immunity of
y1 f1 ( x1 , x2 ,..., xn ), the S-box [43]. A low algebraic immunity allows for a
simpler representation of the cipher, potentially reducing
y2 f 2 ( x1 , x2 ,..., xn ), the computational effort required for cryptanalysis [44, 45].
I (S )
..., This vulnerability has prompted researchers to explore
alternative methods for S-box generation that prioritize
ym f m ( x1 , x2 ,..., xn ) . high algebraic immunity alongside other critical properties.
3
To address these concerns, there is growing interest in the offer a powerful and flexible approach to optimizing
cryptographic community in random or pseudo-random S- multiple cryptographic properties simultaneously
boxes [24, 46, 47]. These S-boxes, generated through [10, 37, 54].
heuristic methods, offer several advantages: The fundamental principle of GAs is to emulate the
process of natural selection, where the fittest individuals are
Higher algebraic immunity: Random S-boxes are more likely to survive and reproduce, passing their
less likely to exhibit algebraic structures that can beneficial traits to future generations [52, 53]. In the case of
be exploited in attacks, potentially leading to S-box generation, an “individual” represents a candidate S-
higher algebraic immunity values. box, and its “fitness” is determined by how well it satisfies
Resistance to specialized attacks: Algebraically the desired cryptographic properties.
constructed S-boxes might be vulnerable to attacks The basic structure of a GA includes the following
tailored to their specific structure. Random S- components [54, 55]:
boxes, lacking such predictable structures, can
offer better protection against these targeted Chromosome representation: Encoding of
attacks. potential solutions (S-boxes).
Flexibility in design: Heuristic methods allow for Fitness function: Evaluates the quality of solutions
the optimization of multiple cryptographic criteria based on cryptographic criteria.
simultaneously, enabling a more balanced Selection mechanism: Chooses individuals for
approach to S-box design. reproduction.
Genetic operators: Crossover and mutation to
Adaptability to evolving threat models: As new create new solutions.
cryptanalytic techniques emerge, the criteria for S-box Termination criteria: Conditions for ending the
generation can be adjusted more easily with heuristic evolutionary process.
methods compared to algebraic constructions.
Various heuristic approaches have been proposed for A general pseudocode for a Genetic Algorithm applied
generating high-quality random S-boxes, including: to S-box generation can be described as follows:
Simulated Annealing [23, 26, 33, 48]: This method Algorithm: Genetic algorithm for S-box generation
mimics the physical process of annealing in Input: Population size N, number of generations G,
metallurgy, gradually “cooling” the system to find crossover rate pc, mutation rate pm;
an optimal configuration. It has shown promise in Output: Optimized S-box;
generating S-boxes with good cryptographic
properties. 1. Initialize population P of N random S-boxes
Hill Climbing [6, 10, 36, 49, 50]: A local search 2. For g = 1 to G do
algorithm that iteratively makes small 3. Evaluate the fitness of each S-box in P
improvements to a candidate solution. This 4. Select parents for reproduction using tournament
approach can be effective in fine-tuning S-box selection
properties. 5. Create new population P’ through crossover and
Genetic Algorithms [10, 35, 37, 51]: Evolutionary mutation:
approaches that mimic natural selection to evolve 6. For i = 1 to N/2 do
a population of S-boxes towards desired 7. Select two parents p1 and p2 from P
properties. These algorithms have demonstrated 8. If random (0,1) < pc then
the ability to generate S-boxes with excellent 9. (c1, c2) = Crossover(p1, p2)
cryptographic characteristics, including high 10. Else
algebraic immunity. 11. (c1, c2) = (p1, p2)
12. End If
In this work, we focus on genetic algorithms due to their 13. Mutate c1 and c2 with probability pm
ability to efficiently explore large search spaces and handle 14. Add c1 and c2 to P’
multi-objective optimization problems. Genetic algorithms 15. End For
offer a promising approach to generating S-boxes that 16. P = P’
balance multiple cryptographic criteria, including high 17. End For
algebraic immunity, nonlinearity, and differential 18. Return the best S-box from P
uniformity.
Key parameters and their roles:
3.3. Genetic algorithms for S-box generation
Genetic Algorithms (GAs) are stochastic optimization Population size (N): Determines the diversity of
techniques inspired by the principles of natural selection solutions. A larger population allows for broader
and evolution [52, 53]. They operate on a population of exploration of the search space but increases
potential solutions, evolving them over successive computational cost.
generations to improve their fitness concerning a defined Number of generations (G): Controls the duration
objective function. In the context of S-box generation, GAs of the evolutionary process. More generations
4
allow for further optimization but may lead to Algorithm: Modified genetic algorithm for S-box
overfitting. generation
Crossover rate (pc): Probability of performing Input: Spop, Kiter, Kpop, Kmut
crossover. Higher rates promote the exploration of Output: Optimized S-box or 0 (failure)
new solution combinations.
Mutation rate (pm): Probability of mutating each 1. For t = 0 to Kiter - 1 do
bit in a chromosome. Higher rates increase 2. Spop = elite_selection(Spop)
diversity but may disrupt good solutions. 3. For p = 0 to Kpop - 1 do
4. S ← Spop[p]
The fitness function is crucial in guiding the 5. For k = 0 to Kmut - 1 do
evolutionary process towards S-boxes with desired 6. S’ ← S
cryptographic properties. 7. i ← random(0, 255)
The selection mechanism, often implemented as 8. j ← random(0, 255)
tournament selection, ensures that fitter individuals have a 9. swap(S’[i], S’[j])
higher chance of being chosen for reproduction. This 10. Nf, Fc ← evaluate(S’)
process mimics natural selection, where more adapted 11. If Nf ≥ 104 then
individuals are more likely to pass on their genes. 12. Return S’
Crossover operators for S-boxes must be carefully 13. Spop = Spop ∪ {S’}
designed to preserve the bijective property. One approach 14. End For
is to use a permutation-based crossover, where segments of 15. End For
the S-box permutation are exchanged between parents. For 16. End For
example, given two parent S-boxes P1 and P2, a two-point 17. Return 0
crossover might produce offspring C1 and C2 as follows:
Key components and parameters of the algorithm:
P1 (a1 , a2 ,..., ak | ak 1 ,..., al | al 1 ,..., an ) ;
Spop: The population of S-boxes, initially generated
P2 (b1 , b2 ,..., bk | bk 1 ,..., bl | bl 1 ,..., bn ) ; using the Fisher-Yates shuffle algorithm to ensure
bijectivity.
C1 (a1 , a2 ,..., ak | bk 1 ,..., bl | al 1 ,..., an ) ; Kiter: Maximum number of iterations, set to 150,000
in our experiments.
C2 (b1 , b2 ,..., bk | ak 1 ,..., al | bl 1 ,..., bn ) .
Kpop: Population size, representing the number of
Mutation operators introduce small random changes to elite S-boxes maintained in each generation.
maintain genetic diversity and prevent premature Kmut: Number of mutations applied to each S-box
convergence. For S-boxes, this might involve swapping two in the population per generation.
randomly chosen elements or applying a random
permutation to a subset of elements. The elite selection function performs a crucial role in
our algorithm. It ranks the S-boxes based on their
4. Modified genetic algorithm nonlinearity and objective function values, prioritizing
higher nonlinearity and lower objective function values.
Our research focuses on developing and implementing a This function ensures that only the top Kpop S-boxes survive
modified genetic algorithm for generating to the next generation, maintaining a high-quality
cryptographically strong S-boxes. This section details our population.
approach, the algorithm’s structure, and the experimental
setup used to evaluate its performance. 4.2. Mutation operator
4.1. Modified genetic algorithm overview Our mutation operator is designed to preserve the
bijectivity of the S-box while introducing controlled
We have developed a modified genetic algorithm that randomness. It operates by swapping two randomly
incorporates elements of hill climbing, enhancing its ability selected (distinct) elements within the S-box. This approach
to navigate the complex search space of S-box ensures that the fundamental property of bijectivity is
configurations. This approach allows for a more targeted maintained throughout the evolutionary process.
exploration of promising regions while maintaining the Formally, the mutation can be described as:
population-based nature of genetic algorithms.
The core idea of our algorithm is to maintain a S [i ] S [ j ], S [ j ] S [ i ] ,
population of S-boxes, subject them to controlled mutations, where 𝑖, 𝑗 ∈ 0,1, … , 255, 𝑖 ≠ 𝑗 and all other elements
evaluate their cryptographic properties, and selectively remain unchanged.
propagate the best specimens to subsequent generations.
This process is iterated until either an S-box meeting the 4.3. Objective function
desired criteria is found or a predefined computational limit
is reached. The pseudocode for our modified genetic The choice of objective function is critical in guiding the
algorithm is: evolutionary process towards cryptographically strong S-
boxes. We employ the WHS function proposed by Clark et
5
al. [26], which has shown effectiveness in generating high- For each combination of Kpop and Kmut, we performed 100
quality S-boxes. The WHS function is defined as [26]: independent runs of the search algorithm to ensure
255 255 statistical significance. This resulted in a total of
| WHT [b, i ] | X
R
WHS 11×11×100 = 12,100 experimental runs.
b 1 i 0 , The algorithm was set to terminate upon finding an S-
where WHT[b,i] represents the Walsh-Hadamard box with nonlinearity ≥ 104 or reaching the maximum
transform coefficients; i iterate over all component iteration limit of 150,000. For each run, we recorded the
functions and their linear combinations; b iterates over all number of S-boxes generated and evaluated (KSbox), which
linear functions; X and R are real-valued parameters. serves as our primary metric for computational efficiency.
Based on empirical studies, we set R = 12 and X = 0,
which has been shown to yield optimal results in generating 5. Results and discussion
bijective S-boxes with high nonlinearity [56, 57]. This section presents the results of our comprehensive
experimental study on the modified genetic algorithm for S-
4.4. Evaluation criteria box generation. We analyze the performance of the
The primary criteria for evaluating the generated S-boxes algorithm across various parameter configurations and
are: discuss the implications of our findings.
Nonlinearity (NL): We aim for a nonlinearity of at 5.1. Overview of experimental results
least 104, which is close to the theoretical
Our primary metric for evaluating the algorithm’s
maximum for 8×8 S-boxes.
efficiency is KSbox, which represents the number of S-boxes
Differential uniformity (δ): Lower values indicate generated and evaluated before finding an S-box with the
better resistance against differential cryptanalysis. desired nonlinearity of 104. Table 1 presents the average
Algebraic degree (deg): Higher degrees provide KSbox values for different combinations of population size
better resistance against algebraic attacks. (Kpop) and mutation rate (Kmut).
Algebraic immunity (AI): Higher values indicate
increased resistance to algebraic cryptanalysis. 5.2. Analysis of population size impact
The evaluate function in our algorithm computes these One of the most striking observations from our results is the
properties for each generated S-box, allowing us to assess superior performance of the algorithm when Kpop = 1. This
its cryptographic strength comprehensively. configuration consistently yielded the lowest KSbox values
across all mutation rates, with averages ranging from 49,277 to
4.5. Experimental setup 58,213. This finding is somewhat counterintuitive, as genetic
algorithms typically benefit from larger population sizes that
Our experiments were conducted on a high-performance provide greater genetic diversity.
computing cluster to handle the computational intensity of The effectiveness of a single-individual population
the S-box generation process. The implementation was done suggests that our algorithm’s behavior in this configuration
in C++ for efficiency, with parallelization to utilize multiple closely resembles that of a stochastic hill-climbing method. This
cores. approach appears to be particularly well-suited to the S-box
Given that the calculation of the objective function is optimization problem, possibly due to the following factors:
the most computationally expensive operation in terms of Landscape structure: The fitness landscape of S-box
processor time, the complexity of the entire search configurations may have numerous local optima that are
algorithm can be considered proportional to the number of relatively close in quality to the global optimum. In such a
times the objective function is calculated. This corresponds scenario, an aggressive local search can be highly effective.
to the number of S-boxes that were generated and Mutation operator efficiency: Our swap-based
evaluated. We denote this quantity as KSbox. mutation operator appears to be sufficiently powerful to
To accelerate the algorithm’s performance, we navigate the search space effectively, even without the
implemented parallel computation of the new population diversity typically provided by a larger population.
using Nthread = 8 threads within each iteration. This Reduced computational overhead: With Kpop = 1, the
parallelization significantly reduced the overall execution algorithm avoids the computational cost associated with
time of the algorithm. managing and evaluating a large population, allowing for
We conducted a comprehensive parameter sweep to more iterations within the same computational budget.
analyze the impact of population size and mutation rate on
the quality of the generated S-boxes and the algorithm's 5.3. Impact of mutation rate
convergence rate. Specifically:
While the population size shows a clear trend, the impact of
Population size (Kpop) was varied from 1 to 21 with the mutation rate (Kmut) is more nuanced. For Kpop = 1, we
a step size of 2. observe that:
The mutation rate (Kmut) was varied from 1 to 31 The lowest KSbox (49,277) was achieved with Kmut = 7.
with a step size of 3. Performance generally degraded with higher mutation
rates, with KSbox increasing to 58,213 at Kmut = 1.
6
This pattern suggests that there exists an optimal Slower convergence: Diversity maintenance in
balance between exploration and exploitation in the search larger populations may slow down the
process. Lower mutation rates may lead to premature convergence to high-quality solutions.
convergence, while higher rates may disrupt good solutions
too frequently. However, it’s worth noting that larger populations
might offer benefits not captured by the KSbox metric alone,
5.4. Scalability and computational such as increased robustness or the ability to find a more
efficiency diverse set of high-quality S-boxes.
As Kpop increases, we observe a general trend of increasing 5.5. Parallelization performance
KSbox values, indicating reduced computational efficiency.
This scaling behavior can be attributed to: Our implementation of parallel computation using 8 threads
(Nthread = 8) has proven to be effective in accelerating the
Increased evaluation overhead: Larger populations search process. This parallelization strategy is particularly
require more objective function evaluations per beneficial for configurations with larger Kpop and Kmut
generation. values, where the workload can be more evenly distributed
across threads.
Table 1
The average number of S-boxes generated (KSbox) before finding an S-box with Nf = 104
Kpop
Kmut
1 3 5 7 9 11 13 15 17 19 21
1 58,213 65,942 72,830 86,642 101,726 111,990 112,718 125,113 132,806 140,336 149,339
4 56,067 64,863 75,069 89,598 94,726 105,925 122,364 137,003 136,740 151,874 163,291
7 49,277 67,198 77,848 88,353 103,154 109,618 122,382 130,901 142,463 144,601 165,918
10 54,636 65,723 82,198 92,542 102,797 114,163 129,411 137,442 147,416 161,020 165,672
13 56,042 62,660 83,216 94,538 101,073 117,611 124,466 135,244 152,048 158,696 171,756
16 56,010 68,711 79,645 93,134 107,371 120,567 125,274 140,817 150,494 155,049 169,462
19 56,532 65,910 82,883 92,911 105,144 117,877 129,718 142,017 155,463 164,902 175,531
22 54,775 67,236 77,663 92,874 105,559 120,992 131,029 140,772 156,224 162,808 176,669
25 50,066 70,394 79,596 98,967 115,462 118,406 135,294 144,708 157,321 177,621 183,087
28 54,203 70,453 82,200 91,841 108,783 121,683 133,665 152,984 159,887 176,751 181,781
31 53,709 71,581 91,827 101,536 109,625 126,616 143,233 156,987 160,573 183,069 192,493
5.6. Comparison with existing methods cryptographers and security researchers. This
diversity in high-performing methods enhances
The best-performing configuration of our algorithm the robustness of S-box generation techniques.
(Kpop = 1, Kmut = 7) achieves an average KSbox of 49,277. To
Consistency and Reliability: Like our previous best
contextualize our findings within the broader landscape of
results, the genetic algorithm maintains a 100%
S-box generation research, we conducted a comprehensive
success rate in generating target S-boxes with
comparison of our genetic algorithm approach with existing
nonlinearity 104. This level of reliability is crucial
methods. Table 2 presents this comparative analysis,
for practical applications in cryptographic system
encompassing various techniques and cost functions
design.
employed in the field.
Efficiency Across Methods: The similarity in
Our genetic algorithm implementation, utilizing the WHS
performance between our genetic algorithm and
cost function, achieves results that are on par with the best-
hill climbing approaches (49,399 vs. 50,000
known methods in the field. Specifically, our approach
iterations) suggests that we may be approaching
generates S-boxes with a nonlinearity of 104 in an average of
theoretical limits of efficiency for generating S-
49,399 iterations, with a 100% success rate. This performance
boxes with these properties using heuristic
is comparable to our previous works using hill climbing
methods.
[6, 28], which required 50,000 iterations on average.
Several key observations emerge from this comparative Progress from Earlier Genetic Approaches: Compared to
analysis: earlier genetic algorithm implementations [11, 35], our
method shows substantial improvement, reducing the
Parity in Performance: Our genetic algorithm
required iterations by orders of magnitude while achieving
achieves results that are statistically equivalent to
higher nonlinearity.
the best-known methods, particularly our earlier
The achievement of parity with the best-known results
hill-climbing approach. This parity is significant,
using a genetic algorithm is particularly noteworthy and
as it demonstrates the versatility and potential of
underscores the potential of evolutionary approaches in
genetic algorithms in this domain.
cryptographic primitive generation.
Algorithmic Diversity: By achieving comparable
results through a different algorithmic approach,
we have expanded the toolkit available to
7
5.7. Practical Implications computational resources, as it doesn’t require
maintaining a large population.
The superior performance of the Kpop = 1 configuration has
Simplicity: The simplified population management
important implications for the practical application of our
makes the algorithm easier to implement and tune.
algorithm:
Adaptability: The algorithm’s efficiency makes it
Resource efficiency: The algorithm can be suitable for scenarios where S-boxes need to be
effectively run on systems with limited generated or updated frequently.
Table 2
Comparison of S-box Generation Methods
Method Cost Function Algorithm NL Success Rate, % Avg. Iterations
[23,26] WHS SA 102 0.5 -
[23] WHS SA 104 - 30,000,000
[35] WHS HC 100 - 2,500
[35] WHS GaT 104 - 3,239,000
[11] WHS Ga 102 - 28,200
[11] WHS GaT 104 - 3,849,881
[11] WHS LSA 102 - 6,701
[11] PCF Ga 104 - 741,371
[11] PCF GaT 104 - 167,451
[11] PCF LSA 104 - 172,280
[27] WCF LSA 104 - 89,460
[27] WCF HC 104 37 65,933
[58] WHS SA 104 56.4 450,000
[48] WCF SA 104 100 65,000
[48,59] ECF SA 104 100 55,000 … 83,000
[49] WHS HC 104 100 50,000
[6,28] WCFS HC 104 100 50,000
Our work WHS Ga 104 100 49,399
However, it’s important to note that while this valuable insights to the field of cryptographic primitive
configuration is optimal for finding a single high-quality design and offer a powerful tool for the development of
S-box, alternative configurations may be more suitable secure symmetric encryption systems.
for generating a diverse set of S-boxes or for multi-
objective optimization scenarios. 6. Conclusions
5.8. Limitations and future work This study presents a significant advancement in the
field of S-box generation for symmetric key
While our results are promising, several avenues for cryptography, focusing on the application of genetic
future research remain: algorithms to produce highly nonlinear substitutions.
Our research demonstrates that genetic algorithms,
Extended cryptographic criteria: Incorporate when properly optimized and combined with the Walsh-
additional criteria such as algebraic immunity Hadamard Spectrum (WHS) cost function, can achieve
and differential uniformity into the objective performance parity with the best-known methods in
function. generating 8×8 S-boxes with a nonlinearity of 104.
Adaptive parameter tuning: Develop methods Key findings of our work include:
to dynamically adjust Kpop and Kmut during the
search process. The genetic algorithm approach achieves an
Alternative mutation operators: Explore more average of 49,399 iterations to generate target
sophisticated mutation strategies that leverage S-boxes, comparable to the best results of
domain-specific knowledge about S-box 50,000 iterations using hill-climbing methods.
structures. A 100% success rate in producing S-boxes with
Multi-objective optimization: Extend the the desired nonlinearity, matching the reliability
algorithm to simultaneously optimize multiple of top-performing techniques.
cryptographic properties, potentially using A significant improvement over earlier genetic
Pareto-based approaches. algorithm implementations, reducing iteration
counts by orders of magnitude.
In conclusion, our modified genetic algorithm
demonstrates exceptional efficiency in generating The achievement of performance parity using a different
cryptographically strong S-boxes, particularly in its hill- algorithmic approach expands the toolkit available to
climbing-like configuration. These findings contribute cryptographers and highlights the versatility of genetic
8
methods in cryptographic primitive generation. This [9] C. Carlet, Vectorial Boolean Functions for
diversity in high-performing techniques enhances the Cryptography, Boolean Models and Methods in
robustness of S-box generation methodologies. Mathematics, Computer Science, and Engineering
Furthermore, our results underscore the potential of (2006).
genetic algorithms in this domain, particularly their [10] A. Freyre-Echevarría, et al., An External Parameter
adaptability to evolving cryptographic criteria and their Independent Novel Cost Function for Evolving
inherent parallelization capabilities. These Bijective Substitution-Boxes, Symmetry 12 (2020)
characteristics position genetic approaches as promising 1896. doi: 10.3390/sym12111896.
avenues for future research, potentially leading to more [11] S. Picek, M. Cupic, L. Rotim, A New Cost Function
efficient, flexible, and innovative S-box generation for Evolution of S-Boxes, Evolut. Computation 24
techniques. (2016) 695–718. doi: 10.1162/EVCO_a_00191.
In conclusion, while not surpassing existing [12] J. Álvarez-Cubero, Vector Boolean Functions:
methods in raw performance, our genetic algorithm Applications in Symmetric Cryptography (2015).
approach offers a valuable alternative that matches the doi: 10.13140/RG. 2.2.12540.23685.
best-known results. This equivalence, coupled with the [13] K. Lisitskiy, I. Lisitska, A. Kuznetsov,
unique advantages of genetic algorithms, opens new Cryptographically Properties of Random S-Boxes.,
perspectives in cryptographic research and 16th International Conference on ICT in Education,
development. Future work should focus on exploiting Research and Industrial Applications. Integration,
these advantages, potentially through hybridization Harmonization and Knowledge Transfer II, vol.
with other heuristic methods or by leveraging parallel 2732 (2020) 228–241.
computing architectures to further enhance S-box [14] I. Gorbenko, et al., Random S-Boxes Generation
generation efficiency. Methods for Symmetric Cryptography, IEEE 2nd
Ukraine Conference on Electrical and Computer
References Engineering (UKRCON) (2019) 947–950. doi:
10.1109/UKRCON.2019.8879962.
[1] Y. Li, et al., HDLBC: A Lightweight Block Cipher [15] J. Daemen, V. Rijmen, Specification of Rijndael,
with High Diffusion, Integr. 94 (2024) 102090. doi: The Design of Rijndael: The Advanced Encryption
10.1016/j.vlsi.2023. 102090. Standard (AES), Springer (2020) 31–51. doi:
[2] A. Tiwari, Chapter 14—Cryptography in 10.1007/978-3-662-60769-5_3.
Blockchain, Distributed Computing to Blockchain, [16] G. V. Bard, Algebraic Cryptanalysis, Springer US
Academic Press (2023) 251–265. doi: 10.1016/B978- (2009). doi: 10.1007/978-0-387-88757-9.
0-323-96146-2.00011-5. [17] N. T. Courtois, G. V. Bard, Algebraic Cryptanalysis
[3] M. Milanič, B. Servatius, H. Servatius, Chapter 8 - of the Data Encryption Standard, Cryptography
Codes and cyphers, Discrete Mathematics With and Coding, LNSC 4887 (2007) 152–169. doi:
Logic, Academic Press (2024) 163–179. doi: 10.1007/978-3-540-77272-9_10.
10.1016/B978-0-44-318782-7.00013-7. [18] N. T. Courtois, J. Pieprzyk, Cryptanalysis of Block
[4] Y. Sovyn, et al., Minimization of Bitsliced Ciphers with Overdefined Systems of Equations,
Representation of 4×4 S-Boxes based on Ternary Advances in Cryptology—ASIACRYPT 2002,
Logic Instruction, in Cybersecurity Providing in LNCS 2501 (2002) 267–287. doi: 10.1007/3-540-
Information and Telecom-munication Systems, 36178-2_17.
vol. 3421 (2023) 12–24. [19] V. Buhas, et al., Using Machine Learning
[5] A. S. Changle, S. P. Metkar, R. K. Patole, Techniques to Increase the Effectiveness of
Implementation of S-box for Lightweight Block Cybersecurity, in: Cybersecurity Providing in
Cipher, 3rd International Conference on Intelligent Information and Telecommunication Systems, vol.
Technologies (2023) 1–4. doi: 3188, no. 2 (2021) 273–281.
10.1109/CONIT59222. 2023.10205535. [20] V. Buriachok, et al., Invasion Detection Model
[6] A. Kuznetsov, et al., A New Cost Function for using Two-Stage Criterion of Detection of
Heuristic Search of Nonlinear Substitutions, Network Anomalies, in: Cybersecurity Providing
Expert Syst. Appls. 237 (2024). doi: in Information and Telecommunication Systems,
10.1016/j.eswa.2023.121684. vol. 2746 (2020) 23–32.
[7] A. J. Menezes, et al., CRC Press (2018). doi: [21] V. Sokolov, P. Skladannyi, H. Hulak, Stability
10.1201/9780429466335. Verification of Self-Organized Wireless Networks
[8] C. E. Shannon, Communication Theory of Secrecy with Block Encryption, in: 5th International
Systems, Bell Syst. Tech. J. 28 (1949) 656–715. doi: Workshop on Computer Modeling and Intelligent
10.1002/j.1538-7305.1949.tb00928.x. Systems, vol. 3137 (2022) 227–237.
9
[22] A. Freyre Echevarría, Evolución Híbrida de S-cajas http://dspace.lib.vutbr.cz/xmlui/handle/11012/569
no Lineales Resistentes a Ataques de Potencia 57
(2020). doi: 10.13140/RG.2.2.17037.77284/1. [36] G. Ivanov, N. Nikolov, S. Nikova,
[23] J. McLaughlin, Applications of Search Techniques Cryptographically Strong S-Boxes Generated by
to Cryptanalysis and the Construction of Cipher Modified Immune Algorithm, Cryptography and
Components, PhD, University of York (2012). URL: Information Security in the Balkans, LNSC 9540
https://etheses.whiterose.ac.uk/3674/ (2016) 31–42. doi: 10.1007/978-3-319-29172-7_3.
[24] L. D. Burnett, Heuristic Optimization of Boolean [37] G. Ivanov, N. Nikolov, S. Nikova, Reversed Genetic
Functions and Substitution Boxes for Algorithms for Generation of Bijective S-boxes
Cryptography, PhD, Queensland University of with Good Cryptographic Properties, Cryptogr.
Technology, (2005). URL: Commun. 8 (2016) 247–276. doi: 10.1007/s12095-
https://eprints.qut.edu.au/16023/ 015-0170-5.
[25] A. Kuznetsov, et al., SBGen: A High-performance [38] M. Rodinko, R. Oliynykov, Y. Gorbenko,
Library for Rapid Generation of Cryptographic S- Optimization of the High Nonlinear S-Boxes
boxes, SoftwareX 27 (2024) 101788. doi: Generation Method, Tatra Mountains Math. Publ.
10.1016/j.softx.2024.101788. 70 (2017) 93–105. doi: 10.1515/tmmp-2017-0020.
[26] J. A. Clark, J. L. Jacob, S. Stepney, The Design of S- [39] F. Artuğer, F. Özkaynak, A New Post-Processing
boxes by Simulated Annealing, New Gener. Approach for Improvement of Nonlinearity
Comput. 23 (2005) 219–231. doi: 10.1007/BF03037 Property in Substitution Boxes, Integration 94
656. (2024) 102105. doi: 10.1016/j.vlsi.2023.102105.
[27] A. Freyre Echevarría, I. Martínez Díaz, A New Cost [40] T. Haider, N. A. Azam, U. Hayat, Substitution Box
Function to Improve Nonlinearity of Bijective S- Generator with Enhanced Cryptographic
boxes (2020). Properties and Minimal Computation Time,
[28] O. Kuznetsov, et al., Enhancing Smart Expert Syst. Appl. 241 (2024) 122779. doi:
Communication Security: A Novel Cost Function 10.1016/j.eswa.2023.122779.
for Efficient S-Box Generation in Symmetric Key [41] S. S. Jamal, et al., Region of Interest-Based Medical
Cryptography, Cryptography 8 (2024) 17. doi: Image Encryption Technique Based on Chaotic S-
10.3390/cryptography8020017. boxes, Expert Syst. Appl. 238 (2024) 122030. doi:
[29] O. Kuznetsov, et al., Enhancing Cryptographic 10.1016/j.eswa.2023.122030.
Primitives through Dynamic Cost Function [42] S. Fahd, et al., The Reality of Backdoored S-Boxes—
Optimization in Heuristic Search, Electronics 13 An Eye Opener, J. Inf. Secur. Appls. 80 (2024)
(2024) 1825. doi: 10.3390/electronics13101825. 103674. doi: 10.1016/j.jisa.2023.103674.
[30] K. Nyberg, Perfect Nonlinear S-boxes, Advances in [43] G. Ars, J.-C. Faugère, Algebraic Immunities of
Cryptology — EUROCRYPT ’91, LNCS 547 (1991) Functions Over Finite Fields, INRIA, (2005). URL:
378–386. doi: 10.1007/3-540-46416-6_32. https://hal.inria.fr/inria-00070475
[31] K. Nyberg, Differentially uniform mappings for [44] O. O. Kuznetsov, et al., Algebraic Immunity of
cryptography, Advances in Cryptology— Non-linear Blocks of Symmetric Ciphers,
EUROCRYPT ’93, LNCS 765 (1994) 55–64. doi: Telecommun. Radio Eng. 77 (2018) 309–325. doi:
10.1007/3-540-48285-7_6. 10.1615/TelecomRadEng.v77.i4.30.
[32] R. La Scala, S. K. Tiwari, Stream/Block Ciphers, [45] A. Kuznetsov, et al., Evaluation of Algebraic
Difference Equations and Algebraic Attacks, J. Immunity of modern block ciphers, IEEE Int. Conf.
Symbolic Comput. 109 (2022) 177–198. doi: Dependable Syst., Serv. Technol. (2018) 288–293.
10.1016/j.jsc.2021.09.001. doi: 10.1109/DESSERT.2018.8409146.
[33] D. Souravlias, K. Parsopoulos, G. Meletiou, [46] K. Lisitskiy, I. Lisitska, A. Kuznetsov,
Designing Bijective S-boxes Using Algorithm Cryptographically Properties of Random S-boxes,
Portfolios with Limited Time Budgets, Appl. Soft in: ICT in Education, Research and Industrial
Comput. 59 (2017) 475–486. doi: Applications. Integration, Harmonization and
10.1016/j.asoc.2017.05. 052. Knowledge Transfer, vol. 2732 (2020) 228–241.
[34] W. Millan, et al., Evolutionary Heuristics for [47] I. Gorbenko, et al., Random S-boxes Generation
Finding Cryptographically Strong S-Boxes, Methods for Symmetric Cryptography, IEEE 2nd
Information and Communication Security, LNCS Ukraine Conference on Electrical and Computer
1726 (1999) 263–274. doi: 10.1007/978-3-540-47942- Engineering, (2019) 947–950. doi:
0_22. 10.1109/ukrcon.2019.8879962.
[35] P. Tesar, A New Method for Generating High Non- [48] A. Kuznetsov, et al., Optimized Simulated
linearity S-Boxes, (2010). URL: Annealing for Efficient Generation of Highly
10
Nonlinear S-boxes, Soft. Comput. (2023). doi:
10.1007/s00500-023-09334-y.
[49] A. Kuznetsov, et al., Optimizing Hill Climbing
Algorithm for S-Boxes Generation, Electronics 12
(2023) 2338. doi: 10.3390/electronics12102338.
[50] A. Freyre-Echevarría, et al., Evolving Nonlinear S-
Boxes With Improved Theoretical Resilience to
Power Attacks, IEEE Access 8 (2020) 202728–
202737. doi: 10.1109/ACCESS.2020.3035163.
[51] E.C. Laskari, G.C. Meletiou, M.N. Vrahatis,
Utilizing Evolutionary Computation Methods for
the Design of S-Boxes, International Conference
on Computational Intelligence and Security, (2006)
1299–1302. doi: 10.1109/ICCIAS.2006.295267.
[52] A. Ghosh, S. Das, B. Saha, Chapter 6—Nature-
inspired Optimization Algorithms, Artificial
Intelligence in Textile Engineering, Woodhead
Publishing (2024) 171–231. doi: 10.1016/B978-0-
443-15395-2.00002-8.
[53] C.-W. Tsai, M.-C. Chiang, Chapter Seven—Genetic
algorithm, Handbook of Metaheuristic
Algorithms, Academic Press (2023) 111–138. doi:
10.1016/B978-0-44-319108-4.00020-4.
[54] T. Kapuściński, R. K. Nowicki, C. Napoli,
Application of Genetic Algorithms in the
Construction of Invertible Substitution Boxes,
Artificial Intelligence and Soft Computing, LNAI
9692 (2016) 380–391. doi: 10.1007/978-3-319-39378-
0_33.
[55] C.-W. Tsai, M.-C. Chiang, Chapter Sixteen—Local
Search Algorithm, Handbook of Metaheuristic
Algorithms, Academic Press (2023) 351–374. doi:
10.1016/B978-0-44-319108-4.00030-7.
[56] A. Kuznetsov, et al., WHS Cost Function for
Generating S-boxes, IEEE 8th International
Conference on Problems of Infocommunications,
Science and Technology (2021) 434–438. doi:
10.1109/PICST54195.
2021.9772133.
[57] A. Kuznetsov, et al., Opportunities to Minimize
Hardware and Software Costs for Implementing
Boolean Functions in Stream Ciphers, Int. J.
Comput. 18 (2019) 443–452.
[58] A. Kuznetsov, et al., Optimization of a Simulated
Annealing Algorithm for S-Boxes Generating,
Sensors 22 (2022) 6073. doi: 10.3390/s22166073.
[59] A. Kuznetsov, et al., Generation of Nonlinear
Substitutions by Simulated Annealing Algorithm,
Inf. 14 (2023) 259. doi: 10.3390/info14050259.
11