An extension of the class of Boolean functions used in symmetric cipher algorithms Svetlana Korabel’shchikova Northern (Arctic) Federal University named after M.V. Lomonosov Arkhangelsk, Russia s.korabelsschikova@narfu.ru Abstract—One of the standard cryptographic 128-bit k1 transformations is the addition modulo two of an open binary .. text with a key binary sequence. In this paper, we have k1 open data . block F obtained a description of all Boolean functions from n D(a) kn arguments that are suitable for use in cryptographic transformations instead of the modulo-two addition function (we will call them component-by-component Boolean Permutations Permutations functions). An example of their use in the cryptographic conversion algorithm GOST R 34.12-2015 is also given. The S(а) and L(а) S(а) and L(а) article proposes an algorithm for generating component-by- k2 component Boolean functions from n variables for different .. k2 . values of n and k, where k is the number of the variable whose F value the function returns. For the case n=3 and k=1 or k=2, all kn+1 Boolean functions that replace the addition function modulo 2 are represented. The paper proposes an encryption method Permutations Nine Permutations based on component-by-component Boolean functions and a S(а) and L(а) pseudo-random sequence generator of elements from the rounds of S(а) and L(а) GF(2n-1) field. Using Boolean functions that return the value of cryptogra- one of the arguments when repeated, expands the variety of . phic .. intermediate variants of round transformations, gives a new .. transforma . k9 variable encryption method, ultimately significantly increasing tions .. the cryptographic strength of the cipher. . k9 F kn+ Keywords—boolean functions, encryption, symmetric cipher, 8 GOST R 34.12-2015 Permutations Permutations I. INTRODUCTION S(а) and L(а) S(а) and L(а) To date, many works on cryptography have been devoted to 128-bit k10 Boolean functions, the study of certain cryptographic .. k10 encrypted . properties of Boolean functions, as well as the possibilities text block F of their use in order to protect information. These issues are kn+9 E(a) ciphertext discussed both in scientific articles and in a number of textbooks for Universities, for example, [1, 2]. For the most Fig. 1. Modification of the encryption algorithm from GOST R 34.12- 2015 using component-wise functions. complete overview of the cryptographic properties of Boolean functions and available results see [3]. This article Earlier in [4], we proposed 10 Boolean functions of three is a continuation of work [4], in which the authors proposed variables to replace addition modulo 2. We introduced the Boolean functions of three arguments that replace the term component-wise functions for them. In general case, a addition operation modulo 2. In [5], the author has Boolean function of n variables, which returns the value of previously considered the issues of information security the first argument, must satisfy the condition: using linear encoding. 𝐹(𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ), 𝑥2 , … , 𝑥𝑛 ) = 𝑥1 . (5) One of the standard cryptographic transformations used in The figure 1 shows the possible location of the various symmetric encryption algorithms is bitwise addition component functions Fj, which depend on n+ 1 arguments, modulo two of a plaintext represented in binary form with a in the process of encrypting a 128-bit block of information. key binary sequence. For example, GOST R 34.12-2015 [6] We presented two algorithms: an algorithm from GOST R is a symmetric cipher in which the beginning of the 34.12-2015 standard (left) and a modification proposed by transformation sequence is the transformation X[k]: the authors (right). X[k](a) = k ⊕ a, (1) Modern studies of the GOST R 34.12-2015 cipher are where k is the round key, a is the plaintext, k, a ∈ V128. presented in [7- 9] and others. They are mainly devoted to The decryption method consists in the repeated addition the study of the cryptographic strength of this encryption modulo 2 of the ciphertext with the same key k: standard, as well as various options for its software D X[k](a) = (k ⊕ a) ⊕ k = a. (2) implementation. This article describes all Boolean functions We use the same Boolean function to encrypt and decrypt of n arguments that are suitable for use in symmetric information: cryptographic transformations instead of the modulo two F(x, y)=x⊕ y. (3) addition function. We proposed an algorithm for their This function has the following property: generation, an encryption method based on component-wise F(F(x, y), y) = x. (4) Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) Data Science Boolean functions and a generator of a pseudo-random satisfy condition (5) and substantially depend on all sequence of elements of the field GF(2n-1). variables. Definition 1. A component-wise (n, k) function, where II. THE GENERATION ALGORITHM AND PROPERTIES OF 1 k n, is a Boolean function 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ), that does COMPONENT-WISE FUNCTIONS not contain fictitious variables and returns the k-th argument Condition (5) means that the system of equations: when it is reused, i.e., satisfying the condition: F(F(0, x2 , … , xn ), x2 , … , xn ) = 0 F(x1 , x2 , … , xk−1 , F(x1 , x2 , … , xn ), xk+1 , … , xn ) = { , (6) F(F(1, x2 , … , xn ), x2 , … , xn ) = 1 = xk (7) whence it follows that 𝐹(0, 𝑥2 , … , 𝑥𝑛 ) = 0 if and only if We formulate an algorithm for generating all 𝐹(1, 𝑥2 , … , 𝑥𝑛 )=1 and vice versa, 𝐹(0, 𝑥2 , … , 𝑥𝑛 )=1 if and component-wise functions 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) , that only if 𝐹(1, 𝑥2 , … , 𝑥𝑛 ) =0. We will represent the Boolean substantially depend on n variables and return the variable xk function F by the vector of values 𝐹̃ =(0,1,…, 2𝑛−1 ), when reused. The input of the algorithm: n is the number of written in the order of correspondence with the ordered sets variables, k is the number of the variable whose value the of variable values from zeros to ones. Then the conditions function returns. obtained mean that if the vector of values of the component- The generation algorithm wise Boolean function is divided into two parts, then the Step 1. Enter n, k. second part will be inverse to the first. Therefore, the Step 2. Calculate 2n-1 and generate all possible binary following statement is true. vectors of length 2n-1. Theorem 1. The number of Boolean functions of n Step 3. (Checking for fictitiousness of all variables). For 𝑛−1 variables satisfying condition (5) is 22 . A Boolean verification, we use the algorithm from [10]. We delete all function satisfies condition (5) if and only if the second half vectors that did not pass verification. of its value vector is inverse to the first. Step 4. (Adding inverted parts) Run for all vectors Let us prove the first statement. We can define a remaining after step 3. Boolean function of n variables by its vector of values of Divide the vector into 2k-1 parts and add the inverse one length 2n. Since the second half of the value vector after each part. completely depends on the first (inverse to the first), you Include all received vectors in the response. can set it by specifying the first half of the value vector in an In the proposed algorithm, we check for fictitiousness of arbitrary way. It has a length of 2n- 1, and the number of variables only half of the value vector. If the check was 𝑛−1 successful, then all the variables will be significant for the binary vectors of this length is 22 . Let us prove the second statement of the theorem. The full value vector generated in step 4. Example 3. Here is an example of how the algorithm values of 𝐹(0, 𝑥2 , … , 𝑥𝑛 ) are in the first half of the vector 𝐹̃ , works for n=3, k=2. and in the same order in the second half of the vector 𝐹̃ ̃ are 23-1=4, so we generate all binary vectors of length 4: the values of 𝐹(1, 𝑥2 , … , 𝑥𝑛 ). From the conditions obtained 0000, 0001, 0010, 0011, ..., 1111. above, it follows that 𝐹(0, 𝑥2 , … , 𝑥𝑛 ) = ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ 𝐹(1, 𝑥2 , … , 𝑥𝑛 ), that After step 3, only 10 vectors that have passed is, the second half of the vector 𝐹̃ is inverse to the first. verification will remain out of the 16 vectors: The theorem is proved. 0001,0010,0100,0110, 0111, 1000, 1001, 1011, 1101, 2−1 Example 1. For n=2 we have 22 =4 Boolean functions 1110. that return the first argument. We list their value vectors. Performing step 4, divide each vector into 2 parts, and The first half of the vector of values is set arbitrarily; the add an inversion to each part. We get the following result: second is completed inversely with the first: 0011, 0110, 1001, 1100. TABLE I. (3,2) COMPONENT-WISE FUNCTIONS We got the following 4 functions: F(x, y)= x, F(x, y)= (3,2) component-wise (3,2) component-wise x+y, F(x, y)= xy and F(x, y)= 𝑥̅ . functions functions 3−1 Example 2. For n=3 we have 22 =16 Boolean 0001 00110110 1000 10010011 0010 00111001 1001 10010110 functions that return the first argument. They have value 0100 01100011 1011 10011100 vectors: 00001111, 00011110, 00101101, 00111100, 0110 01101001 1101 11000110 01001011, 01011010, 01101001, 01111000, 10000111, 0111 01101100 1110 11001001 10010110, 10100101, 10110100, 11000011, 11010010, Let us show that we actually got functions that return the 11100001 and 11110000. second component. Take, for example, F1(x1, x2, x3), Taking into account the requirements for cryptographic functions, we come to conclusions from examples 1 and 2. 𝐹̃1 =(00110110). We show that for arbitrary values x1 and x3 Note that the first and the last functions are the negation of and for x2=a, the function F1 returns the value a. each other. The second and penultimate functions are related If we take n=3 and k=1 in the algorithm, the first steps in a similar way, and so on. Thus, if we are going to use of the algorithm will be the same as in example 3. In step 4, several component-wise functions in the encryption we do not split the remaining 10 vectors, since 2 k-1=20=1. algorithm, we will select them only one from each pair. Add the inverse part to them and get the vectors of the It is also obvious that some of the functions obtained values of (3,1) functions from example 2, with the exception contain dummy variables, which is unacceptable for of six functions containing fictional variables. cryptographic functions. Applying the algorithm for determining the fictitiousness of the variables x2, ..., xn to the selected functions, we obtain all Boolean functions that VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 107 Data Science (𝑥1 , 𝑥3 ) 𝐹1 (𝑥1 , 0, 𝑥2 ) F1(x1,F1(x1, 0, x3), x3) subfunctions: 𝐹(0, 𝑥2 , … , 𝑥𝑘−1 , 0, 𝑥𝑘+1 , … , 𝑥𝑛 ) and 00 0 0 𝐹(0, 𝑥2 , … , 𝑥𝑘−1 , 1, 𝑥𝑘+1 , … , 𝑥𝑛 ) , and the second will be 01 0 0 inverse to the first. Therefore, the combined vector 10 0 0 𝐹(0, 𝑥2 , … , 𝑥𝑘−1 , 𝑥𝑘 , 𝑥𝑘+1 , … , 𝑥𝑛 ) has an equal number of 11 1 0 zeros and ones. It is proved in a similar way that the vector of values of the function 𝐹(1, 𝑥2 , … , 𝑥𝑘−1 , 𝑥𝑘 , 𝑥𝑘+1 , … , 𝑥𝑛 ) is (𝑥1 , 𝑥3 ) 𝐹1 (𝑥1 , 1, 𝑥2 ) F1(x1,F1(x1,1,x3),x3) balanced. For the same reason, the vectors of subfunction 00 1 1 values are balanced, which are obtained by fixing one of the 01 1 1 variables 𝑥2 , … , 𝑥𝑘−1 , 𝑥𝑘+1 , … , 𝑥𝑛 . Therefore, 10 1 1 11 0 1 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is 1-stable. We prove 31. It follows from the condition that the All component-wise (n, k) functions are balanced, that is, function 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑘−1 , 0, 𝑥𝑘+1 , … , 𝑥𝑛 ) is balanced, that they take the values 0 and 1 equally often. However, they is, takes 2𝑛−2 times the value 1 and 2𝑛−2 times the value 0. have a different probability of replacing or saving the In addition, the function 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑘−1 , 1, 𝑥𝑘+1 , … , 𝑥𝑛 ) is plaintext character. So, the function from example 3 with balanced, that is, takes 2𝑛−2 times the value 1 and 2𝑛−2 the value vector 𝐹̃1 =(00110110) changes the value of the times the value 0. Then the number of substitutions is 2𝑛−1 , plaintext a only in 2 cases out of 8, that is, it has a 25% the which means the probability of transitions is 50%. probability of replacing the plaintext. The theorem is proved. The best characteristics (50% to 50%) of transition III. ENCRYPTION OPTIONS BASED ON EXPLODED BOOLEAN probabilities are those Boolean functions that correspond to FUNCTIONS the balanced vectors that remain after step 3 of the above algorithm. One of the encryption options using (3, 1) component- wise Boolean functions was proposed by us in [4], and is Example 4. When n=4 of the 256 vectors generated in shown schematically in Figure 1. However, to use step 2, only 220 are checked for the absence of fictive component-wise (n, k) functions for encryption, it is variables. Of these, 58 vectors are balanced. At n=4, it is necessary to have or generate n –1 key binary sequence, balanced and passes the check for the absence of fictive which is not always convenient. Instead, one pseudo- variables, for example, the vector 11010100. This means random sequence of elements of the field GF (2n-1) can be that (4,1) a component-wise function with the value vector generated. To do this, you can use, for example, a PSP 1101010000101011 has a 50% to 50% transition generator based on linear registers of shift registers (LFSR). probability. The same probability of transitions have (4,2) Let 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) – be a component-wise (n,1) component function with the value vector function, a –be binary plaintext, K be the key SRP of the 1101001001001011, (4,3) component function with the elements of the field GF(2n-1). Then the encryption of the value vector 1100011001100011, (4,4) component function text a is performed elementwise according to the formula: with the value vector 1010011001100101. E(a)= 𝐹(𝑎, 𝐾). (8) Let r be a non-negative number less than 𝑛. A Boolean To decrypt, we use the same function 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) function 𝑓 from n variables is called r-stable if any of its and the key sequence K: subfunctions obtained by fixing at most r variables are D(E(a))= 𝐹( 𝐹(𝑎, 𝐾), 𝐾)=a. (9) balanced. The organization of calculations largely depends on the representation of the elements of the field GF (2n- 1), Theorem 2. For any component-wise (n,k) function therefore, we will consider this question in more detail. 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) the following conditions are equivalent: Operations on elements of a finite field GF (2m) are easily 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) has transition probabilities 50%; performed when they form an index table, where m- 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑘−1 , 0, 𝑥𝑘+1 , … , 𝑥𝑛 ) is balanced; dimensional binary vectors are associated with the powers 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is 1-stable. of a primitive element. Such a representation is also We prove 12. The length of the vector of values of the convenient when dividing field elements into circular function 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑘−1 , 0, 𝑥𝑘+1 , … , 𝑥𝑛 ) is 2𝑛−1 . Let the classes, used, for example, to find primitive polynomials or vector of values of the function in algorithms for obtaining the number of noise-resistant 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑘−1 , 0, 𝑥𝑘+1 , … , 𝑥𝑛 ) have t units. It is necessary codes [11]. to prove that 𝑡 = 2𝑛−2 , i.e., that the number of units is Consider the algorithm for constructing the index table exactly half the length of the vector. Since (6) holds, the of the field GF(2m), where m≤30. Input data: m is the degree vector of values of the function of expansion, f(x) is the primitive polynomial of degree m 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑘−1 , 1, 𝑥𝑘+1 , … , 𝑥𝑛 ) has t zeros. Then the total over GF(2). At the end of the algorithm, the program number of substitutions is 2t. By condition, the probability memory contains all 2m-1 nonzero vectors of length m in a of substitutions is 50%, that is 2𝑡 = 2𝑛−1 , whence 𝑡 = 2𝑛−2 . certain order, specifically, in powers of the primitive We prove 23. Note that it is the vector of values of the element  – the root of the polynomial f(x). function 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑘−1 , 0, 𝑥𝑘+1 , … , 𝑥𝑛 ) that we obtain at To optimize the speed of calculations, we will store each step 3 of the algorithm proposed above. Since, by condition, vector of coefficients in a 32-bit integer data type. At the it is balanced, the inverse vector position of the i-th bit in the binary notation of the number, 𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑘−1 , 1, 𝑥𝑘+1 , … , 𝑥𝑛 ) will also be balanced. the i-th coefficient of the vector will be stored. Due to this, Now we consider the function the multiplication of the vector by will be carried out by a 𝐹(0, 𝑥2 , … , 𝑥𝑘−1 , 𝑥𝑘 , 𝑥𝑘+1 , … , 𝑥𝑛 ). It can be divided into two bitwise left shift. The search for the remainder of division VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 108 Data Science by the primitive polynomial f(x) will be expressed through operation increases the possibilities of choosing round the operation of bitwise addition modulo 2. Thus, the transformations for symmetric ciphers. But a significant storage of the index table GF(2m) requires about 4 ∙ 2m disadvantage of the functions under consideration is a large bytes of memory. redundancy. We believe that consideration of K- digit Note that the calculation of each subsequent coefficient component-wise functions will help overcome this vector is performed sequentially. Thus, lines 0, 1, 2, …, 2 m- disadvantage. Further research will be aimed at introducing 3, 2m-2. are filled. In order to calculate the table using the encryption method using component-wise functions in parallel technologies, we break all the rows into k the hardware-software complex. consecutive blocks. To calculate the j-th block, you need to calculate the coefficient vector that is the first in this block. REFERENCES For this, there is no need to find all previous vectors. We use [1] N.N. Tokareva, “Simmetrichnaya kriptografiya,” Novosibirsk: NSU, 2012, 232 p. the binary exponentiation algorithm to significantly speed [2] S.N. Selezneva, “Multiplicative complexity of some functions of the up the calculations. algebra of logic,” Discrete Mathematics, vol. 26, no. 4, pp. 100-109, We tested the work of the program for constructing the 2014. index table of the GF(2m), where m=26, 27, 28, 29 and 30. [3] A.A. Gorodilova, “From cryptanalysis of a cipher to the The calculations were performed on 4 cores on an NArFU cryptographic property of a Boolean function,” Applied Discrete cluster with 20 computing nodes, each of which had 2 10- Mathematics, vol. 3, no. 33, pp. 4-44, 2016. core Intel Xeon processors and 64 GB of RAM. The table [4] I.I. Vasilishin and S.Yu. Korabelshchikova, “Using component-wise function in cryptographical transformation algorithm from Russian contains data on the operating time for various m and for a national standard GOST R 34.12-2015,” CEUR Workshop different number of threads. Proceedings, vol. 2212, pp. 392-398, 2018. [5] S.Y. Korabelshchikova, L.V. Zyablitseva, B.F. Melnikov and S.V. TABLE II. EXECUTION TIME OF SERIAL AND PARALLEL ALGORITHMS Pivneva, “Linear codes and some their applications,” Journal of (IN SECONDS) Physics: Conference Series, 012174, 2018. [6] GOST R 34.12-2015. Information technology. “Cryptographic Threads \ m 26 27 28 29 30 information security. Block ciphers,” M .: Standartinform, 2015, 25 p. Sequential algorithm 0.507 1.003 2.013 4.034 8.290 [7] E.A. Ishchukova, L.K. Babenko and M.V. Anikeev, “Fast Implementation and Cryptanalysis of GOST R 34.12-2015 Block 1 thread 0.535 1.077 2.143 4.283 8.869 Ciphers,” 9th International Conference on Security of Information and Networks SIN, Newark, Nj, pp. 104-111, 2016. 2 threads 0.311 0.624 1.247 2.495 5.103 [8] T. Isobe, “A single-key attack on the full GOST block cipher,” 3 threads 0.242 0.473 0.945 1.910 3.877 Journal of Cryptology, vol. 26, pp. 172-189, 2013. [9] J. Kim, “On the security of the block cipher GOST suitable for the 4 threads 0.205 0.403 0.794 1.597 3.283 protection in U-business services,” Personal and ubiquitous computing, vol. 17, pp. 1429-1435, 2013. From the presented table we can conclude that the [10] S.V. Yablonskiy, “Vvedeniye v diskretnuyu matematiku,” M.: Nauka, program shows an acceleration of about 2,5 times with 2005, 384 p. parallel implementation of 4 threads. [11] B.F. Melnikov and S.Yu. Korabelshchikova, “Algorithms for estimation the number of noise-immune codes of general and special IV. CONCLUSION types,” Informatization and communication, vol. 1, pp. 55-60, 2019. Component-wise (n, k) functions extend the modes of standard cryptographic transformations, in particular, GOST R 34.12-2015. Using them instead of the modulo 2 addition VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 109