=Paper=
{{Paper
|id=Vol-1502/paper3
|storemode=property
|title=Solving Weakened Cryptanalysis Problems for the Bivium Keystream Generator in the Volunteer Computing Project SAT@home
|pdfUrl=https://ceur-ws.org/Vol-1502/paper3.pdf
|volume=Vol-1502
}}
==Solving Weakened Cryptanalysis Problems for the Bivium Keystream Generator in the Volunteer Computing Project SAT@home==
Solving Weakened Cryptanalysis Problems for the Bivium Keystream Generator in the Volunteer Computing Project SAT@home Oleg Zaikin, Alexander Semenov, and Ilya Otpuschennikov Institute for System Dynamics and Control Theory SB RAS, Irkutsk, Russia zaikin.icc@gmail.com, biclop.rambler@yandex.ru, otilya@yandex.ru Abstract. In this paper, a cryptanalysis of the Bivium keystream generator in the SAT form is considered. For encoding the initial cryptanalysis problem into SAT a special program system T RANSALG was used. For an obtained SAT instance we use Monte Carlo method to search for a partitioning with good time estimation. Several weakened cryptanalysis instances of the Bivium generator were success- fully solved in the volunteer computing project SAT@home using corresponding partitionings found on a computing cluster. Keywords: SAT, logical cryptanalysis, Monte Carlo method, volunteer computing, Bivium, SAT@home, Transalg 1 Introduction Volunteer computing [19] is a type of distributed computing which uses computational resources of PCs of private persons called volunteers. Each volunteer computing project is designed to solve one or several hard problems. When PC is connected to the project, all the calculations are performed automatically and do not inconvenience user since only idle resources of PC are used. Nowadays the most popular platform for organizing volunteer computing projects is BOINC [1] that was developed in Berkeley in 2002. To- day there are about 70 active volunteer projects, the majority of them based on BOINC. A volunteer computing project consists of the following basic parts: server dae- mons, database, web site and client application. The daemons include work generator (generates tasks to be processed), validator (checks the correctness of the results re- ceived from volunteer’s PCs) and assimilator (processes correct results). Client appli- cation should have versions for the widespread computing platforms. One of the attractive features of volunteer computing is its low cost — to maintatin a project one needs only a dedicated server working 24/7. Main difficulty here lies in the development of software and in database administration. In addition, it is crucial to provide the feedback to volunteers using the web site of the project and special forums. Another attractive feature of this type of computing is that volunteer project can solve some particular hard problem for months or even years with good average performance. Wide class of problems from modern computer science can be effectively reduced to Boolean satisfiability problem (SAT) [2]. SAT problems are usually considered as 2 O. Zaikin, A. Semenov, I. Otpuschennikov the problems of search for solutions of Boolean equations in the form of CNF=1, where CNF is a conjunctive normal form. There are many works in which various combinato- rial problems are reduced to SAT and solved in this form. For example, such problems can be found in areas of verification, cryptography, combinatorics and bioinformatics. Usually if the cryptanalysis is considered as a SAT problem then it is called a logical cryptanalysis [15]. In this case to find a secret key it is sufficient to find a solution of corresponding satisfiable SAT instance. For solving corresponding SAT instance one usually needs large computational resources. A volunteer computing project can pro- vide such resources. Below we present a brief outline of our paper. In the second section, we describe how we utilize the T RANSALG tool for reduction of the cryptanalysis problem of the Bivium generator to SAT. In the third section various decomposition techniques for SAT are discussed. In the fourth section experimental results on solving weakened cryptanal- ysis problems for the Bivium generator in the volunteer computing project SAT@home are presented. 2 Propositional Encoding of Cryptanalysis Problem for Bivium The Bivium generator [5] uses two shift registers of a special kind (see Fig. 1). The first register contains 93 cells and the second contains 84 cells. To initialize the generator, a secret key of length 80 bit is put to the first register, and a fixed (known) initialization vector of length 80 bit is put to the second register. All remaining cells are filled with zeros. An initialization phase consists of 708 rounds during which keystream output is not released. qt 93 92 91 ... 69 ... 66 ... 1 pt keystream bit pt 177 176 175 ... 171 ... 162 ... 94 qt Fig. 1. Scheme of the Bivium generator In accordance with [16, 25] we considered cryptanalysis problem for Bivium in the following formulation. Based on the known fragment of keystream we search for the values of all registers cells at the end of the initialization phase. Therefore, in our experiments we used CNF encodings where initialization phase was omitted. Usually it is believed that to uniquely identify the secret key it is sufficient to consider keystream fragment of length comparable to the total length of shift registers. Here we followed [8, 25] and set the keystream fragment length for Bivium cryptanalysis to 200 bits. Solving weakened cryptanalysis problems for the Bivium keystream generator 3 To encode algorithm of the Bivium generator into CNF we used T RANSALG tool [20]. As an input T RANSALG takes a source code of a program in the domain specific language named TA language. TA language is a procedural programming language with C-like syntax. This source code must describe an algorithm of computing of a discrete function. Output of T RANSALG is a CNF which encodes the considered algorithm. Translation of a TA-program consists of two main stages. On the first stage a source code of a TA-program is parsed, and concrete syntax tree is constructed. On the second stage of translation, the symbolic execution of TA-program is performed. According to the concept of the symbolic execution, interpreter operates not with concrete data, but with symbols which encode this data — Boolean variables and Boolean vectors (ar- rays). The symbolic execution of a TA-program results in a set of Boolean formulas, which is called propositional encoding of the algorithm. We will consider construct- ing of propositional encoding by T RANSALG on the example of TA-program which describes algorithm of the Bivium keystream generator [5]. The TA-program for the Bivium generator starts from the declaration of integer constants (see Listing 1.1). Here constant len is equal to total length of the shift reg- isters (177 bits); constants lenA and lenB define the lengths of each register (93 and 84 bits respectively); stream len defines the number of bits of a keystream produced by the generator (200 bits). After this, arrays of variables of type bit are declared. Main data type in the TA-language is the bit type. T RANSALG uses this type to estab- lish links between variables used in a TA-program and Boolean variables included into corresponding propositional encoding. It is important to distinguish between these two sets of variables. Below we will refer to variables that appear in a TA-program as pro- gram variables. All variables included in a propositional encoding are called encoding variables. Global variables of type bit can be declared with attribute in or out. Attribute in marks program variables associated with Boolean variables that encode input of an algorithm. In fact, in such a way we declare a set of variables with the following feature: any assignment of these variables derives values of all other variables in the proposi- tional encoding. It should be noted that in every TA-program that describes some algo- rithm there must be declared variables which encode input of the considered algorithm. Attribute out marks variables which, after TA-program is executed, contain an output of the algorithm. Listing 1.1. Declarations of global variables and constants define len 177; d e f i n e lenA 9 3 ; d e f i n e lenB 84; define stream len 200; in b i t reg [ len ] ; out b i t stream [ stream len ] ; In the Listing 1.1 the bit array reg with attribute in is declared. This array con- tains a state of the shift registers of the generator. Attribute in reports to the translator that initial state of the registers is unknown. Encoding variables, linked with program variables marked with attribute in, get first len numbers in a propositional encoding. 4 O. Zaikin, A. Semenov, I. Otpuschennikov Also in Listing 1.1 the bit array stream with attribute out is declared. This array con- tains the keystream produced by the generator. Encoding variables, linked with program variables marked with attribute out, get last stream len numbers in a propositional encoding. Thus by using attributes in and out we uniquely identify the sets of Boolean variables in a propositional encoding, which encode input and output of an algorithm respectively. In the considered TA-program shifting of the registers is implemented in the form of the procedure shift regs (see Listing 1.2). This procedure updates the state of the global bit array reg, according to the Bivium algorithm [5]. Note that as a result of interpretation of the procedure shift regs, new encoding variables are created only to represent the result of feedback functions calculation. Operations of copying data between cells add nothing to a propositional encoding — only links between program variables and corresponding Boolean encoding variables are changed. Listing 1.2. Function that implements one shift of the Bivium registers void s h i f t r e g s ( ) { b i t t 1 = r e g [ 6 5 ] ˆ r e g [90]& r e g [ 9 1 ] ˆ r e g [ 9 2 ] ˆ r e g [ 1 7 0 ] ; b i t t 2 = r e g [ 1 6 1 ] ˆ r e g [174]& r e g [ 1 7 5 ] ˆ r e g [ 1 7 6 ] ˆ r e g [ 6 8 ] ; int i ; f o r ( i = lenA − 1 ; i > 0 ; i = i − 1 ) { reg [ i ] = reg [ i −1]; } reg [0] = t2 ; f o r ( i = lenA + l e n B − 1 ; i > lenA ; i = i − 1 ) { reg [ i ] = reg [ i −1]; } r e g [ lenA ] = t 1 ; } Interpretation of a TA-program starts from executing the function main. This func- tion is the entry of any TA-program. Presented in the Listing 1.3 function main im- plements the base loop of the Bivium generator. One iteration of this loop outputs one bit of keystream. For this purpose the current state of the four registers cells are taken — their sum mod 2 gives next keystream bit. After this, the shifting of the registers is performed by calling function shift regs. Listing 1.3. Implementation of the base work loop of the Bivium generator v o i d main ( ) { for ( i n t i = 0; i < s t r e a m l e n ; i = i + 1){ b i t t1 = reg [65] ˆ reg [ 9 2 ] ; b i t t2 = reg [161] ˆ reg [176]; stream [ i ] = t1 ˆ t2 ; Solving weakened cryptanalysis problems for the Bivium keystream generator 5 shift regs (); } } The considered TA-program describes the generation of 200 keystream bits by the Bivium generator. The translation of this program results in the system of 465 Boolean equations. The transition from this system to the CNF is performed using the Tseitin transformations [27] and procedures of Boolean minimization. During this transition to minimize Boolean formulas, the E SPRESSO library1 , which is integrated to T RANSALG in the form of separate program module, is used. Procedures of Boolean minimization, implemented in Espresso, operate with truth tables. If the size of a constructed formula does not allow to effectively work with its truth table, then to decompose such formulas into subformulas the Tseitin transformations are used. The final CNF which encodes generating of 200 keystream bits by Bivium is constructed over the set of 642 Boolean variables, and consists of 9560 clauses and 49920 literals. 3 SAT partitioning In [11–14] various approaches to partitioning SAT were studied. Below we will use the terminology from [11]. Consider a satisfiability problem for an arbitrary CNF C. Partitioning of C is a set of formulas C ∧ Gi , i ∈ {1, . . . , S} such that for any i, j : i ̸= j formula C ∧ Gi ∧ Gj is unsatisfiable and C ≡ C ∧ G1 ∨ . . . ∨ C ∧ GS . When one has a partitioning of an original SAT instance, satisfiability problems for C ∧ Gj , j ∈ {1, . . . , S} can be solved independently in parallel. There exist various partitioning techniques. For example one can construct {Gj }Sj=1 using a scattering procedure [12], a guiding path solver [28] or a lookahead solver [10]. Unfortunately, for these partitioning methods it is hard in general case to estimate the time required to solve an original problem. From the other hand in a number of papers about logical cryptanalysis of several keystream generators there was used a partitioning method that makes it possible to construct such estimations in quite a natural way. In particular, in [17, 8, 26, 25] for this purpose the information about the time to solve small number of subproblems randomly chosen from the partitioning of an original problem was used. In the paper [23] strict formal description of this idea within the borders of the Monte Carlo method in its classical form [18] was given. Consider a satisfiability problem for an arbitrary CNF C over a set of Boolean variables X = {x1 , . . . , xn }. We call an arbitrary set X̃ = {xi1 , . . . , xid }, X̃ ⊆ X a decomposition set. Consider a partitioning of C that consists of a set of 2d formulae C ∧ Gj , j ∈ {1, . . . , 2d } 1 http://embedded.eecs.berkeley.edu/pubs/downloads/Espresso/index.htm 6 O. Zaikin, A. Semenov, I. Otpuschennikov where Gj , j ∈ {1, . . . , 2d } are all possible minterms over X̃. ( Note that )an arbitrary formula Gj takes a value of true on a single truth assignment α1j , . . . , αdj ∈ {0, 1}d . [ ( )] Therefore, an arbitrary formula C∧Gj is satisfiable if and only if C X̃/ α1j , . . . , αdj [ ( )] is satisfiable. Here C X̃/ α1j , . . . , αdj is produced by setting values of variables xik to corresponding αkj , k ∈ {1, . . . , d} : xi1 = α1j , . . . , xid = αdj . A set of CNFs { [ ( )]} ∆(C, X̃) = C X̃/ α1j , . . . , αdj (αj1 ,...,αjd )∈{0,1}d is called a decomposition family produced by X̃. The number of CNFs in ∆(C, X̃) is 2d . It is possible to estimate the total time required for processing this family based on the time of solving of SAT problems for k, k << 2d randomly chosen CNFs from the family. This estimation can be used to plan a computational experiment. In [23] the Monte Carlo algorithm based on tabu search heuristics to search for de- composition sets with good time estimations was suggested. This algorithm was imple- mented as the parallel MPI program PDSAT. We used PDSAT running on a computing cluster to obtain decomposition sets for logical cryptanalysis instances for the Bivium generator. As a result we found the set of of 47 variables for Bivium (see Fig. 2). On this figure cells corresponding to variables from decompositions sets are marked with gray color. Time estimation for this set is 1.3391e+11 seconds for 1 processor core. qt 93 92 91 ... 87 ... 85 ... 83 ... 75 ... 72 ... 70 69 ... 66 ... keystream bit pt 177 176 175 ... 171 ... 167 ... 164 ... 162 ... 158 ... 156 ... 154 ... 152 ... ... 60 ... 58 ... 48 ... 45 ... 43 ... 31 ... 29 ... 18 17 ... 3 2 ... 1 pt ... 150 149 ... 144 ... 139 ... 137 ... 128 127 ... 125 ... 122 ... 114 ... 99 98 ... 94 qt Fig. 2. Decomposition set for logical cryptanalysis of Bivium 4 Using SAT@home project for solving weakened cryptanalysis problems for Bivium SAT@home2 [21] is a volunteer computing BOINC-based project aimed at solving hard combinatorial problems that can be effectively reduced to SAT. It was launched 2 http://sat.isa.ru/pdsat/ Solving weakened cryptanalysis problems for the Bivium keystream generator 7 on September 29, 2011 by ISDCT SB RAS and IITP RAS. On February 7, 2012 SAT@home was added to the official list of BOINC projects3 with alpha status. Re- cently its status was improved to beta. The SAT@home server uses a number of standard BOINC daemons responsible for sending and processing tasks (transitioner, feeder, scheduler, etc.). Such daemons as work generator, validator and assimilator were implemented taking into account the specificity of the project. The work generator decomposes the original SAT problem to subproblems according to partitioning found by PDSAT (see section 3). The validator checks the correctness of the results, and the assimilator processes correct results. The client application is based on the CDCL SAT solver M INI S AT [7]. The characteristics of the SAT@home project as of 22 of February 2015 are (ac- cording to BOINCstats4 ): – 2948 active PCs (active PC in volunteer computing is a PC that sent at least one result in last 30 days) about 80% of them use Microsoft Windows OS; – 1432 active users (active user is a user that has at least one active PC); – versions of the client application for CPU: Windows x86, Windows x86-64, Linux x86, Linux x86-64; – average real performance: 6 teraflops, maximal performance: 10 teraflops. The dynamics of the real performance of SAT@home can be seen at the SAT@home performance page5 . With respect to the estimation obtained by PDSAT the solving of one instance of cryptanalysis of the Bivium ciper in the SAT@home project with its current perfor- mance would take about 3 years (using decomposition set of 47 variables [23]). That is why we decided to solve weakened cryptanalysis problems for this generator. Below we use the notation BiviumK to denote a weakened cryptanalysis problem for Bivium with known values of K variables (in corresponding SAT instance) encoding last K cells of the second shift register. In particular we considered the Bivium9 problem. We used PDSAT to find a decomposition set with good time estimation for Bivium9. As a result we obtained the decomposition set of 43 variables with time estimation 4.2873e+09 seconds. From September 2014 to December 2014 with the help of this decomposition set, 5 Bivium9 instances were solved in SAT@home. During the cor- responding experiment for each cryptanalysis instance the search space was divided into 146602 equal tasks, each containing 60 millions subproblems. For processing one such task about 2 hours of modern CPU core is needed. The client application was based on a modified version of M INI S AT 2.2. Time estimation by PDSAT shows good consistency with real solving time of these cryptanalysis instances. As far as we know there are no other available experimental results on solving such problems. 5 Related Work The first work that used SAT-solvers for cryptanalysis was [15]. The authors of [8, 17, 26] presented some estimations of the time required for logical cryptanalysis of the 3 http://boinc.berkeley.edu/projects.php 4 http://boincstats.com/ 5 http://sat.isa.ru/pdsat/performance.php 8 O. Zaikin, A. Semenov, I. Otpuschennikov Bivium cipher. Topics related to organization of SAT solving in distributed environ- ments were considered in many papers, for example in [4, 6, 10–12, 28]. In [22] a desk- top grid for solving SAT which used conflict clauses exchange via peer-to-peer protocol was described. Apparently, [3] became the first paper about the use of a desktop grid based on the BOINC platform for solving SAT, but it did not evolve into a full-fledged volunteer computing project. The predecessor of the SAT@home was the BNB-Grid system [9, 24], that was used to solve first large scale cryptographic problems in 2009. 6 Conclusions Obtained results show that SAT@home can be successfully used for solving hard log- ical cryptanalysis problems. We plan to use SAT@home for solving non-weakened cryptanalysis problem of the Bivium generator. Also we plan to solve in SAT@home cryptanalysis problems for other keystream generators. 7 Acknowledgments We thank Mikhail Posypkin and Nickolay Khrapov for their help in maintaining the SAT@home project, Stepan Kochemazov for valuable comments and discussions, and all the SAT@home volunteers for their participation. This work was partly supported by Russian Foundation for Basic Research (grants 14-07-00403-a, 14-07-31172-mol-a, 15-07-07891-a) and by the President of Russian Federation grants for young scientists (grant SP-3667.2013.5). References 1. D. P. Anderson and G. Fedak. The computational and storage potential of volunteer comput- ing. In Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 16-19 May 2006, Singapore, pages 73–80. IEEE Computer Society, 2006. 2. A. Biere, M. Heule, H. van Maaren, and T. Walsh, editors. Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications. IOS Press, 2009. 3. M. Black and G. Bard. SAT Over BOINC: An Application-Independent Volunteer Grid Project. In S. Jha, N. gentschen Felde, R. Buyya, and G. Fedak, editors, GRID, pages 226– 227. IEEE, 2011. 4. W. Blochinger, C. Sinz, and W. Küchlin. Parallel propositional satisfiability checking with distributed dynamic learning. Parallel Computing, 29(7):969–994, 2003. 5. C. D. Cannière. Trivium: A stream cipher construction inspired by block cipher design principles. In S. K. Katsikas, J. Lopez, M. Backes, S. Gritzalis, and B. Preneel, editors, ISC, volume 4176 of LNCS, pages 171–186. Springer, 2006. 6. W. Chrabakh and R. Wolski. GridSAT: a system for solving satisfiability problems using a computational grid. Parallel Computing, 32(9):660–687, 2006. 7. N. Eén and N. Sörensson. An Extensible SAT-solver. In E. Giunchiglia and A. Tacchella, editors, SAT, volume 2919 of LNCS, pages 502–518. Springer, 2003. 8. T. Eibach, E. Pilz, and G. Völkel. Attacking Bivium Using SAT Solvers. In H. K. Büning and X. Zhao, editors, SAT, volume 4996 of LNCS, pages 63–76. Springer, 2008. Solving weakened cryptanalysis problems for the Bivium keystream generator 9 9. Y. Evtushenko, M. Posypkin, and I. Sigal. A framework for parallel large-scale global opti- mization. Computer Science - R&D, 23(3-4):211–215, 2009. 10. M. Heule, O. Kullmann, S. Wieringa, and A. Biere. Cube and Conquer: Guiding CDCL SAT Solvers by Lookaheads. In K. Eder, J. Lourenço, and O. Shehory, editors, Haifa Verification Conference, volume 7261 of LNCS, pages 50–65. Springer, 2011. 11. A. E. J. Hyvärinen. Grid Based Propositional Satisfiability Solving. PhD thesis, Aalto University, 2011. 12. A. E. J. Hyvärinen, T. A. Junttila, and I. Niemelä. A Distribution Method for Solving SAT in Grids. In A. Biere and C. P. Gomes, editors, SAT, volume 4121 of LNCS, pages 430–435. Springer, 2006. 13. A. E. J. Hyvärinen, T. A. Junttila, and I. Niemelä. Partitioning SAT Instances for Distributed Solving. In C. G. Fermüller and A. Voronkov, editors, LPAR (Yogyakarta), volume 6397 of LNCS, pages 372–386. Springer, 2010. 14. A. E. J. Hyvärinen, T. A. Junttila, and I. Niemelä. Grid-Based SAT Solving with Iterative Partitioning and Clause Learning. In J. H.-M. Lee, editor, CP, volume 6876 of LNCS, pages 385–399. Springer, 2011. 15. F. Massacci and L. Marraro. Logical Cryptanalysis as a SAT Problem. J. Autom. Reasoning, 24(1/2):165–203, 2000. 16. A. Maximov and A. Biryukov. Two trivial attacks on trivium. In C. M. Adams, A. Miri, and M. J. Wiener, editors, Selected Areas in Cryptography, volume 4876 of Lecture Notes in Computer Science, pages 36–55. Springer, 2007. 17. C. Mcdonald, C. Charnes, and J. Pieprzyk. Attacking Bivium with MiniSat. Technical Report 2007/040, ECRYPT Stream Cipher Project, 2007. 18. N. Metropolis and S. Ulam. The Monte Carlo Method. J. Amer. statistical assoc., 44(247):335–341, 1949. 19. M. Nouman Durrani and J. A. Shamsi. Review: Volunteer computing: Requirements, chal- lenges, and solutions. J. Netw. Comput. Appl., 39:369–380, Mar. 2014. 20. I. Otpuschennikov, A. Semenov, and S. Kochemazov. Transalg: a tool for translating proce- dural descriptions of discrete functions to sat (tool paper). CoRR, abs/1405.1544, 2014. 21. M. Posypkin, A. Semenov, and O. Zaikin. Using BOINC desktop grid to solve large scale SAT problems. Computer Science Journal, 13(1):25–34, 2012. 22. S. Schulz and W. Blochinger. Parallel SAT Solving on Peer-to-Peer Desktop Grids. J. Grid Comput., 8(3):443–471, 2010. 23. A. Semenov and O. Zaikin. On estimating total time to solve SAT in distributed computing environments: Application to the SAT@home project. CoRR, abs/1308.0761, 2013. 24. A. Semenov, O. Zaikin, D. Bespalov, and M. Posypkin. Parallel Logical Cryptanalysis of the Generator A5/1 in BNB-Grid System. In V. Malyshkin, editor, PaCT, volume 6873 of LNCS, pages 473–483. Springer, 2011. 25. M. Soos. Grain of Salt - an Automated Way to Test Stream Ciphers through SAT Solvers. In Tools’10: Proceedings of the Workshop on Tools for Cryptanalysis, pages 131–144, 2010. 26. M. Soos, K. Nohl, and C. Castelluccia. Extending SAT Solvers to Cryptographic Problems. In O. Kullmann, editor, SAT, volume 5584 of LNCS, pages 244–257. Springer, 2009. 27. G. S. Tseitin. On the complexity of derivation in propositional calculus. In J. Siekmann and G. Wrightson, editors, Automation of Reasoning 2: Classical Papers on Computational Logic 1967-1970, pages 466–483. Springer, Berlin, Heidelberg, 1983. 28. H. Zhang, M. P. Bonacina, and J. Hsiang. PSATO: a Distributed Propositional Prover and its Application to Quasigroup Problems. J. Symb. Comput., 21(4):543–560, 1996.