=Paper=
{{Paper
|id=Vol-2341/paper-05
|storemode=property
|title=The Study of Implementations of CRCs Algorithms
|pdfUrl=https://ceur-ws.org/Vol-2341/paper-05.pdf
|volume=Vol-2341
|authors=Sergey V. Klimenko,Valentin V. Yakovlev,Boris V. Sokolov
}}
==The Study of Implementations of CRCs Algorithms==
The Study of Implementations of CRCs Algorithms Sergey V. Klimenko, Valentin V. Yakovlev Boris V. Sokolov Department of Information Systems and Laboratory of Information Technologies in Technologies, Emperor Alexander I St. System Analysis and Modeling, Petersburg State Transport University St. Petersburg Institute for Informatics and Saint Petersburg, Russia Automation of the RAS, s.klimenko@live.ru, jakovlev@pgups.ru Saint Petersburg, Russia sokolov_boris@inbox.ru The CRC was invented by W. Wesley Peterson in 1961; the 32-bit CRC function, used in Ethernet and many other standards, is the work of several Abstract researchers and was published in 1975. The logical prerequisite for using this type of Objective: To give comparative assessment checksum is that the size of the checksum is much of the basic ways of generating a checksum smaller than the format of the converted numbers / (CRC code) based on direct, table and matrix messages, therefore, the probability of distorting the algorithms. Methods: Algorithms were checksum (when transmitting information through compared by means of mathematical any channel or when it is stored on a data storage methods. In order to achieve the result Java device) is significantly lower than the probability of Development Kit software version 1.8 and distorting an array of information. NetBeans IDE 8.2 development environment CRCs are so called because the check (data were used. Results: The methods of verification) value is a redundancy (it expands the generating checksums by means of message without adding information) and the algorithms were described in detail. For each algorithm is based on cyclic codes. CRCs are method under consideration, the time popular because they are simple to implement in characteristics of their work were given. The binary hardware, easy to analyze mathematically, comparison of the analyzed methods was and particularly good at detecting common errors conducted. Practical importance: caused by noise in transmission channels. Because Based on the results of the experiment, it was the check value has a fixed length, the function that concluded which method was optimal for the generates it is occasionally used as a hash function. generation of checksums. The basic idea of the CRC algorithm is to present the message as a huge binary number, divide it by 1 Introduction another fixed binary number, and use the rest of this division as a checksum. Having received the Error detection methods are intended to detect message, the recipient can perform a similar action distortions in messages when they are transmitted and compare the balance received with the through noisy channels. For this, the transmitter “checksum”. calculates a number, called a checksum, which is a The simplest error-detection system, the parity function of the message, and adds it to this message. bit, is in fact a 1-bit CRC: it uses the generator Receiver, using the same algorithm, calculates the polynomial x + 1 (two terms), and has the name checksum of the received message and compares it CRC-1. with the transmitted value [1].1 Cyclic redundant codes (CRC) are a subclass of CRC (Cyclic Redundancy Check) checksum is a block codes and are used in HDLC, Token Ring, value calculated from a data set using mathematical Token algorithms that provide hash-collision resistance [2]. Bus protocols, Ethernet protocol families and in Hash-collision is a checksum equality for various other protocols of a link level. The popularity of input blocks of data. Checksums are widely used to CRC codes is due to the fact that the procedures control the correctness of stored and transmitted encoding and decoding are fairly simple and do not information. require large computational resources. Copyright © by the papers’ authors. Copying Conference "Problems of Mathematical and permitted for private and academic purposes. Natural-Scientific Training in Engineering In: S. V. Klimenko, V. V. Yakovlev (eds.): Education", St.-Petersburg, Russia, 8–9 Selected Papers of the Workshop Computer November, 2018, published at http://ceur- Science and Engineering in the framework of ws.org the 5th International Scientific-Methodical 27 The CRC algorithm is based on the properties of CRCs are specifically designed to protect division with the remainder of binary polynomials, against common types of errors on communication thus, the CRC value is the remainder from dividing channels, where they can provide quick and the polynomial corresponding to the input data by reasonable assurance of the integrity of messages some fixed generating polynomial [1]. delivered. However, they are not suitable for The most important task of constructing CRC codes protecting against intentional alteration of data. is the choice of the generating polynomial. There are Firstly, as there is no authentication, an attacker many standardized and recommended by various can edit a message and recompute the CRC without organizations generating polynomials used to the substitution being detected. When stored generate CRC. For example, CRC32 generating alongside the data, CRCs and cryptographic hash polynomial of IEEE 802.3 standard looks like this: functions by themselves do not protect against 𝑥 "# + 𝑥 #% + 𝑥 #" + 𝑥 ## + 𝑥&% + 𝑥 + 𝑥&& + 𝑥&' + intentional modification of data. Any application 𝑥( + 𝑥) + 𝑥* + 𝑥+ + 𝑥# + 𝑥 + 1 = that requires protection against such attacks must use 0𝑥04𝐶11𝐷𝐵7(in hexadecimal form) = cryptographic authentication mechanisms, such as 100110000010001110110110111 message authentication codes or digital signatures (in binary form). (which are commonly based on cryptographic hash One of the main conditions for choosing a functions). polynomial is that the coefficients in the high and Secondly, unlike cryptographic hash functions, low bits were equal to one. CRC is an easily reversible function, which makes it The mathematical model of finding the checksum unsuitable for use in digital signatures. is presented below: Thirdly, CRC is a linear function with a property 𝑅(𝑥) = 𝑃(𝑥) · 𝑥 J mod 𝐺(𝑥), (1) that: where R(x) – polynomial which represents the value 𝑐𝑟𝑐(𝑥⨁𝑦⨁𝑧) = 𝑐𝑟𝑐(𝑥)⨁𝑐𝑟𝑐(𝑦)⨁𝑐𝑟𝑐(𝑧) (3) of CRC; P(x) – polynomial, in which coefficients represent input blocks of data; G(x) – generating 2 Brief description of CRC counting polynomial; N – degree o generating polynomial (1≤ algorithms N ≤ 256). Thus, the CRC calculation is possible to implement The direct CRC calculation algorithm determines on the basis of any programming language by using the control CRC bit by bit [9], it is described as XOR (Exclusive or) and SHL ("shift to the left") follows in accordance with (1): logical operations, because they are included into 1) add zeros to the original message for alignment any programming language [3, 4]. (the number of zeros is determined by the degree of An example of the execution of the algorithm the generating polynomial)𝑃(𝑥)´ = 𝑃(𝑥)000 … 𝑁; for calculating the CRC remainder is shown in Fig. 2) do SHL operation of the bit sequence of the 1, where the polynomial 10011 is selected as the message P (x) ´ until the bit in the cell becomes equal generator. to one or the number of bits becomes less than in the divider; 1 0 1 1 0 0 1 1 - quotient 3) if high bit becomes equal to one, then perform an 10011101011100000 XOR operation between the message and the 10011 generating polynomial and repeat step 2; 001101 00000 4) the final residue of the sequence P (x)´ is a CRC- 011011 residue. 10011 010000 In the above description, G (x) is a polynomial, 10011 N is the degree of the polynomial, P (x) is the 000110 original message, and P (x) ´ is the augmented 00000 01100 original message. 00000 The need to perform multiple iterations when 11000 generating the CRC checksum results in significant 10011 10110 time costs. 10011 The tabular CRC calculation algorithm is used to 00100 0 1 0 0 - CRC accelerate the calculation of the CRC checksum. Acceleration is accomplished by replacing eight Figure 1: Example for CRC calculation shift operations with a single search operation in the table, which contains 256 values. Therefore, when The probability that the distortion of a calculating the CRC checksum, a cycle is performed transmitted message in several positions will be such on 256 values. that the final checksum does not change is The prerequisite for the appearance of the table determined by the formula [5–8]: was the fact that when performing a XOR operation & 𝑃 = #L (2) with a constant value at different shifts, there will always be some value, which, when applying an 28 XOR operation with the original content, will give after that a checksum is constructed for the obtained the same result. Therefore, it is possible to build a string using the methods described earlier. For table of such values, where index is the original example, a randomly generated string consisting of content [1]. 12 characters, looks like this: Table building algorithm: 𝑂 ℎ 3𝑇𝑗 + 𝐽71𝐿. 1) calculate the value in the table for each byte from To determine the running time of the algorithms, 0x00 to 0xff: the nanoTime () static method of the System class of a) perform the “right shift” operation 8 times, and if the java.lang package was used, which returns the the low bit is equal to one, then the XOR operation current time value. is performed with the polynomial G; Thus, it is possible to calculate the running time of b) all that remains of two bytes becomes the value in the entire application or its separate fragment (it is the table. shown at the Figure 2). In considered case, were used Calculating algorithm the CRC checksum using the possibility of this method and the measurement the table: of the operating time of a separate fragment 1) each byte of the P (x) message is viewed: (method) of the application, determined by the type a) an XOR operation is performed on the low byte of of algorithm. the current CRC value and the current byte of the The analysis was performed for randomly message — this is the index in the table; generated rows of the following sizes (h, in MB): b) the high byte of the current CRC value shifts to 1, 2, 4, 8, 16, 32, 64 and 128. the right by 8 and becomes low, then merged by To calculate the arithmetic average of the XOR with the value of the table - this will be the new checksum calculation time, the following formula CRC value; was used: 2) the result is the CRC value. & 𝑡̅ = _ · ∑_ab& 𝑡a , (4) The matrix CRC calculation algorithm is used to 𝑡a is the time for calculating the i checksum, P – total calculate the checksum and is similar to the tabular number. one, except for that instead of a table, the vector Standard deviation was calculated as follows: multiplication operation (extended byte) by the & matrix is used. 𝜎 = d_ · ∑_ab&(𝑡a − 𝑡̅), (5) The main advantage of the matrix algorithm over the tabular is the size of the memory required to store where t is the average value. the table. So, for the implementation of the tabular algorithm, 1 Kb (256 elements of 4 bytes each) of the memory is required to store the table, while for the matrix algorithm, only 32 bytes are required (8 Set the number of repetitions N elements of 4 bytes each) [10]. N=0? YES 3 Evaluation of temporal effectiveness NO In order to compare the considered methods of generating CRC by performance under the same Generate a random string of size n conditions, an application was written in the high- level language Java, which allows to get statistics for each of the methods with the generating polynomial Start time = System.nanoTime() 0xEDB88320 for CRC32 and 0xd800000000000000 for CRC64. Statistics show the dependence of their execution time on the size of the source line Calculate the string checksum (message). The Java Development Kit version 1.8 and the NetBeans IDE 8.2 development environment are Total Time = System.nanoTime () - used to write the application. The experiment was Start Time conducted using the following hardware and software resources: Reduce the number of N repetitions by 1) Windows 10 operating system (64-bit); one 2) dual-core processor Intel Core i7-7600U with a clock frequency of 2.8 GHz; 3) 16 GB of RAM; 4) SSD with 512 Gb capacity. The application generates strings (messages) in a randomly specified size based on the characters: Figure 2: Flowchart - test structure “A”–“Z”, “a”–“z”, “0”–“9”, and punctuation marks, 29 int crc = 0xFFFFFFFF int poly = 0xEDB88320 byte[] bytes = str.toString().getBytes() int inx = 0 inx < bytes.length false true crc = crc ^ 0xffffffff byte b = bytes[inx] int temp = (crc^b) & 0xff int i = 0 i<8 false true temp & 1 == 1 crc = (crc >>>8) ^ temp false true inx ++ temp = (temp >>> 1) ^ poly temp = (temp >>> 1) i++ Figure 3: Checksum calculation 4 Results The results of calculations by formulas (4) - (5) are shown in the table 1 (CRC32) and table 2(CRC64). Table 1: Dependence of the working time of CRC32 calculation algorithms on the size of the string Tabular algorithm Direct algorithm Matrix algorithm h, Mb 𝑡̅, s 𝜎, s 𝑡̅, s 𝜎, s 𝑡̅, s 𝜎, s 1 0.0024 0.0008 0.0085 0.0019 0.0065 0.0013 2 0.0046 0.0007 0.0165 0.0021 0.0128 0.0017 4 0.009 0.001 0.035 0.005 0.027 0.004 8 0.019 0.005 0.07 0.013 0.053 0.01 16 0.036 0.004 0.130 0.016 0.101 0.012 32 0.072 0.003 0.256 0.014 0.199 0.01 64 0.14 0.01 0.52 0.02 0.40 0.02 128 0.29 0.02 1.03 0.07 0.80 0.04 30 1,2 Time spent counting CRC32, s 1 0,8 0,6 0,4 0,2 0 1 2 4 8 16 32 64 128 String size, Mb Figure 4: Dependence of the working time of direct (red), matrix (green) and tabular (blue) CRC32 calculation algorithms on the size of the string Table 2: Dependence of the working time of CRC64 calculation algorithms on the size of the string Tabular algorithm Direct algorithm h, Mb 𝑡̅, s 𝜎, s 𝑡̅, s 𝜎, s 1 0.005 0.001 0.009 0.001 2 0.010 0.002 0.019 0.003 4 0.018 0.003 0.036 0.003 8 0.034 0.004 0.070 0.003 16 0.067 0.008 0.144 0.009 32 0.134 0.028 0.287 0.050 64 0.445 0.116 0.557 0.020 128 0.576 0.082 1.173 0.143 256 1.218 0.064 2.241 0.063 2,5 Time spent counting CRC64, s 2 1,5 1 0,5 0 1 2 4 8 16 32 64 128 256 String size, Mb Figure 5: Dependence of the working time of direct (red) and tabular (blue) CRC64 calculation algorithms on the size of the string 31 As can be seen from the results given in Tables As a result, it was established that the tabular 1-2, the time spent on the calculation of the algorithm is optimal in terms of the calculation speed checksum CRC32 and CRC64 by the direct of the checksum CRC32. Acceleration in it algorithm is almost the same. But, the calculation of compared to the direct algorithm was achieved by CRC32 in tabular way gives a gain of 2 two times replacing eight shift operations with a single search compared with CRC64. operation in the table, which contains 256 values. It In turn, it is worth noting that in CRC64 the is also important that the use of the matrix algorithm probability of not detecting an error is significantly can be conditioned to significant memory savings & & lower than fg versus hi. compared to the table algorithm. # # Acknowledgments 5 Conclusion Research carried out on this topic was carried The article describes the direct, tabular and out with partial financial support from RFBR grants matrix algorithms for calculating the checksum (No. 17-29-07073-ofi-m, 18-07-01272, 19–08– CRC32 and CRC64(tabular and direct). A 00989), under the budget theme 0004. comparison of the operating time is made (Fig. 3, Fig. 4). University], 2006, no. 1 (309), pp. 19– 24. References (In Russian) [Koo02] P. Koopman 32-Bit Cyclic Redundancy [Ros93] N. W. Ross A Painless guide to CRC error Codes for Internet Applications. Intern. detection algorithms. Australia, 1993. Conf. on Dependable Systems and Networks URL:http://www.ross.net/crc/download/crc (DSN) (Washington, July 2002). _v3.txt (accessed: 01.04.2018). Washington, DC, 2002, pp. 459–468. [Myt11] E. A. Mytsko, A. N. Malchukov Osobennosti [Jak15] V. V. Jakovlev, F. I. Kushnazarov Otsenka programmnoy realizatsii vychisleniya vlijanija pomekh na proizvoditelnost kontrolnoy summy CRC32 na primere protokolov kanalnogo urovnya [Estimating PKZIP, WINZIP, ETHERNET [The specifi the impact of interference on the cities of software implementation of CRC32 performance of link layer protocols]. checksum calculation by the example of Izvestija Peterburgskogo universiteta putej PKZIP, WINZIP, ETHERNET]. Vestnik soobshchenija [Proceedings of Petersburg nauki Sibiri [The Siberian Science State Transport University], 2015, issue 1 newsletter], 2011, no.1 (1), pp. 279–282. (In (42), pp. 133–138. (In Russian) Russian) [Hal96] F. Halsall Data communications, computer [Mal10] A. N. Malchukov, A. N. Osokin networks and open systems. Addison- Bystrodeistvuyushchiye algoritmy Wesley, Pearson Education Press., 1996, vychisleniya kontrolnoy summy na primere 907 р. CRC8 [Fast algorithms for calculating the [Oli08] V. G. Olifer, N. A. Olifer Kompjuterniye sety. checksum using the example CRC8]. Printsipy, tekhnologii, protokoly [Computer Molodezh I sovremennyye networks. Principles, technologies, informatsionnyye tekhnologii: Sbornik protocols]. Saint Petersburg, Peter Publ., trudov VIII Vseros. nauchno-prakt. konf. 2008, 958 p. (In Russian) studentov, aspirantov i molodykh uchenykh [Tem79] F. E. Temnikov, V. A. Afonin, V. I. Dmitriev [The youth and modern information Teoreticheskie osnovy informatsionnoy technologies: Proceedings of the 8th All- tekhniky: uchebnoye posobiye [Theoretical Russian research and training conference of basis of information technology: study students, postgraduates and young guide]. Moscow, Energia Publ, 1979, 512 p. scientists]. Tomsk, March 3–5th, 2010. (In Russian) Tomsk, SPB Grafi ks, 2010, pp. 34–35. (In [Mal10] A. N. Malchukov, A. N. Osokin Bystroje Russian) vychislenije kontrolnoy summy CRC: [Bur06] Yu. B. Burkatovskaya, A. N. Malchukov, A. tablitsa protiv matritsy [Fast calculation of N Osokin. Bystrodeystvuyushchiye CRC checksum: table versus matrix]. algoritmy deleniya polinomov v arifmetike Prikladnaja informatika [Applied po modulyu dva [Fast-acting algorithms for Informatics], 2010, no. 2 (26), pp. 58–63. (In dividing polynomials in arithmetics modulo Russian) two]. Izvestiya Tomskogo politekhnicheskogo universiteta [Proceedings of Tomsk Polytechnic 32