=Paper= {{Paper |id=Vol-2341/paper-05 |storemode=property |title=The Study of Implementations of CRCs Algorithms |pdfUrl=https://ceur-ws.org/Vol-2341/paper-05.pdf |volume=Vol-2341 |authors=Sergey V. Klimenko,Valentin V. Yakovlev,Boris V. Sokolov }} ==The Study of Implementations of CRCs Algorithms== https://ceur-ws.org/Vol-2341/paper-05.pdf
                    The Study of Implementations of CRCs Algorithms

 Sergey V. Klimenko, Valentin V. Yakovlev                             Boris V. Sokolov
  Department of Information Systems and                  Laboratory of Information Technologies in
   Technologies, Emperor Alexander I St.                       System Analysis and Modeling,
    Petersburg State Transport University                St. Petersburg Institute for Informatics and
           Saint Petersburg, Russia                                Automation of the RAS,
  s.klimenko@live.ru, jakovlev@pgups.ru                            Saint Petersburg, Russia
                                                                   sokolov_boris@inbox.ru

                                                             The CRC was invented by W. Wesley Peterson
                                                         in 1961; the 32-bit CRC function, used in Ethernet
                                                         and many other standards, is the work of several
                     Abstract                            researchers and was published in 1975.
                                                             The logical prerequisite for using this type of
    Objective: To give comparative assessment            checksum is that the size of the checksum is much
    of the basic ways of generating a checksum           smaller than the format of the converted numbers /
    (CRC code) based on direct, table and matrix         messages, therefore, the probability of distorting the
    algorithms. Methods: Algorithms were                 checksum (when transmitting information through
    compared by means of mathematical                    any channel or when it is stored on a data storage
    methods. In order to achieve the result Java         device) is significantly lower than the probability of
    Development Kit software version 1.8 and             distorting an array of information.
    NetBeans IDE 8.2 development environment                 CRCs are so called because the check (data
    were used. Results: The methods of                   verification) value is a redundancy (it expands the
    generating checksums by means of                     message without adding information) and the
    algorithms were described in detail. For each        algorithm is based on cyclic codes. CRCs are
    method under consideration, the time                 popular because they are simple to implement in
    characteristics of their work were given. The        binary hardware, easy to analyze mathematically,
    comparison of the analyzed methods was               and particularly good at detecting common errors
    conducted. Practical importance:                     caused by noise in transmission channels. Because
    Based on the results of the experiment, it was       the check value has a fixed length, the function that
    concluded which method was optimal for the           generates it is occasionally used as a hash function.
    generation of checksums.                                 The basic idea of the CRC algorithm is to present
                                                         the message as a huge binary number, divide it by
1 Introduction                                           another fixed binary number, and use the rest of this
                                                         division as a checksum. Having received the
Error detection methods are intended to detect           message, the recipient can perform a similar action
distortions in messages when they are transmitted        and compare the balance received with the
through noisy channels. For this, the transmitter        “checksum”.
calculates a number, called a checksum, which is a           The simplest error-detection system, the parity
function of the message, and adds it to this message.    bit, is in fact a 1-bit CRC: it uses the generator
Receiver, using the same algorithm, calculates the       polynomial x + 1 (two terms), and has the name
checksum of the received message and compares it         CRC-1.
with the transmitted value [1].1                             Cyclic redundant codes (CRC) are a subclass of
    CRC (Cyclic Redundancy Check) checksum is a          block codes and are used in HDLC, Token Ring,
value calculated from a data set using mathematical      Token
algorithms that provide hash-collision resistance [2].   Bus protocols, Ethernet protocol families and in
Hash-collision is a checksum equality for various        other protocols of a link level. The popularity of
input blocks of data. Checksums are widely used to       CRC codes is due to the fact that the procedures
control the correctness of stored and transmitted        encoding and decoding are fairly simple and do not
information.                                             require large computational resources.

   Copyright © by the papers’ authors. Copying              Conference "Problems of Mathematical and
   permitted for private and academic purposes.             Natural-Scientific Training in Engineering
   In: S. V. Klimenko, V. V. Yakovlev (eds.):               Education", St.-Petersburg, Russia, 8–9
   Selected Papers of the Workshop Computer                 November, 2018, published at http://ceur-
   Science and Engineering in the framework of              ws.org
   the 5th International Scientific-Methodical

                                                                                                            27
    The CRC algorithm is based on the properties of           CRCs are specifically designed to protect
division with the remainder of binary polynomials,      against common types of errors on communication
thus, the CRC value is the remainder from dividing      channels, where they can provide quick and
the polynomial corresponding to the input data by       reasonable assurance of the integrity of messages
some fixed generating polynomial [1].                   delivered. However, they are not suitable for
The most important task of constructing CRC codes       protecting against intentional alteration of data.
is the choice of the generating polynomial. There are         Firstly, as there is no authentication, an attacker
many standardized and recommended by various            can edit a message and recompute the CRC without
organizations generating polynomials used to            the substitution being detected. When stored
generate CRC. For example, CRC32 generating             alongside the data, CRCs and cryptographic hash
polynomial of IEEE 802.3 standard looks like this:      functions by themselves do not protect against
𝑥 "# + 𝑥 #% + 𝑥 #" + 𝑥 ## + 𝑥&% + 𝑥&# + 𝑥&& + 𝑥&' +     intentional modification of data. Any application
         𝑥( + 𝑥) + 𝑥* + 𝑥+ + 𝑥# + 𝑥 + 1 =               that requires protection against such attacks must use
       0𝑥04𝐶11𝐷𝐵7(in hexadecimal form) =                cryptographic authentication mechanisms, such as
         100110000010001110110110111                    message authentication codes or digital signatures
                   (in binary form).                    (which are commonly based on cryptographic hash
    One of the main conditions for choosing a           functions).
polynomial is that the coefficients in the high and           Secondly, unlike cryptographic hash functions,
low bits were equal to one.                             CRC is an easily reversible function, which makes it
    The mathematical model of finding the checksum      unsuitable for use in digital signatures.
is presented below:                                           Thirdly, CRC is a linear function with a property
               𝑅(𝑥) = 𝑃(𝑥) · 𝑥 J mod 𝐺(𝑥),        (1)   that:
where R(x) – polynomial which represents the value           𝑐𝑟𝑐(𝑥⨁𝑦⨁𝑧) = 𝑐𝑟𝑐(𝑥)⨁𝑐𝑟𝑐(𝑦)⨁𝑐𝑟𝑐(𝑧) (3)
of CRC; P(x) – polynomial, in which coefficients
represent input blocks of data; G(x) – generating       2 Brief description of CRC counting
polynomial; N – degree o generating polynomial (1≤      algorithms
N ≤ 256).
Thus, the CRC calculation is possible to implement      The direct CRC calculation algorithm determines
on the basis of any programming language by using       the control CRC bit by bit [9], it is described as
XOR (Exclusive or) and SHL ("shift to the left")        follows in accordance with (1):
logical operations, because they are included into      1) add zeros to the original message for alignment
any programming language [3, 4].                        (the number of zeros is determined by the degree of
     An example of the execution of the algorithm       the generating polynomial)𝑃(𝑥)´ = 𝑃(𝑥)000 … 𝑁;
for calculating the CRC remainder is shown in Fig.      2) do SHL operation of the bit sequence of the
1, where the polynomial 10011 is selected as the        message P (x) ´ until the bit in the cell becomes equal
generator.                                              to one or the number of bits becomes less than in the
                                                        divider;
           1 0 1 1 0 0 1 1 - quotient
                                                        3) if high bit becomes equal to one, then perform an
  10011101011100000                                     XOR operation between the message and the
       10011                                            generating polynomial and repeat step 2;
       001101
        00000                                           4) the final residue of the sequence P (x)´ is a CRC-
        011011                                          residue.
         10011
         010000                                             In the above description, G (x) is a polynomial,
          10011                                         N is the degree of the polynomial, P (x) is the
          000110                                        original message, and P (x) ´ is the augmented
           00000
             01100                                      original message.
             00000                                          The need to perform multiple iterations when
               11000                                    generating the CRC checksum results in significant
               10011
                 10110                                  time costs.
                 10011                                      The tabular CRC calculation algorithm is used to
                 00100
                   0 1 0 0 - CRC                        accelerate the calculation of the CRC checksum.
                                                            Acceleration is accomplished by replacing eight
      Figure 1: Example for CRC calculation             shift operations with a single search operation in the
                                                        table, which contains 256 values. Therefore, when
    The probability that the distortion of a            calculating the CRC checksum, a cycle is performed
transmitted message in several positions will be such   on 256 values.
that the final checksum does not change is                  The prerequisite for the appearance of the table
determined by the formula [5–8]:                        was the fact that when performing a XOR operation
                            &
                      𝑃 = #L                      (2)   with a constant value at different shifts, there will
                                                        always be some value, which, when applying an



                                                                                                             28
XOR operation with the original content, will give        after that a checksum is constructed for the obtained
the same result. Therefore, it is possible to build a     string using the methods described earlier. For
table of such values, where index is the original         example, a randomly generated string consisting of
content [1].                                              12 characters, looks like this:
Table building algorithm:                                                     𝑂 ℎ 3𝑇𝑗 + 𝐽71𝐿.
1) calculate the value in the table for each byte from        To determine the running time of the algorithms,
0x00 to 0xff:                                             the nanoTime () static method of the System class of
a) perform the “right shift” operation 8 times, and if    the java.lang package was used, which returns the
the low bit is equal to one, then the XOR operation       current time value.
is performed with the polynomial G;                       Thus, it is possible to calculate the running time of
b) all that remains of two bytes becomes the value in     the entire application or its separate fragment (it is
the table.                                                shown at the Figure 2). In considered case, were used
    Calculating algorithm the CRC checksum using          the possibility of this method and the measurement
the table:                                                of the operating time of a separate fragment
1) each byte of the P (x) message is viewed:              (method) of the application, determined by the type
a) an XOR operation is performed on the low byte of       of algorithm.
the current CRC value and the current byte of the             The analysis was performed for randomly
message — this is the index in the table;                 generated rows of the following sizes (h, in MB):
b) the high byte of the current CRC value shifts to       1, 2, 4, 8, 16, 32, 64 and 128.
the right by 8 and becomes low, then merged by                To calculate the arithmetic average of the
XOR with the value of the table - this will be the new    checksum calculation time, the following formula
CRC value;                                                was used:
2) the result is the CRC value.                                                       &
                                                                                 𝑡̅ = _ · ∑_ab& 𝑡a ,           (4)
    The matrix CRC calculation algorithm is used to
                                                          𝑡a is the time for calculating the i checksum, P – total
calculate the checksum and is similar to the tabular
                                                          number.
one, except for that instead of a table, the vector
                                                          Standard deviation was calculated as follows:
multiplication operation (extended byte) by the
                                                                            &
matrix is used.                                                        𝜎 = d_ · ∑_ab&(𝑡a − 𝑡̅),                  (5)
    The main advantage of the matrix algorithm over
the tabular is the size of the memory required to store   where t is the average value.
the table. So, for the implementation of the tabular
algorithm, 1 Kb (256 elements of 4 bytes each) of the
memory is required to store the table, while for the
matrix algorithm, only 32 bytes are required (8                             Set the number of repetitions N
elements of 4 bytes each) [10].
                                                                                            N=0?
                                                                                                     YES
3 Evaluation of temporal effectiveness
                                                                                              NO
In order to compare the considered methods of
generating CRC by performance under the same                               Generate a random string of size n
conditions, an application was written in the high-
level language Java, which allows to get statistics for
each of the methods with the generating polynomial                          Start time = System.nanoTime()
0xEDB88320 for CRC32 and 0xd800000000000000
for CRC64. Statistics show the dependence of their
execution time on the size of the source line                                Calculate the string checksum
(message).
    The Java Development Kit version 1.8 and the
NetBeans IDE 8.2 development environment are                              Total Time = System.nanoTime () -
used to write the application. The experiment was                                     Start Time
conducted using the following hardware and
software resources:                                                      Reduce the number of N repetitions by
1) Windows 10 operating system (64-bit);                                                one
2) dual-core processor Intel Core i7-7600U with a
clock frequency of 2.8 GHz;
3) 16 GB of RAM;
4) SSD with 512 Gb capacity.
    The application generates strings (messages) in a
randomly specified size based on the characters:                   Figure 2: Flowchart - test structure
“A”–“Z”, “a”–“z”, “0”–“9”, and punctuation marks,



                                                                                                                 29
                                                  int crc = 0xFFFFFFFF



                                                  int poly = 0xEDB88320



                                           byte[] bytes = str.toString().getBytes()



                                                         int inx = 0

                                                                inx < bytes.length

                                                    false          true


                      crc = crc ^ 0xffffffff                                          byte b = bytes[inx]



                                                                                int temp = (crc^b) & 0xff



                                                                                              int i = 0

                                                                                                      i<8


                                                                                      false               true
                                                                                                                         temp & 1 == 1

                               crc = (crc >>>8) ^ temp
                                                                                                                 false      true


                                         inx ++                                                                                      temp = (temp >>> 1) ^ poly


                                                                                  temp = (temp >>> 1)



                                                                                                i++

                                                                 Figure 3: Checksum calculation

       4 Results
       The results of calculations by formulas (4) - (5) are
       shown in the table 1 (CRC32) and table 2(CRC64).

              Table 1: Dependence of the working time of CRC32 calculation algorithms on the size of the string

                    Tabular algorithm                                               Direct algorithm                                             Matrix algorithm
h, Mb
                𝑡̅, s              𝜎, s                                        𝑡̅, s                 𝜎, s                                   𝑡̅, s               𝜎, s
  1           0.0024             0.0008                                      0.0085                0.0019                                 0.0065              0.0013
  2           0.0046             0.0007                                      0.0165                0.0021                                 0.0128              0.0017
  4           0.009               0.001                                       0.035                 0.005                                 0.027                0.004
  8           0.019               0.005                                       0.07                  0.013                                 0.053                0.01
 16           0.036               0.004                                       0.130                 0.016                                 0.101                0.012
 32           0.072               0.003                                       0.256                 0.014                                 0.199                0.01
 64            0.14               0.01                                        0.52                  0.02                                   0.40                0.02
 128           0.29               0.02                                        1.03                  0.07                                   0.80                0.04




                                                                                                                                                                  30
                                  1,2
   Time spent counting CRC32, s
                                    1

                                  0,8

                                  0,6

                                  0,4

                                  0,2

                                    0
                                            1             2          4           8        16           32         64         128
                                                                            String size, Mb


 Figure 4: Dependence of the working time of direct (red), matrix (green) and tabular (blue) CRC32 calculation
                                     algorithms on the size of the string

                                  Table 2: Dependence of the working time of CRC64 calculation algorithms on the size of the string

                                                           Tabular algorithm                         Direct algorithm
                                        h, Mb
                                                       𝑡̅, s              𝜎, s                  𝑡̅, s                𝜎, s
                                           1          0.005             0.001                  0.009                0.001
                                           2          0.010             0.002                  0.019                0.003
                                           4          0.018             0.003                  0.036                0.003
                                           8          0.034             0.004                  0.070                0.003
                                          16          0.067             0.008                  0.144                0.009
                                          32          0.134             0.028                  0.287                0.050
                                          64          0.445             0.116                  0.557                0.020
                                          128         0.576             0.082                  1.173                0.143
                                          256         1.218             0.064                  2.241                0.063

                                  2,5
   Time spent counting CRC64, s




                                    2


                                  1,5


                                    1


                                  0,5


                                    0
                                            1         2          4         8         16           32         64        128         256
                                                                               String size, Mb

Figure 5: Dependence of the working time of direct (red) and tabular (blue) CRC64 calculation algorithms on the
                                               size of the string



                                                                                                                                         31
          As can be seen from the results given in Tables                 As a result, it was established that the tabular
       1-2, the time spent on the calculation of the                 algorithm is optimal in terms of the calculation speed
       checksum CRC32 and CRC64 by the direct                        of the checksum CRC32. Acceleration in it
       algorithm is almost the same. But, the calculation of         compared to the direct algorithm was achieved by
       CRC32 in tabular way gives a gain of 2 two times              replacing eight shift operations with a single search
       compared with CRC64.                                          operation in the table, which contains 256 values. It
          In turn, it is worth noting that in CRC64 the              is also important that the use of the matrix algorithm
       probability of not detecting an error is significantly        can be conditioned to significant memory savings
                   &          &
       lower than fg versus hi.                                      compared to the table algorithm.
                  #           #

                                                                     Acknowledgments
       5 Conclusion
                                                                          Research carried out on this topic was carried
            The article describes the direct, tabular and            out with partial financial support from RFBR grants
       matrix algorithms for calculating the checksum                (No. 17-29-07073-ofi-m, 18-07-01272, 19–08–
       CRC32 and CRC64(tabular and direct). A                        00989), under the budget theme 0004.
       comparison of the operating time is made (Fig. 3,
       Fig. 4).

                                                                         University], 2006, no. 1 (309), pp. 19– 24.
References                                                               (In Russian)
                                                                [Koo02] P. Koopman 32-Bit Cyclic Redundancy
[Ros93] N. W. Ross A Painless guide to CRC error                         Codes for Internet Applications. Intern.
         detection algorithms. Australia, 1993.                          Conf. on Dependable Systems and Networks
         URL:http://www.ross.net/crc/download/crc                        (DSN)        (Washington,     July     2002).
         _v3.txt (accessed: 01.04.2018).                                 Washington, DC, 2002, pp. 459–468.
[Myt11] E. A. Mytsko, A. N. Malchukov Osobennosti               [Jak15] V. V. Jakovlev, F. I. Kushnazarov Otsenka
         programmnoy        realizatsii   vychisleniya                   vlijanija pomekh na proizvoditelnost
         kontrolnoy summy CRC32 na primere                               protokolov kanalnogo urovnya [Estimating
         PKZIP, WINZIP, ETHERNET [The specifi                            the impact of interference on the
         cities of software implementation of CRC32                      performance of link layer protocols].
         checksum calculation by the example of                          Izvestija Peterburgskogo universiteta putej
         PKZIP, WINZIP, ETHERNET]. Vestnik                               soobshchenija [Proceedings of Petersburg
         nauki Sibiri [The Siberian Science                              State Transport University], 2015, issue 1
         newsletter], 2011, no.1 (1), pp. 279–282. (In                   (42), pp. 133–138. (In Russian)
         Russian)                                               [Hal96] F. Halsall Data communications, computer
[Mal10] A. N. Malchukov, A. N. Osokin                                    networks and open systems. Addison-
         Bystrodeistvuyushchiye              algoritmy                   Wesley, Pearson Education Press., 1996,
         vychisleniya kontrolnoy summy na primere                        907 р.
         CRC8 [Fast algorithms for calculating the              [Oli08] V. G. Olifer, N. A. Olifer Kompjuterniye sety.
         checksum using the example CRC8].                               Printsipy, tekhnologii, protokoly [Computer
         Molodezh             I         sovremennyye                     networks.       Principles,      technologies,
         informatsionnyye tekhnologii: Sbornik                           protocols]. Saint Petersburg, Peter Publ.,
         trudov VIII Vseros. nauchno-prakt. konf.                        2008, 958 p. (In Russian)
         studentov, aspirantov i molodykh uchenykh              [Tem79] F. E. Temnikov, V. A. Afonin, V. I. Dmitriev
         [The youth and modern information                               Teoreticheskie osnovy informatsionnoy
         technologies: Proceedings of the 8th All-                       tekhniky: uchebnoye posobiye [Theoretical
         Russian research and training conference of                     basis of information technology: study
         students,     postgraduates     and    young                    guide]. Moscow, Energia Publ, 1979, 512 p.
         scientists]. Tomsk, March 3–5th, 2010.                          (In Russian)
         Tomsk, SPB Grafi ks, 2010, pp. 34–35. (In              [Mal10] A. N. Malchukov, A. N. Osokin Bystroje
         Russian)                                                        vychislenije kontrolnoy summy CRC:
[Bur06] Yu. B. Burkatovskaya, A. N. Malchukov, A.                        tablitsa protiv matritsy [Fast calculation of
         N       Osokin.      Bystrodeystvuyushchiye                     CRC checksum: table versus matrix].
         algoritmy deleniya polinomov v arifmetike                       Prikladnaja        informatika       [Applied
         po modulyu dva [Fast-acting algorithms for                      Informatics], 2010, no. 2 (26), pp. 58–63. (In
         dividing polynomials in arithmetics modulo                      Russian)
         two].           Izvestiya          Tomskogo
         politekhnicheskogo                universiteta
         [Proceedings of Tomsk Polytechnic


                                                                                                                          32