Research of Lossy Image Compression Algorithm
     Based on Fractal Discrete Cosine Transform
                      Vladislav Butorov                                                                   Marina Chicheva
              Samara National Research University                                    Image Processing Systems Institute of RAS - Branch of the FSRC
                       Samara, Russia                                                           "Crystallography and Photonics" RAS;
                     twoeightnine@list.ru                                                      Samara National Research University
                                                                                                           Samara, Russia
                                                                                                        mchi@geosamara.ru


    Abstract—Lossy image compression algorithm based on                                  𝑧 = ∑𝑘𝑗=0 𝑧𝑗 𝛼 𝑗 , 𝑧𝑗 ∈ 𝑁 = {0,1, . . , |𝑁𝑜𝑟𝑚(𝛼)| − 1}     (1)
fractal discrete cosine transform is proposed in this paper. The
created algorithm is compared to an algorithm based on two-
                                                                                  CNS in the field 𝑄(√𝑑) is called a pair {α, N} , k-
dimensional discrete cosine transform. It is shown
experimentally that the described algorithm brings less                      fundamental domain 𝐺𝑘 is the set of algebraic elements of
distortion concerning block structure in comparison with                     the field 𝑄(√𝑑), created by k-membered sum of a formula
square blocks of two-dimensional discrete cosine transform. It               (1),
is remarked that visual quality characteristics of both
algorithms vary poorly for several values ranges of entropy of                                      𝐺𝑘 = {∑𝑘−1     𝑗
                                                                                                            𝑗=0 𝑧𝑗 𝛼 , 𝑧𝑗 ∈ 𝑁}.                      (2)
compressed image.
                                                                                                                      𝜋Im(𝛼¯ 𝑘+1 𝑛(𝑥+𝛽))
    Keywords—lossy image compression, discrete                   cosine             Let    𝛬COS𝑘 (𝑚, 𝑛) = 𝑐𝑜𝑠 (                            ) ,   where
                                                                                                                        𝑁𝑜𝑟𝑚(𝛼 𝑘 )Im(𝛼)
transform, canonical number system, fractal DCT
                                                                             parameter β is set for the reason of orthogonality:
                         I.    INTRODUCTION
                                                                                          ∑𝑥∈𝐺𝑘 𝛬 COS𝑘 (𝑝, 𝑥) ⋅ 𝛬COS𝑘 (𝑞, 𝑥) = 0, 𝑝 ≠ 𝑞,
    For compression of images, lossy compression
algorithms are widely used, since the losses introduced into
the image can be invisible to the eyes and practically do not                       For example, for 𝑁𝑜𝑟𝑚(𝛼) = 2 parameter β is calculated
affect the visual quality. In such algorithms, compression is                as
performed in the frequency domain, to obtain values in                                                          𝛼 𝑘+1 −2𝛼 𝑘 +1
                                                                                                           𝛽=                .
which discrete orthogonal transforms (DOTs) are used.                                                              2(𝛼−1)
Discrete cosine transform (DCT), namely, its two-
dimensional variation, is widely used in the field of image                         Then FDCT over 𝐺𝑘 is called a transformation
processing. Since two-dimensional DCT is defined on a
square region, the resulting artifacts in compression have a                                 𝑋(𝑚) = 𝜆(𝑚) ∑𝑛∈𝐺𝑘 𝑥 (𝑛)𝛬𝑘 (𝑚, 𝑛),
very noticeable mesh structure. To eliminate this effect, one
can use the classical one-dimensional DCT applied to the                     where 𝑚 ∈ 𝐷𝑘 , and 𝜆(𝑚) is FDCT.
sweep generated by some canonical number system (CNS)
[1], or the fractal DCT (FDCT) defined on the fractal region                        Reverse FDCT (RFDCT) is called
generated by CNS [2]. In this paper, we study a lossy
compression algorithm that uses various variations of the                                    𝑥(𝑛) = ∑𝑚∈𝐷𝑘 𝜆 (𝑚)𝑋(𝑚)𝛬𝑘 (𝑚, 𝑛),
FDCT. The results of the algorithm are compared with a
compression based on two-dimensional DCT.                                    where 𝑛 ∈ 𝐺𝑘 , and 𝜆(𝑚) is the normalizing coefficient of
                                                                             FDCT.
                  II.    THE THEORETICAL BASIS
                                                                                The normalizing coefficient of FDCT and RFDCT is
A. Fractal DCT                                                               equal and is calculated using the following equation:
    This section provides brief theoretical information about                                                  1
                                                                                                      √𝑁𝑜𝑟𝑚(𝛼𝑘) , 2𝑚 ≡ 0(𝑚𝑜𝑑𝛼 𝑘 )
the CNS in imaginary quadratic fields [3]-[6], k- fundamental                                𝜆(𝑚) =                              .
domains, and FDCT [2].                                                                                    2
                                                                                                      √𝑁𝑜𝑟𝑚(𝛼𝑘) , 2𝑚 ≠ 0(𝑚𝑜𝑑𝛼 𝑘 )
                                                                                                    {
     Let 𝑄(√𝑑) is a quadratic field: 𝑄(√𝑑) = {𝑧 = 𝑎 +
𝑏𝑑; 𝑎, 𝑏 ∈ 𝑄}, d is an integer, free of squares. Then the field                  The field 𝐷𝑘 is found algorithmically for the reasons of
element 𝑧 ∈ 𝑄(√𝑑) is called a whole algebraic field element                  orthogonality of the basis functions:
if its norm and trace are integers
                                                                                               ∑ 𝛬 COS𝑘 (𝑝, 𝑥) ⋅ 𝛬COS𝑘 (𝑞, 𝑥) = 0;
                𝑁𝑜𝑟𝑚(𝑧) = (𝑎 + 𝑏√𝑑)(𝑎 − 𝑏√𝑑),
                                                                                                      𝑥 ∈ 𝐺𝑘 ; 𝑝, 𝑞 ∈ 𝐷𝑘 ; 𝑝 ≠ 𝑞
                𝑇𝑟(𝑧) = (𝑎 + 𝑏√𝑑) + (𝑎 − 𝑏√𝑑).
                                                                                               ∑ 𝛬 COS𝑘 (𝑝, 𝑥) ⋅ 𝛬COS𝑘 (𝑞, 𝑥) ≠ 0;
    The whole algebraic element 𝛼 ∈ 𝑄(√𝑑) is the basis of                                             𝑥 ∈ 𝐺𝑘 ; 𝑝, 𝑞 ∈ 𝐷𝑘 ; 𝑝 = 𝑞
the CNS in the ring of integer elements 𝑄(√𝑑), if any whole                         The algorithm for calculating this region is described in
element of this field is uniquely representable in the form of               [2].
a finite sum


Copyright © 2020 for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
Image Processing and Earth Remote Sensing
                                                                                                               𝜎𝑚𝑎𝑥
B. Two-dimensional DCT                                                                                              +10
                                                                                                                𝜎𝑖
                                                                                                        𝑞𝑖 = ⌊           ⌋ ⋅ 𝑄,
                                                                                                                    3
   Two-dimensional DCT is called the transformation
                       𝑋(𝑚1 , 𝑚2 ) =                                          where 𝑞𝑖 is i-th component of the quantization vector (or
                             𝜋𝑘 (𝑛 +0.5)       𝜋𝑘 (𝑛 +0.5)
 ∑𝑁1 −1 𝑁2 −1
       ∑      𝑥 (𝑛 ,
                  1 2𝑛 )𝑐𝑜𝑠 ( 1 1       ) 𝑐𝑜𝑠 ( 2 2       ),                 matrix), 𝜎𝑖 is i-th component of the mean squared error
  𝑛1 =0 𝑛2 =0                                     𝑁1                     𝑁2
                                                                              vector, 𝜎𝑚𝑎𝑥 is the maximum value of the standard deviation
                                                                              for all components, Q is the algorithm parameter, which is
where 𝑥(𝑛1 , 𝑛2 ) is a source signal (block of image                          the image compression ratio setting. The meaning of this
brightness), 𝑁𝑖 is the size of the i-th side of the block, and                formula is to give more quantization levels to a component
𝑋(𝑚1 , 𝑚2 ) is the resulting spectrum of the source signal.                   with a larger standard deviation. For example, for FDCT 𝛼 =
    Then the inverse two-dimensional DCT is called the                        −1 + 𝑖, 𝑘 = 3, the quantization vector at 𝑄 = 1 is equal to
transformation                                                                (3, 4, 5, 4, 6, 7, 7, 6).
                                                                                  The quantized values of the spectral components are
                    𝑁 −1 𝑁 −1                                                 recorded sequentially, 2 bytes were allocated to each
     𝑥(𝑛1 , 𝑛2 ) = ∑𝑚11=0 ∑𝑚22=0 𝜆1 (𝑚1 )𝜆2 (𝑚2 )𝑋(𝑚1 , 𝑚2 ) ×
                        𝜋𝑘1 (𝑚1 +0.5)              𝜋𝑘2 (𝑚2 +0.5)              component.
                𝑐𝑜𝑠 (                   ) 𝑐𝑜𝑠 (                    ),
                             𝑁1                         𝑁2
                                                                                                        III.     RESEARCH
where 𝜆𝑖 (𝑚) is a normalizing coefficient calculated as                       A. Description of the experiment
                                          1
                                       ,𝑚 = 0                                     The comparison was carried out on 10 halftone images
                                   √𝑁𝑖
                         𝜆𝑖 (𝑚) = { 2         .                              512×512 in size from the Waterloo Gray Set. All images
                                   √𝑁 , 𝑚 ≠ 0                                 were compressed by algorithms using two-dimensional DCT
                                              𝑖                               on blocks 4 х 4 and 8 х 8 and FDCT with parameters 𝛼 =
                                                                              −1 + 𝑖; 𝑘 = 3,4,5,6.
C. Description of the Compression Algorithm                                       As a comparative measure of visual quality, PSNR, or the
    The studied compression algorithm consists of the                         ratio of peak signal to noise, and MSSIM, or a measure of
following steps:                                                              structural similarity averaged over the image, were chosen.
     splitting the image into blocks;                                           PSNR is calculated by the formula
     calculating the DOT for each of the blocks;                                                    𝑃𝑆𝑁𝑅(𝑥, 𝑦) = 20𝑙𝑜𝑔10 255 −
                                                                                                 1
                                                                                      10𝑙𝑜𝑔10         ∑𝑁1 −1 𝑁2 −1                 2
                                                                                                       𝑖=0 ∑𝑗=0 [𝑥(𝑖, 𝑗) − 𝑦(𝑖, 𝑗)] ,
     quantization of the obtained frequency domain (lossy                                      𝑁1 𝑁2
      compression);                                                           where x and y are the compared grayscale images, 𝑁1, 𝑁2 are
                                                                              image width and height respectively; PSNR value is
     packing of quantized spectral components for                            measured in decibels. The higher the PSNR value, the less
      subsequent lossless compression.                                        the image has changed compared to the original.
                                                                                 MSSIM is calculated as the average SSIM for disjoint
                                                                              88 blocks :
                                                                                                                 (2𝜇𝑥 𝜇𝑦+𝐶12 )(2𝜎𝑥𝑦+𝐶22 )
                                                                                          𝑆𝑆𝐼𝑀(𝑥, 𝑦) = 2 +𝜇 2 +𝐶 2 )(𝜎 2 +𝜎 2 +𝐶 2 )        ,
                                                                                                     (𝜇𝑥   𝑦    1     𝑥    𝑦    2
                                                                                                    1 𝑀−1
                                                                                             𝑀𝑆𝑆𝐼𝑀 = ∑𝑖=0 𝑆𝑆𝐼𝑀 (𝑥, 𝑦),
                                                                                                    𝑀
                                                                              where x and y are grayscale images being compared, M is the
                                                                              number of 88 blocks, 𝐶1 = 2.55 , 𝐶2 = 7.65 . MSSIM
                                                                              values range from -1 to 1, the higher value corresponds to a
                                                                              better visual similarity of two images [7].
Fig. 1. An example of dividing an image into blocks when using FDCT
for α=-1+i, k=4.                                                                  To assess the degree of compression, informational
                                                                              entropy was used. Information entropy shows how much
    In the case of FDCP, the partition is performed in                        information the spectral component carries on average after
accordance with the k- fundamental fractal region (2). This                   compression [8], and describes the theoretical limit of
area describes the shape of the block, and the block offsets                  sequence compression. Accordingly, the lower the value of
over the entire image are calculated based on the size of the                 entropy, the greater the compression ratio can be achieved by
block (Fig.1). When using two-dimensional DCT, square                         compressing this sequence. Entropy was calculated from a
blocks are used. In cases where the blocks go beyond the                      sequence of quantized spectral components by the formula
image border, the missing values are supplemented by
brightness values from the nearest pixel.                                                          𝐻 = − ∑65535
                                                                                                            𝑖=0 𝑝𝑖 𝑙𝑜𝑔2 𝑝𝑖 ,
                                                                              where 𝑝𝑖 – is the probability of occurrence of the value of i in
    Lossy compression is performed by quantizing the                          the sequence.
spectral components of each block in accordance with the
quantization vector (or matrix in the two-dimensional case).                  B. Results
Quantization vectors are calculated for each algorithm based                     As a result of the study, it turned out that for most images
on the standard square deviation (SSD) of the corresponding                   for equal values of entropy, algorithms based on two-
spectral components according to the formula                                  dimensional DCT show the best values of comparative
                                                                              measures of visual quality compared to algorithms based on
                                                                              FDCT (Fig. 2), but the following can be noted: firstly, when


VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                          154
Image Processing and Earth Remote Sensing

the entropy is one and a half bits per sample and higher, the               seen from the graphs in Fig. 2. Such a property can be useful
MSSIM value for FDCT-based algorithms differ no more                        in image transmission systems for which low PSNR values
than by 1%, which means that the difference is almost                       (about 20 dB) are acceptable.
imperceptible.
                                                                                Moreover, Fig. 3 shows the nature of the distortions
    Secondly, starting from a certain value of entropy, the                 introduced by the FDCT fractal blocks. Compared to the
visual quality of images compressed by the FDCT algorithm                   square blocks of two-dimensional DCT, the fractal structure
is superior to the visual quality of images compressed by the               is less noticeable, and the boundaries of the objects in the
algorithm based on two-dimensional DCT. This can also be                    image      are      sharper,   although     more      noisy.


               Fig. 2. Dependence of the visual characteristics of the Cameraman image on the information entropy: a) PSNR; b) MSSIM.


 Fig. 3. Fragments of the Cameraman image: a) before compression; b) after compression by an algorithm based on two-dimensional DCT 88 (PSNR =
28.7 dB; MSSIM = 0.81); c) after compression by an algorithm based on FDCT k = 6 (PSNR = 25.42 dB; MSSIM = 0.74). Both compressed images have an
                                                               entropy of 0.19 bit/pixel.

    Finally, it can be noted that in experiments on images                  algorithms, the study of FDCT-based algorithms in other k-
consisting of text, FDCT-based algorithms showed                            fundamental areas, as well as the synthesis of the algorithm
themselves better than algorithms based on two-dimensional                  for reducing the noise introduced by compression when
DCT, which makes great practical sense when working with                    using FDCT.
scanned documents and books. An example of the operation
of algorithms in images containing text is shown in Fig. 4.
                        IV. CONCLUSION
    In this paper, a lossy image compression algorithm based
on a fractal discrete cosine transform was implemented and
studied. The implemented algorithm was compared with the
algorithm based on two-dimensional DCT. As a result, it                     Fig. 4. Image fragments with text: a) before compression; b) after
                                                                            compression by an algorithm based on FDCT k = 3 (PSNR = 24.54 dB;
turned out that FDCT has a completely different character of                MSSIM = 0.92); c) after compression by an algorithm based on two-
distortions introduced into the image during compression: an                dimensional DCT 44 (PSNR = 28.04 dB; MSSIM = 0.96); c) after image
image compressed by the FDCT algorithm has sharper but                      compression by an algorithm based on two-dimensional DCT 88 (PSNR =
more noisy object boundaries compared to two-dimensional                    25.86 dB; MSSIM = 0.89). All compressed images have an entropy of 1.2
                                                                            bits/pixel.
DCT; the structure of fractal blocks is less noticeable than
the structure of a square block of two-dimensional DCT.                                                  REFERENCES
Despite the fact that FDCT does not show the best numerical
                                                                            [1]   A.M. Belov, “The study of the effectiveness of one-dimensional
characteristics of visual quality with an equal value of                          discrete cosine transforms on the scans of two-dimensional signals
entropy compared to two-dimensional DCT, the actual visual                        generated by canonical number systems,” Computer Optics, vol. 35,
quality differs insignificantly for some values of entropy,                       no. 4, pp. 519-522, 2011.
which can be used in a number of image processing areas.                    [2]   M.S. Kasparyan, “Fractal discrete cosine transformations on
                                                                                  prefractal areas associated with the fundamental areas of canonical
   Actual problems associated with the FDCT-based                                 number systems,” Computer Optics, vol. 38, no. 1, pp. 148-153, 2014.
compression algorithm are the synthesis of fast FDCT


VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                           155
Image Processing and Earth Remote Sensing

[3]   I. Katai and A. Kovacs, “Canonical number system in imaginary          [6]   V.M. Chernov, “Exotic" binary number systems for rings of Gauss
      quadratic fields,” Acta Mathematica Hungarica, vol. 37, pp. 159-164,         and Eisenstein integers,” Computer Optics, vol. 42, no. 6, pp. 1068-
      1981.                                                                        1073, 2018, DOI: 10.18287/2412-6179-2018-42-6-1068-1073.
[4]   I. Katai and J. Szabo, “Canonical number systems for complex           [7]   Z. Wang, Alan C. Bovik, Hamid R. Sheikh and E.P. Simoncelli,
      integers,” Acta Sci. Math. (Szeged), vol. 37, pp. 255-260, 1975.             “Image Quality Assessment: From Error Visibility to Structural
[5]   V.M. Chernov, “Arithmetic methods for the synthesis of fast discrete         Similarity,” IEEE Transactions on Image Processing, vol. 13, pp. 600-
      orthogonal transform algorithms,” M.: Fizmatlit, 2007.                       612, 2004.
                                                                             [8]   V.D. Kolesnik and G.Sh. Poltyrev, “Information theory course,” M.:
                                                                                   Nauka, 1982.


VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                             156