=Paper= {{Paper |id=Vol-2665/paper35 |storemode=property |title=Research of Lossy image compression algorithm based on fractal discrete cosine transform |pdfUrl=https://ceur-ws.org/Vol-2665/paper35.pdf |volume=Vol-2665 |authors=Vladislav Butorov,Marina Chicheva }} ==Research of Lossy image compression algorithm based on fractal discrete cosine transform == https://ceur-ws.org/Vol-2665/paper35.pdf
   Research of Lossy Image Compression Algorithm
     Based on Fractal Discrete Cosine Transform
                      Vladislav Butorov                                                                   Marina Chicheva
              Samara National Research University                                    Image Processing Systems Institute of RAS - Branch of the FSRC
                       Samara, Russia                                                           "Crystallography and Photonics" RAS;
                     twoeightnine@list.ru                                                      Samara National Research University
                                                                                                           Samara, Russia
                                                                                                        mchi@geosamara.ru


    Abstractβ€”Lossy image compression algorithm based on                      ο€             𝑧 = βˆ‘π‘˜π‘—=0 𝑧𝑗 𝛼 𝑗 , 𝑧𝑗 ∈ 𝑁 = {0,1, . . , |π‘π‘œπ‘Ÿπ‘š(𝛼)| βˆ’ 1}ο€      (1)ο€ 
fractal discrete cosine transform is proposed in this paper. The
created algorithm is compared to an algorithm based on two-
                                                                                  CNS in the field 𝑄(βˆšπ‘‘) is called a pair {Ξ±, N} , k-
dimensional discrete cosine transform. It is shown
experimentally that the described algorithm brings less                      fundamental domain πΊπ‘˜ is the set of algebraic elements of
distortion concerning block structure in comparison with                     the field 𝑄(βˆšπ‘‘), created by k-membered sum of a formula
square blocks of two-dimensional discrete cosine transform. It               (1),
is remarked that visual quality characteristics of both
algorithms vary poorly for several values ranges of entropy of               ο€                        πΊπ‘˜ = {βˆ‘π‘˜βˆ’1     𝑗
                                                                                                            𝑗=0 𝑧𝑗 𝛼 , 𝑧𝑗 ∈ 𝑁}.ο€                       (2)ο€ 
compressed image.
                                                                                                                      πœ‹Im(𝛼¯ π‘˜+1 𝑛(π‘₯+𝛽))
    Keywordsβ€”lossy image compression, discrete                   cosine             Let    𝛬COSπ‘˜ (π‘š, 𝑛) = π‘π‘œπ‘  (                            ) ,   where
                                                                                                                        π‘π‘œπ‘Ÿπ‘š(𝛼 π‘˜ )Im(𝛼)
transform, canonical number system, fractal DCT
                                                                             parameter Ξ² is set for the reason of orthogonality:
                         I.    INTRODUCTION
                                                                                          βˆ‘π‘₯βˆˆπΊπ‘˜ 𝛬 COSπ‘˜ (𝑝, π‘₯) β‹… 𝛬COSπ‘˜ (π‘ž, π‘₯) = 0, 𝑝 β‰  π‘ž,ο€ 
    For compression of images, lossy compression
algorithms are widely used, since the losses introduced into
the image can be invisible to the eyes and practically do not                       For example, for π‘π‘œπ‘Ÿπ‘š(𝛼) = 2 parameter Ξ² is calculated
affect the visual quality. In such algorithms, compression is                as
performed in the frequency domain, to obtain values in                                                          𝛼 π‘˜+1 βˆ’2𝛼 π‘˜ +1
                                                                                                           𝛽=                .ο€ 
which discrete orthogonal transforms (DOTs) are used.                                                              2(π›Όβˆ’1)
Discrete cosine transform (DCT), namely, its two-
dimensional variation, is widely used in the field of image                         Then FDCT over πΊπ‘˜ is called a transformation
processing. Since two-dimensional DCT is defined on a
square region, the resulting artifacts in compression have a                                 𝑋(π‘š) = πœ†(π‘š) βˆ‘π‘›βˆˆπΊπ‘˜ π‘₯ (𝑛)π›¬οƒοο“π‘˜ (π‘š, 𝑛),ο€ 
very noticeable mesh structure. To eliminate this effect, one
can use the classical one-dimensional DCT applied to the                     where π‘š ∈ π·π‘˜ , and πœ†(π‘š) is FDCT.
sweep generated by some canonical number system (CNS)
[1], or the fractal DCT (FDCT) defined on the fractal region                        Reverse FDCT (RFDCT) is called
generated by CNS [2]. In this paper, we study a lossy
compression algorithm that uses various variations of the                                    π‘₯(𝑛) = βˆ‘π‘šβˆˆπ·π‘˜ πœ† (π‘š)𝑋(π‘š)π›¬οƒοο“π‘˜ (π‘š, 𝑛),ο€ 
FDCT. The results of the algorithm are compared with a
compression based on two-dimensional DCT.                                    where 𝑛 ∈ πΊπ‘˜ , and πœ†(π‘š) is the normalizing coefficient of
                                                                             FDCT.
                  II.    THE THEORETICAL BASIS
                                                                                The normalizing coefficient of FDCT and RFDCT is
A. Fractal DCT                                                               equal and is calculated using the following equation:
    This section provides brief theoretical information about                                                  1
                                                                                                      βˆšπ‘π‘œπ‘Ÿπ‘š(π›Όπ‘˜) , 2π‘š ≑ 0(π‘šπ‘œπ‘‘π›Ό π‘˜ )
the CNS in imaginary quadratic fields [3]-[6], k- fundamental                                πœ†(π‘š) =                              .ο€ 
domains, and FDCT [2].                                                                                    2
                                                                                                      βˆšπ‘π‘œπ‘Ÿπ‘š(π›Όπ‘˜) , 2π‘š β‰  0(π‘šπ‘œπ‘‘π›Ό π‘˜ )
                                                                                                    {
     Let 𝑄(βˆšπ‘‘) is a quadratic field: 𝑄(βˆšπ‘‘) = {𝑧 = π‘Ž +
𝑏𝑑; π‘Ž, 𝑏 ∈ 𝑄}, d is an integer, free of squares. Then the field                  The field π·π‘˜ is found algorithmically for the reasons of
element 𝑧 ∈ 𝑄(βˆšπ‘‘) is called a whole algebraic field element                  orthogonality of the basis functions:
if its norm and trace are integers
                                                                                               βˆ‘ 𝛬 COSπ‘˜ (𝑝, π‘₯) β‹… 𝛬COSπ‘˜ (π‘ž, π‘₯) = 0;
                π‘π‘œπ‘Ÿπ‘š(𝑧) = (π‘Ž + π‘βˆšπ‘‘)(π‘Ž βˆ’ π‘βˆšπ‘‘),ο€ 
                                                                                                      π‘₯ ∈ πΊπ‘˜ ; 𝑝, π‘ž ∈ π·π‘˜ ; 𝑝 β‰  π‘ž
                π‘‡π‘Ÿ(𝑧) = (π‘Ž + π‘βˆšπ‘‘) + (π‘Ž βˆ’ π‘βˆšπ‘‘).ο€ 
                                                                                               βˆ‘ 𝛬 COSπ‘˜ (𝑝, π‘₯) β‹… 𝛬COSπ‘˜ (π‘ž, π‘₯) β‰  0;
    The whole algebraic element 𝛼 ∈ 𝑄(βˆšπ‘‘) is the basis of                                             π‘₯ ∈ πΊπ‘˜ ; 𝑝, π‘ž ∈ π·π‘˜ ; 𝑝 = π‘ž
the CNS in the ring of integer elements 𝑄(βˆšπ‘‘), if any whole                         The algorithm for calculating this region is described in
element of this field is uniquely representable in the form of               [2].
a finite sum



Copyright Β© 2020 for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
Image Processing and Earth Remote Sensing
                                                                                                               πœŽπ‘šπ‘Žπ‘₯
B. Two-dimensional DCT                                                                                              +10
                                                                                                                πœŽπ‘–
                                                                                                        π‘žπ‘– = ⌊           βŒ‹ β‹… 𝑄,ο€ 
                                                                                                                    3
   Two-dimensional DCT is called the transformation
                       𝑋(π‘š1 , π‘š2 ) =                                          where π‘žπ‘– is i-th component of the quantization vector (or
                             πœ‹π‘˜ (𝑛 +0.5)       πœ‹π‘˜ (𝑛 +0.5)
 βˆ‘π‘1 βˆ’1 𝑁2 βˆ’1
       βˆ‘      π‘₯ (𝑛 ,
                  1 2𝑛 )π‘π‘œπ‘  ( 1 1       ) π‘π‘œπ‘  ( 2 2       ),ο€                  matrix), πœŽπ‘– is i-th component of the mean squared error
  𝑛1 =0 𝑛2 =0                                     𝑁1                     𝑁2
                                                                              vector, πœŽπ‘šπ‘Žπ‘₯ is the maximum value of the standard deviation
                                                                              for all components, Q is the algorithm parameter, which is
where π‘₯(𝑛1 , 𝑛2 ) is a source signal (block of image                          the image compression ratio setting. The meaning of this
brightness), 𝑁𝑖 is the size of the i-th side of the block, and                formula is to give more quantization levels to a component
𝑋(π‘š1 , π‘š2 ) is the resulting spectrum of the source signal.                   with a larger standard deviation. For example, for FDCT 𝛼 =
    Then the inverse two-dimensional DCT is called the                        βˆ’1 + 𝑖, π‘˜ = 3, the quantization vector at 𝑄 = 1 is equal to
transformation                                                                (3, 4, 5, 4, 6, 7, 7, 6).
                                                                                  The quantized values of the spectral components are
                    𝑁 βˆ’1 𝑁 βˆ’1                                                 recorded sequentially, 2 bytes were allocated to each
     π‘₯(𝑛1 , 𝑛2 ) = βˆ‘π‘š11=0 βˆ‘π‘š22=0 πœ†1 (π‘š1 )πœ†2 (π‘š2 )𝑋(π‘š1 , π‘š2 ) Γ—
                        πœ‹π‘˜1 (π‘š1 +0.5)              πœ‹π‘˜2 (π‘š2 +0.5)              component.
                π‘π‘œπ‘  (                   ) π‘π‘œπ‘  (                    ),ο€ 
                             𝑁1                         𝑁2
                                                                                                        III.     RESEARCH
where πœ†π‘– (π‘š) is a normalizing coefficient calculated as                       A. Description of the experiment
                                          1
                                       ,π‘š = 0                                     The comparison was carried out on 10 halftone images
                                   βˆšπ‘π‘–
                         πœ†π‘– (π‘š) = { 2         .ο€                               512Γ—512 in size from the Waterloo Gray Set. All images
                                   βˆšπ‘ , π‘š β‰  0                                 were compressed by algorithms using two-dimensional DCT
                                              𝑖                               on blocks 4 Ρ… 4 and 8 Ρ… 8 and FDCT with parameters 𝛼 =
                                                                              βˆ’1 + 𝑖; π‘˜ = 3,4,5,6.
C. Description of the Compression Algorithm                                       As a comparative measure of visual quality, PSNR, or the
    The studied compression algorithm consists of the                         ratio of peak signal to noise, and MSSIM, or a measure of
following steps:                                                              structural similarity averaged over the image, were chosen.
    ο‚· splitting the image into blocks;                                           PSNR is calculated by the formula
    ο‚· calculating the DOT for each of the blocks;                                                    𝑃𝑆𝑁𝑅(π‘₯, 𝑦) = 20π‘™π‘œπ‘”10 255 βˆ’
                                                                                                 1
                                                                                      10π‘™π‘œπ‘”10         βˆ‘π‘1 βˆ’1 𝑁2 βˆ’1                 2
                                                                                                       𝑖=0 βˆ‘π‘—=0 [π‘₯(𝑖, 𝑗) βˆ’ 𝑦(𝑖, 𝑗)] ,
    ο‚· quantization of the obtained frequency domain (lossy                                      𝑁1 𝑁2
      compression);                                                           where x and y are the compared grayscale images, 𝑁1, 𝑁2 are
                                                                              image width and height respectively; PSNR value is
    ο‚· packing of quantized spectral components for                            measured in decibels. The higher the PSNR value, the less
      subsequent lossless compression.                                        the image has changed compared to the original.
                                                                                 MSSIM is calculated as the average SSIM for disjoint
                                                                              8ο‚΄8 blocks :
                                                                                                                 (2πœ‡π‘₯ πœ‡π‘¦+𝐢12 )(2𝜎π‘₯𝑦+𝐢22 )
                                                                                          𝑆𝑆𝐼𝑀(π‘₯, 𝑦) = 2 +πœ‡ 2 +𝐢 2 )(𝜎 2 +𝜎 2 +𝐢 2 )        ,
                                                                                                     (πœ‡π‘₯   𝑦    1     π‘₯    𝑦    2
                                                                                                    1 π‘€βˆ’1
                                                                                             𝑀𝑆𝑆𝐼𝑀 = βˆ‘π‘–=0 𝑆𝑆𝐼𝑀 (π‘₯, 𝑦),
                                                                                                    𝑀
                                                                              where x and y are grayscale images being compared, M is the
                                                                              number of 8ο‚΄8 blocks, 𝐢1 = 2.55 , 𝐢2 = 7.65 . MSSIM
                                                                              values range from -1 to 1, the higher value corresponds to a
                                                                              better visual similarity of two images [7].
Fig. 1. An example of dividing an image into blocks when using FDCT
for Ξ±=-1+i, k=4.                                                                  To assess the degree of compression, informational
                                                                              entropy was used. Information entropy shows how much
    In the case of FDCP, the partition is performed in                        information the spectral component carries on average after
accordance with the k- fundamental fractal region (2). This                   compression [8], and describes the theoretical limit of
area describes the shape of the block, and the block offsets                  sequence compression. Accordingly, the lower the value of
over the entire image are calculated based on the size of the                 entropy, the greater the compression ratio can be achieved by
block (Fig.1). When using two-dimensional DCT, square                         compressing this sequence. Entropy was calculated from a
blocks are used. In cases where the blocks go beyond the                      sequence of quantized spectral components by the formula
image border, the missing values are supplemented by
brightness values from the nearest pixel.                                                          𝐻 = βˆ’ βˆ‘65535
                                                                                                            𝑖=0 𝑝𝑖 π‘™π‘œπ‘”2 𝑝𝑖 ,
                                                                              where 𝑝𝑖 – is the probability of occurrence of the value of i in
    Lossy compression is performed by quantizing the                          the sequence.
spectral components of each block in accordance with the
quantization vector (or matrix in the two-dimensional case).                  B. Results
Quantization vectors are calculated for each algorithm based                     As a result of the study, it turned out that for most images
on the standard square deviation (SSD) of the corresponding                   for equal values of entropy, algorithms based on two-
spectral components according to the formula                                  dimensional DCT show the best values of comparative
                                                                              measures of visual quality compared to algorithms based on
                                                                              FDCT (Fig. 2), but the following can be noted: firstly, when


VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                          154
Image Processing and Earth Remote Sensing

the entropy is one and a half bits per sample and higher, the               seen from the graphs in Fig. 2. Such a property can be useful
MSSIM value for FDCT-based algorithms differ no more                        in image transmission systems for which low PSNR values
than by 1%, which means that the difference is almost                       (about 20 dB) are acceptable.
imperceptible.
                                                                                Moreover, Fig. 3 shows the nature of the distortions
    Secondly, starting from a certain value of entropy, the                 introduced by the FDCT fractal blocks. Compared to the
visual quality of images compressed by the FDCT algorithm                   square blocks of two-dimensional DCT, the fractal structure
is superior to the visual quality of images compressed by the               is less noticeable, and the boundaries of the objects in the
algorithm based on two-dimensional DCT. This can also be                    image      are      sharper,   although     more      noisy.




               Fig. 2. Dependence of the visual characteristics of the Cameraman image on the information entropy: a) PSNR; b) MSSIM.




 Fig. 3. Fragments of the Cameraman image: a) before compression; b) after compression by an algorithm based on two-dimensional DCT 8ο‚΄8 (PSNR =
28.7 dB; MSSIM = 0.81); c) after compression by an algorithm based on FDCT k = 6 (PSNR = 25.42 dB; MSSIM = 0.74). Both compressed images have an
                                                               entropy of 0.19 bit/pixel.

    Finally, it can be noted that in experiments on images                  algorithms, the study of FDCT-based algorithms in other k-
consisting of text, FDCT-based algorithms showed                            fundamental areas, as well as the synthesis of the algorithm
themselves better than algorithms based on two-dimensional                  for reducing the noise introduced by compression when
DCT, which makes great practical sense when working with                    using FDCT.
scanned documents and books. An example of the operation
of algorithms in images containing text is shown in Fig. 4.
                        IV. CONCLUSION
    In this paper, a lossy image compression algorithm based
on a fractal discrete cosine transform was implemented and
studied. The implemented algorithm was compared with the
algorithm based on two-dimensional DCT. As a result, it                     Fig. 4. Image fragments with text: a) before compression; b) after
                                                                            compression by an algorithm based on FDCT k = 3 (PSNR = 24.54 dB;
turned out that FDCT has a completely different character of                MSSIM = 0.92); c) after compression by an algorithm based on two-
distortions introduced into the image during compression: an                dimensional DCT 4ο‚΄4 (PSNR = 28.04 dB; MSSIM = 0.96); c) after image
image compressed by the FDCT algorithm has sharper but                      compression by an algorithm based on two-dimensional DCT 8ο‚΄8 (PSNR =
more noisy object boundaries compared to two-dimensional                    25.86 dB; MSSIM = 0.89). All compressed images have an entropy of 1.2
                                                                            bits/pixel.
DCT; the structure of fractal blocks is less noticeable than
the structure of a square block of two-dimensional DCT.                                                  REFERENCES
Despite the fact that FDCT does not show the best numerical
                                                                            [1]   A.M. Belov, β€œThe study of the effectiveness of one-dimensional
characteristics of visual quality with an equal value of                          discrete cosine transforms on the scans of two-dimensional signals
entropy compared to two-dimensional DCT, the actual visual                        generated by canonical number systems,” Computer Optics, vol. 35,
quality differs insignificantly for some values of entropy,                       no. 4, pp. 519-522, 2011.
which can be used in a number of image processing areas.                    [2]   M.S. Kasparyan, β€œFractal discrete cosine transformations on
                                                                                  prefractal areas associated with the fundamental areas of canonical
   Actual problems associated with the FDCT-based                                 number systems,” Computer Optics, vol. 38, no. 1, pp. 148-153, 2014.
compression algorithm are the synthesis of fast FDCT



VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                           155
Image Processing and Earth Remote Sensing

[3]   I. Katai and A. Kovacs, β€œCanonical number system in imaginary          [6]   V.M. Chernov, β€œExotic" binary number systems for rings of Gauss
      quadratic fields,” Acta Mathematica Hungarica, vol. 37, pp. 159-164,         and Eisenstein integers,” Computer Optics, vol. 42, no. 6, pp. 1068-
      1981.                                                                        1073, 2018, DOI: 10.18287/2412-6179-2018-42-6-1068-1073.
[4]   I. Katai and J. Szabo, β€œCanonical number systems for complex           [7]   Z. Wang, Alan C. Bovik, Hamid R. Sheikh and E.P. Simoncelli,
      integers,” Acta Sci. Math. (Szeged), vol. 37, pp. 255-260, 1975.             β€œImage Quality Assessment: From Error Visibility to Structural
[5]   V.M. Chernov, β€œArithmetic methods for the synthesis of fast discrete         Similarity,” IEEE Transactions on Image Processing, vol. 13, pp. 600-
      orthogonal transform algorithms,” M.: Fizmatlit, 2007.                       612, 2004.
                                                                             [8]   V.D. Kolesnik and G.Sh. Poltyrev, β€œInformation theory course,” M.:
                                                                                   Nauka, 1982.




VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                             156