=Paper= {{Paper |id=Vol-2485/paper62 |storemode=property |title=A New Approach to Reducing the Distortion of the Digital Image Natural Model in the DCT Domain When Embedding Information According to the QIM Method |pdfUrl=https://ceur-ws.org/Vol-2485/paper62.pdf |volume=Vol-2485 |authors=Oleg Evsutin,Anna Melman,Roman Meshcheryakov,Anastasia Ishakova }} ==A New Approach to Reducing the Distortion of the Digital Image Natural Model in the DCT Domain When Embedding Information According to the QIM Method== https://ceur-ws.org/Vol-2485/paper62.pdf
 A New Approach to Reducing the Distortion of the Digital Image Natural
  Model in the DCT Domain When Embedding Information According to
                          the QIM Method
                           O.O. Evsutin1,2,3, A.S. Melman3, R.V. Meshcheryakov2, A.O. Ishakova2,3
                evsutin.oo@gmail.com | annakokurina94@yandex.ru | mrv@ipu.ru | shumskaya.ao@gmail.com
                        1
                          National Research University Higher School of Economics, Moscow, Russia;
            2
              V. A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences, Moscow, Russia;
                      3
                       Tomsk State University of Control Systems and Radioelectronics, Tomsk, Russia
    One of the areas of digital image processing is the steganographic embedding of additional information into them. Digital
steganography methods are used to ensure the information confidentiality, as well as to track the distribution of digital content on the
Internet. Main indicators of the steganographic embedding effectiveness are invisibility to the human eye, characterized by the PSNR
metric, and embedding capacity. However, even with full visual stealth of embedding, its presence may produce a distortion of the digital
image natural model in the frequency domain. The article presents a new approach to reducing the distortion of the digital image natural
model in the field of discrete cosine transform (DCT) when embedding information using the classical QIM method. The results of the
experiments show that the proposed approach allows reducing the distortion of the histograms of the distribution of DCT coefficients,
and thereby eliminating the unmasking signs of embedding.
    Keywords: information security, steganography, digital images, discrete cosine transform.


1. Introduction                                                        2. The embedding operation
    One of the promising areas for solving the problem of                  Embedding operation, i.e. operation of direct changes in the
ensuring information security of multimedia data is the use of         values of frequency coefficients, is based on the QIM method.
digital steganography and digital watermarking (DWM)                   The idea of the QIM method [7] is to modulate the brightness of
technique, which allow you to hide additional information of           the pixels or the values of the frequency coefficients depending
various purposes in digital objects, in particular, in digital         on the values of the embedded bits. In this study, the QIM method
images.                                                                is used to embed information in DCT coefficients. Images are
    The methods for embedding information in digital images are        processed in blocks of 8 × 8 pixels. The embedding area consists
divided into methods for embedding in the spatial domain and in        of 36 high- and mid-frequency AC-coefficients. The embedding
the frequency domain of digital images. In practice, the use of        of information is carried out by the formula
embedding methods in the frequency domain is more effective,                                            c q
since such embedding in the general case provides greater                                      c  q      b ,
resistance to various destructive influences. However, most of                                          q 2 i
the known frequency embedding algorithms lead to significant           where c is the DCT coefficient before embedding, c – is the
distortions of a digital image natural model in the frequency          DCT coefficient after embedding, b is the secret message bit,
domain. Such distortions are an unmasking feature that reduces                                                i
the stability of the steganographic algorithm before steganalysis,      q is the quantization step.
aimed at identifying the presence of embedded additional                    The algorithm used in this study is distinguished by the
information in digital objects. Using steganalysis methods, an         possibility of error-free extraction of embedded information due
intruder can detect the presence of an embedded message in a           to an iterative embedding procedure. Since this feature of the
given stego-image and subsequently compromise or destroy it.           algorithm does not affect the distortion of the digital image
    In general case, steganalysis of digital images is considered      natural model in the frequency domain, typical for the QIM
as a two-class classification problem. Many modern methods of          method, we shall omit its description. The authors of this study
steganalysis are the development of a classical study [4], which       give more information on the principle of iterative correction of
proposes a set of features for conducting steganalysis. The            extraction errors by example of the discrete Fourier transform in
studies presented in articles [5, 8] are aimed at minimizing           [3].
distortions of the natural model of digital images by using
various feature spaces. The articles [2, 6] are devoted to             3. Proposed Approach
steganalysis, the improvement of steganographic algorithms to
                                                                            The application of the QIM method to the DCT domain is
counteract it and the expansion of feature spaces.
                                                                       associated with the problem of distortion of the digital image
    This article proposes a new approach to reducing the
                                                                       natural model. In the present work, as a digital image natural
distortions of the digital image natural model in the discrete
                                                                       model in the DCT domain we mean a histogram of distribution
cosine transform (DCT) domain by the steganographic
                                                                       of the DCT coefficient values. An example of a typical histogram
embedding of information. Embedding is performed according
                                                                       of DCT-coefficients of the image is shown in Fig. 1 (a).
to an algorithm based on the popular steganographic method of
                                                                            If we embed a message in the corresponding image using the
quantization index modulation (QIM). The idea of the proposed
                                                                       QIM method with a predefined quantization step q , the
approach consists in adaptive selection of the quantization step
(the main parameter of the classical QIM method) depending on          histogram of the DCT coefficients will take the form shown in
the characteristics of a particular cover image. The aim of the        Fig. 1 (b). The obtained histogram is markedly different from the
work is to study the effectiveness of this approach and its specific   original. This is due to the fact that the classical QIM method
algorithmic implementations.                                           narrows the number of possible variants of the DCT coefficient
                                                                       values [7].
                                                                            One of the solutions to this problem was previously proposed
                                                                       by the authors of this study in [1], where to redistribute the



Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
arising distortions of the histogram, the quantization step was                          steganography due to the compromise of the fact of the hidden
variable and depended on the ratio of the average values of the                          transmission of information.
AC-coefficients moduli of the one-dimensional DCT in and out                                  In order to decide how the non-embedding area can be
of the embed area. This approach demonstrated a positive effect                          interconnected with the value q , it is necessary to pay attention
on the compensation of distortions of the histogram, but the                             on the nature of the distortions that arise when embedding
problem was not completely solved. The form of the histogram                             information with a constant quantization step. The histogram in
for a number of images still made it possible to unambiguously                           Fig. 1 (b) corresponds to the stego-image obtained with q  8 .
determine the presence of a steganographic attachment. Fig. 1 (c)                        The peaks that appear in the histogram correspond to the values
demonstrates the influence of the approach described in [1] on
the reduction of histogram distortions.                                                    4 , i.e.  q 2 . So, a certain predetermined value leads to the

                                                                                         fact that the frequency of occurrence of values equal to  q
                                          8000                                                                                                              2
                                                                                         increases. To reduce the probability that some new value of the
                                          6000
                                                                                         quantization step for the next block will coincide with the most
                 The number of values




                                                                                         frequently occurring value of DCT coefficients, i.e. in order not
                                                                                         to enhance the growth of possible peaks in the histogram, it is
                                          4000                                           proposed to choose the least frequently encountered value of
       a)                                                                                DCT coefficients from the non-embedding area as the value of
                                          2000
                                                                                         the quantization step. It should be defined more exactly that the
                                                                                         quantization step is a positive integer, while the DCT coefficients
                                                                                         are real values; therefore, to select the next quantization step, the
                                             0                                           DCT coefficients from the non-embedding area must be taken
                                                 -40    -20        0         20     40
                                                       The DCT coefficient values        modulo and rounded (an example is shown in Fig. 2). To
                                                                                         determine the least common values of DCT coefficients, it is
                                          8000
                                                                                         necessary to construct a histogram of the distribution of their
                                                                                         converted values over the non-embedding area.
                                          6000
                   The number of values




                                                                                                  -16,9 -8,4 -0,6 2,4 5,2 0,4           17   8   1   2   5   0

                                                                                            45,4 28,0 5,1 -3,7 -1,7 -8,3           45   28   5   4   2   8
                                          4000
      b)                                                                                    -15,1 -16,8 2,1 -7,4 4,1               15   17   2   7   4

                                                                                            17,4 11,4 -6,3 3,2                     17   11   6   3
                                          2000
                                                                                            7,9 3,3 4,1                            8    3    4

                                                                                            5,8 0,3                                6    0
                                             0                                              1,8                                    2
                                                 -40    -20        0         20     40
                                                       The DCT coefficient values                 Fig. 2. Transformation of the non-embedding area.
                                          8000
                                                                                              The authors of the study considered two options for choosing
                                                                                         the quantization step based on the obtained histogram of the non-
                                          6000                                           embedding area:
             The number of values




                                                                                         ‒           from a group of values with a frequency not exceeding
                                          4000
                                                                                         the set one;
       c)                                                                                ‒           from all values of the non-embedding area.
                                                                                              To implement the first option, it is necessary to pre-set the
                                          2000                                           threshold for the frequency of occurrence of rounded absolute
                                                                                         values of the DCT coefficients, among which the quantization
                                                                                         step will be selected. Then the values that occur no more often
                                             0
                                                 -40    -20        0         20     40   than the quantity of the threshold value will form a group. The
                                                       The DCT coefficient values
                                                                                         quantization step will be equal to the smallest value in the group.
Fig. 1. The histogram of the image: a) before the embedding; b)                          The second option does not require explicitly setting a threshold
 after the embedding with a constant quantization step; c) after                         value. In this case, the smallest of the most rarely found values
   the embedding with a variable quantization step from [1].                             in the block is selected as the quantization step.
                                                                                              These approaches are somewhat similar, since in both cases
    This study proposes to develop the idea of using the ratio of                        the decision to select the quantization step is made using the
domains within a block to select a quantization step.                                    threshold value of the frequency of occurrence of DCT
    Obviously, the embedding area undergoes the most                                     coefficients. However, the fundamental difference between them
significant distortions. At the same time, a change in the non-                          is that the first option operates on a single threshold value for all
embedding area, i.e. in the other AC-coefficients of the block is                        image blocks, while the second option uses different threshold
negligible. Therefore, to select the quantization step, it is                            values for different blocks.
proposed to use the non-embedding area of the corresponding                                   It is empirically found that the choice of a quantization step
block. The invariance of the non-embedding area will allow to                            of less than three provides a very small capacity, insufficient for
extract embedded data without errors, since the quantization step                        effective operation, and the selection of a quantization step of
selected over the non-embedding area will be the same for both                           more than twenty leads to a significant deterioration in the visual
the cover and the stego-images. It does not require knowledge of                         quality of images, therefore we introduce the condition:
any additional key information. This means that the transmission                          3  q  20 .
of the stego-image does not require a preliminary exchange of                                 Fig. 3 shows an example of selecting a quantization step q
keys, the presence of which would contradict the very idea of                            over the non-embedding area for both variants. In the first case,
                                                                                         the choice of q is made according to a group of values, the
occurrence frequency of which should be no more than two. The                                                            capacity in all cases was further set equal to the capacity obtained
smallest among these values is three, therefore q  3 . In the                                                           in the case of a selection over a group.
second case, the lowest frequency of occurrence of individual
values should be firstly determined, in this case it is equal to one.                                                                                          50000

Then from the values encountered only once, the smallest is
selected, and in the end q  7 .                                                                                                                               40000




                                                                                                                                        The number of values
                                                 4
                                                                                                                                                               30000

                                                             3               3                                       3
                          The number of values




                                                 3
                                                                                                                                a)
           Option No. 1




                                                                                                                                                               20000

                                                         2       2   2
                                                 2
                                                                                                                                                               10000

                                                                         1                   1               1
                                                 1
                                                                                                                                                                   0
                                                                                                                                                                    -80      -60   -40   -20    0     20    40   60   80
                                                                                 0       0       0   0   0       0
                                                 0                                                                                                                                  The DCT coefficient values
                                                     2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18
                                                                                     q
                                                                                                                                                               50000

                                                     q=3
                                                 4
                                                                                                                                                               40000




                                                                                                                                      The number of values
                                                             3               3                                       3                                         30000
                          The number of values




                                                 3
           Option No. 2




                                                 2
                                                         2       2   2                                                          b)                             20000


                                                                         1                   1               1                                                 10000
                                                 1


                                                                                 0       0       0   0   0       0                                                 0
                                                 0
                                                     2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18                                                          -80   -60   -40   -20    0     20    40   60   80
                                                                                     q                                                                                              The DCT coefficient values

                                                                     q=7                                                                                       50000

    Fig. 3. The choice of the quantization step over the non-
                        embedding area.                                                                                                                        40000
                                                                                                                                       The number of values




                                                                                                                                                               30000
4. The results of the experiments
    To evaluate the effectiveness of the proposed approach and                                                                  c)                             20000
to compare the two options described above, computing
experiments were conducted. For the experiments, 20 classic test                                                                                               10000
images of 512 × 512 pixels in grayscale were taken from the
“USC-SIPI Image Database”.                                                                                                                                         0
                                                                                                                                                                       -80   -60   -40   -20    0     20    40   60   80
    When evaluating the effectiveness, such standard                                                                                                                                The DCT coefficient values
characteristics as capacity is used, i.e. the ratio of the number of
                                                                                                                          Fig. 4. Histograms for the “Airplane” image: a) container; b)
embedded bits to the size of the container, and the PSNR metric,
                                                                                                                         stego-image (constant q , maximum capacity); c) stego-image
which is calculated by the formula
                                                                                                                                (new approach, option No. 2, maximum capacity).
                                      255 
                    PSNR  20 log10            ,
                                      RMSE                                                                                 Fig. 5 and Fig. 6 present the results of comparing the
                                                                                                                         histograms for the “Airplane” and “Baboon” images obtained for
                                                                                   Pi  Q1 2 ,
                                                                             1 n
                                                     RMSE                                                               the cover and stego-images with different options for choosing
                                                                             n i 1
                                                                                                                         the quantization step. For the “Airplane” test image, the
where n is the total number of pixels, Pi is the pixel value of                                                          embedding capacity is 0.29 bit / pixel, for the “Baboon” test
the cover image, Qi is the pixel value of the stego-image.                                                               image, it is 0.11 bit / pixel. These images are selected as
                                                                                                                         examples, as they belong to different types: the “Airplane” image
    Since when choosing a quantization step over a group, its
                                                                                                                         contains many one-colour areas, while the “Baboon” image has
value is on average less than when choosing it over the entire
                                                                                                                         a high degree of detail.
non-embedding area, the algorithm capacity when using the first
                                                                                                                             According to Fig. 5 and Fig. 6, the application of the
option is much lower. So, the maximum capacity for the first
                                                                                                                         proposed approach significantly reduces the distortions that
option is an average of 201 bit/pixel over 20 test images, with the
                                                                                                                         occur when embedding information. The histograms shown in
corresponding average PSNR of 36.19 dB, while for the second
                                                                                                                         Fig. 5 (b) and Fig. 6 (b) unambiguously show the presence of a
option, the average capacity is 0.34 bits/pixel with a PSNR of
                                                                                                                         steganographic embedding in the corresponding images ( q  8 )
35.63 dB. The comparison of the histogram of the container, the
histogram of the stego-image obtained with a constant                                                                    while histograms corresponding to stego-images obtained with
quantization step q  8 at the maximum capacity, and the                                                                 the proposed approach conform with the natural form of the
                                                                                                                         histograms of cover images. So we can conclude that the
histogram of the stego-image obtained using the second option at
                                                                                                                         proposed approach can reduce the distortions of the digital image
the maximum capacity is shown in Fig. 4. The form of the
                                                                                                                         natural model in the frequency domain, and therefore, increase
histogram of the stego-image obtained using the selection of q
                                                                                                                         the imperceptibility of embedding.
over the non-embedding area is close to the form of the histogram
of the container. However, for a correct assessment of the
effectiveness of the proposed approach and both its variants, the
                                       50000                                                       quantization step over a group (the first option) demonstrates an
                                                                                                   average RMSE value less than for all values (the second option),
                                       40000                                                       but the difference between them is not significant. It is also worth
                                                                                                   noting that the second option allows you to provide a larger
                The number of values   30000                                                       embedding capacity, so the first option is preferable with a small
                                                                                                   embedment volume, but you should use the second option if you
       a)                              20000                                                       need to embed a larger message.
                                                                                                       The analysis of the presented histograms showed that, on the
                                       10000                                                       one hand, they do not contain characteristic peaks, and, on the
                                                                                                   other hand, the proposed approach allows us to restore exactly
                                           0                                                       the natural form of the initial histograms and does not lead to
                                            -80      -60   -40   -20    0     20    40   60   80
                                                            The DCT coefficient values
                                                                                                   excessive “uniformity”, which could also become an unmasking
                                                                                                   sign.
                                       50000

                                                                                                                                         17500
                                       40000
                                                                                                                                         15000
               The number of values




                                       30000




                                                                                                                  The number of values
                                                                                                                                         12500


       b)                                                                                                                                10000
                                       20000

                                                                                                          a)                              7500
                                       10000
                                                                                                                                          5000

                                           0                                                                                              2500
                                               -80   -60   -40   -20    0     20    40   60   80
                                                           The DCT coefficient values                                                        0
                                                                                                                                                 -80   -60   -40   -20     0     20    40    60    80
                                       50000                                                                                                                  The DCT coefficient values

                                                                                                                                         17500
                                       40000
                                                                                                                                         15000
               The number of values




                                       30000
                                                                                                                The number of values
                                                                                                                                         12500


       c)                              20000
                                                                                                                                         10000

                                                                                                          b)                              7500
                                       10000
                                                                                                                                          5000

                                           0                                                                                              2500
                                               -80   -60   -40   -20    0     20    40   60   80
                                                            The DCT coefficient values                                                       0
                                                                                                                                                 -80   -60   -40   -20    0     20     40   60    80
                                       50000                                                                                                                  The DCT coefficient values

                                                                                                                                         17500
                                       40000
                                                                                                                                         15000
             The number of values




                                                                                                                The number of values




                                       30000                                                                                             12500

                                                                                                                                         10000
       d)                              20000
                                                                                                          c)                              7500

                                       10000                                                                                              5000

                                                                                                                                          2500
                                           0
                                               -80   -60   -40   -20    0     20    40   60   80
                                                                                                                                             0
                                                            The DCT coefficient values                                                           -80   -60   -40   -20    0     20    40    60    80

 Fig. 5. Histograms for the “Airplane” image: a) container; b)                                                                                                The DCT coefficient values

  stego-image (constant q ); c) stego-image (new approach,                                                                               17500

 option No. 1); d) stego-image (new approach, option No. 2).                                                                             15000
                                                                                                                 The number of values




                                                                                                                                         12500
    To evaluate the differences of the histograms numerically,
the value of the RMSE metric between the histograms of the                                                                               10000

cover images and the corresponding stego-images was                                                                                       7500
                                                                                                          d)
calculated. On average, for 20 test images, the RMSE value
between the histograms of cover and stego-images obtained with                                                                            5000

a constant q  8 was 1017,80, for the “Airplane” and the                                                                                  2500
“Baboon” images it was 513,13 and 268,55 respectively. The                                                                                   0
results for the first option of the variable q : 565.81 on average,                                                                              -80   -60   -40    -20    0     20    40    60    80
                                                                                                                                                               The DCT coefficient values
799.21 for "Airplane", 184.76 for "Baboon". Results for the
second variant of the variable q : 598.06 on average, 680.40 for                                     Fig. 6. Histograms for the “Baboon” image: a) container; b)
the “Airplane”, 288.72 for the “Baboon”. It can be concluded that                                    stego-image (constant q ); c) stego-image (new approach,
the differences between the histograms of cover and stego-                                           option No. 1); d) stego-image (new approach, option No. 2.
images are much smaller when using a variable q. The choice of
5. Conclusion
    The article presented and investigated a new approach to
reducing the distortions of a digital image natural model in the
DCT domain when embedding information using the QIM
method. As it can be seen from the results of the experiments, the
application of this approach has a positive effect on reducing the
unmasking signs of the embedding in the frequency domain. In
the future, it is planned to continue work to reduce the distortions
of the natural model of images in the frequency domain by
adapting the embedding parameters to a specific container.

6. Acknowledgments
    This work was financially supported by a grant from the
Russian Foundation for Basic Research and the Tomsk Region
in the framework of project No. 19-47-703003 and financially
supported by a grant from the Russian Foundation for Basic
Research in the framework of project No. 18-29-22104.

7. References
[1] Evsutin O.O. Steganographic embedding of information
    into the frequency domain of the DCT of digital images
    using the QIM method with variable quantization step /
    O.O. Evsutin, A.S. Kokurina, R.V. Meshcheryakov //
    Proceedings of the 28th International Conference on
    Computer Graphics and Vision «GraphiCon 2018». – 2018.
    – Russia, Tomsk. – 297-300.
[2] Denemark T. Steganalysis Features for Content-Adaptive
    JPEG Steganography / T. Denemark, M. Boroumand, J.
    Fridrich // IEEE Transactions on Information Forensics and
    Security. – 2016. – Vol. 11(8). – P. 1736-1746.
[3] Evsutin O. The adaptive algorithm of information
    unmistakable embedding into digital images based on the
    discrete Fourier transformation / O. Evsutin, A. Kokurina,
    R. Meshcheryakov, O. Shumskaya // Multimedia Tools and
    Applications. – 2018. – Vol. 77(21). – P. 28567-28599.
[4] Fridrich J. Feature-Based Steganalysis for JPEG Images and
    its Implications for Future Design of Steganographic
    Schemes // Lecture Notes in Computer Science. – 2004. –
    Vol. 3200. – P. 67-81.
[5] Holub V. Universal Distortion Function for Steganography
    in an Arbitrary Domain / V. Holub, J. Fridrich, T. Denemark
    // EURASIP Journal on Information Security. – 2014. – №
    1. – P. 1-13.
[6] Kodovsky J. Ensemble Classifiers for Steganalysis of
    Digital Media / J. Kodovsky, J. Fridrich, V. Holub // IEEE
    Transactions on Information Forensics and Security. –
    2012. – Vol. 7(2). – P. 432-444.
[7] Mitekin V. A new QIM-based watermarking algorithm
    robust against multi-image histogram attack / V. Mitekin,
    V. Fedoseev // Procedia Engineering. – 2017. – Vol. 201. –
    P. 453-462.
[8] Sedighi V. Content-Adaptive Steganography by
    Minimizing Statistical Detectability // V. Sedighi, R.
    Cogranne, J. Fridrich // IEEE Transactions on Information
    Forensics and Security. – 2016. –Vol. 11(2). – P. 221-234.