Discrete Transformations and Noise-Resistant Coding of Still Images in Ste-
                          ganography Problems

        Vladimir N. Kustov, Anatoly A.                                                 Boris V. Sokolov
        Kornienko, Dmitry K. Protsko
                                                                        Laboratory of Information Technologies
    Department of Computer Science and In-                                 in System Analysis and Modeling,
             formation Security,                                         St. Petersburg Institute for Informatics
    Emperor Alexander I St. Petersburg State                                  and Automation of the RAS,
            Transport University,                                               Saint Petersburg, Russia
           Saint Petersburg, Russia                                              sokolov_boris@inbox.ru
    kvnvika@mail.ru, kaa.pgups@yandex.ru

                                                                  • The need to develop and implement new, more ad-
                                                                vanced methods of noise-resistant coding of hidden mes-
                            Abstract                            sages transmitted over communication channels, to en-
                                                                sure their integrity.
       This article is another attempt of a comprehen-
       sive solution in the field of steganography data           • Development of new, simpler, faster and energy-
       transmission. We also consider software model            efficient noise-resistant decoders that have characteristics
       prototype that fully implements the process of           close to the optimal decoders, but with linear (polynomi-
       hidden messages transmission in digital photo-           al) time costs for the implementation of the decoding
       graphs. All stages of hidden message processing          process.
       on the whole way from the sender to the recipi-              These paper attempts of a comprehensive solution
       ent are taken into account. The programming              that takes into account the above features in one way or
       model uses a discrete wavelet transform and              another, and presents a prototype of the software model
       new hiding algorithms based on the «Arnold cat           for implementing the process of transmitting hidden mes-
       map» decomposition. The efficiency of using              sages in digital still images. Using discrete transfor-
       noise-resistant coding methods and multi-                mations of still images, new hiding algorithms and meth-
       threshold decoding to ensure high probability of         ods of noise-resistant coding to ensure a high probability
       integrity and reliability of hidden messages             of their integrity and reliability during transmission over
       when transmitting them through communication             communication channels with a high level of noise pro-
       channels with a high level of noise is also              posed.
       shown.
                                                                1 Discrete Transformation of Still Images
   Introduction                                                 Methods of image steganography can be divided into two
                                                                groups: the first uses the spatial area of the image, the
   The following features can characterize the current state    second - the frequency area. Methods that edit a spatial
   of research in the field of steganography:                   area directly affect the container itself (for example,
     • Extensive use of discrete transformations of digital     changing image pixels). However, such methods are
   still images used as containers for steganography mes-       highly sensitive to various container changes: scale mod-
   sage, and the need to combine the well-known methods         ification, rotation, cropping, adding noise or interference
   of discrete signal conversion with simple, well-tested old   to the channel, various loss compression - are likely to
   algorithms.                                                  destroy the hidden message.
     • Development of more and more new methods of                  Methods that allow messages to be embedded in the
   steganography, which provide high secrecy, confidential-     frequency domain first convert the container file and then
   ity and reliability of message delivery.                     only perform the embedding. These methods do not de-
     • Increasingly more complete account of the charac-        pend on image formats [Dav07], [Che08]. The infor-
   teristics of data transmission channels and interference     mation hiding the steganography methods is based on
   arising in it, especially in the conditions of their high    linear orthogonal transformations such as:
   noise level.
              1


                                                                 • Discrete Hadamard transform (DHT);
                                                                 • Discrete Fourier transform (DFT);
   Copyright © by the papers’ authors. Copying permitted
   for private and academic purposes.
   In: B. V. Sokolov, A. D. Khomonenko, A. A. Bliudov           Mathematical and Natural-Scientific Training in Engi-
   (eds.): Selected Papers of the Workshop Computer Sci-        neering Education", St.-Petersburg, Russia, 8–9 Novem-
   ence and Engineering in the framework of the 5 th Inter-     ber, 2018, published at http://ceur-ws.org
   national Scientific-Methodical Conference "Problems of


                                                                                                                       12
  • Discrete cosine transform (DCT);
  • Discrete wavelet transform (DWT);
  • Singular value decomposition (SVD).
   Among all the discrete transformations, the most
popular are the discrete cosine transform (DCT) [Kus17]
and the discrete wavelet transforms (DWT). The preva-
lence of these methods is explained by their wide use for
image compression. Especially successfully, they are
used in JPEG and JPEG2000 standards. The JPEG stand-
ard uses the DCT and the DWT is used the JPEG2000
image compression standard.
                                                                          Figure 2: Embedded Message
   In [Kus17], the authors have successfully shown the
use of combined stegoalgorithm on the basis of the
method of the Least Significant Bit (LSB) in combination
with DCT (LSB & DCT). In this paper, the authors tried
to use LSB in combination with DWT (LSB & DWT).
Let us dwell on this.
2 Steganography System Model
The General model of the steganography system is
shown in figure 1. In this model, a secret graphic digit-
ized message in 24-bit bmp format is used as an embed-
ded message (figure 2).
    The choice of a graphical object as an embedded
message was made due to its higher resistance to noise in
transformations. One of the main blocks in this model of
the steganography system is the block implementing the
function of embedding presented in figure 3. Consider its
functionality.
   First, it is designed to perform DWT and embed a
pre-converted hidden message into it.


                                                                         Figure 3: Embedding Function
                                                                For each decomposition level, first a DWT is per-
                                                            formed in the vertical direction and then a DWT in the
                                                            horizontal direction. After performing the first level of
                                                            decomposition, we obtain a block 4x4, shown in figure
                                                            4a.
                                                                The next level of decomposition is performed in the
                                                            low-frequency part obtained in the previous decomposi-
                                                            tion (figure 4b).


                                                                  a) The first stage            b) The second stage
                                                                             Figure 4: Two DWT Stages
    Figure 1: General Steganography System Model


                                                                                                                13
   The main real indicators of concealing information          also known that ACM is used to demonstrate the dynam-
methods in steganography are:                                  ics of chaotic processes (figure 6).
  • PSNR-peak signal to noise ratio;                               Also it should be noted that the ACM decomposition
  • Imbedded capacity;                                         is iteratively reversible. Let us illustrate this fact by the
  • Correlation.                                               example shown in figure 6. This figure shows that at a
    PSNR is inversely proportional to capacity and direct-     certain iteration step the image converted using ACM (in
ly proportional to correlation and vice versa. During the      this case, the embedded message) necessarily takes the
study, the correct ratio of PSNR to capacity and correla-      form of the original. In the example below, each picture
tion was found, which suggests that the information can        has an iteration number corresponding to that picture. On
be sent over an unprotected channel of information             iteration number 194 the image becomes equivalent to
transmission, without fear of unauthorized access by a         the original. Let's imagine that we used the original im-
third party. But also keep in mind that the larger the im-     age as a hidden message, converting it to iteration num-
bedded message, the greater its impact on the steganog-        ber 50. Then when decoding this image to get the origi-
raphy container and for big data, you must choose a larg-      nal message view, we need to perform 194-50 = 144 iter-
er container.                                                  ations! From here, we can conclude that the values 50
                                                               and 144 can be used as secret keys, respectively, at the
   In the proposed approach, DWT is used to decompose          stages of embedding and extracting the hidden message.
the image into high frequency and low frequency sub-
bands.                                                             After the transformation is applied to the ACM algo-
                                                               rithm, the hidden message is divided into RGB compo-
   In turn, imbedded message converted using the Ar-           nents, and embedded using LSB algorithm in the corre-
nold cat map (ACM). In mathematics, ACM is identified          sponding sub-band HL. After the implementation, the
with the chaotic mapping on the torus first proposed by        reverse DWT (RDWT) is applied, the components are
Vladimir Arnold [Div11].                                       assembled again and the filled stegocontainer is obtained
                                                               in accordance with the embedding function (figure 3).
    ACM can be viewed as a two-dimensional map de-
scribed by relations:
                     𝑝" = 𝑞 (𝑚𝑜𝑑1)
                    𝑞 " = 𝑝 + 2𝑞 (𝑚𝑜𝑑1).
    In these relations, the dash sign shows the dynamics
of parameter values change at the next time step. As the
phase space of ACM typically consider the surface of a
torus. The parameter p on the torus specifies the coordi-
nates of the Parallels, and the parameter q is the coordi-
nate along the Meridian of the torus. The range of values
of both parameters is limited by the interval from zero to
one. Typically, a unit square with p and q coordinates is
used as a graphical representation of ACM. The name
ACM is because Vladimir Arnold illustrated it in a pic-
ture resembling a cat's head (figure 5) [Div11].
    The record of these relations in the matrix representa-
tion has the form:


                   𝑝                  𝑝
            /1 1/ /𝑞 / = /0 1/ /0 1/ /𝑞 /.
             1 2          1 1 1 1
                                                                   Figure 6: ACM Hidden Message Transformation
                                                                   ACM in this combination is used to increase safety. It
                                                               allows you to extract a full secret message only to the
                                                               recipient who has information about the method of em-
                                                               bedding (key). The filled container is sent to the commu-
                                                               nication channel (figure 1) after the application of noise-
                                                               resistant coding, where it is exposed to noise generated
                                                               by the noise generator.
                                                                   A noisy hidden message after the decoding procedure
                                                               is passed to the input of the block that implements the
         Figure 5: Graphical ACM Interpretation                extraction function (figure 1). The function of extracting
    It should be noted that any picture subjected to ACM       a hidden message is performed in the following order.
(for example, the cat's head) always retains its area. It is   The DWT is first performed (similar to that shown in


                                                                                                                       14
figure 3) applied to the filled steganography container         is assembled by applying a reverse DWT (RDWT) trans-
passed an error-correcting decoding. Then, the LSB re-          formation to its RGB components and connecting them
verse conversion is extracted from the HL blocks and the        into a single unit.
RGB components of the secret message are assembled
together. Gathered secret message in turn subjected to              Let us perform statistical analysis of the effectiveness
                                                                of the developed stego-algorithm. Peak signal to noise
reverse ACM (RACM) using the key for the stage of
extracting secret message and acquiring complete, is de-        ratio (PSNR) means the ratio between the maximum pos-
livered to the recipient. If necessary, the empty container     sible signal value and the noise power that distorts the
                                                                signal
    values [Ami10]. This metric is used to show the dif-           Capacity is the relation of the hidden message size to
ference between empty and filled containers:                    container size. It is calculated by the formula:
                                  2559                                                         hidden message pixels number
                𝑃𝑆𝑁𝑅 = 10 𝑙𝑔 7          <.                                 Capacity =                                       .
                                  𝑀𝑆𝐸                                                             container pixels number
    The root mean square error (MSE) – determines the              Correlation used to display a linear relationship be-
difference between the intensities of the filled and empty      tween empty and filled containers [Mut11]:
containers:
                        H   D
                                                                                    ∑dIFG(𝑥I − 𝑥)(𝑦I − 𝑦)
                1                                                                        𝑟_` =            .
      𝑀𝑆𝐸 =        > >(𝑓(𝑖, 𝑗) − (𝑓 " (𝑖, 𝑗)))9 .                                        (𝑛 − 1)𝑠_ 𝑠`
               𝑀×𝑁                                                  Table 1 presents the main performance indicators of
                       IFG EFG
   Where f(i,j) is an empty container and f'(i,j) is a filled   this algorithm. As can be seen from the table, the pro-
container. A large MSE value indicates that the original        posed method copes with hiding data in the image.
image is of poor quality, and Vice versa.
                                        Table 1: The Main Performance Indicators

                                                                                Filled container
N   Container size MxN      Message size     Empty container PSNR                                            Correlation              Capacity
                                                                                     PSNR

1        1680 х 1050            635 х 715            45.0553                         15.2341                   0.9997                     0.29
2        1440 х 990             635 х 715            47.8423                         20.4323                   0.9998                     0.26
3        1360 х 768             600 х 400            49.3421                         12.2342                   0.9997                     0.22


3 Error-correcting Codes and Multi-                               • Ability to work efficiently with different code
                                                                speeds in high-interference channels;
threshold Decoding                                                • High performance and significant energy gain.
Of particular relevance in the transmission of hidden               Let us consider in more detail the device of MTD
messages through channels of communication with inter-          [Zol12]. An example of a multi-threshold character block
ference is the use of noise-resistant coding.                   coder (MTBC) scheme for a self-orthogonal code with
                                                                one information branch is shown in figure 7. It can be
    Recently, in the field of digital signal transmission in    seen that the encoder consists only of a shift register and
channels with a high level of noise, methods of noise-          a group of adders modulo q, where q=256.
correcting coding based on the use of multi-threshold
decoders (MTD) of self-orthogonal codes (SOC) are in-              In this example, the group of adders determined in
tensively used. The prototype of MTD is a simple decod-         accordance with the image of the polynomial
er of Massey [Mas69]. New technical solutions used in           P=x +x +x +x .
                                                                       0    1    4   6


the MTD represent the implementation of an effective
algorithm of noise-correcting coding. Distinctive features
                                                                  uj        2
of MTD are:                                                                     0    1     2     3   4   5    6        7   8   9      10 11 12
                                                                       1 K
  • Linear computational complexity;
  • High efficiency of error correction;                                                                          vj               - addition on module q

  • Iterative error correction process that constantly                     Figure 7: MTBC with One Information Branch
brings the decoding process closer to the optimal decod-
                                                                    The scheme of the multi-threshold character block
er;
                                                                decoder (MTCBD) for such code has the form shown in
  • Easy technical implementation;                              figure 8. The information register performs the role of the


                                                                                                                                                  15
information branch here, and the role of the verification      The MTCBD scheme consists only of shift registers,
branch is the syndrome register.                            adders, subtracts modulo q and a threshold element (TE).
                                                                TE task is to count the most common characters in
                                                            the corresponding positions of the syndrome and differ-
                                                            ence registers. For example, the symbol q1 occurs a1
                                                            times, and the symbol q2 occurs b1 times. Then, the val-
                                                            ue of |a1 – b1| compare with some set threshold value in
                                                            the majority element with further correction of the asso-
                                                            ciated elements in case of exceeding the threshold value.
                                                               An example of the MTBC scheme with two infor-
                                                            mation branches presented in figure 9.


          Figure 8: MTCBD with One Information and One
                      Verification Branch
                                             Table 2: Simulation results
q=256
volume =100000
P =0.25000
 0_low


P =0.15000
 0_high


P =0.00500
 0_step


The parameters of the communication channel:
q- synchronous channel

The parameters non-binary coder
Number of information branches: nk=4
Number of test branches: nr=4
Number of possible symbol values: q= 256
Code rate: R=0.50

Code distance: 9
Code length: 9704
Number of decoding iterations: 7
The probability of error in the channel: P =0.16000
                                         0


Number of blocks transmitted = 21
Number of information characters transmitted = 101892

Simulation results

The number of the iteration:
     0           1           2            3          4             5         6            7
Number of errors at the output of different decoding iterations:
   16347      14338        7384          259          0            0          0           0
Probability of error at the output of different decoding iterations:
1.60e-001 1.41e-001 7.25e-002 2.54e-003 0.00e+000 0.00e+000 0.00e+000 0.00e+000
The probability of an error on the symbol at the output of a non-binary decoder was 9.81e-006
The probability of an error on the block at the output of a non-binary decoder was 0.00e+000


                                                                                                                16
                                                             results file is shown in table 2. As can be seen in table 2,
                                                             the MTCBD is very effective. Given a sufficiently high
                                                             probability of error in the channel 0.16 (16 decibels) and
                                                             the size of the source character file consisting of 101892
                                                             characters (bytes), combined into 21 blocks, all errors,
                                                             the total number of which in the source file was 16347,
                                                             were eliminated at the fourth iteration.
                                                                A binary synchronous communication channel (BSC)
                                                             with an independent error stream (channel without
 Figure 9: MTBC Scheme with 2 Information Branches           memory) subject to Gauss distribution was chosen as a
    The simulation was carried out with the help of          communication channel model.
MTCBD, which has 4 information and 4 verification              Let us consider the results of the stenographic process
branches. The output file of the simulation results con-     modeling as a whole using the model shown in figure 1.
tained information on the simulation parameters, the en-
coder and decoder used in the simulation, as well as in-         Figure 10 shows a 24-bit BMP graphic file used as a
formation on the estimated error probability at the output   steganography container.
of the MTCBD and the number of errors remaining after
decoding iteration for each error probability in the com-
munication channel. An example of a typical entry in the


                                         Figure 10: Steganography Container


                                                                                                                    17
                                Figure 11: Steganography Container Are at the BSC Exit

                                                                From the output of the BSC stegocontainer are sup-
    After the function, embedding secret message, filled
                                                            plied to the MTCBD where all the noise is removed, and
stegocontainer are supplied to MTCBC, and after the
                                                            the file stegocontainer are taking on the appearance it had
encoding process is transmitted in noisy BSC. The output
                                                            at the entrance to the BSC.
of the BSC file has the form shown in figure 11.
                                                                Filtered noise is shown in figure 12.


                                              Figure 12: Filtered Noise
    Output MTCBD the steganography container are              • The code rate is 0.5;
supplied to the removal unit, the output of which then        • The probability of error in channel P0 = 0.25;
issued a secret message.                                      • Container size - 93640 bytes;
    The following parameters were set for the final simu-     • Number of generated errors - 23543;
lation:                                                       • Number of decoding iterations until all errors are
                                                            fully retrieved - 18;


                                                                                                                  18
 • Type of communication channel - BSC.
                                                              [Kus17] Kustov, V. N., Protsko D. K. A Software model
Conclusion                                                            of steganography on the basis of a combination
                                                                      of methods LSB and DCT. Science and educa-
In the opinion of the authors, this paper describes a suc-            tion in the XXI century. Collection of scientific
cessful attempt of a comprehensive solution in the field              papers on the materials of the international sci-
of steganography container data transmission. The au-                 entific-practical conference on February 28,
thors present a prototype of a software model based on                2017. Part 3. Tambov: LLC "Ucom Consulting
four main components:
                                                                      company", 2017. P. 54-61.
  • Discrete wavelet transformation the steganography
container are;                                                [Dav07] David Frith,” Steganography Approaches, Op-
  • Pre-coding the hidden message using Arnold's cat                  tions and Implications", Network Security, Au-
decomposition;                                                        gust 2007.
  • Embedding encoded message on LSB algorithm in
wavelet transformations the stegocontainer are;               [Che08] Cheddad A., Condell J., Curran K. and Mc Kev-
  • Application of noise-resistant coding in the com-                 itt, P. (2008). Biometric Inspired Digital Image
munication channel using advanced technologies of mul-                Steganography. Proc of the 15th Annual IEEE
ti-threshold decoding using MTD (multi-threaded decod-                International Conference and Workshops on the
er).                                                                  Engineering of Computer-Based Systems
     Conventionally, this combination of four components              (ECBS'08). P. 159-168.
can be designated as DWT & ACM & LSB & MTD.
                                                              [Div11] Divya Saxena,” Digital Watermarking Algo-
     The authors believe that this model fully implements             rithm based on Singular Value Decomposition
the process of transmission of hidden messages in digital             and Arnold Transform”, International Journal of
still images. The model takes into account and agrees on              Electronics and Computer Science Engineering
the features of all processing hidden message stages.                 (IJECSE), Vol. 1, No. 1, 2011, P. 22-27.

                                                              [Ami10] R. Amirtharajan, R. Akila, P. Deepika Chow-
Acknowledgments                                                       davarapu, “Comparative Analysis of Image Ste-
    Studies carried out on this topic were carried out with           ganography”, International Journal of Computer
partial financial support from RFBR grants (No. 16-29-                Applications, volume 2 – No.3, May 2010.
09482-ofi-m), under the budget theme No. 0073–2019–           [Mut11] S. K. Muttoo, Sushil Kumar, “A mulltilayred
0004.                                                                 secure, robust and high capacity image ste-
                                                                      ganographic algorithm”, 2011.
References
                                                              [Mas69] J. L. Massey, Shift-register synthesis and BCH
[Mes66] Messi John. Threshold decoding / Per. with
                                                                      decoding, IEEE Trans. Information Theory, IT-
        English. Ed. by E. L. Bloch. M.: Mir, 1966.
                                                                      15 (1969), P. 122-127.
[Zol12] Zolotarev V. V., Zubarev Y. B., Ovechkin G. V.
         multi-Threshold decoders and optimization the-
         ory of coding. M.: Hotline-Telecom, 2012.


                                                                                                                  19