Discrete Transformations and Noise-Resistant Coding of Still Images in Ste- ganography Problems Vladimir N. Kustov, Anatoly A. Boris V. Sokolov Kornienko, Dmitry K. Protsko Laboratory of Information Technologies Department of Computer Science and In- in System Analysis and Modeling, formation Security, St. Petersburg Institute for Informatics Emperor Alexander I St. Petersburg State and Automation of the RAS, Transport University, Saint Petersburg, Russia Saint Petersburg, Russia sokolov_boris@inbox.ru kvnvika@mail.ru, kaa.pgups@yandex.ru • The need to develop and implement new, more ad- vanced methods of noise-resistant coding of hidden mes- Abstract sages transmitted over communication channels, to en- sure their integrity. This article is another attempt of a comprehen- sive solution in the field of steganography data • Development of new, simpler, faster and energy- transmission. We also consider software model efficient noise-resistant decoders that have characteristics prototype that fully implements the process of close to the optimal decoders, but with linear (polynomi- hidden messages transmission in digital photo- al) time costs for the implementation of the decoding graphs. All stages of hidden message processing process. on the whole way from the sender to the recipi- These paper attempts of a comprehensive solution ent are taken into account. The programming that takes into account the above features in one way or model uses a discrete wavelet transform and another, and presents a prototype of the software model new hiding algorithms based on the «Arnold cat for implementing the process of transmitting hidden mes- map» decomposition. The efficiency of using sages in digital still images. Using discrete transfor- noise-resistant coding methods and multi- mations of still images, new hiding algorithms and meth- threshold decoding to ensure high probability of ods of noise-resistant coding to ensure a high probability integrity and reliability of hidden messages of their integrity and reliability during transmission over when transmitting them through communication communication channels with a high level of noise pro- channels with a high level of noise is also posed. shown. 1 Discrete Transformation of Still Images Introduction Methods of image steganography can be divided into two groups: the first uses the spatial area of the image, the The following features can characterize the current state second - the frequency area. Methods that edit a spatial of research in the field of steganography: area directly affect the container itself (for example, • Extensive use of discrete transformations of digital changing image pixels). However, such methods are still images used as containers for steganography mes- highly sensitive to various container changes: scale mod- sage, and the need to combine the well-known methods ification, rotation, cropping, adding noise or interference of discrete signal conversion with simple, well-tested old to the channel, various loss compression - are likely to algorithms. destroy the hidden message. • Development of more and more new methods of Methods that allow messages to be embedded in the steganography, which provide high secrecy, confidential- frequency domain first convert the container file and then ity and reliability of message delivery. only perform the embedding. These methods do not de- • Increasingly more complete account of the charac- pend on image formats [Dav07], [Che08]. The infor- teristics of data transmission channels and interference mation hiding the steganography methods is based on arising in it, especially in the conditions of their high linear orthogonal transformations such as: noise level. 1 • Discrete Hadamard transform (DHT); • Discrete Fourier transform (DFT); Copyright © by the papers’ authors. Copying permitted for private and academic purposes. In: B. V. Sokolov, A. D. Khomonenko, A. A. Bliudov Mathematical and Natural-Scientific Training in Engi- (eds.): Selected Papers of the Workshop Computer Sci- neering Education", St.-Petersburg, Russia, 8–9 Novem- ence and Engineering in the framework of the 5 th Inter- ber, 2018, published at http://ceur-ws.org national Scientific-Methodical Conference "Problems of 12 • Discrete cosine transform (DCT); • Discrete wavelet transform (DWT); • Singular value decomposition (SVD). Among all the discrete transformations, the most popular are the discrete cosine transform (DCT) [Kus17] and the discrete wavelet transforms (DWT). The preva- lence of these methods is explained by their wide use for image compression. Especially successfully, they are used in JPEG and JPEG2000 standards. The JPEG stand- ard uses the DCT and the DWT is used the JPEG2000 image compression standard. Figure 2: Embedded Message In [Kus17], the authors have successfully shown the use of combined stegoalgorithm on the basis of the method of the Least Significant Bit (LSB) in combination with DCT (LSB & DCT). In this paper, the authors tried to use LSB in combination with DWT (LSB & DWT). Let us dwell on this. 2 Steganography System Model The General model of the steganography system is shown in figure 1. In this model, a secret graphic digit- ized message in 24-bit bmp format is used as an embed- ded message (figure 2). The choice of a graphical object as an embedded message was made due to its higher resistance to noise in transformations. One of the main blocks in this model of the steganography system is the block implementing the function of embedding presented in figure 3. Consider its functionality. First, it is designed to perform DWT and embed a pre-converted hidden message into it. Figure 3: Embedding Function For each decomposition level, first a DWT is per- formed in the vertical direction and then a DWT in the horizontal direction. After performing the first level of decomposition, we obtain a block 4x4, shown in figure 4a. The next level of decomposition is performed in the low-frequency part obtained in the previous decomposi- tion (figure 4b). a) The first stage b) The second stage Figure 4: Two DWT Stages Figure 1: General Steganography System Model 13 The main real indicators of concealing information also known that ACM is used to demonstrate the dynam- methods in steganography are: ics of chaotic processes (figure 6). • PSNR-peak signal to noise ratio; Also it should be noted that the ACM decomposition • Imbedded capacity; is iteratively reversible. Let us illustrate this fact by the • Correlation. example shown in figure 6. This figure shows that at a PSNR is inversely proportional to capacity and direct- certain iteration step the image converted using ACM (in ly proportional to correlation and vice versa. During the this case, the embedded message) necessarily takes the study, the correct ratio of PSNR to capacity and correla- form of the original. In the example below, each picture tion was found, which suggests that the information can has an iteration number corresponding to that picture. On be sent over an unprotected channel of information iteration number 194 the image becomes equivalent to transmission, without fear of unauthorized access by a the original. Let's imagine that we used the original im- third party. But also keep in mind that the larger the im- age as a hidden message, converting it to iteration num- bedded message, the greater its impact on the steganog- ber 50. Then when decoding this image to get the origi- raphy container and for big data, you must choose a larg- nal message view, we need to perform 194-50 = 144 iter- er container. ations! From here, we can conclude that the values 50 and 144 can be used as secret keys, respectively, at the In the proposed approach, DWT is used to decompose stages of embedding and extracting the hidden message. the image into high frequency and low frequency sub- bands. After the transformation is applied to the ACM algo- rithm, the hidden message is divided into RGB compo- In turn, imbedded message converted using the Ar- nents, and embedded using LSB algorithm in the corre- nold cat map (ACM). In mathematics, ACM is identified sponding sub-band HL. After the implementation, the with the chaotic mapping on the torus first proposed by reverse DWT (RDWT) is applied, the components are Vladimir Arnold [Div11]. assembled again and the filled stegocontainer is obtained in accordance with the embedding function (figure 3). ACM can be viewed as a two-dimensional map de- scribed by relations: 𝑝" = 𝑞 (𝑚𝑜𝑑1) 𝑞 " = 𝑝 + 2𝑞 (𝑚𝑜𝑑1). In these relations, the dash sign shows the dynamics of parameter values change at the next time step. As the phase space of ACM typically consider the surface of a torus. The parameter p on the torus specifies the coordi- nates of the Parallels, and the parameter q is the coordi- nate along the Meridian of the torus. The range of values of both parameters is limited by the interval from zero to one. Typically, a unit square with p and q coordinates is used as a graphical representation of ACM. The name ACM is because Vladimir Arnold illustrated it in a pic- ture resembling a cat's head (figure 5) [Div11]. The record of these relations in the matrix representa- tion has the form: 𝑝 𝑝 /1 1/ /𝑞 / = /0 1/ /0 1/ /𝑞 /. 1 2 1 1 1 1 Figure 6: ACM Hidden Message Transformation ACM in this combination is used to increase safety. It allows you to extract a full secret message only to the recipient who has information about the method of em- bedding (key). The filled container is sent to the commu- nication channel (figure 1) after the application of noise- resistant coding, where it is exposed to noise generated by the noise generator. A noisy hidden message after the decoding procedure is passed to the input of the block that implements the Figure 5: Graphical ACM Interpretation extraction function (figure 1). The function of extracting It should be noted that any picture subjected to ACM a hidden message is performed in the following order. (for example, the cat's head) always retains its area. It is The DWT is first performed (similar to that shown in 14 figure 3) applied to the filled steganography container is assembled by applying a reverse DWT (RDWT) trans- passed an error-correcting decoding. Then, the LSB re- formation to its RGB components and connecting them verse conversion is extracted from the HL blocks and the into a single unit. RGB components of the secret message are assembled together. Gathered secret message in turn subjected to Let us perform statistical analysis of the effectiveness of the developed stego-algorithm. Peak signal to noise reverse ACM (RACM) using the key for the stage of extracting secret message and acquiring complete, is de- ratio (PSNR) means the ratio between the maximum pos- livered to the recipient. If necessary, the empty container sible signal value and the noise power that distorts the signal values [Ami10]. This metric is used to show the dif- Capacity is the relation of the hidden message size to ference between empty and filled containers: container size. It is calculated by the formula: 2559 hidden message pixels number 𝑃𝑆𝑁𝑅 = 10 𝑙𝑔 7 <. Capacity = . 𝑀𝑆𝐸 container pixels number The root mean square error (MSE) – determines the Correlation used to display a linear relationship be- difference between the intensities of the filled and empty tween empty and filled containers [Mut11]: containers: H D ∑dIFG(𝑥I − 𝑥)(𝑦I − 𝑦) 1 𝑟_` = . 𝑀𝑆𝐸 = > >(𝑓(𝑖, 𝑗) − (𝑓 " (𝑖, 𝑗)))9 . (𝑛 − 1)𝑠_ 𝑠` 𝑀×𝑁 Table 1 presents the main performance indicators of IFG EFG Where f(i,j) is an empty container and f'(i,j) is a filled this algorithm. As can be seen from the table, the pro- container. A large MSE value indicates that the original posed method copes with hiding data in the image. image is of poor quality, and Vice versa. Table 1: The Main Performance Indicators Filled container N Container size MxN Message size Empty container PSNR Correlation Capacity PSNR 1 1680 х 1050 635 х 715 45.0553 15.2341 0.9997 0.29 2 1440 х 990 635 х 715 47.8423 20.4323 0.9998 0.26 3 1360 х 768 600 х 400 49.3421 12.2342 0.9997 0.22 3 Error-correcting Codes and Multi- • Ability to work efficiently with different code speeds in high-interference channels; threshold Decoding • High performance and significant energy gain. Of particular relevance in the transmission of hidden Let us consider in more detail the device of MTD messages through channels of communication with inter- [Zol12]. An example of a multi-threshold character block ference is the use of noise-resistant coding. coder (MTBC) scheme for a self-orthogonal code with one information branch is shown in figure 7. It can be Recently, in the field of digital signal transmission in seen that the encoder consists only of a shift register and channels with a high level of noise, methods of noise- a group of adders modulo q, where q=256. correcting coding based on the use of multi-threshold decoders (MTD) of self-orthogonal codes (SOC) are in- In this example, the group of adders determined in tensively used. The prototype of MTD is a simple decod- accordance with the image of the polynomial er of Massey [Mas69]. New technical solutions used in P=x +x +x +x . 0 1 4 6 the MTD represent the implementation of an effective algorithm of noise-correcting coding. Distinctive features uj 2 of MTD are: 0 1 2 3 4 5 6 7 8 9 10 11 12 1 K • Linear computational complexity; • High efficiency of error correction; vj - addition on module q • Iterative error correction process that constantly Figure 7: MTBC with One Information Branch brings the decoding process closer to the optimal decod- The scheme of the multi-threshold character block er; decoder (MTCBD) for such code has the form shown in • Easy technical implementation; figure 8. The information register performs the role of the 15 information branch here, and the role of the verification The MTCBD scheme consists only of shift registers, branch is the syndrome register. adders, subtracts modulo q and a threshold element (TE). TE task is to count the most common characters in the corresponding positions of the syndrome and differ- ence registers. For example, the symbol q1 occurs a1 times, and the symbol q2 occurs b1 times. Then, the val- ue of |a1 – b1| compare with some set threshold value in the majority element with further correction of the asso- ciated elements in case of exceeding the threshold value. An example of the MTBC scheme with two infor- mation branches presented in figure 9. Figure 8: MTCBD with One Information and One Verification Branch Table 2: Simulation results q=256 volume =100000 P =0.25000 0_low P =0.15000 0_high P =0.00500 0_step The parameters of the communication channel: q- synchronous channel The parameters non-binary coder Number of information branches: nk=4 Number of test branches: nr=4 Number of possible symbol values: q= 256 Code rate: R=0.50 Code distance: 9 Code length: 9704 Number of decoding iterations: 7 The probability of error in the channel: P =0.16000 0 Number of blocks transmitted = 21 Number of information characters transmitted = 101892 Simulation results The number of the iteration: 0 1 2 3 4 5 6 7 Number of errors at the output of different decoding iterations: 16347 14338 7384 259 0 0 0 0 Probability of error at the output of different decoding iterations: 1.60e-001 1.41e-001 7.25e-002 2.54e-003 0.00e+000 0.00e+000 0.00e+000 0.00e+000 The probability of an error on the symbol at the output of a non-binary decoder was 9.81e-006 The probability of an error on the block at the output of a non-binary decoder was 0.00e+000 16 results file is shown in table 2. As can be seen in table 2, the MTCBD is very effective. Given a sufficiently high probability of error in the channel 0.16 (16 decibels) and the size of the source character file consisting of 101892 characters (bytes), combined into 21 blocks, all errors, the total number of which in the source file was 16347, were eliminated at the fourth iteration. A binary synchronous communication channel (BSC) with an independent error stream (channel without Figure 9: MTBC Scheme with 2 Information Branches memory) subject to Gauss distribution was chosen as a The simulation was carried out with the help of communication channel model. MTCBD, which has 4 information and 4 verification Let us consider the results of the stenographic process branches. The output file of the simulation results con- modeling as a whole using the model shown in figure 1. tained information on the simulation parameters, the en- coder and decoder used in the simulation, as well as in- Figure 10 shows a 24-bit BMP graphic file used as a formation on the estimated error probability at the output steganography container. of the MTCBD and the number of errors remaining after decoding iteration for each error probability in the com- munication channel. An example of a typical entry in the Figure 10: Steganography Container 17 Figure 11: Steganography Container Are at the BSC Exit From the output of the BSC stegocontainer are sup- After the function, embedding secret message, filled plied to the MTCBD where all the noise is removed, and stegocontainer are supplied to MTCBC, and after the the file stegocontainer are taking on the appearance it had encoding process is transmitted in noisy BSC. The output at the entrance to the BSC. of the BSC file has the form shown in figure 11. Filtered noise is shown in figure 12. Figure 12: Filtered Noise Output MTCBD the steganography container are • The code rate is 0.5; supplied to the removal unit, the output of which then • The probability of error in channel P0 = 0.25; issued a secret message. • Container size - 93640 bytes; The following parameters were set for the final simu- • Number of generated errors - 23543; lation: • Number of decoding iterations until all errors are fully retrieved - 18; 18 • Type of communication channel - BSC. [Kus17] Kustov, V. N., Protsko D. K. A Software model Conclusion of steganography on the basis of a combination of methods LSB and DCT. Science and educa- In the opinion of the authors, this paper describes a suc- tion in the XXI century. Collection of scientific cessful attempt of a comprehensive solution in the field papers on the materials of the international sci- of steganography container data transmission. The au- entific-practical conference on February 28, thors present a prototype of a software model based on 2017. Part 3. Tambov: LLC "Ucom Consulting four main components: company", 2017. P. 54-61. • Discrete wavelet transformation the steganography container are; [Dav07] David Frith,” Steganography Approaches, Op- • Pre-coding the hidden message using Arnold's cat tions and Implications", Network Security, Au- decomposition; gust 2007. • Embedding encoded message on LSB algorithm in wavelet transformations the stegocontainer are; [Che08] Cheddad A., Condell J., Curran K. and Mc Kev- • Application of noise-resistant coding in the com- itt, P. (2008). Biometric Inspired Digital Image munication channel using advanced technologies of mul- Steganography. Proc of the 15th Annual IEEE ti-threshold decoding using MTD (multi-threaded decod- International Conference and Workshops on the er). Engineering of Computer-Based Systems Conventionally, this combination of four components (ECBS'08). P. 159-168. can be designated as DWT & ACM & LSB & MTD. [Div11] Divya Saxena,” Digital Watermarking Algo- The authors believe that this model fully implements rithm based on Singular Value Decomposition the process of transmission of hidden messages in digital and Arnold Transform”, International Journal of still images. The model takes into account and agrees on Electronics and Computer Science Engineering the features of all processing hidden message stages. (IJECSE), Vol. 1, No. 1, 2011, P. 22-27. [Ami10] R. Amirtharajan, R. Akila, P. Deepika Chow- Acknowledgments davarapu, “Comparative Analysis of Image Ste- Studies carried out on this topic were carried out with ganography”, International Journal of Computer partial financial support from RFBR grants (No. 16-29- Applications, volume 2 – No.3, May 2010. 09482-ofi-m), under the budget theme No. 0073–2019– [Mut11] S. K. Muttoo, Sushil Kumar, “A mulltilayred 0004. secure, robust and high capacity image ste- ganographic algorithm”, 2011. References [Mas69] J. L. Massey, Shift-register synthesis and BCH [Mes66] Messi John. Threshold decoding / Per. with decoding, IEEE Trans. Information Theory, IT- English. Ed. by E. L. Bloch. M.: Mir, 1966. 15 (1969), P. 122-127. [Zol12] Zolotarev V. V., Zubarev Y. B., Ovechkin G. V. multi-Threshold decoders and optimization the- ory of coding. M.: Hotline-Telecom, 2012. 19