<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Towards Image Data Hiding via Facial Stego Synthesis With Generative Model</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Li</forename><surname>Dong</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Electrical Engineering and Computer Science</orgName>
								<orgName type="institution">Ningbo University</orgName>
								<address>
									<postCode>315211</postCode>
									<settlement>Zhejiang</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Southeast Digital Economic Development Institute</orgName>
								<address>
									<postCode>324000</postCode>
									<settlement>Zhejiang</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jie</forename><surname>Wang</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Electrical Engineering and Computer Science</orgName>
								<orgName type="institution">Ningbo University</orgName>
								<address>
									<postCode>315211</postCode>
									<settlement>Zhejiang</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Southeast Digital Economic Development Institute</orgName>
								<address>
									<postCode>324000</postCode>
									<settlement>Zhejiang</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rangding</forename><surname>Wang</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Yuanman</forename><surname>Li</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Electrical Engineering and Computer Science</orgName>
								<orgName type="institution">Ningbo University</orgName>
								<address>
									<postCode>315211</postCode>
									<settlement>Zhejiang</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Southeast Digital Economic Development Institute</orgName>
								<address>
									<postCode>324000</postCode>
									<settlement>Zhejiang</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="institution">Shenzhen University</orgName>
								<address>
									<postCode>518061</postCode>
									<settlement>Guangdong</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Weiwei</forename><surname>Sun</surname></persName>
							<affiliation key="aff3">
								<orgName type="department">Alibaba Group</orgName>
								<address>
									<postCode>310052</postCode>
									<settlement>Zhejiang</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Towards Image Data Hiding via Facial Stego Synthesis With Generative Model</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">FA78DDB3AA82ECFED4DD3263385E3DE9</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T11:12+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>data hiding</term>
					<term>stego synthesis</term>
					<term>generative adversarial network</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Stego synthesis-based data hiding aims to directly produce a plausible natural image to convey secret message. However, most of the existing works neglected the possible communication degradations and forensic actions, which commonly occur in practice. In this paper, we devise a generative adversarial network (GAN)-based framework to synthesize facial stego images. The framework consists of four components: generator, extractor, discriminator and forensic network. Specifically, the generator is deployed to generate a realistic facial stego image from the secret message and key, while the extractor aims at extracting the secret message from the stego image with the provided secret key. To combat forensics, we explicitly integrate a forensic network into the proposed framework, which is responsible for guiding the update of generator. Three degradation layers are further incorporated, enforcing the generator to characterize the communication degradations. Experimental results demonstrate that the proposed framework could accurately extract the secret message and effectively resist the forensic detection and certain degradations, while attaining realistic facial stego images.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Data hiding aims to embed the secret message into a cover signal, without incurring awareness of an adversary. It is widely used in many applications, e.g., covert communication <ref type="bibr" target="#b0">[1]</ref> and multimedia data protection <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>. The primitive ad-hoc Least-Significant Bit (LSB) replaces the bit in least significant bit-plane of each pixel with the secret bit. While the modern data hiding methods attempt to eliminate the traces of data hiding action and improve the steganographic capacity. For example, contentadaptive steganography <ref type="bibr" target="#b0">[1]</ref> designed sophisticated distortion function according to prior knowledge and used Syndrome-Trellis coding to embed the secret message. Recently, neural network-based data hiding is becoming one of the active research directions. Baluja <ref type="bibr" target="#b3">[4]</ref> employed convolutional neural networks to hide an entire secret image into the cover image in an end-to-end fashion. The work SSGAN <ref type="bibr" target="#b4">[5]</ref> attempted to exploit GAN to synthesize a cover image which is more suitable for the subsequent steganographic data embedding. ASDL-GAN <ref type="bibr" target="#b5">[6]</ref> integrated the content-adaptive steganography and GAN, in which the generator was able to produce the modifica-</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>International Workshop on Safety &amp; Security of Deep Learning, 21st -26th August, 2021</head><p>Envelope dongli@nbu.edu.cn (L. Dong); 1811082196@nbu.edu.cn (J. Wang); wangrangding@nbu.edu.cn (R. Wang); yuanmanli@szu.edu.cn (Y. Li); sunweiwei.sww@alibaba-inc.com (W. Sun) tion probability maps. For the methods HayersGAN <ref type="bibr" target="#b6">[7]</ref>, HiDDeN <ref type="bibr" target="#b7">[8]</ref> and SteganoGAN <ref type="bibr" target="#b8">[9]</ref>, they all designed an encoder-decoder alike framework based on GAN. These methods could automatically learn the suitable areas for embedding the secret bitstream message.</p><p>For the last several years, the adversarial examples to neural networks meet data hiding, and continuously drawing extensive attention from the community. Some studies, e.g., <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11]</ref>, found that adding slight perturbations to the input data would paralyze the prediction capability of learning-based classifiers. As the opponent of data hiding, steganalysis aims to expose the data hiding on stego signal and usually involves machine-learning classifiers. Therefore, it is possible for data hiding methods to bypass steganalysis by borrowing some strategies from the adversarial examples-related works. Tang et al. <ref type="bibr" target="#b11">[12]</ref> presented the Adversarial Embedding (ADV-EMB) method that adjusts the modification cost of image elements, according to the gradients that back-propagated from the target steganalytic neural network. The constructed adversarial stego could effectively fool the steganalytic network, revealing the vulnerability of the deep learning-based steganalyzer.</p><p>Note that, all aforementioned data hiding techniques are based on the cover modification. The common characteristic is that these methods can not be independent of the modification on the given cover image. As such, it inevitably leaves artifacts exposing to steganalysis. On the contrary, stego synthesis-based data hiding, e.g., <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14]</ref>, refers to synthesizing the stego image directly from the secret message. It could pose more challenges for steganalysis. Under this concept, traditional methods tried to produce stego image based on some hand-crafted designations. Although the capacity was relatively higher, they were limited to synthesizing patterned images, such as textures and fingerprints. As an alternative solution, some methods <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b15">16]</ref> use GAN to synthesize stego images with rich semantics, e.g., face and food. However, the accuracy of message extraction was unsatisfactory under image degradations. Moreover, the synthesized stego images can be easily identified by a well-trained forensic detector. It is thus urgent to further improve the robustness of message extraction and anti-forensic capability of stego synthesis-based data hiding methods.</p><p>In this work, we propose a Facial Stego Image Synthesis method for data hiding with GAN, which is termed as FSIS-GAN. Unlike the cover modification-based data hiding methods, FSIS-GAN is designed without providing a cover image beforehand. Compared with the existing stego synthesis-based methods, FSIS-GAN can not only synthesize realistic facial stego images, but also achieve superior performance in terms of robustness and antiforensic capability. Experimental results conducted on the public facial dataset validate such merits of our proposed method. The main contributions of this work can be summarized as follows,</p><p>• We explicitly consider the image degradation during the covert communication, and integrate multiple degradation layers into the framework. This boost the robustness performance in terms of the message extraction. • We incorporate a forensic network during training FSIS-GAN. By exploiting the gradients from such a forensic network, the stego image produced by the learned generator could effectively fool the forensic network. • We explicitly adopt the secret key into the data hiding procedure of FISI-GAN, which could further improve the reliability of the secret message extraction.</p><p>The rest of this paper is organized as follows. Section II briefly reviews the related work on stego synthesisbased data hiding. Section III describes the proposed FSIS-GAN, including network architecture and loss function. Section IV presents the experimental results, and the final conclusions are drawn in Section V.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Stego Synthesis-based Data Hiding</head><p>The majority of data hiding method involves the modification on the given cover images. However, such cover Instead, Hu et al. <ref type="bibr" target="#b14">[15]</ref> suggested using the generator of GAN to synthesize a facial stego image from the secret message. Meanwhile, the secret message can be extracted from the stego image by the corresponding extractor network. Similarly, Zhang et al. <ref type="bibr" target="#b15">[16]</ref> exploited GAN to generate stego image with different semantic labels, which could improve the robustness of data extraction but significantly scarifying the steganographic capacity. The main advantage of the GAN-based works is that they could synthesize stego images with rich semantics. However, we shall note that stego images can be easily identified by some well-trained forensic networks. In addition, there is no trade-off between capacity and extraction accuracy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Facial Image Data Hiding via Generative Stego Synthesis</head><p>In this section, we first give an overview of the proposed FSIS-GAN framework and then introduce each component of the framework, accompanied with thorough discussion on the loss function, network structure and train-ing procedure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Overview of FSIS-GAN</head><p>The proposed FSIS-GAN framework is illustrated in Figure <ref type="figure" target="#fig_0">1</ref>. In general, it is an end-to-end framework consisting of three parts, where each part is designed to achieve a specific goal. First, the part of facial stego image synthesis and message extraction contains a generator G, an extractor E and the degradation layers N. The generator G is deployed to convert the secret message along with the secret key into a facial stego image. The degradation layers N are used to simulate possible common image degradations within the communication channel. The extractor E is learned to recover the secret message from the degraded stego image. Second, there is a discriminator D in the part of adversarial training, which aims at distinguishing the genuine data sample from the ones produced by the generator G. Third, a well-trained existing forensic network F 𝜃 (parameterized by 𝜃) is introduced in the part of anti-forensics, which could distinguish the genuine from the synthesized facial stego image. Note that this target forensic network is treated as a fixed adversary, and its network parameters are always frozen.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Stego Image Synthesis and Message Extraction</head><p>The part of facial stego image synthesis and message extraction achieve two functionalities. First, by using the generator G, one can convert the given secret message into a facial stego image. Second, the extractor E is responsible for extracting the secret message from the input stego image. Furthermore, a secret key is introduced to ensure the communication reliability and high diversity of the generated facial stego image. Generally, generator G and extractor E aim to learn two mappings, i.e., mapping the given secret message into a stego image, and vice versa. More formally, let m ∈ {0, 1} 𝑙 𝑚 and k ∈ {0, 1} 𝑙 𝑘 be the binary secret message and the secret key, respectively. Generator G is intended to learn the first mapping, transforming the message m along with the secret key k into a stego image:</p><formula xml:id="formula_0">S = G(m, k),<label>(1)</label></formula><p>where S denotes the synthesized facial stego image of shape 𝐶 × 𝐻 × 𝑊. To recover the secret message, we next introduce the extractor E. Considering that the facial stego image S may be degraded during transmission, the second mapping should be from the degraded stego image along with the secret key k to the secret message, which can be expressed by</p><formula xml:id="formula_1">m ′ = E(N(S), k),<label>(2)</label></formula><p>where N(⋅) models the image degradation process, and N(S) is the degraded stego image. Here, m ′ ∈ (0, 1) 𝑙 𝑚 denotes the extracted secret message. It shall be noted that the extracted message m ′ shall be (approximately) equals the original secret message m, and thus one can employ error correcting mechanism to fully correct the erroneous bits.</p><p>To measure the distortion between the original secret message m and the extracted message m ′ , we use the cross-entropy loss to calculate the message extraction loss L E , which is given by</p><formula xml:id="formula_2">L E (m, m ′ ) = − 1 𝑙 𝑚 𝑙 𝑚 ∑ 𝑖=1 [𝑚 𝑖 log(𝑚 ′ 𝑖 ) + (1 − 𝑚 𝑖 )log(1 − 𝑚 ′ 𝑖 )],</formula><p>(3) where 𝑚 𝑖 and 𝑚 ′ 𝑖 is 𝑖-th element of m and m ′ , respectively. Note that, our proposed FSIS-GAN framework explicitly receiving a secret key as an input, which is designed to satisfy the Kerckhoffs' principle. It means that even the extractor E network is completely exposed to an attacker, the secret message m will be recovered only if the receiver obtain both the secret key k and the facial stego image S. It is worth emphasizing that, for most of the existing GAN-based methods, e.g., <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b15">16]</ref>, there is no involvement of a secret key. Further notice that as the input of the extractor E, the dimensions of secret key k is greatly smaller than that of the facial stego image S. Thus, the extractor E tends to discard the secret key because it carries much less information. To mitigate this issue, we propose to use randomly generated incorrect secret key k ∈ {0, 1} 𝑙 𝑘 , where k ≠ k, as input during training stage. Instead of directly using the correct secret key and minimize the difference between the extracted and original message, we maximize the differences between the extracted and original message when applying incorrect secret key. Mathematically, the loss term inverse loss L Ẽ, can be expressed by the negative cross-entropy loss:</p><formula xml:id="formula_3">L Ẽ(m, m ′ ) = 1 𝑙 𝑚 𝑙 𝑚 ∑ 𝑖=1 [𝑚 𝑖 log( m′ 𝑖 ) + (1 − 𝑚 𝑖 )log(1 − m′ 𝑖 )],<label>(4)</label></formula><p>where m′ 𝑖 is the 𝑖-th element of the extracted message m ′ with the incorrect key k , i.e., m ′ = E(N(S), k</p><p>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Enhancing robustness with degradation layers:</head><p>In a practical communication channel, there often exists degradations on the synthesized stego image S, when transmitting the stego to a receiver. To this end, the data hiding system requires certain robustness to ensure the accuracy of message extraction. Therefore, in this work, we take three representative degradations into account, i.e., image noise pollution, blurring, and compression. For noise pollution, we consider the one of the most widely-used noise models: Gaussian noise. For blurring, the Gaussian blurring is used. For signal compression, JPEG image compression is employed, which is extensively used for reducing the bandwidth of transmission process. In experiments, we implement these three types of degradation as neural network layers N to degrade the stego image. Specifically, three network layers are used for simulating each type of degradation. Gaussian noise layer (GNL) is to add Gaussian noise to the facial stego image S. Gaussian blurring layer (GBL) blurs S. For JPEG compression, considering that the quantitation operation is non-differentiable, we approximate the quantization operation with a differentiable polynomial function. Such differentiating technique can also be referred to the work HiDDeN <ref type="bibr" target="#b7">[8]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Adversarial Training Part</head><p>As aforementioned, the hand-crafted stego synthesisbased data hiding methods <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14]</ref> only could synthesize patterned images such as texture and fingerprint, limiting their practical applications. Synthesizing a natural image with semantics is a challenging task. However, this problem can be alleviated with the guidance of adversarial training. In this part, the purpose of the discriminator D is to conduct adversarial training with the generator G and improve the plausibility of the synthesized facial stego images.</p><p>More specifically, let I be the genuine facial image sample of shape 𝐶 × 𝐻 × 𝑊 from a publicly available genuine facial image dataset. The discriminator D estimates the probability that a given image sample belonging to a synthesized by the generator G. The generator G attempts to fool the discriminator D. Through such adversarial training, the generator G is encouraged to synthesize much more realistic facial stego images. As a variant of GAN, the network structure and loss function of BEGAN <ref type="bibr" target="#b16">[17]</ref> provides a good reference for improving training stability. Thus, we in this work employ the adversarial training loss used in BEGAN. Mathematically, the adversarial loss L adv for the generator G can be calculated as</p><formula xml:id="formula_4">L adv (D(S), S) = 1 𝐶𝐻 𝑊 [|D(S) − S|],<label>(5)</label></formula><p>where the shape of output D(S) is same as the facial stego image. The adversarial loss L D for the discriminator D is</p><formula xml:id="formula_5">L D (I, S) = 1 𝐶𝐻 𝑊 [|D(I) − I| − ℎ 𝑡 ⋅ |D(S) − S|],<label>(6)</label></formula><p>where ℎ 𝑡 controls the discrimination ability of D in the 𝑡-th training step to equilibrate the adversarial training.</p><p>It can be computed as</p><formula xml:id="formula_6">ℎ 𝑡+1 = ℎ 𝑡 + 𝜆 𝐶𝐻 𝑊 [𝛾|D(I) − I| − |D(S) − S|].<label>(7)</label></formula><p>Here the parameter 𝜆 is the learning rate of training, and 𝛾 is a hyper-parameter to control the diversity of synthesized facial images. The quality and diversity of the facial stego images can be freely adjusted by tuning the parameter 𝛾.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Anti-forensics Part</head><p>Remind that there is no explicit cover images involved in stego synthesis-based data hiding methods. This merit makes such type of data hiding method could effectively resist to conventional steganalysis detection. However, as pointed in <ref type="bibr" target="#b14">[15]</ref>, a well-trained forensic network could readily distinguish a synthesized stego image from the genuine one, even the synthesized stego image is of no perceptual differences to an observer. Although F 𝜃 is an expert in such a detection task, some studies <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11]</ref> have shown that deep neural networkbased classifiers are vulnerable to adversarial examples. Inspired by this, we propose to apply strategies of obtaining adversarial examples to evade the stego detection network as a way for realizing anti-forensics. In FSIS-GAN framework, we consider a white-box scenario, i.e., assuming one has full knowledge of the target forensic network. The target forensic network F is trained with the genuine images from a publicly available facial dataset and the synthesized images that produced by BE-GAN <ref type="bibr" target="#b16">[17]</ref>. Then, we integrate the well-trained F 𝜃 into the FSIS-GAN framework, in which F 𝜃 receives the synthesized facial stego image S and output the confidence. The gradients that back-propagated by the F 𝜃 are used to update the parameters of the generator G. To measure the loss of resisting forensic detection, we define the antiforensic loss L F 𝜃 to computes the cross-entropy between the output of F 𝜃 and our target genuine image label:</p><formula xml:id="formula_7">L F 𝜃 (S) = − log (1 − F 𝜃 (S)),<label>(8)</label></formula><p>where F 𝜃 (S) ∈ (0, 1) is the confidence output by F 𝜃 .</p><p>Clearly, the decrement of L F 𝜃 indicates the probability increment of S being identified as a genuine image by F 𝜃 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Network Structure and Training Strategy</head><p>The network architecture of the generator G and the extractor E are shown in Figure <ref type="figure">2</ref>. For generator G, the secret key vector k is first concatenated to the secret message vector m and then fed to subsequent layers. Then, G applies two fully-connected (FC) layers and three convtranspose (ConvT) layers to produce the facial stego image S. In particular, after each FC layer or ConvT layer, we apply batch normalization (BN) <ref type="bibr" target="#b17">[18]</ref> and ReLU activation function to process intermediate vectors. In experiments, we found that both m and k are composed of binary number 0 or 1, and such form is not suitable as input and the adversarial training loss would diverge.</p><p>To solve this issue, additional BN layers were added, and normalization operation is carried out inside the network. Experiential results show that this trick could greatly alleviate the divergence problem. For extractor E, we shall ensure the secret key vector k and the facial stego image matrix S in a way such that the extractor E would not neglect the information provided by the secret key. To this end, the extractor E first applies FC layer to the secret key to form the intermediate matrix, i.e., 1 × 𝑊 × 𝐻. Then, the facial stego image S and the intermediate matrix are concatenated, and then feed the fused tensor to the four convolutional (Conv) layers. Finally, the extractor E applies the FC layer and Sigmoid activation function to produce the message vector m ′ (or m ′ ) with size of 1 × 𝑙 𝑚 .</p><p>For the discriminator D, we adopt the auto-encoder alike structure from BEGAN <ref type="bibr" target="#b16">[17]</ref>. For the target forensic network F, we use Ye-Net <ref type="bibr" target="#b18">[19]</ref>, which is a widely-used steganalytic method.</p><p>The training process of the proposed FSIS-GAN framework is iteratively optimize the loss function of each network, except the well-trained forensic network F 𝜃 . We apply the extraction loss L E and the adversarial loss L D as the loss function for the extractor E and the discriminator D, respectively. In particular, The total loss L G for the generator G is a proper fusion of the four losses aforementioned as follows</p><formula xml:id="formula_8">L G = L adv + 𝛼(L E + L Ẽ) + 𝛽L F 𝜃 ,<label>(9)</label></formula><p>where L adv is the adversarial loss for G, L Ẽ is the inverse loss, and L F 𝜃 is the anti-forensic loss. The hyperparameters of 𝛼 and 𝛽 are used to control the relative importance among the four losses.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiment results</head><p>In this section, we first introduce the experimental setup. Then, to verify the robustness of our proposed FSIS-GAN, it is evaluated under image degradation and without degradation, respectively. Finally, the anti-forensic capability of FSIS-GAN is validated.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Experimental Setup</head><p>Our experiments are conducted on the CelebA dataset <ref type="bibr" target="#b19">[20]</ref>, where the region with face is identified and extracted. All images are reshaped into 3 × 64 × 64. The following three metrics are used for evaluation:</p><p>• Fréchet Inception Distance (FID) <ref type="bibr" target="#b20">[21]</ref>, which is a widely-used perceptual image quality assessment metric for synthesized images. • Probability of missed detection (PMD). This metric can be calculated by PMD = 𝐹 𝑁 𝐹 𝑁 +𝑇 𝑃 , where 𝐹 𝑁 (False Negative) is the ratio for case "synthesized facial image is misclassified as a genuine one", and 𝑇 𝑃 (True Positive) is the ratio for case "synthesized facial image is correctly detected". Larger PMD indicates higher resisting ability to the forensic network.</p><p>The proposed FSIS-GAN framework is implemented with PyTorch and train on four NVIDIA GTX1080Ti GPUs with 11GB memory. The number of training epochs is set to with a mini batch-size of 64. We use Adam <ref type="bibr" target="#b21">[22]</ref> as the optimizer with a learning rate of 2 × 10 −4 . For the hyper-parameters 𝛼 and 𝛽 in <ref type="bibr" target="#b8">(9)</ref>, with a number of trials and errors, we empirically set them as 0.1 in experiments. The parameter 𝛾 in ( <ref type="formula" target="#formula_6">7</ref>) is set to 0.7, which is expected to produce reasonably diverse facial stego images. The competing method is the most related work <ref type="bibr" target="#b14">[15]</ref>. We implement this work by ourselves because there is no publicly available code. With certain tweaking and fine-tuning, the tested results were comparable to the originally reported data from <ref type="bibr" target="#b14">[15]</ref>. For a fair comparison, the length of the secret message 𝑙 𝑚 and the secret key 𝑙 𝑘 are all set to 100, so as to the payload is identical to that of work <ref type="bibr" target="#b14">[15]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Performance Without Degradations</head><p>Notice that the competing method <ref type="bibr" target="#b14">[15]</ref> does not consider the image degradations. To verify the effectiveness of the proposed method under same settings and make a fair comparison. We in this subsection to evaluate the performance without degradation layers N. The facial stego image S will be transmitted to extractor E without any degradation. To avoid confusion, this variation of our proposed method is termed as FSIS-GAN-WD (WD is abbreviated for Without Degradations). We first compare the visual quality of the facial stego images with the competing method <ref type="bibr" target="#b14">[15]</ref>. As can be seen from Figure <ref type="figure" target="#fig_3">3</ref>, the proposed FSIS-GAN-WD could synthesize more realistic facial stego images in comparison with Hu et al. <ref type="bibr" target="#b14">[15]</ref>. With more careful inspection, one can notice that the stego images produced by FSIS-GAN-WD are more vivid and with more correct semantic structures. It is difficult for a common human to aware the inauthenticity of the facial stego images synthesized by FSIS-GAN-WD. In contrast, the stego images generated by Hu et al. <ref type="bibr" target="#b14">[15]</ref> are typically blurry and severely distorted, which apparently draw attentions from a forensic analyzer. For the FID evaluation experiment, we use 10, 000 pairs of genuine images and synthesized facial stego images to compute the FID score. The FID score of FSIS-GAN-WD is 23.20, which is much smaller than that of Hu et al. <ref type="bibr" target="#b14">[15]</ref>'s 32.07.</p><p>Then, we evaluate the extraction accuracy for the case of without degradation. The results are tabulated in Table <ref type="table">1</ref>. To demonstrate the impact of the inverse loss L Ẽ on the extraction accuracy, the ablation experiments are also conducted, by excluding the inverse loss during training. This L Ẽ-ablated version is denoted as FSIS-GAN-WD (ex L Ẽ). From the Table <ref type="table">1</ref>, one can draw the following conclusions. First, the extraction accuracy of FSIS-GAN-WD with the correct secret key k is 98.76%, which dramatically outperforms 85.23% the competing method <ref type="bibr" target="#b14">[15]</ref>. Second, by comparing FSIS-GAN-WD and FSIS-GAN-WD (ex L Ẽ), one can see that, the extraction accuracy of FSIS-GAN-WD with a correct secret key k slightly inferior to that of FSIS-GAN-WD (ex L Ẽ). This suggests that the introduced inverse loss would marginally harm the extraction accuracy. However, when comparing the case of incorrect key k , the participation of the inverse loss L Ẽ Table <ref type="table">1</ref> Comparison of message extraction accuracy (%) for the case of no communication degradations. Here, k and k denote the correct and incorrect secret key, respectively. FSIS-GAN-WD is a variant of the proposed method by excluding the degradation layers, and FSIS-GAN-WD (ex L Ẽ) represents the FSIS-GAN-WD trained without inverse loss L Ẽ.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Scheme</head><p>Hu et al. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Performance With Degradations</head><p>In this subsection, we test the robustness performance of the proposed framework under certain image degradations. The image degradation type and level are given as prior knowledge. This scenario is common in practice because one can obtain some prior knowledge on the degradation through probing the communication channel. Thus, one can fix the degradation layers N and its associated parameters during training stage. Specifically, in our experiments, the standard deviation 𝜎 1 of the Gaussian noise layer (GNL) is set to 0.2. The kernel width 𝑑 and the standard deviation 𝜎 2 of the Gaussian blurring layer (GBL) are set to 3 and 1, respectively. The differentiable JPEG compression layer (JCL) is implemented as suggested by the work HiDDEN <ref type="bibr" target="#b7">[8]</ref> For referring simplicity, this variation is termed as FSIS-GAN-FD (FD is abbreviated for Fixed Degradation) in the sequel. Firstly, the stego images synthesized by FSIS-GAN-FD are provided in Figure <ref type="figure" target="#fig_6">4</ref>. One can observe that some speckle noises emerge in the generated stego images, which can be clearly seen from the highlighted regions with red line in Figure <ref type="figure" target="#fig_6">4 (b)</ref>. Quantitatively, the FID score of FSIS-GAN-FD is 41.40, which is inferior to that of FSIS-GAN-WD (23.20) and Hu et al. <ref type="bibr" target="#b14">[15]</ref> (32.07). Nevertheless, the stego images produced by FSIS-GAN-FD are intuitively more realistic than that of Hu et al. <ref type="bibr" target="#b14">[15]</ref>.</p><p>Secondly, in Table <ref type="table" target="#tab_0">2</ref>, we report the extraction accuracy performance under fixed degradations. Not surprisingly, one can notice that the extraction accuracy of Hu et al. <ref type="bibr" target="#b14">[15]</ref> and FSIS-GAN-WD greatly degrade, which can be attributed to the overlooking on degradation-resistant message extraction issue. In contrast, FSIS-GAN-FD ex-   The results verify that for the case of known degradations, the proposed framework could learn to effectively resistant the fixed degradations, by employing the fixed degradation layers during the training. Finally, to illustrate how the robustness of message extraction changes under different degradation levels, we test different degradation types with a variety of degradation levels. Due to space limit, we only report the JPEG compression degradation in Figure <ref type="figure" target="#fig_7">5</ref>. As can be seen, with the decrement of quality factor (𝑄𝐹), the extraction accuracy generally decreases. Although the JCL that adopted from HiDDEN <ref type="bibr" target="#b7">[8]</ref> could handle nondifferentiable JPEG compression, it cannot perfectly reproduce the JPEG compression artifacts. Nevertheless, FSIS-GAN-FD still achieve superior robustness, when comparing with other two schemes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Performance of Anti-forensics</head><p>Recall that, owing to that no cover images are involvement for data hiding, our method has a relatively good undetectability when exposed to a steganalyzer. How- ever, as pointed in <ref type="bibr" target="#b14">[15]</ref>, a well-trained forensic network can effectively identify a synthesized image. To solve this issue, we explicitly considered the anti-forensics scenario and introduce the anti-forensic loss L F 𝜃 .</p><p>To demonstrate the influence of anti-forensic loss L F 𝜃 , we conduct the ablation experiment by excluding the loss term L F 𝜃 , and thus this variant is termed as FSIS-GAN (ex L F 𝜃 ). For a concrete example, we employ the well-trained forensic network Ye-Net <ref type="bibr" target="#b18">[19]</ref> F 𝜃 to detect 3000 facial stego images produced by different methods, and record the probability of missed detection (PMD). The PMD's of Hu et al. <ref type="bibr" target="#b14">[15]</ref>, FSIS-GAN (ex L F 𝜃 ), and FSIS-GAN are 3.23%, 8.84%, and 89.91, respectively. As clearly shown, for FSIS-GAN (ex L F 𝜃 ), despite the facial stego images look natural for human, they are easily exposed to the forensic network, where the PMD value is lower than 10%. In contrast, by introducing the anti-forensic loss term, the value of PMD of FSIS-GAN could reach 89.91%. This means the proposed method FSIS-GAN could effectively bypass the existing forensic network, retaining an nice anti-forensic capability.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>In this work, we proposed a stego-synthesis based data hiding method using generative neural network, by explicitly considering the image degradation and antiforensic need. Specifically, the generator is to synthesize a facial stego image from the given secret message and secret key. The extractor aims to recover the secret message with the secret key. Through the adversarial training with the discriminator, the generator could produce realistic facial stego images. The degradation layers are introduced during the training, which significantly enhance the robustness of message extraction. A forensic network is incorporated during training, in response to the possible adversarial forensic analysis in communication channel. Experimental results verified that, our approach could generate more natural facial stego images, while retaining higher message extraction accuracy and nice anti-forensic ability.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Overview of the proposed FSIS-GAN framework.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>( a )Figure 2 :</head><label>a2</label><figDesc>Figure 2: Network structure of the generator G and the extractor E. "Concat", "FC", "ConvT", "BN", "Conv" denote the concatenation, fully-connected layer, convtranspose layer, batch norm, and convolution layer, respectively.</figDesc><graphic coords="5,137.02,184.10,158.79,56.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>FID is a de facto metric for assessing the image quality created by generator of GANs'. Lower score indicates better consistency with human's perception on natural images. • Accuracy of message extraction (ACC) that is computed by ACC = 𝐿 Ext 𝐿 , where 𝐿 Ext is the length of correctly extracted message and 𝐿 is the length of secret message m.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Comparison of exemplar synthesized stego images. Top: Hu et al. [15]; Bottom: Proposed FSIS-GAN-WD.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>[ 15 ]</head><label>15</label><figDesc>SchemeHu et al.<ref type="bibr" target="#b14">[15]</ref> </figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: The comparison of synthesized facial stego images, where four images of (a) are produced by FSIS-GAN-WD; images of (b) are stego images produced by FSIS-GAN-FD. With the introduction of degradation layers, minor speckle noises emerge (highlighted with red rectangular).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Comparison of the message extraction accuracy (%) under various levels of JPEG compression degradation.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 2</head><label>2</label><figDesc>Comparison of message extraction accuracy (%) under various degradation conditions. The bold and marked value with an asterisk (*) denote the highest extraction accuracy with correct secret key k and the lowest extraction accuracy with the incorrect secret key k , respectively.</figDesc><table><row><cell>Scheme</cell><cell>Hu et al. [15]</cell><cell>FSIS-GAN-WD FSIS-GAN-FD with k with k with k with</cell><cell>k</cell></row><row><cell cols="2">W/o degradation 85.23</cell><cell>98.76 71.50  *  98.22 72.08</cell><cell></cell></row><row><cell>Fixed GNL</cell><cell>52.72</cell><cell>59.78 56.23  *  95.58 72.74</cell><cell></cell></row><row><cell>Fixed GBL</cell><cell>69.68</cell><cell>57.52 54.68  *  98.58 73.78</cell><cell></cell></row><row><cell>Fixed JCL</cell><cell>65.33</cell><cell>61.38 58.00</cell><cell></cell></row></table><note>* 98.46 72.67 hibits quite promising results. Under three types of degradation layers, the extraction accuracy typically exceeds 94% (though lower than that of FSIS-GAN-WD, which is specifically designed for the non-degradation scenario).</note></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work was supported in part by the National Natural Science Foundation of China under Grant 61901237, in part by the Open Project Program of the State Key Laboratory of CADCG, Zhejiang University under Grant A2006, and in part by the Ningbo Natural Science Foundation under Grant 2019A610103. Thanks to Southeast Digital Economic Development Institute for supporting the computing facility.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Contentadaptive steganography by minimizing statistical detectability</title>
		<author>
			<persName><forename type="first">V</forename><surname>Sedighi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Cogranne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Fridrich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Inf. Forensics Security</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="221" to="234" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Secure reversible image data hiding over encrypted domain via key modulation</title>
		<author>
			<persName><forename type="first">J</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">C</forename><surname>Au</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">Y</forename><surname>Tang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Circuits Syst. Video Technol</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="page" from="441" to="452" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">First steps toward concealing the traces left by reversible image data hiding</title>
		<author>
			<persName><forename type="first">L</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Circuits Syst. II, Exp. Briefs</title>
		<imprint>
			<biblScope unit="volume">67</biblScope>
			<biblScope unit="page" from="951" to="955" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Hiding images within images</title>
		<author>
			<persName><forename type="first">S</forename><surname>Baluja</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Pattern Anal. Mach. Intell</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="page" from="1685" to="1697" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">SS-GAN: secure steganography based on generative adversarial networks</title>
		<author>
			<persName><forename type="first">H</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Qian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Pacific Rim Conference on Multimedia</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="534" to="544" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Automatic steganographic distortion learning using a generative adversarial network</title>
		<author>
			<persName><forename type="first">W</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Huang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Signal Process. Lett</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="page" from="1547" to="1551" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Generating steganographic images via adversarial training</title>
		<author>
			<persName><forename type="first">J</forename><surname>Hayes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Danezis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. Adv. Neural Inf. Process. Syst</title>
				<meeting>Adv. Neural Inf. ess. Syst</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1954" to="1963" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">HiDDen: Hiding data with deep networks</title>
		<author>
			<persName><forename type="first">J</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kaplan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Johnson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. Eur. Conf. Comput. Vis</title>
				<meeting>Eur. Conf. Comput. Vis</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="657" to="672" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">A</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Cuesta-Infante</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Veeramachaneni</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1901.03892</idno>
		<title level="m">SteganoGAN: High capacity image steganography with GANs</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Szegedy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zaremba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bruna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Erhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Fergus</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1312.6199</idno>
		<title level="m">Intriguing properties of neural networks</title>
				<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">J</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shlens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Szegedy</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1412.6572</idno>
		<title level="m">Explaining and harnessing adversarial examples</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">CNNbased adversarial embedding for image steganography</title>
		<author>
			<persName><forename type="first">W</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Barni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Huang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Inf. Forensics Security</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="2074" to="2087" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Steganography using reversible texture synthesis</title>
		<author>
			<persName><forename type="first">K</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Image Process</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="page" from="130" to="139" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Toward construction-based data hiding: From secrets to fingerprint images</title>
		<author>
			<persName><forename type="first">S</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Image Process</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="page" from="1482" to="1497" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">A novel image steganography method via deep convolutional generative adversarial networks</title>
		<author>
			<persName><forename type="first">D</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="38303" to="38314" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">A generative method for steganography by cover synthesis with auxiliary semantics</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Tsinghua Science and Technology</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="516" to="527" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Berthelot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Schumm</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Metz</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1703.10717</idno>
		<title level="m">BEGAN: Boundary equilibrium generative adversarial networks</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Ioffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Szegedy</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1502.03167</idno>
		<title level="m">Batch normalization: Accelerating deep network training by reducing internal covariate shift</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Deep learning hierarchical representations for image steganalysis</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Inf. Forensics Security</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="2545" to="2557" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Deep learning face attributes in the wild</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Tang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. IEEE Int. Conf. Comput. Vis</title>
				<meeting>IEEE Int. Conf. Comput. Vis</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Gans trained by a two time-scale update rule converge to a local nash equilibrium</title>
		<author>
			<persName><forename type="first">M</forename><surname>Heusel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ramsauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Unterthiner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Nessler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hochreiter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. Adv. Neural Inf. Process. Syst</title>
				<meeting>Adv. Neural Inf. ess. Syst</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="6629" to="6640" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">P</forename><surname>Kingma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ba</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1412.6980</idno>
		<title level="m">Adam: A method for stochastic optimization</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
