=Paper=
{{Paper
|id=Vol-2283/MediaEval_18_paper_34
|storemode=property
|title=First Steps in Pixel Privacy: Exploring Deep Learning-based Image Enhancement against Large-Scale Image Inference
|pdfUrl=https://ceur-ws.org/Vol-2283/MediaEval_18_paper_34.pdf
|volume=Vol-2283
|authors=Zhuoran Liu,Zhengyu Zhao
|dblpUrl=https://dblp.org/rec/conf/mediaeval/LiuZ18
}}
==First Steps in Pixel Privacy: Exploring Deep Learning-based Image Enhancement against Large-Scale Image Inference==
First Steps in Pixel Privacy: Exploring Deep Learning-based Image Enhancement against Large-scale Image Inference Zhuoran Liu, Zhengyu Zhao Radboud University, Netherlands {z.liu,z.zhao}@cs.ru.nl ABSTRACT are already some techniques which could generate adversarial ex- In this paper, we present several enhancement approaches for the amples, e.g., L-BFGS method [13], fast gradient sign method [8] Pixel Privacy Task of MediaEval 2018. The goal of this task is to and so on. These generated adversarial examples could fool the use image enhancement techniques to fool the state-of-the-art con- ConvNet-based classifiers to protect privacy information. The ad- volutional neural network (ConvNet) classifiers in scene classifi- versarial examples software library cleverhans [12] collects some cation problem, and maintain the visual appeal of images. Our construction techniques to generate adversarial examples. Given proposed approaches are based on image crop, adversarial per- the condition that perturbation-based approach may need informa- turbations and style transfer, respectively. Firstly, we showed the tion from classifier and the resulting images may look not good. potential influence of easy-to-use image processing operations, i.e., We propose to use style transfer approach to protect image pri- cropping (center cropping and random cropping). In perturbation- vacy. This category of approaches protects sensitive information in based approach, we apply a white-box technique, which makes images by transferring social images to another style, and at the use of the information of ConvNet classifiers. Based on the experi- same time , improves the image appeal. There are plenty of meth- ments, we observed some limitations of this approach, caused by, ods to do style transfer. By making use of image representations for example, image preprocessing. In addition, we demonstrated from ConvNets, [7] renders the semantic content of an image in the style transfer-based approach, which was not developed for different styles. Some generative models are also applied for style privacy protection, could be also used to reduce the effectiveness of transfer, for example, conditional adversarial networks for image- the classifiers for Large-scale Image Inference. Specifically, we im- to-image translation [9] and cycle-consistent adversarial networks plement black-box techniques based on the Generative Adversarial (CycleGAN) for unpaired image-to-image translation [16]. Network. Experimental results showed that style transfer-based ap- In this paper, we explore these two categories of approaches to proach could address privacy protection and appeal improvement image privacy and image appeal, and show their effectiveness base simultaneously. on the experiments. 2 APPROACHES 1 INTRODUCTION In this section, we describe our perturbation-based approach and Multimedia data is generated every day and accumulated as large- show the potential limitations of it. Then, style-transfer-based ap- scale datasets. Based on large-scale image data and the development proach was proposed to achieve more effective protection and gen- of artificial intelligence, privacy-sensitive information, e.g., daily erate images with better quality with respect to human perception. patterns and locations, can be efficiently inferred by some state- of-the-art techniques [5]. The objective of the Pixel Privacy Task 2.1 Perturbation-based approach of MediaEval 2018 is to protect privacy-sensitive scene images Our perturbation-based approach generates a fixed 2-d perturbation against large-scale image inference algorithms, and at the same vector for each image. After adding this quasi-imperceptible vector time, maintain or even increase the visual appeal of the image. to original images, the performance of the ConvNets-based classi- Commonly used approaches to protection are based on hiding fier will decreased by a large margin. Our implementation of image visible privacy-sensitive information of images. In the visual privacy perturbation refers to Universal Adversarial Perturbation (UAP) [10], task of MediaEval [2], some approaches are proposed to protect which makes use of DeepFool [11] and generalizes well across differ- information in video sequences. For example, [6] proposes an ap- ent ConvNets. This approach follows a white-box setting. In other proach based on false colour within the JPEG architecture to prevent words, the calculation of perturbation need explicit information revealing of sensitive information of video surveillance to viewers. from both training dataset and the ConvNet model. Specifically, Although these approaches can protect sensitive information in for each image in the dataset, it computes a minimal perturbation images, they are not applicable in the context of this task, because which sends the perturbed image to the decision boundary. Then, users would not like to share these social images, which have been this perturbation will be updated iteratively with a constraint, e.g., blurred or changed obviously. L ∞ ≤ ξ , to make the final perturbation as small as possible. In the scenario of social images, we found two main categories In our implementation, we calculate the perturbation vector on of techniques can be used to protect privacy-sensitive scene images. the basis of 3000 images from the validation data set provided by One of them is based on generating adversarial examples. There 2018 Pixel Privacy Task and a ConvNets model (ResNet50), which Copyright held by the owner/author(s). was pretrained on the Places-Standard dataset [15]. In the prepro- MediaEval’18, 29-31 October 2018, Sophia Antipolis, France cessing step, we resize input images to 256 with respect to its short edge and keep the original ratio. Then we crop a 224 square in the MediaEval’18, 29-31 October 2018, Sophia Antipolis, France Zhuoran Liu, Zhengyu Zhao center of images. After training, we add the resulting perturbation vector in the resized test images. We summarized the potrntial lim- itations of UAP as follows. Firstly, in most practical cases of social images, the information of inference models and the traing set is not available. It is hard to generate optimal perturbation vector without this explicit information. Secondly, quasi-imperceptible artifacts added in the perturbed images are still not satisfying in the Original CartoonGAN CycleGAN context of social images. In addition, the generated perturbation is vulnerable to image preprocessing [1]. Exploratory experiments showed potential influences of image preprocessing with cropping operations have potential influence on this approach. With additional scaling and cropping of the perturbed images, the top-1 accuracy of the classification drops from 46.4% to 41.9%. UAP Center crop Random crop 2.2 Style transfer-based approach We propose a style transfer-based approach to protect image privacy Figure 1: An example of resulting images by different pro- and increase appeal. In particular, we apply GANs-based methods tection approaches. to change images to some certain styles, such as Ukiyo-e style with CycleGAN [16] and Hayao style with CartoonGAN [4]. Both of two above GAN-based methods are used for unpaired image-to-image Table 1: Evaluation results in terms of Top-1 and top-5 pre- translation. Given a source domain X (input images) and a target diction accuracy on the scene images from the MEPP18test domain Y (styled images), a mapping G : X → − Y is learned such dataset. that G(X ) is indistinguishable from Y . The objective of the learning Top-1 acc. Top-5 acc. is a summation of adversarial losses and cycle consistency loss. Original 60.23% 88.63% After training on source set and target set, we learn a mapping Hayao 41.57% 69.97% function G, which can transfer the style of any input images to a Ukiyo-e 34.23% 62.00% target style. C-crop 39.33% 70.57% R-crop 34.27% 63.17% UAP 45.33% 76.97% 3 EVALUATION RESULTS We submitted five runs for the Pixel Privacy Task of MediaEval 2018. Fig. 1 shows image examples enhanced by these five runs. In social multimedia, it is common that users crop images and videos 4 CONCLUSIONS AND FUTURE WORK to improve the appeal before sharing them. Due to different settings In this paper, we propose perturbation-based approach (white-box) of ConvNet-based classifiers, image cropping may also have some and style transfer-based approach (black-box) to protect privacy influences on the classification. For example, in the preprocessing and improve appeal in scene sensitive images. The proposed style stage of the classification, the input images are scaled and cropped transfer-based approach has a good evaluation performance in both by default. So we submitted two runs based on central cropping image privacy and image appeal. and random cropping to explore potential influence of image crop- From the exploratory experiments and evaluation results, we ping. In addition, we submit one run using our perturbation-based find that the perturbation-based approach generally works well, approach, and another two runs using our style transfer-based but is vulnerable to image preprocessing. In addition, added pertur- approach. bation vector decreases the image appeal. For style transfer-based Table 1 presents evaluation results of our five runs in terms of approach, the prediction accuracy decreases significantly. In addi- Top-1 and Top-5 classification accuracy. We see that style transfer- tion, Hayao style with shows an increase in appeal score by NIMA based and image cropping approaches yield obvious decrease of evaluation. accuracy compared with the original performance of the classifier. In the future, we will combine the proposed two approaches to Perturbation-based approach shows less decrease, due to image simultaneously achieve effectiveness of privacy protection based on preprocessing as we discussed in 2.1. In addition, the number of optimal computation of the perturbation and improve the aesthetic training images and selection of hyper-parameters in UAP may also quality of images. influence the evaluation performance. We also get aesthetics results of our submissions [3]. We also ACKNOWLEDGMENTS evaluate the aesthetic quality of the images enhanced by our runs, This work is part of the Open Mind research program, financed by using NIMA (Neural Image Assessment) [14]. Hayao style with Car- the Netherlands Organization for Scientific Research (NWO). The toonGAN shows the largest increase of mean aesthetics score (5.09), experiments were carried out on the Dutch national e-infrastructure compared with the score (4.472) of original images. with the support of SURF Cooperative. Pixel Privacy Task MediaEval’18, 29-31 October 2018, Sophia Antipolis, France REFERENCES [1] Anish Athalye and Ilya Sutskever. 2017. Synthesizing robust adversar- ial examples. arXiv preprint arXiv:1707.07397 (2017). [2] Atta Badii, Mathieu Einig, Tomas Piatrik, and others. 2014. Overview of the MediaEval 2013 Visual Privacy Task.. In MediaEval. [3] Simon Brugman, Maciej Wysokinski, and Martha Larson. 2018. Media- Eval 2018 Pixel Privacy Task: Views on image enhancement. In Work- ing Notes Proceedings of the MediaEval 2018 Workshop. [4] Yang Chen, Yu-Kun Lai, and Yong-Jin Liu. 2018. CartoonGAN: Gener- ative Adversarial Networks for Photo Cartoonization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9465–9474. [5] Jaeyoung Choi, Martha Larson, Xinchao Li, Kevin Li, Gerald Fried- land, and Alan Hanjalic. 2017. The Geo-Privacy Bonus of Popular Photo Enhancements. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval. ACM, 84–92. [6] Serdar Çiftçi, Ahmet Oğuz Akyüz, and Touradj Ebrahimi. 2018. A reliable and reversible image privacy protection based on false colors. IEEE Transactions on Multimedia 20, 1 (2018), 68–81. [7] Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2414–2423. [8] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Ex- plaining and Harnessing Adversarial Examples. CoRR abs/1412.6572 (2014). arXiv:1412.6572 http://arxiv.org/abs/1412.6572 [9] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. arXiv preprint (2017). [10] Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. 2017. Universal adversarial perturbations. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 1765–1773. [11] Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 2574–2582. [12] Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Ham- bardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, and Rujun Long. 2018. Tech- nical Report on the CleverHans v2.1.0 Adversarial Examples Library. arXiv preprint arXiv:1610.00768 (2018). [13] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013). [14] Hossein Talebi and Peyman Milanfar. 2018. Nima: Neural image assessment. IEEE Transactions on Image Processing 27, 8 (2018), 3998– 4011. [15] Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million Image Database for Scene Recog- nition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017). [16] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adver- sarial Networkss. In Computer Vision (ICCV), 2017 IEEE International Conference on.