MediaEval 2020: Maintaining Human-Imperceptibility of Image Adversarial Attack by using Human-Aware Sensitivity Map Zhiqi Shen1, Muhammad Furqan Habibi1, Shaojing Fan1, Mohan Kankanhalli1 1National University of Singapore dcsshenz@nus.edu.sg,furqan.habibi@u.nus.edu dcsfs@nus.edu.sg,mohan@comp.nus.edu.sg ABSTRACT With the rapid rise of big data with developments in artificial intel- ligence, privacy has come under the spotlight. Adversarial attacks using image perturbation have recently been introduced to fool machines on pattern recognition tasks. They also have been success- fully employed to protect privacy of images. However, only a few works consider the imperceptibility of perturbations for humans. This report presents our submission to the pixel privacy task, where we improve the imperceptibility of image perturbations by using a human-aware sensitivity map, while protecting image privacy via adversarial attack techniques. 1 INTRODUCTION The Pixel Privacy task [7] of MediaEval aims to protect personal privacy by embedding human-imperceptible noise on images that Figure 1: The figure shows sensitivity map examples. The fools the BIQA classifiers. The attack models use InceptionResNetV2 left column has the original images and the right column are structure and are pre-trained on KonIQ-10k dataset. The organizers the corresponding sensitivity maps. For example, in the first evaluated the performance in terms of success attack rate (accuracy) image, its sensitivity map highlights the humans, indicating and imperceptibility of perturbation. that noise added to the human region will be perceived more Prior work usually applies 𝐿2 norm [1, 5, 6] to the loss func- easily than when the noise is added to the background. tion to improve the imperceptibility of perturbed images. However, 𝐿2 norm only guarantees the overall noise to be small without considering the perceptual characteristics of regions. For example, 2 APPROACH observers will perceive differently when we add the same noise to a 2.1 Preliminaries flat background versus a content-rich background. With this insight, We denote an image by 𝐼 ∈𝐻 βˆ—π‘Š βˆ—πΆ , where H, W, C is the frame we can apply a sensitivity map to the loss function that indicates height, frame width, and the number of channels, respectively. The which regions’ changes are least sensitive to observers, so that the BIQA classifier is denoted by 𝑓 (𝑋, πœƒ ) = π‘™π‘œπ‘”π‘–π‘‘π‘  that takes an image algorithms know where to add the noise. Recent works [2, 4, 11] as input and produces the corresponding logits 1𝐾 , which 𝐾 is the published after our earlier work [9] do take human imperceptibil- class number. A softmax layer is followed to the network to transfer ity of perturbations into account. Unlike our deep learning-based the logits to each class’s probability 𝑦. The whole BIQA classifier is method, most of them compute human imperceptibility based on represented by π‘ π‘œ 𝑓 π‘‘π‘šπ‘Žπ‘₯ (𝑓 (𝑋, πœƒ )) = 𝑦. texture information. The image adversarial attack approach aims to find an image Our method is an optimization-based approach based on the CW perturbation πΌπ‘Žπ‘‘π‘£ that maximizes the classification error. We denote attack [1]. We manipulate each input image’s model logits to its tar- 𝐼𝑠 = πΌπ‘Žπ‘‘π‘£ βˆ’ 𝐼 the adversarial image perturbation. get class. We then optimize the attack to minimize the loss function We propose an optimization-based approach. The general idea by modifying the input image. To improve human imperceptibility, of generating perturbation for an image is by using the following we improve the loss function by integrating human sensitivity maps optimization equation. learned from [9]. Experimental evaluation indicates our approach achieves good results in terms of human imperceptibility. Λ† arg min 𝛼𝐷 (𝐼𝑠 ) βˆ’ β„“ (𝑓 (𝐼 + 𝐼𝑠 , πœƒ ), 𝑙) (1) 𝐼𝑠 where 𝐷 (.) is the perception regularization to keep the perturba- Copyright 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). tion to be small and imperceptible to humans. 𝑙ˆ is the target logits. MediaEval’20, December 14-15 2020, Online β„“ (., .) is the loss function to measure the difference between the actual prediction and the target prediction. To obtain a high attack MediaEval’20, December 14-15 2020, Online M. Larson et al. Figure 2: The figure shows the sensitivity map prediction network. The network bases on FCN network and use VGG-16 as its backbone network. rate success, we minimize the distance between actual logits and integrate the human perceptual sensitivity, we extend the L2 norm the target logits. 𝛼 is a hyper-parameter to balance these two terms. by multiplying it with the sensitivity map, as shown below. 2 2.2 Loss to fool machines 𝐷 (𝐼𝑠 ) = 𝛽𝑠 𝐼𝑠 2 (3) We follow the loss in [1] to fool machines. For the sake of clarity, 3 RESULTS AND ANALYSIS Λ† the detailed formulation is as follows: we use 𝐿𝐢 = β„“ (𝑓 (𝐼 + 𝐼𝑠 , πœƒ ), 𝑙), We submitted five runs towards the Pixel Privacy task. The organiz- ers selected 20 images with the largest BIQA variance for human evaluation. They then put the same image of all qualified runs in ( Λ† if π‘Žπ‘Ÿπ‘”π‘šπ‘Žπ‘₯ 𝑓 (𝐼 + 𝐼𝑠 ) β‰  π‘Žπ‘Ÿπ‘”π‘šπ‘Žπ‘₯ 𝑙ˆ |π‘šπ‘Žπ‘₯ (𝑓 (𝐼 + 𝐼𝑠 )) βˆ’ π‘šπ‘Žπ‘₯ (𝑙)|, 𝐿𝐢 = one folder and let 7 experts select the most appealing (i.e., β€œBest”) 0, otherwise three runs out of 17 runs. A run can be selected as β€œBest” for at (2) most 140 times. Where 𝑓 (𝐼 + 𝐼𝑠 ) and 𝑙ˆ are the one-hot vectors representing the From Table 1, we can observe that the accuracy of our first run current logits and desired logits. The losses consist of two parts. The (with parameter 𝛼 = 10) has dropped to lower than random guess first part represents the situations when the perturbed image has not (50%), meaning that our perturbed images have fooled machines’ been into our desired class. The loss value is the absolute distance prediction. More importantly, more than half of the images are between the most trusted class in current logits and the desired selected as the best three images out of 17 runs. From the trend of class. The second part depicts the situation when the perturbed parameters, we can see the potential of our algorithm. If we can image has been classified into our desired class, so we set the loss try more parameters (e.g,. smaller than 10), the performance might value to zero. be even better than the current one. For the other runs, we have not achieved a good attack rate. This is because the parameter 𝛼 is 2.3 Loss to fool humans too large that forces the perturbed images to focus more on image We observed that the traditional norms (e.g., 𝐿0, 𝐿2, 𝐿𝑖𝑛𝑓 ) consider quality during back-propagation. all pixels in the images to be equal, while humans have different priorities when viewing different image regions. More specifically, Table 1: The table shows the evaluation of our five runs. The even adding the same perturbation noise to different regions will first run with parameter 𝛼 = 10 has a high attack rate success lead to different humans’ perceptibility. For quantifying humans’ with more than half of the perturbed images selected as best. perceptibility of each pixel, we integrate a sensitivity map with our Parameter (𝛼) Accuracy Number of times selected as β€œBest” loss function. The value of each pixel in the sensitivity map ranges 10 42.73 74 from 0 to 1. The larger value indicates more chance to be perceived 20 52.91 Not qualified when adding noise on such pixels. 30 62.36 Not qualified 40 75.10 Not qualified Human-aware sensitivity map Human perception is a complex 50 93.82 Not qualified phenomenon which is not easily captured in a neat mathematical formulation. Therefore, we train a neural network to generate the 4 CONCLUSION AND FUTURE WORKS spatially dense prediction of each pixel with human sensitivity scores. The network is designed based on a fully convolutional This report introduces our approach for privacy protection, which network (FCN) [8]. The backbone network is a VGG-16 [10] model integrates the human-aware sensitivity map to the loss function pre-trained ImageNet dataset. A 1*1 convolutional layer is used to improve the quality of perturbed images’. The results demon- to combine all feature maps extracted from VGG-16 to obtain the strate the effectiveness of the sensitivity map in maintaining noise final sensitivity map. The architecture of our DNN is illustrated in imperceptibility. However, some aspects can be further improved. Figure 2. The current sensitivity map prediction network is trained on the EMOd dataset, which has only 698 images. Another problem is that Embed sensitivity maps to attack approach For this workshop, the network structure (FCN) is rudimentary. We can foresee that we train the sensitivity map generation model on the EMOd dataset with a more sophisticated structure, trained on a larger data-set, [3] and then test it on the given Place365 testing set. In order to can improve the performance. Pixel Privacy: Quality Camouflage for Social Images MediaEval’20, December 14-15 2020, Online REFERENCES [1] Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp). IEEE, 39–57. [2] Francesco Croce and Matthias Hein. 2019. Sparse and imperceivable adversarial attacks. In Proceedings of the IEEE International Conference on Computer Vision. 4724–4732. [3] Shaojing Fan, Zhiqi Shen, Ming Jiang, Bryan L Koenig, Juan Xu, Mo- han S Kankanhalli, and Qi Zhao. 2018. Emotional attention: A study of image sentiment and visual attention. In Proceedings of the IEEE Conference on computer vision and pattern recognition. 7521–7531. [4] Diego Gragnaniello, Francesco Marra, Giovanni Poggi, and Luisa Ver- doliva. 2019. Perceptual Quality-preserving Black-Box Attack against Deep Learning Image Classifiers. arXiv preprint arXiv:1902.07776 (2019). [5] Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2016. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016). [6] Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. 2016. Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770 (2016). [7] Zhuoran Liu, Zhengyu Zhao, Martha Larson, and Laurent Amsaleg. 2020. Exploring Quality Camouflage for Social Images. In Working Notes Proceedings of the MediaEval Workshop. [8] Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3431– 3440. [9] Zhiqi Shen, Shaojing Fan, Yongkang Wong, Tian-Tsong Ng, and Mohan Kankanhalli. 2019. Human-imperceptible privacy protection against machines. In Proceedings of the 27th ACM International Conference on Multimedia. 1119–1128. [10] Karen Simonyan and Andrew Zisserman. 2014. Very deep convo- lutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014). [11] Eric Wong, Frank R Schmidt, and J Zico Kolter. 2019. Wasserstein adversarial examples via projected sinkhorn iterations. arXiv preprint arXiv:1902.07906 (2019).