Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable Color Filter Zhengyu Zhao Radboud University, Netherlands z.zhao@cs.ru.nl ABSTRACT This paper presents the submission of our RU-DS team to the Pixel Privacy Task 2020. We propose to fool the blind image quality assessment model by transforming images based on optimizing a human-understandable color filter. In contrast to the common work that relies on small, 𝐿𝑝 -bounded additive pixel perturbations, our approach yields large yet smooth perturbations. Experimental results demonstrate that in the specific context of this task, our approach is able to achieve strong adversarial effects, but has to sacrifice the image appeal. 1 INTRODUCTION High-quality images shared online can be misappropriated for pro- motional goals. The Pixel Privacy Task [15] this year is focused on developing adversarial techniques to decrease the predicted quality Figure 1: A 4-piece color filter in ACE ( from [20]). scores of an automatic Blind Image Quality Assessment (BIQA) model [10], which effectively camouflages images from being pro- moted. A key requirement of such adversaries is that the adversarial 2 APPROACH image should remain its original quality or become more appealing In this section, we firstly recall the general formulation of Adversar- to the human eye. Conventional work on generating adversarial ial Color Enhancement (ACE) as proposed by [20], and then present images has been focused on small additive perturbations, mostly the modifications for applying it in our specific Pixel Privacy Task. bounded by 𝐿𝑝 distance [2, 3, 9, 16], or other more visual-perception- aligned metrics [4, 18, 19, 21]. In this way, the adversarial image 2.1 Parametric Image Enhancement is only designed to maintain its original appearance as much as Most advanced automatic photo enhancement algorithms have possible, instead of enhancing the image appeal. proposed to parameterize the image editing process by the DNNs, In contrast, recent studies [1, 6, 7, 13, 14, 17, 20] have started to which however suffers from high computational cost and low inter- explore non-suspicious adversarial images that accommodate larger pretability [8, 12, 22]. In contrast, recent work [5, 11] has proposed perturbations without arousing suspicion because they transform to parameterize the process as human-understandable image filters. groups of pixels along dimensions consistent with human interpre- Such methods have far fewer parameters to optimize, and can be tation of images. Among them, the Adversarial Color Enhancement applied independently of the image resolution. (ACE) [20] can simultaneously achieve the adversarial effects and Specifically, ACE adopts the approximation of the color filter image enhancement by optimizing a human-understandable para- in [11], which is formulated as a simple monotonic piecewise-linear metric color filter. Its effectiveness has been originally validated in mapping function: the domain of image classification and segmentation. One may argue that it is easier to separately conduct the optimiza- π‘˜βˆ’1 Γ• πœƒπ‘– πœƒ tion for adversarial effects and image enhancement. However, we 𝐹𝜽 (π‘₯π‘˜ ) = + (𝐾 Β· π‘₯π‘˜ βˆ’ (π‘˜ βˆ’ 1)) Β· π‘˜ , πœƒ sum πœƒ sum note that the joint optimization can yield larger perturbations that 𝑖=1 (1) enjoy two important practical properties: robustness against com- 𝐾 Γ• mon image processing operations and transferability to a black-box πœƒ sum = πœƒπ‘˜ , target model [1, 17, 20]. In this paper, specifically, we will explore π‘˜=1 the usefulness of ACE in this Pixel Privacy Task for decreasing the where 𝐾 demotes the total number of pieces. In this case, an input BIQA score while enhancing the image appeal. image pixel π‘₯π‘˜ falling in the π‘˜-th piece will be filtered using the parameter πœƒπ‘˜ , and 𝐹𝜽 (π‘₯π‘˜ ) is its corresponding output. By doing this, pixels with similar colors will be filtered with the same parameter, Copyright 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). leading to smooth color transformation. Specifically, the three RGB MediaEval’20, December 14-15 2020, Online channels are processed independently. An example of this function with four pieces (𝐾 = 4) is illustrated in Fig. 1. MediaEval’20, December 14-15 2020, Online Z. Zhao Table 1: Detailed settings of our five runs. Table 2: Evaluation results of our five runs. The accuracy (%) is calculated over all the 550 test images, which are com- pressed with JPEG 90 before evaluation. The number of Runs Methods Parameters times selected as β€œTop-3” most appealing among the total 13 1 ACE-PGD 𝐾 = 64, πœ– = 16, and iters. = 20 qualified runs is evaluated by user study with 7 people on 2 ACE-PGD 𝐾 = 64, πœ– = 32, and iters. = 20 20 representative images that have the largest BIQA score variance. The maximum number is 140. 3 ACE-PGD 𝐾 = 256, πœ– = 16, and iters. = 20 4 ACE-PGD 𝐾 = 256, πœ– = 64, and iters. = 20 5 ACE-Ins 𝐾 = 64, πœ† = 0.01, and iters. = 100 Runs 1 2 3 4 5 Acc before JPEG 48.00 33.27 50.00 21.82 35.09 Acc after JPEG 45.27 33.45 47.45 22.55 44.91 2.2 Adversarial Color Enhancement ACE generates non-suspicious adversarial images by iteratively Number of Top-3 2 7 6 4 7 updating the parameters of the color filter defined in Eq. 1, in contrast to the conventional attacks that are operated in the raw pixel space. There are two methods to constrain the color transformation strength. The first method imposes adjustable bounds on the filter parameters, formulated as: 𝜽 min πΏπ‘Žπ‘‘π‘£ (𝐹𝜽 (𝒙)), s.t. 1 ≀ βˆ₯ βˆ₯ ∞ ≀ πœ–, (2) 𝜽 𝜽0 where 𝜽 0 denotes the initial parameters, equaling to 1𝐾 /𝐾. The adversarial loss, πΏπ‘Žπ‘‘π‘£ , adopts the specific logit loss from the the well- known C&W method [2]. Note that this parameter bound is not necessarily to tight as in the 𝐿𝑝 methods, since the color filtering can inherently guarantee the uniformity of the image transformation even when the perturbations are large. This bounded variant of Figure 2: Adversarial images achieved by our approach with ACE is referred to as ACE-PGD. the original and decreased scores. The top row shows the The second method guides the transformation towards specific examples with relatively high appeal and the bottom row appealing color styles, in addition to achieving the adversarial ef- shows the failed examples with low appeal. fects. To this end, additional guidance from common enhancement practices is incorporated into the adversarial optimization. Specif- ically, the targeted appealing color styles are obtained by using addition, we find that the results before and after the JPEG compres- Instagram filters, and the optimization can be formulated as: sion remain similar, suggesting that our approach is stale against min πΏπ‘Žπ‘‘π‘£ (𝐹𝜽 (𝒙)) + πœ† Β· βˆ₯𝐹𝜽 (𝒙) βˆ’ 𝒙 ins βˆ₯ 22, (3) compression. 𝜽 However, the human evaluation results on the 20 selected images where 𝒙 ins denotes the targeted Instagram filtered image with a are not satisfying. It implies that the BIQA model is more stable specific color style. This variant of ACE is referred to as ACE-Ins. against the interference of smooth modifications, such as ACE, One popular Instagram filter style, Nashville, is considered in our than the classification models. Specifically, we notice that ACE- submitted runs, and the implementation is automated using the Ins fails to drive the image into a target appealing style since the GIMP toolkit with the Instagram Effects Plugins1 . optimization has to be focused on lowering the score. This may In the context of fooling BIQA, the πΏπ‘Žπ‘‘π‘£ is formulated as: be because the quality assessment model tends to rely on high- πΏπ‘Žπ‘‘π‘£ = max{BIQA(𝐹𝜽 (𝒙)) βˆ’ 𝐢, 0}, (4) frequency features but the ImageNet classifier learns both low- frequency (e.g. shape) and high-frequency (e.g. textures) features. where the target score can be set by adjusting 𝐢. Specifically, we This makes the quality assessment model more robust against the set 𝐢 a bit lower than the standard target, 50, to make sure the low-frequency perturbations by our ACE. We will explore this in adversarial effects could remain after the JPEG compression. more depth for the future work. Figure 2 visualizes the successful adversarial examples with high 3 RESULTS AND ANALYSIS and low appeal. We can observe that ACE can yield good image In total, we submitted five runs. We tried different parameters of examples with filtering-like styles, but the bad examples suffer from ACE-PGD for the first four runs, and used ACE-Ins for the last run. over-colorization effects. As can be seen from Table 1, all the five runs effectively de- crease the model accuracy to a level below 50%. Specifically, as ACKNOWLEDGMENTS expected, higher 𝐾 = 4 and πœ– lead to stronger adversarial effects. In This work was carried out on the Dutch national e-infrastructure 1 https://www.marcocrippa.it/page/gimp_instagram.php. with the support of SURF Cooperative. Pixel Privacy: Quality Camouflage for Social Images MediaEval’20, December 14-15 2020, Online REFERENCES [1] Anand Bhattad, Min Jin Chong, Kaizhao Liang, Bo Li, and David A Forsyth. 2020. Unrestricted Adversarial Examples via Semantic Ma- nipulation. In ICLR. [2] Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In IEEE S&P. [3] Pin-Yu Chen, Yash Sharma, Huan Zhang, Jinfeng Yi, and Cho-Jui Hsieh. 2018. EAD: elastic-net attacks to deep neural networks via adversarial examples. In AAAI. [4] Francesco Croce and Matthias Hein. 2019. Sparse and Imperceivable Adversarial Attacks. In ICCV. [5] Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2018. Aesthetic- driven image enhancement by adversarial learning. In ACM MM. [6] Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. 2019. Exploring the Landscape of Spatial Robustness. In ICML. [7] Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. 2018. Robust physical-world attacks on deep learning models. In CVPR. [8] MichaΓ«l Gharbi, Jiawen Chen, Jonathan T Barron, Samuel W Hasinoff, and FrΓ©do Durand. 2017. Deep bilateral learning for real-time image enhancement. ACM TOG 36, 4 (2017), 1–12. [9] Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Ex- plaining and harnessing adversarial examples. In ICLR. [10] Vlad Hosu, Hanhe Lin, Tamas Sziranyi, and Dietmar Saupe. 2020. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE TIP 29 (2020), 4041–4056. [11] Yuanming Hu, Hao He, Chenxi Xu, Baoyuan Wang, and Stephen Lin. 2018. Exposure: A white-box photo post-processing framework. ACM Transactions on Graphics 37, 2 (2018), 26. [12] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR. [13] Ameya Joshi, Amitangshu Mukherjee, Soumik Sarkar, and Chinmay Hegde. 2019. Semantic Adversarial Attacks: Parametric Transforma- tions That Fool Deep Classifiers. In ICCV. [14] Cassidy Laidlaw and Soheil Feizi. 2019. Functional Adversarial Attacks. In NeurIPS. [15] Zhuoran Liu, Zhengyu Zhao, Martha Larson, and Laurent Amsaleg. 2020. Exploring Quality Camouflage for Social Images. In Working Notes Proceedings of the MediaEval Workshop. [16] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards deep learning models resis- tant to adversarial attacks. In ICLR. [17] Ali Shahin Shamsabadi, Ricardo Sanchez-Matilla, and Andrea Caval- laro. 2020. ColorFool: Semantic Adversarial Colorization. In CVPR. [18] Eric Wong, Frank Schmidt, and Zico Kolter. 2019. Wasserstein Adver- sarial Examples via Projected Sinkhorn Iterations. In ICML. [19] Chaowei Xiao, Jun-Yan Zhu, Bo Li, Warren He, Mingyan Liu, and Dawn Song. 2018. Spatially transformed adversarial examples. In ICLR. [20] Zhengyu Zhao, Zhuoran Liu, and Martha Larson. 2020. Adversarial Robustness Against Image Color Transformation within Parametric Filter Space. In arXiv preprint arXiv:2011.06690. [21] Zhengyu Zhao, Zhuoran Liu, and Martha Larson. 2020. Towards Large yet Imperceptible Adversarial Image Perturbations with Perceptual Color Distance. In CVPR. [22] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Un- paired image-to-image translation using cycle-consistent adversarial networks. In ICCV.