=Paper=
{{Paper
|id=Vol-2881/paper7
|storemode=property
|title=Object Hider: Adversarial Patch Attack Against Object Detectors
|pdfUrl=https://ceur-ws.org/Vol-2881/paper7.pdf
|volume=Vol-2881
|authors=Yusheng Zhao,Huanqian Yan,Xingxing Wei
}}
==Object Hider: Adversarial Patch Attack Against Object Detectors==
Object Hider: Adversarial Patch Attack Against Object Detectors Yusheng Zhao∗ Huanqian Yan∗ Xingxing Wei† Beihang University Beihang University Beihang University Beijing, China Beijing, China Beijing, China zhaoyusheng@buaa.edu.cn yanhq@buaa.edu.cn xxwei@buaa.edu.cn ABSTRACT the purpose of making the objects disappear. Since object detec- Deep neural networks have been widely used in many computer tion models have been used in many life-concerning applications, vision tasks. However, it is proved that they are susceptible to small, research about the fragility of these models is of great importance. imperceptible perturbations added to the input. Inputs with elabo- Therefore, we aim to investigate the vulnerability of object detec- rately designed perturbations that can fool deep learning models tion algorithms in this work and attack four state-of-the-art object are called adversarial examples, and they have drawn great con- detection models provided by Alibaba Group on the Tianchi plat- cerns about the safety of deep neural networks. Object detection form, including two white-box models — YOLOv4 [1] and Faster algorithms are designed to locate and classify objects in images or RCNN [2], and two black-box models to test the transferability videos and they are the core of many computer vision tasks, which of the proposed algorithm. The purpose of the designed methods have great research value and wide applications. In this paper, we is to blind the detection models with the restricted patches. The focus on adversarial attack on some state-of-the-art object detection framework of adversarial attacking is shown in Figure 1. models. As a practical alternative, we use adversarial patches for We discover that the locations of adversarial patches are crucial the attack. Two adversarial patch generation algorithms have been to the attack, so we focus on locating the patches and propose two proposed: the heatmap-based algorithm and the consensus-based patch selection algorithms: the heatmap-based algorithm and the algorithm. The experiment results have shown that the proposed consensus-based algorithm. The heatmap-based algorithm is an methods are highly effective, transferable and generic. Additionally, improved version of Grad-CAM [3], which introduced the idea of we have applied the proposed methods to competition Adversarial heatmap to visualize the gradients of intermediate convolutional Challenge on Object Detection and won top 7 in 1701 teams. Code is layers in image classifiers. We modify and improve the algorithm available at https://github.com/FenHua/DetDak to make it suitable for visualizing the gradients in object detec- tion models and use the heatmap to select patches. To the best our CCS CONCEPTS knowledge, it is the first Grad-CAM-like algorithm designed specif- ically for the object detection task. The consensus-based algorithm • Security and privacy → Software and application security; is another novel patch selecting method. It chooses patch locations • Computing methodologies → Object detection. by attacking several target models and combining the results with KEYWORDS a voting strategy, which can make the location of the patch more precise and the adversarial examples more transferable. object detection, adversarial patches, patch generation algorithm We test our attacking algorithm with the proposed patch selec- tion algorithms on the dataset provided by Alibaba Group. The 1 INTRODUCTION result shows that the proposed algorithms are highly competitive. While being widely used in many fields, deep neural networks are In brief, the main contributions can be summarized as follows: shown to be vulnerable to adversarial examples [4]. Many early studies of adversarial examples focused on the classification task, • We improve the Grad-CAM algorithm to make it more suit- adding perturbation on the entire image. However, in real world able for analysing the gradients of object detection models applications like autonomous vehicles and surveillance, such per- and use it for the heatmap-based attack. turbation is hard to implement. Because of this, recent studies focus • We propose consensus-based attack algorithm that is very mainly on adversarial patches, which restrict the perturbation to a powerful for attacking object detection models. small region like a rectangular area. This makes adversarial exam- • The experimental results show that the proposed attacking ples more practical and easier to implement. methods are competitive and generic. Object detection is an important part of computer vision and enables many tasks like autonomous driving, visual question an- The rest of this paper is organized as follows. The proposed swering and surveillance. However, there are relatively few studies algorithms are described in Section 2. The experimental results and on the adversarial attack of object detection models, especially for analysis are presented in Section 3. Finally, we summarize the work in Section 4. ∗ Both authors contributed equally to this research. † Corresponding author. Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons 2 METHODS License Attribution 4.0 International (CC BY 4.0). Two methods have been designed for generating patches: the heatmap- In: Dimitar Dimitrov, Xiaofei Zhu (eds.): Proceedings of the CIKM AnalytiCup 2020, 22 October, 2020, Gawlay (Virtual Event), Ireland, 2020, published at http://ceur-ws.org. based algorithm and the consensus-based algorithm. In this Section, two proposed methods are introduced in details. The adversarial 24 Figure 1: The framework of our attack method. It is an adversarial patch attack algorithm for object detection. The strategies of generating patches are described in Section 2. attack algorithm with patches is also presented concretely at end used in the normalization to avoid small bounding boxes from being of this Section. too dominant. We compute ℎ𝑏 as Õ 𝜕𝑦𝑏 2.1 Heatmap-based Algorithm ℎ𝑏 = max(0, ⊙ 𝐴𝑘𝑖𝑗 ), (3) 𝑘 Grad-CAM[3] is a popular tool for visualizing the derivative of 𝑘 𝜕𝐴𝑖 𝑗 the output with respect to an intermediate convolutional layer. It where 𝐴𝑘𝑖𝑗 denotes the activation of a convolutional layer at channel introduced the idea of heatmap — important regions of the input are hotter in the heatmap. The heatmap is a function of the gradients 𝑘 and location (𝑖, 𝑗), ⊙ represents the element-wise product and 𝑦𝑏 of the output with respect to an intermediate layer. However, the is the highest confidence score of the bounding box 𝑏. original Grad-CAM algorithm is designed for classification models Finally, we use some Gaussian filters to post-process the heatmap and thus cannot be used directly in our task. On the one hand, to make it more smooth. Combining the heatmaps of several object object detection tasks usually have multiple objects for the input detection models, we can choose the patches in hot regions of the image, while the classification task only have one. On the other input image. hand, the size of different objects could have a significant influence on the heatmap, so we cannot directly add the gradients together 2.2 Consensus-based Algorithm when computing the heatmap. Although the heatmap-based algorithm exploits the gradient infor- Therefore, an improved Grad-CAM algorithm is proposed for se- mation of the models, it is separated from the attacking process. lecting patches. Firstly, we adopt the element-wise multiplication of Besides, we find that the sensitive locations of the input image the gradients and activations, which preserves spatial information might change over time when attacking algorithms are performed of the gradients and the intermediate layer. Secondly, we normalize iteratively. Therefore, we propose another method for patch selec- the heatmap data of all bounding boxes and combine them together tion: the consensus-based algorithm. to get the heatmap of the entire image. Thirdly, we use several First of all, we perform the Fast Gradient Sign Method iteratively intermediate layers of the backbone for computing the heatmap, with 𝐿2-norm regularization on the target models respectively. Our which can combine both lower-level features and higher-level fea- loss function 𝐽𝐿2 is originally defined as: tures. Finally, we get the patch mask according to the values of the Õ 𝐽𝑚𝑜𝑑𝑒𝑙 = 𝑠𝑖 , (4) heatmap. 𝑖 ∈ { 𝑗 |𝑠 𝑗 >𝑡 } Mathematically, we calculate the heatmap 𝐻 using the following formula: 𝐽𝐿2 = 𝐽𝑚𝑜𝑑𝑒𝑙 + 𝜔 · ||𝑃 || 22, (5) Õ 𝐻= 𝐻𝑎 , (1) where 𝑠𝑖 is the confidence score of each bounding box of the cor- 𝑎 ∈A responding model, 𝑡 is the confidence threshold (we use 0.3 in our where A is the set of several activation layers (like conv56, conv92 task). Usually, when 𝑠𝑖 > 𝑡, it indicates the bounding box is correct in YOLOv4). 𝐻𝑎 is the heatmap of a single activation layer 𝑎, which and will appear in the results. So the lower the confidence score is defined as: 𝑠𝑖 , the fewer objects can be detected. 𝑃 represents the perturbation Õ ℎ𝑏 − E[ℎ𝑏 ] p and 𝜔 is a hyper parameter. 𝐻𝑎 = p · 𝐴𝑟𝑒𝑎𝑏 , (2) In the experiments, we find that the noise perturbations of some 𝑏 ∈𝐵 Var[ℎ𝑏 ] models like Faster RCNN are not concentrated, which makes it hard to fuse multiple results. To solve this problem, we modified the loss where ℎ𝑏 represents the heatmap of a single bounding box 𝑏, the function of those models: mean E(ℎ𝑏 ) and the variance Var(ℎ𝑏 ) are used for normalization. p Õ Õ Besides, 𝐴𝑟𝑒𝑎𝑏 is the area of the bounding box 𝑏 and 𝐴𝑟𝑒𝑎𝑏 is 𝐽𝑟𝑐𝑛𝑛 = 𝛾 𝑠𝑘𝑒𝑦 + (1 − 𝛾) 𝑠𝑜𝑡ℎ𝑒𝑟 , (6) 25 Algorithm 1 Consensus-based Attack Algorithm Input: a clean image 𝐼 , the patch number 𝑛, the scale of square patches S, the iteration number for attacking 𝑖𝑡 Output: an adversarial image 𝐼 ′ 1: Let the set of object detection models be M 2: for model 𝑚 ∈ M do 3: L2 attack on 𝑚 and get perturbation 𝑃 (𝑚) (𝑚) 4: Get top 𝑛 noise patches in 𝑃 (𝑚) as 𝑃𝑛 with scale S (𝑚) 5: Normalize 𝑃𝑛 6: end for Í (𝑚) 7: 𝑃𝑛 ← 𝑚 ∈M 𝑃𝑛 8: P ← select top 𝑛 patch masks in 𝑃𝑛 according to the magnitude Figure 2: Consensus-based Algorithm which uses a voting of the perturbation. method for generating patch masks. 9: repeat 10: Perform FGSM attack on 𝐼 with patch masks in P 11: Update the polluted image 𝐼 ′ . where 𝑠𝑘𝑒𝑦 is the confidence score of bounding boxes appeared in 12: 𝑖𝑡 ← 𝑖𝑡 − 1 the clean image and 𝑠𝑜𝑡ℎ𝑒𝑟 is the confidence score of others that 13: until 𝐿 = 0 or 𝑖𝑡 = 0 // 𝐿 is the loss function do not contain any true objects during the attack. The 𝛾 is a hyper 14: return 𝐼 ′ parameter which is set to 0.9 in our experiments. Such modification could force the perturbation to concentrate on the main objects of the image. 3 EXPERIMENT After 𝐿2 attack, we can get the noise of the input image of each model. However, we do not mix those noise directly, because they We used the proposed methods in AIC Phase IV CIKM-2020: Ad- are different in magnitude and it is not easy to balance them. So we versarial Challenge on Object Detection competition. The results of sparsify the noise into 𝑛 patches with a specified scale S. Next, we two basic proposed methods without any ensemble operations are take a vote to decide which patch mask should be preserved and recorded in Table 1 and shown in Figure 3. Even without ensemble which should be discarded. Usually, the greater the perturbation, operations, the algorithms are quite competitive. In order to reduce the more likely it is to be selected as patch. The voting strategy the number of pixels of our patches, grid-like patches are designed. on those noise patches is very helpful for improving performance. To get grid-like patches, we performed a element-wise dot prod- The flow of the algorithm is described in Figure 2. Here, we in- uct between patch mask 𝑀 and a grid matrix 𝐺𝑟𝑎𝑡𝑖𝑜 . The grid-like troduce EfficientDet [5] to join the vote. The more the detection mask 𝑀 ′ can be calculated by: models, the more accurate the voting results and the higher the 𝑀 ′ = 𝑀 ⊙ 𝐺𝑟𝑎𝑡𝑖𝑜 , (9) adaptability. Furthermore, the voting strategy can also improve the transferability of our adversarial patches and the robustness of our where 𝑟𝑎𝑡𝑖𝑜 represents the degree of sparsity, and the larger 𝑟𝑎𝑡𝑖𝑜 algorithm. Additionally, the number of 𝐿2 attack iterations is not is, the less pixels are used. very sensitive. Even with only 5 iterations, the voting result is still Ensemble operation is a common practice in machine learning quite decent. and we used this in the task. To combine different results for getting better results, an indicator is defined, 2.3 Adversarial attack with patches 2 Õ Õ After patch mask generation, Fast Gradient Sign Method (FGSM) is FinalScore = 𝑆 (𝑥, 𝑥 ∗, 𝑚𝑖 ), (10) 𝑖=1 𝑥 used to finish attacking: because there is two white-box detector, YOLO and Faster RCNN, 𝛿 := clip [0,255] (𝛿 + 𝛼 · sign(∇𝛿 𝐿)), (7) can be used, the indicator is the sum of two score functions. 𝑆 is where 𝛿 is the parameter in the adversarial patches and 𝛼 is the provided by Alibaba Group: learning rate which we refer to [4] for setting its value. 𝐿 is the loss min 𝐵𝐵(𝑥; 𝑚𝑖 ), 𝐵𝐵(𝑥 ∗ ; 𝑚𝑖 ) Í function defined as: ∗ 𝑘 𝑅𝑘 𝑆 (𝑥, 𝑥 , 𝑚𝑖 ) = (2 − )· 1− . Õ Õ (𝑚) 5000 𝐵𝐵(𝑥; 𝑚𝑖 ) 𝐿= 𝑠𝑖 , (8) (11) 𝑚 ∈M 𝑖 ∈ { 𝑗 |𝑠 (𝑚) >𝑡 } 𝑗 where 𝑅𝑖 is the number of pixels of the 𝑖-th patch, 𝑥 is the clean image, 𝑥 ∗ is the adversarial example, 𝑚𝑖 denotes the 𝑖-th model, (𝑚) where 𝑠𝑖 is the confidence score of the 𝑖-th bounding box of and 𝐵𝐵(𝑥; 𝑚𝑖 ) is the number of bounding boxes detected by 𝑚𝑖 on model 𝑚 and M is set of detection models. The loss function 𝐿 is image 𝑥. simple but efficient. Figure 1 offers a comprehensible description of When we perform ensemble operations with grid-like patches the attacking algorithm. The detail of the algorithm with consensus- through 𝑟𝑎𝑡𝑖𝑜 ∈ {0.5, 0.6, 0.7}, we have got more than 2000 scores. based patch selection algorithm is also described in Algorithm 1. The results are also recorded in Table 1 and shown in Figure 3. As you can see, the effect is obvious. 26 Figure 3: The scores of the two proposed algorithms. "Basic" represents results of two patch generation methods. "+Grid" represents the result of the proposed methods with grid-like patches ensemble. "+G&S" means the results of our methods with grid-like and different scale patches ensemble. Figure 4: Some adversarial images using the consensus- based attack algorithm with grid-like patches. Table 1: The results of the proposed two methods. heatmap-based Consensus-based object detection models from detecting the objects. Those adversar- Basic 1390 1507 ial examples are a great threat to deep neural networks deployed in +G 1936 2400 real world applications. Through the study of adversarial examples, +G & S 2713 3071 the mechanism of deep learning models can be further understood, and robust algorithms can also be proposed. In the future, we will explore how to improve the robustness of current detection models to deal with adversarial examples. Since we can successfully attack most object detection models with grid-like sparse patches, we can also expand the region of the original patches and sparsify them to cover a larger area while REFERENCES [1] Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. 2020. YOLOv4: altering a moderate number of pixels. Besides, we observe that Optimal Speed and Accuracy of Object Detection. arXiv e-prints, Article the patches of a fixed size might only be suitable for some images, arXiv:2004.10934 (April 2020), arXiv:2004.10934 pages. arXiv:2004.10934 [cs.CV] [2] S. Ren, K. He, R. Girshick, and J. Sun. 2017. Faster R-CNN: Towards Real-Time so we performed an ensemble over patches of different sizes S. Object Detection with Region Proposal Networks. TPAMI 39, 6 (2017), 1137–1149. Combining grid operations and the ensemble over patches of three [3] Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedan- different sizes S (S ∈ {20, 50, 70} in our experiments), our best tam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual Explanations From Deep Networks via Gradient-Based Localization. In ICCV. result is over 3000 scores. [4] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Note that the consensus-based algorithm is generally better than Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. the heatmap-based algorithm in the experiments. A possible expla- In International Conference on Learning Representations. http://arxiv.org/abs/1312. 6199 nation for this is that the consensus-based algorithm might better [5] Mingxing Tan, Ruoming Pang, and Quoc V. Le. 2020. EfficientDet: Scalable and combine the target models with the voting process and incorporate Efficient Object Detection. In CVPR. the attacking process with the selection process. Some adversarial images are shown in Figure 4. As shown, the patches are gridded and have different scales. Since there is no limit to the perturbations, the noise is obvious. In general, the greater the noise is, the better the attack transferibility is. 4 CONCLUSION In this paper, two adversarial patch generation algorithms have been proposed: heatmap-based and consensus-based patch gener- ation algorithms. The generated patches are efficient and precise. Additionally, they only rely on few pixels but are generic. Further- more, the proposed attacking methods can misguide state-of-the-art 27