=Paper= {{Paper |id=Vol-2881/paper7 |storemode=property |title=Object Hider: Adversarial Patch Attack Against Object Detectors |pdfUrl=https://ceur-ws.org/Vol-2881/paper7.pdf |volume=Vol-2881 |authors=Yusheng Zhao,Huanqian Yan,Xingxing Wei }} ==Object Hider: Adversarial Patch Attack Against Object Detectors== https://ceur-ws.org/Vol-2881/paper7.pdf
Object Hider: Adversarial Patch Attack Against Object Detectors

                  Yusheng Zhao∗                                              Huanqian Yan∗                                     Xingxing Wei†
               Beihang University                                          Beihang University                                 Beihang University
                 Beijing, China                                              Beijing, China                                     Beijing, China
            zhaoyusheng@buaa.edu.cn                                        yanhq@buaa.edu.cn                                  xxwei@buaa.edu.cn

ABSTRACT                                                                                      the purpose of making the objects disappear. Since object detec-
Deep neural networks have been widely used in many computer                                   tion models have been used in many life-concerning applications,
vision tasks. However, it is proved that they are susceptible to small,                       research about the fragility of these models is of great importance.
imperceptible perturbations added to the input. Inputs with elabo-                               Therefore, we aim to investigate the vulnerability of object detec-
rately designed perturbations that can fool deep learning models                              tion algorithms in this work and attack four state-of-the-art object
are called adversarial examples, and they have drawn great con-                               detection models provided by Alibaba Group on the Tianchi plat-
cerns about the safety of deep neural networks. Object detection                              form, including two white-box models — YOLOv4 [1] and Faster
algorithms are designed to locate and classify objects in images or                           RCNN [2], and two black-box models to test the transferability
videos and they are the core of many computer vision tasks, which                             of the proposed algorithm. The purpose of the designed methods
have great research value and wide applications. In this paper, we                            is to blind the detection models with the restricted patches. The
focus on adversarial attack on some state-of-the-art object detection                         framework of adversarial attacking is shown in Figure 1.
models. As a practical alternative, we use adversarial patches for                               We discover that the locations of adversarial patches are crucial
the attack. Two adversarial patch generation algorithms have been                             to the attack, so we focus on locating the patches and propose two
proposed: the heatmap-based algorithm and the consensus-based                                 patch selection algorithms: the heatmap-based algorithm and the
algorithm. The experiment results have shown that the proposed                                consensus-based algorithm. The heatmap-based algorithm is an
methods are highly effective, transferable and generic. Additionally,                         improved version of Grad-CAM [3], which introduced the idea of
we have applied the proposed methods to competition Adversarial                               heatmap to visualize the gradients of intermediate convolutional
Challenge on Object Detection and won top 7 in 1701 teams. Code is                            layers in image classifiers. We modify and improve the algorithm
available at https://github.com/FenHua/DetDak                                                 to make it suitable for visualizing the gradients in object detec-
                                                                                              tion models and use the heatmap to select patches. To the best our
CCS CONCEPTS                                                                                  knowledge, it is the first Grad-CAM-like algorithm designed specif-
                                                                                              ically for the object detection task. The consensus-based algorithm
• Security and privacy → Software and application security;
                                                                                              is another novel patch selecting method. It chooses patch locations
• Computing methodologies → Object detection.
                                                                                              by attacking several target models and combining the results with
KEYWORDS                                                                                      a voting strategy, which can make the location of the patch more
                                                                                              precise and the adversarial examples more transferable.
object detection, adversarial patches, patch generation algorithm                                We test our attacking algorithm with the proposed patch selec-
                                                                                              tion algorithms on the dataset provided by Alibaba Group. The
1    INTRODUCTION                                                                             result shows that the proposed algorithms are highly competitive.
While being widely used in many fields, deep neural networks are                              In brief, the main contributions can be summarized as follows:
shown to be vulnerable to adversarial examples [4]. Many early
studies of adversarial examples focused on the classification task,                               • We improve the Grad-CAM algorithm to make it more suit-
adding perturbation on the entire image. However, in real world                                     able for analysing the gradients of object detection models
applications like autonomous vehicles and surveillance, such per-                                   and use it for the heatmap-based attack.
turbation is hard to implement. Because of this, recent studies focus                             • We propose consensus-based attack algorithm that is very
mainly on adversarial patches, which restrict the perturbation to a                                 powerful for attacking object detection models.
small region like a rectangular area. This makes adversarial exam-                                • The experimental results show that the proposed attacking
ples more practical and easier to implement.                                                        methods are competitive and generic.
   Object detection is an important part of computer vision and
enables many tasks like autonomous driving, visual question an-                                  The rest of this paper is organized as follows. The proposed
swering and surveillance. However, there are relatively few studies                           algorithms are described in Section 2. The experimental results and
on the adversarial attack of object detection models, especially for                          analysis are presented in Section 3. Finally, we summarize the work
                                                                                              in Section 4.
∗ Both authors contributed equally to this research.
† Corresponding author.


 Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons
                                                                                              2   METHODS
License Attribution 4.0 International (CC BY 4.0).                                            Two methods have been designed for generating patches: the heatmap-
In: Dimitar Dimitrov, Xiaofei Zhu (eds.): Proceedings of the CIKM AnalytiCup 2020, 22
October, 2020, Gawlay (Virtual Event), Ireland, 2020, published at http://ceur-ws.org.        based algorithm and the consensus-based algorithm. In this Section,
                                                                                              two proposed methods are introduced in details. The adversarial
                                                                                         24
Figure 1: The framework of our attack method. It is an adversarial patch attack algorithm for object detection. The strategies
of generating patches are described in Section 2.


attack algorithm with patches is also presented concretely at end             used in the normalization to avoid small bounding boxes from being
of this Section.                                                              too dominant. We compute ℎ𝑏 as
                                                                                                                Õ 𝜕𝑦𝑏
2.1    Heatmap-based Algorithm                                                                   ℎ𝑏 = max(0,                       ⊙ 𝐴𝑘𝑖𝑗 ),    (3)
                                                                                                                      𝑘
Grad-CAM[3] is a popular tool for visualizing the derivative of                                                   𝑘 𝜕𝐴𝑖 𝑗
the output with respect to an intermediate convolutional layer. It            where 𝐴𝑘𝑖𝑗 denotes the activation of a convolutional layer at channel
introduced the idea of heatmap — important regions of the input are
hotter in the heatmap. The heatmap is a function of the gradients             𝑘 and location (𝑖, 𝑗), ⊙ represents the element-wise product and 𝑦𝑏
of the output with respect to an intermediate layer. However, the             is the highest confidence score of the bounding box 𝑏.
original Grad-CAM algorithm is designed for classification models                 Finally, we use some Gaussian filters to post-process the heatmap
and thus cannot be used directly in our task. On the one hand,                to make it more smooth. Combining the heatmaps of several object
object detection tasks usually have multiple objects for the input            detection models, we can choose the patches in hot regions of the
image, while the classification task only have one. On the other              input image.
hand, the size of different objects could have a significant influence
on the heatmap, so we cannot directly add the gradients together              2.2    Consensus-based Algorithm
when computing the heatmap.                                                   Although the heatmap-based algorithm exploits the gradient infor-
   Therefore, an improved Grad-CAM algorithm is proposed for se-              mation of the models, it is separated from the attacking process.
lecting patches. Firstly, we adopt the element-wise multiplication of         Besides, we find that the sensitive locations of the input image
the gradients and activations, which preserves spatial information            might change over time when attacking algorithms are performed
of the gradients and the intermediate layer. Secondly, we normalize           iteratively. Therefore, we propose another method for patch selec-
the heatmap data of all bounding boxes and combine them together              tion: the consensus-based algorithm.
to get the heatmap of the entire image. Thirdly, we use several                  First of all, we perform the Fast Gradient Sign Method iteratively
intermediate layers of the backbone for computing the heatmap,                with 𝐿2-norm regularization on the target models respectively. Our
which can combine both lower-level features and higher-level fea-             loss function 𝐽𝐿2 is originally defined as:
tures. Finally, we get the patch mask according to the values of the                                              Õ
                                                                                                      𝐽𝑚𝑜𝑑𝑒𝑙 =           𝑠𝑖 ,                   (4)
heatmap.
                                                                                                               𝑖 ∈ { 𝑗 |𝑠 𝑗 >𝑡 }
   Mathematically, we calculate the heatmap 𝐻 using the following
formula:                                                                                            𝐽𝐿2 = 𝐽𝑚𝑜𝑑𝑒𝑙 + 𝜔 · ||𝑃 || 22,               (5)
                                  Õ
                            𝐻=          𝐻𝑎 ,                      (1)         where 𝑠𝑖 is the confidence score of each bounding box of the cor-
                                 𝑎 ∈A                                         responding model, 𝑡 is the confidence threshold (we use 0.3 in our
where A is the set of several activation layers (like conv56, conv92          task). Usually, when 𝑠𝑖 > 𝑡, it indicates the bounding box is correct
in YOLOv4). 𝐻𝑎 is the heatmap of a single activation layer 𝑎, which           and will appear in the results. So the lower the confidence score
is defined as:                                                                𝑠𝑖 , the fewer objects can be detected. 𝑃 represents the perturbation
                        Õ ℎ𝑏 − E[ℎ𝑏 ] p                                       and 𝜔 is a hyper parameter.
                 𝐻𝑎 =       p            · 𝐴𝑟𝑒𝑎𝑏 ,                (2)              In the experiments, we find that the noise perturbations of some
                       𝑏 ∈𝐵    Var[ℎ𝑏 ]                                       models like Faster RCNN are not concentrated, which makes it hard
                                                                              to fuse multiple results. To solve this problem, we modified the loss
where ℎ𝑏 represents the heatmap of a single bounding box 𝑏, the
                                                                              function of those models:
mean E(ℎ𝑏 ) and the variance Var(ℎ𝑏 ) are used for normalization.
                                                      p                                                  Õ                 Õ
Besides, 𝐴𝑟𝑒𝑎𝑏 is the area of the bounding box 𝑏 and 𝐴𝑟𝑒𝑎𝑏 is                                  𝐽𝑟𝑐𝑛𝑛 = 𝛾    𝑠𝑘𝑒𝑦 + (1 − 𝛾)     𝑠𝑜𝑡ℎ𝑒𝑟 ,          (6)
                                                                         25
                                                                                Algorithm 1 Consensus-based Attack Algorithm
                                                                                Input: a clean image 𝐼 , the patch number 𝑛, the scale of square
                                                                                     patches S, the iteration number for attacking 𝑖𝑡
                                                                                Output: an adversarial image 𝐼 ′
                                                                                  1: Let the set of object detection models be M
                                                                                  2: for model 𝑚 ∈ M do
                                                                                  3:     L2 attack on 𝑚 and get perturbation 𝑃 (𝑚)
                                                                                                                                (𝑚)
                                                                                  4:     Get top 𝑛 noise patches in 𝑃 (𝑚) as 𝑃𝑛 with scale S
                                                                                                       (𝑚)
                                                                                  5:     Normalize 𝑃𝑛
                                                                                  6: end for
                                                                                            Í         (𝑚)
                                                                                  7: 𝑃𝑛 ← 𝑚 ∈M 𝑃𝑛
                                                                                  8: P ← select top 𝑛 patch masks in 𝑃𝑛 according to the magnitude
Figure 2: Consensus-based Algorithm which uses a voting                              of the perturbation.
method for generating patch masks.                                                9: repeat
                                                                                 10:     Perform FGSM attack on 𝐼 with patch masks in P
                                                                                 11:     Update the polluted image 𝐼 ′ .
where 𝑠𝑘𝑒𝑦 is the confidence score of bounding boxes appeared in                 12:     𝑖𝑡 ← 𝑖𝑡 − 1
the clean image and 𝑠𝑜𝑡ℎ𝑒𝑟 is the confidence score of others that                13: until 𝐿 = 0 or 𝑖𝑡 = 0 // 𝐿 is the loss function
do not contain any true objects during the attack. The 𝛾 is a hyper              14: return 𝐼 ′
parameter which is set to 0.9 in our experiments. Such modification
could force the perturbation to concentrate on the main objects of
the image.                                                                      3   EXPERIMENT
   After 𝐿2 attack, we can get the noise of the input image of each
model. However, we do not mix those noise directly, because they                We used the proposed methods in AIC Phase IV CIKM-2020: Ad-
are different in magnitude and it is not easy to balance them. So we            versarial Challenge on Object Detection competition. The results of
sparsify the noise into 𝑛 patches with a specified scale S. Next, we            two basic proposed methods without any ensemble operations are
take a vote to decide which patch mask should be preserved and                  recorded in Table 1 and shown in Figure 3. Even without ensemble
which should be discarded. Usually, the greater the perturbation,               operations, the algorithms are quite competitive. In order to reduce
the more likely it is to be selected as patch. The voting strategy              the number of pixels of our patches, grid-like patches are designed.
on those noise patches is very helpful for improving performance.                  To get grid-like patches, we performed a element-wise dot prod-
The flow of the algorithm is described in Figure 2. Here, we in-                uct between patch mask 𝑀 and a grid matrix 𝐺𝑟𝑎𝑡𝑖𝑜 . The grid-like
troduce EfficientDet [5] to join the vote. The more the detection               mask 𝑀 ′ can be calculated by:
models, the more accurate the voting results and the higher the                                          𝑀 ′ = 𝑀 ⊙ 𝐺𝑟𝑎𝑡𝑖𝑜 ,                        (9)
adaptability. Furthermore, the voting strategy can also improve the
transferability of our adversarial patches and the robustness of our            where 𝑟𝑎𝑡𝑖𝑜 represents the degree of sparsity, and the larger 𝑟𝑎𝑡𝑖𝑜
algorithm. Additionally, the number of 𝐿2 attack iterations is not              is, the less pixels are used.
very sensitive. Even with only 5 iterations, the voting result is still             Ensemble operation is a common practice in machine learning
quite decent.                                                                   and we used this in the task. To combine different results for getting
                                                                                better results, an indicator is defined,
2.3    Adversarial attack with patches                                                                          2 Õ
                                                                                                                Õ
After patch mask generation, Fast Gradient Sign Method (FGSM) is                                 FinalScore =           𝑆 (𝑥, 𝑥 ∗, 𝑚𝑖 ),          (10)
                                                                                                                𝑖=1 𝑥
used to finish attacking:
                                                                                because there is two white-box detector, YOLO and Faster RCNN,
                 𝛿 := clip [0,255] (𝛿 + 𝛼 · sign(∇𝛿 𝐿)),             (7)
                                                                                can be used, the indicator is the sum of two score functions. 𝑆 is
where 𝛿 is the parameter in the adversarial patches and 𝛼 is the                provided by Alibaba Group:
learning rate which we refer to [4] for setting its value. 𝐿 is the loss
                                                                                                                     min 𝐵𝐵(𝑥; 𝑚𝑖 ), 𝐵𝐵(𝑥 ∗ ; 𝑚𝑖 ) 
                                                                                                       Í                                         
function defined as:                                                                    ∗               𝑘 𝑅𝑘
                                                                                 𝑆 (𝑥, 𝑥 , 𝑚𝑖 ) = (2 −       )· 1−                                   .
                        Õ          Õ
                                              (𝑚)
                                                                                                       5000                  𝐵𝐵(𝑥; 𝑚𝑖 )
                   𝐿=                       𝑠𝑖 ,                     (8)                                                                          (11)
                         𝑚 ∈M 𝑖 ∈ { 𝑗 |𝑠 (𝑚) >𝑡 }
                                        𝑗
                                                                                where 𝑅𝑖 is the number of pixels of the 𝑖-th patch, 𝑥 is the clean
                                                                                image, 𝑥 ∗ is the adversarial example, 𝑚𝑖 denotes the 𝑖-th model,
        (𝑚)
where 𝑠𝑖    is the confidence score of the 𝑖-th bounding box of                 and 𝐵𝐵(𝑥; 𝑚𝑖 ) is the number of bounding boxes detected by 𝑚𝑖 on
model 𝑚 and M is set of detection models. The loss function 𝐿 is                image 𝑥.
simple but efficient. Figure 1 offers a comprehensible description of              When we perform ensemble operations with grid-like patches
the attacking algorithm. The detail of the algorithm with consensus-            through 𝑟𝑎𝑡𝑖𝑜 ∈ {0.5, 0.6, 0.7}, we have got more than 2000 scores.
based patch selection algorithm is also described in Algorithm 1.               The results are also recorded in Table 1 and shown in Figure 3. As
                                                                                you can see, the effect is obvious.
                                                                           26
Figure 3: The scores of the two proposed algorithms. "Basic"
represents results of two patch generation methods. "+Grid"
represents the result of the proposed methods with grid-like
patches ensemble. "+G&S" means the results of our methods
with grid-like and different scale patches ensemble.
                                                                                Figure 4: Some adversarial images using the consensus-
                                                                                based attack algorithm with grid-like patches.
     Table 1: The results of the proposed two methods.

                     heatmap-based            Consensus-based                   object detection models from detecting the objects. Those adversar-
    Basic                1390                      1507                         ial examples are a great threat to deep neural networks deployed in
    +G                   1936                      2400                         real world applications. Through the study of adversarial examples,
    +G & S               2713                      3071                         the mechanism of deep learning models can be further understood,
                                                                                and robust algorithms can also be proposed. In the future, we will
                                                                                explore how to improve the robustness of current detection models
                                                                                to deal with adversarial examples.
   Since we can successfully attack most object detection models
with grid-like sparse patches, we can also expand the region of
the original patches and sparsify them to cover a larger area while
                                                                                REFERENCES
                                                                                [1] Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. 2020. YOLOv4:
altering a moderate number of pixels. Besides, we observe that                      Optimal Speed and Accuracy of Object Detection. arXiv e-prints, Article
the patches of a fixed size might only be suitable for some images,                 arXiv:2004.10934 (April 2020), arXiv:2004.10934 pages. arXiv:2004.10934 [cs.CV]
                                                                                [2] S. Ren, K. He, R. Girshick, and J. Sun. 2017. Faster R-CNN: Towards Real-Time
so we performed an ensemble over patches of different sizes S.                      Object Detection with Region Proposal Networks. TPAMI 39, 6 (2017), 1137–1149.
Combining grid operations and the ensemble over patches of three                [3] Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedan-
different sizes S (S ∈ {20, 50, 70} in our experiments), our best                   tam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual Explanations From
                                                                                    Deep Networks via Gradient-Based Localization. In ICCV.
result is over 3000 scores.                                                     [4] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan,
   Note that the consensus-based algorithm is generally better than                 Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks.
the heatmap-based algorithm in the experiments. A possible expla-                   In International Conference on Learning Representations. http://arxiv.org/abs/1312.
                                                                                    6199
nation for this is that the consensus-based algorithm might better              [5] Mingxing Tan, Ruoming Pang, and Quoc V. Le. 2020. EfficientDet: Scalable and
combine the target models with the voting process and incorporate                   Efficient Object Detection. In CVPR.
the attacking process with the selection process. Some adversarial
images are shown in Figure 4. As shown, the patches are gridded
and have different scales. Since there is no limit to the perturbations,
the noise is obvious. In general, the greater the noise is, the better
the attack transferibility is.

4    CONCLUSION
In this paper, two adversarial patch generation algorithms have
been proposed: heatmap-based and consensus-based patch gener-
ation algorithms. The generated patches are efficient and precise.
Additionally, they only rely on few pixels but are generic. Further-
more, the proposed attacking methods can misguide state-of-the-art
                                                                           27