Projecting Trouble: Light Based Adversarial Attacks on Deep Learning Classifiers

                             Nicole Nichols 1,2                                       Robert Jasper1
                      nicole.nichols@pnnl.gov                                 robert.jasper@pnnl.gov
      1
          Pacific Northwest National Laboratory, Seattle, Washington
          2
            Western Washington University, Bellingham, Washington


                            Abstract                                   manipulate physical objects to fool classifiers, which could
                                                                       pose a much greater real world threat.
  This work demonstrates a physical attack on a deep learning
  image classification system using projected light onto a phys-
  ical scene. Prior work is dominated by techniques for creat-
                                                                                           Related Research
  ing adversarial examples which directly manipulate the digi-         Researchers have proposed many theories about the cause
  tal input of the classifier. Such an attack is limited to scenar-    of model vulnerabilities. Evidence suggests that adversarial
  ios where the adversary can directly update the inputs to the        samples lie close to the decision boundary in the low dimen-
  classifier. This could happen by intercepting and modifying          sional manifold representing high dimensional data. Adver-
  the inputs to an online API such as Clarifai or Cloud Vision.        sarial manipulation in the high dimension is often impercep-
  Such limitations have led to a vein of research around physi-        tible to humans and can shift the low dimensional represen-
  cal attacks where objects are constructed to be inherently ad-
  versarial or adversarial modifications are added to cause mis-
                                                                       tation to cross the decision boundary (Feinman et al. 2017).
  classification. Our work differs from other physical attacks in      Many approaches are available to perform this manipulation
  that we can cause misclassification dynamically without al-          if the attacker has access to the defender’s classifier. Further-
  tering physical objects in a permanent way.                          more, adversarial examples have empirically been shown
  We construct an experimental setup which includes a light            to transfer between different classifier types (Papernot, Mc-
  projection source, an object for classification, and a camera to     Daniel, and Goodfellow 2016; Szegedy et al. 2013). This
  capture the scene. Experiments are conducted against 2D and          enhances the attacker’s potential capability when there is no
  3D objects from CIFAR-10. Initial tests show projected light         access to the defender’s classifier.
  patterns selected via differential evolution could degrade clas-         It is difficult for defenses to keep pace with attacks, and
  sification from 98% to 22% and 89% to 43% probability for            the advantage lies with the adversary. This was highlighted
  2D and 3D targets respectively. Subsequent experiments ex-           when seven of the eight white box defenses announced at
  plore sensitivity to physical setup and compare two additional       the prestigious ICLR2018 were defeated within a week of
  baseline conditions for all 10 CIFAR classes. Some physical
                                                                       publication (Athalye, Carlini, and Wagner 2018).
  targets are more susceptible to perturbation. Simple attacks
  show near equivalent success, and 6 of the 10 classes were               Researchers have successfully demonstrated physical
  disrupted by light.                                                  world attacks against deep learning classifiers. Some of the
                                                                       first physical attacks were demonstrated by printing an ad-
                                                                       versarial example, photographing the printed image, and
                        Introduction                                   verifying the adversarial attack remained (Kurakin, Good-
                                                                       fellow, and Bengio 2016). (Sharif et al. 2016) demonstrated
Machine learning models are vulnerable to adversarial at-              printed eyeglasses frames that thwart facial recognition sys-
tacks by making small but targeted modifications to inputs             tems and fully avoid face detection by the Viola-Jones object
that cause misclassification. The research around adversar-            detection algorithm. It has also been noted that near infra-
ial attacks on deep learning systems has grown significantly           red light can also be used to evade face detection (Yamada,
since (Szegedy et al. 2013) demonstrated intriguing proper-            Gohshi, and Echizen 2013). Our work is different because
ties. The scope and limitations of such attacks is an active           we leverage dynamic generation methods use real world
area of research in the academic community. Most of the                feedback when learning the patterns of light to project.
research has focused on the purely digital manipulation. Re-               Putting aside adversarial attacks, most image classifiers
cently, researchers have developed techniques that alter or            are not inherently invariant to object scale, translation, or
                                                                       rotation. Notable exceptions are (Cohen and Welling 2014),
Copyright c by the papers authors. Copying permitted for private
and academic purposes. In: Joseph Collins, Prithviraj Dasgupta,        which attempts to learn object recognition by construction
Ranjeev Mittu (eds.): Proceedings of the AAAI Fall 2018 Sympo-         of parts, and (Qi et al. 2017) which use 3D point cloud rep-
sium on Adversary-Aware Learning Techniques and Trends in Cy-          resentation for object classification. To some degree, this in-
bersecurity, Arlington, VA, USA, 18-19 October, 2018, published        variance can be learned from training data if it has intention-
at http://ceur-ws.org                                                  ally been designed to address this gap. For example the early
work by (LeCun, Huang, and Bottou 2004) was evaluated             by the camera. Though only one pixel was modified in the
with the NORB dataset which was systematically collected          digital attack pattern, because of the distance between the
to assess pose, lighting, and rotation of 3D objects.             projector and object, a larger area in the captured scene and
   Simulating scale, translation, and rotation of 2D images       many input pixels to the camera are modified. The original
is conducive to experiment automation, and many recent ad-        and attacked scenes are shown in Figure 1.
vances in rotational invariance such as Spatial Transformer          Through this attack, the probability of horse was de-
Networks (Jaderberg et al. 2015), use this framework for          creased from 98% to 22%.
evaluation of robustness to these properties. However, fur-
ther research is needed to validate the ability of this simu-     3D Presentation
lated rotational invariance to transfer to real world rotation    To demonstrate the potential for light based attacks, we ex-
of 3D figures. We emphasize the need for invariant models         tended the 2D methodology to a 3D scene in two experimen-
because it is impossible to disambiguate the success of an        tal phases. First, we placed a toy car in the field of view of
attack when it is can only be validated with a weak model.        the web camera to capture the scene. To perform the attack,
   Maintaining adversarial attack under a range of pose or        the projector iteratively applies the same adversarial noise
lighting conditions may prove to be the most difficult as-        procedure to the 3D physical scene and the same ResNet38
pect of this task. Some preliminary research suggests this is     model is used for evaluation. The object probabilities for the
possible and demonstrated two toy examples in the physi-          original scene were 89% automobile and 11% truck.
cal world (Athalye and Sutskever 2017). They introduce an         The attacked scene probabilities were 43 % automobile
Expectation over Transformation (EoT) method for differ-          and 57% truck. The second phase of experiments was de-
entiating texture patterns through a 3D renderer to produce       signed to improve the repeatability and confidence of the ini-
an adversarial object. An additional demonstration of phys-       tial demonstration. Results are expanded to evaluate all 10
ical attack is to introduce an adversarial patch to the physi-    CIFAR classes: airplane, automobile, bird, cat,
cal scene, which is invariant to location, rotation, scale, and   deer, dog, frog, horse, ship, truck. The figurines
cause specific misclassification (Brown et al. 2017).             used for each of these classes are shown in Figure 4a. The
                                                                  yellow car in phase 1 was not available and was replaced
         Experimental Setup and Results                           with a red car in phase 2.
                                                                     Rotation invariance is important for interpreting the pre-
We constructed a test environment to perform light based
                                                                  sented experimental setup. This impacts our data collec-
adversarial attacks and collect data in an office environment
                                                                  tion because we observed in a baseline condition, with
with minimal lighting control. Our attacks were conducted
                                                                  no added light, the distance to the camera and object ori-
against 2D and 3D target objects placed in the scene. We
                                                                  entation yielded highly variable classification results. We
used a projector to project light onto the target and a com-
                                                                  tested four experimental conditions: ambient light, white
mon web camera to capture the scene. For the 2D and initial
                                                                  light from the projector, white light with a randomly located
3D experiments, the projector was a Casio XJ-A257 and the
                                                                  pixel in the 32x32 grid, and differential evolution process
camera was a Logitech C930e. During the second phase of
                                                                  to control color and location of one pixel in a 32x32 white
3D experiments, we used an Epson VS250 projector, Log-
                                                                  grid. We observed classification variability in the physical
itech C615 HD camera and an Altura HD-ND8, neutral den-
                                                                  scene when no modifications were applied. For this reason
sity filter to control the light intensity of the projector.
                                                                  we introduced some lighting controls which observation-
                                                                  ally provided a significantly more stable baseline classifica-
2D Presentation                                                   tion. Three physical modifications were made. The projected
For the 2D scene, we chose a random image (horse) from            background color was changed from black to white to pro-
the CIFAR-10 dataset to be attacked. The image was printed        vide more uniformity to the scene. We used a foam block to
and secured to the wall in front of the camera and projec-        minimize stray reflections caused by the projector. Addition-
tor. Following a similar methodology of earlier work (Su,         ally we used a neutral density filter to scale the light inten-
Vargas, and Kouichi 2017) on single pixel attacks we use          sity. To verify stability, we collected twenty image captures
differential evolution (DE) to optimize a light based attack      of each test condition, and 200 for differential evolution (50
to cause misclassification. Differential evolution is a heuris-   population sample and 4 evolution phases).
tic global optimization strategy similar to genetic algorithms       Reproducibility of the physical placement of each object
where the algorithm maintains a population of candidate so-       in the scene is imprecise, thus each test condition was col-
lutions, selecting a small number (potentially one) for fur-      lected in sequence without any disturbance (besides light).
ther rounds of modification and refinement. We projected          An unrecorded calibration phase was used to reposition
a digital black 32x32 square containing a single pixel at a       the object for a maximum baseline classification score be-
variable location and RGB values. Because projectors can’t        fore the recorded baseline and light projected data was col-
project black (the absence of light) the projector adjusted the   lected. For each class and test condition, we report the mean,
black pixels to present the illusion of a black background.       median, standard deviation, variance, minimum, maximum,
This adjustment is impacted somewhat by RGB value of              ∆mean and ∆median. The ∆mean and ∆median are the
the single pixel being projected. Each iteration of the dif-      computation of the reduction in probability score for the
ferential evolution was projected, captured, and input to a       given attack type relative to baseline. Larger ∆ numbers rep-
standard ResNet38 for classification of the image captured        resent more powerful decrease in the true class probability.
               (a) The 2D scene without adversarial attack.                 (b) The 2D scene with adversarial attack.

                          Figure 1: Images demonstrating light based attack on 2D physical presentation


All scores are reported in Table 1.                                 to extend this to more complex physical scenarios and clas-
   Interpreting the table yields one immediate observation:         sification models.
some examples (Automobile, Bird, Horse, Ship) are                      We chose to attack the CIFAR-10 framework in a manner
invariant to the light attack, consistently being identified as     similar to what was demonstrated in the original single pixel
the true class at 100% (within rounding error) while other          attack (Su, Vargas, and Kouichi 2017). This framework is
classes (Airplane, Cat, Deer, Dog, Frog, and Truck)                 an easier target because it is a low resolution, low param-
have varying degrees of susceptibility. It is unclear whether       eter model. To assess the robustness of stronger models, a
these differences are inherent in the classes themselves, or        ResNet50 classifier trained on ImageNet was also used to
to the particular figurines we chose. As one might expect           evaluate all of the collected images. Because of a lack of cor-
with a research classifier, there is a high degree of variability   responding true class identification, scores are not reported,
based on the particular example. We incremented the com-            but it was observed that the top1 class prediction was shifted
plexity of light attack from pure white light, random square,       with the addition of light based attacks.
and differential evolution, to assess if there was something           There is also a closed world assumption of 10 relatively
unique in the more sophisticated attack, or if it was merely        dissimilar classes, where the probability of all classes sums
the addition of light, or a pattern, that was causing the ob-       to one. When a misclassification occurs, it tends to be more
served decrease in classification. In many cases, the simple        outlandish than it could otherwise be. For example, rose
addition of white light is as effective as the other attacks. For   and tulip might be a more forgiving mistake than frog
example, the mean airplane class was decreased from 1.000           and airplane but in the CIFAR closed world framework,
to 0.151, with only the addition of white light. The corre-         the model is limited to the 10 known classes.
sponding trials with random and differential evolution light           In our attack on the 3D presentation, the true class was
patterns yielded only slightly stronger attacks, with 0.113         correctly identified as car when no attack was present. By
and 0.133 mean scores respectively. However, the decline is         applying the adversarial light attack, we were able to de-
noteworthy, independent of sophistication.                          crease the confidence of car from 89% to 43%, and instead
                                                                    predict truck with 57% probability. We would not iden-
                         Discussion                                 tify this as a 3D attack because we had a fixed orientation
Physical attacks on machine learning systems could be ap-           between the camera, projector, and object. In this example,
plied in a wide range of security domains. The literature           the single square attack is visually perceptible but transient.
has primarily discussed the safety of road signs and au-            However, the notion of human perception is not as simple
tonomous driving (Eykholt et al. 2017; Chen et al. 2018),           as an L∞ distance in pixel space. This is highlighted by
however other security applications may also be impacted.           the fact that consecutive video frames can be significantly
An adversary may be trying to hide themselves or physi-             mis-classified by top performing image classification sys-
cal ties to illegal activities to evade law enforcement (e.g.       tems (Zheng et al. 2016). Images that are imperceptibly dif-
knives/weapons, contraband, narcotics manufacturing, etc).          ferent can have large distance in pixel or feature space, and
Any AI to be deployed for law-enforcement applications              images that are perceptually different can be close.
needs to be robust in an adversarial environment where                 A key topic that needs further understanding is why the
physical obfuscation could be employed. Light based at-             extreme variability in class identification. One potential ex-
tacks:                                                              planation is the degree of self similarity within a class, and
• Can perform targeted and non-targeted attacks.                    training data bias. For example, the horse images in the
• Do not modify physical object in a permanent way.                 training data, are potentially all self similar and also closely
                                                                    match the example figurine. The variation between different
• Can be a transient effect occurring at specified times.           types of horses is likely smaller than the visual difference
This work aims to be a first step towards understanding the         between different breeds of dogs.
abilities and limitations of such physical attacks. We picked          Another possible explanation is the scale or percentage of
a relatively easy first target to verify the possibility and plan   the scene that the object occupies. Most of the classes which
    (a) The 3D scene without any adversarial attack.                    (b) The 3D scene with adversarial attack.

                 Figure 2: Images demonstrating light based attack on 3D physical presentation


(a) Downsampled image without any adversarial attack.               (b) Downsampled image with adversarial attack.

        Figure 3: Downsampled images demonstrating light based attack on 3D physical representation


(a) The toy figurines used to represent the CIFAR classes.
                                                             (b) Physical setup demonstrating relative position of projector,
                                                             camera, object, and lighting control.

         Figure 4: Experiment setup and figurines for second phase experiments with 3D presentation.
were susceptible to attack were relatively small. The notable     Athalye, A.; Carlini, N.; and Wagner, D. 2018. Ob-
exception was the truck which was actually the largest figure     fuscated gradients give a false sense of security: Circum-
used for data, yet was still susceptible to misclassification     venting defenses to adversarial examples. arXiv preprint
errors with the addition of light.                                arXiv:1802.00420.
   There are a several important constraints present when         Brown, T. B.; Mané, D.; Roy, A.; Abadi, M.; and Gilmer, J.
crafting a light based physical attack that are unconstrained     2017. Adversarial patch. arXiv preprint arXiv:1712.09665.
in a digital attack. Specifically, light is always an additive
                                                                  Chen, S.-T.; Cornelius, C.; Martin, J.; and Chau, D. H. 2018.
noise and turning a dark color to white with the addition of
                                                                  Robust physical adversarial attack on faster r-cnn object de-
light is impossible. The angle of projection and the texture
                                                                  tector. arXiv preprint arXiv:1804.05810.
of the scene may impact the colors reflected to the camera.
The camera itself will introduce color balance changes as         Cohen, T. S., and Welling, M. 2014. Transformation
it adjusts to the adversarial addition of light. Even a fully     properties of learned visual representations. arXiv preprint
manual camera will always have CCD shot noise, which is a         arXiv:1412.7659.
function of shutter speed and temperature, that could influ-      Eykholt, K.; Evtimov, I.; Fernandes, E.; Li, B.; Rahmati, A.;
ence the success or failure of a light based attack. The pro-     Xiao, C.; Prakash, A.; Kohno, T.; and Song, D. 2017. Robust
jected pixel was not constrained to overlap the target object,    Physical-World Attacks on Deep Learning Models.
and would appear in the background. Empirically, these sin-       Feinman, R.; Curtin, R. R.; Shintre, S.; and Gardner, A. B.
gle pixel projections onto the background of an image could       2017. Detecting adversarial samples from artifacts. arXiv
significantly change classifier predictions.                      preprint arXiv:1703.00410.
                                                                  Jaderberg, M.; Simonyan, K.; Zisserman, A.; et al. 2015.
           Conclusion and Future Work                             Spatial transformer networks. In Advances in neural infor-
The presented work is an empirical demonstration of light         mation processing systems, 2017–2025.
based attacks on deep learning based object recognition sys-      Kurakin, A.; Goodfellow, I.; and Bengio, S. 2016. Adver-
tems. Adversarial machine learning research has empha-            sarial examples in the physical world. Arxiv (c):1–15.
sized attacks against deep learning architectures, however
it has been observed that other models are equally suscep-        LeCun, Y.; Huang, F. J.; and Bottou, L. 2004. Learning
tible to attack and that adversarial examples often transfer      methods for generic object recognition with invariance to
between model types (Papernot, McDaniel, and Goodfellow           pose and lighting. In Computer Vision and Pattern Recogni-
2016). The empirical demonstration of light based attack          tion, 2004. CVPR 2004. Proceedings of the 2004 IEEE Com-
was against a deep learning architecture. However, based          puter Society Conference on, volume 2, II–104. IEEE.
on this prior work, it is likely that it could be demonstrated    Papernot, N.; McDaniel, P.; and Goodfellow, I. 2016. Trans-
against other model types.                                        ferability in Machine Learning: from Phenomena to Black-
   We plan on conducting experiments with higher resolu-          Box Attacks using Adversarial Samples.
tion and more robust classifiers and more subtle manip-           Qi, C. R.; Su, H.; Mo, K.; and Guibas, L. J. 2017. Pointnet:
ulations. We believe that more targeted optimization ap-          Deep learning on point sets for 3d classification and seg-
proaches that initially focus on sensitive image areas will       mentation. Proc. Computer Vision and Pattern Recognition
likely lead to faster identification of successful attacks. We    (CVPR), IEEE 1(2):4.
expect light based attacks could use more complex projected       Sharif, M.; Bhagavatula, S.; Bauer, L.; and Reiter, M. K.
textures and take advantage of 3D geometry. Presented re-         2016. Accessorize to a crime: Real and stealthy attacks on
sults clearly show light has the potential to be another av-      state-of-the-art face recognition. In Proceedings of the 2016
enue of adversarial attack in the physical domain.                ACM SIGSAC Conference on Computer and Communica-
                                                                  tions Security, 1528–1540. ACM.
                   Acknowledgments                                Su, J.; Vargas, D. V.; and Kouichi, S. 2017. One pixel
The research described in this paper is part of the Analysis in   attack for fooling deep neural networks. arXiv preprint
Motion Initiative at Pacific Northwest National Laboratory;       arXiv:1710.08864.
conducted under the Laboratory Directed Research and              Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan,
Development Program at PNNL, a multi-program national             D.; Goodfellow, I.; and Fergus, R. 2013. Intriguing proper-
laboratory operated by Battelle for the U.S. Department           ties of neural networks. arXiv preprint arXiv:1312.6199.
of Energy. The authors are especially grateful to Mark
Greaves, Artem Yankov, Sean Zabriskie, Michael Henry,             Yamada, T.; Gohshi, S.; and Echizen, I. 2013. Privacy visor:
Jeremiah Rounds, Court Corley, Nathan Hodas, Will Koella          Method for preventing face image detection by using differ-
and our Quickstarter supporters.                                  ences in human and device sensitivity. In IFIP International
                                                                  Conference on Communications and Multimedia Security,
                                                                  152–161. Springer.
                                                                  Zheng, S.; Song, Y.; Leung, T.; and Goodfellow, I. 2016.
                        References                                Improving the Robustness of Deep Neural Networks via Sta-
                                                                  bility Training.
Athalye, A., and Sutskever, I. 2017. Synthesizing robust
adversarial examples. arXiv preprint arXiv:1707.07397.
CIFAR Class   Experiment Condition    Mean     Median      SD     Var     Min     Max      ∆ Mean   ∆ Median
  Airplane          Baseline          1.000    1.000      .000   .000    1.000    1.000     .000      .000
                  White Light          .151     .101      .198   .039     .017    .997      .849      .899
                    Random             .114     .105      .088   .008     .022    .445      .886      .895
                 Diff Evolution        .133     .112      .087   .007     .014    .459      .867      .888
Automobile          Baseline          1.000    1.000      .000   .000    1.000    1.000     .000      .000
                  White Light         1.000    1.000      .000   .000     .999    1.000     .000      .000
                    Random            1.000    1.000      .000   .000     .999    1.000     .000      .000
                 Diff Evolution       1.000    1.000      .000   .000    1.000    1.000     .000      .000
   Bird             Baseline          1.000    1.000      .000   .000    1.000    1.000     .000      .000
                  White Light         1.000    1.000      .002   .000     .993    1.000     .000      .000
                    Random            1.000    1.000      .000   .000    1.000    1.000     .000      .000
                 Diff Evolution       1.000    1.000      .000   .000     .999    1.000     .000      .000
    Cat             Baseline           .990     .991      .004   .000     .979    .996      .000      .000
                  White Light          .009     .008      .005   .000     .000    .020      .981      .983
                    Random             .011     .007      .012   .000     .001    .047      .979      .984
                 Diff Evolution        .023     .017      .019   .000     .002    .124      .967      .974
   Deer             Baseline           .999     .999      .000   .000     .999    1.000     .000      .000
                  White Light          .516     .516      .145   .021     .242    .997      .483      .483
                    Random             .545     .507      .155   .024     .327    .871      .454      .492
                 Diff Evolution        .473     .467      .130   .017     .144    .829      .526      .532
   Dog              Baseline           .993     .993      .003   .000     .986    .996      .000      .000
                  White Light          .512     .499      .088   .008     .390    .695      .481      .494
                    Random             .482     .497      .123   .015     .136    .753      .511      .496
                 Diff Evolution        .386     .388      .088   .008     .123    .601      .606      .605
   Frog             Baseline           .888     .888      .025   .001     .842    .933      .000      .000
                  White Light          .008     .008      .003   .000     .000    .015      .881      .880
                    Random             .030     .011      .076   .006     .004    .360      .858      .877
                 Diff Evolution        .071     .038      .093   .009     .005    .576      .817      .849
   Horse            Baseline          1.000    1.000      .000   .000    1.000    1.000     .000      .000
                  White Light          .999    1.000      .001   .000     .993    1.000     .000      .000
                    Random            1.000    1.000      .000   .000    1.000    1.000     .000      .000
                 Diff Evolution       1.000    1.000      .000   .000    1.000    1.000     .000      .000
   Ship             Baseline          1.000    1.000      .000   .000    1.000    1.000     .000      .000
                  White Light         1.000    1.000      .000   .000    1.000    1.000     .000      .000
                    Random            1.000    1.000      .000   .000    1.000    1.000     .000      .000
                 Diff Evolution       1.000    1.000      .000   .000    1.000    1.000     .000      .000
   Truck            Baseline          1.000    1.000      .000   .000    1.000    1.000     .000      .000
                  White Light          .832     .832      .052   .003     .729    1.000     .168      .168
                    Random             .818     .819      .072   .005     .634    .970      .182      .180
                 Diff Evolution        .826     .839      .088   .008     .507    .949      .174      .161

                     Table 1: Classification statistics for baseline and attacked CIFAR figures.