Attacks on Machine Learning: Lurking Danger for Accountability

                        Katja Auernhammer, Ramin Tavakoli Kolagari, Markus Zoppelt
                                  Nuremberg Institute of Technology, Faculty of Computer Science
                                                        Hohfederstrasse 40
                                                   Nuremberg, 90489, Germany
                          {katja.auernhammer, ramin.tavakolikolagari, markus.zoppelt}@th-nuernberg.de


                             Abstract                                 • Confidentiality ensures that private or confidential infor-
                                                                        mation is not made available or disclosed to unauthorized
  It is well-known that there is no safety without security. That       users, and that users can control (or influence) what in-
  being said, a sound investigation of security breaches on Ma-
  chine Learning (ML) is a prerequisite for any safety concerns.
                                                                        formation related to them may be collected, used, and
  Since attacks on ML systems and their impact on the security          to whom it is disclosed. Confidentiality is often imple-
  goals threaten the safety of an ML system, we discuss the im-         mented through cryptography / encryption.
  pact attacks have on the ML models’ security goals, which           • Integrity ensures that information is not changed (mod-
  are rarely considered in published scientific papers.                 ified) or destroyed unauthorizedly. Integrity can be com-
  The contribution of this paper is a non-exhaustive list of pub-       promised even if the information or system produces the
  lished attacks on ML models and a categorization of attacks           correct output.
  according to their phase (training, after-training) and their im-
  pact on security goals. Based on our categorization we show         • Availability ensures that a system works promptly, ser-
  that not all security goals have yet been considered in the lit-      vice is not denied to authorized users, and access to and
  erature, either because they were ignored or there are no pub-        use of information is timely and reliable.
  lications on attacks targeting those goals specifically, and that
  some are difficult to assess, such as accountability. This is       • Authenticity is the characteristic of being genuine
  probably due to some ML models being a black box.                     and verifiable and trustworthy. Authenticity is ensured
                                                                        through authentication processes that verify whether users
                                                                        are who they say they are (entity authenticity). Authentic-
                        Introduction                                    ity is often enabled through cryptography / cryptographic
During the last few years scientists and researchers have               signatures.
published a variety of different attacks on Machine Learning          • Reliability is the property of a system such that reliance
(ML) systems. However, the papers only rarely mention se-               can be justifiably placed on the service it delivers, i.e., the
curity goals—such as integrity, availability, confidentiality,          system adheres to the specification it was engineered to
reliability, authenticity, and accountability—that are endan-           address.
gered by these attacks. Even if a paper explicitly mentions
                                                                      • Accountability refers to the requirements for actions of
the violation of a security goal it is not clear if the breach
                                                                        an entity to be traced uniquely to that entity (e.g., non-
refers to the whole system in which the ML model is em-
                                                                        repudiation of a communication that took place). Ac-
bedded or rather the ML model itself or parts of it.
                                                                        countability allows a certain degree of transparency to
   The contribution of this paper is a non-exhaustive list
                                                                        what happened when and what was performed by whom.
of published attacks on ML and a derivation of different
groups of attacks. We further elaborate on the breaches of
known security goals (integrity, availability, confidentiality,          Attacks on Machine Learning Algorithms
etc.) caused by the listed attacks to justify our categorization      Important criteria that influence the applicability of certain
and show the security goals mentioned in published papers             attacks on ML models at this level of detail are the learn-
about attacks on ML. Our categorization clarifies that there          ing type (supervised, unsupervised, reinforcement learning)
are some security goals, such as accountability, which are            and if the algorithm undergoes lifelong learning. Different
yet difficult to evaluate due to the complex operations within        attacks are designed to target combinations of different cri-
ML models.                                                            teria. The implications to the security goals of the ML model
                                                                      are equivalent to the security goals corresponding to the cat-
                       Security Goals                                 egorization of the attack.
                                                                         In Table 1 the first column names the ML algorithm in al-
The six main security goals as described in [21] are summa-
                                                                      phabetical order, followed by the learning type and whether
rized as follows:
                                                                      the model is capable of lifelong learning or not. Lifelong
Copyright held by authors.                                            learning is a criterion that is often ignored by researchers
Table 1: Published attacks on ML categorized by ML algorithms. The listed ML algorithms are derived from the publications
of the attacks, therefore, there might be attacks aimed at, e.g., neural networks in general but also attacks on specific sub-types
of neural networks, e.g., convolutional neural networks. The columns “Learning Type” and “Lifelong Learning” do not solely
refer to what the algorithm is capable of but to the premises the ML algorithm must meet to render the attack effective
  ML Algorithm                                  Learning Type Lifelong L. Attack
  Complete-linkage Hierarchical Clustering Unsupervised             No             Poisoning Attack [9]
  Single-Linkage Hierarchical Clustering        Unsupervised        No             Poisoning Attack [13]
                                                                                   Obfuscation Attack [13, 14]
  Decision Tree/Random Forest                   Supervised          Yes/No         Poisoning Attack [46]
                                                                    No             Path-finding Attack [72]
                                                                                   Model Inversion [26]
                                                                                   Ateniese et al. Attack [4]
                                                                                   Adversarial Examples [31, 52, 66]
  Hidden Markov Model                           Supervised          No             Ateniese et al. Attack [4]
  k-Nearest Neighbors                           Supervised          Yes/No         Poisoning Attack [46]
                                                                    No             Adversarial Examples [31]
  k-Means Clustering                            Unsupervised        No             Ateniese et al. Attack [4]
  Linear Regression                             Supervised          Yes/No         Poisoning Attack [8, 35, 41]
                                                                    No             Model Inversion [27]
                                                                                   Lowd-Meek Attack [44, 72]
  Logistic Regression                           Supervised          No             Equation-solving Attack [49]
                                                                                   Hyperparameter Stealing [73]
                                                                                   Adversarial Examples [52, 70, 71]
  Multi-class Logistic Regression               Supervised          No             Equation-solving Attack [49]
  Maximum Entropy Models                        Supervised          No             Lowd-Meek Attack [44]
  Naive Bayes                                   Supervised          No             Classifier Evasion [3, 22]
                                                                                   Lowd-Meek Attack [44]
  Neural Network                                Reinforcement       Unclear        Strategically-timed Attack [40]
                                                Learning                           Enchanting Attack [40]
                                                                                   Adversarial Examples [33, 40]
  Neural Network                                Supervised          No             Model Inversion [26]
                                                                                   Membership Inference [63]
                                                                                   Hyperparameter Stealing Attack [73]
                                                                                   Ateniese et al. Attack [4]
                                                                                   Adversarial Examples [29, 31, 45, 52, 62, 70]
                                                                                   Trojan Trigger [43]
  Multi-layer Perceptron                        Supervised          Yes/No         Poisoning Attack [46]
                                                                    No             Equation-solving Attack [49]
                                                                                   Ateniese et al. Attack [4]
  Convolutional Neural Network                  Supervised          No             Side-channel Attack [74]
                                                                                   Training Data Extraction [18]
                                                                                   Adversarial Examples [50, 52, 70]
  Recurrent Neural Network                      Supervised          No             Training Data Extraction [18]
                                                                                   Classifier Evasion [3]
                                                                                   Adversarial Examples [57]
  Support Vector Machine                        Supervised          Yes/No         Poisoning Attack [12, 46]
                                                                                   Adversarial Label Flips [76, 77]
                                                                    No             Hyperparameter Stealing [73]
                                                                                   Lowd-Meek Attack [44, 72]
                                                                                   Ateniese et al. Attack [4]
                                                                                   Evasion Attack [3, 24, 30, 61, 66]
                                                                                   Feature Deletion [28]
                                                                                   Adversarial Examples [31, 52, 66, 71]
or at least not explicitly mentioned in papers. We comple-        rized in Table 2 under integrity.
mented this information wherever necessary according to              Table 2 shows our mapping of the analyzed attacks listed
the definition in common text books. There are four possi-        in Table 1 to the six security goals described in the Secu-
ble values for lifelong learning: Yes, No, Yes/No (when both      rity Goals section. While Table 1 focused on the ML algo-
can be the case) and unclear (when we simply do not know).        rithms Table 2 brings the attacks into focus. The assignment
In the last column we list the attacks with corresponding lit-    in Table 2 is based on the description of the attacks in the
erature.                                                          respective publications. In the table, an “X” indicates which
   We also identified attacks that are employable against sev-    security goal (related to the ML component as a whole) is
eral ML algorithms. Attacks we consider applicable to sys-        affected by which attack.
tems regardless of the ML algorithm, learning type, and life-        In addition, many attacks have been published that relate
long learning capability are, for example, poisoning attacks      to pre- and post-processing units of ML components (their
[8, 46] as these attacks do not focus on the model but the        environment). These attacks do not differ from those on tra-
training data; therefore, poisoning attacks are considered in-    ditional software, therefore they are not described in this pa-
dependent of the ML algorithm.                                    per.
   Another group of attacks that tamper with data fed into           An obvious peculiarity of ML components compared to
the ML model, and thus are applicable on a wide range of          traditional software is their training, so there are two essen-
different ML algorithms, are adversarial examples [5, 6, 17,      tial phases in their life cycle: the training phase (T) and the
34, 51, 59], evasion attacks [23, 78], and feature deletion       deployment phase that we prefer to call the after-training
attacks. These attacks exploit weaknesses in the ML model         phase (A), as this also considers lifelong learning ML al-
without changing the model itself by simply perturbing the        gorithms, which are trained with every input even after de-
input to falsify the output.                                      ployment. This continuous learning process makes attacks
   Shokri et al. [63] claim their attack, membership infer-       in deployment time possible, which are also applicable in
ence, to be generic, although they only apply it to classifica-   training time (such as poisoning attacks [60]) and, on the
tion algorithms. We also think the attack is only applicable to   other hand, disables the applicability of attacks that require
ML algorithms that are not capable of lifelong learning, as       a fixed target model (e.g., model inversion [26]).
membership inference relies on computing multiple inputs             Unlike previous research (e.g., [7, 55]) we do not consider
via the ML model to extract information about the training        whether an attack is targeted, whether the opponent causes a
data. If the model adapts with every given input, this ap-        certain wrong output or not, whether a wrong output is gen-
proach can be aggravated.                                         erated, or whether the opponent has white box or black box
                                                                  knowledge. At this point we also do not distinguish between
Categorization of ML Attacks with Regard to                       different types of learning (supervised, unsupervised, rein-
                                                                  forcement learning). Considering all these kinds of criterion,
               Security Goals                                     a blurred categorization would be created that contradicts a
In software security it is well-established to distinguish be-    clear distinction between attacks. Instead, we propose con-
tween attacks with regard to their effects on security goals      sidering the above criteria within each of our main groups in
(see Section “Security Goals”). The attacks described in Ta-      order to add further dimensions and form sub-groups. This
ble 2 affect one or more security goals of a system (here:        is not within the scope of this paper, although we consider
an ML component). A categorization of the published at-           the learning type in Table 1, which can be used as a starting
tacks according to security goals compiles an overview of         point for further investigations.
clusters of similar attack scenarios as well as of missing but       By analyzing the security goals that are breached by the
expected attack clusters. These gaps in the categorization of     attacks and the time the attack takes place, we can create
attacks may result from unknown publications about attacks        different categories of attacks. The names of the categories
on ML components, from unpublished attacks or attacks that        are derived from whether the attack takes place during train-
have not yet been executed but which are all conceivable and      ing time (T) or after-training time (A) followed by a dash
therefore executable in principle. Therefore, these gaps in       (-) and the first one or two letters of the main security goals,
the categorization are particularly revealing.                    which are breached by the attacks. Grey “X”s indicate the
   Of particular relevance for the categorization of attacks      main assignments of attacks to security goals.
developed here is the violation of security goals, which af-         First of all, it is noticeable that all attacks at training time
fect the ML component as a whole. Thus, the violation of the      affect both integrity and reliability. This also makes sense
integrity for an ML component means that the ML compo-            immediately: if only the integrity was corrupted during train-
nent itself is changed (in some form). In the publications on     ing time, the system could be corrected conform to the spec-
the attacks on ML components analyzed here (and also listed       ification via the existing reliability. If only the reliability was
in Table 2), statements are partly made on the violations of      corrupted, the unchanged behavior would result in a differ-
the security goals, but these sometimes refer (only) to par-      ence to the specification, which would result in a correction
tial areas of an attack. Thus, the attack adversarial examples    of the specification. Only a simultaneous attack on both se-
[69], which manipulates data fed into the model, targets—         curity goals can therefore be successful during the training
according to the authors—integrity, namely the integrity of       phase. Confidentiality is not a main security goal for attacks
the input data; as the integrity of the ML model itself is not    during the training phase, but most of the identified attacks
attacked because it has not been changed, it is not catego-       have attacked the confidentiality as well. However, success-
Table 2: Mapping of published attacks on ML on the security goals violated. The attacks are categorized according to the
security goals they breach. The first column “Att. Cat.” (Attack Category) labels the categories. The names are derived from the
time of the ML algorithm lifecycle (Training, After-training) the attacks take place and the security goals the attack brakes that
are most relevant for the specified category
  Att. Published Attacks                   Confiden-          Availa-       Integrity             Reliability Authen- Accoun-
  Cat.                                     tiality            bility                                          ticity    tability
         Poisoning Attack [60]             X [14, 47]         [10, 13, 14,   X [10, 14, 35,       X
                                                              35, 39, 47] 36, 39, 47, 55, 67]
         Adversarial Label Flips [76]      X                                 X [56]               X
 T-IR


         Strategically-timed Attack [40] X                                   X                    X
         Enchanting Attack [40]            X                                 X                    X
         Obfuscation Attack [13]                                             X [13]               X
IR


         Trojan Trigger [43]               X                                 X                    X
A-


        Model Inversion [26]                X [26, 27,       X
                                           32, 56, 72, 75]
        Membership Inference [63]           X [63, 65]       X
        Side-channel Attack [74]            X [74]           X
        Lowd-Meek Attack [44]               X                               [55]
 A-C


        Training Data Extraction [18]       X [18]           X
        Ateniese et al. Attack [4]          X [4]            X
        Path-finding Attack [49]            X                X
        Equation-solving Attack [49]        X
        Hyperparameter Stealing [73]        X [73]
        Classifier Evasion [11]                              X [7, 48]      [20, 48, 55, 68]      X
        Adversarial Examples [69]                            X [15]         [15, 17, 52, 53,      X [25,
 A-R


                                                                            54, 55]              64]
        Feature Deletion [28]                                X                                    X
ful attacks during the training phase that relate exclusively to      The entity, which can not deny an action, is ultimately rel-
integrity and reliability would also be conceivable. Attack-       evant in a legal context, namely in case of finding the party
ing the security goal availability makes no sense during the       liable for a specific action. It is not relevant, however, how
training phase.                                                    a single element of an algorithm contributed to the system’s
   Attacks on integrity and reliability during the deployment      decision, but whether the wrong decision was caused due to
phase are theoretically meaningful and have been published         faulty training, biases in the training data or malicious at-
pertinently. They represent the mirroring of attacks on in-        tacks.
tegrity and reliability from the training phase. An essential         We find that there is no clear definition of accountabil-
group with a particularly large number of published attacks        ity and that it is difficult to transfer existing definitions to
in the deployment phase refers to confidentiality. The fact        the field of ML. In order to guarantee accountability at all,
that these attacks are often accompanied by restrictions in        changes in the system, e.g., in traditional software this could
availability is rather a side effect than a main aspect. A cat-    be changes in the database, must be recorded. Without a
egory of attacks on ML components that mainly refers to            form of audit that promises some form of tracing, account-
availability (think of DoS attacks on traditional software)        ability cannot be broken, because the goal was not even
makes little sense in theory and has not been published. The       reached in the first place. With a ML system, the changes
frequently cited adversarial examples attack group is among        within a system do not necessarily have to be recorded.
others in the category of reliability attacks during the de-       Rather the decisions of the system or of parts of the system
ployment phase; typically, integrity is not corrupted because      should be made assignable to a distinct entity.
the ML components themselves are not modified.                        In the context of ML, a distinction between accountability
   The lack of assignments to the security goals authentic-        and liability should be considered. Both focus on retracing
ity and accountability are also particularly informative. In       an action to an entity. Liability, however, concentrates on the
our research we could not find any attacks on these secu-          assignment of blame or debt relief of individual entities and
rity goals of the ML components. Authenticity is usually           is also possible without an audit of the actions and decision
implemented in the environmental components surrounding            made by inner components within the ML algorithm. For
an ML component. This will probably change in the future,          liability it is sufficient to record the final decision of the ML
however, when comprehensive tasks will be implemented in           system solely.
a network of ML components and it becomes necessary to                Accountability, on the other hand, is only possible by log-
establish the ML components as mission-critical communi-           ging the internal processes. The definition of an “entity”,
cation partners. Accountability of ML is considered—even           however, is still unclear. Furthermore, logging requires a cer-
in the community of ML experts—to be mostly inaccessi-             tain understanding of the model, which is difficult up until
ble (especially with the so-called black box ML components         now. However, if ML algorithms become comprehensible in
such as deep neural networks), because these components            the future, accountability could be achievable and this also
cannot be read like traditional software and cannot be se-         means that accountability—as a security goal—can be bro-
mantically deduced from the structure. Nevertheless, we be-        ken by attackers.
lieve that a new field of attacks on ML components will               Assume it will be possible to identify which nodes in
open up in this field in the future because initiatives such       a neural network are responsible for a particular decision.
as eXplainable AI (layer-wise relevance propagation [16],          E.g., we know which nodes in an image recognition sys-
Black Box Explanations through Transparent Approxima-              tem are responsible for detecting certain objects, such as
tions (BETA) [37], LIME [58], Generalized Additive Model           stop signs. If these nodes are regarded as entities, they can
(GAM) [19], etc.) and the political demand for comprehensi-        be made accountable for their decisions. Accountability al-
ble AI decisions will ensure greater comprehensibility in the      lows ML algorithms to be developed and validated more ef-
area of the black box ML, which will ultimately also help          ficiently maybe even to the point where they become similar
the attackers.                                                     to the code of traditional software development. This is de-
                                                                   sirable in any case, as it greatly simplifies development and
         The Peculiarity of Accountability                         troubleshooting. If this knowledge about accountability is
                                                                   leaked, adversaries can also take advantage of it and launch
It is yet unclear, how the concept of accountability applies       more targeted attacks, which might ultimately also target ac-
to ML. Accountability in traditional software engineering          countability. A breach in accountability will most likely be
means an action can always be retraced to the entity per-          the first step to sophisticated attacks that violate other secu-
forming the action. An entity is usually a human or a digital      rity goals as well.
agent, however, the definition of an entity is not clear in the       It is unclear what types of attacks might be possible once
field of ML. An entity could be an input feature which leads       ML models can be fully explained to humans, though.
to a certain output of the ML model (this meets the defini-
tion made by Papernot et al. [56]). An entity could also be
an element within in the ML model, e.g., each single neuron
                                                                                         Related Work
within a neural network, which makes its own decision that         Barreno et al. [7] give relevant properties they consider im-
influences the final output of the model. From a different         portant when conducting attacks on ML. The properties are
point of view even the software developer could be consid-         grouped into three categories: the influence of the attack on
ered the entity.                                                   the target system, the specificity (targeted or untargeted) and
the security violation (integrity, availability). Their paper fo-    [2]   Ibrahim M. Alabdulmohsin, Xin Gao, and Xiangliang
cuses mostly on countermeasures against attacks. Papernot                  Zhang. “Adding Robustness to Support Vector Machines
et al. [55] also review attacks and distinguish them into black            Against Adversarial Reverse Engineering”. In: Proceedings
box and white box attacks. They focus on attacks on classi-                of the 23rd ACM International Conference on Conference
fication algorithms and list theoretical countermeasures. Liu              on Information and Knowledge Management - CIKM ’14.
                                                                           New York, New York, USA: ACM Press, 2014, pp. 231–
et al. [42] also discuss different attacks and propose interest-           240. ISBN: 9781450325981. DOI: 10.1145/2661829.
ing points to consider in future research. Biggio et al. [15]              2662047.
take a different view on attacks on ML. They focus on how            [3]   Mark Anderson, Andrew Bartolo, and Pulkit Tandon.
the field has developed during the years since its first men-              “Crafting Adversarial Attacks on Recurrent Neural Net-
tion in 2004. They also review published countermeasures.                  works”. In: (2017).
   Alabdulmohsin et al. [2] sort attacks into causative or ex-       [4]   Giuseppe Ateniese, Giovanni Felici, Luigi V. Mancini, An-
ploratory attacks. A survey of attacks against deep learning               gelo Spognardi, Antonio Villani, and Domenico Vitali.
in computer vision was conducted by Akhtar and Mian [1].                   “Hacking Smart Machines with Smarter Ones: How to Ex-
They list several published countermeasures against adver-                 tract Meaningful Data from Machine Learning Classifiers”.
sarial examples. Laskov and Kloft [38] propose a “frame-                   In: (2013). ISSN: 1747-8405. DOI: 10 . 1504 / IJSN .
work for quantitative security analysis of ML models”.                     2015.071829. arXiv: 1306.4447.
                                                                     [5]   Shumeet Baluja and Ian Fischer. “Adversarial Transforma-
                        Conclusion                                         tion Networks: Learning to Generate Adversarial Exam-
                                                                           ples”. In: (2017). arXiv: 1703.09387.
In this paper we give an overview of the current state-of-the-
                                                                     [6]   Shumeet Baluja and Ian Fischer. “Learning to Attack: Ad-
art ML algorithms and their respective attacks. This list is es-           versarial Transformation Networks”. In: Association for the
pecially interesting when considering some of the more crit-               Advancement of Artificial Intelligence - AAAI’18. 2018.
ical fields ML is used in, such as autonomous driving. Au-           [7]   Marco Barreno, Blaine Nelson, Russell Sears, Anthony D.
tonomous driving uses ML models in safety-critical applica-                Joseph, and J. D. Tygar. “Can machine learning be se-
tions. Ignoring known attacks on pertinent ML algorithms is                cure?” In: Proceedings of the 2006 ACM Symposium on In-
hazardous as human life is at stake. Likewise, regular soft-               formation, computer and communications security - ASI-
ware development, security by design has to be applied to                  ACCS ’06. New York, USA: ACM Press, 2006. ISBN:
the development of ML algorithms as well.                                  1595932720. DOI: 10.1145/1128817.1128824.
   We also propose a classification of published attacks on          [8]   Alex Beatson, Zhaoran Wang, and Han Liu. “Blind At-
ML models based on security goals and life cycle phase.                    tacks on Machine Learners”. In: 30th Conference on Neu-
   Our research shows that accountability is not covered by                ral Information Processing Systems (NIPS 2016) (2016),
literature as there have not yet been any attacks published.               pp. 2397–2405. ISSN: 10495258.
This is probably due to the fact that accountability for ML          [9]   Battista Biggio, Samuel Rota Bulò, Ignazio Pillai, Michele
is difficult to attack as ML models are yet beyond human                   Mura, Eyasu Zemene Mequanint, Marcello Pelillo, and
understanding and, therefore, the security goal is not com-                Fabio Roli. “Poisoning Complete-Linkage Hierarchical
pulsory.                                                                   Clustering”. In: ed. by Ana Fred, Terry M. Caelli, Robert
                                                                           P. W. Duin, Aurélio C. Campilho, and Dick de Ridder.
   Although, there are already some papers working on a
                                                                           Vol. 3138. Lecture Notes in Computer Science. Berlin,
solution to improve comprehensibility of ML models, we                     Heidelberg: Springer Berlin Heidelberg, Aug. 2004. ISBN:
think there is still a long way to go until humans are able                978-3-540-22570-6. DOI: 10 . 1007 / b98738. arXiv:
to completely understand ML models. If accountability can                  9780201398298.
be guaranteed for all kinds of ML models this will enable a         [10]   Battista Biggio, Igino Corona, Giorgio Fumera, Giorgio Gi-
wide range of new yet unknown attacks.                                     acinto, and Fabio Roli. “Bagging classifiers for fighting poi-
   Further research will elaborate the implications of vulner-             soning attacks in adversarial classification tasks”. In: Lec-
able ML models. It will also discuss whether and how the                   ture Notes in Computer Science (including subseries Lec-
security goal accountability can be transferred to the field               ture Notes in Artificial Intelligence and Lecture Notes in
of ML and if proper accountability of ML models has to be                  Bioinformatics) 6713 LNCS (2011), pp. 350–359. ISSN:
considered in liability claims.                                            03029743. DOI: 10 . 1007 / 978 - 3 - 642 - 21557 -
                                                                           5_37.
                   Acknowledgement                                  [11]   Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nel-
                                                                           son, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. “Eva-
Katja Auernhammer and Markus Zoppelt were supported by                     sion Attacks against Machine Learning at Test Time”. In:
the BayWISS Consortium Digitization.                                       ECML PKDD (2013), pp. 387–402. DOI: 10.1007/978-
                                                                           3-642-40994-3\_25.
                        References                                  [12]   Battista Biggio, Blaine Nelson, and Pavel Laskov. “Poi-
 [1]   Naveed Akhtar and Ajmal Mian. “Threat of Adversarial At-            soning Attacks against Support Vector Machines”. In: Pro-
       tacks on Deep Learning in Computer Vision: A Survey”.               ceedings of the 29 th International Conference on Machine
       In: IEEE Access 6 (Jan. 2018), pp. 14410–14430. ISSN:               Learning (June 2012). arXiv: 1206.6389.
       21693536. DOI: 10.1109/ACCESS.2018.2807385.                  [13]   Battista Biggio, Ignazio Pillai, Samuel Rota Bulò, Davide
       arXiv: 1801.00553.                                                  Ariu, Marcello Pelillo, and Fabio Roli. “Is data cluster-
                                                                           ing in adversarial settings secure?” In: Proceedings of the
       2013 ACM workshop on Artificial intelligence and secu-                10.1007/978-3-319-49055-7_29. arXiv: 1709.
       rity - AISec ’13 (2013), pp. 87–98. ISSN: 15437221. DOI:              00045.
       10.1145/2517312.2517321.                                       [25]   Alhussein Fawzi, Seyed Mohsen Moosavi-Dezfooli, and
[14]   Battista Biggio, Konrad Rieck, Davide Ariu, Christian                 Pascal Frossard. “The Robustness of Deep Networks: A Ge-
       Wressnegger, Igino Corona, Giorgio Giacinto, and Fabio                ometrical Perspective”. In: IEEE Signal Processing Maga-
       Roli. “Poisoning behavioral malware clustering”. In: Pro-             zine 34.6 (2017), pp. 50–62. ISSN: 10535888. DOI: 10 .
       ceedings of the 2014 Workshop on Artificial Intelligent and           1109/MSP.2017.2740965.
       Security Workshop - AISec ’14. New York, USA: ACM              [26]   Matt Fredrikson, Somesh Jha, and Thomas Ristenpart.
       Press, Nov. 2014, pp. 27–36. ISBN: 9781450331531. DOI:                “Model Inversion Attacks that Exploit Confidence Informa-
       10.1145/2666652.2666666.                                              tion and Basic Countermeasures”. In: Proceedings of the
[15]   Battista Biggio and Fabio Roli. “Wild Patterns: Ten Years             22nd ACM SIGSAC Conference on Computer and Com-
       After the Rise of Adversarial Machine Learning”. In:                  munications Security - CCS ’15. New York, USA: ACM
       Pattern Recognition 84 (Dec. 2017), pp. 317–331. ISSN:                Press, 2015, pp. 1322–1333. ISBN: 9781450338325. DOI:
       00313203. DOI: 10.1016/j.patcog.2018.07.023.                          10.1145/2810103.2813677.
       arXiv: 1712.03141.                                             [27]   Matt Fredrikson, Eric Lantz, Somesh Jha, Simon Lin, David
[16]   Alexander Binder, Sebastian Bach, Gregoire Montavon,                  Page, and Thomas Ristenpart. “Privacy in Pharmacogenet-
       Klaus Robert Müller, and Wojciech Samek. “Layer-wise                 ics: An End-to-End Case Study of Personalized Warfarin
       relevance propagation for deep neural network architec-               Dosing”. In: Proceedings of the 23rd USENIX Security
       tures”. In: Lecture Notes in Electrical Engineering 376               Symposium (2014), pp. 17–32.
       (2016), pp. 913–922. ISSN: 18761119. DOI: 10 . 1007 /          [28]   Amir Globerson and Sam Roweis. “Nightmare at test time:
       978-981-10-0557-2_87.                                                 robust learning by feature deletion”. In: Proceedings of the
[17]   Wieland Brendel, Jonas Rauber, and Matthias Bethge.                   23rd international conference on Machine learning (2006),
       “Decision-Based Adversarial Attacks: Reliable Attacks                 pp. 353–360. DOI: 10.1145/1143844.1143889.
       Against Black-Box Machine Learning Models”. In: (Dec.          [29]   Abigail Graese, Andras Rozsa, and Terrance E. Boult.
       2017). arXiv: 1712.04248.                                             “Assessing threat of adversarial examples on deep neu-
[18]   Nicholas Carlini, Chang Liu, Jernej Kos, Úlfar Erlings-              ral networks”. In: Proceedings - 2016 15th IEEE Interna-
       son, and Dawn Song. “The Secret Sharer: Measuring Un-                 tional Conference on Machine Learning and Applications,
       intended Neural Network Memorization & Extracting Se-                 ICMLA 2016 (2017), pp. 69–74. DOI: 10.1109/ICMLA.
       crets”. In: (Feb. 2018). arXiv: 1802.08232.                           2016.44. arXiv: 1610.04256.
[19]   Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch,             [30]   Yi Han and Benjamin I. P. Rubinstein. “Adequacy of the
       Marc Sturm, and Noemie Elhadad. “Intelligible Models for              Gradient-Descent Method for Classifier Evasion Attacks”.
       HealthCare”. In: Proceedings of the 21th ACM SIGKDD In-               In: (2017). arXiv: 1704.01704.
       ternational Conference on Knowledge Discovery and Data         [31]   Jamie Hayes and George Danezis. “Machine Learning as an
       Mining - KDD ’15 (2015), pp. 1721–1730. ISSN: 1869-                   Adversarial Service: Learning Black-Box Adversarial Ex-
       0327. DOI: 10.1145/2783258.2788613.                                   amples”. In: (2017). arXiv: 1708.05207.
[20]   Lingwei Chen, Yanfang Ye, and Thirimachos Bourlai. “Ad-        [32]   Briland Hitaj, Giuseppe Ateniese, and Fernando Perez-
       versarial machine learning in malware detection: Arms race            Cruz. “Deep Models Under the GAN: Information Leak-
       between evasion attack and defense”. In: Proceedings -                age from Collaborative Deep Learning”. In: Proceedings
       2017 European Intelligence and Security Informatics Con-              of the 2017 ACM SIGSAC Conference on Computer and
       ference, EISIC 2017 2017-Janua (2017), pp. 99–106. DOI:               Communications Security - CCS ’17. New York, USA:
       10.1109/EISIC.2017.21.                                                ACM Press, 2017, pp. 603–618. ISBN: 9781450349468.
[21]   Fabiano Dalpiaz, Elda Paja, and Paolo Giorgini. Secu-                 DOI : 10 . 1145 / 3133956 . 3134012. arXiv: 1702 .
       rity Requirements Engineering: Designing Secure Socio-                07464.
       Technical Systems. MIT Press, 2016, p. 224. ISBN:              [33]   Sandy Huang, Nicolas Papernot, Ian Goodfellow, Yan
       0262034212.                                                           Duan, and Pieter Abbeel. “Adversarial Attacks on Neural
[22]   Nilesh Dalvi, Pedro Domingos, Mausam, Sumit Sanghai,                  Network Policies”. In: (Feb. 2017). arXiv: 1702.02284.
       and Deepak Verma. “Adversarial classification”. In: Pro-       [34]   Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy
       ceedings of the 2004 ACM SIGKDD international confer-                 Lin. “Query-Efficient Black-box Adversarial Examples”.
       ence on Knowledge discovery and data mining - KDD ’04.                In: (Dec. 2017). arXiv: 1712.07113.
       New York, USA: ACM Press, 2004. DOI: 10 . 1145 /
                                                                      [35]   Matthew Jagielski, Alina Oprea, Battista Biggio, Chang
       1014052.1014066.
                                                                             Liu, Cristina Nita-Rotaru, and Bo Li. “Manipulating Ma-
[23]   Hung Dang, Yue Huang, and Ee-Chien Chang. “Evading                    chine Learning: Poisoning Attacks and Countermeasures
       Classifiers by Morphing in the Dark”. In: Proceedings of              for Regression Learning”. In: 2018 IEEE Symposium on Se-
       the 2017 ACM SIGSAC Conference on Computer and Com-                   curity and Privacy (SP). IEEE, May 2018, pp. 19–35. ISBN:
       munications Security - CCS ’17. New York, USA: ACM                    978-1-5386-4353-2. DOI: 10.1109/SP.2018.00057.
       Press, 2017, pp. 119–133. ISBN: 9781450349468. DOI:                   arXiv: 1804.00308.
       10.1145/3133956.3133978. arXiv: 1705.07535.
                                                                      [36]   Ricky Laishram and Vir Virander Phoha. “Curie: A method
[24]   Ambra Demontis, Paolo Russu, Battista Biggio, Giorgio                 for protecting SVM Classifier from Poisoning Attack”. In:
       Fumera, and Fabio Roli. “On security and sparsity of lin-             (June 2016). arXiv: 1606.01584.
       ear classifiers for adversarial settings”. In: Lecture Notes
       in Computer Science (including subseries Lecture Notes in      [37]   Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure
       Artificial Intelligence and Lecture Notes in Bioinformatics)          Leskovec. “Interpretable & Explorable Approximations of
       10029 LNCS (2016), pp. 322–332. ISSN: 16113349. DOI:                  Black Box Models”. In: (2017). arXiv: 1707.01154.
[38]   Pavel Laskov and Marius Kloft. “A framework for quantita-      [49]   Tam N. Nguyen. “Attacking Machine Learning models as
       tive security analysis of machine learning”. In: Proceedings          part of a cyber kill chain”. In: (2017). arXiv: 1705 .
       of the 2nd ACM workshop on Security and artificial intelli-           00564.
       gence - AISec ’09. New York, New York, USA: ACM Press,         [50]   Andrew P. Norton and Yanjun Qi. “Adversarial-
       2009. ISBN: 9781605587813. DOI: 10.1145/1654988.                      Playground: A visualization suite showing how adversarial
       1654990.                                                              examples fool deep learning”. In: 2017 IEEE Symposium
[39]   Bo Li, Yining Wang, Aarti Singh, and Yevgeniy Vorobey-                on Visualization for Cyber Security (VizSec). Vol. 2017-
       chik. “Data Poisoning Attacks on Factorization-Based Col-             Octob. IEEE, Oct. 2017. ISBN: 978-1-5386-2693-1.
       laborative Filtering”. In: 29th Conference on Neural Infor-           DOI : 10 . 1109 / VIZSEC . 2017 . 8062202. arXiv:
       mation Processing Systems (NIPS 2016) Nips (Aug. 2016).               1708.00807.
       ISSN : 10495258. arXiv: 1608.08182.                            [51]   Nicolas Papernot. “Characterizing the Limits and Defenses
[40]   Yen Chen Lin, Zhang Wei Hong, Yuan Hong Liao, Meng                    of Machine Learning in Adversarial Settings”. In: (2018).
       Li Shih, Ming Yu Liu, and Min Sun. “Tactics of adver-          [52]   Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow.
       sarial attack on deep reinforcement learning agents”. In:             “Transferability in Machine Learning: from Phenomena to
       IJCAI International Joint Conference on Artificial Intel-             Black-Box Attacks using Adversarial Samples”. In: (2016).
       ligence (2017), pp. 3756–3762. ISSN: 10450823. arXiv:                 arXiv: 1605.07277.
       1703.06748.                                                    [53]   Nicolas Papernot, Patrick McDaniel, Ian Goodfellow,
[41]   Chang Liu, Bo Li, Yevgeniy Vorobeychik, and Alina Oprea.              Somesh Jha, Z. Berkay Celik, and Ananthram Swami.
       “Robust Linear Regression Against Training Data Poison-               “Practical Black-Box Attacks against Machine Learning”.
       ing”. In: Proceedings of the 10th ACM Workshop on Arti-               In: (Feb. 2016). DOI: 10 . 1145 / 3052973 . 3053009.
       ficial Intelligence and Security - AISec ’17 (2017), pp. 91–          arXiv: 1602.02697.
       102. DOI: 10.1145/3128572.3140447.                             [54]   Nicolas Papernot, Patrick Mcdaniel, Somesh Jha, Matt
[42]   Qiang Liu, Pan Li, Wentao Zhao, Wei Cai, Shui Yu, and                 Fredrikson, Z. Berkay Celik, and Ananthram Swami. “The
       Victor C.M. Leung. “A survey on security threats and                  limitations of deep learning in adversarial settings”. In: Pro-
       defensive techniques of machine learning: A data driven               ceedings - 2016 IEEE European Symposium on Security
       view”. In: IEEE Access 6 (2018), pp. 12103–12117. ISSN:               and Privacy, EURO S and P 2016 (2016), pp. 372–387.
       21693536. DOI: 10.1109/ACCESS.2018.2805680.                           DOI : 10 . 1109 / EuroSP . 2016 . 36. arXiv: 1511 .
[43]   Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee,                  07528.
       Juan Zhai, Authors Yingqi Liu, Weihang Wang, and Xi-           [55]   Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and
       angyu Zhang. “Trojaning Attack on Neural Networks”. In:               Michael Wellman. “SoK: Towards the Science of Security
       NDSS 2018 (Network and Distributed System Security Sym-               and Privacy in Machine Learning”. In: (Nov. 2016). arXiv:
       posium) (Feb. 2018). DOI: 10 . 14722 / ndss . 2018 .                  1611.03814.
       23291.                                                         [56]   Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and
[44]   Daniel Lowd and Christopher Meek. “Adversarial learn-                 Michael P. Wellman. “SoK: Security and Privacy in Ma-
       ing”. In: Proceeding of the eleventh ACM SIGKDD interna-              chine Learning”. In: 2018 IEEE European Symposium
       tional conference on Knowledge discovery in data mining -             on Security and Privacy (EuroS&P). IEEE, Apr. 2018,
       KDD ’05 (2005). DOI: 10.1145/1081870.1081950.                         pp. 399–414. ISBN: 978-1-5386-4228-3. DOI: 10.1109/
[45]   Seyed Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar                  EuroSP.2018.00035. arXiv: 1611.03814.
       Fawzi, and Pascal Frossard. “Universal adversarial pertur-     [57]   Nicolas Papernot, Patrick McDaniel, Ananthram Swami,
       bations”. In: Proceedings - 30th IEEE Conference on Com-              and Richard Harang. “Crafting adversarial input sequences
       puter Vision and Pattern Recognition, CVPR 2017 2017-                 for recurrent neural networks”. In: MILCOM 2016 - 2016
       January (2017), pp. 86–94. ISSN: 1063-6919. DOI: 10 .                 IEEE Military Communications Conference. IEEE, Nov.
       1109/CVPR.2017.17. arXiv: 1705.09554.                                 2016, pp. 49–54. ISBN: 978-1-5090-3781-0. DOI: 10 .
[46]   Mehran Mozaffari-Kermani, Susmita Sur-Kolay, Anand                    1109 / MILCOM . 2016 . 7795300. arXiv: 1604 .
       Raghunathan, and Niraj K. Jha. “Systematic poisoning at-              08275.
       tacks on and defenses for machine learning in healthcare”.     [58]   Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin.
       In: IEEE Journal of Biomedical and Health Informatics                 “”Why Should I Trust You?”: Explaining the Predictions of
       19.6 (2015), pp. 1893–1905. ISSN: 21682194. DOI: 10 .                 Any Classifier”. In: KDD ’16 Proceedings of the 22nd ACM
       1109/JBHI.2014.2344095.                                               SIGKDD International Conference on Knowledge Discov-
[47]   Luis Muñoz-González, Battista Biggio, Ambra Demon-                  ery and Data Mining (Aug. 2016), pp. 1135–1144. ISSN:
       tis, Andrea Paudice, Vasin Wongrassamee, Emil C. Lupu,                9781450321389. DOI: 10.1145/2939672.2939778.
       and Fabio Roli. “Towards Poisoning of Deep Learning Al-               arXiv: 1602.04938.
       gorithms with Back-gradient Optimization”. In: Proceed-        [59]   Amir Rosenfeld, Richard Zemel, and John K. Tsotsos. “The
       ings of the 10th ACM Workshop on Artificial Intelligence              Elephant in the Room”. In: (2018). arXiv: 1808.03305.
       and Security - AISec ’17. New York, SA: ACM Press,             [60]   Benjamin I.P. Rubinstein, Blaine Nelson, Ling Huang, An-
       2017, pp. 27–38. ISBN: 9781450352024. DOI: 10.1145/                   thony D Joseph, Shing-hon Lau, Satish Rao, Nina Taft, and
       3128572.3140451. arXiv: 1708.08689.                                   J. D. Tygar. “ANTIDOTE: Understanding and Defending
[48]   Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony               against Poisoning of Anomaly Detectors”. In: Proceedings
       D. Joseph, Benjamin I.P. Rubinstein, Udam Saini, Charles              of the 9th ACM SIGCOMM conference on Internet mea-
       Sutton, J. D. Tygar, and Kai Xia. “Exploiting machine                 surement conference - IMC ’09. New York, New York,
       learning to subvert your spam filter”. In: In Proceedings of          USA: ACM Press, Nov. 2009. ISBN: 9781605587714. DOI:
       the First Workshop on Large-scale Exploits and Emerging               10.1145/1644893.1644895.
       Threats (LEET) April (2008), Article 7.
[61]   Paolo Russu, Ambra Demontis, Battista Biggio, Giorgio          [75]   Xi Wu, Matthew Fredrikson, Somesh Jha, and Jeffrey
       Fumera, and Fabio Roli. “Secure Kernel Machines against               F. Naughton. “A methodology for formalizing model-
       Evasion Attacks”. In: Proceedings of the 2016 ACM Work-               inversion attacks”. In: Proceedings - IEEE Computer Se-
       shop on Artificial Intelligence and Security - ALSec ’16              curity Foundations Symposium 2016-Augus (2016). ISSN:
       (2016), pp. 59–69. DOI: 10.1145/2996758.2996771.                      19401434. DOI: 10.1109/CSF.2016.32.
[62]   Ali Shafahi, W. Ronny Huang, Mahyar Najibi, Octavian Su-       [76]   Han Xiao, Huang Xiao, and Claudia Eckert. “Adversarial
       ciu, Christoph Studer, Tudor Dumitras, and Tom Goldstein.             label flips attack on support vector machines”. In: Frontiers
       “Poison Frogs! Targeted Clean-Label Poisoning Attacks on              in Artificial Intelligence and Applications 242.4 (2012),
       Neural Networks”. In: (Apr. 2018). arXiv: 1804.00792.                 pp. 870–875. ISSN: 09226389. DOI: 10.3233/978- 1-
[63]   Reza Shokri, Marco Stronati, Congzheng Song, and Vi-                  61499-098-7-870.
       taly Shmatikov. “Membership Inference Attacks Against          [77]   Huang Xiao, Battista Biggio, Blaine Nelson, Han Xiao,
       Machine Learning Models”. In: Proceedings - IEEE Sym-                 Claudia Eckert, and Fabio Roli. “Support vector machines
       posium on Security and Privacy (2017), pp. 3–18. ISSN:                under adversarial label contamination”. In: Neurocomput-
       10816011. DOI: 10.1109/SP.2017.41. arXiv: 1610.                       ing 160 (2015), pp. 53–62. ISSN: 18728286. DOI: 10 .
       05820.                                                                1016/j.neucom.2014.08.081.
[64]   D.B. Skillicorn. “Adversarial Knowledge Discovery”. In:        [78]   Zhizhou Yin, Fei Wang, Wei Liu, and Sanjay Chawla.
       IEEE Intelligent Systems 24.6 (2009), pp. 1–13. ISSN:                 “Sparse Feature Attacks in Adversarial Learning”. In: IEEE
       1541-1672. DOI: 10.1109/MIS.2009.108.                                 Transactions on Knowledge and Data Engineering 30.6
[65]   Congzheng Song, Thomas Ristenpart, and Vitaly                         (2018), pp. 1164–1177. ISSN: 10414347. DOI: 10.1109/
       Shmatikov. “Machine Learning Models that Remem-                       TKDE.2018.2790928.
       ber Too Much”. In: (2017). ISSN: 15437221. DOI:
       10.1145/3133956.3134077. arXiv: 1709.07886.
[66]   Nedim Šrndić and Pavel Laskov. “Practical evasion of a
       learning-based classifier: A case study”. In: Proceedings -
       IEEE Symposium on Security and Privacy (2014), pp. 197–
       211. ISSN: 10816011. DOI: 10.1109/SP.2014.20.
[67]   Jacob Steinhardt, Pang Wei Koh, and Percy Liang. “Certi-
       fied Defenses for Data Poisoning Attacks”. In: (June 2017).
       ISSN : 10495258. arXiv: 1706.03691.
[68]   Rock Stevens, Octavian Suciu, Andrew Ruef, Sanghyun
       Hong, Michael Hicks, and Tudor Dumitraş. “Summon-
       ing Demons: The Pursuit of Exploitable Bugs in Machine
       Learning”. In: (Jan. 2017). arXiv: 1701.04739.
[69]   Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan
       Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus.
       “Intriguing properties of neural networks”. In: (Dec. 2013),
       pp. 1–10. ISSN: 15499618. DOI: 10.1021/ct2009208.
       arXiv: 1312.6199.
[70]   Pedro Tabacof and Eduardo Valle. “Exploring the space
       of adversarial images”. In: Proceedings of the Interna-
       tional Joint Conference on Neural Networks 2016-Octob.1
       (2016), pp. 426–433. DOI: 10 . 1109 / IJCNN . 2016 .
       7727230. arXiv: 1510.05328.
[71]   Florian Tramèr, Nicolas Papernot, Ian Goodfellow, Dan
       Boneh, and Patrick McDaniel. “The Space of Transferable
       Adversarial Examples”. In: (2017). arXiv: 1704.03453.
[72]   Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Re-
       iter, and Thomas Ristenpart. “Stealing Machine Learning
       Models via Prediction APIs”. In: Proceedings of the 25th
       USENIX Security Symposium 94.3 (Sept. 2016), pp. 601–
       618. ISSN: 2469-9985. DOI: 10.1103/PhysRevC.94.
       034301. arXiv: 1609.02943.
[73]   Binghui Wang and Neil Zhenqiang Gong. “Stealing Hy-
       perparameters in Machine Learning”. In: Proceedings -
       IEEE Symposium on Security and Privacy 2018-May.May
       (2018), pp. 36–52. ISSN: 10816011. DOI: 10.1109/SP.
       2018.00038. arXiv: 1802.05351.
[74]   Lingxiao Wei, Yannan Liu, Bo Luo, Yu Li, and Qiang Xu.
       “I Know What You See: Power Side-Channel Attack on
       Convolutional Neural Network Accelerators”. In: (2018).
       arXiv: 1803.05847.