<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Infinity-norm Support Vector Machines against Adversarial Label Contamination</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ambra Demontis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Battista Biggio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giorgio Fumera</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giorgio Giacinto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabio Roli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Electrical and Electronic Eng. University of Cagliari</institution>
          ,
          <addr-line>Piazza d'Armi, 09123, Cagliari, Italy ambra.demontis, battista.biggio, fumera, giacinto, roli @diee.unica.it</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <fpage>106</fpage>
      <lpage>115</lpage>
      <abstract>
        <p>Nowadays machine-learning algorithms are increasingly being applied in security-related applications like spam and malware detection, aiming to detect never-before-seen attacks and novel threats. However, such techniques may expose specific vulnerabilities that may be exploited by carefully-crafted attacks. Support Vector Machines (SVMs) are a well-known and widely-used learning algorithm. They make their decisions based on a subset of the training samples, known as support vectors. We first show that this behaviour poses risks to system security, if the labels of a subset of the training samples can be manipulated by an intelligent and adaptive attacker. We then propose a countermeasure that can be applied to mitigate this issue, based on infinity-norm regularization. The underlying rationale is to increase the number of support vectors and balance more equally their contribution to the decision function, to decrease the impact of the contaminating samples during training. Finally, we empirically show that the proposed defence strategy, referred to as Infinity-norm SVM, can significantly improve classifier security under malicious label contamination in a real-world classification task involving malware detection.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        filters allow users to correct the label (spam or legitimate) automatically assigned to
incoming emails, if wrong; an attacker may exploit this feature by creating an email account on a
provider protected by the targeted spam filter, and then purposely mislabeling incoming emails
that will be subsequently used to retrain the classifier, to gradually poison classifier training.
Another instance is PDFRate,1 which is an online tool for detecting malware embedded into
PDF files [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]: an attacker may provide wrong feedback to the system, which amounts to
manipulating the labels of (future) training samples. Since collecting labels from domain experts
is usually costly, crowdsourcing systems like Amazon Mechanical Turk are being used to assign
this task to non-expert individuals: this scenario may be exploited by attackers to provide
wrong labels. “Malicious crowdsourcing” or “crowdturfing” services are growing in
popularity: Internet users are payed to perform profitable malicious tasks, like spam dissemination,
including polluting the data used as training samples by machine learning systems [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        The above examples show that understanding label-flip attacks more thoroughly and finding
effective countermeasures is a very relevant research topic. In this paper we focus on label-flip
attacks against Support Vector Machines (SVM), which are a state-of-the-art, widely used
classifier. Previous work, summarized in Sect. 2.2, has shown that SVM classification accuracy
decreases in the presence of label noise (even non-adversarial), and that some SVM variants
are more robust under random label flips. To our knowledge the only specific countermeasure
against label-flip attacks has been proposed in our previous work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]: it is a heuristic approach
that enforces the classifier to evenly weigh all the training samples, to increase the stability
of the decision function with respect to changes of the training labels. In this work we give
a theoretical support to the above approach, and propose a general, more theoretically-sound
countermeasure (Sect. 3) rooted in recent findings about the relationship between regularized
and robust optimization (Sect. 2.3), which also reduces the complexity of SVM training. In
Sect. 4 we validate our approach on artificial and real-world data sets In Sect. 5 we review
related work, and in Sect. 6 we discuss the main contribution of this work and some interesting
research directions.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>We first describe two label flip strategies we shall consider in our experiments. Then we review
state-of-the-art strategies that may be used to improve SVM security under label flip attacks.
We finally overview works highlighting a link between regularization and robustness, that will
provide a formal support to the approach we propose in Sect. 3.</p>
      <p>In the following we denote by {(xi, yi)}in=1 the training set, where xi ∈ Rd is the feature
vector of the i-th sample and yi ∈ {−1, +1} its label (respectively, for legitimate and malicious
samples). The decision function of a trained SVM classifier (using a nonlinear kernel) is g(x) =
Pn n</p>
      <p>i=1 αiyik(x, xi) + b, where k(·, ·) is the kernel function, and {αi}i=1 and b are coefficients set
by the learning algorithm.
2.1</p>
      <sec id="sec-2-1">
        <title>Label-Flip Attacks</title>
        <p>We consider two different kinds of label flip attacks. In both cases we set a constraint to the
fraction of labels the adversary can change, to reflects a likely limitation in real-world scenarios.</p>
        <p>Random label flip is a baseline attack, which consists of flipping the labels of a
randomlychosen fraction of training samples, without exploiting any knowledge of the targeted classifier.
1Available at: http://pdfrate.com</p>
        <p>
          Adversarial Label-Flip Attack (ALFA-Tilt) is a different attack proposed in [
          <xref ref-type="bibr" rid="ref21 ref5">5, 21</xref>
          ],
and assumes a skilled attacker whose aim is to maximize the classifier error on untainted
(testing) data. Since finding the subset of samples whose label flipping maximizes the testing error
is a non-trivial problem, the authors devised a heuristic approach that maximizes a surrogate
measure of the testing error, namely, the angle between the decision hyperplane found by the
untainted classifier and the one under attack.
        </p>
        <p>Different attack scenarios can be simulated, depending on the attacker’s level of knowledge of
the system: either perfect knowledge, if the attacker exactly knows the coefficients of the SVM
decision function, or limited knowledge, if she is only capable of creating a data set sampled
from the same distribution of the one used for training the original classifier, and then training
a surrogate classifier for estimating the original decision function. In this work we consider the
worst case of perfect knowledge, although in real scenarios the attacker is likely to have only a
limited knowledge of the system.
2.2</p>
        <p>SVM</p>
      </sec>
      <sec id="sec-2-2">
        <title>Variants</title>
        <p>A possible countermeasure to label-flip attacks is to enforce the decision function of an SVM
to weigh more uniformly the contribution of each training sample to the decision hyperplane.
The reason is that this would decrease the impact of each single point during learning of the
decision function. Two SVM variants can be used to this aim.</p>
        <p>
          Least-Squares SVM (LS-SVM). This SVM variant [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] uses a quadratic loss function
instead of the hinge loss. This makes its solution non-sparse, i.e., all the training samples are
assigned a non-null α value. In particular, the LS-SVM (primal) learning problem is:
wm,bin,e 12 w&gt;w + γ 21 Pin=1 ei2 s.t. yi = w&gt;φ(xi) + b + ei ∀i ,
(1)
where φ is the kernel-induced feature mapping, and w the set of primal weights. Recall that,
as in SVM learning, w = Pn
        </p>
        <p>i=1 αiyiφ(xi) and k(xi, xj ) = φ(x)&gt;φ(xj ), which enables learning
of nonlinear decision functions in input space by solving the corresponding dual optimization
problem (i.e., optimizing directly the dual variables α instead of w).</p>
        <p>
          Label Noise Robust SVM (LN-robust SVM). This is another SVM variant proposed in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]
against label flip attacks. It assumes that the label of each training sample can be independently
flipped with the same probability μ. The probability of label flips is then encoded into the kernel
matrix, which is involved in the dual SVM learning problem. The expected value of the modified
kernel matrix (which is still positive-semidefinite) is then used for solving the standard SVM
learning problem. It turns out that, by increasing the variance S = μ(1 − μ), the variance of the
coefficients αi decreases; accordingly, each training sample is more likely to become a support
vector, providing a more balanced contribution to the decision function. This approach only
requires a simple correction to the kernel matrix with respect to standard SVM. It is however a
heuristic solution, which also requires one to be able to reliably estimate the fraction of potential
label flips in the training data.
2.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Robustness and Regularization</title>
        <p>
          Recently an interesting relationship between regularized and robust optimization problems has
been pointed out [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. Under mild assumptions, the two kind of problems are equivalent. In
particular, the robust optimization problem considered in [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] is:
min max
w,b u1,..,un∈U
        </p>
        <p>
          Pin=1 1 − yi(w&gt;(xi − ui) + b) + ,
(2)
where (z)+ = z (0), if z &gt; 0 (≤ 0), u1, ..., un ∈ U is a set of bounded perturbations of the
training data, and U is the so-called uncertainty set. This set is defined as:
being k · k∗ the dual norm of k · k. Typical examples of uncertainty sets include the `1 and
`2 balls [
          <xref ref-type="bibr" rid="ref17 ref22">22, 17</xref>
          ]. The non-robust, regularized optimization problem is formulated as (cf. Th. 3
in [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]):
        </p>
        <p>Δ
U = {(u1, . . . , un)| Pn</p>
        <p>i=1 kuik∗ ≤ c} ,
minw,b ckwk + Pin=1 1 − yi(w&gt;xi + b) + .
(3)
(4)
This means that, if the `1 norm is chosen as the dual norm characterizing the uncertainty set
U , then the optimal regularizer would be `∞.2 If the attacker can change only few labels of
training samples, label flip attacks can be seen as a sparse `1 noise affecting the training labels.
The optimal countermeasure is therefore to use a `∞ regularizer to enforce the classifier to give
the same importance to all the training samples.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Infinity-norm Support Vector Machines</title>
      <p>
        In [
        <xref ref-type="bibr" rid="ref15 ref8">8, 15</xref>
        ], based on the findings of Xu et al. [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] (Sect. 2.3), we have shown that infinity-norm
(`∞) regularization is very effective against sparse evasion attacks, i.e., attacks in which the
attacker modifies only a small subset of the feature values. The reason is that this regularizer
bounds the maximum and the minimum values of the feature weights, i.e., enforcing the SVM
to learn more-evenly distributed weights. Under this setting, it is not difficult to see that the
attacker is required to manipulate more features to evade detection.
      </p>
      <p>
        Label-flip attacks can be seen as sparse attacks in terms of the influenced training points,
since only the labels of few training samples can be manipulated by the attacker. Our idea is
thus to exploit `∞ regularization to enforce more evenly-distributed α weights on the training
data, similarly to the intuition in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to learn more secure SVMs against adversarial label flips.
In this work, we obtain this effect by training a (linear) Infinity-norm SVM directly in the kernel
space, i.e., using the kernel matrix as the input training data, to learn a discriminant function
of the form g(x) = Pim=1 αik(x, xi) + b, where k(·, ·) is the kernel function, and {xi}im=1 and
m
{αi}i=1 are respectively the training samples and their α weights. Under this setting, the α
values and the bias b are obtained by solving the following linear programming problem:
minα,b
kαk∞ + C Pm
i=1 (1 − yig(xi))+ .
      </p>
      <p>(5)
Notably, this approach can be used also with kernels that are not necessarily positive
semidefinite (i.e., indefinite kernels).
4</p>
    </sec>
    <sec id="sec-4">
      <title>Experimental Analysis</title>
      <p>
        In this section, we first show on a two-dimensional example how the adversarial label-flip attack
(ALFA-tilt) affects the decision function of the different SVM classifiers described in the previous
sections. Then, to mitigate the fact that the impact of label-flip attacks is strongly
datadependent, as pointed out in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], we validate our approach on a very large number of real-world
datasets, including a case study on PDF malware detection.
      </p>
      <p>2The `1 norm is the dual norm of `∞, and vice versa.
0
−5
0
g(x)
−2 0 2
test error: 3%
−2 0 2
test error: 5%
−2 0 2
test error: 3%
−2 0 2
test error: 3%
−2 0 2
test error: 60%
−2 0 2
test error: 48%
−2 0 2
test error: 38%
−2 0 2
test error: 15%
Two-dimensional Example. We consider here a Gaussian dataset with mean [y, 0] (for class
y) and diagonal covariance matrix equal to diag([0.5, 0.5]). We have generated 60 samples for
training and 40 for testing, and used the adversarial label-flip attack to flip 18 training labels.
We have set C = 1 for SVM and LN-SVM, and C = 0.01 for LS-SVM and Infinity-Norm SVM.
Results are reported in Fig. 1, where one can appreciate how the Infinity-norm SVM retains
a higher accuracy under attack, due to the fact that it spreads in a more uniform manner the
(absolute) weight values α over the training samples. Note indeed that the decision hyperplane
obtained by Infinity-norm SVM under attack, and the corresponding test error, are less affected
by the attack.</p>
      <p>
        Real-world data. Here we report the results for 6 datasets downloaded from LibSVM and
UCI repositories.3 Firstly, we have normalized data in [
        <xref ref-type="bibr" rid="ref1">−1, 1</xref>
        ] using min-max normalization.
Then we have randomly split the data in 5 distinct training and test set pairs, consisting of 60%
3https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html
0 10 20 30 40 50
% flipped labels
diabetes C=0.001
0 10 20 30 40 50
      </p>
      <p>% flipped labels
0 10 20 30 40 50
% flipped labels
diabetes C=0.001
0 10 20 30 40 50</p>
      <p>% flipped labels
and 40% of the data. The averaged results, for different C values under random and ALFA-tilt
attacks are resepctively reported in Fig. 2 and Fig. 3. Notably, the LS-SVM and Infinity-norm
SVM attain the best performance for low C values, as the effect of regularization is stronger.
These classifiers are nevertheless the most secure under both the random and the ALFA-tilt
attack. The performance with the best C for each classifier are dataset-dependent, however
Infinity-norm SVM is clearly able to achieve a higher level of security on all the different datasets
considered in this evaluation.</p>
      <p>
        PDF Malware Detection. Nowadays PDF is the most used document type due to the
fact that presents documents in a independent manner from the operative systems. A PDF
document can hosts not only text and images but also JavaScript and Flash scripts. This makes
it one of the most exploited vector for convey malware (i.e., malicious software). We have
used a dataset called Lux0r [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. This consists of PDF documents that embeds JavaScript code,
collected from different security blogs and antivirus engine. The dataset contains around 12, 000
malicious PDFs and about 5, 000 benign samples. Every PDF is represented by 736 features,
each representing the number of occurrences of a specific Javascript function (API call) into the
PDF. Each API call corresponds to an action performed by one of the objects that belongs to the
PDF. For this experiment, we have used the same normalization and splitting strategy used in
our previous experiment on other real-world datasets. The averaged results of this experiment
for random and ALFA-tilt attacks are reported in Fig. 4. As for the experiments on the other
datasets, we can see that Infinity-norm SVM is able to obtain always good performance and
that it has the highest accuracy under the ALFA-tilt attack.
      </p>
      <p>PDF data C=1e−05</p>
      <p>PDF data C=0.01</p>
      <p>PDF data C=1</p>
      <p>PDF data C=100
PDF data C=0.0001</p>
      <p>PDF data C=0.01
1
cy 0.8
a
ru0.6
c
tca0.4
tse0.2
0
0 10 20 30 40 50
% flipped labels</p>
      <p>
        PDF data C=100
Adversarial label flip is a particular case of a more general phenomenon known in the
machinelearning literature as label noise, as properly explained into a recent survey on this topic by
Fr´enay et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. As mentioned in Sect. 1, some of the recently-proposed algorithms aim to
improve SVM security under random label noise. In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], Goernitz et al. propose a one-class
SVM that reduces the influence of outlying data observations during learning. In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], the
authors propose a heuristic approach, named micro-bagging, that equalizes the contribution
of each training sample, bagging one SVM for each different pair of training samples (each
belonging to a different class). Natarjan et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] propose a classifier that use a weighted
surrogate loss function that represents an upper bound of SVM risk on real data. Their classifier
achieves high accuracy also in the presence of a large amount of noise.
      </p>
      <p>
        Robustness of classifiers against adversarial (worst-case) label flips has been investigated
in [
        <xref ref-type="bibr" rid="ref12 ref20 ref21 ref5 ref6">12, 6, 5, 20, 21</xref>
        ], also proposing some countermeasures to increase classifier security against
such attacks.
6
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>
        Within this work we have investigated the security of different SVMs under adversarial label
contamination. We have shown that the sparsity of the SVM α values may be considered
a threat for its security in the presence of training data contamination. We have proposed a
countermeasure that consists of using an infinity-norm regularizer in kernel space. This proposal
is based on more theoretically-sound explanations (in terms of robustness and regularization)
than those provided in previous work (mainly based on heuristics and intuition) [
        <xref ref-type="bibr" rid="ref21 ref5">5, 21</xref>
        ]. We have
validated our approach on a large number of real-world datasets, confirming the soundness of the
proposed approach. We remark that we have supposed that the attacker has perfect knowledge
of the system. Although, in practice, it may be difficult for an attacker to have full knowledge
of the targeted system, this is anyway an interesting analysis as it provides an estimate of
the maximum performance degradation that the system may incur under attack. Moreover,
only relying on security through obscurity (i.e., believing that the attacker is not going to
discover some system implementation details) is normally not advocated as a best security
practice. Besides considering also limited-knowledge attack scenarios, another interesting future
extension of this work may be to investigate the trade-off between robustness to poisoning
attacks at training time and evasion attacks at test time, depending on the kind of regularization
(and, thus, on the sparsity of the solution). In this respect, it may be interesting to consider
novel regularizers that allow one to trade sparsity for classifier security, to tackle computational
complexity issues without compromising system security, as also discussed in our recent work
for the case of evasion attacks [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Marco</given-names>
            <surname>Barreno</surname>
          </string-name>
          , Blaine Nelson, Russell Sears,
          <string-name>
            <given-names>Anthony D.</given-names>
            <surname>Joseph</surname>
          </string-name>
          , and
          <string-name>
            <surname>J. D. Tygar.</surname>
          </string-name>
          <article-title>Can machine learning be secure</article-title>
          ?
          <source>In Proc. ACM Symp</source>
          . Information, Computer and Comm. Sec.,
          <source>ASIACCS '06</source>
          , pages
          <fpage>16</fpage>
          -
          <lpage>25</lpage>
          , New York, NY, USA,
          <year>2006</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Biggio</surname>
          </string-name>
          , G. Fumera,
          <string-name>
            <given-names>P.</given-names>
            <surname>Russu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Didaci</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Roli</surname>
          </string-name>
          .
          <article-title>Adversarial biometric recognition : A review on biometric system security from the adversarial machine-learning perspective</article-title>
          .
          <source>Signal Processing Magazine</source>
          , IEEE,
          <volume>32</volume>
          (
          <issue>5</issue>
          ):
          <fpage>31</fpage>
          -
          <lpage>41</lpage>
          ,
          <year>Sept 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Battista</given-names>
            <surname>Biggio</surname>
          </string-name>
          , Giorgio Fumera, and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Roli</surname>
          </string-name>
          .
          <article-title>Pattern recognition systems under attack: Design issues and research challenges</article-title>
          .
          <source>Int'l J. Patt. Recogn. Artif. Intell.</source>
          ,
          <volume>28</volume>
          (
          <issue>7</issue>
          ):
          <fpage>1460002</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Battista</given-names>
            <surname>Biggio</surname>
          </string-name>
          , Giorgio Fumera, and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Roli</surname>
          </string-name>
          .
          <article-title>Security evaluation of pattern classifiers under attack</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          ,
          <volume>26</volume>
          (
          <issue>4</issue>
          ):
          <fpage>984</fpage>
          -
          <lpage>996</lpage>
          ,
          <year>April 2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Battista</given-names>
            <surname>Biggio</surname>
          </string-name>
          , Blaine Nelson, and
          <string-name>
            <given-names>Pavel</given-names>
            <surname>Laskov</surname>
          </string-name>
          .
          <article-title>Support vector machines under adversarial label noise</article-title>
          .
          <source>In Journal of Machine Learning Research - Proc. 3rd Asian Conf. Machine Learning</source>
          , volume
          <volume>20</volume>
          , pages
          <fpage>97</fpage>
          -
          <lpage>112</lpage>
          ,
          <year>November 2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Nader</surname>
            <given-names>H Bshouty</given-names>
          </string-name>
          , Nadav Eiron, and
          <string-name>
            <given-names>Eyal</given-names>
            <surname>Kushilevitz</surname>
          </string-name>
          .
          <article-title>Pac learning with nasty noise</article-title>
          .
          <source>Theoretical Computer Science</source>
          ,
          <volume>288</volume>
          (
          <issue>2</issue>
          ):
          <fpage>255</fpage>
          -
          <lpage>275</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Igino</given-names>
            <surname>Corona</surname>
          </string-name>
          , Davide Maiorca, Davide Ariu, and Giorgio Giacinto.
          <article-title>Lux0r: Detection of malicious pdf-embedded javascript code through discriminant analysis of API references</article-title>
          .
          <source>In Proc. 2014 Workshop on Artificial Intelligent and Security Workshop</source>
          , AISec '
          <volume>14</volume>
          , pages
          <fpage>47</fpage>
          -
          <lpage>57</lpage>
          , New York, NY, USA,
          <year>2014</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Ambra</given-names>
            <surname>Demontis</surname>
          </string-name>
          , Paolo Russu, Battista Biggio, Giorgio Fumera, and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Roli</surname>
          </string-name>
          .
          <article-title>On security and sparsity of linear classifiers for adversarial settings</article-title>
          .
          <source>In Joint IAPR Int'l Workshop on Structural, Syntactic, and Statistical Pattern Recognition</source>
          . Springer, Springer, In Press.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Frenay</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Verleysen</surname>
          </string-name>
          .
          <article-title>Classification in the presence of label noise: a survey</article-title>
          .
          <source>IEEE Transactions on Neural Networks and Learning Systems</source>
          ,
          <volume>35</volume>
          (
          <issue>5</issue>
          ):
          <fpage>845</fpage>
          -
          <lpage>869</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10] Nico Go¨rnitz, Anne Porbadnigk, Alexander Binder, Claudia Sannelli, Mikio L. Braun, KlausRobert Mu¨ller, and
          <string-name>
            <given-names>Marius</given-names>
            <surname>Kloft</surname>
          </string-name>
          .
          <article-title>Learning and evaluation in presence of non-i.i.d. label noise</article-title>
          .
          <source>In Proc. 17th Int'l Conf. on Artificial Intell. and Statistics</source>
          , AISTATS, Reykjavik, Iceland,
          <source>April 22-25</source>
          , pages
          <fpage>293</fpage>
          -
          <lpage>302</lpage>
          . JMLR.org,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Joseph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Nelson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Rubinstein</surname>
          </string-name>
          , and
          <string-name>
            <surname>J. D. Tygar.</surname>
          </string-name>
          <article-title>Adversarial machine learning</article-title>
          .
          <source>In 4th ACM Workshop on Artificial Intelligence and Security</source>
          (AISec
          <year>2011</year>
          ), pages
          <fpage>43</fpage>
          -
          <lpage>57</lpage>
          , Chicago, IL, USA,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Kearns</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ming</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <article-title>Learning in the presence of malicious errors</article-title>
          .
          <source>SIAM J. Comput.</source>
          ,
          <volume>22</volume>
          (
          <issue>4</issue>
          ):
          <fpage>807</fpage>
          -
          <lpage>837</lpage>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Nagarajan</surname>
            <given-names>Natarajan</given-names>
          </string-name>
          , Inderjit S Dhillon,
          <article-title>Pradeep K Ravikumar,</article-title>
          and
          <string-name>
            <given-names>Ambuj</given-names>
            <surname>Tewari</surname>
          </string-name>
          .
          <article-title>Learning with noisy labels</article-title>
          . In C.J.
          <string-name>
            <surname>C. Burges</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Welling</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ghahramani</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K.Q. Weinberger, editors,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>26</volume>
          , pages
          <fpage>1196</fpage>
          -
          <lpage>1204</lpage>
          . Curran Associates, Inc.,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Blaine</surname>
            <given-names>Nelson</given-names>
          </string-name>
          , Battista Biggio, and
          <string-name>
            <given-names>Pavel</given-names>
            <surname>Laskov</surname>
          </string-name>
          .
          <article-title>Microbagging estimators: An ensemble approach to distance-weighted classifiers</article-title>
          .
          <source>In Journal of Machine Learning Research - Proc. 3rd Asian Conf. Machine Learning</source>
          , volume
          <volume>20</volume>
          , pages
          <fpage>63</fpage>
          -
          <lpage>79</lpage>
          , Taoyuan, Taiwan,
          <year>November 2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Paolo</surname>
            <given-names>Russu</given-names>
          </string-name>
          , Ambra Demontis, Battista Biggio, Giorgio Fumera, and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Roli</surname>
          </string-name>
          .
          <article-title>Secure kernel machines against evasion attacks</article-title>
          .
          <source>In 9th ACM Workshop on Artificial Intelligence and Security</source>
          . ACM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Charles</given-names>
            <surname>Smutz</surname>
          </string-name>
          and
          <string-name>
            <given-names>Angelos</given-names>
            <surname>Stavrou</surname>
          </string-name>
          .
          <article-title>Malicious pdf detection using metadata and structural features</article-title>
          .
          <source>In Proceedings of the 28th Annual Computer Security Applications Conference</source>
          , ACSAC '
          <volume>12</volume>
          , pages
          <fpage>239</fpage>
          -
          <lpage>248</lpage>
          , New York, NY, USA,
          <year>2012</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Suvrit</surname>
            <given-names>Sra</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Nowozin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Stephen J.</given-names>
            <surname>Wright</surname>
          </string-name>
          .
          <article-title>Optimization for Machine Learning</article-title>
          . The MIT Press,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.A.K.</given-names>
            <surname>Suykens</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Vandewalle</surname>
          </string-name>
          .
          <article-title>Least squares support vector machine classifiers</article-title>
          .
          <source>Neural Processing Letters</source>
          ,
          <volume>9</volume>
          (
          <issue>3</issue>
          ):
          <fpage>293</fpage>
          -
          <lpage>300</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Gang</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tianyi</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Haitao</given-names>
            <surname>Zheng</surname>
          </string-name>
          , and
          <string-name>
            <surname>Ben</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhao</surname>
          </string-name>
          .
          <article-title>Man vs. machine: Practical adversarial detection of malicious crowdsourcing workers</article-title>
          .
          <source>In 23rd USENIX Security Symposium (USENIX Security 14)</source>
          , San Diego, CA,
          <year>2014</year>
          . USENIX Association.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Han</surname>
            <given-names>Xiao</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Huang</given-names>
            <surname>Xiao</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Claudia</given-names>
            <surname>Eckert</surname>
          </string-name>
          .
          <article-title>Adversarial label flips attack on support vector machines</article-title>
          .
          <source>In 20th European Conference on Artificial Intelligence</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Huang</surname>
            <given-names>Xiao</given-names>
          </string-name>
          , Battista Biggio, Blaine Nelson, Han Xiao,
          <string-name>
            <given-names>Claudia</given-names>
            <surname>Eckert</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Roli</surname>
          </string-name>
          .
          <article-title>Support vector machines under adversarial label contamination</article-title>
          . Neurocomputing, Special Issue on Advances in
          <source>Learning with Label Noise</source>
          ,
          <volume>160</volume>
          (
          <issue>0</issue>
          ):
          <fpage>53</fpage>
          -
          <lpage>62</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Huan</surname>
            <given-names>Xu</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Constantine</given-names>
            <surname>Caramanis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Shie</given-names>
            <surname>Mannor</surname>
          </string-name>
          .
          <article-title>Robustness and regularization of support vector machines</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>10</volume>
          :
          <fpage>1485</fpage>
          -
          <lpage>1510</lpage>
          ,
          <year>July 2009</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>