<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Sympo-
sium on Adversary-Aware Learning Techniques and Trends in Cy-
bersecurity, Arlington, VA, USA</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Exploring Adversarial Examples in Malware Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Octavian Suciu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Scott E. Coull and Jeffrey Johns</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>FireEye, Inc.</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Maryland</institution>
          ,
          <addr-line>College Park</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>1</volume>
      <fpage>8</fpage>
      <lpage>19</lpage>
      <abstract>
        <p>The Convolutional Neural Network (CNN) architecture is increasingly being applied to new domains, such as malware detection, where it is able to learn malicious behavior from raw bytes extracted from executables. These architectures reach impressive performance with no feature engineering effort involved, but their robustness against active attackers is yet to be understood. Such malware detectors could face a new attack vector in the form of adversarial interference with the classification model. Existing evasion attacks intended to cause misclassification on test-time instances, which have been extensively studied for image classifiers, are not applicable because of the input semantics that prevents arbitrary changes to the binaries. This paper explores the area of adversarial examples for malware detection. By training an existing model on a production-scale dataset, we show that some previous attacks are less effective than initially reported, while simultaneously highlighting architectural weaknesses that facilitate new attack strategies for malware classification. Finally, we explore more generalizable attack strategies that increase the potential effectiveness of evasion attacks.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The popularity of Convolutional Neural Network (CNN)
classifiers has lead to their adoption in fields which have
been historically adversarial, such as malware
dectection
        <xref ref-type="bibr" rid="ref1 ref12 ref8 ref9">(Raff et al. 2017; Krcˇa´l et al. 2018)</xref>
        . Recent advances in
adversarial machine learning have highlighted weaknesses
of classifiers when faced with adversarial samples. One such
class of attacks is evasion
        <xref ref-type="bibr" rid="ref3">(Biggio et al. 2013)</xref>
        , which acts
on test-time instances. The instances, also called
adversarial examples, are modified by the attacker such that they are
misclassified by the victim classifier even though they still
resemble their original representation. State-of-the-art
attacks focus mainly on image classifiers
        <xref ref-type="bibr" rid="ref11 ref13 ref4 ref5">(Szegedy et al. 2013;
Goodfellow, Shlens, and Szegedy 2014; Papernot et al.
2017; Carlini and Wagner 2017)</xref>
        , where attacks add small
perturbations to input pixels that lead to a large shift in the
victim classifier feature space, potentially shifting it across
the classification decision boundary. The perturbations do
not change the semantics of the image as a human oracle
easily identifies the original label associated with the image.
      </p>
      <p>
        In the context of malware detection, adversarial examples
could represent an additional attack vector for an attacker
determined to evade such a system. However,
domainspecific challenges limit the applicability of existing attacks
designed against image classifiers on this task. First, the
strict semantics of binary files disallows arbitrary
perturbations in the input space. This is because there is a structural
interdependence between adjacent bytes, and any change
to a byte value could potentially break the functionality of
the executable. Second, limited availability of representative
datasets or robust public models limits the generality of
existing studies. Existing attacks
        <xref ref-type="bibr" rid="ref8">(Kolosnjaji et al. 2018)</xref>
        use
victim models trained on very small datasets and avoid the
semantic issues entirely by appending adversarial noise at
the end of the binary files, and their generalization
effectiveness is yet to be evaluated.
      </p>
      <p>This paper sheds light on the generalization property of
adversarial examples against CNN-based malware detectors.
By training on a production-scale dataset of 12.5 million
binaries, we are able to observe interesting properties of
existing attacks, and propose a more effective and generalizable
attack strategy. Our contributions are as follows:
We measure the generalization of existing adversarial
attacks and highlight the limitations that prevents them from
being widely applicable.</p>
      <p>
        We unearth an architectural weaknesses of a published
CNN architecture that facilitates the existing
appendbased attack by Kolosnjaji et al.
        <xref ref-type="bibr" rid="ref8">(Kolosnjaji et al. 2018)</xref>
        .
We propose a new attack which, by modifying the existing
bytes of a binary, has the potential to outperform
appendbased attacks without semantic inconsistencies.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>
        The Convolutional Neural Network (CNN) architecture has
proven to be very successful across popular vision tasks,
such as image classification
        <xref ref-type="bibr" rid="ref6">(He et al. 2016)</xref>
        . This lead to
an increased adoption in other fields and domains, with one
such example being text classification from character-level
features
        <xref ref-type="bibr" rid="ref16">(Zhang, Zhao, and LeCun 2015)</xref>
        , which turns out to
be extremely similar to the malware classification problem
discussed in this paper. In this setting, a natural language
documents is represented as a sequence of characters, and
the CNN is applied on that one-dimensional stream of
characters. The intuition behind this approach is that a CNN is
capable of automatically learning complex features, such as
words or word sequences, by observing compositions of raw
signals extracted from single characters. This approach also
avoids the requirement of defining language semantic rules,
and is able to tolerate anomalies in features, such as word
misspellings. The classification pipeline first encodes each
character into a fixed-size embedding vector. The sequence
of embeddings acts as input to a set of convolutional
layers, intermixed with pooling layers, then followed by fully
connected layers. The convolutional layers act as receptors,
picking particular features from the input instance, while
the pooling layers act as filters to down-sample the feature
space. The fully connected layers act as a non-linear
classifier on the internal feature representation of instances.
CNNs for Malware Classification. Similar to this
approach, the security community explored the applicability of
CNNs to the task of malware detection on binary files
        <xref ref-type="bibr" rid="ref1 ref12 ref8 ref9">(Raff
et al. 2017; Krcˇa´l et al. 2018)</xref>
        . Analogous to text, a file could
be conceptualized as a sequence of bytes that are arranged
into higher-level features, such as instructions or functions.
By allowing the classifier to automatically learn features
indicative of maliciousness, this approach avoids the
laborintensive feature engineering process typical of malware
classification tasks. Manual feature engineering proved to
be challenging in the past and lead to an arms race
between antivirus developers and attackers aiming to evade
them
        <xref ref-type="bibr" rid="ref14">(Ugarte-Pedrero et al. 2015)</xref>
        . However, the robustness
of these automatically learned features in the face of evasion
is yet to be understood.
      </p>
      <p>
        In this paper, we explore evasion attacks by focusing on
a byte-based convolutional neural network for malware
detection, called MalConv
        <xref ref-type="bibr" rid="ref12">(Raff et al. 2017)</xref>
        , whose
architecture is shown in Figure 1. MalConv reads up to 2MB of
raw byte values from a Portable Executable (PE) file as
input, appending a distinguished padding token to files smaller
than 2MB and truncating extra bytes from larger files. The
fixed-length sequences are then transformed into an
embedding representation, where each byte is mapped to an
8dimensional embedding vector. These embeddings are then
passed through a gated convolutional layer, followed by a
temporal max-pooling layer, before being classified through
a final fully connected layer. Each convolutional layer uses
a kernel size of 500 bytes with a stride of 500 (i.e.,
nonoverlapping windows), and each of the 128 filters is passed
through a max-pooling layer. This results in a unique
architectural feature that we will revisit in our results: each
pooled filter is mapped back to a specific 500-byte sequence
and there are at most 128 such sequences that contribute to
the final classification across the entire input. Their reported
results on a testing set of 77,349 samples achieved a
Balanced Accuracy of 0.909 and Area Under the Curve (AUC)
of 0.982.
      </p>
      <p>
        Adversarial Binaries. Unlike evasion attacks on
images
        <xref ref-type="bibr" rid="ref11 ref13 ref4 ref5">(Szegedy et al. 2013; Goodfellow, Shlens, and Szegedy
2014; Papernot et al. 2017; Carlini and Wagner 2017)</xref>
        ,
attacks that alter the raw-bytes of PE files must maintain
the syntactic and semantic fidelity of the original file. The
Portable Executable (PE) standard
        <xref ref-type="bibr" rid="ref10">(Microsoft 2018)</xref>
        defines
a fixed structure for these files. A PE file contains a leading
header enclosing file metadata and pointers to the sections
of the file, followed by the variable-length sections which
contain the actual program code and data. Changing bytes
arbitrarily could break the malicious functionality of the
binary or, even worse, prevent it from loading at all.
      </p>
      <p>
        Recent work
        <xref ref-type="bibr" rid="ref8">(Kolosnjaji et al. 2018)</xref>
        avoided this
problem by appending adversarial noise to the end of the binary.
Since the appended adversarial bytes are not within the
defined boundaries of the PE file, their existence does not
impact the binary’s functionality and there are no inherent
restrictions on the syntax of bytes (i.e., valid instructions and
parameters). The trade-off, however, is that the impact of
the appended bytes on the final classification is offset by the
features present in the original sample, which remain
unchanged. As we will see, these attacks take advantage of
certain vulnerabilities in position-independent feature detectors
present in the MalConv architecture.
      </p>
      <p>
        Datasets. To evaluate the success of evasion attacks
against the MalConv architecture, we collected 16.3M PE
files from a variety of sources, including VirusTotal,
Reversing Labs, and proprietary FireEye data. The data was used
to create a production-quality dataset of 12.5M training
samples and 3.8M testing samples, which we refer to as the Full
dataset. It contains 2.2M malware samples in the training
set, and 1.2M in testing, which represents a realistic ratio of
goodware to malware. The dataset was created from a larger
pool of more than 33M samples using a stratified sampling
technique based on VirusTotal’s vhash clustering algorithm,
which is based on dynamic and static properties of the
binaries. Use of stratified sampling ensures uniform coverage
over the canonical ‘types’ of binaries present in the dataset,
while also limiting bias from certain overrepresented types
(e.g., popular malware families). In addition, we also created
a smaller dataset whose size and distribution is more in line
with Kolosnjaji et al.’s evaluation
        <xref ref-type="bibr" rid="ref8">(Kolosnjaji et al. 2018)</xref>
        ,
which we refer to as the Mini dataset. The Mini dataset was
created by sampling 4,000 goodware and 4,598 malware
samples from the Full dataset. Note that both datasets follow
a strict temporal split where test data was observed strictly
later than training data. We use the Mini dataset in order to
explore whether the attack results demonstrated by
Kolosnjaji et al. would generalize to a production-quality model, or
whether they are artifacts of the dataset properties.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Baseline Performance</title>
      <p>
        To validate our implementation of the MalConv
architecture
        <xref ref-type="bibr" rid="ref12">(Raff et al. 2017)</xref>
        , we train the classifier on both the
Mini and the Full datasets, leaving out the DeCov
regularization addition suggested by the authors. Our implementation
uses a momentum-based optimizer with decay and a batch
size of 80 instances. We train on the Mini dataset for 10 full
epochs. We also trained the Full dataset for 10 epochs, but
stopped the process early due to a small validation loss1. To
assess and compare the performance of the two models, we
test them on the entire Full testing set. The model trained on
the Full dataset achieves an accuracy of 0.89 and an AUC of
0.97, which is similar to the results published in the
original MalConv paper. Unsurprisingly, the Mini model is much
less robust, achieving an accuracy of 0.73 and an AUC of
0.82.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Append Attacks</title>
      <p>
        In this section we present various attack strategies that
address the semantic integrity constraints of PE files by
appending adversarial noise to the original file. We start
by presenting two attacks first introduced by Kolosnjaji et
al.
        <xref ref-type="bibr" rid="ref8">(Kolosnjaji et al. 2018)</xref>
        and evaluated against MalConv.
Random Append. The Random Append attack works by
appending byte values sampled from a uniform distribution.
This baseline attack measures how easily an append attack
could offset features derived from the file length, and helps
compare the actual adversarial gains from more complex
append strategies over random appended noise.
      </p>
      <p>
        Gradient Append. The Gradient Append strategy uses
the input gradient value to guide the changes in the appended
byte values. The algorithm appends numBytes to the
candidate sample and updates their values over numIter
iterations or until the victim classifier is evaded. The gradient of
the input layer with respect to the classification loss rl
indicates the direction in the input space of the change required
to shift the instance towards the other class. The
representation of all appended bytes is iteratively updated, starting
from random values. However, as the input bytes are mapped
to a discrete embedding representation in MalConv, the
endto-end architecture becomes non-differentiable and its input
gradient cannot be computed analytically. Therefore, this
attack uses a heuristic to instead update the embedding vector
and discretize it back in the byte space to the closest byte
value along the direction of the embedding gradient. We
refer interested readers to the original paper for details of this
discretization process
        <xref ref-type="bibr" rid="ref8">(Kolosnjaji et al. 2018)</xref>
        . The attack
requires numBytes numIter gradient computations and
updates to the appended bytes in the worst case, which could
be prohibitively expensive for large networks.
      </p>
      <p>
        1This was also reported in the original MalConv study.
Benign Append. We propose two new append strategies,
one intended to highlight a vulnerability specific to the
MalConv architecture, and the second to address the potentially
long convergence time of the previously-proposed
gradientbased attack. First, the Benign Append attack allows us to
observe how the MalConv architecture encodes positional
features extracted from the input byte sequences. The attack
appends leading bytes extracted from benign instances that
are correctly classified with high confidence by the victim
classifier. The intuition behind this attack uses the
observation that the leading bytes of a file are the most influential
towards the classification decision
        <xref ref-type="bibr" rid="ref12">(Raff et al. 2017)</xref>
        .
Therefore, it signals whether the maliciousness of the target could
be offset by appending highly-influential benign bytes.
Algorithm 1 The FGM Append attack
1: function FGMAPPEND(x0, numBytes, )
2: x0 PADRANDOM(x0; numBytes)
3: e GETEMBEDDINGS(x0)
4: eu GRADIENTATTACK(e; )
5: for i in jx0j:::jx0j + numBytes 1 do
6: e[i] eu[i]
7: end for
8: x EMBEDDINGMAPPING(e)
9: return x
10: end function
11: function GRADIENTATTACK(e, )
12: eu e sign(rl(e))
13: return eu
14: end function
15: function EMBEDDINGMAPPING(ex)
16: e ARRAY(256)
17: for byte in 0:::255 do
18: e[byte] GETEMBEDDINGS(byte)
19: end for
20: for i in 0:::jexj do
21: x [i] argminb20:::255(jjex[i] e[b]jj2)
22: end for
23: return x
24: end function
FGM Append. Based on the observation that the
convergence time of the Gradient Append attack grows linearly
with the number of appended bytes, we propose the
”oneshot” FGM Append attack, an adaptation of the Fast
Gradient Method (FGM) originally described in
        <xref ref-type="bibr" rid="ref4">(Goodfellow,
Shlens, and Szegedy 2014)</xref>
        . The pseudocode is described
in Algorithm 1. Our attack starts by appending numBytes
random bytes to the original sample x0 and updating them
using a policy dictated by FGM. The FGM attack updates
each embedding value by a user specified amount in a
direction that minimizes the classification loss l on the input,
as dictated by the sign of the gradient. In order to avoid the
non-differentiability issue, our attack performs the
gradientbased updates of the appended bytes on the embedding
space, while mapping the updated value to the closest byte
value representation in EMBEDDINGMAPPING using the L2
distance metric. Unlike Gradient Append, the FGM Append
applies a single update to the appended bytes, which makes
it very fast regardless of the number of appended bytes.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Slack Attacks</title>
      <p>Besides the inability to append bytes to files that already
exceed the model’s maximum size (e.g., 2MB for MalConv),
append-based attacks suffer from an additional limitation.
Figure 2 plots the average frequency of each of the 4,195
input byte sequences as input to the fully connected layer
across a random set of 200 candidate malware samples. This
shows that, for example, while the first 1,000 byte sequences
(0.5 MB) in binaries correspond to 79% of the actual
features for the classifier, only 55% of the files are smaller than
that. Additionally, 13% of the instances cannot be attacked
at all because they are larger than the maximum file size for
the classifier. The result shows not only that appended bytes
need to offset a large fraction of the discriminative features,
but also that attacking the byte sequences of these
discriminative features directly will likely amplify the attack
effectiveness due to their importance. Driven by this intuition, we
proceed to describing an attack strategy that would exploit
the existing bytes of binaries with no side effects on the
functionality of the program.</p>
      <p>Slack FGM. Our strategy defines a set of slack bytes
where an attack algorithm is allowed to freely modify bytes
in the existing binary without breaking the PE. Once
identified, the slack bytes are then modified using a gradient-based
approach. The SLACKATTACK function in Algorithm 2
highlights the architecture of our attack. The algorithm is
independent of the strategy SLACKINDEXES employed for
extracting slack bytes or the gradient-based method in
GRADIENTATTACK used to update the bytes.</p>
      <p>In our experiments we use a simple technique that
empirically proves to be effective in finding sufficiently large slack
regions. This strategy extracts the gaps between
neighboring PE sections of an executable by parsing the executable
Algorithm 2 The Slack FGM attack
1: function SLACKATTACK(x0)
2: m SLACKINDEXES(x0)
3: e GETEMBEDDINGS(x0)
4: eu GRADIENTATTACK(e)
5: xu EMBEDDINGMAPPING(eu)
6: x x0
7: for idx in m do
8: x [idx] xu[idx]
9: end for
10: return x
11: end function
12: function SLACKINDEXES(x)
13: s GETPESECTIONS(x)
14: m ARRAY(0)
15: for i in 0:::jsj do
16: if s[i]:RawSize &gt; s[i]:V irtualSize then
17: rs s[i]:RawAddress + s[i]:V irtualSize
18: re s[i]:RawSize
19: for idx in rs:::re do
20: m APPEND(m; idx)
21: end for
22: end if
23: end for
24: return m
25: end function
section header. The gaps are inserted by the compiler and
exist due to misalignments between the virtual addresses and
the multipliers over the block sizes on disk. We compute the
size of the gap between consecutive sections in a binary as
RawSize V irtualSize, and define its byte start index in
the binary by the section’s RawAddress + V irtualSize.
By combining all the slack regions, SLACKINDEXES returns
a set of indexes over the existing bytes of a file, indicating
that they can be modified.</p>
      <p>Although more complex byte update strategies are
possible, potentially accounting for the limited leverage imposed
by the slack regions, we use the technique introduced for
the FGM Append attack in Algorithm 1, which proved to be
effective. Like in the case of FGM Append, updates are
performed on the embeddings of the allowed byte indexes and
the updated values are mapped back to the byte values using
the L2 distance metric.</p>
    </sec>
    <sec id="sec-6">
      <title>Results</title>
      <p>Here, we evaluate the attacks described in the previous
section in the same adversarial settings using both our Mini and
Full datasets. Our evaluation seeks to answer the following
three questions:</p>
      <p>How do existing attacks generalize to classifiers trained
on larger datasets?
How vulnerable is a robust MalConv architecture to
adversarial samples?
Are slack-based attacks more effective than append
attacks?</p>
      <p>In an attempt to reproduce prior work, we select candidate
instances from the test set set if they have a file size smaller
than 990,000 bytes and are correctly classified as malware
by the victim. We randomly pick 400 candidates and test the
effectiveness of the attacks using the Success Rate (SR): the
percentage of adversarial samples that successfully evaded
detection.</p>
      <p>Append Attacks. We evaluate the append-based attacks
on both the Mini and the Full datasets by varying the number
of appended bytes. Table 1 summarizes these results.</p>
      <p>
        We observe that the Random Append attack fails on both
datasets, regardless of the number of appended bytes. This
result is in line with our expectations, demonstrating that the
MalConv model is immune to random noise and that the
input size is not among the learned features. However, our
results do not reinforce previously reported success rates of up
to 15% in
        <xref ref-type="bibr" rid="ref8">(Kolosnjaji et al. 2018)</xref>
        .
      </p>
      <p>The SR of the Benign Append attack seems to
progressively increase with the number of added bytes on the Mini
dataset, but fails to show the same behavior on the Full
dataset. Conversely, on the FGM Append attack we observe
that the attack fails on the Mini dataset, while reaching up to
71% SR on the Full dataset. This paradoxical behavior
highlights the importance of large, robust datasets in evaluating
adversarial attacks. One reason for the discrepancy in attack
behaviors is that the MalConv model trained using the Mini
dataset (modeled after the dataset used by Kolosnjaji et al.)
has a severe overfitting problem. In particular, the success
of appending specific benign byte sequences from the Mini
dataset could be indicative of poor generalizability and this
is further supported by the disconnect between the model’s
capacity and the number of samples in the Mini dataset.
When we consider the one-shot FGM Attack’s success on
the Full dataset and failure on the Mini dataset, this can also
be explained by poor generalizability in the Mini model; the
single gradient evaluation does not provide enough
information for the sequence of byte changes made in the attack.
Recomputing the gradient after each individual byte change
is expected to result in a higher attack success rate.</p>
      <p>Aside from the methodological issues surrounding dataset
size and composition, our results also show that even a
robustly trained MalConv classifier is vulnerable to append
attacks when given a sufficiently large degree of freedom.
Indeed, the architecture uses 500 byte convolutional kernels
with a stride size of 500 and a single max pool layer for the
entire file, which means that not only is it looking at a
limited set of relatively coarse features, but it also selects the
best 128 activations locations irrespective of location. That
is, once a sufficiently large number of appended bytes are
added in the FGM attack, they quickly replace legitimate
50
)
%
(
e40
t
a
r
s
s
30
e
c
c
u
s
20
10
0
0
2000 4000 6000 8000
number of modified bytes
10000
features from the original binary in the max pool operation.
Therefore, the architecture does not encode positional
information, which is a significant vulnerability that we
demonstrate can be exploited.</p>
      <p>Additionally, we implemented the Gradient Append
attack proposed by Kolosnjaji et al., but failed to reproduce
the reported results. We aimed to follow the original
description, with one difference: our implementation, in line with
the original MalConv architecture, uses a special token for
padding, while Kolosnjaji et al. use the byte value 0 instead.
We evaluated our implementation under the same settings as
the other attacks, but none of the generated adversarial
samples were successful. One limitation of the Gradient Append
attack that we identified is the necessity to update the value
of each appended byte at each iteration. However, different
byte indexes might converge to their optimal value after a
varying number of iterations. Therefore, successive and
unnecessary updates may even lead to divergence of some of
the byte values. Indeed, empirically investigating individual
byte updates across iterations revealed an interesting
oscillating pattern, where some bytes receive the same sequence
of byte values cyclically in later iterations.</p>
      <p>Slack Attacks. We evaluate the Slack FGM attack over
the Full dataset for the same experimental settings as above.
In order to control the amount of adversarial noise added in
the slack bytes, we use the parameter to define an L2 ball
around the original byte value in the embedding space. Only
those values provided by the FGM attack that fall within the
ball are considered for the slack attack, otherwise the
original byte value will remain. The upper bound for the SR
is 28% for = 1:0, where we observed all the available
slack bytes being modified according to the gradient. In
order to compare it with the append attacks, in Figure 3 we
plot the SR as a function of the number of modified bytes.
The results show that, while the FGM Append attack could
achieve a higher SR, it also requires much larger number of
extra byte modifications. The Slack FGM attack achieves a
SR of 28% for an average of 1005 modified bytes, while the
SR of the FGM Append lies around 20% for the same
setting. This results confirms our initial intuition that the coarse
nature of MalConv’s features requires consideration of the
surrounding contextual bytes within the convolutional
window. In the slack attack, we make use of existing contextual
bytes to amplify the power of our FGM attack without
having to generate a full 500-byte convolutional window using
appended bytes.</p>
    </sec>
    <sec id="sec-7">
      <title>Related Work</title>
      <p>
        The work by Barreno et al. (Barreno et al. 2010) was among
the first to systematize attack vectors against machine
learning, where they distinguished evasion as a type of test-time
attack. Since then, several evasion attacks have been
proposed against malware detectors. Many of these attacks
focus on additive techniques for evasion, where new
capabilities or features are added to cause misclassification. For
instance, Biggio et al.
        <xref ref-type="bibr" rid="ref3">(Biggio et al. 2013)</xref>
        use a gradient-based
approach to evade malware detectors by adding new features
to PDFs, while Grosse et al.
        <xref ref-type="bibr" rid="ref5">(Grosse et al. 2017)</xref>
        and Hu et
al.
        <xref ref-type="bibr" rid="ref7">(Hu and Tan 2018)</xref>
        add new API calls to evade
detection. More recently, Anderson et al.
        <xref ref-type="bibr" rid="ref1">(Anderson et al. 2018)</xref>
        used reinforcement learning to evade detectors by selecting
from a pre-defined list of semantics-preserving
transformations. Similarly, Xu et al.
        <xref ref-type="bibr" rid="ref15">(Xu, Qi, and Evans 2016)</xref>
        propose
a genetic algorithm for manipulating PDFs while
maintaining necessary syntax. Closest to our work is the
gradientbased append attack by
        <xref ref-type="bibr" rid="ref8">(Kolosnjaji et al. 2018)</xref>
        against the
CNN-based MalConv architecture. In comparison to earlier
work, our slack-based attack operates on the raw bytes of
the binary, and modifies them without requiring the
expensive feedback loop from the reinforcement learning agent
and has the potential to outperform append-based attacks.
      </p>
    </sec>
    <sec id="sec-8">
      <title>Conclusion</title>
      <p>In this paper, we explored the space of adversarial
examples against deep learning-based malware detectors. Our
experiments indicate that the effectiveness of adversarial
attacks on models trained using small datasets does not
always generalize to robust models. We also observe that the
MalConv architecture does not encode positional
information about the input features and is therefore vulnerable to
append-based attacks. Finally, we proposed the Slack FGM
attack, which modifies existing bytes without affecting
semantics, with greater efficacy than append-based attacks.
against machine learning at test time. In Joint European
conference on machine learning and knowledge discovery
in databases, 387–402. Springer.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>H. S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kharkar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Filar</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Evans</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ; and Roth,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <year>2018</year>
          .
          <article-title>Learning to evade static pe machine learning malware models via reinforcement learning</article-title>
          .
          <source>arXiv preprint arXiv:1801</source>
          .08917.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <year>2010</year>
          .
          <article-title>The security of machine learning</article-title>
          .
          <source>Machine Learning</source>
          <volume>81</volume>
          (
          <issue>2</issue>
          ):
          <fpage>121</fpage>
          -
          <lpage>148</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Biggio</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Corona</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Maiorca</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Nelson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ;
          <article-title>Sˇ rndic´</article-title>
          , N.;
          <string-name>
            <surname>Laskov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ; Giacinto, G.; and
          <string-name>
            <surname>Roli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <year>2013</year>
          .
          <article-title>Evasion attacks Carlini</article-title>
          , N., and
          <string-name>
            <surname>Wagner</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Towards evaluating the robustness of neural networks</article-title>
          .
          <source>In 2017 IEEE Symposium on Security and Privacy (SP)</source>
          ,
          <fpage>39</fpage>
          -
          <lpage>57</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Goodfellow</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ; Shlens, J.; and
          <string-name>
            <surname>Szegedy</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>Explaining and harnessing adversarial examples</article-title>
          .
          <source>arXiv preprint arXiv:1412</source>
          .
          <fpage>6572</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Grosse</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Papernot</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Manoharan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Backes</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>McDaniel</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Adversarial examples for malware detection</article-title>
          .
          <source>In European Symposium on Research in Computer Security</source>
          ,
          <fpage>62</fpage>
          -
          <lpage>79</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Deep residual learning for image recognition</article-title>
          .
          <source>In Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          ,
          <fpage>770</fpage>
          -
          <lpage>778</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Black-box attacks against RNN based malware detection algorithms</article-title>
          .
          <source>In The Workshops of the The Thirty-Second AAAI Conference on Artificial Intelligence</source>
          , New Orleans, Louisiana, USA, February 2-
          <issue>7</issue>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Kolosnjaji</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Demontis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Biggio</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Maiorca</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ; Giacinto,
          <string-name>
            <surname>G.</surname>
          </string-name>
          ; Eckert,
          <string-name>
            <given-names>C.</given-names>
            ; and
            <surname>Roli</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          <year>2018</year>
          .
          <article-title>Adversarial malware binaries: Evading deep learning for malware detection in executables</article-title>
          .
          <source>26th European Signal Processing Conference (EUSIPCO '18).</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Krcˇa´l</surname>
          </string-name>
          , M.; Sˇ vec, O.;
          <article-title>Ba´lek, M.;</article-title>
          and Jasˇek,
          <string-name>
            <surname>O.</surname>
          </string-name>
          <year>2018</year>
          .
          <article-title>Deep convolutional malware classifiers can learn from raw executables and labels only</article-title>
          .
          <source>International Conference on Learning Representations.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Microsoft</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Pe format</article-title>
          . https://docs.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Papernot</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>McDaniel</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Goodfellow</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ; Jha,
          <string-name>
            <given-names>S.</given-names>
            ;
            <surname>Celik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. B.</given-names>
            ; and
            <surname>Swami</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <year>2017</year>
          .
          <article-title>Practical black-box attacks against machine learning</article-title>
          .
          <source>In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security</source>
          ,
          <fpage>506</fpage>
          -
          <lpage>519</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Raff</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Barker</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Sylvester</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Brandon,
          <string-name>
            <surname>R.</surname>
          </string-name>
          ; Catanzaro,
          <string-name>
            <surname>B.</surname>
          </string-name>
          ; and Nicholas,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <year>2017</year>
          .
          <article-title>Malware detection by eating a whole exe</article-title>
          .
          <source>arXiv preprint arXiv:1710</source>
          .
          <fpage>09435</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Szegedy</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zaremba</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ; Sutskever,
          <string-name>
            <surname>I.</surname>
          </string-name>
          ; Bruna,
          <string-name>
            <given-names>J.</given-names>
            ;
            <surname>Erhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ;
            <surname>Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.;</surname>
          </string-name>
          and Fergus,
          <string-name>
            <surname>R.</surname>
          </string-name>
          <year>2013</year>
          .
          <article-title>Intriguing properties of neural networks</article-title>
          .
          <source>arXiv preprint arXiv:1312</source>
          .
          <fpage>6199</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Ugarte-Pedrero</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Balzarotti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Santos</surname>
            ,
            <given-names>I.;</given-names>
          </string-name>
          and Bringas,
          <string-name>
            <surname>P. G.</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Sok: Deep packer inspection: A longitudinal study of the complexity of run-time packers</article-title>
          .
          <source>In 2015 IEEE Symposium on Security and Privacy (SP)</source>
          ,
          <fpage>659</fpage>
          -
          <lpage>673</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ; Qi,
          <string-name>
            <given-names>Y.</given-names>
            ; and
            <surname>Evans</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <year>2016</year>
          .
          <article-title>Automatically evading classifiers</article-title>
          .
          <source>In Proceedings of the 2016 Network and Distributed Systems Symposium.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; and LeCun,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Character-level convolutional networks for text classification</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          ,
          <volume>649</volume>
          -
          <fpage>657</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>