<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>with Symbolic Features</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Chinmay Siwach</string-name>
          <email>chinmay.siwach@imtlucca.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gabriele Costa</string-name>
          <email>gabriele.costa@imtlucca.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rocco De Nicola</string-name>
          <email>rocco.denicola@imtlucca.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IMT School for Advanced Studies</institution>
          ,
          <addr-line>Lucca</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <abstract>
        <p>Malware is a constant threat for the security of devices and users. Successful and automatic malware detection is a critical necessity [1]. Existing malware detection solutions cannot accurately characterize the behavior of a malware and, thereby, they rely on other indicators, e.g., digital signatures. Nevertheless, behavior-based detection is an active field of research since it can deal with zero-day malware. Although many proposals leveraging machine learning (ML) classifiers have been put forward, finding proper behavioral features is still an open problem. Existing solutions typically consider either static or dynamic software features. Static refers to the program syntax while dynamic refers to features observed at runtime. However, both of them sufer from limitations which impact on the efectiveness of the ML classification.</p>
      </abstract>
      <kwd-group>
        <kwd>ML based malware detection</kwd>
        <kwd>symbolic program features</kwd>
        <kwd>static program analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Malware is one of the main threats that come along with the difusion and growth of digital
technologies. In the past few years, we have observed a dramatic rise of malware attacks also in
the context of cyberespionage and sabotage [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Nowadays, signature-based malware detection
tools are the main protection mechanism. Nevertheless, since they solely rely on databases of
known malware signatures, they can be easily circumvented by slightly changing the malware
syntax. Although databases can be kept updated over time, zero-day menaces cannot be faced
in this way.
nEvelop-O
(R. D. Nicola)
      </p>
      <p>In principle, to efectively detect a malware, we should consider its behavior to understand
whether it carries our malicious operations. Nevertheless, finding a suitable definition of
malware behavior is non trivial. An approach followed by most authors is to consider API
calls. Under reasonable assumptions, APIs are the only way for a software to interact with the
underlying operating systems and its resources, e.g., to read files and open connections. Thus,
most of the malicious operations carried out by a piece of malware involve one or more APIs.</p>
      <p>There are two ways to inspect how APIs are used by a software. Static approaches are based
on code analysis and do not require program execution. On the other hand, dynamic approaches
observes the API invocations performed during one or more executions of the target binary
inside a sandbox environment. Nevertheless, these approaches have some serious limitations.
As a matter of fact, in most cases static analysis cannot deal with runtime properties related to
data and control flows inside the program, e.g., execution branches and invocation arguments.
Symmetrically, dynamic analysis can only observe a limited amount of the possible execution
traces of a program. The result is that modern malware have plenty of countermeasures to fool
automatic classifiers using either static or dynamic features.</p>
      <p>
        Ideally, there is one static technique that can accurately model the actual behavior of software,
that is symbolic execution [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Symbolic execution engines replace runtime values with abstract,
symbolic expressions that denote all the possible executions of a certain piece of software. When
symbolic execution reaches a certain program state, the generated expressions can be evaluated
by a SMT solver to verify the satisfiability of the collected constraints. If the result is negative,
the symbolic execution does not correspond to any real execution of the program. Instead, when
the the constraints are satisfiable, SMT solvers return a valid assignment to program variables
that witness the existence of a concrete execution. The main limitation for the adoption of
symbolic execution is its extremely high computational complexity. Although many significant
improvements in symbolic execution engines and SMT solvers have been proposed in the past,
nowadays the complete symbolic exploration of even small malware samples if out of reach.
      </p>
      <p>In this paper we propose a novel technique that combines symbolic execution and
MLbased malware detection. The idea is to take advantage of symbolic execution to extract
accurate, meaningful behavioral features from malware samples and to apply state-of-the-art
ML classifiers to identify suspicious features profiles. To avoid the complexity blowup of
symbolic execution, we recur to local, bounded exploration. In particular, we locally explore
limited sequences of program instructions instead of the entire code. In this way, symbolic
expressions are kept under control and SMT solvers can be eficiently applied to reasonably
small problems. Clearly, this comes at the price of an over-approximation of the actual behavior
of the target program. Nevertheless, our features are both () more accurate than standard,
static features, and () more general than dynamically generated execution traces.</p>
      <p>To obtain this result, we introduce a novel specification for behavioral features called Symbolic
Feature Specification Language (SFSL). SFSL feature allows to model sort sequences of APIs and
relationships between their arguments and return values. Whenever a match is found between
a SFSL rule and the target code, a corresponding element of the feature vector is activated.
Eventually, the feature vector of a program is passed to a trained ML classifier that decides
whether the program should be considered suspicious.</p>
      <p>Contributions. The major technical contributions of this paper are the following.
• We describe a novel feature specification language known as SFSL to encode the rule
used for malware classification.
• We introduce a novel symbolic feature extraction technique based on matching the rule
with the target code in a binary sample.
• We show the efectiveness of our technique by presenting analysis of real-world malware
sample in Section 2.
• We implemented our approach by combining symbolic feature extraction technique with
machine learning classifiers for malware classification.</p>
      <p>Paper organization. In Section 2 we introduce a motivating example that we will develop
along the paper. In Section 3 we present our methodology based on symbolic feature extraction.
Experimental results are given in Section 4. In Section 5 we present the related work. Section 6
summarizes the final thoughts and ideas for further investigation.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Motivating example</title>
      <p>
        In this section we present our motivating example based on malware Derusbi [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This sample
appeared in late 2011. It was involved in several attacks against Japan and USA carried out by
Deep Panda (APT-19). This group has targeted security, telecommunications, high tech, and
other sectors. The primary goal of Derusbi was to be a long term intelligence gathering tool.
To obtain this result, Derusbi is equipped with several functionalities that we list below.
• Backdoor. This sample is a DLL which is capable of registering itself as a service. It also
stops security services and opens an interactive command line shell to the Command and
Control.
• Spyware. It targets user credentials from client storage in Internet Explorer, Mozilla
Firefox and other major browsers. It also does intelligence gathering on the infected
system by identifying security tools by their process name, proxy accounts, and version
numbers from the Operating System (OS) and Internet Explorer.
• Dropper. The sample is also capable of dropping an encrypted kernel driver. Encrypted
kernel driver is responsible for relying information to C2.
• Anti-analysis. If ZhuDongFangYu.exe (AntiVirus360 program) is running, sample don’t
write encrypted kernel driver to its disk. It also loads a dll known as p s t o r e c . d l l to detect
SunBelt Sandbox.
• Persistence. It can bypass User Account Control (UAC) to achieve persistence by using
’sysprep.exe’ (Microsoft executable by Windows) to elevate its privileges.
• Disabling security tools. It can start stop delete system services, managing running
processes and enumerating or altering registry keys.
database and it is stored in variable s e r v i c e _ d b . If s e r v i c e _ d b is equal to N U L L , the service
database cannot be accessed and execution jumps to another location.1 Otherwise, instruction
at line 4 is invoked O p e n S e r v i c e W . The first argument s e r v i c e _ d b is handle (pointer to service
database) from previous call to O p e n S C M a n a g e r A which provides control access to the service
control manager (SCM) and service objects [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The second argument p o i n t e r + 2 0 , is obtained
through computation that corresponds to service name. The third argument is 0 x 2 0 specifying
S C _ M A N A G E R _ M O D I F Y _ B O O T _ C O N F I G as desired access provides G E N E R I C _ W R I T E access [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The return
value is the pointer to the service and it is stored in variable s e r v i c e . If s e r v i c e is equal to N U L L ,
the service cannot be accessed and execution jumps to another location. Otherwise, instruction
1For the sake of presentation we replaced jumps to external locations with break.
invokes API C o n t r o l S e r v i c e which sends a control code to a service. The first argument s e r v i c e
is handle (pointer to a service) from previous call to O p e n S e r v i c e W which provides access to
the service. The second argument is S E R V I C E _ C O N T R O L _ S T O P which corresponds to control code
specifying service stop [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The third argument &amp; s e r v i c e _ s t a t e is a pointer to a S E R V I C E _ S T A T U S
structure that receives the latest service status information.
      </p>
      <p>
        Although the above fragment only involves a few instructions, static analyzers based on API
name inspection cannot efectively understand the malicious nature of the performed
operation. The reason is twofold. On the one hand, the chain of involved APIs is standard for most
applications, including benign one. As a matter of fact, according to oficial documentation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
O p e n S C M a n a g e r A , O p e n S e r v i c e W and C o n t r o l S e r v i c e are meant to be used in sequence.
Furthermore, since there are conditional jumps between the API calls, static approaches may need to
approximate the actual instruction flow, e.g., by considering all the possible execution branches
instead of a specific sequence. On the other hand, the malicious nature of the code might be
spotted out by checking some of the APIs arguments. For instance, one might want to evaluate
which service is actually accessed through O p e n S e r v i c e W . This can be achieved by checking
whether the second argument belongs to a predefined list of service identifiers. Nevertheless,
since the second argument of the call is obtained through a computation, i.e., p o i n t e r + 2 0 ,
pure static analysis cannot be applied in general.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>Loosely speaking, our approach amounts to ML-based malware classification, that is, we train a
machine learning classifiers to distinguish between malicious and benign software. The main
diference with previous proposals is the feature extraction methodology.</p>
      <sec id="sec-3-1">
        <title>3.1. Overview</title>
        <p>Our methodology involves three phases that we list below.</p>
        <p>• Phase 1: Features definition. We implement it by defining a feature specification language
called SFSL. A SFSL specification consists of a list of rules, each defining a sequence
of security relavent APIs together with a boolean condition. Roughly speaking, this
corresponds to asking whether the program under analysis can execute the API sequence
so that the rule condition is satisfied.
• Phase 2: Feature exploration. We implement a feature extraction engine that performs
symbolic exploration of a program to check whether program instructions matches the
SFSL rule. The result of this phase is a vector of program features where every location
contains 1 if the corresponding SFSL rule is satisfied (0 otherwise).
• Phase 3: Training machine learning classifier. The features generated by the extraction
algorithm are used for training several machine learning classifiers. Trained classifiers
are then used in our experiments for assessing the methodology.</p>
        <p>SPEC ::= FEAT; …; FEAT;
FEAT ::= DECL; PATH; BPRE
DECL ::= TYPE VAR, …, TYPE VAR
TYPE ::= int | char | string
PATH ::= CALL, …, CALL
CALL ::= VAR := API(VAR, …, VAR)
BPRE ::= ¬ BPRE | BPRE ∧ BPRE | BPRE ∨ BPRE | true | false | SPRE | APRE
SPRE ::= VAR in SEXP
SEXP ::= [WCHR … WCHR]
APRE ::= AEXP = AEXP | AEXP &gt; AEXP</p>
        <p>AEXP ::= NUM | VAR | AEXP + AEXP | AEXP - AEXP | AEXP × AEXP | AEXP / AEXP</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Feature specification language</title>
        <p>The abstract syntax of our feature specification language is given in Table 1. A specification
(SPEC) amounts to a finite sequence of features (FEAT). Each feature is a triple consisting of
some declarations (DECL), a path (PATH) and a Boolean predicate (BPRE). Each declaration is
a pair of a type (TYPE) and a variable name (VAR). Supported types include the C base types.
For the sake of presentation, here we restrict ourselves to integers, characters and strings. A
variable name is any valid identifier. A path is a sequence of API calls (CALL). Each call consists
of an assignment of a variable (for the return value) to an API invocation with a list of variables
(for the API parameters). Boolean predicates include standard propositional logic connectives
as well as string and arithmetic predicates (SPRE and APRE, respectively). A string predicate
checks whether a variable matches a certain sequence of characters (WCHR) between […].
For string matching, we use . as a wildcard to denote that the corresponding position in the
string can be any character. For instance x in [a.a] is valid for both x=aaa and x=aba. Instead,
arithmetic predicates are comparisons between expressions (AEXP).</p>
        <p>For brevity, we feel free to use parentheses for grouping and to introduce some abbreviations.
In particular, we use ? in API calls in place of a variable name when the variable is immaterial,
i.e., it does not appear in the Boolean predicate of the feature specification (and we skip the
declaration of such a variable). Also, in case the return variable is equal to ? we simply omit it.</p>
        <p>Finally, we use constants in API calls as a shortcut for equality checks as in the following.
int x; f(x, 0); x &gt; 2;
≜</p>
        <p>int x, int y; f(x, y); x &gt; 2 ∧ y = 0;
Example 1. We introduce the following SFSL rule to model the behavior discussed in Section 2.
i n t r , i n t s , i n t d , s t r i n g e , i n t h , i n t g ;
r : = O p e n S C M a n a g e r A ( ? , ? , ? ) , s : = O p e n S e r v i c e W ( d , e , ? ) , C o n t r o l S e r v i c e ( h , g , ? ) ;
r = d ∧ s = h ∧ g = 1 ∧ (e i n [ W i n D e f e n d ] ∨ e i n [ w u a u s e r v ] ∨ e i n [ w s c s v s ] );
The rule above declares 6-variables, i.e., r , s , d , e , h and g . Then, it contains a path
consisting of the three API invocations appearing in Listing 1. Finally, the rule concludes with
a boolean predicate. The predicate defines a relationship between the invocations by stating
that () O p e n S e r v i c e W is invoked on the same service database returned by O p e n S C M a n a g e r A , i.e.,
r = d , and () C o n t r o l S e r v i c e is invoked on the same service returned by O p e n S e r v i c e W , i.e.,
s = h . Also, the predicate says that g = 1 , i.e., the command to be issued to the target service
is S E R V I C E _ C O N T R O L _ S T O P (1 ). Finally, the rule is matched if the second O p e n S e r v i c e W argument
belongs to a list of relevant Windows defense services, i.e., W i n D e f e n d , w u a u s e r v or w s c s v s .</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Feature extraction algorithm</title>
        <p>The feature extraction algorithm aims to compile a vector of features for ML classification. We
implemented a bounded symbolic exploration strategy. The algorithm used for symbolic feature
extraction is given below 1. Application of algorithm on example mentioned in Section 2 is
illustrated Example 2.</p>
        <p>Algorithm 1 Symbolic Feature Extraction algorithm
1: Input Rule R = ( DECL; PATH; PRED) Sample S
2: for e a c h f u n c t i o n f i n s a m p l e S do
3: C ∶= set of calls to PATH[0]
4: for e a c h c a l l c i n C do
5: init := empty_symbolic_state(f)
6:  ∶= explore(init,c)
7: add_bindings ( ,DECL)
8: target := next(PATH)
9: repeat
10:  := explore( ,target)
11: add_bindings ( ,DECL)
12: target := next(PATH)
13: until end(PATH)
14: add_predicate( ,PRED)
15: if satisfiable(  ) then
16: vector(R) :=1
17: exit;
18: end if
19: end for
20: end for
21: vector(R):=0
22: end Input</p>
        <p>Symbolic feature extraction methodology is based on below mentioned Algorithm 1.
Algorithm takes RULE(R) as an input. RULE(R) is a triple of DECL (declaration), PATH (path) and
PRED (predicate) mentioned in Subsection 3.2. Symbolic state  is a set of boolean predicates.
Exploration is bounded in nature. Exploration begins by checking the presence of API calls
in set C inside a function f of a sample, S and by initializing empty symbolic state. Function
explore is called with parameters namely, initialized symbolic state  and call c to PATH[0](
ifrst API in a RULE R). Output of the explore function is  symbolic state. Then, function
add_bindings performs binding between resulting symbolic state  from explore and DECL at
line 7. Then, we move to next API in a PATH, see Subsection 3.2. This process is repeated until
all PATH exhausted(i.e. we reach at last API in a PATH). At line 14, boolean predicates are
added by function add_predicate that takes  (symbolic state) and PRED as arguments, where
 here is state obtained after reaching last API in PATH. At the end of exploration of a RULE
R, satisfiability check is performed at line 15. If this check amounts to True, one is added to
the feature vector and exit is performed to check next RULE R. Otherwise, zero is added to the
Vector(R) and we move to the next RULE R.</p>
        <p>Example 2. Consider again the fragment of code provided in Section 2. We show the symbolic
exploration steps applied to the code fragment and the feature given in Example 1. Exploration
starts from a fully symbolic state  0 = ∅. The first instruction to be symbolically executed is
s e r v i c e _ d b = O p e n S C M a n a g e r A ( N U L L , N U L L , G E N E R I C _ E X E C U T E )
Since no implementation is given of API O p e n S C M a n a g e r A , it is treated as a purely symbolic
function, i.e., such that its return value is unconstrained. Thus, after symbolic evaluations, the
next state is  1 =  0 = ∅. Since the previous instruction corresponds to an invocation of the first
API of the feature under analysis, the next operation is to add to  1 new bindings between the
feature variables and the state variables. In this case we obtain  2 = {s e r v i c e _d b = r }. The next
instruction is an if statement. Clearly, when the guard is satisfied, i.e., when s e r v i c e _ d b = = N U L L ,
the execution jumps to another location and our exploration fails. Thus, we proceed with the
new symbolic state  3 =  2 ∪ {s e r v i c e _d b ≠ 0}. The next instruction is</p>
        <p>s e r v i c e = O p e n S e r v i c e W ( s e r v i c e _ d b , ( p o i n t e r + 2 0 ) , 0 x 2 0 )
Again, since this API is the next in the feature under evaluation, the new symbolic state  4
must contains bindings between the invocation arguments and the feature variables, that is
 4 =  3 ∪ {s e r v i c e = s , s e r v i c e _d b = d , p o i n t e r + 2 0 = e }. Then, the symbolic execution
proceeds with another conditional statement. In this case, we want the guard to be satisfied,
which implies  5 =  4 ∪ {s e r v i c e ≠ 0}. The next instruction is</p>
        <p>e r r = C o n t r o l S e r v i c e ( s e r v i c e , S E R V I C E _ C O N T R O L _ S T O P , &amp; s e r v i c e _ s t a t e )</p>
        <p>This API is last in the feature under evaluation, the new symbolic state  6 must contain
bindings between the feature variables, the invocation arguments, and symbolic expression that is
 6 =  5 ∪ {s e r v i c e = h , S E R V I C E _C O N T R O L _S T O P = g } ∪ {( r = d ∧ s = h ∧ g = 1 ∧ ( e i n [ W i n D e f e n d ] ∨
e i n [ w u a u s e r v ] ∨ e i n [ w s c s v c ] ) ) . After reaching symbolic state  6, SMT solver check for
which input values of parameters r , s , d , e , h , g can make the symbolic expression hold true (i.e
satisfiable). The solver constraints will look like a sequence of assignments r = d , s = h , g = 1 ,
e = W i n D e f e n d , s e r v i c e _ d b , s e r v i c e , p o i n t e r . If SMT solver is able to solve constraints in
symbolic state  6, presence of a feature is marked. For instance, values that satisfy the assignment are
r = 1 , d = 1 , s = 2 , h = 2 , g = 1 , e = W i n D e f e n d , s e r v i c e _ d b = 1 , s e r v i c e = 2 , p o i n t e r = x x x x x x x x x x x x x
x x x x x x x W i n D e f e n d . Therefore, there exists a program executable path corresponding to our
feature mentioned in Example 1. End result for this feature check provides us with one in our
feature vector.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental results</title>
      <sec id="sec-4-1">
        <title>4.1. Dataset preparation</title>
        <p>
          To evaluate the proposed method and create a symbolic feature set, we collected 32-bit Portable
Executable (PE) samples for both malware and benignware. In particular, our corpus of malware
dataset consists of samples from APT Malware Dataset.2 The corpus of benignware is made
up of samples presented in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and Malware Data Science: Attack Detection and Attribution.3
Filtering out duplicate malware and benign samples is necessary. We used SHA256 hashing
to retain unique samples in both malware and benignware groups. Also, we excluded packed
binary samples and samples written in VB Script, since these samples are beyond the scope of
our research. The resulting dataset consists of 1072 malware and 1112 benignware samples.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Experimental settings</title>
        <p>
          We run our experiments on a x64 18.04 Ubuntu, Intel®Core™i7-9750H CPU @ 2.60GHz machine,
with 16 GB RAM. The feature extraction algorithm has been implemented in Python and we
adopted Angr [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] as symbolic execution engine. The set of rules used for our experiments
consists of 102 features. Features are collected from various malware reports such as [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and are
encoded in our rules. They are categorized according to MITRE Techniques and Sub-Techniques
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. We applied a two-minutes time threshold to the extraction of each feature. The ML
classifier was implemented with Scikit-learn [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. In particular, we used well-known machine
learning algorithms such as Random Forest (RF), Decision Tree (DT), Naive Bayes (NB), Logistic
Regression (LR), and K-NN.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Stratified K-fold cross validation</title>
        <p>The objective of this experiment is to study the performance of the proposed symbolic feature
set. Stratified K-Fold splits the dataset on K folds such that each fold contains approximately the
same percentage of samples of each target class as the complete set. Stratified K-fold reduces the
chance of overfitting. We conducted five separate stratified 4-fold cross-validation experiments.
For each of the four partitions, 3 of the folds are used as training data. Validation of the resulting
model is done on the remaining data by using it as a test set. Receiver operating characteristic
(ROC) curve tracking the relationship between a classifier’s detection threshold and its true and
false positive rates for classifiers trained and tested on a symbolic feature set is shown in Figure
2 (a,b,c,d,e). Area under curve (AUC) values shown in Figure 2 (a,b,c,d,e) provide an aggregate
measure of performance across all possible classification thresholds. It can be observed that RF
and LR performance are the best among all classifiers, while NB gives lowest AUC value. Figure
2 (f) shows the box plot output of classifier’s accuracy collected during stratified 4-fold cross
validation trained on symbolic feature set containing 102 features. It is observed that RF and
DT have high accuracy and low variation where others have more variation with low accuracy.
2https://github.com/cyber-research/APTMalware
3https://www.malwaredatascience.com/</p>
        <p>ROC fold 0 (AUC = 0.89)
ROC fold 1 (AUC = 0.89)
ROC fold 2 (AUC = 0.89)
ROC fold 3 (AUC = 0.89)
Chance
Mean ROC (AUC = 0.89 ± 0.00)
± 1 std. dev.</p>
        <p>ROC fold 0 (AUC = 0.88)
ROC fold 1 (AUC = 0.89)
ROC fold 2 (AUC = 0.89)
ROC fold 3 (AUC = 0.90)
Chance
Mean ROC (AUC = 0.89 ± 0.01)
± 1 std. dev.</p>
        <p>ROC fold 0 (AUC = 0.87)
ROC fold 1 (AUC = 0.85)
ROC fold 2 (AUC = 0.85)
ROC fold 3 (AUC = 0.87)
Chance
Mean ROC (AUC = 0.86 ± 0.01)
± 1 std. dev.</p>
        <p>b)
1.0
)
1
l:
e
lab0.8
e
v
iit
so0.6
P
(
e
t
aR0.4
e
v
iit
s
Po0.2
e
u
r
T
0.0
d)
1.0
)
1
l:
e
lab0.8
e
v
iit
so0.6
P
(
e
t
aR0.4
e
v
iit
s
Po0.2
e
u
r
T
0.0
f)
0.84
0.82
y
c
a
r
cu0.80
c
A
1.0
)
1
l:
e
lab0.8
e
v
iit
so0.6
P
(
e
t
aR0.4
e
v
iit
s
Po0.2
e
u
r
T
0.0
c)
1.0
)
1
l:
e
lab0.8
e
v
iit
so0.6
P
(
e
t
aR0.4
e
v
iit
s
Po0.2
e
u
r
T
0.0
e)
1.0
)
1
l:
e
lab0.8
e
v
iit
so0.6
P
(
e
t
aR0.4
e
v
iit
s
Po0.2
e
u
r
T
0.0</p>
        <sec id="sec-4-3-1">
          <title>Receiver operating characteristic- Decision Tree</title>
        </sec>
        <sec id="sec-4-3-2">
          <title>Receiver operating characteristic- Random Forest</title>
          <p>These experiments are only the first attempt to use symbolic features for malware detection
and classification. Nevertheless, the results presented above show that malware detection based
on symbolic features is feasible. In general, we expect that the number and type of the used rules
afects the overall performance of detectors and classifiers implemented in this way. However,
the study of this aspect is beyond the scope of the present work.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Related work</title>
      <p>In this section, we discuss the related work on malware analysis. We categorize existing works
as follows.</p>
      <p>
        Static Analysis. Approaches based on static analysis scan the sample without actually
executing it [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. This technique targets syntactic signatures such as strings or instruction sequences
embedded in the binary [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. As a result, malware samples can evade detection by modifying
their appearance while maintaining their functionality [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. To tackle these limitations, a
number of features to describe binary executables have been proposed. One such feature is
analysing the specific sequences of bytes also known as n-gram analysis. Here each feature
represents the number of times a given combination of n bytes appears in the executable [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
Another feature relies on API calls to model the the actions of a malware ont the underlying
system [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. The list of all API calls that may be executed is obtained by disassembling the
sample. Static analysis based on strings looks for suspicious text inside the binary [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ], e.g.,
known malicious URLs. More sophisticated static analysis approach such as symbolic execution
rely on symbolic semantics to capture the malicious functionality of a binary. In [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], the
authors show how an analyst can use symbolic execution techniques to unveil critical behavior
of a remote access trojan (RAT) by deriving the list of commands and corresponding system calls
sequence getting activated. In [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], the author demonstrates the use of system call dependency
graphs (SCDGs) constructed using symbolic execution traces. SCDGs are then used as learning
inputs in a classifier to generate signature graphs for each malware family.
      </p>
      <p>
        Dynamic analysis. These techniques capture the behavior of the program at runtime. One
method to capture such a behavior is to observe the interactions of the program with the
underlying operating system in terms of API calls [
        <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], the analysis is based on
scrutinizing the genuineness of every kernel driver using flood emulation, i.e., by exploring the
program code through forced executions. Approaches such as [
        <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
        ] model program behavior
in form of a graph composed of function calls and their dependencies. Dynamic analysis solely
relies on extracting behavioral signature encoded in limited set of execution traces. Thus, some
execution paths may remain unexplored [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. Also, several of these techniques proceed with
executing the sample in a well-protected and isolated environment, called sandbox. However,
anti-analysis techniques, such as anti-virtualization, employed by malware samples can detect
sandboxed executions [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. The authors of [26] presented detection of malware based on
conditions that trigger hidden behaviour by automatically and iteratively exploring various
code paths that may be dependent on trigger inputs. Its efectiveness is demonstrated using
examples such as keyloggers that only activate on targeted websites, botnets that wait for the
correct command, etc.
      </p>
      <p>
        Some approaches [
        <xref ref-type="bibr" rid="ref24">24, 26</xref>
        ] use hybrid strategies based on a combination of concrete and
symbolic execution. This is usually done to benefit from the best of the two techniques, i.e.,
eficiency of testing and good code coverage of symbolic execution.
      </p>
      <p>Discussion. The state of the art discussed above shows that malware detection based on
both static and dynamic features have serious limitations. Mainly, static features struggle in
capturing a precise definition of malware behavior, while dynamic features cannot efectively
explore the program execution paths. While hybrid techniques can partially overcome these
issue, a proper definition of “malicious” behavior is still missing. Our technique cope with this
limitation by introducing a dedicated feature specification language.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This paper presented a novel symbolic execution-based feature extraction technique that is
combined with machine learning to perform malware detection. The main motivation behind
the choice of extracting features based on symbolic execution is its ability to provide us with
behavioral features without executing the malware. Symbolic execution examines alternative
execution paths by exploring the code without running it on a real machine, potentially ofering
a more comprehensive understanding of malware’s actions.</p>
      <p>As future work, we plan to continue the assessment of our technique. In particular, we plan
to extend corpus of malware samples as well as dataset of rules. Another experiment we want to
carry out is for measuring the efectiveness of our methodology against zero-day malware. For
instance, this might be estimated by () only training our classifier with malware released before
a certain date and () testing the classification accuracy by using more recent malware samples.
Finally, we want to consider integration of our SFSL with existing specification languages, such
as YARA.
[26] D. Brumley, C. Hartwig, Z. Liang, J. Newsome, D. Song, H. Yin, Automatically identifying
trigger-based behavior in malware, in: Botnet Detection, Springer, 2008, pp. 65–88.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Symantec</surname>
          </string-name>
          ,
          <source>ISTR V24</source>
          ,
          <year>2019</year>
          . URL: https://docs.broadcom.com/doc/istr-24-2019-en.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Baldoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Coppa</surname>
          </string-name>
          , D. C.
          <article-title>D'elia, C. Demetrescu, I. Finocchi, A survey of symbolic execution techniques</article-title>
          ,
          <source>ACM Computing Surveys (CSUR) 51</source>
          (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Crowdstrike</surname>
          </string-name>
          ,
          <source>Report deep panda</source>
          ,
          <year>2013</year>
          . URL: https://paper.seebug.org/papers/APT/APT_ CyberCriminal_Campagin/
          <year>2013</year>
          /2013.Deep.
          <article-title>Panda/crowdstrike-deep-panda-report</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Windows</surname>
          </string-name>
          ,
          <source>Win32API Documentation</source>
          ,
          <year>2019</year>
          . URL: https://docs.microsoft.com/en-us/ windows/win32/services.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Windows</surname>
          </string-name>
          ,
          <source>Win32API Documentation</source>
          ,
          <year>2019</year>
          . URL: https://docs.microsoft.com/en-us/ windows/win32/api/winsvc/nf-winsvc-controlservice.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Windows</surname>
          </string-name>
          ,
          <source>Win32API Documentation</source>
          ,
          <year>2019</year>
          . URL: https://docs.microsoft.com/en-us/ windows/win32/services/service
          <article-title>-security-and-access-rights.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kuppusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Aghila</surname>
          </string-name>
          ,
          <article-title>A learning model to detect maliciousness of portable executable using integrated feature set</article-title>
          ,
          <source>Journal of King Saud University-Computer and Information Sciences</source>
          <volume>31</volume>
          (
          <year>2019</year>
          )
          <fpage>252</fpage>
          -
          <lpage>265</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shoshitaishvili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Salls</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Stephens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Polino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dutcher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Grosen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hauser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kruegel</surname>
          </string-name>
          , et al.,
          <article-title>Sok:(state of) the art of war: Ofensive techniques in binary analysis</article-title>
          ,
          <source>in: 2016 IEEE Symposium on Security and Privacy (SP)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>138</fpage>
          -
          <lpage>157</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Corporation</surname>
          </string-name>
          , Mitre Attack,
          <year>2019</year>
          . URL: https://attack.mitre.org/.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , E. Duchesnay,
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (
          <year>2011</year>
          )
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Biondi</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
            Given-Wilson,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Legay</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Puodzius</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Quilbeuf</surname>
            ,
            <given-names>Tutorial:</given-names>
          </string-name>
          <article-title>An overview of malware detection and evasion techniques</article-title>
          ,
          <source>in: International Symposium on Leveraging Applications of Formal Methods</source>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>565</fpage>
          -
          <lpage>586</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Szor</surname>
          </string-name>
          ,
          <article-title>The art of computer virus research and defence, addison-wesley professional</article-title>
          , New York, NY, USA (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Christodorescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jha</surname>
          </string-name>
          ,
          <article-title>Testing malware detectors</article-title>
          ,
          <source>ACM SIGSOFT Software Engineering Notes</source>
          <volume>29</volume>
          (
          <year>2004</year>
          )
          <fpage>34</fpage>
          -
          <lpage>44</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J. Z.</given-names>
            <surname>Kolter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Maloof</surname>
          </string-name>
          ,
          <article-title>Learning to detect and classify malicious executables in the wild</article-title>
          .,
          <source>Journal of Machine Learning Research</source>
          <volume>7</volume>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahmadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ulyanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Semenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Trofimov</surname>
          </string-name>
          , G. Giacinto,
          <article-title>Novel feature extraction, selection and fusion for efective malware family classification</article-title>
          ,
          <source>in: Proceedings of the sixth ACM conference on data and application security and privacy</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>183</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Saxe</surname>
          </string-name>
          , K. Berlin,
          <article-title>Deep neural network based malware detection using two dimensional binary program features</article-title>
          ,
          <source>in: 2015 10th International Conference on Malicious and Unwanted Software (MALWARE)</source>
          , IEEE,
          <year>2015</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>R.</given-names>
            <surname>Baldoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Coppa</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. C. D'Elia</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Demetrescu</surname>
          </string-name>
          ,
          <article-title>Assisting malware analysis with symbolic execution: A case study</article-title>
          ,
          <source>in: International conference on cyber security cryptography and machine learning</source>
          , Springer,
          <year>2017</year>
          , pp.
          <fpage>171</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sebastio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Baranov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Biondi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Decourbe</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
            Given-Wilson,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Legay</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Puodzius</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Quilbeuf</surname>
          </string-name>
          ,
          <article-title>Optimizing symbolic execution for malware behavior classification</article-title>
          ,
          <source>Computers &amp; Security</source>
          <volume>93</volume>
          (
          <year>2020</year>
          )
          <fpage>101775</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>C.</given-names>
            <surname>Willems</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Holz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Freiling</surname>
          </string-name>
          ,
          <article-title>Toward automated dynamic malware analysis using cwsandbox</article-title>
          ,
          <source>IEEE Security &amp; Privacy</source>
          <volume>5</volume>
          (
          <year>2007</year>
          )
          <fpage>32</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Salehi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghiasi</surname>
          </string-name>
          ,
          <article-title>Maar: Robust features to detect malicious activity based on api calls, their arguments and return values</article-title>
          ,
          <source>Engineering Applications of Artificial Intelligence</source>
          <volume>59</volume>
          (
          <year>2017</year>
          )
          <fpage>93</fpage>
          -
          <lpage>102</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wilhelm</surname>
          </string-name>
          , T.-c. Chiueh,
          <article-title>A forced sampled execution approach to kernel rootkit identification</article-title>
          , in: International Workshop on Recent Advances in Intrusion Detection, Springer,
          <year>2007</year>
          , pp.
          <fpage>219</fpage>
          -
          <lpage>235</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>F.</given-names>
            <surname>Karbalaie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahmadi</surname>
          </string-name>
          ,
          <article-title>Semantic malware detection by deploying graph mining</article-title>
          ,
          <source>International Journal of Computer Science Issues (IJCSI) 9</source>
          (
          <year>2012</year>
          )
          <fpage>373</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>C.</given-names>
            <surname>Kolbitsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Comparetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kruegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kirda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Efective and eficient malware detection at the end host</article-title>
          ,
          <source>in: Proceedings of the 18th Conference on USENIX Security Symposium, SSYM'09</source>
          ,
          <year>2009</year>
          , p.
          <fpage>351</fpage>
          -
          <lpage>366</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A.</given-names>
            <surname>Moser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kruegel</surname>
          </string-name>
          , E. Kirda,
          <article-title>Exploring multiple execution paths for malware analysis</article-title>
          ,
          <source>in: 2007 IEEE Symposium on Security and Privacy (SP'07)</source>
          , IEEE,
          <year>2007</year>
          , pp.
          <fpage>231</fpage>
          -
          <lpage>245</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>R.</given-names>
            <surname>Wojtczuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rutkowska</surname>
          </string-name>
          ,
          <article-title>Following the white rabbit: Software attacks against intel vt-d technology</article-title>
          , ITL: http://www. invisiblethingslab. com/resources/2011/Software% 20Attacks%
          <article-title>20on% 20Intel% 20VT-d</article-title>
          .
          <source>pdf</source>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>