=Paper=
{{Paper
|id=Vol-1507/dx15paper25
|storemode=property
|title=Diagnosing Advanced Persistent Threats: A Position Paper
|pdfUrl=https://ceur-ws.org/Vol-1507/dx15paper25.pdf
|volume=Vol-1507
|dblpUrl=https://dblp.org/rec/conf/safeprocess/AbreuBEFHHKPAB15
}}
==Diagnosing Advanced Persistent Threats: A Position Paper==
Proceedings of the 26th International Workshop on Principles of Diagnosis
Diagnosing Advanced Persistent Threats: A Position Paper
Rui Abreu and Danny Bobrow and Hoda Eldardiry and Alexander Feldman and
John Hanley and Tomonori Honda and Johan de Kleer and Alexandre Perez
Palo Alto Research Center
3333 Coyote Hill Rd
Palo Alto, CA 94304, USA
{rui,bobrow,hoda.eldardiry,afeldman,john.hanley,tomo.honda,dekleer,aperez}@parc.com
Dave Archer and David Burke
Galois, Inc.
421 SW 6th Avenue, Suite 300
Portland, OR 97204, USA
{dwa,davidb}@galois.com
Abstract individual applications. Thus current techniques give
operators little system-wide situational awareness, nor
When a computer system is hacked, analyz- any viewpoint informed by a long-term perspective.
ing the root-cause (for example entry-point Adversaries have taken advantage of this opacity by
of penetration) is a diagnostic process. An adopting a strategy of persistent, low-observability
audit trail, as defined in the National Infor- operation from inside the system, hiding effectively
mation Assurance Glossary, is a security- through the use of long causal chains of system and
relevant chronological (set of) record(s), application code. We call such adversaries advanced
and/or destination and source of records that persistent threats, or APTs.
provide evidence of the sequence of activi-
ties that have affected, at any time, a specific To address current limitations, this position pa-
operation, procedure, or event. After detect- per discusses a technique that aims to track causal-
ing an intrusion, system administrators man- ity across the enterprise and over extended periods of
ually analyze audit trails to both isolate the time, identify subtle causal chains that represent ma-
root-cause and perform damage impact as- licious behavior, localize the code at the roots of such
sessment of the attack. Due to the sheer vol- behavior, trace the effects of other malicious actions
ume of information and low-level activities descended from those roots, and make recommenda-
in the audit trails, this task is rather cum- tions on how to mitigate those effects. By doing so,
bersome and time intensive. In this posi- the proposed approach aims to enable stakeholders to
tion paper, we discuss our ideas to automate understand and manage the activities going on in their
the analysis of audit trails using machine networks. The technique exploits both current and
learning and model-based reasoning tech- novel forms of local causality to construct higher-level
niques. Our approach classifies audit trails observations, long-term causality in system informa-
into the high-level activities they represent, tion flow. We propose to use a machine learning ap-
and then reasons about those activities and proach to classify segments of low-level events by the
their threat potential in real-time and foren- activities they represent, and reasons over these ac-
sically. We argue that, by using the outcome tivities, prioritizing candidate activities for investiga-
of this reasoning to explain complex evi- tion. The diagnostic engine investigates these candi-
dence of malicious behavior, we are equip- dates looking for patterns that may represent the pres-
ping system administrators with the proper ence of APTs. Using pre-defined security policies and
tools to promptly react to, stop, and mitigate related mitigations, the approach explains discovered
attacks. APTs and recommends appropriate mitigations to op-
erators. We plan to leverage models of APT and nor-
mal business logic behavior to diagnose such threats.
1 Introduction Note that the technique is not constrained by availabil-
Today, enterprise system and network behaviors are ity of human analysts, but can benefit by human-on-
typically “opaque”: stakeholders lack the ability to as- the-loop assistance.
sert causal linkages in running code, except in very The approach discussed in the paper will offer un-
simple cases. At best, event logs and audit trails can precedented capability for observation of long-term,
offer some partial information on temporally and spa- subtle system-wide activity by automatically con-
tially localized events as seen from the viewpoint of structing such global, long-term causality observa-
193
Proceedings of the 26th International Workshop on Principles of Diagnosis
tions. The ability to automatically classify causal execute processes on the front-end as the non-
chains of events in terms of abstractions such as ac- privileged user www-data.
tivities, will provide operators with a unique capabil-
2. The attacker notices that the front-end is run-
ity to orient to long-term, system-wide evidence of
ning an unpatched U BUNTU L INUX OS version
possible threats. The diagnostic engine will provide
13.1. The attacker uses the nc Linux utility to
a unique capability to identify whether groups of such
copy an exploit for obtaining root privileges. The
activities likely represent active threats, making it eas-
particular exploit that the attacker uses utilizes
ier for operators to decide whether long-term threats
the x32 recvmmsg() kernel vulnerability reg-
are active, and where they originate, even before those
istered in the Common Vulnerabilities and Expo-
threats are identified by other means. Thus, the ap-
sures (CVE) database as CVE 2014-0038. After
proach will pave the way for the first automated, long-
running the copied binary for a few minutes the
horizon, continuously operating system-wide support
attacker gains root access to the front-end host.
for an effective defender Observe, Orient, Decide, and
Act (OODA) loop. 3. The attacker installs a root-kit utility that inter-
cepts all input to ssh;
2 Running Example 4. A system administrator uses the compromised
The methods proposed in this article are illustrated on ssh to connect to the back-end revealing his back-
a realistic running example. The attackers in this ex- end password to the attacker;
ample use sophisticated and recently discovered ex- 5. The attacker uses the compromised front-end to
ploits to gain access to the victim’s resources. The at- bypass firewalls and uses the newly acquired
tack is remote and does not require social engineering back-end administrator’s password to access the
or opening a malicious email attachment. The meth- back-end;
ods that we propose, however, are not limited to this
class of attacks. 6. The attacker uses a file-tree traversing utility on
the back-end that collects sensitive data and con-
router solidates it in an archive file;
7. The attacker sends the archive file to a third-party
Internet
hijacked computer for analysis.
hacker
victim’s local network
3 Auditing and Instrumentation
Almost all computing systems of sufficiently high-
level (with the exception of some embedded systems)
leave detailed logs of all system and application activ-
ities. Many UNIX variants such as L INUX log via the
syslog daemon, while W INDOWSTM uses the event
web server data storage system log service. In addition to the usual logging mecha-
front-end back end administrator nisms, there is a multitude of projects related to se-
cure and detailed auditing. An audit log is more de-
Figure 1: Network topology for the attack tailed trail of any security or computation-related ac-
tivity such as file or RAM access, system calls, etc.
Depending on the level of security we would like
The network topology used for our running example to provide, there are several methods for collecting in-
is shown in figure 1. The attack is executed over sev- put security-related information. On one extreme, it is
eral days. It starts by (1) compromising the web server possible to use the existing log files. On the other ex-
front-end, followed by (2) a reconnaissance phase and treme there are applications for collecting detailed in-
(3) compromising the data storage back end and ulti- formation about the application execution. One such
mately extracting and modifying sensitive information approach [1] runs the processes of interest through a
belonging to the victim. debugger and logs every memory read and write ac-
Both the front-end and the back end in this example cess.
run unpatched U BUNTU 13.1 L INUX OS on an I N - It is also possible to automatically inject logging
TEL R S ANDY B RIDGE TM architecture.
calls in the source files before compiling them, allow-
What follows is a detailed chronology of the events: ing us to have static or dynamic logging or a combi-
1. The attacker uses the A PACHE httpd server, a nation of the two. Log and audit information can be
cgi-bin script, and the S HELLSHOCK vulnera- signed, encrypted and sent in real-time to a remote
bility (GNU bash exploit registered in the Com- server to make system tampering and activity-hiding
mon Vulnerabilities and Exposures database as more difficult. All these configuration decisions im-
CVE 2014-6271 (see https://nvd.nist. pose different trade-offs in security versus computa-
gov/) to gain remote shell access to the victim’s tional and RAM load [2] and depend on the organiza-
front-end. It is now possible for the attacker to tional context.
2
194
Proceedings of the 26th International Workshop on Principles of Diagnosis
.. abstractions may be composed of lower-order abstrac-
.
tions that are in turn abstractions of low-level events.
front_end.secure_access_log:11.239.64.213 - [22/Apr/2014 For example, a sequential set of logged events such as
06:30:24 +0200] "GET /cgi-bin/test.cgi HTTP/1.1" 401 381 ‘browser forking bash’, ‘bash initiating Netcat’, and
.. ‘Netcat listening to new port’, might be abstracted as
. the activity ‘remote shell access’. The set of activities,
front_end.rsyslogd.log:recvmsg(3, msg_name(0) = ‘remote shell access’, and ‘escalation of privilege’ can
NULL, msg_iov(1) = ["29/Apr/2014 22:15:49 ...", 8096], be abstracted as the activity ‘remote root shell access’.
msg_controllen = 0, msg_flags = MSG_CTRUNC,
MSG_DONTWAIT) = 29 We approach activity annotation as a supervised
learning problem that uses classification techniques to
.. generate activity tags for audit trails. Table 1 shows
.
multiple levels of activity classifications for the above
back_end:auditctl:type = SYSCALL msg = au- APT example.
dit(1310392408.506:36): arch = c000003e syscall = 2
success = yes exit = 3 a0 = 7fff2ce9471d a1 = 0 a2 = 61f768 Table 1 represents one possible classification-
a3 = 7fff2ce92a20 items = 1 ppid = 20478 pid = 21013 auid enriched audit trail for such an APT. There can be
= 1000 uid = 0 gid = 0 euid = 0 suid = 0 fsuid = 0 egid = 0 many relatively small variations. For example, ob-
sgid = 0 fsgid = 0 ses = 1 comm = "grep" exe = "/bin/grep" scuring the password file could be done using other
.. programs. A single classifier only allows for a single
. level of abstraction, and a single leap from low-level
events to very abstract activities (for example, from
Figure 2: Part of log files related to the attack from the ‘bash execute perl’ level to ‘extracting modified file’
running example level) will have higher error caused by these additional
variations.
Figure 2 shows part of the logs collected for our run- To obtain several layers of abstraction for reason-
ning example. The first entry is when the attacker ex- ing over, and thus reduce overall error in classifica-
ploits the S HELLSHOCK vulnerability through a CGI tion, we use a multi-level learning strategy that models
script of the web server. The second entry shows sys- information at multiple levels of semantic abstraction
log strace-like message resulting from the kernel using multiple classifiers. Each classifier solves the
escalation. Finally, the attacker uses the grep com- problem at one abstraction level, by mapping from a
mand on the back-end server to search for sensitive lower-level (fine) feature space to the next higher-level
information and the call is recorded by the audit sys- conceptual (coarse) feature space.
tem. The activity classifier rely on both a vocabulary of
It is often the case that the raw system and secu- activities and a library of patterns describing these ac-
rity log files are preprocessed and initial causal links tivities that will be initially defined manually. This vo-
are computed. If we trace the exec, fork, and cabulary and pattern set reside in a Knowledge Base.
join POSIX system calls, for example, it is possi- In our training approach, results from training lower
ble to add graph-like structure to the log files comput- level classifiers are used as training data for higher
ing provenance graphs. Another method for comput- level classifiers. In this way, we coherently train all
ing local causal links is to consider shared resources, classifiers by preventing higher-level classifiers from
e.g., two threads reading and writing the same memory being trained with patterns that will never be gener-
address [1]. ated by their lower-level precursors. We use an ensem-
ble learning approach to achieve accurate classifica-
4 Activity Classification tion. This involves stacking together both bagged and
The Activity Classifier continuously annotates audit boosted models to reduce both variance and bias er-
trails with semantic tags describing the higher-order ror components [3]. The classification algorithm will
activity they represent. For example, ‘remote shell ac- be trained using an online-learning technique and in-
cess’, ‘remote file overwrite’, and ‘intra-network data tegrated within an Active Learning Framework to im-
query’ are possible activity tags. These tags are used prove classification of atypical behaviors.
by the APT Diagnostics Engine to enable higher-order Generating Training Data for Classification To
reasoning about related activities, and to prioritize ac- build the initial classifier, training data is generated
tivities for possible investigation. using two methods. First, an actual deployed sys-
tem is used to collect normal behavior data, and a
4.1 Hierarchical semantic annotation of Subject Matter Expert manually labels it. Second,
audit trails a testing platform is used to generate data in a con-
A key challenge in abstracting low-level events into trolled environment, particularly platform dependent
higher-order activity patterns that can be reasoned vulnerability-related behavior. In addition, to gener-
about efficiently is that such patterns can be described ate new training data of previously unknown behavior,
at multiple levels of semantic abstraction, all of which we use an Active Learning framework as described in
may be useful in threat analysis. Indeed, higher-order Section 5.
3
195
Proceedings of the 26th International Workshop on Principles of Diagnosis
Table 1: Sample classification problem for running example
Activity 1 Activity 2 Activity 3
Remote Shell Access Remote File Overwrite Modified File Download
Shell Shock Trojan Installation Password Exfiltration
Browser (Port 80) fork bash Netcat listen to Port 8443 Netcat listen to Port 8443
bash fork Netcat Port 8443 receive binary file Port 8443 fork bash
Netcat listen to port 8080 binary file overwrites libns.so bash execute perl
Perl overwrite /tmp/stolen_pw
Port 8443 send /tmp/stolen_pw
5 Prioritizer are ranked according to their threat level by aggregat-
ing a severity measure (determined by classified threat
As the Activity Classifier annotates audit trails with
type) and a confidence measure. We complement the
activity descriptors, the two (parallel) next steps in our
initial set of training data to calibrate our classifiers by
workflow are to 1) prioritize potential threats to be re-
using an Active Learning Framework, which focuses
ferred to the Diagnostic Engine (see Section 6) for in-
on improving the classification algorithm through oc-
vestigation, and 2) prioritize emergent activities that
casional manual labeling of the most critical activities
(after suitable review and labeling) are added to the ac-
in the audit trails.
tivity classifier training data. This module prioritizes
Unsupervised ranking using normalcy charac-
activities by threat severity and confidence level. This
terization to catch unknown threats. The second
prioritization process presents three key challenges.
component of the prioritizer is a set of unsupervised
normalcy rankers, which rank entities based on their
5.1 Threat-based rank-annotation of
statistical “normalcy". Activities identified as un-
activities usual will be fed to the Active Learning framework
One challenge in ranking activities according to their to check if any of them are “unknown” APT activities.
threat potential is the complex (and dynamic) notion of This provides a mechanism for detecting “unknown”
what constitutes a threat. Rankings based on matching threats while also providing feedback to improve the
to known prior threats is necessary, but not sufficient. APT classifier.
An ideal ranking approach should take known threats
into account, while also proactively considering the 5.2 Combining Multiple Rankings
unknown threat potential of new kinds of activities. One of the key issues with combining the outputs of
Another such challenge is that risk may be assessed multiple risk ranking is dealing with two-dimensional
at various levels of activity abstraction, requiring that risk (severity, confidence) scores that may be on very
overall ranking must be computed by aggregating risk different scales. A diverse set of score normalization
assessments at multiple abstraction levels. techniques have been proposed [4; 5; 6] to deal with
We implement two ranking approaches: a super- this issue, but no single technique has been found to
vised ranker based on previously known threats and an be superior over all the others. An alternative to com-
unsupervised ranker that considers unknown potential bining scores is to combine rankings [7]. Although
threats. converting scores to rankings does lose information, it
Supervised ranking using APT classification to remains an open question if the loss in information is
catch known threats. The goal of APT classifica- compensated for by the convenience of working with
tion is to provide the diagnostic engine with critical the common scale of rankings.
APT related information such as APT Phase, severity We will develop combination techniques for
of attack, and confidence level associated with APT weighted risk rankings based on probabilistic rank ag-
tagging for threat prioritization. Since the audit trails gregation methods. This approach builds on our own
are annotated hierarchically into different granularity work [8] that shows the robustness of the weighted
of actions, multiple classifiers will be built to consider ranking approach. We also build on principled meth-
each hierarchical level separately. APT classifiers are ods for combining ranking data found in the statistics
used to identify entities that are likely to be instances and information retrieval literature.
of known threats or phases of an APT attack. Two Traditionally, the goal of rank aggregation [9; 10]
types of classifiers are used. The first classifier is is to combine a set of rankings of the same candi-
hand-coded and the second classifier is learned from dates into a single consensus ranking that is “better”
training data. than the individual rankings. We extend the tradi-
The hand-coded classifier is designed to have high tional approach to accommodate the specific context
precision, using hand-coded rules, mirroring SIEM of weighted risk ranking. First, unreliable rankers will
and IDS systems. Entities tagged by this classifier are be identified and either ignored or down-weighted,
given the highest priority for investigation. The second lest their rankings decrease the quality of the over-
classifier, which is learned from training data, will pro- all consensus [7; 10]. Second, we will discount ex-
vide higher recall at the cost of precision. Activities cessive correlation among rankers, so that a set of
4
196
Proceedings of the 26th International Workshop on Principles of Diagnosis
highly redundant rankers do not completely outweigh vulnerabilities or the combined use of social engineer-
the contribution of other alternative rankings. To ad- ing and insufficiency of the organizational security
dress these two issues, we will associate a probabilis- policies. We use MBD for computing the set of si-
tic latent variable Zi with the i’th entity of interest, multaneously exploited vulnerabilities that allowed the
which indicates whether the entity is anomalous or deployment of the APT. Computing such explanations
normal. Then, we will build a probabilistic model is possible because MBD reasons in terms of multiple-
that allows us to infer the posterior distribution over faults [14]. In our running example this set would in-
the Zi based on the observed rankings produced by clude both the fact the the web server has been ex-
each of the input weighted risk rankings. This poste- ploited due to the Shellshock vulnerability and that a
rior probability of Zi being normal will then be used the attacker gained privileged access on the front-end
as the weighted risk rank. Our model will make the due to the use of the X64_32 escalation vulnerability.
following assumptions to account for both unreliable The abstract security model is used to gather infor-
and correlated rankers: 1) Anomalies are ranked lower mation about types of attacks the system is vulnerable
than all normal instances and these ranks tend to be to, and to aid deciding the set of actions required to
concentrated near the lower rankings of the provided stop an APT campaign (policy enforcement). Various
weighted risk rankings, and 2) Normal data instances heuristics exist to find the set of meaningful diagnosis
tend to be uniformly distributed near the higher rank- candidates. As an example, one might be interested
ings of the weighted risk rankings. in the minimal set of actions to stop the attack [15;
There are various ways to build a probabilistic 16] or select those candidates that capture significant
model that reflects the above assumptions and al- probability mass [17]. In the rest of this section, for
lows for the inference of the Zi variables through illustration purposes, we use minimality as the heuris-
Expectation-Maximization [11]. In addition to these tic of interest. MBD is the right tool for dealing with
assumptions, we will explore allowing other factors to computation of diagnosis candidates as it offers sev-
influence the latent Zi variables, such as features of eral ways to address the modeling and computational
the entities as well as feedback provided by an expert complexity [18; 19].
analysts.
6.2 Detection and Isolation of Attacks from
Abstract Security Model and Sensor
6 Diagnosis Data
We view the problem of detecting, isolating, and ex- The abstract security model provides an abstraction
plaining complex APT campaigns behavior from rich mechanism that is originally missing in the audit trails.
activity data is a diagnostic problem. We will use More precisely what is not in the audit trails and what
an AI-based diagnostic reasoning to guide the global is in the security model is how to connect (possibly
search for possible vulnerabilities that enabled the disconnected) activities for the purpose of global rea-
breach. Model-based diagnosis (MBD) [12] is a par- soning. The abstract security model and the sensor
ticularly compelling approach as it supports reasoning data collected from the audit trails are provided as in-
over complex causal networks (for example, having puts to an MBD algorithms that performs the high-
multiple conjunctions, disjunctions, and negations) level reasoning about possible vulnerabilities and at-
and identifies often subtle combinations of root causes tacks similar to what a human security officer would
of the symptoms (the breach). do.
The information in the “raw” audit trails is of too
6.1 An MBD approach for APT detection high fidelity [2] and low abstraction to be used by a
and isolation: Motivation “crude” security model. That is the reason the diag-
nostic engine needs the machine learning module to
Attack detection and isolation are two distinct chal- temporally and spatially group nodes in the audit trails
lenges. Often diagnostic approaches use separate and to provide semantically rich variable/value sensor
models for detection and isolation [13]. MBD how- data about actions, suitable for MBD. Notice that in
ever uses a single model, to combine these two rea- this process, the audit trail structure is translated to se-
sonings. The security model contains both part of mantic categories, i.e., the diagnostic engine receives
the security policy (that communicating with certain as observations time-series of sensed actions.
blacklisted hosts may indicate an information leak) The listing that follows next shows an abstract se-
and information about the possible locations and con- curity model for the running example in the LYDIA
sequence of a vulnerability (a privilege escalation may language [20]. This bears some resemblance to P RO -
lead to an information leak). The security model also LOG , except that LYDIA is a language for model-
contains abstract security constraints such as if a pro- based diagnosis of logical circuits while P ROLOG is
cess requires authentication, a password must be read for Horn-style reasoning. The use of LYDIA is for
and compared against. illustration purposes only, in reality computer sys-
The diagnostic approach takes into consideration tems can be much more easily modeled as state ma-
the bootstrapping of an APT which we consider the chines. There is a significant body of literature deal-
root-cause of the attack. What enables a successful ing with diagnosis of discrete-event systems [21; 22;
APT is either a combination of software component 23], to name just a few.
5
197
Proceedings of the 26th International Workshop on Principles of Diagnosis
1 system front_end (bool know_root_password) know root password
2 { root shell
3 bool httpd_shell_vuln ; // vulnerability httpd shell
>
bool buffer_overflow_vuln ; // vulnerability
2
4
r
bool escalation_vuln ; // vulnerability
leak pw1
5 2
p
6 Legend:
1
assumable variable
7 bool httpd_shell ; buffer overflow vuln 1
2
q 2
internal variable
8 bool root_shell ;
9 bool leak_passwd;
10 Figure 3: Part of the abstract security model for the
11 // weak−fault models
running example
12 if (! httpd_shell_vuln ) { // if healthy
13 ! httpd_shell ; // forbid shells via httpd
14 } rity constraints in it is notoriously difficult, hence we
15 plan to create or use specialized modal logic similar to
16 if (! escalation_vuln ) { // if healthy
17 ! root_shell ; // no root shell is possible
the one proposed in [25].
18 } Notice that the format of the Boolean circuit shown
19 in figure 3 is very close to the one used in Truth Main-
20 if (! buffer_overflow_vuln ) { // if healthy tenance System (TMS) [26]. The only assumable vari-
21 !leak_passwd; // passwords don’t leak able in figure 3 is buffer_overflow_vuln and its
22 } default value is false (i.e., there is no buffer overflow
23 vulnerability in the web server process).
24 bool access_passwd; We next show how a reasoning engine can discover
25 attribute observable (access_passwd) = true; a conflict through forward and backward propagation.
26
Looking at figure 3, it is clear that r must be true be-
27 !access_passwd => !leak_passwd;
28 cause it is an input to an AND-gate whose output is set
29 /∗∗ to true. Therefore either p or q (or both) must be true.
30 ∗ Knowing the root password can be explained This means that either buffer_overflow_vuln or
31 ∗ by a root shell ( for example there is a leak_pw must be false. If we say that leak_pw is
32 ∗ password sniffer ). assumed to be true (measured or otherwise inferred),
33 ∗/ then leak_pw and buffer_overflow_vuln are to-
34 know_root_password => gether part of a conflict. It means that the reasoning
35 (( httpd_shell || leak_passwd) && root_shell ); engine has to change one of them to resolve the con-
36 } tradiction.
37
38 system back_end(bool know_root_password)
Based on the observation from our running exam-
39 { ple and a TMS constructed from the security model
40 bool comm; shown in figure 3, the hitting set algorithm computes
41 attribute observable (comm) = true; two possible diagnostic hypotheses: (1) the attacker
42 gained a shell access through a web-server vulnerabil-
43 /∗∗ ity and the attacker performed privilege escalation or
44 ∗ Normal users can only communicate with a (2) the attacker injected binary code through a buffer
45 ∗ list of permitted hosts . overflow and the attacker performed privilege escala-
46 ∗/ tion.
47 if (! know_root_password) { If we use LYDIA to compute the set of diagnoses for
48 comm == true;
49 }
the running example, we get the following two (am-
50 } biguous) diagnoses for the root-cause of the penetra-
51 tion:
52 system main() $ lydia example.lm example.obs
53 { d1 = { fe.escalation_vuln,
54 bool know_root_password;
fe.httpd_shell_vuln }
55
56 system front_end fe (know_root_password); d2 = { fe.buffer_overflow_vuln,
57 system back_end be(know_root_password); fe.escalation_vuln }
58 } MBD uses probabilities to computes a sequence of
possible diagnoses ordered by likelihood. This proba-
LYDIA translates the model to an internal proposi- bility can be used for many purposes: decide which di-
tional logic formula. Part of this internal representa- agnosis is more likely to be the true fault explanation,
tion is shown in figure 3, which uses the standard VLSI whether there is the need for consider further evidence
[24] notation to denote AND-gates, OR-gates, and from the logs or limit the number of diagnoses that
NOT-gates. Wires are labeled with variable names. need to be identified. Many policies exist to compute
Boolean circuits (matching propositional logic), how- these probabilities [27; 28].
ever, have limited expressiveness and modeling secu- For illustration purposes we consider that the diag-
6
198
Proceedings of the 26th International Workshop on Principles of Diagnosis
noses for the running example are ambiguous. Before and their diagnostic accuracy in the context of trans-
we discuss methods for dealing with this ambiguity, parent computing.
we address the major research challenge of model gen-
eration. 7 Conclusions
6.3 Model Generation Identifying the root-cause and perform damage im-
The abstract vulnerability model can either be con- pact assessment of advanced persistent threats can be
structed manually or semi-automatically. The chal- framed as a diagnostic problem. In this paper, we dis-
lenge with modeling is that an APT campaign gener- cuss an approach that leverages machine learning and
ally exploits unknown vulnerabilities. Therefore, our model-based diagnosis techniques to reason about po-
approach to address this issue is to construct the model tential attacks.
which captures expected behavior (known goods) of Our approach classifies audit trails into high-level
the system. Starting from generic parameterized vul- activities, and then reasons about those activities and
nerability models and security objectives, the abstract their threat potential in real-time and forensically. By
vulnerability model can be extended with information using the outcome of this reasoning to explain com-
related to known vulnerabilities (known bads). plex evi- dence of malicious behavior, the system
Generating the model can be done either manu- administrators is provided with the proper tools to
ally or semi-automatically. We will explore venues to promptly react to, stop, and mitigate attacks.
generate this model manually, which requires signif-
icant knowledge about potential security vulnerabili- References
ties, while being error prone and not detailed enough.
Amongst company specific requirements, we envisage [1] Kyu Hyung Lee, Xiangyu Zhang, and Dongyan
the abstract vulnerability model to capture the most Xu. High accuracy attack provenance via binary-
common attacks that target software systems, as de- based execution partition. In Proceedings of
scribed in the Common Attack Pattern Enumeration the 20th Annual Network and Distributed System
and Classification (CAPEC1 ). The comprehensive list Security Symposium, San Diego, CA, February
of known attacks has been designed to better under- 2013.
stand the perspective of an attacker exploiting the vul- [2] Kyu Hyung Lee, Xiangyu Zhang, and Dongyan
nerabilities and, from this knowledge, devise appro- Xu. LogGC: Garbage collecting audit log. In
priate defenses. Proceedings of the 2013 ACM SIGSAC Confer-
As modeling is challenging, we propose to explore ence on Computer and Communications Secu-
semi-automatic approaches to construct models. The rity, pages 1005–1016, Berlin, Germany, 2013.
semi-automatic method is suitable to addressing the
modeling because in security, similarly to diagno- [3] M. Galar, A. Fernández, E. Barrenechea,
sis, there is (1) component models and (2) structure. H. Bustince, and F. Herrera. A review on ensem-
While it is difficult to automate the building of com- bles for the class imbalance problem: Bagging-,
ponent models (this may even require natural language boosting-, and hybrid-based approaches. IEEE
parsing of databases such as CAPEC), it is feasible to Transactions onSystems, Man, and Cybernetics,
capture diagnosis-oriented information from structure Part C: Applications and Reviews, 42(4):463–
(physical networking or network communication). 484, July 2012.
Yet another approach to semi-automatically gener- [4] Charu C Aggarwal. Outlier ensembles: Position
ate the model is to learn it from executions of the paper. ACM SIGKDD Explorations Newsletter,
system (e.g., during regression testing, just before 14(2):49–58, 2013.
deployment). This approach to system modeling is
inspired by the work in automatic software debug- [5] Jing Gao and Pang-Ning Tan. Converting out-
ging work [29], where modeling of program behav- put scores from outlier detection algorithms into
ior is done in terms of abstraction of program traces probability estimates. In Proceedings of the Sixth
– known as spectra [30], abstracting from modeling International Conference on Data Mining, pages
specific components and data dependencies 212–221. IEEE, December 2006.
The outlined approaches to construct the abstract [6] Hans-Peter Kriegel, Peer Kröger, Erich Schu-
vulnerability model entail different costs and diagnos- bert, and Arthur Zimek. Interpreting and unify-
tic accuracies. As expected, manually building the ing outlier scores. In Proceedings of the Eleventh
model is the most expensive one. Note that build- SIAM International Conference on Data Mining,
ing the model is a time-consuming and error-prone pages 13–24, April 2011.
task. The two semi-automatic ways also entail differ-
ent costs: one exploits the available, static informa- [7] Erich Schubert, Remigius Wojdanowski, Arthur
tion and the other requires the system to be executed Zimek, and Hans-Peter Kriegel. On evaluation of
to compute a meaningful set of executions. We will in- outlier rankings and outlier scores. In Proceed-
vestigate the trade-offs between modeling approaches ings of the Twelfth SIAM International Confer-
ence on Data Mining, pages 1047–1058, April
1
http://capec.mitre.org/ 2012.
7
199
Proceedings of the 26th International Workshop on Principles of Diagnosis
[8] Hoda Eldardiry, Kumar Sricharan, Juan Liu, sis using greedy stochastic search. Journal of Ar-
John Hanley, Bob Price, Oliver Brdiczka, and tificial Intelligence Research, 38:371–413, 2010.
Eugene Bart. Multi-source fusion for anomaly [19] Nuno Cardoso and Rui Abreu. A distributed
detection: using across-domain and across-time approach to diagnosis candidate generation. In
peer-group consistency checks. Journal of Progress in Artificial Intelligence, pages 175–
Wireless Mobile Networks, Ubiquitous Com- 186. Springer, 2013.
puting, and Dependable Applications (JoWUA),
[20] Alexander Feldman, Jurryt Pietersma, and Ar-
5(2):39–58, 2014.
jan van Gemund. All roads lead to fault
[9] Yoav Freund, Raj D. Iyer, Robert E. Schapire, diagnosis: Model-based reasoning with LY-
and Yoram Singer. An efficient boosting al- DIA . In Proceedings of the Eighteenth Belgium-
gorithm for combining preferences. Journal of Netherlands Conference on Artificial Intelli-
Machine Learning Research, 4(Nov):933–969, gence (BNAIC’06), Namur, Belgium, October
2003. 2006.
[10] Ke Deng, Simeng Han, Kate J Li, and Jun S Liu. [21] Meera Sampath, Raja Sengupta, Stephane Lafor-
Bayesian aggregation of order-based rank data. tune, Kasim Sinnamohideen, and Demosthenis C
Journal of the American Statistical Association, Teneketzis. Failure diagnosis using discrete-
109(507):1023–1039, 2014. event models. Control Systems Technology, IEEE
[11] Arthur P Dempster, Nan M Laird, and Donald B Transactions on, 4(2):105–124, 1996.
Rubin. Maximum likelihood from incomplete [22] Alban Grastien, Marie-Odile Cordier, and Chris-
data via the EM algorithm. Journal of the royal tine Largouët. Incremental diagnosis of discrete-
statistical society. Series B, 39(1):1–38, 1977. event systems. In DX, 2005.
[12] Johan de Kleer, Olivier Raiman, and Mark [23] Alban Grastien, Patrik Haslum, and Sylvie
Shirley. One step lookahead is pretty good. In Thiébaux. Conflict-based diagnosis of discrete
Readings in Model-Based Diagnosis, pages 138– event systems: theory and practice. 2012.
142. Morgan Kaufmann Publishers, San Fran-
[24] Behrooz Parhami. Computer Arithmetic: Algo-
cisco, CA, 1992.
rithms and Hardware Designs. Oxford Univer-
[13] Alexander Feldman, Tolga Kurtoglu, Sriram sity Press, Inc., New York, NY, USA, 2nd edi-
Narasimhan, Scott Poll, David Garcia, Johan tion, 2009.
de Kleer, Lukas Kuhn, and Arjan van Gemund.
[25] Janice Glasgow, Glenn Macewen, and Prakash
Empirical evaluation of diagnostic algorithm
Panangaden. A logic for reasoning about secu-
performance using a generic framework. In-
rity. ACM Transactions on Computer Systems,
ternational Journal of Prognostics and Health
10(3):226–264, August 1992.
Management, pages 1–28, 2010.
[26] Kenneth Forbus and Johan de Kleer. Building
[14] Johan de Kleer and Brian Williams. Diagnosing
Problem Solvers. MIT Press, 1993.
multiple faults. Artificial Intelligence, 32(1):97–
130, 1987. [27] Johan de Kleer. Diagnosing multiple persistent
[15] Oleg Sheyner, Joshua Haines, Somesh Jha, and intermittent faults. In Proceeding of the 2009
International Joint Conference on Artificial In-
Richard Lippmann, and Jeannette M Wing. Au-
telligence, pages 733–738, July 2009.
tomated generation and analysis of attack graphs.
In Proceeding of the 2002 IEEE Symposium [28] Rui Abreu, Peter Zoeteweij, and Arjan J. C.
on Security and Privacy, pages 273–284. IEEE, Van Gemund. A new bayesian approach to multi-
May 2002. ple intermittent fault diagnosis. In Proceeding of
[16] Seyit Ahmet Camtepe and Bülent Yener. Mod- the 2009 International Joint Conference on Arti-
ficial Intelligence, pages 653–658, July 2009.
eling and detection of complex attacks. In Pro-
ceeding of the Third International Conference on [29] Rui Abreu, Peter Zoeteweij, and Arjan JC
Security and Privacy in Communications Net- Van Gemund. Spectrum-based multiple fault lo-
works, pages 234–243, September 2007. calization. In Proceedings of the 24th IEEE/ACM
[17] Rui Abreu and Arjan JC van Gemund. A International Conference on Automated Soft-
ware Engineering, pages 88–99, November
low-cost approximate minimal hitting set algo-
2009.
rithm and its application to model-based diagno-
sis. In Proceedings of the Eighth Symposium on [30] Mary Jean Harrold, Gregg Rothermel, Kent
Abstraction, Reformulation and Approximation, Sayre, Rui Wu, and Liu Yi. An empirical in-
pages 2–9, July 2009. vestigation of the relationship between spectra
[18] Alexander Feldman, Gregory Provan, and Arjan differences and regression faults. Software Test-
ing Verification and Reliability, 10(3):171–194,
van Gemund. Approximate model-based diagno-
2000.
8
200