=Paper= {{Paper |id=Vol-1218/bmaw2014_paper_6 |storemode=property |title=Using Bayesian Attack Detection Models to Drive Cyber Deception |pdfUrl=https://ceur-ws.org/Vol-1218/bmaw2014_paper_6.pdf |volume=Vol-1218 |dblpUrl=https://dblp.org/rec/conf/uai/JonesL14 }} ==Using Bayesian Attack Detection Models to Drive Cyber Deception== https://ceur-ws.org/Vol-1218/bmaw2014_paper_6.pdf
  Using Bayesian Attack Detection Models to Drive Cyber Deception



              James H. Jones, Jr.                                          Kathryn B. Laskey
Department of Electrical and Computer Engineering       Department of Systems Engineering and Operations Research
            George Mason University                                     George Mason University
               Fairfax, VA 22030                                           Fairfax, VA 22030



                     Abstract                                Key words: Cyber Deception, Cyber Attack, Bayesian
                                                             Model, Deception Model, Intrusion Detection System

  We present a method to devise, execute, and                1.   INTRODUCTION
  assess a cyber deception. The aim is to cause an
                                                             Network-based Intrusion Detection Systems (NIDS) are
  adversary to believe they are under a cyber                essentially granular sensors. Their measurements consist
  attack when in fact they are not. Cyber network            of computer network traffic, sometimes at the packet
  defense relies on human and computational                  level, which matches signatures of known cyber attack
  systems that can reason over multiple individual           activity. For a typical network, the individual data points
  evidentiary items to detect the presence of meta           are numerous and require aggregation, fusion, and context
  events, i.e., cyber attacks. Many of these systems         to acquire meaning. This reasoning may be accomplished
  aggregate and reason over alerts from Network-             through the use of cyber attack detection models, where
  based Intrusion Detection Systems (NIDS). Such             the NIDS data points represent evidence and specific
  systems use byte patterns as attack signatures to          cyber attacks or classes of attacks represent hypotheses.
  analyze      network       traffic   and  generate         Modeling approaches, including Bayesian Networks, have
  corresponding alerts. Current aggregation and              been applied in the past, are an active research area, and
  reasoning tools use a variety of techniques to             are in use today in deployed systems.
  model meta-events, among them Bayesian                     The input to a NIDS sensor is network traffic, which is
  Networks. However, the inputs to these models              inherently uncertain and subject to manipulation. Prior
  are based on network traffic which is inherently           research has exploited this fact to create large numbers of
  subject to manipulation. In this work, we                  false NIDS alerts to overwhelm or disable the backend
  demonstrate a capability to remotely and                   processing systems. In this work, we leverage knowledge
  artificially trigger specific meta events in a             of the backend cyber attack models to craft network
  potentially unknown model. We use an existing              traffic which manipulates the inputs and hence the outputs
  and known Bayesian Network based cyber attack              of both known and unknown models. With a small
  detection system to guide construction of                  number of packets and no actual cyber attack, we are able
  deceptive network packets. These network                   to create the false impression of an active attack.
  packets are not actual attacks or exploits, but            In this work, we describe a general approach to network-
  rather contain selected features of attack traffic         based offensive cyber deception, and we demonstrate an
  embedded in benign content. We provide these               implementation of such a deception. We use an existing
  packets to a different cyber attack detection              cyber attack detection model to guide the development of
  system to gauge their generalizability and effect.         deception traffic, which is then processed by a second and
  We       combine       the      deception  packets'        distinct cyber attack detection model. Finally, we propose
  characteristics, the second system's response, and         a deception model to assess the effectiveness of the
  external observables to propose a deception                deception on a target. Future work will expand and
  model to assess the effectiveness of the                   automate the generation of deceptive network packets and
  manufactured network traffic on our target. We             further develop the deception model.
  demonstrate the development and execution of a
  specific deception, and we propose the
  corresponding deception model.



                                                        60
2.   RELATED WORK                                             sense to provide disinformation to an adversary.
                                                              Similarly, deliberately triggering an adversary's network
Deception has been a staple of military doctrine for          defenses to overwhelm or disable equipment, software, or
thousands of years, and a key element of intelligence         operators was discussed openly in 2001 (Patton, Yurcik,
agency activities since they took their modern form in        and Doss 2001) but proposed as cover for other attacks
World War II. From Sun Tzu 2,500 years ago (Tzu,              rather than to effect a deception. A small number of
2013), to Operation Mincemeat in 1943 (Montagu and            offensive cyber deception implementations have been
Joyce, 1954), to the fictional operation in the 2007 book     presented, such as the D3 (Decoy Document Distributor)
Body of Lies (Ignatius, 2007), one side has endeavored to     system to lure malicious insiders (Bowen, Hershkop,
mislead the other through a variety of means and for a        Keromytis, and Stolfo, 2009) and the ADD (Attention
variety of purposes. The seminal work of Whaley and           Deficit Disorder) tool to create artificial host-based
Bell (Whaley, 1982; Bell and Whaley, 1982; Bell and           artifacts in memory to support a deception (Williams and
Whaley, 1991), formalized and in some ways defended           Torres, 2014). While offensive cyber warfare has entered
deception as both necessary and possible to execute           the public awareness with the exposure of activity based
without "self contamination".                                 on tools such as Stuxnet, Flame, and Shamoon, offensive
                                                              cyber deception remains the subject of limited open
Deception operations have naturally begun to include the      research and discussion.
cyber domain, although the majority of this work has been
on the defensive side. Fred Cohen suggested a role for        Our deception work focuses on aggregation and reasoning
deception in computer system defense in 1998 (Cohen,          tools applied to Network Intrusion Detection Systems
1998) and simultaneously released his honeypot                (NIDS). These reasoning tools emerged from the
implementation called The Deception Toolkit (Cohen,           inundation of alerts when NIDS sensors were first
1998). Honeypots are systems meant to draw in attackers       deployed on enterprise networks. Such tools may simply
so they may be distracted and/or studied. The Deception       correlate and aggregate alerts or may model cyber attack
Toolkit was one of the first configurable and dynamic         and attacker behavior to reason over large quantities of
honeypots, as opposed to prior honeypots which were           individual evidentiary items and provide assessments of
simply static vulnerable systems with additional              attack presence for human operators to review. Such
administrator control and visibility. Other honeypot          reasoning models are abundant in the literature and in
implementations have followed, and they remain a staple       operational environments, having become indispensable
of defensive cyber deception. Neil Rowe (2003)(2007),         to cyber defenders and remaining an active research area.
his colleague Dorothy Denning (Yuill, Denning, and Feer,      Initial work on correlating and aggregating NIDS alerts
2006), and students (Tan, 2003) at the Naval Postgraduate     appeared in 2001 (Valdes and Skinner, 2001). A few
School have been researching defensive cyber deception        years later, a body of research emerged which correlated
for several years. Extending their early work identifying     NIDS events with vulnerability scans to remove irrelevant
key disruption points of an attack, they propose deception    alerts, for example (Zhai, et al., 2004). More advanced
by resource denial, where some key element of an active       reasoning models emerged a few years later, attempting to
attack vector is deceptively claimed to be unavailable.       capture attack and attacker behavior using various
Such an approach stalls the attacker while the activity can   techniques. For example, Zomlot, Sundaramurthy, Luo,
be analyzed and risks mitigated. Other defensive cyber        Ou, and Rajagopalan (2011) applied Dempster-Shafer
deception approaches include masking a target's operating     theory to prioritize alerts, and Bayesian approaches
system (Murphy, McDonald, and Mills, 2010), and               remain popular (Tylman, 2009; Hussein, Ali, and Kasiran,
actively moving targets within an IP address and TCP          2012; Ismail, Mohd and Marsono, 2014). Jones and
port space (Kewley, et al., 2001), later labeled "address     Beisel (2014) developed a Bayesian approach to
shuffling". Crouse’s (2012) comparison of the theoretical     reasoning over custom NIDS alerts for novel attack
performance of honeypots and address shuffling remains        detection. A prototype of this approach, dubbed Storm,
one of the few rigorous comparisons of techniques. Until      was used as the base model in this work.
recently, most defensive cyber deception involved
theoretical work or small proofs of concept. However, in      Our research builds on this rich body of prior work,
2011, Ragsdale both legitimized defensive cyber               merging the basic precepts of deception, manipulation of
deception and raised the bar when he introduced               network traffic, and model-based reasoning into an
DARPA’s   Scalable Cyber Deception program (Ragsdale,         offensive cyber deception capability. We use existing
2011). The program aims to automatically redirect             detection models to derive corresponding deception
potential intruders to tailorable decoy products and          models, demonstrating the ability to deceive an adversary
infrastructures in real time and at an enterprise scale.      on their own turf and causing them to believe they are
                                                              under attack when in fact they are not. This capability
By comparison, offensive cyber deception has been             may be used offensively to create an asymmetry between
discussed only briefly in the literature, often as a          attackers generating small numbers of deception packets
secondary consideration. For example, honeypots are           and targets investigating multiple false leads, and
typically a defensive tool but may be used in an offensive



                                                         61
defensively to improve the sensitivity, specificity, and       underlying cyber attack process is shown in Figure 1
deception recognition of existing cyber attack detection       below, where a typical attack progress downward from
tools.                                                         State 1 (S1) to State 7 (S7). Observables are created at
                                                               each state and transition. The existing Storm
3.   BACKGROUND                                                implementation contains one or more observable
                                                               signatures for each of the six state transitions (T1-T6 in
Signature-based Network Intrusion Detection Systems            the figure).
operate by matching network traffic to a library of
patterns derived from past attacks and known techniques.
Matches generate alerts, often one per packet, which are
saved and sent to a human operator for review. The basic
idea of intrusion detection is attributed to Anderson
(1980). Todd Heberlein introduced the idea of network-
based intrusion detection in 1990 (Heberlein, et al., 1990).
Only when processing capabilities caught up to network
bandwidth did the market take off in the late 1990s and
early 2000s. Unfortunately, early enterprise deployments
generated massive numbers of alerts, to the point that
human operators could not possibly process them all. Two
capabilities came out of this challenge: (1) correlation
between NIDS data and other enterprise sources, such as
vulnerability scanning data, e.g., see ArcSighti, and (2)
aggregators which model cyber attacks and use NIDS
alerts as individual evidentiary items, only alerting a
human when multiple aspects of an attack are detected,
e.g., see Hofmann and Sick (2011). As noted in Section 2,
many modeling approaches have been applied to the
aggregation and context problem for more than a decade,
and the area remains one of active research, e.g.,
Boukhtouta, et al (2013).
For the base model in this project, we used an existing
cyber attack detection model previously developed by one
of the authors. The implementation of this model, called
Storm, uses network traffic observables and a Bayesian
Network reasoning model to detect a system compromise
resulting from known and novel cyber attacks. The                           Figure 1: Cyber Attack Model
system ingests raw network traffic via a live network
connection or via traffic capture files in libpcapii format.
Individual packets and groups of packets are assessed          The reasoning model, shown in Figure 2, was derived
against signatures associated with cyber attack stages,        from expert knowledge and consists of 20 signature
such as reconnaissance, exploitation, and backdoor access      evidence nodes (leaves labeled Tnn), three derived
(see Figure 1). Prior to ingest by the model, saturation and   evidence nodes (labeled Mn), two protocol aggregation
time decay functions are applied to packets which match        nodes (labeled Port80 and Port25), six transition
signatures so that model output reflects the quantity and      aggregation nodes (labeled Tn), and a root node (labeled
timing of packets. Ingest and model updates occur in real      Compromise). Signature hits are processed and used to set
time as packets are received and processed. Packet             values for the Tnn and Mn nodes. As implemented, one
capture and signature matching is implemented in C++           model is instantiated for each cyber attack target (unique
using libpcap on a Linux (Ubuntu) system. Matching             target IP address). Model instances are updated whenever
packet processing, model management, and the user              new evidence is received or a preconfigured amount of
interface are implemented in Java, and the Bayesian            time has passed, and the root node values are returned as
Network is implemented with Unbbayesiii. Packets are           Probability of Compromise given Evidence, P(C|E), for
passed to the model via a TCP socket so that packet            each target.
processing and reasoning may be performed on different
systems, although we used a single server for our testing.     The theory behind the model, further explained by Jones
                                                               and Beisel (2014), is to recognize observables created
The Storm system reasons over indirect observables             when a cyber attack transitions to a new state. For
resulting from the necessary and essentially unavoidable       example, when transitioning to the exploit stage, packets
steps necessary to effect a system compromise. This            with NOP instructions (machine code for "do nothing"



                                                          62
and used in buffer overflow type attacks) or shell code        system alarms, and external indicators of the target's
elements (part of many exploit payloads) are often seen.       response activities. The impact of our deception packets is
As such, the Storm signature rules for detecting               estimated by their effect on an alternative cyber attack
observables are not specific to particular attacks, but        detection system, in this case Snortv. See Figure 3 for an
rather represent effects common to actions associated          overview of this process.
with cyber attack stages in general. Taken individually,
signature matches do not imply an attack. However, when
aggregated and combined in context by the reasoning
model, an accurate assessment of attack existence may be
produced. The system is able to detect novel attacks, since
signatures are based on generic cyber attack state
transitions instead of specific attack signatures.




     Figure 2: Bayes Net Cyber Attack Detection Model

For most environments, network traffic is insecure and
untrusted. Most networks and systems carry and accept
traffic that may be both unencrypted and unauthenticated.
As such, nearly anyone can introduce traffic on an
arbitrary network or at least modify traffic destined for an
arbitrary network or system. While full arbitrary packet
creation, modification, and introduction is not generally
possible, the ability to at least minimally manipulate
packets destined for an arbitrary network or system is
inherent in the design and implementation of the Internet.                    Figure 3: Process Overview
Packet manipulation is straightforward using available
tools like Scapyiv, requiring only knowledge of basic          The deception model output, P(Successful Deception) is
object oriented concepts and an understanding of the           envisioned to be a dynamic value computed in real time
relevant network protocols.                                    as deception packets are delivered to a target and external
                                                               observables are collected. As the deception operation
It is the combination of an ability to manipulate network
                                                               unfolds and feedback from external observables is
traffic and models which use network traffic as
                                                               incorporated, additional existing deception packets may
evidentiary inputs which we exploit in our work.
                                                               be injected, or influence points may be examined for
                                                               additional deception packet development.
4.    METHODOLOGY
Our goal for this project is to establish the viability of     Our base detection model is a Bayesian Network (Figure
manipulating an adversary's perception that they are the       2) from a test implementation of the Storm cyber attack
target of a cyber attack when in fact they are not. We         detection system. A single node sensitivity to findings
begin by conducting a sensitivity analysis of a known          analysis (from Netica) is summarized in Table 1 for the
cyber attack detection model to identify candidate             20 evidence input nodes. We reviewed this output and the
influence points. We design, construct, and inject network     descriptions of each signature to select those which (a)
packets to trigger evidence at a subset of these influence     would have high impact on Storm's probability of
points. We construct a corresponding deception model to        compromise based on the sensitivity analysis, (b) could be
assess the likelihood that our deception is effective. This    reasonably developed into a deception packet or packets,
derivative deception model combines the impact of our          and (c) could be general enough to be detected by a
deception packets with other factors, such as the ease with    target's cyber attack detection system, i.e., not Storm. In
which a target may invalidate the deception packets, the       Table 1, the eight non-gray rows are those that were
prevalence of alternative explanations for detection           selected for deception packet development (signatures
                                                               T5a, T6a, T6b, T6d, T4a, T4b, T1e, and T1f).


                                                          63
        Table 1: Storm model sensitivity analysis               packets from the clean set. By minimizing changes, we
                                                                produced packets that maintained most of the clean
                                                                session characteristics and so would not be blocked by a
                         Mutual        Variance                 firewall or other packet screening device. Also, packets
         Signature
                          Info         of Beliefs               were modified only to the extent necessary to trigger the
                                                                desired signature, so the modified packets do not contain
            T5a           0.03305      0.0011763
                                                                any actual attacks. The modified packets and associated
            T6e           0.02719      0.0008344                unmodified session packets, such as session establishment
            T6a           0.02719      0.0008344                via the TCP 3-way handshake, were exported to a
                                                                separate pcap file so they could be ingested by the Storm
            T6b           0.02719      0.0008344                and Snort systems in a controlled manner.
            T6d           0.02719      0.0008344
                                                                The eight signatures selected for deception and the related
            T6c           0.02719      0.0008344                deception packets are summarized in Table 2.
            T6f           0.02719      0.0008344
            T4a           0.01701      0.0004315
                                                                        Table 2: Signatures and deception packets
            T4b           0.01701      0.0004315
            T3b           0.01658      0.0003618                                     Signature T1e
            T3c           0.00702      0.0001202                 Description: After 3-way handshake, DstPort=80,
            T3a           0.00259      0.0000372                    payload≠  
                                                                 Explanation: Abnormal traffic to web server (usually
            T1a           0.00002      0.0000002                    expect GET or POST with ASCII data)
            T1b           0.00002      0.0000002                 Deception packet: Inserted non-ASCII (hex > 7F) at
            T1e           0.00002      0.0000002                    beginning of payload for existing HTTP session
                                                                                     Signature T1f
            T1c           0.00002      0.0000002
                                                                 Description: After 3-way handshake, DstPort=25,
            T1d           0.00002      0.0000002                    payload≠  
            T1f           0.00002      0.0000002                 Explanation: Abnormal traffic to a mail server
            T2a           0.00001      0.0000002                    (normally we expect plaintext commands)
                                                                 Deception packet: Edited HTTP session to use server
            T2b           0.00001      0.0000002                    port 25; inserted non-ASCII (hex > 7F) at
                                                                    beginning of payload
                                                                                     Signature T4a
Our test environment consisted of a Storm
                                                                 Description: Client to server traffic containing 20+
implementation running on Ubuntu 11.10 and a packet
manipulation host running BackTrack5vi. We captured                 repeated ASCII characters
normal network traffic in a test environment and                 Explanation: Buffer overflows often use a long string
processed the traffic through the Storm system to confirm           of ASCII characters to overflow the input buffer
that no attacks were detected. Storm, like most NIDS             Deception packet: Inserted 43 "d" (hex 64) characters
implementations, has the ability to ingest live network             at the beginning of existing HTTP session payload
traffic as well as network traffic capture files without loss                        Signature T4b
of accuracy or fidelity. Traffic was captured using the          Description: Client payload contains 20+ identical and
open source Wiresharkvii tool, saved as a pcap file, then           consecutive NOP instruction byte patterns
ingested by Storm. We labeled this original packet
                                                                 Explanation: A "NOP sled" is a common technique
capture file "clean" and used it as the basis for our
subsequent packet manipulations.                                    used in buffer overflow exploits; the sled consists
                                                                    of multiple NOP (No Operation) instructions to
We used Scapy on the BackTrack5 instance to craft                   ensure that the real instructions fall in the desired
deception packets. Scapy is an open source Python based             range
packet crafting and editing tool. Packets may be loaded          Deception packet: Inserted 24 hex 90 (known NOP
from a pcap file, then manipulated in an environment                code) characters at the beginning of existing HTTP
similar to a Python command shell and written out to a              session payload
pcap file. Scapy supports creation and modification of any
packet field down to the byte level and to include the raw
creation and editing of packet data. To create our
deception packets, we made minor modifications to



                                                           64
                     Signature T5a                                      deception packets by processing them with Storm, and
 Description: Client  to  server  traffic  if  port≠23  and  first      later Snort, in a controlled environment.
    100 bytes of payload contains "rm", "rmdir", "rd",
    "del", "erase"                                                      We tested single occurrences of each signature separately
                                                                        and in a subset of possible combinations. For each
 Explanation: File or directory removal activity
                                                                        individual signature and for selected combinations, we
 Deception packet: Inserted "rm " (hex 726D20)                          also tested the effects of 10 and 20 signature instances.
    characters at the beginning of existing HTTP                        Finally, for selected signatures, we measured the effect of
    session payload                                                     multiple occurrences for values 1, ..., 25. For all tests, we
                     Signature T6a                                      reset the Storm model, loaded the desired pcap file, and
 Description: First two bytes of client to server                       recorded the resulting P(C|E).
    payload="MZ"
 Explanation: COM, DLL, DRV, EXE, PIF, QTS,                             We then processed each of the deception pcap files (one
    QTX, or SYS file transfer for use in a backdoor                     per signature) with Snort, separately and in combinations
                                                                        and repetitions.
 Deception packet: Inserted "MZ" (hex 4D5A) and
    filename "exe" characters at the beginning of
    existing HTTP session payload                                       5.     EXPERIMENTAL RESULTS
                     Signature T6b                                      Each signature pcap file was processed by Storm in
 Description: New Port opened on server; ignore first                   quantities of 1, 10, and 20 hits. Storm was reset after each
    500 packets after startup                                           run, that is, reset after a run of one T1e hit, then reset after
 Explanation: Traffic from a port not previously seen                   a run of 10 T1e hits, then reset after a run of 20 T1e hits,
    might indicate the opening of a new back door                       etc. Results are recorded in Table 3.
 Deception packet: Edited HTTP session to use server
    port 25 (ingested after first 500 packets)                          Table 3: Single signature impact on P(C|E) with repetition
                     Signature T6d
 Description: Unencrypted traffic on encrypted port                      Signature                     Qty=1      Qty=10      Qty=20
 Explanation: Traffic on encrypted sockets (HTTPS,                             Short
    SMTP with SSL, Secure Shell, etc.) should be                         ID    Description             P(C|E)      P(C|E)      P(C|E)
    encrypted once the session is established.                           T1e    HTTPload!=ASCII         0.00        0.01        0.03
 Deception packet: Inserted ASCII text in an
                                                                         T1f    SMTPload!=ASCII         0.00        0.01        0.03
    established SSH session
                                                                         T4a    Repeated ASCII          0.01        0.05        0.16
Manual creation of the deception packets required a                      T4b    Repeated NOPs           0.01        0.05        0.15
moderate amount of effort. When crafting deception
packets, care must be taken to use carrier traffic which                 T5a    Cleanup cmds            0.02        0.09        0.27
will be passed by a firewall or similar network security                 T6a    Executable load         0.02        0.08        0.22
gateway while still triggering the desired signature.
                                                                         T6b    New server port         0.02        0.08        0.22
Flexibility in carrier traffic is signature dependent. For
example, some signatures have offset or port                             T6d    Unencrypted SSL         0.02        0.08        0.22
dependencies like requiring the traffic to be, or not to be,
on port 80 (HTTP), while others are more flexible. Future               The relationships between the quantity of signature hits
work will develop an automated deception packet creation                and P(C|E) in each row indicate that setting the Bayesian
capability.                                                             Network findings is not a simple True/False assignment.
                                                                        To confirm this behavior, we recorded the effect on
We began by identifying a candidate session for packet                  P(C|E) of 1, 2, ..., 25 signature hits for signatures T5a and
modification. For example, we could start with an existing              T6a. These results are graphed in Figures 4a and 4b. The
HTTP or HTTPS session and alter packet payload, or we                   curves and apparent inflection points of the graphs
might also alter TCP ports. Payload modification required               indicate that the signature hits are subject to a ramping up
adjustments to the TCP checksum, IP checksum, and IP                    requirement at low quantities and a saturation adjustment
length values as well. To create a deception packet set, we             at high quantities. This is in fact implemented by a pre-
loaded the clean pcap file in scapy, made the desired                   processing step in the Storm system and is not actually a
packet modifications, and wrote the resulting packet set to             part of the Bayesian Network component of Storm.
a new pcap file. We then used Wireshark to confirm our
modifications and to extract and save only the session of
interest as a distinct pcap file. We confirmed our




                                                                   65
                                                               packets (pcap files for T5a and T6b) did not trigger any
                                                               Snort alerts.

                                                                 Table 4a: Effect of signature combinations on P(C|E)

                                                                 Signature                                  Test
                                                                     ID          A          B           C           D           E           F
                                                                     T1c                                                        1
                                                                     T1e                                                        1
                                                                     T4a                    1           1           1           1           1
                                                                    T4b                     1           1           1           1
     Figure 4a: T5a signature hit effect for n = 1..25
                                                                    T5a                                             1           1           1
                                                                    T6a          1                      1           1           1           1
                                                                    T6b          1                      1           1           1           1
                                                                    T6d          1                      1           1           1           1


                                                                   P(C|E)       0.10       0.02        0.46        0.86        0.89        0.79

                                                                 Table 4b: Effect of signature combinations on P(C|E)

                                                                     Signature                              Test
                                                                           ID          G          H           I           J           K
                                                                        T1c
     Figure 4b: T6a signature hit effect for n = 1..25                  T1e
We constructed 11 combinations of signature hits at                     T4a            1          10          10
selected quantities and tested each combination. The                    T4b
results are shown in Tables 4a and 4b below (split for                  T5a            1          10          10          10          20
readability).
                                                                        T6a            1          10                      10          20
The results indicate that the desired effect was achieved in            T6b
two ways: (1) single hits across a wide range of signatures             T6d
(e.g., tests D, E, and F), and (2) repeated hits on selected
signatures (e.g., tests H and K). Assuming a threshold of
P(C|E) > 0.75 for alerting, meta-event alerts could be                 P(C|E)        0.37       0.93        0.58        0.72        0.78
generated with as few as five packets spread across five
signatures as in test F, or 40 packets spread across only
                                                               In a default configuration, and without any subsequent
two different signatures as in test K.
                                                               aggregation or alert thresholds, Snort alerts are explicitly
                                                               linear, meaning that combination and repetition testing
We processed the same eight pcap deception files with
                                                               produced the obvious results. For example, running any
Snort. Six of the eight files triggered Snort alerts, as
                                                               one pcap file N times produces N alerts. Similarly,
summarized below in Table 5.
                                                               running the files in combination produced the expected
                                                               total of alerts, e.g., running the entire set 10 times
Snort Priority 1 are the most severe, Priority 3 the least.
                                                               produced 70 Snort alerts.
The name, classification, and priority of each signature
are assigned by the signature author. A base Snort install
contains signatures contributed by the Snort developers
and the open source community.

Our deception packets produced five Priority 1 alerts, one
Priority 2 alert, and one Priority 3 alert. Two deception




                                                          66
            Table 5: Snort alerts from deception packets                  are used to build the Conditional Probability Table of the
                                                                          Deception Success node.
     File                        Snort Alerts
     T5a                             None
               Lotus Notes .exe script source download attempt
     T6a           [Classification: Web Application Attack]
                                  [Priority: 1]
     T6b                             None
                                Protocol mismatch
                                   [Priority: 3]
     T6d              EXPLOIT ssh CRC32 overflow /bin/sh
                [Classification: Executable code was detected]
                                   [Priority: 1]
              MailSecurity Management Host Overflow Attempt
     T4a       [Classification: Attempted Admin Privilege Gain]
                                   [Priority: 1]
                              SHELLCODE x86 NOOP
     T4b        [Classification: Executable code was detected]
                                   [Priority: 1]                                 Figure 5: Bayes Net Cyber Deception Model
              apache chunked enc mem corrupt exploit attempt
     T1e      [Classification: access to potentially vuln web app]        Area B in the graphic contains three nodes representing
                                   [Priority: 2]                          external observables which could indicate a successful
                        x86 windows MailMax overflow                      deception. The "apparent target" and "apparent attacker"
     T1f       [Classification: Attempted Admin Privilege Gain]           are the endpoints of the deception packets. As noted
                                   [Priority: 1]                          elsewhere, these systems may not send or receive any of
                                                                          the observed traffic, but they will be endpoints from a
                                                                          network monitor's point of view. If a target blocks the
6.      DECEPTION MODEL                                                   apparent target or attacker, or takes the apparent target
                                                                          off-line, then the deception is likely working. Similarly, if
A sample deception model is shown in Figure 5. The                        the target system operators probe the apparent attacker,
model consists of three key parts: (A) the deception                      then the deception is likely working.
packets, (B) external observables indicating a successful
deception, and (C) external observables indicating an                     Area C in the graphic contains two nodes for external
unsuccessful deception.                                                   observables which may indicate that the deception is not
                                                                          working. If the apparent target's response or processing
Each deception packet, area A in the figure, is assessed                  time slows down, this may indicate that the target has
for alternative explanations. For example, a byte string                  added monitoring capabilities in order to trace the source
that we embed in a JPEG image file may generate a NIDS                    of the deception, although this could also indicate
alert for an unrelated attack, but upon examination will be               monitoring in response to a perceived successful
discounted as a chance occurrence and hence a false                       deception. The other node in area C is the worst case
alarm. Strong alternative explanations suggest that the                   scenario. Although none of the deception packet contents
target might not interpret the packet as part of an attack                are directly traceable to the actual perpetrators of the
and so would weaken the packet node's intended effect on                  deception, probing of the perpetrators systems, especially
the Successful Deception node. Similarly, each deception                  from the target of the deception, might indicate that the
packet is assessed for how difficult it will be for a target              deception has failed and the target suspects the true
to invalidate the packet. Again using the example of a                    source of the deception.
byte string embedded in a JPEG image file, if the
triggered NIDS alert is an exploit of image viewers, then                 The model of Figure 5 is a work in progress at the time of
the packet will be difficult to invalidate. A difficult-to-               this writing. Preliminary values for the conditional
invalidate packet will have a strong positive influence on                probability tables have been developed but not yet tested
the Successful Deception node via the intended effect                     or refined. Our work suggests that such a deception model
node.                                                                     may be developed for other domains where we have some
                                                                          control over the inputs to the base model. Our process in
Processing the original eight deception packets (pcap                     Figure 3 may be generalized by replacing "packets" with
files) with Snort provides additional parameters for the                  "evidence", since packets are simply our mechanism for
model. The number, priority, and relevance of Snort alerts                affecting an evidentiary input node. Generally, a derived



                                                                     67
deception model consists of the nodes that we directly          Future work will focus on automated deception packet
influence, their estimated effect on the target's model, and    creation, development of delivery mechanisms, and the
external observables.                                           derived deception model. We created our packets
                                                                manually based on a review of the target signature and
7.   CONCLUSIONS AND FUTURE WORK                                several iterations of trial and error. Our next step is to
                                                                create deception packets directly from signature
We demonstrated the ability to construct network packets        descriptions. For example, given a Snort signature file, we
which will look similar to normal network traffic, pass         could craft multiple deception packets in an automated
through a typical Firewall, trigger specific attack element     fashion. A related effort will explore the automation of
signatures, and have a controlled impact on a back-end          delivery mechanisms, for example establishing TCP
cyber attack detection reasoning model. Further, we             sessions with an internal host and delivering deception
proposed a derived deception model to dynamically               packets and injection of deception material into an
assess the effectiveness of the cyber deception activities,     existing network traffic stream. Author Jones recently led
and we suggested how such a deception model might be            a project to develop a hardware-based inline packet
constructed for other domains. In support of cyber              rewriting tool which could be used for such a purpose.
defense, our work also supports the testing and                 Finally, we will continue the development and
development of more accurate reasoning models and               generalization of deriving deception models from
research geared towards detecting deception.                    detection models.

An apparent limitation of our work is a requirement to          References
know the signatures which trip alerts and are fed to the        Anderson, J. P. (1980). Computer security threat
back end reasoning model. However, this is not                  monitoring and surveillance (Vol. 17). Technical report,
necessarily true. While these signatures may be known, as       James P. Anderson Company, Fort Washington,
in the case of systems leveraging open source tools like        Pennsylvania.
Snort, it is also true that a system designed to detect
specific attacks or attacks of a certain class will use         Bell, J. B., & Whaley, B. (1982). Cheating: deception in
similar signatures. The common requirement to derive a          war & magic, games & sports, sex & religion, business &
discriminatory signature that is as short as possible results   con games, politics & espionage, art & science. St
in different entities independently producing similar           Martin's Press.
signatures. We have observed this effect in the NIDS
domain, where commercial and open source tools have             Bell, J. B., & Whaley, B. (1991). Cheating and deception.
similar signature sets for many attacks. Similarly, we          Transaction Publishers.
observe this effect in the antivirus and malware detection
industry, where different vendors and open source               Boukhtouta, A., Lakhdari, N. E., Mokhov, S. A., &
providers frequently generate similar signatures                Debbabi, M. (2013). Towards fingerprinting malicious
independently. The implication is that we could develop         traffic. Procedia Computer Science, 19, 548-555.
probable signatures for specific attacks or behaviors, then
develop deception packets to trip these signatures with a       Bowen, B. M., Hershkop, S., Keromytis, A. D., & Stolfo,
reasonable expectation of successfully affecting a target       S. J. (2009). Baiting inside attackers using decoy
system using unknown signatures. We partially                   documents (pp. 51-70). Springer Berlin Heidelberg.
demonstrated this by processing our Storm-derived
packets with Snort.                                             Cohen, F. (1998). A note on the role of deception in
                                                                information protection. Computers & Security, 17(6),
As noted above, we assert that the use of pcap files is         483-506.
equivalent for our purposes to live network traffic capture
and processing. However, it is true that in most live           Cohen, F. (1998). The deception toolkit. Risks Digest, 19.
network scenarios we will not be able to put both sides of
a TCP session on the wire as we did in this work. Rather,       Crouse, M. B. (2012). Performance Analysis of Cyber
we will have to establish a live session with a target          Deception Using Probabilistic Models (Master's Thesis,
computer and modify subsequent session packets in real          Wake Forest University).
time, or we will have to intercept and modify packets
between a target and some other system. This is an              Heberlein, L. T., Dias, G. V., Levitt, K. N., Mukherjee,
implementation issue vs. a question of validity, as the         B., Wood, J., & Wolber, D. (1990, May). A network
results presented here hold regardless of how the               security monitor. In Research in Security and Privacy,
deceptive packets are introduced.                               1990. Proceedings., 1990 IEEE Computer Society
                                                                Symposium on (pp. 296-304). IEEE.



                                                           68
Hofmann, A., & Sick, B. (2011). Online intrusion alert                        Information Warfare & Security (p. 177). Academic
aggregation with generative data stream modeling.                             Conferences Limited.
Dependable and Secure Computing, IEEE Transactions
on, 8(2), 282-294.                                                            Tan, K. L. G. (2003). Confronting cyberterrorism with
                                                                              cyber deception (Doctoral dissertation, Monterey,
Hussein, S. M., Ali, F. H. M., & Kasiran, Z. (2012, May).                     California. Naval Postgraduate School).
Evaluation effectiveness of hybrid IDs using snort with
naive Bayes to detect attacks. In Digital Information and                     Tylman, W. (2009) Detecting Computer Intrusions with
Communication Technology and it's Applications                                Bayesian Networks. Intelligent Data Engineering and
(DICTAP), 2012 Second International Conference on (pp.                        Automated Learning - IDEAL 2009. Lecture Notes in
256-260). IEEE.                                                               Computer Science Volume 5788, 2009, pp 82-91.

Ignatius, D. (2007). Body of Lies. WW Norton &                                Tzu, S. (2013). The art of war. Orange Publishing.
Company.
                                                                              Valdes, A., & Skinner, K. (2001, January). Probabilistic
Ismail, I., Mohd Nor, S., & Marsono, M. N. (2014).                            alert correlation. In Recent Advances in Intrusion
Stateless Malware Packet Detection by Incorporating                           Detection (pp. 54-68). Springer Berlin Heidelberg.
Naive Bayes with Known Malware Signatures. Applied
Computational Intelligence and Soft Computing, 2014.                          Whaley, B. (1982). Toward a general theory of deception.
                                                                              The Journal of Strategic Studies, 5(1), 178-192.
Jones, J. and Beisel, C. (2014) Extraction and Reasoning
over Network Data to Detect Novel Cyber Attacks.                              Williams, J., & Torres, A. (2014). ADD - Complicating
National Cybersecurity Institute Journal. Volume 1,                           Memory Forensics Through Memory Disarray. Presented
Number 1.                                                                     at      ShmooCon         2014    and    archived   at
                                                                              https://archive.org/details/ShmooCon2014_ADD_Compli
Kewley, D., Fink, R., Lowry, J., & Dean, M. (2001).                           cating_Memory_Forensics_Through_Memory_Disarray.
Dynamic approaches to thwart adversary intelligence                           Retrieved June 8, 2014.
gathering. In DARPA Information Survivability
Conference & Exposition II, 2001. DISCEX'01.                              Yuill, J., Denning, D. E., & Feer, F. (2006). Using
Proceedings (Vol. 1, pp. 176-185). IEEE.                                      deception to hide things from hackers: Processes,
                                                                              principles, and techniques. North Carolina State
Montagu, E., & Joyce, P. (1954). The man who never                            University at Raleigh, Department of Computer Science.
was. Lippincott.
                                                                              Zhai, Y., Ning, P., Iyer, P., & Reeves, D. S. (2004,
Murphy, S. B., McDonald, J. T., & Mills, R. F. (2010).                        December). Reasoning about complementary intrusion
An Application of Deception in Cyberspace: Operating                          evidence. In Computer Security Applications Conference,
System Obfuscation1. In Proceedings of the 5th                                2004. 20th Annual (pp. 39-48). IEEE.
International Conference on Information Warfare and
Security (ICIW 2010) (pp. 241-249).                                           Zomlot, L., Sundaramurthy, S. C., Luo, K., Ou, X., &
                                                                              Rajagopalan, S. R. (2011, October). Prioritizing intrusion
Patton,   S.,   Yurcik,   W.,   &   Doss,   D.   (2001).   An   Achilles’     analysis using Dempster-Shafer theory. In Proceedings of
heel in signature-based IDS: Squealing false positives in                     the 4th ACM workshop on Security and artificial
SNORT. Proceedings of RAID 2001.                                              intelligence (pp. 59-70). ACM.

Ragsdale, D. (2011). Scalable Cyber Deception. Defense
Advanced Research Projects Agency, Arlington, Virginia,                       i
Information Innovation Office.                                                    http://www.hp.com/go/ArcSight
                                                                              ii
                                                                                  http://sourceforge.net/projects/libpcap/
                                                                              iii
Rowe, N. C. (2003, June). Counterplanning deceptions to                           http://sourceforge.net/projects/unbbayes/
                                                                              iv
foil cyber-attack plans. In Information Assurance                                 http://www.secdev.org/projects/scapy/
                                                                              v
Workshop, 2003. IEEE Systems, Man and Cybernetics                                 http://www.snort.org/
                                                                              vi
Society (pp. 203-210). IEEE.                                                      http://www.backtrack-linux.org/downloads/
                                                                              vii
                                                                                  http://www.wireshark.org/
Rowe, N. (2007, March). Planning cost-effective
deceptive resource denial in defense to cyber-attacks. In
Proceedings of the 2nd International Conference on



                                                                        69