Using Bayesian Attack Detection Models to Drive Cyber Deception James H. Jones, Jr. Kathryn B. Laskey Department of Electrical and Computer Engineering Department of Systems Engineering and Operations Research George Mason University George Mason University Fairfax, VA 22030 Fairfax, VA 22030 Abstract Key words: Cyber Deception, Cyber Attack, Bayesian Model, Deception Model, Intrusion Detection System We present a method to devise, execute, and 1. INTRODUCTION assess a cyber deception. The aim is to cause an Network-based Intrusion Detection Systems (NIDS) are adversary to believe they are under a cyber essentially granular sensors. Their measurements consist attack when in fact they are not. Cyber network of computer network traffic, sometimes at the packet defense relies on human and computational level, which matches signatures of known cyber attack systems that can reason over multiple individual activity. For a typical network, the individual data points evidentiary items to detect the presence of meta are numerous and require aggregation, fusion, and context events, i.e., cyber attacks. Many of these systems to acquire meaning. This reasoning may be accomplished aggregate and reason over alerts from Network- through the use of cyber attack detection models, where based Intrusion Detection Systems (NIDS). Such the NIDS data points represent evidence and specific systems use byte patterns as attack signatures to cyber attacks or classes of attacks represent hypotheses. analyze network traffic and generate Modeling approaches, including Bayesian Networks, have corresponding alerts. Current aggregation and been applied in the past, are an active research area, and reasoning tools use a variety of techniques to are in use today in deployed systems. model meta-events, among them Bayesian The input to a NIDS sensor is network traffic, which is Networks. However, the inputs to these models inherently uncertain and subject to manipulation. Prior are based on network traffic which is inherently research has exploited this fact to create large numbers of subject to manipulation. In this work, we false NIDS alerts to overwhelm or disable the backend demonstrate a capability to remotely and processing systems. In this work, we leverage knowledge artificially trigger specific meta events in a of the backend cyber attack models to craft network potentially unknown model. We use an existing traffic which manipulates the inputs and hence the outputs and known Bayesian Network based cyber attack of both known and unknown models. With a small detection system to guide construction of number of packets and no actual cyber attack, we are able deceptive network packets. These network to create the false impression of an active attack. packets are not actual attacks or exploits, but In this work, we describe a general approach to network- rather contain selected features of attack traffic based offensive cyber deception, and we demonstrate an embedded in benign content. We provide these implementation of such a deception. We use an existing packets to a different cyber attack detection cyber attack detection model to guide the development of system to gauge their generalizability and effect. deception traffic, which is then processed by a second and We combine the deception packets' distinct cyber attack detection model. Finally, we propose characteristics, the second system's response, and a deception model to assess the effectiveness of the external observables to propose a deception deception on a target. Future work will expand and model to assess the effectiveness of the automate the generation of deceptive network packets and manufactured network traffic on our target. We further develop the deception model. demonstrate the development and execution of a specific deception, and we propose the corresponding deception model. 60 2. RELATED WORK sense to provide disinformation to an adversary. Similarly, deliberately triggering an adversary's network Deception has been a staple of military doctrine for defenses to overwhelm or disable equipment, software, or thousands of years, and a key element of intelligence operators was discussed openly in 2001 (Patton, Yurcik, agency activities since they took their modern form in and Doss 2001) but proposed as cover for other attacks World War II. From Sun Tzu 2,500 years ago (Tzu, rather than to effect a deception. A small number of 2013), to Operation Mincemeat in 1943 (Montagu and offensive cyber deception implementations have been Joyce, 1954), to the fictional operation in the 2007 book presented, such as the D3 (Decoy Document Distributor) Body of Lies (Ignatius, 2007), one side has endeavored to system to lure malicious insiders (Bowen, Hershkop, mislead the other through a variety of means and for a Keromytis, and Stolfo, 2009) and the ADD (Attention variety of purposes. The seminal work of Whaley and Deficit Disorder) tool to create artificial host-based Bell (Whaley, 1982; Bell and Whaley, 1982; Bell and artifacts in memory to support a deception (Williams and Whaley, 1991), formalized and in some ways defended Torres, 2014). While offensive cyber warfare has entered deception as both necessary and possible to execute the public awareness with the exposure of activity based without "self contamination". on tools such as Stuxnet, Flame, and Shamoon, offensive cyber deception remains the subject of limited open Deception operations have naturally begun to include the research and discussion. cyber domain, although the majority of this work has been on the defensive side. Fred Cohen suggested a role for Our deception work focuses on aggregation and reasoning deception in computer system defense in 1998 (Cohen, tools applied to Network Intrusion Detection Systems 1998) and simultaneously released his honeypot (NIDS). These reasoning tools emerged from the implementation called The Deception Toolkit (Cohen, inundation of alerts when NIDS sensors were first 1998). Honeypots are systems meant to draw in attackers deployed on enterprise networks. Such tools may simply so they may be distracted and/or studied. The Deception correlate and aggregate alerts or may model cyber attack Toolkit was one of the first configurable and dynamic and attacker behavior to reason over large quantities of honeypots, as opposed to prior honeypots which were individual evidentiary items and provide assessments of simply static vulnerable systems with additional attack presence for human operators to review. Such administrator control and visibility. Other honeypot reasoning models are abundant in the literature and in implementations have followed, and they remain a staple operational environments, having become indispensable of defensive cyber deception. Neil Rowe (2003)(2007), to cyber defenders and remaining an active research area. his colleague Dorothy Denning (Yuill, Denning, and Feer, Initial work on correlating and aggregating NIDS alerts 2006), and students (Tan, 2003) at the Naval Postgraduate appeared in 2001 (Valdes and Skinner, 2001). A few School have been researching defensive cyber deception years later, a body of research emerged which correlated for several years. Extending their early work identifying NIDS events with vulnerability scans to remove irrelevant key disruption points of an attack, they propose deception alerts, for example (Zhai, et al., 2004). More advanced by resource denial, where some key element of an active reasoning models emerged a few years later, attempting to attack vector is deceptively claimed to be unavailable. capture attack and attacker behavior using various Such an approach stalls the attacker while the activity can techniques. For example, Zomlot, Sundaramurthy, Luo, be analyzed and risks mitigated. Other defensive cyber Ou, and Rajagopalan (2011) applied Dempster-Shafer deception approaches include masking a target's operating theory to prioritize alerts, and Bayesian approaches system (Murphy, McDonald, and Mills, 2010), and remain popular (Tylman, 2009; Hussein, Ali, and Kasiran, actively moving targets within an IP address and TCP 2012; Ismail, Mohd and Marsono, 2014). Jones and port space (Kewley, et al., 2001), later labeled "address Beisel (2014) developed a Bayesian approach to shuffling". Crouse’s (2012) comparison of the theoretical reasoning over custom NIDS alerts for novel attack performance of honeypots and address shuffling remains detection. A prototype of this approach, dubbed Storm, one of the few rigorous comparisons of techniques. Until was used as the base model in this work. recently, most defensive cyber deception involved theoretical work or small proofs of concept. However, in Our research builds on this rich body of prior work, 2011, Ragsdale both legitimized defensive cyber merging the basic precepts of deception, manipulation of deception and raised the bar when he introduced network traffic, and model-based reasoning into an DARPA’s   Scalable Cyber Deception program (Ragsdale, offensive cyber deception capability. We use existing 2011). The program aims to automatically redirect detection models to derive corresponding deception potential intruders to tailorable decoy products and models, demonstrating the ability to deceive an adversary infrastructures in real time and at an enterprise scale. on their own turf and causing them to believe they are under attack when in fact they are not. This capability By comparison, offensive cyber deception has been may be used offensively to create an asymmetry between discussed only briefly in the literature, often as a attackers generating small numbers of deception packets secondary consideration. For example, honeypots are and targets investigating multiple false leads, and typically a defensive tool but may be used in an offensive 61 defensively to improve the sensitivity, specificity, and underlying cyber attack process is shown in Figure 1 deception recognition of existing cyber attack detection below, where a typical attack progress downward from tools. State 1 (S1) to State 7 (S7). Observables are created at each state and transition. The existing Storm 3. BACKGROUND implementation contains one or more observable signatures for each of the six state transitions (T1-T6 in Signature-based Network Intrusion Detection Systems the figure). operate by matching network traffic to a library of patterns derived from past attacks and known techniques. Matches generate alerts, often one per packet, which are saved and sent to a human operator for review. The basic idea of intrusion detection is attributed to Anderson (1980). Todd Heberlein introduced the idea of network- based intrusion detection in 1990 (Heberlein, et al., 1990). Only when processing capabilities caught up to network bandwidth did the market take off in the late 1990s and early 2000s. Unfortunately, early enterprise deployments generated massive numbers of alerts, to the point that human operators could not possibly process them all. Two capabilities came out of this challenge: (1) correlation between NIDS data and other enterprise sources, such as vulnerability scanning data, e.g., see ArcSighti, and (2) aggregators which model cyber attacks and use NIDS alerts as individual evidentiary items, only alerting a human when multiple aspects of an attack are detected, e.g., see Hofmann and Sick (2011). As noted in Section 2, many modeling approaches have been applied to the aggregation and context problem for more than a decade, and the area remains one of active research, e.g., Boukhtouta, et al (2013). For the base model in this project, we used an existing cyber attack detection model previously developed by one of the authors. The implementation of this model, called Storm, uses network traffic observables and a Bayesian Network reasoning model to detect a system compromise resulting from known and novel cyber attacks. The Figure 1: Cyber Attack Model system ingests raw network traffic via a live network connection or via traffic capture files in libpcapii format. Individual packets and groups of packets are assessed The reasoning model, shown in Figure 2, was derived against signatures associated with cyber attack stages, from expert knowledge and consists of 20 signature such as reconnaissance, exploitation, and backdoor access evidence nodes (leaves labeled Tnn), three derived (see Figure 1). Prior to ingest by the model, saturation and evidence nodes (labeled Mn), two protocol aggregation time decay functions are applied to packets which match nodes (labeled Port80 and Port25), six transition signatures so that model output reflects the quantity and aggregation nodes (labeled Tn), and a root node (labeled timing of packets. Ingest and model updates occur in real Compromise). Signature hits are processed and used to set time as packets are received and processed. Packet values for the Tnn and Mn nodes. As implemented, one capture and signature matching is implemented in C++ model is instantiated for each cyber attack target (unique using libpcap on a Linux (Ubuntu) system. Matching target IP address). Model instances are updated whenever packet processing, model management, and the user new evidence is received or a preconfigured amount of interface are implemented in Java, and the Bayesian time has passed, and the root node values are returned as Network is implemented with Unbbayesiii. Packets are Probability of Compromise given Evidence, P(C|E), for passed to the model via a TCP socket so that packet each target. processing and reasoning may be performed on different systems, although we used a single server for our testing. The theory behind the model, further explained by Jones and Beisel (2014), is to recognize observables created The Storm system reasons over indirect observables when a cyber attack transitions to a new state. For resulting from the necessary and essentially unavoidable example, when transitioning to the exploit stage, packets steps necessary to effect a system compromise. This with NOP instructions (machine code for "do nothing" 62 and used in buffer overflow type attacks) or shell code system alarms, and external indicators of the target's elements (part of many exploit payloads) are often seen. response activities. The impact of our deception packets is As such, the Storm signature rules for detecting estimated by their effect on an alternative cyber attack observables are not specific to particular attacks, but detection system, in this case Snortv. See Figure 3 for an rather represent effects common to actions associated overview of this process. with cyber attack stages in general. Taken individually, signature matches do not imply an attack. However, when aggregated and combined in context by the reasoning model, an accurate assessment of attack existence may be produced. The system is able to detect novel attacks, since signatures are based on generic cyber attack state transitions instead of specific attack signatures. Figure 2: Bayes Net Cyber Attack Detection Model For most environments, network traffic is insecure and untrusted. Most networks and systems carry and accept traffic that may be both unencrypted and unauthenticated. As such, nearly anyone can introduce traffic on an arbitrary network or at least modify traffic destined for an arbitrary network or system. While full arbitrary packet creation, modification, and introduction is not generally possible, the ability to at least minimally manipulate packets destined for an arbitrary network or system is inherent in the design and implementation of the Internet. Figure 3: Process Overview Packet manipulation is straightforward using available tools like Scapyiv, requiring only knowledge of basic The deception model output, P(Successful Deception) is object oriented concepts and an understanding of the envisioned to be a dynamic value computed in real time relevant network protocols. as deception packets are delivered to a target and external observables are collected. As the deception operation It is the combination of an ability to manipulate network unfolds and feedback from external observables is traffic and models which use network traffic as incorporated, additional existing deception packets may evidentiary inputs which we exploit in our work. be injected, or influence points may be examined for additional deception packet development. 4. METHODOLOGY Our goal for this project is to establish the viability of Our base detection model is a Bayesian Network (Figure manipulating an adversary's perception that they are the 2) from a test implementation of the Storm cyber attack target of a cyber attack when in fact they are not. We detection system. A single node sensitivity to findings begin by conducting a sensitivity analysis of a known analysis (from Netica) is summarized in Table 1 for the cyber attack detection model to identify candidate 20 evidence input nodes. We reviewed this output and the influence points. We design, construct, and inject network descriptions of each signature to select those which (a) packets to trigger evidence at a subset of these influence would have high impact on Storm's probability of points. We construct a corresponding deception model to compromise based on the sensitivity analysis, (b) could be assess the likelihood that our deception is effective. This reasonably developed into a deception packet or packets, derivative deception model combines the impact of our and (c) could be general enough to be detected by a deception packets with other factors, such as the ease with target's cyber attack detection system, i.e., not Storm. In which a target may invalidate the deception packets, the Table 1, the eight non-gray rows are those that were prevalence of alternative explanations for detection selected for deception packet development (signatures T5a, T6a, T6b, T6d, T4a, T4b, T1e, and T1f). 63 Table 1: Storm model sensitivity analysis packets from the clean set. By minimizing changes, we produced packets that maintained most of the clean session characteristics and so would not be blocked by a Mutual Variance firewall or other packet screening device. Also, packets Signature Info of Beliefs were modified only to the extent necessary to trigger the desired signature, so the modified packets do not contain T5a 0.03305 0.0011763 any actual attacks. The modified packets and associated T6e 0.02719 0.0008344 unmodified session packets, such as session establishment T6a 0.02719 0.0008344 via the TCP 3-way handshake, were exported to a separate pcap file so they could be ingested by the Storm T6b 0.02719 0.0008344 and Snort systems in a controlled manner. T6d 0.02719 0.0008344 The eight signatures selected for deception and the related T6c 0.02719 0.0008344 deception packets are summarized in Table 2. T6f 0.02719 0.0008344 T4a 0.01701 0.0004315 Table 2: Signatures and deception packets T4b 0.01701 0.0004315 T3b 0.01658 0.0003618 Signature T1e T3c 0.00702 0.0001202 Description: After 3-way handshake, DstPort=80, T3a 0.00259 0.0000372 payload≠   Explanation: Abnormal traffic to web server (usually T1a 0.00002 0.0000002 expect GET or POST with ASCII data) T1b 0.00002 0.0000002 Deception packet: Inserted non-ASCII (hex > 7F) at T1e 0.00002 0.0000002 beginning of payload for existing HTTP session Signature T1f T1c 0.00002 0.0000002 Description: After 3-way handshake, DstPort=25, T1d 0.00002 0.0000002 payload≠   T1f 0.00002 0.0000002 Explanation: Abnormal traffic to a mail server T2a 0.00001 0.0000002 (normally we expect plaintext commands) Deception packet: Edited HTTP session to use server T2b 0.00001 0.0000002 port 25; inserted non-ASCII (hex > 7F) at beginning of payload Signature T4a Our test environment consisted of a Storm Description: Client to server traffic containing 20+ implementation running on Ubuntu 11.10 and a packet manipulation host running BackTrack5vi. We captured repeated ASCII characters normal network traffic in a test environment and Explanation: Buffer overflows often use a long string processed the traffic through the Storm system to confirm of ASCII characters to overflow the input buffer that no attacks were detected. Storm, like most NIDS Deception packet: Inserted 43 "d" (hex 64) characters implementations, has the ability to ingest live network at the beginning of existing HTTP session payload traffic as well as network traffic capture files without loss Signature T4b of accuracy or fidelity. Traffic was captured using the Description: Client payload contains 20+ identical and open source Wiresharkvii tool, saved as a pcap file, then consecutive NOP instruction byte patterns ingested by Storm. We labeled this original packet Explanation: A "NOP sled" is a common technique capture file "clean" and used it as the basis for our subsequent packet manipulations. used in buffer overflow exploits; the sled consists of multiple NOP (No Operation) instructions to We used Scapy on the BackTrack5 instance to craft ensure that the real instructions fall in the desired deception packets. Scapy is an open source Python based range packet crafting and editing tool. Packets may be loaded Deception packet: Inserted 24 hex 90 (known NOP from a pcap file, then manipulated in an environment code) characters at the beginning of existing HTTP similar to a Python command shell and written out to a session payload pcap file. Scapy supports creation and modification of any packet field down to the byte level and to include the raw creation and editing of packet data. To create our deception packets, we made minor modifications to 64 Signature T5a deception packets by processing them with Storm, and Description: Client  to  server  traffic  if  port≠23  and  first   later Snort, in a controlled environment. 100 bytes of payload contains "rm", "rmdir", "rd", "del", "erase" We tested single occurrences of each signature separately and in a subset of possible combinations. For each Explanation: File or directory removal activity individual signature and for selected combinations, we Deception packet: Inserted "rm " (hex 726D20) also tested the effects of 10 and 20 signature instances. characters at the beginning of existing HTTP Finally, for selected signatures, we measured the effect of session payload multiple occurrences for values 1, ..., 25. For all tests, we Signature T6a reset the Storm model, loaded the desired pcap file, and Description: First two bytes of client to server recorded the resulting P(C|E). payload="MZ" Explanation: COM, DLL, DRV, EXE, PIF, QTS, We then processed each of the deception pcap files (one QTX, or SYS file transfer for use in a backdoor per signature) with Snort, separately and in combinations and repetitions. Deception packet: Inserted "MZ" (hex 4D5A) and filename "exe" characters at the beginning of existing HTTP session payload 5. EXPERIMENTAL RESULTS Signature T6b Each signature pcap file was processed by Storm in Description: New Port opened on server; ignore first quantities of 1, 10, and 20 hits. Storm was reset after each 500 packets after startup run, that is, reset after a run of one T1e hit, then reset after Explanation: Traffic from a port not previously seen a run of 10 T1e hits, then reset after a run of 20 T1e hits, might indicate the opening of a new back door etc. Results are recorded in Table 3. Deception packet: Edited HTTP session to use server port 25 (ingested after first 500 packets) Table 3: Single signature impact on P(C|E) with repetition Signature T6d Description: Unencrypted traffic on encrypted port Signature Qty=1 Qty=10 Qty=20 Explanation: Traffic on encrypted sockets (HTTPS, Short SMTP with SSL, Secure Shell, etc.) should be ID Description P(C|E) P(C|E) P(C|E) encrypted once the session is established. T1e HTTPload!=ASCII 0.00 0.01 0.03 Deception packet: Inserted ASCII text in an T1f SMTPload!=ASCII 0.00 0.01 0.03 established SSH session T4a Repeated ASCII 0.01 0.05 0.16 Manual creation of the deception packets required a T4b Repeated NOPs 0.01 0.05 0.15 moderate amount of effort. When crafting deception packets, care must be taken to use carrier traffic which T5a Cleanup cmds 0.02 0.09 0.27 will be passed by a firewall or similar network security T6a Executable load 0.02 0.08 0.22 gateway while still triggering the desired signature. T6b New server port 0.02 0.08 0.22 Flexibility in carrier traffic is signature dependent. For example, some signatures have offset or port T6d Unencrypted SSL 0.02 0.08 0.22 dependencies like requiring the traffic to be, or not to be, on port 80 (HTTP), while others are more flexible. Future The relationships between the quantity of signature hits work will develop an automated deception packet creation and P(C|E) in each row indicate that setting the Bayesian capability. Network findings is not a simple True/False assignment. To confirm this behavior, we recorded the effect on We began by identifying a candidate session for packet P(C|E) of 1, 2, ..., 25 signature hits for signatures T5a and modification. For example, we could start with an existing T6a. These results are graphed in Figures 4a and 4b. The HTTP or HTTPS session and alter packet payload, or we curves and apparent inflection points of the graphs might also alter TCP ports. Payload modification required indicate that the signature hits are subject to a ramping up adjustments to the TCP checksum, IP checksum, and IP requirement at low quantities and a saturation adjustment length values as well. To create a deception packet set, we at high quantities. This is in fact implemented by a pre- loaded the clean pcap file in scapy, made the desired processing step in the Storm system and is not actually a packet modifications, and wrote the resulting packet set to part of the Bayesian Network component of Storm. a new pcap file. We then used Wireshark to confirm our modifications and to extract and save only the session of interest as a distinct pcap file. We confirmed our 65 packets (pcap files for T5a and T6b) did not trigger any Snort alerts. Table 4a: Effect of signature combinations on P(C|E) Signature Test ID A B C D E F T1c 1 T1e 1 T4a 1 1 1 1 1 T4b 1 1 1 1 Figure 4a: T5a signature hit effect for n = 1..25 T5a 1 1 1 T6a 1 1 1 1 1 T6b 1 1 1 1 1 T6d 1 1 1 1 1 P(C|E) 0.10 0.02 0.46 0.86 0.89 0.79 Table 4b: Effect of signature combinations on P(C|E) Signature Test ID G H I J K T1c Figure 4b: T6a signature hit effect for n = 1..25 T1e We constructed 11 combinations of signature hits at T4a 1 10 10 selected quantities and tested each combination. The T4b results are shown in Tables 4a and 4b below (split for T5a 1 10 10 10 20 readability). T6a 1 10 10 20 The results indicate that the desired effect was achieved in T6b two ways: (1) single hits across a wide range of signatures T6d (e.g., tests D, E, and F), and (2) repeated hits on selected signatures (e.g., tests H and K). Assuming a threshold of P(C|E) > 0.75 for alerting, meta-event alerts could be P(C|E) 0.37 0.93 0.58 0.72 0.78 generated with as few as five packets spread across five signatures as in test F, or 40 packets spread across only In a default configuration, and without any subsequent two different signatures as in test K. aggregation or alert thresholds, Snort alerts are explicitly linear, meaning that combination and repetition testing We processed the same eight pcap deception files with produced the obvious results. For example, running any Snort. Six of the eight files triggered Snort alerts, as one pcap file N times produces N alerts. Similarly, summarized below in Table 5. running the files in combination produced the expected total of alerts, e.g., running the entire set 10 times Snort Priority 1 are the most severe, Priority 3 the least. produced 70 Snort alerts. The name, classification, and priority of each signature are assigned by the signature author. A base Snort install contains signatures contributed by the Snort developers and the open source community. Our deception packets produced five Priority 1 alerts, one Priority 2 alert, and one Priority 3 alert. Two deception 66 Table 5: Snort alerts from deception packets are used to build the Conditional Probability Table of the Deception Success node. File Snort Alerts T5a None Lotus Notes .exe script source download attempt T6a [Classification: Web Application Attack] [Priority: 1] T6b None Protocol mismatch [Priority: 3] T6d EXPLOIT ssh CRC32 overflow /bin/sh [Classification: Executable code was detected] [Priority: 1] MailSecurity Management Host Overflow Attempt T4a [Classification: Attempted Admin Privilege Gain] [Priority: 1] SHELLCODE x86 NOOP T4b [Classification: Executable code was detected] [Priority: 1] Figure 5: Bayes Net Cyber Deception Model apache chunked enc mem corrupt exploit attempt T1e [Classification: access to potentially vuln web app] Area B in the graphic contains three nodes representing [Priority: 2] external observables which could indicate a successful x86 windows MailMax overflow deception. The "apparent target" and "apparent attacker" T1f [Classification: Attempted Admin Privilege Gain] are the endpoints of the deception packets. As noted [Priority: 1] elsewhere, these systems may not send or receive any of the observed traffic, but they will be endpoints from a network monitor's point of view. If a target blocks the 6. DECEPTION MODEL apparent target or attacker, or takes the apparent target off-line, then the deception is likely working. Similarly, if A sample deception model is shown in Figure 5. The the target system operators probe the apparent attacker, model consists of three key parts: (A) the deception then the deception is likely working. packets, (B) external observables indicating a successful deception, and (C) external observables indicating an Area C in the graphic contains two nodes for external unsuccessful deception. observables which may indicate that the deception is not working. If the apparent target's response or processing Each deception packet, area A in the figure, is assessed time slows down, this may indicate that the target has for alternative explanations. For example, a byte string added monitoring capabilities in order to trace the source that we embed in a JPEG image file may generate a NIDS of the deception, although this could also indicate alert for an unrelated attack, but upon examination will be monitoring in response to a perceived successful discounted as a chance occurrence and hence a false deception. The other node in area C is the worst case alarm. Strong alternative explanations suggest that the scenario. Although none of the deception packet contents target might not interpret the packet as part of an attack are directly traceable to the actual perpetrators of the and so would weaken the packet node's intended effect on deception, probing of the perpetrators systems, especially the Successful Deception node. Similarly, each deception from the target of the deception, might indicate that the packet is assessed for how difficult it will be for a target deception has failed and the target suspects the true to invalidate the packet. Again using the example of a source of the deception. byte string embedded in a JPEG image file, if the triggered NIDS alert is an exploit of image viewers, then The model of Figure 5 is a work in progress at the time of the packet will be difficult to invalidate. A difficult-to- this writing. Preliminary values for the conditional invalidate packet will have a strong positive influence on probability tables have been developed but not yet tested the Successful Deception node via the intended effect or refined. Our work suggests that such a deception model node. may be developed for other domains where we have some control over the inputs to the base model. Our process in Processing the original eight deception packets (pcap Figure 3 may be generalized by replacing "packets" with files) with Snort provides additional parameters for the "evidence", since packets are simply our mechanism for model. The number, priority, and relevance of Snort alerts affecting an evidentiary input node. Generally, a derived 67 deception model consists of the nodes that we directly Future work will focus on automated deception packet influence, their estimated effect on the target's model, and creation, development of delivery mechanisms, and the external observables. derived deception model. We created our packets manually based on a review of the target signature and 7. CONCLUSIONS AND FUTURE WORK several iterations of trial and error. Our next step is to create deception packets directly from signature We demonstrated the ability to construct network packets descriptions. For example, given a Snort signature file, we which will look similar to normal network traffic, pass could craft multiple deception packets in an automated through a typical Firewall, trigger specific attack element fashion. A related effort will explore the automation of signatures, and have a controlled impact on a back-end delivery mechanisms, for example establishing TCP cyber attack detection reasoning model. Further, we sessions with an internal host and delivering deception proposed a derived deception model to dynamically packets and injection of deception material into an assess the effectiveness of the cyber deception activities, existing network traffic stream. Author Jones recently led and we suggested how such a deception model might be a project to develop a hardware-based inline packet constructed for other domains. In support of cyber rewriting tool which could be used for such a purpose. defense, our work also supports the testing and Finally, we will continue the development and development of more accurate reasoning models and generalization of deriving deception models from research geared towards detecting deception. detection models. An apparent limitation of our work is a requirement to References know the signatures which trip alerts and are fed to the Anderson, J. P. (1980). Computer security threat back end reasoning model. However, this is not monitoring and surveillance (Vol. 17). Technical report, necessarily true. While these signatures may be known, as James P. Anderson Company, Fort Washington, in the case of systems leveraging open source tools like Pennsylvania. Snort, it is also true that a system designed to detect specific attacks or attacks of a certain class will use Bell, J. B., & Whaley, B. (1982). Cheating: deception in similar signatures. The common requirement to derive a war & magic, games & sports, sex & religion, business & discriminatory signature that is as short as possible results con games, politics & espionage, art & science. St in different entities independently producing similar Martin's Press. signatures. We have observed this effect in the NIDS domain, where commercial and open source tools have Bell, J. B., & Whaley, B. (1991). Cheating and deception. similar signature sets for many attacks. Similarly, we Transaction Publishers. observe this effect in the antivirus and malware detection industry, where different vendors and open source Boukhtouta, A., Lakhdari, N. E., Mokhov, S. A., & providers frequently generate similar signatures Debbabi, M. (2013). Towards fingerprinting malicious independently. The implication is that we could develop traffic. Procedia Computer Science, 19, 548-555. probable signatures for specific attacks or behaviors, then develop deception packets to trip these signatures with a Bowen, B. M., Hershkop, S., Keromytis, A. D., & Stolfo, reasonable expectation of successfully affecting a target S. J. (2009). Baiting inside attackers using decoy system using unknown signatures. We partially documents (pp. 51-70). Springer Berlin Heidelberg. demonstrated this by processing our Storm-derived packets with Snort. Cohen, F. (1998). A note on the role of deception in information protection. Computers & Security, 17(6), As noted above, we assert that the use of pcap files is 483-506. equivalent for our purposes to live network traffic capture and processing. However, it is true that in most live Cohen, F. (1998). The deception toolkit. Risks Digest, 19. network scenarios we will not be able to put both sides of a TCP session on the wire as we did in this work. Rather, Crouse, M. B. (2012). Performance Analysis of Cyber we will have to establish a live session with a target Deception Using Probabilistic Models (Master's Thesis, computer and modify subsequent session packets in real Wake Forest University). time, or we will have to intercept and modify packets between a target and some other system. This is an Heberlein, L. T., Dias, G. V., Levitt, K. N., Mukherjee, implementation issue vs. a question of validity, as the B., Wood, J., & Wolber, D. (1990, May). A network results presented here hold regardless of how the security monitor. In Research in Security and Privacy, deceptive packets are introduced. 1990. Proceedings., 1990 IEEE Computer Society Symposium on (pp. 296-304). IEEE. 68 Hofmann, A., & Sick, B. (2011). Online intrusion alert Information Warfare & Security (p. 177). Academic aggregation with generative data stream modeling. Conferences Limited. Dependable and Secure Computing, IEEE Transactions on, 8(2), 282-294. Tan, K. L. G. (2003). Confronting cyberterrorism with cyber deception (Doctoral dissertation, Monterey, Hussein, S. M., Ali, F. H. M., & Kasiran, Z. (2012, May). California. Naval Postgraduate School). Evaluation effectiveness of hybrid IDs using snort with naive Bayes to detect attacks. In Digital Information and Tylman, W. (2009) Detecting Computer Intrusions with Communication Technology and it's Applications Bayesian Networks. Intelligent Data Engineering and (DICTAP), 2012 Second International Conference on (pp. Automated Learning - IDEAL 2009. Lecture Notes in 256-260). IEEE. Computer Science Volume 5788, 2009, pp 82-91. Ignatius, D. (2007). Body of Lies. WW Norton & Tzu, S. (2013). The art of war. Orange Publishing. Company. Valdes, A., & Skinner, K. (2001, January). Probabilistic Ismail, I., Mohd Nor, S., & Marsono, M. N. (2014). alert correlation. In Recent Advances in Intrusion Stateless Malware Packet Detection by Incorporating Detection (pp. 54-68). Springer Berlin Heidelberg. Naive Bayes with Known Malware Signatures. Applied Computational Intelligence and Soft Computing, 2014. Whaley, B. (1982). Toward a general theory of deception. The Journal of Strategic Studies, 5(1), 178-192. Jones, J. and Beisel, C. (2014) Extraction and Reasoning over Network Data to Detect Novel Cyber Attacks. Williams, J., & Torres, A. (2014). ADD - Complicating National Cybersecurity Institute Journal. Volume 1, Memory Forensics Through Memory Disarray. Presented Number 1. at ShmooCon 2014 and archived at https://archive.org/details/ShmooCon2014_ADD_Compli Kewley, D., Fink, R., Lowry, J., & Dean, M. (2001). cating_Memory_Forensics_Through_Memory_Disarray. Dynamic approaches to thwart adversary intelligence Retrieved June 8, 2014. gathering. In DARPA Information Survivability Conference & Exposition II, 2001. DISCEX'01. Yuill, J., Denning, D. E., & Feer, F. (2006). Using Proceedings (Vol. 1, pp. 176-185). IEEE. deception to hide things from hackers: Processes, principles, and techniques. North Carolina State Montagu, E., & Joyce, P. (1954). The man who never University at Raleigh, Department of Computer Science. was. Lippincott. Zhai, Y., Ning, P., Iyer, P., & Reeves, D. S. (2004, Murphy, S. B., McDonald, J. T., & Mills, R. F. (2010). December). Reasoning about complementary intrusion An Application of Deception in Cyberspace: Operating evidence. In Computer Security Applications Conference, System Obfuscation1. In Proceedings of the 5th 2004. 20th Annual (pp. 39-48). IEEE. International Conference on Information Warfare and Security (ICIW 2010) (pp. 241-249). Zomlot, L., Sundaramurthy, S. C., Luo, K., Ou, X., & Rajagopalan, S. R. (2011, October). Prioritizing intrusion Patton,   S.,   Yurcik,   W.,   &   Doss,   D.   (2001).   An   Achilles’   analysis using Dempster-Shafer theory. In Proceedings of heel in signature-based IDS: Squealing false positives in the 4th ACM workshop on Security and artificial SNORT. Proceedings of RAID 2001. intelligence (pp. 59-70). ACM. Ragsdale, D. (2011). Scalable Cyber Deception. Defense Advanced Research Projects Agency, Arlington, Virginia, i Information Innovation Office. http://www.hp.com/go/ArcSight ii http://sourceforge.net/projects/libpcap/ iii Rowe, N. C. (2003, June). Counterplanning deceptions to http://sourceforge.net/projects/unbbayes/ iv foil cyber-attack plans. In Information Assurance http://www.secdev.org/projects/scapy/ v Workshop, 2003. IEEE Systems, Man and Cybernetics http://www.snort.org/ vi Society (pp. 203-210). IEEE. http://www.backtrack-linux.org/downloads/ vii http://www.wireshark.org/ Rowe, N. (2007, March). Planning cost-effective deceptive resource denial in defense to cyber-attacks. In Proceedings of the 2nd International Conference on 69