=Paper=
{{Paper
|id=Vol-1964/S3
|storemode=property
|title=Context-Based Heuristics in Attribution
|pdfUrl=https://ceur-ws.org/Vol-1964/S3.pdf
|volume=Vol-1964
|authors=Jim Q. Chen
|dblpUrl=https://dblp.org/rec/conf/maics/Chen17
}}
==Context-Based Heuristics in Attribution==
Jim Q. Chen MAICS 2017 pp. 175–180 Context-Based Heuristics in Attribution Jim Q. Chen, Ph.D. National Defense University, U.S.A. Abstract What needs to be done in order to improve the process In cyber forensics, attribution of an attack, which finds out of attribution in the cyber domain so that direct retaliation details about the individual(s) who launched an attack, is in the cyber domain can be quickly launched should it be more important than mere identification of an attack, since a legal and necessary? To answer this question, the key precise response to the cyber attack heavily depends upon components in attribution should be identified. With this attribution. The identification of the initiator(s) in attribution provides precise targeting for a counter-attack. However, identification, a novel approach can be figured out to ad- heuristics are typically deployed to find out information dress these key components ahead of time so that the time about attack actions rather than initiator(s) of attack actions. needed for conducting attribution can be significantly re- This paper proposes a mechanism that utilizes a weight sys- duced. tem for guiding the way in which the heuristics prioritize the The paper is organized as follows: In Section 1, an in- discovery of attacker initiator(s). Linking purpose, methods, time, location, and events with the identified device, the troduction to the challenge is provided. In Section 2, relat- proposed heuristic approach can serve as a path towards ac- ed works are examined. The current approaches and their curate and prompt attribution. limitations are analyzed. In Section 3, an innovative solu- tion is proposed. In Section 4, this novel approach is ap- plied to a hypothetical case. In Section 5, a conclusion is Introduction drawn. It is not uncommon that a cyber attack is reported without identification of the attacker(s). Quite often, cyber defense mechanisms and cyber forensics can help to identify the Related Works fact that a system has been hacked and compromised and Beebe (2009) calls for the design and implementation of the data on the system have been stolen. However, it al- smart analytical algorithms in digital forensics since the ways takes a lot more time and efforts to find out who did “cost of human analytical time spent sifting through non- it and why it was done. Attribution is hard to be done even relevant search hits is a significant issue”. He holds that though it is possible. Without quick and accurate attribu- even though current “computational approaches for search- tion, precise responses to the attacker(s) are delayed, and ing, retrieving and analyzing digital evidence are unneces- direct cyber deterrence mechanisms become less effective. sarily simplistic”, there exists significant information re- In some cases, indirect deterrence mechanisms, such as trieval overhead. He argues that smart analytical algo- diplomatic, economic, legal, military, or other national rithms should “clearly reduce information retrieval over- security instruments, have been employed, especially in head”, “help investigators get to relevant data more quick- dealing with nation-state attackers. Unfortunately, the indi- ly, reduce the noise investigators must wade through, and rect deterrence mechanisms are always taking long time to help transform data into information and investigative be deployed and executed, as attribution and preparation knowledge.” In order to design such an intelligent algo- for the use of non-cyber national security instruments re- rithm, heuristics should be looked into. quire extra time in this process, thus causing the delay in Marti and Reinelt (2011) maintain that a good heuristic response or retaliation. In addition, as correctly pointed out algorithm should fulfill the following properties: “A solu- by Sterner (2011), the indirect deterrence mechanisms have tion can be obtained with reasonable computational effort”. limited effect on non-nation-state attackers. “The solution should be near optimal (with high probabil- Copyright held by the author. All rights reserved. Copying permitted for private and academic purposes. 175 Context-Based Heuristics in Attribution pp. 175–180 ity)”. “The likelihood for obtaining a bad solution (far from optimal) should be low”. Figure 1. Golden Circle Hill-climbing algorithms belong to local search, which, according to Kokash (1998), “is a version of exhaustive The Golden Circle is used for inspirational leadership. search that only focuses on a limited area of the search The idea is to have a goal figured out and made known space”. “Such algorithms consistently replace the current first, come up with a method or craft a strategy based on solution with the best of its neighbors if it is better than the the purpose, and then figure out what to do to achieve the current.” However, a hill-climbing algorithm “always finds goal. the nearest local optima of low quality”. This issue is re- As shown in this figure, the component “what” repre- ferred to as pre-mature convergence. Heuristics is used to sents actions or events. The component “how” represents deal with this problem. the method or the strategy used in orchestrating these There are several different approaches in heuristics. The events. It is relatively less obvious than the component best-first search selects the best state in the list. Simulated “what”. The component “why” represents the goal to be annealing allows some moves to worse states in order to achieved via the method or the strategy employed. It is the explore many regions of the state space. A* algorithm, least comprehensible element of these three components. which uses a best-first search with a modified evaluation However, once an understanding of the goal is gained, an function, selects the shortest path that has the minimal total understanding of the whole picture and the relationship of cost. However, in the first trial, as evaluation is not per- all these events is acquired. formed yet, it may select a path that is not the shortest one. Given the representation in circles, this process can be In the context of attribution, is there a structural configu- depicted as being inside out. In Sinek’s term, it all starts ration that helps to select the shortest path in the first trial? with why. Sinek (2009) even looks at how this representa- If there is one, what is it? How does this work? These are tion corresponds with the major levels of the brain. The the questions that are addressed in the next section. “what” level corresponds with neocortex, while the “how” level and the “why” level correspond with limbic brain. Neocortex is responsible for rational and analytical thought Proposal as well as language but it does not drive behavior. Limbic A novel context-based heuristic approach is proposed in brain, which drives behavior, is responsible for feelings, this section. Here, the relationship among the components such as trust and loyalty, as well as all human behavior and for attribution is analyzed and a weight system is em- decision making. ployed. Combining this weight system with the Contextual This model demonstrates that a purpose (i.e. the “why” Binding Condition, this new context-based heuristic ap- component) drives methods or strategies (i.e. the “how” proach is designed to discover the shortest and the most component), which, in turn, drive actions (i.e. the “what” optimal path for attribution. component). From this perspective, the “why” component To accurately attribute an event to an individual, all the is more important than the “how” component, and the following elements should be addressed: “who”, “what”, “how” component is more important than the “what” com- “when”, “where”, “how”, and “why”. To do so, it is crucial ponent. to find out the relationship among these elements. It has to be pointed out that as the purpose of the Golden Sinek (2009) does a very good job in explaining the rela- Circle is not for attribution, other important components tionship among some components, such as “what”, “how”, such as “who”, “when”, and “where”, are not included in and “why”, via the Golden Circle, as shown in Figure 1 the Golden circle. However, to build the Attribution Circle below: on the basis of the Golden Circle, these three components have to be included. What needs to be discovered is the relationship among all these components. It needs to be noted that the component “who”, which represents the human component, possesses the highest priority in any investigation as it directly pinpoints to the individual(s) who conducted the action. Other factors, such as the reason why the action was conducted, the way the action was conduct, the action that was conducted, the place where it was conducted, and the time when it was conducted, are all directly associated with the human com- ponent, i.e. the “who” component. To a certain extent, they are the attributes of the “who” component, which repre- sents the initiator of an action. It is the human who has a 176 Jim Q. Chen MAICS 2017 pp. 175–180 purpose or a goal. It is the human who comes up with a method or a strategy to archive the goal. Of course, the Figure 3. Inside Out method or the strategy has to be associated with location and time. It is the human who conducts the action based on However, in the cyber forensics environment, an effec- the method or the strategy. The action has to occur in a tive directional relationship is outside in. Investigators usu- specific location within a specific time. This is why this ally observe seemingly irrelevant actions in different loca- human component should hold relatively the highest tions at different times. The analysis helps them to link the weight in the Attribution Circle. Also, the component dots of these actions and eventually to figure out the meth- “who” is closely tied to all other components as it is the od or the strategy used. Based on the understanding of the initial driver who makes all these happen. method or the strategy used as well as the link between an The component “why” is the second most crucial ele- action and an actor, the suspect(s) can be eventually at- ment, as it drives the component “how”, which, in turn, tributed to. This reflects an outside-in directional relation- drives the component “what”. This is why it should possess ship, which is displayed in Figure 4 below: the second highest weight in the Attribution Circle. For the same reason, the component “how” should hold a weight that is less than that of the component “why” but more than that of the component “what”. As location (i.e. the compo- nent “where”) and time (i.e. the component “when”) are the attributes for a method (i.e. the component “how”) or an action (i.e. the component “what”), they should hold a weight that is less than that of the component “how”. Natu- rally, a weight system comes into being. All these relations can be successfully captured in the Attribution Circle proposed in Figure 2 below: Figure 4. Outside In Evidently, the directional relationship truly reflects the order of events. The Attribution Circle can effectively cap- ture the relationship. Based on the above analysis, the following stipulation can be made to capture the proportion of weight of proba- bility for each component in attribution: (1) Weight of probability for each component: “who”: W1 = 0.3 “why”: W2 = 0.25 Figure 2. Attribution Circle “how”: W3 = 0.15 “when”: W4 = 0.1 In the leadership environment, an effective directional “where”: W5 = 0.1 relationship is inside out. Similarly, a well-designed attack “what”: W6 = 0.1 follows this directional relationship. An attacker has a goal to achieve. To achieve that goal, the attacker needs to fig- The total weight of probability equals 1. ure out a method or a strategy. The attacker then orches- If a component is known, it carries the value “1”. Oth- trates various actions in different locations at different erwise, it has the value “0”. times according to the method or the strategy. This clearly The probability of successful attribution can be express reflects an inside-out directional relationship, which is dis- as follows: played in Figure 3 below: (2) Given the weight of each component listed in (1), the formula in (2) can be expanded as follows: Figure 3. Inside Out 177 Context-Based Heuristics in Attribution pp. 175–180 (3) for the attributes “who” and “why”. Once these two attrib- utes are known, 55%, i.e. (1*0.3) + (1*0.25) = 0.55, of the puzzle is solved. Let us compare the pair of the attributes “who” and “why” with the pair of attributes “how” and “what”. As the weight of the attribute “how” is 0.15 and = (X1*W1) + (X2*W2) + (X3*W3) + (X4*W4) the weight of the attribute “what” is 0.1, the total weight of + (X5*W5) + (X6*W6) the latter pair is P(X) = (1*0.15) + (1*0.1) = 0.25. This = (1*0.3) + (1*0.25) + (1*0.15) + (1*0.1) means that getting to know these two attributes solves 25% + (1*0.1) + (1*0.1) of the puzzle. Evidently, 25% is less than 55%; and the = 0.3 + 0.25 + 0.15 + 0.1 + 0.1 + 0.1 pair of the attributes “how” and “what” has less priority =1 than the pair of the attributes “who” and “why” does. With such a weight system in place, the attribute “who” is al- This means that if all the six components are known, the ways the first one to go after if it is unknown. The attribute individual who launched the attack can be successfully “why” is the second one to go after, and the attribute attributed to. “how” is the third one to go after. The pair of the attributes Also, when the attributes represented by these compo- that possesses the highest weight, i.e. the attributes of nents are all properly addressed in an expected way, the “who” and “why”, which possesses 55% of the total Revised Restrictive Contextual Binding Condition pro- weight, is the first one to go after as a pair. The pair of the posed in Chen (2016) is satisfied, as the variables are attributes that holds the second highest weight, i.e. the at- properly bound by their corresponding contextual opera- tributes of “who” and “how”, which holds 45% of the total tors. This binding condition is listed below: weight, is the second one to go after as a pair. As shown Assume X is an entity, and CO is a contextual operator. here, the weight system proposed in this paper helps to set (4) In a specialized time, location, environment, and up the priority in the search and helps to heuristically background, if X is directly related to CO with re- choose an optimal path for the quest. This structural con- spect to all the attributes such as action-initiator figuration helps to select the shortest path in the first trial, (who), action (what), action-recipient thus making heuristic algorithms more optimal and more (who/what_recipient), time (when), location efficient, especially in the quest for attribution. (where), method (how), and purpose (why) in such a In addition, this weight system can help the process of setting: intelligence collection for the sake of prevention in the COi[WHO1, WHAT2, WHAT_RECIPIENT3, cyber domain. If a request for a service is received from a WHEN4, WHERE5, HOW6, WHY7] device that is unknown, the server service should hold the {……Xi[WHO1,WHAT2, normal response and immediately start the query for the WHAT_RECIPIENT3, unknown factors. Picking up the component with the heav- WHEN4, WHERE5, HOW6, WHY7]……} iest weight in the list, the server service goes after the then Xi is contextually bound by COi in a restrictive component “who”. The server service now engages the way. device of the attack-initiator into a dialog by asking it As pointed out in Chen (2016), this is a typical represen- questions related to the “who” attribute. The idea is to tation of Type 1 Binding as all the attributes in the variable make the device of the attack-initiator to reveal its identity are contextually bound by the attributes in the contextual information. If no answer or unsatisfactory answer is re- operator. “If one contextual attribute in the variable is not ceived, the request from the attack device is immediately directly related to the corresponding attribute in the contex- rejected and the normal response is not provided at all. If a tual operator, the variable is not contextually bound by the satisfactory answer is received, the server service goes contextual operator in the restrictive sense.” after the component “why”, which possesses the second Putting (3) and (4) together, if all the attributes of a vari- heaviest weight in the list. The server service now asks the able (i.e. “who”, “why”, “how”, “when”, “where”, and device that makes the request to provide reasons for its “what”) are known, then P(X) = 1, and the variable is request. Again, if no answer or unsatisfactory answer is properly, (i.e. 100%) bound by the contextual operator received, the request from the attack device is immediately (CO). However, if only “what”, “when”, and “where” are rejected and the normal response is not provided at all. known, then P(X) = (1*0.1) + (1*0.1) + (1*0.1) = 0.3, and Otherwise, a normal response is provided. The questions the variable is 30% bound by the CO. related to the “why” attribute can help to detect a zombie As the attribute “who” possesses the highest weight, i.e. since a zombie either does not have a good reason for the 0.3, and the attribute “why” possesses the second highest request or has to wait for the attack-initiator to provide a weight, i.e. 0.25, the missing of these two attributes imme- reason. The unsatisfactory answer or the delay in response diately points out a new path of search, namely, the quest is a good indicator in detecting a zombie system. Evident- 178 Jim Q. Chen MAICS 2017 pp. 175–180 ly, this new context-based heuristic approach can help in- et that the switch receives. The switch will ask the router telligence collection for the sake of prevention. that it directly connects to for the source MAC address and Chen and Dinerman (2016) examine the unique charac- the source IP address within the echo packets that the rout- teristics of cyber conflicts and discover the following three er receives. The router provides the information. Now, the cyber feature sets, namely intelligence collection, stealth MAC address and the IP address that sends the echo pack- maneuvers, and surprise effect. They argue that these ets to the router are discovered. The engagement mecha- unique feature sets can be turned into unique cyber capa- nism approaches that device and asks the same question. bilities that serve as force multipliers, if they are integrated This process keeps running until it reaches to the device appropriately into conventional conflicts as complementary that launches these echo packets. military capacities. As shown in this paper, this new con- Once it gets to the device that launches these echo pack- text-based heuristic approach not only can assist intelli- ets, the engagement mechanism makes an inquiry about the gence collection but also can speed up the attribution pro- attribute “why”, which possess 25% of the total weight. If cess. This capability is exactly what is needed for force this device is a zombie, it may provide an unsatisfactory multipliers. reason; or it may be slow in providing the reason as it waits for it from the command and control (C2) server. Note that this type of control requires connectivity. If the engage- Case Study ment mechanism further asks for the current status of its In this section, the proposed context-based heuristics is connectivity, and if the zombie device provides the answer, applied to a hypothetical case, which is a typical attribution the IP address of the C2 server is revealed. challenge. Using the same back-tracking method, the engagement Let us assume that a server suddenly receives 2,000 re- mechanism can eventually trace to the C2 server. From the petitive packets within a second from the same source right neighboring device of this C2 server, the engagement at 5:00 PM on Monday. This abnormal behavior immedi- mechanism is able to find out the MAC address as well as ately triggers the context-based heuristics for investigation, the IP address of the C2 server. Once discovered, the en- as the server usually receives less than 1,000 different gagement mechanism makes an inquiry about the attribute packets within a minute. A quick scrutiny reveals the pack- “why”. The C2 server either refuses to provide an answer ets are all echo packets utilizing UDP Port 7. The message or provides an unsatisfactory answer. This may give up its echoed is exactly the same. This started a minute ago. It real intention. At this point, a close surveillance is initiated only occurs on this particular server at that time. in order to find out the host name of the devices and the This quick scrutiny discloses the attributes of “what”, user name if possible. In addition, the engagement mecha- “when”, “where”, and “how”. The fact that the server is hit nism tries to verify if the device is used by the real attack by 2,000 echo packets per second accounts for the attribute initiator and if the owner/user of the device is the real at- of “what”. The time at 5:00 PM on Monday accounts for tacker. Eventually, 100% of the puzzle is solved, or at least the attribute of “when”. The location of the server accounts a very higher percentage of the puzzle is solved. for the attribute of “where”. Echo packets utilizing UDP Note that this operation is conducted at the very early Port 7 in that particular location at that particular time ac- stage of a denial of service attack. So, deterrence mecha- counts for the attribute of “how”. So far, the known attrib- nisms, defense mechanisms, and recovery mechanisms can ute are “what”, “when”, “where”, and “how”. The un- be immediately launched to halt the denial of service at- known attributes are “who” and “why”. Given the tack. In cyber operations, every minute counts. The sooner weighted system, the weight of the known attributes is an attacker can be identified, the sooner a counter-attack ((1*0.15) + (1*0.1) + (1*0.1) + (1*0.1) = 0.15 + 0.1 + 0.1 can be launched, and the less impact can be left on the af- + 0.1 = 0.45, namely, 45% of the puzzle is known. The fected systems and networks. Meanwhile, the evidence context-based heuristics recommends an inquiry for the collected can be used for prosecution and retaliation pur- attribute “who” first as it possesses 30% of the total pose. This supports cyber deterrence. weight. As shown in this hypothetical case, the context-based Now, the engagement mechanism is triggered, and the heuristics plays a significant role in search for a target and intelligence collection process gets started. It examines the in collecting intelligence and evidence about the target. source MAC address and the source IP address within the With no doubt, it helps accurate attribution. echo packets. As the source MAC address is the address of the switch that the server is directly connected to, the serv- Conclusion er asks the switch for the source MAC address of the pack- Attribution is a challenge in the cyber domain. However, as shown in this paper, heuristics can guide the most opti- 179 Context-Based Heuristics in Attribution pp. 175–180 mal search based on some structural configurations with a weight system. Eventually, it is capable of limiting the search time of information discovery heuristics in support- ing cyber operations. Linking purpose, methods, time, lo- cation, and events with the identified device, the proposed heuristic approach can serve as a path towards accurate and prompt attribution. References Beebe, N. 2009. Digital Forensic Research: the Good, the Bad and the Unaddressed. Advances in Digital Forensics V, Springer. pp. 17-36. Chen, J. 2016. Contextual Binding and Intelligent Targeting. Proceedings of the 2016 IEEE/WIC/ACM International Confer- ence on Web Intelligence. pp.701-704. Chen, J. & Dinerman, A. 2016. On Cyber Dominance in Modern Warfare, Proceedings of the 15th European Conference on Cyber Warfare and Security. pp.52-57. Reading, UK: Academic Conferences & Publishing International (ACPI) Limited. Kosash, N. 1998. An Introduction to Heuristic Algorithms. Uni- versity of Trento, Italy. Marti, R. & Reinelt, G. 2011. Heuristic Methods. The Linear Ordering Problem, Exact and Heuristic Methods in Combinatorial Optimization 175, DOI: 10.1007/978-3-642-16729-4_2. pp.17- 40. Berlin: Springer-Verlag. Sinek, S. 2009. Start with Why: How Great Leaders Inspire Eve- ryone to Take Action. USA: Penguin Group. Sterner, E. 2011. Deterrence in Cyberspace: Yes, No, Maybe. Returning to Fundamentals: Deterrence and U.S. National Secu- rity in the 21st Century. pp. 27. Washington DC: George C. Mar- shall Institute. 180