=Paper=
{{Paper
|id=Vol-1964/S3
|storemode=property
|title=Context-Based Heuristics in Attribution
|pdfUrl=https://ceur-ws.org/Vol-1964/S3.pdf
|volume=Vol-1964
|authors=Jim Q. Chen
|dblpUrl=https://dblp.org/rec/conf/maics/Chen17
}}
==Context-Based Heuristics in Attribution==
<pdf width="1500px">https://ceur-ws.org/Vol-1964/S3.pdf</pdf>
<pre>
Jim Q. Chen                                                       MAICS 2017                                                  pp. 175–180


                                  Context-Based Heuristics in Attribution
                                                             Jim Q. Chen, Ph.D.
                                                        National Defense University, U.S.A.


                              Abstract                                              What needs to be done in order to improve the process
  In cyber forensics, attribution of an attack, which finds out                  of attribution in the cyber domain so that direct retaliation
  details about the individual(s) who launched an attack, is                     in the cyber domain can be quickly launched should it be
  more important than mere identification of an attack, since a                  legal and necessary? To answer this question, the key
  precise response to the cyber attack heavily depends upon
                                                                                 components in attribution should be identified. With this
  attribution. The identification of the initiator(s) in attribution
  provides precise targeting for a counter-attack. However,                      identification, a novel approach can be figured out to ad-
  heuristics are typically deployed to find out information                      dress these key components ahead of time so that the time
  about attack actions rather than initiator(s) of attack actions.               needed for conducting attribution can be significantly re-
  This paper proposes a mechanism that utilizes a weight sys-                    duced.
  tem for guiding the way in which the heuristics prioritize the                    The paper is organized as follows: In Section 1, an in-
  discovery of attacker initiator(s). Linking purpose, methods,
  time, location, and events with the identified device, the                     troduction to the challenge is provided. In Section 2, relat-
  proposed heuristic approach can serve as a path towards ac-                    ed works are examined. The current approaches and their
  curate and prompt attribution.                                                 limitations are analyzed. In Section 3, an innovative solu-
                                                                                 tion is proposed. In Section 4, this novel approach is ap-
                                                                                 plied to a hypothetical case. In Section 5, a conclusion is
                          Introduction                                           drawn.
It is not uncommon that a cyber attack is reported without
identification of the attacker(s). Quite often, cyber defense
mechanisms and cyber forensics can help to identify the
                                                                                                     Related Works
fact that a system has been hacked and compromised and                           Beebe (2009) calls for the design and implementation of
the data on the system have been stolen. However, it al-                         smart analytical algorithms in digital forensics since the
ways takes a lot more time and efforts to find out who did                       “cost of human analytical time spent sifting through non-
it and why it was done. Attribution is hard to be done even                      relevant search hits is a significant issue”. He holds that
though it is possible. Without quick and accurate attribu-                       even though current “computational approaches for search-
tion, precise responses to the attacker(s) are delayed, and                      ing, retrieving and analyzing digital evidence are unneces-
direct cyber deterrence mechanisms become less effective.                        sarily simplistic”, there exists significant information re-
In some cases, indirect deterrence mechanisms, such as                           trieval overhead. He argues that smart analytical algo-
diplomatic, economic, legal, military, or other national                         rithms should “clearly reduce information retrieval over-
security instruments, have been employed, especially in                          head”, “help investigators get to relevant data more quick-
dealing with nation-state attackers. Unfortunately, the indi-                    ly, reduce the noise investigators must wade through, and
rect deterrence mechanisms are always taking long time to                        help transform data into information and investigative
be deployed and executed, as attribution and preparation                         knowledge.” In order to design such an intelligent algo-
for the use of non-cyber national security instruments re-                       rithm, heuristics should be looked into.
quire extra time in this process, thus causing the delay in                         Marti and Reinelt (2011) maintain that a good heuristic
response or retaliation. In addition, as correctly pointed out                   algorithm should fulfill the following properties: “A solu-
by Sterner (2011), the indirect deterrence mechanisms have                       tion can be obtained with reasonable computational effort”.
limited effect on non-nation-state attackers.                                    “The solution should be near optimal (with high probabil-


Copyright held by the author. All rights reserved. Copying permitted for
private and academic purposes.


                                                                           175
Context-Based Heuristics in Attribution                                                                                pp. 175–180


ity)”. “The likelihood for obtaining a bad solution (far from
optimal) should be low”.                                                                    Figure 1. Golden Circle
   Hill-climbing algorithms belong to local search, which,
according to Kokash (1998), “is a version of exhaustive                      The Golden Circle is used for inspirational leadership.
search that only focuses on a limited area of the search                  The idea is to have a goal figured out and made known
space”. “Such algorithms consistently replace the current                 first, come up with a method or craft a strategy based on
solution with the best of its neighbors if it is better than the          the purpose, and then figure out what to do to achieve the
current.” However, a hill-climbing algorithm “always finds                goal.
the nearest local optima of low quality”. This issue is re-                  As shown in this figure, the component “what” repre-
ferred to as pre-mature convergence. Heuristics is used to                sents actions or events. The component “how” represents
deal with this problem.                                                   the method or the strategy used in orchestrating these
   There are several different approaches in heuristics. The              events. It is relatively less obvious than the component
best-first search selects the best state in the list. Simulated           “what”. The component “why” represents the goal to be
annealing allows some moves to worse states in order to                   achieved via the method or the strategy employed. It is the
explore many regions of the state space. A* algorithm,                    least comprehensible element of these three components.
which uses a best-first search with a modified evaluation                 However, once an understanding of the goal is gained, an
function, selects the shortest path that has the minimal total            understanding of the whole picture and the relationship of
cost. However, in the first trial, as evaluation is not per-              all these events is acquired.
formed yet, it may select a path that is not the shortest one.               Given the representation in circles, this process can be
   In the context of attribution, is there a structural configu-          depicted as being inside out. In Sinek’s term, it all starts
ration that helps to select the shortest path in the first trial?         with why. Sinek (2009) even looks at how this representa-
If there is one, what is it? How does this work? These are                tion corresponds with the major levels of the brain. The
the questions that are addressed in the next section.                     “what” level corresponds with neocortex, while the “how”
                                                                          level and the “why” level correspond with limbic brain.
                                                                          Neocortex is responsible for rational and analytical thought
                          Proposal                                        as well as language but it does not drive behavior. Limbic
A novel context-based heuristic approach is proposed in                   brain, which drives behavior, is responsible for feelings,
this section. Here, the relationship among the components                 such as trust and loyalty, as well as all human behavior and
for attribution is analyzed and a weight system is em-                    decision making.
ployed. Combining this weight system with the Contextual                     This model demonstrates that a purpose (i.e. the “why”
Binding Condition, this new context-based heuristic ap-                   component) drives methods or strategies (i.e. the “how”
proach is designed to discover the shortest and the most                  component), which, in turn, drive actions (i.e. the “what”
optimal path for attribution.                                             component). From this perspective, the “why” component
   To accurately attribute an event to an individual, all the             is more important than the “how” component, and the
following elements should be addressed: “who”, “what”,                    “how” component is more important than the “what” com-
“when”, “where”, “how”, and “why”. To do so, it is crucial                ponent.
to find out the relationship among these elements.                           It has to be pointed out that as the purpose of the Golden
   Sinek (2009) does a very good job in explaining the rela-              Circle is not for attribution, other important components
tionship among some components, such as “what”, “how”,                    such as “who”, “when”, and “where”, are not included in
and “why”, via the Golden Circle, as shown in Figure 1                    the Golden circle. However, to build the Attribution Circle
below:                                                                    on the basis of the Golden Circle, these three components
                                                                          have to be included. What needs to be discovered is the
                                                                          relationship among all these components.
                                                                             It needs to be noted that the component “who”, which
                                                                          represents the human component, possesses the highest
                                                                          priority in any investigation as it directly pinpoints to the
                                                                          individual(s) who conducted the action. Other factors, such
                                                                          as the reason why the action was conducted, the way the
                                                                          action was conduct, the action that was conducted, the
                                                                          place where it was conducted, and the time when it was
                                                                          conducted, are all directly associated with the human com-
                                                                          ponent, i.e. the “who” component. To a certain extent, they
                                                                          are the attributes of the “who” component, which repre-
                                                                          sents the initiator of an action. It is the human who has a


                                                                    176
Jim Q. Chen                                              MAICS 2017                                                  pp. 175–180


purpose or a goal. It is the human who comes up with a
method or a strategy to archive the goal. Of course, the                                   Figure 3. Inside Out
method or the strategy has to be associated with location
and time. It is the human who conducts the action based on                However, in the cyber forensics environment, an effec-
the method or the strategy. The action has to occur in a               tive directional relationship is outside in. Investigators usu-
specific location within a specific time. This is why this             ally observe seemingly irrelevant actions in different loca-
human component should hold relatively the highest                     tions at different times. The analysis helps them to link the
weight in the Attribution Circle. Also, the component                  dots of these actions and eventually to figure out the meth-
“who” is closely tied to all other components as it is the             od or the strategy used. Based on the understanding of the
initial driver who makes all these happen.                             method or the strategy used as well as the link between an
   The component “why” is the second most crucial ele-                 action and an actor, the suspect(s) can be eventually at-
ment, as it drives the component “how”, which, in turn,                tributed to. This reflects an outside-in directional relation-
drives the component “what”. This is why it should possess             ship, which is displayed in Figure 4 below:
the second highest weight in the Attribution Circle. For the
same reason, the component “how” should hold a weight
that is less than that of the component “why” but more than
that of the component “what”. As location (i.e. the compo-
nent “where”) and time (i.e. the component “when”) are
the attributes for a method (i.e. the component “how”) or
an action (i.e. the component “what”), they should hold a
weight that is less than that of the component “how”. Natu-
rally, a weight system comes into being.
   All these relations can be successfully captured in the
Attribution Circle proposed in Figure 2 below:                                             Figure 4. Outside In

                                                                          Evidently, the directional relationship truly reflects the
                                                                       order of events. The Attribution Circle can effectively cap-
                                                                       ture the relationship.
                                                                          Based on the above analysis, the following stipulation
                                                                       can be made to capture the proportion of weight of proba-
                                                                       bility for each component in attribution:

                                                                         (1) Weight of probability for each component:
                                                                             “who”: W1 = 0.3
                                                                             “why”: W2 = 0.25
                Figure 2. Attribution Circle                                 “how”: W3 = 0.15
                                                                             “when”: W4 = 0.1
   In the leadership environment, an effective directional                   “where”: W5 = 0.1
relationship is inside out. Similarly, a well-designed attack                “what”: W6 = 0.1
follows this directional relationship. An attacker has a goal
to achieve. To achieve that goal, the attacker needs to fig-              The total weight of probability equals 1.
ure out a method or a strategy. The attacker then orches-                 If a component is known, it carries the value “1”. Oth-
trates various actions in different locations at different             erwise, it has the value “0”.
times according to the method or the strategy. This clearly               The probability of successful attribution can be express
reflects an inside-out directional relationship, which is dis-         as follows:
played in Figure 3 below:
                                                                         (2)


                                                                          Given the weight of each component listed in (1), the
                                                                       formula in (2) can be expanded as follows:
                    Figure 3. Inside Out


                                                                 177
Context-Based Heuristics in Attribution                                                                                  pp. 175–180


  (3)                                                                     for the attributes “who” and “why”. Once these two attrib-
                                                                          utes are known, 55%, i.e. (1*0.3) + (1*0.25) = 0.55, of the
                                                                          puzzle is solved. Let us compare the pair of the attributes
                                                                          “who” and “why” with the pair of attributes “how” and
                                                                          “what”. As the weight of the attribute “how” is 0.15 and
                = (X1*W1) + (X2*W2) + (X3*W3) + (X4*W4)                   the weight of the attribute “what” is 0.1, the total weight of
                   + (X5*W5) + (X6*W6)
                                                                          the latter pair is P(X) = (1*0.15) + (1*0.1) = 0.25. This
                = (1*0.3) + (1*0.25) + (1*0.15) + (1*0.1)
                                                                          means that getting to know these two attributes solves 25%
                   + (1*0.1) + (1*0.1)
                                                                          of the puzzle. Evidently, 25% is less than 55%; and the
                = 0.3 + 0.25 + 0.15 + 0.1 + 0.1 + 0.1
                                                                          pair of the attributes “how” and “what” has less priority
                =1
                                                                          than the pair of the attributes “who” and “why” does. With
                                                                          such a weight system in place, the attribute “who” is al-
   This means that if all the six components are known, the
                                                                          ways the first one to go after if it is unknown. The attribute
individual who launched the attack can be successfully
                                                                          “why” is the second one to go after, and the attribute
attributed to.
                                                                          “how” is the third one to go after. The pair of the attributes
   Also, when the attributes represented by these compo-
                                                                          that possesses the highest weight, i.e. the attributes of
nents are all properly addressed in an expected way, the
                                                                          “who” and “why”, which possesses 55% of the total
Revised Restrictive Contextual Binding Condition pro-
                                                                          weight, is the first one to go after as a pair. The pair of the
posed in Chen (2016) is satisfied, as the variables are
                                                                          attributes that holds the second highest weight, i.e. the at-
properly bound by their corresponding contextual opera-
                                                                          tributes of “who” and “how”, which holds 45% of the total
tors. This binding condition is listed below:                             weight, is the second one to go after as a pair. As shown
   Assume X is an entity, and CO is a contextual operator.
                                                                          here, the weight system proposed in this paper helps to set
   (4) In a specialized time, location, environment, and                  up the priority in the search and helps to heuristically
       background, if X is directly related to CO with re-
                                                                          choose an optimal path for the quest. This structural con-
       spect to all the attributes such as action-initiator
                                                                          figuration helps to select the shortest path in the first trial,
       (who),        action      (what),        action-recipient          thus making heuristic algorithms more optimal and more
       (who/what_recipient), time (when), location
                                                                          efficient, especially in the quest for attribution.
       (where), method (how), and purpose (why) in such a
                                                                             In addition, this weight system can help the process of
       setting:                                                           intelligence collection for the sake of prevention in the
          COi[WHO1, WHAT2, WHAT_RECIPIENT3,
                                                                          cyber domain. If a request for a service is received from a
          WHEN4, WHERE5, HOW6, WHY7]                                      device that is unknown, the server service should hold the
          {……Xi[WHO1,WHAT2,
                                                                          normal response and immediately start the query for the
          WHAT_RECIPIENT3,                                                unknown factors. Picking up the component with the heav-
          WHEN4, WHERE5, HOW6, WHY7]……}                                   iest weight in the list, the server service goes after the
       then Xi is contextually bound by COi in a restrictive              component “who”. The server service now engages the
       way.
                                                                          device of the attack-initiator into a dialog by asking it
   As pointed out in Chen (2016), this is a typical represen-             questions related to the “who” attribute. The idea is to
tation of Type 1 Binding as all the attributes in the variable
                                                                          make the device of the attack-initiator to reveal its identity
are contextually bound by the attributes in the contextual                information. If no answer or unsatisfactory answer is re-
operator. “If one contextual attribute in the variable is not
                                                                          ceived, the request from the attack device is immediately
directly related to the corresponding attribute in the contex-
                                                                          rejected and the normal response is not provided at all. If a
tual operator, the variable is not contextually bound by the
                                                                          satisfactory answer is received, the server service goes
contextual operator in the restrictive sense.”
                                                                          after the component “why”, which possesses the second
   Putting (3) and (4) together, if all the attributes of a vari-         heaviest weight in the list. The server service now asks the
able (i.e. “who”, “why”, “how”, “when”, “where”, and
                                                                          device that makes the request to provide reasons for its
“what”) are known, then P(X) = 1, and the variable is
                                                                          request. Again, if no answer or unsatisfactory answer is
properly, (i.e. 100%) bound by the contextual operator                    received, the request from the attack device is immediately
(CO). However, if only “what”, “when”, and “where” are
                                                                          rejected and the normal response is not provided at all.
known, then P(X) = (1*0.1) + (1*0.1) + (1*0.1) = 0.3, and
                                                                          Otherwise, a normal response is provided. The questions
the variable is 30% bound by the CO.                                      related to the “why” attribute can help to detect a zombie
   As the attribute “who” possesses the highest weight, i.e.
                                                                          since a zombie either does not have a good reason for the
0.3, and the attribute “why” possesses the second highest
                                                                          request or has to wait for the attack-initiator to provide a
weight, i.e. 0.25, the missing of these two attributes imme-
                                                                          reason. The unsatisfactory answer or the delay in response
diately points out a new path of search, namely, the quest
                                                                          is a good indicator in detecting a zombie system. Evident-


                                                                    178
Jim Q. Chen                                                MAICS 2017                                                   pp. 175–180


ly, this new context-based heuristic approach can help in-               et that the switch receives. The switch will ask the router
telligence collection for the sake of prevention.                        that it directly connects to for the source MAC address and
   Chen and Dinerman (2016) examine the unique charac-                   the source IP address within the echo packets that the rout-
teristics of cyber conflicts and discover the following three            er receives. The router provides the information. Now, the
cyber feature sets, namely intelligence collection, stealth              MAC address and the IP address that sends the echo pack-
maneuvers, and surprise effect. They argue that these                    ets to the router are discovered. The engagement mecha-
unique feature sets can be turned into unique cyber capa-                nism approaches that device and asks the same question.
bilities that serve as force multipliers, if they are integrated         This process keeps running until it reaches to the device
appropriately into conventional conflicts as complementary               that launches these echo packets.
military capacities. As shown in this paper, this new con-                  Once it gets to the device that launches these echo pack-
text-based heuristic approach not only can assist intelli-               ets, the engagement mechanism makes an inquiry about the
gence collection but also can speed up the attribution pro-              attribute “why”, which possess 25% of the total weight. If
cess. This capability is exactly what is needed for force                this device is a zombie, it may provide an unsatisfactory
multipliers.                                                             reason; or it may be slow in providing the reason as it waits
                                                                         for it from the command and control (C2) server. Note that
                                                                         this type of control requires connectivity. If the engage-
                        Case Study                                       ment mechanism further asks for the current status of its
In this section, the proposed context-based heuristics is                connectivity, and if the zombie device provides the answer,
applied to a hypothetical case, which is a typical attribution           the IP address of the C2 server is revealed.
challenge.                                                                  Using the same back-tracking method, the engagement
   Let us assume that a server suddenly receives 2,000 re-               mechanism can eventually trace to the C2 server. From the
petitive packets within a second from the same source right              neighboring device of this C2 server, the engagement
at 5:00 PM on Monday. This abnormal behavior immedi-                     mechanism is able to find out the MAC address as well as
ately triggers the context-based heuristics for investigation,           the IP address of the C2 server. Once discovered, the en-
as the server usually receives less than 1,000 different                 gagement mechanism makes an inquiry about the attribute
packets within a minute. A quick scrutiny reveals the pack-              “why”. The C2 server either refuses to provide an answer
ets are all echo packets utilizing UDP Port 7. The message               or provides an unsatisfactory answer. This may give up its
echoed is exactly the same. This started a minute ago. It                real intention. At this point, a close surveillance is initiated
only occurs on this particular server at that time.                      in order to find out the host name of the devices and the
   This quick scrutiny discloses the attributes of “what”,               user name if possible. In addition, the engagement mecha-
“when”, “where”, and “how”. The fact that the server is hit              nism tries to verify if the device is used by the real attack
by 2,000 echo packets per second accounts for the attribute              initiator and if the owner/user of the device is the real at-
of “what”. The time at 5:00 PM on Monday accounts for                    tacker. Eventually, 100% of the puzzle is solved, or at least
the attribute of “when”. The location of the server accounts             a very higher percentage of the puzzle is solved.
for the attribute of “where”. Echo packets utilizing UDP                    Note that this operation is conducted at the very early
Port 7 in that particular location at that particular time ac-           stage of a denial of service attack. So, deterrence mecha-
counts for the attribute of “how”. So far, the known attrib-             nisms, defense mechanisms, and recovery mechanisms can
ute are “what”, “when”, “where”, and “how”. The un-                      be immediately launched to halt the denial of service at-
known attributes are “who” and “why”. Given the                          tack. In cyber operations, every minute counts. The sooner
weighted system, the weight of the known attributes is                   an attacker can be identified, the sooner a counter-attack
((1*0.15) + (1*0.1) + (1*0.1) + (1*0.1) = 0.15 + 0.1 + 0.1               can be launched, and the less impact can be left on the af-
+ 0.1 = 0.45, namely, 45% of the puzzle is known. The                    fected systems and networks. Meanwhile, the evidence
context-based heuristics recommends an inquiry for the                   collected can be used for prosecution and retaliation pur-
attribute “who” first as it possesses 30% of the total                   pose. This supports cyber deterrence.
weight.                                                                     As shown in this hypothetical case, the context-based
   Now, the engagement mechanism is triggered, and the                   heuristics plays a significant role in search for a target and
intelligence collection process gets started. It examines the            in collecting intelligence and evidence about the target.
source MAC address and the source IP address within the                  With no doubt, it helps accurate attribution.
echo packets. As the source MAC address is the address of
the switch that the server is directly connected to, the serv-
                                                                                                 Conclusion
er asks the switch for the source MAC address of the pack-
                                                                         Attribution is a challenge in the cyber domain. However,
                                                                         as shown in this paper, heuristics can guide the most opti-


                                                                   179
Context-Based Heuristics in Attribution                                 pp. 175–180


mal search based on some structural configurations with a
weight system. Eventually, it is capable of limiting the
search time of information discovery heuristics in support-
ing cyber operations. Linking purpose, methods, time, lo-
cation, and events with the identified device, the proposed
heuristic approach can serve as a path towards accurate and
prompt attribution.


                        References
Beebe, N. 2009. Digital Forensic Research: the Good, the Bad
and the Unaddressed. Advances in Digital Forensics V, Springer.
pp. 17-36.

Chen, J. 2016. Contextual Binding and Intelligent Targeting.
Proceedings of the 2016 IEEE/WIC/ACM International Confer-
ence on Web Intelligence. pp.701-704.

Chen, J. & Dinerman, A. 2016. On Cyber Dominance in Modern
Warfare, Proceedings of the 15th European Conference on
Cyber Warfare and Security. pp.52-57. Reading, UK: Academic
Conferences & Publishing International (ACPI) Limited.

Kosash, N. 1998. An Introduction to Heuristic Algorithms. Uni-
versity of Trento, Italy.

Marti, R. & Reinelt, G. 2011. Heuristic Methods. The Linear
Ordering Problem, Exact and Heuristic Methods in Combinatorial
Optimization 175, DOI: 10.1007/978-3-642-16729-4_2. pp.17-
40. Berlin: Springer-Verlag.

Sinek, S. 2009. Start with Why: How Great Leaders Inspire Eve-
ryone to Take Action. USA: Penguin Group.

Sterner, E. 2011. Deterrence in Cyberspace: Yes, No, Maybe.
Returning to Fundamentals: Deterrence and U.S. National Secu-
rity in the 21st Century. pp. 27. Washington DC: George C. Mar-
shall Institute.


                                                                  180

</pre>