<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>How do we e ectively monitor for slow suspicious activities?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Harsha K. Kalutarage</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Siraj A. Shaikh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Qin Zhou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anne E. James</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>kalutarh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>g@coventry.ac.uk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Digital Security and Forensics (SaFe) Research Group Department of Computing, Faculty of Engineering and Computing Coventry University Coventry</institution>
          ,
          <addr-line>CV1 5FB</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>6</lpage>
      <abstract>
        <p>As computer networks scale up in size and tra c volume, detecting slow suspicious activity, deliberately designed to stay beneath the threshold, becomes ever more di cult. Simply storing all packet captures for analysis is not feasible due to computational constraints. Detecting such activity depends on maintaining tra c history over extended periods of time, and using it to distinguish between suspicious and innocent nodes. The doctoral work presented here aims to adopt a Bayesian approach to address this problem, and to examine the e ectiveness of such an approach under di erent network conditions: multiple attackers, tra c volume, subnet con guration and tra c sampling. We provide a theoretical account of our approach and very early experimental results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>We are particularly interested in studying subnet size and tra c volume, and how that may e ect
our ability to distinguish such activity. We will draw from this network design principles for more
e ective monitoring.
3. How do we e ectively detect the target of such activity?</p>
      <p>We acknowledge that the use of botnets and distributed sources makes it very di cult to attribute
attacks. Of further interest is to determine the target of such activity. We will investigate methods
to pro le such nodes. Such methods need to be e ective for scalable networks.
4. What e ect does using sampling techniques has as a logging method?</p>
      <p>Tra c volumes will continue to increase. This makes it ever more di cult to process and e ectively
monitor slow activity. Since we are not detecting for strict tra c signatures, we wish to investigate
tra c sampling methods and evaluate their suitability for security monitoring of slow attacks.
3</p>
    </sec>
    <sec id="sec-2">
      <title>Research Methodology</title>
      <p>We look at the problem as two sub problems: pro ling and analysis. Pro ling is the method for evidence
fusion across space and accumulation across time, which updates the normal node pro les dynamically
based on changes in evidence. Analysis is the method for distinguishing between anomalous and normal
pro les using statistical normality. We propose to use elements of network ow data as input to our
pro ling method. Flow data contains network and port addresses, protocols, date and time, and amount
of data exchanged during a session. We use a multivariate approach to analyse such records. So for
example suspicious port scanning activity may have the following characteristics: a single source address,
one or more destination addresses, and target port numbers increasing incrementally. When ngerprinting
such tra c, we examine multiple elements and develop a hypothesis for the cause of behaviour on that
basis. We use a Bayesion approach to achieve this.
3.1</p>
      <p>Building the hypothesis
The posterior probability of the hypothesis Hk given that E, is given by the well known Bayes' formula:</p>
      <p>Let Hk : hypothesis that kth node is an attacker, Ei is a ow record element and E =fE1=e1, E2=e2,
E3=e3,...,Em=emg is the set of all suspicious evidence observed against node k during time t from m
di erent independent observation spaces. Here P (E) is the probability of producing suspicious events by
node k, but on its own is di cult to calculate. This can be avoided by using the law of total probability.
For independent observations, the joint posterior probability distribution can be obtained from (1) as:
(1)
(2)</p>
      <p>To calculate the posterior probability of node k being an attacker p(Hk=E), it is necessary to estimate:
1. the likelihood of the event E given the hypothesis Hi, p(E=Hi) and,
2. the prior probability p(Hi), where n i &gt; 0.</p>
      <p>Assuming that prior and likelihoods are known, (2) facilitates to combine evidence from multiple
sources (all Eis) to a single value (posterior probability) which describes our belief, during a short
observation period, that node k is an attacker given E. Aggregating short period estimations over time helps
to accumulate relatively weak evidence for long periods. This accumulated probability term, P p(Hk=E)
t
(t is time) known as pro le value hereafter, can be used as a measurement of the level of suspicion for
node k at any given time. These scores are converted into Z-scores for analysis.</p>
      <p>A series of experiments have been conducted in a simulated environment to test the proposed approach.
We use NS3 [NS311] to simulate our network and generate tra c patterns of interest, assuming a poison
p(Hk=E) =
p (E=Hk) :p(Hk)</p>
      <p>p(E)
Q p(ej =Hi):p(Hk)
j
p(Hk=E) = P Q p(ej =Hi):p(Hi)</p>
      <p>i j
arrival model. Each simulation is run for a reasonable period of time to ensure that enough tra c is
generated (over one million events). If s, n are mean rates of generating suspicious events (where we
only generate a subset of ow data elements including source and destination address and port numbers,
and where suspicious activity is judged by unexpected port numbers) by suspicion and normal nodes
respectively, we ensure maintaining s = ( n 3p n) and n( 0:1) su ciently smaller for all our
experiments to characterise slow suspicious activities which aim at staying beneath the threshold of
detection and hiding behind the background noise.
3.2</p>
      <p>Early Results
Early results of our work are promising: our approach is able to distinguish multiple suspicious nodes
from a given set of network nodes as shown in Figure 1.</p>
      <p>We model detection potential D as a function of subnet size S and tra c volume V , where D =
1
k:( bVS ) 2 , and where k is a constant, which demonstrates the e ect of varying the subnet size over ability
to detect e ective monitoring. This e ect is demonstrated in Figure 2. The e ects of total tra c volume
on detection potential are also demonstrated in Figure 3. Relevant details for these results could be found
in [KSZJ12].</p>
      <p>reo 2
c
S
−
Z
4
0</p>
      <p>Our work aims to address the stated research goals by demonstrating how e ective monitoring could
be deployed in more realistic network topologies. We plan to continue with our experimental approach,
and consolidate results towards the end to ensure a coherent and consistent picture emerges that is of
practical value.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Contribution</title>
      <p>This research aims to address a di cult problem. Monitoring infrastructures are overloaded both with
data and tools. The question is: what do we with it? The di culty is due to the increasing scale of
networks, the diversity of user access provision to systems, the nature of suspicious activity and the
corresponding need to monitor for serious attacks, and ultimately being able to e ectively manage detection
of intrusions.</p>
      <p>100
200
300
400</p>
      <p>500
Subnet size
Traffic volume
6
7</p>
      <p>Our ultimate goal is to o er a set of design principles and heuristics allowing for e ective collection and
analysis of data on networks. The rst two research questions from Section 2 allow us to build defensible
networks, where any source of suspicious activity could be detected e ectively and quickly. This is about
both better data analysis and network design. The third research question is inspired by related work
investigating exposure maps [DOE06] and darkports [WvOK07], where we adapt our algorithm to pro le
target nodes for possible slow and suspicious activity. The underlying principle remains the same: we trade
in state for computation. Ever increasing processing capacity increasingly makes this feasible. But tra c
volumes indeed also pose a big challenge, and hence our nal question is an attempt assess the feasibility
of sampling tra c for analysis. This is also evidenced as feasible by some other work [BR12,PRTV10],
and we propose to build on it.</p>
      <p>Our aim is to remain domain agnostic. This allows for research to be applied at various levels, including
better detection software, monitoring tools, and network design and con guration management solutions.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>[BBSP04] Phillip</surname>
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Bradford</surname>
            , Marcus Brown, Bonnie Self, and
            <given-names>Josh</given-names>
          </string-name>
          <string-name>
            <surname>Perdue</surname>
          </string-name>
          .
          <article-title>Towards proactive computer-system forensics</article-title>
          .
          <source>In In International conference on information technology: Coding and computing,IEEE Computer Society</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [BR12]
          <string-name>
            <given-names>Karel</given-names>
            <surname>Bartos</surname>
          </string-name>
          and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Rehak</surname>
          </string-name>
          .
          <article-title>Towards e cient ow sampling technique for anomaly detection</article-title>
          .
          <source>In Proceedings of the 4th international conference on Tra c Monitoring and Analysis</source>
          ,
          <source>TMA'12</source>
          , pages
          <fpage>93</fpage>
          {
          <fpage>106</fpage>
          , Berlin, Heidelberg,
          <year>2012</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [CNS+09]
          <string-name>
            <surname>Howard</surname>
            <given-names>Chivers</given-names>
          </string-name>
          , Philip Nobles, Siraj Ahmed Shaikh, John Clark, and
          <string-name>
            <given-names>Hao</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>Accumulating evidence of insider attacks</article-title>
          .
          <source>In (MIST</source>
          <year>2009</year>
          )
          <article-title>(In conjunction with IFIPTM 2009</article-title>
          ) CEUR Workshop Proceedings,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [CNS+10]
          <string-name>
            <surname>Howard</surname>
            <given-names>Chivers</given-names>
          </string-name>
          , Philip Nobles, Siraj Ahmed Shaikh, John Clark, and
          <string-name>
            <given-names>Hao</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>Knowing who to watch: Identifying attackers whose actions are hidden within false alarms and background noise</article-title>
          .
          <source>Information Systems Frontiers</source>
          , Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [DOE06]
          <string-name>
            <given-names>Whyte</given-names>
            <surname>David</surname>
          </string-name>
          , P.C.van
          <string-name>
            <surname>Oorschot</surname>
            ,
            <given-names>and Kranakis</given-names>
          </string-name>
          <string-name>
            <surname>Evangelos</surname>
          </string-name>
          .
          <article-title>Exposure maps: removing reliance on attribution during scan detection</article-title>
          .
          <source>In Proceedings of the 1st USENIX Workshop on Hot Topics in Security</source>
          , pages
          <volume>9</volume>
          {
          <fpage>9</fpage>
          , Berkeley, CA, USA,
          <year>2006</year>
          . USENIX Association.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [ER01]
          <string-name>
            <given-names>E.E.</given-names>
            <surname>Schultz</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Shumway</surname>
          </string-name>
          .
          <article-title>Incident response: A strategic guide for system and network security breaches indianapolis</article-title>
          . In New Riders,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>[KSZJ12] Harsha</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kalutarage</surname>
          </string-name>
          , Siraj A.
          <string-name>
            <surname>Shaikh</surname>
          </string-name>
          ,
          <string-name>
            <surname>Qin Zhou</surname>
            , and
            <given-names>Anne E.</given-names>
          </string-name>
          <string-name>
            <surname>James</surname>
          </string-name>
          .
          <article-title>Sensing for suspicion at scale: A bayesian approach for cyber con ict attribution and reasoning</article-title>
          .
          <source>In InProceedings of 4th International Conference on Cyber Con ict, NATO CCD COE. NATO CCD COE Publications</source>
          , Tallinn,
          <year>June 2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>[MBK11] M.H.Bhuyan</surname>
            , DK Bhattacharyya, and
            <given-names>JK</given-names>
          </string-name>
          <string-name>
            <surname>Kalita</surname>
          </string-name>
          .
          <article-title>Survey on Incremental Approaches for Network Anomaly Detection</article-title>
          .
          <source>International Journal of Communication Networks and Information Security (IJCNIS)</source>
          ,
          <volume>3</volume>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [NS311]
          <article-title>NS3 Development Team. Ns3 discrete-event network simulator for internet systems</article-title>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [PRTV10] Antonio Pescap, Dario Rossi, Davide Tammaro, and
          <string-name>
            <given-names>Silvio</given-names>
            <surname>Valenti</surname>
          </string-name>
          .
          <article-title>On the impact of sampling on tra c monitoring and analysis</article-title>
          .
          <source>In Proceedings of 22nd International Teletra c Congress (ITC)</source>
          <year>2010</year>
          , pages
          <issue>1{8</issue>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [SCW02] William W. Streilein,
          <string-name>
            <surname>Robert K. Cunningham</surname>
            , and
            <given-names>Seth E.</given-names>
          </string-name>
          <string-name>
            <surname>Webster</surname>
          </string-name>
          .
          <article-title>Improved detection of low-pro le probe and novel denial-of-service attacks</article-title>
          .
          <source>In Workshop on Statistical and Machine Learning Techniques in Computer Intrusion Detection</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [T.H02]
          <string-name>
            <given-names>T.</given-names>
            <surname>Heberlein</surname>
          </string-name>
          .
          <article-title>Tactical operations and strategic intelligence: Sensor purpose and placement</article-title>
          .
          <source>Technical Report TR-2002-04</source>
          .02,
          <string-name>
            <surname>Net</surname>
            <given-names>Squared Inc</given-names>
          </string-name>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [WvOK07]
          <string-name>
            <given-names>David</given-names>
            <surname>Whyte</surname>
          </string-name>
          , Paul C. van
          <string-name>
            <surname>Oorschot</surname>
            ,
            <given-names>and Evangelos</given-names>
          </string-name>
          <string-name>
            <surname>Kranakis</surname>
          </string-name>
          .
          <article-title>Tracking Darkports for Network Defense</article-title>
          .
          <source>In Proceedings of Computer Security Applications Conference</source>
          ,
          <year>2007</year>
          .
          <source>ACSAC</source>
          <year>2007</year>
          ., pages
          <volume>161</volume>
          {
          <fpage>171</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>