A Target-Centric Ontology for Intrusion Detection
                     John Pinkston, Jeffrey Undercoffer, Anupam Joshi and Timothy Finin
                                   University of Maryland, Baltimore County
                           Department of Computer Science and Electrical Engineering
                                   1000 Hilltop Circle, Baltimore, MD 21250
                                  pinkston, undercoffer, joshi, finin @umbc.edu   


                           Abstract                                    with a greater ability to reason over and analyze this informa-
                                                                       tion.
     We have produced an ontology specifying a model                      As detailed by Allen, et al. [2], and McHugh [22], the taxo-
     of computer attacks. Our ontology is based upon                   nomic characterization of intrusive behavior has typically been
     an analysis of over 4,000 classes of computer                     from the attacker’s point of view, each suggesting that alterna-
     intrusions and their corresponding attack strategies              tive taxonomies need to be developed. Allen et al., state that
     and is categorized according to: system component                 intrusion detection is an immature discipline and has yet to es-
     targeted, means of attack, consequence of attack                  tablish a commonly accepted framework. McHugh suggests
     and location of attacker. We argue that any tax-                  classifying attacks according to protocol layer or, as an alter-
     onomic characteristics used to define a computer                  native, whether or not a completed protocol handshake is re-
     attack be limited in scope to those features that                 quired. Likewise, Guha [10] suggests an analysis of each layer
     are observable and measurable at the target of the                of the TCP/IP protocol stack to serve as the foundation for an
     attack. We present our model as a target-centric                  attack taxonomy.
     ontology that is to be refined and expanded over                     As an alternative to a taxonomy, we propose a data model
     time. We state the benefits of forgoing dependence                implemented with an ontology representation language such
     upon taxonomies, in favor of ontologies, for the                  as the Resource Description Framework Schema (RDFS) [26]
     classification of computer attacks and intrusions.                or the DARPA Agent Markup Language + Ontology Infer-
     We have specified our ontology using DAML+OIL                     ence Layer (DAML+OIL) [1]. We illustrate the benefits of
     and have prototyped it using DAMLJessKB. We                       using ontologies by presenting an implementation of our on-
     present our model as a target-centric ontology and                tology being utilized by a distributed intrusion detection sys-
     illustrate the benefits of utilizing an ontology in lieu          tem. Accordingly, we have specified our target-centric ontol-
     of a taxonomy, by presenting a use case scenario of               ogy in DAML+OIL and have implemented it using DAML-
     a distributed intrusion detection system.                         JessKB [17], an extension to the Java Expert System Shell [7].
                                                                          Because IDS’s are either adjacent to or co-located with the
                                                                       target of an attack it is imperative that any classification scheme
                                                                       used to represent an attack be target-centric, where each taxo-
1   Introduction                                                       nomic character is comprised of properties and features that are
Based upon empirical evidence we have produced a model of              observable by the target of the attack. Consequently, our ontol-
                                                                       ogy only defines properties and attributes that are observable
computer attacks categorized by: the system component tar-
                                                                       and measurable by the target of an attack. As a basis for estab-
geted, the means and consequence of attack, and the location
of the attacker. Our model is represented as a target-centric          lishing our a posteriori target-centric attack ontology, we eval-
                                                                       uated and analyzed over 4,000 computer vulnerabilities and the
ontology, where the structural properties of the classification
                                                                       corresponding attack strategies employed to exploit them.
scheme is in terms of features that are observable and measur-
able by the target of the attack or some software system acting           The remainder of this paper is organized as follows: Sec-
                                                                       tion 2 presents related work in the form of alternative attack
on the target’s behalf. In turn, this ontology is used to facilitate
                                                                       taxonomies as well as presenting related work in the area of
the reasoning process of detecting and mitigating computer in-
trusions.                                                              ontologies for intrusion detection. Section 3 presents the char-
                                                                       acteristics of a sufficient taxonomy. Section 4 details the moti-
   Traditionally, the characterization and classification of com-
                                                                       vation for abandoning taxonomies in favor of ontologies. Our
puter attacks and other intrusive behaviors have been limited to
                                                                       target-centric attack taxonomy is presented in Section 5. Sec-
taxonomies. Taxonomies, however, lack the necessary and es-
                                                                       tion 6 details our implementation and Section 7 provides an ex-
sential constructs needed to enable an intrusion detection sys-
                                                                       ample scenario illustrating the utility of the ontology within a
tem (IDS) to reason over an instance that is representative of
                                                                       distributed intrusion detection system. We conclude with Sec-
the domain of a computer attack. Alternatively, ontologies
                                                                       tion 8.
provide powerful constructs that include machine interpretable
definitions of the concepts within a domain and the relations
between them. Ontologies, therefore, provide software sys-             2 Related Work
tems with the ability to share a common understanding of the           As previously stated, most of the existing research in the area of
information at issue, in turn empowering the software system           the classification of computer attacks is limited to taxonomies.
Because a taxonomy is contained within an ontology we ad-               sufficient and acceptable taxonomy for computer security. Col-
dress the research in the area of defining intrusion taxonomies         lectively, they have identified the following properties as es-
before we address ontologies. Accordingly, this section is sub-         sential to a taxonomy: Mutually Exclusive, Exhaustive, Unam-
divided, with Subsection 2.1 presenting related work in the area        biguous, Repeatable, Accepted, Useful, Comprehensible, Con-
of taxonomies for intrusion detection and Subsection 2.2 pre-           forming, Objective, Deterministic and Specific. Accordingly,
senting related work in the area of ontologies for intrusion de-        as an ontology subsumes a taxonomy these characteristics form
tection.                                                                the underpinnings of our work.

2.1    Related Work: Taxonomies                                         4 From Taxonomies to Ontologies: The case for
There are numerous attack taxonomies proposed for use in in-              ontologies in Intrusion Detection
trusion detection research.
   In [19] Landwehr et al., present a taxonomy categorized ac-          Ning et al. [23], propose a hierarchical model for attack speci-
cording to genesis (how), time of introduction (when) and lo-           fication and event abstraction using three concepts essential to
cation (where). They include sub-categories of: validation er-          their approach: System View, Misuse Signature and View Def-
rors, boundary condition errors and serialization errors, which         inition. Their model is based upon a thorough examination of
we incorporate into our ontology as the means of an attack.             attack characteristics and attributes, and is encoded within the
   During the 1998 and 1999 DARPA Off Line Intrusion De-                logic of their proposed system. Consequently, this model is not
tection System Evaluations [12] [21] [15] Weber provided a              readily interchangeable and reusable by other systems.
taxonomy defining the categories of consequence, to include                 The Intrusion Detection Working Group of Internet Engi-
Denial of Service, Remote to Local and User to Root, which              neering Task Force (IETF) has proposed the Intrusion Detec-
we incorporate into our work.                                           tion Message Exchange Requirements [33] which, in addition
   Lindqvist and Jonsson [20] state that they “focus on the ex-         to defining the requirements for the Intrusion Detection Mes-
ternal observations of attacks and breaches which the system            sage Exchange Format, also specifies the architecture of an
owner can make”. Our effort is consistent with their focus.             IDS. The Intrusion Detection Message Exchange Format Data
                                                                        Model (IDMEF) and accompanying Extensible Markup Lan-
2.2    Related Work: Ontologies                                         guage Document Type Definition [4] is a profound effort to
There is little, if any, published research formally defining on-       establish an industry wide data model which defines computer
tologies for use in Intrusion Detection.                                intrusions. The IDMEF, however, has its shortcomings. Specif-
   Raskin et al. [25], introduce and advocate the use of ontolo-        ically, it uses XML which is limited to a syntactic representa-
gies for information security. In arguing the case for using on-        tion of the data model and does not convey the semantics, re-
tologies, they state that an ontology organizes and systematizes        lationships, attributes and characteristics of the objects which
all of the phenomena (intrusive behavior) at any level of detail,       it represents.. This limitation requires that each individual IDS
consequently reducing a large diversity of items to a smaller           interpret and implement the data model programmaticaly.
list of properties.                                                         According to Davis et al. [5], knowledge representation is
   In commenting on the IETF’s IDMEF, Kemmerer and Vigna                a surrogate or substitute for an object under study. In turn,
[14] state “it is a but a first step, however additional effort is      the surrogate enables an entity, such as a software system, to
needed to provide a common ontology that lets IDS sensors               reason about the object. Knowledge representation is also a set
agree on what they observe”.                                            of ontological commitments specifying the terms that describe
                                                                        the essence of the object. In other words, meta-data or data
                                                                        about data describing their relationships.
3     Characteristics of a Sufficient Taxonomy                              Frame Based Systems are an important thread in knowledge
At this point, a clear understanding of the definition, purpose         representation. According to Koller et al. [16], Frame Based
and objective of a taxonomy is in order. Accordingly, a taxon-          Systems provide an excellent representation for the organiza-
omy is a classification system where the classification scheme          tional structure of complex domains. Frame Based Languages,
conforms to a systematic arrangement into groups or categories          which support Frame Based Systems, include RDF, and are
according to established criteria [31]. Glass and Vessey [9]            used to represent ontologies. According to Welty et al. [32], an
contend that taxonomies provide a set of unifying constructs            ontology, at its deepest level, subsumes a taxonomy. Similarly,
so that the area of interest can be systemically described and          Noy and McGuinness [24] state the process of developing an
aspects of relevance may be interpreted. The overarching goal           ontology includes arranging classes in a taxonomic hierarchy.
of any taxonomy, therefore, is to supply some predictive value              In applying ontologies to the problem of intrusion detection,
during the analysis of an unknown specimen, while the classi-           the power and utility of the ontology is not realized by the sim-
fications within the taxonomy offer an explanatory value.               ple representation of the attributes of the attack. Instead, the
   According to Simpson [27] classifications may be created             power and utility of the ontology is realized by the fact that
either a priori or a posteriori. An a priori classification is          we can express the relationships between collected data and
created non-empirically whereas an a posteriori classification          use those relationships to deduce that the particular data
is created by empirical evidence derived from some data set.            represents an attack of a particular type. Moreover, spec-
Simpson defines a taxonomic character as a feature, attribute or        ifying an ontological representation decouples the data model
characteristic that is divisible into at least two contrasting states   defining an intrusion from the logic of the intrusion detection
and used for constructing classifications. He further states that       system. The decoupling of the data model from the IDS logic
taxonomic characters should be observable from the object in            enables non-homogeneous IDS’s to share data without a prior
question.                                                               agreement as to the semantics of the data. To effect this shar-
   Amoroso [3], Lindqvist, et al. [20] and Krusl [18] each have         ing, an instance of the ontology is shared between IDS’s in the
identified what they believe to be the requisite properties of a        form of a set of DAML+OIL (or RDF) statements. If the re-
cipient does not understand some aspect of the data, it obtains           The class Attack has the properties Directed to, Effected by,
the ontology in order to interpret and use the data as intended        and Resulting in. This construction is predicated upon the no-
by its originator.                                                     tion that an attack consists of some input which is directed to
   Ontologies therefore, unlike taxonomies, provide powerful           some system component and results in some consequence. Ac-
constructs that include machine interpretable definitions of the       cordingly, the classes System Component, Input, and Conse-
concepts within a specific domain and the relations between            quence are the corresponding objects. The class Consequence
them. In our case the domain is that of a particular computer          is comprised of several subclasses which include:
or a software system acting on the computer’s behalf in order            1. Denial of Service. The attack results in a Denial of Ser-
to detect attacks and intrusions. Ontologies may be utilized                 vice to the users of the system. The denial of service may
to not only provide an IDS with the ability to share a com-                  be because the system was placed into an unstable state
mon understanding of the information at issue but also further               or all of the system resources may be consumed by mean-
enable the IDS with improved capacity to reason over and an-                 ingless functions.
alyze instances of data representing an intrusion. Moreover,
within an ontology, characteristics such as cardinality, range           2. User Access. The attack results in the attacker having ac-
and exclusion may be specified and the notion of inheritance is              cess to services on the target system at an unprivileged
supported.                                                                   level.
                                                                         3. Root Access. The attack results in the attacker being
                                                                             granted privileged access to the system, consequently hav-
5     Target-Centric Ontology Attributes of the                              ing complete control of the system.
      Class Intrusion                                                    4. Probe. This type of an attack is the result of scanning or
In constructing our ontology, we relied upon an empirical anal-              other activity wherein a profile of the system is disclosed.
ysis [30] of the features and attributes, and their interrelation-        Finally, the class Input has the the attributes Received from
ships, of over 4,000 classes of computer attacks and intrusions.       and Causing where Causing defines the relationship between
Figure 1, presents a high level view of our ontology. The at-          the Means of attack and some input. We define the following
tributes of each class and subclass (denoted by ellipses) are not      subclasses for Means of attack:
shown because it would make the illustration unwieldy.
                                                                         1. Input Validation Error. An input validation error exists if
   At the top most level we define the class Host. Host has the
                                                                             some malformed input is received by a hardware or soft-
properties Current State which is defined by the class System
                                                                             ware component and is not properly bounded or checked.
Component and Victim of which is defined by the class Attack.
                                                                             This class is further sub-classed as:
As defined in Section 4 the property, also called the predicate,
defines the relationship between a subject and an object.                    (a) Buffer Overflow. The classic buffer overflow results
   The System Component class is comprised of the following                      from an overflow of a static-sized data structure.
subclasses:                                                                  (b) Boundary Condition Error. A process attempts to
                                                                                 read or write beyond a valid address boundary or a
    1. Network. This class is inclusive of the network lay-                      system resource is exhausted.
       ers of the protocol stack. We have focused on TCP/IP                  (c) Malformed Input. A process accepts syntactically in-
       therefore we only consider IP, TCP, and UDP subclasses.                   correct input, extraneous input fields, or the process
       For example, and as will be later demonstrated, the TCP                   lacks the ability to handle field-value correlation er-
       subclass includes the properties TCP MAX which defines                    rors.
       the maximum number of TCP connections, WAIT STATE
       defining the number of connections waiting on the fi-            2. Logic Exploits. Logic exploits are exploited software
       nal ack of the three-way handshake to establish a TCP               and hardware vulnerabilities such as race conditions or
       connection, THRESHOLD specifying the allowable ratio                undefined states that lead to performance degradation
       between maximum connections and partially established               and/or system compromise. Logic exploits are further
       connections and EXCEED T a boolean value indicating                 subclasssed as follows:
       that the allowable ratio has been exceeded. It should be              (a) Exception Condition. An error resulting from the
       noted that these are only four of several network proper-                 failure to handle an exception condition generated by
       ties.                                                                     a functional module or device.
    2. System. This includes attributes representing the operat-             (b) Race Condition. An error occurring during a timing
       ing system of the host. It includes attributes represent-                 window between two operations.
       ing overall memory usage (MEM TOTAL, MEM FREE,                        (c) Serialization Error. An error that results from the im-
       MEM SWAP) and CPU usage (LOAD AVG). The class                             proper serialization of operations.
       also contains attributes reflective of the number of current          (d) Atomicity Error.       An error occurring when a
       users, disk usage, the number of installed kernel modules,                partially-modified data structure is used by another
       and change in state of the interrupt descriptor and system                process; An error occurring because some process
       call tables.                                                              terminated with partially modified data where the
    3. Process. This class contains attributes representing par-                 modification should have been atomic.
       ticular processes that are to be monitored. These at-
       tributes include the current value of the instruction pointer   6 Implementation
       (INS P), the current top of the stack (T STACK), a              There are several reasoning systems that are compatible with
       scalar value computed from the stream of system calls           DAML+OIL. According to their functionality, reasoning sys-
       (CALL V), and the number of child processes (N CHILD).          tems can be classified into two types, backward-chaining
                                                                                                             HOST


                                                                                                                Victim of
                                                                                  t   e
                                                                               Sta
                                                                        en t
                                                               C   u rr


                                                               Directed to
                                                                                                             Attack
                                                                                                                                                                    R es ulting
                                                                                                                                                                                     in


                                                                                                                Effected by
                                    System
                                   Component
                                                                                                                                                                                                                           Consequence
                                   f            S u bC
                        l as s o                         la ss
                  S ub C                                       o   f
                                                                                                                                                                                                              of                                     S ub Cla
                                                                                                                                                                                                Su b C lass                                                     ss o f
                                                                                                             Input
        Network                        System                          Process                                                                                             Denial of            Remote to                                          User to
                                                                                                                                                                                                                                                                         Probe
                                                                                                                                                                           Service                Local                                             Root
                                                                                                                                                                   R ece


                                                                                                                 Causing
                                                                                                                                                                           i ved
                                                                                                                                                                                   f rom
   IP    TCP      UDP


                                                                                                             Means
                                                                                                                                                                                                                          Location
                                                                                                         f                    Su b C
                                                                                                  la ss o                           l   ass o                                                                             C las
                                                                                                                                                                                                                                  s   S ub
                                                                                          Su bC
                                                                                                                                                                                                                                          Clas
                                                                                                                                              f                                                                    Su b                  of
                                                                                                                                                                                                                                               s
                                                                                                                                                                                                                          of


                                                                                                                                                                                                          Local                              Remote
                                                                          Input
                                                                        Validation                                                                Logic Explo it
                                                                          Error
                                                                                                                                                                                                                                                    TCP/IP


                                                 Buffer                  Boundary           Malformed          Exception                  Race              Atomicity                Serialization
                                                Overflow                 Condition            Input            Condition                 Condition            Error                      Error                                            TCP                 UDP
                                                                                                                                                                                                                                         Socket              Socke t


                                                                                          Figure 1: Target Centric Ontology


and forward-chaining. Backward-chaining reasoners pro-                                                                             <rdfs:Class rdf:about=
                                                                                                                                         "&IntrOnt;Attack"
cess queries and return proofs for the answers they provide.                                                                          rdfs:label="Consequence">
Forward-chaining reasoners process assertions substantiated                                                                          <rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
by proofs, and draw conclusions. Available reasoning systems                                                                       </rdfs:Class>

include: Stanford’s Java Theorem Prover [6], Drexel’s DAML-                                                                        <rdf:Property rdf:about="&IntrOnt;Directed_To"
JessKB [17] and the Renamed ABox and Concept Expression                                                                                rdfs:label="Directed_To">
                                                                                                                                      <rdfs:domain rdf:resource="&IntrOnt;Attack"/>
Reasoner [11].                                                                                                                        <rdfs:range rdf:resource="&IntrOnt;SysComp"/>
   We have prototyped the logic portion of our system using the                                                                    </rdf:Property>
DAMLJessKB [17] reasoning system, an extension to the Java
                                                                                                                                   <rdf:Property rdf:about="&IntrOnt;Resulting_In"
Expert System Shell (JESS) [7]. JESS is a Java implementation                                                                          rdfs:label="Resutling_In">
of the C Language Integrated Production System (CLIPS) [8].                                                                           <rdfs:domain rdf:resource="&IntrOnt;Attack"/>
DAMLJessKB is employed to reason over instances of our data                                                                           <rdfs:range rdf:resource="&IntrOnt;Conseq"/>
                                                                                                                                   </rdf:Property>
model that are considered to be suspicious. These suspicious
instances are constrained according to our target-centric ontol-                                                                   <rdf:Property rdf:about="&IntrOnt;Effected_By"
                                                                                                                                       rdfs:label="Effected_By">
ogy and asserted into the knowledge base.                                                                                             <rdfs:domain rdf:resource="&IntrOnt;Attack"/>
   Upon initialization of DAMLJessKB we converted the                                                                                 <rdfs:range rdf:resource="&IntrOnt;Input"/>
DAML+OIL statements representing the ontology into N-                                                                              </rdf:Property>
Triples and assert them into a knowledge base as rules. The
assertions are of the form:
                                                                                                                                 Figure 2: DAML+OIL Statements Defining the Class Attack
(assert
(PropertyValue (predicate) (subject) (object)))                                                                                  and its Properties: Directed To, Resulting In and Effected By
Once asserted, DAMLJessKB generates additional rules which
include all of the chains of implication derived from the ontol-
ogy.                                                                                                                                Figure 3 presents the DAML+OIL notation for the class Sys-
   The following series of figures illustrate the DAML+OIL en-                                                                   tem Component, its subclass Network, and Network’s subclass
coding of selected classes, subclasses and their respective prop-                                                                TCP. Figure 4 lists the DAML+OIL notation for some of the
erties, of our ontology.                                                                                                         attributes of the class TCP.
   Figure 2 lists the DAML+OIL statements defining the class                                                                        Figure 5 details the specification of the class Consequence
Attack and it properties Directed To, Resulting In and Effected                                                                  while Figures 6 and 7 show similar details for the specification
By. These properties correspond to the edges between the node                                                                    of the classes Denial of Service and Syn Flood. The Syn Flood
labeled Target and the nodes labeled System Component, Input                                                                     class, which is not shown in Figure 1 illustrating our ontology,
and Consequence respectively, in Figure 1.                                                                                       is a subclass of both Denial of Service and TCP and, as stated
<daml:Class rdf:about="&IntrOnt;SysComp"                            <daml:Class rdf:about="&IntrOnt;Syn_Flood"
    rdfs:label="State">                                                rdfs:label="Syn_Flood">
   <rdfs:subClassOf rdf:resource="&rdfs;                              <rdfs:subClassOf rdf:resource="&IntrOnt;DoS"/>
         Resource"/>                                                     <rdfs:subClassOf rdf:resource="&IntrOnt;TCP">
</daml:Class>                                                              <daml:Restriction>
                                                                              <daml:onProperty rdf:resource=
<daml:Class rdf:about="&IntrOnt;                                                    "&IntrOnt;Exceed_T"/>
      Network"                                                                <daml:hasValue rdf:resource="#true"/>
   rdfs:label="Network">                                                    </daml:Restriction>
  <rdfs:subClassOf rdf:resource="&IntrOnt;                                 </rdfs:subClassOf>
        SysComp"/>                                                  </daml:Class>
</daml:Class>

<daml:Class rdf:about="&IntrOnt;TCP"                                Figure 7: DAML+OIL Statements Specifying the SynFlood
   rdfs:label="Network">                                            Subclass
  <rdfs:subClassOf rdf:resource="&IntrOnt;
        Network"/>
</daml:Class>
                                                                    generated, the knowledge base is ready to receive instances of
                                                                    the ontology. Instances are asserted and de-asserted into/from
Figure 3: DAML+OIL Statements Specifying the Class Sys-
                                                                    the knowledge base as temporal events dictate. To query the
tem Component and its Subclass, Network and TCP
                                                                    knowledge base for the existence of an attack or intrusion, the
                                                                    query could be so granular that it requests an attack of a specific
rdf:Property rdf:about="&IntrOnt;TCP_Max"                           type, such as a Syn Flood:
   rdfs:label="TCP_Max">
                                                                    (defrule isSynFlood
  <rdfs:domain rdf:resource="&IntrOnt;Network"/>
  <rdfs:range rdf:resource="&rdfs;
                                                                    (PropertyValue
        nonNegativeInteger"/>
                                                                    (p http://www.w3.org/1999/02/22-rdf-syntax-ns#type)
</rdf:Property>
                                                                    (s ?var)
                                                                    (o http://security.umbc.edu/IntrOnt#SynFlood))
<rdf:Property rdf:about="&IntrOnt;Wait_State"
   rdfs:label="Wait_State">
                                                                    =>
  <rdfs:domain rdf:resource="&IntrOnt;Network"/>
  <rdfs:range rdf:resource=
                                                                    (printout t ‘‘A SynFlood attack has occurred.’’         crlf
        "&rdfs;nonNegativeInteger"/>
                                                                                ‘‘with event number: ‘‘ ?var))
</rdf:Property>
                                                                    The query could be of a medium level of granularity, asking
<rdf:Property rdf:about="&IntrOnt;Threshold"
   rdfs:label="Threshold">                                          for all attacks of a specific class, such as denial of service. Ac-
  <rdfs:domain rdf:resource="&IntrOnt;Network"/>                    cordingly, the following query will return all instances of an
  <rdfs:range rdf:resource=                                         attack of the class Denial of Service.
        "&rdfs;nonNegativeInteger"/>
</rdf:Property>                                                     (defrule isDOS

<rdf:Property rdf:about="&IntrOnt;Exceed_T"                         (PropertyValue
 rdfs:label="Exceed_T">                                             (p http://www.w3.org/1999/02/22-rdf-syntax-ns#type)
<rdfs:domain rdf:resource="&IntrOnt;Network"/>                      (s ?var)
<rdfs:range rdf:resource="&IntrOnt;BooleanValue"/>                  (o http://security.umbc.edu/IntrOnt#DoS))
</rdf:Property>
                                                                    =>

Figure 4: DAML+OIL Notation Specifying Attributes of the            (printout t ‘‘A DoS attack has occurred.’’       crlf
                                                                                ‘‘with ID number: ‘‘ ?var))
TCP Subclass
                                                                    Finally, the following rule will return instances of any attack,
<rdfs:Class rdf:about="&IntrOnt;Conseq"
                                                                    where the event numbers that are returned by the query need to
   rdfs:label="Conseq">                                             be iterated over in order to discern the specific type of attack:
  <rdfs:subClassOf rdf:resource="&rdfs;Resource"/>                  (defrule isConseq
</rdfs:Class>
                                                                    (PropertyValue
                                                                    (p http://www.w3.org/1999/02/22-rdf-syntax-ns#type)
                                                                    (s ?var)
Figure 5: DAML+OIL Specification of the Class Consequence           (o http://security.umbc.edu/IntrOnt#Conseq))

                                                                    =>
<rdfs:Class rdf:about="&IntrOnt;DoS"
   rdfs:label="DoS">                                                (printout t ‘‘An attack has occurred.’’       crlf
  <rdfs:subClassOf rdf:resource="&IntrOnt;Conseq"/>                             ‘‘with ID number: ‘‘ ?var))
</rdfs:Class>
                                                                       These varying levels of granularity are possible because of
                                                                    DAML+OIL’s notion of classes, subclasses, and the relation-
Figure 6: DAML+OIL Statements Specifying the Denial of              ships that holds between them. The variable ?var, contained in
Service Subclass                                                    each of the queries, is instantiated with the subject whenever
                                                                    a predicate and object from a matching triple is located in the
                                                                    knowledge base.
in the DAML+OIL notation, will only be instantiated when the
threshold of pending TCP connections is exceeded.
                                                                    7 Using the Ontology to Detect a Distributed
6.1   Querying the Knowledge Base                                     Attack
Once the ontology is asserted into the knowledge base and all       The following example of a distributed attack illustrates the
of the derived rules resulting from the chains of implication are   utility of our ontology.
   The Mitnick attack is multi-phased; consisting of a Denial of                                                                       The following explains the utility of our ontology, as well
Service attack, TCP sequence number prediction and IP spoof-                                                                        as the importance of forming coalitions of IDSs. In our IDS
ing. When this attack first occurred a Syn Flood was used to                                                                        model, we form coalitions of IDS services each of which is re-
effect the denial of service, however any denial of service at-                                                                     sponsible for specific parts of an enterprise or domain. For ex-
tack would have sufficed.                                                                                                           ample, one IDS service may be responsible for a specific host,
   In the following example, which is illustrated in figure 8,                                                                      while another is responsible for a group of hosts, while yet
Host B is the ultimate target and Host A is trusted by Host B.                                                                      still another is responsible for monitoring network traffic. The
   The attack is structured as follows:                                                                                             IDS’s all share a common ontology and utilize a secure com-
 1. The attacker initiates a Syn/Flood attack against Host A                                                                        munications infrastructure that has been optimized for IDS’s.
    to prevent Host A from responding to Host B.                                                                                    We present such a infrastructure in [13, 28, 29].
                                                                                                                                       Consider the case of the instance of the Syn Flood attack
 2. The attacker sends multiple TCP packets to the target,                                                                          presented in Section 6 and that it was directed against Host A
    Host B in order to be able to predict the values of TCP                                                                         in our example scenario. As the IDS responsible for Host A is
    sequence numbers generated by Host B.                                                                                           continually monitoring for anomalous behavior, asserting and
 3. The attacker then pretends to be Host A, by spoofing Host                                                                       de-asserting data as necessary, it will detect the occurrence of
    A’s IP address, and sends a Syn packet to Host B in order                                                                       an inordinate number of partially established TCP connections,
    to establish a TCP session between Host A and Host B.                                                                           and transmit the instance of the Syn Flood to the other IDS’s in
 4. Host B responds with a SYN/ACK to Host A. The at-                                                                               its coalition.
    tacker does not see this packet. Host A, since its in-                                                                             That instance is repeated below:
    put queue is full due to number of half open connections
    caused by the Syn/Flood attack, cannot send a RST mes-                                                                          <IntrOnt:Intrusion rdf:about="&IntrOnt;00035"
    sage to Host B in response to the spurious Syn message.                                                                              IntrOnt:IP_Address="130.85.112.231">
                                                                                                                                         <IntrOnt:resulting_in
 5. Using the calculated TCP sequence number of Host B                                                                                            rdf:resource="&IntrOnt;00038"/>
    (recall that the attacker did not see the Syn/ACK mes-                                                                          </IntrOnt:Intrusion>
    sage sent from Host B to Host A) the attacker sends an
    Ack with the predicted TCP sequence number packet in                                                                            <IntrOnt:Syn_Flood rdf:about="&IntrOnt;00038"
    response to the Syn/Ack packet sent to Host A.                                                                                       IntrOnt:Exceed_T="true"
                                                                                                                                         IntrOnt:int_time="20021212 154312"/>
 6. Host B is now in a state where it believes that a TCP ses-
    sion has been established with a trusted host Host A. The                                                                          This instance is converted into a set of N-Triples and asserted
    attacker now has a one way session with the target, Host                                                                        into the knowledge base of each IDS in the coalition. Those
    B, and can issue commands to the target.                                                                                        sameN-Triples will be de-asserted when the responsible IDS
                                                                                                                                    transmits a message stating that the particular host is no longer
                                                                                                                                    the victim of a Syn Flood attack. This situation, especially in
                                                                                                                                    conjunction with Host B being subjected to a series of probes
                                                                                                                                    meant to determine its TCP sequencing, could be the prelude
         Host A                                                                                                                     to a distributed attack the current connections and pending con-
                                                                                                                                    nections are also asserted into the knowledge base.
                                         S te p
                                   pr e ve      1. In i
                                           nt H o      t iate
                                                              S yn / F
                                                                                                                                       The following is a set DAML+OIL statements describing
                                                  st A fr              lo od
                                                     H o st
                                                           om R
                                                                 e sp o
                                                                              to
                                                                         n d in g
                                                                                                                                    connections:
                                                              B                   to
      Step 4. Host B sends
      SYN/ACK to Host A, in
                                                                                                                                    <IntrOnt:Connection rdf:about="&IntrOnt;00038"
      response to step 3.                                                         TC
                                                                                     P
                                                                                                      Attacker                       IntrOnt:IP_Address="130.85.112.231"
      Host A’s input queue is                                                in em
                                                                            te r                                                     IntrOnt:conn_time="20021212 154417"/>
      full and does not                                                 de ng
                                                                    to        ri
      receive the message                                    s t B m be
                                                          o          u
                                                     e H ce n
                                                  ob       n                                                                        <IntrOnt:Connection rdf:about="&IntrOnt;00101"
                                              Pr que
                                       2.          se
                                   e p                                                     A,                          e d o th e
                                St                                                    os t B .                     la t
                                                                                                                          st
                                                                                                                                     IntrOnt:IP_Address="202.85.191.121"
                                                                               e H st                           cu
                                                                                                           c al pon d
                                                                        t o b h Ho
                                                                   ing
                                                               nd o n W
                                                                                 it                   th e
                                                                                                   nd ke r
                                                                                                              r es                   IntrOnt:conn_time="20021212 151221"/>
                                                           t e        i                          a
                                                        re         ss                          s
                                                                                             es at t a
                                                                                                       c
                                                   . P P se                               dr                      4,
                                              p3         C                            ad , t he               ep
                                      St e e a T                                   ’s                      St                       <IntrOnt:Connection rdf:about="&IntrOnt;00102"
         Host B                             i a t                          s t A os t B                in
                                      in it                           H  o
                                                                                 f  H             st B
                                                                 in g      ro                  Ho                                    IntrOnt:IP_Address="68.54.101.78"
         {Trusts                                            Us mb e                        by
                                                       5.        nu                    nt
                                                   ep nc e                          se                                               IntrOnt:conn_time="20021212 150152"/>
         Host A}                               St       e                      K
                                                  eq
                                                      u                   AC
                                                                       N/
                                            Ps                    SY
                                      TC
                                                                                                                                      In order to detect an Mitnick type attack, we include the fol-
                                                                                                                                    lowing DAML+OIL statements that partially specify an ontol-
           Figure 8: Illustration of the Mitnick Attack                                                                             ogy of the Mitnick attack (the class is identified as P Mitnick
                                                                                                                                    for partial):

   It should be noted that an intrusion detection system running                                                                    <daml:Class rdf:about="&Intrusion;P_Mitnick"
                                                                                                                                       rdfs:label="P_Mitnick">
exclusively at either host will not detect this multi-phased and                                                                        <daml:intersectionOf rdf:parseType=
distributed attack. At best, Host A’s IDS would see a relatively                                                                                                   ’’daml:collection’’>
short lived Syn Flood attack, and Host B’s IDS might observe                                                                              <daml:Class rdf:about="&IntrOnt;DoS"/>
an attempt to infer TCP sequence numbers, although this may                                                                               <daml:Class rdf:about="&IntrOnt;Connection"/>
not stand out from other non-intrusive but ill-formed TCP con-                                                                          </daml:intersectionOf>
nection attempts.                                                                                                                   </daml:Class>
The ontology is partial because the Mitnick attack has the ad-         to severe vulnerabilities, root access is the most common con-
ditional property that the connection time with the victim must        sequence of an exploit whereas the ICAT data shows denial of
be greater than or equal to the time of the denial of service at-      service to be the most common consequence.
tack. An instance of this ontology will be instantiated provided          Our analysis was conducted in order to identify the observ-
that there exists an instance of a denial of service attack that has   able and measurable properties of computer attacks and intru-
the same unique identifier as that of an established connection.       sions. Accordingly, we have developed a target-centric on-
In fact there will be an instance created in each case where this      tology characterized by System Component, Means of Attack,
condition holds. In our prototype, we check each instance to           Consequences of Attack and Location of Attacker. We have
determine if the time of the connection is greater than or equal       stated the case for replacing simple taxonomies with ontolo-
the time of the attack.                                                gies for use in IDS’s and have presented an initial ontology
   The following rules are used to check each instance:                specifying the class Intrusion. Our ontology is available at:
(defrule isMitnick
                                                                       http://security.cs.umbc.edu/Intrusion.
                                                                          We have prototyped our ontology using the DAMLJessKB,
(PropertyValue                                                         which has some limitations. We intend to either modify
(p http://security.umbc.edu/IntrOnt#P\_Mitnick )
(s ?eventNumber) (o "true"))                                           DAMLJessKB in order to make it a full and complete reasoner
                                                                       or use Stanford’s Java Theorem Prover [6] or Rename ABox
(PropertyValue                                                         and Concept Expression Reasoner [11].
(p http://security.umbc.edu/IntrOnt#Int_time)
(s ?eventNumber) (o ?Int_Time))

(PropertyValue
                                                                       References
(p http://security.umbc.edu/IntrOnt#Conn_time)                         [1] DARPA Agent Markup Language+Ontology Interface
(s ?eventNumber) (o ?Conn_Time))
                                                                            Layer.      http://www.daml.org/2001/03/daml+oil-index,
=>                                                                          2001.
(if (>= ?Conn_Time ?Int_Time) then
(printout t ‘‘event number: ‘‘                                         [2] Julia Allen, Alan Christie, William Fithen, John
?eventnumber ‘‘ is a Mitnick Attack: crlf)))                                McHugh, Jed Pickel, and Ed Stoner. State of the Practice
                                                                            of Intrusion Detection Technologies. Technical Report
this rule will fire and event number 00038, the instance of the
                                                                            99tr028, Carnegie Mellon - Software Engineering Insti-
intersection of the connection and the denial of service attack,
                                                                            tute, 2000.
will be displayed.
   At this point it is important to review the sequence of events      [3] Edward G. Amoroso. Fundamentals of Computer Secu-
leading up to the discovery of the Mitnick attack. Recall, that             rity Technology. Prentice-Hall PTR, 1994.
the IDS responsible for the victim of the Syn Flood attack             [4] D. Curry and H. Debar. Intrusion detection message ex-
queried its knowledge base for an instance of a DoS denial of               change format data model and extensible markup lan-
service attack. The query returned an instance of a Syn Flood               guage (xml)document type definition. draft-ietf-idwg-
which was instantiated solely on the condition that the Exced T             idmef-xml-07.txt, January 2003. expires July 31, 2003.
property of the Network class was true.                                [5] Randall Davis, Howard Shrobe, and Peter Szolovits.
   The instance (its properties) of the Syn Flood attack was
                                                                            What is knowledge representation?          AI Magazine,
transmitted in the form of a set of DAML+OIL statements to
                                                                            14(1):17 – 33, 1993.
the other IDS’s in the coalition. In turn, these IDS’s converted
the DAML+OIL statements to a set of N-Triples and asserted             [6] Gleb Frank, Jessica Jenkins, and Richard Fikes.
them into their respective knowledge bases. As a Syn Flood                  Jtp: An object oriented modular reasoning system.
is a precursor to a more insidious attack, instances of estab-              http://kst.stanford.edu/software/jtp.
lished and pending connections were asserted into the knowl-           [7] Ernest J. Friedman-Hill. Jess, the java expert sys-
edge base. As the state of the knowledge base is dynamic due                tem shell. http://herzberg.ca.sandia.gov/jess/docs/52/,
to the assertions and de-assertions, the rule set of each IDS is            November 1977.
continually applied to the knowledge base.                             [8] Joseph Giarratano and Gary Riley. Expert Systems Prin-
   The ontology specifying the Mitnick class states that it is
                                                                            ciples and Programming. PWS Publishing Company,
the intersection of both the DoS and Connection classes. Be-
                                                                            third edition, 1998.
cause each IDS instantiates an instance when this constraints
imposed by intersection is true, we need to examine each in-           [9] Robert L. Glass and Iris Vessey.            Contemporary
stance to ensure that     .        application-domain taxonomies. IEEE Software, pages
                                                                            63 – 76, July 1995.
8   Conclusion and Future Work                                         [10] Biswaroop Guha and Biswanath Mukherjee. Network Se-
                                                                            curity via Reverse Engineering of TCP Code: Vulnerabil-
We have analyzed vulnerability and intrusion data derived from              ity Analysis and Proposed Solutions. In IEEE Networks,
CERT advisories and NIST’s ICAT meta-base resulting in the                  pages 40 – 48. IEEE, July/August 1997.
identification of the components (network, kernel, application
and other) most frequently attacked. We have also identified           [11] Volker Haarslev and Ralf Moller.                RACER:
the most common means and consequences of the attack as                     Renamed         ABox      and     Concept     Expression
well as the location of the attacker. Our analysis shows that               Reasoner.             http://www.cs.concordia.ca/   fac-
non-kernel space (non operating system) applications, running               ulty/haarslev/racer/index.html, June 2001.
as either root or user, are the most frequently attacked and are       [12] Joshua W. Haines, Lee M. Rossey, Richard P. Lippman,
attacked remotely. The most common means of attack are ex-                  and Robert K. Cunningham. Extending the darpa off-line
ploits. According to the CERT advisories issued in response                 intrusion detection evaluations. In DARPA Information
     Survivability Conference and Exposition II, volume 1,       [27] George Gaylord Sumpson. Principals of Animal Taxon-
     pages 77 – 88. IEEE, 2001.                                       omy. Columbia University Press, 1961.
[13] Lalana Kagal, Jeffrey Undercoffer, Anupam Joshi, and        [28] Jeffrey Undercoffer, Filip Perich, Andrej Cedilnik,
     Tim Finin. Vigil: Enforcing Security in Ubiquitous En-           Lalana Kagal, and Anupam Joshi. A Secure Infrastruc-
     vironments . In Grace Hooper Celebration of Women in             ture for Service Discovery and Access in Pervasive Com-
     Computing 2002, 2002.                                            puting. Mobile Networks and Applications: Special Issue
                                                                      on Security, (2):113 – 126, 2003.
[14] Richard A. Kemmerer and Giovanni Vigna. Intrusion de-
     tection: A brief history and overview. Security and Pri-    [29] Jeffrey Undercoffer, Filip Perich, and Charles Nicholas.
     vacy a Supplement to IEEE Computer Magazine, pages               Shomar: An architecure for distributed intrusion detec-
     27 – 30, April 2002.                                             tion services. University of Maryland Baltimore County,
                                                                      Department of Computer Science and Electrical Engi-
[15] Kristopher Kendall. A database of computer attacks for           neering, 2002.
     the evaluation of intrusion detection systems. Master’s
     thesis, MIT, 1999.                                          [30] Jeffrey Undercoffer and John Pinkston. An empirical
                                                                      analysis of computer attacks and intrusions. Technical
[16] Daphne Koller and Avi Pfeffer. Probabilistic Frame-              Report TR-CS-03-11, University of Maryland, Baltimore
     Based Systems. In Proceedings of the Fifteenth National          County, 2002.
     Conference on Artifical Intelligence, pages 580 – 587,
     Madison, Wisconsin, July 1998. AAAI.                        [31] WEBSTERS, inc, editor. Merriam-Webster’s Collegiate
                                                                      Dictionary. Merriam-Webster, Inc., tenth edition, 1993.
[17] Joe Kopena. DAMLJessKB. http://edge.mcs.drexel.edu/
                                                                 [32] Chris Welty. Towards a semantics for the web. Vassar
     assemblies/software/damljesskb/ articles/DAMLJessKB-
     2002.pdf, October 2002.                                          College, 2000.
                                                                 [33] M. Wood and M. Erlinger. Intrusion detection message
[18] Ivan Krusl. Software Vulnerability Analysis. PhD thesis,
                                                                      exchange requirements. draft-ietf-idwg-requirements-08,
     Purdue, 1998.
                                                                      August 2002.
[19] Carl E. Landwehr, Alan R. Bull, John P. McDermott, and
     William S. Choi. A taxonomy of computer program se-
     curity flaws. ACM Computing Surveys, 26(3):211 – 254,
     September 1994.
[20] Ulf Lindqvist and Erland Jonsson. How to systemati-
     cally classify computer security intrusions. In Proceed-
     ings of the 1997 IEEE Symposium on Security and Pri-
     vacy, pages 154 – 163. IEEE, May 1997.
[21] Richard Lippmann, David Fried, Isaac Graf, Joshua
     Haines, Kristopher Kendall, Davind McClung, Dan We-
     ber, Seth Webster, Dan Wyschogrod, Rober Cunningham,
     and Marc Zissman. Evaluating intrusion detection sys-
     tems: The 1998 darpa off-line intrusion detection evalu-
     ation. In Proceedings of the DARPA Information Surviv-
     ability Conference and Exposition,2000, pages 12 – 26,
     January 2000.
[22] John McHugh. Testing Intrusion Detection Systems: A
     Critique of the 1998 and 1999 DARPA Intrusion Detec-
     tion System Evaluations as Performed by Lincoln Lab-
     oratory. ACM Transactions on Information and System
     Security, November 2000.
[23] Peng Ning, Sushil Jajodia, and Xiaoyang Sean Wang.
     Abstraction-based intrusion in distributed environments.
     ACM Transactions on Information and Systems Security,
     4(4):407 – 452, November 2001.
[24] Natalya F. Noy and Deborah L. McGuinnes. Ontology
     development 101: A guide to creating your fisrt ontology.
     Stanford University.
[25] Victor Raskin, Christian F. Hempelmann, Katrina E.
     Triezenberg, and Sergei Nirenburg. Ontology in in-
     formation security: A useful theoretical foundation and
     methodological tool. In Proceedings of NSPW-2001,
     pages 53 – 59. ACM, ACM, September 2001.
[26] RDF. Resource description framework (rdf) schema spec-
     ification, 1999.