<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Toward an Ontology Architecture for Cyber-Security Standards</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mary C. Parmelee</string-name>
          <email>mparmelee@mitre.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>The MITRE Corporation 7515</institution>
          <addr-line>Colshire Drive, McLean, VA 22102-7539</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2010</year>
      </pub-date>
      <abstract>
        <p>The rapid growth in magnitude and complexity of cyber-security information and event management (CSIEM) has ignited a trend toward security automation and information exchange standards. Making Security Measurable (MSM) references a collection of open community standards for the common enumeration, expression and reporting of cyber-security-related information. While MSM-related standards are valuable for enabling security automation; insufficient vocabulary management and data interoperability methods as well as domain complexity that exceeds current representation capabilities impedes the adoption of these important standards. This paper describes an Agile, ontology architecture-based approach for improving the ability to represent, manage, and implement MSM-related standards. Initial cross-standard analysis revealed enough common concepts to warrant four ontologies that are reusable across standards. This reuse will simplify standards-based data interoperability. Further, early prototyping enabled us to streamline vocabulary management processes and demonstrate the ability to represent complex domain semantics in OWL ontologies.</p>
      </abstract>
      <kwd-group>
        <kwd>cyber-security</kwd>
        <kwd>ontology architecture</kwd>
        <kwd>security standards</kwd>
        <kwd>security automation</kwd>
        <kwd>making security measurable</kwd>
        <kwd>security information and event management</kwd>
        <kwd>SIEM</kwd>
        <kwd>semantic interoperability</kwd>
        <kwd>Agile Development</kwd>
        <kwd>OWL</kwd>
        <kwd>RDF</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Through its Making Security Measurable [
        <xref ref-type="bibr" rid="ref12">13</xref>
        ] and related efforts to standardize the
expression and reporting of cyber-security-related information, MITRE leads the
development of several open community standards. These standards are primarily
designed to support security automation and information interoperability, as well as
facilitate human security analysis across much of the cyber-security information and
event management (CSIEM) lifecycle. Some of the major security-related activities
supported by the standards are: vulnerability management, intrusion detection, asset
management, configuration guidance, incident management and threat analysis.
MITRE’s support of the individual standards is funded by several federal government
organizations. Many of the MSM-related standards have been adopted by the National
Institute of Standards and Technology’s (NIST’s) Security Content Automation
Protocol (SCAP) program [
        <xref ref-type="bibr" rid="ref15">16</xref>
        ]. Federal government organizations and security tool
vendors are moving toward adoption of SCAP validated products to ensure baseline
security data and tool interoperability [
        <xref ref-type="bibr" rid="ref14">15</xref>
        ].
      </p>
      <p>
        While MSM-related standards are valuable for enabling security automation;
insufficient vocabulary management and data interoperability methods as well as
domain complexity that exceeds current representation capabilities impedes the
adoption of these important standards. This paper describes an Agile Development
[1], ontology architecture-based approach for improving the ability to represent,
manage, and implement MSM-related standards. The Cyber-Security Ontology
Architecture is a loosely-coupled, modular representation that is resilient to rapid
change and complexity. Architecture-based services and applications are free to
combine and extend architecture components at implementation time to fit
application-specific contexts without having to implement a single monolithic model.
The result is improved ability to support security automation, vocabulary
management, and data interoperability. Initial cross-standard analysis revealed
enough common concepts to warrant four ontologies that are reusable across
standards. This reuse is one way that this approach will simplify standards-based data
interoperability. Further, early prototyping enabled us to streamline vocabulary
management processes and demonstrate the ability to represent complex domain
semantics in OWL ontologies that are difficult or not possible to represent using the
Relational Database (RDB) and XML Schema (XSD) [
        <xref ref-type="bibr" rid="ref16 ref29">17, 30</xref>
        ] technologies in which
the standards are currently implemented.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>This section provides background descriptions of ontology architecture and controlled
vocabulary in the context of this paper.</p>
      <p>
        An ontology architecture is a conceptual information model comprised of a
looselycoupled federation of modular ontologies that form the structural and semantic
framework of an information domain. Ontology architectures have been used to relate
upper ontologies to their middle and domain level extensions [
        <xref ref-type="bibr" rid="ref20">21</xref>
        ]. Many of the
concepts involved in ontology architecture are defined Ontology architectures are
especially useful when applied to large, dynamic, complex domains such as
cybersecurity [
        <xref ref-type="bibr" rid="ref16">17</xref>
        ]. The major benefits of this federated approach to ontology application
are [
        <xref ref-type="bibr" rid="ref22 ref7">8, 23</xref>
        ]:
1. Loose coupling and modularization makes it easier to add, remove and maintain
individual ontologies;
2. Modular ontologies are easier to reuse and process than large monolithic
ontologies;
3. Component ontologies can be dynamically combined on demand at
implementation time to meet application-specific needs.
      </p>
      <p>The vocabulary of complex, dynamic domains such as cyber-security often include
atypical linguistic expressions such as acronyms, idioms, and numeric codes. It is
important to recognize that although these linguistic expressions are not standard
language terms, they form an accepted vocabulary in the context of the domain. This
perspective of what constitutes a vocabulary calls for a broad definition of controlled
vocabulary (CV). In this context, a controlled vocabulary is a collection of linguistic
expressions that is vetted by an authority (e.g. a community) according to a set of
criteria. All of the MSM standards maintain some form of a controlled vocabulary.
These vocabularies were developed independently of each other, and are at various
stages of maturity that range from a few months to ten years of active development.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Obstacles to Standards Adoption</title>
      <p>The three major obstacles inhibiting the widespread adoption of the MSM-related
standards are:</p>
      <p>
        Unsustainable vocabulary management processes: Vocabulary management
involves thousands of manually developed and managed value enumerations and
vocabulary representations that are mostly encoded in XSD. The MSM-related
standards are growing rapidly in number, volume and complexity. Some of the
standards are adding hundreds to thousands of enumeration entries per month. A
semantic approach to vocabulary management would streamline the vocabulary
management process and reduce human error.
2. Ineffective data interoperability methods: Data interoperability activities are
largely driven by the SCAP Validation program, which among other things,
requires security tool vendors to translate proprietary output to a common
expression and reporting form in order to achieve SCAP compliance [
        <xref ref-type="bibr" rid="ref14">15</xref>
        ]. This
data interoperability is typically accomplished with manual ETL-style mappings
to each of the SCAP-required standards. This mapping process would be more
tractable, even semi-automatable if common concepts were represented more
consistently across standards. A well-designed ontology architecture would
facilitate this consistency.
      </p>
      <p>Rapidly evolving, complex domain semantics that exceed the representation
capability of the RDB and XSD technologies in which the standards are currently
implemented: Domain complexity issues such as how to represent the behavioral
aspects of malware, and relating numerous software versioning schemes, call for
a more semantic representation than either XSD or RDB technologies alone can
readily provide. The semantics of these technologies are currently represented
mostly in human interpretable documentation, which is not automatable or
machine processable.</p>
      <p>
        The following sections of this document describe how a well-designed ontology
architecture coupled with a semantic technology-based approach to information
management could improve the productivity and efficiency of MSM-related standards
development, management and implementation [
        <xref ref-type="bibr" rid="ref18 ref19">19, 20</xref>
        ].
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Agile Development Approach</title>
      <p>We take an Agile Development approach (Agile approach), to ontology architecture
design, development, and implementation [1]. Agile Development begins with an
envisioning phase in which we rapidly collect and prioritize user needs, perform
coarse grained architecture modeling, and roughly estimate scope. Then we
implement the architecture by building incremental capability in short design and
development cycles called sprints. The intent is to allow the architecture to gradually
evolve based on emerging stakeholder requirements and lessons learned from each
sprint [1]. When fully mature, the Cyber-Security Ontology Architecture will
represent a comprehensive, standards-based family of ontologies.
4.1</p>
      <sec id="sec-4-1">
        <title>Envisioning Phase</title>
        <p>We gathered high level requirements from domain experts, which are expressed as
obstacles to adoption in Section 3 of this document. Then we developed a coarse
model of the CSIEM lifecycle to provide a rough estimate of scope. We mapped the
current MSM-related controlled vocabularies (CVs) to the CSIEM lifecycle model to
produce a CV architecture as illustrated in Figure 1. Acronym expansions for the
standard names in Figure 1 are located in the References section, reference numbers
1,2,3,4,5,6,7,12,14,18, and 29.</p>
        <p>
          Finally, we performed a vocabulary analysis, identifying gaps and overlaps while
extracting common concepts for reuse across vocabularies. Results are illustrated in
the first draft Cyber-Security Ontology Architecture as illustrated in Figure 2
[
          <xref ref-type="bibr" rid="ref17 ref5">2,4,6,18</xref>
          ]. The top two layers of the architecture designates the ontology-level tiers.
We will eventually fill the gaps with new or existing ontologies while reducing
vocabulary overlap to only intentional variation in order to control complexity and
improve structural and syntactic information interoperability.
        </p>
        <p>
          The lowest tier of the architecture designates the standards value-level CV content
followed by the CV representations in the third tier. These two CV tiers are the
sources for the upper two ontology-level architecture tiers. Above the CV tiers, the
third tier contains ontologies that are specific to the cyber-security domain. Finally the
upper-most tier contains common ontologies that emerged from vocabulary overlap
analysis. Development of the first draft Cyber-Security Ontology Architecture marks
the end of the envisioning phase of development and the beginning of Sprint 1
implementation. The ontologies that are encircled with red ovals are those that have
been developed or adopted during the Sprint 1 implementation phase. We adopt or
derive from existing ontologies where possible. The ontologies are encoded in the
Web Ontology Language (OWL) [
          <xref ref-type="bibr" rid="ref24">25</xref>
          ].
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Cyber-Security Ontology Architecture Implementation Sprint 1</title>
        <p>
          Sprint 1 focused on improving the vocabulary management process and produced five
ontologies. Four of these are common ontologies, including: an OWL (Web Ontology
Language) representation of the Dublin Core metadata standard [
          <xref ref-type="bibr" rid="ref24 ref8">9,25</xref>
          ]; a Resource
Manager ontology which imports the Dublin Core model and references parts of
SKOS (Simple Knowledge Organization System) [
          <xref ref-type="bibr" rid="ref27">28</xref>
          ]; a Point-of-Contact ontology
(which was derived from the FOAF [
          <xref ref-type="bibr" rid="ref9">10</xref>
          ] and VCard ontologies) [
          <xref ref-type="bibr" rid="ref25">26</xref>
          ]; and a Content
Curation ontology. The domain ontology was derived from the Common
Configuration Enumeration (CCE) CV. It includes the Content Curation ontology and
parts of the other three common ontologies. Figure 3 illustrates the structure of the
CCE Vocabulary Manager Ontology’s core concepts.
We converted the existing CCE XML content into over 27,000 RDF [
          <xref ref-type="bibr" rid="ref26">27</xref>
          ] instances to
create the CCE Vocabulary Manager knowledge base, which contains over 500,000
RDF triples. Then we implemented a reference Semantic Web application using Top
Quadrant’s TopBraid Suite [
          <xref ref-type="bibr" rid="ref23">24</xref>
          ]. This application enables CCE content analysts to
view, query, navigate, edit and track the status of CCE content in the knowledge base.
Figure 4 shows a screenshot of the CCE vocabulary management application. The
RDF graph structure eliminates the need for redundant content that is required of
tabular and hierarchical structures. The OWL ontology expands the single tacit CCE
Entry relation to many explicit user-defined relations among CCE instances. These
capabilities, among others, have the potential to streamline vocabulary management
processes and improve content quality across MSM-related standards.
In the near future, we will refine the vocabulary management reference application
while building out the ontology architecture. A longer term goal is to develop an end
user reference implementation that semi-automates the mapping of proprietary tool
output to standard vocabularies.
        </p>
        <p>Acknowledgement Thank you to MITRE subject matter experts Matthew Wojcik,
Jonathan Baker and David Mann for their valuable contributions to this research.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>CEE</given-names>
            <surname>Board: Common Event Expression Technical Report</surname>
          </string-name>
          , Department G026, The MITRE
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Corporation</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>CPE: Common Platform Enumeration</surname>
          </string-name>
          , http://cpe.mitre.org/files/cpe-specification_
          <volume>2</volume>
          .2.pdf
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>5. CRE: Common Remediation Enumeration, http://scap.nist.gov/events/2010/saddw/presentations/remediation.pdf</mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          6. CVE:
          <article-title>Common Vulnerability</article-title>
          and Exposures, http://cve.mitre.org/
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>7. CWE: Common Weakness Enumeration, http://cwe.mitre.org/</mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          8.
          <string-name>
            <surname>Deshayes</surname>
          </string-name>
          , Laurent; Foufou, Sebti; et al.:
          <article-title>An Ontology Architecture for Standards Integration</article-title>
          and Conformance in Manufacturing, 6th
          <string-name>
            <surname>International</surname>
            <given-names>IDDME</given-names>
          </string-name>
          , Grenoble, France, May
          <volume>17</volume>
          -19
          <year>2006</year>
          . http://stl.mie.utoronto.ca/publications/P0057paper.pdf
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>9. Dublin Core Metadata Inititative: Dublin Core Element Set, http://dublincore.org/documents/dces/</mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          10. FOAF:
          <article-title>Friend-of-a-Friend Vocabulary Specification</article-title>
          , http://xmlns.com/foaf/spec/
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          11. ISO: ISO 639-4:
          <year>2010</year>
          , http://www.iso.org/iso/catalogue_detail.
          <source>htm?csnumber=39535</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          12. MAEC:
          <article-title>Malware Attribute Enumeration and Characterization</article-title>
          , http://maec.mitre.org/
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>13. MSM: Making Security Measurable, http://measurablesecurity.mitre.org/</mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          14.
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>David:</given-names>
          </string-name>
          <article-title>An Introduction to the Common Configuration Enumeration (CCE)</article-title>
          ,
          <source>Technical Report</source>
          , Department G022,
          <source>The MITRE Corporation</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          15.
          <source>NIST: Interagency Report 7511</source>
          , SCAP Validation Derived Test Requirements, http://csrc.nist.gov/publications/drafts/nistir-7511/draft-nistir-
          <volume>7511</volume>
          _rev1.
          <string-name>
            <surname>pdf</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          16. NIST:
          <string-name>
            <surname>SCAP (Security Content Automation Protocol</surname>
          </string-name>
          ), http://scap.nist.gov/
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          17.
          <string-name>
            <surname>Obrst</surname>
          </string-name>
          , Leo: Ontological Architectures,
          <article-title>Chapter 2 in Part One: Ontology as Technology in the book: TAO - Theory and Applications of Ontology</article-title>
          , Volume
          <volume>2</volume>
          :
          <string-name>
            <given-names>The</given-names>
            <surname>Informationscience Stance</surname>
          </string-name>
          , Michael Healy, Achilles Kameas, Roberto Poli, eds. Springer, (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          18. OVAL:
          <article-title>Open Vulnerability and Assessment Language</article-title>
          , http://oval.mitre.org/
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          19.
          <string-name>
            <surname>Parmelee</surname>
          </string-name>
          ,
          <article-title>Mary: Toward the Semantic Interoperability of the Security Information and Event Management Lifecycle</article-title>
          , In: AAAI Intelligent Security Workshop, http://www.tzi.de/~edelkamp/secart/IntSec.pdf (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          20.
          <string-name>
            <surname>Parmelee</surname>
          </string-name>
          , Mary; Nichols, Deborah; Obrst,
          <article-title>Leo: A Net-Centric Metadata Framework for Service Oriented Environments</article-title>
          .
          <source>IJMSO</source>
          <volume>4</volume>
          (
          <issue>4</issue>
          ):
          <fpage>250</fpage>
          -
          <lpage>260</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          21.
          <string-name>
            <surname>Pease</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Niles</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>The Suggested Upper Merged Ontology: A Large Ontology for the Semantic Web and its Applications</article-title>
          .
          <source>In Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web</source>
          , Edmonton, Canada (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>22. Princeton University: WordNet, http://wordnet.princeton.edu/</mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          23.
          <string-name>
            <surname>Probst</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Lutz: Giving Meaning to GI Web Service Descriptions</article-title>
          ,
          <string-name>
            <surname>WSMAI</surname>
          </string-name>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>24. Top Quadrant: TopBraid Suite, http://topquadrant.com/products/TB_Suite.html</mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          25.
          <article-title>W3C: OWL Overview</article-title>
          , http://www.w3.org/TR/owl-features/ (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          26. W3C:
          <article-title>Representing vCard Objects in RDF</article-title>
          , http://www.w3.org/Submission/vcard-rdf/ (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          27. W3C:
          <article-title>Resource Description Framework (RDF) Semantics</article-title>
          , W3C Recommendation http://www.w3.org/TR/rdf-mt/ (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          28.
          <string-name>
            <surname>W3C</surname>
            <given-names>SWDWG</given-names>
          </string-name>
          : SKOS, http://www.w3.org/
          <year>2004</year>
          /02/skos/ (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          29. XCCDF:
          <article-title>Specification for the Extensible Configuration Checklist Description Format (XCCDF) Version 1</article-title>
          .1.4, http://csrc.nist.gov/publications/nistir/ir7275r3/NISTIR7275r3.pdf (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          30.
          <string-name>
            <surname>W3C</surname>
            <given-names>XSWG</given-names>
          </string-name>
          :
          <article-title>XML Schema Part 1: Structures</article-title>
          , http://www.w3.org/TR/2001/PRxmlschema-1-20010330/ (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>