=Paper= {{Paper |id=None |storemode=property |title=Toward an Ontology Architecture for Cyber-Security Standards |pdfUrl=https://ceur-ws.org/Vol-713/STIDS_A8_Parmelee.pdf |volume=Vol-713 |dblpUrl=https://dblp.org/rec/conf/stids/Parmelee10 }} ==Toward an Ontology Architecture for Cyber-Security Standards== https://ceur-ws.org/Vol-713/STIDS_A8_Parmelee.pdf
                         Toward an Ontology Architecture for
                             Cyber-Security Standards

                                             Mary C. Parmelee

                                         The MITRE Corporation
                                           7515 Colshire Drive,
                                       McLean, VA 22102-7539, USA
                                        mparmelee@mitre.org



                Abstract. The rapid growth in magnitude and complexity of cyber-security
                information and event management (CSIEM) has ignited a trend toward
                security automation and information exchange standards. Making Security
                Measurable (MSM) references a collection of open community standards for
                the common enumeration, expression and reporting of cyber-security-related
                information. While MSM-related standards are valuable for enabling security
                automation; insufficient vocabulary management and data interoperability
                methods as well as domain complexity that exceeds current representation
                capabilities impedes the adoption of these important standards. This paper
                describes an Agile, ontology architecture-based approach for improving the
                ability to represent, manage, and implement MSM-related standards. Initial
                cross-standard analysis revealed enough common concepts to warrant four
                ontologies that are reusable across standards. This reuse will simplify
                standards-based data interoperability. Further, early prototyping enabled us to
                streamline vocabulary management processes and demonstrate the ability to
                represent complex domain semantics in OWL ontologies.

                Keywords: cyber-security, ontology architecture, security standards, security
                automation, making security measurable, security information and event
                management, SIEM, semantic interoperability, Agile Development, OWL, RDF


          Disclaimer. The views expressed in this chapter are those of the author’s alone and
          do not reflect the official policy or position of The MITRE Corporation or any other
          company or individual.


          1    Introduction

          Through its Making Security Measurable [13] and related efforts to standardize the
          expression and reporting of cyber-security-related information, MITRE leads the
          development of several open community standards. These standards are primarily
          designed to support security automation and information interoperability, as well as
          facilitate human security analysis across much of the cyber-security information and




STIDS 2010 Proceedings                                                                            Page 116 of 135
          event management (CSIEM) lifecycle. Some of the major security-related activities
          supported by the standards are: vulnerability management, intrusion detection, asset
          management, configuration guidance, incident management and threat analysis.
          MITRE’s support of the individual standards is funded by several federal government
          organizations. Many of the MSM-related standards have been adopted by the National
          Institute of Standards and Technology’s (NIST’s) Security Content Automation
          Protocol (SCAP) program [16]. Federal government organizations and security tool
          vendors are moving toward adoption of SCAP validated products to ensure baseline
          security data and tool interoperability [15].

          While MSM-related standards are valuable for enabling security automation;
          insufficient vocabulary management and data interoperability methods as well as
          domain complexity that exceeds current representation capabilities impedes the
          adoption of these important standards. This paper describes an Agile Development
          [1], ontology architecture-based approach for improving the ability to represent,
          manage, and implement MSM-related standards. The Cyber-Security Ontology
          Architecture is a loosely-coupled, modular representation that is resilient to rapid
          change and complexity. Architecture-based services and applications are free to
          combine and extend architecture components at implementation time to fit
          application-specific contexts without having to implement a single monolithic model.
          The result is improved ability to support security automation, vocabulary
          management, and data interoperability. Initial cross-standard analysis revealed
          enough common concepts to warrant four ontologies that are reusable across
          standards. This reuse is one way that this approach will simplify standards-based data
          interoperability. Further, early prototyping enabled us to streamline vocabulary
          management processes and demonstrate the ability to represent complex domain
          semantics in OWL ontologies that are difficult or not possible to represent using the
          Relational Database (RDB) and XML Schema (XSD) [17, 30] technologies in which
          the standards are currently implemented.


          2    Background

          This section provides background descriptions of ontology architecture and controlled
          vocabulary in the context of this paper.

          An ontology architecture is a conceptual information model comprised of a loosely-
          coupled federation of modular ontologies that form the structural and semantic
          framework of an information domain. Ontology architectures have been used to relate
          upper ontologies to their middle and domain level extensions [21]. Many of the
          concepts involved in ontology architecture are defined Ontology architectures are
          especially useful when applied to large, dynamic, complex domains such as cyber-
          security [17]. The major benefits of this federated approach to ontology application
          are [8, 23]:




STIDS 2010 Proceedings                                                                    Page 117 of 135
          1. Loose coupling and modularization makes it easier to add, remove and maintain
             individual ontologies;
          2. Modular ontologies are easier to reuse and process than large monolithic
             ontologies;
          3. Component ontologies can be dynamically combined on demand at
             implementation time to meet application-specific needs.

          The vocabulary of complex, dynamic domains such as cyber-security often include
          atypical linguistic expressions such as acronyms, idioms, and numeric codes. It is
          important to recognize that although these linguistic expressions are not standard
          language terms, they form an accepted vocabulary in the context of the domain. This
          perspective of what constitutes a vocabulary calls for a broad definition of controlled
          vocabulary (CV). In this context, a controlled vocabulary is a collection of linguistic
          expressions that is vetted by an authority (e.g. a community) according to a set of
          criteria. All of the MSM standards maintain some form of a controlled vocabulary.
          These vocabularies were developed independently of each other, and are at various
          stages of maturity that range from a few months to ten years of active development.


          3     Obstacles to Standards Adoption

          The three major obstacles inhibiting the widespread adoption of the MSM-related
          standards are:

          1.   Unsustainable vocabulary management processes: Vocabulary management
               involves thousands of manually developed and managed value enumerations and
               vocabulary representations that are mostly encoded in XSD. The MSM-related
               standards are growing rapidly in number, volume and complexity. Some of the
               standards are adding hundreds to thousands of enumeration entries per month. A
               semantic approach to vocabulary management would streamline the vocabulary
               management process and reduce human error.

          2.   Ineffective data interoperability methods: Data interoperability activities are
               largely driven by the SCAP Validation program, which among other things,
               requires security tool vendors to translate proprietary output to a common
               expression and reporting form in order to achieve SCAP compliance [15]. This
               data interoperability is typically accomplished with manual ETL-style mappings
               to each of the SCAP-required standards. This mapping process would be more
               tractable, even semi-automatable if common concepts were represented more
               consistently across standards. A well-designed ontology architecture would
               facilitate this consistency.

          3.   Rapidly evolving, complex domain semantics that exceed the representation
               capability of the RDB and XSD technologies in which the standards are currently
               implemented: Domain complexity issues such as how to represent the behavioral




STIDS 2010 Proceedings                                                                     Page 118 of 135
                aspects of malware, and relating numerous software versioning schemes, call for
                a more semantic representation than either XSD or RDB technologies alone can
                readily provide. The semantics of these technologies are currently represented
                mostly in human interpretable documentation, which is not automatable or
                machine processable.

          The following sections of this document describe how a well-designed ontology
          architecture coupled with a semantic technology-based approach to information
          management could improve the productivity and efficiency of MSM-related standards
          development, management and implementation [19, 20].


          4      Agile Development Approach

          We take an Agile Development approach (Agile approach), to ontology architecture
          design, development, and implementation [1]. Agile Development begins with an
          envisioning phase in which we rapidly collect and prioritize user needs, perform
          coarse grained architecture modeling, and roughly estimate scope. Then we
          implement the architecture by building incremental capability in short design and
          development cycles called sprints. The intent is to allow the architecture to gradually
          evolve based on emerging stakeholder requirements and lessons learned from each
          sprint [1]. When fully mature, the Cyber-Security Ontology Architecture will
          represent a comprehensive, standards-based family of ontologies.

          4.1     Envisioning Phase

          We gathered high level requirements from domain experts, which are expressed as
          obstacles to adoption in Section 3 of this document. Then we developed a coarse
          model of the CSIEM lifecycle to provide a rough estimate of scope. We mapped the
          current MSM-related controlled vocabularies (CVs) to the CSIEM lifecycle model to
          produce a CV architecture as illustrated in Figure 1. Acronym expansions for the
          standard names in Figure 1 are located in the References section, reference numbers
          1,2,3,4,5,6,7,12,14,18, and 29.

          Finally, we performed a vocabulary analysis, identifying gaps and overlaps while
          extracting common concepts for reuse across vocabularies. Results are illustrated in
          the first draft Cyber-Security Ontology Architecture as illustrated in Figure 2
          [2,4,6,18]. The top two layers of the architecture designates the ontology-level tiers.
          We will eventually fill the gaps with new or existing ontologies while reducing
          vocabulary overlap to only intentional variation in order to control complexity and
          improve structural and syntactic information interoperability.

          The lowest tier of the architecture designates the standards value-level CV content
          followed by the CV representations in the third tier. These two CV tiers are the
          sources for the upper two ontology-level architecture tiers. Above the CV tiers, the




STIDS 2010 Proceedings                                                                     Page 119 of 135
          third tier contains ontologies that are specific to the cyber-security domain. Finally the
          upper-most tier contains common ontologies that emerged from vocabulary overlap
          analysis. Development of the first draft Cyber-Security Ontology Architecture marks
          the end of the envisioning phase of development and the beginning of Sprint 1
          implementation. The ontologies that are encircled with red ovals are those that have
          been developed or adopted during the Sprint 1 implementation phase. We adopt or
          derive from existing ontologies where possible. The ontologies are encoded in the
          Web Ontology Language (OWL) [25].
                                       Fig. 1. CSIEM CV Architecture




                        Fig. 2. Cyber-Security Ontology Architecture Concept Diagram




STIDS 2010 Proceedings                                                                        Page 120 of 135
          4.2    Cyber-Security Ontology Architecture Implementation Sprint 1

          Sprint 1 focused on improving the vocabulary management process and produced five
          ontologies. Four of these are common ontologies, including: an OWL (Web Ontology
          Language) representation of the Dublin Core metadata standard [9,25]; a Resource
          Manager ontology which imports the Dublin Core model and references parts of
          SKOS (Simple Knowledge Organization System) [28]; a Point-of-Contact ontology
          (which was derived from the FOAF [10] and VCard ontologies) [26]; and a Content
          Curation ontology. The domain ontology was derived from the Common
          Configuration Enumeration (CCE) CV. It includes the Content Curation ontology and
          parts of the other three common ontologies. Figure 3 illustrates the structure of the
          CCE Vocabulary Manager Ontology’s core concepts.

                          Fig. 3. CCE Vocabulary Manager Ontology Core Concepts




          We converted the existing CCE XML content into over 27,000 RDF [27] instances to
          create the CCE Vocabulary Manager knowledge base, which contains over 500,000
          RDF triples. Then we implemented a reference Semantic Web application using Top
          Quadrant’s TopBraid Suite [24]. This application enables CCE content analysts to
          view, query, navigate, edit and track the status of CCE content in the knowledge base.
          Figure 4 shows a screenshot of the CCE vocabulary management application. The
          RDF graph structure eliminates the need for redundant content that is required of
          tabular and hierarchical structures. The OWL ontology expands the single tacit CCE
          Entry relation to many explicit user-defined relations among CCE instances. These
          capabilities, among others, have the potential to streamline vocabulary management
          processes and improve content quality across MSM-related standards.




STIDS 2010 Proceedings                                                                    Page 121 of 135
                             Fig. 4. CCE Vocabulary Management Web Application




          5     Future Work

          In the near future, we will refine the vocabulary management reference application
          while building out the ontology architecture. A longer term goal is to develop an end
          user reference implementation that semi-automates the mapping of proprietary tool
          output to standard vocabularies.

          Acknowledgement Thank you to MITRE subject matter experts Matthew Wojcik,
          Jonathan Baker and David Mann for their valuable contributions to this research.


          References

          1.   Amber, Scott W.: Agile Model Driven Development,
               http://www.agilemodeling.com/essays/amdd.htm
          2.   CCE: Common Configuration Enumeration, http://cce.mitre.org/
          3.   CEE Board: Common Event Expression Technical Report, Department G026, The MITRE
               Corporation (2007)
          4.   CPE: Common Platform Enumeration, http://cpe.mitre.org/files/cpe-specification_2.2.pdf




STIDS 2010 Proceedings                                                                         Page 122 of 135
          5.    CRE:                    Common                      Remediation                 Enumeration,
                http://scap.nist.gov/events/2010/saddw/presentations/remediation.pdf
          6.    CVE: Common Vulnerability and Exposures, http://cve.mitre.org/
          7.    CWE: Common Weakness Enumeration, http://cwe.mitre.org/
          8.    Deshayes, Laurent; Foufou, Sebti; et al.: An Ontology Architecture for Standards
                Integration and Conformance in Manufacturing, 6th International IDDME, Grenoble,
                France, May 17-19 2006. http://stl.mie.utoronto.ca/publications/P0057paper.pdf
          9.    Dublin       Core       Metadata       Inititative:   Dublin        Core     Element    Set,
                http://dublincore.org/documents/dces/
          10.   FOAF: Friend-of-a-Friend Vocabulary Specification, http://xmlns.com/foaf/spec/
          11.   ISO: ISO 639-4:2010, http://www.iso.org/iso/catalogue_detail.htm?csnumber=39535
          12.   MAEC: Malware Attribute Enumeration and Characterization, http://maec.mitre.org/
          13.   MSM: Making Security Measurable, http://measurablesecurity.mitre.org/
          14.   Mann, David: An Introduction to the Common Configuration Enumeration (CCE),
                Technical Report, Department G022, The MITRE Corporation (2008)
          15.   NIST: Interagency Report 7511, SCAP Validation Derived Test Requirements,
                http://csrc.nist.gov/publications/drafts/nistir-7511/draft-nistir-7511_rev1.pdf (2009)
          16.   NIST: SCAP (Security Content Automation Protocol), http://scap.nist.gov/
          17.   Obrst, Leo: Ontological Architectures, Chapter 2 in Part One: Ontology as Technology in
                the book: TAO – Theory and Applications of Ontology, Volume 2: The Information-
                science Stance, Michael Healy, Achilles Kameas, Roberto Poli, eds. Springer, (2010).
          18.   OVAL: Open Vulnerability and Assessment Language, http://oval.mitre.org/
          19.   Parmelee, Mary: Toward the Semantic Interoperability of the Security Information and
                Event Management Lifecycle, In: AAAI Intelligent Security Workshop,
                http://www.tzi.de/~edelkamp/secart/IntSec.pdf (2010)
          20.   Parmelee, Mary; Nichols, Deborah; Obrst, Leo: A Net-Centric Metadata Framework for
                Service Oriented Environments. IJMSO 4 (4): 250 – 260 (2009)
          21.   Pease, A., Niles, I., and Li, J.: The Suggested Upper Merged Ontology: A Large Ontology
                for the Semantic Web and its Applications. In Working Notes of the AAAI-2002
                Workshop on Ontologies and the Semantic Web, Edmonton, Canada (2002)
          22.   Princeton University: WordNet, http://wordnet.princeton.edu/
          23.   Probst, F., M. Lutz: Giving Meaning to GI Web Service Descriptions, WSMAI (2004)
          24.   Top Quadrant: TopBraid Suite, http://topquadrant.com/products/TB_Suite.html
          25.   W3C: OWL Overview, http://www.w3.org/TR/owl-features/ (2004)
          26.   W3C: Representing vCard Objects in RDF, http://www.w3.org/Submission/vcard-rdf/
                (2010)
          27.   W3C: Resource Description Framework (RDF) Semantics, W3C Recommendation
                http://www.w3.org/TR/rdf-mt/ (2004)
          28.   W3C SWDWG: SKOS, http://www.w3.org/2004/02/skos/ (2004)
          29.   XCCDF: Specification for the Extensible Configuration Checklist Description Format
                (XCCDF) Version 1.1.4, http://csrc.nist.gov/publications/nistir/ir7275r3/NISTIR-
                7275r3.pdf (2008)
          30.   W3C XSWG: XML Schema Part 1: Structures, http://www.w3.org/TR/2001/PR-
                xmlschema-1-20010330/ (2001)




STIDS 2010 Proceedings                                                                                Page 123 of 135