=Paper=
{{Paper
|id=Vol-1788/STIDS2016_A02
|storemode=property
|title=A Practical Approach to Data Modeling using CCO
|pdfUrl=https://ceur-ws.org/Vol-1788/STIDS_2016_A02_Moten_Barnhill.pdf
|volume=Vol-1788
|authors=Rod Moten,Bill Barnhill
|dblpUrl=https://dblp.org/rec/conf/stids/MotenB16
}}
==A Practical Approach to Data Modeling using CCO==
A Practical Approach to Data Modeling using CCO Rod Moten Bill Barnhill Datanova Scientific EOIR Technologies Baltimore, Maryland APG, MD Abstract—In this paper, we present work in progress on using projects considered the use of CCO impractical for tactical the Information Domain ontologies of CCO (Common Core military systems. Ontologies) as a domain model for land combat. Our goal is to We believe that CCO is practical for tactical military sys- use the domain model as a common semantics for multiple land combat logical models. In the paper, we show how our domain tems. The problems we encountered were due to how CCO model can be mapped to different logical models in a manner was used. The problems we encountered occurred because that is less labor intensive than the approach commonly used of differences in the modeling objectives of a logical model by users of CCO. We demonstrate our approach by describing and a domain model defined as a formal ontology. A logical how our domain model, which is a domain ontology of CCO, is model defines the symbolic structure of entities for automated mapped to logical models created in Ecore and NIEM (National Information Exchange Model). processing and analysis. The structure is chosen in order to simplify processing and analysis. For example, the essential I. I NTRODUCTION properties of a person, such as name and birth date, are There are three primary forms of a data model, domain modeled as attributes of the same object in a logical model. model, logical model, and a physical model [1]. A domain However, the domain ontologies of CCO are specifications model specifies the concepts that data represents, the properties of the metaphysical make up of entities. Therefore, essential of the concepts and the relationships between concepts. A properties of the same entity may have different structural logical model species the logical structure of data. A physical representations as individuals in the CCO. In other words, the model species how data is represented in machine readable graph patterns of the triples representing the essential attributes format. Ideally, a logical model is derived directly from a of the same entity may be different. For example, a birth date domain model or a formal relationship is defined between for a person is a temporal interval for a birth event that occurs the domain model and the logical model. In these cases, the on a person agent. A name of a person is an information domain model serves as the semantics of the logical model. bearer that inheres on a person agent. This means to map Semantics is assigned to the logical model via a mapping a person entity in a logical model requires determining how between the domain model and the logical model. each attribute is represented metaphysically and then create There are multiple approaches of performing this mapping. the triples accordingly. One approach is to develop a mapping between objects in An approach that requires examining each attribute equates the domain model and and objects in the logical model. to defining a separate function for converting each attribute to For example, the domain model could be defined using an individuals in the domain model. If we measure the cost of ontology. The mapping specifies how to convert objects in the creating a mapping based on the number of functions that have logical models to individuals in the ontology. to be created, then an approach that used a single function for We used this approach for several projects where the domain mapping sets of entities to concepts may be less expensive models were domain ontologies of CCO (Common Core than an approach that required a function for each attribute. Ontologies) [2]. CCO is a collection of upper, middle, and To develop an approach based on converting sets of entities domain ontologies in OWL that extend BFO (Basic Formal to concepts, we propose modeling a domain model as infor- Ontologies) [3]. Figure 1 contains a diagram of the ontologies mation about the metaphysical properties of entities. In other in CCO. words, consider the domain to be the terms that designate the One of the authors of this paper has used CCO for creating entities and relationships between the entities. For example, domain ontologies for a motion imagery analysis application Aircraft and F-14 would be concepts where F-14 is subsumed [4] and other projects. In all of these projects, we sought to by Aircraft. In this case, there are multiple Aircraft individuals use ontologies conformant to the CCO as domain models. In and multiple F-14 individuals which are also Aircraft indi- addition, we sought to create mappings from the logic models viduals. However, in an information model, there is only one of existing tactical military software systems to the domain designator term for all aircrafts and one designator term for all models. We required the assistance of an ontologist with in- F-14s. The subsumption relationship between Aircraft and F- depth knowledge of CCO to create the mappings. As a result, 14 could be modeled using a descriptive term, such as derives- using CCO may have a higher cost than an approach that from. More specifically, the relationship could be modeled as allows programmers or data architects to develop the mapping the triple ‘F-14 derives-from Aircraft’. This means the domain independently. As a result, the government sponsor of the ontology has to extend the Information Domain ontologies of STIDS 2016 Proceedings Page 69 Fig. 1. The ontologies of CCO and the Land Combat Information Ontology. CCO. However, we have to ensure that the domain ontology information bearer because it contains information about the isn’t just an OWL encoding of a logical model. This approach flight pattern of an aircraft. is used by some techniques for automatically creating schema Information content entities are things used to represent from ontologies [5]. information for an information bearer. For example, a 2D Using this approach, we do not map objects in the logical graph could be the information content entity of an air track. model to individuals in the ontology. Instead, we create a In this case, the 2D graph is the information that represents mapping where the domain model represents concepts that the flight pattern of an aircraft. In addition, a 3D graph could have direct mapping to syntactic classes in the logical model. be the information content of the air track. The information This mapping should be more intuitive to data architects since content entity does not have to be unique to its bearer. For it requires little knowledge of CCO and ontology development. example, -20 degrees Celsius is an information content entity In this paper, we demonstrate a method for creating domain that inheres in many information bearers, such as the current ontologies in CCO that can be systematically mapped to temperature or the lowest operating temperature. logical models. In Section II, we provide an overview of the In- Information content entities are organized into three hier- formation Domain ontologies of CCO. Then in Section III we archies, directive information, designative information, and describe how a domain ontology should extend the Information descriptive information. In this paper, we only use designa- Domain ontologies by creating a proof–of–concept domain tive and descriptive information entities. Therefore, we omit ontology for land combat. Then in Section IV we describe describing directive information. Designative content entities how the domain ontology maps to logical models in ECore consist of a set of symbols that denote some entity. Type codes [6] and NIEM (National Information Exchange Model)1 . We are an example of designative content entities. Descriptive conclude the paper in Section V with a discussion on why content entities consist of a set of propositions that describe we think our approach faithfully encodes the semantics of the some entity. Numeric scales are examples of descriptive con- domain and isn’t merely a logical model in OWL. tent entities. There is only one class for Information Bearers, Information II. I NFORMATION O NTOLOGIES IN CCO Entity Bearers. Our domain ontology for land combat will The information entity ontology is partitioned into two class define a hierarchy for land combat terms with Information hierarchies, information bearing entities and information con- Entity Bearer as the root. tent entities. We call information bearing entities information bearers for short. III. L AND C OMBAT D OMAIN M ODEL An information bearers is and independent continuant that carries information. For example, a track of an aircraft is an In this section, we give an overview how we created the land combat domain model as an extension of the Information 1 https://www.niem.gov/ Entity Ontology. STIDS 2016 Proceedings Page 70 Descriptive Name Acronym/Standard Name Common Warfighting Symbology MIL-STD-2525C Variable Message Format MIL-STD-6017C US Message Text Format MIL-STD-6040 Rev. B Modernized Intelligence Database MIDB Ground-Warfighter Geospatial Data Model GGDM TABLE I L AND C OMBAT D OMAIN S OURCES A. Identify Sources The first step in creating the domain model is identifying the sources of the information entities. For the land combat proof–of–concept, we use the standards in Table I. B. Define Class Hierarchy For the second step, we defined a class hierarchies that extend Information Bearing Entity and Information Content Entity. Our approach is based on the assumption that the domain model is a conceptualization of information about entities. More specifically, the domain model consists of concepts that can be classified as an entity report, an entity artifact, or an entity representation. An entity report is a concept which captures in a structured machine-readable form one or Fig. 2. Example depicting informational entity categories. more observations about an entity’s state at a given time, as observed by an agent with a given location (where the agent can be human or software). An entity artifact is a concept which describes assertions about an entity. Entity artifacts are derived either from entity records or from other entity artifacts. For example, a detailed entity artifact about a person can be created from multiple entity records obtained from HUMINT sources. There can be more than one entity artifact asserting information about a given entity or there may be no entity artifacts asserting information about a particular entity. An entity representation is a concept describing human understandable signs and symbols which can be presented to a human actor via some sensory medium (e.g., an audible alert, a PowerPoint deck, a printed document). Figure 2 shows an example of the entity informational categories. We partition the terms into two groups. We define OWL classes for each of these groups. The first group of terms are terms representing entity artifacts and entity reports. We call these terms LC (Land Combat) Information Entities. The second group of terms contain qualities, traits, roles, and characteristics of the entity referenced by an entity artifact or an entity report. The class for this group of terms will be Information Content Entity classes. Figure 3 shows a snapshot of the object properties, LC Information Bearing Entities, and the Information Content Entity classes. C. Convert Terms to Individuals In this step, we present the guidelines we used to determine the terms from the source documents we used as individuals Fig. 3. Screen shot of the Land Combat Domain Model T-Box in Protégé. in the ontology. We use the noun and adjective phrases in the source documents to create the individuals in the ontology. For example, the terms ‘aircraft carrier’, ‘light’, ‘guided missile’, STIDS 2016 Proceedings Page 71 and ‘nuclear powered’ are noun and adjective phrases in A. Mapping to ECore USMTF. Each of these terms will be an individual. The ECore is a metal model for defining models in EMF (Eclipse adjective phrases will become land combat designative content Modeling Framework) [6]. Using Ecore, developers can create individuals. The noun phrases will become information content models similar to UML Class diagrams and automatically entity individuals and information bearing entity individuals. generate code from the models. Ecore contains constructs and The usage of the noun phrase determines which class the features common in object-oriented design, such as classes, term belongs to. If the noun phrase is an entity, such as aircraft enumerations, and inheritance. carrier, then it will become an LC Information Entity. If the Mapping to an object model in ECore is straightforward. noun phrase is the value of a type code, then it will become Each of the individuals of Type Code becomes an Enumeration an Information Content Entity. More specifically, it will be an class in ECore. The enumerations are determined by the individual of a subclass of LC Designative Content. If it is a ‘enumerated-by’ property. More specifically, if A ‘enumerated multi-valued numeric attribute, then it will be an individual of by’ X and A ‘enumerated by’ Y are triples, then X and Y are an LC Ratio Measurement Content subclass. the enumeration literals of enumeration class corresponding to The individuals of the LC Relation class are verb phrases A. that describe a relationship between terms in the standard. Each LC Info Entity individual will be a class in ECore For example, 2525C contains a taxonomy of air tracks about that extends the root class InfoEntity. The derived from different kinds of aircraft. Therefore, ‘is about’ is a relation property determines its subclasses and parent class. More between the LC Information Entities. Notice that the relation specifically, if A ‘specialization of’ B or B ‘generalization individuals may not be verb phrases in the standard. Instead, of’ A is a triple, then the ECore class corresponding to A, they are conceptualization of the relationships between terms will be a subclass of the ECore class corresponding to B. in the standard. The attributes of the classes will be defined as follows. For each triple S p O, where S is a LC Entity Info Individual D. Define Ontological Relationships of the Domain and p is one of the properties, ‘has feature’, ‘has value’, ‘has By defining relationships between terms using an individual, attribute’, or ‘has part’, there will be an attribute in the class we can support defining an arbitrary number of relations. corresponding to O whose type is the type corresponding to We can use OWL properties as meta–relationships between O. Each of these types will be created as classes using the individuals. More specifically, we define a fixed set of OWL same approach. properties for defining subsumption and composition relation- If the ECore class created from the Entity individual A ships between individuals. These relationships hold for all does not have any attributes, then it can be made into an domains. enumerated class. This will require the individual B in a Each of the meta–relation properties is a CCO property or triple A ‘specialization of’ B or B ‘generalization of’ A be a sub-property of a CCO property. Figure 4 depicts pictorially converted into an enumeration literal. a sample of triples using all of the meta–relation properties. Each A ‘is record of’ B triple will be converted into an The CCO properties are in blue and the derived properties association class. More specifically, it will be converted into are in black. The ‘derives from’ indicates the subject has a class that contains two attributes, subject and object. all of the same properties as the object. Therefore, ‘Stragetic The type of subject will be the type corresponding to A. Bomber’ and ‘Tactical Bomber’ each have a ‘Fixed Wing’ as The type of object will be the type corresponding to B. a quality. The ‘derives from’ property is the only subsumption B. Mapping to NIEM property in our model. The properties ‘has object’ and ‘has NIEM is a logical model developed by the U.S. Government subject’ are used to indicate the subject and object of an LC to enable state and federal agencies to share data. The purpose relation. The properties ‘has feature’, ‘has part’, ‘has value’, of NIEM is to establish a common structured vocabulary for and ‘has code’ all indicate a part–whole relationship between a set of terms used in all domains relevant to government the subject and object. The difference between the three is activities, such as person and location, and a set of common the range of the properties. The range of ‘has feature’ is terms used in specialized domains relevant to some govern- Information Content Entities, but the range of ‘has part’ is ment activities, such as hospital and unmanned vehicle. NIEM an LC Info Entity class. The range of ‘has code’ is LC Info uses XSD and UML to define the terms so that it can be readily Type Code. And the range of ‘has value’ is subclass of LC used in software. Ration Measurement Info Term. The property ‘enumerated by’ In NIEM, terms are partitioned into elements and types. An indicates the enumerations of a type code. The property ‘has element represent properties or attributes of objects. A type quality’ indicates the object is a quality of the subject. represents a set of objects that have the same properties and semantics. IV. L AND C OMBAT L OGICAL M ODELS Each Entity individual will be a NIEM type. Elements of the In this section, we describe how classes and individuals NIEM types are determined by the objects in triples. Objects from the domain model created in Section III map to logical of ‘has feature’, ‘has attribute’, and ‘has part’ will be come models in ECore and NIEM. composite elements. Objects of ‘has value’ will be come scalar STIDS 2016 Proceedings Page 72 Fig. 4. Example illustrating use of meta–properties elements. The ‘generalization of’ and ‘specialization of’ will hope this domain model will be used as a common semantics determine inheritance. for U.S. Army’s initiative to use a single computing platform Code Lists can be created in a similar fashion to how for multiple army battle command systems [7]. enumerated classes are created in ECore. Association Types can be created from ‘is record of’ triples. R EFERENCES A logical modeler determines whether an object of ‘has at- [1] Interim report: ANSI/X3/SPARC Study Group on Data Base Management Systems. Washington, D.C.: ACM, 1975. tribute’ should be considered Metadata. NIEM Augmentation [2] J. R. Schoening, D. K. Duff, D. A. Hines, K. M. Riser, T. Pham, and Extension Augmentation point and extensions are deter- G. H. Stolovy, J. Houser, R. Rudnicki, R. Ganger, and A. James, “PED mined from ‘derives from’. The logical modeler determines fusion via enterprise ontology,” in Proceedings SPIE 9464,Ground/Air Multisensor Interoperability, Intergration, and Networking for Persistent whether to create an augmentation point or an extension. ISR VI. International Society for Optics and Photonics, May 2015. V. C ONCLUSION [3] R. Arp, B. Smith, and A. D. Spear, Building ontologies with basic formal ontology. Mit Press, 2015. We described an approach to create a domain model in [4] W. R. Thissell, R. Czajkowski, F. Schrenk, T. Selway, A. J. Ries, OWL for which logical models can be derived in a systematic S. Patel, P. L. McDermott, R. Moten, R. Rudnicki, G. Seetharaman, I. Ersoy, and K. Palaniappan, “A Scalable Architecture for Operational way. Our approach is truly a domain model because it uses FMV Exploitation,” in Proceedings of the IEEE International Conference terminology from domain documents to create the ontology on Computer Vision Workshops, 2015, pp. 10–18. entities. In addition, the domain model contains the ontological [5] M. J. O’Connor and A. Das, “Acquiring OWL Ontologies from XML Documents,” in Proceedings of the Sixth International Conference on relationships from the domain. For instance, it is able to Knowledge Capture, ser. K-CAP ’11. New York, NY, USA: ACM, specify that two concepts are related because one concept is 2011, pp. 17–24. a quality of another concept. In addition, it is able to capture [6] D. Steinberg, F. Budinsky, E. Merks, and M. Paternostro, EMF: eclipse modeling framework. Pearson Education, 2008. role relationships. [7] S. Lyngaas, “Four years on, Army common operating environment takes We provided an overview of how we intend to use domain shape” FCW, Sep. 2015. [Online]. Available: https://fcw.com/articles/ models created with our approach to generate logical models in 2015/09/22/army-mobile-computing.aspx Ecore and NIEM. We believe project managers will consider our approach suitable for their projects because it does not require expertise in ontologies and in-depth knowledge of CCO. In the future, we plan to build a complete land combat domain model using the sources mentioned in Table I. We STIDS 2016 Proceedings Page 73