Horizontal Integration of Warfighter Intelligence Data A Shared Semantic Resource for the Intelligence Community Barry Smith Tatiana Malyuta William S. Mandrick Chia Fu Kesny Parent Milan Patel University at Data Tactics Corp. Data Tactics Corp. Data Tactics Corp. Intelligence and Intelligence and Buffalo, VA, USA VA, USA VA, USA Information Warfare Information Warfare NY, USA Directorate (I 2WD) Directorate (I 2WD) CERDEC, MD, USA CERDEC, MD, USA Abstract - We describe a strategy that is being used for the have been developed. We propose a strategy for horizontal horizontal integration of warfighter intelligence data within the integration which seeks to avoid such problems by being framework of the US Army’s Distributed Common Ground completely independent of the processes by which the data System Standard Cloud (DSC) initiative. The strategy rests on store to which it is applied is populated and utilized. This the development of a set of ontologies that are being strategy, which draws on standard features of what is now incrementally applied to bring about what we call the ‘semantic enhancement’ of data models used within each called ‘semantic technology’ [2], has been used successfully intelligence discipline. We show how the strategy can help to for over ten years to advance integration of the data made overcome familiar tendencies to stovepiping of intelligence available to bioinformaticians, molecular biologists and data, and describe how it can be applied in an agile fashion to clinical scientists in the wake of the successful realization of new data resources in ways that address immediate needs of the Human Genome Project [3, 4]. The quantity and variety intelligence analysts. of such data – now spanning all species and species- interactions, at all life stages, at multiple granularity levels, Index Terms—semantic enhancement, ontology, joint doctrine, intelligence analytics, intelligence data retrieval. and pertaining to thousands of different diseases – is at least comparable to the quantity and variety of the data which need to be addressed by intelligence analysts. As we I. INTRODUCTION describe in more detail in [5], however, today’s dynamic environment of military operations (from Deterrence to The horizontal integration of warfighter intelligence data Crisis Response to Major Combat Operations) is one in is described in Chairman of the Joint Chiefs of Staff which ever new data sources are becoming salient to Instruction J2 CJCSI 3340.02A [1] in the following way: intelligence analysis, in ways which will require a new sort Horizontally integrating warfighter intelligence data improves the of agile support for retrieval, integration and enrichment of consumers’ production, analysis and dissemination capabilities. data. We will thus address in particular how our strategy can Horizontal Integration (HI) requires access (including discovery, be rapidly reconfigured to allow its application to emerging search, retrieval, and display) to intelligence data among the data sources. warfighters and other producers and consumers via standardized The strategy is one of a family of similar initiatives services and architectures. These consumers include, but are not designed both to rectify the legacy effects of data stovepiping limited to, the combatant commands, Services, Defense agencies, and the Intelligence Community. in the past and to counteract the problems caused by new stovepipes arising in the future. It is currently being applied Horizontal integration is achieved when multiple within the DCGS-A Standard Cloud (DSC) initiative, which heterogeneous data resources become aligned or harmonized is part of the Distributed Common Ground System-Army [6], in such a way that search and analysis procedures can be the principal Intelligence, Surveillance and Reconnais- applied to their combined content as if they formed a single sance (ISR) enterprise for the analysis, processing and resource. We describe here a methodology that is designed exploitation of all US Army intelligence data, and which is to achieve such alignment in a flexible and incremental way. designed to be interoperable with other DCGS The methodology is applied to the source data at arm’s programs. The DSC Cloud is a military program of record in length, in such a way that the data itself remains unaffected the realm of Big Data that is accumulating data from by the integration process. multiple diverse sources and with high rapidity of change. In Ironically, attempts to achieve horizontal integration [5, 7] we described how the proposed strategy is already have often served to consolidate the very problems of data helping to improve search results within the DSC Cloud in stovepiping which they were designed to solve. Integration ways that bring benefits to intelligence analysts. In this solution A is proposed; and works well for the data and communication, we present the underlying methodology purposes for which it was originally tailored; but it does not describing also how it draws on resources developed in an work at all when applied to new data, or to existing data that incremental way that takes account of lessons learned in has to be used in new ways. Such failures arise for a variety successive phases of application of the methodology to new of reasons, many of which have to do with the fact that kinds of data. Here we provide only general outlines. Further integration systems are too closely tied to specific features details and supplementary material are presented at [8]. of the (software/workflow) environments for which they II. OVERCOMING SEMANTIC STOVEPIPES order to ensure that the suite of purpose-built ontologies Every data store is based on some data model which evolves in a consistent and non-redundant fashion. specifies how the data in the store is to be organized. Since III. DEFINING FEATURES OF THE SE APPROACH communities that develop data stores do so always to serve some particular purpose, so each data model, too, is oriented Associating terms used in source data with preferred around some specific purpose. Data models have been labels in ontologies leads to what we call ‘Semantic created in uncoordinated ways to address these different Enhancement’ (SE) of the source data. The ontologies purposes, and they typically cannot easily be modified to themselves we call ‘SE ontologies’, and the semantically serve additional purposes. Where there is a need to combine enhanced source data together form what we call the data from multiple existing systems, therefore, the tendency ‘Shared Semantic Resource’ (SSR). To create this resource has been to invest what may be significant manual effort in in a way that supports successful integration, our building yet another data store, thereby contributing further methodology must ensure realization of the following goals, to a seemingly never-ending process of data stovepipe which are common to many large-scale horizontal proliferation. integration efforts: To break out of this impasse, we believe, a successful • It must support an incremental process of ontology strategy for horizontal integration must operate at a different creation in which ontologies are constructed and level from the source data. It must be insulated from maintained by multiple distributed groups, some of them entanglements with specific data models and associated associated with distinct agencies, working to a large software applications, and it must be marked by a degree of degree independently. persistence and of relative technological simplicity over • The content of each ontology must exist in both human- against the changing source data to which it is applied. readable (natural language) and computable (logical) The strategy we propose, which employs by now versions in order to allow the ontologies to be useful to standard methods shared by many proponents of semantic multiple communities, not only of software developers technology [2], begins by focusing on the terms (labels, and data managers, but also of intelligence analysts. acronyms, codes) used as column headers in source data artifacts. The underlying idea is that it is very often the case • Labels must be selected with the help of SMEs in the that multiple distinct terms {t1, …, tn} are used in separate relevant domains. This is not because these labels are data sources with one and the same meaning. If, now, these designed to be used by SMEs at the point where source terms are associated with some single ‘preferred label’ data are collected; rather it is to ensure that the drawn from some standard set of such labels, then all the ontologies reflect the features of this domain in a way separate data items associated with the {t1, … tn} will that coheres as closely as possible with the become linked together through the corresponding preferred understanding of those with relevant expertise. Where labels. necessary – for instance in cases where domains overlap Such sets of preferred labels provide the starting point – multiple synonyms are incorporated into the structure for the creation of what are called ‘ontologies’, which are of the relevant ontologies to reflect usage of different created (1) by selecting a preliminary list of labels in communities of interest. collaboration with subject-matter experts (SMEs); (2) by • Ontology development must be an arms-length process, organizing these labels into graph-theoretic hierarchies with minimal disturbance to existing data and data structured in terms of the is_a (or subtype) relation and models, and to existing data collection and management adding new terms to ensure is_a completeness; (3) by workflows and application software. associating logical definitions, lists of synonyms and other • Ontologies must be developed in an incremental process metadata with the nodes in the resultant graphs. One which approximates by degrees to a situation in which assumption widespread among semantic technologists is that there is one single reference ontology for each domain of ontology-based integration is best pursued by building large interest to the intelligence community. ontology repositories (for example as at [9]), in which, while use of languages such as RDF or OWL is • The ontologies must be capable of evolving in an agile standardized, the ontologies themselves are unconstrained. fashion in response to new sorts of data and new Our experience of efforts to achieve horizontal integration in analytical and warfighter needs. the bioinformatics domain, however, gives us strong reason • The ontologies must be linked together through logical to believe that, in order to counteract the creation of new definitions [10], and they must be maintained in such a (‘semantic’) stovepipes, we must ensure that the separate way that they form a single, non-redundant and ontologies are constructed in a collaborative process which consistently evolving integrated network. The fact that ensures a high degree of integration among the ontologies all the ontologies in this network are being used themselves. To this end, our strategy imposes on ontology simultaneously to create annotations of source data developers a common set of principles and rules and an artifacts will in turn have the effect of virtually associated common architecture and governance regime in transforming the latter into an evolving single SSR, to which computer-based retrieval and analysis tools can be idea that ontologies should be constructed as applied. representations, not of data or of data models, but rather of The ontology development strategy we advocate thus differs the types of entities in reality to which the data relate. radically from other approaches (such as are propounded in The first step in the development of an ontology for a [11]), which allow contextualized inconsistency. For while domain that has been identified as a target for intelligence of course source data in the intelligence domain will analysis is thus not to examine what types of data we have sometimes involve inconsistency – the data is derived, after about that domain. Rather, it is to establish in a data-neutral all, from multiple, and variably reliable, sources –, to allow fashion the salient types of entities within the domain, and inconsistency among the ontologies used in annotations to select appropriate preferred labels for these types, would, from our point of view, defeat the purposes of drawing for guidance on the language used by SMEs with horizontal integration. corresponding domain expertise. In addition, we rely on authoritative publications such as the capstone Joint To achieve the goals set forth above, we require: Publication (JP) 1 of Joint Doctrine and the associated • A set of ontology development rules and principles, a Dictionary (JP 1-02) [14, 15] (see Figure 1), applying shared governance and change management process, and adjustments where necessary to ensure logical consistency. a common architecture incorporating a common, The resultant preferred labels are organized into simple domain-neutral, upper-level ontology. hierarchies of subtype and supertype, and each label is • An ontology registry in which all ontology initiatives associated with a simple logical definition, along the lines and emerging warfighter and analyst needs will be illustrated (in a toy example) in Table 1. communicated to all collaborating ontology developers. • A simple, repeatable process for ontology development, vehicle  =def:  an  object  used  for  transporting  people  or   which will promote coordination of the work of goods   distributed development teams, allow the incorporation personnel  carrier  =def.  a  vehicle  that  is  used  for   transporting  persons   of SMEs into the ontology development process, and tractor  =def:  a  vehicle  that  is  used  for  towing   provide a software-supported feedback channel through crane  =def:  a  vehicle  that  is  used  for  lifting  and  moving   which users can easily communicate their needs, and heavy  objects   report errors and gaps to those involved in ontology   development. vehicle  platform=def.  means  of  providing  mobility  to  a   • A process of intelligence data capture through vehicle   ‘annotation’ [12] or ‘tagging’ of source data artifacts [7], wheeled  platform=def.  a  vehicle  platform  that   whereby the preferred labels in the ontologies are provides  mobility  through  the  use  of  wheels     tracked  platform=def.  a  vehicle  platform  that  provides   associated incrementally with the terms embedded in mobility  through  the  use  of  continuous  tracks   source data models and terminology resources in such a Table 1. Fragments of asserted ontologies way that the data in distinct data sources, where they V. REALIZATION OF THE STRATEGY pertain to a single topic, are represented in the SSR in a way that associates them with a single ontology term. There is a tension, in attempts to create a framework for Currently the annotation process is primarily manually horizontal integration of large and rapidly changing bodies driven, but it will in the future incorporate the use of of data, which turns on the fact that (1) to secure integration Natural Language Processing (NLP) tools. Importantly, the framework needs to be free from entanglements with the process of annotation incrementally tests the specific data models; yet (2) to allow effective ontologies against the data to which they must be representation of data, the framework needs to remain as applied, thereby helping to identify errors and gaps in the close as possible to those same data models. ontologies and thus serving as a vital ontology quality This same tension arises also for the SE approach, where assurance mechanism [12]. it is expressed in the fact that: (1) The SSR needs to be created on the basis of persistent, IV. ONTOLOGICAL REALISM logically well-structured ontologies designed to be The key idea underlying the SE methodology is that the reused in relation to multiple different bodies of data; successful application of ontologies to horizontal data yet: integration requires a process for creating ontologies that is (2) To ensure agile response to emerging warfighter needs, independent of specific data models and software its ontologies must be created in ways that keep them as implementations. This is achieved through the adoption of close as possible to the new data that is becoming what is called ‘ontological realism’ [13], which rests on the available locally in each successive stage. Figure 1 - Joint Doctrine Hierarchy To resolve this tension, the SE strategy incorporates a that this asserted is_a hierarchy is a monohierarchy (a distinction between two sorts of ontologies, called hierarchy in which each term has at most one parent). This ‘reference’ and ‘application’ ontologies, respectively. By requirement is imposed for reasons of efficiency and ‘reference ontology’, we mean an ontology that captures consistency: it allows the total ontology structure to be generic content and is designed for aggressive reuse in managed more effectively and more uniformly across multiple different types of context. Our assumption is that distributed development teams – for example by aiding most reference ontologies will be created manually on the positioning and surveyability of terms. It brings also basis of explicit assertion of the taxonomical and other computational performance benefits [23] and provides an relations between their terms. By ‘application ontology’, we easy route (described in Section V.E below) to the creation mean an ontology that is tied to specific local applications. of the sorts of logical definitions we will need to support Each application ontology is created by using ontology horizontal integration. The principle of asserted single merging software [16] to combine new, local content with inheritance comes at a price, however, in that it may require generic content taken over from relevant reference reformulation of content – for example deriving from multi- ontologies [17, 18], thereby providing rapid support for information retrieval in relation to particular bodies of inheritance ontologies already developed by the intelligence intelligence data but in a way that streamlines the task of community – that is needed to support the creation of the ensuring horizontal integration of this new data with the SSR. Again, our experience in the biomedical domain is that existing content of the SSR. such reformulation, while requiring manual effort, is in almost all cases trivial, and that, where it is not trivial, the A. Principle of Single Inheritance effort invested often brings benefits in terms of greater clarity as to the meanings and interrelationships of the new Our ontologies are ‘inheritance’ hierarchies in the sense that terms that need to be imported into the SE framework. everything that holds (is true) of the entities falling under a given parent term holds also of all the entities falling under B. A Simple Case Study its is_a child terms at lower levels. Thus in Figure 2, for Imagine, now, that there is a need for rapid creation of an example, everything that holds of ‘vehicle’ holds also of application ontology incorporating preferred labels to ‘tractor’. Each reference ontology is required to be created describe artillery units available to some specific military around an inheritance hierarchy of this sort that is unit called ‘Delta Battery’. Such an ontology is enabled, constructed in accordance with what we call the principle of first, by selecting from existing reference ontologies the asserted single inheritance. This requires that for each terms needed to address the data in hand, for example of the reference ontology the is_a hierarchy is asserted, through sort used in Table 1. Second we define supplementary terms explicit axioms (subclass axioms in the OWL language), needed for our specific local case, as in Table 2. rather than inferred by the reasoner. In addition it requires Some of these terms may later be incorporated into corresponding asserted ontologies within the SE suite. For our present purposes, however, they can be understood as being simply combined together with the associated asserted ontology terms using ontology merging software, for example as developed by the Brinkley [17,19,17] and He [20,21] Groups. Because of the way the definitions are formulated, it is then possible to apply an automatic reasoner [22] to the result of merger to infer new relations, and thereby to create a new ontology hierarchy, as in Figure 2. Note that, in contrast to the reference ontologies from which it is derived, such an application ontology need not satisfy the principle of single inheritance. Note, too, that the definitions are exploited by the reasoner not only to generate the new inferred ontology, but also to test its consistency both internally and with the reference ontologies from which it is derived. Figure 2. Inferred ontology of Delta Battery artillery vehicles. Child-parent links are inferred by the reasoner from the content of merged artillery  weapon  =  def.  device  for  projection  of  munitions   reference ontologies and from definitions of the supplementary terms. Note beyond  the  effective  range  of  personal  weapons   that some terms have multiple parents. artillery  vehicle  =  def.  vehicle  designed  for  the  transport   of  one  or  more  artillery  weapons   A suite of normalized ontologies is easier to maintain, wheeled  tractor  =  def.  a  tractor  that  has  a  wheeled   platform   because globally significant changes – those changes which tracked  tractor  =  def.  a  tractor  that  has  a  tracked  platform   potentially have implications across the entire suite of artillery  tractor  =  def.  an  artillery  vehicle  that  is  a  tractor     ontologies – can be made in just one place in the relevant wheeled  artillery  tractor  =  def.  an  artillery  tractor  that   reference ontology, thereby allowing consequent changes in has  a  wheeled  platform   the associated inferred ontologies to be propagated Delta  Battery  artillery  vehicle=def.  an  artillery  vehicle   automatically. This makes ontology-based integration easier that  is  at  the  disposal  of  Unit  Delta   to manage and scale, because when single-inheritance Delta  Battery  artillery  tractor=def.  an  artillery  tractor  that   modules serve to constrain allowable sorts of combinations, is  at  the  disposal  of  Unit  Delta   this makes it easier to avoid problems of combinatorial Delta  Battery  wheeled  artillery  tractor=def.  a  wheeled   explosion. artillery   Tabletractor   that  is  ofat  supplementary 2: Examples the  disposal  oterms f  Unit  and Delta definitions   C. Modularity of Ontologies Designed for Reuse The strategy is designed to guarantee The reference ontologies within the SE suite are to be (1) that salient reference ontology content is preserved in conceived as forming a set of plug-and-play ontology the new, inferred ontology in such a way that modules such as the Organization Ontology, Geospatial (2) the latter can be used to semantically enhance newly Feature Ontology, Human Physical Characteristics added data very rapidly, and thereby Ontology, Event Ontology, Improvised Explosive Device Component Ontology, and so on. These modules need to be (3) bring about the horizontal integration of these data with created at different levels of generality, with the architecture all remaining contents of the SSR. of the higher level reference ontologies being preserved as While ontology software has the capacity to support rapid we move down to lower levels. ontology merger and consistency checking, we note that the Each module has its own coverage domain, and the inferred application ontology that is generated may on first coverage domains for the more specific modules (for pass fail to meet the local application needs. Thus, multiple example artillery vehicle, military engineering vehicle) are iterations and investment of manual effort are needed. contained as parts within the coverage domains of the more Requiring that all inferred ontologies rest on reference general modules (for example vehicle, equipment). It is our ontology content serves not only to ensure consistency, but intention that the full SE suite of ontologies will mimic the also to bring about what we can think of as the sort of hierarchical organization that we find in the Joint normalization [23] of the evolving ontology suite. (This is in Doctrine Hierarchy [15], and our strategy for identifying loose analogy with the process of normalization of a vector and demarcating modules will wherever possible follow the space, where a basis of orthogonal unit vectors is chosen, in demarcations of Joint Doctrine. The goal is to specify a set terms of which every vector in the whole space can be of levels of greater and lesser generality: for example represented in a standard way.) Intelligence, Operations, Logistics, at one level; Army Intelligence, Navy Intelligence, Airforce Intelligence, at the next lower level; and so on. Ideally, the set of modules on each level are non-redundant in the sense that (1) they deal independently developed ontologies and terminology with non-overlapping domains of entities and thus (2) do not content, the incremental approach adopted here implies that contain any terms in common. In this way the more general mergers will be applied almost exclusively only (1) to the content at higher levels is inherited by the lower levels and content of reference ontologies developed according to a thus does not need to be recreated anew. As the history of common methodology and reviewed at every stage for doctrine writing shows, drawing such demarcations and mutual consistency and (2) to application ontology content ensuring consistency of term use in each sibling domain on developed by downward population from the evolving any given level is by no means easy. Here, however, we will ontology suite. have the advantage that the ontology resource we are E. Creating Definitions creating is not designed to serve as a terminology and doctrine set for use by multiple distinct groups of The principle of single inheritance allows application of warfighters. Rather, it is designed for use behind the scenes a simple rule for formulating definitions of ontology terms, for the specific purpose of data discovery and integration. whereby all definitions are required to have the form: Thus it is assumed that disciplinary specialists will continue an S = Def. a G which Ds to use their local terminologies (and taxonomies) at the point where source data is being collected, even while, where ‘S’ (for: species) is the term to be defined, ‘G’ (for: thanks to the intermediation of ontology annotation, they are genus) is the immediate parent term of ‘S’ in the relevant SE contributing to the common SSR. At the same time, asserted ontology, and ‘D’ (for: differentia) is the species- community-specific terms will wherever possible be added criterion, which specifies what it is about certain G’s which to the SE ontology hierarchies as synonyms. This will makes them S’s. (Note that this rule can be applied contribute not only to the effectiveness of ontology review consistently only in a context where every term to be by SMEs but also to the applicability of NLP technology in defined has exactly one asserted parent.) support of automatic data annotation. As more specific terms are defined through the addition Our goal is to build the SE ontology hierarchy in such a of more detailed differentia, their definitions encapsulate the way as to ensure non-redundancy by imposing the rule that, taxonomic information relating the corresponding type for each salient domain, one single reference ontology within the SE ontology to the sequence of higher-level terms module is developed for use throughout the hierarchy. by which it is connected to the corresponding ontology root. Creating non-redundant modules in this way is, we believe, The task of formulating definitions thereby serves as a indispensable if we are to counteract the tendency for quality control check on the correctness of the constituent separate groups of ontology developers to create new hierarchies, just as awareness of the hierarchy assists in the ontologies for each new purpose. formulation of coherent definitions. A further requirement is that the definitions themselves D. Benefits of Normalized Ontology Modules use (wherever possible) preferred labels which are taken The grounding in modular, hierarchically organized, over from other ontologies within the SE suite. Where non-redundant, asserted ontology modules brings a number appropriate terms are missing, the SE registry serves as a of significant benefits, of a sort which are being realized feedback channel through which the corresponding need can already in the biomedical ontology research referred to be transmitted to those tasked with ontology maintenance. above [3]. First, it creates an effective division of labor The purpose of this requirement is to bring it about that the among those involved in developing, maintaining and using SE ontologies themselves will become incrementally linked ontologies. In particular, it allows us to exploit the existing together via logical relations in the way needed to ensure the disciplinary division of knowledge and expertise among horizontal integration of the data in the SSR that have been specialists in the domains and subdomains served by the annotated with their terms. And as more logical definitions intelligence community. To ensure population of the are added to the SE suite, the more its separate modules ontologies in a consistent fashion, we are training selected begin to act like a single, integrated network. All of this SMEs from relevant disciplines in ontology development brings further benefits, including: and use; at the same time we are ensuring efficient feedback • Lessons learned in experience developing and using one between those who are using ontologies in annotating data module can be easily propagated throughout the entire and those who are maintaining the ontologies over time in system. order to assure effective update, including correction of gaps • The value of training in ontology development in any and errors. given domain module is increased, since the results of Second, it ensures that the suite of asserted ontologies is such training can easily be re-applied in relation to other easily surveyable: developers and users of ontologies can modules. easily discover where the preferred label equivalents of • The incrementally expanding stock of available reference given terms are to be found in the ontology hierarchy; they ontology terms will help to make it progressively easier to can also easily determine where new terms, or new create in an agile fashion new application ontologies for branches, should be inserted into the SE suite. Thus, where emerging domains. familiar problems arise when mergers are attempted of • The expanding set of logical definitions cross-linking the Formal Ontology 2.0 (BFO), which has been thoroughly ontologies in the SE suite will mean that the use of tested in multiple application areas [8, 24]. Its role is to ontology reasoners [22] for quality assurance of both provide a framework that can serve as a starting point for asserted and inferred ontologies will become downward population in order to ensure consistent ontology progressively more effective. These same reasoners will development at lower levels. Since almost all SE ontology then be able to be used to check the consistency of the development is at the lower levels within the hierarchy, resultant annotations; and when inconsistencies are BFO itself will in most cases be invisible to the user. detected, these can be flagged as being of potential The Mid-Level Ontologies (MLOs) introduce significance to the intelligence analyst. successively less general and more detailed representations of types which arise in successively narrower domains until VI. FROM DATA TO DECISIONS: AN EXAMPLE we reach the Lowest Level Ontologies (LLOs). These LLOs Suppose, for example, that analysts are faced with a large are maximally specific representation of the entities in a body of new data pertaining to activities of organizations particular one-dimensional domain, as illustrated in Table 3. involved in the financing of terrorism through drug Some MLOs are created by adding together LLO trafficking. The data is presented to them in multiple component modules, for example, the Person MLO may be different formats, with multiple different types of labels created by conjoining person-relevant ontology components (acronyms, free text descriptions, alphanumeric identifiers) from Table 3 such as: Person Name, Person Date, Hair for the types of organizations and activities involved. Color, Gender, and so on. More complex MLOs will involve To create a semantically enhanced and integrated version the use of reasoners to generate ontologies incorporating of these data for purposes of indexing and retrieval, analysts inferred labels such as ‘Male Adult’, ‘Female Infant’, and so and ontology developers can use as their starting point the on, along the lines sketched in Section V.B above. Organization Ontology which has already been populated with many of the general terms they will need across the Person Name (with types such as: FirstName, LastName, …) entire domain of organizations, both military and non- Hair Color (with types such as Grey, Blonde, … ) military, formal and informal, family- or tribe- or religion- Military Role (with types such as: Soldier, Officer, …) based, and so on. It will also contain the terms they need to Blood Type (with types: O, A, …) define different kinds of member roles, organizational units Eye Color (with types: Blue, Grey, …) and sub-units, chains of authority, and so on. Gender (with types: Male, Female, …) Adherence to the SE principles ensures that the Organization Ontology has been developed in such a way as Age Group (with types: Infant, Teenager, Adult, …) to be interoperable, for example, with the Financial Event Person Date (with types: BirthDate, DeathDate, …) and Drug Trafficking Ontologies. Portions of each of these Education History (with types: HighSchoolGraduation, …) modules can thus be selected for merger in the creation of a Education Date (with types: DateOfGraduation, …) new, inferred ontology, which can rapidly be applied to Criminal History (with types: FirstArrest, FirstProsecution, …) annotation of the new drug-financed terrorism data, which Citizenship (based on ISO 3166 Country Codes) thereby becomes transformed from a mere collection of separate data sources into a single searchable store Table 3. Examples of Lowest Level Ontologies (LLOs) horizontally integrated within the SSR. Figure 3 illustrates the rough architecture of the resultant VII. UPPER-, MID-AND LOWEST-LEVEL ONTOLOGIES suite of SE ontologies on different levels, drawing on the The SE suite of ontologies is designed to serve top-level architecture of Basic Formal Ontology. horizontal integration. But, it depends also on what we can now recognize as a vertical integration of asserted VIII. CONCLUSION ontologies through the imposition of a hierarchy of ontology In any contemporary operational environment, decision levels. In general, the SE methodology requires that all makers at all levels, from combatant commanders to asserted ontologies are created via downward population tactical-level team leaders, need timely information from a common top-level ontology, which embodies the pertaining to issues ranging from insurgent activity to shared architecture for the entire suite of asserted ontologies outbreaks of malaria and from key-leader engagements to – an architecture that is automatically inherited by all local elections. This requires the exploitation by analysts of ontologies at lower levels. a changing set of highly disparate databases and other Here, the level of an ontology is determined by the level sources of information, whose horizontal integration will of generality of the types in reality which its nodes greatly facilitate this data to decision cycle. represent. The Upper Level Ontology (ULO) in the SE The SE strategy is designed to create the resources hierarchy must be maximally general – it must provide a needed to support such integration incrementally, with high-level domain-neutral representation of distinctions thorough testing at each successive stage, and one of our between objects and events, objects and attributes, roles, current pilot projects is designed to identify the problems locations, and so forth. For this purpose we select the Basic which arise when the SE methodology is applied to support collaboration across distinct intelligence agencies, including exploring how independently developed legacy ontologies can be incorporated into the framework. REFERENCES [1] Chairman of the Joint Chiefs of Staff Instruction. J2 CJCSI 3340.02A. [2] P. Hitzler, M. Krötzsch and S. Rudolph, Foundations of Semantic Web Technologies, Chapman & Hall, 2009. [3] Barry Smith, et al., “The OBO Foundry: Coordinated Evolution of Ontologies to Support Biomedical Data Integration”, Nature Biotechnology, 25 (11), November 2007, 1251–1255. [4] Fahim T. Imam, et al., “Development and use of Ontologies Inside the Neuroscience Information Framework: A Practical Approach”, Frontiers in Genetics, 2012; 3: 111. [5] Barry Smith, et al., “Ontology for the Intelligence Analyst”, Crosstalk: The Journal of Defense Software Engineering (forthcoming). [6] Distributed Common Ground System - Army (DCGS-A) What is it? Pentagon Army Posture Statement, 27 December 2011. [7] David Salmen, et al., “Integration of Intelligence Data through Semantic Enhancement”, Proceedings of the Conference on Semantic Technology in Intelligence, Defense and Security (STIDS), George Mason University, Fairfax, VA, November 16-17, 2011, CEUR, Vol. 808, 6–13 [8] Supplementary material on Semantic Enhancement: http://ncorwiki.buffalo.edu/index.php/Semantic_Enhancement   [9] http://ontolog.cim3.net/cgi-bin/wiki.pl?OpenOntologyRepository.   [10] Chris J. Mungall et al., “Cross-product extensions of the Gene Figure 3. Organization of asserted ontologies Ontology”, Journal of Biomedical Informatics 44 (2007), 80–86.   Our work on using SE ontologies for purposes of [11] Douglas B. Lenat, “CYC: a large-scale investment in knowledge annotation has been executed thus far both manually and infrastructure”, Communications of the ACM, 38 (11), 1995 33-38.   [12]   David P. Hill, et al., “Gene Ontology Annotations: What they mean with NLP support. The results of this work have been found and where they come from”, BMC Bioinformatics, 2008; 9(Suppl 5): useful to indexing and retrieval of large bodies of data in the S2. DSC Cloud store. In our next phase we will test its capacity [13] Barry Smith and Werner Ceusters, “Ontological Realism as a to support rapid creation of application ontologies to address Methodology for Coordinated Evolution of Scientific Ontologies”, emerging analyst needs. In a subsequent, and more Applied Ontology, 5 (2010), 139–188. [14] Joint Publication 1, Doctrine for the Armed Forces of the United ambitious phase, we plan to explore the degree to which the States, Chairman of the Joint Chiefs of Staff. Washington, DC. 20 idea of semantic enhancement can be truly transformative in March 2009. the sense that it will influence the way in which source data [15] Joint Electronic Library: The Joint Publications. are collected and stored. We believe that such an influence [16] Z. Xiang, et al., “OntoFox: Web-Based Support for Ontology Reuse”, would bring a series of positive consequences flowing from BMC Research Notes. 2010, 3:175. the fact that the asserted ontologies will be focused [17] Marianne Shaw, et al., “Generating Application Ontologies from Reference Ontologies”, Proceedings, American Medical Informatics automatically upon (i.e. represent) the same entities in the Association Fall Symposium, 2008, 672-676. battlespace that the operators, analysts, and war-planners are [18] James Malone and Helen Parkinson, “Reference and Application concerned with, and they would treat these entities in the Ontologies.” same intuitively organized way. Thus while at this stage all [19] James F. Brinkley et al., “Project: Ontology Views.”   SE ontologies are free of entanglements with specific source [20] http://www.hegroup.org/ontoden/.   data models, our vision for the future is that the success of [21] J. Hur, et al., “Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network”, BMC the approach will provide ever stronger incentives for the Immunology 2011, 12:49   use of SE ontologies already in the field. These incentives [22] OWL 2 Reasoners, will exist, because using such ontologies at the point of data http://www.w3.org/2007/OWL/wiki/Implementations. collection will guarantee efficient horizontal integration [23] Rector, A. L. “Modularisation of Domain Ontologies Implemented in with the contents of the SSR, thereby giving rise to a Description Logics and Related Formalisms including OWL”. Proceedings of the 2nd International Conference on Knowledge network effect whereby not only the immediate utility of the Capture, ACM, 2003, 121–128. collected data will be increased, but so also will the value of [24] Pierre Grenon and Barry Smith, “SNAP and SPAN: Towards all existing data stored within the SSR. Dynamic Spatial Ontology”, Spatial Cognition and Computation, 4: 1 (March 2004), 69–103.