=Paper= {{Paper |id=Vol-1097/STIDS2013_T05 |storemode=property |title=IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain |pdfUrl=https://ceur-ws.org/Vol-1097/STIDS2013_T05_SmithEtAl.pdf |volume=Vol-1097 |dblpUrl=https://dblp.org/rec/conf/stids/SmithMRMSMDSP13 }} ==IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain== https://ceur-ws.org/Vol-1097/STIDS2013_T05_SmithEtAl.pdf
                                                         IAO-Intel
                     An Ontology of Information Artifacts in the Intelligence Domain

     Barry Smith                  Tatiana Malyuta           Ron Rudnicki              William Mandrick              David Salmen
 University at Buffalo            CUNY, NY, USA            CUBRC, Buffalo               Data Tactics                 Data Tactics
      NY, USA                 Data Tactics, McLean, VA        NY, USA                 McLean, VA, USA              McLean, VA, USA
           Peter Morosoff                   Danielle K. Duff          James Schoening                       Kesny Parent
            E-Maps, Inc.                         I2WD                      I2WD                                I2WD
         Washington, DC, USA               Aberdeen, MD, USA         Aberdeen, MD, USA                   Aberdeen, MD, USA


    Abstract—We describe on-going work on IAO-Intel, an              integration (for example in the case of registries of persons of
information artifact ontology developed as part of a suite of        interest) are all too familiar. Increasingly, however, it is
ontologies designed to support the needs of the US Army              recognized that there is the need for a unified approach to
intelligence community within the framework of the Distributed       description and classification of information resources (see for
Common Ground System (DCGS-A). IAO-Intel provides a                  example [3], [4]), and the DoD has recognized at an official
controlled, structured vocabulary for the consistent formulation     level that, to advance discoverability and analysis in the age of
of metadata about documents, images, emails and other carriers       Big (military) Data, new approaches are needed that can
of information. It will provide a resource for uniform explication   enable computational retrieval, integration and processing of
of the terms used in multiple existing military dictionaries,
                                                                     data. Thus Directive 8320.02 [5], the latest version of which is
thesauri and metadata registries, thereby enhancing the degree to
which the content formulated with their aid will be available to
                                                                     dated August 5, 2013, requires all authoritative DoD data
computational reasoning.                                             sources to be registered in the DoD Data Services
                                                                     Environment (DSE) [6]. It further requires that all salient
    Keywords—ontology; information artifacts; military doctrine;     metadata be discoverable, searchable, retrievable, and
intelligence analysis; interoperability; data services environment   understandable:
                                                                            Data, information, and IT services will be considered under-
                         I.     BACKGROUND
                                                                            standable when authorized users are able to consume them and
   Standardization of terminology has been important from the               when users can readily determine how those assets may be used
very beginning of organized warfare. Imagine the Chinese                    for specific needs. Data standards and specifications that require
trying to pass reports down the Great Wall using fire beacons               associated semantic and structural metadata, including
without standardization of the signals used. In the                         vocabularies, taxonomies, and ontologies, will be published in
Revolutionary War, General Washington directed Friedrich                    the DSE, or in a registry that is federated with the DSE.
Wilhelm von Steuben to write the drill manual for the                We shall return to the DSE below. First, we present our own
Continental Army [1] so that all units would use and respond         strategy for realizing these important goals.
uniformly to the same commands.
                                                                                II.     THE INFORMATION ARTIFACT ONTOLOGY
   In our own era, DoD has directed development and use of
the DoD Dictionary of Military and Associated Terms (Joint               The Information Artifact Ontology (IAO) was originally
Publication 1-02) as the paramount terminological standard for       conceived in 2008 as part of an effort to master the Big Data
military operations [2]. JP 1-02 helps to enable joint warfare       accumulating in the wake of the Human Genome Project in
by (a) advancing consistency in communications and (b)               the context of biological research [7]. Its goal was to aid the
facilitating consistent interpretation of commands. Military         consistent description of biological data emanating from
dictionaries and related terminology artifacts continue to be        multiple heterogeneous sources. The goal of IAO-Intel is
developed, addressing these and a series of additional aims, in-     analogous: it is to provide common resources for the
cluding: (c) compiling lessons learned (outcomes assessment);        consistent description of information artifacts of relevance to
(d) providing controlled vocabularies for official reporting;        the intelligence community in a way that will allow discovery,
and (e) enhancing discoverability and analysis of data.              integration and analysis of intelligence data from both official
                                                                     and non-official sources.
    Such artifacts have until recently been conceived by
analogy with traditional free-text dictionaries published in             When biomedical informaticians work with databases,
forms designed to maximize utility to human beings. Most             publications and records generated by experimental research
existing doctrinal and related lexica and thesauri not only          or medical care they focus primarily on what these artifacts
provide little aid to computation, they also suffer from the fact    describe (for example on the genes or proteins which form the
that multiple such resources have been (and continue to be)          subject matters of a given journal publication, or on the
developed independently, in divergent and often non-                 symptoms or diseases reported in a given clinical note).
principled ways. The result is that identical data may be            Similarly, when intelligence analysts work with source data
classified and described entirely differently by different           artifacts, then they, too, focus primarily on what the data in
agencies, and the consequences of the resultant failures of          these artifacts describe, for example on the military units



                                                    STIDS 2013 Proceedings Page 33
whose movements are recorded in a given shipping report, or              which involve a particular creator, or a particular type of
on the vulnerabilities of a given forward operations base as             intelligence report, or a particular type of weblink, or have
described in some force protection assessment.                           been declassified under the authority of a particular agency, or
                                                                         are operative within a given time window.
    But while the primary focus concerns in both cases the
topic or subject of the artifacts in question, both also require a           Importantly, IAO-Intel is not designed to replace existing
secondary focus, targeted to the artifacts themselves, through           doctrinal or other standards created to guide human beings or
which information about these topics is conveyed. Such                   computer applications in the creation and description of
artifacts have attributes – including format, purpose, evidence,         documents in accordance with defined formats or document
provenance, operational relevance, security markings – data              architectures. Rather, its purpose is to allow the results of
concerning which (often called   ‘metadata’)   is   vital   to   the     using such standards to generate the needed metadata in a
effective exploitation of the reports, images, or signals                uniform, non-redundant and algorithmically processable
documents with which the analyst has to deal.                            fashion. Moreover, the broad scope of IAO-Intel means that
                                                                         the metadata generated in relation to official documents will
   The dichotomy between focus on entities in the world and              be of a piece with the metadata incrementally accumulating in
focus on the information artifacts in which these entities are           relation to all information artifacts of relevance to the IC – the
represented is fundamental to the work reported here. IAO                metadata will consist, in every case, of annotations to IAs
relates precisely to the objects of this secondary focus. An             formulated in ontology terms drawn not only from IAO-Intel
information artifact (IA), as we conceive it, is an entity that          but from the entire suite of DSGS-A ontology modules.
has been created through some deliberate act or acts by one or
more human beings, and which endures through time,                           Thus while using existing standards for human or
potentially in multiple (for example digital or printed) copies.         computer-aided creation or description of IAs does indeed
IAO thus deals with information in the forms it takes when it            allow us to retrieve data pertaining to IAs prepared in
has been deliberately fixed in some medium in such a way as              accordance with these standards, for IAs of other sorts the
to become accessible to multiple subjects. Examples are: a               existing approach will fail. Only an ontology-based approach
diagram on a sheet of paper, a video file, a map on a computer           along the lines here proposed can, we believe, demonstrate the
monitor, an article in a newspaper, a message on a network,              sort of flexibility and consistent expandability which are
the output of some querying process in a computer memory.                needed  in  today’s  dynamic  and data-rich environments.

                    III.   GOAL OF IAO-INTEL                                       IV.     EXPLICATION AND ANNOTATION
  The goal of IAO-Intel is to support the effective handling of              Currently a draft version of IAO-Intel is being applied
data concerning those attributes of IAs that are relevant to the         within  the  framework  of  the  US  Army’s  Distributed Common
purposes of intelligence analysis. To describe such attributes           Ground System (DCGS-A) Standard Cloud (DSC) initiative as
coherently we need to distinguish:                                       part of a strategy for the horizontal integration of warfighter
                                                                         intelligence data [9]. Two sorts of application are currently
– the particular information artifact of interest, tied to some          being used to enable the ontology to support computer-aided
  particular physical information bearer: the photographic               retrieval and analytics. First, is explication of general terms
  image on this piece of paper retrieved from this enemy                 used in source intelligence artifacts and in data models,
  combatant; the email created by this particular author on this         terminologies and doctrinal publications which provide typo-
  specific laptop; the target list compiled for this particular          logies of intelligence-related IAs. Second, is the annotation of
  artillery unit on this particular date;                                the instance-level information captured by such IAs.
– the copyable information content that is carried by the                    Explication is performed by providing definitions of such
  artifact in question. The photographic image may be printed            general terms using the resources of IAO-Intel and of the
  out in multiple paper copies; the email or target list may be          domain ontologies (such as Agent or Event ontologies) being
  transmitted to multiple further recipients. The information            developed within the DSGS-A framework. Annotation is
  content that is copied or transmitted thereby remains in each          performed by associating ontology terms with data about part-
  case one and the same.                                                 icular persons, events, or places in given information artifacts.
IAO-Intel provides ontology terms relating both to official
documents and to non-official (source) artifacts. It provides               TABLE 1. SAMPLE TYPES AND SUBTYPES OF INFORMATION ARTIFACTS
also a set of relations to be used when we wish to represent the           IAO           IAO-Intel (examples)
fact that, say, IA #12345 is-about some given person, or uses-             Report        Intelligence Report (FM 6-99.2, 126)
symbols-from some specified symbology, or links-to some
second IA #56789, and so forth,                                            Summary       Electronic Warfare Mission Summary (FM 6-99.2, 87)
                                                                           Diagram       Network Analysis Diagram (from JP 2-01.3, II-51)
    IAO-Intel is designed from the start to provide the needed             Overlay       Combined Information Overlay (JP 2-01.3, II 33)
supplement in a way that will create semantic interoperability             Assess-       Assessment of Impact of Damage (FM 6-99.2, 53)
of data retrieved from different types of sources through an               ment
incremental process of semantic enhancement as described in
                                                                           Estimate      Adversary Course of Action Estimate
[8], [9] and [10]. It is designed to allow automatic retrieval of
all documents in a given collection of heterogeneous sources               List          List of High-Value Targets (JP 2-01.3, II 61)




                                                    STIDS 2013 Proceedings Page 34
  Order           Airspace Control Order (FM 6-99.2, 17)                       annotated using different standard terminology resources. To
  Matrix          Target Value Matrix (JP 2-01.3, II-63)                       bring this about, the constituent terms of such resources will
                                                                               be explicated using terms from IAO-Intel so that the artificial
  Template        Ground and Air Adversary Template (JP 2-01.3, II-57)         composite terms used in certain official terminologies and
    The goal of explication is to ensure that the data captured                exchange model resources (along the lines of
in annotations is semantically enhanced in a way that enables                  ‘VehicleInspectionJurisdictionAuthorityText’) will be broken
computational integration and reasoning along the lines                        down logically into constituent elements. This will provide a
described in [11], [12]. The goal of annotation is to aid                      means to avoid the combinatoric explosion that is threatened
retrieval of information about specific persons, groups, events,               by traditional approaches. Some composite expressions – for
documents, images, and so forth, where this information is                     example  ‘Essential  Element  of  Friendly  Information  (EEFI)’  –
conveyed through source documents using disjointed and                         will indeed be included in pre-composed form in the IAO-Intel
disparate systems for designation.                                             ontology, but only where they are either defined in doctrine or
                                                                               already established as part of relevant SME vocabularies.
             V.     STRATEGY FOR BUILDING IAO-INTEL                                The modeling task for which compounds such as
                                                                               ‘VehicleInspectionJurisdictionAuthorityText’   were   designed  
   Our strategy for building IAO-Intel is to extend the draft                  is addressed in our framework by allowing single data entries
IAO to include terms and definitions tailored for the intelli-                 to be annotated by multiple ontology terms (sometimes linked
gence domain and specifically for the needs of our DSGS-A                      by appropriate relations). A record in one of the tables
ontology initiative. The strategy has the following parts.                     containing data about an IED can be annotated, for example,
                                                                               both   with   ‘IED   Event’   (based   on   its   aboutness)   and   with  
    First, IAO-Intel is created by downward population from
                                                                               ‘EEFI’   (based   on   its   importance).   A particular plan for the
the draft IAO reference ontology. That is, the highest level
                                                                               Intelligence Preparation of the Battlefield can be annotated as
terms of IAO-Intel are defined as specializations of terms from
                                                                               being at the same time a Plan (based on its purpose), a
IAO along the lines illustrated in Table 1. The coverage do-
                                                                               Government Document (based on its source), a Report on Air
main of IAO-Intel will be determined incrementally on the ba-
                                                                               Defenses (based on its aboutness). It can be annotated also
sis of requests from analysts and other SME communities and
                                                                               through relations, for example through located-at linking the
through incorporation of terms from doctrinal publications and
                                                                               source of the plan to some city or building and linking the
relevant high-level data models and document classifications.
                                                                               planned air defenses to some region of interest.
    Second, we use these sources to identify the dimensions of
                                                                                   Currently, military terminology resources generally fail to
attributes along which IAs will be annotated. The selected
                                                                               follow established best practice principles for the formulation
dimensions are constructed in such a way as to be orthogonal
                                                                               of definitions. For example, they often confuse terms referring
in the sense in which, for example, color is orthogonal to
                                                                               to components of information artifacts with terms referring to
shape – thus ontology branches built to represent different
                                                                               the entities in reality which those information artifacts are
dimensions of attributes will contain no terms in common.
                                                                               about.  The  “WTI  Improvised  Explosive  Device”  Glossary,  for  
This will enable these branches to be structured following the
                                                                               example, defines Method of Emplacement as:
principle of single inheritance (thus as true hierarchies) [13].
                                                                                    The description of where the [improvised explosive] device was
   Third, we create low-level ontology modules (LLOs)                               delivered, used or employed.
corresponding to each of these orthogonal dimensions. LLOs
are small single-dimension attribute lists or shallow                          Similarly the DCGS-A Logical Data Model defines Cover-
hierarchies designed to advance ease of maintenance and                        Concealment as:
surveyability of the ontology and to provide a growing set of                       information about geographical features that provide protection
simple component terms which can be used:                                           from attack or observation.
1. to construct more complex terms, both terms for inclusion
   in IAO-Intel, and terms to be used to generate inferred                     Use of IAO-Intel in tandem with corresponding domain
   classifications in application ontologies created for specific              ontologies allows us to explicate CoverConcealment (properly
   local purposes, along the lines described in [10];                          so-called) as:
2. to define the terms of the IAO-Intel ontology and of its                         a geographic feature which has-role CoverRole,
   sister ontologies within the DSGS-A framework;
                                                                               and to explicate CoverConcealmentInformation as:
3. to explicate the meanings of terms standardly used by
   different agencies, or by different groups of SMEs, or by                        IA which is-about CoverConcealment,
   different existing and future systems to describe such                      where CoverRole is defined as:
   artifacts in a logically consistent way that is designed to
   allow integration of data and enhanced analytics;                                the Role acquired by a given geographic feature when it is used
                                                                                    to provide protection from attack or observation.
4. to annotate instance data pertaining to particular
   information artifacts used by the intelligence community –                         VI.     MAINTAINING AND EVALUATING IAO-INTEL
   for   instance   analysts’   reports;;   harvested   emails;;   signals  
   data; and so forth.                                                              To maintain the IAO-Intel term collection over time we
                                                                               will create feedback links to enable users of the ontology to
The goal is that IAO-Intel should support integration of data                  request new terms and to report errors. We are also working




                                                         STIDS 2013 Proceedings Page 35
on an objective validation process which will enable us to           – Information Quality Entity (IQE). An IQE is the pattern on
determine how requested terms should be treated,                       an IBE in virtue of which it is a bearer of some information.
distinguishing options such as: 1. incorporation into IAO-Intel      – Information Structure Entity (ISE). An ISE is a structural
or into some associated reference ontology, 2. incorporation           part of an ICE; speaking metaphorically, it is an ICE with
into an application ontology maintained for some local                 the content removed: for example an empty cell in a spread-
purpose, 3. being marked as a synonym of some existing                 sheet; a blank Microsoft Word file. ISEs thus capture part of
ontology term.                                                         what is involved  when  we  talk  about  the  ‘format’  of  an  IA.
     We are identifying, and where necessary constructing de
novo, the domain ontologies that will need to be used in the         The term ‘information  artifact’ can now be used to refer either
definition of complex terms, and defining the relations that         1. to some combination of ICEs and ISEs (roughly: the IA as
will link IAO-Intel terms with terms in these domain                 body of copyable information content); or 2. to some
ontologies. These ontologies, too, will be extended over time        concretization of ICEs and ISEs in some IBE in which some
on the basis of input from users.                                    IQE inheres (the information artifact is: this content here and
                                                                     now, on this specific computer screen or this printed page).
     We are also testing a series of objective criteria to be used   Different information artifact types will differ in different
in evaluation of IAO-Intel and other DCGS-A ontologies,              ways along these dimensions, as illustrated in Table 2.
starting with simple numerical measures of (a) term requests
received and dealt with, and (b) uses of terms in definitions,
explications and annotations. IAO-Intel will allow us to keep              BFO:                     BFO:                       BFO:
track of the number of information artifacts that make                  Independent               Generically                Specifically
reference to individuals falling under a given class, and these          Continuant               Dependent                  Dependent
metrics too can be used to assess the relative importance of                                      Continuant                 Continuant
this class within the ontology framework taken as a whole.
While not definitive, such measures will help guide our
judgments concerning the content and structure both of IAO-                                                            Information
                                                                        Information    Information     Information
Intel and of its associated domain ontologies.                                                                        Quality Entity
                                                                       Bearing Entity Content Entity Structure Entity    (Pattern)
              VII. ORGANIZATION OF IAO-INTEL                               (IBE)          (ICE)           (ISE)           (IQE)
     Given the importance of the dichotomy between primary                          Figure 1. Continuants in the IAO framework
(topic) and secondary (artifact) focus, a central role in IAO-
Intel is played by what we call                                              VIII. IAO AND THE BASIC FORMAL ONTOLOGY
   Information Content Entities (ICEs) are about something              Figure 1 shows how IAO and IAO-Intel are being built to
    in reality (they have this something as a subject; they          conform to Basic Formal Ontology (BFO), the upper-level
    represent, or mention or describe this something; they           architecture used in the DSGS-A ontologies [14]. IBEs are, in
    inform us about this something). Aboutness may be                BFO terms, independent continuants (they are entities made of
    identifiable from different perspectives. Thus one analyst       physical matter). An IBE is a physical entity that is created or
    may interpret a given ICE as being about the geography           modified to serve as bearer of certain patterned arrangements
    of a given encampment; another may view it as providing          – for example of ink or other chemicals, of electromagnetic
    information about the morale of those encamped there.            excitations. An IQE is a quality of an IBE which exists in
                                                                     virtue of such patterned arrangements and which is
All major classes of information artifacts involve ICEs –
                                                                     interpretable as an ICE or ISE. Such an IQE is created when
simply because all major classes of information artifacts are
                                                                     some physical artifact is deliberately created or modified to
about something. A plan of action, for example, is about a
                                                                     support it (patterned to serve as its bearer). IQEs are
certain group of persons and goals and the types and ordering
                                                                     BFO:specifically dependent continuants (SDCs) – entities
of actions that will be used to realize these goals. Even a
                                                                     which require some specific physical bearer but which are not
document that has been written in code will be assumed by an
                                                                     themselves physical. Each IBE and IQE is restricted at any
analyst to be about something (for what, otherwise, would be
                                                                     given time to some specific location in space. (If you display
the reason for its creation?). Typically, an information artifact
                                                                     the same digital image twice on your desktop, then there are
such as a copy of a newspaper will be associated with multiple
                                                                     two IQEs on your desktop, which are – at some level of
ICEs at successive levels of granularity, including separate
                                                                     granularity – indistinguishable copies of each other.
articles within the newspaper, separate sentences within these
articles, and so on.                                                     ICEs and ISEs, in contrast, are what BFO calls generically
                                                                     dependent continuants or GDCs. This means that they are
   In addition to ICEs, we distinguish also:
                                                                     entities – such as a pdf file or an email – which can be copied
– Information Bearing Entity (IBE). An IBE is a material             from one physical bearer to another and thus may exist
  entity that has been created to serve as a bearer of               simultaneously in multiple different IQEs, which are called
  information. IBEs are either (1) self-sufficient material          ‘concretizations’   of   the   corresponding   GDC.   Each   GDC   is  
  wholes, or (2) proper material parts of such wholes.               concretized by at least one specific IQE inhering for example
  Examples under (1) are: a hard drive, a paper printout (e.g.,      in the tiny piles of ink on the piece of paper in your pocket or
  a report); and under (2): a specific sector on a hard drive, a     in differentially excited pixels on your screen. When the GDC
  single page of a paper printout.




                                                  STIDS 2013 Proceedings Page 36
is copied, then a new IQE is created on a new physical                       Note that we do not assume that all portions of IAO-Intel
information bearer, as when a new pattern of characters is               will be of equal utility in applications for the IC. We do,
created on the screen of the recipient of an email. This second          however, believe that to achieve clarity of explication in the
pattern is a copy of the pattern created on the screen of the            treatment of source data artifacts will require clear definitions
sender. The GDC itself exists simultaneously both at its                 of the upper-level terms in the IAO, and a clear understanding
original site and at the site to which it has been transmitted.          of the relations between them.
GDCs can thus be multiply located.
                                                                                    TABLE 2: DIMENSIONS OF INFORMATION ARTIFACTS (IAS)
     BFO relations between ICEs, ISEs, IQEs and IBEs can be
set forth as follows:                                                     Information
                                                                                                    IBE                 ISE          ICE
                                                                          Artifact
         ICE generically-depends-on IBE
                                                                                               Hard drive
         ISE generically-depends-on IBE                                   MS Word file                         MS Word          Varies
                                                                                               (magnetized
                                                                          (.doc, .docx)                        format
         IQE specifically-depends-on IBE                                                       sector)
         ICE concretized-by IQE                                                                Hard drive
                                                                                                               XML V 2.0        Varies
                                                                          XML file             (magnetized
         ISE concretized-by IQE                                                                sector)
                                                                                                               format
    IAO contains in addition relations which allow us to                                       Hard drive
                                                                          MS Excel 2010                        MS Excel 2010    Varies
formulate metadata concerning attributes of IAs such as                   file (.xls, .xlsx)
                                                                                               (magnetized
                                                                                                               format
author, creation date, classification status, and so forth, and to                             sector)
annotate also components of IAs such as the To- and                                            Hard drive
FromAddress components of email headers. The ToAddress of                 KML file             (magnetized     KML              Map overlay
email message m, for example, is defined as:                                                   sector)
    a collection of at least one email addresses of the intended reci-                         Hard drive
    pients of m, each with at most one optionally associated name.        JPEG file (.jpg)     (magnetized     JPEG format      Image
                                                                                               sector)
The set of relations can be extended to include also relations                                                 Internet Message
involving documents, document parts and document                          Email file (with     Hard drive
                                                                                                               Format (e.g.,    Message
collections, such as retrieved-from, curated by, and so forth.            embedded             (magnetized
                                                                                                               RFC 5322
                                                                          attachments          sector)
                                                                                                               compliant)
    When we consider examples such as those provided in
                                                                                        A specific
Table 2, then it becomes clear that, when IAO-Intel is applied            USMTF Message                                         Message
                                                                                        government             USMTF Format
to the explication of terms involved in describing instance-              file
                                                                                        network
data relating to real-world IAs, then multiple artifacts may
need to be distinguished. Consider, for example, a pdf file                             Paper                                   Name,
                                                                                        document;              ID formats,      Personal data,
stored on some specific laptop. When we address what is                   Passport      (may include           security marking Passport
meant by the (copyable) content of this file, then we recognize                         photographs,           formats  …       number, Visas
that this content may be copied in multiple ways, for example:                          RFID tags)                              …
to a pdf file using the same version of the Acrobat software                            Official paper
and on the same operating system, to a pdf file using a                   Title Deed                           Varies           Varies
                                                                                        document
different version of the Acrobat software, using characters
                                                                          Report               Varies          Varies           Varies
from the same or a different character set, by being printed out
on a piece of paper, and so on. The annotation of instance data                                                MIL-STD-2525
with information of this sort may be important for example in             Overlay Sheet                        Symbols; FM
investigating the provenance of given information artifacts               ( e.g. Map                           101-1-5
                                                                                               Acetate sheet                    Map overlay
which lie at the end of long chains of copying and processing             Overlay Sheet –                      Operational
                                                                          see Figure 2)                        Terms and
involving multiple authors and computer systems. One                                                           Graphics
potential application of IAO-Intel is to the systematic
annotation of data pertaining to such chains.
                                                                                   IX. ATTRIBUTES OF INFORMATION ARTIFACTS
    Matters are complicated further when we go deeper into
the question of how IAs are stored inside the computer. Given                Information artifacts have attributes along a number of
a generically dependent continuant which is the pdf file stored          distinct dimensions, treated in LLO modules of the IAO.
in the hard drive on some given laptop, there is a specifically          Terms in these modules will be applied to explicate
dependent IQE which is (roughly) the pattern of 1s and 0s in             information relating to IAs of different types, and to annotate
the magnetic coating of the hard drive. When the entirety of             data pertaining to IA instances with the help of relations
this pdf file is displayed on your screen, then there is a further       mentioned above. Some dimensions of IA attributes are
specifically dependent IQE which is the corresponding pattern            common to all areas, both military and non-military,
of pixels on your screen. Both of these IQEs are concretiza-             including: Purpose, Lifecycle Stage (draft, finished version,
tions of a corresponding GDC.                                            revision); Language, Format, Provenance, Source (person,
                                                                         organization), and so forth.




                                                     STIDS 2013 Proceedings Page 37
    Along the dimension of Purpose we distinguish:                         operational environment, such as obstacles restricting military
                                                                           movement, key geography, and military objectives.
x    Descriptive purpose: scientific paper, newspaper article,
     after-action report
x    Prescriptive purpose: legal code, license, statement of
     rules of engagement
x    Directive purpose (of specifying a plan or method for
     achieving something): instruction, manual, protocol
x    Designative purpose: a registry of members of an
     organization, a phone book, a database linking proper
     names of persons with their social security numbers
whereby it should be stressed that one and the same IA may of
course serve multiple purposes.
   As is shown in Table 3 IAO-Intel will include additional
LLOs relating to attributes of importance to the intelligence
domain such as: Classification, Encryption Status, Encryption
Strength, and so forth. IAO-Intel will also include terms
representing specific IA Purposes such as: informing the
commander, providing targeting support, intelligence
preparation of the battlefield.

    TABLE 3. DIMENSIONS OF INFORMATION ARTIFACT ATTRIBUTES

Role in the Intelligence Process (JP 3-0, III-11)
 Priority Intelligence Requirement (PIR)
      Commander’s  Critical  Information  Requirement  (CCIR)
          Essential Element of Information (EEI)
                 Essential Element of Friendly Information (EEFI)
Confidence Level (JP 2.0, Appendix A)                                           Figure 2: Modified Combined Obstacle Overlay (example IA#1)
    Highly Likely                  Unlikely
    Likely                         Highly Unlikely
                                                                        We assume that IA#1 has been prepared as part of some given
    Even Chance
                                                                        plan, IA#2. Both IAs #1 and #2 will then be referred to in
Discipline (JP 2.0, I-5)              Intelligence                      multiple further IAs including multiple databases compiled
     Legal                            Signal                            during planning, execution and outcomes assessment.
     Ideology                         Human                             Relevant terms used in the data models associated with these
     Religion                         Rumor intelligence                data models will have been explicated using terms from IAO-
     Propaganda                       Web intelligence                  Intel. The latter terms can then be used along the lines
Intelligence Excellence (JP 2.0, II-6)                                  described in [9] to create annotations to both #1 and #2 on the
     Anticipatory                      Complete                         basis of the fact that they are referred to in the databases in
     Timely                            Relevant                         question. The results will include, for example:
     Accurate                          Objective                             a) annotations to the attributes of IA#1:
     Usable                            Available                                ICE: MCOO
                                                                                IBE: Acetate Sheet
   Table 3 illustrates fragments of some of the dimensional
hierarchies specific to IAO-Intel, with their doctrinal sources.                uses-symbology MIL-STD-2525C
                                                                                authored-by person #4644
    X. EXAMPLES OF USE OF IAO-INTEL IN ANNOTATION
                                                                                part-of plan IA#2
   As should by now be clear, IAO-Intel relates not merely to               b) annotations relating to the aboutness of IA#1
textual documents but to information artifacts of all types
including maps, videos, photographic images, websites,                          Avenue of Approach
databases, and so forth, both unstructured source documents                     Strategic Defense Belt
and official documents of many different varieties. Consider,                   Amphibious Operations
the Modified Combined Obstacle Overlay (MCOO), taken
                                                                                Objective
from JP 2-01.3 [15] and illustrated in Figure 2. (We refer to
this as example IA#1 in what follows.) An MCOO is defined               and so forth. Used in conjunction with the skill ontology and
as:                                                                     the person database the annotations above will enable a
                                                                        planner to retrieve (for example) all MCOOs relating to
    A joint intelligence preparation of the operational environment
    product used to portray the militarily significant aspects of the
                                                                        amphibious operations authored by persons with certain skills.




                                                    STIDS 2013 Proceedings Page 38
   Consider, as a second example, a collection of documents                    describing (A) chemicals (properties, costs, manufacture,
prepared according to FM 6-99.2 [16], for example of types:                    transport, supply, and so forth), and (B) explosives
                                                                               manufacture (raw materials, persons and skills involved,
        Intelligence Report [INTREP]
                                                                               processes and equipment and safety measures used). We will
        Intelligence Summary [INTSUM]                                          have satisfied Directive 8322.20 in maximizing discoverability
        Logistics Situation Report [LOGSITREP]                                 if we annotate each body of data in accordance with
        Operations Summary [OPSUM]                                             corresponding term repositories, which we can assume to have
        Patrol Report [PATROLREP]
                                                                               been independently developed. Suppose now, however, that
                                                                               we are called upon to integrate the data in (A) with the data in
        Reconnaissance Exploitation Report [RECCEXREP]                         (B). Here these annotations will likely provide no assistance,
        SAEDA Report [SAEDAREP]                                                which will in turn lead to calls for the creation of a third term
                                                                               repository to be used in efforts to annotate the combined (AB)
Suppose further that we need to cross-reference these with
                                                                               data. The results of these efforts will then once again likely
comparable sets of documents prepared by other commands,
                                                                               provide no assistance when (AB) data itself needs to be
and that we need to do this in such a way as to extract and
                                                                               integrated with, say, data about explosives financing.
process the information computationally. FM 6-99.2 provides
definitions of the mentioned report types, but does not take the                   Where, in contrast, the systems for annotating (A) and (B)
step of formulating these definitions computationally. IAO-                    reflect a common ontological approach, then new annotation
Intel addresses this problem by providing a common,                            resources for the merged data can be easily be developed by
algorithmically useful, set of ontology terms that is designed                 reusing the initially developed ontologies in the formulation of
to allow consistent explication of these and related types as                  both composite terms and corresponding definitions [10].
they appear in different doctrinal resources. The results can
then be used for computer-aided aggregation of the data                            A further problem is that the need to create new
represented using corresponding IA types, cross-checking of                    terminology resources for the annotation of such merged
mismatches, and so forth.                                                      content may lead to the need for corrections of the initial
                                                                               terminology resources. Such corrections may have expensive
         XI.    THE DOD DATA SERVICES ENVIRONMENT                              consequences: either they will break interoperability with the
    We can now return to Directive 8320.02 and address the                     results of earlier annotation efforts, or – if resources are
relevance of the work reported above to its successful                         invested to correct already existing annotations to make them
implementation. As   we   saw,   the   Directive   requires   that   ‘all      conform to the new usage – they will have unforeseen
salient   metadata   be   discoverable,   searchable,   and   retrievable’     consequences for third parties who have been relying on the
through use of the DoD Data Services Environment (DSE) [6].                    older resources to be maintained consistently through time.
DSE’s numerous data sources include 35   ‘supporting                           Such problems are minimized where terminology resources
taxonomies’ derived from pre-existing terminology resources.                   are developed in tandem from the very start as parts of a single
Problems arise, however, because the latter have been                          suite of ontology modules developed using common
constructed on the basis of multiple distinct methodologies                    principles, exactly as is proposed by our DSGS-A strategy.
(for example as concerns the formulation of definitions).                      We believe that only a strategy of this sort can satisfy the
When, on August 25, 2013, the DSE was queried for                              requirement that data, information, and IT services are  ‘made
information   on   “location”,   the   DSE   reported   660   possibly         visible, accessible, understandable, trusted, and interoperable
relevant sources of information. When the DSE was queried                      throughout  their  lifecycles  for  all  authorized  users.’ [5]
for  “unit  types,”  882 possibly relevant sources of information
                                                                                       XII. SEMANTIC TECHNOLOGY IS NOT ENOUGH
were  reported.  When  types  of  “ground  vehicles”  were  queried  
for, 175 possible relevant sources of information were                         The strategy underlying DSE has much in common with a
reported. Such redundancies present obstacles to discovery,                    strategy adopted widely in the semantic technology
search and retrieval. They arise because different compilers of                community under the heading of Linked Open Data, a strategy
authoritative data describe entities of the same types in                      often involving the use of the Dublin Core Metadata Element
heterogeneous ways. This thwarts the sort of coherent                          Set as controlled vocabulary. We believe that the Dublin Core
integration that is required for the mounting of what, in [6], we              can serve as reliable controlled vocabulary for describing IA
referred  to  as  the  “massing  of  intelligence  fires”.                     data only where the information artifacts in question are
                                                                               themselves artifacts formulated using RDF or some other
     One problem is that while the terms in thesauri and                       W3C recommended syntax, and unfortunately this is not the
glossaries can be used in annotations, the value derived                       case for many of the artifacts at issue here. We believe further
therefrom is limited above all because they do not allow the                   that the Linked Data approaches cannot solve the problems of
benefits of inferencing and of rapid introduction and definition               silo-formulation in the IC for the results outlined already in
of new terms which are provided by a framework of well-                        section XI above. The semantic technology community draws
constructed ontologies along the lines described in [10]. There                a distinction between two levels of interoperability: Level 1,
we show how reference ontologies can be quickly expanded                       resting on shared term definitions (for example drawn from
with new content to meet emerging data representation needs                    the Dublin Core), and Level 2, of what is called Formal
and in such a way that data annotated with the newly added                     Semantic Interoperability. As is recognized at [17], Level 1 is
terms is automatically integrated with existing data.                          ‘so open-ended that it quickly leads to a proliferation of
    Imagine, for example that we have two large bodies of data                 custom-built solutions incompatible with each other, such as




                                                         STIDS 2013 Proceedings Page 39
metadata expressed in document formats that require                                  years in the domain of biomedical informatics, and is
customized software to read and data models that cannot                              gradually being adopted also in other domains, including for
easily be mapped to generic, interoperable representations                           example the domain of modeling and simulation, where the
such as those expressed in RDF.’ Level 2 is designed to solve                        identifying authoritative data sources is needed to ensure
these problems by requiring that all IAs are described via                           realistic scenarios [18]. One principal feature of the strategy is
metadata formulated using RDF. Unfortunately RDF (or even                            that it provides a standard means for defining new ontologies
OWL) is no panacea. Multiple conflicting ontologies can be                           in light of emerging needs, in a way that guarantees
formulated in RDF terms, yet still remain conflicting.                               consistency with the ontologies already created and with the
                                                                                     data annotated in their terms. We believe that this feature
   The solution, again, must rely on shared development of a                         makes the strategy particularly useful in addressing the emerg-
single suite of modularized ontologies, in which not only the                        ing challenges to the intelligence analyst in accordance with
same formal language is used, but also consistent definitions                        DoD directives concerning discovery, retrieval and search.
populating downward from a common upper level such as
BFO – and we note in this connection a parallel with the way                                                       ACKNOWLEDGMENTS
in which joint doctrine is elaborated, in a process that is                          Work on IAO-Intel was supported by I2WD. Thanks are due
designed to ensure (at least ideally) that the same term is                          also to Mathias Brochhausen, Werner Ceusters, Mélanie
defined and used consistently across the 80 plus Joint                               Courtot, Janna Hastings, James Malone, Bjoern Peters,
Publications (JPs) that address the various aspects of joint                         Jonathan Rees, and Alan Ruttenberg for their work on IAO.
                                                                                                                         REFERENCES
                                                                                     [1]  Friedrich Wilhelm von Steuben, Regulations for the order and discipline
                     IAO:Report
                                                                                          of the troops of the United States, 1792, http://x.co/1dJEk.
                                                                                     [2] Department of Defense Dictionary of Military and Associated Terms,
                     Intelligence                                                         2013, http://www.dtic.mil/doctrine/new_pubs/jp1_02.pdf.
                        report                                                       [3] Leo Obrst, Patrick Cassidy, “The   need   for ontologies: Bridging the
                                                                                          barriers of terminology and data structure”,   Geological   Society   of  
                                                                                          America Special Paper 482, 2011.
       Geospatial              Human               Measurement and                   [4] Leo Obrst, Terry Janssen, Werner Ceusters (eds.), Ontologies and
                                                                                          Semantic Technologies for the Intelligence Community. Amsterdam:
      intelligence           intelligence         signals intelligence
                                                                                          IOS Press, 2010.
         report                 report                   report
                                                                                     [5] Sharing Data, Information, and Information Technology (IT) Services in
                                                                                          the Department of Defense, DoD Instruction 8320.02, August 5, 2013,
         Human geospatial               Signals             Measurement                   http://www.dtic.mil/whs/directives/corres/pdf/832002p.pdf.
           intelligence               intelligence          intelligence             [6] DSE Data Services Environment, https://metadata.ces.mil/dse.
              report                     report                report                [7] https://code.google.com/p/information-artifact-ontology.
                                                                                     [8] David Salmen, Tatiana Malyuta, Alan Hansen, Shaun Cronen, Barry
  The above IAO-Intel terms are defined by using terms from the                           Smith,   “Integration   of   intelligence   data   through   Semantic   Enhance-
   ontologies below with the help of relations such as is-about,                          ment”,   Proceedings of the Conference on Semantic Technology in
            created-by, derives-from and so forth [7].                                    Intelligence, Defense and Security (STIDS), 2011, CEUR 808, pp. 6–13.
                                                                                     [9] Barry Smith, Tatiana Malyuta, David Salmen, William Mandrick, Kesny
                                                                                          Parent,  Shouvik  Bardhan,  Jamie  Johnson,  “Ontology  for  the  Intelligence
       Geospatial feature                             IA source                           Analyst”,   CrossTalk:   The   Journal   of   Defense   Software   Engineering,  
                                                                                          November/December 2012, pp. 18–25.
                Person                               Intel discipline                [10] Barry Smith, Tatiana Malyuta, William S. Mandrick, Chia Fu, Kesny
                                                                                          Parent,   Milan   Patel,   “Horizontal   integration   of   warfighter   intelligence  
                  Signal
                                                       IA classification                  data.   A   shared   semantic   resource   for   the   Intelligence   Community”,  
                measurement                                                               Proceedings of STIDS Conference, 2012 (CEUR 996), pp. 112–119.
                                                                                     [11] Ron Rudnicki, Werner Ceusters, Shahid Manzoor and Barry Smith,
                                                                                          “What   particulars   are   referred   to   in   EHR   data?”,   American   Medical  
  Figure 3. Top: Terms from IAO (unfilled) and IAO-Intel (grey) ontologies.               Informatics Association 2007 Annual Symposium, 2007, pp. 630–634.
  Taxonomical hierarchies: asserted – solid lines, inferred – dashed lines. Bottom
  left: Domain ontologies. Bottom Right: IAO-Intel LLOs.                             [12] Ron   Rudnicki,   “DCGS-A   Ontology   Program   Explication   Procedures”,  
                                                                                          MS, 2013.
warfare in accordance with JP 1-02 [2].                                              [13] Barry   Smith   and   Werner   Ceusters,   “Ontological   Realism   as   a  
                                                                                          methodology for coordinated evolution of scientific ontologies”,  
                                                                                          Applied Ontology, 5 (2010), pp. 139–188.
                            XIII. CONCLUSION
                                                                                     [14] Basic Formal Ontology 2.0, http://ontology.buffalo.edu/BFO/Reference.
To summarize: IAO-Intel forms part of a collection of                                [15] Joint Publication 2-01.3 Joint Intelligence Preparation of the Operational
ontologies that is being applied primarily to the explication of                          Environment, 16 June 2009.
data models and other terminology resources of importance to                         [16] U.S. Army Report and Message Formats (FM 6-99.2), April 2007,
DCGS-A. The terms in these ontologies are linked together                                 http://armypubs.army.mil/doctrine/DR_pubs/dr_a/pdf/fm6_99x2.pdf.
logically in virtue of the fact that each ontology uses terms                        [17] Dublin Core User Guide,                 Last modified September 6, 2011,
which are defined in terms of other ontologies belonging to                               http://wiki.dublincore.org/index.php/User_Guide.
this same suite (as illustrated in Figure 3). This strategy for                      [18] Saikouy   Diallo,   Jose   Padilla,   “Military   Interoperability   Challenges”,  
                                                                                          Handbook on Real-World Applications in Modeling and Simulation,
ontology development has been tested in use over several                                  Wiley, 2012, pp. 298–332.




                                                                STIDS 2013 Proceedings Page 40