=Paper=
{{Paper
|id=Vol-1097/STIDS2013_T05
|storemode=property
|title=IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
|pdfUrl=https://ceur-ws.org/Vol-1097/STIDS2013_T05_SmithEtAl.pdf
|volume=Vol-1097
|dblpUrl=https://dblp.org/rec/conf/stids/SmithMRMSMDSP13
}}
==IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain==
IAO-Intel
An Ontology of Information Artifacts in the Intelligence Domain
Barry Smith Tatiana Malyuta Ron Rudnicki William Mandrick David Salmen
University at Buffalo CUNY, NY, USA CUBRC, Buffalo Data Tactics Data Tactics
NY, USA Data Tactics, McLean, VA NY, USA McLean, VA, USA McLean, VA, USA
Peter Morosoff Danielle K. Duff James Schoening Kesny Parent
E-Maps, Inc. I2WD I2WD I2WD
Washington, DC, USA Aberdeen, MD, USA Aberdeen, MD, USA Aberdeen, MD, USA
Abstract—We describe on-going work on IAO-Intel, an integration (for example in the case of registries of persons of
information artifact ontology developed as part of a suite of interest) are all too familiar. Increasingly, however, it is
ontologies designed to support the needs of the US Army recognized that there is the need for a unified approach to
intelligence community within the framework of the Distributed description and classification of information resources (see for
Common Ground System (DCGS-A). IAO-Intel provides a example [3], [4]), and the DoD has recognized at an official
controlled, structured vocabulary for the consistent formulation level that, to advance discoverability and analysis in the age of
of metadata about documents, images, emails and other carriers Big (military) Data, new approaches are needed that can
of information. It will provide a resource for uniform explication enable computational retrieval, integration and processing of
of the terms used in multiple existing military dictionaries,
data. Thus Directive 8320.02 [5], the latest version of which is
thesauri and metadata registries, thereby enhancing the degree to
which the content formulated with their aid will be available to
dated August 5, 2013, requires all authoritative DoD data
computational reasoning. sources to be registered in the DoD Data Services
Environment (DSE) [6]. It further requires that all salient
Keywords—ontology; information artifacts; military doctrine; metadata be discoverable, searchable, retrievable, and
intelligence analysis; interoperability; data services environment understandable:
Data, information, and IT services will be considered under-
I. BACKGROUND
standable when authorized users are able to consume them and
Standardization of terminology has been important from the when users can readily determine how those assets may be used
very beginning of organized warfare. Imagine the Chinese for specific needs. Data standards and specifications that require
trying to pass reports down the Great Wall using fire beacons associated semantic and structural metadata, including
without standardization of the signals used. In the vocabularies, taxonomies, and ontologies, will be published in
Revolutionary War, General Washington directed Friedrich the DSE, or in a registry that is federated with the DSE.
Wilhelm von Steuben to write the drill manual for the We shall return to the DSE below. First, we present our own
Continental Army [1] so that all units would use and respond strategy for realizing these important goals.
uniformly to the same commands.
II. THE INFORMATION ARTIFACT ONTOLOGY
In our own era, DoD has directed development and use of
the DoD Dictionary of Military and Associated Terms (Joint The Information Artifact Ontology (IAO) was originally
Publication 1-02) as the paramount terminological standard for conceived in 2008 as part of an effort to master the Big Data
military operations [2]. JP 1-02 helps to enable joint warfare accumulating in the wake of the Human Genome Project in
by (a) advancing consistency in communications and (b) the context of biological research [7]. Its goal was to aid the
facilitating consistent interpretation of commands. Military consistent description of biological data emanating from
dictionaries and related terminology artifacts continue to be multiple heterogeneous sources. The goal of IAO-Intel is
developed, addressing these and a series of additional aims, in- analogous: it is to provide common resources for the
cluding: (c) compiling lessons learned (outcomes assessment); consistent description of information artifacts of relevance to
(d) providing controlled vocabularies for official reporting; the intelligence community in a way that will allow discovery,
and (e) enhancing discoverability and analysis of data. integration and analysis of intelligence data from both official
and non-official sources.
Such artifacts have until recently been conceived by
analogy with traditional free-text dictionaries published in When biomedical informaticians work with databases,
forms designed to maximize utility to human beings. Most publications and records generated by experimental research
existing doctrinal and related lexica and thesauri not only or medical care they focus primarily on what these artifacts
provide little aid to computation, they also suffer from the fact describe (for example on the genes or proteins which form the
that multiple such resources have been (and continue to be) subject matters of a given journal publication, or on the
developed independently, in divergent and often non- symptoms or diseases reported in a given clinical note).
principled ways. The result is that identical data may be Similarly, when intelligence analysts work with source data
classified and described entirely differently by different artifacts, then they, too, focus primarily on what the data in
agencies, and the consequences of the resultant failures of these artifacts describe, for example on the military units
STIDS 2013 Proceedings Page 33
whose movements are recorded in a given shipping report, or which involve a particular creator, or a particular type of
on the vulnerabilities of a given forward operations base as intelligence report, or a particular type of weblink, or have
described in some force protection assessment. been declassified under the authority of a particular agency, or
are operative within a given time window.
But while the primary focus concerns in both cases the
topic or subject of the artifacts in question, both also require a Importantly, IAO-Intel is not designed to replace existing
secondary focus, targeted to the artifacts themselves, through doctrinal or other standards created to guide human beings or
which information about these topics is conveyed. Such computer applications in the creation and description of
artifacts have attributes – including format, purpose, evidence, documents in accordance with defined formats or document
provenance, operational relevance, security markings – data architectures. Rather, its purpose is to allow the results of
concerning which (often called ‘metadata’) is vital to the using such standards to generate the needed metadata in a
effective exploitation of the reports, images, or signals uniform, non-redundant and algorithmically processable
documents with which the analyst has to deal. fashion. Moreover, the broad scope of IAO-Intel means that
the metadata generated in relation to official documents will
The dichotomy between focus on entities in the world and be of a piece with the metadata incrementally accumulating in
focus on the information artifacts in which these entities are relation to all information artifacts of relevance to the IC – the
represented is fundamental to the work reported here. IAO metadata will consist, in every case, of annotations to IAs
relates precisely to the objects of this secondary focus. An formulated in ontology terms drawn not only from IAO-Intel
information artifact (IA), as we conceive it, is an entity that but from the entire suite of DSGS-A ontology modules.
has been created through some deliberate act or acts by one or
more human beings, and which endures through time, Thus while using existing standards for human or
potentially in multiple (for example digital or printed) copies. computer-aided creation or description of IAs does indeed
IAO thus deals with information in the forms it takes when it allow us to retrieve data pertaining to IAs prepared in
has been deliberately fixed in some medium in such a way as accordance with these standards, for IAs of other sorts the
to become accessible to multiple subjects. Examples are: a existing approach will fail. Only an ontology-based approach
diagram on a sheet of paper, a video file, a map on a computer along the lines here proposed can, we believe, demonstrate the
monitor, an article in a newspaper, a message on a network, sort of flexibility and consistent expandability which are
the output of some querying process in a computer memory. needed in today’s dynamic and data-rich environments.
III. GOAL OF IAO-INTEL IV. EXPLICATION AND ANNOTATION
The goal of IAO-Intel is to support the effective handling of Currently a draft version of IAO-Intel is being applied
data concerning those attributes of IAs that are relevant to the within the framework of the US Army’s Distributed Common
purposes of intelligence analysis. To describe such attributes Ground System (DCGS-A) Standard Cloud (DSC) initiative as
coherently we need to distinguish: part of a strategy for the horizontal integration of warfighter
intelligence data [9]. Two sorts of application are currently
– the particular information artifact of interest, tied to some being used to enable the ontology to support computer-aided
particular physical information bearer: the photographic retrieval and analytics. First, is explication of general terms
image on this piece of paper retrieved from this enemy used in source intelligence artifacts and in data models,
combatant; the email created by this particular author on this terminologies and doctrinal publications which provide typo-
specific laptop; the target list compiled for this particular logies of intelligence-related IAs. Second, is the annotation of
artillery unit on this particular date; the instance-level information captured by such IAs.
– the copyable information content that is carried by the Explication is performed by providing definitions of such
artifact in question. The photographic image may be printed general terms using the resources of IAO-Intel and of the
out in multiple paper copies; the email or target list may be domain ontologies (such as Agent or Event ontologies) being
transmitted to multiple further recipients. The information developed within the DSGS-A framework. Annotation is
content that is copied or transmitted thereby remains in each performed by associating ontology terms with data about part-
case one and the same. icular persons, events, or places in given information artifacts.
IAO-Intel provides ontology terms relating both to official
documents and to non-official (source) artifacts. It provides TABLE 1. SAMPLE TYPES AND SUBTYPES OF INFORMATION ARTIFACTS
also a set of relations to be used when we wish to represent the IAO IAO-Intel (examples)
fact that, say, IA #12345 is-about some given person, or uses- Report Intelligence Report (FM 6-99.2, 126)
symbols-from some specified symbology, or links-to some
second IA #56789, and so forth, Summary Electronic Warfare Mission Summary (FM 6-99.2, 87)
Diagram Network Analysis Diagram (from JP 2-01.3, II-51)
IAO-Intel is designed from the start to provide the needed Overlay Combined Information Overlay (JP 2-01.3, II 33)
supplement in a way that will create semantic interoperability Assess- Assessment of Impact of Damage (FM 6-99.2, 53)
of data retrieved from different types of sources through an ment
incremental process of semantic enhancement as described in
Estimate Adversary Course of Action Estimate
[8], [9] and [10]. It is designed to allow automatic retrieval of
all documents in a given collection of heterogeneous sources List List of High-Value Targets (JP 2-01.3, II 61)
STIDS 2013 Proceedings Page 34
Order Airspace Control Order (FM 6-99.2, 17) annotated using different standard terminology resources. To
Matrix Target Value Matrix (JP 2-01.3, II-63) bring this about, the constituent terms of such resources will
be explicated using terms from IAO-Intel so that the artificial
Template Ground and Air Adversary Template (JP 2-01.3, II-57) composite terms used in certain official terminologies and
The goal of explication is to ensure that the data captured exchange model resources (along the lines of
in annotations is semantically enhanced in a way that enables ‘VehicleInspectionJurisdictionAuthorityText’) will be broken
computational integration and reasoning along the lines down logically into constituent elements. This will provide a
described in [11], [12]. The goal of annotation is to aid means to avoid the combinatoric explosion that is threatened
retrieval of information about specific persons, groups, events, by traditional approaches. Some composite expressions – for
documents, images, and so forth, where this information is example ‘Essential Element of Friendly Information (EEFI)’ –
conveyed through source documents using disjointed and will indeed be included in pre-composed form in the IAO-Intel
disparate systems for designation. ontology, but only where they are either defined in doctrine or
already established as part of relevant SME vocabularies.
V. STRATEGY FOR BUILDING IAO-INTEL The modeling task for which compounds such as
‘VehicleInspectionJurisdictionAuthorityText’ were designed
Our strategy for building IAO-Intel is to extend the draft is addressed in our framework by allowing single data entries
IAO to include terms and definitions tailored for the intelli- to be annotated by multiple ontology terms (sometimes linked
gence domain and specifically for the needs of our DSGS-A by appropriate relations). A record in one of the tables
ontology initiative. The strategy has the following parts. containing data about an IED can be annotated, for example,
both with ‘IED Event’ (based on its aboutness) and with
First, IAO-Intel is created by downward population from
‘EEFI’ (based on its importance). A particular plan for the
the draft IAO reference ontology. That is, the highest level
Intelligence Preparation of the Battlefield can be annotated as
terms of IAO-Intel are defined as specializations of terms from
being at the same time a Plan (based on its purpose), a
IAO along the lines illustrated in Table 1. The coverage do-
Government Document (based on its source), a Report on Air
main of IAO-Intel will be determined incrementally on the ba-
Defenses (based on its aboutness). It can be annotated also
sis of requests from analysts and other SME communities and
through relations, for example through located-at linking the
through incorporation of terms from doctrinal publications and
source of the plan to some city or building and linking the
relevant high-level data models and document classifications.
planned air defenses to some region of interest.
Second, we use these sources to identify the dimensions of
Currently, military terminology resources generally fail to
attributes along which IAs will be annotated. The selected
follow established best practice principles for the formulation
dimensions are constructed in such a way as to be orthogonal
of definitions. For example, they often confuse terms referring
in the sense in which, for example, color is orthogonal to
to components of information artifacts with terms referring to
shape – thus ontology branches built to represent different
the entities in reality which those information artifacts are
dimensions of attributes will contain no terms in common.
about. The “WTI Improvised Explosive Device” Glossary, for
This will enable these branches to be structured following the
example, defines Method of Emplacement as:
principle of single inheritance (thus as true hierarchies) [13].
The description of where the [improvised explosive] device was
Third, we create low-level ontology modules (LLOs) delivered, used or employed.
corresponding to each of these orthogonal dimensions. LLOs
are small single-dimension attribute lists or shallow Similarly the DCGS-A Logical Data Model defines Cover-
hierarchies designed to advance ease of maintenance and Concealment as:
surveyability of the ontology and to provide a growing set of information about geographical features that provide protection
simple component terms which can be used: from attack or observation.
1. to construct more complex terms, both terms for inclusion
in IAO-Intel, and terms to be used to generate inferred Use of IAO-Intel in tandem with corresponding domain
classifications in application ontologies created for specific ontologies allows us to explicate CoverConcealment (properly
local purposes, along the lines described in [10]; so-called) as:
2. to define the terms of the IAO-Intel ontology and of its a geographic feature which has-role CoverRole,
sister ontologies within the DSGS-A framework;
and to explicate CoverConcealmentInformation as:
3. to explicate the meanings of terms standardly used by
different agencies, or by different groups of SMEs, or by IA which is-about CoverConcealment,
different existing and future systems to describe such where CoverRole is defined as:
artifacts in a logically consistent way that is designed to
allow integration of data and enhanced analytics; the Role acquired by a given geographic feature when it is used
to provide protection from attack or observation.
4. to annotate instance data pertaining to particular
information artifacts used by the intelligence community – VI. MAINTAINING AND EVALUATING IAO-INTEL
for instance analysts’ reports;; harvested emails;; signals
data; and so forth. To maintain the IAO-Intel term collection over time we
will create feedback links to enable users of the ontology to
The goal is that IAO-Intel should support integration of data request new terms and to report errors. We are also working
STIDS 2013 Proceedings Page 35
on an objective validation process which will enable us to – Information Quality Entity (IQE). An IQE is the pattern on
determine how requested terms should be treated, an IBE in virtue of which it is a bearer of some information.
distinguishing options such as: 1. incorporation into IAO-Intel – Information Structure Entity (ISE). An ISE is a structural
or into some associated reference ontology, 2. incorporation part of an ICE; speaking metaphorically, it is an ICE with
into an application ontology maintained for some local the content removed: for example an empty cell in a spread-
purpose, 3. being marked as a synonym of some existing sheet; a blank Microsoft Word file. ISEs thus capture part of
ontology term. what is involved when we talk about the ‘format’ of an IA.
We are identifying, and where necessary constructing de
novo, the domain ontologies that will need to be used in the The term ‘information artifact’ can now be used to refer either
definition of complex terms, and defining the relations that 1. to some combination of ICEs and ISEs (roughly: the IA as
will link IAO-Intel terms with terms in these domain body of copyable information content); or 2. to some
ontologies. These ontologies, too, will be extended over time concretization of ICEs and ISEs in some IBE in which some
on the basis of input from users. IQE inheres (the information artifact is: this content here and
now, on this specific computer screen or this printed page).
We are also testing a series of objective criteria to be used Different information artifact types will differ in different
in evaluation of IAO-Intel and other DCGS-A ontologies, ways along these dimensions, as illustrated in Table 2.
starting with simple numerical measures of (a) term requests
received and dealt with, and (b) uses of terms in definitions,
explications and annotations. IAO-Intel will allow us to keep BFO: BFO: BFO:
track of the number of information artifacts that make Independent Generically Specifically
reference to individuals falling under a given class, and these Continuant Dependent Dependent
metrics too can be used to assess the relative importance of Continuant Continuant
this class within the ontology framework taken as a whole.
While not definitive, such measures will help guide our
judgments concerning the content and structure both of IAO- Information
Information Information Information
Intel and of its associated domain ontologies. Quality Entity
Bearing Entity Content Entity Structure Entity (Pattern)
VII. ORGANIZATION OF IAO-INTEL (IBE) (ICE) (ISE) (IQE)
Given the importance of the dichotomy between primary Figure 1. Continuants in the IAO framework
(topic) and secondary (artifact) focus, a central role in IAO-
Intel is played by what we call VIII. IAO AND THE BASIC FORMAL ONTOLOGY
Information Content Entities (ICEs) are about something Figure 1 shows how IAO and IAO-Intel are being built to
in reality (they have this something as a subject; they conform to Basic Formal Ontology (BFO), the upper-level
represent, or mention or describe this something; they architecture used in the DSGS-A ontologies [14]. IBEs are, in
inform us about this something). Aboutness may be BFO terms, independent continuants (they are entities made of
identifiable from different perspectives. Thus one analyst physical matter). An IBE is a physical entity that is created or
may interpret a given ICE as being about the geography modified to serve as bearer of certain patterned arrangements
of a given encampment; another may view it as providing – for example of ink or other chemicals, of electromagnetic
information about the morale of those encamped there. excitations. An IQE is a quality of an IBE which exists in
virtue of such patterned arrangements and which is
All major classes of information artifacts involve ICEs –
interpretable as an ICE or ISE. Such an IQE is created when
simply because all major classes of information artifacts are
some physical artifact is deliberately created or modified to
about something. A plan of action, for example, is about a
support it (patterned to serve as its bearer). IQEs are
certain group of persons and goals and the types and ordering
BFO:specifically dependent continuants (SDCs) – entities
of actions that will be used to realize these goals. Even a
which require some specific physical bearer but which are not
document that has been written in code will be assumed by an
themselves physical. Each IBE and IQE is restricted at any
analyst to be about something (for what, otherwise, would be
given time to some specific location in space. (If you display
the reason for its creation?). Typically, an information artifact
the same digital image twice on your desktop, then there are
such as a copy of a newspaper will be associated with multiple
two IQEs on your desktop, which are – at some level of
ICEs at successive levels of granularity, including separate
granularity – indistinguishable copies of each other.
articles within the newspaper, separate sentences within these
articles, and so on. ICEs and ISEs, in contrast, are what BFO calls generically
dependent continuants or GDCs. This means that they are
In addition to ICEs, we distinguish also:
entities – such as a pdf file or an email – which can be copied
– Information Bearing Entity (IBE). An IBE is a material from one physical bearer to another and thus may exist
entity that has been created to serve as a bearer of simultaneously in multiple different IQEs, which are called
information. IBEs are either (1) self-sufficient material ‘concretizations’ of the corresponding GDC. Each GDC is
wholes, or (2) proper material parts of such wholes. concretized by at least one specific IQE inhering for example
Examples under (1) are: a hard drive, a paper printout (e.g., in the tiny piles of ink on the piece of paper in your pocket or
a report); and under (2): a specific sector on a hard drive, a in differentially excited pixels on your screen. When the GDC
single page of a paper printout.
STIDS 2013 Proceedings Page 36
is copied, then a new IQE is created on a new physical Note that we do not assume that all portions of IAO-Intel
information bearer, as when a new pattern of characters is will be of equal utility in applications for the IC. We do,
created on the screen of the recipient of an email. This second however, believe that to achieve clarity of explication in the
pattern is a copy of the pattern created on the screen of the treatment of source data artifacts will require clear definitions
sender. The GDC itself exists simultaneously both at its of the upper-level terms in the IAO, and a clear understanding
original site and at the site to which it has been transmitted. of the relations between them.
GDCs can thus be multiply located.
TABLE 2: DIMENSIONS OF INFORMATION ARTIFACTS (IAS)
BFO relations between ICEs, ISEs, IQEs and IBEs can be
set forth as follows: Information
IBE ISE ICE
Artifact
ICE generically-depends-on IBE
Hard drive
ISE generically-depends-on IBE MS Word file MS Word Varies
(magnetized
(.doc, .docx) format
IQE specifically-depends-on IBE sector)
ICE concretized-by IQE Hard drive
XML V 2.0 Varies
XML file (magnetized
ISE concretized-by IQE sector)
format
IAO contains in addition relations which allow us to Hard drive
MS Excel 2010 MS Excel 2010 Varies
formulate metadata concerning attributes of IAs such as file (.xls, .xlsx)
(magnetized
format
author, creation date, classification status, and so forth, and to sector)
annotate also components of IAs such as the To- and Hard drive
FromAddress components of email headers. The ToAddress of KML file (magnetized KML Map overlay
email message m, for example, is defined as: sector)
a collection of at least one email addresses of the intended reci- Hard drive
pients of m, each with at most one optionally associated name. JPEG file (.jpg) (magnetized JPEG format Image
sector)
The set of relations can be extended to include also relations Internet Message
involving documents, document parts and document Email file (with Hard drive
Format (e.g., Message
collections, such as retrieved-from, curated by, and so forth. embedded (magnetized
RFC 5322
attachments sector)
compliant)
When we consider examples such as those provided in
A specific
Table 2, then it becomes clear that, when IAO-Intel is applied USMTF Message Message
government USMTF Format
to the explication of terms involved in describing instance- file
network
data relating to real-world IAs, then multiple artifacts may
need to be distinguished. Consider, for example, a pdf file Paper Name,
document; ID formats, Personal data,
stored on some specific laptop. When we address what is Passport (may include security marking Passport
meant by the (copyable) content of this file, then we recognize photographs, formats … number, Visas
that this content may be copied in multiple ways, for example: RFID tags) …
to a pdf file using the same version of the Acrobat software Official paper
and on the same operating system, to a pdf file using a Title Deed Varies Varies
document
different version of the Acrobat software, using characters
Report Varies Varies Varies
from the same or a different character set, by being printed out
on a piece of paper, and so on. The annotation of instance data MIL-STD-2525
with information of this sort may be important for example in Overlay Sheet Symbols; FM
investigating the provenance of given information artifacts ( e.g. Map 101-1-5
Acetate sheet Map overlay
which lie at the end of long chains of copying and processing Overlay Sheet – Operational
see Figure 2) Terms and
involving multiple authors and computer systems. One Graphics
potential application of IAO-Intel is to the systematic
annotation of data pertaining to such chains.
IX. ATTRIBUTES OF INFORMATION ARTIFACTS
Matters are complicated further when we go deeper into
the question of how IAs are stored inside the computer. Given Information artifacts have attributes along a number of
a generically dependent continuant which is the pdf file stored distinct dimensions, treated in LLO modules of the IAO.
in the hard drive on some given laptop, there is a specifically Terms in these modules will be applied to explicate
dependent IQE which is (roughly) the pattern of 1s and 0s in information relating to IAs of different types, and to annotate
the magnetic coating of the hard drive. When the entirety of data pertaining to IA instances with the help of relations
this pdf file is displayed on your screen, then there is a further mentioned above. Some dimensions of IA attributes are
specifically dependent IQE which is the corresponding pattern common to all areas, both military and non-military,
of pixels on your screen. Both of these IQEs are concretiza- including: Purpose, Lifecycle Stage (draft, finished version,
tions of a corresponding GDC. revision); Language, Format, Provenance, Source (person,
organization), and so forth.
STIDS 2013 Proceedings Page 37
Along the dimension of Purpose we distinguish: operational environment, such as obstacles restricting military
movement, key geography, and military objectives.
x Descriptive purpose: scientific paper, newspaper article,
after-action report
x Prescriptive purpose: legal code, license, statement of
rules of engagement
x Directive purpose (of specifying a plan or method for
achieving something): instruction, manual, protocol
x Designative purpose: a registry of members of an
organization, a phone book, a database linking proper
names of persons with their social security numbers
whereby it should be stressed that one and the same IA may of
course serve multiple purposes.
As is shown in Table 3 IAO-Intel will include additional
LLOs relating to attributes of importance to the intelligence
domain such as: Classification, Encryption Status, Encryption
Strength, and so forth. IAO-Intel will also include terms
representing specific IA Purposes such as: informing the
commander, providing targeting support, intelligence
preparation of the battlefield.
TABLE 3. DIMENSIONS OF INFORMATION ARTIFACT ATTRIBUTES
Role in the Intelligence Process (JP 3-0, III-11)
Priority Intelligence Requirement (PIR)
Commander’s Critical Information Requirement (CCIR)
Essential Element of Information (EEI)
Essential Element of Friendly Information (EEFI)
Confidence Level (JP 2.0, Appendix A) Figure 2: Modified Combined Obstacle Overlay (example IA#1)
Highly Likely Unlikely
Likely Highly Unlikely
We assume that IA#1 has been prepared as part of some given
Even Chance
plan, IA#2. Both IAs #1 and #2 will then be referred to in
Discipline (JP 2.0, I-5) Intelligence multiple further IAs including multiple databases compiled
Legal Signal during planning, execution and outcomes assessment.
Ideology Human Relevant terms used in the data models associated with these
Religion Rumor intelligence data models will have been explicated using terms from IAO-
Propaganda Web intelligence Intel. The latter terms can then be used along the lines
Intelligence Excellence (JP 2.0, II-6) described in [9] to create annotations to both #1 and #2 on the
Anticipatory Complete basis of the fact that they are referred to in the databases in
Timely Relevant question. The results will include, for example:
Accurate Objective a) annotations to the attributes of IA#1:
Usable Available ICE: MCOO
IBE: Acetate Sheet
Table 3 illustrates fragments of some of the dimensional
hierarchies specific to IAO-Intel, with their doctrinal sources. uses-symbology MIL-STD-2525C
authored-by person #4644
X. EXAMPLES OF USE OF IAO-INTEL IN ANNOTATION
part-of plan IA#2
As should by now be clear, IAO-Intel relates not merely to b) annotations relating to the aboutness of IA#1
textual documents but to information artifacts of all types
including maps, videos, photographic images, websites, Avenue of Approach
databases, and so forth, both unstructured source documents Strategic Defense Belt
and official documents of many different varieties. Consider, Amphibious Operations
the Modified Combined Obstacle Overlay (MCOO), taken
Objective
from JP 2-01.3 [15] and illustrated in Figure 2. (We refer to
this as example IA#1 in what follows.) An MCOO is defined and so forth. Used in conjunction with the skill ontology and
as: the person database the annotations above will enable a
planner to retrieve (for example) all MCOOs relating to
A joint intelligence preparation of the operational environment
product used to portray the militarily significant aspects of the
amphibious operations authored by persons with certain skills.
STIDS 2013 Proceedings Page 38
Consider, as a second example, a collection of documents describing (A) chemicals (properties, costs, manufacture,
prepared according to FM 6-99.2 [16], for example of types: transport, supply, and so forth), and (B) explosives
manufacture (raw materials, persons and skills involved,
Intelligence Report [INTREP]
processes and equipment and safety measures used). We will
Intelligence Summary [INTSUM] have satisfied Directive 8322.20 in maximizing discoverability
Logistics Situation Report [LOGSITREP] if we annotate each body of data in accordance with
Operations Summary [OPSUM] corresponding term repositories, which we can assume to have
Patrol Report [PATROLREP]
been independently developed. Suppose now, however, that
we are called upon to integrate the data in (A) with the data in
Reconnaissance Exploitation Report [RECCEXREP] (B). Here these annotations will likely provide no assistance,
SAEDA Report [SAEDAREP] which will in turn lead to calls for the creation of a third term
repository to be used in efforts to annotate the combined (AB)
Suppose further that we need to cross-reference these with
data. The results of these efforts will then once again likely
comparable sets of documents prepared by other commands,
provide no assistance when (AB) data itself needs to be
and that we need to do this in such a way as to extract and
integrated with, say, data about explosives financing.
process the information computationally. FM 6-99.2 provides
definitions of the mentioned report types, but does not take the Where, in contrast, the systems for annotating (A) and (B)
step of formulating these definitions computationally. IAO- reflect a common ontological approach, then new annotation
Intel addresses this problem by providing a common, resources for the merged data can be easily be developed by
algorithmically useful, set of ontology terms that is designed reusing the initially developed ontologies in the formulation of
to allow consistent explication of these and related types as both composite terms and corresponding definitions [10].
they appear in different doctrinal resources. The results can
then be used for computer-aided aggregation of the data A further problem is that the need to create new
represented using corresponding IA types, cross-checking of terminology resources for the annotation of such merged
mismatches, and so forth. content may lead to the need for corrections of the initial
terminology resources. Such corrections may have expensive
XI. THE DOD DATA SERVICES ENVIRONMENT consequences: either they will break interoperability with the
We can now return to Directive 8320.02 and address the results of earlier annotation efforts, or – if resources are
relevance of the work reported above to its successful invested to correct already existing annotations to make them
implementation. As we saw, the Directive requires that ‘all conform to the new usage – they will have unforeseen
salient metadata be discoverable, searchable, and retrievable’ consequences for third parties who have been relying on the
through use of the DoD Data Services Environment (DSE) [6]. older resources to be maintained consistently through time.
DSE’s numerous data sources include 35 ‘supporting Such problems are minimized where terminology resources
taxonomies’ derived from pre-existing terminology resources. are developed in tandem from the very start as parts of a single
Problems arise, however, because the latter have been suite of ontology modules developed using common
constructed on the basis of multiple distinct methodologies principles, exactly as is proposed by our DSGS-A strategy.
(for example as concerns the formulation of definitions). We believe that only a strategy of this sort can satisfy the
When, on August 25, 2013, the DSE was queried for requirement that data, information, and IT services are ‘made
information on “location”, the DSE reported 660 possibly visible, accessible, understandable, trusted, and interoperable
relevant sources of information. When the DSE was queried throughout their lifecycles for all authorized users.’ [5]
for “unit types,” 882 possibly relevant sources of information
XII. SEMANTIC TECHNOLOGY IS NOT ENOUGH
were reported. When types of “ground vehicles” were queried
for, 175 possible relevant sources of information were The strategy underlying DSE has much in common with a
reported. Such redundancies present obstacles to discovery, strategy adopted widely in the semantic technology
search and retrieval. They arise because different compilers of community under the heading of Linked Open Data, a strategy
authoritative data describe entities of the same types in often involving the use of the Dublin Core Metadata Element
heterogeneous ways. This thwarts the sort of coherent Set as controlled vocabulary. We believe that the Dublin Core
integration that is required for the mounting of what, in [6], we can serve as reliable controlled vocabulary for describing IA
referred to as the “massing of intelligence fires”. data only where the information artifacts in question are
themselves artifacts formulated using RDF or some other
One problem is that while the terms in thesauri and W3C recommended syntax, and unfortunately this is not the
glossaries can be used in annotations, the value derived case for many of the artifacts at issue here. We believe further
therefrom is limited above all because they do not allow the that the Linked Data approaches cannot solve the problems of
benefits of inferencing and of rapid introduction and definition silo-formulation in the IC for the results outlined already in
of new terms which are provided by a framework of well- section XI above. The semantic technology community draws
constructed ontologies along the lines described in [10]. There a distinction between two levels of interoperability: Level 1,
we show how reference ontologies can be quickly expanded resting on shared term definitions (for example drawn from
with new content to meet emerging data representation needs the Dublin Core), and Level 2, of what is called Formal
and in such a way that data annotated with the newly added Semantic Interoperability. As is recognized at [17], Level 1 is
terms is automatically integrated with existing data. ‘so open-ended that it quickly leads to a proliferation of
Imagine, for example that we have two large bodies of data custom-built solutions incompatible with each other, such as
STIDS 2013 Proceedings Page 39
metadata expressed in document formats that require years in the domain of biomedical informatics, and is
customized software to read and data models that cannot gradually being adopted also in other domains, including for
easily be mapped to generic, interoperable representations example the domain of modeling and simulation, where the
such as those expressed in RDF.’ Level 2 is designed to solve identifying authoritative data sources is needed to ensure
these problems by requiring that all IAs are described via realistic scenarios [18]. One principal feature of the strategy is
metadata formulated using RDF. Unfortunately RDF (or even that it provides a standard means for defining new ontologies
OWL) is no panacea. Multiple conflicting ontologies can be in light of emerging needs, in a way that guarantees
formulated in RDF terms, yet still remain conflicting. consistency with the ontologies already created and with the
data annotated in their terms. We believe that this feature
The solution, again, must rely on shared development of a makes the strategy particularly useful in addressing the emerg-
single suite of modularized ontologies, in which not only the ing challenges to the intelligence analyst in accordance with
same formal language is used, but also consistent definitions DoD directives concerning discovery, retrieval and search.
populating downward from a common upper level such as
BFO – and we note in this connection a parallel with the way ACKNOWLEDGMENTS
in which joint doctrine is elaborated, in a process that is Work on IAO-Intel was supported by I2WD. Thanks are due
designed to ensure (at least ideally) that the same term is also to Mathias Brochhausen, Werner Ceusters, Mélanie
defined and used consistently across the 80 plus Joint Courtot, Janna Hastings, James Malone, Bjoern Peters,
Publications (JPs) that address the various aspects of joint Jonathan Rees, and Alan Ruttenberg for their work on IAO.
REFERENCES
[1] Friedrich Wilhelm von Steuben, Regulations for the order and discipline
IAO:Report
of the troops of the United States, 1792, http://x.co/1dJEk.
[2] Department of Defense Dictionary of Military and Associated Terms,
Intelligence 2013, http://www.dtic.mil/doctrine/new_pubs/jp1_02.pdf.
report [3] Leo Obrst, Patrick Cassidy, “The need for ontologies: Bridging the
barriers of terminology and data structure”, Geological Society of
America Special Paper 482, 2011.
Geospatial Human Measurement and [4] Leo Obrst, Terry Janssen, Werner Ceusters (eds.), Ontologies and
Semantic Technologies for the Intelligence Community. Amsterdam:
intelligence intelligence signals intelligence
IOS Press, 2010.
report report report
[5] Sharing Data, Information, and Information Technology (IT) Services in
the Department of Defense, DoD Instruction 8320.02, August 5, 2013,
Human geospatial Signals Measurement http://www.dtic.mil/whs/directives/corres/pdf/832002p.pdf.
intelligence intelligence intelligence [6] DSE Data Services Environment, https://metadata.ces.mil/dse.
report report report [7] https://code.google.com/p/information-artifact-ontology.
[8] David Salmen, Tatiana Malyuta, Alan Hansen, Shaun Cronen, Barry
The above IAO-Intel terms are defined by using terms from the Smith, “Integration of intelligence data through Semantic Enhance-
ontologies below with the help of relations such as is-about, ment”, Proceedings of the Conference on Semantic Technology in
created-by, derives-from and so forth [7]. Intelligence, Defense and Security (STIDS), 2011, CEUR 808, pp. 6–13.
[9] Barry Smith, Tatiana Malyuta, David Salmen, William Mandrick, Kesny
Parent, Shouvik Bardhan, Jamie Johnson, “Ontology for the Intelligence
Geospatial feature IA source Analyst”, CrossTalk: The Journal of Defense Software Engineering,
November/December 2012, pp. 18–25.
Person Intel discipline [10] Barry Smith, Tatiana Malyuta, William S. Mandrick, Chia Fu, Kesny
Parent, Milan Patel, “Horizontal integration of warfighter intelligence
Signal
IA classification data. A shared semantic resource for the Intelligence Community”,
measurement Proceedings of STIDS Conference, 2012 (CEUR 996), pp. 112–119.
[11] Ron Rudnicki, Werner Ceusters, Shahid Manzoor and Barry Smith,
“What particulars are referred to in EHR data?”, American Medical
Figure 3. Top: Terms from IAO (unfilled) and IAO-Intel (grey) ontologies. Informatics Association 2007 Annual Symposium, 2007, pp. 630–634.
Taxonomical hierarchies: asserted – solid lines, inferred – dashed lines. Bottom
left: Domain ontologies. Bottom Right: IAO-Intel LLOs. [12] Ron Rudnicki, “DCGS-A Ontology Program Explication Procedures”,
MS, 2013.
warfare in accordance with JP 1-02 [2]. [13] Barry Smith and Werner Ceusters, “Ontological Realism as a
methodology for coordinated evolution of scientific ontologies”,
Applied Ontology, 5 (2010), pp. 139–188.
XIII. CONCLUSION
[14] Basic Formal Ontology 2.0, http://ontology.buffalo.edu/BFO/Reference.
To summarize: IAO-Intel forms part of a collection of [15] Joint Publication 2-01.3 Joint Intelligence Preparation of the Operational
ontologies that is being applied primarily to the explication of Environment, 16 June 2009.
data models and other terminology resources of importance to [16] U.S. Army Report and Message Formats (FM 6-99.2), April 2007,
DCGS-A. The terms in these ontologies are linked together http://armypubs.army.mil/doctrine/DR_pubs/dr_a/pdf/fm6_99x2.pdf.
logically in virtue of the fact that each ontology uses terms [17] Dublin Core User Guide, Last modified September 6, 2011,
which are defined in terms of other ontologies belonging to http://wiki.dublincore.org/index.php/User_Guide.
this same suite (as illustrated in Figure 3). This strategy for [18] Saikouy Diallo, Jose Padilla, “Military Interoperability Challenges”,
Handbook on Real-World Applications in Modeling and Simulation,
ontology development has been tested in use over several Wiley, 2012, pp. 298–332.
STIDS 2013 Proceedings Page 40