High-Level Information Fusion with Bayesian Semantics


    Paulo C. G. Costa, Kathryn Laskey, Kuo-Chu Chang, Wei Sun, Cheol Park, Shou Matsumoto
          Center of Excellence in C4I & Department of Systems Engineering and Operations Research
                                          George Mason University
                                              Fairfax, VA 22030
     (pcosta,klaskey,kchang)@gmu.edu; wsun@c4i.gmu.edu; cparkf@gmu.edu; smatsum2@masonlive.gmu.edu


                       Abstract                             tems that feed information directly to human users.
                                                            Subsequent generation of high-level information fu-
      In an increasingly interconnected world infor-        sion (HLIF) products, such as situation displays, auto-
      mation comes from various sources, usually            mated decision support, and predictive analysis, relies
      with distinct, sometimes inconsistent seman-          heavily on human cognition. The tacit underlying as-
      tics. Transforming raw data into high-level           sumption is that humans are still the most efficient
      information fusion (HLIF) products, such as           resource for translating low-level fusion products into
      situation displays, automated decision sup-           decision-relevant knowledge. While the current ap-
      port, and predictive analysis, relies heavily         proach works well for many purposes, it cannot scale
      on human cognition. There is a clear lack of          as the data influx grows. Automated assistance for
      automated solutions for HLIF, making such             HLIF tasks is urgently needed to mitigate cognitive
      systems prone to scalability issues. In this          overload and achieve the necessary throughput.
      paper, we propose to address this issue with
      the use of highly expressive Bayesian mod-            Stove-piped systems can be extremely efficient at ex-
      els, which can provide a tighter link between         ploiting a specific technology applied to a limited and
      information coming from low-level sources             well defined set of problems. Air Traffic Control Sys-
      and the high-level information fusion sys-            tems, for instance, employ radar technology in a very
      tems, and allow for greater automation of the         e↵ective way to provide reliable situation awareness
      overall process. We illustrate our ideas with         for radar controllers via sophisticated low-level infor-
      a naval HLIF system, and show the results of          mation fusion (LLIF) techniques. The synthetic radar
      a preliminary set of experiments.                     screen shown to traffic controllers in a sector of an Area
                                                            Control Center (ACC) fuses multiple radar tracks.
                                                            Data association algorithms infer whether geographi-
1     INTRODUCTION                                          cally close signals captured by various radars are com-
                                                            ing from a single or multiple aircraft. Despite the
Information fusion is defined as:                           sophistication of its low-level fusion components, the
         “. . . the synergistic integration of informa-     ATC system relies heavily on humans for HLIF prod-
     tion from di↵erent sources about the behav-            ucts. For instance, controllers rely on their own under-
     ior of a particular system, to support deci-           standing of the overall picture to decide how to drive
     sions and actions relating to the system.”1            their tracks; area coordinators rely on their knowledge
A distinction is commonly made between low-level and        to decide whether the outbound traffic to a given air-
high-level fusion. Low-level fusion combines sensor re-     port should be redirected due to an upcoming storm;
ports to identify, classify, or track individual objects.   and so on.
High-level fusion combines information about multiple       The ATC system is a good example of a highly so-
objects, as well as contextual information, to charac-      phisticated stove-piped system that relies on human
terize a complex situation, draw inferences about the       cognition for its major purpose: to ensure that thou-
intentions of actors, and support process refinement.       sands of airplanes in the US can share the airspace
In current information fusion systems, lower-level data     in a safe and e↵ective way. As the volume of air-
fusion is typically accomplished by stove-piped sys-        planes increases, more humans are needed to perform
                                                            HLIF tasks. After a point, the overhead of transferring
   1
     The International Society for Information Fusion,      between ever-smaller control sectors becomes a major
http://isif.org
scalability issue. That is, cognitive limitations (each      Most languages for expressing ontologies, such as the
human can control only airplanes at once) together           most popular variant of the W3C Recommendation
with the added complexity of adding extra cognitive          OWL [5], are based on Description Logic [6]. Other
units (a.k.a. traffic controllers) become a major obsta-     ontology languages such as KIF2 and the ISO Stan-
cle to growth. This scalability problem is common to         dard Common Logic3 , are based on first-order logic.
HLIF systems in other domains as well.                       Classical logic has no standard means to represent un-
                                                             certainty. This is a major drawback for HLIF sys-
In this paper, we propose to address the issue with
                                                             tems, which must operate in environments in which
the use of highly expressive Bayesian models. Such
                                                             uncertainty is pervasive. Inputs from LLIF systems
systems provide a tighter link between low-level and
                                                             come with uncertainty, as do the high-level domain re-
high-level information fusion systems. Because they
                                                             lationships that analysts use to draw inferences about
are sufficiently expressive to reason about high-level
                                                             a complex situation.
information, they provide a coherent framework for
expressing and reasoning with uncertain information          Although LLIF algorithms often have a basis in prob-
that spans low and high level systems.                       ability theory, it is common to report results accord-
                                                             ing to a threshold rule without any confidence qual-
This paper describes our approach by way of a case
                                                             ifier. This is often justified by cognitive limitations
study in information fusion for Maritime Domain
                                                             of human decision makers. As an example, suppose a
Awareness. Section 2 motivates the use of explicit
                                                             video analysis report assigns 86% probability of person
probabilistic semantics and explains the main concepts
                                                             Joe being inside a car driving towards place A. If the
behind our approach. Section 3 introduces the Mar-
                                                             threshold for the input source was 85%, a LLIF sys-
itime Domain Ontology we used in our experiments.
                                                             tem might simply report the statement without quali-
The experiments are described in Section 4. Section 5
                                                             fication, and a HLIF system might treat this as a true
concludes with a discussion.
                                                             statement. Such threshold rules lose uncertainty infor-
                                                             mation. Other information sources, each with its own
2    Semantics in HLIF                                       internal processing and threshold rules, might provide
                                                             additional reports about Joe, A, and other aspects of
Humans are more e↵ective than computers at gather-           the situation relevant to inferences about Joes destina-
ing various pieces of information and correlating them       tion. Without uncertainty qualifiers, it is difficult for
into a coherent picture, but still have a high error rate.   the HLIF system to draw sound inferences about Joes
For example, an intelligence analyst can correlate im-       destination. Other limitations of HLIF with respect
ages and videos of a road with observers reports that        to the handling of uncertainty are discussed within the
a convoy has passed in the early afternoon, and con-         context of the International Society of Information Fu-
clude that this was the same convoy that participated        sions working group on Evaluation of Technologies for
in a terrorist attack 10 miles down that road. These         Uncertainty Reasoning (ETURWG)4 [7, 8].
conclusions are based on an implicit understanding of
how trucks and cars are represented in each type of          Representing uncertainty with ontologies is an active
media (video, imagery, human reports), as well as the        area of research, especially in the area of the Seman-
temporal and spatial relationships between cars, roads,      tic Web (e.g., [9, 10]). HLIF requires reasoning with
convoys, etc. For a computer program to perform the          uncertain information about complex situations with
same inferences from the same set of sources, it must        many interacting objects, actors, events and processes.
possess the same kind of knowledge. Conveying such           Automating HLIF therefore requires expressive rep-
knowledge to a computer program requires a means to          resentation formalisms that can handle uncertainty.
make the humans tacit knowledge explicit and formal,         Probabilistic ontologies [11, 12], extend traditional on-
so it can be retrieved and used when needed.                 tologies to capture both domain semantics and associ-
                                                             ated uncertainty about the domain. The probabilistic
Ontologies are the current paradigm for specifying do-       ontology language PR-OWL [12] is based on multi-
main knowledge in an explicit and formal way. One of         entity Bayesian Networks [13].
the most cited definitions of ontologies is the specifica-
tion of a conceptualization. [1] To perform automated
                                                             2.1    Multi-Entity Bayesian Networks
fusion in the above example, concepts such as cars,
roads, convoys, people, etc., as well as their relation-     MEBNs represent the world as a collection of inter-
ships, must be formalized in a way that computers can        related entities and their respective attributes. Knowl-
store, retrieve, and use. Not surprisingly, ontologies
have been widely considered in the domain of informa-           2
                                                                  http://www-ksl.stanford.edu/knowledge-sharing/kif/
tion fusion as a means to enable automated systems to           3
                                                                  http://www.iso-commonlogic.org/
                                                                4
perform HLIF tasks (e.g. [2, 3, 4]).                              http://eturwg.c4i.gmu.edu/
edge about attributes of entities and their relation-                  Definition 1 (from [11]): A probabilistic
ships is represented as a collection of repeatable pat-             ontology (PO) is an explicit, formal knowl-
terns, known as MEBN Fragments (MFrags). A set                      edge representation that expresses knowledge
of MFrags that collectively satisfies constraints ensur-            about a domain of application. This includes:
ing a unique joint probability distribution is a MEBN
                                                                       • Types of entities that exist in the do-
Theory (MTheory).
                                                                         main;
An MFrag is a parametrized fragment of a directed                      • Properties of those entities;
graphical probability model. It represents probabilis-
                                                                       • Relationships among entities;
tic relationships among uncertain attributes of and re-
lationships among domain entities. MFrags are tem-                     • Processes and events that happen with
plates that can be instantiated to form a joint prob-                    those entities;
ability distribution involving many random variables.                  • Statistical regularities that characterize
Such a ground network is called a situation-specific                     the domain;
Bayesian network (SSBN).                                               • Inconclusive, ambiguous, incomplete,
MEBN provides a compact way to represent re-                             unreliable, and dissonant knowledge re-
peated structures in a Bayesian Network. There is                        lated to entities of the domain; and
no fixed limit on the number of random variable                        • Uncertainty about all the above forms of
instances, which can be dynamically generated as                         knowledge;
needed. The ability to form a consistent composition                where the term entity refers to any concept
of parametrized model fragments makes MEBN well                     (real or fictitious, concrete or abstract) that
suited for knowledge fusion applications [14]. MEBN                 can be described and reasoned about within
inference can be performed by instantiating relevant                the domain of application. ⌅
MFrags and assembling them into SSBNs to reason
about a given situation. As evidence arrives, it is           POs provide a principled, structured, sharable formal-
fused into the SSBN to provide updated hypotheses             ism for describing knowledge about a domain and the
with associated levels of confidence. These are very          associated uncertainty and could serve as a formal ba-
convenient features for representing diverse informa-         sis for representing and propagating fusion results in
tion coming from various sensors, which make MEBN             a distributed system. They expand the possibilities of
attractive as a logical basis for probabilistic ontologies.   standard ontologies by introducing the requirement of
                                                              a proper representation of the statistical regularities
2.2   PR-OWL Probabilistic Ontologies                         and the uncertain evidence about entities in a domain
                                                              of application. POs can be implemented using PR-
There are basically three aspects that must be ad-            OWL5 , a Probabilistic Web Ontology Language that
dressed for a representational and reasoning frame-           extends OWL with constructs for expressing first-order
work in support of e↵ective higher-level knowledge fu-        Bayesian theories. PR-OWL structures map to MEBN
sion:                                                         structures, so PR-OWL provides a means to express
                                                              MEBN theories in OWL.
  1. A rigorous mathematical foundation,
  2. The ability to represent intricate patterns of un-
      certainty, and                                          2.3     The UnBBayes MEBN/PR-OWL Plugin
  3. Efficient and scalable support for automated rea-
                                                              In order to develop and use POs, we have devel-
      soning.
                                                              oped a MEBN/PR-OWL plugin to the graphical
Current ontology formalisms deliver a partial answer          probabilistic package UnBBayes6 , an open source,
to items 1 and 3, but lack a principled, standardized         JavaTM -based application developed at the University
means to represent uncertainty. This has spurred the          of Brasilia. The plugin provides both a GUI for build-
development of palliative solutions in which probabil-        ing probabilistic ontologies and a reasoner based on the
ities are simply inserted in an ontology as annotations       MEBN/PR-OWL framework [15, 16]. Reasoning in
(e.g. marked-up text describing some details related          the UnBBayes MEBN/PR-OWL plugin involves SSBN
to a specific object or property). These solutions ad-        construction, which can be seen type of proposition-
dress only part of the information that needs to be           alization, and the subsequent inferential process over
represented, and too much information is lost to the          the resulting SSBN. Figure 1 shows a screenshot of the
lack of a good representational scheme that captures          UnBBayes MEBN/PR-OWL plugin.
structural constraints and dependencies among proba-
bilities. A true probabilistic ontology must be capable         5
                                                                    http://www.pr-owl.org
                                                                6
of properly representing those nuances. More formally:              http://unbbayes.sourceforge.net
                                Figure 1: The UnBBayes MEBN/PR-OWL plugin

Many HLIF problems involve spatio-temporal enti-            (via mixtures of Gaussians). Mixtures of Gaussian dis-
ties, and require reasoning with discrete and continu-      tributions are used to represent continuous messages.
ous, possibly non-Gaussian, variables. To support this      The number of mixture components can be as large
requirement, a capability for hybrid MTheories was          as the size of the joint state space of all discrete par-
added to the UnBBayes MEBN/PR-OWL plugin. The               ents. To achieve scalability, the algorithm can restrict
plugin can handle MTheories in which continuous vari-       the number of mixture components in the messages to
ables can have discrete or continuous parents, but no       satisfy a predefined error bound [19].
discrete variable is allowed to have a continuous par-
                                                            Specifically, without loss of generality, suppose a typi-
ent. To specify hybrid models, constructs were added
                                                            cal hybrid model involving a continuous node X with a
to the local distribution scripting language for contin-
                                                            discrete parent node D and a continuous parent node
uous distributions. The SSBN construction algorithm
                                                            U. As shown in Figure 2, messages sent between these
for building the ground model is basically unchanged
                                                            nodes are: (1) ⇡ message from D to X, denoted as
except that local distributions can be continuous.
                                                            ⇡X (D); (2) ⇡ message from U to X, denoted as ⇡X (U );
For inference in a hybrid SSBN, the plugin implements       (3) message from X to D, denoted as X (D); and
the direct message passing (DMP) algorithm [17] to          (4) message from X to U , denoted as X (U ).
compute, propagate, and integrate messages. DMP
combines the unscented transformation [18] and the
traditional message-passing algorithm to deal with ar-
bitrary, not necessary linear, functional relationships
between continuous variables in the network. DMP
gives exact results for polytree conditional linear Gaus-
sian (CLG) networks and approximate results for net-
works with loops (via loopy propagation), networks
with non-linear relationships (via the unscented trans-
formation) and networks with non-Gaussian variables
                                                                       Figure 2: Example hybrid model
In general, for a polytree network, any node X d–                3       Maritime Domain Awareness PO
separates evidence into {e+ , e }, where e+ and e
are evidence from the sub-network “above” X and “be-             In 2008, the Department of Defense issued a direc-
low” X respectively. The and ⇡ message maintained                tive to establish policy and define responsibilities for
in each node are defined as,                                     Maritime Domain Awareness (MDA).7 The directive
                     (X) = P (eX |X)                       (1)   defines MDA as the “e↵ective understanding of the
                                                                 global maritime domain and its impact on the secu-
and                                                              rity, safety, economy, or environment of the United
                    ⇡(X) = P (X| e+                              States.” The ability to automatically integrate in-
                                  X)                       (2)
                                                                 formation and recommendations from multiple intel-
With the two messages, it is straightforward to see              ligence sources in a complex and ever-changing envi-
that the belief of a node X given all evidence is just           ronment to produce a dynamic, comprehensive, and
the normalized product of and ⇡ values, namely,                  accurate battlespace picture is a critical capability for
                                                                 MDA. This section reports on a prototype probabilis-
       BEL(X)       =    P (X|e) = P (X|e+
                                         X , eX )                tic ontology for maritime domain awareness (MDA-
                    =    ↵ (X)⇡(X)                         (3)   PO). This model, which is depicted in Figure 3, was
                                                                 developed as part of the PROGNOS project [20, 21],
where ↵ is a normalizing constant. It can be shown               with the assistance of two retired Navy officers who
that for a hybrid network, the ⇡ message can be re-              served as subject-matter experts.
cursively computed as,                                           Figure 4 depicts one of the MFrags of the MDA-PO,
              XZ                                                 the AggressiveBehavior MFrag. As the name implies,
  ⇡(X) =            P (X|D, U )⇡X (D)⇡X (U )dU
                     U                                           this is a chunk of knowledge that captures some of
               D
               X             Z                                  the concepts and relationships that are useful to in-
           =         ⇡X (D)           P (X|D, U )⇡X (U )dU (4)   fer whether a ship is displaying aggressive behavior.
               D                  U                              The three di↵erent types of MFrag nodes can be seen:
                                                                 Context, Input, and Resident nodes.
where the integral of P (X|D = d, U )⇡X (U ) over U
                                                                 Resident nodes are the random variables that form
is equivalent to a functional transformation of ⇡X (U ),
                                                                 the core subject of an MFrag. The MFrag defines a
which is a continuous message in the form of a Gaus-
                                                                 local distribution for each resident node as a function
sian mixture.
                                                                 of the parents of the resident node in the fragment
Similarly, the message for the discrete parents can              graph. They can be discrete or continuous. There are
be obtained as                                                   three discrete nodes in this MFrag, which are depicted
              Z     Z                                            as yellow rounded rectangles in the picture, and five
 X (D = d) =    (X)     P (X|D = d, U )⇡X (U )dU dX              continuous nodes, depicted as rounded rectangles with
                X         U                                      double lines.
       R                                      (5)
where U P (X|D = d, U )⇡X (U )dU is a functional                 As an example of how the representation works, re-
transformation of a distribution over U to X.                    ports on the propeller turn count of a ship will be an
                                                                 indicator of whether the ship speed is changing or not.
On other hand, the message for continuous parent                 Also, there will be di↵erent probability distributions
U can be computed as                                             for speedChange(ship) if the ship is behaving aggres-
            Z       X                                            sively or not (i.e. if the state of node hasAggressive-
 X (U ) =       (X)   P (X|D, U )⇡X (D)⇡X (D)dX                  Behavior(ship) is true or false).
               X         D
               X            Z                                   Input nodes, depicted as gray trapezoids in the figure,
           =        ⇡X (D)            (X)P (X|D, U )dX     (6)   serve as “pointers” referring to resident nodes in other
               D                 X
                                                                 MFrags. Input nodes influence the local distributions
                                                                 of resident, but their own distributions are defined in
Equations (3) to (6) form a baseline for computing               the MFrags in which they are resident.
direct messages between mixed variables.                         In a complete MTheory, every input node must point
As mentioned earlier, with the unscented transforma-             to a resident node in some MFrag. For instance, the
tion, this method can be modified for arbitrary non-             hasBombPortPlan(ship) input node influences the dis-
linear non-Gaussian hybrid models. In addition, the              tribution of all the hasAggressiveBehavior(ship) nodes
algorithm is scalable by combining the mixture com-
                                                                     7
ponents in the messages with any given error bound                       www.dtic.mil/whs/directives/corres/pdf/200502p.pdf
                                             Figure 3: The MDA-PO

that would be instantiated in an SSBN construction         The MDA-PO is described in detail in [22]. In PROG-
process.                                                   NOS, the MDA-PO was also used to build the model
                                                           used to run the test and evaluation process, which we
Context nodes are Boolean (i.e., true/false) random
                                                           explain in the next Section.
variables representing conditions that must be satisfied
for the probability distribution of an MFrag to apply.
Like input nodes, context nodes also have distributions    4   Experimental Results
defined in other MFrags.
By allowing uncertainty on context nodes, MEBN can         The main objective of the set of experiments presented
represent several types of sophisticated uncertainty       in this paper was to assess the accuracy, scalability,
patterns, such as relational uncertainty or existence      and overall performance of the SSBN construction and
uncertainty. There is only one context node in the Ag-     DMP algorithms combined. As a benchmark, we used
gressiveBehavior MFrag, seen in the figure as a green      the UnBBayes implementation of the Junction Tree
pentagon.                                                  (JT) algorithm, which is a well-known belief propa-
                                                           gation method for Bayesian networks [23] and the fo-
                                                             of them. In this case, the HLIF system must integrate
                                                             the data coming from the ship sensors with informa-
                                                             tion coming from other sources, such as intelligence re-
                                                             ports, signals intelligence, HUMINT, and others. We
                                                             emulate this in the experiments with “area queries”
                                                             in which all ships within a 60NM radius are queried
                                                             by the system. More precisely, information on all n
                                                             known ships within that radius trigger the instantia-
                                                             tion of MFrags storing pertinent knowledge, including
                                                             the one containing the shipOfInterest(ship) node. The
                                                             system would then query the n shipOfInterest(ship)
                                                             nodes that were instantiated.
                                                             All experiments were performed in a dedicated com-
      Figure 4: The Aggressive Behavior MFrag                puter with an Intel quad-core i7TM processor with 8
                                                             GB of RAM, and running MS Windows 7TM 64bit.
cus of various e↵orts on algorithm optimization (e.g.        Accuracy is assessed by the quadratic scoring rule [26]:
[24, 25]).
                                                                                         C
                                                                                         X
                                                                             B(r, i) =         (yj   rj ) 2
4.1   Setup and Metrics                                                                  j=1


Obtaining a real data set for maritime HLIF was not          where yj = 1 when the j th event is correct and 0 oth-
an option for our team. Therefore, we generated vari-        erwise. C is the number of classes. This is a proper
ous synthetic datasets through an agent-based simula-        scoring rule, i.e., the score is minimized when the as-
tion module, depicted in Figure 5. The module gener-         sessed probability is equal to the actual frequency.
ates simulated scenarios, including entities (e.g., ships,   To assess scalability, we measured the computation
people) and their features. The simulated scenarios          time as a function of the generated SSBN size and
serve as the ground truth for evaluating performance.        the number of ships involved in a query.
The simulation module also generates reports of the
kind the eventual operational system is expected to          4.2   Preliminary Results
receive, thus exercising the interfaces and the reason-
ing module in a realistic manner [21].                       The results for accuracy are depicted in Table 1. From
The simulation is based on maritime activities (regular      the obtained scores, it is clear that the Hybrid sys-
and suspicious) with the objective of prevention and         tem performed better in capturing both the cases in
disruption of terrorist attacks, sabotage, espionage, or     which the shipOfInterest(ship) node state was true
subversive acts. Therefore, the agents on the simu-          (⇡ 10.36% better) in the ground truth, as well as those
lation tool simulate commercial, fishing, recreational,      in which the node state was false (⇡ 14.02% better).
and other types of ships in their normal and suspicious                  Table 1: Results for Accuracy
behaviors. Suspicious behaviors are characterized by             Queries on the
ships that do not follow their regular or most prob-                                    Hybrid         Discrete
                                                               “Ship of Interest”
able routes according to their origin and destination,                                  System         System
                                                                      node
by ships that meet in the middle of the ocean for no           Ship of Interest =
apparent reason, etc.                                                                   0.88203        0.79947
                                                              true (ground truth)
For the experiments, the simulation engine generated           Ship of Interest =
                                                                                        0.89439        0.78439
9 types of scenarios with di↵erent combinations of the        false (ground truth)
number of ships and the associated entities (e.g. or-
ganizations, people, etc.). The maximum size of the          These results are consistent with expectations, given
dataset was limited to 10K entities. After generating        the inherent inaccuracies in discretizing the continuous
each dataset, we added noise as to assess robustness         random variables in the MDA-PO. The 10 to 14% im-
to model misspecification. The scenario for the ex-          provement from the hybrid with respect to the discrete
periments emulates a U.S. Navy destroyer conducting          model was consistent all over the 9 datasets. However,
Maritime Security Operations in a relatively busy area.      since the datasets were all generated from the same
This is a “needle in a haystack” type of problem, in         model, it is difficult to assess robustness with this run
which the destroyer has information about dozens of          of experiments. In any case, more complex relation-
ships within a certain radius, but can only verify a few     ships between nodes are likely to increase the di↵erence
                                   Figure 5: PROGNOS simulation module


in accuracy between the JT and the DMP systems.
Figure 6 below shows the results of the area query ex-
periments. The x-axis conveys the number of nodes
generated by each query, which tends to be correlated
with the number of ships. However, it was not uncom-
mon to see a few ships generating a large network or
vice-versa. The y-axis depicts query time in millisec-
onds.


                                                              Figure 7: Query Time vs. Number of Ships


                                                         This was a di↵erent run of experiments, in which the
                                                         focus was on keeping a controlled number of ships
                                                         within the query area. This allowed an assessment of
                                                         how each system reacted to the controlled increase in
                                                         that number. The performance of JT stayed relatively
    Figure 6: Area Query Time vs. Network Size
                                                         steady, while the DMP system performed much better
Regarding performance, most of the generated net-        for simpler problems but approached the performance
works were between 100 and 500 nodes, and generally      of JT when the number of ships was above twenty.
yielded a query time below 2 seconds for DMP and 5
                                                         These results were also expected, since our implemen-
seconds for JT. The maximum query time for JT was
                                                         tation of DMP was not as optimized as the JT imple-
7.3 seconds, while the DMP system worst case was 6.4
                                                         mentation in UnBBayes. More specifically, the DMP
seconds for a query. The results also show lower vari-
                                                         algorithm was initially implemented in MATLABTM ,
ance for DMP query times for a given network size.
                                                         and the translation to JavaTM was not tuned for per-
Figure 7 shows results for query time vs. number of      formance. Yet, the graph suggests a linear increase in
ships within the 60NM area.                              the range considered.
5   Discussion                                              [6] F. Baader, I. Horrocks, and U. Sattler, “Descrip-
                                                                tion logics as ontology languages for the seman-
The experiments were meant to simulate an HLIF sys-             tic web,” in Mechanizing Mathematical Reason-
tem within a relatively simple scenario. In spite of the        ing, pp. 228–248, 2005.
overall size of the experiments and the fact that it was
conducted within a controlled environment, the per-         [7] E. Blasch, J. Llinas, D. Lambert, P. Valin,
formance figures are promising. There remain many               S. Das, C. Chong, M. Kokar, and E. Shahbazian,
ways to improve the efficacy of the algorithm. As pre-          “High level information fusion developments, is-
viously mentioned, the main objective of the testing            sues, and grand challenges: Fusion 2010 panel dis-
and evaluation was to assess the gains in accuracy,             cussion,” in Information Fusion (FUSION), 2010
which clearly lived up to our expectations.                     13th Conference on, pp. 1 –8, July 2010.

The results also show promise for the feasibility of us-    [8] P. C. G. Costa, R. N. Carvalho, K. B. Laskey, and
ing probabilistic ontologies as a driver for HLIF sys-          C. Y. Park, “Evaluating uncertainty representa-
tems. Our future steps towards this goal are to con-            tion and reasoning in HLF systems,” in Proceed-
tinue the optimization of the algorithms, and to seek           ings of the Fourteenth International Conference
out new forms of knowledge acquisition techniques.              on Information Fusion, (Chicago, Illinois, USA),
The latter involves automated learning, which has               July 2011.
been the subject of our latest research e↵orts. We
also plan to address the research on rare events, and       [9] K. Laskey and K. Laskey, “Uncertainty reasoning
to work with other datasets.                                    for the world wide web: Report on the URW3-XG
                                                                incubator group,” URW3-XG, W3C, 2008.
Acknowledgements                                           [10] L. Predoiu and H. Stuckenschmidt, “Probabilis-
                                                                tic extensions of semantic web languages - a sur-
The PROGNOS project was partially funded by the
                                                                vey,” in The Semantic Web for Knowledge and
Office of Naval Research. The authors acknowledge
                                                                Data Management: Technologies and Practices,
Richard Haberlin and Michael Lehocky who served as
                                                                Idea Group Inc, 2008.
subject-matter experts for developing the MDA-PO,
and Rommel Carvalho, for his various contributions         [11] P. C. G. Costa, Bayesian semantics for the Se-
to the PROGNOS project.                                         mantic Web. PhD dissertation, George Mason
                                                                University, Fairfax, VA, USA, July 2005. Brazil-
References                                                      ian Air Force.

                                                           [12] P. C. G. Costa and K. B. Laskey, “PR-OWL:
 [1] T. R. Gruber, “A translation approach to
                                                                a framework for probabilistic ontologies,” in
     portable ontology specifications,” Knowledge ac-
                                                                Proceedings of the International Conference on
     quisition, vol. 5, no. 2, pp. 199–200, 1993.
                                                                Formal Ontology in Information Systems (FOIS
                                                                2006) (B. Bennet and F. Christiane, eds.),
 [2] D. McGuinness, “Ontologies for information fu-
                                                                vol. 150 of Frontiers in Artificial Intelligence and
     sion,” in Information Fusion, 2003. Proceedings
                                                                Applications, (Baltimore, MD, USA), pp. 237–
     of the Sixth International Conference of, vol. 1,
                                                                249, IOS Press, Nov. 2006.
     pp. 650 – 657, 2003.
                                                           [13] K. B. Laskey, “MEBN: a language for first-
 [3] E. G. Little and G. L. Rogova, “Designing ontolo-          order Bayesian knowledge bases,” Artificial Intel-
     gies for higher level fusion,” Information Fusion,         ligence, vol. 172, pp. 140–178, Feb. 2008.
     vol. 10, pp. 70–82, Jan. 2009.
                                                           [14] P. C. G. Costa, K. Chang, K. B. Laskey, and R. N.
 [4] E. Blasch, E. Dorion, P. Valin, E. Bosse, and              Carvalho, “High level fusion and predictive situ-
     J. Roy, “Ontology alignment in geographical                ational awareness with probabilistic ontologies,”
     hard-soft information fusion systems,” in Infor-           (George Mason University, Fairfax, VA, USA),
     mation Fusion (FUSION), 2010 13th Conference               May 2010.
     on, pp. 1 –8, July 2010.
                                                           [15] R. N. Carvalho, L. L. Santos, M. Ladeira, and
 [5] P. F. Patel-Schneider, P. Hayes, and I. Horrocks,          P. C. G. Costa, “A GUI tool for plausible rea-
     “OWL web ontology language semantics and                   soning in the semantic web using MEBN,” (Los
     abstract syntax.” http://www.w3.org/TR/owl-                Alamitos, CA, USA), pp. 381–386, IEEE Com-
     semantics/, Feb. 2004. W3C Recommendation.                 puter Society, Oct. 2007.
[16] P. Costa, M. Ladeira, R. N. Carvalho, K. Laskey,       [26] G. W. Brier, “Verification of forecasts expressed
     L. Santos, and S. Matsumoto, “A first-order                 in terms of probability,” Monthly Weather Re-
     Bayesian tool for probabilistic ontologies,” in             view, vol. 78, pp. 1–3, Jan. 1950.
     Proceedings of the Twenty-First International
     Florida Artificial Intelligence Research Society
     Conference (FLAIRS 2008), (Coconut Grove, FL,
     USA), pp. 631–636, AAAI Press, May 2008.

[17] W. Sun and K. Chang, “Direct message passing
     for hybrid Bayesian network and its performance
     analysis,” in SPIE Defense and Security Sympo-
     sium, (Orlando, FL, USA), Apr. 2010.

[18] S. J. Julier, “The scaled unscented transforma-
     tion,” in Proceedings of the American Control
     Conference, vol. 6, p. 45554559, 2002.

[19] H. Chen, K. Chang, and C. J. Smith, “Constraint
     optimized weight adaptation for gaussian mixture
     reduction,” in SPIE Defense and Security Sympo-
     sium, (Orlando, FL, USA), Apr. 2010.

[20] P. C. G. Costa, K. B. Laskey, and K. Chang,
     “PROGNOS: applying probabilistic ontologies
     to distributed predictive situation assessment in
     naval operations,” in Proceedings of the Four-
     teenth International Command and Control Re-
     search and Technology Conference, (Washington,
     DC, USA), CCRP Publications, June 2009. Best
     paper award of the Collaborative Technologies for
     Network-Centric Operations Track.

[21] R. N. Carvalho, P. C. G. Costa, K. B. Laskey,
     and K. Chang, “PROGNOS: predictive situa-
     tional awareness with probabilistic ontologies,”
     (Edinburgh, UK), July 2010.

[22] R. Carvalho, R. Haberlin, P. Costa, K. Laskey,
     and K. Chang, “Modeling a probabilistic ontology
     for maritime domain awareness,” in Information
     Fusion (FUSION), 2011 Proceedings of the 14th
     International Conference on, pp. 1 –8, July 2011.

[23] J. Pearl, Probabilistic Reasoning in Intelligent
     Systems: Networks of Plausible Inference. Mor-
     gan Kaufmann, 1 ed., Sept. 1988.

[24] L. Zheng, O. J. Mengshoel, and J. Chong, “Belief
     propagation by message passing in junction trees:
     Computing each message faster using GPU par-
     allelization,” Proc. of the 27th Conference on Un-
     certainty in Artificial Intelligence (UAI-11), 2011.

[25] A. L. M. F. V. Jensen, A. L. Madsen, A. L.
     Madsen, F. V. Jensen, and F. V. Jensen, “Lazy
     propagation in junction trees,” in In Proc. 14th
     Conf. on Uncertainty in Artificial Intelligence,
     p. 362369, Morgan Kaufmann Publishers, 1998.