<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A New Algorithm for Generating Situation-Specific Bayesian Networks Using Bayes-Ball Method</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lae´cio L. Santos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rommel N. Carvalho</string-name>
          <email>rommel.carvalho@cgu.gov.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcelo Ladeira</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Li Weigang</string-name>
          <email>weigangg@unb.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Bras ́ılia (UnB) Bras ́ılia</institution>
          ,
          <addr-line>DF</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Research and Strategic Information, Brazilian Office of the Comptroller General (CGU)</institution>
          ,
          <addr-line>Bras ́ılia, DF</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Multi-Entity Bayesian Network (MEBN) is an expressive first-order probabilistic logic that represents the domain using parameterized fragments of Bayesian networks. Probabilistic-OWL (PR-OWL) uses MEBN to add uncertainty support to OWL, the main language of the Semantic Web. The reasoning in MEBN is made by the construction of a Situation-Specific Bayesian Network (SSBN), a minimal Bayesian network sufficient to compute the response to queries. A Bottom-Up algorithm has been proposed for generating SSBNs in MEBN. However, this approach presents scalability problems since the algorithm starts from all the query and evidence nodes, which can be a very large set in real domains. To address this problem, we present a new scalable algorithm for generating SSBNs based on the Bayes-Ball method, a well-known and efficient algorithm for discovering d-separated nodes of target sets in Bayesian networks. The novel SSBN algorithm used together with Resource Description Framework (RDF) databases and PR-OWL 2 RL, an amenable version of PR-OWL, allows reasoning with probabilistic ontologies containing large assertive bases, offering a scalable approach for the treatment of uncertainty in the Semantic Web.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Uncertainty can occur in a variety of forms in several types of domains, which
encouraged the proposition of various new formalisms. Bayesian networks are one of the
probabilistic methodologies most widely studied and used when working with
uncertainty due to their power of representation, and the well-known algorithms that make
inference to them.</p>
      <p>
        Multi-Entity Bayesian Network (MEBN) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is an expressive first-order
probabilistic logic that represents the domain using parameterized fragments of Bayesian
networks (BNs). MEBN is the base for the Probabilistic-OWL (PR-OWL) language [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
PR-OWL adds uncertainty support to Web Ontology Language (OWL), the main
language of the Semantic Web, allowing the construction of probabilistic ontologies.
PROWL 2 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] extends the original language allowing a better integration with the OWL
semantics, where the modeler can define uncertainty starting with an existent OWL
ontology. PR-OWL 2 RL [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] maps PR-OWL 2 language to be represented in OWL 2
RL profile, which allows reasoning in polynomial time for the main reasoning tasks.
PR-OWL 2 RL uses RDF databases for both storage and reasoning with very large
ontologies represented as RDF triples [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Inference in MEBN consists of the generation of a Situation-Specific Bayesian
Network (SSBN), a minimal Bayesian network sufficient to solve a set of target nodes
for which it is necessary to calculate the probability. Laskey [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] presents a
BottomUp algorithm to generate SSBNs. This algorithm was implemented in the UnBBayes
framework [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which was the first tool to implement MEBN.
      </p>
      <p>
        The Bottom-Up algorithm, however, does not work well in domains with large
assertive bases. In the Procurement Fraud ontology presented in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], for example, there
are millions of statements about the entities related to the analyzed case (e.g.,
enterprises, enterprise owners, and committee members). The Bottom-Up algorithm starts
the generation of a Bayesian network by instantiating the query and evidence nodes.
In this kind of domain, all the information related to every entity in the dataset will
become nodes instantiated and evaluated to see whether they can influence the query
nodes. Since the query nodes are usually related to just a small subset of these entities,
usually, most of the instantiated nodes will be removed from the final BN. Thus, a new
algorithm is necessary to work with domains such as this, and with PR-OWL 2 RL
language, which is designed to work with assertive bases with millions of RDF triples.
      </p>
      <p>
        This paper proposes a novel algorithm for generating SSBNs based on Bayes-Ball
algorithm. Bayes-Ball [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] is a well-known and efficient method that was developed for
searching relevant information in Bayesian networks. The algorithm can be used to see
which nodes are d-connected with a set of target nodes in polynomial time on the size
of the graph. The idea is to extend the Bayes-Ball algorithm to generate the SSBN on
demand, generating only the nodes d-connected with the query nodes.
      </p>
      <p>The rest of paper is organized as follows. Section 2 presents some necessary
concepts for this work: MEBN, PR-OWL, and Bayes-Ball algorithm. Section 3 presents the
Bottom-Up algorithm for generating SSBNs with some considerations about the
implemented version in the UnBBayes framework. Section 4 further discusses the proposed
algorithm for extending Bayes-Ball algorithm to generate SSBNs. Section 5 presents
a comparison between Bottom-Up and Bayes-Ball algorithms using plug-ins of
PROWL 2 and PR-OWL 2 RL implemented in UnBBayes framework. Finally, Section 6
describes some concluding remarks and future works.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Fundamentals</title>
      <p>In this section we will present the main concepts related to the new algorithm proposed
for generating SSBNs: MEBN, PR-OWL, and the Bayes-Ball algorithm.
2.1</p>
      <sec id="sec-2-1">
        <title>Multi-Entity Bayesian Networks</title>
        <p>
          Multi-Entity Bayesian Network (MEBN) is a first-order language that specifies
probabilistic knowledge bases as parameterized fragments of Bayesian networks [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. By the
incorporation of first-order logic expressiveness, MEBN solves the main limitation of
Bayesian networks: the impossibility of representing situations where the number of
random variables is unknown.
        </p>
        <p>
          The domain is modeled as a MEBN Theory (MTheory). An MTheory consists of
a set of MEBN Fragments (MFrags) that satisfy consistency conditions ensuring the
existence of a unique joint probability distribution over its random variables [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. Each
MFrag has a collection of parametrized related random variables representing the
properties and relationships of the entities. MEBN works as a template: an MFrag can be
instantiated to create as many instances of the hypotheses as needed [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>
          MFrags are composed of resident, input, and context nodes. To illustrate this,
Figure 1 shows the MFrag Front of Enterprise, from the domain Procurement Fraud,
presented in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. This domain models the possibility of a fraud in public procurements
using information such as the existence of relationships between members of the
procurement committee and the winning enterprise. Resident nodes (node 9) and input nodes
(nodes 6-8) are random variables representing the properties and relationships of the
entities. They are parametrized with arguments (ordinary variables) that will be filled with
the entities during the instantiation of the model. Resident node isFrontFor, for
example, has the arguments person and enterprise. Each resident node has a Local
Probability Distribution (LPD) defined in its home MFrag. Input nodes point to resident
nodes in other MFrags. Context nodes (nodes 1-5) are first-order logic expressions that
define constraints that must be verified and valid in order to allow the instantiating of
the corresponding MFrag with its defined LPD per resident node. If the context nodes
are not satisfied, the MFrag will be instantiated using a default distribution.
        </p>
        <p>
          Inference in MEBN consists of answering queries posed to assess the belief in a
set of target random variables given a set of evidence random variables (findings) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
Therefore, the MTheory will be instantiated using the entities and findings of the
knowledge base, which generates a Situation-Specific Bayesian Network (SSBN), a minimal
Bayesian network sufficient to compute the correct probability of the query nodes [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
Over the SSBN generated, the inference can be made using any algorithm, exact and
approximate, existent for Bayesian networks.
2.2
        </p>
        <p>
          PR-OWL
Probabilistic OWL (PR-OWL) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] adds uncertainty support to OWL using MEBN.
PROWL is an upper-ontology that consists of a set of classes, subclasses, and properties
that collectively form a framework for building probabilistic ontologies [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>
          Figure 2, extracted from [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], shows how PR-OWL models the main concepts
involved in defining a MEBN Theory. The probabilistic part of the ontology is modeled
using the MTheory class, composed of a set of MFrags. MFrags are composed of nodes
that represent random variables with their respective probability distribution and an
exhaustive set of possible values, individuals of the Entity class.
        </p>
        <p>
          PR-OWL 2 [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] extends the original language solving its two major limitations: the
lack of a formal mapping between random variables of PR-OWL and relationships
defined in OWL, and the lack of compatibility between PR-OWL types and the types that
already exist in OWL language. To solve the first limitation, PR-OWL 2 utilizes the
relationship definesUncertaintyOf, which creates a link between a PR-OWL
random variable and an OWL property. With this construction, and the additional
properties isSubjectIn and isObjectIn, it is possible to model ontologies containing
interrelated probabilistic and deterministic declarations. To solve the second limitation,
PR-OWL 2 maps PR-OWL types to types already defined in OWL. These modifications
approximate PR-OWL semantics to OWL semantics.
        </p>
        <p>
          PR-OWL 2 RL [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] adapts PR-OWL 2 to the OWL 2 RL profile, allowing the use
of triplestores - a type of database that stores the knowledge using RDF triples. The
language is designed to work with probabilistic ontologies that have large assertive
bases. Along with restricting the ontologies to the OWL 2 RL profile to allow inference
in polynomial time, PR-OWL 2 RL also restricts the set of logical expressions allowed
to evaluate them using queries submitted to triplestores (see [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]).
        </p>
        <p>
          Both versions of PR-OWL were implemented as plug-ins for UnBBayes. UnBBayes 3
is an open-source framework developed to work with probabilistic reasoning. The
PR3 http://sourceforge.net/projects/unbbayes
OWL plug-in was the first worldwide implementation of MEBN [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. It includes a
Graphical User Interface (GUI) for visual modeling of an MTheory, a knowledge
representation and reasoning system, and one implementation of the Bottom-Up algorithm
proposed in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] for generating the SSBN. After generating the SSBN, UnBBayes uses
the Junction Tree algorithm [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] for compiling and propagating evidence in the Bayesian
network generated. Junction Tree is an exact algorithm that transforms the network in a
group of cliques, in which probabilities can be propagated in polynomial time.
2.3
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Bayes-Ball Algorithm</title>
        <p>
          The Bayes-Ball algorithm [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] allows the verification of which nodes of a Bayesian
network are d-separated from a set of target nodes, given a set of evidence nodes. Thus,
it allows the identification of the probabilistically irrelevant (independent) nodes.
        </p>
        <p>Given a Bayesian network B = (N; A), where N is the set of nodes and A the set of
arcs, Bayes-Ball algorithm allows the calculation of the set Ni(J jK), which contains
the irrelevant nodes to a set of target nodes XJ given the set of evidence XK , as shown
in Equation 1.</p>
        <p>Ni(J jK) = fi 2 N : Xi ? XJ jXK g :
(1)</p>
        <p>The d-separation, stated in Theorem 1, is a graphical criterion that identifies
conditional independences in Bayesian networks.</p>
        <p>
          Theorem 1. (d-separation) Two variables, A and B, in a directed acyclic graph are
dseparated if for every trail between A and B there is an intermediate variable W, such
as: a) connection is serial or divergent, and the state of W is known, or, b) connection
is convergence and neither W nor its descendants have any evidence [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>The idea of the Bayes-Ball algorithm is to pass an imaginary ball from the target
nodes to the nodes that have an active trail to them. The ball is initially received from
each target node from a virtual child node and is passed from one node to another
following rules based on d-separation. Depending whether the ball was received from
a parent or from a child, and whether the node that received it is evidence or not, the
ball is passed to its parents, to its children, or blocked. Figure 4 illustrates the following
rules:
R1 If the ball is received from a parent and the node does not have evidence, the ball is
passed to all children of this node.</p>
        <p>R2 If the ball is received from a child and the node does not have evidence, the ball is
bounced back to all the children of the node and passed to all parents of the node.
R3 If the ball is received from a parent and the node has evidence, the ball is bounced
back to all parents of the node.</p>
        <p>R4 If the ball is received from a child and the node has evidence, the ball is blocked (it
is not passed to any node).</p>
        <p>
          The irrelevant nodes, Ni(J jK), are the nodes not marked as visited. The set of
irrelevant nodes calculated by Bayes-Ball algorithm correspond to the set of nodes
dseparated from target nodes given K [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>
          The complexity of the algorithm is O(jN j+jAvj) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], where Av are the arcs visited
by the algorithm. It is necessary to pass to every node to initialize the flags. In the worst
case scenario, the algorithm is linear on the size of the graph [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] because every node
and arc will be visited.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Bottom-Up Algorithm for Generating SSBNs</title>
      <p>This section presents the version of the SSBN Bottom-Up construction algorithm
implemented in UnBBayes framework.</p>
      <p>The steps are described below. The input is the generative MTheory, the knowledge
base containing the assertive information, and the queries to be solved.</p>
      <p>
        The Bayesian network built during the second step is known as Grand BN and it will
possibly contain disconnected networks. The prune phase is important for removing this
disconnected networks and for eliminating nodes that are irrelevant for determining the
probabilistic distribution of the query node. Laskey [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] proposes pruning the following
nodes:
– d-separated nodes: nodes d-separated from query nodes given the evidence.
– Barren nodes : nodes that not contain descendants that are query or evidence
nodes. Barren nodes should depend on evidence, but they cannot change the
probability of a query node and are computationally irrelevant [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
– Nuisance nodes: nodes that are computationally related to target nodes given the
evidence but that are not part of any evidential trail from any node in the evidence
set to any node in the query set [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. They are relevant for the queries only by their
marginal distributions: if those are pre-computed, these nodes can be removed from
the SSBN.
      </p>
    </sec>
    <sec id="sec-4">
      <title>New Algorithm for Generating SSBNs</title>
      <p>The Bottom-up algorithm is not adequate to work with domains that contain large
assertive bases and, because of this, it is not adequate to work with PR-OWL 2 RL
language. It starts in each query and evidence node, which can originate the generation
of disconnected networks and irrelevant nodes in domains with a great quantity of
evidence, consuming time and space in the Grand BN generation and evaluating some
unnecessary MFrags. For each MFrag, we have to evaluate the context nodes, making
searches and inferences on the knowledge base. This can be very complex using an
expressive logic such as first-order logic, or OWL 2 DL logic. Since irrelevant nodes will
be removed in the pruning phase, one solution that avoids its generation in the building
phase can improve the algorithm scalability.</p>
      <p>We propose an algorithm for generating SSBNs based on the Bayes-Ball algorithm.
The SSBN nodes are generated when the Bayes ball visits that node. Since the ball is
passed only for nodes relevant to answer the queries, the algorithm warranty that only
d-connected nodes to the query nodes will be generated. Consequently, only MFrags of
relevant nodes are evaluated, saving time in the evaluation of context nodes.</p>
      <p>The new algorithm makes the following modifications to the Bottom-Up algorithm:
in the initialization phase, only query nodes are included in the initial set S1; in the
structure building phase, the nodes are created based on Bayes-Ball algorithm rules;
and, in the structure Prune phase, it is no longer necessary to prune d-separated nodes
because the algorithm generated only nodes that have an active path to the query nodes.</p>
      <p>To control the evaluation of the algorithm, some marks are necessary on each SSBN
node during the reception and transmission of the virtual Bayes ball:
receivedBallFromChild: true if the ball was received from a child;
receivedBallFromParent: true if the ball was received from a parent;
evaluatedTop: true if the node was evaluated above;
evaluatedBottom: true if the node was evaluated below;
isFinding: true if the node has evidence;
visited: true if the node was visited by the Bayesian ball.</p>
      <p>Algorithm 1 presents how the new algorithm builds the SSBN. The algorithm uses
the following procedures:</p>
      <p>evaluateNode: verify whether the node is an evidence (using the knowledge base).
If it is, set isFinding = true; instantiate MFrag Mx in which the node is resident
and evaluate its context nodes.</p>
      <p>evaluateTop: evaluate the nodes above by instantiating the parents of x, setting
receivedBallFromChild to true. If y, a parent node of x, is an input node, the
MFrag My where y is resident will be instantiated using the values for that ordinary
variables in Mx to fill the arguments of y. If y is a resident node, only instantiate it,
using the arguments of Mx. Add the new generated nodes to nodeList (list of nodes
to be evaluated in the next iteration).</p>
      <p>evaluateBottom: evaluate the nodes below, generating its descendants. This
includes resident nodes that are children of x in Mx and resident nodes that are children
of y in My, where y is an input node located in MFrag My that makes reference to x. In
the last case, the MFrag My will also be instantiated, starting with the ordinary variables
of x. The other ordinary variables will be recovered using the context node evaluation.
New nodes created are added to the nodeList and receivedBallFromParent
is set to true.</p>
      <p>Before creating a new node, the algorithm verifies whether the node already
exists in the SSBN. If it exists, the algorithm only updates the information of the flags
receivedBallFromChild and receivedBallFromParent, of the existing
node, and sets visited to false if any of these flags were changed.</p>
    </sec>
    <sec id="sec-5">
      <title>Evaluation of the new Algorithm</title>
      <p>We developed a plug-in 4 for UnBBayes that implements PR-OWL 2 RL and the
BayesBall algorithm. This section shows the results of comparative tests made between the
new plug-in and the PR-OWL 2 plug-in, the most recent one that uses the Bottom-Up
algorithm.</p>
      <p>
        PR-OWL 2 plug-in utilizes the reasoner HermiT [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] to work with the knowledge
base, represented in OWL 2 DL. PR-OWL 2 RL plug-in utilizes a triplestore, GraphDB
Free5, that implements the OWL 2 RL profile over RDF graphs, using materialization
for inference. The materialization consists of expanding the rules in loading time by
calculating all new expressions that emerge from the set of added expressions.
      </p>
      <p>In order to carry out the tests, we developed a benchmark able to generate assertive
bases for the probabilistic ontology Procurement Fraud. Table 1 shows the bases used.
Base 15, for example, has 15 persons, 1 procurement, and 2 enterprises. Using these
entities, the benchmark creates the relationships (isProcurementFinished,
hasProcurementOwner, hasWinnerOfProcurement, isParticipantIn,
isMemberOfCommittee, etc.) based on the probabilities available on the ontology. The
final base has a total of 97 assertions.</p>
      <p>The tests were performed on a computer with an i5 processor and 8 GB of RAM, 5
GB dedicated to Java Runtime Engine running UnBBayes. The test consists of solving
the query isSuspiciousProcurement for the entity procurement 0,
generating the SSBN, and compiling it using the Junction Tree algorithm.</p>
      <p>
        Figure 5 shows a plot with the time used for both plug-ins to evaluate bases 15,
50, and 100. When generating the SSBN using PR-OWL 2 plug-in, the time to solve
the query increases at a greater rate than using PR-OWL 2 RL plug-in. This occurs for
two reasons: the Bottom-Up algorithm starts on each evidence existent in the base; and
HermiT reasoner takes more time to solve expressions submitted to it because it makes
inference in SHROIQ(D), the description logic in which OWL 2 DL is based, that
has complexity N2EXPTIME-complete [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. PR-OWL 2 RL plug-in performs better
since it uses the Bayes-Ball algorithm, which generates only nodes relevant to solve
the queries. It also uses a triplestore that makes the inference in polynomial time, over
4 Available in https://sourceforge.net/p/unbbayes/
5 http://graphdb.ontotext.com/
a less expressive version of OWL (OWL 2 RL), using materialization when the user
loads data on the base.
      </p>
      <p>()s 4;000
e
m
i
T
n
o
i
t
cu 2;000
e
x
E
0
0</p>
      <p>PR-OWL 2</p>
      <p>PR-OWL 2 RL
20</p>
      <p>40 60
Size of Base (#persons)
80
100
The test with Base 1,000 using PR-OWL 2 plug-in was not completed after 10
hours. As a result, the following tests were not executed in this plug-in. Table 2 shows
the time necessary to execute the query in Base 1,000 to 1,000,000 using the PR-OWL
2 RL plug-in. Step 1 is the initialization; Step 2 the construction of Grand BN; Step 3
the pruning phase; Step 4 the construction of CPT’s; and Step 5 the compilation of the
network.</p>
      <p>The table shows that the tests were executed at an approximately constant time.
This occurs because we fixed the number of persons by procurement and enterprises by
procurement to 4. When we used more than this, we have found space problems for
generating the CPTs because nodes with a large number of parents were generated. SSBN
generation using Bayes-Ball algorithm uses only the information related to the query.
Subsequently, while increasing the assertion base, as the amount of these relationships
remained constant, the time needed to solve the query remained approximately
constant. This resolved the limitation of the Bottom-Up algorithm, which starts on every
information of the assertive base. In real applications, usually only a small part of the
database information will be related to a user query, which makes the new algorithm
more adequate.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Future Work</title>
      <p>This paper presented an algorithm based on Bayes-Ball method to generate a SSBN
starting only on the query nodes. The new SSBN algorithm facilitates working with
domains that have a large assertive base, such as the Procurement Fraud detecting system.</p>
      <p>This new algorithm is part of the development of PR-OWL 2 RL, a new version
of PR-OWL amenable for working with RDF databases (triplestores). Together with
the Bayes-Ball SSBN algorithm and a grammar of FOL formulas that allow evaluation
over triplestores, the new language can be used to work with domains containing large
assertive databases. Tests comparing PR-OWL 2 RL plug-in to PR-OWL 2 plug-in show
that the Bayes-Ball algorithm together with the use of triplestores improved a lot the
scalability of SSBN generation in this kind of domains.</p>
      <p>The tests presented in this work are only an initial evaluations of the algorithm.
We plan to carry out more complete tests in the future, including tests using real data
from the Procurement Fraud domain. We also plan to make tests with others domains
to validate the Bayes-Ball algorithm and the PR-OWL 2 RL language.</p>
      <p>Some issues still need to be studied in relation to the Bayesian network generated,
given that the inference with it still depends on the algorithm’s sensibility to the size of
the network. The exact inference Junction Tree algorithm implemented in UnBBayes
presents problems when the size of the cliques is large. Future research will focus on
approximate inference, which seems to be a more effective approach.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Rommel</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Carvalho</surname>
            , Marcelo Ladeira, Lae´cio
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Santos</surname>
          </string-name>
          , Shou Matsumoto, and
          <string-name>
            <surname>Paulo</surname>
            <given-names>C. G.</given-names>
          </string-name>
          <string-name>
            <surname>Costa</surname>
          </string-name>
          .
          <article-title>A GUI tool for plausible reasoning in the semantic web using MEBN</article-title>
          .
          <source>In Innovative Applications in Data Mining</source>
          , pages
          <fpage>17</fpage>
          -
          <lpage>45</lpage>
          .
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Rommel</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Carvalho</surname>
          </string-name>
          , Kathryn B.
          <string-name>
            <surname>Laskey</surname>
          </string-name>
          , and
          <string-name>
            <surname>Paulo</surname>
            <given-names>C. G. Costa.</given-names>
          </string-name>
          <article-title>PR-OWL 2.0-bringing the gap to OWL semantics</article-title>
          .
          <source>In Uncertainty Reasoning for the Semantic Web II</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Rommel</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Carvalho</surname>
          </string-name>
          , Shou Matsumoto, Kathryn B.
          <string-name>
            <surname>Laskey</surname>
            , Paulo C. G. da Costa, Marcelo Ladeira, and Lae´cio
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Santos</surname>
          </string-name>
          .
          <article-title>Probabilistic ontology and knowledge fusion for procurement fraud detection in brazil</article-title>
          .
          <source>In Uncertainty Reasoning for the Semantic Web II</source>
          , pages
          <fpage>19</fpage>
          -
          <lpage>40</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Paulo</surname>
            <given-names>C. G.</given-names>
          </string-name>
          <string-name>
            <surname>Costa</surname>
          </string-name>
          , Kathryn B.
          <string-name>
            <surname>Laskey</surname>
            , and
            <given-names>Kenneth J.</given-names>
          </string-name>
          <string-name>
            <surname>Laskey</surname>
          </string-name>
          .
          <article-title>PR-OWL: a Bayesian ontology language for the semantic web. In Uncertainty Reasoning for the Semantic Web I: ISWC International Workshops</article-title>
          ,
          <source>URSW 2005-2007, Revised Selected and Invited Papers</source>
          , pages
          <fpage>88</fpage>
          -
          <lpage>107</lpage>
          . Springer-Verlag,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Lae´cio L. dos
          <string-name>
            <surname>Santos</surname>
            , Rommel N. Carvalho, Marcelo Ladeira,
            <given-names>Weigang</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>and Gilson Libo´rio Mendes. PR-OWL 2 RL - A language for scalable uncertainty reasoning on the semantic web information</article-title>
          .
          <source>In Proceedings of the 11th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW</source>
          <year>2015</year>
          ), Bethlehem, USA, October
          <volume>12</volume>
          ,
          <year>2015</year>
          ., pages
          <fpage>14</fpage>
          -
          <lpage>25</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Birte</given-names>
            <surname>Glimm</surname>
          </string-name>
          , Ian Horrocks, Boris Motik, Giorgos Stoilos, and
          <string-name>
            <given-names>Zhe</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <article-title>Hermit: An OWL 2 reasoner</article-title>
          .
          <source>Journal of Automated Reasoning</source>
          ,
          <volume>53</volume>
          (
          <issue>3</issue>
          ):
          <fpage>245</fpage>
          -
          <lpage>269</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Finn</surname>
            <given-names>V</given-names>
          </string-name>
          <string-name>
            <surname>Jensen</surname>
          </string-name>
          .
          <article-title>An introduction to Bayesian networks</article-title>
          , volume
          <volume>210</volume>
          . UCL press London,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Finn</surname>
            <given-names>V Jensen</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steffen L Lauritzen</surname>
          </string-name>
          , and Kristian G Olesen.
          <article-title>Bayesian updating in causal probabilistic networks by local computations</article-title>
          .
          <source>Computational statistics quarterly</source>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kathryn</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Laskey</surname>
          </string-name>
          .
          <article-title>MEBN: a language for First-Order Bayesian knowledge bases</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>172</volume>
          (
          <issue>2-3</issue>
          ):
          <fpage>140</fpage>
          -
          <lpage>178</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Yan</given-names>
            <surname>Lin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Marek J.</given-names>
            <surname>Druzdzel</surname>
          </string-name>
          .
          <article-title>Computational advantages of relevance reasoning in bayesian belief networks</article-title>
          .
          <source>In UAI '97: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence</source>
          , Brown University, Providence, Rhode Island, USA,
          <year>August</year>
          1-
          <issue>3</issue>
          ,
          <year>1997</year>
          , pages
          <fpage>342</fpage>
          -
          <lpage>350</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Suzanne</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Mahoney</surname>
            and
            <given-names>Kathryn B.</given-names>
          </string-name>
          <string-name>
            <surname>Laskey</surname>
          </string-name>
          .
          <article-title>Constructing Situation Specific Belief Networks</article-title>
          .
          <source>In UAI '98: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence</source>
          , University of Wisconsin Business School, Madison, Wisconsin, USA, July
          <volume>24</volume>
          -
          <issue>26</issue>
          ,
          <year>1998</year>
          , pages
          <fpage>370</fpage>
          -
          <lpage>37</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Boris</surname>
            <given-names>Motik</given-names>
          </string-name>
          , Bernardo Cuenca Grau, Ian Horrocks,
          <string-name>
            <surname>Zhe Wu</surname>
          </string-name>
          , and
          <article-title>Carsten Lutz Achille Fokoue. OWL 2 Web Ontology Language profiles</article-title>
          . http://www.w3.org/TR/owl2-profiles/,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. Lae´cio
          <string-name>
            <given-names>L.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Rommel N.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          , Marcelo Ladeira, Li Weigang,
          <string-name>
            <surname>Kathryn B. Laskey</surname>
          </string-name>
          , and
          <string-name>
            <surname>Paulo</surname>
            <given-names>C. G.</given-names>
          </string-name>
          <string-name>
            <surname>Costa</surname>
          </string-name>
          .
          <article-title>Scalable uncertainty treatment using triplestores and the OWL 2 RL profile</article-title>
          .
          <source>In 18th International Conference on Information Fusion, FUSION 2015</source>
          , Washington, DC, USA, July 6-
          <issue>9</issue>
          ,
          <year>2015</year>
          , pages
          <fpage>924</fpage>
          -
          <lpage>931</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Ross</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Shachter</surname>
          </string-name>
          .
          <article-title>Bayes-ball: The rational pastime (for determining irrelevance and requisite information in belief networks and influence diagrams)</article-title>
          .
          <source>In UAI '98: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence</source>
          , University of Wisconsin Business School, Madison, Wisconsin, USA, July
          <volume>24</volume>
          -
          <issue>26</issue>
          ,
          <year>1998</year>
          , pages
          <fpage>480</fpage>
          -
          <lpage>487</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>