A Study of MEBN Learning for Relational Model
                      Cheol Young Park, Kathryn Blackmond Laskey, Paulo Costa, Shou Matsumoto
                                                   Volgenau School of Engineering
                                                      George Mason University
                                                         Fairfax, VA USA
                                [cparkf, klaskey, pcosta]@gmu.edu, smatsum2@masonlive.gmu.edu


  Abstract— In the past decade, Statistical Relational Learning      knowledge, along with methods for both induction and
(SRL) has emerged as a new branch of machine learning for            deduction. Relational representations provide both class and
representing and learning a joint probability distribution over      instance models. A class model describes statistical information
relational data. Relational representations have the necessary       that applies to classes of objects. For example, a class model
expressive power for important real-world problems, but until        might describe the false positive and false negative rates for a
recently have not supported uncertainty. Statistical relational      class of sensor. The instance model is generated from the class
models fill this gap. Among the languages recently developed for     model by a deduction method. For example, the instance model
statistical relational representations is Multi-Entity Bayesian      would be used to infer the probability that a given detection is a
Networks (MEBN). MEBN is the logical basis for Probabilistic
                                                                     false positive. An induction method learns structure and
OWL (PR-OWL), a language for uncertainty reasoning in the
Semantic Web. However, until now there has been no
                                                                     parameters of a domain theory from observations. For example,
implementation of MEBN learning. This paper describes the first      induction would be used to learn the false positive and false
implementation of MEBN learning. The algorithm learns a              negative rates from a data set annotated with ground truth.
MEBN theory for a domain from data stored in a relational                SRLs have been applied to problems such as Object
Database. Several issues are addressed such as aggregating           Classification, Object Type Prediction, Link Type Prediction,
influences, optimization problem, and so on. In this paper, as our   Predicting Link Existence, Link Cardinality Estimation, Entity
contributions, we will provide a MEBN-RM (Relational Model)          Resolution, Group Detection, Sub-graph Discovery, Metadata
Model which is a bridge between MEBN and RM, and suggest a           Mining, and so on [2].
basic structure learning algorithm for MEBN. And the method              This paper is concerned with the Multi-Entity Bayesian
was applied to a test case of a maritime domain in order to prove    Networks (MEBN), a relational language that forms the logical
our basic method.                                                    basis of Probabilistic OWL (PR-OWL), a language for
                                                                     uncertainty reasoning in the Semantic Web [7, 8]. PR-OWL
   Keywords: Probabilistic ontology, Multi-Entity Bayesian           has been extended to PR-OWL 2, which provides a tighter link
networks, PR-OWL, Relational Model/Database, Machine                 between the deterministic and probabilistic aspects of the
Learning, Statistical Relational Learning                            ontology [9]. MEBN extends Bayesian networks to a relational
                                                                     representation. A MEBN Theory, or MTheory, consists of a set
                        I. INTRODUCTION                              of Bayesian network fragments, or MFrags, that together
     Statistical Relational Learning (SRL) is a new branch of        represent a joint distribution over instances of the random
machine learning for representing and learning a joint               variables represented in the MTheory [7].
distribution over relational data [1, 2]. As its name suggests, it       However, until now there has been no implementation of
combines statistical and relational knowledge representations.       induction or learning for MEBN or PR-OWL. This paper
A relational model represents a domain as a collection of            describes such an implementation. We follow an approach used
objects that may have attributes and can participate in              by other SRL models [1] and use Relational Database (RDB) to
relationships with other objects. Relational representations are     store the observations from which the representation is learned.
expressive enough for important real-world problems, but until           This paper focuses a basic learning algorithm that addresses
recently have not supported uncertainty. This gap has been           the following issues:
filled by SRL methods. Statistical relational knowledge
representations combine statistical and relational approaches,          1.    Developing a bridge of MEBN and RDB;
allowing representation of a probability distribution over a            2.    Developing basic structure and parameter learning for
relational model of a domain. SRL methods allow such                          MEBN.
representations to be learned from data.
     Examples of representation languages for SRL include                Ultimately, a relational learning algorithm should address
Probabilistic Relational Models (PRMs), Markov Logic                 issues such as aggregation of data, reference uncertainty, type
Networks (MLNs), Relational Dependency Networks (RDNs),              uncertainty, and continuous variable learning. These issues will
Bayesian Logic Programs (BLPs), Join Bayes Net (JBN), and            be considered for future research.
Multi-Entity Bayesian Networks (MEBN) [2, 3, 4, 5, 6, and 7].            Our learning method is exact, and assumes discrete random
     A comparison of some of the above models is given in [1].       variables, and complete data. It will be evaluated by the
Typically, SRL models provide a representation for relational        inference accuracy test.
    In Section 2, we give a brief definition of MEBN and RM                      Type             Name                       Example
as background. In the Section 3, we introduce the MEBN-RM                         1                Isa            Isa( Person, P ), Isa( Car, C )
Model. In Section 4, we present the basic structure learning                      2         Value-Constraint            Height( P ) = high
                                                                                  3            Slot-Filler              P = OwnerOf( C )
algorithm. The application of the algorithm is described in the
                                                                                  4         Entity-Constraint             Friend( A, B )
Section 5.
                                                                                        Table 2. Context Node Types on MEBN-RM Model

     II. MULTI-ENTITY BAYESIAN NETWORKS (MEBN) AND                             1) Isa
                 RELATIONAL MODEL (RM)                                          In MEBN, the Isa random variable represents the type of an
                                                                            entity. In a RM, an entity table represents a collection of
A. Multi-Entity Bayesian Networks (MEBN)                                    entities of a given type. Thus, an entity table corresponds to an
                                                                            Isa random variable in MEBN. Note that a relationship table
    MEBN extends Bayesian Networks (BNs) to represent                       whose primary key is composed of foreign keys does not
relational information. BNs have been very successful as an                 correspond to an Isa RV. A relationship table will correspond
approach to representing uncertainty about many interrelated                to the Entity-Constraint type of Context Node.
variables. However, BNs are not expressive enough for
relational domains. MEBN extends Bayesian networks to
                                                                               2) Value-Constraint
represent the repeated structure of relational domains.
                                                                                In a case, a value of attribute can limit keys which are
    MEBN represents knowledge about a domain as a
                                                                            related with only the value. For example, Consider Table 1, in
collection of MFrags, an MFrag is a fragment of a graphical
                                                                            which we have the course table with the difficulty attribute. (In
model that is a template of probabilistic relationships among
                                                                            our definition, Attribute is descriptive Attribute and Key is
instances of its random variables. Random variables in an
                                                                            Primary Key)
MFrag can contain ordinary variables which can be filled in
                                                                                The course table has instances of the key (e.g., c1, c2, c3,
with domain entities. And MFrag includes context, input, and
                                                                            c4, c5, and c6). And if we want to focus on a case of the entity
resident node for restriction of entity, reference of node, and
                                                                            with “high” value of the attribute, it will be {c2, c3}. In this
random variable respectively. We can think of an MFrag as a
                                                                            case, for the entity, any group of elements related with any
class which can generate instances of BN fragments, which can
then be assembled into a Bayesian network [7].                              attributes can be derived. We encode this into “Difficulty
                                                                            (Course) = high” in MEBN.

B. Relational Model (RM)                                                       3) Slot-Filler
    In 1969, Edgar F. Codd proposed RM as a database model                       In the table 1, the professor key is used on the student table
based on first-order predicate logic [10]. RM is composed of                by a foreign key, Advisor. The foreign key is not primary key
Relation, Attribute, Key, Tuple, Instance, and Cell. Relational             in the student table. In this case, the connection will be
database which is the most popular database is based on RM.                 expressed by “Professor = Advisor (Student)” in MEBN. And
                                                                            its instance will be that s1’s advisor is p4 and so on.

                       III. MEBN-RM MODEL                                     4) Entity-Constraint
    As a bridge of MEBN and RM, we suggest MEBN-RM                              The registration table is a relationship table which is a
Model which provides a specification for how to match                       bridge between the course and student entity. In this case,
elements of MEBN to elements of RM. Key nodes in MEBN                       obviously, the registration table will be an intersection group.
are the context and resident node. To understand this easily, we            And this is described as “Registration (Course, Student)” in
use the following example of the university relational model.               MEBN.

   Course               Registration          Student           Professor   B. Resident Node
Key Difficulty Course Key Student Key Grade Key Advisor        Key Major
c1    low         c1           s1       low     s1      p4     p1   SYST        In MFrags, Resident Node can be described as Function,
c2    high        c1           s2       high    s2      p2     p2    OR     Predicate, and Formula of FOL with a probability distribution.
c3    high        c2           s2       high    s3      p3     p3    OR
c4    low         c2           s4       low     s4      p1     p4    CS
                                                                            FOL Function consists of arguments and an output, while FOL
c5    med         c3           s5       med     s5      p5     p5   SYST    Predicate consists of arguments and no output, but Boolean
c6    low         c4           s6       low     s6     null    p6    OR     output. We define the following relationship between elements
             Table 1. Example of university relational model
                                                                            of RM and MEBN.

                                                                                                 RM               Resident Node
A. Context Node                                                                               Attribute         Function/ Predicate
    In MFrags, context terms (or nodes) are used to specify                                      Key               Arguments
constraints under which the local distributions apply. Thus, it                            Cell of Attribute          Output
determines specific entities on an arbitrary situation of a                             Table 3. Resident Node Types of MEBN-RM Model
context. In MEBN-RM model, we define four types of data
structure corresponding to context nodes: Isa, Slot-filler,
Value-Constraint, and Entity-Constraint type.                                   For example, in the table 1, the grade of the registration
                                                                            table is the function having the course and student keys as
arguments. Its output will be the cell of the grade such as low,     several evidence nodes located in the leaf nodes. Figure 3
med, and high. On the other hand, if the domain type of the          presents the result SSBN in which the nodes of the ship and
grade is Boolean, it will be the predicate in MEBN.                  person entity are connected each other.
                                                                         To compare the accuracies of the results, we used the single
                                                                     entity Bayesian Network which was used for the sampling.
       IV. THE BASIC STRUCTURE LEARNING FOR MEBN                     Thus, the single network provided another query result with the
    To address the issues in Section 1, we suggest a basic           same evidence. Figure 1 shows the Receiver Operating
structure learning algorithm for MEBN. The initial ingredients       Characteristic (ROC) Curve which describes accuracy of the
of the algorithm are a dataset of RM, a Bayesian Network             result of the learned MTheory and single entity Bayesian
Structure searching algorithm, and a size of chain. For the          Network. The areas under curves are shown in Table 4.
parameter learning, we only use Maximum Likelihood
Estimation (MLE). The algorithm focuses on discrete variables                                  Model                   AUC
with complete data. We utilize a standard Bayesian Network                               Learned MTheory           0.874479929
                                                                                  Single Entity Bayesian Network   0.87323784
Structure searching algorithm to generate a local BN from the
joined dataset of RM. To avoid infinite loops, we employed the         Table 4. AUC of Learned MTheory and Single Entity Bayesian Network
size of chain. Thus, the process of searching structure will
finish in the size of chain.
    Firstly, the algorithm creates the default MTheory. All keys
of DB are defined as entities of MEBN theory. One default
reference MFrag is created. For the all of tables of DB, the
dataset for each table is retrieved and, by using the BN
structure searching algorithm, a graph is generated from the
dataset. If the graph has a cycle and undirected edge, a
knowledge expert for the domain sets the arc direction. Based
on the revised graph, an MFrag is created. Until the size of
chain is reached, the joined datasets which are derived by
“Join” command in SQL are retrieved. The graphs related to
the joined datasets are generated in the same way as the above.
If any nodes of the new generated graph are not used in any
MFrags, create the resident node having the name of the dataset
of the graph on the default reference MFrag and the new
MFrag for the dataset. If not, only make edges between resident        Figure 1. ROC of Learned MTheory and Single Entity Bayesian Network
nodes in the different MFrags. Lastly, for all resident nodes in
the MTheory, LPDs are generated by MLE.                                  As we can see from Figure 1 and Table 4, the results of
                                                                     accuracy of the learned MTheory and the single entity
                                                                     Bayesian Network are almost the same. This means that the
                         V. CASE STUDY
                                                                     learned MTheory well reflected the data of the relational
    To evaluate the algorithm, we used a dataset which came          database which was sampled using the single entity Bayesian
from the PROGNOS (Probabilistic OntoloGies for Net-centric           Network.
Operation Systems) [11, 12]. The purpose of the system is to             In this paper, we only compared the learned MTheory to the
provide higher-level knowledge representation, fusion, and           true model which was the single entity Bayesian Network. This
reasoning in the maritime domain.                                    result proves that our approach reflects the true model correctly.
    The PROGNOS includes a simulation which provides the             However, the result of this paper is only the beginning and
ground truth information for the system. The simulation uses a       baseline for a full MEBN Learning method, because we didn’t
given single entity Bayesian Network (we use this term to            address the aggregating influence problem which is the
discriminate the SSBN from Multi Entity Bayesian Networks)           important issue in SRL models.
in order for sampling data. The simulation generates 85000
persons, 10000 ships, and 1000 organization entities with
various values of attributes. The data for these entities are                   VI. DISCUSSION AND FUTURE WORK
stored in the relational database.                                       Because of a flood of complex and huge data, efficient and
    For the evaluation of the model, the training and test dataset   accurate methods are needed for learning expressive models
was generated by the simulation. Using the basic structure           incorporating uncertainty. In this paper, we have introduced a
learning for MEBN, the PROGNOS MTheory was derived as                learning approach for MEBN. As a bridge between MEBN and
shown in Figure 2. In the model, a total of four MFrags were         RM, MEBN-RM Model was introduced. For induction, the
generated such as the default reference, org_members, person,        Basic Structure Learning for MEBN was suggested.
and ship MFrag.                                                          Recently, we are studying about a heuristic approach which
    To generate a SSBN from this MTheory, we assume that             called as the Framework of Function Searching for LPD (FFS-
we have one person, ship, and organization. They are related as      LPD) to address the aggregating influence problem. We plan to
ship_crews (Ship S, Person P) and org_members( Organization          expand the learning algorithm in order to include continuous
O, Person P). We queried the isShipOfInterest node with the          random variables.
                                REFERENCES                                           [8]  Paulo C. G Costa, Bayesian Semantics for the Semantic Web. PhD
                                                                                          Dissertation, George Mason University, July 2005. Brazilian Air Force.
[1]   Hassan Khosravi, Bahareh Bina. A Survey on Statistical Relational
      Learning. In Proceedings of Canadian Conference on AI'2010.                    [9] Rommel N. Carvalho, Probabilistic Ontology: Representation and
      pp.256~268                                                                          Modeling Methodology, PhD Dissertation, George Mason University,
                                                                                          July 2011.
[2]   Getoor, L., Tasker, B.: Introduction to statistical relational learning. MIT
      Press, Cambridge, 2007                                                         [10] Codd, E.F. "A Relational Model of Data for Large Shared Data Banks".
                                                                                          Communications of the ACM, 1970
[3]   Domingos, P., Richardson, M.: Markov logic: A unifying framework for
      statistical relational learning. In: Introduction to Statistical Relational    [11] P.C.G. Costa, K.B. Laskey, and KC Chang, “PROGNOS: Applying
      Learning, ch. 12, pp. 339–367, 2007                                                 Probabilistic Ontologies To Distributed Predictive Situation Assessment
                                                                                          In Naval Operations.” Proceedings of the 14th Int. Command And
[4]   Nevile, J., Jensen, D.: Relational dependency networks. In: An                      Control Research and Technology Symposium, Washington, D.C., USA,
      Introduction to Statistical Relational Learning                                     2009.
[5]   Kersting, K., de Raedt, L.: Bayesian logic programming: Theory and             [12] R. N. Carvalho, P. C. G. Costa, K. B. Laskey, and K. Chang,
      tool. In: Introduction to Statistical Relational Learning                           “PROGNOS: predictive situational awareness with probabilistic
[6]   Oliver Schulte, Hassan Khosravi, Flavia Moser, and Martin Ester. Join               ontologies,” in Proceedings of the 13th International Conference on
      bayes nets: A new type of bayes net for relational data. Technical Report           Information Fusion, Edinburgh, UK, Jul. 2010.
      2008-17, Simon Fraser University, 2008. also in CS-Learning Preprint
      Archive.
[7]   Laskey, K. B.,MEBN: A Language for First-Order Bayesian Knowledge
      Bases. Artificial Intelligence, 172(2-3), 2008


                                                              Figure 2. Generated PROGNOS MTheory


                                                          Figure 3. Generated SSBN of PROGNOS MTheory