<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Generation of Parametrically Uniform Knowledge Bases in a Relational Probabilistic Logic with Maximum Entropy Semantics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Christoph Beierle</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Markus Ho¨hnerbach</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcus Marto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Computer Science, University of Hagen</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In a relational setting, the maximum entropy model of a set of probabilistic conditionals can be defined referring to the full set of ground instances of the conditionals. The logic FO-PCL uses the notion of parametric uniformity to ensure that the full grounding of the conditionals can be avoided, thereby greatly simplifying the maximum entropy model computation. In this paper, we describe a system that realises an approach transforming an FO-PCL knowledge base consisting of relational probabilistic conditionals into a knowledge base having the same maximum entropy model that is parametrically uniform. The implemented system provides different execution and evaluation modes, including the generation of all possible solutions, and is available within an integrated development environment for relational probabilistic logic.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        There are different approaches addressing the challenge of modelling uncertainty by
combining logic with probabilities, see e.g. the foundational work reported in [18, 6, 7].
Probabilistic conditional logic considers conditional probabilities P (BjA) for
conditionals of the form if A then B with probability x [
        <xref ref-type="bibr" rid="ref1">1, 19</xref>
        ], formally denoted by (BjA)[x];
thus, P (BjA) denotes the probability of B given that we know A holds. While these
approaches are often based upon propositional logic, there are also several developments
aiming at exploiting the expressiveness of relational or general first order logic in
combination with probabilities, e.g. Bayesian logic programs and Markov logic networks
(see [11] for an overview). Some of these approaches (see e.g. [15, 22]) employ the
principle of maximum entropy (ME) [20, 14]. Among these, there is also FO-PCL [10],
a first-order extension of propositional probabilistic conditional logic. An example of a
conditional in FO-PCL is “If V likes U, then U likes V with probability 0.9, for different
U, V”, formally denoted by h(likes(U; V ) j likes(V; U )) [0:9] ; U 6= V i.
      </p>
      <p>The models of an FO-PCL knowledge base R consisting of a set of such
conditionals are probability distributions over possible worlds [12] satisfying each conditional
in R, where a possible world is a subset of the Herbrand base which is determined by
the set of all ground instances of conditionals satisfying the respective constraint. The
ME principle is used to select the uniquely determined model ME (R) having
maximum entropy. The computation of ME (R) leads to an optimisation problem with one
optimisation parameter to be determined for each admissible ground instance of
every conditional in R. However, if the knowledge base is parametrically uniform, i.e.
all ground instances of a conditional share the same optimisation parameter value, for
each conditional in R just one optimisation parameter has to be determined [10]. Thus,
parametric uniformity significantly simplifies the task of computing ME (R) [8].</p>
      <p>In [17, 4], a set of of transformation rules PU is presented allowing to transform
any consistent knowledge base into a parametrically uniform knowledge base with the
same maximum entropy model. In this paper, we introduce the system PU sys
implementing PU and automatically generating PU (R) for any consistent R. This allows
for a simpler ME model computation by computing ME (PU (R)) instead of ME (R).</p>
      <p>This paper is an extended version of [3]. In particular, we expand the background
about FO-PCL by recalling the notions of inter-rule und intra-rule intercations, give a
more detailed overview on PU sys , and provide results obtained from an evaluation of
the system. After recalling the basics of FO-PCL and PU (Sec. 2), we present PU sys in
Sec. 3. We illustrate the reasons for multiple solutions of PU transformations (Sec. 4),
describe their optimised generation (Sec. 5), give some first application and evaluation
results (Sec. 6), and conclude and point out further work (Sec. 7).
2</p>
      <sec id="sec-1-1">
        <title>Background: Interactions and P U Transformation Rules</title>
        <p>An FO-PCL conditional h( j ) [ ] ; Ci consists of an antecedent and a consequent
, both being quantifier-free first-order, non-equational formulas, a probability value
2 [0; 1], and a constraint formula C with inquatations over the variables (and
constants) occurring in and . As an illustration of a simple FO-PCL knowledge base we
consider the following:
Example 1 (Misanthrope). The misanthrope example (adapted from [10]) models the
friendship relations for a group of people. Usually, if a person V likes a person U , then
U also likes V . However, there is a misanthrope a, and the chances that a likes someone
is considerably low. The knowledge base RMI = fR1; R2g contains two conditionals:
R1 : h(likes(U; V ) j likes(V; U )) [0:9] ; U 6= V i
R2 : h(likes(a; V )) [0:05] ; V 6= ai
tu</p>
        <p>A ground instance of a conditional satisfying the conditional’s constraint formula
like U 6= V is called admissible. A model of an FO-PCL knowledge base R is a joint
probability function P satisfying each conditional in R where P satisfies a conditional
h( j ) [ ] ; Ci iff P ( gj g) = for every admissible ground instance h( g j g) [ ] ; &gt;i
of the conditional [10]. Note that a full grounding of RMI ignoring the constraint
formulas would immediately yield an inconsistency since there is no probability
distribution P with P (likes(a; a)) = 0:05 and P (likes(a; a)jlikes(a; a)) = 0:9. Parametric
uniformity [10] significantly simplifies the task of computing the maximum entropy
[20, 14] model ME (R) since it reduces the number of optimisation parameters to be
taken into account from the number of admissible ground instances to the number of
conditionals in R [8]. For details about the underlying optimisation problem and its
parameters determining the FO-PCL model ME (R), we refer to [10, 8].</p>
        <p>In [17], the reasons causing R to be not parametrically uniform are investigated in
detail. A central observation is that for any parametrically uniform R, the sets of ground
instances of atoms and pairs of different atoms occurring in the admissible ground
instances of two conditionals must be either disjoint or identical (otherwise there is an
imbalanced sharing [17, 4]). For instance, RMI from Ex. 1 is not parametrically uniform
since the sets of admissible ground atoms originating from likes(U; V ) of conditional
R1 and from likes(a; V ) of conditional R2 are neither equal nor disjoint. For a single
conditional it is required that each admissible ground instance of an atom p or of a pair
of different atoms p0; p00 occurs the same number of times in the ground instances of
the conditional as any other admissible ground instance of the atom p or of the pair of
atoms p0; p00 (otherwise there is an imbalanced use [17, 4]). The syntactic criterion of
inter-rule and intra-rule interactions [17] identifies these constellations by analysing
the syntactic structure of the conditionals in R; no grounding operation is required.
Inter-Rule Interactions An inter-rule interaction of type 1 between two conditionals
R1 and R2 with respect to P , regarding the variable V and the constant c, denoted</p>
        <sec id="sec-1-1-1">
          <title>R1 hP iV;c ! R2 is a situation</title>
          <p>a) R1 = h(: : : P (: : : ; c; : : :) : : :)[ R1 ]; CR1 i,</p>
          <p>R2 = h(: : : P (: : : ; V; : : :) : : :)[ R2 ]; CR2 i, and CR2 2 V 6= c, or
b) R1 = h(: : : P (: : : ; U; : : :) : : :)[ R1 ]; CR1 i, CR1 U 6= c,</p>
          <p>R2 = h(: : : P (: : : ; V; : : :) : : :)[ R2 ]; CR2 i, and CR2 2 V 6= c .
where CR2 2 V 6= c denotes that from CR2 , it does not follow that V is not equal to c.</p>
          <p>An inter-rule interaction of type 2 between R1 and R2 with respect to P , regarding
the variables V and Z, denoted R1 hP iV;Z ! R2, is a situation with
a) R1 = h(: : : P (: : : ; U; : : : ; U; : : :) : : :)[ R1 ]; CR1 i,</p>
          <p>R2 = h(: : : P (: : : ; V; : : : ; Z; : : :) : : :)[ R2 ]; CR2 i, and CR2 2 V 6= Z, or
b) R1 = h(: : : P (: : : ; U; : : : ; W; : : :) : : :)[ R1 ]; CR1 i, CR1 U 6= W ,
R2 = h(: : : P (: : : ; V; : : : ; Z; : : :) : : :)[ R2 ]; CR2 i, and CR2 2 V 6= Z.</p>
          <p>An inter-rule interaction of type 3 between R1 and R2 with respect to P and Q,
regarding the variables V and Z, denoted R1 hP; QiV;Z ! R2, involves two different
atoms P (: : :) and Q(: : :) and is a situation with
a) R1 = h(: : : P (: : : ; U; : : :) : : : Q(: : : ; U; : : :) : : :)[ R1 ]; CR1 i,</p>
          <p>R2 = h(: : : P (: : : ; V; : : :) : : : Q(: : : ; Z; : : :) : : :)[ R2 ]; CR2 i, and CR2 2 V 6= Z, or
b) R1 = h(: : : P (: : : ; U; : : :) : : : Q(: : : ; W; : : :) : : :)[ R1 ]; CR1 i, CR1 U 6= W ,
R2 = h(: : : P (: : : ; V; : : :) : : : Q(: : : ; Z; : : :) : : :)[ R2 ]; CR2 i, and CR2 2 V 6= Z.</p>
          <p>The three types of inter-rule interactions capture exactly the different reasons for
imbalanced sharing [17]. E.g., for RMI the imbalanced sharing sketched above is
identified by the inter-rule interaction of type 1 R2 hlikesiU;a ! R1.</p>
          <p>For each of the different types of interactions, there is a corresponding interaction
removing transformation rule in PU (cf. Figure 1). For instance, the transformation rule
(TE 1) removes an inter-rule interaction of type 1. The application of T E1 replaces a
conditional R with two new conditionals ( (R)) and ( (R)), where (R) is the
result of applying the variable substitution = fV =cg to R, and (R) is the result of
adding the constraint V 6= c to the constraint formula of R. The operator transforms
a conditional in constraint normal form, a normal form required for the recognition of</p>
          <p>hP iV;c ! R2;
= fV =cg</p>
          <p>R [ fR1; R2g
R [ fR1g [ f (R2); (R2)g</p>
          <p>R1</p>
          <p>hP iV;Z ! R2;
= fV =Zg</p>
          <p>R [ fR1; R2g
R [ fR1g [ f (R2); (R2)g</p>
          <p>R1</p>
          <p>hP; QiV;Z ! R2;
= fV =Zg</p>
          <p>R [ fRg
R [ f (R); (R)g</p>
          <p>R [ fRg
R [ f (R); (R)g</p>
          <p>R [ fRg
R [ f (R); (R)g
hQiV;c ! R;</p>
          <p>= fV =cg
hQiV;Z ! R;</p>
          <p>= fV =Zg
hQ; SiV;Z ! R;</p>
          <p>= fV =Zg
interactions. Similarly, (TE 2) and (TE 3) remove inter-rule interactions of type 2 and
3, respectively.</p>
          <p>Example 2 (Application of P U to RMI ). The knowledge base RMI of Ex. 1 is not
parametrically uniform since it contains two inter-rule interactions of type 1 denoted by
R2 hlikesiU;a ! R1 and R2 hlikesiV;a ! R1. The application of T E1 for the
first interaction replaces R1 with</p>
          <p>R1 1 : h(likes(a; V ) j likes(V; a)) [0:9] ; V 6= ai</p>
          <p>R1 2 : h(likes(U; V ) j likes(V; U )) [0:9] ; U 6= V ^ U 6= ai:
R1 1 is the result of the substitution = fU=ag applied to R1 and R1 2 is the result of
adding the constraint U 6= a to R1. The atom likes(U; V ) of R1 that caused the first
interaction becomes likes(a; V ) in R1 1 and its set of ground atoms is now identical to
the set of ground atoms from likes(a; V ) of R2. The added constraint U 6= a in R1 2
leads to disjoint sets of the ground atoms of the discussed atoms. The second interaction
between R2 and R1 2 is also removed by applying T E1, thereby replacing R1 2 with
R1 2 1 : h(likes(U; a)jlikes(a; U ))[0:9]; U 6= ai</p>
          <p>R1 2 2 : h(likes(U; V )jlikes(V; U ))[0:9]; U 6= V ^ U 6= a ^ V 6= ai:
The resulting knowledge base is P U (RMI ) = fR1 1; R1 2 1; R1 2 2; R2g:
tu
Intra-Rule Interactions While inter-rule interactions involve two conditionals,
intrarule interactions occur within a single conditional and capture the notion of imbalanced
use. Suppose R is a conditional containing atoms PR; QR with the predicate symbols
P and Q, respectively.</p>
          <p>An intra-rule interaction of type 1 in R with respect to Q, regarding the variable V
and the constant c, denoted hQiV;c ! R, is a situation with PR = P (: : : ; U; : : :), QR =
Q(: : : ; V; : : :), U 2= vars(QR), CR U 6= V , CR U 6= c, and CR 2 V 6= c .</p>
          <p>An intra-rule interaction of type 2 in R with respect to Q, regarding the variables
V and Z, denoted hQiV;Z ! R, is a situation with PR = P (: : : ; U; : : :), QR =
Q(: : : ; V; : : : ; Z : : :), U 2= vars(QR), CR U 6= V , CR U 6= Z, and CR 2 V 6=
Z.</p>
          <p>If SR with predicate symbol S is a further atom in R, an intra-rule interaction
of type 3 in R with respect to Q and S, regarding the variables V and Z, denoted
hQ; SiV;Z ! R, is a situation with PR = P (: : : ; U; : : :), QR = Q(: : : ; V; : : :), SR =
S(: : : Z : : :), U 2= vars(QR), U 2= vars(SR), CR U 6= V , CR U 6= Z, and
CR 2 V 6= Z.</p>
          <p>Each type of intra-rule interaction can be removed by one of the three rule (TA1),
(TA2), (TA3) in PU (Figure 1).</p>
          <p>Example 3 (Intra-rule interaction of type 1). Suppose that the signature for the
knowledge base RIA1 = fR1g contains the constants D = fa; b; cg:</p>
          <p>R1 : h(p(U ) j q(V )) [ ] ; U 6= V ^ U 6= ai
There is an intra-rule interaction of type 1 denoted hqiV;a ! R1. The reason for this
interaction is that the number of occurrences of the ground atom q(a) in the admissible
ground instances of R1 is 2, but for q(b) (and also q(c)) only 1. Note that this imbalance
persists for any larger set of constants D. The application of (TA1) replaces R1 with
R1 1 : h(p(U ) j q(a)) [ ] ; U 6= ai</p>
          <p>R1 2 : h(p(U ) j q(V )) [ ] ; U 6= V ^ U 6= a ^ V 6= ai
and the resulting knowledge base PU (RIA1 ) is parametrically uniform.
tu</p>
          <p>The importance of inter- and intra-rule interactions is that they fully capture the
reasons for a knowledge base to be not parametrically uniform. Correctness and
completeness of the set of interaction removing transformation rules PU is given by the
following proposition:
Proposition 1. [17] Exhaustively applying PU to a knowledge base R yields a
knowledge base PU (R) such that R and PU (R) have the same maximum-entropy model
and PU (R) is parametrically uniform.</p>
          <p>A proof of this proposition and further details of PU can be found in [17, 4]. As a
consequence of Proposition 1, we can obtain the maximum entropy model ME (R)
by computing ME (PU (R)), thereby reducing the number of optimisation parameters
to be considered from the number of admissible ground instances of R to the
number of conditionals in PU (R). For instance, considering a set D of constants having
15 elements in Ex. 1, there are 224 admissible ground instances for RMI , but only 4
conditionals in PU (RMI ) (cf. Ex. 2).</p>
        </sec>
      </sec>
      <sec id="sec-1-2">
        <title>3 Implementation of the Transformation System P U</title>
        <p>In this section, we present an overview of the software system PU sys that implements
the transformation system PU . PU sys has been designed as a plugin for KReator1 [9],
1 Source code of KReator and PU sys can be found at http://kreator-ide.
sourceforge.net/</p>
        <p>FO-PCL
Knowledge</p>
        <p>Base</p>
        <p>Lexer/
Parser/
AST
ObjectStructure</p>
        <p>Semantic Analysis
type consistency
vars(C) vars( [</p>
        <p>)</p>
        <p>Transformation
constraint normal form
recognition of interactions
application of transformation rules
which is an integrated development environment for relational probabilistic logic. The
basic architecture of the implemented transformation system is illustrated in Figure 2.
The input knowledge base is parsed into an abstract syntax tree (AST) from which an
object structure is created. Thereby it is ensured that the types of variables and
constants in a conditional are conform with the signature of the knowledge base (type
consistency) and that the variables of the constraint formula are used in either the premise
or the conclusion of the conditional. The key operations of the transformation phase are
the transformation of conditionals in constraint normal form, the recognition of
interactions, and the application of the transformation rules. These operations operate directly
on the object structure. After the transformation phase, the output knowledge base is
created from the resulting object structure.</p>
        <p>Transformation Modes P U sys ensures that all conditionals of the initial
knowledge base are transformed into constraint-normal form. If more than one interaction
is found, one of these interactions has to be selected for the application of the
corresponding transformation rule. Therefore, P U sys offers different transformation modes
for different rule application strategies; moreover, P U sys can be easily extended with
additional transformation modes. The Interactive mode allows to control, monitor and
trace single steps of a transformation process through a graphical user interface. All
interactions present in a knowledge base are listed in a short form notation. They can
be selected separately to apply the corresponding transformation rule or to view more
detailed information in a higher-level, declarative notation which originates from the
definition of the interactions. In the Automatic mode, an applicable transformation rule
is selected automatically and applied, and this step is repeated until a parametrically
uniform knowledge base is reached. The All Solutions transformation mode creates all
results that are obtainable by applying different orders of rule applications. Thereby,
it avoids the multiple generation of the same parametrically uniform knowledge base,
and moreover, it avoids the generation of knowledge bases that are just variants of each
other with respect to variable renamings. As this mode is of particular interest when
investigating properties of P U related e.g. to minimal solutions or confluence properties,
this mode will be described in more detail in Sec. 5.</p>
        <p>User Interface A transformation process can be started with KReator by executing a
KReator-Script [9]. All transformation parameters (e.g. transformation mode) can be set
either by using a graphical user interface or within the script itself. The transformation
systems also supports batch processing for multiple knowledge base files. A screenshot
of the Interactive transformation mode applied to RMI from Ex. 1 is shown in Figure 3.
The right table displays the conditionals of the knowledge base to be transformed and
is updated accordingly after a transformation rule has been applied. All interactions
that have been recognised are listed in a short form notation in the left table. They can
be selected to apply the corresponding transformation rule. The bottom pane displays
detailed information about the reasons why an interaction exists by tracing it back to
the declarative notation used in the definitions of the interactions.
4</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Multiple Solutions</title>
      <p>The application of different transformation rules form P U to a knowledge base R may
lead to different parametric uniform knowledge bases (though still having the same
maximum entropy model due to Proposition 1), i.e. P U is not confluent. The following
knowledge base presented in [16] illustrates this.</p>
      <p>Example 4. Let R = fR1; R2g be the knowledge base with:</p>
      <sec id="sec-2-1">
        <title>There are three interactions in R:</title>
        <p>R1 : h(p(U; U ) j q(V )) [0:2] ; U 6= V i
Choosing first the interaction Ia and applying P U exhaustively yields the
parametrically uniform knowledge base Ra with the following four conditionals:
Choosing first the interaction Ib and applying P U exhaustively yields Rb with six
conditionals:</p>
        <p>R1 : h(p(U; U ) j q(V )) [0:2] ; U 6= V i
Rb2 : h(p(Y; Y ) j q(Y )) [0:3] ; &gt;i
Rb3 : h(p(X; Y ) j q(X)) [0:3] ; X 6= Y i
Rb4 : h(p(X; Y ) j q(Y )) [0:3] ; X 6= Y i
Rb5 : h(p(Y; Y ) j q(W )) [0:3] ; W 6= Y i</p>
        <p>Rb6 : h(p(X; Y ) j q(W )) [0:3] ; W 6= X ^ W 6= Y ^ X 6= Y i
Choosing first the interaction Ic and applying PU exhaustively yields a knowledge
base Rc also with six conditionals; in fact, Rc differs from Rb only by a renaming of
variables.
tu</p>
        <p>Thus, even when taking variable renamings into account, in Example 4, PU can
transform R into two different parametrically uniform knowledge bases, Ra and Rb.
Here, the choice of the interaction that gets removed first determines the solution, while
in general, the splitting in different solutions may occur at any stage of the
transformation process. From a computational point of view, one is especially interested in
small knowledge bases. Since it is not clear which particular choice of interaction
removal will lead to a minimal knowledge base, such a minimal knowledge base can be
obtained by selecting it from the set of all solutions.
5</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Generation of all Solutions</title>
      <p>In principle, it is possible to enumerate the solutions in a simple way by branching out
the search algorithm every time that there is more than one option which interaction
to remove first. However, this might be not feasible even for small knowledge bases. It
would also give no information about the number of unique solutions, i.e. solutions that
differ in more than a variable naming.</p>
      <p>Knowledge bases obtained by PU whose conditionals differ only in variable naming
are equivalent. A source for this ambiguity in the transformation process is that equality
constraints are realised using substitutions. However, an equality constraint A = B can
be realised by a substitution A=B as well by B=A if A and B are both variables.
Definition 1 (pt-equivalent conditionals). Let R be a knowledge base, R 2 R, and
let = n : : : 1 and 0 = m0 : : : 10 be substitutions obtained from applying
two sequences of PU transformations to R. Then the conditionals (R) and 0(R)
are equivalent with respect to PU transformations (or just pt-equivalent) iff there is a
variable renaming such that ( (R)) = 0(R).</p>
      <p>Note that this notion is relative to the root conditional R. For instance, in Example 4,
R20 : h(p(X; X) j q(X)) [0:3] ; &gt;i</p>
      <p>R200 : h(p(Y; Y ) j q(Y )) [0:3] ; &gt;i
originating from R2 with the substitutions W=X and Y =X respectively W=Y and X=Y
are pt-equivalent as there is the variable renaming = X=Y such that (R20) = R200.</p>
      <p>An algorithm to find the solutions has to make two choices during the process:</p>
      <p>X=W
h(p(X; Y ) j q(X)) [0:3] ; &gt;i h(p(X; Y ) j q(W )) [0:3] ; W 6= Xi</p>
      <p>X=Y</p>
      <p>W=Y
Rb2 : h(p(Y; Y ) j q(Y ))
[0:3] ; &gt;i</p>
      <p>Rb3 : h(p(X; Y ) j q(X))
[0:3] ; X 6= Y i</p>
      <p>Rb4 : h(p(X; Y ) j q(Y ))
[0:3] ; X 6= Y i
h(p(X; Y ) j q(W ))
[0:3] ; W 6= X ^ W 6= Y i</p>
      <p>X=Y
Rb5 : h(p(Y; Y ) j q(W ))
[0:3] ; W 6= Y i</p>
      <p>Rb6 : h(p(X; Y ) j q(W ))
[0:3] ; W 6= X ^ W 6= Y ^ X 6= Y i
A naive approach might choose all conditionals and all transformations, leading to an
algorithm that branches out extremely and that may generate a huge number of
knowledge bases. The algorithm introduced in this paper uses an auxiliary graph to answer
those two questions in a more sensible way. An auxiliary graph is essentially a
representation for the set of knowledge bases that can be reached through the transformation
process. It is a directed graph containing two types of nodes: conditional nodes
representing a single conditional, and substitution nodes representing a substitution acting
on a conditional. The nodes are connected such that conditional nodes are connected
to their respective interaction-removing substitution nodes, and substitution nodes are
connected to the conditional nodes that are the result of applying said substitution to
the parent conditional.</p>
      <p>Example 5. Fig. 4 is an auxiliary graph representing the solution knowledge base Rb
from Example 4. It shows how such graphs look like: On the top level there are the
conditionals of the original knowledge base (rectangles). Below these there are the
interaction-removing substitutions (ellipses) connected to the conditional node R they
apply to, and to the two resulting conditional nodes (R) and (R). This means that
each substitution node has exactly one incoming and two leaving edges. The
conditionals in Rb are precisely the six leaf nodes in the graph. tu</p>
      <p>Such an auxiliary graph can also be constructed for the whole transformation
process behind PU . The algorithm starts with the empty graph and adds a conditional node
for each conditional in the initial knowledge base. Then we successively pick one
conditional node, compute the set of conditional nodes in the graph that it can possibly
interact with, check for interactions with said nodes and add the corresponding
substitution nodes for the found interactions. When the substitution node gets added, we
also have to connect its outgoing edges. At this point we use the equivalence between
conditionals from Definition 1 to check whether a pt-equivalent conditional is already
contained in the graph. If this is the fact, then it suffices to connect the substitution node
to said conditional node, and we do not have to add a new conditional node to the graph.
Example 6. Fig. 5 is the auxiliary graph corresponding to the knowledge base R from
Example 4. In the first row, there are the two conditionals of the original knowledge
base R, and the second row contains the substitution nodes corresponding to the three
interactions Ia; Ib; Ic in R. The third row contains the six conditionals obtained by
applying the corresponding interaction removing transformations. The fourth row
contains the substitution nodes corresponding to the interactions among the conditionals in
the third row. Note that three of the resulting conditionals in the fifth row have multiple
incoming edges since, up to pt-equivalence, they can be generated in different ways. tu</p>
      <p>This operation effectively transforms the graph from a tree to a directed acyclic
graph. This graph can now answer the question Q1 posed before: The substitution nodes
denote exactly the substitutions that can be applied to its parent conditional node during
the interaction removal process.</p>
      <p>In order to answer question Q2, the auxiliary graph is reduced by identifying and
removing redundancies caused by substitution nodes. We will give an exemplary
situation of removing redundancies introduced by the order of interaction removal for
independent interactions on the same conditional: Let R 2 R be a conditional that has
two interactions in R with interaction removing substitutions 1; 2. Assume that those
are independent, i.e. removing one interaction changes nothing about the other
interaction. Then the graph will contain both 1 and 2 as substitution nodes below R. As
these are independent from each other, 2 is also a substitution child node of 1(R) as
well as 1(R) and vice versa . Thus, both substitution nodes 1 and 2 below R lead
to the same conditionals below. This means that we can fuse the two substitution child
nodes of R to one substitution node f 1; 2g. We can pick an arbitrary representative
that will represent the substitution node and especially determine the edges.
Removing all such redundancies in a bottom-up manner yields the reduced auxiliary graph
which is uniquely determined if a choice function for picking a representative for fused
substitution nodes is given.</p>
      <p>Example 7. Fig. 6 shows the reduced graph to Example 4. Note how there is just one
conditional node with more than one substitution child node, corresponding to R2.
tu</p>
      <p>The reduced graph can be used to determine which interaction-removing
substitutions on a given conditional are sufficient for generating all solutions. Starting with the
set M containing the conditional nodes in the first row of the graph (i.e., the set of
conditionals in the original knowledge base), do the following: While there is a conditional
node C in M that is not a leaf node, choose (non-deterministically) one of C’s child
substitution nodes and replace C in M by the two child nodes of the chosen substitution
node. Using the reduced graph in this way allows the generation of all solutions without
getting multiple copies or variable renamed versions.</p>
      <p>Example 8. As there is only one conditional node in the reduced graph in Fig. 6 (i.e.
R2), there is only one (non-deterministic) choice to be made. Thus, the graph represents
i
Y
=6</p>
      <p>X
=XW Y^
=6
W
:
0
[
=Y ))</p>
      <p>W
X (
q
j
)
Y
;
X
(
p
(h
=YW Xi
=6
W
^
Y
[
)
)
W
(
q
j
)
Y
i
Y
=6
X
^
X
=6
i
&gt;</p>
      <p>X
;]
3
:
;]3 =W [))0</p>
      <p>X
X (</p>
      <p>)
=Y [))
X (</p>
      <p>W
q
];W W Yi
:30 =Y =6</p>
      <p>X
;
(h
]
3
:
0
j [
) =X ))
;Y Y (Y
(pX )qj</p>
      <p>Y
;
i
Y
=6
W
3
];
:
0
[
)
)
W
(
q
j
)
Y
;
X
(
(h
p
i
X
=6
W
;
3
:
]
0
[
)
)
W
(
q
j
)
Y
;
X
(
(h
p
i
Y
X
(
p
(h
i
];&gt;
3
:
0
[
)
)
Y
(
q
j
)
Y
i
;]&gt;
3
:
0
[
)
)
W
(
q
j
)
Y
;
Y
(
p
(h
i
V
=6
U
;]
2
:
0
[
)
)
V
(
q
j
U
)
;
U
(
p
(h
)</p>
      <p>W ;
i&gt; =X ((</p>
      <p>X
];3 ph
:
0
[
)</p>
      <p>i
qW =W Y
(
j Y =6
) X
Y
;
X
(
(ph =Y ))</p>
      <p>X (</p>
      <p>Y =6
=
X ;W
]
3
:
0
:
;]&gt;
03 ;
[)) (X
(X =Y (ph
qj X
)
;
]; :];3
:3 [
[0 ))
W =YW (qY</p>
      <p>j
q )
)j Y
;YX (pY
(
(
0
;
(h
p
h
i
Y
=6
q
j
Y
;
X
(
p
(h
i
X
=6
W
^
Y
=6
X
^
Y
=6
W
;]
3
:
0
[
)
)
W
(
q
j
)
Y
;
X
(
p
(
h
X
(
p
(h
i
Y
=6
W
;
]
3
:
0
[
)
)
W
(
q
j
)
Y
;
Y
(
p
(h
4
e
l
p
m
a
x
E
f
o
h
p
a
r
g
y
r
a
i
l
i
x
u
a
e
h
T
.
5
.</p>
      <p>;&gt;
i
ig :]30
F [
)
)
W
(
q
j
i
V
=6
U
];
2
:
)
exactly the two parametrically uniform solutions Ra and Rb (cf. Example 4) which
correspond to the leave nodes obtained by choosing either the left substitution child
node X=Y or the right substitution child node X=W of R2.
tu
6</p>
    </sec>
    <sec id="sec-4">
      <title>Applications and First Evaluation Results</title>
      <p>The system PU sys was successfully applied to many different examples, including all
examples discussed in [10, 17, 8, 4]. As expected, the optimised generation of all
solutions presented in Sec. 5 turned out to be much more efficient than an naive approach
of simply executing all possible transformation sequences. For instance, for the
knowledge base R (Example 4) exactly the two solutions were generated, compared to 28
knowledge bases in the naive approach. There are also examples, where the naive
approach generates many solutions (e.g. 96), while the optimised version yields a single
unique solution [2]. For another knowledge base, all four (non-redundant) solutions
were generated within 10 seconds, while the naive approach did not terminate within
more than four hours. In [2], PU sys is tested and evaluated with respect to many
different knowledge bases, including a series of synthetically generated knowledge bases
varying in the number of sorts, constants, predicates, predicate arities, and
conditionals. Among these parameters, especially increasing the arity of predicates increased the
number of conditionals in PU (R) due to an increased number of interactions in the
original knowledge base R. For all knowledge bases considered in [10, 17, 8, 4], the
increase from jRj to jPU (R)j was at most by a factor of 2.2, while for some synthetically
generated knowledge bases there was a factor of 8.1. However, as pointed out in Sec. 2,
crucial for maximum entropy computation is the relationship between jPU (R)j and
j gnd(R)j (i.e. the number of admissible groundings of R); in the evaluation in [2], this
relationship is often characterised by factors in the order of magnitude between 102 and
104. As shown in [8], this factor between jPU (R)j and j gnd(R)j is responsible for the
simplification obtained in computing ME (R) by determining ME (PU (R)) instead.
7</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Further Work</title>
      <p>The software system PU sys implements the transformation system PU and provides
an optimised generation of all possible solutions, thereby transforming any consistent
FO-PCL knowledge R base to a semantically equivalent and parametrically uniform
one and thus simplifying the maximum entropy model computation. An open question
is how PU should be modified so that a confluent set of transformation rules is
obtained. While the objective of PU is to simplify the ME model computation, thereby
not taking inference into account, the techniques used in PU are related to techniques
like shattering used in lifted inference [21, 5, 13] Our current work also includes the
development of sophisticated methods of using the obtained maximum entropy model
ME (R) for deriving the probability ME (R)(F ) for a given formula or conditional F ;
here, techniques of lifted inference might be employed.
2. A. Becker. Test, evaluation and assessment of a transformation system for knowledge bases
with relational probabilistic conditionals. M.Sc. Thesis, Dept. of Computer Science,
University of Hagen, Germany, 2014. (in German).
3. C. Beierle, M. Ho¨hnerbach, and M. Marto. Implementation of a transformation system for
relational probabilistic knowledge bases simplifying the maximum entropy model
computation. In Proc. FLAIRS 2014, pages 486–489, Menlo Park, CA, 2014. AAAI Press.
4. C. Beierle and A. Kra¨mer. Achieving parametric uniformity for knowledge bases in a
relational probabilistic conditional logic with maximum entropy semantics. Annals of
Mathematics and Artificial Intelligence, 2014. (to appear).
5. R. de Salvo Braz, E. Amir, and D. Roth. Lifted first-order probabilistic inference. In L. P.</p>
      <p>Kaelbling and A. Saffiotti, editors, IJCAI-05, Proceedings of the Nineteenth International
Joint Conference on Artificial Intelligence, pages 1319–1325. Professional Book Center,
2005.
6. R. Fagin, J. Halpern, and N. Megiddo. A logic for reasoning about probabilities. Information
and Computation, 87:78–128, 1990.
7. R. Fagin and J. Y. Halpern. Reasoning about knowledge and probability. J. ACM, 41(2):340–
367, 1994.
8. M. Finthammer and C. Beierle. How to exploit parametric uniformity for maximum
entropy reasoning in a relational probabilistic logic. In L. Farin˜as del Cerro, A. Herzig, and
J. Mengin, editors, JELIA, volume 7519 of LNAI, pages 189–201. Springer, 2012.
9. M. Finthammer and M. Thimm. An integrated development environment for probabilistic
relational reasoning. Logic Journal of the IGPL, 20(5):831–871, 2012.
10. J. Fisseler. Learning and Modeling with Probabilistic Conditional Logic, volume 328 of</p>
      <p>Dissertations in Artificial Intelligence. IOS Press, Amsterdam, 2010.
11. L. Getoor and B. Taskar, editors. Introduction to Statistical Relational Learning. MIT Press,
2007.
12. J. Halpern. Reasoning About Uncertainty. MIT Press, 2005.
13. A. Jaimovich, O. Meshi, and N. Friedman. Template based inference in symmetric relational
Markov random fields. In Proceedings of the Twenty-Third Conference on Uncertainty in
Artificial Intelligence. AUAI Press, 2007.
14. G. Kern-Isberner. Characterizing the principle of minimum cross-entropy within a
conditional-logical framework. Artificial Intelligence, 98:169–208, 1998.
15. G. Kern-Isberner and T. Lukasiewicz. Combining probabilistic logic programming with the
power of maximum entropy. Artificial Intelligence, Special Issue on Nonmonotonic
Reasoning, 157(1-2):139–202, 2004.
16. A. Kra¨mer. Transformation rules for lifted inference in relational probabilistic logic
knowledge bases. B.Sc. Thesis, Dept. of Computer Science, University of Hagen, Germany, 2011.
17. A. Kra¨mer and C. Beierle. On lifted inference for a relational probabilistic conditional logic
with maximum entropy semantics. In T. Lukasiewicz and A. Sali, editors, Foundations of
Information and Knowledge Systems (FoIKS 2012), volume 7153 of LNCS, pages 224–243.</p>
      <p>Springer, 2012.
18. N. Nilsson. Probabilistic logic. Artificial Intelligence, 28:71–87, 1986.
19. D. Nute and C. Cross. Conditional logic. In D. Gabbay and F. Guenther, editors, Handbook
of Philosophical Logic, volume 4, pages 1–98. Kluwer Academic Publishers, second edition,
2002.
20. J. Paris. The uncertain reasoner’s companion – A mathematical perspective. Cambridge</p>
      <p>University Press, 1994.
21. D. Poole. First-order probabilistic inference. In G. Gottlob and T. Walsh, editors, Proc.</p>
      <p>IJCAI-03, pages 985–991. Morgan Kaufmann, 2003.
22. M. Thimm, G. Kern-Isberner, and J. Fisseler. Relational probabilistic conditional reasoning
at maximum entropy. In ECSQARU, volume 6717 of LNCS, pages 447–458. Springer, 2011.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>E.</given-names>
            <surname>Adams</surname>
          </string-name>
          . The Logic of Conditionals. D. Reidel, Dordrecht,
          <year>1975</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>