<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>T</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>MCINTYRE: A Monte Carlo Algorithm for Probabilistic Logic Programming</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>44122 Ferrara</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <volume>1</volume>
      <fpage>0</fpage>
      <lpage>2</lpage>
      <abstract>
        <p>Probabilistic Logic Programming is receiving an increasing attention for its ability to model domains with complex and uncertain relations among entities. In this paper we concentrate on the problem of approximate inference in probabilistic logic programming languages based on the distribution semantics. A successful approximate approach is based on Monte Carlo sampling, that consists in verifying the truth of the query in a normal program sampled from the probabilistic program. The ProbLog system includes such an algorithm and so does the cplint suite. In this paper we propose an approach for Monte Carlo inference that is based on a program transformation that translates a probabilistic program into a normal program to which the query can be posed. In the transformation, auxiliary atoms are added to the body of rules for performing sampling and checking for the consistency of the sample. The current sample is stored in the internal database of the Yap Prolog engine. The resulting algorithm, called MCINTYRE for Monte Carlo INference wiTh Yap REcord, is evaluated on various problems: biological networks, arti cial datasets and a hidden Markov model. MCINTYRE is compared with the Monte Carlo algorithms of ProbLog and cplint and with the exact inference of the PITA system. The results show that MCINTYRE is faster than the other Monte Carlo algorithms.</p>
      </abstract>
      <kwd-group>
        <kwd>Probabilistic Logic Programming</kwd>
        <kwd>Monte Carlo Methods</kwd>
        <kwd>Logic Programs with Annotated Disjunctions</kwd>
        <kwd>ProbLog</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Probabilistic Logic Programming (PLP) is an emerging eld that has recently
seen many proposals for the integration of probability in logic programming.
Such an integration overcomes the limit of logic of dealing only with certain
propositions and the limit of works in probability theory that consider mostly
simple descriptions of domain entities instead of complex relational descriptions.</p>
      <p>
        PLP is of interest also for its many application domains, the most promising
of which is maybe Probabilistic Inductive Logic Programming [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] in which PLP
languages are used to represent the theories that are induced from data. This
allows a richer representation of the domains that often leads to increased
modeling accuracy. This trend can be cast in a more general tendency in Machine
Learning to combine aspects of uncertainty with aspects of logic, as is testi ed
by the development of the eld of Statistical Relational Learning [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Many languages have been proposed in PLP. Among them, many share a
common approach for de ning the semantics, namely the so called distribution
semantics [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. This approach sees a probabilistic logic program as a
description of a probability distribution over normal logic programs, from which the
probability of queries is computed. Example of languages following the
distribution semantics are Probabilistic Logic Programs [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], Probabilistic Horn
Abduction [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], Independent Choice Logic [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], PRISM [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], Logic Programs with
Annotated Disjunctions (LPADs) [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] and ProbLog [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. These languages have
essentially the same expressive power [
        <xref ref-type="bibr" rid="ref20 ref4">20,4</xref>
        ] and in this paper we consider only
LPADs and ProbLog, because they stand at the extremes of syntax complexity,
LPADs having the most complex syntax and ProbLog the simplest, and because
most existing inference algorithms can be directly applied to them.
      </p>
      <p>
        The problem of inference, i.e., the problem of computing the probability of a
query from a probabilistic logic program, is very expensive, being #P complete
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Nevertheless, various exact inference algorithms have been proposed, such
as the ProbLog system1 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], cplint2 [
        <xref ref-type="bibr" rid="ref12 ref13">12,13</xref>
        ] and PITA3 [
        <xref ref-type="bibr" rid="ref14 ref16">14,16</xref>
        ] and have been
successfully applied to a variety of non-trivial problems. All of these algorithms
nd explanations for queries and then use Binary Decision Diagrams (BDDs)
for computing the probability. This approach has been shown to be faster than
algorithms not using BDDs. Reducing the time to answer a probabilistic query
is important because in many applications, such as in Machine Learning, a high
number of queries must be issued. To improve the speed, approximate inference
algorithms have been proposed. Some compute a lower bound of the probability,
as the k-best algorithm of ProbLog [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] which considers only the k most probable
explanations for the query, while some compute an upper and a lower bound,
as the bounded approximation algorithm of ProbLog [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] that builds an SLD
tree only to a certain depth. A completely di erent approach for approximate
inference is based on sampling the normal programs encoded by the
probabilistic program and checking whether the query is true in them. This approach,
called Monte Carlo, was rst proposed in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] for ProbLog, where a lazy sampling
approach was used in order to avoid sampling unnecessary probabilistic facts.
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] presented algorithms for k-best, bounded approximation and Monte Carlo
inference for LPADs that are all based on a meta-interpreter. In particular, the
Monte Carlo approach uses the arguments of the meta-interpreter predicate to
store the samples taken and to ensure consistency of the sample.
      </p>
      <p>In this paper we present the algorithm MCINTYRE for Monte Carlo
INference wiTh Yap REcord that computes the probability of queries by means of
a program transformation technique. The disjunctive clauses of an LPAD are
rst transformed into normal clauses to which auxiliary atoms are added to the
body for taking samples and storing the results. The internal database of the
1 http://dtai.cs.kuleuven.be/problog/
2 http://www.ing.unife.it/software/cplint/
3 https://sites.google.com/a/unife.it/ml/pita
Yap Prolog engine is used to record all samples taken thus ensuring that samples
are consistent. The truth of a query in a sampled program can be then tested
by asking the query to the resulting normal program.</p>
      <p>MCINTYRE is compared with the Monte Carlo algorithms of ProbLog and
cplint and with the exact inference algorithm of the PITA system on various
problems: biological networks, arti cial datasets and a hidden Markov model.
The results show that the performances of MCINTYRE overcome those of the
other Monte Carlo algorithms.</p>
      <p>The paper is organized as follows. In Section 2 we review the syntax and the
semantics of PLP. Section 3 illustrates previous approaches for inference in PLP
languages. Section 4 presents the MCINTYRE algorithm. Section 5 describes
the experiments and Section 6 concludes the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Probabilistic Logic Programming</title>
      <p>
        One of the most interesting approaches to the integration of logic programming
and probability is the distribution semantics [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], which was introduced for the
PRISM language but is shared by many other languages.
      </p>
      <p>
        A program in one of these languages de nes a probability distribution over
normal logic programs called worlds. This distribution is then extended to queries
and the probability of a query is obtained by marginalizing the joint distribution
of the query and the programs. We present the semantics for programs without
function symbols but the semantics has been de ned also for programs with
function symbols [
        <xref ref-type="bibr" rid="ref15 ref17">17,15</xref>
        ].
      </p>
      <p>
        The languages following the distribution semantics di er in the way they
de ne the distribution over logic programs. Each language allows
probabilistic choices among atoms in clauses: Probabilistic Logic Programs, Probabilistic
Horn Abduction , Independent Choice Logic, PRISM and ProbLog allow
probability distributions over facts, while LPADs allow probability distributions over
the heads of disjunctive clauses. All these languages have the same expressive
power: there are transformations with linear complexity that can convert each
one into the others [
        <xref ref-type="bibr" rid="ref20 ref4">20,4</xref>
        ]. Next we will discuss LPADs and ProbLog.
      </p>
      <p>
        Formally a Logic Program with Annotated Disjunctions T [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] consists of a
nite set of annotated disjunctive clauses. An annotated disjunctive clause Ci is
of the form hi1 : i1; : : : ; hini : ini : bi1; : : : ; bimi . In such a clause hi1; : : : hini
are logical atoms and bi1; : : : ; bimi are logical literals, f i1; : : : ; ini g are real
numbers in the interval [0; 1] such that Pkn=i1 ik 1. bi1; : : : ; bimi is called
the body and is indicated with body(Ci). Note that if ni = 1 and i1 = 1 the
clause corresponds to a non-disjunctive clause. If Pkn=i1 ik &lt; 1 the head of the
annotated disjunctive clause implicitly contains an extra atom null that does
not appear in the body of any clause and whose annotation is 1 Pkn=i1 ik. We
denote by ground(T ) the grounding of an LPAD T .
      </p>
      <p>An atomic choice is a triple (Ci; j; k) where Ci 2 T , j is a substitution
that grounds Ci and k 2 f1; : : : ; nig. (Ci; j; k) means that, for the ground
clause Ci j, the head hik was chosen. In practice Ci j corresponds to a random
variable Xij and an atomic choice (Ci; j ; k) to an assignment Xij = k. A set of
atomic choices is consistent if (Ci; j ; k) 2 ; (Ci; j ; l) 2 ) k = l, i.e., only
one head is selected for a ground clause. A composite choice is a consistent set
of atomic choices. The probability P ( ) of a composite choice is the product of
the probabilities of the individual atomic choices, i.e. P ( ) = Q(Ci; j;k)2 ik.</p>
      <p>A selection is a composite choice that contains an atomic choice (Ci; j ; k)
for each clause Ci j in ground(T ). A selection identi es a normal logic program
w de ned as w = f(hik : body(Ci)) j j(Ci; j ; k) 2 g. w is called a world of
T . Since selections are composite choices, we can assign a probability to possible
worlds: P (w ) = P ( ) = Q(Ci; j;k)2 ik.</p>
      <p>Since the program does not have function symbols, the set of worlds is nite:
WT = fw1; : : : ; wmg and, since the probabilities of the individual choices sum to
1, P (w) is a distribution over worlds: Pw2WT P (w) = 1. We also assume that
each world w has a two-valued well founded model W F M (w). If a query Q is
true in W F M (w) we write w j= Q.</p>
      <p>We can de ne the conditional probability of a query Q given a world: P (Qjw) =
1 if w j= Q and 0 otherwise. The probability of the query can then be obtained
by marginalizing over the query</p>
      <p>P (Q) =</p>
      <p>X P (Q; w) =
w</p>
      <p>X P (Qjw)P (w) = X P (w)
w
wj=Q
Example 1. The following LPAD T encodes a very simple model of the
development of an epidemic or a pandemic:</p>
      <p>C1 = epidemic : 0:6; pandemic : 0:3 : f lu(X); cold:
C2 = cold : 0:7:
C3 = f lu(david):</p>
      <p>C4 = f lu(robert):
This program models the fact that if somebody has the u and the climate
is cold, there is the possibility that an epidemic or a pandemic arises. We are
uncertain about whether the climate is cold but we know for sure that David
and Robert have the u. Clause C1 has two groundings, both with three atoms
in the head, while clause C2 has a single grounding with two atoms in the head,
so overall there are 3 3 2 = 18 worlds. The query epidemic is true in 5 of
them and its probability is</p>
      <p>
        P (epidemic) = 0:6 0:6 0:7 + 0:6 0:3 0:7 + 0:6 0:1 0:7+
0:3 0:6 0:7 + 0:1 0:6 0:7 =
0:588
A ProbLog program is composed by a set of normal clauses and a set of
probabilistic facts, possibly non-ground. A probabilistic fact takes the form
:: f:
where is in [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ] and f is an atom. The semantics of such program can be
given by considering an equivalent LPAD containing, for each ProbLog normal
B, a clause h : 1 :
      </p>
      <p>B and, for each probabilistic ProbLog fact, a
clause h :
clause</p>
      <p>f : :
The semantics of the ProbLog program is the same as that of the equivalent
LPAD.</p>
      <p>
        It is also possible to translate an LPAD into a ProbLog program [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. A clause
Ci of the LPAD with variables X
hi1 : i1; : : : ; hini : ini :
      </p>
      <p>Bi
is translated into
in 1 :: fini 1(X):
where problog not=1 is a ProbLog builtin predicate that implements negation
for probabilistic atoms and i1 = i1, i2 = 1 i2i1 , i3 = (1 i1)i(31 i2) ; : : :. In
general ij = Qjk=11(i1j ik) .</p>
      <p>Example 2. The ProbLog program equivalent to the LPAD of Example 1 is
C11 = epidemic : f lu(X); cold; f 1(X):
C12 = pandemic : f lu(X); cold; problog not(f 1(X)); f 2(X):
C13 = 0:6 :: f 1(X):
C14 = 0:75 :: f 2(X):
C21 = cold : f 3:
C22 = 0:7 :: f 3:
C3 = f lu(david):</p>
      <p>C4 = f lu(robert):
3</p>
    </sec>
    <sec id="sec-3">
      <title>Inference Algorithms</title>
      <p>
        In order to compute the probability of a query from a probabilistic logic program,
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] proposed the ProbLog system that rst nds a set of explanations for the
query and then computes the probability from the set by using Binary Decision
Diagrams. An explanation is a set of probabilistic facts used in a derivation of the
query. The set of explanations can be seen as a Boolean DNF formula in which
the Boolean propositions are random variables. Computing the probability of
the formula involves solving the disjoint sum problem which is #P-complete
[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. BDDs represent an approach for solving this problem that has been shown
to work well in practice [
        <xref ref-type="bibr" rid="ref13 ref14 ref6">6,13,14</xref>
        ].
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] proposed various approaches for approximate inference that are now
included in the ProbLog system. The k-best algorithm nds only the k most
probable explanations for a query and then builds a BDD from them. The resulting
probability is only a lower bound but if k is su ciently high it represents a
good approximation. The bounded approximation algorithm computes a lower
bound and an upper bound of the probability of the query by using iterative
deepening to explore the SLD tree for the query. The SLD tree is built partially,
the successful derivations it contains are used to build a BDD for computing
the lower bound while the successful derivations plus the incomplete ones are
used to compute the upper bound. If the di erence between the upper and the
lower bound is above the required precision, the SLD tree is built up to a greater
depth. This process is repeated until the required precision is achieved. These
algorithms are implemented by means of a program transformation technique
applied to the probabilistic atoms: they are turned into clauses that, when the
atom is called, add the probabilistic fact to the current explanation.
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] presented an implementation of k-best and bounded approximation for
LPADs that is based on a meta-interpreter and showed that in some cases this
gives good results.
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] also presented a Monte Carlo algorithm that samples the possible
programs and tests the query in the samples. The probability of the query is then
given by the fraction of programs where the query is true. The Monte Carlo
algorithm for ProbLog is realized by using an array with an element for each
ground probabilistic fact that stores one of three values: sampled true, sampled
false and not yet sampled. When a probabilistic fact is called, the algorithm rst
checks whether the fact has already been sampled by looking at the array. If
it has not been sampled, then it samples it and stores the result in the array.
Probabilistic facts that are non-ground in the program are treated di erently. A
position in the array is not reserved for them since their grounding is not known
at the start, rather samples for groundings of these facts are stored in the
internal database of Yap and the sampled value is retrieved when they are called.
If no sample has been taken for a grounding, a sample is taken and recorded in
the database.
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] presents a Monte Carlo algorithm for LPADs that is based on a
metainterpreter. In order to keep track of the samples taken, two arguments of the
meta-interpreter predicate are used, one for keeping the input set of choices and
one for the output set of choices. This algorithm is included in the cplint suite
available in the source tree of Yap4.
4 http://www.dcc.fc.up.pt/~vsc/Yap/downloads.html
      </p>
    </sec>
    <sec id="sec-4">
      <title>MCINTYRE</title>
      <p>: : :
MCINTYRE rst transforms the program and then queries the transformed
program. The disjunctive clause Ci = hi1 : i1 _ : : : _ hin : ini : bi1; : : : ; bimi :
where the parameters sum to 1, is transformed into the set of clauses M C(Ci):
M C(Ci; 1) = hi1 : bi1; : : : ; bimi ;</p>
      <p>sample head(P arList; i; V C; N H); N H = 1:
M C(Ci; ni) = hini : bi1; : : : ; bimi ;</p>
      <p>sample head(P arList; i; V C; N H); N H = ni:
where V C is a list containing each variable appearing in Ci and P arList is
[ i1; : : : ; ini ]. If the parameters do not sum up to 1 the last clause (the one
for null ) is omitted. Basically, we create a clause for each head and we sample a
head index at the end of the body with sample head/4. If this index coincides
with the head index, the derivation succeeds, otherwise it fails. Thus failure can
occur either because one of the body literals fails or because the current clause
is not part of the sample.</p>
      <p>For example, clause C1 of epidemic example becomes
M C(C1; 1) = epidemic : f lu(X); cold;</p>
      <p>sample head([0:6; 0:3; 0:1]; 1; [X]; N H); N H = 1:
M C(C1; 2) = pandemic : f lu(X); cold;</p>
      <p>sample head([0:6; 0:3; 0:1]; 1; [X]; N H); N H = 2:
The predicate sample head/4 samples an index from the head of a clause and
uses the builtin Yap predicates recorded/3 and recorda/3 for respectively
retrieving or adding an entry to the internal database. Since sample head/4 is at
the end of the body and since we assume the program to be range restricted, at
that point all the variables of the clause have been grounded. A program is range
restricted if all the variables appearing in the head also appear in positive literals
in the body. If the rule instantiation had already been sampled, sample head/4
retrieves the head index with recorded/3, otherwise it samples a head index
with sample/2:
sample_head(_ParList,R,VC,NH):</p>
      <p>recorded(exp,(R,VC,NH),_),!.
sample_head(ParList,R,VC,NH):sample(ParList,NH),
recorda(exp,(R,VC,NH),_).
sample(ParList, HeadId)
:random(Prob),
sample(ParList, 0, 0, Prob, HeadId).
sample([HeadProb|Tail], Index, Prev, Prob, HeadId)
:Succ is Index + 1,
Next is Prev + HeadProb,
(Prob =&lt; Next -&gt;
;
).</p>
      <p>HeadId = Index
sample(Tail, Succ, Next, Prob, HeadId)
Tabling can be e ectively used to avoid re-sampling the same atom. To take a
sample from the program we use the following predicate
sample(Goal):abolish_all_tables,
eraseall(exp),
call(Goal).</p>
      <p>A xed number of samples n is taken and the fraction p^ of samples in which
the query succeds is computed. In order to compute the con dence interval of p^,
we use the central limit theorem to approximate the binomial distribution with
a normal distribution. Then the 95% binomial proportion con dence interval is
calculated as
where z1 =2 is the 1 =2 percentile of a standard normal distribution and
usually = 0:05 . If the width of the interval is below a user de ned threshold
, we stop and we return the fraction of successful samples.</p>
      <p>This estimate of the con dence interval is good for a sample size larger than
30 and if p^ is not too close to 0 or 1. The normal approximation fails totally
when the sample proportion is exactly zero or exactly one. Empirically, it has
been observed that the normal approximation works well as long as np^ &gt; 5 and
n(1 p^) &gt; 5.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Experiments</title>
      <p>
        We considered three sets of benchmarks: graphs of biological concepts from [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
arti cial datasets from [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and a hidden Markov model from [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. On these dataset,
we compare MCINTYRE, the Monte Carlo algorithm of ProbLog [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], the Monte
Carlo algorithm of cplint [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and the exact system PITA which has been shown
to be particularly fast [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. All the experiments have been performed on Linux
machines with an Intel Core 2 Duo E6550 (2333 MHz) processor and 4 GB of
RAM. The algorithms were run on the data for 24 hours or until the program
ended for lack of memory. = 0:01 was chosen as the maximum con dence
interval width for Monte Carlo algorithms. The normal approximation tests np^ &gt;
5 and n(1 p^) &gt; 5 were disabled in MCINTYRE because they are not present
in ProbLog. For each experiment we used tabling when it gave better results.
      </p>
      <p>
        In the graphs of biological concepts the nodes encode biological entities such
as genes, proteins, tissues, organisms, biological processes and molecular
functions, and the edges conceptual and probabilistic relations among them. Edges
are thus represented by ground probabilistic facts. The programs have been
sampled from the Biomine network [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] containing 1,000,000 nodes and 6,000,000
edges. The sampled programs contain 200, 400, : : :, 10000 edges. Sampling was
repeated ten times, to obtain ten series of programs of increasing size. In each
program we query the probability that the two genes HGNC 620 and HGNC 983
are related.
      </p>
      <p>For MCINTYRE and ProbLog we used the following de nition of path
path(X,X).
path(X,Y):-X\==Y, path(X,Z),arc(Z,Y).
arc(X,Y):-edge(Y,X).
arc(X,Y):-edge(X,Y).</p>
      <p>For MCINTYRE, we tabled path/2 using Yap tabling with the directive
:- table path/2.
while for ProbLog we tabled the path predicate by means of ProbLog tabling
with the command
problog_table(path/2),
For PITA we used the program
path(X,Y):-path(X,Y,[X],Z).
path(X,X,A,A).
path(X,Y,A,R):-X\==Y, arc(X,Z), \+ member(Z,A), path(Z,Y,[Z|A],R).
arc(X,Y):-edge(Y,X).
arc(X,Y):-edge(X,Y).
that performs loop checking by keeping a list of visited nodes rather than by using
tabling because this approach gave the best results. We used the same program
also for cplint because it does not allow to use tabling for loop checking.</p>
      <p>Figure 1(a) shows the number of graphs for each size for which MCINTYRE,
ProbLog, cplint and PITA were able to compute the probability. Figure 1(b)
shows the execution times of the four algorithms as a function of graph size
averaged over the graphs on which the algorithms succeeded.</p>
      <p>MCINTYRE and ProbLog were able to solve all graphs, while PITA and
cplint stopped much earlier. As regards speed, MCINTYRE is much faster
than cplint and slightly faster than ProbLog. For non-small programs it is also
faster than PITA.</p>
      <p>
        The growing head dataset from [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] contains propositional programs in which
the head of clauses are of increasing size. For example, the program for size 4 is
a0 :- a1.
a1:0.5.
a0:0.5; a1:0.5 :- a2.
a2:0.5.
a0:0.333333333333; a1:0.333333333333; a2:0.333333333333 :- a3.
a3:0.5.
      </p>
      <p>2000 4000Edge6s000 8000 10000
(a) Solved graphs.</p>
      <p>2000 4000Size6000 8000 10000
(b) Average execution times .</p>
      <p>The equivalent ProbLog program is
a0 :- a1. 0.5::a1f.
a1:-a1f. 0.5::a0_2.
a0:-a2,a0_2.
a1:-a2,problog_not(a0_2). 0.5::a2f.
a2:-a2f.
0.333333333333::a0_3. 0.5::a1_3.
a0:-a3,a0_3.
a1:-a3,problog_not(a0_3),a1_3.
a2:-a3,problog_not(a0_3),problog_not(a1_3).
0.5::a3f.
a3:-a3f.</p>
      <p>In this dataset no predicate is tabled for both MCINTYRE and ProbLog. Figure
2(a) shows the time for computing the probability of a0 as a function of the
size. MCINTYRE is faster than ProbLog and PITA for non-small programs but
all of them are much slower and less scalable than cplint. The reason why
10−4
20
10−2
102
101
cplint performs so well is that the meta-interpreter checks for the consistency
of the sample when choosing a clause to resolve with the goal, rather than after
having resolved all the body literals as in MCINTYRE and ProbLog. However,
since the clauses are ground, the sampling predicates of MCINTYRE can be
put at the beginning of the body, simulating cplint behavior. Similarly, the
probabilistic atoms can be put at the beginning of the body of ProbLog clauses.
With this approach, we get the timings depicted in Figure 2(b) which shows that
now MCINTYRE and ProbLog are faster than cplint and MCINTYRE is the
fastest.</p>
      <p>
        The blood type dataset from [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] determines the blood type of a person on
the basis of her chromosomes that in turn depend on those of her parents. The
blood type is given by clauses of the form
bloodtype(Person,a):0.90 ; bloodtype(Person,b):0.03 ;
bloodtype(Person,ab):0.03 ; bloodtype(Person,null):0.04
:pchrom(Person,a),mchrom(Person,a).
where pchrom/2 indicates the chromosome inherited from the father and mchrom
/2 that inherited from the mother. There is one such clause for every combination
of the values fa, b, nullg for the father and mother chromosomes. In turn, the
chromosomes of a person depend from those of her parents, with clauses of the
form
mchrom(Person,a):0.90 ; mchrom(Person,b):0.05 ;
mchrom(Person,null):0.05
:
      </p>
      <p>mother(Mother,Person), pchrom(Mother,a), mchrom(Mother,a).
There is one such clause for every combination of the values fa, b, nullg for
the father and mother chromosomes of the mother and similarly for the father
chromosome of a person. In this dataset we query the blood type of a person
on the basis of that of its ancestors. We consider families with an increasing
number of components: each program adds two persons to the previous one.
The chromosomes of the parentless ancestors are given by disjunctive facts of
the form
mchrom(p,a):0.3 ; mchrom(p,b):0.3 ; mchrom(p,null):0.4.
pchrom(p,a):0.3 ; pchrom(p,b):0.3 ; pchrom(p,null):0.4.
For both MCINTYRE and ProbLog all the predicates are tabled.</p>
      <p>Figure 3 shows the execution times as a function of the family size. Here
MCINTYRE is faster than ProbLog but slower than the exact inference of PITA.
This is probably due to the fact that the bodies of clauses with the same atoms
in the head are mutually exclusive in this dataset and the goals in the bodies
are independent, making BDD operations particularly fast.</p>
      <p>
        In the growing body dataset [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] the clauses have bodies of increasing size.
For example, the program for size 4 is,
a0:0.5 :- a1.
102
)
s
(100
e
m
i
T10−2
10−40
50
      </p>
      <p>In this dataset as well no predicate is tabled for both MCINTYRE and ProbLog
and the sampling predicates of MCINTYRE and the probabilistic atoms of
ProbLog have been put at the beginning of the body since the clauses are ground.
Figure 4(a) shows the execution time for computing the probability of a0. Here
PITA is faster and more scalable than Monte Carlo algorithms, again probably
due to the fact that the bodies of clauses with the same heads are mutually
exclusive thus simplifying BDD operations. Figure 4(b) shows the execution time
of the Monte Carlo algorithms only, where it appears that MCINTYRE is faster
than ProbLog and cplint.
106</p>
      <p>
        The UWCSE dataset [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] describes a university domain with predicates such as
taught_by/2, advised_by/2, course_level/2, phase/2, position/2, course/1,
professor/1, student/1 and others. Programs of increasing size are considered
by adding facts for the student/1 predicate, i.e., by considering an increasing
number of students. For both MCINTYRE and ProbLog all the predicates are
tabled. The time for computing the probability of the query taught_by(c1,p1)
as a function of the number of students is shown in Figure 5(a). Here
MCINTYRE is faster than ProbLog and both scale much better than PITA.
      </p>
      <p>The algorithms are used to compute the probability of hmm(O) for random
sequences O of increasing length. Tabling was not used for MCINTYRE nor for
ProbLog.</p>
      <p>Figure 5(b) show the time taken by the various algorithms as a function of
the sequence length. Since the probability of such a sequence goes rapidly to
zero, the derivations of the goal terminate mostly after a few steps only and
all Monte Carlo algorithms take constant time with MCINTYRE faster that
ProbLog and cplint.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions</title>
      <p>Probabilistic Logic Programming is of high interest for its many application
elds. The distribution semantics is one of the most popular approaches to PLP
and underlies many languages, such as LPADs and ProbLog. However, exact
inference is very expensive, being #P complete and thus approximate approaches
have to be investigated. In this paper we propose the algorithm MCINTYRE that
performs approximate inference by means of a Monte Carlo technique, namely
random sampling. MCINTYRE transforms an input LPAD into a normal
program that contains a clause for each head of an LPAD clause. The resulting
clauses contain in the body auxiliary predicates that perform sampling and check
for the consistency of the sample.</p>
      <p>
        MCINTYRE has been tested on graphs of biological concepts, on four arti
cial datasets from [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and on a hidden Markov model. In all cases it turned out
to be faster than the Monte Carlo algorithms of ProbLog and cplint. It is also
faster and more scalable than exact inference except in two datasets, blood type
and growing body, that however possess peculiar characteristics. MCINTYRE
is available in the cplint package of the source tree of Yap and instructions on
its use are available at http://www.ing.unife.it/software/cplint/.
      </p>
      <p>In the future we plan to investigate other approximate inference techniques
such as lifted belief propagation and variational methods.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bragaglia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riguzzi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Approximate inference for logic programs with annotated disjunctions</article-title>
          .
          <source>In: International Conference on Inductive Logic Programming. LNAI</source>
          , vol.
          <volume>6489</volume>
          , pp.
          <volume>30</volume>
          {
          <fpage>37</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Christiansen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gallagher</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          :
          <article-title>Non-discriminating arguments and their uses</article-title>
          .
          <source>In: International Conference on Logic Programming. LNCS</source>
          , vol.
          <volume>5649</volume>
          , pp.
          <volume>55</volume>
          {
          <fpage>69</fpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Dantsin</surname>
          </string-name>
          , E.:
          <article-title>Probabilistic logic programs and their semantics</article-title>
          .
          <source>In: Russian Conference on Logic Programming. LNCS</source>
          , vol.
          <volume>592</volume>
          , pp.
          <volume>152</volume>
          {
          <fpage>164</fpage>
          . Springer (
          <year>1991</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>De Raedt</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Demoen</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fierens</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gutmann</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Janssens</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kimmig</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Landwehr</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mantadelis</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meert</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rocha</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Santos</given-names>
            <surname>Costa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Thon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Vennekens</surname>
          </string-name>
          , J.:
          <article-title>Towards digesting the alphabet-soup of statistical relational learning</article-title>
          . In: Roy,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Winn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>McAllester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Mansinghka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Tenenbaum</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of the 1st Workshop on Probabilistic Programming: Universal Languages, Systems and Applications</source>
          , in NIPS (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>De Raedt</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frasconi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kersting</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muggleton</surname>
          </string-name>
          , S. (eds.):
          <article-title>Probabilistic Inductive Logic Programming - Theory and Applications</article-title>
          , LNCS, vol.
          <volume>4911</volume>
          . Springer (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>De Raedt</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kimmig</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toivonen</surname>
          </string-name>
          , H.:
          <article-title>ProbLog: A probabilistic prolog and its application in link discovery</article-title>
          .
          <source>In: International Joint Conference on Arti cial Intelligence</source>
          . pp.
          <volume>2462</volume>
          {
          <fpage>2467</fpage>
          . AAAI Press (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Getoor</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taskar</surname>
            ,
            <given-names>B</given-names>
          </string-name>
          . (eds.):
          <article-title>Introduction to Statistical Relational Learning</article-title>
          . MIT Press (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kimmig</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Demoen</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Raedt</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costa</surname>
            ,
            <given-names>V.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rocha</surname>
          </string-name>
          , R.:
          <article-title>On the implementation of the probabilistic logic programming language ProbLog</article-title>
          .
          <source>Theory and Practice of Logic Programming</source>
          <volume>11</volume>
          (
          <issue>2-3</issue>
          ),
          <volume>235</volume>
          {
          <fpage>262</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Meert</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Struyf</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blockeel</surname>
          </string-name>
          , H.:
          <article-title>CP-Logic theory inference with contextual variable elimination and comparison to BDD based inference methods</article-title>
          .
          <source>In: International Conference on Inductive Logic Programming. LNCS</source>
          , vol.
          <volume>5989</volume>
          , pp.
          <volume>96</volume>
          {
          <fpage>109</fpage>
          . Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Poole</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Logic programming, abduction and probability - a top-down anytime algorithm for estimating prior and posterior probabilities</article-title>
          .
          <source>New Generation Computing</source>
          <volume>11</volume>
          (
          <issue>3-4</issue>
          ),
          <volume>377</volume>
          {
          <fpage>400</fpage>
          (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Poole</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>The Independent Choice Logic for modelling multiple agents under uncertainty</article-title>
          .
          <source>Arti cial Intelligence</source>
          <volume>94</volume>
          (
          <issue>1-2</issue>
          ),
          <volume>7</volume>
          {
          <fpage>56</fpage>
          (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Riguzzi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>A top-down interpreter for LPAD and CP-Logic</article-title>
          . In:
          <article-title>Congress of the Italian Association for Arti cial Intelligence</article-title>
          .
          <source>LNCS</source>
          , vol.
          <volume>4733</volume>
          , pp.
          <volume>109</volume>
          {
          <fpage>120</fpage>
          . Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Riguzzi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Extended semantics and inference for the Independent Choice Logic</article-title>
          .
          <source>Logic Journal of the IGPL</source>
          <volume>17</volume>
          (
          <issue>6</issue>
          ),
          <volume>589</volume>
          {
          <fpage>629</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Riguzzi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swift</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Tabling and Answer Subsumption for Reasoning on Logic Programs with Annotated Disjunctions</article-title>
          . In: International Conference on Logic Programming.
          <source>LIPIcs</source>
          , vol.
          <volume>7</volume>
          , pp.
          <volume>162</volume>
          {
          <fpage>171</fpage>
          .
          <string-name>
            <given-names>Schloss</given-names>
            <surname>Dagstuhl-Leibniz-Zentrum fuer Informatik</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Riguzzi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swift</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>An extended semantics for logic programs with annotated disjunctions and its e cient implementation</article-title>
          .
          <source>In: Italian Conference on Computational Logic. No. 598 in CEUR Workshop Proceedings, Sun SITE Central Europe</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Riguzzi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swift</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>The PITA system: Tabling and answer subsumption for reasoning under uncertainty</article-title>
          .
          <source>Theory and Practice of Logic Programming</source>
          ,
          <source>International Conference on Logic Programming Special Issue</source>
          <volume>11</volume>
          (
          <issue>4-5</issue>
          ) (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Sato</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>A statistical learning method for logic programs with distribution semantics</article-title>
          .
          <source>In: International Conference on Logic Programming</source>
          . pp.
          <volume>715</volume>
          {
          <fpage>729</fpage>
          . MIT Press (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Sevon</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eronen</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hintsanen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kulovesi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toivonen</surname>
          </string-name>
          , H.:
          <article-title>Link discovery in graphs derived from biological databases</article-title>
          .
          <source>In: International Workshop on Data Integration in the Life Sciences. LNCS</source>
          , vol.
          <volume>4075</volume>
          , pp.
          <volume>35</volume>
          {
          <fpage>49</fpage>
          . Springer (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Valiant</surname>
            ,
            <given-names>L.G.</given-names>
          </string-name>
          :
          <article-title>The complexity of enumeration and reliability problems</article-title>
          .
          <source>SIAM Journal on Computing</source>
          <volume>8</volume>
          (
          <issue>3</issue>
          ),
          <volume>410</volume>
          {
          <fpage>421</fpage>
          (
          <year>1979</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Vennekens</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verbaeten</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Logic programs with annotated disjunctions</article-title>
          .
          <source>Tech. Rep. CW386</source>
          , Department of Computer Science, Katholieke Universiteit Leuven,
          <string-name>
            <surname>Belgium</surname>
          </string-name>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Vennekens</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verbaeten</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bruynooghe</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Logic programs with annotated disjunctions</article-title>
          .
          <source>In: International Conference on Logic Programming. LNCS</source>
          , vol.
          <volume>3131</volume>
          , pp.
          <volume>195</volume>
          {
          <fpage>209</fpage>
          . Springer (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>