=Paper=
{{Paper
|id=Vol-1413/paper-04
|storemode=property
|title=Constraint-Based Inference in Probabilistic Logic Programs
|pdfUrl=https://ceur-ws.org/Vol-1413/paper-04.pdf
|volume=Vol-1413
|dblpUrl=https://dblp.org/rec/conf/iclp/NampallyR15
}}
==Constraint-Based Inference in Probabilistic Logic Programs==
Constraint-Based Inference in
Probabilistic Logic Programs?
(Extended Abstract)
Arun Nampally, C. R. Ramakrishnan
Department of Computer Science, Stony Brook University, Stony Brook, NY 11794
{anampally,cram}@cs.stonybrook.edu
1 Introduction
A wide variety of models that combine logical and statistical knowledge can be
expressed succinctly in the Probabilistic Logic Programming (PLP) paradigm.
Specifically, models in standard statistical formalisms such as probabilistic graph-
ical models (PGMs) (e.g. Bayesian Networks), can be easily encoded as PLP
programs. For instance, Fig. 1(a) shows a program in PRISM, a pioneering PLP
language [16]. A widget, represented by a random variable X, is tested by two
different processes b1 and b2 . The outcomes of these tests are represented by
random variables Y and Z, respectively. In PRISM, a special predicate of the
form msw(a,X) associates random variable X with a random process a. Con-
sider the problem of determining the distribution of X given that Y and Z are
identical. Note that evidence is defined as a constraint over instantiations of the
random variables, in contrast to a specific instantiation as in traditional PGMs.
However, such evidence can be easily specified in PLP (see predicate e/0). The
probability of a specific instantiation of X can also be computed based on a PLP
query (e.g. q(1) using predicate q/1). This simple example illustrates how logi-
cal clauses can be used to specify evidence and queries in PLP that go beyond
what is possible in traditional PGMs.
The Driving Problem. The expressiveness of PLP comes at a cost. Since PLP
is an extension to traditional logic programming, inference in PLP is undecid-
able in general. Probabilistic inference for a large class of statistical models (e.g.
Bayesian networks) is intractable. Even problems for which inference is tractable
can be encoded in multiple ways in PLP, with different inference complexity. For
instance, consider the PRISM program in Fig. 1(b). In that program, genlist/2
defines a list of the outcomes of N identically distributed random variables rang-
ing over {a,b}. Predicate palindrome/1 tests, using a definite clause grammar
definition, if a given list is a palindrome; and count as/2 tests if a given list
contains k (not necessarily consecutive) “a”s. Using these predicates, consider
the inference of the conditional probability of query(n, k) given evidence(n):
i.e., the probability that an n-element palindrome has k “a”s.
?
This research was supported in part by NSF grants CCF-1018459 and IIS-1447549.
46
Constraint-Based Inference in Probabilistic Logic Programs
1 % generate a list of N random variables.
2 genlist(N, L) :- (N=0 -> L= []
1 % Model: Y and Z depend on X. 3 ; msw(flip, N, X),
2 w(X,Y,Z) :- 4 L = [X|L1], N1 is N-1,
3 msw(a, X), 5 genlist(N1, L1) ).
4 msw(b1(X), Y), 6 % Evidence: string is a palindrome.
5 msw(b2(X), Z). 7 evidence(N) :- genlist(N, L), palindrome(L).
6 % Evidence: Y and Z are same. 8 % Query: string has K ’a’s
7 e :- w(_,S,S). 9 query(N, K) :- genlist(N, L), count_as(L, K).
8 % Query: value of X. 10 % Check if a given list is a palindrome
9 q(X) :- w(X,_,_). 11 palindrome(L) :- phrase(palindrome, L).
10 % Domains: 12 palindrome --> [].
11 values(a, [1,2]). 13 palindrome --> [_X].
12 values(b1(_), [1,2,3]). 14 palindrome --> [X], palindrome, [X].
13 values(b2(_), [2,3,4]). 15 % Query condition:
14 % Distribution parameters: 16 count_as([], 0).
15 set_sw(a, [0.4,0,6]). 17 count_as([X|Xs], K) :-
16 set_sw(b1(1), [0.1,0.3,0.6]). 18 K > 0, (X=a -> L is K-1; L=K),
17 set_sw(b1(2), [0.2,0.4,0.4]). 19 count_as(Xs, L).
18 set_sw(b2(1), [0.5,0.3,0.2]). 20 % Domains:
19 set_sw(b2(2), [0.6,0.1,0.3]). 21 values(flip, [a,b]).
22 % Distribution parameters:
23 set_sw(flip, [0.5, 0.5]).
(a) Bayesian Network PLP (b) Palindrome PLP
Fig. 1: Examples of PLPs
The conditional probability is well-defined according to PRISM’s distribution
semantics [17]. However, the PRISM itself will be unable to correctly compute the
conditional query’s probability, since the conditional query, as encoded above,
will violate the PRISM system’s assumptions of independence among random
variables used in an explanation. It should be noted that the above conditional
probability may be efficiently inferred by transforming the “generate-and-test”
program to one where the tests are folded into the generation phase. However,
such transformations are dependent on the encoding of the query and evidence
predicates, and are hard to generalize. Moreover, while the probability of goal
evidence(N) can be computed in linear time (by exploiting sharing in expla-
nation graphs), the size of the explanation graph for goal query(N) may be
exponential in N when the subgoals in the explanations are placed in the order
in which they are encountered.
Approximate inference based on rejection sampling performs poorly, reject-
ing a vast number of generated samples, since the likelihood of a string being
a palindrome decreases exponentially in N . Alternatives such as Metropolis-
Hastings-based Markov Chain Monte Carlo (MCMC) techniques [9, e.g.] do
not behave much better due to the fact that the chains exhibit poor conver-
gence (mixing), since most transitions lead to strings inconsistent with evidence.
Gibbs-sampling-based MCMC [8] cannot be readily applied since the dependen-
cies between random variables are hidden in the program and not explicit in the
model.
47
Constraint-Based Inference in Probabilistic Logic Programs
msw(a,X) msw(a,X)
msw(b1(X), Y) msw(b1(X), Y)
Y ∈ {2, 3}
msw(b2(X), Z)
msw(b2(X), Z)
Y=Z Z ∈ {2, 3}, Z = Y
2 2
(a) Before Constraint Propagation (b) After Constraint Propagation
Fig. 2: Symbolic Derivation for evidence “e” in BN Example
Our Approach. We identify two basic problems that contribute to the diffi-
culty of inference in PLPs. First is that the random variable dependencies are not
explicit in the program but may vary based on the program’s control and data
flow. The second is that evidence (and query) specifications may be complex ren-
dering it difficult to predict whether a variable’s valuation will be consistent with
the evidence (or lead to query’s success). We use constraint propagation both
to uncover the hidden dependencies and to predict consistency with evidence.
We explicitly construct symbolic derivations that abstract actual valuations of
random variables and use a graphical structure to represent the derivations. We
then provide inference algorithms, both approximate and exact, that compute
the probability of a (possibly conditional) query based on this graphical struc-
ture.
Summary of Contributions. This paper describes a novel technique that
addresses the problem of scalability of inference in PLPs.
1. The paper introduces a structure called an Ordered Symbolic Derivation
Diagram to represent succinctly the set of possible derivations for a PLP
query or evidence (Section 2).
2. The paper presents a likelihood-weighted sampling method based on OSDDs
that can be used for approximate inference (Section 3).
3. The paper also presents an exact inference algorithm that operates directly
on OSDDs. While this algorithm has relatively narrow applicability, it pro-
vides a powerful way to infer over large problem sizes without enumerating
random variable valuations (Section 3).
We present experimental results which show the effectiveness of OSDD-based
inference methods, as well as their cost (Section 4). Related work is discussed in
detail in Section 5. The paper concludes with a discussion on the other uses of
OSDDs for inference in PLPs.
2 Symbolic Derivations and Diagrams
In this paper, we use PRISM’s syntax and distribution semantics, but without
the independence and mutual exclusion requirements on the explanations of a
48
Constraint-Based Inference in Probabilistic Logic Programs
goal. Thus we consider PRISM programs with their intended model-theoretic
semantics, rather than that computed by the PRISM system.
As stated in the Introduction, the dependencies between random variables
implicit in the control and data flow of a PLP, and the impossibility of completely
representing the set of all random variable valuations that are consistent with
evidence contribute to the difficulty of inference in PLPs. To address these two
problems, we devise an inference technique based on constraint propagation,
to uncover hidden dependencies and predict evidence consistency. In the first
step, we build derivations symbolically, instantiating random variables only when
necessary. A symbolic derivation is a sequence of msw goals and constraints,
as illustrated in Fig. 2(a) by the derivation for evidence “e” from example of
Fig. 1(a).
In the second step, we propagate the constraints in a symbolic derivation,
resulting in possible restrictions on the domains of variables. For instance, in
the Bayesian Network (BN) example of Fig. 1(a), since the evidence demands
Y = Z, the domains of b1 and b2 get restricted to [2,3]. Such constraint
propagation is done using light-weight techniques such as node-consistency and
arc-consistency algorithms. We add inferred domain restrictions (if any) to the
derivation. We also place the constraints where their satisfaction can be effec-
tively tested. Fig. 2(b) shows the symbolic derivation for “e” in the BN example
after constraint propagation. The constraints denote a sufficient condition for
any concrete instance of the symbolic derivation to represent a successful deriva-
tion. Symbolic derivations, parameterized by the consistency algorithms used in
their construction, can be readily formalized; see [14]. Symbolic derivations can
be subsequently used in a number of ways, two of which are described below.
Generalization. For many standard statistical models (e.g. PGMs) our tech-
nique will construct at most one symbolic derivation. In general, however, PLPs
may have more than one symbolic derivation, as illustrated by the Birthday
Collision example in Fig. 5. This example encodes the problem of determining
the (unconditional) probability that two persons in a population of a given size
share the same birthday. The query same birthday(3), which fixes a population
of size 3, has 6 symbolic derivations, 3 of which are shown in Fig. 3(a). In such
cases, we combine the set of symbolic derivations into a tree structure, called
the Ordered Symbolic Derivation Diagram (OSDD), illustrated in Fig. 3(b). An
OSDD is analogous to a Constraint Decision Diagram (CDD) [3]: each node
defines a variable, and the outgoing edges are guarded by constraints on that
variable. An OSDD is constructed based on a total order over variables, as in
an Ordered Binary Decision Diagram [2]. Each path in an OSDD is a symbolic
derivation. In fact, every symbolic derivation is a rudimentary OSDD (with 0-
branches removed).
3 Inference Based on Symbolic Derivation Diagrams
We illustrate the process of generating likelihood-weighted samples [7,18] for goal
“e” from its symbolic derivation. We start with likelihood weight of 1. When
49
Constraint-Based Inference in Probabilistic Logic Programs
msw(b(1),X1)
msw(b(1),X1) msw(b(2),X2) msw(b(3),X3) msw(b(2), X2)
X2 = X1 X26=X1
msw(b(2), X2) msw(b(3), X3) msw(b(1), X1)
X2=X1 X3=X2 X1=X3 1 msw(b(3), X3)
X3 = X1 X3 6= X1 ∧ X3 6= X1 ∧
2 2 2 X3 = X2 X36= X2
1 1 0
(a) Selected Symbolic Derivations (b) Symbolic Derivation Diagram
Fig. 3: Symbolic Derivations for query “same birthday(3)” in Birthday Colli-
sion Example
visiting msw(a,X), we notice no domain restrictions on X, and hence bind X to a
random sample generated from a’s distribution. Assume the sample we drew is
X=1. We then visit msw(b1(1), Y) to sample Y. However, since Y has a domain
restriction Y ∈ {2, 3}, we generate a sample for Y such that Y ∈ {2, 3}. This
is done by picking from {2, 3} uniformly, and multiplying the likelihood weight
with the probability of the picked value. Assume we pick Y=2; then the current
likelihood is set to 0.3, the probability of 2 in b1(1)’s distribution. Finally,
we visit msw(b2(1),Z), whose two constraints restrict Z to {2}. Selecting Z=2,
we multiply the likelihood weight of the current derivation with 0.5. Thus we
generate a sample (X=1, Y=2, Z=2) consistent with evidence e with a likelihood
weight of 0.15.
In summary, likelihood-weighted samples are drawn by (a) independently
sampling random variables whose valuations are unconstrained; (b) uniformly
sampling variables whose valuations have domain constraints; and (c) computing
the probability of the sample as the product of probabilities of values picked in
step (b). This procedure can be readily formalized; see [14].
Exact Inference. For certain class of programs and queries, symbolic deriva-
tions can be directly used for exact inference. Fig. 4 shows the symbolic deriva-
tion of evidence evidence(6) from the Palindrome example (Fig. 1(b)). Note
that only the constraints in the symbolic derivation determine whether a con-
crete instance succeeds. Thus, if the distribution of flip is uniform, the three
constraints are each satisfied independently, resulting in 0.125 as the probability.
Such exact computation of probabilities is formalized in terms of measurability;
a variable X with domain constraint η is said to be measurable if the size of X’s
domain (consistent with η) is independent of the valuation of other variables.
When a symbolic derivation diagram consists only of uniformly distributed mea-
surable variables, then the associated probability can be computed exactly. Such
exact computation is readily formalized as well; see [14].
50
Constraint-Based Inference in Probabilistic Logic Programs
msw(flip, 6, X1) % Two from a population of size N
1
% share a birthday.
2
msw(flip, 5, X2) 3 same_birthday(N) :-
4 person(N, P1),
msw(flip, 4, X3)
5 % P1’s birthday is D
6 msw(b(P1), D),
msw(flip, 3, X4)
7 person(N, P2),
X4 = X3
8 P1 \= P2,
msw(flip, 2, X5)
9 % and so is P2’s.
X5 = X2
10 msw(b(P2), D).
11
msw(flip, 1, X6) 12 person(N, P) :-
X6 = X1 13 % bind P, backtracking through 1..N
14 basics:for(P, 1, N).
2 15
16 % Distribution parameters:
Fig. 4: Symbolic Derivation for ev- 17 set_sw(b(_), uniform(1,365)).
idence “evidence(6)” in Palin-
drome Example Fig. 5: Birthday Collision PLP
4 Experimental Evaluation
We present the results of experiments using a prototype implementation of a
likelihood-weighted sampler based on symbolic derivations. The prototype uses
XSB Prolog to build symbolic derivations, propagate constraints and construct
OSDDs; and a few modules written in C for maintaining the sampler’s state and
dealing with random variable distributions. We used the following examples in
the experiments.
– Grid BN is a Bayesian Network with Boolean random variables arranged
in a 6 × 6 grid (with dependencies going left-to-right and top-to down). This
simple structure was used to evaluate the effectiveness of our technique when
the evidence probability is extremely low (˜10−12 ).
– Ising Model is a well-known undirected graphical model. We used a 6 × 6
grid of Boolean random variables with factors on edges. The PRISM program
independently generates values of terminal nodes of all edges, and ties them
together by expressing equality constraints between shared variables of edges.
– Palindrome, which is shown in Fig. 1(b), with evidence limited to strings
of length 20, and query checking for a string with 4 “a”s.
– Birthday Collision, shown in Fig. 5 (page 6), with population size of 6,
i.e. query same birthday(6).
The first three examples involved conditional queries with low-likelihood evi-
dence. The birthday collision example had an unconditional query. It should
be noted that only the first example, Grid BN, can be evaluated in the PRISM
system; the other examples have queries that violate PRISM’s mutual exclusion
and independence assumptions and hence cannot be directly evaluated in that
51
Constraint-Based Inference in Probabilistic Logic Programs
6x6 grid BN 6x6 ising model
0.011 0.001 0.51 0.01
exact probability lw-probability
lw-probability lw-variance
0.0105 lw-variance
0.505
computed probability
computed probability
0.01
0.0001
0.0095
0.5 0.001
variance
variance
0.009
0.0085 1e-05 0.495
0.008
0.49 0.0001
0.0075
1e-06
0.007
0.485
0.0065
0.006 1e-07 0.48 1e-05
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1e+06 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000
samples samples
(a) BN (b) Ising Model
palindrome example birthday collision
0.06 0.01 0.044 0.001
exact answer exact answer
lw-probability lw-probability
is-probability 0.043 is-probability
lw-variance lw-variance
0.05 is-variance is-variance
computed probability
computed probability
0.001 0.042
0.0001
0.041
0.04
variance
variance
0.0001 0.04
0.03 0.039 1e-05
1e-05 0.038
0.02
0.037
1e-06
1e-06 0.036
0.01
0.035
0 1e-07 0.034 1e-07
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1e+06 0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1e+06
samples samples
(c) Palindrome (d) Birthday
Fig. 6: Experimental Results
system. Our inference procedure, however, removes PRISM’s assumptions and
correctly evaluates the query probabilities for all the above examples.
The results of the experiments are shown in Fig. 6. Each subfigure plots the
estimated probability and variance of the estimate(on log scale), for two sam-
plers: the LW method described in this paper, and a simple independent sampler
(with rejection sampling for conditional queries). Note that the LW sampler’s re-
sults show significantly lower variance in all the examples. For the Grid BN and
Ising Model, the evidence probability was low enough that a rejection sampler
was unable to draw a consistent sample. The LW sampler, however, was able
to converge to a reasonable estimate of low variance in about 500,000 samples.
Both examples generated a single symbolic derivation. We directly sampled from
this instead of materializing a OSDD structure. For the Grid BN, node consis-
tency was sufficient to derive domain restrictions. For the Ising model, we found
that standard LW sampling (picking a restricted value uniformly and assigning
a likelihood weight) generated a number of samples with extremely low weights.
Instead the probabilites of the set of allowed values were normalized to create
a new proposal distribution. This resulted in generating samples with higher
likelihood weights.
52
Constraint-Based Inference in Probabilistic Logic Programs
For the Palindrome example, we get a single symbolic derivation and the
LW sampler quickly converges to the actual probability, while the independent
sampler fails to converge even after a million samples. However, node- and arc-
consistency can discover no further domain restrictions; forward checking at
sampling time generated all the restrictions for LW sampler. The unusual pat-
tern of variance for independent sampler in the initial iterations is due to it
not being able to generate consistent samples and hence not having an estimate
for the answer probability. The birthday collision example produces a number
of symbolic derivations which were incorporated in an explicit OSDD. Domain
restrictions are discovered only via forward checking, and that too for only one
variable. The results show smaller difference between independent sampling and
LW sampling for this example, compared to the others. One interesting ob-
servation from this example was that independent sampling using the OSDD
structure was significantly faster (up to 2×) than using the program directly.
This is because the program’s non-deterministic evaluation has been replaced
by a deterministic traversal through the OSDD.
Overheads. For all the examples, the time to construct the symbolic deriva-
tions, and propagate constraints was negligible (ranging from 4ms for Grid BN
to 7ms for Birthday Collision, with XSB 3.5.0 on a 2.5GHz Intel Core 2 Duo
machine). The overheads for sampling however were more pronounced. While an
independent sampler picks values from the given distributions, the likelihood-
weighting sampler needs to construct restricted domains to draw samples from.
Consequently, our LW sampler takes up to 4× per sample as an independent
sampler.
Comparison with PITA and ProbLog. We evaluated the exact inference proce-
dures of PITA and ProbLog on the same examples. We used a timeout of 15
minutes for both systems. The exact inference algorithm of ProbLog using sen-
tential decision diagrams was able to handle Grid BN instances of size up to
9 × 9. However, ProbLog’s inference does not scale beyond small problem sizes
for the remaining three examples. In contrast, exact inference algorithm of PITA
scaled much better. PITA could successfully compute the conditional probabili-
ties for Grid BN (up to size 10 × 10), Ising model (up to 13 × 13) and Palindrome
with n = 18. For the Ising model example, PITA’s inference completes but with
numerical errors due to the low probability of evidence. Finally, PITA’s infer-
ence completed for the Birthday example with population size 2, but ran out of
memory for larger population sizes.
5 Related Work
Probabilistic Constraint Logic Programming [11] extends PLP with constraint
logic programming (CLP). It allows the specification of models with imprecise
probabilities. Whereas a world in PLP denotes a specific assignment of values to
random variables, a world in PCLP can define constraints on random variables,
53
Constraint-Based Inference in Probabilistic Logic Programs
rather than specific values. Lower and upper bounds are given on the probabil-
ity of a query by summing the probabilities of worlds where query follows and
worlds where query is possibly true respectively. While the way in which “proof
constraints” of a PCLP query are obtained is similar to the way in which sym-
bolic derivations are obtained (i.e., through constraint based evaluation), the
inference techniques employed are completely different with PCLP employing
satisfiability modulo theory (SMT) solvers.
cProbLog extends ProbLog with first-order constraints [6]. This gives the
ability to express complex evidence in a succinct form. The semantics and in-
ference are based on ProbLog. In contrast, our work makes the underlying con-
straints in a query explicit and uses the OSDDs to drive inference.
CLP(BN ) [4] extends logic programming with constraints which encode con-
ditional probability tables. A CLP(BN ) program defines a joint distribution on
the ground skolem terms. Operationally, queries are answered by constructing
the relevant BN and performing BN inference.
There has been a significant interest in the area of lifted inference as ex-
emplified by the work of [15,1,12]. The main idea of lifted inference is to treat
indistinguishable instances random variables as one unit and perform inference
at the population level. In contrast, exact inference using OSDDs treats indistin-
guishable values of random variables as one unit, thereby computing probabilities
without grounding the random variables. Consequently, the method in this pa-
per is orthogonal to traditional lifted inference and can be used when inversion
and counting elimination are inapplicable (e.g. Birthday Collision example in
Fig. 5).
The use of sampling methods for inference in PLPs has been widespread.
The evidence has generally been handled by heuristics to reduce the number
of rejected samples [5,13]. However we provide a systematic approach to deal
with constraints imposed by evidence. When our constraint processing algorithm
is powerful enough, the sampler can generate consistent samples without any
rejections.
Adaptive sequential rejection sampling [10] is an algorithm that adapts its
proposal distributions to avoid generating samples which are likely to be rejected.
However, it requires a decomposition of the target distribution, which may not
be available in PLPs. Further, in our work the distribution from which samples
are generated is not adapted. It is an interesting direction of research to combine
adaptivity with the proposed sampling algorithm.
6 Discussion
We presented a technique for inference in PLPs based on constructing a symbolic
structure called OSDD using constraint propagation. The technique effectively
performs inference without enumeration for a number of programs. The tech-
nique also uncovers the dependencies between random variables, which can then
be exploited by more powerful inference techniques (e.g. Gibbs-sampling-based
MCMC) that were inapplicable otherwise. However, for programs where sym-
54
Constraint-Based Inference in Probabilistic Logic Programs
bolic derivations match one-to-one with concrete derivations, the technique offers
no benefit. An important topic of future work is to statically analyze a program
to determine when (and when not to) use this technique. OSSDs are constructed
by exploiting the presence of explicit random variables due to msw’s in PRISM.
Application to other (equally expressive) PLP languages remains to be explored.
Finally, OSDDs introduce a style of inference where indistinguishable valuations
of random variables are treated together; combining this with lifted inference
that groups indistinguishable random variables together will improve the scala-
bility of inference in PLPs.
References
1. Rodrigo De Salvo Braz, Eyal Amir, and Dan Roth. Lifted first-order probabilis-
tic inference. In Proceedings of the Nineteenth International Joint Conference on
Artificial Intelligence, pages 1319–1325, 2005.
2. Randal E Bryant. Symbolic boolean manipulation with ordered binary-decision
diagrams. ACM Computing Surveys (CSUR), 24(3):293–318, 1992.
3. Kenil CK Cheng and Roland HC Yap. Constrained decision diagrams. In Pro-
ceedings of the National Conference on Artificial Intelligence, volume 20, page 366.
Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2005.
4. Vı́tor Santos Costa, David Page, Maleeha Qazi, and James Cussens. CLP (BN):
Constraint logic programming for probabilistic knowledge. In Proceedings of the
Nineteenth conference on Uncertainty in Artificial Intelligence, pages 517–524.
Morgan Kaufmann Publishers Inc., 2002.
5. James Cussens. Stochastic logic programs: Sampling, inference and applications.
In Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence,
pages 115–122. Morgan Kaufmann Publishers Inc., 2000.
6. Daan Fierens, Guy Van den Broeck, Maurice Bruynooghe, and Luc De Raedt.
Constraints for probabilistic logic programming. In Proceedings of the NIPS Prob-
abilistic Programming Workshop, pages 1–4, 2012.
7. Robert M. Fung and Kuo-Chu Chang. Weighing and integrating evidence for
stochastic simulation in Bayesian Networks. In Proceedings of the Fifth Annual
Conference on Uncertainty in Artificial Intelligence, UAI ’89, pages 209–220, Am-
sterdam, The Netherlands, 1990. North-Holland Publishing Co.
8. Stuart Geman and Donald Geman. Stochastic relaxation, Gibbs distributions, and
the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and
Machine Intelligence, (6):721–741, 1984.
9. W Keith Hastings. Monte Carlo sampling methods using Markov chains and their
applications. Biometrika, 57(1):97–109, 1970.
10. Vikash K Mansinghka, Daniel M Roy, Eric Jonas, and Joshua B Tenenbaum. Exact
and approximate sampling by systematic stochastic search. In Proceedings of the
Twelfth International Conference on Artificial Intelligence and Statistics, pages
400–407, 2009.
11. Steffen Michels, Arjen Hommersom, Peter JF Lucas, Marina Velikova, and Pieter
Koopman. Inference for a new probabilistic constraint logic. In Proceedings of
the Twenty-Third international joint conference on Artificial Intelligence, pages
2540–2546. AAAI Press, 2013.
55
Constraint-Based Inference in Probabilistic Logic Programs
12. Brian Milch, Luke S Zettlemoyer, Kristian Kersting, Michael Haimes, and
Leslie Pack Kaelbling. Lifted probabilistic inference with counting formulas. In
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, pages
1062–1068, 2008.
13. Bogdan Moldovan, Ingo Thon, Jesse Davis, and Luc De Raedt. MCMC estima-
tion of conditional probabilities in probabilistic programming languages. In Sym-
bolic and Quantitative Approaches to Reasoning with Uncertainty, pages 436–448.
Springer, 2013.
14. Arun Nampally and C. R. Ramakrishnan. Constraint-based inference in
probabilistic logic programs. Technical report, Computer Science Depart-
ment, Stony Brook University, http://www.cs.stonybrook.edu/~cram/Papers/
NR_LW15/lw15.pdf, 2015.
15. David Poole. First-order probabilistic inference. In Proceedings of the Eighteenth
International Joint Conference on Artificial Intelligence, volume 3, pages 985–991,
2003.
16. Taisuke Sato and Yoshitaka Kameya. PRISM: a language for symbolic-statistical
modeling. In Proceedings of the Fifteenth International Joint Conference on Arti-
ficial Intelligence, volume 97, pages 1330–1339, 1997.
17. Taisuke Sato and Yoshitaka Kameya. Parameter learning of logic programs for
symbolic-statistical modeling. Journal of Artificial Intelligence Research, pages
391–454, 2001.
18. Ross D. Shachter and Mark A. Peot. Simulation approaches to general probabilistic
inference on belief networks. In Proceedings of the Fifth Annual Conference on
Uncertainty in Artificial Intelligence, UAI ’89, pages 221–234, Amsterdam, The
Netherlands, 1990. North-Holland Publishing Co.
56