=Paper= {{Paper |id=Vol-2970/plppaper3 |storemode=property |title=Reasoning about Independence in Open Universe Probabilistic Logic Programs |pdfUrl=https://ceur-ws.org/Vol-2970/plppaper3.pdf |volume=Vol-2970 |authors=Kilian Rueckschloss,Felix Weitkämper |dblpUrl=https://dblp.org/rec/conf/iclp/RueckschlossW21 }} ==Reasoning about Independence in Open Universe Probabilistic Logic Programs== https://ceur-ws.org/Vol-2970/plppaper3.pdf
Reasoning about Independence in Open Universe
Probabilistic Logic Programs
Kilian Rückschloß, Felix Weitkämper
Lehr- und Forschungseinheit für Programmier- und Modellierungssprachen
Institut für Informatik
Ludwig-Maximilians-Universität München
Oettingenstraße 67
D-80538 München


                                      Abstract
                                      In the 1980’s J. Pearl and T. Verma [1] developed a graphical criterion to reason about probabilistic
                                      independence in Bayesian networks, which is famously known as d-separation. As J. Vennekens and
                                      S. Verbaeten [2, §5.2] pointed out in 2003, ground acyclic probabilistic logic programs correspond to
                                      Bayesian networks. Hence, we can pass from an acyclic ground program to the associated Bayesian
                                      network in order to derive independence statements from d-separation. In the present paper we lift this
                                      procedure so as to reason about independence in open universe probabilistic logic programs.




1. Introduction
The aim of this paper is to establish an analogue of the graphical independence analysis in
Bayesian networks (BN) for open universe probabilistic logic programs (PLPs). A BN stores
a probability distribution in the form of a directed acyclic graph (DAG) on the involved ran-
dom variables together with a table of conditional probabilities for every node in this graph.
J. Pearl and T. Verma [1] developed the following criterion to derive conditional independence
statements, only considering the DAG, underlying a BN:

Definition 1 (d-Separation). Let 𝐺 be a DAG and let Z be a set of nodes. Moreover, let
𝑃 ∶= 𝑋 − ... − 𝑌 be an undirected path in 𝐺. We call 𝑃 a d-connecting path with respect to Z if
the following assertions hold for every triple ... − 𝑅1 − 𝑅2 − 𝑅3 − ... in 𝑃:
          i) If 𝑅1 ← 𝑅2 or 𝑅2 → 𝑅3 in 𝐺, we have that 𝑅2 ∉ Z.
         ii) If 𝑅1 → 𝑅2 ← 𝑅3 in 𝐺, there exists a directed path 𝑅2 → ... → 𝑍 from 𝑅2 to a 𝑍 ∈ Z.
Two sets X and Y of nodes in 𝐺 are said to be d-separated by Z if there exists no d-connecting
path between a 𝑋 ∈ X and a 𝑌 ∈ Y.

Remark 1. The ”d” in the term term ”d-separation” is a shorthand for ”directional”. ([3])

  The intuition behind this definition is that d-connecting paths indicate dependencies and
indeed one obtains:
PLP 2021: The 8th Workshop on Probabilistic Logic Programming, September 20-27, 2021, Porto
Envelope-Open kilian.rueckschloss@lmu.de (K. Rückschloß); felix.weitkaemper@lmu.de (F. Weitkämper)
                                    © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
Theorem 1. Consider a BN with underlying DAG 𝐺 and let X, Y and Z be sets of random
variables. If X and Y are d-separated by Z in 𝐺, then X and Y are independent, conditioned on Z.

Remark 2. Note that the converse of Theorem 1 fails in general. It is referred to as causal
faithfulness assumption and plays a central role in causal structure discovery [4, §4]. We believe
that formulating and verifying the causal faithfulness assumption for open universe PLPs would be
an interesting direction for future work.

Example 1. Assume we are given a BN with underlying DAG ℎ1 → ℎ2 → ℎ3 . In this case
theorem 1 yields without further calculations that ℎ1 and ℎ3 become independent, once we observe ℎ2 .


  In probabilistic logic programming (PLP) J. Vennekens and S. Verbaeten established a corre-
spondence between ground acyclic programs and BNs [2, §5.2]. In particular, a ground acyclic
program without function symbols induces a distribution, which can be stored in a Boolean BN
on the dependency graph. This gives us the possibility to derive statements about independence,
only considering the program structure:

Example 2. Let us for instance consider the following acyclic ground LPAD-program:

                ℎ1 ∶ 𝛼 1 ←                  ℎ2 ∶ 𝛼2 ← ℎ1                   ℎ3 ∶ 𝛼 3 ← ℎ 2

We see that the induced distribution can be stored in a BN with underlying graph ℎ1 → ℎ2 → ℎ3 ,
i.e. by example 1 we can immediately conclude that ℎ1 and ℎ3 become independent if we observe ℎ2 .


   In this paper we want to transfer the reasoning of Example 2 to open universe PLPs, which
then correspond to families of BNs. Since the field of causal structure discovery for BNs heavily
relies on d-separation ([5], [6], [7]), we believe that such an independence analysis is a first step
towards a new PLP-based relational causal structure discovery algorithm. Moreover, considering
the work of S. Holtzen et. al. [8], it also seems that such an analysis may contribute to speed
up (lifted) inference in PLP. Let us now proceed to the formulation of the main problem in this
paper.


2. Formulation of the Problem
Let us consider an open universe LPAD-program P, which is given by the clauses
             (𝑗)     (𝑗)                                      (𝑗)       (𝑗)
𝐻𝑗 ∶ 𝛼 ← 𝐵1 , ..., 𝐵𝑚𝑗 for 1 ≤ 𝑗 ≤ 𝑛, i.e. the 𝐻𝑗 are atoms, 𝐵1 , ..., 𝐵𝑚𝑗 are literals and 𝛼 ∈ [0, 1] is a
real number, denoting the probability for the clause to hold. Further, assume that C1 , C2 and C3
are finite sets of literals, which take probabilities 0 or 1 in every realization of the program P
and let us choose head atoms 𝐻𝑗1 , 𝐻𝑗2 and 𝐻𝑗3 . For a given realization Π of P we denote by
         Π
{𝐻𝑗𝑖 : C𝑖 } the definable set of all ground atoms, i.e. random variables, which we obtain from
𝐻𝑗𝑖 under the condition that the literals in C𝑖 hold true. A more formal definition of the sets
         Π
{𝐻𝑗𝑖 : C𝑖 } will be given in Section 4.1. This paper investigates the following question:

Problem 1. Is {𝐻𝑗1 : C1 }Π independent of {𝐻𝑗2 : C2 }Π if we condition on {𝐻𝑗3 : C3 }Π for every
realization Π of the program P?
Example 3. Let us consider storages, which consist of two sections 𝑠1 and 𝑠2 . Further, for each
storage a database 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑆) tells us, which rooms 𝑅 belong to which section 𝑆. Since 𝑠1 and 𝑠2
are the only sections and each room lies in one section, we obtain the following constraint:

                                 ∀𝑅 ¬ (𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠1 ) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠2 ))                             (1)

Moreover, in each room 𝑅 we find tanks 𝑇, denoted by 𝑖𝑛(𝑇 , 𝑅) and each tank 𝑇 stores a liquid 𝐿,
denoted by 𝑠𝑡𝑜𝑟𝑒𝑠(𝑇 , 𝐿). Finally, we assume a database 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿), indicating the inflammable
liquids.
   An employee 𝐸 opens a tank 𝑇 with probability 𝛼, denoted by 𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ), i.e. we obtain:

                                          𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ) ∶ 𝛼 ←                                          (2)

Further, if employee 𝐸 opens tank 𝑇, it may be with probability 𝛽 that 𝐸 doesn’t close the tank 𝑇
properly, which then causes the tank 𝑇 to leak, denoted by 𝑙𝑒𝑎𝑘𝑠(𝑇 ):

                                     𝑙𝑒𝑎𝑘𝑠(𝑇 ) ∶ 𝛽 ← 𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 )                                     (3)

  Moreover, an employee 𝐸 smokes in a room 𝑅, denoted by 𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅), with probability 𝛾:

                                          𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) ∶ 𝛾 ←                                          (4)

If employee 𝐸 smokes in a room 𝑅, which contains a leaking tank storing an inflammable liquid,
this causes a fire in the corresponding section with probability 𝛿:

   𝑓 𝑖𝑟𝑒(𝑆) ∶ 𝛿 ← 𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅), 𝑙𝑒𝑎𝑘𝑠(𝑇 ), 𝑖𝑛(𝑇 , 𝑅), 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑆), 𝑠𝑡𝑜𝑟𝑒𝑠(𝑇 , 𝐿), 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿)   (5)

  In this paper we present a reasoning, which verifies for instance that the event

                          {𝑠𝑚𝑜𝑘𝑒𝑠(𝑒, 𝑟) : 𝑒 employee, 𝑟 room, 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑟, 𝑠1 )}

of an employee smoking in a room of section 𝑠1 is independent of the event

                   {𝑜𝑝𝑒𝑛𝑠(𝑒, 𝑡) : 𝑒 employee, 𝑡 tank, 𝑟 room, 𝑖𝑛(𝑡, 𝑟), 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑟, 𝑠1 )}

of an employee opening a tank in section 𝑠1 . Further, we discuss how things change once we observe
a fire or a leaking tank in 𝑠1 .


3. A Formalism for our Solution
Recall that events of the trivial probabilities 0 and 1 are independent of every other event.
To overcome this obstruction we introduce a language, which separates domain predicates,
denoting logical statements with probabilities 0 and 1 from random predicates, denoting events
with a probability, properly lying between 0 and 1.
3.1. Syntax
We consider a language in two parts: a probabilistic vocabulary 𝔓 and a domain vocabu-
lary 𝔇 ⊆ 𝔓. Here 𝔓 is a finite multisorted relational vocabulary, i.e. it consists of a finite set
of sorts, a finite set of relation symbols and a finite set of constants in those sorts, as well as a
countably infinite set of variables in every sort. For every variable or constant 𝑥 we will denote
it’s sort by 𝔰(𝑥). Further, 𝔇 is a subvocabulary of 𝔓, which contains all of the variables and
constants of 𝔓 as well as a (possibly empty) subset of the relation symbols of 𝔓.

Example 4. In Example 3 the vocabulary 𝔇 consists of the predicates 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑆), 𝑖𝑛(𝑇 , 𝑅),
𝑠𝑡𝑜𝑟𝑒𝑠(𝑇 , 𝐿) and 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿), which we assume to be given by databases.

  From now on we fix vocabularies 𝔇 ⊆ 𝔓 as above. As usual, an atom is an expression of
the form 𝑟(𝑡1 , … , 𝑡𝑛 ), where 𝑟 is a relation symbol and 𝑡1 to 𝑡𝑛 are constants or variables of the
correct sorts and a literal is an expression of the form 𝐴 or ¬𝐴 for an atom 𝐴. It is called a
domain atom or literal if 𝑟 is in 𝔇 and a random atom or literal if 𝑟 is in 𝔓 ⧵ 𝔇. A literal of
the form 𝐴 is called positive and a literal of the form ¬𝐴 is called negative.

Example 5. In Example 4 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠1 ) is a domain atom, whereas 𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) is a random
atom.

  Formulas, as well as existential and universal formulas are defined as usual in first-order
logic. Moreover, we will use the operator var(_) to refer to the free variables in an expression.
  The domain vocabulary will be used to formulate constraints and conditions for our PLP. The
purpose of a PLP, however, is to define distributions for the random variables, determined by
the language 𝔓. This is done by so called random clauses, i.e. LPAD-clauses with one head atom
and only random literals in their body, which we additionally annotate by existential domain
formulas to reflect the circumstances under which they are available.

Definition 2 (Random Clause). A random clause 𝑅𝐶 is an expression of the form

                                     (𝑅 ∶ 𝛼 ← 𝑅1 , ..., 𝑅𝑚 ) ⇐ 𝐶

which is given by
    i) a random atom 𝑅 ∶= effect(𝑅𝐶), called the effect of 𝑅𝐶
   ii) a possibly empty set of random literals causes(𝑅𝐶) ∶= {𝑅1 , ..., 𝑅𝑚 }, called the causes of 𝑅𝐶
  iii) a condition condition(𝑅𝐶) ∶= 𝐶 of 𝑅𝐶, which is an existential domain formula 𝐶, such
       that all free variables of C also occur in at least one of 𝑅, 𝑅1 , ..., 𝑅𝑚
  iv) a real number 𝑝(𝑅𝐶) ∶= 𝛼 ∈ [0, 1], called the probability of 𝑅𝐶.
Finally, we define the set var(𝑅𝐶) of free variables occurring in 𝑅𝐶 to be
var(effect(𝑅𝐶)) ∪ ⋃{var(𝑅) : 𝑅 ∈ causes(𝑅𝐶)}.

Remark 3. In i) and ii) of Definition 2 we use the terminology of cause and effect to reflect that
a ground program with Problog semantics denotes a structural equation model in the sense of [9,
§1.4].
Example 6. To reason about the independence statements implied by the program in Example 3,
we rephrase the clause (5), i.e. we arrange all domain literals in the body on the right side of ⇐ and
quantify over all variables, not occurring in the remaining literals. Hence, we obtain the following
random clause 𝑅𝐶:

   (𝑓 𝑖𝑟𝑒(𝑆) ∶ 𝛾 ← 𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅), 𝑙𝑒𝑎𝑘𝑠(𝑇 )) ⇐ ∃𝐿 𝑖𝑛(𝑇 , 𝑅) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑆) ∧ 𝑠𝑡𝑜𝑟𝑒𝑠(𝑇 , 𝐿) ∧ 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿)   (6)

In the future examples we will refer to the program P consisting of the clauses (1), (2), (3), (4) and (6)
with the convention that we denote a clause of the form
(effect(_) ∶ 𝑝(_) ← causes(_)) ⇐ 𝑇 𝑟𝑢𝑒 by effect(_) ∶ 𝑝(_) ← causes(_).

3.2. Semantics
After having established the necessary syntax we introduce the corresponding semantics. Let
us begin with the domain expressions.

3.2.1. Semantics of Domain Expressions
The semantics of domain expressions are given in a straightforward way.
   We just want to highlight the unique names assumption in our definition of a structure:
   A 𝔇-structure Δ consists of a sort universe 𝔰Δ for every sort 𝔰, an element of the corre-
sponding sort universe for every constant in 𝔇, such that two different constants are mapped
to different elements, and a relation among the corresponding sort universes for every relation
in 𝔇.
   Whether a domain formula is satisfied by a given 𝔇-structure (under a given interpretation
of its free variables) is determined by the usual rules of first-order logic.

3.2.2. Semantics of Probabilistic Expressions
Let us proceed with the semantics for probabilistic expressions. As a 𝔓-structure should extend
a 𝔇-structure by random variables we obtain:

Definition 3 (Ground Variable & 𝔓-Structure).
   i) Let Δ be a 𝔇-structure. A ground variable 𝐺 ∶= 𝑟(𝑥1 , … , 𝑥𝑞 ) consists of a random predicate
      pred(𝐺) ∶= 𝑟 of arity ar(𝑟) ∶= (𝔰1 , ..., 𝔰𝑞 ), called the predicate of 𝐺 and a tuple of suitable
                                                          𝑞
      realizations arguments(𝐺) ∶= (𝑥1 , ..., 𝑥𝑞 ) ∈ ∏𝑖=1 𝔰Δ   𝑖 , called the arguments of 𝐺.
  ii) A 𝔓-structure Π consists of a 𝔇-structure Π, and a distinct, non-trivially distributed,
      Boolean random variable 𝐺 Π ∶= 𝑟 Π (𝑥1 , ..., 𝑥𝑞 ) for every ground variable 𝐺 = 𝑟(𝑥1 , ..., 𝑥𝑞 ).

Remark 4. Please note, that in ii) of Definition 3 we also require a version of the unique names
assumption.

  Now we can make sense of the low level constructs in the language 𝔓.
Definition 4 (Semantics of Random Literals). Let Π be a 𝔓-structure. We define for every
random atom 𝐴 ∶= 𝑟(𝑥1 , ..., 𝑥𝑞 ) and for every variable interpretation 𝜄 on var(𝐴) the following
Boolean random variables:
                                   𝑟(𝑥1 , ..., 𝑥𝑞 )𝜄 ∶= 𝑟 Π (𝑥1𝜄 , ..., 𝑥𝑞𝜄 )
                                 (¬𝑟(𝑥1 , ..., 𝑥𝑞 ))𝜄 ∶= ¬𝑟 Π (𝑥1𝜄 , ..., 𝑥𝑞𝜄 ).

Probabilistic Logic Programs
For the semantics of clauses and programs we choose Problog semantics, since a ground Problog
program is essentially the same as a structural equation model in the sense of [9, §1.4]. We start
with the definition of a program:

Definition 5 (Program). A program P ∶= (R, D) is a pair, which consists of
    i) a finite set of universal domain formulas D
   ii) a finite set of random clauses R with the following property:
             For every random clause 𝑅𝐶 ∈ R, every random literal 𝐶 ∈ causes(𝑅𝐶), every
             𝔇-structure Δ and every interpretation 𝜄 such that Δ satisfies D and the condition of
             𝑅𝐶 under 𝜄, there exists a sequence of random clauses 𝑅𝐶0 , ..., 𝑅𝐶𝑛 ∈ R such that the
             condition of 𝑅𝐶𝑖 is satisfied by Δ under 𝜄 for all 1 ≤ 𝑖 ≤ 𝑛, head(𝑅𝐶𝑖+1 ) ∈ causes(𝑅𝐶𝑖 ),
             head(𝑅𝐶0 ) = 𝐶 and causes(𝑅𝐶𝑛 ) = ∅.
In this case R(P) ∶= R is called the random part and D(P) ∶= D is called the deterministic
part of the program P. Moreover, a 𝔇-structure Δ is called a domain of P if and only if Δ ⊧ D(P),
i.e. if Δ satisfies the deterministic part of P.

Remark 5. Note that the property in Definition 5 ii) is needed to ensure that that we obtain a
well-defined distribution in Definition 7.

Example 7. In Example 6 we obtain that the deterministic part D(P) of P consists of the con-
straint (1). The remaining clauses of P form the random part R(P) of P.

   In the present paper we want to reason on a syntactic level about the independence statements,
which are implied by a program P. To obtain a reasonable criterion we proceed as in the
propositional theory [10] and reduce ourselves to those independence statements, which follow
from d-separation in the corresponding BNs. Now the ground graphs are the DAGs, in which
we want to apply d-separation.

Definition 6 (Ground Graph & Acyclic Program). Let P be a program. For every domain Δ
of P we define the ground graph GraphΔ (P) to be the directed graph, obtained by the following
procedure:
      For every random clause 𝑅𝐶 ∈ R(P), for every cause 𝑅 ∈ causes(𝑅𝐶) and for every interpre-
      tation 𝜄 on var(𝑅𝐶), satisfying condition(𝑅𝐶)𝜄 = 𝑇 𝑟𝑢𝑒, we draw an arrow effect(𝑅𝐶)𝜄 ← 𝑅 𝜄 .
      In this case we say that the edge effect(𝑅𝐶)𝜄 ← 𝑅 𝜄 is induced by the clause 𝑅𝐶.
The program P is called acyclic if the ground graph GraphΔ (P) is acyclic for all domains Δ of P.
Example 8. It is easy to see that the program P of Example 6 is an acyclic program, since there
are only arrows from the variables 𝑙𝑒𝑎𝑘𝑠(_) and 𝑠𝑚𝑜𝑘𝑒𝑠(_, _) to 𝑓 𝑖𝑟𝑒(_) and from the variables
𝑜𝑝𝑒𝑛𝑠(_, _) to 𝑙𝑒𝑎𝑘𝑠(_).

  Finally, we are in the position to give the semantics of an acyclic program.

Definition 7 (Semantics of Acyclic Programs). Let P be an acyclic program. Assume the
random part R(P) of P consists of the random clauses 𝑅𝐶𝑖 , which are given by
                                                      (𝑖)       (𝑖)
                                      (𝑅𝑖 ∶ 𝛼𝑖 ← 𝑅1 , ..., 𝑅𝑚𝑖 ) ⇐ 𝐶𝑖

for 1 ≤ 𝑖 ≤ 𝑚. Further, let Δ be a domain of P. We define the grounding PΔ of the program P
w.r.t. Δ to be the Boolean functional causal model [9, §1.4] on the set of random variables

ℛ(Δ) ∶= {effect(𝑅𝐶𝑖 )𝜄 : 1 ≤ 𝑖 ≤ 𝑚, 𝜄 interpretation on var(𝑅𝐶𝑖 ) such that condition(𝑅𝐶𝑖 )𝜄 = 𝑇 𝑟𝑢𝑒} ,

which is given by the following equation for every 𝑅 𝜄 ∈ ℛ(Δ):
                                                             𝑚𝑖
                                                                      (𝑖) 𝜅
                       𝑅 𝜄 ∶=               ⋁                ⋀ (𝑅𝑗 ) ∧ 𝑢 (𝑅𝐶𝑖 , 𝜅) ,
                                          𝑅𝐶𝑖 ∈R(P)           𝑗=1
                                𝜅 interpretation on var(𝑅𝐶𝑖 )
                                       effect(𝑅𝐶𝑖 )𝜅 =𝑅 𝜄
                                   conditions(𝑅𝐶𝑖 )𝜅 =𝑇 𝑟𝑢𝑒

where we have that 𝑢 (𝑅𝐶𝑖 , 𝜅) is a Boolean random variable with the distribution

                                 ℙ (𝑢 (𝑅𝐶𝑖 , 𝜅) = 𝑇 𝑟𝑢𝑒) ∶= 𝛼𝑖 = 𝑝(𝑅𝐶𝑖 )

for every interpretation 𝜅 on var(𝑅𝐶𝑖 ). Besides, the random variables 𝑢 (𝑅𝐶𝑖 , 𝜅) are assumed to
be mutually independent for every random clause 𝑅𝐶𝑖 , every interpretation 𝜅 on var(𝑅𝐶𝑖 ) and
every 1 ≤ 𝑖 ≤ 𝑚.

Notation 1. We say that a 𝔓-structure Π is generated by a program P and write Π ⊧ P if the
following assertions hold:
    i) the underlying 𝔇-structure Δ is a domain of P
   ii) the joint distribution on the random variables effect(𝑅𝐶)𝜄 for 𝑅𝐶 ∈ R(P), 𝜄 interpretation with
       condition(𝑅𝐶)𝜄 = 𝑇 𝑟𝑢𝑒 coincides with the one induced by the functional causal model PΔ .
In this case Π is called a (probabilistic) realization of P.

Remark 6. Note that the semantics of Definition 7 do not coincide with the classical LPAD
semantics. However, our semantics still cause the same BN structures after grounding. Further, note
that the functional causal model PΔ has the causal diagram GraphΔ (P) for every domain Δ of the
program P, i.e. by [9, §1.4] we obtain that PΔ induces a distribution, which can be stored in a 𝐵𝑁
with the ground graph GraphΔ (P) as underlying DAG.
     Example 9. It is easy to observe that the program P of Example 6 is indeed an acyclic pro-
     gram in our sense. Now assume we are given a specific storage, which consists of three rooms
     𝑟1 , 𝑟2 and 𝑟3 . These rooms are assigned to sections as follows: 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑟1 , 𝑠1 ), 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑟2 , 𝑠1 ) and
     𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑟3 , 𝑠2 ). Moreover, we have 4 tanks 𝑡1 , 𝑡2 , 𝑡3 and 𝑡4 with 𝑖𝑛(𝑡1 , 𝑟1 ), 𝑖𝑛(𝑡2 , 𝑟1 ), 𝑖𝑛(𝑡3 , 𝑟2 ) and
     𝑖𝑛(𝑡4 , 𝑟3 ). The tanks contain three types of liquids 𝑙1 , 𝑙2 and 𝑙3 , which we describe in the following way:
     𝑠𝑡𝑜𝑟𝑒𝑠(𝑙1 , 𝑡1 ), 𝑠𝑡𝑜𝑟𝑒𝑠(𝑙2 , 𝑡2 ), 𝑠𝑡𝑜𝑟𝑒𝑠(𝑙2 , 𝑡3 ) and 𝑠𝑡𝑜𝑟𝑒𝑠(𝑙3 , 𝑡4 ) with 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝑙1 ) and 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝑙3 ).
     Finally, assume there are two employees Mary and John, which we express simply as 𝑚𝑎𝑟𝑦 and
     𝑗𝑜ℎ𝑛. In this case one checks that the storage above satisfies the deterministic part (1), i.e. we are
     given a domain Δ of the program P. Further, for 𝑒 ∈ {𝑗𝑜ℎ𝑛, 𝑚𝑎𝑟𝑦}, 𝑡 ∈ {𝑡1 , 𝑡2 , 𝑡3 , 𝑡4 } and 𝑟 ∈ {𝑟1 , 𝑟2 , 𝑟3 }
     the grounding PΔ is given by equations of the form:
            • 𝑜𝑝𝑒𝑛𝑠(𝑒, 𝑡) ∶= 𝑢 (𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ) ∶ 𝛼 ←, {𝐸 ↦ 𝑒, 𝑇 ↦ 𝑡}) such that
              𝑢 (𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ), {𝐸 ↦ 𝑒, 𝑇 ↦ 𝑡}) is true with probability 𝛼
            • 𝑙𝑒𝑎𝑘𝑠(𝑡) ∶= 𝑜𝑝𝑒𝑛𝑠(𝑒, 𝑡) ∧ 𝑢 (𝑙𝑒𝑎𝑘𝑠(𝑇 ) ∶ 𝛽 ← 𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ), {𝐸 ↦ 𝑒, 𝑇 ↦ 𝑡}) such that
              𝑢 (𝑙𝑒𝑎𝑘𝑠(𝑇 ) ∶ 𝛽 ← 𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ), {𝐸 ↦ 𝑒, 𝑇 ↦ 𝑡}) is true with probability 𝛽
            • 𝑠𝑚𝑜𝑘𝑒𝑠(𝑒, 𝑟) ∶= 𝑢 (𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) ∶ 𝛾 ←, {𝐸 ↦ 𝑒, 𝑅 ↦ 𝑟}) such that
              𝑢 (𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) ∶ 𝛾 ←, {𝐸 ↦ 𝑒, 𝑅 ↦ 𝑟}) is true with probability 𝛾
            • 𝑓 𝑖𝑟𝑒(𝑠1 ) ∶= 𝑠𝑚𝑜𝑘𝑒𝑠(𝑒, 𝑟1 ) ∧ 𝑙𝑒𝑎𝑘𝑠(𝑡1 ) ∧ 𝑢(𝑅𝐶, {𝑆 ↦ 𝑠1 , 𝐸 ↦ 𝑒, 𝑅 ↦ 𝑟1 , 𝑇 ↦ 𝑡1 }) such that
              𝑢(𝑅𝐶, {𝑆 ↦ 𝑠1 , 𝐸 ↦ 𝑒, 𝑅 ↦ 𝑟1 , 𝑇 ↦ 𝑡1 }) is true with probability 𝛿
            • 𝑓 𝑖𝑟𝑒(𝑠2 ) ∶= 𝑠𝑚𝑜𝑘𝑒𝑠(𝑒, 𝑟3 ) ∧ 𝑙𝑒𝑎𝑘𝑠(𝑡4 ) ∧ 𝑢(𝑅𝐶, {𝑆 ↦ 𝑠2 , 𝐸 ↦ 𝑒, 𝑅 ↦ 𝑟3 , 𝑇 ↦ 𝑡4 }) such that
              𝑢(𝑅𝐶, {𝑆 ↦ 𝑠2 , 𝐸 ↦ 𝑒, 𝑅 ↦ 𝑟3 , 𝑇 ↦ 𝑡4 }) is true with probability 𝛿
     In the situation above we obtain the following ground graph GraphΔ (P):

𝑜𝑝𝑒𝑛𝑠(𝑚𝑎𝑟𝑦, 𝑡1 )   𝑜𝑝𝑒𝑛𝑠(𝑗𝑜ℎ𝑛, 𝑡1 )   𝑜𝑝𝑒𝑛𝑠(𝑚𝑎𝑟𝑦, 𝑡2 )    𝑜𝑝𝑒𝑛𝑠(𝑗𝑜ℎ𝑛, 𝑡2 )   𝑜𝑝𝑒𝑛𝑠(𝑚𝑎𝑟𝑦, 𝑡3 )    𝑜𝑝𝑒𝑛𝑠(𝑗𝑜ℎ𝑛, 𝑡3 )   𝑜𝑝𝑒𝑛𝑠(𝑚𝑎𝑟𝑦, 𝑡4 )    𝑜𝑝𝑒𝑛𝑠(𝑗𝑜ℎ𝑛, 𝑡4 )


                      𝑙𝑒𝑎𝑘𝑠(𝑡1 )      𝑠𝑚𝑜𝑘𝑒𝑠(𝑚𝑎𝑟𝑦, 𝑟1 )      𝑙𝑒𝑎𝑘𝑠(𝑡2 )      𝑠𝑚𝑜𝑘𝑒𝑠(𝑚𝑎𝑟𝑦, 𝑟2 )      𝑙𝑒𝑎𝑘𝑠(𝑡3 )      𝑠𝑚𝑜𝑘𝑒𝑠(𝑚𝑎𝑟𝑦, 𝑟3 )      𝑙𝑒𝑎𝑘𝑠(𝑡4 )


                      𝑓 𝑖𝑟𝑒(𝑠1 )      𝑠𝑚𝑜𝑘𝑒𝑠(𝑗𝑜ℎ𝑛, 𝑟1 )                      𝑠𝑚𝑜𝑘𝑒𝑠(𝑗𝑜ℎ𝑛, 𝑟2 )                      𝑠𝑚𝑜𝑘𝑒𝑠(𝑗𝑜ℎ𝑛, 𝑟3 )      𝑓 𝑖𝑟𝑒(𝑠2 )


     Further, it is is easy to check that this is the causal diagram, i.e. the corresponding BN structure, of
     the functional causal model PΔ .

       As we now defined the necessary refinement of the LPAD-language, we can return to Prob-
     lem 1.


     4. Reasoning about Probabilistic Independence in Programs
     Having the language of Section 3 to our disposal, we now reformulate Problem 1, i.e. we formally
     define the sets {𝐻𝑗𝑖 : C𝑖 }Π .

     4.1. Independence of Context Variables
     We introduce context variables to reason about definable sets of random variables in an open
     universe.
Definition 8 (Context Variables). A context variable 𝐶 ∶= {atom(𝐶) : context(𝐶)} is a pair,
which consists of a random atom atom(𝐶), called the atom of 𝐶, and an existential formula
context(𝐶) with var(context(𝐶)) ⊆ var(atom(𝐶)), called the context of 𝐶.
  Moreover, we associate to every 𝔓-structure Π the following set of random variables:

𝐶 Π = {atom(𝐶) : context(𝐶)}Π ∶= {atom(𝐶)𝜄 : 𝜄 interpretation on var(𝐶) with context(𝐶)𝜄 = 𝑇 𝑟𝑢𝑒}

Finally, we define for every set C of context variables CΠ ∶= ⋃𝐶∈C 𝐶 Π .

Example 10. Let us consider the situation in Example 6. We obtain that the pair

             {𝑙𝑒𝑎𝑘𝑠(𝑇 ) : ∃𝑅 ∃𝐿 𝑖𝑛(𝑇 , 𝑅) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠1 ) ∧ 𝑠𝑡𝑜𝑟𝑒𝑠(𝑇 , 𝐿) ∧ 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿)}

is a context variable, denoting the event that in section 𝑠1 there is a leaking tank, which stores an
inflammable liquid. Moreover, for the realization Π of the program P in Example 9 we obtain that

     {𝑙𝑒𝑎𝑘𝑠(𝑇 ) : ∃𝑅 ∃𝐿 𝑖𝑛(𝑇 , 𝑅) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠1 ) ∧ 𝑠𝑡𝑜𝑟𝑒𝑠(𝑇 , 𝐿) ∧ 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿)}Π = {𝑙𝑒𝑎𝑘𝑠(𝑡1 )}.



   We are interested in the independence relations, which hold across all realizations of an
acyclic program P. Hence, let us fix an acyclic program P. Altogether we are interested in the
following notion:

Definition 9 (Independence of Context Variables). We say that two sets of context variables
X and Y are independent, conditioned on a third set of context variables Z if for every realization
Π ⊧ P we have that XΠ is independent of YΠ , whenever we condition on ZΠ . In this case we
write 𝑋 ⊔ 𝑌 |𝑍.

   To transfer d-separation from the ground graphs GraphΔ (P) to an open universe we need a
representation for all d-connecting paths that might occur in a specific ground graph GraphΔ (P)
for a domain Δ of P. Before we introduce this representation in Section 4.3, we still need to lift
the notion of an interpretation to the open universe setting.

4.2. Substitutions
A substitution is the open universe analogue of an interpretation.

Definition 10 (Substitution). A substitution 𝜃 is a function that associates to each 𝑆 from a
finite set of variables Dom(𝜃) a variable or constant 𝑆 𝜃 of the same sort.
   If V ⊆ Dom(𝜃), we call 𝜃 a substitution on V. We extend 𝜃 to a function on atoms, clauses and
formulas. To avoid variables being substituted into or out of the scope of a quantifier, first rename
all bound variable occurrences in 𝜑 using variable names outside the domain and the range of 𝜃.
Then obtain 𝜑 𝜃 by applying 𝜃 to every remaining occurrence of the variables of Dom(𝜃) in 𝜑.
   An expression 𝐵 is called a specialization of an expression 𝐴 if there exists a substitution 𝜃
such that 𝐴𝜃 = 𝐵.
Example 11. Let us reconsider Example 6. Further, let 𝐸 and 𝐸 ′ be variables of the sort employee,
let 𝑆 be a variable of the sort section and let 𝑅 be a variable of the sort room. Then 𝐸 𝜃 ∶= 𝐸 ′ ,
𝑆 𝜃 ∶= 𝑠1 and 𝑅 𝜃 ∶= 𝑅 defines a substitution on the set {𝐸, 𝑆, 𝑅} and we obtain:
                𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅)𝜃 = 𝑠𝑚𝑜𝑘𝑒𝑠(𝐸 ′ , 𝑅), 𝑓 𝑖𝑟𝑒(𝑆)𝜃 = 𝑓 𝑖𝑟𝑒(𝑠1 ), 𝑓 𝑖𝑟𝑒(𝑠2 )𝜃 = 𝑓 𝑖𝑟𝑒(𝑠2 )


   As a preparation for the definition of the big dependence graph in Section 4.3 we need the
following lemma about the existence of substitutions. Since it is an immediate consequence of
the unique names assumption, its proof is omitted here.
                                                                         ′
Lemma 1. Let 𝐴 and 𝐵 be random atoms such that 𝐴𝜄 = 𝐵𝜄 for interpretations 𝜄 and 𝜄 ′ of a
                                                                          ′       𝜄
𝔓-structure Π. Then there exists substitutions 𝜃 and 𝜃 ′ such that 𝐴𝜃 = 𝐵𝜃 , (𝐴𝜃 ) = 𝐴𝜄 and such
        ′   𝜄     ′
that (𝐵𝜃 ) = 𝐵𝜄 .
  When we come to decidability in Section 4.4, we consider the following special substitutions:
Definition 11. A substitution 𝜃 on V is called a relabelling if it is injective and for all variables
𝑆 ∈ Dom(𝜃) we have that 𝑆 𝜃 is again a variable. We say that two atoms 𝐴 and 𝐵 are unifiable
and write 𝐴 ∼ 𝐵 if there exists a relabelling 𝜃 such that 𝐴𝜃 = 𝐵.
Notation 2. Note that unifiability ∼ is an equivalence relation. We denote by [𝑅] the equivalence
class of a random atom 𝑅 upto unifiablity. Further, we call [𝑅] the unification class of 𝑅.
Example 12. Note that 𝜃 in Example 11 is not a relabelling, whereas 𝐸 𝜌 ∶= 𝐸 ′ , 𝑅 𝜌 ∶= 𝑅 and
𝑆 𝜌 ∶= 𝑆 would be a relabelling on {𝐸, 𝑅, 𝑆}.
  As it turns out the decidability of our independence analysis relies on the following lemma:
Lemma 2. There are finitely many unification classes of random atoms in 𝔓. Moreover, determin-
ing whether two random atoms are unifiable is decidable.
Proof. Note that for every two random atoms we have that 𝑟1 (𝑥1 , ..., 𝑥𝑛 ) ∼ 𝑟2 (𝑦1 , ..., 𝑦𝑚 ) if and
only if the following assertions hold:
    i) 𝑟1 = 𝑟2 , i.e. 𝑛 = 𝑚 and 𝔰(𝑥𝑖 ) = 𝔰(𝑦𝑖 ) for all 1 ≤ 𝑖 ≤ 𝑛.
   ii) For all 1 ≤ 𝑖 ≤ 𝑛 we have that 𝑥𝑖 is a variable if and only if 𝑦𝑖 is.
  iii) For all 1 ≤ 𝑖 ≤ 𝑛 we have that 𝑥𝑖 = 𝑦𝑖 if 𝑥𝑖 or 𝑦𝑖 is a constant.
  iv) For all 1 ≤ 𝑖 < 𝑗 ≤ 𝑛 we have that 𝑥𝑖 ≠ 𝑥𝑗 if and only if 𝑦𝑖 ≠ 𝑦𝑗 .
As there are only finitely many random predicates of finite arity and finitely many constants
in 𝔓, we obtain the desired result. □
  Now let us move on to the big dependence graph, which turns out to be the right object for
reasoning about d-separation in our setting.

4.3. The Big Dependence Graph
As already mentioned, in order to reason about d-separation in an open universe we need a
structure, which allows for a representation of all d-connecting paths, that might occur in a
ground graph GraphΔ (P) for a possible domain Δ of the program P.
Definition 12 (Big Dependence Graph). The big dependence graph Graph(P) is the directed
labelled graph, obtained by the following procedure:
      For every random clause 𝑅𝐶 ∈ R(P), for each substitution 𝜃 on var(𝑅𝐶) and for every cause
      𝑅 ∈ causes(𝑅𝐶) we draw an edge 𝑅 𝜃 → effect(𝑅𝐶)𝜃 . An edge of this kind is said to be
      induced by 𝑅𝐶. Moreover, we label each edge 𝑅 𝜃 → effect(𝑅𝐶)𝜃 , which is induced by the
      random clause 𝑅𝐶 ∈ R(P), with the existential formula
                                                                                                                   𝜃
                                 𝜔 (𝑅 𝜃 → effect(𝑅𝐶)𝜃 ) ∶= (∃y condition(𝑅𝐶)) .

      Here y denotes the set of variables, which occur in 𝑅𝐶 but not in effect(𝑅𝐶) and 𝑅.

Remark 7. One should read 𝜔 (𝑅 𝜃 → effect(𝑅𝐶)𝜃 ) as follows: ”For every interpretation 𝜄 on
                                                                                                                          𝜄
var(𝑅𝐶) the edge 𝑅 𝜄 → effect(𝑅𝐶)𝜄 exists in the ground graph GraphΔ (P) if 𝜔 (𝑅 𝜃 → effect(𝑅𝐶)𝜃 )
holds true.”

Example 13. Let us reconsider the program P in Example 6. Moreover, let (𝑆𝑖 )𝑖∈ℕ , (𝑅𝑖 )𝑖∈ℕ , (𝑇𝑖 )𝑖∈ℕ
and (𝐸𝑖 )𝑖∈ℕ be enumerations of the variables of sort section, room, tank, employee, respectively. In
that case, the big dependence graph Graph(P) is the infinite graph with the nodes 𝑓 𝑖𝑟𝑒(𝑆𝑖 ), 𝑓 𝑖𝑟𝑒(𝑠𝑚 ),
𝑙𝑒𝑎𝑘𝑠(𝑇𝑗 ), 𝑠𝑚𝑜𝑘𝑒𝑠(𝐸𝑘 , 𝑅𝑙 ), 𝑜𝑝𝑒𝑛𝑠(𝐸𝑘 , 𝑇𝑗 ) for 𝑖, 𝑗, 𝑘, 𝑙 ∈ ℕ, 𝑚 = 1, 2 and with the edges:
                                                           𝑇 𝑟𝑢𝑒
                       𝑜𝑝𝑒𝑛𝑠(𝐸𝑘 , 𝑇𝑗 )                                                    𝑙𝑒𝑎𝑘𝑠(𝑇𝑗 )

                                           ∃𝑇 ∃𝐿 𝑖𝑛(𝑇 ,𝑅𝑙 )∧𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅𝑙 ,𝑠𝑚 )∧𝑠𝑡𝑜𝑟𝑒𝑠(𝑇 ,𝐿)∧𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿)
                       𝑠𝑚𝑜𝑘𝑒𝑠(𝐸𝑘 , 𝑅𝑙 )                                                                      𝑓 𝑖𝑟𝑒(𝑠𝑚 )

                                          ∃𝑅 ∃𝐿 𝑖𝑛(𝑇𝑗 ,𝑅)∧𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅,𝑠𝑚 )∧𝑠𝑡𝑜𝑟𝑒𝑠(𝑇𝑗 ,𝐿)∧𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿)
                          𝑙𝑒𝑎𝑘𝑠(𝑇𝑗 )                                                                         𝑓 𝑖𝑟𝑒(𝑠𝑚 )

                                            ∃𝑇 ∃𝐿 𝑖𝑛(𝑇 ,𝑅𝑙 )∧𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅𝑙 ,𝑆𝑖 )∧𝑠𝑡𝑜𝑟𝑒𝑠(𝑇 ,𝐿)∧𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿)
                       𝑠𝑚𝑜𝑘𝑒𝑠(𝐸𝑘 , 𝑅𝑙 )                                                                      𝑓 𝑖𝑟𝑒(𝑆𝑖 )

                                          ∃𝑅 ∃𝐿 𝑖𝑛(𝑇𝑗 ,𝑅)∧𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅,𝑆𝑖 )∧𝑠𝑡𝑜𝑟𝑒𝑠(𝑇𝑗 ,𝐿)∧𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿)
                          𝑙𝑒𝑎𝑘𝑠(𝑇𝑗 )                                                                         𝑓 𝑖𝑟𝑒(𝑆𝑖 )




d-Separation for PLP
The next step will be to establish an analogue of d-separation on the big dependence graph
Graph(P). This criterion will consist of two components: A d-connecting path in Graph(P)
is a graph-theoretical object, which indicates possible d-connecting paths that might occur
in a ground graph GraphΔ (P) for a domain Δ of the program P. Further, we assign to each
d-connecting path in the big dependence graph Graph(P) a set of conditions, which tells us
under which assumptions on the domain Δ we expect this path to appear in the corresponding
ground graph GraphΔ (P).
Definition 13 (d-Connecting Path). Let 𝐴 and 𝐵 be random atoms. Further, let Z be a set of
context variables. A d-connecting path between 𝐴 and 𝐵 with respect to Z is a finite undirected
                                  ′
path of the form 𝒫 = 𝐴𝜃 − ... − 𝐵𝜃 for substitutions 𝜃 and 𝜃 ′ on var(𝐴), respectively var(𝐵) with
the following property:
      For every collider ... → 𝐶 ← ... in 𝒫 there exists a directed path of the form

                                       𝒫 (𝐶) ∶= 𝐶 → ... → atom(𝑍 )𝜃(𝐶)

      for a context variable 𝑍 ∈ Z and a substitution 𝜃(𝐶) on var(atom(𝑍 )).

Example 14. In Example 13 we have that 𝑠𝑚𝑜𝑘𝑒𝑠(𝐸𝑗 , 𝑅𝑘 ) → 𝑓 𝑖𝑟𝑒(𝑠1 ) ← 𝑙𝑒𝑎𝑘𝑠(𝑇𝑖 ) ← 𝑜𝑝𝑒𝑛𝑠(𝐸𝑗 ′ , 𝑇𝑖 )
yields a d-connecting path 𝒫 between 𝑠𝑚𝑜𝑘𝑒𝑠(𝐸1 , 𝑅1 ) and 𝑜𝑝𝑒𝑛𝑠(𝐸2 , 𝑇1 ) with respect to
Z ∶= {{𝑓 𝑖𝑟𝑒(𝑠1 ) : 𝑇 𝑟𝑢𝑒}, {𝑙𝑒𝑎𝑘𝑠(𝑇1 ) : ∃𝑅 𝑖𝑛(𝑇1 , 𝑅) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠2 )}}.

  We still need to find out under which conditions we expect a given d-connecting path in
the big dependence graph Graph(P) to occur in a ground graph GraphΔ (P). This is done
by the following conditions function, which we define by recursion on the structure of a
d-connecting path:

Definition 14 (Conditions Function). As before we assume that 𝐴 and 𝐵 are random atoms.
Further, we choose a set of context variables Z.
   i) If 𝒫 = 𝐴𝜃 is a d-connecting path of length 1 in Graph(P), we set conditions(𝒫 ) ∶= ∅.
                     ′
  ii) If 𝒫 = 𝐴𝜃 − 𝐵𝜃 is a d-connecting path of length 2 in Graph(P), we set
                                                                           ′
                                      conditions(𝒫 ) ∶= {𝜔 (𝐴𝜃 − 𝐵𝜃 )} .
                                                                                                ′
  iii) Assume we have two d-connecting paths 𝒫 ′ ∶= 𝐴𝜃 − ... − 𝐷 − 𝐶 and 𝒫 ″ = 𝐵𝜃 − ... − 𝐸 − 𝐶
       such that 𝐷 ← 𝐶 or 𝐸 ← 𝐶. Then by Definition 13 we obtain a d-connecting path
                                     ′
       𝒫 ∶= 𝐴𝜃 −...−𝐷 −𝐶 −𝐸 −...−𝐵𝜃 , for which we define conditions(𝒫 ) to be the following set:

      conditions(𝒫 ′ ) ∪ conditions(𝒫 ″ ) ∪ {¬ context(𝑍 )𝜎 : 𝑍 ∈ Z, 𝜎 substitution, atom(𝑍 )𝜎 = 𝐶}
  iv) Finally, assume that the following two d-connecting paths, 𝒫 ′ ∶= 𝐴𝜃 − ... − 𝐷 → 𝐶 and
                ′                                                                               ′
      𝒫 ″ = 𝐵𝜃 − ... − 𝐸 → 𝐶, yield a d-connecting path 𝒫 ∶= 𝐴𝜃 − ... − 𝐷 → 𝐶 ← 𝐸 − ... − 𝐵𝜃 .
      Then by Definition 13 there exists a directed path 𝒫 (𝐶) ∶= 𝐶 → ... → atom(𝑍 )𝜃(𝐶) for a
      context variable 𝑍 ∈ Z and a substitution 𝜃(𝐶). In this case we define conditions(𝒫 ) to be
      the following set:

            conditions(𝒫 ′ ) ∪ conditions(𝒫 ″ ) ∪ {𝜔(𝐸) : 𝐸 edge of 𝒫 (𝐶)} ∪ {context (𝑍)𝜃(𝐶) }

Example 15. In the case of the d-connecting path 𝒫 in Example 14 we obtain for conditions(𝒫 )
the following set:

                  {∃𝑇 ∃𝐿 𝑖𝑛(𝑇 , 𝑅𝑘 ) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅𝑘 , 𝑠1 ) ∧ 𝑠𝑡𝑜𝑟𝑒𝑠(𝑇 , 𝐿) ∧ 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿),

∃𝑅 ∃𝐿 𝑖𝑛(𝑇𝑖 , 𝑅) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠1 ) ∧ 𝑠𝑡𝑜𝑟𝑒𝑠(𝑇𝑖 , 𝐿) ∧ 𝑖𝑛𝑓 𝑙𝑎𝑚𝑚𝑎𝑏𝑙𝑒(𝐿), ¬∃𝑅 𝑖𝑛(𝑇𝑖 , 𝑅) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠2 ), 𝑇 𝑟𝑢𝑒}


Now we bring the graphical and the logical component together and define the correct notion
of a d-connecting path between context variables in the big dependence graph Graph(P).
Definition 15 (d-Connecting Path between Context Variables). Let 𝑋 and 𝑌 be context vari-
ables. Further, let Z be a set of context variables. A d-connecting path 𝒫 between 𝑋 and 𝑌 with
respect to Z is a d-connecting path 𝒫 between atom(𝑋 ) and atom(𝑌 ). We call 𝒫 a valid
d-connecting path if the set

                          conditions(𝒫 ) ∪ context(𝑋 ) ∪ context(𝑌 ) ∪ D(P)

is satisfiable. We say that Z d-connects 𝑋 and 𝑌 in Graph(P) if there exists a valid d-connecting
path between 𝑋 and 𝑌 with respect to Z in Graph(P), otherwise we say that Z d-separates 𝑋 and 𝑌.
Finally, we say that Z d-connects two sets of context variables X and Y if there exist a 𝑋 ∈ X and
a 𝑌 ∈ Y such that Z d-connects 𝑋 and 𝑌. Otherwise, we say that Z d-separates X and Y.

Example 16. Let 𝑋 ∶= {𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ) : ∃𝑅 𝑖𝑛(𝑇 , 𝑅) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠1 )}, let
𝑌 ∶= {𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) : 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠1 )} and let 𝑌 ′ ∶= {𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) : 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠2 )}. Obviously, the
storage in Example 9 satisfies conditions(𝒫 ) ∪ {𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅𝑘 , 𝑠1 ), ∃𝑅 𝑖𝑛(𝑇𝑖 , 𝑅) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠1 )} ∪ D(P)
for 𝒫 as in Example 14, i.e. 𝒫 is a valid d-connecting path between 𝑋 and 𝑌 with respect to Z.
   Further, we also see that conditions(𝒫 ) ∪ {𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅𝑘 , 𝑠2 ), ∃𝑅 𝑖𝑛(𝑇𝑖 , 𝑅) ∧ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠(𝑅, 𝑠1 )} ∪ D(P)
yields a contradiction as by (1) each room is allowed to lie in at most one section, i.e. 𝒫 is not a
valid d-connecting path between 𝑋 and 𝑌 ′ with respect to Z.

  With these definitions at our hand we can prove the following results:

Lemma 3. Let 𝑋 and 𝑌 be context variables. Further, let Z be a set of context variables. Assume
there exists a domain Δ of P such that ZΔ d-connects 𝑋 Δ and 𝑌 Δ in the ground graph GraphΔ (P).
Then we obtain that Z d-connects 𝑋 and 𝑌 in the big dependence graph Graph(P).

Proof. This can be proven with the help of Lemma 1 by an induction on the structure of a
d-connecting path. □

Lemma 4. Let 𝑋 and 𝑌 be context variables and let Z be a set of context variables. Further, assume
that Z d-connects 𝑋 and 𝑌 in the big dependence Graph(P). Then there exists a domain Δ of P such
that ZΔ d-connects 𝑋 Δ and 𝑌 Δ in the ground graph GraphΔ (P).
                                                                                    ′
Proof. For every valid d-connecting path 𝒫 = atom(𝑋 )𝜃 − ... − atom(𝑌 )𝜃 we obtain that
                                                     ′
                        context(𝑋 )𝜃 ∪ context(𝑌 )𝜃 ∪ conditions(𝒫 ) ∪ D(P)

is satisfiable. Now take Δ to be a Herbrand model [11, §11.3] of the skolemization of the
existential formulas in the upper set with the corresponding interpretation 𝜄 and we obtain by
construction that
                                                                 ′
                                 atom(𝑋 )𝜄∘𝜃 − ... − atom(𝑌 )𝜄∘𝜃
                                                                        ′
is a d-connecting path between atom(𝑋 )𝜄∘𝜃 ∈ 𝑋 Δ and atom(𝑌 )𝜄∘𝜃 ∈ 𝑌 Δ . □

  From these two lemmas we can easily deduce the next theorem, which yields the desired
generalization of d-separation to open universe PLPs.
Theorem 2. Let X, Y and Z be sets of context variables. Then we obtain that Z d-separates X
and Y in the big dependence graph Graph(P) if and only if ZΔ d-separates XΔ and YΔ in the ground
graph GraphΔ (P) for every domain Δ of P.

  If we combine Theorem 2 with Theorem 1, we get:

Corollary 1. Let X, Y and Z be sets of context variables. If Z d-separates X and Y in the big
dependence graph Graph(P), we obtain that X and Y are independent conditioned on Z, i.e. X ⊔ Y|Z.


  Our final goal is to show that the problem of determining whether two context variables are
d-separated with respect to a given set of context variables is decidable.

4.4. A Normal Form for d-Connecting Paths
Here we want to investigate the following setting: Let us fix two context variables 𝑋 and 𝑌
together with a set of context variables Z.
   A priori searching for a d-connecting path between 𝑋 and 𝑌 in the big dependence graph
Graph(P) also means searching for an arbitrary long path in an infinite graph, which would
render this problem undecidable. This is the reason why we establish a normal form for
d-connecting paths and reduce the problem of finding a d-connecting path to finding one in
normal form.

Definition 16 (Normal form of a d-connecting path). A d-connecting path
                                                                              ′
                        𝒫 ∶= atom(𝑋 )𝜃 = 𝑅1 − 𝑅2 − ... − 𝑅𝑛 = atom(𝑌 )𝜃

between 𝑋 and 𝑌 in the big dependence graph Graph(P) is said to be in normal form if for all
1 ≤ 𝑖 < 𝑗 ≤ 𝑛 we have that 𝑅𝑖 and 𝑅𝑗 are not unifiable, i.e. [𝑅𝑖 ] ≠ [𝑅𝑗 ].

Example 17. Consider the following valid d-connecting path between {𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) : 𝑇 𝑟𝑢𝑒} and
{𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ) : 𝑇 𝑟𝑢𝑒} with respect to {𝑓 𝑖𝑟𝑒(𝑆) : 𝑇 𝑟𝑢𝑒} in the big dependence graph of Example 13:

           𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) → 𝑓 𝑖𝑟𝑒(𝑆) ← 𝑙𝑒𝑎𝑘𝑠(𝑇 ′ ) → 𝑓 𝑖𝑟𝑒(𝑆 ′ ) ← 𝑙𝑒𝑎𝑘𝑠(𝑇 ) ← 𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 )

Note that this path is not in normal form as 𝑓 𝑖𝑟𝑒(𝑆) and 𝑓 𝑖𝑟𝑒(𝑆 ′ ) are unifiable. However, in order
to see that {𝑓 𝑖𝑟𝑒(𝑆) : 𝑇 𝑟𝑢𝑒} d-connects {𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) : 𝑇 𝑟𝑢𝑒} and {𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ) : 𝑇 𝑟𝑢𝑒}, we can also
take the valid d-connecting path

                        𝑠𝑚𝑜𝑘𝑒𝑠(𝐸, 𝑅) → 𝑓 𝑖𝑟𝑒(𝑆) ← 𝑙𝑒𝑎𝑘𝑠(𝑇 ) ← 𝑜𝑝𝑒𝑛𝑠(𝐸, 𝑇 ),

which is in normal form. In Lemma 5 we generalize this observation to a calculus, which transforms
each valid d-connecting path to one in normal form.

   Finally, we reduce the problem of finding a d-connecting path between 𝑋 and 𝑌 with respect
to Z to the problem of finding a d-connecting path between them in normal form.
Lemma 5. If Z d-connects 𝑋 and 𝑌, then there exists a valid d-connecting path 𝒫 in normal form
between 𝑋 and 𝑌.
                                                                       ′
Proof. Assume 𝒫 ∶= atom(𝑋 )𝜃 − ... − 𝑅𝑖 − ... − 𝑅𝑗 − ... − atom(𝑌 )𝜃 is a valid d-connecting
path, which is not in normal form, i.e. we may assume that 𝑅𝑖 and 𝑅𝑗 are unifiable for 𝑖 < 𝑗.
                                                𝜌
Hence, there exists a relabelling 𝜌 such that 𝑅𝑗 = 𝑅𝑖 . By an induction on the structure of a
                                                                   𝜌                    ′
d-connecting path one verifies that 𝒫 ̃ ∶= atom(𝑋 )𝜃 − ... − 𝑅𝑖 = 𝑅𝑗 − ... − atom(𝑌 )𝜌∘𝜃 is also
a d-connecting path between 𝑋 and 𝑌. Moreover, observe that 𝒫 ̃ is valid as the conditions
function in Definition 14 is monotone in the length of a d-connecting path. Now extensionally
applying the procedure above yields the desired result. □

  From Lemma 5 we obtain the desired decidability of d-separation for open universe PLPs:

Theorem 3. Let 𝑋, 𝑌 be context variables and let Z be a set of context variables. Determining
whether Z d-connects 𝑋 and 𝑌 is decidable.

Proof. Note that we can apply a suitable relabelling to a d-connecting path. Hence, to determine
whether 𝑋 and 𝑌 are d-connected we can proceed by Lemma 5 as follows:
  1.) List all unification classes [𝑅] of specializations of atom(𝑋 ).
  2.) List all d-connecting paths in normal form, which start with a 𝑅 of step 1.).
  3.) Delete all paths, which don’t end with a specialization of atom(𝑌 ).
  4.) Check these paths for satisfiability.
Obviously, 1.), 3.) and 4.) are decidable. To achieve 2.), we build undirected paths starting from
a 𝑅 of step 1.) by applying the random clauses as indicated in Definition 12. We stop when we
need to run twice in the same unification class. Since by Lemma 2 there are only finitely many
unification classes, this method terminates. Further, we proceed analogously to check whether
these paths are d-connecting paths, i.e. to check whether the property in Definition 13 holds. □


5. Conclusion
At first we introduced a new semantics for open universe probabilistic logic programs. Subse-
quently, we used the correspondence between ground acyclic programs and Bayesian networks
to apply the method of d-separation in our setting. In particular, we assigned to each open
universe program an infinite directed labelled graph, the big dependence graph. Finally, we
saw that in order to deduce independence statements it is sufficient to search through the big
dependence graph for d-connecting paths in normal form and that searching for those paths is
a decidable problem.
   In the future it would be interesting to extend the notion of a program by incorporating
Prolog programs. This would reflect logic programming inside probabilistic logic programming.
Further, it would also be desirable to bring our semantics closer to the LPAD semantics. We
believe that an independence analysis for LPAD-programs at compilation time can be used to
build modified binary decision trees as in [8] in order to speed up (lifted) inference. Moreover,
we plan to construct a causal structure discovery algorithm for open universe LPAD-programs
on the basis of the big dependence graph.
References
 [1] T. Verma, J. Pearl, Causal Networks: Semantics and Expressiveness, in: Uncertainty in
     Artificial Intelligence, volume 9 of Machine Intelligence and Pattern Recognition, North-
     Holland, 1990, pp. 69–76. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / B 9 7 8 - 0 - 4 4 4 - 8 8 6 5 0 - 7 . 5 0 0 1 1 - 1 .
 [2] J. Vennekens, S. Verbaeten, Logic Programs with Annotated Disjunctions, Technical Report
     CW 368, Katholieke Universiteit Leuven, Department of Computer Science, 2003. URL:
     https://www.cs.kuleuven.be/publicaties/rapporten/cw/CW368.abs.html.
 [3] D. Geiger, T. Verma, J. Pearl,                                         Identifying Independence in Bayesian Net-
     works,          Networks 20 (1990) 507–534. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 0 2 / n e t . 3 2 3 0 2 0 0 5 0 4 .
     arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/net.3230200504.
 [4] P. Spirtes, C. Glymour, R. Scheines, Causation, Prediction, and Search, 2nd ed., MIT press,
     2000. URL: https://www.researchgate.net/publication/242448131_Causation_Prediction_
     and_Search.
 [5] T. Verma, J. Pearl, Equivalence and Synthesis of Causal Models, in: Proceedings of the
     Sixth Annual Conference on Uncertainty in Artificial Intelligence, UAI ’90, Elsevier Science
     Inc., USA, 1990, p. 255–270. doi:h t t p s : / / d l . a c m . o r g / d o i / 1 0 . 5 5 5 5 / 6 4 7 2 3 3 . 7 1 9 7 3 6 .
 [6] C. Meek, Causal inference and causal explanation with background knowledge, in:
     Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI’95,
     Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1995, p. 403–410. doi:h t t p s :
     //dl.acm.org/doi/10.5555/2074158.2074204.
 [7] M. Maier, Causal Discovery for Relational Domains: Representation, Reasoning, and
     Learning, Doctoral dissertations. 279., University of Massachusetts - Amherst, 2014. URL:
     https://scholarworks.umass.edu/dissertations_2/279/.
 [8] S. Holtzen, G. Van den Broeck, T. Millstein, Scaling exact inference for discrete probabilistic
     programs, Proc. ACM Program. Lang. 4 (2020). doi:h t t p s : / / d o i . o r g / 1 0 . 1 1 4 5 / 3 4 2 8 2 0 8 .
 [9] J. Pearl, Causality, 2 ed., Cambridge University Press, Cambridge, UK, 2000. doi:h t t p s :
     //doi.org/10.1017/CBO9780511803161.
[10] D. Geiger, J. Pearl, On the logic of causal models, in: Uncertainty in Artificial Intelligence,
     volume 9 of Machine Intelligence and Pattern Recognition, North-Holland, 1990, pp. 3–14.
     doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / B 9 7 8 - 0 - 4 4 4 - 8 8 6 5 0 - 7 . 5 0 0 0 6 - 8 .
[11] H. Ebbinghaus, J. Flum, W. Thomas, Einführung in die mathematische Logik, 6 ed., Springer
     Spektrum, Berlin, DE, 2018. URL: https://www.springer.com/de/book/9783662580288.