1. Introduction

Reasoning about Independence in Open Universe Probabilistic Logic Programs

Kilian Rückschloß

Felix Weitkämper

In the 1980's J. Pearl and T. Verma [1] developed a graphical criterion to reason about probabilistic independence in Bayesian networks, which is famously known as d-separation. As J. Vennekens and S. Verbaeten [2, §5.2] pointed out in 2003, ground acyclic probabilistic logic programs correspond to Bayesian networks. Hence, we can pass from an acyclic ground program to the associated Bayesian network in order to derive independence statements from d-separation. In the present paper we lift this procedure so as to reason about independence in open universe probabilistic logic programs.

1. Introduction

the following assertions hold for every triple ... − 1 − 2 − 3 − ... in : i) If 1 ← 2 or 2 → 3 in , we have that 2 ∉ Z. ii) If 1 → 2 ← 3 in , there exists a directed path 2 → ... → from 2 to a ∈

Remark 1. The ”d” in the term term ”d-separation” is a shorthand for ”directional”. ([3])

The intuition behind this definition is that d-connecting paths indicate dependencies and indeed one obtains: PLP 2021: The 8th Workshop on Probabilistic Logic Programming, September 20-27, 2021, Porto nEvelop-O

© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Theorem 1. Consider a BN with underlying DAG and let X, Y and Z be sets of random variables. If X and Y are d-separated by Z in , then X and Y are independent, conditioned on Z. Remark 2. Note that the converse of Theorem 1 fails in general. It is referred to as causal faithfulness assumption and plays a central role in causal structure discovery [4, §4]. We believe that formulating and verifying the causal faithfulness assumption for open universe PLPs would be an interesting direction for future work.

Example 1. Assume we are given a BN with underlying DAG ℎ1 → ℎ2 → ℎ3. In this case theorem 1 yields without further calculations that ℎ1 and ℎ3 become independent, once we observe ℎ2.

In probabilistic logic programming (PLP) J. Vennekens and S. Verbaeten established a correspondence between ground acyclic programs and BNs [2, §5.2]. In particular, a ground acyclic program without function symbols induces a distribution, which can be stored in a Boolean BN on the dependency graph. This gives us the possibility to derive statements about independence, only considering the program structure: Example 2. Let us for instance consider the following acyclic ground LPAD-program: ℎ1 ∶ 1 ← We see that the induced distribution can be stored in a BN with underlying graph ℎ1 → ℎ2 → ℎ3, i.e. by example 1 we can immediately conclude that ℎ1 and ℎ3 become independent if we observe ℎ2.

In this paper we want to transfer the reasoning of Example 2 to open universe PLPs, which then correspond to families of BNs. Since the field of causal structure discovery for BNs heavily relies on d-separation ([5], [6], [7]), we believe that such an independence analysis is a first step towards a new PLP-based relational causal structure discovery algorithm. Moreover, considering the work of S. Holtzen et. al. [8], it also seems that such an analysis may contribute to speed up (lifted) inference in PLP. Let us now proceed to the formulation of the main problem in this paper.

2. Formulation of the Problem

Π Π Let us consider an open universe LPAD-program P, which is given by the clauses ∶ ← 1 () , ..., () for 1 ≤ ≤ , i.e. the are atoms, 1() , ..., () are literals and ∈ [0, 1] is a real number, denoting the probability for the clause to hold. Further, assume that C1, C2 and C3 are finite sets of literals, which take probabilities 0 or 1 in every realization of the program P and let us choose head atoms 1, 2 and 3. For a given realization Π of P we denote by { : C } the definable set of all ground atoms, i.e. random variables, which we obtain from under the condition that the literals in C hold true. A more formal definition of the sets { : C } will be given in Section 4.1. This paper investigates the following question: realization Π of the program P? Example 3. Let us consider storages, which consist of two sections 1 and 2. Further, for each storage a database (, ) tells us, which rooms belong to which section . Since 1 and 2 are the only sections and each room lies in one section, we obtain the following constraint: ∀ ¬ ((, 1) ∧ (, 2)) Moreover, in each room we find tanks , denoted by ( , ) denoted by ( , ) . Finally, we assume a database () liquids.

An employee opens a tank with probability , denoted by (, ) and each tank stores a liquid , , indicating the inflammable , i.e. we obtain: Further, if employee opens tank , it may be with probability that doesn’t close the tank properly, which then causes the tank to leak, denoted by ( ) :

(, ) ∶ ← ( ) ∶ ← (, ) (, ) ∶ ← ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) Moreover, an employee smokes in a room , denoted by (, ) , with probability : If employee smokes in a room , which contains a leaking tank storing an inflammable liquid, this causes a fire in the corresponding section with probability : () ∶ ← (, ), ( ), ( , ), (, ), ( , ), () In this paper we present a reasoning, which verifies for instance that the event {(, ) : employee, room, ( , 1)} of an employee smoking in a room of section 1 is independent of the event {(, ) : employee, tank, room, (, ), ( , 1)} of an employee opening a tank in section 1. Further, we discuss how things change once we observe a fire or a leaking tank in 1.

3. A Formalism for our Solution

Recall that events of the trivial probabilities 0 and 1 are independent of every other event. To overcome this obstruction we introduce a language, which separates domain predicates, denoting logical statements with probabilities 0 and 1 from random predicates, denoting events with a probability, properly lying between 0 and 1.

3.1. Syntax

We consider a language in two parts: a probabilistic vocabulary and a domain vocabulary ⊆ . Here is a finite multisorted relational vocabulary, i.e. it consists of a finite set of sorts, a finite set of relation symbols and a finite set of constants in those sorts, as well as a countably infinite set of variables in every sort. For every variable or constant we will denote it’s sort by () . Further, is a subvocabulary of , which contains all of the variables and constants of as well as a (possibly empty) subset of the relation symbols of . Example 4. In Example 3 the vocabulary consists of the predicates (, ) ( , ) and () , which we assume to be given by databases. , ( , )

From now on we fix vocabularies ⊆ as above. As usual, an atom is an expression of the form ( 1, … , ), where is a relation symbol and 1 to are constants or variables of the correct sorts and a literal is an expression of the form or ¬ for an atom . It is called a domain atom or literal if is in and a random atom or literal if is in ⧵ . A literal of the form is called positive and a literal of the form ¬ is called negative. Example 5. In Example 4 (, atom.

1) is a domain atom, whereas (, ) is a random

Formulas, as well as existential and universal formulas are defined as usual in first-order logic. Moreover, we will use the operator var(_) to refer to the free variables in an expression.

The domain vocabulary will be used to formulate constraints and conditions for our PLP. The purpose of a PLP, however, is to define distributions for the random variables, determined by the language . This is done by so called random clauses, i.e. LPAD-clauses with one head atom and only random literals in their body, which we additionally annotate by existential domain formulas to reflect the circumstances under which they are available.

Definition 2 (Random Clause).

A random clause is an expression of the form ( ∶ ←

1, ..., ) ⇐ which is given by i) a random atom ∶= efect () , called the efect of ii) a possibly empty set of random literals causes() ∶= { 1, ..., }, called the causes of iii) a condition condition() ∶= of , which is an existential domain formula , such that all free variables of C also occur in at least one of , 1, ..., iv) a real number () ∶= ∈ [0, 1] , called the probability of .

Finally, we define the set var() of free variables occurring in to be var(efect ()) ∪ ⋃{var() : ∈ causes()}.

Remark 3. In i) and ii) of Definition 2 we use the terminology of cause and efect to reflect that a ground program with Problog semantics denotes a structural equation model in the sense of [9, §1.4].

Example 6. To reason about the independence statements implied by the program in Example 3, we rephrase the clause ( 5 ), i.e. we arrange all domain literals in the body on the right side of ⇐ and quantify over all variables, not occurring in the remaining literals. Hence, we obtain the following random clause : ( () ∶ ← (, ), ( ) ) ⇐ ∃ ( , ) ∧ (, ) ∧ ( , ) ∧ () ( 6 ) In the future examples we will refer to the program P consisting of the clauses ( 1 ), ( 2 ), ( 3 ), ( 4 ) and ( 6 ) with the convention that we denote a clause of the form (efect (_) ∶ ( _) ← causes(_)) ⇐

by efect (_) ∶ ( _) ← causes(_).

3.2. Semantics

us begin with the domain expressions. 3.2.1. Semantics of Domain Expressions After having established the necessary syntax we introduce the corresponding semantics. Let The semantics of domain expressions are given in a straightforward way.

We just want to highlight the unique names assumption in our definition of a structure: A -structure Δ consists of a sort universe Δ for every sort , an element of the corresponding sort universe for every constant in , such that two diferent constants are mapped to diferent elements , and a relation among the corresponding sort universes for every relation in .

Whether a domain formula is satisfied by a given -structure (under a given interpretation of its free variables) is determined by the usual rules of first-order logic. 3.2.2. Semantics of Probabilistic Expressions Let us proceed with the semantics for probabilistic expressions. As a -structure should extend a -structure by random variables we obtain:

Definition 3 (Ground Variable & -Structure).

i) Let Δ be a -structure. A ground variable ∶= ( realizations arguments() ∶= ( 1, ..., ) ∈ ∏=1 Δ, called the arguments of . pred() ∶= of arity ar( ) ∶= ( 1, ..., ), called the predicate of and a tuple of suitable ii) A -structure Π consists of a -structure Π, and a distinct, non-trivially distributed, Boolean random variable Π ∶= Π( 1, ..., ) for every ground variable = ( 1, ..., ). 1, … , ) consists of a random predicate Remark 4. Please note, that in ii) of Definition 3 we also require a version of the unique names assumption.

Now we can make sense of the low level constructs in the language . random atom ∶= ( Boolean random variables:

Definition 4 (Semantics of Random Literals).

Let Π be a -structure. We define for every 1, ..., ) and for every variable interpretation on var() the following ( 1, ..., ) ∶= Π( 1, ..., )

(¬ ( 1, ..., )) ∶= ¬ Π( 1, ..., ).

Probabilistic Logic Programs For the semantics of clauses and programs we choose Problog semantics, since a ground Problog program is essentially the same as a structural equation model in the sense of [9, §1.4]. We start with the definition of a program:

Definition 5 (Program).

A program P ∶= (R, D) is a pair, which consists of i) a finite set of universal domain formulas D ii) a finite set of random clauses R with the following property:

For every random clause ∈

R, every random literal ∈ causes() , every -structure Δ and every interpretation such that Δ satisfies D and the condition of

under , there exists a sequence of random clauses condition of is satisfied by Δ under for all 1 ≤ ≤ , head( +1 ) ∈ causes( ), 0, ..., ∈ R such that the head( 0) = and causes( ) = ∅.

In this case R(P) ∶= R is called the random part and D(P) ∶= D is called the deterministic part of the program P. Moreover, a -structure Δ is called a domain of P if and only if Δ ⊧ D(P), i.e. if Δ satisfies the deterministic part of P.

Remark 5. Note that the property in Definition 5 ii) is needed to ensure that that we obtain a well-defined distribution in Definition

Example 7. In Example 6 we obtain that the deterministic part D(P) of P consists of the constraint ( 1 ). The remaining clauses of P form the random part R(P) of P.

In the present paper we want to reason on a syntactic level about the independence statements, which are implied by a program P. To obtain a reasonable criterion we proceed as in the propositional theory [10] and reduce ourselves to those independence statements, which follow from d-separation in the corresponding BNs. Now the ground graphs are the DAGs, in which we want to apply d-separation. procedure:

Definition 6 (Ground Graph & Acyclic Program).

Let P be a program. For every domain Δ of P we define the ground graph GraphΔ(P) to be the directed graph, obtained by the following For every random clause ∈

R(P), for every cause ∈ causes() and for every interpretation on var() , satisfying condition() = , we draw an arrow efect () ← .

In this case we say that the edge efect () ← is induced by the clause . The program P is called acyclic if the ground graph GraphΔ(P) is acyclic for all domains Δ of P. Example 8. It is easy to see that the program P of Example 6 is an acyclic program, since there are only arrows from the variables ( _) and ( _, _) to ( _) and from the variables ( _, _) to ( _).

Finally, we are in the position to give the semantics of an acyclic program.

Definition 7 (Semantics of Acyclic Programs). Let P be an acyclic program. Assume the random part R(P) of P consists of the random clauses , which are given by ( ∶ ← 1() , ..., () ) ⇐ for 1 ≤ ≤ . Further, let Δ be a domain of P. We define the grounding PΔ of the program P w.r.t. Δ to be the Boolean functional causal model [9, §1.4] on the set of random variables ℛ(Δ) ∶={efect ( ) : 1 ≤ ≤ , interpretation on var( ) such that condition( ) = } , which is given by the following equation for every ∈ ℛ(Δ): ∶=

⋁ ∈R(P) interpretation on var( )

efect ( ) = conditions( ) = () ⋀ ( ) ∧ ( , ), =1 where we have that ( , ) is a Boolean random variable with the distribution ℙ ( ( , ) = ) ∶= = ( ) for every interpretation on var( ). Besides, the random variables ( , ) are assumed to be mutually independent for every random clause , every interpretation on var( ) and every 1 ≤ ≤ .

Notation 1. We say that a -structure Π is generated by a program P and write Π ⊧ P if the following assertions hold: i) the underlying -structure Δ is a domain of P ii) the joint distribution on the random variables efect () for ∈ R(P), interpretation with condition() = coincides with the one induced by the functional causal model PΔ. In this case Π is called a (probabilistic) realization of P.

Remark 6. Note that the semantics of Definition 7 do not coincide with the classical LPAD semantics. However, our semantics still cause the same BN structures after grounding. Further, note that the functional causal model PΔ has the causal diagram GraphΔ(P) for every domain Δ of the program P, i.e. by [9, §1.4] we obtain that PΔ induces a distribution, which can be stored in a with the ground graph GraphΔ(P) as underlying DAG.

Definition 8 (Context Variables). A context variable ∶= { atom() : context()} is a pair, which consists of a random atom atom() , called the atom of , and an existential formula context() with var(context()) ⊆ var(atom()) , called the context of .

Moreover, we associate to every -structure Π the following set of random variables: is a context variable, denoting the event that in section 1 there is a leaking tank, which stores an inflammable liquid. Moreover, for the realization Π of the program P in Example 9 we obtain that Definition 9 (Independence of Context Variables). We say that two sets of context variables X and Y are independent, conditioned on a third set of context variables Z if for every realization Π ⊧ P we have that XΠ is independent of YΠ, whenever we condition on ZΠ. In this case we write ⊔ | .

To transfer d-separation from the ground graphs GraphΔ(P) to an open universe we need a representation for all d-connecting paths that might occur in a specific ground graph GraphΔ(P) for a domain Δ of P. Before we introduce this representation in Section 4.3, we still need to lift the notion of an interpretation to the open universe setting.

4.2. Substitutions

A substitution is the open universe analogue of an interpretation.

Definition 10 (Substitution). A substitution is a function that associates to each from a ifnite set of variables Dom() a variable or constant of the same sort.

If V ⊆ Dom() , we call a substitution on V. We extend to a function on atoms, clauses and formulas. To avoid variables being substituted into or out of the scope of a quantifier, first rename all bound variable occurrences in using variable names outside the domain and the range of . Then obtain by applying to every remaining occurrence of the variables of Dom() in .

An expression is called a specialization of an expression if there exists a substitution such that = .

Example 11. Let us reconsider Example 6. Further, let and ′ be variables of the sort employee, let be a variable of the sort section and let be a variable of the sort room. Then ∶= ′, ∶= 1 and ∶= defines a substitution on the set {, , } and we obtain: (, ) = ( ′, ) , () = ( 1), ( 2) = ( 2)

As a preparation for the definition of the big dependence graph in Section 4.3 we need the following lemma about the existence of substitutions. Since it is an immediate consequence of the unique names assumption, its proof is omitted here.

When we come to decidability in Section 4.4, we consider the following special substitutions: Definition 11. A substitution on V is called a relabelling if it is injective and for all variables ∈ Dom() we have that is again a variable. We say that two atoms and are unifiable and write ∼ if there exists a relabelling such that = .

Notation 2. Note that unifiability ∼ is an equivalence relation. We denote by [] the equivalence class of a random atom upto unifiablity. Further, we call [] the unification class of . Example 12. Note that in Example 11 is not a relabelling, whereas ∶= ′, ∶= and ∶= would be a relabelling on {, , } .

As it turns out the decidability of our independence analysis relies on the following lemma: Lemma 2. There are finitely many unification classes of random atoms in ing whether two random atoms are unifiable is decidable. . Moreover, determinProof. Note that for every two random atoms we have that 1( 1, ..., ) ∼ 2( 1, ..., ) if and only if the following assertions hold:

As there are only finitely many random predicates of finite arity and finitely many constants in , we obtain the desired result. □

Now let us move on to the big dependence graph, which turns out to be the right object for reasoning about d-separation in our setting.

4.3. The Big Dependence Graph

As already mentioned, in order to reason about d-separation in an open universe we need a structure, which allows for a representation of all d-connecting paths, that might occur in a ground graph GraphΔ(P) for a possible domain Δ of the program P.

Definition 12 (Big Dependence Graph).

labelled graph, obtained by the following procedure:

The big dependence graph Graph(P)is the directed

R(P), for each substitution on var() and for every cause ∈ causes() we draw an edge → efect () . An edge of this kind is said to be induced by . Moreover, we label each edge → efect () , which is induced by the random clause ∈

R(P), with the existential formula Here y denotes the set of variables, which occur in but not in efect () and . var() holds true.” Remark 7. One should read ( → efect () ) as follows: ”For every interpretation on the edge → efect () exists in the ground graph GraphΔ(P) if ( → efect () ) that case, the big dependence graph Graph(P) is the infinite graph with the nodes ( Example 13. Let us reconsider the program P in Example 6. Moreover, let ( )∈ℕ , ( )∈ℕ , ( )∈ℕ and ( )∈ℕ be enumerations of the variables of sort section, room, tank, employee, respectively. In ), ( ), ( ), ( , ), ( , ) for , , , ∈ ℕ

, = 1, 2 and with the edges: ( ( ( , ) , ) , ) ( ) ( )

( ) ∃ ∃ (, )∧( , The next step will be to establish an analogue of d-separation on the big dependence graph Graph(P). This criterion will consist of two components: A d-connecting path in Graph(P) is a graph-theoretical object, which indicates possible d-connecting paths that might occur in a ground graph GraphΔ(P) for a domain Δ of the program P. Further, we assign to each d-connecting path in the big dependence graph Graph(P) a set of conditions, which tells us under which assumptions on the domain Δ we expect this path to appear in the corresponding ground graph GraphΔ(P).

Definition 13 (d-Connecting Path). Let and be random atoms. Further, let Z be a set of context variables. A d-connecting path between and with respect to Z is a finite undirected path of the form = − ... − ′ for substitutions and ′ on var() , respectively var() with the following property:

For every collider ... → ← ... in there exists a directed path of the form for a context variable ∈ Z and a substitution () on var(atom( )) .

We still need to find out under which conditions we expect a given d-connecting path in the big dependence graph Graph(P) to occur in a ground graph GraphΔ(P). This is done by the following conditions function, which we define by recursion on the structure of a d-connecting path: Definition 14 (Conditions Function). As before we assume that and are random atoms. Further, we choose a set of context variables Z.

i) If = is a d-connecting path of length 1 in Graph(P), we set conditions( ) ∶= ∅ . ii) If = − ′ is a d-connecting path of length 2 in Graph(P), we set

conditions( ) ∶= { ( − ′)} . iii) Assume we have two d-connecting paths ′ ∶= − ... − − and ″ = ′ − ... − − such that ← or ← . Then by Definition 13 we obtain a d-connecting path ∶= −...− − − −...− ′, for which we define conditions( ) to be the following set: conditions( ′) ∪conditions( ″) ∪{¬ context( ) : ∈ Z, substitution, atom( ) = } iv) Finally, assume that the following two d-connecting paths, ′ ∶= − ... − → and ″ = ′ − ... − → , yield a d-connecting path ∶= − ... − → ← − ... − ′. Then by Definition 13 there exists a directed path () ∶= → ... → atom( ) () for a context variable ∈ Z and a substitution () . In this case we define conditions( ) to be the following set:

conditions( ′) ∪ conditions( ″) ∪ {() : edge of ()} ∪ {context ( )() } Example 15. In the case of the d-connecting path in Example 14 we obtain for conditions( ) the following set: {∃ ∃ ( , ) ∧ ( , 1) ∧ ( , ) ∧ () ∃ ∃ ( , ) ∧ (, 1) ∧ ( , ) ∧ () , ¬∃ ( , ) ∧ (, 2), } Now we bring the graphical and the logical component together and define the correct notion of a d-connecting path between context variables in the big dependence graph Graph(P). Definition 15 (d-Connecting Path between Context Variables). Let and be context variables. Further, let Z be a set of context variables. A d-connecting path between and with respect to Z is a d-connecting path between atom( ) and atom( ) . We call a valid d-connecting path if the set is satisfiable. We say that Z d-connects and in Graph(P) if there exists a valid d-connecting path between and with respect to Z in Graph(P), otherwise we say that Z d-separates and . Finally, we say that Z d-connects two sets of context variables X and Y if there exist a ∈ X and a ∈ Y such that Z d-connects and . Otherwise, we say that Z d-separates X and Y.

With these definitions at our hand we can prove the following results: Lemma 3. Let and be context variables. Further, let Z be a set of context variables. Assume there exists a domain Δ of P such that ZΔ d-connects Δ and Δ in the ground graph GraphΔ(P). Then we obtain that Z d-connects and in the big dependence graph Graph(P). Proof. This can be proven with the help of Lemma 1 by an induction on the structure of a d-connecting path. □ Lemma 4. Let and be context variables and let Z be a set of context variables. Further, assume that Z d-connects and in the big dependence Graph(P). Then there exists a domain Δ of P such that ZΔ d-connects Δ and Δ in the ground graph GraphΔ(P).

Proof. For every valid d-connecting path = atom( ) − ... − atom( ) ′ we obtain that context( ) ∪ context( ) ′ ∪ conditions( ) ∪ D(P) is satisfiable. Now take Δ to be a Herbrand model [11, §11.3] of the skolemization of the existential formulas in the upper set with the corresponding interpretation and we obtain by construction that

atom( ) ∘ − ... − atom( ) ∘ ′ is a d-connecting path between atom( ) ∘ ∈ Δ and atom( ) ∘ ′ ∈ Δ. □

From these two lemmas we can easily deduce the next theorem, which yields the desired generalization of d-separation to open universe PLPs.

Our final goal is to show that the problem of determining whether two context variables are d-separated with respect to a given set of context variables is decidable.

4.4. A Normal Form for d-Connecting Paths

Here we want to investigate the following setting: Let us fix two context variables and together with a set of context variables Z.

A priori searching for a d-connecting path between and in the big dependence graph Graph(P) also means searching for an arbitrary long path in an infinite graph, which would render this problem undecidable. This is the reason why we establish a normal form for d-connecting paths and reduce the problem of finding a d-connecting path to finding one in normal form.

Definition 16 (Normal form of a d-connecting path).

A d-connecting path ∶=

atom( ) = 1 − 2 − ... − = atom( ) ′ between and in the big dependence graph Graph(P) is said to be in normal form if for all 1 ≤ < ≤ we have that and are not unifiable, i.e. [ ] ≠ [ ].

Example 17. Consider the following valid d-connecting path between {(, ) : } and {(, ) : } with respect to { () : } in the big dependence graph of Example 13: (, ) → () ← ( ′) → ( ′) ← ( ) ← (, ) Note that this path is not in normal form as () to see that { () : } d-connects {(, ) take the valid d-connecting path and ( : }

′) are unifiable. However, in order and {(, ) : } , we can also (, ) → () ← ( ) ← (, ), which is in normal form. In Lemma 5 we generalize this observation to a calculus, which transforms each valid d-connecting path to one in normal form.

Finally, we reduce the problem of finding a d-connecting path between and with respect to Z to the problem of finding a d-connecting path between them in normal form. Lemma 5. If Z d-connects and , then there exists a valid d-connecting path in normal form between and .

Proof. Assume ∶=

atom( ) − ... − − ... − − ... − atom( ) ′ is a valid d-connecting Hence, there exists a relabelling such that = . By an induction on the structure of a d-connecting path one verifies that ∶̃ = atom( ) − ... − = − ... − atom( ) ∘ ′ is also a d-connecting path between and . Moreover, observe that ̃ is valid as the conditions function in Definition

14 is monotone in the length of a d-connecting path. Now extensionally applying the procedure above yields the desired result. □

From Lemma 5 we obtain the desired decidability of d-separation for open universe PLPs: Theorem 3. Let , be context variables and let Z be a set of context variables. Determining whether Z d-connects and is decidable.

Proof. Note that we can apply a suitable relabelling to a d-connecting path. Hence, to determine whether and are d-connected we can proceed by Lemma 5 as follows: 1.) List all unification classes [] of specializations of atom( ) . 2.) List all d-connecting paths in normal form, which start with a of step 1.). 3.) Delete all paths, which don’t end with a specialization of atom( ) .

4.) Check these paths for satisfiability.

Obviously, 1.), 3.) and 4.) are decidable. To achieve 2.), we build undirected paths starting from a of step 1.) by applying the random clauses as indicated in Definition 12. We stop when we need to run twice in the same unification class. Since by Lemma 2 there are only finitely many unification classes, this method terminates. Further, we proceed analogously to check whether these paths are d-connecting paths, i.e. to check whether the property in Definition 13 holds. □

5. Conclusion

At first we introduced a new semantics for open universe probabilistic logic programs. Subsequently, we used the correspondence between ground acyclic programs and Bayesian networks to apply the method of d-separation in our setting. In particular, we assigned to each open universe program an infinite directed labelled graph, the big dependence graph. Finally, we saw that in order to deduce independence statements it is suficient to search through the big dependence graph for d-connecting paths in normal form and that searching for those paths is a decidable problem.

In the future it would be interesting to extend the notion of a program by incorporating Prolog programs. This would reflect logic programming inside probabilistic logic programming. Further, it would also be desirable to bring our semantics closer to the LPAD semantics. We believe that an independence analysis for LPAD-programs at compilation time can be used to build modified binary decision trees as in [ 8] in order to speed up (lifted) inference. Moreover, we plan to construct a causal structure discovery algorithm for open universe LPAD-programs on the basis of the big dependence graph.

[1]

Verma ,

Pearl , Causal Networks: Semantics and Expressiveness , in: Uncertainty in Artificial Intelligence, volume 9 of Machine Intelligence and Pattern Recognition, NorthHolland , 1990 , pp. 69 - 76 . doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / B 9 7 8 - 0 - 4 4 4 - 8 8 6 5 0 - 7 . 5 0 0 1 1 - 1 .

[2]

Vennekens ,

Verbaeten , Logic Programs with Annotated Disjunctions , Technical Report CW 368 , Katholieke

Universiteit Leuven

, Department of Computer Science, 2003 . URL: https://www.cs.kuleuven.be/publicaties/rapporten/cw/CW368.abs.html.

[3]

Geiger ,

Verma ,

Pearl , Identifying Independence in Bayesian Networks, Networks 20 ( 1990 ) 507 - 534 . doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 0 2 / n e t . 3 2 3 0 2 0 0 5 0 4 . a r X i v : h t t p s : / / o n l i n e l i b r a r y . w i l e y . c o m / d o i / p d f / 1 0 . 1 0 0 2 / n e t . 3 2 3 0 2 0 0 5 0 4 .

[4]

Spirtes ,

Glymour ,

Scheines , Causation , Prediction, and Search , 2nd ed., MIT press, 2000 . URL: https://www.researchgate.net/publication/242448131_Causation_Prediction_ and_Search.

[5]

Verma ,

Pearl , Equivalence and Synthesis of Causal Models, in: Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence , UAI '90, Elsevier

Science Inc.

, USA, 1990 , p. 255 - 270 . doi:h t t p s : / / d l . a c m . o r g / d o i / 1 0 . 5 5 5 5 / 6 4 7 2 3 3 . 7 1 9 7 3 6 .

[6]

Meek , Causal inference and causal explanation with background knowledge , in: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence , UAI' 95 , Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1995 , p. 403 - 410 . doi:h t t p s : / / d l . a c m . o r g / d o i / 1 0 . 5 5 5 5 / 2 0 7 4 1 5 8 . 2 0 7 4 2 0 4 .

[7]

Maier , Causal Discovery for Relational Domains: Representation, Reasoning, and Learning, Doctoral dissertations . 279 ., University of Massachusetts - Amherst, 2014 . URL: https://scholarworks.umass.edu/dissertations_2/279/.

[8]

Holtzen , G. Van den Broeck, T. Millstein, Scaling exact inference for discrete probabilistic programs , Proc. ACM Program. Lang . 4 ( 2020 ). doi:h t t p s : / / d o i . o r g / 1 0 . 1 1 4 5 / 3 4 2 8 2 0 8 .

[9]

Pearl , Causality, 2 ed., Cambridge University Press, Cambridge, UK, 2000 . doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 7 / C B O 9 7 8 0 5 1 1 8 0 3 1 6 1 .

[10]

Geiger ,

Pearl , On the logic of causal models , in: Uncertainty in Artificial Intelligence, volume 9 of Machine Intelligence and Pattern Recognition , North-Holland, 1990 , pp. 3 - 14 . doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / B 9 7 8 - 0 - 4 4 4 - 8 8 6 5 0 - 7 . 5 0 0 0 6 - 8 .

[11]

Ebbinghaus ,

Flum , W. Thomas, Einführung in die mathematische Logik, 6 ed., Springer Spektrum, Berlin, DE, 2018 . URL: https://www.springer.com/de/book/9783662580288.