=Paper= {{Paper |id=Vol-1413/paper-06 |storemode=property |title=The Distribution Semantics is Well-Defined for All Normal Programs |pdfUrl=https://ceur-ws.org/Vol-1413/paper-06.pdf |volume=Vol-1413 |dblpUrl=https://dblp.org/rec/conf/iclp/Riguzzi15 }} ==The Distribution Semantics is Well-Defined for All Normal Programs== https://ceur-ws.org/Vol-1413/paper-06.pdf
 The Distribution Semantics Is Well-Defined for
             All Normal Programs

                                  Fabrizio Riguzzi

         Dipartimento di Matematica e Informatica, Università di Ferrara
                      Via Saragat 1, I-44122, Ferrara, Italy
                          fabrizio.riguzzi@unife.it



      Abstract. The distribution semantics is an approach for integrating
      logic programming and probability theory that underlies many languages
      and has been successfully applied in many domains. When the program
      has function symbols, the semantics was defined for special cases: either
      the program has to be definite or the queries must have a finite number
      of finite explanations. In this paper we show that it is possible to define
      the semantics for all programs.


Keywords: Distribution Semantics, Function Symbols, ProbLog, Probabilistic
Logic Programming


1   Introduction
The distribution semantics [1, 2] was successfully applied in many domains and
underlies many languages that combine logic programming with probability the-
ory such as Probabilistic Horn Abduction, Independent Choice Logic, PRISM,
Logic Programs with Annotated Disjunctions and ProbLog.
    The definition of the distribution semantics can be given quite simply in the
case of no function symbols in the program: a probabilistic logic program under
the distribution semantics defines a probability distribution over normal logic
programs called worlds and the probability of a ground query can be obtained
by marginalizing the joint distribution of the worlds and the query. In the case
the program has function symbols, however, this simple definition does not work
as the probability of individual worlds is zero.
    A definition of the distribution semantics for programs with function symbols
was proposed in [1, 3] but restricted to definite programs. The case of normal
programs was taken into account in [4] where the semantics required that the
programs are acyclic. A looser condition was proposed in [5] but still required
each goal to have a finite set of finite explanations.
    In this paper we show that the distribution semantics can be defined for all
programs, thus also for programs that have goals with an infinite number of
possibly infinite explanations. We do so by adapting the definition of the well-
founded semantics in terms of iterated fixpoints of [6] to the case of ProbLog,
similarly to the way in which the TP operator has been adapted in [7] to the



                                         69
The Distribution Semantics is Well-Defined for All Normal Programs

  case of stratified ProbLog programs using parameterized interpretations. In the
  case of infinite number of infinite explanations, we show that the probability of
  queries is defined in the limit and the limit always exists.
      We consider the case of ProbLog but the results are equally applicable to all
  other languages under the distribution semantics, as there are linear transfor-
  mations from one language to another that preserve the semantics.
      The paper is organized as follows. Section 2 presents preliminary material
  on fixpoints and the well-founded semantics. Section 3 introduces the distribu-
  tion semantics for programs without function symbols. Section 4 discusses the
  definition of the distribution semantics with function symbols in the case of fi-
  nite set of finite explanations. Section 5 represents the main contribution of this
  paper and discusses the case of infinite set of infinite explanations. Finally, Sec-
  tion 6 concludes the paper. The proofs of the main results are reported in the
  Appendix.


  2     Preliminaries
  A relation on a set S is a partial order if it is reflexive, antisymmetric and
  transitive. In the following, let S be a set with a partial order ≤. a ∈ S is an
  upper bound of a subset X of S if x ≤ a for all x ∈ X. Similarly, b ∈ S is a lower
  bound of X if b ≤ x for all x ∈ X.
      a ∈ S is the least upper bound of a subset X of X if a is an upper bound
  of X and, for all upper bounds a0 of X, we have a ≤ a0 . Similarly, b ∈ S is the
  greatest lower bound of a subset X of S if b is a lower bound of X and, for all
  lower bounds b0 of X, we have b0 ≤ b. The least upper bound of X is unique, if
  it exists, and is denoted by lub(X). Similarly, the greatest lower bound of X is
  unique, if it exists, and is denoted by glb(X).
      A partially ordered set L is a complete lattice if lub(X) and glb(X) exist for
  every subset X of L. We let > denote the top element lub(L) and ⊥ denote the
  bottom element glb(L) of the complete lattice L.
      Let L be a complete lattice and T : L → L be a mapping. We say T is
  monotonic if T (x) ≤ T (y), whenever x ≤ y. We say a ∈ L is the least fixpoint of
  T if a is a fixpoint (that is, T (a) = a) and for all fixpoints b of T we have a ≤ b.
  Similarly, we define greatest fixpoint.
      Let L be a complete lattice and T : L → L be monotonic. Then we define
  T ↑ 0 = ⊥; T ↑ α = T (T ↑ (α − 1)), if α is a successor ordinal; T ↑ α = lub({T ↑
  β|β < α}), if α is a limit ordinal; T ↓ 0 = >; T ↓ α = T (T ↓ (α − 1)), if α is a
  successor ordinal; T ↓ α = glb({T ↓ β|β < α}), if α is a limit ordinal.
  Proposition 1. Let L be a complete lattice and T : L → L be monotonic. Then
  T has a lest fixpoint, lfp(T ) and a greatest fixpoint gfp(T ).
  A normal program P is a set of normal rules. A normal rule has the form

                              r = h ← b1 , . . . , bn , not c1 , . . . , not cm    (1)

  where h, b1 , . . . , bn , c1 , . . . , cm are atoms.



                                                   70
The Distribution Semantics is Well-Defined for All Normal Programs

      The set of ground atoms that can be built with the symbols of a program P
  is called the Herbrand base and is denoted as BP .
      A two-valued interpretation I is a subset of BP . I is the set of true atoms.
  The set Int2 of two-valued interpretations for a program P forms a complete
  lattice where the partial order ≤ is given by the subset relation ⊆.          S The least
  upper bound T   and   greatest   lower bound    are  defined   as lub(X)  =     I∈X I and
  glb(X) = I∈X I. The bottom and top element are respectively ∅ and BP .
      A three-valued interpretation I is a pair hIT ; IF i where IT and IF are subsets
  of BP and represent respectively the set of true and false atoms. The union of
  two three-valued interpretations hIT , IF i and hJT , JF i is defined as hIT , IF i ∪
  hJT , JF i = hIT ∪JT , IF ∪JF i. The intersection of two three-valued interpretations
  hIT , IF i and hJT , JF i is defined as hIT , IF i ∩ hJT , JF i = hIT ∩ JT , IF ∩ JF i.
      The set Int3 of three-valued interpretations for a program P forms a complete
  lattice where the partial order ≤ is defined as hIT , IF i ≤ hJT , JF i if IT ⊆ JT
  and IF ⊆ JFS    . The least upper bound   T and greatest lower bound are defined
  as lub(X) = I∈X I and glb(X) = I∈X I. The bottom and top element are
  respectively h∅, ∅i and hBP , BP i.
      The well-founded semantics (WFS) assigns a three-valued model to a pro-
  gram, i.e., it identifies a three-valued interpretation as the meaning of the pro-
  gram. The WFS was given in [8] in terms of the least fixpoint of an operator that
  is composed by two sub-operators, one computing consequences and the other
  computing unfounded sets. We give here the alternative definition of the WFS
  of [6] that is based on a different iterated fixpoint.
  Definition 1. For a normal program P , sets Tr and Fa of ground atoms, and
  a 3-valued interpretation I we define
  OpT rueP I (Tr ) = {a|a is not true in I; and there is a clause b ← l1 , ..., ln in P ,
     a grounding substitution θ such that a = bθ and for every 1 ≤ i ≤ n either
     li θ is true in I, or li θ ∈ Tr };
  OpFalse PI (Fa) = {a|a is not false in I; and for every clause b ← l1 , ..., ln in P
     and grounding substitution θ such that a = bθ there is some i (1 ≤ i ≤ n)
     such that li θ is false in I or li θ ∈ Fa}.
  In words, the operator OpT rueP   I extends the interpretation I to add the new
  atomic facts that can be derived from P knowing I, while OpFalse PI adds the new
  negations of atomic facts that can be shown false in P by knowing I. OpT rueP   I
  and OpFalse PI are both monotonic [6], so they both have a least and greatest
  fixpoints. An iterated fixpoint operator builds up dynamic strata by constructing
  successive three-valued interpretations as follows.
  Definition 2 (Iterated Fixed Point). For a normal program P , let IFP P :
                                                                         P
  Int3 → Int3 be defined as IFP P (I) = I ∪ hlfp(OpT rueP
                                                        I ), gfp(OpFalse I )i.

  IFP P is monotonic [6] and thus as a least fixed point lfp(IFP P ). Moreover, the
  well-founded model WFM (P ) of P is in fact lfp(IFP P ). Let δ be the smallest
  ordinal such that WFM (P ) = IFP P ↑ δ. We refer to δ as the depth of P . The
  stratum of atom a is the least ordinal β such that a ∈ IFP P ↑ β (where a may



                                            71
The Distribution Semantics is Well-Defined for All Normal Programs

  be either in the true or false component of IFP P ↑ β). Undefined atoms of the
  well-founded model do not belong to any stratum – i.e. they are not added to
  IFP P ↑ δ for any ordinal δ.

  3    The Distribution Semantics for Programs without
       Function Symbols
  We present the distribution semantics for the case of ProbLog [9] as it is the
  language with the simplest syntax. A ProbLog program P is composed by a set
  of normal rules R and a set F of probabilistic facts. Each probabilistic fact is of
  the form pi :: ai where pi ∈ [0, 1] and ai is an atom, meaning that each ground
  instantiation ai θ of ai is true with probability pi and false with probability
  1 − pi . Each world is obtained by selecting or rejecting each grounding of all
  probabilistic facts.
      An atomic choice is the selection or not of grounding F θ of a probabilistic
  fact F . It is represented with the triple (F, θ, i) where i ∈ {0, 1}. A set κ of atomic
  choices is consistent if it does not contain two atomic choices (F, θ, i) and (F, θ, j)
  with i 6= j (only one alternative is selected for a ground probabilistic fact).
  The function consistent(κ) returns true if κ is consistent. A composite choice
  κ is a consistent
               Q       set of Q
                              atomic choices. The probability of composite choice κ
  is P (κ) = (Fi ,θ,1)∈κ pi (Fi ,θ,0)∈κ 1 − pi where pi is the probability of the i-th
  probabilistic fact Fi . A selection σ is a total composite choice, i.e., it contains
  one atomic choice for every grounding of each probabilistic fact. A world wσ is
  a logic program that is identified by a selection σ. The world wσ is formed by
  including the atom corresponding to each atomic choice (F, θ, 1) of σ.
      The probability of a world wσ is P (wσ ) = P (σ). Since in this section we
  are assuming programs without function symbols, the set of groundings of each
  probabilistic fact is finite, and so is the set of worlds WP . Accordingly, for a
  ProbLog program P     P, WP = {w1 , . . . , wm }. Moreover, P (w) is a distribution
  over worlds: w∈WP P (w) = 1. We call sound a program for which every world
  has a two-valued well-founded model. We consider only sound programs, as the
  uncertainty should be handled by the choices rather than by the semantics of
  negation.
      Let q be a query in the form of a ground atom. We define the conditional
  probability of q given a world w as: P (q|w) = 1 if q is true in w and 0 otherwise.
  Since the program is sound, q can be only true or false in a world. The probability
  of q can thus be computed by summing out P         the worlds fromPthe joint distribu-
  tion
  P     of  the  query  and  the worlds:  P (q)  =     w P (q, w) =     w P (q|w)P (w) =
     w|=q P  (w).

  4    The Distribution Semantics for Programs with
       Function Symbols
  When a program contains functions symbols there is the possibility that its
  grounding may be infinite. If so, the number of atomic choices in a selection that



                                           72
The Distribution Semantics is Well-Defined for All Normal Programs

  defines a world is countably infinite and there is an uncountably infinite number
  of worlds. In this case, the probability of each individual world is zero since it is
  the product of infinite numbers all smaller than one. So the semantics of Section
  3 is not well-defined.
  Example 1. Consider the program
                p(0) ← u(0).          t ← ¬s.    F1 = a :: u(X).
                p(s(X)) ← p(X), u(X). s ← r, ¬q. F2 = b :: r.
                                      q ← u(X).
  The set of worlds is infinite and uncountable. In fact, each world can be put
  in a one to one relation with a selection and a selection can be represented as
  a countable sequence of atomic choices of which the first involves fact F2 , the
  second F1 /{X/0}, the third F1 /{X/s(0)} and so on. The set of selections can be
  shown uncountable by Cantor’s diagonal argument. Suppose the set of selections
  is countable. Then the selections could be listed in order, suppose from top to
  bottom. Suppose the atomic choices of each selection are listed from left to right.
  We can pick a composite choice that differs from the first selection in the first
  atomic choice (if (F2 , ∅, k) is the first atomic choice of the first selection, pick
  (F2 , ∅, 1 − k)), from the second selection in the second atomic choice (similar to
  the case of the first atomic choice) and so on. In this way we have obtained a
  selection that is not present in the list because it differs from each selection in
  the list for at least an atomic choice. So it is not possible to list the selections
  in order.
  We now present the definition of the distribution semantics for programs with
  function symbols following [4]. The semantics for a probabilistic logic program P
  with function symbols is given by defining a probability measure µ over the set of
  worlds WP . Informally, µ assigns a probability to a set of subsets of WP , rather
  than to every element of (the infinite set) WP . The approach dates back to [10]
  who defined a probability measure µ as a real-valued function whose domain is
  a σ-algebra Ω on a set W called the sample space. Together hW, Ω, µi is called
  a probability space.
  Definition 3. [11, Section 3.1] The set Ω of subsets of W is a σ-algebra on
  the set W iff (σ-1) W ∈ Ω; (σ-2) Ω is closed under complementation, i.e.,
  ω ∈ Ω → (W \ ω) ∈ Ω; and       S (σ-3) Ω is closed under countable union, i.e., if
  ωi ∈ Ω for i = 1, 2, . . . then i ωi ∈ Ω.
  The elements of Ω are called measurable sets. Importantly, for defining the dis-
  tribution semantics for programs with function symbols, not every subset of W
  need be present in Ω.
  Definition 4. [10] Given a sample space W and a σ-algebra Ω of subsets of W,
  a probability measure is a function µ : Ω → R that satisfies the following axioms:
  (µ-1) µ(ω) ≥ 0 for all ω ∈ Ω; (µ-2) µ(W) = 1; (µ-3) µ is countably additive,
           S = {ω1 , ωP
  i.e., if O          2 , . . .} ⊆ Ω is a countable collection of pairwise disjoint sets,
  then µ( ω∈O ) = i µ(ωi ).



                                           73
The Distribution Semantics is Well-Defined for All Normal Programs

  We first consider the finite additivity version of probability spaces. In this
  stronger version, the σ-algebra is replaced by an algebra.
  Definition 5. [11, Section 3.1] The set Ω of subsets of W is an algebra on the
  set W iff it respects conditions (σ-1), (σ-2) and condition (a-3): Ω is closed
  under finite union, i.e., ω1 ∈ Ω, ω2 ∈ Ω → (ω1 ∪ ω2 ) ∈ Ω
  The probability measure is replaced by a finitely additive probability measure.
  Definition 6. Given a sample space W and an algebra Ω of subsets of W, a
  finitely additive probability measure is a function µ : Ω → R that satisfies
  axioms (µ-1) and (µ-2) of Definition 4 and axiom (m-3): µ is finitely additive,
  i.e., ω1 ∩ ω2 = ∅ → µ(ω1 ∪ ω2 ) = µ(ω1 ) + µ(ω2 ) for all ω1 , ω2 ∈ Ω.
  Towards defining a suitable algebra given a probabilistic logic program P, we
  define the set of worlds ωκ compatible with a composite choice κ as ωκ = {wσ ∈
  WP |κ ⊆ σ}. Thus a composite choice   P identifies a set of worlds. For programs
  without function symbols P (κ) = w∈ωκ P (w).
      GivenSa set of composite choices K, the set of worlds ωK compatible with K
  is ωK = κ∈K ωκ . Two composite choices κ1 and κ2 are incompatible if their
  union is not consistent. A set K of composite choices is pairwise incompatible if
  for all κ1 ∈ K, κ2 ∈ K, κ1 6= κ2 implies that κ1 and κ2 are incompatible.
      Regardless of whether a probabilistic logic program has a finite number of
  worlds or not, obtaining pairwise incompatible sets of composite choices is an
  important problem. This is because the probabilityP        of a pairwise incompatible
  set K of composite choices is defined as P (K) = κ∈K P (κ) which is easily
  computed. Two sets K1 and K2 of finite composite choices are equivalent if they
  correspond to the same set of worlds: ωK1 = ωK2 .
      One way to assign probabilities to a set K of composite choices is to construct
  an equivalent set that is pairwise incompatible; such a set can be constructed
  through the technique of splitting. More specifically, if F θ is an instantiated fact
  and κ is a composite choice that does not contain an atomic choice (F, θ, k)
  for any k, the split of κ on F θ is the set of composite choices Sκ,F θ = {κ ∪
  {(F, θ, 0)}, κ∪{(F, θ, 1)}}. It is easy to see that κ and Sκ,F θ identify the same set
  of possible worlds, i.e., that ωκ = ωSκ,F θ , and that Sκ,F θ is pairwise incompatible.
  The technique of splitting composite choices on formulas is used for the following
  result [12].
  Theorem 1 (Existence of a pairwise incompatible set of composite choices [12])
  Given a finite set K of composite choices, there exists a finite set K 0 of pairwise
  incompatible composite choices such that K and K 0 are equivalent.
  Proof: Given a finite set of composite choices K, there are two possibilities to
  form a new set K 0 of composite choices so that K and K 0 are equivalent:
   1. removing dominated elements: if κ1 , κ2 ∈ K and κ1 ⊂ κ2 , let K 0 =
      K \ {κ2 }.
   2. splitting elements: if κ1 , κ2 ∈ K are compatible (and neither is a superset
      of the other), there is a (F, θ, k) ∈ κ1 \ κ2 . We replace κ2 by the split of κ2
      on F θ. Let K 0 = K \ {κ2 } ∪ Sκ2 ,F θ .



                                           74
The Distribution Semantics is Well-Defined for All Normal Programs

  In both cases ωK = ωK 0 . If we repeat this two operations until neither is ap-
  plicable we obtain a splitting algorithm that terminates because K is a finite
  set of composite choices. The resulting set K 0 is pairwise incompatible and is
  equivalent to the original set.                                             ♦
  Theorem 2 (Equivalence of the probability of two equivalent pairwise incom-
  patible finite set of finite composite choices [13]) If K1 and K2 are both pairwise
  incompatible finite sets of finite composite choices such that they are equivalent
  then P (K1 ) = P (K2 ).
  For a probabilistic logic program P, we can thus define a unique probability
  measure µ : ΩP → [0, 1] where ΩP is defined as the set of sets of worlds identified
  by finite sets of finite composite choices: ΩP = {ωK |K is a finite set of finite
  composite choices }. ΩP is an algebra over WP since WP = ωK with K = {∅}.
                                  c
  Moreover, the complement ωK       of ωK where K is a finite set of finite composite
  choice is ωK where K is a finite set of finite composite choices. In fact, K can
  obtained with the function duals(K) of [12] that performs Reiter’s hitting set
  algorithm over K, generating an element κ of K by picking an atomic choice
  (F, θ, k) from each element of K and inserting in κ the atomic choice (F, θ, 1−k).
  After this process is performed in all possible ways, inconsistent sets of atom
  choices are removed obtaining K. Since the possible choices of the atomic choices
  are finite, so is K. Finally, condition (a-3) holds since the union of ωK1 with ωK2
  is equal to ωK1 ∪K2 for the definition of ωK .
      The corresponding measure µ is defined by µ(ωK ) = P (K 0 ) where K 0 is a
  pairwise incompatible set of composite choices equivalent to K. hWP , ΩP , µi is a
  finitely additive probability space according to Definition 6 because µ(ω{∅} ) = 1,
  µ(ωK ) ≥ 0 for all K and if ωK1 ∩ ωK2 = ∅ and K10 (K20 ) is pairwise incompatible
  and equivalent to K1 (K2 ), then K10 ∪ K20 is pairwise incompatible and
                       X                X              X
  µ(ωK1 ∪ ωK2 ) =             P (κ) =        P (κ1 ) +      P (κ2 ) = µ(ωK1 ) + µ(ωK2 ).
                   κ∈K10 ∪K20        κ1 ∈K10         κ2 ∈K20


  Given a query q, a composite choice κ is an explanation for q if ∀w ∈ ωκ : w |= q.
  A set K of composite choices is covering wrt q if every world in which q is true
  belongs to ωK
  Definition 7. For a probabilistic logic program P, the probability of a ground
  atom q is given by P (q) = µ({w|w ∈ WP , w |= q}).
  If q has a finite set K of finite explanations such that K is covering then {w|w ∈
  WP ∧ w |= q} = ωK ∈ ΩT and we say that P (q) is finitely well-defined for the
  distribution semantics. A program P is finitely well-defined if the probability of
  all ground atoms in the grounding of P is finitely well-defined.
  Example 2. Consider the program of Example 1. The set K = {κ} with κ =
  {(F1 , {X/0}, 1), (F1 , {X/s(0)}, 1)} is a pairwise incompatible finite set of finite
  explanations that are covering for the query p(s(0)). Definition 7 therefore ap-
  plies, and P (p(s(0))) = P (κ) = a2



                                          75
The Distribution Semantics is Well-Defined for All Normal Programs

  5     Infinite Covering Set of Explanations
  In this section we go beyond [4] and we remove the requirement of the finiteness
  of the covering set of explanations and of each explanation for a query q.

  Example 3. In Example 1, the query s has the pairwise incompatible covering
  set of explanations K s = {κs0 , κs1 , . . .} with

      κsi = {(F2 , ∅, 1), (F1 , {X/0}, 1), . . . , (F1 , {X/si−1 (0)}, 1), (F1 , {X/si (0)}, 0)}

  where si (0) is the term where the functor s is applied i times to 0. So K s is count-
  able and infinite. A covering set of explanation for t is K t = {{(F2 , ∅, 0)}, κt }
  where κt is the infinite composite choice

                   κt = {(F2 , ∅, 1), (F1 , {X/0}, 1), (F1 , {X/s(0)}, 1), . . .}

  For a probabilistic logic program P, we can define the probability measure
  µ : ΩP → [0, 1] where ΩP is defined as the set of sets of worlds identified
  by countable sets of countable composite choices: ΩP = {ωK |K is a countable
  set of countable composite choices }.
  Lemma 3 ΩP is a an σ-algebra over WP .
                                                                                        c
  Proof: (σ-1) is true as in the algebra case. To see that the complement ωK              of
  ωK is in ΩP , let us prove by induction that the dual K of K is a countable
                                                            c
  set of countable composite choices and then that ωK          = ωK . In the base case,
  if K1 = {κ1 }, then we can obtain K1 by picking each atomic choice (F, θ, k) of
  κ1 and inserting in K1 the composite choice {(F, θ, 1 − k)}. As there is a finite
  or countable number of atomic choices in κ1 , K1 is a finite or countable set of
  composite choices each with one atomic choice.
       In the inductive case, assume that Kn−1 = {κ1 , . . . , κn−1 } and that Kn−1
  is a finite or countable set of composite choices. Let Kn = Kn−1 ∪ {κn } and
  Kn−1 = {κ01 , κ02 , . . .}. We can obtain Kn by picking each κ0i and each atomic
  choice (F, θ, k) of κn . If (F, θ, k) ∈ κ0i , then discard κ0i , else if (F, θ, k 0 ) ∈ κ0i
  with k 0 6= k, insert κ0i in Kn . Otherwise generate the composite choice κ00i where
  κ00i = κ0i ∪ {(F, θ, 1 − k)} and insert it in Kn . Doing this for all atomic choices
  (F, θ, k) in κn generates a finite set of composite choices if κn is finite and a
  countable number if κn is countable. Doing this for all κ0i we obtain that Kn is a
                                                                                   c
  countable union of countable sets which is a countable set [14, page 3]. ωK          = ωK
  because all composite choices of K are incompatible with each world of ωK , as
                                                                       c
  they are incompatible with each composite choice of K. So ωK             ∈ ΩP . (σ-3) is
  true as in the algebra case.                                                            ♦
       We can see K as limn→∞ Kn where Kn = {κ1 , . . . , κn }. Each Kn is a finite
  set of composite choices and we can compute an equivalent finite pairwise incom-
  patible set of composite choices Kn0 . For each Kn0 we can compute the probability
  P (Kn0 ), noting that the probability of infinite composite choices is 0.
       Now consider limn→∞ P (Kn0 ). We can see the P (Kn0 )s as the partial sums
                                                          0
  of a series. Moreover, it can be shown that P (Kn−1         ) ≤ P (Kn0 ) so the series



                                                76
The Distribution Semantics is Well-Defined for All Normal Programs

  has non-negative terms. Such a series converges if the sequence of partial sums
  is bounded from above [15, page 92]. Since P (Kn0 ) is bounded by 1, the limit
  limn→∞ P (Kn0 ) exists. So we can define measure µ as µ(ωK ) = limn→∞ P (Kn0 ).

  Theorem 4 hWP , ΩP , µi is a probability space according to Definition 4.

  Proof: (µ-1) and (µ-2) hold as for the finite case and for (µ-3) let O =
  {ωL1 , ωL2 , . . .} be a countable set of subsets of ΩP such that the ωLi s are pair-
                           0
  wise disjoint.
         S∞ 0 Let Li be the pairwise incompatible set equivalent to Li and let
  L be i=1 Li . Since the ωLi s are pairwise disjoint, then L is pairwise incom-
  patible. ΩP is a σ-algebra, so L is countable. Let L be {κ1 , κ      P2 , . . .} and let
  Kn0 be {κ1 , . . . , κn }. Then µ(O) = limn→∞ P (Kn0 ) = limn→∞ κ∈Kn0 P (κ) =
  P                           S∞ 0
     κ∈L P (κ). Since P L = i=1 LP  i , by rearranging
                                        ∞            Pthe
                                                       ∞
                                                          terms in the last summation
  we get µ(O) = κ∈L P (κ) = n=1 P (L0n ) = n=1 µ(ωLn ).                                 ♦
      For a probabilistic logic program P, the probability of a ground atom q is
  again given by P (q) = µ({w|w ∈ WP , w |= q}). If q has a countable set K of
  explanations such that K is covering then {w|w ∈ WP ∧ w |= q} = ωK ∈ ΩP
  and we say that P (q) is well-defined for the distribution semantics. A program
  P is well-defined if the probability of all ground atoms in the grounding of P is
  well-defined.
  Example 4. Consider Example 3. Since the explanations in K s are pairwise in-
  compatible the probability of s can be computed as
                                                                  b(1 − a)
           P (s) = b(1 − a) + ba(1 − a) + ba2 (1 − a) + . . . =            = b.
                                                                   1−a
  since the sum is a geometric series. K t is also pairwise incompatible and P (κt ) =
  0 so P (t) = 1 − b + 0 = 1 − b which is what we intuitively expect.
  We now want to show that every program has countable set of countable expla-
  nations that is covering for each query. In the following, we consider only ground
  programs that however may be countably infinite, thus they can be the result of
  grounding a program with function symbols.
     Given two sets of composite choices K1 and K2 , define the conjunction K1 ⊗
  K2 of K1 and K2 as K1 ⊗ K2 = {κ1 ∪ κ2 |κ1 ∈ K1 , κ2 ∈ K2 , consistent(κ1 ∪ κ2 )}
     Similarly to [7], we define parametrized interpretations and a IFPC P oper-
  ator. Differently from [7], here parametrized interpretations associate a set of
  composite choices instead of a Boolean formula to each atom.
  Definition 8 (Parameterized two-valued interpretations). A parameter-
  ized positive two-valued interpretation Tr of a ground probabilistic logic program
  P with and atoms BP is a set of pairs (a, Ka ) with a ∈ atoms and Ka a set
  of composite choices. A parameterized negative two-valued interpretation Fa of
  a ground probabilistic logic program P with atoms BP is a set of pairs (a, K¬a )
  with a ∈ BP and K¬a a set of composite choices.
  Parametrized two-valued interpretations form a complete lattice where the par-
  tial order is defined as I ≤ J if ∀(a, Ka ) ∈ I, (a, La ) ∈ J : ωKa ⊆ ωLa . The



                                           77
The Distribution Semantics is Well-Defined for All Normal Programs

  leastSupper bound and greatest lower bound always N exist and are lub(X) =
  {(a, (a,Ka )∈I,I∈X Ka )|a ∈ BP } and glb(X) = {(a, (a,Ka )∈I,I∈X Ka )|a ∈ BP }.
  The top element > is {(a, {∅})|a ∈ BP } and the bottom element ⊥ is {(a, ∅)|a ∈
  BP }.

  Definition 9 (Parameterized three-valued interpretation). A parameter-
  ized three-valued interpretation I of a ground probabilistic logic program P with
  atoms BP is a set of triples (a, Ka , K¬a ) with a ∈ BP and Ka and K¬a sets of
  composite choices.

  Parametrized three-valued interpretations form a complete lattice where the par-
  tial order is defined as I ≤ J if ∀(a, Ka , K¬a ) ∈ I, (a, La , L¬a ) ∈ J : ωKa ⊆ ωLa
  and ωK¬a ⊆ ωL¬a . The least
                           S upper bound and S      greatest lower bound always exist
  and are lub(X) = {(a, (a,Ka ,K¬a )∈I,I∈X Ka , (a,Ka ,K¬a )∈I,I∈X K¬a )|a ∈ BP }
                      N                        N
  and glb(X) = {(a, (a,Ka ,K¬a )∈I,I∈X Ka , (a,Ka ,K¬a )∈I,I∈X K¬a )|a ∈ BP }. The
  top element > is {(a, {∅}, {∅})|a ∈ BP }, the bottom element ⊥ is {(a, ∅, ∅)|a ∈
  BP }.

  Definition 10. For a ground program P, a two-valued parametrized positive
  interpretation Tr with pairs (a, La ), a two-valued parametrized negative inter-
  pretation Fa with pairs (a, M¬a ) and a three-valued parametrized interpretation
  I with triples (a, Ka , K¬a ), we define OpT rueCIP (Tr ) = {(a, L0a )|a ∈ BP } where
                
                 S  {{(a, ∅, 1)}}                                       if a ∈ F
          L0a =       a←b1 ,...,bn ,¬c1 ,...,cm ∈R ((Lb1 ∪ Kb1 ) ⊗ . . . if a ∈ B \ F
                                                                                P
                    ⊗(Lbn ∪ Kbn ) ⊗ K¬c1 ⊗ . . . ⊗ K¬cm )

  OpFalseC P               0
           I (Fa) = {(a, Ma )|a ∈ BP } where
              
               {{(a,
                N ∅, 0)}}                                              if a ∈ F
         0
      M¬a  =                                 ((M ¬b1 ⊗ K ¬b1 ) ∪ . . .
               a←b1 ,...,bn ,¬c1 ,...,cm ∈R                           if a ∈ BP \ F
                ∪(M¬bn ⊗ K¬bn ) ∪ Kc1 ∪ . . . ∪ Kcm )

  Proposition 5 OpT rueCIP and OpFalseC P
                                        I are monotonic.

  Since OpT rueCIP and OpFalseC P
                                I are monotonic, they have a least fixpoint and
  a greatest fixpoint.

  Definition 11 (Iterated Fixed Point). For a ground program P, let IFPC P
  be defined as IFPC P (I) = {(a, Ka , K¬a )|(a, Ka ) ∈ lfp(OpT rueCIP ), (a, K¬a ) ∈
  lfp(OpFalseC PI )}.

  Proposition 6 IFPC P is monotonic.

  So IFPC P has a least fixpoint. Let WFMC (P) denote lfp(IFPC P ), and let δ
  the smallest ordinal such that IFPC P ↑ δ = WFMC (P). We refer to δ as the
  depth of P.



                                           78
The Distribution Semantics is Well-Defined for All Normal Programs

  Theorem 7 For a ground probabilistic logic program P with atoms BP , let Kaα
        α
  and K¬a  be the formulas associated with atom a in IFPC P ↑ α. For every atom
  a and total choice σ, there is an iteration α0 such that for all α > α0 we have:

           wσ ∈ ωKaα ↔ WFM (wσ ) |= a             wσ ∈ ωK¬a
                                                         α ↔ WFM (wσ ) |= ¬a



  Theorem 8 For a ground probabilistic logic program P, let Kaα and K¬a α
                                                                           be the
                                            P
  formulas associated with atom a in IFPC ↑ α. For every atom a and every
  iteration α, Kaα and K¬a
                        α
                           are countable sets of countable composite choices.

  So every query for every program has a countable set of countable explanations
  that is covering and the probability measure is well defined. Moreover, since the
                                                 c
  program is sound, for all atoms a, ωKaδ = ωK     δ   where δ is the depth of the
                                                   ¬a
  program, as in each world a is either true or false.


  5.1     Comparison with Sato and Kameya’s Definition

  Sato and Kameya [3] define the distribution semantics for definite programs.
  They build a probability measure on the sample space WP from a collection
  of finite distributions. Let F be {F1 , F2 , . . .} and let Xi be a random variable
  associated to Fi whose domain is {0, 1}.
                                        (n)
      The finite distributions PP (X1 = k1 , . . . , Xn = kn ) for n ≥ 1 must be such
  that
               (n)
    
    P 0 ≤ PP (X1 = k1 , . . . , Xn = kn ) ≤ 1
                       (n)
          k1 ,...,kn PP (X1 = k1 , . . . , Xn = kn ) = 1
    
      P            (n+1)                                     (n)
          kn+1 PP         (X1 = k1 , . . . , Xn+1 = kn+1 ) = PP (X1 = k1 , . . . , Xn = kn )
                                                                                           (2)
  The last equation is called the compatibility condition. It can be proved [16] from
  the compatibility condition that there exists a probability space (WP , ΨP , η)
  where η is a probability measure on ΨP , the minimal σ-algebra containing open
  sets of WP such that for any n,
                                                  (n)
               η(X1 = k1 , . . . , Xn = kn ) = PT (X1 = k1 , . . . , Xn = kn ).           (3)
    (n)                                                  (n)
  PP (X1 = k1 , . . . , Xn = kn ) is defined as PP (X1 = k1 , . . . , Xn = kn ) =
  p1 . . . pn where pi is the annotation of alternative ki in fact Fi . This definition
                                                 (n)
  clearly satisfies the properties in (2). PP (X1 = k1 , . . . , Xn = kn ) is then ex-
  tended to a probability measure over BP .
       We conjecture that this definition of the distribution semantics with function
  symbols coincides for definite programs with the one given above.
       To show that the two definition coincide, we conjecture that ΨP = ΩT .
  Moreover, X1 = k1 , . . . , Xn = kn is equivalent to the set of composite choices
  K = {{(F1 , ∅, k1 ), . . . , (Fn , ∅, kn )}} and µ(ωK ) gives p1 . . . pn which satisfies
  equation (3).



                                             79
The Distribution Semantics is Well-Defined for All Normal Programs

  6    Conclusions

  We have presented a definition of the distribution semantics in terms of an
  iterated fixpoint operator that allowed us to prove that the semantics is well
  defined for all programs. The operator we have presented is also interesting from
  an inference point of view, as it can be used for forward inference similarly to
  [7].


  References

   1. Poole, D.: Logic programming, abduction and probability - a top-down anytime
      algorithm for estimating prior and posterior probabilities. New Gen. Comp. 11
      (1993) 377–400
   2. Sato, T.: A statistical learning method for logic programs with distribution se-
      mantics. In: International Conference on Logic Programming. (1995) 715–729
   3. Sato, T., Kameya, Y.: Parameter learning of logic programs for symbolic-statistical
      modeling. J. Artif. Intell. Res. 15 (2001) 391–454
   4. Poole, D.: The Independent Choice Logic for modelling multiple agents under
      uncertainty. Artif. Intell. 94 (1997) 7–56
   5. Riguzzi, F., Swift, T.: Terminating evaluation of logic programs with finite three-
      valued models. ACM T. Comput. Log. 15 (2014) 32:1–32:38
   6. Przymusinski, T.: Every logic program has a natural stratification and an iterated
      least fixed point model. In: ACM Conference on Principles of Database Systems.
      (1989) 11–21
   7. Vlasselaer, J., Van den Broeck, G., Kimmig, A., Meert, W., De Raedt, L.: Anytime
      inference in probabilistic logic programs with Tp-compilation. In: Internation Joint
      Conference on Artificial Intelligence. (2015) 1852–1858
   8. Van Gelder, A., Ross, K.A., Schlipf, J.S.: The well-founded semantics for general
      logic programs. J. ACM 38 (1991) 620–650
   9. De Raedt, L., Kimmig, A., Toivonen, H.: ProbLog: A probabilistic Prolog and
      its application in link discovery. In: Internation Joint Conference on Artificial
      Intelligence. (2007) 2462–2467
  10. Kolmogorov, A.N.: Foundations of the Theory of Probability. Chelsea Publishing
      Company, New York (1950)
  11. Srivastava, S.: A Course on Borel Sets. Graduate Texts in Mathematics. Springer
      (2013)
  12. Poole, D.: Abducing through negation as failure: stable models within the inde-
      pendent choice logic. J. Logic Program. 44 (2000) 5–35
  13. Poole, D.: Probabilistic Horn abduction and Bayesian networks. Artif. Intell. 64
      (1993)
  14. Cohn, P.: Basic Algebra: Groups, Rings, and Fields. Springer (2003)
  15. Brannan, D.: A First Course in Mathematical Analysis. Cambridge University
      Press (2006)
  16. Chow, Y., Teicher, H.: Probability Theory: Independence, Interchangeability, Mar-
      tingales. Springer Texts in Statistics. Springer (2012)




                                           80
The Distribution Semantics is Well-Defined for All Normal Programs

  A        Proofs of Theorems

  Proposition 5 OpT rueCIP and OpFalseC P
                                        I are monotonic.

      Proof: Let us consider OpT rueCIP . We have to prove that if Tr 1 ≤ Tr 2
  then OpT rueCIP (Tr 1 ) ≤ OpT rueCIP (Tr 2 ). Tr 1 ≤ Tr 2 means that ∀(a, La ) ∈
  Tr 1 , (a, Ma ) ∈ Tr 2 : La ⊆ Ma . Let (a, L0a ) be the elements of OpT rueCIP (Tr 1 )
  and (a, Ma0 ) the elements of OpT rueCIP (Tr 2 ). We have to prove that L0a ⊆ Ma0
      If a ∈ F then L0a = Ma0 = {{(a, θ, 1)}}. If a ∈ BP \ F, then L0a and Ma0 have
  the same structure. Since ∀b ∈ BP : Lb ⊆ Mb , then L0a ⊆ Ma0
      We can prove similarly that OpFalseC P      I is monotonic.                ♦
  Proposition 6 IFPC P is monotonic.
      Proof: We have to prove that if I1 ≤ I2 then IFPC P (I1 ) ≤ IFPC P (I2 ).
  I1 ≤ I2 means that ∀(a, La , L¬a ) ∈ I1 , (a, Ma , M¬a ) ∈ I2 : La ⊆ Ma , L¬a ⊆
  M¬a . Let (a, L0a , L0¬a ) be the elements of IFPC P (I1 ) and (a, Ma0 , M¬a
                                                                            0
                                                                               ) the el-
                    P                                0        0      0         0
  ements of IFPC (I2 ). We have to prove that La ⊆ Ma and L¬a ⊆ M¬a               . This
                                                 P                P
  follows from the montonicity of OpT rueCI1 and OpFalseC I2 in I1 and I2 re-
  spectively, which can be proved as in Proposition 5.                               ♦
  Lemma 9 For a ground probabilistic logic program P with probabilistic facts
  F, rules R and atoms BP , let Lαa be the formula associated with atom a in
  OpT rueCIP ↑ α. For every atom a, total choice σ and iteration α, we have:
                                    wσ ∈ ωLαa → WFM (wσ |I) |= a
  where wσ |I is obtained by adding to wσ the atoms a for which (a, Ka , K¬a ) ∈ I
  and wσ ∈ Ka as facts and by removing all the rules with a in the head for which
  (a, Ka , K¬a ) ∈ I and wσ ∈ K¬a .
  Proof: Let us prove the lemma by transfinite induction: let as assume the thesis
  for all β < α and let us prove it for α. If α is a successor ordinal, then it is easily
  verified for a ∈ F. Otherwise assume wσ ∈ ωLαa where
                  [
  Lαa =                       ((Lα−1                    α−1
                                 b1 ∪Kb1 )⊗. . . ⊗(Lbn ∪Kbn )⊗K¬c1 ⊗. . .⊗K¬cm )
           a←b1 ,...,bn ,¬c1 ,...,cm ∈R

  This means that there is rule a ← b1 , . . . , bn , ¬c1 , . . . , cm ∈ R such that wσ ∈
  ωLα−1 ∪Kb for i = 1, . . . , n and wσ ∈ ωK¬cj for j = 1 . . . , m. By the induc-
      bi       i
  tive assumption and because of how wσ |I is built then WFM (wσ |I) |= bi and
  WFM (wσ |I) |= ¬cj so WFM (wσ |I) |= a.
      If α is a limit ordinal, then
                                                     [
                            Lα          β
                              a = lub({La |β < α}) =   Lβa
                                                          β<α

  If wσ ∈ ωLαa then there must exist a β < α such that wσ ∈ ωLβa . By the inductive
  assumption the hypothesis holds.                                               ♦



                                                81
The Distribution Semantics is Well-Defined for All Normal Programs

  Lemma 10 For a ground probabilistic logic program P with probabilistic facts
                                  α
  F, rules R and atoms BP , let M¬a  be the set of composite choices associated
  with atom a in OpFalseC P
                          I ↓ α. For every atom a, total choice σ and iteration
  α, we have:
                       wσ ∈ ωM¬a → WFM (wσ |I) |= ¬a
  where wσ |I is built as in Lemma 9.

  Proof: Similar to the proof of Theorem Lemma 9.                                 ♦

  Lemma 11 For a ground probabilistic logic program P with probabilistic facts
  F, rules R and atoms BP , let Kaα and K¬a
                                         α
                                            be the formulas associated with atom
            P
  a in IFPC ↑ α. For every atom a, total choice σ and iteration α, we have:

                           wσ ∈ ωKaα → WFM (wσ ) |= a                           (4)
                          wσ ∈ ωK¬a
                                 α → WFM (wσ ) |= ¬a                            (5)

  Proof: Let us first prove that for all α, WFM (wσ ) = WFM (wσ |IFPC P ↑ α).
  We can prove it by transfinite induction. Consider the case of α a successor
  ordinal. Consider an atom b. If wσ 6∈ ωKbα and wσ 6∈ ωK¬bα then the rules for b in

                     P
  wσ and wσ |IFPC ↑ α are the same. If wσ ∈ ωKb then b is a fact in wσ |IFPC P ↑
                                                   α

  α but, according to Lemma 9, WFM (wσ |IFPC P ↑ (α−1)) |= b. For the inductive
  hypothesis WFM (wσ ) |= b so b has the same truth value in WFM (wσ ) and
  WFM (wσ |IFPC P ↑ α). Similarly, if wσ ∈ ωK¬b   α , then WFM (wσ ) |= ¬b and b

  has the same truth value in WFM (wσ ) and WFM (wσ |IFPC P ↑ α). So overall
  WFM (wσ ) = WFM (wσ |IFPC P ↑ α).
                                        S                     S
     If α is a limit ordinal, then Kbα = β<α Kbβ and K¬b α
                                                            = β<α Kbβ . So if wσ ∈
  ωKbα there is a β such wσ ∈ ωK β and for the inductive hypothesis WFM (wσ ) |= b
                                 b

  so b has the same truth value in WFM (wσ ) and WFM (wσ |IFPC P ↑ α). Similarly
  if wσ ∈ ωK¬bα .

      We can now prove the lemma by transfinite induction. Consider the case of
                                                       P
  α a successor ordinal. Since (a, Kaα ) ∈ lfp(OpT rueCIFPC ↑(α−1) ), by Lemma 9


                  wσ ∈ ωKaα → WFM (wσ |IFPC P ↑ (α − 1)) |= a

  Since WFM (wσ |IFPC P ↑ (α − 1)) = WFM (wσ ), (4) is proved.
                α
     Since (a, K¬a ) ∈ gfp(OpFalseC P
                                    IFPC P ↑(α−1) ), by Lemma 10

                                          P
                 wσ ∈ ωK¬a
                        α → WFM (wσ |IFPC   ↑ (α − 1)) |= ¬a

  Since WFM (wσ |IFPC P ↑ (α − 1)) = WFM (wσ ), (5) is proved.
                                   S                     S
     If α is a limit ordinal, Kaα = β<α Kaβ and K¬aα
                                                      = β<α Kaβ . If wσ ∈ ωKaα
  there is a β such that wσ ∈ ωKbα and by the inductive hypothesis (4) is proved.
  Similarly for (5).                                                           ♦



                                        82
The Distribution Semantics is Well-Defined for All Normal Programs

  Lemma 12 For a ground probabilistic logic program P with probabilistic facts
  F, rules R and atoms BP , let Kaα and K¬a
                                         α
                                            be the formulas associated with atom
  a in IFPC P ↑ α. For every atom a, total choice σ and iteration α, we have:
                                 a ∈ IFP wσ ↑ α → wσ ∈ Kaα
                                ¬a ∈ IFP wσ ↑ α → wσ ∈ K¬a
                                                        α


  Proof: Let us prove it by double transfinite induction. If α is a successor ordinal,
  assume that
                        a ∈ IFP wσ ↑ (α − 1) → wσ ∈ Kaα−1
                            ¬a ∈ IFP wσ ↑ (α − 1) → wσ ∈ K¬a
                                                          α−1

  Let us perform transfinite induction on the iterations of OpTrue P
                                                                   IFPC P ↑(α−1) .
  Let us consider a successor ordinal δ: assume that
                     a ∈ OpTrue w                               δ−1
                                IFP wσ ↑(α−1) ↑ (δ − 1) → wσ ∈ La
                                 σ




                    ¬a ∈ OpFalse w                               δ−1
                                 IFP wσ ↑(α−1) ↓ (δ − 1) → wσ ∈ M¬a
                                  σ



  and prove that
                          a ∈ OpTrue w                         δ
                                     IFP wσ ↑(α−1) ↑ δ → wσ ∈ La
                                       σ



                        ¬a ∈ OpFalse w                         δ
                                     IFP wσ ↑(α−1) ↓ δ → wσ ∈ M¬a
                                      σ



  Consider a. If a ∈ F then it is easily proved.
        For other atoms a, a ∈ OpTrue w         IFP wσ ↑(α−1) ↑ δ means that there is a rule a ←
                                                 σ


  b1 , . . . , bn , ¬c1 , . . . , cm such that for all i = 1, . . . , n bi ∈ OpTrue w
                                                                                    IFP wσ ↑(α−1) ↑
                                                                                      σ

                                                              wσ
  (δ − 1) and for all j = 1, . . . , m ¬cj ∈ IFP                     ↑ (α − 1). For the inductive
  hypothesis ∀i : wσ ∈ Lδ−1            bi  ∨ w σ  ∈  K  α−1
                                                       bi   and     ∀j  : wσ ∈ K¬cα−1
                                                                                    j
                                                                                        so, for the
                                   wσ                    δ
  definition of OpTrue IFP wσ ↑(α−1) , wσ ∈ La . Analogously for ¬a.
                                                    S                         N
        If δ is a limit ordinal, then Lδa = µ<δ Lµa and M¬a              δ
                                                                            = µ<δ M¬a   µ
                                                                                          . For the
  inductive hypothesis for all µ < δ
                          a ∈ OpTrue w                         µ
                                     IFP wσ ↑(α−1) ↑ µ → wσ ∈ La
                                      σ



                        ¬a ∈ OpFalse w                         µ
                                     IFP wσ ↑(α−1) ↓ µ → wσ ∈ M¬a
                                       σ



  If a ∈ OpTrue w    IFP wσ ↑(α−1) ↑ δ, then there exists a µ < δ such that a ∈
                      σ


  OpTrue w IFP wσ ↑(α−1) ↑ µ. For the inductive hypothesis, wσ ∈ La .
            σ                                                        δ

      If ¬a ∈ OpFalse IFP wσ ↑(α−1) ↓ δ, then, for all µ < δ, ¬a ∈ OpFalse w
                        wσ
                                                                           IFP wσ ↑(α−1) ↓
                                                                             σ


  µ. For the inductive hypothesis, wσ ∈SMaδ .                      S
      Consider a limit α. Then Kaα = β<α Kaβ and K¬a          α
                                                                 = β<α K¬a  β
                                                                               . The in-
  ductive hypothesis is
                                a ∈ IFP wσ ↑ β → wσ ∈ Kaβ
                                ¬a ∈ IFP wσ ↑ β → wσ ∈ K¬a
                                                        β

  If a ∈ IFP wσ ↑ α, then there exists a β < α such that a ∈ IFP wσ ↑ β. For the
  inductive hypothesis wσ ∈ Kaβ so wσ ∈ Kaα . Similarly for ¬a.               ♦




                                               83
The Distribution Semantics is Well-Defined for All Normal Programs

  Theorem 7 For a ground probabilistic logic program P with atoms BP , let Kaα
        α
  and K¬a  be the formulas associated with atom a in IFPC P ↑ α. For every atom
  a and total choice σ, there is an iteration α0 such that for all α > α0 we have:

          wσ ∈ ωKaα ↔ WFM (wσ ) |= a         wσ ∈ ωK¬a
                                                    α ↔ WFM (wσ ) |= ¬a



     Proof: The → direction is Lemma 11. In the other direction, WFM (wσ ) |= a
  implies ∃α0 ∀α ≥ α0 : IFP wσ ↑ α |= a. For Lemma 12, wσ ∈ ωKaα . WFM (wσ ) |=
  ¬a implies ∃α0 ∀α ≥ α0 : IF P wσ ↑ α |= ¬a. For Lemma 12, wσ ∈ ωK¬a
                                                                    α .  ♦
  Theorem 8 For a ground probabilistic logic program P, let Kaα and K¬a α
                                                                           be the
                                            P
  formulas associated with atom a in IFPC ↑ α. For every atom a and every
  iteration α, Kaα and K¬a
                        α
                           are countable sets of countable composite choices.
                                                                         P
     Proof: It can be proved by observing that each iteration of OpT rueCIFPC P ↑β

  and OpFalseC P  IFPC P ↑β generates countable sets of countable explanations since
  the set of rules is countable.                                                  ♦




                                        84