The Complexity of Probabilistic EL

Vı́ctor Gutiérrez Basulto1 , Jean Christoph Jung1 , Carsten Lutz1 , and Lutz Schröder1,2
      1
          University Bremen, Germany, {victor,jeanjung,clu}@informatik.uni-bremen.de
                      2
                        DFKI Bremen, Germany, lutz.schroeder@dfki.de


       Abstract. We analyze the complexity of subsumption in probabilistic variants of
       the description logic EL. In the case where probabilities apply only to concepts,
       we map out the borderline between tractability and E XP T IME-completeness. One
       outcome is that any probability value except zero and one leads to intractability
       in the presence of general TBoxes, while this is not the case for classical TBoxes.
       In the case where probabilities can also be applied to roles, we show PS PACE-
       completeness. This result is (positively) surprising as the best previously known
       upper bound was 2-E XP T IME and there were reasons to believe in completeness
       for this class.


1   Introduction
The fact that traditional description logics (DLs) do not provide any built-in means for
representing uncertainty has led to various proposals for probabilistic extensions, see
for example [14, 10, 5, 13, 12] and references therein. Recently, a new family of prob-
abilistic DLs was introduced in [15], with the distinguishing feature that its members
relate to the well-established probabilistic first-order logic (FOL) of Halpern and Bac-
chus [7, 4] in the same way as classical DLs relate to traditional FOL. The main purpose
of DLs from the new family, from now on called Prob-DLs, is to enable concept defini-
tions that require reference to (degrees of) possibility, likelihood, and certainty. To this
effect, Prob-DLs provide a probabilistic constructor P∼p with ∼ ∈ {<, ≤, =, ≥, >}
and p ∈ [0, 1] that can be applied to concepts and sometimes also to roles. For example,

                    Patient u ∃finding.(Disease u P>0.25 Infectious)

describes patients having a disease that is infectious with probability at least .25. It
was argued in [15] that Prob-DLs are well-suited to capture aspects of uncertainty in
biomedical ontologies such as S NOMED CT. Since such ontologies are often formulated
in DLs from the EL family for which subsumption can be solved in polynomial time [2,
16], probabilistic extensions of EL in the style of Prob-DLs is particularly relevant in
this context. Some initial results have already been obtained in [15].
    In this paper, we establish a rather complete picture of the complexity of subsum-
tion in Prob-DLs based on EL. In the first part, we consider Prob-EL in which prob-
abilities can only be applied to concepts, but not to roles. In [15], it was shown that
some concrete combinations of probability constructors such as P>0 and P>0.4 lead
to intractability (in fact, E XP T IME-completeness) of subsumption while a restriction
to the probability values zero and one does not. Here, we prove the much more gen-
eral result that the extension of EL with any single concept constructor P∼p , where
∼ ∈ {<, ≤, =, ≥, >} and p ∈ (0, 1), results in E XP T IME-completeness. More specif-
ically, this result applies to general TBoxes, i.e., to sets of concept inclusions C v D
when ∼ ∈ {=, ≥, >} and even to the empty TBox when ∼ ∈ {<, ≤, }. Inspired by
the observation that many biomedical ontologies such as S NOMED CT are classical
TBoxes, i.e., sets of concept definitions A ≡ D with atomic and unique left-hand sides,
we then show that probabilities other than zero and one can be used without losing
tractability in (possibly cyclic) classical TBoxes for the cases ∼ ∈ {>, ≥}. More pre-
cisely, subsumption in Prob-EL is tractable when only the constructors P∼p and P=1
are admitted, for any (single!) choice of ∼ ∈ {≥, >} and p ∈ (0, 1). The resulting logic
actually ‘coincides’ for all possible choices. We also show that when a second probabil-
ity value from the range (0, 1) sufficiently ‘far away’ from p is added, the complexity
of subsumption snaps back to E XP T IME-completeness.
     In the second part of the paper, we consider Prob-ELr , where probabilities can be
applied to both concepts and roles, concentrating on general TBoxes. While decidability
is an open problem for full Prob-ELr , it was known that subsumption is in 2-E XP T IME
and PS PACE-hard in Prob-EL>0;=1 r    , where probability values are restricted to zero and
one. Since subsumption in the ALC-version of Prob-EL>0;=1   r      is 2-E XP T IME-complete
and the complexity of the EL-version and the ALC-version of many-dimensional DLs
(such as Prob-DLs) coincides in all known cases, it was thus tempting to conjecture
2-E XP T IME-completeness also of subsumption in Prob-EL>0;=1    r      . We show that this
is not the case by establishing a tight PS PACE upper bound for subsumption in Prob-
EL>0;=1
    r      . This also implies PS PACE-completeness for the two-dimensional DL S5EL ,
in sharp contrast with the 2-E XP T IME-completeness of S5ALC .
    This paper is a workshop version of [6]. Proofs can be found in the long version of
that paper, to be found at http://www.informatik.uni-bremen.de/˜clu/papers/.


2   Preliminaries

Let NC and NR be countably infinite sets of concept names and role names. Prob-EL is
the extension of EL that allows the application of probabilities to concepts, i.e., Prob-
EL concepts are built according to the rule

                        C, D ::= > | A | C u D | ∃r.C | P∼p C

where A ranges over NC , r over NR , ∼ over {<, ≤, =, ≥, >}, and p ∈ [0, 1]. The
concept P∼p C denotes the class of objects that are an instance of C with probability
∼ p. For example, the S NOMED CT concept ‘animal bite by potentially rabid animal’
can be expressed as

                       Bite u ∃by.(Animal u P>0.5 ∃has.Rabies).

When we admit only a few values for ∼ and n, we put them in superscript; for example,
Prob-EL>0.4,<0.1 denotes the extension of EL with P>0.4 C and P<0.1 C. Probabilities
can be applied to roles using the concept constructors ∃P∼p r.C where ∼ and p range
over the same values as above, expressing that there is an element satisfying C that is
related to the current element by the role name r with probability ∼ p. For example, the
S NOMED CT concept ‘disease of possible viral origin’ can be modeled as

                              Disease u ∃P>0 origin.Viral.

We denote the extension of Prob-EL with the constructor ∃P∼p r.C with Prob-ELr . We
also consider the restriction Prob-EL>0;=1
                                        r    of Prob-ELr to probabilities 0 and 1 (both
on concepts and roles).
    The semantics of the probabilistic DLs is given in terms of a probabilistic interpre-
tation I = (∆I , W, (Iw )w∈W , µ), where ∆I is the (non-empty) domain, W a non-
empty set of possible worlds, µ a discrete probability distribution on W , and for each
w ∈ W , Iw is a classical DL interpretation with domain ∆I . We usually write C I,w
for C Iw , and likewise for rI,w . For concept names A and role names r, we define the
probability

 – pId (A) that d ∈ ∆I is an A as µ({w ∈ W | d ∈ AI,w });
 – pId,e (r) that d, e ∈ ∆I are related by r as µ({w ∈ W | (d, e) ∈ rI,w }).

Next, we extend pId (A) to compound concepts C and define the extension C I,w of
compound concepts by mutual recursion on C. The definition of pId (C) is exactly as in
the base case, with A replaced by C. The extension of compound concepts is defined as
follows:
                        >I,w = ∆I (C u D)I,w = C I,w ∩ DI,w
                   (∃r.C)I,w = {d ∈ ∆I | ∃e.(d, e) ∈ rI,w ∧ e ∈ C I,w }
                 (P∼p C)I,w = {d ∈ ∆I | pId (C) ∼ p}
              (∃P∼p r.C)I,w = {d ∈ ∆I | ∃e ∈ C I,w : pId,e (r) ∼ p}
A general TBox is a finite set of concept inclusions C v D, where C, D are concepts.
A classical TBox is a set of concept definitions A ≡ C, where A is a concept name
and the left-hand sides of concept definitions are unique. Note that cyclic definitions
are allowed.
     A probabilistic interpretation I satisfies a concept inclusion C v D if C I,w ⊆
   I,w
D      and a concept definition A ≡ C if AI,w = C I,w , for all w ∈ W . I is a model of
a TBox T if it satisfies all inclusions/definitions in T . A concept C is subsumed by a
concept D relative to a TBox T (written T |= C v D) if every model I of T satisfies
C v D.
     The above definition is the result of transferring the notion of subsumption from
standard DLs to probabilistic DLs in a straightforward way. However, there is an al-
ternative variant of subsumption that is natural for probabilistic DLs: a concept C is
positively subsumed by a concept D relative to a TBox T (written T |=+ C v D)
if C I,w ⊆ DI,w for every probabilistic model I = (∆I , W, (Iw )w∈W , µ) and every
w ∈ W with µ(w) > 0. Intuitively, classical subsumption is about subsumptions that
are logically implied whereas positive subsumption is about subsumptions that are cer-
tain. For example, when T∅ is the empty TBox, then T∅ 6|= P=1 A v A, but we can only
have d ∈ (P=1 A)I,v \AI,v when µ(v) = 0, thus non-subsumption is only witnessed by
worlds that we are certain to not be the actual world. Consequently, T∅ |=+ P=1 A v A.
In the extension Prob-ALC of Prob-EL with negation studied in [15], positive sub-
sumption can easily be reduced to subsumption. This does not seem easily possible in
Prob-EL. In fact, we will sometimes use (Turing) reductions in the opposite direction.


3   Probabilistic Concepts

In [15], it was shown that subsumption in Prob-EL>0;=1 with general TBoxes is in
PT IME, whereas the same problem is E XP T IME-complete in Prob-EL>0;>0.4 (both in
the positive and in the unrestricted case). This raises the question whether any proba-
bility except 0,1 can be admitted in Prob-EL without losing tractability. The following
theorem provides a strong negative result.
Theorem 1. For all p ∈ (0, 1), (positive) subsumption in Prob-EL∼p relative to

 1. general TBoxes is E XP T IME-hard when ∼ ∈ {=, >, ≥}
 2. the empty TBox is E XP T IME-hard when ∼ ∈ {≤, <}

Matching upper bounds are an immediate consequence of the fact that each logic Prob-
EL∼p is a fragment of the DL Prob-ALC c for which subsumption was proved E X -
P T IME -complete in [15]. To prove the lower bounds, it suffices to show that each
logic Prob-EL∼p is non-convex, i.e., that there are a general TBox T and concepts
C, D1 , . . . , Dn , n ≥ 2, such that T |= C v D1 t · · · t Dn , but T 6|= C v Di for all i.
Once that this is established, standard proof techniques from [2] can be used to reduce
satisfiability in ALC relative to general TBoxes, which is E XP T IME-complete, to sub-
sumption in Prob-EL∼p . The following constructions work for standard subsumption
and positive subsumption alike.
     First consider ∼ = ≥ and assume p ≤ 0.5. Fix a k > 0 such that k · p > 1 and set

                      T = {Ai u Aj v P≥p Bij | 1 ≤ i < j ≤ k}
                      C = P≥p A1 u · · · u P≥p Ak
                     Dij = P≥p Bij

Intuitively, the probabilities stipulated by C sum up to > 1, thus some of the Ai have
to overlap, but there is a choice as to which ones these are. Formally, we can show non-
convexity by proving that T |= C v t1≤i<j≤k Dij , but T 6|= C v Dij for any i, j.
The comparisons ∼ ∈ {=, >} can be handled similarly.
    Now assume that p > 0.5. We start with the case ∼ = > and use a variation of the
above. The main idea is to employ P>p C to simulate P>q C, for some q ≤ 0.5, which
brings us back to a case already dealt with. More precisely, let n > 0 be smallest such
            1
that n > 2(1−p) and set q = pn − n + 1. An easy computation shows that 0 ≤ q < 0.5.
Moreover, it can be shown that

                    P>p X1 u · · · u P>p Xn v P>q (X1 u · · · u Xn )

which allows us to redo the above reduction with probability q < 0.5. Details are given
in the long version of [6]. The comparisons ∼ ∈ {=, ≥} can be handled similarly.
    For the remaining cases ∼ ∈ {<, ≤}, there is a very simple argument for non-
convexity even w.r.t. the empty TBox: we have > v P<p A t P<p P<p A, but neither
> v P<p A nor > v P<p P<p A, and likewise when ∼ is ≤.
    When ∼ ∈ {=, >, ≥}, the proof of Theorem 1 relies on general TBoxes in a crucial
way. It turns out that when we restrict ourselves to classical TBoxes, tractability can be
attained even with probabilities other than 0 and 1.
Theorem 2. For all ∼ ∈ {>, ≥} and p ∈ [0, 1], (positive) subsumption in Prob-
EL∼p;=1 relative to classical TBoxes is in PT IME.
To prove Theorem 2, we start with positive subsumption. We can assume p > 0 since
subsumption in Prob-EL>0;=1 is in PT IME even with general TBoxes. To prove a
PT IME upper bound, we use a ‘consequence-driven’ procedure similar to the ones in
[2, 11]. A concept name A is defined in a classical TBox T if there is a concept defi-
nition A ≡ C ∈ T , and primitive otherwise. We can w.l.o.g. restrict our attention to
the subsumption of defined concept names relative to TBoxes. We also assume that the
input TBox is normalized to a set of concept definitions of the form

                         A ≡ P1 u · · · u Pn u C1 u · · · u Cm

n, m ≥ 0, and where P1 , . . . , Pn are primitive concept names and C1 , . . . , Cm are of
the form P∼p A, P=1 A, and ∃r.A with A a defined concept name (note that the top
concept is completely normalized away). It is well-known that such a normalization
can be achieved in polytime, see [1] for details. For a given TBox T and a defined
concept name A in T , we write CA to denote the defining concept for A in T , i.e.,
A ≡ CA ∈ T . Moreover, we deliberately confuse the concept CA = D1 u · · · u Dk
with the set {D1 , . . . , Dk }. We define a set of concepts ‘certain for CA ’ as
                                                              [
                   cert(CA ) = {P∗ B | P∗ B ∈ CA } ∪                 {CB }
                                                       P=1 B∈CA

where, here and in what follows, P∗ ranges over P=1 and P>p . Intuitively, cert(CA )
contains concepts that hold with probability 1 whenever A is satisfied in some world.
The algorithm starts with the normalized input TBox and then exhaustively applies the
completion rules displayed in Figure 1. As a general proviso, each rule can be applied
only if it adds a concept that occurs in T and actually changes the TBox, e.g., R1 can
only be applied when ∃r.B 0 occurs in T and ∃r.B 0 ∈ / CA . Exemplarily, we explain rule
R5 in more detail. If all defining concepts CB of B are certain for A, then A v P=1 B,
thus we can add P=1 B to CA . Let T ∗ be the result of exhaustive rule application and
      ∗
let CA  be the defining concept for A in T ∗ , for all concept names A. The ‘only if’
direction requires a careful and surprisingly subtle model construction.
                                                                      ∗    ∗
Lemma 1. For all defined concept names A, B, we have T |=+ A v B iff CB ⊆ CA .
It is easy to see that TBox completion requires only polytime: every rule application
extends the TBox, but both the number of concept definitions and of conjuncts in each
concept definition is bounded by the size of the original TBox.
     To prove Theorem 2 for unrestricted subsumption, we provide a Turing reduction
from unrestricted subsumption to positive subsumption. We again assume that the input
 R1 If ∃r.B ∈ CA , and CB 0 ⊆ CB
    then replace A ≡ CA with A ≡ CA ∪ {∃r.B 0 }
 R2 If P=1 B ∈ CA
    then replace A ≡ CA with A ≡ CA ∪ CB
 R3 If P=1 B ∈ CA
    then replace A ≡ CA with A ≡ CA ∪ {P∼p B}
 R4 If P∼p B ∈ CA , and D ∈ cert(CB )
    then replace A ≡ CA with A ≡ CA ∪ {D}
 R5 If CB ⊆ cert(CA )
    then replace A ≡ CA with A ≡ CA ∪ {P=1 B}
 R6 If P∼p B ∈ CA and CB 0 ⊆ cert(CA ) ∪ CB
    then replace A ≡ CA with A ≡ CA ∪ {P∼p B 0 }

                    Fig. 1. TBox completion rules for positive subsumption


TBox is in the described normal form and then exhaustively apply the rules shown in
Figure 2, calling the result T ∗ with defining concept of the form CA
                                                                    ∗
                                                                      .
                                                                     ∗    ∗
Lemma 2. For all defined concept names A, B, we have T |= A v B iff CB ⊆ CA .

Clearly, the Turing reduction and thus the overall algorithm runs in polytime.
    It is interesting to note that the proof of Theorem 2 is based on exactly the same
algorithm, for all ∼ ∈ {≥, >} and p ∈ (0, 1]. It follows that there is in fact only a
single logic Prob-EL∼p , for all such ∼ and p. Formally, given a Prob-EL∼p -concept
C, ≈ ∈ {≥, >} and q ∈ (0, 1], let C≈q denote the result of replacing each subconcept
P∼p D in C with P≈q D in C and similarly for Prob-EL∼p -TBoxes T .

Theorem 3. For any p, q > 0, ∼, ≈ ∈ {>, ≥}, EL∼p -concepts C, D and -TBox T ,
we have T |=+ C v D iff T≈q |=+ C≈q v D≈q , and likewise for unrestricted
subsumption.

Consequently, the (potentially difficult!) choice of a concrete ∼ ∈ {≥, >} and p ∈
(0, 1] is moot. In fact, it might be more intuitive to replace the constructor P∼p C with
a constructor L C that describes elements which ‘are likely to be a C’, and to replace
P=1 C with the constructor C C to describe elements that ‘are certain to be a C’, see e.g.
[8, 9] for other approaches to logics of likelihood. Note that the case p = 0 is differ-
ent from the cases considered above: for example, we have T∅ |=+ ∃r.A v ∃r.P>p A
iff p = 0, and likewise T∅ |= P>p ∃r.A v P>p ∃r.P>p A iff p = 0. In the spirit of
the constructors C and L, P>0 C can be replaced with a constructor PC that describes
elements for which ‘it is possible that they are a C’. For example, the S NOMED CT con-
cepts ‘definite thrombus’ and ‘possible thrombus’ can then be written as C Thrombus
and P Thrombus (although we speculate that the S NOMED CT designers mean ‘likely’
rather than ‘possible’).
 S1 If ∃r.B ∈ CA , and CB 0 ⊆ CB
    then replace A ≡ CA with A ≡ CA ∪ {∃r.B 0 }
 S2 If T |=+ cert(CA ) v P∗ B
    then replace A ≡ CA with A ≡ CA ∪ {P∗ B}

                     Fig. 2. TBox completion rules for Turing reduction


     It is a natural question whether the PT IME upper bound for classical TBoxes extends
to the case of multiple probability values (except one, which is apparently always un-
critical). The following result shows that many combinations of two probability values
lead to (non-convexity, thus) intractability, even without any TBox.
Theorem 4. Let ∼ ∈ {>, ≥}, and p, q ∈ [0, 1). Then (positive) subsumption in Prob-
EL∼p;∼q relative to the empty TBox is E XP T IME-hard if (i) q = 0, (ii) p ≤ 1/2 and
q < p2 , or more generally (iii) 2p − 1 < q < p2 .
In particular, we cannot combine the constructors P and L mentioned above without
losing tractability. The above formulation of Theorem 4 is actually only a consequence
of a more general (but also more complicated to state) result established in the long
version of [6]. We conjecture that (positive) subsumption in Prob-EL∼p;∼q relative to
classical TBoxes is in PT IME whenever p ≥ q ≥ p2 and that, otherwise, it is E XP T IME-
hard.


4   Probabilistic Roles
Adding probabilistic roles to Prob-EL tends to increase the complexity of subsump-
tion. While for full Prob-ELr even decidability is open, it was shown in [15] that sub-
sumption is in 2-E XP T IME and PS PACE-hard in Prob-EL>0;=1  r     . As discussed in the
introduction, there were reasons to believe that this problem is actually 2-E XP T IME-
complete. We show that this is not the case by proving a PS PACE upper bound, thus
establishing PS PACE-completeness. This result holds both for positive and unrestricted
subsumption, we start with the positive case.
    We again concentrate on subsumption between concept names and assume that the
input TBox is in a certain normal form, defined as follows. A basic concept is a concept
of the form >, A, P>0 A, P=1 A, or ∃α.A, where A is a concept name and, here and in
what follows, α is a role, i.e., of the form r, P>0 r, or P=1 r with r a role name. Now,
every concept inclusion in the input TBox is required to be of the form

                                   X1 u · · · u Xn v X

with X1 , . . . , Xn , X basic concepts. It is not hard to show that every TBox can be trans-
formed into this normal form in polynomial time such that (non-)subsumption between
the concept names that occur in the original TBox is preserved.
    Let T be the input TBox in normal form, CN the set of concept names that occur
in T , BC the set of basic concepts in T , and ROL the set of roles in T . Call a role
probabilistic if it is of the form P=1 r or P>0 r. Our algorithm maintains the following
data structures:

 – a mapping Q that associates with each A ∈ CN a subset Q(A) ⊆ BC such that
   T |= A v X for all X ∈ Q(A);
 – a mapping Qcert that associates with each A ∈ CN a subset Qcert (A) ⊆ BC such
   that T |= A v P=1 X for all X ∈ Qcert (A);
 – a mapping R that associates with each probabilistic role α ∈ ROL a binary relation
   R(α) on CN such that T |= A v P>0 (∃α.B) for all (A, B) ∈ R(α).

Some intuition about the data structures is already provided above; e.g., X ∈ Q(A)
means that T |= A v X. However, there is also another view on these structures
that will be important in what follows: they represent an abstract view of a model of
T , where each set Q(A) describes the concept memberships of a domain element d
in a world w with d ∈ AI,w and R describes role memberships, i.e., when (A, B) ∈
R(α), then d ∈ AI,w implies that in some world v with positive probability, d has an
element described by Q(B) as an α-successor. In this context, Qcert (A) contains all
concepts that must be true with probability 1 for any domain element that satisfies A
in some world. Note that non-probabilistic roles r and probabilistic roles P=1 r are not
represented in the R(·) data structure; we will treat them in a more implicit way later
on.
    The data structures are initialized as follows, for all A ∈ CN and relevant roles α:

                 Q(A) = {>, A}            Qcert (A) = {>}            R(α) = ∅.

The sets Q(·), Qcert (·), and R(·) are then repeatedly extended by the application of
various rules. Before we can introduce these rules, we need some preliminaries. As the
first step, Figure 3 presents a (different!) set of rules that serves the purpose of saturating
a set of concepts Γ . We use cl(Γ ) to denote the set of concepts that is the result of
exhaustively applying the displayed rules to Γ , where any rule can only be applied if
the added concept is in BC, but not yet in Γ . The rules access the data structure Q(·)
introduced above and shall later be applied to the sets Q(A) and Qcert (A), but they will
also serve other purposes as described below. It is not hard to see that rule application
terminates after polynomially many steps.


 R1 If X1 u . . . u Xn v X ∈ T and X1 , . . . , Xn ∈ Γ then add X to Γ
 R2 If P=1 A ∈ Γ then add A to Γ
 R3 If ∃P=1 r.A ∈ Γ then add ∃r.A to Γ
 R4 If A ∈ Γ then add P>0 A to Γ
 R5 If ∃r.A ∈ Γ then add ∃P>0 r.A to Γ
 R6 If ∃α.A ∈ Γ and B ∈ Q(A) then add ∃α.B to Γ

                               Fig. 3. Saturation rules for cl(Γ )
    The rules that are used for completing the data structures Q(·), Qcert (·), and R(·)
are more complex and refer to ‘traces’ through these data structures.

Definition 1. Let B ∈ CN. A trace to B is a sequence S, A1 , α2 , A2 , . . . , αn , An where
 1. S = A for some P>0 A ∈ Q(A1 ) or S = (r, B) for some (A1 , B) ∈ R(P>0 r);
 2. each Ai ∈ CN and each αi ∈ ROL is a probabilistic role, such that An = B;
 3. (Ai , Ai−1 ) ∈ R(αi ) for 1 < i ≤ n.

If t is a trace of length n, we use tk , k ≤ n, to denote the trace S, A1 , α2 , . . . , αk , Ak .
Intuitively, the purpose of a trace is to deal with worlds that are generated by concepts
P>0 A and ∃P>0 r.A; there can be infinitely many such worlds as Prob-EL>0;=1      r        lacks
the finite model property, see [15]. The trace starts at some domain element represented
by a set Q(A1 ) in the world generated by the first element S of the trace, then repeat-
edly follows role edges represented by R(·) backwards until it reaches the final domain
element represented by Q(B). The importance of traces stems from the fact that infor-
mation can be propagated along them, as captured by the following notion.
Definition 2. Let t = S, A1 , α2 , . . . , αn , An be a trace of length n. Then the type
Γ (t) ⊆ BC of t is
• cl({A} ∪ Qcert (A1 )) if n = 1 and S = A;
• cl(Qcert (A1 ) ∪ {∃r.B 0 ∈ BC | B 0 ∈ Qcert (B)}) if n = 1 and S = (r, B);
• cl(Qcert (An ) ∪ {∃αn .B 0 ∈ BC | B 0 ∈ Γ (tn−1 )}) if n > 1.
Note that the rules R1 to R6 are used in every step of this inductive definition. The
mentioned propagation of information along traces is now as follows: if there is a trace
t to B, then any domain element that satisfies B in some world must satisfy the concepts
in Γ (t) in some other world. So if for example P>0 A ∈ Γ (t), we need to add P>0 A
also to Qcert (B) and to Q(B).
     Figure 4 shows the rules used for completing the data structures Q(·), Qcert (·), and
R(·). Note that S6 and S7 implement the propagation of information along traces,
as discussed above. Our algorithm for deciding (positive) subsumption starts with the
initial data structures defined above and then exhaustively applies the rules shown in
Figure 4. To decide whether T |=+ A v B, it then simply checks whether B ∈ Q(A).
Lemma 3. Let T be a Prob-EL>0;=1
                              r     -TBox in normal form and A, B be concept names.
Then T |=+ A v B iff, after exhaustive rule application, B ∈ Q(A).
We now argue that the algorithm can be implemented using only polynomial space.
First, it is easy to see that there can be only polynomially many rule applications: every
rule application extends the data structures Q(·), Qcert (·), and R(·), but these structures
consist of polynomially many sets, each with at most polynomially many elements. It
thus remains to verify that each rule application can be executed using only polyspace,
which is obvious for all rules except those involving traces, i.e., S6 and S7. For these
rules, we first note that it is not necessary to consider all (infinitely many!) traces. In
fact, a straightforward ‘pumping argument’ can be used to show that there is a trace
t to B with some relevant concept C ∈ Γ (t) iff there is a non-repeating such trace,
i.e., a trace t0 of length n such that for all distinct k, ` ≤ n, we have Γ (t0k ) 6= Γ (t0` ).
 S1 apply R1-R6 to Q(A) and Qcert (A)
 S2 if P∗ B ∈ Q(A)
    then add P∗ B to Qcert (A)
 S3 if C ∈ Qcert (A)
    then add P=1 C and C to Q(A)
 S4 If ∃α.B ∈ Q(A) with α a probabilistic role
    then add (A, B) to R(α).
 S5 If P>0 B1 ∈ Q(A), (B1 , B2 ) ∈ R(α), B3 ∈ Qcert (B2 )
    then add ∃α.B3 to Qcert (A)
 S6 if t is a trace to B and P∗ A ∈ Γ (t)
    then add P∗ A to Qcert (B)
 S7 if t is a trace to B and ∃α.A ∈ Γ (t) with α a probabilistic role

                       Fig. 4. The rules for completing the data structures.


Clearly, the length of non-repeating traces is bounded by 2m , m the size of T . To get
to polyspace, we use a non-deterministic approach, enabled by Savitch’s theorem: to
check whether there is a trace t to B with C ∈ Γ (t), we guess t step-by-step, at each
time keeping only a single Ai , αi and Γ (ti ) in memory. When we reach a situation
where Ai = B and C ∈ Γ (ti ), our guessing was successful and we apply the rule. We
also maintain a binary counter of the number of steps that have been guessed so far. As
soon as this counter exceeds 2m , the maximum length of non-repeating traces, we stop
the guessing and do not apply the rule. Clearly, this yields a polyspace algorithm.
Theorem 5. Deciding positive subsumption in Prob-EL>0;=1
                                                   r     with respect to general
TBoxes is PS PACE-complete.
As a byproduct, the proof of Lemma 3 yields a unique least model (in the sense of
Horn logic), thus proving convexity of Prob-EL>0;=1
                                                  r    . Note that positive subsumption
in Prob-EL>0;=1
            r      is actually the same  as subsumption in the two-dimensional descrip-
tion logic S5EL , which is thus also PS PACE-complete. Using a Turing reduction similar
to that shown in Figure 2, we can ‘lift’ the result from positive subsumption to unre-
stricted subsumption.
Theorem 6. Subsumption in Prob-EL>0;=1
                                 r     relative to general TBoxes is PS PACE-
complete.


5   Conclusion

We have established a fairly complete picture of the complexity of subsumption in Prob-
EL, although some questions remain open. We speculate that Theorem 2 can be proved
also when ∼ is = with only minor changes (e.g. rule R3 becomes unsound). It would be
interesting to verify the conjecture made below Theorem 4 that (positive) subsumption
in Prob-EL∼p;∼q relative to classical TBoxes is in PT IME whenever p ≥ q ≥ p2 and
that, otherwise, it is E XP T IME-hard relative to the empty TBox. It is even conceivable
that the conjectured PT IME result can be further generalized to any set of probability
values P ⊆ [0, 1] as long as q ≥ p2 whenever p, q ∈ P and p ≥ q. Moreover, variants
of Theorem 4 that involve, additionally or exclusively, the case where ∼ is = would
also be of interest.

Acknowledgements The present work was supported by the DFG project on proba-
bilistic description logics (LU1417/1-1) and DAAD-CONACYT grant 206550.

References
 1. Baader, F.: Terminological cycles in a description logic with existential restrictions. In: Proc.
    of the 18th Int. Joint Conf. on Artif. Intell. (IJCAI03). pp. 325–330. Morgan Kaufmann
    (2003)
 2. Baader, F., Brandt, S., Lutz, C.: Pushing the EL envelope. In: Proc. of the 19th Int. Joint
    Conf. on Artif. Intell. (IJCAI05). pp. 364–369. Morgan Kaufmann (2005)
 3. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The
    Description Logic Handbook. Cambridge University Press (2003)
 4. Bacchus, F.: Representing and Reasoning with Probabilistic Knowledge. MIT Press, Cam-
    bridge, MA (1990)
 5. da Costa, P.C.G., Laskey, K.B.: PR-OWL: A framework for probabilistic ontologies. In:
    Formal Ontology in Information Systems (FOIS06). Frontiers in Artif. Intell. and Applic.,
    vol. 150, pp. 237–249. IOS Press (2006)
 6. Gutierrez-Basulto, V., Jung, J.C., Lutz, C., Schröder, L.: A closer look at the probabilistic
    description logic Prob-EL. In: Proc. of the 25th AAAI Conf. on Artif. Intell. (AAAI11).
    AAAI Press (2011)
 7. Halpern, J.Y.: Reasoning About Uncertainty. MIT Press (2003)
 8. Halpern, J.Y., Rabin, M.O.: A logic to reason about likelihood. Artif. Intell. 32, 379–405
    (1987)
 9. Herzig, A.: Modal probability, belief, and actions. Fund. Inf. 57, 323–344 (2003)
10. Jaeger, M.: Probabilistic reasoning in terminological logics. In: Proc. of the 4th Int. Conf. on
    Princ. of Knowledge Repres. and Reas. (KR94), pp. 305–316. Morgan Kaufmann (1994)
11. Kazakov, Y.: Consequence-driven reasoning for Horn SHIQ ontologies. In: Proc. of the 21st
    Int. Joint Conf. on Artif. Intell. (IJCAI09). pp. 2040–2045 (2009)
12. Lukasiewicz, T.: Tractable probabilistic description logic programs. In: Proc. of the 1st Int.
    Conf. in Scalable Uncertainty Management (SUM07), number 4772 in LNCS, 143–156.
    Springer Verlag.
13. Lukasiewicz, T.: Expressive probabilistic description logics. Artif. Intell. 172, 852–883
    (2008)
14. Lukasiewicz, T., Straccia, U.: Managing uncertainty and vagueness in description logics for
    the semantic web. J. Web Sem. 6(4), 291–308 (2008)
15. Lutz, C., Schröder, L.: Probabilistic description logics for subjective uncertainty. In: Proc. of
    the 12th Int. Conf. on Princ. of Knowledge Repres. and Reas. (KR10). AAAI Press (2010)
16. Schulz, S., Suntisrivaraporn, B., Baader, F.: SNOMED CT’s problem list: Ontologists’ and
    logicians’ therapy suggestions. In: Proc. of The Medinfo 2007 Congress. Studies in Health
    Technology and Informatics (SHTI-series), IOS Press (2007)