The Complexity of Probabilistic EL Vı́ctor Gutiérrez Basulto1 , Jean Christoph Jung1 , Carsten Lutz1 , and Lutz Schröder1,2 1 University Bremen, Germany, {victor,jeanjung,clu}@informatik.uni-bremen.de 2 DFKI Bremen, Germany, lutz.schroeder@dfki.de Abstract. We analyze the complexity of subsumption in probabilistic variants of the description logic EL. In the case where probabilities apply only to concepts, we map out the borderline between tractability and E XP T IME-completeness. One outcome is that any probability value except zero and one leads to intractability in the presence of general TBoxes, while this is not the case for classical TBoxes. In the case where probabilities can also be applied to roles, we show PS PACE- completeness. This result is (positively) surprising as the best previously known upper bound was 2-E XP T IME and there were reasons to believe in completeness for this class. 1 Introduction The fact that traditional description logics (DLs) do not provide any built-in means for representing uncertainty has led to various proposals for probabilistic extensions, see for example [14, 10, 5, 13, 12] and references therein. Recently, a new family of prob- abilistic DLs was introduced in [15], with the distinguishing feature that its members relate to the well-established probabilistic first-order logic (FOL) of Halpern and Bac- chus [7, 4] in the same way as classical DLs relate to traditional FOL. The main purpose of DLs from the new family, from now on called Prob-DLs, is to enable concept defini- tions that require reference to (degrees of) possibility, likelihood, and certainty. To this effect, Prob-DLs provide a probabilistic constructor P∼p with ∼ ∈ {<, ≤, =, ≥, >} and p ∈ [0, 1] that can be applied to concepts and sometimes also to roles. For example, Patient u ∃finding.(Disease u P>0.25 Infectious) describes patients having a disease that is infectious with probability at least .25. It was argued in [15] that Prob-DLs are well-suited to capture aspects of uncertainty in biomedical ontologies such as S NOMED CT. Since such ontologies are often formulated in DLs from the EL family for which subsumption can be solved in polynomial time [2, 16], probabilistic extensions of EL in the style of Prob-DLs is particularly relevant in this context. Some initial results have already been obtained in [15]. In this paper, we establish a rather complete picture of the complexity of subsum- tion in Prob-DLs based on EL. In the first part, we consider Prob-EL in which prob- abilities can only be applied to concepts, but not to roles. In [15], it was shown that some concrete combinations of probability constructors such as P>0 and P>0.4 lead to intractability (in fact, E XP T IME-completeness) of subsumption while a restriction to the probability values zero and one does not. Here, we prove the much more gen- eral result that the extension of EL with any single concept constructor P∼p , where ∼ ∈ {<, ≤, =, ≥, >} and p ∈ (0, 1), results in E XP T IME-completeness. More specif- ically, this result applies to general TBoxes, i.e., to sets of concept inclusions C v D when ∼ ∈ {=, ≥, >} and even to the empty TBox when ∼ ∈ {<, ≤, }. Inspired by the observation that many biomedical ontologies such as S NOMED CT are classical TBoxes, i.e., sets of concept definitions A ≡ D with atomic and unique left-hand sides, we then show that probabilities other than zero and one can be used without losing tractability in (possibly cyclic) classical TBoxes for the cases ∼ ∈ {>, ≥}. More pre- cisely, subsumption in Prob-EL is tractable when only the constructors P∼p and P=1 are admitted, for any (single!) choice of ∼ ∈ {≥, >} and p ∈ (0, 1). The resulting logic actually ‘coincides’ for all possible choices. We also show that when a second probabil- ity value from the range (0, 1) sufficiently ‘far away’ from p is added, the complexity of subsumption snaps back to E XP T IME-completeness. In the second part of the paper, we consider Prob-ELr , where probabilities can be applied to both concepts and roles, concentrating on general TBoxes. While decidability is an open problem for full Prob-ELr , it was known that subsumption is in 2-E XP T IME and PS PACE-hard in Prob-EL>0;=1 r , where probability values are restricted to zero and one. Since subsumption in the ALC-version of Prob-EL>0;=1 r is 2-E XP T IME-complete and the complexity of the EL-version and the ALC-version of many-dimensional DLs (such as Prob-DLs) coincides in all known cases, it was thus tempting to conjecture 2-E XP T IME-completeness also of subsumption in Prob-EL>0;=1 r . We show that this is not the case by establishing a tight PS PACE upper bound for subsumption in Prob- EL>0;=1 r . This also implies PS PACE-completeness for the two-dimensional DL S5EL , in sharp contrast with the 2-E XP T IME-completeness of S5ALC . This paper is a workshop version of [6]. Proofs can be found in the long version of that paper, to be found at http://www.informatik.uni-bremen.de/˜clu/papers/. 2 Preliminaries Let NC and NR be countably infinite sets of concept names and role names. Prob-EL is the extension of EL that allows the application of probabilities to concepts, i.e., Prob- EL concepts are built according to the rule C, D ::= > | A | C u D | ∃r.C | P∼p C where A ranges over NC , r over NR , ∼ over {<, ≤, =, ≥, >}, and p ∈ [0, 1]. The concept P∼p C denotes the class of objects that are an instance of C with probability ∼ p. For example, the S NOMED CT concept ‘animal bite by potentially rabid animal’ can be expressed as Bite u ∃by.(Animal u P>0.5 ∃has.Rabies). When we admit only a few values for ∼ and n, we put them in superscript; for example, Prob-EL>0.4,<0.1 denotes the extension of EL with P>0.4 C and P<0.1 C. Probabilities can be applied to roles using the concept constructors ∃P∼p r.C where ∼ and p range over the same values as above, expressing that there is an element satisfying C that is related to the current element by the role name r with probability ∼ p. For example, the S NOMED CT concept ‘disease of possible viral origin’ can be modeled as Disease u ∃P>0 origin.Viral. We denote the extension of Prob-EL with the constructor ∃P∼p r.C with Prob-ELr . We also consider the restriction Prob-EL>0;=1 r of Prob-ELr to probabilities 0 and 1 (both on concepts and roles). The semantics of the probabilistic DLs is given in terms of a probabilistic interpre- tation I = (∆I , W, (Iw )w∈W , µ), where ∆I is the (non-empty) domain, W a non- empty set of possible worlds, µ a discrete probability distribution on W , and for each w ∈ W , Iw is a classical DL interpretation with domain ∆I . We usually write C I,w for C Iw , and likewise for rI,w . For concept names A and role names r, we define the probability – pId (A) that d ∈ ∆I is an A as µ({w ∈ W | d ∈ AI,w }); – pId,e (r) that d, e ∈ ∆I are related by r as µ({w ∈ W | (d, e) ∈ rI,w }). Next, we extend pId (A) to compound concepts C and define the extension C I,w of compound concepts by mutual recursion on C. The definition of pId (C) is exactly as in the base case, with A replaced by C. The extension of compound concepts is defined as follows: >I,w = ∆I (C u D)I,w = C I,w ∩ DI,w (∃r.C)I,w = {d ∈ ∆I | ∃e.(d, e) ∈ rI,w ∧ e ∈ C I,w } (P∼p C)I,w = {d ∈ ∆I | pId (C) ∼ p} (∃P∼p r.C)I,w = {d ∈ ∆I | ∃e ∈ C I,w : pId,e (r) ∼ p} A general TBox is a finite set of concept inclusions C v D, where C, D are concepts. A classical TBox is a set of concept definitions A ≡ C, where A is a concept name and the left-hand sides of concept definitions are unique. Note that cyclic definitions are allowed. A probabilistic interpretation I satisfies a concept inclusion C v D if C I,w ⊆ I,w D and a concept definition A ≡ C if AI,w = C I,w , for all w ∈ W . I is a model of a TBox T if it satisfies all inclusions/definitions in T . A concept C is subsumed by a concept D relative to a TBox T (written T |= C v D) if every model I of T satisfies C v D. The above definition is the result of transferring the notion of subsumption from standard DLs to probabilistic DLs in a straightforward way. However, there is an al- ternative variant of subsumption that is natural for probabilistic DLs: a concept C is positively subsumed by a concept D relative to a TBox T (written T |=+ C v D) if C I,w ⊆ DI,w for every probabilistic model I = (∆I , W, (Iw )w∈W , µ) and every w ∈ W with µ(w) > 0. Intuitively, classical subsumption is about subsumptions that are logically implied whereas positive subsumption is about subsumptions that are cer- tain. For example, when T∅ is the empty TBox, then T∅ 6|= P=1 A v A, but we can only have d ∈ (P=1 A)I,v \AI,v when µ(v) = 0, thus non-subsumption is only witnessed by worlds that we are certain to not be the actual world. Consequently, T∅ |=+ P=1 A v A. In the extension Prob-ALC of Prob-EL with negation studied in [15], positive sub- sumption can easily be reduced to subsumption. This does not seem easily possible in Prob-EL. In fact, we will sometimes use (Turing) reductions in the opposite direction. 3 Probabilistic Concepts In [15], it was shown that subsumption in Prob-EL>0;=1 with general TBoxes is in PT IME, whereas the same problem is E XP T IME-complete in Prob-EL>0;>0.4 (both in the positive and in the unrestricted case). This raises the question whether any proba- bility except 0,1 can be admitted in Prob-EL without losing tractability. The following theorem provides a strong negative result. Theorem 1. For all p ∈ (0, 1), (positive) subsumption in Prob-EL∼p relative to 1. general TBoxes is E XP T IME-hard when ∼ ∈ {=, >, ≥} 2. the empty TBox is E XP T IME-hard when ∼ ∈ {≤, <} Matching upper bounds are an immediate consequence of the fact that each logic Prob- EL∼p is a fragment of the DL Prob-ALC c for which subsumption was proved E X - P T IME -complete in [15]. To prove the lower bounds, it suffices to show that each logic Prob-EL∼p is non-convex, i.e., that there are a general TBox T and concepts C, D1 , . . . , Dn , n ≥ 2, such that T |= C v D1 t · · · t Dn , but T 6|= C v Di for all i. Once that this is established, standard proof techniques from [2] can be used to reduce satisfiability in ALC relative to general TBoxes, which is E XP T IME-complete, to sub- sumption in Prob-EL∼p . The following constructions work for standard subsumption and positive subsumption alike. First consider ∼ = ≥ and assume p ≤ 0.5. Fix a k > 0 such that k · p > 1 and set T = {Ai u Aj v P≥p Bij | 1 ≤ i < j ≤ k} C = P≥p A1 u · · · u P≥p Ak Dij = P≥p Bij Intuitively, the probabilities stipulated by C sum up to > 1, thus some of the Ai have to overlap, but there is a choice as to which ones these are. Formally, we can show non- convexity by proving that T |= C v t1≤i} can be handled similarly. Now assume that p > 0.5. We start with the case ∼ = > and use a variation of the above. The main idea is to employ P>p C to simulate P>q C, for some q ≤ 0.5, which brings us back to a case already dealt with. More precisely, let n > 0 be smallest such 1 that n > 2(1−p) and set q = pn − n + 1. An easy computation shows that 0 ≤ q < 0.5. Moreover, it can be shown that P>p X1 u · · · u P>p Xn v P>q (X1 u · · · u Xn ) which allows us to redo the above reduction with probability q < 0.5. Details are given in the long version of [6]. The comparisons ∼ ∈ {=, ≥} can be handled similarly. For the remaining cases ∼ ∈ {<, ≤}, there is a very simple argument for non- convexity even w.r.t. the empty TBox: we have > v P

v P

v P

, ≥}, the proof of Theorem 1 relies on general TBoxes in a crucial way. It turns out that when we restrict ourselves to classical TBoxes, tractability can be attained even with probabilities other than 0 and 1. Theorem 2. For all ∼ ∈ {>, ≥} and p ∈ [0, 1], (positive) subsumption in Prob- EL∼p;=1 relative to classical TBoxes is in PT IME. To prove Theorem 2, we start with positive subsumption. We can assume p > 0 since subsumption in Prob-EL>0;=1 is in PT IME even with general TBoxes. To prove a PT IME upper bound, we use a ‘consequence-driven’ procedure similar to the ones in [2, 11]. A concept name A is defined in a classical TBox T if there is a concept defi- nition A ≡ C ∈ T , and primitive otherwise. We can w.l.o.g. restrict our attention to the subsumption of defined concept names relative to TBoxes. We also assume that the input TBox is normalized to a set of concept definitions of the form A ≡ P1 u · · · u Pn u C1 u · · · u Cm n, m ≥ 0, and where P1 , . . . , Pn are primitive concept names and C1 , . . . , Cm are of the form P∼p A, P=1 A, and ∃r.A with A a defined concept name (note that the top concept is completely normalized away). It is well-known that such a normalization can be achieved in polytime, see [1] for details. For a given TBox T and a defined concept name A in T , we write CA to denote the defining concept for A in T , i.e., A ≡ CA ∈ T . Moreover, we deliberately confuse the concept CA = D1 u · · · u Dk with the set {D1 , . . . , Dk }. We define a set of concepts ‘certain for CA ’ as [ cert(CA ) = {P∗ B | P∗ B ∈ CA } ∪ {CB } P=1 B∈CA where, here and in what follows, P∗ ranges over P=1 and P>p . Intuitively, cert(CA ) contains concepts that hold with probability 1 whenever A is satisfied in some world. The algorithm starts with the normalized input TBox and then exhaustively applies the completion rules displayed in Figure 1. As a general proviso, each rule can be applied only if it adds a concept that occurs in T and actually changes the TBox, e.g., R1 can only be applied when ∃r.B 0 occurs in T and ∃r.B 0 ∈ / CA . Exemplarily, we explain rule R5 in more detail. If all defining concepts CB of B are certain for A, then A v P=1 B, thus we can add P=1 B to CA . Let T ∗ be the result of exhaustive rule application and ∗ let CA be the defining concept for A in T ∗ , for all concept names A. The ‘only if’ direction requires a careful and surprisingly subtle model construction. ∗ ∗ Lemma 1. For all defined concept names A, B, we have T |=+ A v B iff CB ⊆ CA . It is easy to see that TBox completion requires only polytime: every rule application extends the TBox, but both the number of concept definitions and of conjuncts in each concept definition is bounded by the size of the original TBox. To prove Theorem 2 for unrestricted subsumption, we provide a Turing reduction from unrestricted subsumption to positive subsumption. We again assume that the input R1 If ∃r.B ∈ CA , and CB 0 ⊆ CB then replace A ≡ CA with A ≡ CA ∪ {∃r.B 0 } R2 If P=1 B ∈ CA then replace A ≡ CA with A ≡ CA ∪ CB R3 If P=1 B ∈ CA then replace A ≡ CA with A ≡ CA ∪ {P∼p B} R4 If P∼p B ∈ CA , and D ∈ cert(CB ) then replace A ≡ CA with A ≡ CA ∪ {D} R5 If CB ⊆ cert(CA ) then replace A ≡ CA with A ≡ CA ∪ {P=1 B} R6 If P∼p B ∈ CA and CB 0 ⊆ cert(CA ) ∪ CB then replace A ≡ CA with A ≡ CA ∪ {P∼p B 0 } Fig. 1. TBox completion rules for positive subsumption TBox is in the described normal form and then exhaustively apply the rules shown in Figure 2, calling the result T ∗ with defining concept of the form CA ∗ . ∗ ∗ Lemma 2. For all defined concept names A, B, we have T |= A v B iff CB ⊆ CA . Clearly, the Turing reduction and thus the overall algorithm runs in polytime. It is interesting to note that the proof of Theorem 2 is based on exactly the same algorithm, for all ∼ ∈ {≥, >} and p ∈ (0, 1]. It follows that there is in fact only a single logic Prob-EL∼p , for all such ∼ and p. Formally, given a Prob-EL∼p -concept C, ≈ ∈ {≥, >} and q ∈ (0, 1], let C≈q denote the result of replacing each subconcept P∼p D in C with P≈q D in C and similarly for Prob-EL∼p -TBoxes T . Theorem 3. For any p, q > 0, ∼, ≈ ∈ {>, ≥}, EL∼p -concepts C, D and -TBox T , we have T |=+ C v D iff T≈q |=+ C≈q v D≈q , and likewise for unrestricted subsumption. Consequently, the (potentially difficult!) choice of a concrete ∼ ∈ {≥, >} and p ∈ (0, 1] is moot. In fact, it might be more intuitive to replace the constructor P∼p C with a constructor L C that describes elements which ‘are likely to be a C’, and to replace P=1 C with the constructor C C to describe elements that ‘are certain to be a C’, see e.g. [8, 9] for other approaches to logics of likelihood. Note that the case p = 0 is differ- ent from the cases considered above: for example, we have T∅ |=+ ∃r.A v ∃r.P>p A iff p = 0, and likewise T∅ |= P>p ∃r.A v P>p ∃r.P>p A iff p = 0. In the spirit of the constructors C and L, P>0 C can be replaced with a constructor PC that describes elements for which ‘it is possible that they are a C’. For example, the S NOMED CT con- cepts ‘definite thrombus’ and ‘possible thrombus’ can then be written as C Thrombus and P Thrombus (although we speculate that the S NOMED CT designers mean ‘likely’ rather than ‘possible’). S1 If ∃r.B ∈ CA , and CB 0 ⊆ CB then replace A ≡ CA with A ≡ CA ∪ {∃r.B 0 } S2 If T |=+ cert(CA ) v P∗ B then replace A ≡ CA with A ≡ CA ∪ {P∗ B} Fig. 2. TBox completion rules for Turing reduction It is a natural question whether the PT IME upper bound for classical TBoxes extends to the case of multiple probability values (except one, which is apparently always un- critical). The following result shows that many combinations of two probability values lead to (non-convexity, thus) intractability, even without any TBox. Theorem 4. Let ∼ ∈ {>, ≥}, and p, q ∈ [0, 1). Then (positive) subsumption in Prob- EL∼p;∼q relative to the empty TBox is E XP T IME-hard if (i) q = 0, (ii) p ≤ 1/2 and q < p2 , or more generally (iii) 2p − 1 < q < p2 . In particular, we cannot combine the constructors P and L mentioned above without losing tractability. The above formulation of Theorem 4 is actually only a consequence of a more general (but also more complicated to state) result established in the long version of [6]. We conjecture that (positive) subsumption in Prob-EL∼p;∼q relative to classical TBoxes is in PT IME whenever p ≥ q ≥ p2 and that, otherwise, it is E XP T IME- hard. 4 Probabilistic Roles Adding probabilistic roles to Prob-EL tends to increase the complexity of subsump- tion. While for full Prob-ELr even decidability is open, it was shown in [15] that sub- sumption is in 2-E XP T IME and PS PACE-hard in Prob-EL>0;=1 r . As discussed in the introduction, there were reasons to believe that this problem is actually 2-E XP T IME- complete. We show that this is not the case by proving a PS PACE upper bound, thus establishing PS PACE-completeness. This result holds both for positive and unrestricted subsumption, we start with the positive case. We again concentrate on subsumption between concept names and assume that the input TBox is in a certain normal form, defined as follows. A basic concept is a concept of the form >, A, P>0 A, P=1 A, or ∃α.A, where A is a concept name and, here and in what follows, α is a role, i.e., of the form r, P>0 r, or P=1 r with r a role name. Now, every concept inclusion in the input TBox is required to be of the form X1 u · · · u Xn v X with X1 , . . . , Xn , X basic concepts. It is not hard to show that every TBox can be trans- formed into this normal form in polynomial time such that (non-)subsumption between the concept names that occur in the original TBox is preserved. Let T be the input TBox in normal form, CN the set of concept names that occur in T , BC the set of basic concepts in T , and ROL the set of roles in T . Call a role probabilistic if it is of the form P=1 r or P>0 r. Our algorithm maintains the following data structures: – a mapping Q that associates with each A ∈ CN a subset Q(A) ⊆ BC such that T |= A v X for all X ∈ Q(A); – a mapping Qcert that associates with each A ∈ CN a subset Qcert (A) ⊆ BC such that T |= A v P=1 X for all X ∈ Qcert (A); – a mapping R that associates with each probabilistic role α ∈ ROL a binary relation R(α) on CN such that T |= A v P>0 (∃α.B) for all (A, B) ∈ R(α). Some intuition about the data structures is already provided above; e.g., X ∈ Q(A) means that T |= A v X. However, there is also another view on these structures that will be important in what follows: they represent an abstract view of a model of T , where each set Q(A) describes the concept memberships of a domain element d in a world w with d ∈ AI,w and R describes role memberships, i.e., when (A, B) ∈ R(α), then d ∈ AI,w implies that in some world v with positive probability, d has an element described by Q(B) as an α-successor. In this context, Qcert (A) contains all concepts that must be true with probability 1 for any domain element that satisfies A in some world. Note that non-probabilistic roles r and probabilistic roles P=1 r are not represented in the R(·) data structure; we will treat them in a more implicit way later on. The data structures are initialized as follows, for all A ∈ CN and relevant roles α: Q(A) = {>, A} Qcert (A) = {>} R(α) = ∅. The sets Q(·), Qcert (·), and R(·) are then repeatedly extended by the application of various rules. Before we can introduce these rules, we need some preliminaries. As the first step, Figure 3 presents a (different!) set of rules that serves the purpose of saturating a set of concepts Γ . We use cl(Γ ) to denote the set of concepts that is the result of exhaustively applying the displayed rules to Γ , where any rule can only be applied if the added concept is in BC, but not yet in Γ . The rules access the data structure Q(·) introduced above and shall later be applied to the sets Q(A) and Qcert (A), but they will also serve other purposes as described below. It is not hard to see that rule application terminates after polynomially many steps. R1 If X1 u . . . u Xn v X ∈ T and X1 , . . . , Xn ∈ Γ then add X to Γ R2 If P=1 A ∈ Γ then add A to Γ R3 If ∃P=1 r.A ∈ Γ then add ∃r.A to Γ R4 If A ∈ Γ then add P>0 A to Γ R5 If ∃r.A ∈ Γ then add ∃P>0 r.A to Γ R6 If ∃α.A ∈ Γ and B ∈ Q(A) then add ∃α.B to Γ Fig. 3. Saturation rules for cl(Γ ) The rules that are used for completing the data structures Q(·), Qcert (·), and R(·) are more complex and refer to ‘traces’ through these data structures. Definition 1. Let B ∈ CN. A trace to B is a sequence S, A1 , α2 , A2 , . . . , αn , An where 1. S = A for some P>0 A ∈ Q(A1 ) or S = (r, B) for some (A1 , B) ∈ R(P>0 r); 2. each Ai ∈ CN and each αi ∈ ROL is a probabilistic role, such that An = B; 3. (Ai , Ai−1 ) ∈ R(αi ) for 1 < i ≤ n. If t is a trace of length n, we use tk , k ≤ n, to denote the trace S, A1 , α2 , . . . , αk , Ak . Intuitively, the purpose of a trace is to deal with worlds that are generated by concepts P>0 A and ∃P>0 r.A; there can be infinitely many such worlds as Prob-EL>0;=1 r lacks the finite model property, see [15]. The trace starts at some domain element represented by a set Q(A1 ) in the world generated by the first element S of the trace, then repeat- edly follows role edges represented by R(·) backwards until it reaches the final domain element represented by Q(B). The importance of traces stems from the fact that infor- mation can be propagated along them, as captured by the following notion. Definition 2. Let t = S, A1 , α2 , . . . , αn , An be a trace of length n. Then the type Γ (t) ⊆ BC of t is • cl({A} ∪ Qcert (A1 )) if n = 1 and S = A; • cl(Qcert (A1 ) ∪ {∃r.B 0 ∈ BC | B 0 ∈ Qcert (B)}) if n = 1 and S = (r, B); • cl(Qcert (An ) ∪ {∃αn .B 0 ∈ BC | B 0 ∈ Γ (tn−1 )}) if n > 1. Note that the rules R1 to R6 are used in every step of this inductive definition. The mentioned propagation of information along traces is now as follows: if there is a trace t to B, then any domain element that satisfies B in some world must satisfy the concepts in Γ (t) in some other world. So if for example P>0 A ∈ Γ (t), we need to add P>0 A also to Qcert (B) and to Q(B). Figure 4 shows the rules used for completing the data structures Q(·), Qcert (·), and R(·). Note that S6 and S7 implement the propagation of information along traces, as discussed above. Our algorithm for deciding (positive) subsumption starts with the initial data structures defined above and then exhaustively applies the rules shown in Figure 4. To decide whether T |=+ A v B, it then simply checks whether B ∈ Q(A). Lemma 3. Let T be a Prob-EL>0;=1 r -TBox in normal form and A, B be concept names. Then T |=+ A v B iff, after exhaustive rule application, B ∈ Q(A). We now argue that the algorithm can be implemented using only polynomial space. First, it is easy to see that there can be only polynomially many rule applications: every rule application extends the data structures Q(·), Qcert (·), and R(·), but these structures consist of polynomially many sets, each with at most polynomially many elements. It thus remains to verify that each rule application can be executed using only polyspace, which is obvious for all rules except those involving traces, i.e., S6 and S7. For these rules, we first note that it is not necessary to consider all (infinitely many!) traces. In fact, a straightforward ‘pumping argument’ can be used to show that there is a trace t to B with some relevant concept C ∈ Γ (t) iff there is a non-repeating such trace, i.e., a trace t0 of length n such that for all distinct k, ` ≤ n, we have Γ (t0k ) 6= Γ (t0` ). S1 apply R1-R6 to Q(A) and Qcert (A) S2 if P∗ B ∈ Q(A) then add P∗ B to Qcert (A) S3 if C ∈ Qcert (A) then add P=1 C and C to Q(A) S4 If ∃α.B ∈ Q(A) with α a probabilistic role then add (A, B) to R(α). S5 If P>0 B1 ∈ Q(A), (B1 , B2 ) ∈ R(α), B3 ∈ Qcert (B2 ) then add ∃α.B3 to Qcert (A) S6 if t is a trace to B and P∗ A ∈ Γ (t) then add P∗ A to Qcert (B) S7 if t is a trace to B and ∃α.A ∈ Γ (t) with α a probabilistic role Fig. 4. The rules for completing the data structures. Clearly, the length of non-repeating traces is bounded by 2m , m the size of T . To get to polyspace, we use a non-deterministic approach, enabled by Savitch’s theorem: to check whether there is a trace t to B with C ∈ Γ (t), we guess t step-by-step, at each time keeping only a single Ai , αi and Γ (ti ) in memory. When we reach a situation where Ai = B and C ∈ Γ (ti ), our guessing was successful and we apply the rule. We also maintain a binary counter of the number of steps that have been guessed so far. As soon as this counter exceeds 2m , the maximum length of non-repeating traces, we stop the guessing and do not apply the rule. Clearly, this yields a polyspace algorithm. Theorem 5. Deciding positive subsumption in Prob-EL>0;=1 r with respect to general TBoxes is PS PACE-complete. As a byproduct, the proof of Lemma 3 yields a unique least model (in the sense of Horn logic), thus proving convexity of Prob-EL>0;=1 r . Note that positive subsumption in Prob-EL>0;=1 r is actually the same as subsumption in the two-dimensional descrip- tion logic S5EL , which is thus also PS PACE-complete. Using a Turing reduction similar to that shown in Figure 2, we can ‘lift’ the result from positive subsumption to unre- stricted subsumption. Theorem 6. Subsumption in Prob-EL>0;=1 r relative to general TBoxes is PS PACE- complete. 5 Conclusion We have established a fairly complete picture of the complexity of subsumption in Prob- EL, although some questions remain open. We speculate that Theorem 2 can be proved also when ∼ is = with only minor changes (e.g. rule R3 becomes unsound). It would be interesting to verify the conjecture made below Theorem 4 that (positive) subsumption in Prob-EL∼p;∼q relative to classical TBoxes is in PT IME whenever p ≥ q ≥ p2 and that, otherwise, it is E XP T IME-hard relative to the empty TBox. It is even conceivable that the conjectured PT IME result can be further generalized to any set of probability values P ⊆ [0, 1] as long as q ≥ p2 whenever p, q ∈ P and p ≥ q. Moreover, variants of Theorem 4 that involve, additionally or exclusively, the case where ∼ is = would also be of interest. Acknowledgements The present work was supported by the DFG project on proba- bilistic description logics (LU1417/1-1) and DAAD-CONACYT grant 206550. References 1. Baader, F.: Terminological cycles in a description logic with existential restrictions. In: Proc. of the 18th Int. Joint Conf. on Artif. Intell. (IJCAI03). pp. 325–330. Morgan Kaufmann (2003) 2. Baader, F., Brandt, S., Lutz, C.: Pushing the EL envelope. In: Proc. of the 19th Int. Joint Conf. on Artif. Intell. (IJCAI05). pp. 364–369. Morgan Kaufmann (2005) 3. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook. Cambridge University Press (2003) 4. Bacchus, F.: Representing and Reasoning with Probabilistic Knowledge. MIT Press, Cam- bridge, MA (1990) 5. da Costa, P.C.G., Laskey, K.B.: PR-OWL: A framework for probabilistic ontologies. In: Formal Ontology in Information Systems (FOIS06). Frontiers in Artif. Intell. and Applic., vol. 150, pp. 237–249. IOS Press (2006) 6. Gutierrez-Basulto, V., Jung, J.C., Lutz, C., Schröder, L.: A closer look at the probabilistic description logic Prob-EL. In: Proc. of the 25th AAAI Conf. on Artif. Intell. (AAAI11). AAAI Press (2011) 7. Halpern, J.Y.: Reasoning About Uncertainty. MIT Press (2003) 8. Halpern, J.Y., Rabin, M.O.: A logic to reason about likelihood. Artif. Intell. 32, 379–405 (1987) 9. Herzig, A.: Modal probability, belief, and actions. Fund. Inf. 57, 323–344 (2003) 10. Jaeger, M.: Probabilistic reasoning in terminological logics. In: Proc. of the 4th Int. Conf. on Princ. of Knowledge Repres. and Reas. (KR94), pp. 305–316. Morgan Kaufmann (1994) 11. Kazakov, Y.: Consequence-driven reasoning for Horn SHIQ ontologies. In: Proc. of the 21st Int. Joint Conf. on Artif. Intell. (IJCAI09). pp. 2040–2045 (2009) 12. Lukasiewicz, T.: Tractable probabilistic description logic programs. In: Proc. of the 1st Int. Conf. in Scalable Uncertainty Management (SUM07), number 4772 in LNCS, 143–156. Springer Verlag. 13. Lukasiewicz, T.: Expressive probabilistic description logics. Artif. Intell. 172, 852–883 (2008) 14. Lukasiewicz, T., Straccia, U.: Managing uncertainty and vagueness in description logics for the semantic web. J. Web Sem. 6(4), 291–308 (2008) 15. Lutz, C., Schröder, L.: Probabilistic description logics for subjective uncertainty. In: Proc. of the 12th Int. Conf. on Princ. of Knowledge Repres. and Reas. (KR10). AAAI Press (2010) 16. Schulz, S., Suntisrivaraporn, B., Baader, F.: SNOMED CT’s problem list: Ontologists’ and logicians’ therapy suggestions. In: Proc. of The Medinfo 2007 Congress. Studies in Health Technology and Informatics (SHTI-series), IOS Press (2007)