Comparing the Expressiveness of Description Logics Ali Rezaei Divroodi Institute of Informatics, University of Warsaw Banacha 2, 02-097 Warsaw, Poland rezaei@mimuw.edu.pl Abstract. This work studies the expressiveness of the description log- ics that extend ALC reg (a variant of PDL) with any combination of the features: inverse roles, nominals, quantified number restrictions, the uni- versal role, the concept constructor for expressing the local reflexivity of a role. We compare the expressiveness of these description logics w.r.t. concepts, positive concepts, TBoxes and ABoxes. Our results about sepa- rating the expressiveness of description logics are based on bisimulations and bisimulation-based comparisons. They are naturally extended to the case when instead of ALC reg we have any sublogic of ALC reg that ex- tends ALC. 1 Introduction Expressiveness (expressive power) is a topic studied in the fields of formal lan- guages, databases and logics. The Chomsky hierarchy provides fundamental re- sults on the expressiveness of formal languages. In the field of databases, the works by Fagin [11, 12], Immerman [15, 16], Abiteboul and Vianu [1] provide important results on the expressiveness of query languages. Many results on the expressiveness of logics have also been obtained, e.g. in [13, 15, 3, 17, 22, 18, 23]. The expressiveness of description logics (DLs) has been studied in a number of works [2, 5, 6, 19, 20]. In [2] Baader proposed a formal definition of the expres- sive power of DLs. His definition is liberal in that it allows the compared logics to have different vocabularies. His work provides separation results for some early DLs. In [5] Borgida showed that certain DLs have the same expressiveness as the two or three variable fragment of first-order logic. The class of DLs considered in [5] is large, but the results only concern DLs without the reflexive and transi- tive closure of roles. In [6] Cadoli et al. considered the expressiveness of hybrid knowledge bases that combine a DL knowledge base with Horn rules. The used DL is ALCN R. The work [19] by Kurtonina and de Rijke is a comprehensive work on the expressiveness of DLs that are sublogics of ALCN R. It is based on bisimulation and provides many interesting results. In [20] Lutz et al. char- acterized the expressiveness and rewritability of DL TBoxes for the DLs that are sublogics of ALCQIO. They used semantic notions such as bisimulation, equisimulation, disjoint union and direct product. This work studies the expressiveness of the DLs that extend ALC reg (a vari- ant of PDL) with any combination of the features: inverse roles, nominals, quan- tified number restrictions, the universal role, the concept constructor for express- ing the local reflexivity of a role. We compare the expressiveness of these DLs w.r.t. concepts, positive concepts, TBoxes and ABoxes. Our results about sep- arating the expressiveness of DLs are based on bisimulations and bisimulation- based comparisons studied in our joint works [8–10]. They are naturally extended to the case when instead of ALC reg we have any sublogic of ALC reg that extends ALC. Our work differs significantly from all of [2, 5, 6, 19, 20], as the class of con- sidered DLs is much larger than the ones considered in those works (we allow PDL-like role constructors as well as the universal role and the concept con- structor ∃r.Self) and our results about separating the expressiveness of DLs are obtained not only w.r.t. concepts and TBoxes but also w.r.t. positive concepts and ABoxes. The rest of this paper is structured as follows. In Section 2 we recall notation and semantics of DLs, including the notion of positive concepts [10]. In Section 3 we recall the definitions of bisimulations and bisimulation-based comparisons as well as some invariance or preservation results of [8–10]. In Section 4 we present our results on the expressiveness of DLs. Section 5 concludes this work. 2 Notation and Semantics of Description Logics Our languages use a finite set ΣC of concept names (atomic concepts), a finite set ΣR of role names (atomic roles), and a finite set ΣI of individual names. Let Σ = ΣC ∪ ΣR ∪ ΣI . We denote concept names by letters like A and B, role names by letters like r and s, and individual names by letters like a and b. We consider some DL-features denoted by I (inverse), O (nominal), Q (quan- tified number restriction), U (universal role), Self (the local reflexivity of a role). A set of DL-features is a set consisting of some or zero of these names. We some- times abbreviate sets of DL-features, writing, e.g., IOQ instead of {I, O, Q}. From now on, if not stated otherwise, let Φ be any set of DL-features and let L stand for ALC reg . Definition 2.1. The DL language LΦ allows roles and concepts defined induc- tively as follows: – if r ∈ ΣR then r is a role of LΦ – if A ∈ ΣC then A is a concept of LΦ – if R and S are roles of LΦ and C is a concept of LΦ then • ε, R ◦ S, R t S, R∗ and C? are roles of LΦ • >, ⊥, ¬C, C t D, C u D, ∃R.C and ∀R.C are concepts of LΦ • if I ∈ Φ then R− is a role of LΦ • if O ∈ Φ and a ∈ ΣI then {a} is a concept of LΦ • if Q ∈ Φ, r ∈ ΣR and n is a natural number then ≥ n r.C and ≤ n r.C are concepts of LΦ • if {Q, I} ⊆ Φ, r ∈ ΣR and n is a natural number then ≥ n r− .C and ≤ n r− .C are concepts of LΦ • if U ∈ Φ then U is a role of LΦ • if Self ∈ Φ and r ∈ ΣR then ∃r.Self is a concept of LΦ . 2 The following definition introduces positive concepts of LΦ . Definition 2.2. Let Lpos pos pos Φ be the smallest set of concepts and LΦ,∃ , LΦ,∀ be the smallest sets of roles defined recursively as follows: – if r ∈ ΣR then r is a role of Lpos pos Φ,∃ and LΦ,∀ , – if I ∈ Φ and r ∈ ΣR then r− is a role of Lpos pos Φ,∃ and LΦ,∀ , – if R and S are roles of Lpos Φ,∃ and C is a concept of LΦ pos ∗ pos then ε, R ◦ S, R t S, R and C? are roles of LΦ,∃ , – if R and S are roles of Lpos Φ,∀ and C is a concept of LΦ pos pos then ε, R ◦ S, R t S, R∗ and (¬C)? are roles of LΦ,∀ , – if A ∈ ΣC then A is a concept of Lpos Φ , – if O ∈ Φ and a ∈ ΣI then {a} is a concept of Lpos Φ , – if Self ∈ Φ and r ∈ ΣR then ∃r.Self is a concept of Lpos Φ , – if C is a concept of Lpos pos pos Φ , R is a role of LΦ,∃ and S is a role of LΦ,∀ then pos • >, C t D, C u D, ∃R.C and ∀S.C are concepts of LΦ , • if Q ∈ Φ, r ∈ ΣR and n is a natural number then ≥ n r.C and ≤ n r.(¬C) are concepts of Lpos Φ , • if {Q, I} ⊆ Φ, r ∈ ΣR and n is a natural number then ≥ n r− .C and ≤ n r− .(¬C) are concepts of Lpos Φ , • if U ∈ Φ then ∀U.C and ∃U.C are concepts of Lpos Φ . A concept of Lpos Φ is called a positive concept of LΦ . 2 We introduce both Lpos pos Φ,∀ and LΦ,∃ due to the test constructor of roles. The concepts ∃(A?).B and ∀((¬A)?).B are positive concepts. As we will see, they are equivalent to A u B and A t B, respectively.1 If Φ is empty then we abbreviate LΦ by L. We use letters like R and S to denote arbitrary roles, and use letters like C and D to denote arbitrary concepts. ± We refer to elements of ΣR also as atomic roles. Let ΣR = ΣR ∪ {r− | r ∈ ± ΣR }. From now on, by basic roles we refer to elements of ΣR if the considered language allows inverse roles, and refer to elements of ΣR otherwise. In general, the language decides whether inverse roles are allowed in the considered context. An interpretation I = h∆I , ·I i consists of a non-empty set ∆I , called the domain of I, and a function ·I , called the interpretation function of I, which maps every concept name A to a subset AI of ∆I , maps every role name r to a binary relation rI on ∆I , and maps every individual name a to an element aI of ∆I . The interpretation function ·I is extended to complex roles and complex concepts as shown in Figure 1, where #Γ stands for the cardinality of the set Γ . We write C I (x) to denote x ∈ C I , and write RI (x, y) to denote hx, yi ∈ RI . 1 That the concept ≤ n R.(¬A) is positive should not be a surprise, as ∀R.A is equiv- alent to ≤ 0 R.(¬A). (R ◦ S)I = RI ◦ S I >I = ∆I (R t S)I = RI ∪ S I ⊥I = ∅ (R∗ )I = (RI )∗ (¬C)I = ∆I \ C I (C?)I = {hx, xi | C I (x)} (C t D)I = C I ∪ DI εI = {hx, xi | x ∈ ∆I } (C u D)I = C I ∩ DI U I = ∆I × ∆I {a}I = {aI } (R ) = (RI )−1 − I (∃r.Self)I = {x ∈ ∆I | rI (x, x)} (∃R.C)I = {x ∈ ∆I | ∃y [RI (x, y) and C I (y)] (∀R.C)I = {x ∈ ∆I | ∀y [RI (x, y) implies C I (y)]} (≥ n R.C)I = {x ∈ ∆I | #{y | RI (x, y) and C I (y)} ≥ n} (≤ n R.C)I = {x ∈ ∆I | #{y | RI (x, y) and C I (y)} ≤ n} Fig. 1. Interpretation of complex roles and complex concepts. A terminological axiom in LΦ , also called a general concept inclusion (GCI) in LΦ , is an expression of the form C v D, where C and D are concepts of LΦ . An interpretation I validates an axiom C v D, denoted by I |= C v D, if C I ⊆ DI . A TBox in LΦ is a finite set of terminological axioms in LΦ . An interpretation I is a model of a TBox T , denoted by I |= T , if it validates all the axioms of T . An individual assertion in LΦ is an expression of one of the forms C(a) (concept assertion), R(a, b) (positive role assertion), ¬R(a, b) (negative role as- sertion), a = b, and a 6= b, where C is a concept and R is a role in LΦ . Given an interpretation I, define that: I |= a = b if aI = bI , I |= a 6= b if aI 6= bI , I |= C(a) if C I (aI ) holds, I |= R(a, b) if RI (aI , bI ) holds, I |= ¬R(a, b) if RI (aI , bI ) does not hold. We say that I satisfies an individual assertion ϕ if I |= ϕ. An ABox in LΦ is a finite set of individual assertions in LΦ . An interpretation I is a model of an ABox A, denoted by I |= A, if it satisfies all the assertions of A. 3 Bisimulations and Bisimulation-Based Comparisons Bisimulation is a very useful notion for DLs. It can be used for analyzing ex- pressiveness of DLs (as investigated in [19] and the current paper), minimizing interpretations [8–10] and concept learning in DLs [21, 25, 14, 7, 24, 26]. The fol- lowing definition comes from our joint works [8–10]. Definition 3.1. Let I and I 0 be interpretations. A binary relation 0 Z ⊆ ∆I × ∆I is called an LΦ -bisimulation between I and I 0 if the following 0 conditions hold for every a ∈ ΣI , A ∈ ΣC , r ∈ ΣR , x, y ∈ ∆I , x0 , y 0 ∈ ∆I : 0 Z(aI , aI ) (1) 0 I I0 0 Z(x, x ) ⇒ [A (x) ⇔ A (x )] (2) 0 I 0 I0 0 I0 0 0 [Z(x, x ) ∧ r (x, y)] ⇒ ∃y ∈ ∆ [Z(y, y ) ∧ r (x , y )] (3) 0 I0 0 0 I 0 I [Z(x, x ) ∧ r (x , y )] ⇒ ∃y ∈ ∆ [Z(y, y ) ∧ r (x, y)], (4) if I ∈ Φ then 0 0 [Z(x, x0 ) ∧ rI (y, x)] ⇒ ∃y 0 ∈ ∆I [Z(y, y 0 ) ∧ rI (y 0 , x0 )] (5) 0 I0 0 0 I 0 I [Z(x, x ) ∧ r (y , x )] ⇒ ∃y ∈ ∆ [Z(y, y ) ∧ r (y, x)], (6) if O ∈ Φ then 0 Z(x, x0 ) ⇒ [x = aI ⇔ x0 = aI ], (7) if Q ∈ Φ then if Z(x, x0 ) holds then, for every role name r, there exists a bijection 0 (8) h : {y | rI (x, y)} → {y 0 | rI (x0 , y 0 )} such that h ⊆ Z, if {Q, I} ⊆ Φ then (additionally) if Z(x, x0 ) holds then, for every role name r, there exists a bijection 0 (9) h : {y | rI (y, x)} → {y 0 | rI (y 0 , x0 )} such that h ⊆ Z, if U ∈ Φ then 0 ∀x ∈ ∆I ∃x0 ∈ ∆I Z(x, x0 ) (10) 0 I0 I 0 ∀x ∈ ∆ ∃x ∈ ∆ Z(x, x ), (11) if Self ∈ Φ then 0 Z(x, x0 ) ⇒ [rI (x, x) ⇔ rI (x0 , x0 )]. (12) 0 0 0 By (2 ), (7 ) and (12 ) we denote the conditions obtained respectively from (2), (7) and (12) by replacing equivalence (⇔) by implication (⇒). If the conditions (2), (7) and (12) are replaced by (20 ), (70 ) and (120 ), respectively, then the relation Z is called an LΦ -comparison between I and I 0 [10]. 2 As shown in [4], the PDL-like role constructors are “safe” for the conditions (3)-(6). That is, we need to specify these conditions only for atomic roles, and as a consequence, they also hold for complex roles. Definition 3.2. A concept C in LΦ is invariant for LΦ -bisimulation if, for any interpretation I, I 0 and any LΦ -bisimulation Z between I and I 0 , if Z(x, x0 ) 0 holds then x ∈ C I iff x0 ∈ C I . A TBox T in LΦ is invariant for LΦ -bisimulation if, for every interpretations I and I 0 , if there exists an LΦ -bisimulation between I and I 0 then I is a model of T iff I 0 is a model of T . The notion of whether an ABox in LΦ is invariant for LΦ -bisimulation is defined similarly. 2 Definition 3.3. An interpretation I is said to be unreachable-objects-free (w.r.t. the considered language) if every element of ∆I is reachable from some aI , where a ∈ ΣI , via a path consisting of edges being instances of basic roles. 2 The following theorem comes from our joint work [8]. Theorem 3.4. 1. All concepts in LΦ are invariant for LΦ -bisimulation. 2. If U ∈ Φ then all TBoxes in LΦ are invariant for LΦ -bisimulation. 3. Let T be a TBox in LΦ and I, I 0 be unreachable-objects-free interpretations (w.r.t. LΦ ) such that there exists an LΦ -bisimulation between I and I 0 . Then I is a model of T iff I 0 is a model of T . 4. Let A be an ABox in LΦ . If O ∈ Φ or A contains only assertions of the form C(a) then A is invariant for LΦ -bisimulation. Definition 3.5. A concept C of LΦ is preserved by LΦ -comparisons if, for any interpretations I, I 0 and any LΦ -comparison Z between I and I 0 , if Z(x, x0 ) 0 holds and x ∈ C I then x0 ∈ C I . 2 The following theorem comes from our joint work [10]. Theorem 3.6. All concepts of Lpos Φ are preserved by LΦ -comparisons. 4 Comparing the Expressiveness of Description Logics Definition 4.1. Two concepts C and D are equivalent if, for every interpreta- tion I, C I = DI . Two TBoxes T1 and T2 are equivalent if, for every interpre- tation I, I is a model of T1 iff I is a model of T2 . Two ABoxes A1 and A2 are equivalent if, for every interpretation I, I is a model of A1 iff I is a model of A2 . Definition 4.2. We say that a logic L1 is at most as expressive as a logic L2 w.r.t. concepts (resp. positive concepts, TBoxes, ABoxes), denoted by L1 ≤C L2 (resp. L1 ≤P C L2 , L1 ≤T L2 , L1 ≤A L2 ), if every concept (resp. positive concept, TBox, ABox) in L1 has an equivalent concept (resp. positive concept, TBox, ABox) in L2 . We say that a logic L2 is more expressive than a logic L1 (or L1 is less expressive than L2 ) w.r.t. concepts (resp. positive concepts, TBoxes, ABoxes), denoted by L1