Is Query Inseparability for ALC Ontologies Decidable? E. Botoeva,1 C. Lutz,2 V. Ryzhikov,1 F. Wolter3 and M. Zakharyaschev4 1 Univ. of Bozen-Bolzano, 2 Univ. of Bremen, 3 Univ. of Liverpool, 4 Birkbeck, Univ. of London 1 Introduction While query answering with Description Logics (DLs) is now well-developed, this is much less the case for reasoning services that support ontology engineering and target query answering as an application. In ontology versioning, for example, one would like to know whether two versions of an ontology give the same answers to all queries formulated over a given vocabulary of interest, which means that they can be safely replaced in an application [5]. Similarly, if one wants to know whether a given ontology can be safely replaced by a smaller subset (a module), it is the answers to all queries that should be preserved [7]. In this context, the fundamental relationship between ontologies is thus not whether they are logically equivalent (have the same models), but whether they give the same answers to all relevant queries. The resulting entailment problem can be formalized in two ways, with different applications. First, given a class Q of queries, knowledge bases (KBs) K1 = (T1 , A1 ) and K2 = (T2 , A2 ), and a signature Σ of relevant concept and role names, we say that K1 Σ-Q-entails K2 if the answers to any Σ- query in Q over K2 are contained in the answers to the query over K1 . K1 and K2 are Σ- Q-inseparable if they Σ-Q-entail each other. This notion of entailment is appropriate if the data is known and does not change frequently. Applications include data-oriented KB versioning and KB module extraction, KB forgetting [9], and knowledge exchange [1]. If the data is not known or changes frequently, it is not KBs that should be compared, but TBoxes. Given a pair Θ = (Σ1 , Σ2 ) specifying a relevant signature Σ1 for ABoxes and Σ2 for queries, we say that a TBox T1 Θ-Q-entails a TBox T2 if, for every Σ1 - ABox A, the KB (T1 , A) Σ2 -Q-entails (T2 , A). T1 and T2 are Θ-Q-inseparable if they Θ-Q-entail each other. Applications include data-oriented TBox versioning, TBox modularization and TBox forgetting [7]. Most important choices for Q are conjunctive queries (CQs) and unions thereof (UCQs); we also consider the practically relevant classes of rooted CQs (rCQs) and UCQs (rUCQs), in which every variable is connected to an answer variable. So far, CQ- entailment has been studied for Horn DL KBs [3], EL TBoxes [8, 5], DL-Lite TBoxes [6], and also for OBDA specifications, that is, DL-Lite TBoxes with mappings [2]. No results are available for non-Horn DLs (neither in the KB nor in the TBox case) and for expressive Horn DLs in the TBox case. In particular, query entailment in non-Horn DLs had the reputation of being a technically challenging problem. This paper makes a first breakthrough into understanding query entailment and inseparability in these cases, with the main results summarized in Figures 1 and 2 (those marked with (?) are from [3]). Most unexpected is the undecidability of CQ- and rCQ- entailments between ALC KBs, even when the first KB is formulated in Horn-ALC (in fact, EL) and without any signature restriction. This should be contrasted with the Horn-ALC ALC to Horn-ALC ALC to ALC Horn-ALC Queries ALC Horn-ALC to ALC Horn-ALC to ALC Horn-ALC undecidable CQ undecidable ? =E XP T IME(?) ? =2E XP T IME ? UCQ ? undecidable rCQ undecidable ≤ 2E XP T IME =E XP T IME(?) =E XP T IME =E XP T IME ≤ 2E XP T IME rUCQ ? Fig. 1. KB query entailment. Fig. 2. TBox query entailment. decidability of subsumption-based entailment between ALC TBoxes [4] and of CQ- entailment between Horn-ALC KBs [3]. The second surprising result is that entailment between ALC KBs becomes decidable when CQs are replaced with rUCQs. For ALC TBoxes, CQ- and rCQ-entailments are undecidable as well. We obtain decidability for Horn-ALC TBoxes (where CQ- und UCQ-entailments coincide) using the fact that non-entailment is always witnessed by tree-shaped ABoxes. As another surprise, CQ- entailment of Horn-ALC TBoxes is 2E XP T IME-complete while rCQ-entailment is only E XP T IME-complete. This should be contrasted with the EL case, where both problems are E XP T IME-complete [8]. All upper bounds and most lower bounds hold also when entailment is replaced with inseparability. 2 Undecidability of CQ-Entailment for ALC KBs Here we sketch the construction showing that the problem whether an EL KB Σ-CQ entails an ALC KB is undecidable. The proof is by reduction of the undecidable N × M - tiling problem: given a finite set T of tile types T with four colours up(T ), down(T ), left(T ) and right(T ), a tile type I ∈ T, and two colours W (for wall) and C (for ceiling), decide whether there exist N, M ∈ N such that the N × M grid can be tiled using T in such a way that (1, 1) is covered by a tile of type I; every (N, i), for i ≤ M , is covered by a tile of type T with right(T ) = W ; and every (i, M ), for i ≤ N , is covered by a tile of type T with up(T ) = C. Given an instance of this problem, we first describe a KB K2 = (T2 , {A(a)}) that uses (among others) three concept names Tk , k = 0, 1, 2, for each tile type T ∈ T. If a point x in a model I of K2 is in Tk and right(T ) = left(T 0 ), then x has an R-successor in Tk0 . Thus, branches of I define (possibly infinite) horizontal rows of tilings with T. If a branch contains a point y ∈ Tk with right(T ) = W , then this y can be the last point in the row, which is indicated by an R-successor z ∈ Row of y. In turn, z has R-successors in all T(k+1) mod 3 that can be possible beginnings of the next row of tiles. To coordinate the up and down colours between the rows—which will be done by the CQs separating K1 and K2 — we make every x ∈ Tk , starting from the 0 0 second row, an instance of all T(k−1) mod 3 with down(T ) = up(T ). The row started by z ∈ Row can be the last one in the tiling, in which case we require that each of its tiles T has up(T ) = C. After the point in Row indicating the end of the final row, we add an R-successor in End for the end of tiling. The beginning of the first row is indicated by a P -successor in Start of the ABox element a, after which we add an R-successor in I0 for the given initial tile type I; see the lowest branch in the figure below. To generate a tree with all possible branches described above, we only require EL axioms of the form E v D and E v ∃S.D. Start B1 BN BN +1 Bn−N Bn−1 Bn End qn hl hr End hl Row Row ∨ Row Start T2 hr T0 T0 T0 T1 T1 T1 T1 Il Row End P Row Row Row End T2 a ∨ T2 A Start I0 T0N 1 Start T112 T1N 2 T11M -1 T1N M -1 Row I0 . . . T0N 1 . . . T01M -2 . . . T0N M -2 . . . Ir T 1M T2N M End 2 T11M -1 . . . T1N M -1 . . . The existence of a tiling of some N × M grid for the given instance can be checked by Boolean CQs q n that require Vn an R-path fromV Start to End going through Tk - or n Row-points: ∃x(Start(x0 ) ∧ i=0 R(xi , xi+1 ) ∧ i=1 Bi (xi ) ∧ End(xn+1 )), where Bi ∈ {Row} ∪ {Tk | T ∈ T, k = 0, 1, 2}. The key trick is—using an axiom of the form D v E t E 0 —to ensure that the Row-point before the final row of the tiling has two alternative continuations: one as described above, and the other one having just a single R-successor in End where ∨ indicates an or-node. This or-node gives two models of K2 denoted Il and Ir in the picture. If K2 |= q n , then q n holds in both of them, and so there are homomorphisms hl : q n → Il and hr : q n → Ir . As hl (xn−1 ) and hr (xn−1 ) are instances of Bn−1 , we have Bn−1 = T1N M −1 in the picture, and so up(T N M −1 ) = down(T N M ). By repeating this argument until x0 , we see that the colours between horizontal rows match and the rows are of the same length. (For this trick to work, we have to make the first Row-point in every branch an instance of Start.) In fact, an instance of the N × M -tiling problem has a positive answer iff there exists q n such that K2 |= q n . It is to be noted that to construct T2 with the properties described above, one needs quite a few auxiliary concept names. Let K1 = (T1 , {A(a)}) be an EL KB with the following canonical model: A Start, Σ0 Start, Σ0 Start, Σ0 a P End, End, End, Σ0 E Σ 0 End, Σ0 End, nd, Σ Σ Σ0 0 0 where Σ0 = {Row} ∪ {Tk | T ∈ T, k = 0, 1, 2}. Note that the vertical R-successors of the Start-points are not instances of any concept name, and so K1 does not satisfy any query of the form q n . On the other hand, K2 |= q implies K1 |= q, for every Σ-CQ q without a subquery of the form q n and Σ = sig(K1 ). Therefore, K1 Σ-CQ entails K2 iff there exists no tiling as specified above. 3 Future Work We have made first steps towards understanding query entailment and inseparability for KBs and TBoxes in expressive DLs. From a theoretical viewpoint, it would be of interest to solve the open problems in Figures 1 and 2, and also consider other expressive DLs such as DL-Lite H bool or ALCI. For example, our undecidability proof goes through for DL-Lite Hbool , but the other cases remain open. From a practical viewpoint, our model- theoretic criteria for query entailment are a good starting point for developing algorithms for approximations of query entailment based on simulations. Our undecidability and complexity results also indicate that rUCQ-entailment is more amenable to practical algorithms than, say, CQ-entailment and can be used as an approximation of the latter. References 1. Arenas, M., Botoeva, E., Calvanese, D., Ryzhikov, V.: Exchanging OWL 2 QL knowledge bases. In: Proc. of the 23rd Int. Joint Conf. on Artificial Intelligence (IJCAI 2013). AAAI Press (2013) 2. Bienvenu, M., Rosati, R.: Query-based comparison of OBDA specifications. In: Proc. of the 28th Int. Workshop on Description Logics (DL 2015). vol. 1350. CEUR-WS (2015) 3. Botoeva, E., Kontchakov, R., Ryzhikov, V., Wolter, F., Zakharyaschev, M.: Query inseparability for description logic knowledge bases. In: Proc. of the 14th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR 2014) (2014) 4. Ghilardi, S., Lutz, C., Wolter, F.: Did I damage my ontology? A case for conservative extensions in description logics. In: Doherty, P., Mylopoulos, J., Welty, C. (eds.) Proc. of the 10th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR 2006). pp. 187–197 (2006) 5. Konev, B., Ludwig, M., Walther, D., Wolter, F.: The logical difference for the lightweight description logic EL. Journal of Artificial Intelligence Research (JAIR) 44, 633–708 (2012) 6. Kontchakov, R., Pulina, L., Sattler, U., Schneider, T., Seimer, P., Wolter, F., Zakharyaschev, M.: Minimal module extraction from DL-Lite ontologies using QBF solvers. In: Proc. of the 21st Int. Joint Conf. on Artificial Intelligence (IJCAI 2009). pp. 836–840 (2009) 7. Kontchakov, R., Wolter, F., Zakharyaschev, M.: Logic-based ontology comparison and module extraction, with an application to DL-Lite. Artificial Intelligence 174, 1093–1141 (2010) 8. Lutz, C., Wolter, F.: Deciding inseparability and conservative extensions in the description logic EL. J. of Symbolic Computation 45(2), 194–228 (2010) 9. Wang, K., Wang, Z., Topor, R.W., Pan, J.Z., Antoniou, G.: Eliminating concepts and roles from ontologies in expressive descriptive logics. Computational Intelligence 30(2), 205–232 (2014)