Is Query Inseparability for ALC Ontologies Decidable?

          E. Botoeva,1 C. Lutz,2 V. Ryzhikov,1 F. Wolter3 and M. Zakharyaschev4
1
    Univ. of Bozen-Bolzano, 2 Univ. of Bremen, 3 Univ. of Liverpool, 4 Birkbeck, Univ. of London


1     Introduction

While query answering with Description Logics (DLs) is now well-developed, this is
much less the case for reasoning services that support ontology engineering and target
query answering as an application. In ontology versioning, for example, one would
like to know whether two versions of an ontology give the same answers to all queries
formulated over a given vocabulary of interest, which means that they can be safely
replaced in an application [5]. Similarly, if one wants to know whether a given ontology
can be safely replaced by a smaller subset (a module), it is the answers to all queries that
should be preserved [7]. In this context, the fundamental relationship between ontologies
is thus not whether they are logically equivalent (have the same models), but whether
they give the same answers to all relevant queries. The resulting entailment problem can
be formalized in two ways, with different applications. First, given a class Q of queries,
knowledge bases (KBs) K1 = (T1 , A1 ) and K2 = (T2 , A2 ), and a signature Σ of
relevant concept and role names, we say that K1 Σ-Q-entails K2 if the answers to any Σ-
query in Q over K2 are contained in the answers to the query over K1 . K1 and K2 are Σ-
Q-inseparable if they Σ-Q-entail each other. This notion of entailment is appropriate if
the data is known and does not change frequently. Applications include data-oriented KB
versioning and KB module extraction, KB forgetting [9], and knowledge exchange [1].
     If the data is not known or changes frequently, it is not KBs that should be compared,
but TBoxes. Given a pair Θ = (Σ1 , Σ2 ) specifying a relevant signature Σ1 for ABoxes
and Σ2 for queries, we say that a TBox T1 Θ-Q-entails a TBox T2 if, for every Σ1 -
ABox A, the KB (T1 , A) Σ2 -Q-entails (T2 , A). T1 and T2 are Θ-Q-inseparable if
they Θ-Q-entail each other. Applications include data-oriented TBox versioning, TBox
modularization and TBox forgetting [7].
     Most important choices for Q are conjunctive queries (CQs) and unions thereof
(UCQs); we also consider the practically relevant classes of rooted CQs (rCQs) and
UCQs (rUCQs), in which every variable is connected to an answer variable. So far, CQ-
entailment has been studied for Horn DL KBs [3], EL TBoxes [8, 5], DL-Lite TBoxes [6],
and also for OBDA specifications, that is, DL-Lite TBoxes with mappings [2]. No results
are available for non-Horn DLs (neither in the KB nor in the TBox case) and for
expressive Horn DLs in the TBox case. In particular, query entailment in non-Horn DLs
had the reputation of being a technically challenging problem.
     This paper makes a first breakthrough into understanding query entailment and
inseparability in these cases, with the main results summarized in Figures 1 and 2 (those
marked with (?) are from [3]). Most unexpected is the undecidability of CQ- and rCQ-
entailments between ALC KBs, even when the first KB is formulated in Horn-ALC
(in fact, EL) and without any signature restriction. This should be contrasted with the
         Horn-ALC   ALC to                                 Horn-ALC ALC to
 ALC                             Horn-ALC Queries     ALC                        Horn-ALC
           to ALC  Horn-ALC                                  to ALC Horn-ALC
    undecidable                                  CQ    undecidable
                        ?       =E XP T IME(?)                           ?      =2E XP T IME
          ?                                     UCQ         ?
    undecidable                                 rCQ    undecidable
                  ≤ 2E XP T IME =E XP T IME(?)                      =E XP T IME =E XP T IME
    ≤ 2E XP T IME                              rUCQ         ?

         Fig. 1. KB query entailment.                       Fig. 2. TBox query entailment.


decidability of subsumption-based entailment between ALC TBoxes [4] and of CQ-
entailment between Horn-ALC KBs [3]. The second surprising result is that entailment
between ALC KBs becomes decidable when CQs are replaced with rUCQs. For ALC
TBoxes, CQ- and rCQ-entailments are undecidable as well. We obtain decidability for
Horn-ALC TBoxes (where CQ- und UCQ-entailments coincide) using the fact that
non-entailment is always witnessed by tree-shaped ABoxes. As another surprise, CQ-
entailment of Horn-ALC TBoxes is 2E XP T IME-complete while rCQ-entailment is only
E XP T IME-complete. This should be contrasted with the EL case, where both problems
are E XP T IME-complete [8]. All upper bounds and most lower bounds hold also when
entailment is replaced with inseparability.


2    Undecidability of CQ-Entailment for ALC KBs
Here we sketch the construction showing that the problem whether an EL KB Σ-CQ
entails an ALC KB is undecidable. The proof is by reduction of the undecidable N × M -
tiling problem: given a finite set T of tile types T with four colours up(T ), down(T ),
left(T ) and right(T ), a tile type I ∈ T, and two colours W (for wall) and C (for ceiling),
decide whether there exist N, M ∈ N such that the N × M grid can be tiled using T in
such a way that (1, 1) is covered by a tile of type I; every (N, i), for i ≤ M , is covered
by a tile of type T with right(T ) = W ; and every (i, M ), for i ≤ N , is covered by a tile
of type T with up(T ) = C. Given an instance of this problem, we first describe a KB
K2 = (T2 , {A(a)}) that uses (among others) three concept names Tk , k = 0, 1, 2, for
each tile type T ∈ T. If a point x in a model I of K2 is in Tk and right(T ) = left(T 0 ),
then x has an R-successor in Tk0 . Thus, branches of I define (possibly infinite) horizontal
rows of tilings with T. If a branch contains a point y ∈ Tk with right(T ) = W , then
this y can be the last point in the row, which is indicated by an R-successor z ∈ Row of
y. In turn, z has R-successors in all T(k+1) mod 3 that can be possible beginnings of the
next row of tiles. To coordinate the up and down colours between the rows—which will
be done by the CQs separating K1 and K2 — we make every x ∈ Tk , starting from the
                                    0                                 0
second row, an instance of all T(k−1)    mod 3 with down(T ) = up(T ). The row started by
z ∈ Row can be the last one in the tiling, in which case we require that each of its tiles
T has up(T ) = C. After the point in Row indicating the end of the final row, we add an
R-successor in End for the end of tiling. The beginning of the first row is indicated by a
P -successor in Start of the ABox element a, after which we add an R-successor in I0
for the given initial tile type I; see the lowest branch in the figure below. To generate a
tree with all possible branches described above, we only require EL axioms of the form
E v D and E v ∃S.D.
                              Start     B1                      BN              BN +1                                                           Bn−N                    Bn−1           Bn         End
                    qn
                    hl                    hr                                                                 End                                                                        hl
                              Row
                                                                                            Row       ∨                                              Row
                              Start                                                                        T2                                                                                                  hr
                         T0        T0           T0         T1              T1                     T1                                                       T1        Il
                                                                                                                                                                                       Row         End
        P                                       Row                                                        Row                                                  Row        End
                                                                                                                                                                                                         T2
a                                                                                                                                                                  ∨                         T2
    A       Start   I0                  T0N 1        Start      T112                     T1N 2                          T11M -1                T1N M -1                                                        Row
                                                                I0 . . .                T0N 1 . . .                    T01M -2 . . .           T0N M -2 . . .        Ir T 1M                      T2N M              End
                                                                                                                                                                           2
                                                                                                                                                                       T11M -1 . . .          T1N M -1 . . .


    The existence of a tiling of some N × M grid for the given instance can be checked
by Boolean CQs q n that require  Vn an R-path fromV    Start to End going through Tk - or
                                                        n
Row-points: ∃x(Start(x0 ) ∧ i=0 R(xi , xi+1 ) ∧ i=1 Bi (xi ) ∧ End(xn+1 )), where
Bi ∈ {Row} ∪ {Tk | T ∈ T, k = 0, 1, 2}. The key trick is—using an axiom of the
form D v E t E 0 —to ensure that the Row-point before the final row of the tiling
has two alternative continuations: one as described above, and the other one having
just a single R-successor in End where ∨ indicates an or-node. This or-node gives two
models of K2 denoted Il and Ir in the picture. If K2 |= q n , then q n holds in both of
them, and so there are homomorphisms hl : q n → Il and hr : q n → Ir . As hl (xn−1 )
and hr (xn−1 ) are instances of Bn−1 , we have Bn−1 = T1N M −1 in the picture, and
so up(T N M −1 ) = down(T N M ). By repeating this argument until x0 , we see that the
colours between horizontal rows match and the rows are of the same length. (For this
trick to work, we have to make the first Row-point in every branch an instance of Start.)
In fact, an instance of the N × M -tiling problem has a positive answer iff there exists q n
such that K2 |= q n . It is to be noted that to construct T2 with the properties described
above, one needs quite a few auxiliary concept names.
    Let K1 = (T1 , {A(a)}) be an EL KB with the following canonical model:
                                              A              Start, Σ0                    Start, Σ0                      Start, Σ0
                                          a
                                                       P

                                                                            End,                            End,                           End,
                                                                                   Σ0 E                          Σ 0    End,                    Σ0    End,
                                                                                           nd, Σ                             Σ                                  Σ0
                                                                                                       0                               0


where Σ0 = {Row} ∪ {Tk | T ∈ T, k = 0, 1, 2}. Note that the vertical R-successors of
the Start-points are not instances of any concept name, and so K1 does not satisfy any
query of the form q n . On the other hand, K2 |= q implies K1 |= q, for every Σ-CQ q
without a subquery of the form q n and Σ = sig(K1 ). Therefore, K1 Σ-CQ entails K2
iff there exists no tiling as specified above.


3            Future Work

We have made first steps towards understanding query entailment and inseparability for
KBs and TBoxes in expressive DLs. From a theoretical viewpoint, it would be of interest
to solve the open problems in Figures 1 and 2, and also consider other expressive DLs
such as DL-Lite H  bool or ALCI. For example, our undecidability proof goes through for
DL-Lite Hbool , but the other cases remain open. From a practical viewpoint, our model-
theoretic criteria for query entailment are a good starting point for developing algorithms
for approximations of query entailment based on simulations. Our undecidability and
complexity results also indicate that rUCQ-entailment is more amenable to practical
algorithms than, say, CQ-entailment and can be used as an approximation of the latter.
References
1. Arenas, M., Botoeva, E., Calvanese, D., Ryzhikov, V.: Exchanging OWL 2 QL knowledge
   bases. In: Proc. of the 23rd Int. Joint Conf. on Artificial Intelligence (IJCAI 2013). AAAI Press
   (2013)
2. Bienvenu, M., Rosati, R.: Query-based comparison of OBDA specifications. In: Proc. of the
   28th Int. Workshop on Description Logics (DL 2015). vol. 1350. CEUR-WS (2015)
3. Botoeva, E., Kontchakov, R., Ryzhikov, V., Wolter, F., Zakharyaschev, M.: Query inseparability
   for description logic knowledge bases. In: Proc. of the 14th Int. Conf. on the Principles of
   Knowledge Representation and Reasoning (KR 2014) (2014)
4. Ghilardi, S., Lutz, C., Wolter, F.: Did I damage my ontology? A case for conservative extensions
   in description logics. In: Doherty, P., Mylopoulos, J., Welty, C. (eds.) Proc. of the 10th Int.
   Conf. on the Principles of Knowledge Representation and Reasoning (KR 2006). pp. 187–197
   (2006)
5. Konev, B., Ludwig, M., Walther, D., Wolter, F.: The logical difference for the lightweight
   description logic EL. Journal of Artificial Intelligence Research (JAIR) 44, 633–708 (2012)
6. Kontchakov, R., Pulina, L., Sattler, U., Schneider, T., Seimer, P., Wolter, F., Zakharyaschev, M.:
   Minimal module extraction from DL-Lite ontologies using QBF solvers. In: Proc. of the 21st
   Int. Joint Conf. on Artificial Intelligence (IJCAI 2009). pp. 836–840 (2009)
7. Kontchakov, R., Wolter, F., Zakharyaschev, M.: Logic-based ontology comparison and module
   extraction, with an application to DL-Lite. Artificial Intelligence 174, 1093–1141 (2010)
8. Lutz, C., Wolter, F.: Deciding inseparability and conservative extensions in the description
   logic EL. J. of Symbolic Computation 45(2), 194–228 (2010)
9. Wang, K., Wang, Z., Topor, R.W., Pan, J.Z., Antoniou, G.: Eliminating concepts and roles from
   ontologies in expressive descriptive logics. Computational Intelligence 30(2), 205–232 (2014)