Conjunctive Query Entailment: Decidable in Spite of O, I, and Q Birte Glimm1 and Sebastian Rudolph2 1 University of Oxford, UK 2 Institut AIFB, Universität Karlsruhe, DE 1 Introduction We present a decidability result for entailment of conjunctive queries (CQs) in the very expressive Description Logic (DL) ALCHOIQb [1] by establishing finite representability of countermodels in the case of non-entailment. Our result also generalizes to unions of conjunctive queries and SHOIQ provided the query contains only simple roles, and we are confident that the technique extends to SROIQ under the simple roles restriction as well. Full proofs and additional material can be found in the accompanying technical report [2]. Throughout the paper, we use the DL ALCOIFb, where b stands for safe Boolean role expressions. Any ALCHOIQb knowledge base (KB) can be polyno- mially reduced to an ALCOIFb KB with extended signature, while preserving query (non-)entailment [3].W.l.o.g., we assume that KBs contain an empty ABox (with nominals the ABox can be internalized) and that all GCIs are simplified to one of the following forms: l G Ai v Bj | A ≡ {o} | A v ∀U.B | A v ∃U.B | func(f ), (1) where A(i) and B(j) are atomic concepts, o is an individual name, U is a safe Boolean role d expression, and f is a role that is F declared functional. If i = 0, we interpret Ai as > and if j = 0, we interpret Bj as ⊥. We use con(K), rol(K), and nom(K) to denote, respectively, the set of concept, role, and individual names occurring in K, and cl(K) to denote the closure of K. A role f is (inverse) functional in K if K contains an axiom func(f ) (func(f − )). Let NV be a countably infinite set of variables, A a concept name, r a role name, and x, y ∈ NV variables. An atom is an expression A(x) or r(x, y). A Boolean conjunctive query q is a non-empty set of atoms. We use var(q) to denote the set of (existentially quantified) variables occurring in q and ](q) for the number of atoms in q. For I = (∆I , ·I ) an interpretation, A(x), r(x, y) atoms, and π : var(q) → ∆I a total function, we write (i) I |=π A(x) if π(x) ∈ AI and (ii) I |=π r(x, y) if hπ(x), π(y)i ∈ rI . If I |=π At for all atoms At ∈ q, we write I |=π q and say that I satisfies q. We write I |= q if there exists a function π such that I |=π q and call π a match for q in I. If I |= K implies I |= q, we say that K entails q and write K |= q. W.l.o.g., we assume that queries are connected. Given a KB K and a CQ q, the query entailment problem is to decide whether K |= q. As a running example, we use a KB K containing the axioms {o} v ∃r.A, A v ∃r.A, A v ∃s.B, B v CtD, C v ∃f.E, D v ∃g.E, E v Bt{o}, func(f − ), func(g − ). The left hand side of Fig. 1 (p. 6) displays a model for K. Unless stated otherwise, we use A for a concept name, r for a role, f for a functional or inverse functional role, U for a safe Boolean role expression, o for a nominal, q for a connected Boolean conjunctive query, K for a simplified ALCOIFb knowledge base, and I for an interpretation (∆I , ·I ). We show decidability by proving that non-entailment is always witnessed by a regular model which is finitely representable. Our procedure enumerates these finite representations and terminates if the entailment does not hold. If it holds, termination can be ensured because first-order logic is recursively enumerable [4]. To construct finitely representable models, we first “unravel” models into interpretations, called forest quasi-models, that have a real forest shape. We then provide a way of “collapsing” such forest quasi-models back into real models that still have a kind of forest shape, called forest model, but which have a possibly infinite set of roots. Finally, we introduce transformations that work on the unraveled forest quasi-models, and the resulting interpretations can be collapsed into forest models with only a finite set of roots. We can then adapt standard cycle detection techniques such as (tree) blocking to obtain the desired finite representations. 2 Model Construction We first introduce interpretations and models that have a kind of forest shape. Please note that, unless we call a forest strict, our notion of a forest is very weak since we do also allow for arbitrary relations between tree elements and roots. Definition 1. A tree T is a non-empty, prefix-closed subset of IN∗ . A forest F is a subset of R × IN∗ , where R is a non-empty, countable, possibly infinite set of elements {ρ1 , . . . , ρn } such that, for each ρ ∈ R, the set {w | (ρ, w) ∈ F } is a tree. Each pair (ρ, ε) ∈ F is called a root of F . For (ρ, w), (ρ0 , w0 ) ∈ F , we call (ρ0 , w0 ) a successor of (ρ, w) if ρ0 = ρ and w0 = w · c for some c ∈ IN, where “·” denotes concatenation; (ρ0 , w0 ) is a predecessor of (ρ, w) if ρ0 = ρ and w = w0 · c for some c ∈ IN; (ρ0 , w0 ) is a neighbor of (ρ, w) if (ρ0 , w0 ) is a successor of (ρ, w) or vice versa. A node (ρ, w) is an ancestor of a node (ρ0 , w0 ) if ρ = ρ0 and w is a prefix of w0 and it is a descendant if ρ = ρ0 and w0 is a prefix of w. We use |w| to denote the length of w. The branching degree d(w) of a node w in a tree T is the number of successors of w. A forest interpretation of K is an interpretation I that satisfies: FI1 ∆I is a forest with roots R; FI2 there is a total and surjective function λ : nom(K) → R×{ε} s.t. λ(o) = (ρ, ε) iff oI = (ρ, ε); FI3 for each role r ∈ rol(K), if h(ρ, w), (ρ0 , w0 )i ∈ rI , then either (a) w = ε or w0 = ε, or (b) (ρ, w) is a neighbor of (ρ0 , w0 ). If I |= K we say that I is a forest model for K. With nomFree(K), we denote a knowledge base obtained from K by replacing each nominal concept {o} with o ∈ nom(K) with a fresh concept name No . A forest quasi-interpretation for K is an interpretation J that satisfies FI1 and FI3, and the adapted version FI20 of FI2 that there is a total and surjective function λ : nom(K) → R × {ε} s.t. λ(o) = (ρ, ε) iff (ρ, ε) ∈ NoJ (there might be other (ρ, w) ∈ ∆J with w 6= ε s.t. (ρ, w) ∈ NoJ ). If J |= nomFree(K) we say that J is a forest quasi-model for K. A forest (quasi) interpretation I is a strict forest (quasi) interpretation if, in condition FI3, only (b) is allowed; it is a tree interpretation, if it has a single root. If there is a k such that d(w) ≤ k for each (ρ, w) ∈ ∆I , then we say that I has branching degree k. 0 Let I, I 0 be two forest interpretations of K with δ1 , δ2 ∈ ∆I , δ10 , δ20 ∈ ∆I . The pairs hδ1 , δ2 i, hδ10 , δ20 i are isomorphic w.r.t. K, written hδ1 , δ2 i ∼ =K hδ10 , δ20 i iff 0 – hδ1 , δ2 i ∈ rI iff hδ10 , δ20 i ∈ rI for each r ∈ rol(K), 0 – δi ∈ AI iff δi0 ∈ AI for i ∈ {1, 2} and each A ∈ con(K), 0 – δi = oI iff δi0 = oI for i ∈ {1, 2} and each o ∈ nom(K). We say that I and I 0 are isomorphic w.r.t. K, written: I ∼ =K I 0 , if there is a I0 bijection ϕ : ∆ → ∆ such that, for each δ1 , δ2 ∈ ∆ , hδ1 , δ2 i ∼ I I =K hϕ(δ1 ), ϕ(δ2 )i and δ1 is a successor of δ2 iff ϕ(δ1 ) is a successor of ϕ(δ2 ). If clear from the context, we omit the subscript K of ∼ =K and we extend the definition to forest quasi-interpretations in the obvious way. Forest quasi-models represent, intuitively, an intermediate step between arbitrary models of K and forest models of K. Since KBs are assumed to be simplified, it can be checked locally (by looking at an element of the domain and its direct neighbors) whether an interpretation I is a model of K. We call an element δ ∈ ∆I locally K-consistent if it satisfies each GCI in K (a functionality restriction func(f ) is satisfied if δ has at most one f -neighbor); I is a model of K if each δ ∈ ∆I is locally K-consistent. Deciding whether a quasi interpretation is a model for K cannot be decided locally since nominals impose a global restriction on the cardinality of concepts. Therefore, a quasi interpretation J for K is a model for K if, additionally, for each o ∈ nom(K), NoJ is a singleton set. We now show how we can obtain a forest quasi-model from a model of K by using an adapted version of unraveling. Definition 2. Let I be a model for K and choose a function that returns, for a concept C = ∃U.B ∈ cl(K) and δ ∈ C I some δC,δ ∈ ∆I s.t. hδ, δC,δ i ∈ U I and δC,δ ∈ B I . W.l.o.g., we assume that if δ ∈ C1I ∩ C2I with Ci = ∃Ui .Bi ∈ cl(K), choose(Ci , δ) = δi for i ∈ {1, 2} and hδ, δ1 i ∼ = hδ, δ2 i, then δ1 = δ2 . An unraveling for some δ ∈ ∆I , denoted ↓(I, δ), is an interpretation obtained from I and δ as follows: we define the set S ⊆ (∆I )∗ of sequences to be the smallest set such that δ is a sequence and δ1 · · · δn · δn+1 is a sequence, if – δ1 · · · δn is a sequence, – if n > 2 and hδn , δn−1 i ∈ f I for some functional role f , then δn+1 6= δn−1 , – δn+1 = choose(C, δn ) for some C = ∃U.B ∈ cl(K). Now fix a set F ⊆ {δ} × IN∗ and a bijection λ : F → S such that (i) F is a forest, (ii) λ(δ, ε) = δ, (iii) if (δ, w), (δ, w · c) ∈ F with w · c a successor of w, then λ(δ, w · c) = λ(δ, w) · δn+1 for some δn+1 ∈ ∆I . For each (δ, w) ∈ F , set Tail(δ, w) = δn if f (δ, w) = δ1 · · · δn . The unraveling for δ is the interpretation J with ∆J = F and, for each (δ, w) ∈ ∆J : (a) for each o ∈ nom(K), NoJ = {(δ, w) ∈ ∆J | Tail(δ, w) ∈ oI } for No ∈ NC a fresh concept name; (b) for each concept name A ∈ con(K), (δ, w) ∈ AJ iff Tail(δ, w) ∈ AI ; (c) for each role name r ∈ rol(K), h(δ, w), (δ, w0 )i ∈ rJ iff (δ, w0 ) is a neighbor of (δ, w), and hTail(δ, w), Tail(δ, w0 )i ∈ rI . Let R ⊆ ∆I contain each δ s.t. oI = δ for some o ∈ nom(K). The union of all ↓(I, δ) with δ ∈ R is called an unraveling for I, denoted ↓(I), where unions of interpretations are defined in the natural way. Please note that the function Tail can also be seen as a homomorphism (up to signature extension) from the elements in the unraveling to elements in the original model. Fig. 1b shows the unraveling for our example KB and model. The dotted lines for the non-root elements labeled No indicate that a copy of the whole tree should be appended. Unravelings are the first step in the process of transforming an arbitrary model of K into a forest model since the resulting interpretation is a strict forest quasi-model of K. Lemma 1. Let I be a model of K, then J =↓(I) is a strict forest quasi-model for K with branching degree bounded in |cl(K)|. In the following steps, we traverse a forest quasi-model in an order in which elements with smaller tree depth are always of smaller order than elements with greater tree depth. Elements with the same tree depth are ordered lexicograph- ically. The bounded branching degree of unravelings then guarantees that, after a finite number of steps, we go on to the next level in the forest and process all nodes eventually. Definition 3. Let ≤ be a total order over NI , K a consistent ALCOIFb KB, and J a forest quasi-interpretation for K. We extend the order to elements in ∗ ∗ ∆J as follows: let w1 = wp · c11 · · · cn1 , w2 = wp · c12 · · · cm 2 ∈ IN where wp ∈ IN is the longest common prefix of w1 and w2 , then w1 < w2 if either |w1 | < |w2 | or both |w1 | = |w2 | and c11 < c12 . For i ∈ {1, 2} and (ρi , ε) ∈ ∆J , let oi ∈ nom(K) be the smallest nominal such that (ρi , ε) ∈ NoJi . Now (ρ1 , w1 ) < (ρ2 , w2 ) if either (i) |w1 | < |w2 | or (ii) |w1 | = |w2 | and o1 < o2 or (ii) |w1 | = |w2 |, o1 = o2 and w1 < w2 . When collapsing, we create new elements of the form (ρw, w0 ) from (ρ, ww0 ). We extend, therefore, the order as follows: (ρ1 w1 , w10 ) < (ρ2 w2 , w20 ) if (ρ1 , w1 w10 ) < (ρ2 , w2 w20 ). During the traversal, we merge nodes such that, finally, all nominal place- holders can be interpreted as singleton sets. To satisfy functionality restrictions, we merge not only nominal placeholders, but also elements that are related to a nominal placeholder by an inverse functional role since, by definition of the semantics, these elements have to correspond to the same element in a model. In order to identify such elements, we define backwards counting paths as follows: Definition 4. Let I be a (quasi) forest model for K. We call p = δ1 · . . . · δn a path from δ1 to δn if, for each i with 1 ≤ i < n, hδi , δi+1 i ∈ riI for some role U Un−1 ri ∈ rol(K). The length |p| of a path p is n − 1. We write δ1 →1 δ2 . . . → δn to denote that hδi , δi+1 i ∈ UiI for each 1 ≤ i < n. The path p is a descending path if there is some (ρ, ε) ∈ ∆I s.t., for each 1 ≤ i ≤ n, δi = (ρ, wi ) and, for each 1 ≤ i < n, |wi | < |wi+1 |; p is a backwards counting path (BCP) in I if δn ∈ oI (δn ∈ NoI ) for some o ∈ nom(K) and, for each 1 ≤ i < n, hδi , δi+1 i ∈ fiI for some inverse functional role fi ; p is a descending BCP if it is descending and a f1 fn BCP. Given a BCP p = δ1 → δ2 . . . → δn+1 with δn+1 ∈ oJ (δn+1 ∈ NoJ ), we call the sequence f1 · · · fn o a path sketch of p. Please note that (ρ, w) is already a descending BCP if (ρ, w) ∈ oI (NoI ). We now show how we can “collapse” a forest quasi-model into a forest model provided it satisfies some admissibility restrictions. During the traversal, we dis- tinguish two situations: (i) we encounter an element (ρ, w) that starts a descend- ing BCP and we have not seen another element before that starts a descending BCP with the same path sketch. In this case, we promote (ρ, w) to become a new root node of the form (ρw, ε) and we shift the subtree rooted in (ρ, w) with it; (ii) we encounter a node (ρ, w) that starts a descending BCP, but we have already seen a node (ρ0 , w0 ) that starts a descending BCP with that path sketch and which is now a root of the form (ρ0 w0 , ε). In this case, we delete the subtree rooted in (ρ, w) and identify (ρ, w) with (ρ0 w0 , ε). If (ρ, w) is an f -successor of its predecessor for some inverse functional role f , we delete all f − -successors of (ρ0 w0 , ε) and their subtrees in order to satisfy the functionality restriction. We use a notion of collapsing admissibility to characterize models s.t. the the predecessor of (ρ, w) satisfies the same atomic concepts as the deleted successor of (ρ0 , w0 ), which ensures that local consistency is preserved. Definition 5. Let K0 = nomFree(K) and J be a forest quasi-interpretation for K, then J is collapsing-admissible if there exists a function ch : (cl(K) × ∆J ) → ∆J s.t., for each C = ∃U.B ∈ cl(K) and δ ∈ C J , 1. hδ, ch(C, δ)i ∈ U J , ch(C, δ) ∈ B J and, if there is no functional role f s.t. hδ, ch(C, δ)i ∈ f J , then ch(C, δ) is a successor of δ, 2. if there is some δ 0 ∈ C J s.t. δ and δ 0 start descending BCPs with identical path sketches, then hδ, ch(C, δ)i ∼= hδ 0 , ch(C, δ 0 )i. We define ∼ as the smallest equivalence relation on ∆J that satisfies δ1 ∼ δ2 if δ1 , δ2 start descending BCPs with identical path sketches. If J is a strict forest quasi-model for K, we call J0 = J an initial collapsing for J and the smallest element (ρ0 , w0 ) ∈ ∆J0 with w0 6= ε that starts a descend- ing BCP the focus of J0 . Let Ji be a collapsing for J and (ρi , wi ) ∈ ∆Ji the focus of Ji . We obtain a collapsing Ji+1 for J from Ji with focus (ρi+1 , wi+1 ) B B B B {o} E No E {o} E C E DE C E DE r r f g f g A r f B s A s C E r A A s BC r g r s E B r A s DE A s BD A s f r f E No E r r s A B A s BC g BC A s C E E E r g r r A s BD f BD f A B E E No E A s DE Fig. 1. a) A representation of a model for the running example. b) Result of unravel- ing this model. c) Result of collapsing this unraveling with infinitely many new root elements displayed in the top line. the smallest element starting a descending BCP and (ρi+1 , wi+1 ) > (ρi , wi ) ac- cording to the following two cases: 1. There is no element (ρ, ε) ∈ ∆Ji s.t. (ρ, ε) < (ρi , wi ) and (ρ, ε) ∼ (ρi , wi ). Then Ji+i is obtained from Ji by renaming each element (ρi , wi wi0 ) ∈ ∆Ji to (ρi wi , wi0 ). 2. There is an element (ρ, ε) ∈ ∆Ji s.t. (ρ, ε) < (ρi , wi ) and (ρ, ε) ∼ (ρi , wi ). Let (ρ, ε) be the smallest such element. (a) ∆Ji+1 = ∆Ji \ ({(ρi , wi wi0 ) | wi0 ∈ IN∗ } ∪ {(ρ, w) | w = c · w0 , c ∈ IN, w0 ∈ IN∗ , (ρi , wi ) has a predecessor (ρi , wi0 ) such that h(ρi , wi0 ), (ρi , wi )i ∈ f Ji for an inverse functional role f in rol(K) and h(ρ, c), (ρ, ε)i ∈ f Ji }); (b) for each A ∈ con(K), δ ∈ ∆Ji+1 , δ ∈ AJi+1 iff δ ∈ AJi ; (c) for each r ∈ rol(K), δ1 , δ2 ∈ ∆Ji+1 , hδ1 , δ2 i ∈ rJi+1 iff (a) hδ1 , δ2 i ∈ rJi or (b) δ1 is the predecessor of (ρi , wi ) in Ji , δ2 = (ρ, ε), and hδ1 , (ρi , wi )i ∈ r Ji . For a collapsing Ji , safe(Ji ) is the restriction of Ji to elements (ρ, w) s.t. (ρ, w) ∈ Jj for all j ≥ i. With Jω we denote the non-disjoint union of all interpretations safe(Ji ) obtained from subsequent collapsings Ji for J . The in- terpretation obtained from Jω by interpreting each o ∈ nom(K) as (ρ, ε) ∈ NoJω is denoted by collapse(J ) and called a purified interpretation w.r.t. J. If collapse(J ) |= K, we call collapse(J ) a purified model of K. Since, in unravelings, elements that start (desccending) BCPs with identical path sketches have been generated from the same element in the unravelled model, collapsing-admissibility is immediate and the function ch can be defined using the function choose from the unraveling. Using collapsing-admissibility, we can show that, whenever we delete an element and the subtree rooted in it during the collapsing, the predecessor of the focus is a suitable replacement. Lemma 2. Let J be a strict forest quasi-model for K with branching degree b that is collapsing-admissible. Then collapse(J ) is a forest model for K that still has branching degree b. By Lemma 1 and since unravelings are collapsing-admissible, we immediately get that collapsing an unraveling yields a forest model with bounded branching degree. At this point, the number of roots might still be infinite and we could have obtained the same result by unraveling an arbitrary model, where we take all elements on BCPs as roots instead of taking just the nominals and creating new roots in the collapsing process. In the next sections, however, we show how we can transform an unraveling of a counter-model for the query such that it remains collapsing-admissible and such that it can in the end be collapsed into a forest model with a finite number of roots that is still a counter model for the query. For this transformation it is much more convenient to work with real trees and forests, which is why we work with strict forest quasi-interpretations. 3 Quasi-Entailment in Quasi-Models In this section, we provide a characterization for query entailment in forest quasi- models that mirrors query entailment for the corresponding “proper models”. In our further argumentation, we talk about the initial part of a tree, i.e., the part that remains if one cuts branches down to a fixed length. For a forest interpre- tation I and some n ∈ IN, we denote, therefore, with cutn (I) the interpretation obtained from I by restricting ∆I to those pairs (ρ, w) for which |w| ≤ n. One can show that, in the case of purified models, we find only finitely many unrav- eling trees of depth n that “look different” (i.e., that are non-isomorphic). For our further considerations, we introduce the notion of “anchored n-com- ponents”. These are certain substructures of forest quasi-interpretations that we use to define a notion of quasi entailment. Definition 6. Let J be a forest quasi-interpretation and δ ∈ ∆J . An interpre- tation C is called anchored n-component of J with witness δ if C can be created by restricting J to a set W ⊆ ∆J obtained as follows: – Let Jδ be the subtree of J that is started by δ and let Jδ,n := cutn (Jδ ). Select a subset W 0 ⊆ ∆Jδ,n that is closed under predecessors. – For every δ 0 ∈ W 0 , let P be a finite set (possibly empty) of descending BCPs p starting fromSδ 0 and let Wδ0 contain all nodes from all p ∈ P . – Set W = W 0 ∪ δ0 ∈W 0 Wδ0 . The following definition and lemma employ the notion of anchored n-compo- nents to come up with the notion of quentailment (short for quasi-entailment), a criterion that reflects query-entailment in the world of forest quasi-models. Fig. 2 illustrates this correspondence. Definition 7. Let J be a forest quasi-model for K and q a CQ with ](q) = n and V = var(q). We say that J quentails q, written J |≈ q, if J contains connected anchored n-components C1 , . . . , C` and there are variable assignment functions Ci µi : V → 2∆ such that: Q1 For every x ∈ V , there is at least one Ci , such that µi (x) 6= ∅ No E µ(x1 ) r µ(x2 ) r µ(x5 ) r C1 A r r µ1 (x1 ) A A A A r s B A C {o} E s BC s BC s BC s BC C2 µ1 (x2 )r s E E E E E A BD f f g g µ2 (x5 ) µ1 (x3 ) E No E f µ(x3 ) µ(x4 ) A r s B g BC C µ2 (x4 ) E E A r s B f B f D D E µ2 (x3 ) E No E r s B g g A C BC BC E E E A r s B f B f B f D D D E E E No E Fig. 2. Correspondence between entailment and quentailment. Left: match µ for q = {r(x1 , x2 ), s(x2 , x3 ), f (x4 , x3 ), s(x5 , x4 )} into the “proper model”. Right: anchored com- ponents C1 and C2 and functions µ1 and µ2 establishing ↓(I) |≈ q. Q2 For all A(x) ∈ q, we have µi (x) ⊆ AJ for some i. Q3 For every r(x, y) ∈ q there is a Ci such that there are δ1 ∈ µi (x) and δ2 ∈ µi (y) such that hδ1 , δ2 i ∈ rJ . Q4 If, for some x ∈ V , there are connected anchored n-components Ci and Cj with δ ∈ µi (x) and δ 0 ∈ µj (x), then there is – a sequence Cn1 , . . . , Cnk with n1 = i and nk = j and – a sequence δ1 , . . . , δk with δ1 = δ and δk = δ 0 as well as δm ∈ µnm (x) for all 1 ≤ m < k, such that, for every m with 1 ≤ m < k, we have that – Cnm contains a descending BCP p1 started by δm , – Cnm+1 contains a descending BCP p2 started by δm+1 , – p1 and p2 have the same path sketch. Note that an anchored component may contain none, one or several instan- tiations of a variable x ∈ V . Intuitively, the definition ensures, that we find matches of query parts which when fitted together by identifying BCP-equal elements yield a complete query match. Lemma 3. For any model I of K, ↓(I) |≈ q implies I |= q and, for any collapsing- admissible strict forest quasi-model J of K, collapse(J ) |= q implies J |≈ q. 4 Limits and Forest Transformations One of the major obstacles for a decision procedure for CQ entailment is that for DLs including inverses, nominals, and cardinality restrictions (or alternatively functionality), there are potentially infinitely many new roots. If we want to elim- inate new roots such that only finitely many remain, they have to be replaced by “uncritical” elements. We will construct such elements as “environment-limits” – new domain elements which can be approximated with arbitrary precision by No E r A δ r A s BC E r A A s BD f E No E A r s BC E ∼ = A r s BC E g BC E r r A s BD f BD A s BD f BD f E E E E No E r r A s BC g BC g BC A s BC g BC BC E E E E E E Fig. 3. Pseudo forest model (right) and according 2-secure replacement for δ (left). already present domain elements – possibly without themselves being present in the domain.3 Definition 8. Let I with δ ∈ ∆I be a model of K. A tree interpretation J is said to be generated by δ, written: J C δ, if it is isomorphic to the restriction of ↓(I, δ) to elements of {(δ, cw) | (δ, cw) ∈ ∆↓(I,δ) , c 6∈ H} for some H ⊆ IN. The set of limits of I, written lim I, is the set of all tree interpretations J s.t., for every k ∈ IN, there are infinitely many δ ∈ ∆I with cutk (L) ∼ = cutk (J ) for some L C δ. The right hand side of Fig. 3 displays one limit element of our example model. The following lemma gives some useful properties of limits. Lemma 4. Let K0 = nomFree(K), I a purified model of K, and n some fixed natural number. Then the following hold: 1. Let L0 be a tree interpretation such that there are infinitely many δ ∈ ∆I with L0 ∼ = cutn (L) for some L C δ. Then, there is at least one limit J ∈ lim I such that cutn (J ) ∼ = L0 . 2. Every J ∈ lim I is locally K0 -consistent apart from its root (ρ, ε). 3. For every J ∈ lim I, every root (ρ, ε) in J has no BCP to any (ρ, w) ∈ ∆J . 4. Every J ∈ lim I is collapsing-admissible. Having defined and justified limit elements as convenient building blocks for restructuring forest quasi-interpretations, the following definition states how this restructuring is carried out. Definition 9. Let I be a model for K and J some forest quasi-model for K with δ ∈ ∆J . A strict tree quasi-interpretation J 0 ∈ lim I is called an n-secure replacement for δ if (i) cutn (↓(J , δ)) is isomorphic to cutn (J 0 ) and (ii) for every anchored n-component of J 0 with witness δ 0 , there is an isomorphic anchored n-component of J with witness δ. If δ ∈ ∆J has an n-secure replacement in lim I, δ is n-replaceable w.r.t. I and it is n-irreplaceable w.r.t. I otherwise. 3 As an analogy, consider the fact that any real number can be approximated by a sequence of rational numbers, even if it is itself irrational. B No E No E f DE r r A A s r r A s BC A E r r A s BD f A s BD E No E E r r A s BC g BC A s BC g BC E E E E r r A s BD f BD f BD A s BD f BD f BD E E E E E E Fig. 4. a) Result of replacing the element δ by the 2-secure replacement depicted in Fig. 3. The inserted component is highlighted. b) Result of collapsing the forest quasi- model displayed on the left hand side. Now, let (ρ, w) ∈ ∆J be n-replaceable w.r.t. I and J 0 an according n- replacement for (ρ, w) from lim I with root (ς, ε). The result of replacing (ρ, w) J0 by J 0 is an interpretation R with ∆R = ∆J 00 00 red ∪ {(ρ, ww ) | (ς, w ) ∈ ∆ } for J ∆red = (∆J \ {(ρ, ww0 ) | |w0 | > 1}) s.t. 0 – for each A ∈ con(K0 ), AR = (AJ ∩ ∆J 0 0 J red ) ∪ {(ρ, ww ) | (ς, w ) ∈ A } J J – for each r ∈ rol(K ), r = (r ∩ ∆red × ∆red ) ∪ {h(ρ, ww ), (ρ, ww00 )i | 0 R J 0 0 h(ς, w0 ), (ς, w00 )i ∈ rJ } For J =↓(I), an interpretation J 0 is called an n-secure transformation of J if it is obtained by (possibly infinitely) repeating the following step: Choose one unvisited w.r.t. tree-depth minimal node (ρ, w) that is n-replaceable w.r.t. I. Replace (ρ, w) with one of its n-secure replacements from lim I and mark (ρ, w) as visited. Fig. 3 displays a 32-secure replacement in the considered unraveling of our example model and Fig. 4a displays the result of carrying out this replacement step on our example. The following lemma ensures that not too many elements (actually defined in terms of the original model) are exempt from being replaced. Lemma 5. Every purified model I of K contains only finitely many distinct elements that start a BCP and are the cause for n-irreplaceable nodes in the unraveling of I. Next, we can show that the process of unraveling, n-secure transformation and collapsing preserves the property of being a model of a knowledge base and (with the right choice of n) also preserves the property of not entailing a conjunctive query. Moreover, this model conversion process ensures that the resulting model contains only finitely many new nominals (witnessed by a bound on the length of BCPs). Fig. 4b illustrates these properties for our example model. Note that only two new nominals are left whereas collapsing the original unraveling yields infinitely many. Lemma 6. Let I be a purified model of K, J =↓(I), and J 0 an n-secure trans- formation of J . Then the following hold: 1. collapse(J 0 ) is a model of K. 2. There is a natural number m such that J 0 does not contain any node whose shortest descending BCP has a length greater than m. 3. If, for some CQ q, we have J |6≈ q and n > ](q), then J 0 |6≈ q. 4. If, for some CQ q, we have I 6|= q and n > ](q), then collapse(J 0 ) 6|= q. Now we are able to establish our first milestone on the way to showing finite representability of countermodels. Theorem 1. For every ALCOIFb KB K and CQ q s.t. K 6|= q, there is a forest model I of K with finitely many roots and bounded branching degree s.t. I 6|= q. 5 Finite Representations of Models We can now use standard techniques from tableau algorithms (adapted to work on models) to construct finite representations for a forest model of K with finitly many roots. In particular the tableau algorithm with n-tree-blocking, n ≥ ](q), for deciding CQ entailment in SHIQ, SHOQ, and SHOI with only simple roles in the query [5, 6] works exactly like that. We call an interpretation on which we applied n-tree-blocking and discarded all blocked elements an n-representation. Such an n-representation corresponds to a complete and clash-free completion graph in tableau algorithms. Since we fixed a bound on the number of roots in Theorem 1 and otherwise only consider the closure cl(K) of K, one can show that there are only finitely many non-isomorphic n-blocking-trees even though we take links back to roots into account. Similarly, we can now use an adapted version of the technique for building a tableau from a complete and clash-free completion graph to show that we can obtain a model for K from an n-representation. Lemma 7. Let n ≥ ](q). If K 6|= q, then there is an n-representation R of K s.t. R 6|= q, and from R one can build a model I of K such that I 6|= q. Thus, we can enumerate all (finite) n-representations for K and check whether they entail q. Together with the semi-decidability result for FOL [4], we get the desired theorem. Theorem 2. It is decidable whether K |= q for K an ALCOIFb knowledge base and q a Boolean conjunctive query. This solves the long-standing open problem of deciding conjunctive query entailment in the presence of nominals, inverse roles, and qualified number re- strictions. Since the approach is purely a decision procedure, the computational complexity of the problem remains open and will be part of our future work. Similarly, we will embark on extending our results to SHOIQ with non-simple roles as query predicates. Acknowledgements. Sebastian Rudolph was supported by a scholarship of the German Academic Exchange Service (DAAD) and Birte Glimm is funded by the EPSRC project HermiT: Reasoning with Large Ontologies. References 1. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F.: The Description Logic Handbook. Cambridge University Press (2003) 2. Glimm, B., Rudolph, S.: Nominals, inverses, counting, and conjunctive queries or Why infinity is your friend! Technical report, University of Oxford (2009) http: //www.comlab.ox.ac.uk/files/2175/paper.pdf. 3. Rudolph, S., Krötzsch, M., Hitzler, P.: Terminological reasoning in SHIQ with ordered binary decision diagrams. In: Proc. 23rd National Conference on Artificial Intelligence (AAAI 2008), AAAI Press/The MIT Press (2008) 529–534 4. Gödel, K.: Über die Vollständigkeit des Logikkalküls. PhD thesis, Universität Wien (1929) 5. Ortiz, M.: Extending CARIN to the description logics of the SH family. In: Proceed- ings of Logics in Artificial Intelligence, European Workshop (JELIA 2008), Lecture Notes in Artificial Intelligence (2008) 324–337 6. Ortiz, M., Calvanese, D., Eiter, T.: Data complexity of query answering in expressive description logics via tableaux. Journal of Automated Reasoning 41(1) (2008) 61–98