On Query Answering in Description Logics with Number Restrictions on Transitive Roles? Vı́ctor Gutiérrez-Basulto1 , Yazmı́n Ibáñez-Garcı́a2 , and Jean Christoph Jung3 1 Cardiff University, UK ( gutierrezbasultov@cardiff.ac.uk ) 2 TU Wien, Austria ( ibanez@kr.tuwien.ac.at ) 3 Universität Bremen, Germany ( jeanjung@informatik.uni-bremen.de ) Abstract. We study query answering in the description logic SQ sup- porting number restrictions on both transitive and non-transitive roles. Our main contributions are (i) a tree-like model property for SQ knowl- edge bases and, building upon this, (ii) an automata based decision procedure for answering two-way regular path queries, which gives a 3ExpTime upper bound. 1 Introduction In the last years, several efforts have been put into the study of the query an- swering problem (QA) in description logics (DLs) featuring transitive roles (or generalisations thereof, such as regular expressions on roles) and number re- strictions, see e.g., [10, 11, 9, 7, 8] and references therein. However, all these DLs heavily restrict the interaction between these two features, or altogether forbid number restrictions on transitive roles. Unfortunately, this comes as a shortcom- ing in crucial DL application areas like medicine and biology in which many terms, e.g., proteins, are defined and classified according to the number of com- ponents they contain or are part of (in a transitive sense) [27, 22, 24]. The lack of investigations of query answering in DLs of this kind is partly because (i) the interaction of these features often leads to undecidability of the standard reasoning tasks (e.g., satisfiability) - even in lightweight sub-Boolean DLs with unqualified number restrictions [17, 20, 15]; and (ii) for those DLs known to be decidable, such as SQ and SOQ [20, 18], only recently tight com- plexity bounds were obtained [15]. Moreover, even if these features (with re- stricted interaction) do not necessarily increase the complexity of QA, they do pose additional challenges for devising decision procedures [10, 11, 9] since they lead to the loss of properties, such as the tree model property, which make the design of algorithms for QA simpler. In fact, these difficulties are present already in DLs with transitivity, but without number restrictions [9]. Clearly, these issues are exacerbated if number restrictions are imposed on transitive roles. The objective of this paper is to start the investigation of query answering in DLs supporting number restrictions on transitive roles. In particular, we look ? Gutiérrez-Basulto was funded by the EU’s Horizon 2020 programme under the Marie Sklodowska-Curie grant No 663830 at the problem of answering regular path queries, which generalise standard query languages like positive existential queries, over SQ knowledge bases [16]. We first develop tree-like decompositions of SQ-interpretations based on a novel unraveling that is specially tailored to handle the interaction of transitivity with number restrictions. Using these decompositions, we design an algorithm for the query answering problem using two-way alternating tree automata in the spirit of [10, 7, 8], resulting in a 3ExpTime upper bound (leaving an exponential gap). Related Work. Schröder and Pattinson [23] investigate the DL PHQ support- ing number restrictions on transitive parthood roles, which are, in contrast to SQ, interpreted as trees: parthood-siblings cannot have a common part. They show that under this assumption decidability (for satisfiability) can be attained. There has been some work on the extension of decidable first-order logic fragments, such as the guarded fragment, with transitivity and counting, see e.g., [25, 21]. Unfortunately, this case leads to undecidability unless the inter- action is severely restricted [25]. Closer to DLs is the detailed investigation of modal logics with graded modalities carried out in [19]. Finally, in the context of existential rules, several efforts have been recently made to design languages with decidable QA supporting transitivity [12, 4, 1]. However, we are not aware of any attempts to additionally support number restrictions. 2 Preliminaries Syntax. We introduce the DL SQ, which extends the classical DL ALC with transitivity declarations on roles (S) and qualified number restrictions (Q). We consider a vocabulary consisting of countably infinite disjoint sets of concept names NC , role names NR , and individual names NI , and assume that NR is partitioned into two countably infinite sets of non-transitive role names Nnt R and transitive role names NtR . The syntax of SQ-concepts C, D is given by the grammar rule C, D ::= A | ¬C | C u D | (≤ n r C) where A ∈ NC , r ∈ NR , and n is a number given in binary. We use (≥ n r C) as an abbreviation for ¬(≤ (n−1) r C), and other standard abbreviations like ⊥, >, C tD, ∃r.C, ∀r.C. Concepts of the form (≤ n r C) and (≥ n r C) are called at most-restrictions and at least-restrictions, respectively. An SQ-TBox T is a finite set of concept inclusions C v D where C, D are SQ-concepts. An ABox is a finite set of concept and role assertions of the form A(a), r(a, b) where A ∈ NC , r ∈ NR and {a, b} ⊆ NI ; ind(A) denotes the set of individual names occurring in A. A knowledge base (KB) K is a pair (T , A). Semantics. As usual, the semantics is defined in terms of interpretations. An interpretation I = (∆I , ·I ) consists of a non-empty domain ∆I and an inter- pretation function ·I mapping concept names to subsets of the domain and role names to binary relations over the domain such that transitive role names are mapped to transitive relations. We define C I for complex concepts C by inter- preting ¬ and u as usual and (≤ n r D)I by taking (≤ n r D)I = {d ∈ ∆I | |{e ∈ C I | (d, e) ∈ rI }| ≤ n}. For ABoxes A we adopt the standard name assumption (SNA), that is, aI = a, for all a ∈ ind(A), but we conjecture that our results hold without the unique name assumption. The satisfaction relation |= is defined in the standard way: I |= C v D iff C I ⊆ DI ; I |= A(a) iff a ∈ AI ; I |= r(a, b) iff (a, b) ∈ rI . An interpretation I is a model of a TBox T , denoted I |= T , if I |= α for all α ∈ T ; it is a model of an ABox A, written I |= A, if I |= α for all α ∈ A; it is a model of a KB K if I |= T and I |= A. A KB is satisfiable if it has a model. Query Language. As query language, we consider regular path queries, sup- porting regular expressions over roles. Recall that a regular expression E over an alphabet Σ is given by the grammar E ::= ε | σ | E·E | E ∪ E | E ∗ , where σ ∈ Σ and ε denotes the empty word. We denote with L(E) the language defined by E. We use N± R to refer to NR ∪ {r − | r ∈ NR } with (r− )I defined as {(d, e) | (e, d) ∈ r }, and identify r with s ∈ NR if r = s− . A positive 2-way regular path I − query (P2RPQ) is a formula of the form q(x) = ∃y.ϕ(x, y) where x and y are tuples of variables and ϕ is constructed using ∧ and ∨ of atoms of the form A(t) or E(t, t0 ) where A ∈ NC , E is a regular expression over S ::= N± R ∪{A? | A ∈ NC }, and t, t0 are terms, i.e., individual names or variables from x, y. We define as usual when a possible answer tuple a ∈ ind(A) is a certain answer of q over K and write K |= q(a) in case it is [8, 6]. Reasoning Problem. We study the certain answers problem: Given a KB K, a query q(x) and a tuple of individuals a, determine whether K |= q(a). Without loss of generality, we consider Boolean queries. 3 Decomposing SQ-Interpretations Existing algorithms for QA in expressive DLs, e.g., SHIQ (without number restrictions on transitive roles), exploit the fact that for answering queries it suffices to consider canoni- (a) ¬B cal models that are forest-like, roughly consisting of an inter- pretation of the ABox and a collection of tree-interpretations whose roots are elements of the ABox. However, for SQ this A C tree-model property is lost: B B Example 1. Let T = {A v (≤ 1 r C) u ∃r.B u ∃r.¬B, > v (b) ∃r.C} with r ∈ NtR . The number restrictions in T force that ev- ery model of T satisfying A contains the structure in Fig. 1(a). A Moreover, in SQ strongly connected components can be en- B forced. Let T 0 = {A v (= 3 r B), B v (= 3 r B), A v ¬B} with r ∈ NtR . Then, in every model of T 0 , an element satis- B fying A roots the structure depicted in Fig. 1(b), where the elements satisfying B form a strongly connected component. Fig. 1. Nevertheless, we will define tree-like canonical models for SQ that suffice for query answering. We start with introducing a basic form of tree decompositions of SQ-interpretations. A tree is a connected, acyclic graph (T, E) with a distinguished root, which we usually denote with ε. We usually write only T instead of (T, E), thus leaving E implicit. Fix some interpretation I = (∆I , ·I ). A bag M is a set of assertions of the form A(d), r(d, e) with d, e ∈ ∆I . We denote with dom(M ) the set of domain elements appearing in M . Given a set Λ ⊆ ∆I , we denote with bagI (Λ) the set bagI (Λ) = {A(d) | d ∈ Λ, d ∈ AI } ∪ {r(d, e) | d, e ∈ Λ, (d, e) ∈ rI }. Definition 2. A tree decomposition of an interpretation I is a tuple (T, bg) where T is a tree and bg assigns a bag to every node w in T such that: (i) bg(w) S = bagI (dom(bg(w))); (ii) ∆I = w∈T dom(bg(w)); (iii) rI = χr for non-transitive r and rI = χ+ r for transitive r, where χr = {(d, e) | r(d, e) ∈ bg(w), w ∈ T }; (iv) for all d ∈ ∆I , the set {w ∈ T | d ∈ dom(bag(w))} is connected in T . Definition 2 provides a variant of tree decompositions of interpretations with transitive relations. This formalisation does not yet enable tree automata to count over transitive roles (with a small number of states) since assertions r(d, e) can appear far away from each other in the decomposition. To address this, we introduce canonical tree decompositions which extend tree decompositions with a third component rl which assigns a role name to every non-root node of T and ⊥ to the root ε. Intuitively, a node w ∈ T labeled with r = rl(w) will be responsible for capturing r-successors of some element(s) in the previous bag. Fix a triple (T, bg, rl) such that (T, bg) is a tree decomposition of I. By Item (iv) of Definition 2, for each d ∈ ∆I , there is a unique node w ∈ T such that all occurrences of d are in or below w in T . In this case, we say that d is fresh in w, and we denote with F (w) the set of all fresh elements in w. We will also need a relativised version of freshness which takes into account the role associated to the predecessor bag. In particular, for each transitive role r and each w ∈ T with rl(w) ∈ {r, ⊥}, we define a set Fr (w) by taking: – Fr (w) = F (w) if the predecessor ŵ of w satisfies rl(ŵ) ∈ {r, ⊥}; – Fr (w) = dom(bg(w)) otherwise. Intuitively, Fr (w) contains all elements that are eligible as origins for r-successors. For a transitive role r and a bag M , we call ∅ ( a ⊆ dom(M ) an r-cluster in M if (i) r(a, b) ∈ M for all a 6= b ∈ a, and (ii) for all a ∈ a, b ∈ dom(M ) with r(a, b), r(b, a) ∈ M , we have b ∈ a. An r-cluster a in M is an r-root cluster in M if r(d, e) ∈ M for all d ∈ a and e ∈ dom(M ) \ a. Definition 3. A triple T = (T, bg, rl) is a canonical tree decomposition of I if (T, bg) is a tree decomposition of I and the following conditions are satisfied for every w ∈ T with M = bg(w) and r = rl(w) and every successor w0 of w with M 0 = bg(w0 ) and r0 = rl(w0 ): (C1) if r0 ∈ Nnt 0 0 R , then dom(M ) = {d, e}, for some d ∈ F (w), e ∈ F (w ), and 0 0 r (d, e) is the only role assertion in M ; (C2) if r0 ∈ NtR and r ∈ / {⊥, r0 }, then there are d ∈ F (w) and an r-root cluster a in M such that dom(M ) ∩ dom(M 0 ) = {d} and d ∈ a; moreover, there is no 0 successor v 0 of w different from w0 satisfying this for d and rl(v 0 ) = r0 ; (C3) if r0 ∈ NtR and r ∈ {⊥, r0 }, then there is an r0 -cluster a in M with: (a) a ⊆ Fr0 (w); (b) a is an r0 -root cluster in M 0 ; (c) for all d ∈ a and r0 (d, e) ∈ M , we have e ∈ dom(M 0 ); and (d) for all r0 (d, e) ∈ M 0 , d ∈ a ∪ F (w0 ) or e ∈ / F (w0 ). Definition 3 imposes restrictions on the structural relation between neighbour- ing bags. Note that Condition (C1) is also satisfied by standard unravelings over non-transitive roles [2]. Condition (C2) reflects that neighbouring bags associated with different role names do only interact via single domain elements; this con- forms with viewing SQ as a fusion logic [3]. Most interestingly, Condition (C3) plays the role of (C1), but for transitive roles. It is important to note that (C3) is based on r-clusters now since they can be enforced, see Example 1 above. Item (a) restricts for which elements a we can have successor bags; Item (b) requires that a is a root cluster in the successor bag M 0 ; Item (c) states that everything which was reachable from a via r0 in M should be also included in M 0 ; finally, Item (d) requires that there are no connections r(d, e) ∈ M 0 between elements d from M and fresh elements e from M 0 . As a consequence of Definition 3, we can address r-successors of elements in a canonical tree decomposition T = (T, bg, rl) of I. For a non-transitive role r, Condition (C1) ensures that r-successors e of d are contained only in suc- cessor nodes of the (unique) node where d is fresh. For a transitive role r, note first that (d, e) ∈ rI iff there is a sequence d0 , w0 , d1 , . . . , wn−1 , dn with di ∈ ∆I and wi ∈ T such that d = d0 , e = dn , and r(di , di+1 ) ∈ bg(wi ), for all 0 ≤ i < n; we call such a sequence an r-path from d to e in T. We call an r-path d0 , w0 , d1 , . . . , wn−1 , dn downward if wi is a successor of wi−1 and di is contained in an r-root cluster of wi , for all 0 < i < n. An r-path in T is canon- ical if P1: it is downward; or P2: d0 ∈ Fr (w0 ), d1 ∈ / Fr (w0 ), and, if n > 1, then d1 , w1 , . . . , dn is a downward path in T and the predecessor ŵ of w1 is an ancestor of w0 and satisfies d1 ∈ Fr (ŵ). Two r-paths d0 , w0 , d1 , . . . , wn−1 , dn and e0 , w00 , e1 , . . . , wm−1 0 , em from d to e are equivalent if n = m, wi = wi0 , for 0 ≤ i < n, and di and ei are in the same r-cluster in bg(wi ), for every 1 ≤ i < n. Lemma 4 establishes the basis for uniquely identifying transitive r-successors in a canonical tree decomposition which is essential for the design of automata. Lemma 4. Let T be a canonical decomposition of I, r transitive, and (d, e) ∈ rI . Then there is a unique canonical r-path up to equivalence from d to e in T. 3.1 Unraveling into Canonical Decompositions We give now the main technical contribution of our paper: an unraveling opera- tion into canonical decompositions of small width, and consequently a tree-like model property for SQ-interpretations. A canonical tree decomposition (T, bg, rl) has width k − 1 if k is the maximum size of dom(bg(w)), where w ranges over T ; its outdegree is the outdegree of the underlying tree T . Theorem 5. Let K = (T , A) be an SQ KB and I |= K. Then, there is an interpretation J and a canonical tree decomposition (T, bg, rl) of J such that: (1) A ⊆ bg(ε); (2) J |= K; (3) there is a homomorphism from J to I; (4) width and outdegree of (T, bg, rl) are bounded by O(|A| · 2poly(|T |) ). We outline the proof of Theorem 5. As a first step, we show that wlog. we can assume that I has a restricted outdegree and width, as defined below. This will be used later on to ensure the satisfaction of Condition (4) above. Given d ∈ ∆I and a transitive role r, the r-cluster of d in I, denoted by QI,r (d), is the set of all elements e ∈ ∆I such that both (d, e) ∈ rI and (e, d) ∈ rI . The width of I is the minimum k such that |QI,r (d)| ≤ k for all d ∈ ∆I , r ∈ NtR . Moreover, for a transitive role r, we say that e is a direct r-successor of d if (d, e) ∈ rI but e ∈/ QI,r (d), and for each f with (d, f ), (f, e) ∈ rI , we have f ∈ QI,r (d) or f ∈ QI,r (e); if r is non-transitive, then e is a direct r-successor of d if (d, e) ∈ rI . The breadth of I is the maximum k such that there are d, d1 , . . . , dk and a role name r, all di are direct r-successors of d, and – if r is non-transitive, then di 6= dj for all i 6= j; – if r is transitive, then QI,r (di ) 6= QI,r (dj ), for all i 6= j. We can assume that width and breath of I are within the following boundaries. Lemma 6 (adapting [19, 15]). For each I |= K, there is a sub-interpretation I 0 of I with I 0 |= K and width and breadth of I 0 are bounded by O(|A|+2poly(|T |) ). We need to introduce one more notion for dealing with at-most restrictions over transitive roles. Let cl(T ) be the set of all subconcepts occurring in T , closed under single negation. For each transitive role r, define a binary relation I,r on ∆I , by taking d I,r e if there is some (≤ n r C) ∈ cl(T ) such that d ∈ (≤ n r C)I , e ∈ C I , and (d, e) ∈ rI . Based on the transitive, reflexive closure ∗I,r of I,r , we define, for every d ∈ ∆I , the set WitI,r (d) of witnesses for d by taking S WitI,r (d) = e|d ∗ e QI,r (e). I,r Intuitively, WitI,r (d) contains all witnesses of at-most restrictions of some el- ement d, and due to using ∗I,r , also the witnesses of at-most restrictions of those witnesses and so on. It is important to note that the size of WitI,r (T ) is bounded exponentially in T (and linearly in A), see appendix. We describe now the construction of the interpretation J and its tree de- composition via a possibly infinite unraveling process. Elements of ∆J will be either of the form a with a ∈ ind(A) or of the form dx with d ∈ ∆I and some index x. We usually use δ to refer to domain elements in J (in either form), and define a function τ : ∆J → ∆I by setting τ (δ) = δ, for all δ ∈ ind(A), and τ (δ) = d, for all δ of the form dx in ∆J . To start the construction of J and (T, bg, rl), we set J = I|ind(A) and, for every transitive role r, define two sets ∆r , ∆0r by taking ∆0r = ∆r ∪ ind(A). S ∆r = {dr | d ∈ a∈ind(A) WitI,r (a) \ ind(A)} and Then extend J by adding, for each transitive r, ∆r to the domain and extending the interpretation of concept and role names such that, for all δ, δ 0 ∈ ∆0r , we have δ ∈ AJ ⇔ τ (δ) ∈ AI , for all A ∈ NC , and (δ, δ 0 ) ∈ rJ ⇔ (τ (δ), τ (δ 0 )) ∈ rI . (†) Now, initialise (T, bg, rl) with T = {ε}, bg(ε) = bagJ (∆J ), and rl(ε) = ⊥. Intuitively, this first step ensures that all witnesses of ABox individuals appear in the first bag. This finishes the initialisation phase. Next, extend J and (T, bg, rl) by applying the following rules exhaustively and in a fair way: R1 Let r be non-transitive, w ∈ T , δ ∈ F (w), and d a direct r-successor of τ (δ) in I with {δ, d} 6⊆ ind(A). Then, add a fresh successor v of w to T , add a fresh element dv to ∆J , extend J by adding (δ, dv ) ∈ rJ and dv ∈ AJ iff d ∈ AI , for all A ∈ NC , and set bg(v) = bagJ ({δ, dv }) and rl(v) = r. R2 Let r be transitive, w ∈ T , and δ ∈ F (w) such that: (a) w = ε and δ ∈ ∆s , s 6= r (∆s defined in the initialisation phase), or (b) w 6= ε and rl(w) 6= r. Then add a fresh successor v of w to T , and define ∆ = {ev | e ∈ WitI,r (τ (δ)) \ {τ (δ)}} and ∆0 = ∆ ∪ {δ}. Extend the domain of J with ∆ and the interpretation of concept and role names such that (†) is satisfied for all δ, δ 0 ∈ ∆0 . Finally, set bg(v) = bagJ (∆0 ) and rl(v) = r. R3 Let r be transitive, w ∈ T , a ⊆ Fr (w) an r-cluster in bg(w) such that: (a) w = ε and a ⊆ ∆0r , or (b) w 6= ε and rl(w) = r. If there is a direct r-successor e of τ (δ) in I for some δ ∈ a such that (δ, δ 0 ) ∈ / rJ for any δ 0 with τ (δ 0 ) = e, then add a fresh successor v of w to T , and define ∆ = {fv | f ∈ WitI,r (e) \ WitI,r (τ (δ))} and ∆0 = ∆ ∪ a ∪ {δ 00 | r(δ 0 , δ 00 ) ∈ bg(w) for some δ 0 ∈ a}. Then extend the domain of J with ∆ and the interpretation of concept names such that (†) is satisfied for all pairs δ, δ 0 with δ ∈ a ∪ ∆ and δ 0 ∈ ∆0 . Finally, set bg(v) = bagJ (∆0 ) and rl(v) = r. Rules R1 –R3 are, respectively, in one-to-one correspondence with Conditions (C1)-(C3) in Definition 3. In particular, R1 implements the well-known unravel- ing procedure for non-transitive roles. R2 is used to change the ‘role component’ for transitive roles by creating a bag which contains all witnesses of the chosen element δ. Finally, R3 describes how to unravel direct r-successors in case of transitive roles r. In the definition of ∆ it is taken care that witnesses which are ‘inherited’ from predecessors are not introduced again, in order to preserve at-most restrictions. Fig. 2. Example 7. Let K = ({A1 v (≤ 1 r B), A2 v (≤ 1 r C)}, {A1 (a)}) with r ∈ NtR . Fig. 2 shows a model I of T and a canonical decomposition T of its unraveling (transitivity connections are omitted). In the initialisation phase, the ⊥-bag is constructed starting from individual a. Since a I,r e and e I,r f , we have WitI,r (a) = {e, f }, thus er and fr are added in this phase. It is verified in the appendix that (T, bg, rl) and J satisfy the conditions from Theorem 5. Theorem 5 yields a tree-like model property for SQ-knowledge bases, which is interesting on its own since existing decidability results (for sat- isfiability) [20, 15] are based on the finite model property. 4 Automata-Based Approach to Query Answering In this section, we devise an automata-based decision procedure for query an- swering in SQ. By Theorem 5, if K 6|= q, there is an interpretation of small width and outdegree witnessing this. The idea is now to design two automata AK and Aq working over tree decompositions which accept precisely the models (of a fixed width) of the KB K and the query q, respectively. Query answering is then reduced to the question if some tree is accepted by AK , but not by Aq [8]. Trees are represented as prefix-closed subsets T ⊆ (N \ {0})∗ such that addi- tionally, wc ∈ T implies w(c − 1) ∈ T for all c > 1. A tree is k-ary if each node has exactly k successors. As a convention, we set w · 0 = w and wc · (−1) = w, N leave ε · (−1) undefined, and for any k ∈ , set [k] = {−1, 0, . . . , k}. Let Σ be a finite alphabet. A Σ-labeled tree is a pair (T, τ ) with T a tree and τ : T → Σ as- signs a letter from Σ to each node. An alternating 2-way tree automaton (2ATA) over Σ-labeled k-ary trees is a tuple A = (Q, Σ, q0 , δ, F ) where Q is a finite set of states, q0 ∈ Q is an initial state, δ is the transition function, and F is the (parity) acceptance condition [26]. The transition function maps a state q and an input letter a ∈ Σ to a positive Boolean formula over the constants true and false, and variables from [k] × Q. The semantics is given in terms of runs, see the appendix. As usual, L(A) denotes the set of trees accepted by A. Emptiness of L(A) can be checked in exponential time in the number of states of A [26]. In principle, tree decompositions T can be represented as labeled trees, where each node label consists of a bag and a role name (or ⊥). However, 2ATAs cannot run over such labeled trees because the domain underlying the bags is potentially infinite. Exploiting the bounded width, we encode the infinite domain with finitely many elements in the following well-known way [14, 5]. Let K be an SQ KB, let K be the bound on the width obtained in Theorem 5, and choose a set of elements ∆ of size 2K. We define the input alphabet Σ as the set of all pairs hM, xi such that M is a bag that uses only constants from ∆, |dom(M )| ≤ K, and x is a role appearing in K or ⊥. A Σ-labeled tree (T, τ ) represents a tree decomposition (and thus an interpretation) as follows. Each domain element d ∈ ∆ induces an equivalence relation ∼d on T by taking v ∼d w iff d appears in all bags on the path from v to w. Domain elements in the represented interpretation are then all equivalence classes obtained in this way. Moreover, for all w ∈ T , τ (w) represents the following bag: bg(w) = {A([w]∼d ) | A(d) ∈ τ (w))} ∪ {r([w]∼d , [w]∼e ) | r(d, e) ∈ τ (w)}. We denote the interpretation associated with a Σ-labeled tree (T, τ ) with IT,τ . Moreover, we consider only k-ary trees where k is the bound on the outdegree given by Theorem 5. Since tree decompositions are not necessarily uniformly branching, we include an auxiliary symbol • to refer to non-existing branches. Lemma 8. There are 2ATAs A1 , A2 , A3 of size O(|A| · 2poly(|T |) ) such that: (T, τ ) ∈ L(A1 ) iff (T, τ ) is the canonical decomposition of some interpretation; (T, τ ) ∈ L(A2 ) iff IT,τ |= A, and (T, τ ) ∈ L(A3 ) iff IT,τ |= T . The mentioned automaton AK is obtained as the conjunction of A1 , A2 , and A3 . Note that AK can be used to decide KB satisfiability in double exponential time, thus not optimal [15]. We concentrate here on the most interesting 2ATA A3 . Denote with nnf(C) the negation normal form of a concept C, and define nnf(T ) = {nnf(C) | C ∈ cl(T )}. Moreover, let Rol(K) be the set of role names appearing in K. Then, define A3 = (Q3 , Σ, q0 , δ3 , F3 ); start by including in Q3 0 0 {q0 } ∪ Qnt ∪ Qt ∪ {Fx,d , Fx,d , F x,d , F x,d | d ∈ ∆, x ∈ {⊥} ∪ Rol(K)} ∪ ∗ 0 {qd , qC,d | C ∈ nnf(T ), d ∈ ∆} ∪ {qC,d , qC,d | C = (∼ n r D) ∈ nnf(T ), d ∈ ∆} ∗ where Qt and Qnt are the states that are used after entering states q(∼n r D),d for transitive and non-transitive roles, respectively. Then, we define the transition ∗ function for all states except states of the form q(∼n r D),d : ^ ^ ^  δ3 (q0 , hM, xi) = (i, q0 ) ∧ (0, qnnf(¬C),d ) ∨ (0, qD,d ) i∈[k] d∈dom(M ) CvD∈T δ3 (q0 , •) = true δ3 (qA,d , hM, xi) = if A(d) ∈ M , then true else false δ3 (q¬A,d , hM, xi) = if A(d) ∈ / M , then true else false δ3 (qC1 tC2 ,d , hM, xi) = (0, qC1 ,d ) ∨ (0, qC2 ,d ) δ3 (qC1 uC2 ,d , hM, xi) = (0, qC1 ,d ) ∧ (0, qC2 ,d ) ∗  _ δ3 (q(∼n r D),d , hM, xi) = (0, Fx,d ) ∧ (0, q(∼n r D),d ) ∨ (i, q(∼n r D),d ) ∧ (i, qd ) i∈[k] δ3 (qd , hM, xi) = if d ∈ dom(M ), then true else false δ3 (F⊥,d , hM, xi) = true ( 0 (−1, Fr,d ) if r ∈ Nnt t R or (r ∈ NR and x = r) δ3 (Fr,d , hM, xi) = false otherwise ( 0 true d ∈ / dom(M ) or (r ∈ NtR and x ∈ / {⊥, r}) δ3 (Fr,d , hM, xi) = false otherwise Intuitively, q0 is used to verify that the TBox is globally satisfied. A state qC,d assigned to a node w is used as an obligation to verify that the element d satisfies the concept C. This can be done locally for Boolean concept constructors u, t, ¬, as implemented in the transitions above. For concepts of the form (∼ n r D), we have to be more careful: the automaton has to move to the unique node w where d ∈ Fr (w), identified using states Fr,d and qd (and the acceptance condition). The transitions for number restrictions on non-transitive roles are defered to the appendix. For transitive roles, Lemma 4 provides the following observation: For counting the r-successors satisfying D of some d ∈ dom(bg(w)), it suffices to look at three “locations” in the tree decomposition: in the bag at w itself, along canonical paths satisfying P1, and along canonical paths satisfying P2. We next implement this strategy for at-least restrictions. In the following transitions, we assume that a1 , . . . , a` are all r-clusters in M , and that a1 , . . . , a` are represen- tatives of each cluster. A partition n1 + . . . + n` = n respects M relative to d if ni = 0 whenever r(d, ai ) ∈ / M ; it d-respects M relative to d if ni = 0 whenever r(d, ai ) ∈ / M or d ∈ ai . Moreover, we define Mr (d) = {e | r(d, e), r(e, d) ∈ M }, and define transitions for (the complement of Fx,d ) F x,d similar to Fx,d . _ ^ ∗ 0 1 δ3 (q(≥n r D),d , hM, xi) = (0, q(≥n i r D),ai ) ∨ (0, q(≥n i r D),ai ) n1 +...+n` =n ni 6=0 respects M rel. to d 0 ↓ δ3 (q(≥n r D),d , hM, xi) = (0, Fr,d ) ∧ (0, q(≥n r D),d ) 1 ↑ δ3 (q(≥n r D),d , hM, xi) = (0, F r,d ) ∧ (−1, q(≥n r D),d ) ↓ _ ^ δ3 (q(≥n r D),d , hM, xi) = (0, pn0 ,r,D,d ) ∧ (i, p(≥ni r D),d ) n0 +n1 +...+nk =n ni 6=0 _ ^ ^  δ3 (pn,r,D,d , hM, xi) = qD,e ∧ q∼D,e Y ⊆Mr (d),|Y |=n e∈Y y∈Mr (d)\Y δ3 (p(≥n r D),d , •) = if n = 0, then true else false  false _  if x 6= r or d not in root cluster ^ 0 δ3 (p(≥n r D),d , hM, xi) = (0, q(≥n i r D),ai ) otherwise   n1 +...+n` =n ni 6=0 d-respects M rel. to d ↑ 0 1  δ3 (q(≥n r D),d , hM, xi) = (0, qd ) ∧ (0, q(≥n r D),d ) ∨ (0, q(≥n r D),d ) Intuitively, the automaton non-deterministically guesses a partition n1 + . . . + nk of n and verifies that, starting from ai at least ni elements are reachable via r and satisfy D. For each such r-cluster, it proceeds either downwards (in states of the form q·0 and q·↓ ) or looks for the world where the cluster ai was a root (in states q·1 and q·↑ ) and proceeds downwards from there on. In states q(≥n ↓ r D),d , the automaton again partitions n this time into n0 , . . . , nk ; it then verifies that there are n0 elements in the r-cluster of d satisfying D and, recursively, that via the i-th successor of the current node, there are ni elements that are reachable via ↓ r and satisfy D. Using the parity condition, we make sure that states q(≥n r D),d with n ≥ 1 are not suspended forever, that is, eventualities are finally satisfied. For the at-most restrictions, recall that (≤ n r D) is equivalent to ¬(≥ n+ 1 r D); we can thus obtain the transitions for q(≤n r D),d by “complementing” the transitions for q(≥n+1 r D),d ; details are given in the appendix. In order to construct, for a given query q, an automaton Aq which accepts a tree (T, τ ) ∈ L(A1 ) iff IT,τ |= q, we adapt and extend ideas from [8] to canonical tree decompositions. The result is a nondeterministic parity tree automaton [13] of size exponential in q, and doubly exponential in K. In contrast to [8], the query automaton depends on the KB. Indeed, for checking whether a fact r(x, y) from the query is true (given some match candidate), it has to recall domain elements in the states; their number, however, is bounded by the width only. It remains to remark that the question of whether L(AK ) \ L(Aq ) is empty can be decided in 3ExpTime, given the mentioned bounds on the sizes of the involved automata. Theorem 9. The certain answers problem for P2RPQs over SQ-KBs is decid- able in 3ExpTime. 5 Discussion and Future Work We have developed a tree-like decomposition for SQ which handles the interac- tion of number restrictions over transitive roles and enables the use of automata- based techniques for query answering. Our techniques yield a 3ExpTime upper bound, leaving an exponential gap to the known 2ExpTime lower bound, for answering positive existential queries over ALC KBs [8]. As immediate future work, we plan to close this gap, taking into account also other techniques for query answering such as rewriting [11]. Another rele- vant question is the precise data complexity – the present techniques give only exponential bounds, but we expect coNP-completeness. Moreover, we plan to extend our approach to nominals and inverses. We will also look at the prob- lem of answering conjunctive queries (CQs) in SQ; in general, the proposed automata-based approach yields the same upper bound for the problem of an- swering P2RPQs or CQs, but we expect it to be easier for CQs. Finally, we plan to see whether our techniques extend to the query containment problem, and develop techniques for finite query answering in (extensions of) SQ. References 1. Amarilli, A., Benedikt, M., Bourhis, P., Vanden Boom, M.: Query answering with transitive and linear-ordered data. In: Proc. of IJCAI-16. pp. 893–899 (2016) 2. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory, Implementation, and Applica- tions. Cambridge University Press (2003) 3. Baader, F., Lutz, C., Sturm, H., Wolter, F.: Fusions of description logics and abstract description systems. J. Artif. Intell. Res. (JAIR) 16, 1–58 (2002) 4. Baget, J., Bienvenu, M., Mugnier, M., Rocher, S.: Combining existential rules and transitivity: Next steps. In: Proc. of IJCAI-15. pp. 2720–2726 (2015) 5. Benedikt, M., Bourhis, P., Vanden Boom, M.: A step up in expressiveness of de- cidable fixpoint logics. In: Proc. of LICS-16. pp. 817–826 (2016) 6. Bienvenu, M., Ortiz, M., Simkus, M.: Regular path queries in lightweight descrip- tion logics: Complexity and algorithms. J. Artif. Intell. Res. (JAIR) 53, 315–374 (2015) 7. Calvanese, D., Eiter, T., Ortiz, M.: Regular path queries in expressive description logics with nominals. In: Proc. of IJCAI-09. pp. 714–720 (2009) 8. Calvanese, D., Eiter, T., Ortiz, M.: Answering regular path queries in expressive description logics via alternating tree-automata. Inf. Comput. 237, 12–55 (2014), http://dx.doi.org/10.1016/j.ic.2014.04.002 9. Eiter, T., Lutz, C., Ortiz, M., Simkus, M.: Query answering in description logics with transitive roles. In: Proc. of IJCAI-09. pp. 759–764 (2009) 10. Glimm, B., Horrocks, I., Sattler, U.: Unions of conjunctive queries in SHOQ. In: Proc. of KR-08. pp. 252–262 (2008) 11. Glimm, B., Lutz, C., Horrocks, I., Sattler, U.: Conjunctive query answering for the description logic SHIQ. J. Artif. Intell. Res. (JAIR) 31, 157–204 (2008) 12. Gottlob, G., Pieris, A., Tendera, L.: Querying the guarded fragment with transi- tivity. In: Proc. of ICALP-13. pp. 287–298 (2013) 13. Grädel, E., Thomas, W., Wilke, T. (eds.): Automata, Logics, and Infinite Games: A Guide to Current Research, LNCS, vol. 2500. Springer 14. Grädel, E., Walukiewicz, I.: Guarded fixed point logic. In: Proc. of LICS-99. pp. 45–54 (1999) 15. Gutiérrez-Basulto, V., Ibáñez-Garcı́a, Y., Jung, J.C.: Number restrictions on tran- sitive roles in description logics with nominals. In: Proc. of AAAI-17 (2017) 16. Hollunder, B., Baader, F.: Qualifying number restrictions in concept languages. In: Proc. of KR-91. pp. 335–346 (1991) 17. Horrocks, I., Sattler, U., Tobies, S.: Practical reasoning for very expressive descrip- tion logics. Logic Journal of the IGPL 8(3), 239–263 (2000) 18. Kaminski, M., Smolka, G.: Terminating tableaux for SOQ with number restrictions on transitive roles. In: Proc. of the 6th IFIP TC. pp. 213–228 (2010) 19. Kazakov, Y., Pratt-Hartmann, I.: A note on the complexity of the satisfiability problem for graded modal logics. In: Proc. of LICS-09. pp. 407–416 (2009) 20. Kazakov, Y., Sattler, U., Zolin, E.: How many legs do I have? non-simple roles in number restrictions revisited. In: Proc. of LPAR-07. pp. 303–317 (2007) 21. Pratt-Hartmann, I.: The two-variable fragment with counting and equivalence. Math. Log. Q. 61(6), 474–515 (2015) 22. Rector, A.L., Rogers, J.: Ontological and practical issues in using a description logic to represent medical concept systems: Experience from GALEN. In: In Proc. of RW-06. pp. 197–231 (2006) 23. Schröder, L., Pattinson, D.: How many toes do I have? parthood and number restrictions in description logics. In: Proc. of KR-08. pp. 307–317 (2008) 24. Stevens, R., Aranguren, M.E., Wolstencroft, K., Sattler, U., Drummond, N., Hor- ridge, M., Rector, A.L.: Using OWL to model biological knowledge. International Journal of Man-Machine Studies 65(7), 583–594 (2007) 25. Tendera, L.: Counting in the two variable guarded logic with transitivity. In: Proc. of STACS-05. pp. 83–96 (2005) 26. Vardi, M.Y.: Reasoning about the past with two-way automata. In: Proc. ICALP- 98. pp. 628–641 (1998) 27. Wolstencroft, K., Brass, A., Horrocks, I., Lord, P., Sattler, U., Turi, D., Stevens, R.: A little semantic web goes a long way in biology. In: Proc. of ISWC-05 (2005)