Polynomial Disjunctive Datalog Rewritings of Instance Queries in Expressive Description Logics Shqiponja Ahmetaj, Magdalena Ortiz, and Mantas Šimkus Institute of Information Systems, TU Vienna, Austria Abstract. Rewriting ontology mediated queries (OMQs) into traditional query languages like FO-queries and Datalog is central for understanding their relative expressiveness, and plays a crucial role in the ongoing efforts to develop OMQ answering tools by reusing existing database technologies. However, the vast majority of works focus on Horn ontologies, and for OMQs where the ontologies are written in extensions of ALC, only a couple of exponential rewritings into disjunctive Datalog are known. In this paper we propose a translation of instance queries mediated by ontologies in the expressive DL ALCHI, into polynomial-sized disjunctive Datalog programs. The translation is based on a simple game-like algorithm, and can be extended to accommodate nominals. We can also rewrite OMQs with closed-predicates into Datalog programs with (limited) negation. Closed predicates are useful for combining complete and incomplete information, but make OMQs non-monotonic and thus not rewritable into positive disjunctive Datalog. 1 Introduction In ontology-mediated queries (OMQs), a database query is enriched with an ontology, providing knowledge to obtain more complete answers from incomplete data. OMQs are receiving much attention in the database and knowledge repre- sentation research communities, particularly when the ontological knowledge is expressed in Description Logics (DLs) or in rule-based formalisms like existential rules and Datalog±, see e.g., [4, 3, 13] and their references. One of the most central questions in OMQ research is rewritability: given an OMQ Q, specified by a query and an ontology, obtain a query Q0 —in some standard database language, like first-order (FO) queries or (fragments of) Datalog—such that, for any dataset A, the certain answers to Q and Q0 coincide. Such a Q0 is called a rewriting of Q. Its existence and size are crucial for understanding the relative expressiveness of different families of OMQs in terms of more traditional query languages. Rewritings are also very relevant in practice, since they allow to reuse existing database technologies to support OMQ answering. Research into OMQs that can be rewritten into first-order (FO) queries has produced the successful DL-Lite family [5] of DLs. The succinctness of rewritings for OMQs consisting of the so-called (unions of ) conjunctive queries (UCQs) paired with ontologies in DL-Lite, or in related families of existential rules, has been extensively studied. It has been shown that polynomially sized non-recursive Datalog queries can be constructed for a few well known ontology languages, formulated as DLs or existential rules [14]; but for many other cases, the lack of succinctness of such rewritings has also been established [11], and some authors have considered rewritings that, unlike the ones we consider here, are not data- independent [18, 12]. For DLs beyond DL-Lite, which are often not FO-rewritable, rewritings of UCQs into Datalog queries have been proposed, and lie at the core of implemented systems, e.g., [24, 9, 25]. The pioneering work in [15] showed that instance queries in an expressive extension of ALC can be rewritten into a program in disjunctive Datalog, using a constant number of variables per rule, but exponentially many rules. The first translation from CQs in expressive DLs (SH, SHQ) to disjunctive Datalog programs was introduced in [8], but the program may contain double exponentially many predicates. For ALC and UCQs, the existence of exponential rewritings in disjunctive Datalog was shown recently [4]. For restricted fragments of SHI and classes of CQs translations to Datalog were investigated in [16, 17]. A polynomial time Datalog translation of instance queries has been proposed in [23], but for a so-called Horn-DL that lacks disjunction. To our knowledge, this was until now the only polynomial rewriting for a DL that is not FO-rewritable. The main contribution of this paper is a polynomial time translation of instance queries mediated by ALCHI TBoxes into disjunctive Datalog. The translation is based on a simple game-theoretic characterization, and implements a so-called type-elimination procedure using a polynomially sized Datalog program. To our knowledge, this is the first polynomial time translation of an expressive (non-Horn) DL into disjunctive Datalog. The translation we present here can be extended to richer OMQs, and in particular, to nominals and closed predicates. In fact, we report here a simplified version of the translation in [1], which is a polynomial translation of instance queries mediated by ALCHOI TBoxes with closed predicates into disjunctive Datalog queries with negation. 2 Preliminaries We give some basic notions of DLs and Datalog. The DL ALCHI We assume countably infinite, mutually disjoint sets NR of role names, NC of concept names, and NI of individual names. A role r is either a role name p, or an expression p− , called the inverse of p. We let r− = p if r = p− . Concepts are defined inductively: (i) >, ⊥ and all A ∈ NC are concepts; (ii) if C1 , C2 are concepts, then C1 u C2 , C1 t C2 and ¬C1 are concepts; (iii) if r is a role, and C is a concept, then ∃r.C, ∀r.C are concepts. A concept inclusion is an expression of form C1 v C2 , where C1 , C2 are concepts. A role inclusion is an expression of form r1 v r2 , where r1 , r2 are roles. A TBox T is a finite set of (concept and role) inclusions. An ABox A is a finite set of assertions of the forms A(a) and p(a, b), where {a, b} ⊆ NI , A ∈ NC , and p ∈ NR . A knowledge base (KB) is a tuple K = (T , A), where T is a TBox and A is an ABox. An interpretation is a pair I = h∆I , ·I i, where ∆I is a non-empty set (called the domain), AI ⊆ ∆I for each A ∈ NC , rI ⊆ ∆I × ∆I for each r ∈ NR , and aI ∈ ∆I for each a ∈ NI . The function ·I is extended to the remaining concepts and roles in the standard way [2]. An interpretation I satisfies an inclusion q1 v q2 , if q1I ⊆ q2I , in symbols I |= q1 v q2 ; and it satisfies an assertion q(a) if (a)I ∈ q I , in symbols, I |= q(a). For Γ a TBox or ABox, we write I |= Γ if I |= α for all α ∈ Γ . For a KB K = (T , A), we write I |= K if the following hold: (i) a ∈ ∆I and aI = a for each a ∈ NI occurring in K,1 (ii) I |= T , and (iii) I |= A. For an assertion α, we write K |= α if I |= α for all I with I |= K. Instance Queries An (ontology mediated) instance query is a tuple Q = (T , q), where T is a TBox, and q ∈ NC ∪ NR . Let a ∈ NI in case q ∈ NC , and a ∈ N2I otherwise. Then a is a certain answer to Q over an ABox A if (T , A) |= q(a). To ease presentation, we assume that every concept and role of A also occurs in T . Normal Form Our results apply to arbitrary TBoxes, but to simplify pre- sentation, we consider TBoxes in normal form where inclusions take one of the following forms: (N1) A1 u · · · u An v An+1 t · · · t Ak (N3) A v ∀r.A0 (N2) A v ∃r.A0 (N4) r v s with r, s roles, {A1 , . . . , Ak } ⊆ NC , and {A, A0 } ⊆ NC . We also assume that T is closed under role inclusions as follows: (a) p v p ∈ T , for each p ∈ NR occurring in T , (b) if r v s ∈ T , then r− v s− ∈ T , and (c) if r1 v r2 ∈ T and r2 v r3 ∈ T , then r1 v r3 ∈ T . For Γ a TBox, ABox, or KB, we denote by NI (Γ ), NR (Γ ), NC (Γ ) the set of individuals, role and concept names that occur in Γ , resp. Disjunctive Datalog We assume countably infinite sets NP and NV of predicate symbols (each with an associated arity) and variables, respectively. We further assume that NC ∪ NR ⊆ NP with each A ∈ NC being unary, and each r ∈ NR being binary. An atom is an expression of the form R(t1 , . . . , tn ), where {t1 , . . . , tn } ⊆ NI ∪ NV , and R is an n-ary relation symbol. A rule ρ is an expression of the form h1 ∨ . . . ∨ hn ← b1 , . . . , bk , where H = {h1 , . . . , hn } and B = {b1 , . . . , bk } are sets of atoms, called the head and the body of ρ, respectively. Each variable that occurs in H must also occur in B. Rules of the form h ← (known as facts) are simply identified with the atom h, thus ABox assertions are valid facts in our syntax. For a role name p, we may use p− (t1 , t2 ) to denote the atom p(t2 , t1 ). A program is any finite set P of rules. We use ground(P ) to denote the grounding of P , i.e. the variable-free program that is obtained from P by applying on its rules all the possible substitutions of variables by individuals of P . An (Herbrand) interpretation (or, database) I is any finite set of variable-free (or, ground ) atoms. An interpretation I is a model of a program P if {b1 , . . . , bk } ⊆ I implies I ∩ {h1 , . . . , hn } = 6 ∅ for all rules h1 ∨ . . . ∨ hn ← b1 , . . . , bk in ground(P ). A ( Datalog) query is a pair (P, q), where P is a program, and q is a predicate 1 This is called the standard name assumption (SNA), which we apply to the individuals occurring in K. symbol from P . A tuple a of constants is a certain answer to (P, q) over a database I if q(a) ∈ J for all models J of P ∪ I. 3 Characterization of Counter Models Assume a KB K = (T , A) and an assertion β. Towards deciding K 6|= β using a polynomially sized program, we decompose the problem into two steps: (1) Guess a core interpretation Ic for K, whose domain is NI (A). Core interpreta- tions fix how the individuals of K participate in concepts and roles, ensuring the satisfaction of A, and the non-entailment of β. (2) Check that Ic can be extended to satisfy all axioms in T . Defining Datalog rules that do (1) is not hard, but (2) is more challenging, and will rely on a game-theoretic characterization we describe below. But first we need to define core interpretations. Definition 1. A core interpretation for a KB K = (T , A) is any interpretation Ic such that (c1) ∆Ic = NI (A) and aIc = a for all a ∈ NI (A), (c2) Ic |= A, (c3) Ic |= A v ∀r.A0 for all A v ∀r.A0 ∈ T , (c4) Ic |= r v s for all r v s ∈ T , (c5) q Ic = ∅ for each q 6∈ NC (T ) ∪ NR (T ). An interpretation J is called an extension of Ic , if Ic is the result of restricting J to NI (A). A core and its extensions coincide on the assertions they entail, and deciding non-entailment of an instance query q amounts to deciding whether there is a core that does not entail q, and that can be extended into a model. But verifying whether a core can be extended into a full model is hard: it corresponds to testing consistency (of Ic viewed as an ABox) with respect to T , an ExpTime-hard problem already for fragments of ALCHI. In order to obtain a polynomial set of rules that solves this ExpTime-hard problem, we characterize it as a game, revealing a simple algorithm that admits an elegant implementation in Datalog. Definition 2. A type (over a TBox T ) τ is a subset of NC (T ) ∪ {>} such that ⊥ 6∈ τ and > ∈ τ . We say that τ satisfies an inclusion α = A1 u · · · u An v An+1 t · · · t Ak of type (N1), if {A1 , . . . , An } ⊆ τ implies {An+1 , . . . , Ak } ∩ τ = 6 ∅; otherwise τ violates α. For an element e ∈ ∆I in an interpretation I, we let type(e, I) = {B ∈ NC | e ∈ B I }. A type τ is realized in I if there is some e ∈ ∆I s.t. type(e, I) = τ . We now describe a game to decide whether a given core Ic can be extended into a model of a KB K. The game is played by Bob (the builder), who wants to extend Ic into a model, and Sam (the spoiler), who wants to spoil all Bob’s attempts. Sam starts by picking an individual a, and they look at its type type(a, Ic ). If it doesn’t satisfy the inclusions of type (N1) Sam wins. Otherwise, in each turn Sam chooses an inclusion of the form A v ∃rA0 which would need to be satisfied by (an element with) the current type, forcing Bob to pick a type for the corresponding r-successor that satisfies A0 . The game continues for as long as Bob can respond to the challenges of Sam. The game on I starts by Sam choosing an individual a ∈ ∆I , and τ = type(a, I) is set to be the current type. Then: () If τ does not satisfy all inclusions of type (N1) in T , then Sam is declared the winner. Otherwise, Sam chooses an inclusion A v ∃r.A0 ∈ T with A ∈ τ ; if there is no such inclusion, Bob is declared the winner. Otherwise, Bob chooses a new type τ 0 such that: (C1) A0 ∈ τ 0 , and (C2) for all inclusions A1 v ∀s.A2 ∈ T : • if r v s ∈ T and A1 ∈ τ then A2 ∈ τ 0 , • if r− v s ∈ T and A1 ∈ τ 0 then A2 ∈ τ . 0 τ is set to be the current type, and the game continues with a new round, i.e. we go back to . A run of the game on I is a (possibly infinite) sequence aα1 τ1 α2 τ2 . . . where a is the individual picked initially by Sam, and each αi and τi are the inclusion picked by Sam and the type picked by Bob in round i, respectively. A strategy for Bob is a partial function str that maps pairs of a type τ and an inclu- sion A v ∃r.A0 with A ∈ τ to a type τ 0 that satisfies (C1) and (C2); intuitively, it gives a move for Bob in response to moves of Sam. A run aα1 τ1 α2 τ2 . . . with type(a, I) = τ0 follows a strategy str if τi = str(τi−1 , αi ) for every i ≥ 1. For a finite run w, we let tail(w) = type(a, I) if w = a, and tail(w) = τ` if w = a . . . α` τ` with ` ≥ 1. The strategy str is called non-losing on I if for every finite run w that follows str, tail(w) satisfies all inclusions of type (N1) in T , and str(tail(w), A v ∃r.A0 ) is defined for every A v ∃r.A0 ∈ T with A ∈ tail(w). Theorem 1. Assume a KB K and an assertion β. Then K 6|= β iff there is a core interpretation Ic for K such that: (1) Ic 6|= β, and (2) there is a non-losing strategy for Bob on Ic . Proof. (Sketch.) We focus on showing that there is a non-losing strategy str for Bob on Ic iff there exists an extension J of Ic s.t. J |= K. The claim follows from this, and the easy claim that extensions preserve non-entailment of β. For the “⇒” direction, from an arbitrary non-losing str for Ic , we build J as follows. We denote by frn(Ic , str) the set of all finite runs aα1 τ1 . . . αi τi , i ≥ 0 that follow str. The domain of J is: ∆J = frn(Ic , str) and for each a ∈ NI , each A ∈ NC , and each p ∈ NR we let: aJ = aIc AJ = {w | A ∈ tail(w)} pJ = pIc ∪ {(w, wαi τi ) | ri v p ∈ T } ∪ {(wαi τi , w) | ri− v p ∈ T } where αi = A v ∃ri .A0 ∈ T and {w, wαi τi } ⊆ ∆J . It is not hard to see that J is an extension of Ic . We show that J is a model of K. First, J |= A by definition of Ic and the fact that J is an extension of Ic . It is left to prove that J satisfies all inclusions in T . Note that for each w ∈ ∆J , type(w, J ) = tail(w). That J satisfies all inclusions of type (N1) holds by definition of non-losing str and the fact that tail(w) satisfies all such inclusions. For the inclusions α = A v ∃r.A0 of type (N2), we see that for each w ∈ ∆J , if w ∈ AJ , then A ∈ tail(w) and w is a run that follows str. Hence, since str is a non-losing strategy str(tail(w), α) is defined, that is there exists a τ 0 over T such that str(tail(w), α) = τ 0 and A0 ∈ τ 0 . Thus wατ 0 is a run that follows str, i.e. wατ 0 ∈ frn(Ic , str), so wατ 0 is in A0J and (w, wατ 0 ) ∈ rJ if r is a role name, and (wατ 0 , w) ∈ rJ otherwise. For the inclusions A1 v ∀r.A2 of type (N3), we distinguish the case of (a, a0 ) ∈ rIc for a pair a, a0 of individuals in ∆Ic , for which a ∈ AJ 0 J 1 implies a ∈ A2 holds by definition of Ic , and the case of an arbitrary pair of objects (w, w ) ∈ rIc where w0 is a child of w (that is, w0 is of the form 0 wαi τi ), or vice-versa, for which w0 ∈ AJ J 2 whenever w ∈ A1 is ensured by the condition (C2) in the definition of the game, the fact that str is a non-losing strategy, and the construction of J . Finally, the inclusions of type (N4) are also guaranteed by the definition of the core Ic for pairs of individual names, and by the condition (C2) in the definition of the game for arbitrary pairs of an object and its child. For “⇐”, assume an arbitrary model J of K that is an extension of Ic . We can extract from it a non-losing strategy for Bob as follows. First, let T be a the set of all the types realized in J , and observe that each type in T satisfies all inclusions of type (N1) and that for all a ∈ NI (A), type(a, J ) ∈ T . Then it suffices to set, for each τ ∈ T and each A v ∃r.A0 ∈ T with A ∈ τ , str(τ, A v ∃r.A0 ) = τ 0 for an arbitrarily chosen τ 0 ∈ T that satisfies (C1) and (C2), which exists because J is a model, hence it satisfies all existential and universal inclusions in T , and all types realized in J are contained in T . This str is a non-losing strategy for Bob. t u To decide whether Bob has a non-losing strategy on a given core we use the type elimination procedure Mark in Algorithm 1, which marks (or, eliminates) all types from which Sam has a winning strategy. It takes as input the TBox T , and an interpretation I which intuitively is the core being checked. The algorithm starts by building the set N of all possible types over T , and then it eliminates bad choices (from which Bob would loose) by marking them. In step (MN1 ), the algorithm marks in N all types that violate some inclusion of type (N1) in T ; Sam wins already in the first round on these types. Then, in the loop, (M∃ ) exhaustively marks types τ that allow Sam to pick an inclusion A v ∃r.A0 for Algorithm 1: Mark input : TBox T , interpretation I output : Set of (possibly marked) types N ← {τ | τ is a type over T } (MN1 ) Mark each τ ∈ N that violates some inclusion of the form (N1) in T repeat (M∃ ) Mark each τ ∈ N such that A v ∃r.A0 ∈ T , A ∈ τ , and for each τ 0 ∈ N , at least one the following holds: (C0) τ 0 is marked, 0 (C1 ) A0 ∈ / τ 0 , or (C20 ) there exists A1 v ∀s.A2 ∈ T with / τ 0 , or – r v s ∈ T and A1 ∈ τ and A2 ∈ − 0 – r v s ∈ T and A1 ∈ τ and A2 ∈ /τ until no new type is marked return N which Bob cannot reply with any τ 0 . At the end the marked types, are those from which Bob can lose, so we get: Theorem 2. Let Ic be a core interpretation. Then Bob has a non-losing strategy on Ic iff none of the types realized in Ic is marked by Mark(T , Ic ). Proof. (Sketch.) For the “⇒” direction, we can show (by induction in the number of iterations of Mark(T , Ic )) that if a type is marked, then it cannot occur in a non-losing str for Bob. For the “⇐” direction, a non-losing str for Ic is obtained by taking all unmarked τ ∈ N , and for each of them, and each A v ∃r.A0 ∈ T with A ∈ τ , setting str(τ, A v ∃r.A0 ) = τ 0 for an arbitrary unmarked τ 0 that satisfies (C1) and (C2). t u 4 Translation into Datalog Assume an instance query (T , q). We build next a polynomially sized program P such that the queries (T , q) and (P, q) have the same certain answers for all ABoxes over the signature of T . Roughly, the program P has 3 major components: (a) rules to non-deterministically generate a core interpretation Ic for the KB (T , A), where A is an input ABox; (b) rules that implement the type elimination algorithm presented in the previous section; (c) rules that glue (a) and (b) together, ensuring that all types that occur in Ic are not marked by the marking procedure. We remark that the construction of P is independent from any particular ABox. (I) Collecting the individuals We first add rules to collect in the unary predicate ind all the individuals that occur in the input ABox. For each A ∈ NC (T ), r ∈ NR (T ), we have: ind(x) ← A(x) ind(x) ← r(x, y) ind(y) ← r(x, y) (II) Generating core interpretations For each A ∈ NC (T ) (resp., r ∈ NR (T )) we will use a fresh concept name A (resp., role name r). We add the following rules to P : A(x) ∨ A(x) ← ind(x) A ∈ NC (T ) ← A(x), A(x) A ∈ NC (T ) r(x, y) ∨ r(x, y) ← ind(x), ind(y) r ∈ NR (T ) ← r(x, y), r(x, y) r ∈ NR (T ) To ensure (c3) in Definition 1, for each A v ∀r.A0 ∈ T we add the rule A0 (y) ← A(x), r(x, y). Then, to ensure (c4), for each role inclusion r v s ∈ T we add the rule s(x, y) ← r(x, y). Intuitively, the stable models of the above rules generate the different core interpretations Ic of the KB K = (T , A) for any given A. We next implement the algorithm Mark from Section 3. To obtain a poly- nomially sized program, we need to use non-ground rules whose number of variables depends on the number of different concept names in T . Assume an arbitrary enumeration A1 , . . . , Ak of NC (T ). Assume also a pair 0, 1 of special individuals. Intuitively, we will use a k-ary relation Type = {0, 1}k to store the set of all types over T . Naturally, a k-tuple (a1 , . . . , ak ) ∈ Type encodes the type τ = {Ai | ai = 1, 1 ≤ i ≤ k} ∪ {>}. We are most interested in computing a k-ary relation Marked ⊆ {0, 1}k that contains precisely the types marked by the Mark algorithm. We next define the rules to compute Type and Marked, and other relevant relations. (III) A linear order over types The first ingredient is a linear order over all possible types. To this end, for every 1 ≤ i ≤ k, we inductively define i-ary relations firsti and lasti , and a 2i-ary relation nexti , which will provide the first, the last and the successor elements from a linear order on {0, 1}i . In particular, given u, v ∈ {0, 1}i , the fact nexti (u, v) will be true if v follows u in the ordering of {0, 1}i . The rules to populate nexti are quite standard (see, e.g., Theorem 4.5 in [6]). For the case i = 1, we simply add the following facts: first1 (0) ← last1 (1) ← next1 (0, 1) ← Then, for all 1 ≤ i ≤ k − 1 we add the following rules: nexti+1 (0, x, 0, y) ← nexti (x, y) nexti+1 (1, x, 1, y) ← nexti (x, y) nexti+1 (0, x, 1, y) ← lasti (x), firsti (y) firsti+1 (0, x) ← firsti (x) lasti+1 (1, x) ← lasti (x) We can now collect in the k-ary relation Type all types over T (thus computing the set N of the Mark algorithm): Type(x) ← firstk (x) Type(y) ← nextk (x, y) (IV) Implementing Step (MN1 ) First, we add the auxiliary facts F(0) ← and T(1) ← to P . For a k-tuple of variables x, we let B ∈ x denote the atom T (xj ), where j is the index of B in the enumeration of NC (T ). Similarly, we let B 6∈ x denote the atom F (xj ), where j is the index of B in the enumeration. Then the step (MN1 ), which marks types violating inclusions of type (N1), is implemented using the following rule for every inclusion A1 u · · · u An v A01 t · · · t A0m of T : Marked(x) ← Type(x), A1 ∈ x, . . . , An ∈ x, A01 6∈ x, . . . , A0m 6∈ x (V) Implementing Step (M∃ ) The following rules are added for each inclu- sion α = A v ∃r.A0 ∈ T . Recall that we need to mark a type τ if A ∈ τ , and for each type τ 0 at least one of (C0), (C10 ) or (C20 ) holds. We first use an auxiliary (2k + 1) relation MarkedOne to collect all such types τ 0 . Assume a fresh individual aα . For collecting each τ 0 that satisfies (C0), we add: MarkedOne(x, aα , y) ← Type(x), Marked(y) For the condition (C10 ), we add the rule: MarkedOne(x, aα , y) ← Type(x), Type(y), A0 6∈ y The rules for (C20 ) are as follows. For all inclusions A1 v ∀r.A2 ∈ T with r v s ∈ T , we add the rule MarkedOne(x, aα , y) ← Type(x), Type(y), A1 ∈ x, A2 6∈ y For all A1 v ∀r.A2 ∈ T with r− v s ∈ T , we also add MarkedOne(x, aα , y) ← Type(x), Type(y), A1 ∈ y, A2 6∈ x We infer Marked(t) in case MarkedOne(t, aα , v) is true for all types (bit vectors) v. To this end, we need another (2k + 1)-ary relation MarkedUntil. We add: MarkedUntil(x, aα , z) ←MarkedOne(x, aα , z), firstk (z) MarkedUntil(x, aα , u) ←MarkedUntil(x, aα , z), nextk (z, u), MarkedOne(x, aα , u) Intuitively, with the above rules we traverse all types checking the conditions (C0), (C10 ), (C20 ) described in (M∃ ). If we manage to reach the last type, we know the condition is satisfied. We add the following rule: Marked(x) ← MarkedUntil(x, aα , z), A ∈ x, lastk (z) (VI) Forbidding marked types in the core We need to forbid each indi- vidual in the generated core interpretation from having a type from Marked. For all 0 ≤ i ≤ k, we take a fresh (i + 1)-ary relation symbol Proji . We first add: Projk (x, y) ← ind(x), Marked(y) We will now project away bits from the Proji relations by looking at the actual types of individuals. For all 1 ≤ i ≤ k we have the following rules: Proji−1 (x, y) ← Proji (x, y, 1), Ai (x) Proji−1 (x, y) ← Proji (x, y, 0), Ai (x) Intuitively, Proji−1 (a, b1 , . . . , bi−1 ) says the partial type given by the bit values b1 , . . . , bi−1 can be extended to a marked type by choosing additional concepts according to the actual type of the individual a. Thus the fact Proj0 (a) represent the situation where a has a marked type. Such situations are ruled out by adding the constraint: ← Proj0 (x) This concludes the translation of the instance query (T , q), where T is an ALCHI TBox, into the disjunctive Datalog program P . This construction, together with Theorems 1 and 2, yields the main result of this paper. Theorem 3. For an instance query (T , q), we can build in polynomial time a Datalog query (P, q) such that the certain answers to (T , q) and (P, q) coincide for any given ABox A over the signature of T . The above encoding employs disjunctive programs. Entailment of ground atoms in such programs is coNExpTime-complete [7], which does not match the ExpTime-completeness of satisfiability of ALCHI KBs. However, we employ disjunction in a limited way, and thus our programs fall into a class of programs that can be evaluated in (deterministic) exponential time. In particular, the above program P can be partitioned into two programs P1 , P2 as follows: P1 consists of all rules in (I) and (II), and P2 consists of the remaining rules. Observe that P1 is a disjunctive program with at most two variables in each rule, while P2 is is disjunction-free. Note also that P2 does not define any relations used in P1 , i.e. none of the relation symbols of P1 occurs in the head of a rule in P2 . Assume a set F of facts over the signature of P1 . Due to the above properties, the successful runs of the following non-deterministic procedure generate the set of all minimal models of P ∪ F : (S1) Compute a minimal model I1 of P1 ∪ F . If I1 does not exist due to a constraint violation, then return failure. (S2) Compute the least model I2 of P2 ∪ I1 . Again, if I2 does not exist due to a constraint violation, then return failure. Otherwise, output I2 . Since P1 has at most two-variables in every rule, each minimal model I1 of P1 ∪ F is of polynomial size in the size of P1 ∪ F , and the set of all such models can be traversed in polynomial space. For a given I1 , performing the step (S2) is feasible in (deterministic) exponential time, because P2 is disjunction-free and of polynomial size in the size of the original instance query. It follows that computing the certain answers to (P, q) for any given ABox A over the signature of T requires single exponential time. 5 Discussion We have presented our results for ALCHI, but they also apply to SHI, its extension with transitivity axioms; it is well known that instance queries mediated by SHI TBoxes can be rewritten in polynomial time to instance queries mediated by ALCHI TBoxes (see, e.g., [15]). Moreover, we have presented the translation for instance queries, but the results can be easily generalized, e.g., to DL-safe rules of [22], or the quantifier-free conjunctive queries like in [21]. These queries are syntactically restricted to ensure that the relevant variable assignments only map into individuals of the input ABox. We note that generalizing our translation to arbitrary conjunctive queries while remaining polynomial is not possible under common assumptions in complexity theory. Indeed, cautious inference from a disjunctive program is in coNExpTime, but entailment of (Boolean) conjunctive queries is 2ExpTime-hard already for the DL ALCI [19]. As mentioned in the introduction, both the game theoretic characterization and the Datalog translation can be extended to ontologies in ALCHOI, the extension of ALCHI with nominals. This is possible even in the presence of the so-called closed predicates, which are useful for combining complete and incomplete knowledge, by explicitly specifying predicates assumed complete, thus given a closed-world semantics [10, 20]. For example, take the following TBox T : BScStud v Student Student v ∃attends.Course BScStud v ∀attends.¬GradCourse and the ABox A: Course(c1 ), Course(c2 ), GradCourse(c2 ), BScStud(a) Then (a, c1 ) is not a certain answer to (T , q) with q = attends(x, y). But if c1 and c2 are known to be the only courses, then (a, c1 ) should be a certain answer. This can be achieved by declaring Course a closed predicate: (a, c1 ) is indeed a certain answer to the OMQ with closed predicates (T , Σ, q), where Σ = {Course}. Interestingly, OMQs with closed predicates are non-monotonic. Indeed, (a, c1 ) is a certain answer to (T , Σ, q) over A, but it is not a certain answer over the extended set of facts A0 = A ∪ {Course(c3 )}. For this reason, these queries cannot be rewritten into monotonic variants of Datalog, like positive Datalog with disjunction as considered here. However, one can devise a polynomial time translation into disjunctive Datalog with negation as failure, for OMQs of the form (T , Σ, q) where T is an ALCHOI TBox, Σ is a set of closed predicates, and q is an instance query. The translation uses the same ideas presented here, but handling closed predicates (both in the game-theoretic characterization, and in the Datalog translation) implies reasoning about the types that are already realized in the core; in Datalog, this requires negation. Details can be found in [1]. Acknowledgements This work was supported by the Austrian Science Fund (FWF): projects P25207- N23, T515 and W1255-N23. References 1. S. Ahmetaj, M. Ortiz, and M. Šimkus. Polynomial datalog rewritings for expressive description logics with closed predicates. In Proc. of IJCAI’16, 2016. To appear. 2. F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. Patel-Schneider, edi- tors. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, second edition, 2007. 3. M. Bienvenu and M. Ortiz. Ontology-mediated query answering with data-tractable description logics. In Proc. of RW 2015, pages 218–307. Springer, 2015. 4. M. Bienvenu, B. ten Cate, C. Lutz, and F. Wolter. Ontology-based data access: A study through disjunctive Datalog, CSP, and MMSNP. ACM Trans. Database Syst., 39(4):33:1–33:44, 2014. 5. D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati. Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. Autom. Reasoning, 39(3):385–429, 2007. 6. E. Dantsin, T. Eiter, G. Gottlob, and A. Voronkov. Complexity and expressive power of logic programming. ACM Comput. Surv., 33(3):374–425, 2001. 7. T. Eiter, G. Gottlob, and H. Mannila. Disjunctive datalog. ACM Trans. Database Syst., 22(3):364–418, 1997. 8. T. Eiter, M. Ortiz, and M. Šimkus. Conjunctive query answering in the description logic SH using knots. J. Comput. Syst. Sci., 78(1):47–85, 2012. 9. T. Eiter, M. Ortiz, M. Šimkus, T. Tran, and G. Xiao. Query rewriting for Horn-SHIQ plus rules. In Proc. of AAAI 2012. AAAI Press, 2012. 10. E. Franconi, Y. A. Ibáñez-Garcı́a, and I. Seylan. Query answering with DBoxes is hard. Electr. Notes Theor. Comput. Sci., 278:71–84, 2011. 11. G. Gottlob, S. Kikot, R. Kontchakov, V. V. Podolskii, T. Schwentick, and M. Za- kharyaschev. The price of query rewriting in ontology-based data access. Artif. Intell., 213:42–59, 2014. 12. G. Gottlob, M. Manna, and A. Pieris. Polynomial combined rewritings for existential rules. In Proc. of KR 2014. AAAI Press, 2014. 13. G. Gottlob, M. Manna, and A. Pieris. Polynomial rewritings for linear existential rules. In Proc. of IJCAI 2015. AAAI Press, 2015. 14. G. Gottlob and T. Schwentick. Rewriting ontological queries into small nonrecursive datalog programs. In Proc. of KR 2012. AAAI Press, 2012. 15. U. Hustadt, B. Motik, and U. Sattler. Reasoning in description logics by a reduction to disjunctive datalog. J. Autom. Reasoning, 39(3):351–384, 2007. 16. M. Kaminski, Y. Nenov, and B. C. Grau. Computing datalog rewritings for disjunctive datalog programs and description logic ontologies. In Proc. of RR 2014, pages 76–91, 2014. 17. M. Kaminski, Y. Nenov, and B. C. Grau. Datalog rewritability of disjunctive datalog programs and its applications to ontology reasoning. In Proc. of AAAI 2014, pages 1077–1083, 2014. 18. R. Kontchakov, C. Lutz, D. Toman, F. Wolter, and M. Zakharyaschev. The combined approach to ontology-based data access. In Proc. of IJCAI 2011. IJCAI/AAAI, 2011. 19. C. Lutz. Inverse roles make conjunctive queries hard. In Proc. of DL 2007. CEUR- WS.org, 2007. 20. C. Lutz, I. Seylan, and F. Wolter. Ontology-based data access with closed predicates is inherently intractable(sometimes). In Proc. of IJCAI 2013. IJCAI/AAAI, 2013. 21. C. Lutz, I. Seylan, and F. Wolter. Ontology-mediated queries with closed predicates. In Proc. of IJCAI 2015. IJCAI/AAAI, 2015. 22. B. Motik, U. Sattler, and R. Studer. Query answering for OWL-DL with rules. J. Web Sem., 3(1):41–60, 2005. 23. M. Ortiz, S. Rudolph, and M. Šimkus. Worst-case optimal reasoning for the Horn-DL fragments of OWL 1 and 2. In Proc. of KR 2010. AAAI Press, 2010. 24. H. Pérez-Urbina, B. Motik, and I. Horrocks. Tractable query answering and rewriting under description logic constraints. J. Applied Logic, 8(2):186–209, 2010. 25. D. Trivela, G. Stoilos, A. Chortaras, and G. B. Stamou. Optimising resolution-based rewriting algorithms for OWL ontologies. J. Web Sem., 33:30–49, 2015.