New inconsistency-tolerant semantics for robust ontology-based data access Meghyn Bienvenu1 and Riccardo Rosati2 1 Laboratoire de Recherche en Informatique CNRS & Université Paris-Sud, France 2 Dipartimento di Ingegneria Informatica, Automatica e Gestionale Sapienza Università di Roma, Italy 1 Introduction In ontology-based data access (OBDA) [17], an ontology provides an abstract and for- mal representation of the domain of interest, which is used as a virtual schema when formulating queries over the data. Current research in OBDA mostly focuses on ontol- ogy specification languages for which conjunctive query answering is first-order (FO) rewritable. In a nutshell, FO-rewritability means that query answering can be performed by rewriting the input query into a first-order query which encodes the relevant knowl- edge from the ontology, and then evaluating the resulting query over the data. Among FO-rewritable ontology languages, description logics (DLs) of the DL-Lite family [8, 2] have played an especially prominent role and notably served as the inspiration for the OWL 2 QL profile 3 of the latest version of the OWL web ontology language. In real-world applications involving large amounts of data and/or multiple data sources, chances are that the data will be inconsistent with the ontology. Standard OBDA querying algorithms are next to useless in such circumstances, since first-order logic semantics (upon which DLs and standard ontology languages are based) dictates that everything can be derived from a contradiction. Appropriate mechanisms for han- dling inconsistent data are thus critical to the success of OBDA in practice. Clearly, the best solution is to restore consistency by removing the pieces of data that are responsi- ble for the inconsistencies. However, this strategy cannot always be applied, since the system may not have enough information to localize the errors, or may lack the autho- rization to modify the data (as is often the case in information integration applications). Thus, a robust OBDA system must be capable of providing meaningful answers to user queries in the presence of inconsistent data. Recently, several approaches have pursued the idea of adopting an inconsistency- tolerant semantics for OBDA, taking inspiration from the work on consistent query an- swering in databases [1, 4]. The most well-known and intuitive among such semantics, which we will call the CQA semantics, considers as a repair of a knowledge base (KB) consisting of an ontology T and a dataset A, a maximal subset of A that is consistent with T . Query answering under the CQA semantics then amounts to computing those answers that hold in every repair of the KB. Unfortunately, conjunctive query answer- ing (as well as simpler forms of reasoning) under CQA semantics is computationally 3 http://www.w3.org/TR/owl2-profiles/ hard, even for extremely simple ontology languages for which reasoning under classical semantics is tractable [13, 5]. To overcome this computational problem, approximations of the CQA semantics have been recently proposed. In particular, [13, 14] introduces a sound approximation (called IAR semantics) that evaluates queries over the intersection of all the repairs of the CQA semantics. It was shown that conjunctive query answering under this seman- tics is tractable (in particular, it is first-order rewritable) for logics of the DL-Lite family. However, the IAR semantics has the drawback that it often constitutes a very rough ap- proximation of the CQA semantics, and desirable query answers may be missed. In an effort to obtain more answers than the IAR semantics, a family of parameterized inconsistency-tolerant semantics, called k-lazy consistent semantics, was proposed in [15] and shown to converge in the limit to the CQA semantics. However, since the con- vergence is not monotone in k, these semantics are not sound approximations of the CQA semantics. Moreover, these semantics do not retain the nice computational prop- erties of the IAR semantics: the polynomial data complexity result shown for linear Datalog+/- ontologies only holds for atomic queries, and it follows from results in [5] that conjunctive query answering under k-lazy consistent semantics is coNP-hard in data complexity, for every k ≥ 1. In this paper, we address the above issues and provide the following contributions: (i) we propose two new families of inconsistency-tolerant semantics, called k-defeater and k-support semantics, that approximate the CQA semantics from above (complete approximations) and from below (sound approximations), respectively, and converge to the CQA semantics in the limit; (ii) we study the data complexity of conjunctive query answering under the new semantics, and show a general tractability result for a broad class of ontology languages that includes all known first-order rewritable languages, in particular almost all DLs of the DL-Lite family and several rule-based languages of the Datalog+/- family [6]; (iii) we analyze the combined complexity of conjunctive query answering under the above semantics for ontology languages of the DL-Lite family. The k-support and k-defeater semantics proposed in this paper provide the basis for a semantically grounded and computationally tractable approximation of the CQA semantics in OBDA systems. In particular, we envision a flexible, iterated execution of query q under both k-support and k-defeater semantics with increasing values of k, which stops as soon as the answers to q under both semantics coincide, or when the user is not interested in (or does not want to pay further computational cost for) an exact classification of the tuples that are answers to q under the CQA semantics. 2 Preliminaries Ontologies and KBs An ontology T is a finite set of first-order logic sentences, and an ontology (specification) language L is a (typically infinite) set of first-order logic sentences. If T ⊆ L, then T is called an L ontology. A knowledge base (KB) is a pair consisting of an ontology T and a finite set A of ground facts. A KB hT , Ai is said to be consistent if the first-order theory T ∪ A has a model. Otherwise, it is inconsistent, which we denote by hT , Ai |= ⊥. We are interested in the problem of answering instance queries and conjunctive queries over KBs. Without loss of generality, and for ease of exposition, we only con- sider Boolean queries (i.e. queries without free variables). A first-order (FO) query, or simply query, is a first-order sentence. An instance query (IQ) is a FO query con- sisting of a single ground fact. A conjunctive query (CQ) is a FO query of the form ∃x(α1 ∧ . . . ∧ αn ) where every αi is an atom whose arguments are either constants or variables from x. A query q is entailed by a KB K under classical semantics (denoted by K |= q) if q is satisfied in every model of K. The instance checking problem consists in deciding, for a KB K and IQ q, whether K |= q. The conjunctive query entailment problem is defined analogously, but with q a CQ. We introduce some terminology for referring to sets of facts which are responsible for inconsistency or query entailment. A set S of ground facts is called T -consistent if hT , Si 6|= ⊥. A minimal T -inconsistent subset of A is any S ⊆ A such that hT , Si |= ⊥ and every S 0 ( S is T -consistent. A set of facts S ⊆ A is said to be a T -support for query q in A if S is T -consistent and hT , Si |= q, and it is called a minimal T -support for q in A if no proper subset of S is a T -support for q in A. We sometimes omit “for q” or “in A”, when these are understood. Given a set of ground facts A, we define IA as the interpretation isomorphic to A, i.e., the interpretation defined over the domain of constants occurring in A and such that the interpretation of every relation R in IA is equal to the set {a | R(a) ∈ A}. DL-Lite ontology languages We focus on DLs of the DL-Lite family [8, 2] and recall the syntax and semantics of two specific dialects, called DL-Lite4 and DL-LiteHorn . A DL-Lite ontology consists of a finite set of inclusions B v C, where B and C are defined according to the following syntax: B → A | ∃R C → B | ¬B R → P | P− with A a concept name (unary relation) and P a role name (binary relation). In a DL-LiteHorn ontology, inclusions take the form B1 u . . . u Bn v C, with B1 , . . . , Bn and C as above. The classical semantics of DL-Lite and DL-LiteHorn ontologies is obtained by trans- lating inclusions into first-order sentences using the following function Φ: Φ(A(x)) = A(x) Φ(¬B(x)) = ¬Φ(B(x)) Φ(∃P (x)) = ∃y(P (x, y)) Φ(B1 u B2 (x)) = Φ(B1 (x)) ∧ Φ(B2 (x)) Φ(∃P − (x)) = ∃y(P (y, x)) Φ(C v D) = ∀x(Φ(C(x)) → Φ(D(x)) The classical semantics of a DL-LiteHorn KB hT , Ai (and in particular, the notions of model, consistency, and entailment) corresponds to the semantics of the first-order KB hΦ(T ), Ai. Note that when considering DL KBs, we assume as is standard that the dataset A uses only unary and binary relations. First-order rewritability We say that an ontology T is first-order (FO) rewritable (for CQ answering) under semantics S if, for every CQ q, there exists an effectively computable FO query q 0 such that, for every set of ground facts A, hT , Ai entails q 4 This DL is referred to as DL-Litecore in [8, 2]. under semantics S iff q 0 is satisfied in IA (in the classical sense). Such a query q 0 is called a FO-rewriting of q relative to T under semantics S. Moreover, we say that an ontology language L is FO-rewritable (for CQ answering) under semantics S if, for every ontology T ⊆ L, T is FO-rewritable for CQ answering under S. Complexity There are two common ways of measuring the complexity of query entail- ment. The first, called combined complexity, is with respect to the size of the whole input (T , A, q), whereas the second, called data complexity, is only with respect to the size of A. Our complexity results utilize standard complexity classes, such as NLS PACE, P, NP, and coNP. We also require the following classes which may be less well-known: AC0 (problems which can be solved by a family of circuits of constant depth and poly- nomial size, with unlimited fan-in AND gates and OR gates), Π2p (problems whose complement is solvable in non-deterministic polynomial time with access to an NP or- acle), and ∆p2 [O(log n)] (problems which are solvable in polynomial time with at most logarithmically many calls to an NP oracle). 3 Inconsistency-tolerant Semantics In this section, we formally introduce the consistent query answering (CQA) semantics and other relevant inconsistency-tolerant semantics. All of the semantics considered in this paper rely on the notion of a repair, defined as follows: Definition 1. A repair of a KB K = hT , Ai is an inclusion-maximal subset of A that is T -consistent. We use Rep(K) to denote the set of repairs of K. The repairs of a KB correspond to the different ways of achieving consistency while retaining as much of the original data as possible. Hence, if we consider that the data is mostly reliable, then it is reasonable to assume that one of the repairs accurately reflects the correct portion of the data. The consistent query answering semantics (also known as the AR semantics [13]) is based upon the idea that, in the absence of further information, a query can be con- sidered to hold if it can be inferred from each of the repairs. Formally: Definition 2. A query q is entailed by a KB K = hT , Ai under the consistent query answering (CQA) semantics, written hT , Ai |=CQA q, if hT , Bi |= q for every repair B ∈ Rep(K). Example 1. Consider the DL-Lite ontology Tuniv : Prof v Faculty Lect v Faculty Fellow v Faculty Prof v ¬Lect Prof v ¬Fellow Lect v ¬Fellow Prof v ∃teaches Lect v ∃teaches ∃teaches− v ¬Faculty which states that professors, lecturers, and research fellows are disjoint classes of fac- ulty, that professors and lecturers must teach something, and that whatever is taught is not faculty. Now let Asam be as follows: {Prof(sam), Lect(sam), Fellow(sam)}. It is easy to see that KB hTuniv , Asam i is inconsistent and has 3 repairs: R1 = {Prof(sam)}, R2 = {Lect(sam)} and R3 = {Fellow(sam)}. Observe that from each of the re- pairs, we can infer q1 = Faculty(sam), so hTuniv , Asam i |=CQA q1 . However, q2 = ∃x.Faculty(sam) ∧ teaches(sam, x) is not entailed from hTuniv , R3 i, so we have that hTuniv , Asam i 6|=CQA q2 . Unfortunately, while the CQA semantics is intuitively appealing, it is well-known that answering queries under this semantics is usually intractable w.r.t. data complexity [13, 5]. This stems from the fact that the number of repairs of hT , Ai may be exponential in the size of A, even when T is formulated in extremely simple ontology languages. To overcome the computational problems of the CQA semantics, a sound approxi- mation of it, called the IAR semantics, was proposed in [13]. Definition 3. A query q is entailed by a KB K =ThT , Ai under the IAR semantics, written hT , Ai |=IAR q, if hT , Di |= q where D = B∈Rep(K) B. The IAR semantics is more conservative than the CQA semantics, as it only uses those facts which are not involved in any contradiction. This has the advantage of yield- ing query results which are almost surely correct, but also the drawback that some plausible inferences may be missed, as demonstrated by the following example. Example 2. Reconsider the KB hTuniv , Asam i and CQ q1 from Example 1. The intersec- tion of the repairs R1 ∩ R2 ∩ R3 is the empty set, so hTuniv , Asam i 6|=IAR q1 , despite the fact that all the information in Asam supports q1 being true. From the computational perspective, the IAR semantics can be much better-behaved than the CQA semantics. Indeed, it was shown in [14] that DL-LiteA is FO-rewritable for CQ answering under the IAR semantics, and this result was recently extended to linear Datalog +/- ontologies [16]. Finally, to obtain a natural overapproximation of the CQA semantics, we introduce its brave version. Definition 4. A query q is entailed by a KB K = hT , Ai under the brave semantics, written hT , Ai |=brave q, if hT , Bi |= q for some repair B ∈ Rep(K). Example 3. As q2 is entailed by hTuniv , R1 i, we have hTuniv , Asam i |=brave q2 . Also note that every fact in Asam appear in some repair, hence, all facts in Asam are entailed under the brave semantics. As Example 3 demonstrates, the brave semantics has the undesirable feature of allowing contradictory statements to be entailed. Nonetheless, this semantics can still serve a useful role by providing a means of showing that a query is not entailed under the CQA semantics. 4 Approximations of the CQA Semantics In this section, we propose two new families of inconsistency-tolerant semantics, which provide increasingly fine-grained under- and over-approximations of the CQA seman- tics. As these semantics will be shown in Section 5 to enjoy the same nice computational properties as the IAR semantics, our new approach allows us to marry the advantages of the IAR and CQA semantics. We begin by presenting our new family of sound approximations of the CQA se- mantics. The intuition is as follows: if a query q is entailed under the CQA semantics, then this is because there is a set {S1 , . . . , Sn } of T -supports for q such that every re- pair contains some Si . The k-support semantics we propose is obtained by allowing a maximum of k different supports to be used. Definition 5. A query q is entailed by K = hT , Ai under the k-support semantics, written K |=k-supp q, if there exist (not necessarily distinct) subsets S1 , . . . , Sk of A satisfying the following conditions: – each Si is a T -support for q in A – for every R ∈ Rep(K), there is some Si with Si ⊆ R Example 4. The three repairs of hTuniv , Asam i all use different supports for q1 . We thus have hTuniv , Asam i |=3-supp q1 , but hTuniv , Asam i 6|=2-supp q1 . The following theorem resumes the important properties of the family of k-support semantics, showing that they interpolate between the IAR and CQA semantics. Theorem 1. Let K = hT , Ai be a KB and q a query. Then: 1. K |=IAR q if and only if K |=1-supp q 2. K |=CQA q if and only if K |=k-supp q for some k 3. for every k ≥ 0, if K |=k-supp q, then K |=k+1-supp q The k-support semantics allows us to approximate more and more closely the set of queries entailed under the CQA semantics, but provides no way of showing that a par- ticular query is not entailed under this semantics. This motivates the study of complete approximations of the CQA semantics. The observation underlying our new family of complete approximations is the fol- lowing: if a query q is not entailed under the CQA semantics, this is because there is a T -consistent set of facts which contradicts all of the T -supports of q. The k-defeater semantics corresponds to there being no way to construct such a “defeating” set using at most k facts. Definition 6. A query q is entailed by K = hT , Ai under the k-defeater semantics, written K |=k-def q, if there does not exist a T -consistent subset S of A with |S| ≤ k such that hT , S ∪ Ci |= ⊥ for every minimal T -support C ⊆ A of q. Note that if q has no T -support, then it is not entailed under 0-defeater semantics since one can simply take S = ∅. Example 5. We have hTuniv , Asam i 6|=1-def q2 , since by choosing S = {Fellow(sam)}, we can invalidate the minimal T -supports of q2 , which are {Prof(sam)} and {Lect(sam)}. The next theorem shows that the family of k-defeater semantics provides increas- ingly closer over-approximations of the CQA semantics, starting from the brave seman- tics presented in Section 3. Theorem 2. Let K = hT , Ai be a KB and q a query. Then: 1. K |=brave q if and only if K |=0-def q 2. K |=CQA q if and only if K |=k-def q for every k 3. for every k ≥ 1, if K |=k+1-def q, then K |=k-def q 5 Data Complexity In this section, we study the data complexity of conjunctive query answering under the k-support and k-defeater semantics. Our main result is the following theorem which shows that, for a broad class of ontology languages, conjunctive query answering un- der these semantics can be done using FO-rewriting, and hence is in AC0 w.r.t. data complexity. Theorem 3. Let T be an ontology that is FO-rewritable for CQ answering under clas- sical semantics and such that for every CQ q, there exist `, m such that for every A, every minimal T -support for q relative to A has cardinality at most `, and every mini- mal T -inconsistent subset of A has cardinality at most m. Then: (i) for every k ≥ 1, T is FO-rewritable for conjunctive query answering under the k-support semantics; (ii) for every k ≥ 0, T is FO-rewritable for conjunctive query answering under the k-defeater semantics. Proof (sketch). Let T be as stated, and let q be a CQ. By assumption, we can find ` and m such that for every A, the minimal T -supports for q relative to A have cardinality at most `, and the minimal T -inconsistent subsets of A have cardinality bounded by m. For point (i), a FO-rewriting of q relative to T for the k-support semantics can be obtained by considering the first-order query ϕq = q1 ∨ . . . ∨ qn , where the disjuncts qi correspond to the different possible choices of k T -supports for q of cardinality at most `, and each qi asserts that the chosen supports are present in A and that there is no T -consistent subset of A of cardinality at most km which conflicts with each of the supports. For point (ii), the desired FO-rewriting of q takes the form ¬(q1 ∨ . . . ∨ qn ), where every qi asserts the existence of a T -consistent set of facts of cardinality at most k which conflicts with every minimal T -support for q. Here we again utilize the fact that the size of minimal T -supports is bounded by `, and hence there are only finitely many types of supports to consider. Theorem 3 significantly strengthens earlier positive results for the IAR semantics [14, 15] by covering a full range of semantics and an entire class of practically rele- vant ontology languages. Indeed, it is easy to verify that all ontology languages that are currently known to be first-order rewritable under classical semantics satisfy the hy- potheses of Theorem 3: that is, all logics of the original DL-Lite family [8] and almost all members of the extended DL-Lite family [2], as well as all dialects of Datalog+/- that are known to be FO-rewritable under classical semantics [7]. The following examples illustrate the construction of FO-rewritings for the k-support and k-defeater semantics. Example 6. We consider how to rewrite the CQ q1 under the k-support semantics. When k = 1, we can take as our FO-rewriting the disjunction of the formulas: Faculty(sam) ∧ ¬∃x teaches(x, sam) Prof(sam) ∧ ¬∃x teaches(x, sam) ∧ ¬Lect(sam) ∧ ¬Fellow(sam) Lect(sam) ∧ ¬∃x teaches(x, sam) ∧ ¬Prof(sam) ∧ ¬Fellow(sam) Fellow(sam) ∧ ¬∃x teaches(x, sam) ∧ ¬Lect(sam) ∧ ¬Prof(sam) Note that each disjunct expresses that one of the four possible T -supports is present and is not contradicted by other facts. To obtain the rewriting for k = 2, we must introduce additional disjuncts which assert that a pair of T -supports is present and cannot be simultaneously contradicted. We obtain three new disjuncts (the other combinations being subsumed by one of the other disjuncts): Prof(sam) ∧ Lect(sam) ∧ ¬∃x teaches(x, sam) ∧ ¬Fellow(sam) Lect(sam) ∧ Fellow(sam) ∧ ¬∃x teaches(x, sam) ∧ ¬Prof(sam) Fellow(sam) ∧ Prof(sam) ∧ ¬∃x teaches(x, sam) ∧ ¬Lect(sam) Finally, for k = 3, we must add further disjuncts to check for the existence of a triple of T -supports which are present and cannot be defeated. In our case, this leads to one new (non-subsumed) disjunct: Prof(sam) ∧ Lect(sam) ∧ Fellow(sam) ∧ ¬∃x teaches(x, sam) Note that this last disjunct is satisfied in IAsam , witnessing hTuniv , Asam i |=3-supp q1 . Notice also that in this particular example, the CQA and 3-support semantics coincide, and so the FO-rewriting for k = 3 is also a FO-rewriting under the CQA semantics. Example 7. We now consider how to rewrite the query q2 under the k-defeater seman- tics. When k = 0, the construction yields the following FO-rewriting: ¬ ¬(∃x Faculty(sam) ∧ teaches(sam, x)) ∧ ¬Prof(sam)  ∧ ¬Lect(sam) ∧ ¬(∃x Fellow(sam) ∧ teaches(sam, x)) Inside the negation, there is a single disjunct which asserts that the empty set conflicts with every T -support, or equivalently, that there are no T -supports. When k = 1, we must add further disjuncts inside the negation to capture single facts which conflict with all T -supports. In our case, we must add two new disjuncts: ∃x teaches(x, sam) Fellow(sam) ∧ ¬teaches(sam, x) The first disjunct is needed since any fact of the form teaches(x, sam) contradicts Faculty(sam), and hence, every T -support for q2 . The second disjunct treats the case where there is no atom teaches(x, sam), in which case the only possible T -supports for q2 are Prof(sam) and Lect(sam), both of which are contradicted by Fellow(sam). Notice that this last disjunct holds in IAsam , which proves that hTuniv , Asam i 6|= q2 . We briefly remark that polynomial data complexity is not preserved under the new semantics. Indeed, in the lightweight DL EL⊥ , CQ answering and unsatisfiability are P-complete w.r.t. data complexity, but it was shown in [18] that instance checking under the IAR (equiv. 1-support) semantics is coNP-hard w.r.t. data complexity, and it is not hard to show intractability also for the brave (equiv. 0-defeater) semantics. 6 Combined Complexity To gain further insight into the computational properties of the different inconsistency- tolerant semantics considered in this paper, we study the combined complexity of in- stance checking and CQ entailment for DL-Lite and DL-LiteHorn KBs under these se- mantics. The results of our analysis are reported in Figure 1. IAR k-supp (k > 1) CQA k-def (k > 0) brave IC DL-Lite NLS PACE NLS PACE coNP NLS PACE NLS PACE DL-LiteHorn coNP ≥ coNP coNP NP NP ≤ ∆p2 [O(log n)] CQ DL-Lite NP NP Π2p NP NP DL-LiteHorn ∆p2 [O(log n)] ∆p2 [O(log n)] Π2p NP NP Fig. 1. Combined complexity of instance checking (IC) and conjunctive query entailment (CQ) under various inconsistency-tolerant semantics. All results are completeness results, unless oth- erwise noted. Before presenting the results in more detail, let us begin with some general ob- servations. First, it is interesting to note that for DL-Lite KBs, the complexities ob- tained for the IAR, k-support, brave, k-defeater, and classical semantics all coincide, and are strictly lower than the complexity w.r.t. the CQA semantics. By contrast, for DL-LiteHorn KBs, instance checking under any of the considered inconsistency-tolerant semantics is of higher complexity than under classical semantics. Moreover, we lose the symmetry between the sound and complete approximations. Indeed, for CQ entailment, the complexities of the sound approximations (IAR and k-support) is higher than for the complete approximations (brave and k-defeater semantics). Finally, we remark that in several cases, and in particular, for the k-support seman- tics, the complexity for DL-LiteHorn is higher than for DL-Lite. This can be explained by the fact that for DL-Lite KBs, the size of a minimal T -support of a query is linear in the size of the query and independent of T , whereas for DL-LiteHorn KBs, the bound on minimal T -supports depends also on the size of T . Overall, these results suggest that while the k-support and k-defeater semantics are tractable w.r.t. data complexity for both DL-Lite and DL-LiteHorn , it will likely be much easier to obtain practical al- gorithms for DL-Lite KBs. We now present our different complexity results and some brief ideas concerning the proofs. We start by showing that for DL-Lite, instance checking under the proposed semantics has the same low complexity as under classical semantics. Theorem 4. In DL-Lite, instance checking under the k-support semantics is NLS PACE- complete w.r.t. combined complexity, for every k ≥ 1. The same holds for the k-defeater semantics, for every k ≥ 0. Proof (idea). The proof exploits the fact that when T is a DL-Lite ontology, minimal T -supports for IQs consist of single facts, and minimal T -inconsistent subsets contain at most two facts. This means in particular that every k-tuple of minimal T -supports contains at most k facts, and at most k facts are needed to contradict all k supports. This enables a NLS PACE procedure which guesses k facts and verifies that each fact is a T -support, and that there is no set with at most k facts which contradicts all of the guessed facts. The upper bound for the k-defeater semantics uses similar ideas. In DL-LiteHorn , instance checking is intractable already for the IAR and brave se- mantics, and the lower bounds can be used to show intractability also for the k-support and k-defeater semantics. For the k-defeater semantics, Theorem 6 provides a matching upper bound, while the precise complexity for the k-support semantics remains open. Theorem 5. Instance checking in DL-LiteHorn is coNP-complete w.r.t. combined com- plexity under the IAR semantics, coNP-hard w.r.t. combined complexity under k-support semantics, and NP-complete w.r.t. combined complexity under both the brave semantics and k-defeater semantics. Proof (idea). We sketch the coNP lower bound for the IAR semantics, which is by reduction from UNSAT. Let ϕ = c1 ∧ . . . ∧ cn be a propositional CNF over variables x1 , . . . , xm . Consider the DL-LiteHorn KB with T ={Ti v Cj | xi ∈ cj } ∪ {Fi v Cj | ¬xi ∈ cj }∪ {Ti u Fi v ⊥ | 1 ≤ i ≤ m}∪{A u C1 u . . . u Cn v ⊥} and A = {A(a)}∪{Ti (a), Fi (a) | 1 ≤ i ≤ m}. Then it can be shown that hT , Ai |=IAR A(a) if and only if the formula ϕ is unsatisfiable. We next consider the complexity of CQ entailment under our proposed semantics. For DL-Lite, we obtain precisely the same complexity as under the classical semantics. Theorem 6. In DL-Lite, CQ entailment under the k-support semantics is NP-complete w.r.t. combined complexity, for every k ≥ 1. For both DL-Lite and DL-LiteHorn , CQ entailment under the k-defeater semantics is NP-complete w.r.t. combined complexity, for every k ≥ 1. Proof (idea). We sketch the upper bound for the k-defeater semantics. Fix a DL-LiteHorn KB hT , Ai and a CQ q. Let S1 , . . . , Sm be the T -consistent subsets of A with cardi- nality at most k. Guess a sequence C1 , . . . , Cm of subsets of A of cardinality at most c = 2 · |q| · |T |, together with polynomial certificates that hT , Ci i |= q, for each Ci . Output yes if for every 1 ≤ i ≤ m, the certificate is valid and Si ∪ Ci is T -consistent. As m is polynomial in |A| (since k is fixed), and both conditions can be verified in polynomial time for DL-LiteHorn KBs, we obtain an NP procedure. Correctness relies on the fact that because T is a DL-LiteHorn ontology, every minimal T -support for q has cardinality at most c. For DL-LiteHorn , CQ entailment under the IAR and k-support semantics rises to ∆p2 [O(log n)]-complete. Theorem 7. In DL-LiteHorn , CQ entailment under k-support semantics is ∆p2 [O(log n)]- complete w.r.t. combined complexity, for every k ≥ 1. Proof (idea). The lower bound is by a non-trivial reduction from the Parity(SAT) prob- lem [21]. For the upper bound, consider the following algorithm which takes as input a DL-LiteHorn KB hT , Ai and CQ q: 1. For every k-tuple (α1 , . . . , αk ) ⊆ Ak of facts, use an NP oracle to decide whether every repair contains some αi . Let S contain all k-tuples for which the test succeeds. 2. A final oracle call checks if there is a k-tuple (C1 , . . . , Ck ) of subsets of A of cardi- nality at most c = 2 · |T | · |q| such that (i) every Ci is T -consistent and hT , Ci i |= q, and (ii) every k-tuple (β1 , . . . , βk ) with βi ∈ Ci belongs to S. Return yes if the call succeeds, else no. Every minimal T -support for q contains at most c facts. It follows that the algorithm returns yes if hT , Ai |=k-supp q. Conversely, if the output is yes, with (C1 , . . . , Cn ) the k-tuple from Step 2, then by (i), every Ci is a T -support for q. Moreover, (ii) ensures that every repair contains some Ci , for it not, we could find some k-tuple (β1 , . . . , βk ) ∈ C1 × . . . × Cn which does not belong to S, contradicting (ii). Note that the algorithm runs in polynomial time with an NP oracle, since there are only polynomially many k-tuples of facts to consider, for fixed k. As the oracle calls can be organized into a tree, membership in ∆p2 [O(log n)] follows by a result from [11]. Finally, we determine the combined complexity of instance checking and CQ en- tailment for the CQA semantics (prior results only considered data complexity). Theorem 8. For DL-Lite and DL-LiteHorn , instance checking (resp. CQ entailment) under CQA semantics is coNP-complete (resp. Π2p -complete) for combined complexity. Proof (idea). The upper bounds are easy: guess a repair and show that it does not en- tail the query. The coNP-lower bound for instance checking follows from the coNP- hardness of this problem w.r.t. data complexity. The Π2p -hardness result involves a non- trivial reduction from 2-QBF validity. We should point out that our proofs are quite generic and can be directly used (or trivially extended) to obtain results for a whole rangle DL-Lite dialects (as well as other ontology languages). 7 Future Work The present work can be extended in several directions. First, we believe that our ap- proach can have a practical impact on OBDA systems, so we aim to implement and experiment with the approach. It would also be very interesting to investigate the con- nections between our approach and approximate knowledge compilation [19]: in par- ticular, it would be important (also for practical purposes) to study the possibility of effectively “compiling” our semantics. Moreover, it is also relevant to extend our anal- ysis to more complex OBDA systems, where the ontology elements are related to the data sources through complex mappings [17]. Finally, while the present approach is computationally attractive for all known FO-rewritable ontology languages, tractable approximations of the CQA semantics for other tractable yet non-FO-rewritable ontol- ogy languages (like EL⊥ [3]) are still missing. Acknowledgments. The first author has been supported by a Université Paris-Sud At- tractivité grant and ANR project PAGODA (ANR-12-JS02-007-01). The second author has been partially supported by EU FP7 project Optique – Scalable End-user Access to Big Data (grant n. FP7-318338). References 1. Arenas, M., Bertossi, L.E., Chomicki, J.: Consistent query answers in inconsistent databases. In: Proc. of PODS. pp. 68–79. ACM Press (1999) 2. Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: The DL-Lite family and re- lations. Journal of Artificial Intelligence Research 36, 1–69 (2009) 3. Baader, F., Brandt, S., Lutz, C.: Pushing the EL envelope. In: Proc. of IJCAI. pp. 364–369 (2005) 4. Bertossi, L.E.: Database Repairing and Consistent Query Answering. Synthesis Lectures on Data Management, Morgan & Claypool Publishers (2011) 5. Bienvenu, M.: On the complexity of consistent query answering in the presence of simple ontologies. In: Proc. of AAAI (2012) 6. Calı̀, A., Gottlob, G., Pieris, A.: New expressive languages for ontological query answering. In: Proc. of AAAI (2011) 7. Calı̀, A., Gottlob, G., Pieris, A.: Towards more expressive ontology languages: The query answering problem. Artificial Intelligence 193, 87–128 (2012) 8. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. Journal of Auto- mated Reasoning 39(3), 385–429 (2007) 9. Eiter, T., Gottlob, G.: The complexity of logic-based abduction. J. ACM 42(1), 3–42 (1995) p 10. Eiter, T., Gottlob, G.: The complexity class θ : Recent results and applications in AI and 2 modal logic. In: Proc. of FCT. pp. 1–18 (1997) 11. Gottlob, G.: Np trees and Carnap’s modal logic. Journal of the ACM 42(2), 421–457 (1995) 12. Immerman, N.: Nondeterministic space is closed under complementation. SIAM Journal of Computing 17(5), 935–938 (1988) 13. Lembo, D., Lenzerini, M., Rosati, R., Ruzzi, M., Savo, D.F.: Inconsistency-tolerant seman- tics for description logics. In: Proc. of RR. pp. 103–117 (2010) 14. Lembo, D., Lenzerini, M., Rosati, R., Ruzzi, M., Savo, D.F.: Query rewriting for inconsistent dl-lite ontologies. In: Proc. of RR. pp. 155–169 (2011) 15. Lukasiewicz, T., Martinez, M.V., Simari, G.I.: Inconsistency handling in datalog+/- ontolo- gies. In: Proc. of ECAI. pp. 558–563 (2012) 16. Lukasiewicz, T., Martinez, M.V., Simari, G.I.: Inconsistency-tolerant query rewriting for lin- ear datalog+/-. In: Proc. of Datalog 2.0. pp. 123–134 (2012) 17. Poggi, A., Lembo, D., Calvanese, D., Giacomo, G.D., Lenzerini, M., Rosati, R.: Linking data to ontologies. Journal of Data Semantics 10, 133–173 (2008) 18. Rosati, R.: On the complexity of dealing with inconsistency in description logic ontologies. In: Proc. of IJCAI. pp. 1057–1062 (2011) 19. Selman, B., Kautz, H.A.: Knowledge compilation and theory approximation. Journal of the ACM 43(2), 193–224 (1996) 20. Szelepcsényi, R.: The method of forcing for nondeterministic automata. Bulletin of the EATCS 33, 96–99 (1987) 21. Wagner, K.W.: More complicated questions about maxima and minima, and some closures of NP. Theoretical Computer Science 51, 53–80 (1987)