First Order Rewritability for Ontology Mediated Querying in Horn-DLF D David Toman and Grant Weddell Cheriton School of CS, University of Waterloo, Canada {david,gweddel}@uwaterloo.ca Abstract. We study first order (FO) rewritability for query answering in the setting of ontology mediated queries (OMQ) in which ontologies are formulated in the Horn fragments of the feature description logic DLFD. In general, OMQ approaches for logics such as DLFD rely on non-FO completion of the data (ABoxes) that can commonly be ex- pressed as Datalog programs. In this paper, we study the problem of the existence of FO rewritings for a given Horn-DLFD theory (TBox) in terms of Beth definability, and show how Craig interpolation can then be used to effectively construct the rewritings (when they exist) from the Clark’s completion of Datalog programs for computing ABox comple- tions. In addition, since data that populates OMQ is commonly residing in relational database systems, we show how constraints enforced by such systems can be used to expand the scope of rewritable queries. 1 Introduction In earlier work [25,27], we presented a combined-combined approach to ontology mediated querying (OMQ) of backend relational data sources when ontologies are formulated as TBoxes in the Horn fragments of the description logic (DL) DLFD and where ABoxes are induced by data residing in the data sources. The logic DLFD is the most expressive member of the FunDL family of feature-based DLs [26], which are particularly well suited for capturing relational schemata by virtue of being feature-based and also by virtue of including the path functional dependency (PFD) concept constructor. PFDs makes it possible to capture a variety of equality generating dependencies commonly occurring in relational schemata, such as primary keys, uniqueness constraints and functional depen- dencies. To illustrate, we show in the next section how the relational schema in Figure 1 can be captured as a Horn-DLFD TBox, the FunDL dialect that will be the focus of the remainder of this paper. This example also serves to illustrate how the parts of a TBox that reflect a relational schema could in principle be automatically generated, and thereby make it unnecessary to author so-called mapping rules in ontology based data access (OBDA) (again see [25,27]). The combined-combined approach is so named because any of the Horn frag- ments of DLFD in the FunDL family will require using ontological knowledge, Copyright c 2020 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). a Horn-DLFD TBox in our case, for two steps in OMQ. The first step, as with the combined approach to OMQ [19,20,22,23], is to complete the ABox via a Datalog program that adds tuples to tables in the backend relational database. Note that this first step depends only on the TBox. The second step, as with the perfect rewriting approach to OMQ and OBDA [9], is to rewrite a query to account for such artifacts as inheritance and attribute typing in the TBox. In this paper, we show how Beth definability can be used to resolve the problem of recognizing when a first order (FO) rewriting of the Datalog program needed in the first step exists, and then how Craig interpolation can be used to effectively construct such a rewriting. A key insight is to employ the Clark’s completion of the Datalog program in both the diagnosis and synthesis of FO expressibility of ABox completion. The existence of such a rewriting enables an OMQ frontend to a relational data source to operate entirely by a more refined query reformulation that, for a given user query, ultimately yields an SQL query over the data source, that is, with no requirement to update tables in the data source beforehand. An additional advantage of our approach is that it also becomes straightforward to incorporate the benefits of integrity constraints such as foreign key constraints that are enforced by a relational database management system to simplify query reformulation. Indeed, such integrity constraints can enable FO rewriting of data completion that would otherwise not be possible. In summary, our contributions are as follows. 1. We show how to decide FO rewritability of OMQ in Horn-DLFD via Clark’s completion of Datalog programs and Beth definability; 2. We show how an equivalent non-recursive Datalog program can be derived via Craig’s theorem and interpolation in cases where ABox completion for a given TBox is indeed FO expressible; 3. We illustrate how integrity constraints in backend relational schema can enable FO rewritability of OMQ that would otherwise not be possible; and 4. We show how our framework extends to query specific OMQ in a straight- forward manner. The remainder of the paper is organized as follows. Section 2 immediately fol- lowing provides the necessary background and definitions. Here, we review Horn- DLFD and the combined combined approach to OMQ. Our main results follow in Section 3 in which we show how the above-mentioned artifacts, Clark’s com- pletion of Datalog programs for example, can be employed to both decide FO rewritability and to synthesize FO rewritings of ABox completion in the com- bined combined approach to OMQ. In Section 4 shows how a Datalog program computing an ABox completion can usefully incorporate database constraints enforced by backend relational database systems. Section 5 then considers query specific OMQ, that is, when the combination of a Datalog ABox completion procedure and a specific query are FO expressible. Summary comments and our conclusions are in Section 6. create table EMP ( name STRING not null, dept STRING, man STRING, primary key ( name ), constraint deptRef foreign key ( dept ) references DEPT, constraint manRef foreign key( man ) references EMP ); create table DEPT ( name STRING not null, head STRING, primary key ( name ), candidate key ( head ), constraint headRef foreign key( head ) references EMP ); Fig. 1. A Relational Schema. 2 Background and Definitions All member dialects of the FunDL family of DLs [26] are fragments of FOL with underlying signatures based on disjoint sets of unary predicate symbols PC, called primitive concepts, constant symbols IN, called individuals, and unary function symbols F, called features. The latter replace roles in other DLs and are interpreted as either total functions or partial functions, depending on the particular FunDL dialect. The most expressive member of the FunDL family is the dialect DLFD. In this paper, we consider Horn-DLFD, the most expressive Horn fragment of DLFD, assuming in particular that an ontology is given as a Horn-DLFD TBox.1 To simplify our presentation, we also assume features are interpreted as total functions. Definition 1 (Horn-DLFD) A path function Pf is a word in F∗ , with the empty word denoted by id , and concatenation by “.”. Concept descriptions of two kinds, C and D, are defined by the grammar rules on the left-hand-side of Figure 2, where A ∈ PC, f ∈ F and Pf (possibly subscripted) refers to a path function. An instance of the final production is called a path functional dependency (PFD). Semantics is defined in the standard way with respect to an interpretation I = (4, (·)I ) that fixes the meaning of symbols in PC, IN and F. Here, features are interpreted as total functions. We also assume the unique name assumption (UNA) whereby, for any distinct individuals a and b, aI 6= bI . The interpretation function I is extended to path expressions by interpreting id (the empty word) as the identity function λx.x, concatenation as function composition, and to complex concept descriptions C or D as given on the right- hand-side of Figure 2. An interpretation I satisfies a subsumption C v D if 1 Here, we do not consider the problem of recognizing when TBoxes in very expressive DLs have unique minimal models. Syntax Semantics: “(·)I ” C ::= A AI ⊆ 4 | ∀ Pf .C {x | Pf I (x) ∈ C I } | C1 u C2 C1I ∩ C2I D ::= C CI ⊆ 4 | ⊥ ∅ | ∃f −1 {x | ∃y ∈ 4 : f I (y) = x} | ∀ Pf .D {x | Pf I (x) ∈ DI } Vk | C : Pf 1 , . . . , Pf k → Pf {x | ∀y ∈ C I : i=1 Pf Ii (x) = Pf Ii (y) ⇒ Pf I (x) = Pf I (y)} Fig. 2. Horn-DLFD Concepts. C I ⊆ DI , a concept assertion A(a) if aI ∈ AI , and a feature assertion f (a) = b if f I (aI ) = bI , where a, b ∈ IN. A knowledge base (KB) K = (T , A) consists of a TBox T of subsumptions, and an ABox A of assertions. I satisfies K if it satisfies each subsumption and assertion in K. Although incorporating features deviates from the normal practice of using bi- nary predicate symbols called roles, features make it easier to incorporate con- cept constructors suited to the capture of relational data sources that include various dependencies by a straightforward reification of n-ary predicates. Thus, e.g., a role R can be reified as a primitive concept RC and two features domR and ranR in Horn-DLFD, and a subsumption A v ∀R.B can then be captured as a subsumption ∀domR.A v ∀ranR.B. As usual, to ensure decidability in the presence of ABoxes [28] and reasonable computational properties of the logic, one looks for additional restrictions on the PFD concept constructors. One restriction, which has been investigated (e.g., [17,27]), is to limit the PFD constructor to one of the following two forms: 1. C : Pf 1 , . . . , Pf . Pf i , . . . , Pf k → Pf or 2. C : f1 , . . . , fk → g We can allow a more complex variant of (2) if the ABox completion admits/requi- res creating additional constant symbols; for details—beyond the scope of this paper—see [27]. Also, in this paper we will not explore the parameter tractable fragments of Horn-DLFD nor the issues connected with interpreting features as partial functions. Both these issues were studied in [25] which shows, among other things, how partiality can be easily simulated in Horn-DLFD. Example 2 Figure 3 lists the subsumptions for a Horn-DLFD TBox that pro- vides an ontological view of the relational schema illustrated in Figure 1. Part (a) are subsumptions that capture column definitions and data dependencies mani- fest in the schema itself. This part of an ontology can be automatically generated (disjoint classes) EMP u DEPT v ⊥ (disjoint columns) EMP v DEPT : name → id EMP u STRING v ⊥ (b) additional data dependencies DEPT u STRING v ⊥ (attribute typing) EMP v ∀name.STRING (inheritance) MAN v EMP EMP v ∀deptRef.DEPT (local typing) EMP v ∀manRef.MAN EMP v ∀manRef.EMP DEPT v ∀headRef.EMP DEPT v ∀headRef.MAN (inverses) MAN v ∃headRef−1 DEPT v ∀name.STRING (c) ontology enhancement (primary keys) EMP v EMP : name → id DEPT v DEPT : name → id (candidate keys) DEPT v DEPT : head → name (a) derived from the relational schema Fig. 3. An Ontology as a Horn-DLFD TBox. (for an outline of a procedure to do this, see [25]). Observe in particular how the name attached to a foreign key constraint, such as headRef, is mapped to a feature with the same name. Part (b) illustrates how additional data depen- dencies not expressible without appeal to general assertions in SQL can also be captured in Horn-DLFD. Here, the (presumed) fact that name values of em- ployees are disjoint from name values of departments is captured by appeal to an asymmetric use of PFDs. Finally, Part (c) illustrates how an ontology can be enhanced by adding further subsumptions, in this case relating to the virtual concept of a manager. Conjunctive queries are, as usual, formed from atomic queries (or atoms) of the form “A(x)” and “x.f = y”, where x and y are variables, using conjunc- tion and existential quantification. To simplify notation, we conflate conjunctive queries with the set of its constituent atoms and a set of answer variables: Definition 3 (Conjunctive Queries and Certain Answers) Let A(xi ) and xi1 .fi = xi2 be (first-order) atoms, where A is a primitive concept , fi is a feature, and x̄ and ȳ tuples V of variables. A conjunctive query (CQ) ϕ is an expression of the form ∃ȳ. ψ with free variables x̄, that is constructed from the above atoms ψ using conjunctions and existential quantification. Let K be a Horn-DLFD knowledge base and ϕ a CQ. A certain answer to ϕ over K is a substitution of constant symbols ā, [x̄ 7→ ā], such that K |= ϕ[x̄ 7→ ā]. As usual, UCQ stands for a union of CQs. To study the first-order rewritability of conjunctive queries over Horn-DLFD TBoxes, we use the following version of the combined combined approach to OMQ in our setting. Proposition 4 (Combined Combined Approach to OMQ [25,27,30]) Let K = (T , A) be a Horn-DLFD knowledge base and ϕ a conjunctive query. Then there is a UCQ query ϕT and a Datalog program ΠT , both of which can be effectively constructed from T , such that K |= ϕ[x̄ 7→ ā] ⇐⇒ ΠT (A) |= ϕT [x̄ 7→ ā] for a tuple of constant symbols ā and where ΠT (A) is the minimal model of ΠT when evaluated over A. The proposition uses a Datalog program ΠT to define an ABox completion over which the query ϕT , the rewriting of the original user query, is evaluated to com- pute the certain answers. The Proposition also shows that the data complexity of OMQ in Horn-DLFD is complete for PTIME. 3 Classification of Horn-DLF D TBoxes To simplify the presentation, we assume that Horn-DLFD TBoxes are in a normal form given as follows: Definition 5 (Normal Form of Horn-DLFD TBoxes) Let T be a Horn-DLFD TBox. We say that T is in normal form if all subsump- tions adhere to one of the following six forms: 1. A v ⊥, 4. ∀f.A v B, 2. A1 u A2 v B, 5. A v ∃f −1 , and 3. A v ∀f.B, 6. A v B : Pf 1 , . . . , Pf k → Pf. It is an easy exercise to convert an arbitrary Horn-DLFD TBox to its conserva- tive extension in this normal form by introducing additional primitive concepts. Definition 6 (Datalog Program ΠT ) The Datalog program ΠT used in Pro- position 4 consists of completion rules obtained by translating subsumptions that are logical consequences of T . The form of these subsumptions and their trans- lation are given as follows: (consequences of T ) (completion rule in ΠT ) Av⊥ C⊥ (x) ← CA (x) A1 u A2 v B CB (x) ← CA1 (x), CA2 (x) A v ∀f.B CB (x) ← CA (y), Rf (y, x) ∀f.A v B CB (x) ← CA (y), Rf (x, y) For every primitive concept B, we introduce an unary EDB predicate PB (x) together with an additional clause CB (x) ← PB (x) that captures explicit data in an ABox of the form A(a), and an IDB predicate CB (x) corresponding to the completion of the ABox w.r.t. T . For features, we use Rf (x, y) as a synonym for ABox assertion f (x) = y to conform with standard Datalog syntax. We also define an auxiliary symbol C⊥ (x) that will be used to test for ABox consistency. Note that the subsumptions for unqualified inverse features (5) and PFDs (6) do not contribute to the definition of an ABox completion since they do not generate additional ABox assertions for primitive concepts. Also note that, unlike in the classical combined approach [19,23], our combined combined approach does not introduce any new constants in the ABox [27]. To test for first-order definability of the completion (i.e., all predicates that stand for the completed ABox instance), we use the following construction: Definition 7 (Clark’s Completion ΣT ) The Clark’s Completion [11] ΣT of ΠT is given by a set of formulas CB (x) ↔ PB (x) ∨ (∃y.α1 ) ∨ . . . ∨ (∃y.αn ) corresponding to all clauses CB (x) ← αi ∈ ΠT with the head CB (x). This completion closes the original Datalog program in the following sense: Proposition 8 ([11], simplified for this paper) – ΠT |= CB (a) implies ΣT |= CB (a), and – ΠT 6|= CB (a) implies ΣT |= ¬CB (a). Note that Clark’s result works in the much more general setting of logic pro- grams with function symbols and possibly infinite resolution proofs and under the Negation As Failure semantics. Since Clark’s completion makes all IDB pred- icates closed, we can now use standard tools for testing for explicit definability. Note that had we used ΠT instead, none of the definability results could possibly hold. Note that in absence of role/feature subsumptions (such as role hierarchies) we do not need to apply the completion to the Rf atoms. Proposition 9 (Projective Beth Definability [3]) Let Σ be an FO theory over symbols in L and L0 ⊆ L. Then the following are equivalent: 1. For M1 and M2 models of Σ such that M1|L0 = M2|L0 , it holds that M1 |= ϕ[a] iff M2 |= ϕ[a] for all M1 , M2 , and a tuples of constants, and 2. ϕ is equivalent under Σ to a formula ψ in L0 . This gives us a complete characterization of first-order rewritability of the ABox closure of individual primitive concepts with respect to Horn-DLFD TBoxes as follows: Theorem 10 Let T be a Horn-DLFD TBox over the primitive concepts and features {A1 , . . . , Ak , f1 , . . . , fn }, and let T 0 be a conservative extension of T in normal form. Then the completion of the primitive concept A w.r.t. T is first-order definable if and only if CAi (x) is Beth definable over ΣT 0 and L0 = {PA1 , . . . , PAk , Rf1 , . . . , Rfn }. Proof. Follows immediately from the properties of Beth definability (Proposi- tion 9) and the definition and properties of the Clark’s completion (Proposi- tion 8). Observe that one can restrict the alphabet of the ABox (L0 ) to target only ABoxes over restricted signature(s). Given ΣT , one can now reformulate (1) in Proposition 9 as a logical impli- cation problem by making a copy of all formulas of ΣT in which all non-logical symbols not in {PA1 , . . . , PAk , Rf1 , . . . , Rfn } are starred. Hence, the definability question for CA (x) can be expressed as a logical implication question of the form: ΣT ∪ ΣT∗ |= ∀x.CA (x) → CA ∗ (x) (1) Note that, on closer inspection, all formulas in ΣT can be written as ALCI subsumptions.2 Hence: Theorem 11 Let T be a Horn-DLFD TBox. Then the existence of 1. the first order rewritability of the A completion with respect to T , 2. the consistency of T , and 3. the uniform query rewritability over T are all decidable and in EXPTIME. Proof. The first claims follow immediately from Theorem 10 applied to all atoms of the form CB and the decidability and complexity of reasoning in ALCI. The second claim relies on definability of C⊥ (x) and, in the presence of key PFDs of the form A v B : Pf 1 , . . . , Pf k → Pf, on the definability of CA (x) and CB (x) since the remaining consistency check for PFD satisfaction can then be expressed as a first-order query with respect to these explicit definitions. The last claim follows by observing that (i) definability of atomic queries implies definability of arbitrary (U)CQs using the combined-combined approach [25], and that (ii) non-definability of a single atomic query exhibits the need for a non first-order ABox completion for queries containing/consisting of this atom. Note that in the third case one can restrict the definability conditions to atoms that can appear in the query ϕT . A matching lower bound can be obtained for expressive fragments of Horn-DLFD (for which the reasoning complexity is EXPTIME-complete). However, the exact complexity is open for PTIME frag- ments, such as CF Dnc and CF DI ∀− . Note, however, that since the size (and the kc construction) of rewritings will dominate this cost (even for the simplest ontology languages [18]), exact complexity bounds are mostly of academic interest. Example 12 Returning to our introductory example, we find out that neither CEMP (x) nor CDEPT (x) nor the TBox-derived CMAN (x) atom are first-order definable. We confirmed this using a tableau-based depth-limited interpolant generator [16,31]. One of the reasons is that an ABox can contain a chain of alternating deptRef/empRef features without having the objects along this chain alternately belonging to the EMP and DEPT concepts, save the individual at the beginning of this chain. The Datalog completion rules are needed to propagate these concept assertions along such a chain. 2 Disregarding functionality of the original features does not matter at this juncture. 3.1 Construction of Rewritings To obtain an algorithm that constructs rewritings from our characterization of FO rewritability, we utilize Craig Interpolation: Proposition 13 (Craig Interpolation [12]) Let ϕ and φ be first order for- mulas such that |= ϕ → φ. Then there is a first order formula ψ, called Craig interpolant, containing only symbols common to ϕ and φ such that |= ϕ → ψ and |= ψ → φ. Moreover, the interpolant can be extracted, typically in linear time, from a proof of |= ϕ → φ, as long as a reasonably structural proof system, such as resolution, (cut-free) sequent calculus, and/or analytic tableau is used. The formulation (1), after a minor manipulation of the formulas, can thus be used to construct a Craig interpolant for CA (x) from a (analytic) tableau proof of the above logical implication statement (if one exists): one can con- struct, with the help of blocking [1,13], a finite analytic tableau for (1). The size of the rewriting, however, will depend crucially on the proof system used and on the presentation of the rewriting. However, by using an optimal tableau for ALC [13] (modified for ALCI), one can construct an exponential represen- tation of the rewriting by sharing common subexpressions in the same way the tableau algorithm caches already unsatisfiable sets. However, plain first-order representation of the formula will suffer from additional exponential blowup, as one would expect. Combining the above construction with the already existing rewriting ϕT from [25] we get: Theorem 14 Let K = (T , A) be a Horn-DLFD knowledge base. Then the data complexity of conjunctive query answering is in AC 0 whenever the A completion with respect to T is first-order definable with respect to ΣT . Proof. Let ψA (x) be a first order definition of CA w.r.t. ΣT . Then K |= ϕ(a) iff A |= ϕT [ψA [y/x]/A(y) | A ∈ PC](a). The claim follows since ϕT [ψA [y/x]/A(y) | A ∈ PC] is a first order formula, in particular a UCQ in the case of Horn-DLFD. 4 Integration of Database Constraints For ABoxes constructed from relational instances (see the discussion in our In- troduction and Background Sections), relational systems commonly enforce in- tegrity constraints such as primary and foreign keys. The effect of these database constraints is that parts of the corresponding ABox are not only consistent with such constraints but are a model. This observation is in particular helpful in the case of foreign keys since they stipulate that parts of the ABox completion are no longer necessary: for a foreign key to hold in an instance of a relational database, either its target exists in the appropriate table or the source is NULL, which can then be interpreted as value unknown and is taken care of by query rewriting ϕT . Definition 15 (Adding Foreign Keys) Let A v B and A v ∀f.B be sub- sumptions corresponding to foreign keys that capture ISA and attribute typ- ing, respectively. For each such constraint, add PB (x) ← CA (x) and PB (x) ← CA (y), Rf (x, y), respectively, to ΣA . This definition allows us to sidestep parts of the ABox completion that are mandated by the database constraints and thus already exist in the original instance of the ABox. As a consequence, we have the following extension of Theorem 14: Theorem 16 Let K = (T , A) be a Horn-DLFD knowledge base. Then the data complexity of conjunctive query answering is in AC 0 whenever the A completion with respect to T is first-order definable with respect to the theory ΣT ∪ ΣA . Example 17 Continuing with our example, adding the constraints PEMP (x) ← CEMP (y), RmanRef (x, y) PEMP (x) ← CDEPT (y), RdeptRef (x, y) PDEPT (x) ← CEMP (y), RheadRef (x, y) to ΣA , both CEMP (x) and CDEPT (x) become definable with the expected inter- polants CEMP (x) and CDEPT (x) found by the above-mentioned interpolant genera- tor [16,31]. Moreover, CMAN (x) also becomes definable by ∃y.PEMP (x)∧RmanRef (y, x) (again, using the interpolant found by the interpolant generator). 5 Query-specific Rewritability Our approach also provides decidability for the non-uniform (query-specific) problems: Theorem 18 Let T be a Horn-DLFD TBox and ϕ a (U)QC. Then the follow- ing are equivalent: 1. ΣT ∪ ΣT∗ |= ∀x.ϕT → ϕ∗T (x), and 2. ϕ is first-order rewritable with respect to T . The exact complexity again depends on the complexity of (1) above. In the general case, an EXPTIME bound follows from [15], but again, a more refined analysis is in order for fragments of Horn-DLFD. 6 Summary and Conclusions We have shown how the combined combined approach to OMQ can be used to determine when query answering mediated by a Horn-DLFD TBox can be reduced to a first order query over a plain ABox (or, in turn, over the underlying data source), and when a non-first order (Datalog-based) completion of the data is needed. Interestingly enough, and unlike other approaches that aim on deter- mining first-order rewritability, our approach links the problem to well studied issues of explicit definability [3], interpolation [12], and to Clark’s completion of logic programs [11]. In addition, our approach naturally and seamlessly incorpo- rates database constraints that hold in the data source to improve our chances of finding a first order rewriting. Further extensions of this work along the following lines is needed: – Performing a fine-grained analysis of computational complexity for tractable members of Horn-DLFD, such as CF Dnc . (We conjecture that the rewritabil- ity problem will remain intractable.) – Determining if a dichotomy holds, in particular, if non-rewritability implies NL hardness of OMQ in data complexity. (We conjecture that this is the case since blocked branches of a tableau proof signal non-rewritability and can be used to simulate, e.g., s-t connectivity similarly to the approach in [21].) – Extensions to situations in which ABox closure requires a finite number of additional auxiliary constants. (The approach itself can immediately use Skolem functions to deal with such constants while retaining completeness. However, decidability and complexity issues stemming from the use of Skolem functions remain to be determined.) – Extensions to DLs outside of the FunDL family, such as the various members of the EL family and to more expressive logics such as Horn-SHIQ. – And integration with an interpolation-based relational query optimizer [16,31] via depth-limited tableau construction, which, for a sufficiently large depth, leads to completeness as in [10]. An additional issue arises from the assumption of UNA. Indeed, this assumption is implicit in [21], where explicit equality among constant symbols is simply not considered, and also in most of the other approaches to OMQ via query rewriting, including the original approach introduced in [9]. In the case of Horn-DLFD, UNA is needed explicitly to accommodate keys and PFDs. We conjecture that some form of UNA is necessary, but whether restricted variants could suffice is unknown. 6.1 Related Work First order rewritability for Horn logics in the ALC/SHIQ family has been studied by others, e.g., see [4,6]. This earlier work has also developed algorithms for generating such rewritings efficiently for logics in the EL family [14]. Our approach seems to provide an alternative path to detecting rewritability and to generating rewritings. A feature of our approach is its link to interpolation-based query optimization [16,29]. The link to query optimization reveals that minimal sized rewritings are often not optimal for query execution. However, establishing limits on the size of rewriting [5] does provide a guide to what rewritings are reasonable to consider during query optimization. The use of database constraints, or constraints implied by data mapping rules and database constraints, has been explored in several systems that im- plement variants of perfect rewriting [9], such as Ontop and MASTRO [2,8] and others. Our approach, however, seamlessly accommodates such constraints into the rewriting via interpolation and, again, could serve as an unifying approach between perfect rewriting-based approaches and the combined approach, even for DL-Lite based approaches to OMQ/OBDA. Our approach cannot deal with rewritings to Datalog/and or Linear Datalog and with dichotomies on the NL-PTIME [21] and PTIME-coNP boundaries [24]: extensions of definability/interpolation beyond first order theories do not seem to exist at present and it is unclear whether similar approaches can be developed due to the close ties between interpolation and compactness of first order theories. Finally, note that Beth definability and Craig Interpolation have been used for other purposes, such as query reformulation under first-order constraints [7,16,29,31]. That use, however, is orthogonal to the topic of this paper. References 1. Franz Baader, Diego Calvanese, Deborah L. McGuinness, Daniele Nardi, and Pe- ter F. Patel-Schneider. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, 2003. 2. Timea Bagosi, Diego Calvanese, Josef Hardi, Sarah Komla-Ebri, Davide Lanti, Martin Rezk, Mariano Rodriguez-Muro, Mindaugas Slusnys, and Guohui Xiao. The ontop framework for ontology based data access. In The Semantic Web and Web Science - 8th Chinese Conference, CSWS 2014, Revised Selected Papers, pages 67–77, 2014. 3. Evert Willem Beth. On Padoa’s method in the theory of definition. Indagationes Mathematicae, 15:330–339, 1953. 4. Meghyn Bienvenu, Peter Hansen, Carsten Lutz, and Frank Wolter. First order- rewritability and containment of conjunctive queries in horn description logics. In Proceedings of the 29th International Workshop on Description Logics, Cape Town, South Africa, 2016. 5. Meghyn Bienvenu, Stanislav Kikot, Roman Kontchakov, Vladimir V. Podolskii, Vladislav Ryzhikov, and Michael Zakharyaschev. The complexity of ontology- based data access with OWL 2 QL and bounded treewidth queries. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, pages 201–216, 2017. 6. Meghyn Bienvenu, Balder ten Cate, Carsten Lutz, and Frank Wolter. Ontology- based data access: A study through disjunctive datalog, csp, and MMSNP. ACM Trans. Database Syst., 39(4):33:1–33:44, 2014. 7. Alexander Borgida, Jos de Bruijn, Enrico Franconi, Inanç Seylan, Umberto Strac- cia, David Toman, and Grant E. Weddell. On finding query rewritings under expressive constraints. In SEBD, pages 426–437, 2010. 8. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, Antonella Poggi, Mariano Rodriguez-Muro, Riccardo Rosati, Marco Ruzzi, and Domenico Fabio Savo. The MASTRO system for ontology-based data access. Se- mantic Web, 2(1):43–53, 2011. 9. Diego Calvanese, Giuseppe de Giacomo, Domenico Lembo, Maurizio Lenzerini, and Riccardo Rosati. Tractable Reasoning and Efficient Query Answering in De- scription Logics: The DL-Lite Family. J. of Automated Reasoning, 39(3):385–429, 2007. 10. Jan Chomicki. Depth-bounded bottom-up evaluation of logic program. J. Log. Program., 25(1):1–31, 1995. 11. Keith L. Clark. Negation as failure. In Logic and Data Bases, Symposium on Logic and Data Bases, Centre d’études et de recherches de Toulouse, France, 1977, pages 293–322, 1977. 12. William Craig. Three uses of the Herbrand-Genzen theorem in relating model theory and proof theory. Journal of Symbolic Logic, 22:269–285, 1957. 13. Francesco M. Donini and Fabio Massacci. EXPTIME tableaux for ALC. Artif. Intell., 124(1):87–138, 2000. 14. Peter Hansen, Carsten Lutz, Inanç Seylan, and Frank Wolter. Efficient query rewriting in the description logic EL and beyond. In Proceedings of the Twenty- Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, pages 3034–3040, 2015. 15. Ian Horrocks, Ulrike Sattler, Sergio Tessaris, and Stephan Tobies. How to decide query containment under constraints using a description logic. In Logic for Pro- gramming and Automated Reasoning, 7th International Conference, LPAR 2000, Reunion Island, France, November 11-12, 2000, Proceedings, pages 326–343, 2000. 16. Alexander K. Hudek, David Toman, and Grant E. Weddell. On enumerating query plans using analytic tableau. In Automated Reasoning with Analytic Tableaux and Related Methods - 24th International Conference, TABLEAUX 2015, Wroclaw, Poland, September 21-24, 2015. Proceedings, pages 339–354, 2015. 17. Vitaliy L. Khizder, David Toman, and Grant Weddell. Reasoning about Duplicate Elimination with Description Logic. In DOOD, pages 1017–1032, 2000. 18. Stanislav Kikot, Roman Kontchakov, Vladimir V. Podolskii, and Michael Za- kharyaschev. Long rewritings, short rewritings. In Proceedings of the 2012 In- ternational Workshop on Description Logics, DL-2012, Rome, Italy, June 7-10, 2012, 2012. 19. Roman Kontchakov, Carsten Lutz, David Toman, Frank Wolter, and Michael Za- kharyaschev. The combined approach to query answering in DL-Lite. In Proc. KR, pages 247–257, 2010. 20. Roman Kontchakov, Carsten Lutz, David Toman, Frank Wolter, and Michael Za- kharyaschev. The combined approach to ontology-based data access. In Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI), pages 2656–2661, 2011. 21. Carsten Lutz and Leif Sabellek. Ontology-mediated querying with the description logic EL: trichotomy and linear datalog rewritability. In Proceedings of the Twenty- Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, pages 1181–1187, 2017. 22. Carsten Lutz, Inanç Seylan, David Toman, and Frank Wolter. The combined approach to OBDA: Taming role hierarchies using filters. In ISWC (1), pages 314–330, 2013. 23. Carsten Lutz, David Toman, and Frank Wolter. Conjunctive query answering in the description logic EL using a relational database system. In Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI), pages 2070–2075, 2009. 24. Carsten Lutz and Frank Wolter. Non-uniform data complexity of query answering in description logics. In Principles of Knowledge Representation and Reasoning: Proceedings of the Thirteenth International Conference, KR 2012, 2012. 25. Stephanie McIntyre, Alexander Borgida, David Toman, and Grant E. Weddell. On limited conjunctions and partial features in parameter-tractable feature logics. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, pages 2995–3002, 2019. 26. Stephanie McIntyre, David Toman, and Grant E. Weddell. FunDL - A family of feature-based description logics, with applications in querying structured data sources. In Description Logic, Theory Combination, and All That - Essays Dedi- cated to Franz Baader on the Occasion of His 60th Birthday, pages 404–430, 2019. 27. Jason St. Jacques, David Toman, and Grant E. Weddell. Object-relational queries over CFDI nc knowledge bases: OBDA for the SQL-Literate. In Proc. IJCAI, pages 1258–1264, 2016. 28. David Toman and Grant E. Weddell. On keys and functional dependencies as first-class citizens in description logics. J. Aut. Reasoning, 40(2-3):117–132, 2008. 29. David Toman and Grant E. Weddell. Fundamentals of Physical Design and Query Compilation. Synthesis Lectures on Data Management. Morgan & Claypool Pub- lishers, 2011. 30. David Toman and Grant E. Weddell. Conjunctive Query Answering in CFDnc : A PTIME Description Logic with Functional Constraints and Disjointness. In Australasian Conference on Artificial Intelligence, pages 350–361, 2013. 31. David Toman and Grant E. Weddell. An interpolation-based compiler and opti- mizer for relational queries (system design report). In IWIL@LPAR 2017 Workshop and LPAR-21 Short Presentations, Maun, Botswana, May 7-12, 2017, 2017.