Exact Query Reformulation over SHOQ DBoxes Enrico Franconi, Volha Kerhet, Nhung Ngo Free University of Bozen-Bolzano, Italy lastname@inf.unibz.it Abstract We formalise the problem of query reformulation over a description lo- gic ontology and a DBox in a general framework. This framework supports decid- ing the existence of a safe-range first-order equivalent reformulation of a concept query in terms of the signature of a DBox. A constructive method to compute the reformulation is provided. We are particularly interested in safe-range reformula- tions since they can be transformed to relational queries and executed using SQL. We also discuss the completeness of the proposed framework with respect to fi- nite and unrestricted models. As a case study we consider ontologies and queries expressed in SHOQ. 1 Introduction In this paper we study and develop a query rewriting framework which is applicable to description logics systems where data is stored in a classical finite relational data- base, in a way that in the literature has been called DBox [5,6]. A DBox is a set of ground atoms which semantically behaves like a database, i.e., the interpretation of the database predicates in the DBox is exactly equal to the database relations. The DBox predicates are closed, i.e., their extensions are the same in every interpretation, whereas the other predicates in the knowledge base are open, i.e., their extensions may vary among different interpretations. We do not consider here the open interpretation for the database predicates – i.e., the classical ABox. In an ABox, the interpretation of database predicates contains the database relations and possibly more. This notion clearly is less faithful in the representation of a database semantics since it would allow for spurious interpretations of database predicates with additional unwanted tuples not present in the original database. In our general framework an ontology is a TBox in a first-order description logic, and queries are concept expressions. Within this setting, the framework provides sup- port to decide the existence of a relational algebra (i.e., safe-range first-order) equivalent (a.k.a. exact) reformulation of the query in terms of the DBox signature. It also provides an effective approach to construct the reformulation. We are particularly interested in safe-range reformulations of queries because their range-restricted syntax is needed to reduce the original query answering problem to a relational algebra evaluation (e.g., via SQL) over the original database [7]. Our framework points out several conditions on the ontology and the query in order to guarantee the existence of a safe-range equivalent reformulation. We show that these conditions are not infeasible in practice and we also provide an efficient method to ensure their validation. Standard tableau techniques can be used to compute the reformulation. In order to be complete, our framework is applicable to ontologies and queries ex- pressed in any fragment of first-order logic enjoying the finitely controllable determin- acy [3,8]. If the employed logic does not enjoy the finitely controllable determinacy our approach would become sound but incomplete, by still effectively implementable using standard theorem proving techniques. We have explored non-trivial applications where the framework is complete; in this paper, the application with SHOQ ontology and concept queries is discussed. We show how (i) to check whether the answers to a given query with an ontology are solely determined by the extension of the DBox predicates and, if so, (ii) to find an equivalent rewriting of the query in terms of the DBox pre- dicates to allow the use of standard database technology for answering the query. This means we benefit from the low computational complexity in the size of the data for answering queries on relational databases. In addition, it is possible to reuse standard techniques of description logics reasoning to find rewritings, such as in [5]. The query reformulation problem has received strong interest in classical relational database research as well as modern knowledge representation studies. Differently from the mainstream research on query reformulation [9], which is mostly based on per- fect or maximally contained rewritings with sound views (see, e.g., the DL-Lite ap- proach [10]), we focus here on exact rewritings with exact views, since it character- ises more precisely the query answering problem with ontologies and databases, and it allows for very expressive ontology languages. An exact reformulation has the same answer in any model of the ontology with the DBox, and it provides a fully determined answer, which may be useful, e.g., for materialisation. This work extends the seminal works on exact rewritings with exact views [2,5,3] by focussing on safe-range reformulations and on the conditions ensuring their existence in description logics. This is necessary when the description logic at hand is not enjoying the Beth definability property [11], which would guarantee the rewriting to be safe- range. The detailed algorithms and all the proofs for a more general framework are available in the technical report [8]. The paper is organised as follows. Section 2 provides the necessary formal back- ground and definitions. Section 3 introduces a characterisation of the query reformula- tion problem, and the conditions allowing for an effective reformulation are analysed. At the end, we discuss in details the application to SHOQ ontologies with a DBox. 2 Preliminaries In this section we define the basic concepts that are used in the paper. 2.1 Description Logics and DBox Let NC , NR and NI be sets of concept, role and individual names respectively. And let L(NC , NR , NI ) be some description logic language over NC , NR and NI . An ontology is a set of TBox assertions in L(NC , NR , NI ). Let C be a (possibly complex) concept or an assertion in L(NC , NR , NI ). We de- note as σ(C) the signature of C, that is the union of all concept, role and individual names occurring in C. A DBox D is a finite set of atomic concept and role assertions of the form A(a) and R(a, b) respectively, where A ∈ NC , R ∈ NR and a, b ∈ NI . The sets of all concept, role and individual names appearing in D are denoted as σD (C), σD (R) and σD (I) respectively. We call DBox predicates the set σD (P ) = σD (C) ∪ σD (R). As usual, an interpretation I = h∆I , ·I i includes a domain ∆I and an interpreta- tion function ·I that maps concepts to subsets of ∆I , roles to binary relations on ∆I and individuals to elements of ∆I . We say that an interpretation I embeds a DBox D, written I(D) , if it holds that: (i) aI = a for every DBox individual a ∈ σD (I), i.e. a follows the standard name assumption (SNA) [7]; (ii) for every concept name A in σD (C) and every u ∈ ∆I , u ∈ AI if and only if A(u) ∈ D; and (iii) for every role name R in σD (R) and every pair (u, v) ∈ ∆I × ∆I , (u, v) ∈ RI if and only if R(u, v) ∈ D. In other words, in every interpretation embedding D, the interpretation of any DBox predicate is always the same and it is given exactly by its content in the DBox; this is, in general, not the case for the interpretation of the non-DBox predicates. Under above embedding conditions, we say that all the DBox predicates are closed, while all the other predicates are open and may be interpreted differently in different interpretations. In order to allow for an arbitrary DBox to be embedded, we generalise the standard name assumption to all the individual names in NI ; this implies that the domain of any interpretation necessarily includes the set of all the individual names NI . We denote an interpretation I with a specific domain ∆ as I (∆) . Given an in- terpretation I, we denote as I|S the interpretation restricted to the smaller signature S ⊆ NC ∪ NR ∪ NI , i.e., the interpretation with the same domain ∆I and the same interpretation function ·I defined only for the concept, role and individual names from the set S. We call F OL(C, P) a function free first-order language with equality over a signa- ture Σ = (C, P), where C = NI is a set of constants and P = NC ∪ NR is a set of predicates with arities 1 (for concept names) and 2 (for role names). An interpretation in which an assertion ϕ (TBox or ABox) is true is called a model of ϕ; the set of all models of ϕ is denoted as M (ϕ). The set of all models of all assertions in an ontology T is denoted as M (T ). We say that a DBox D is legal for an ontology T if there exists a model of T embedding D. In the paper, we consider only consistent non-tautological ontologies and legal DBoxes. 2.2 Queries and certain answers A query is a concept in L(NC , NR , NI ). Given a query Q, we define its certain answer to a DBox D under T as follows: Definition 1 (Certain Answer) The (certain) answer of a query Q to a DBox D under an ontology T is the set of individuals: {a ∈ NI | ∀ I(D) ∈ M (T ) : I(D) |= Q(a)}. We now show that we can weaken the standard name assumption for the constants by just assuming unique names, without changing the certain answers. As we said be- fore, an interpretation I embedding a DBox D satisfies the standard name assumption – written I(D)SNA – if cI = c for any c ∈ NI . Alternatively, an interpretation I embed- ding a DBox D satisfies the unique name assumption – written I(D)UNA – if aI 6= bI for any different a, b ∈ NI . The following proposition allows us to freely interchange the standard name and the unique name assumptions in dealing with interpretations embed- ding DBoxes. This is a practical advantage, since most description logics reasoners do have a native unique name assumption. Proposition 1 (SNA vs UNA) For any query Q(x), ontology T and DBox D, {a ∈ NI | ∀ I(D)SNA ∈ M (T ) : I(D)SNA |= Q(a)} = {a ∈ NI | ∀ I(D)UNA ∈ M (T ) : I(D)UNA |= Q(a)}. A query is DBox-relativised if and only if its answer is bounded by the DBox. Definition 2 (DBox-relativised query) A concept query Q is DBox-relativised under ontology T , if in each model of T the interpretation of Q includes only domain elements which are among the interpretation of DBox predicates or of individuals from T or Q. 2.3 Safe-range formulas Since a query can be an arbitrary first-order formula, its answer can be infinite (since the domain is not restricted to be finite) or it may depend on the domain. To eliminate such cases, we will consider domain independent queries. For example, the query Q = ¬Student over the DBox Student(A), Student(B), with domain {A, B, C} has the answer {x = C}, with domain {A, B, C, D} has the answer {x = C, x = D}, and if we change the domain to an infinite one, the answer will be infinite even in presence of such a finite database. Therefore, the notion of domain independent queries has been introduced in relational databases. In general, the problem of checking whether a FOL formula is domain independent is undecidable [7]. The well known safe-range syntactic fragment of FOL introduced by Codd is an equally expressive language; indeed any safe-range formula is domain independent, and any domain independent formula can be easily transformed into a logically equivalent safe-range formula. Intuitively, a formula is safe-range if and only if its variables are bounded by positive predicates or equalities – for the exact syntactical definition see, e.g., [7]. For example, the formula ¬A(x) ∧ B(x) is safe-range, while queries ¬A(x) and ∀x. A(x) are not. To check whether a formula is safe-range, the formula is transformed into a logically equivalent safe-range normal form and its range restriction is computed according to a set of syntax based rules; the range restriction of a formula is a subset of its free variables, and if it coincides with the free variables then the formula is said to be safe-range. Any formula in F OL(C, P) can be transformed to a logically equivalent safe range normal form (SRNF) by recursively applying the following steps : – Variable substitution: no distinct pair of quantifiers may employ same variable. – Elimination of universal quantifiers – Elimination of implications – Pushing negation – Flattening of and/or A formula is said to be SRNF if none of the aforementioned steps can be applied any more. Let ϕ be a formula in F OL(C, P), we denote the set of all variables appearing in ϕ as VAR(ϕ), and the set of the free variables appearing in ϕ as FREE(ϕ). The safe range normal form of ϕ is denoted as SRNF(ϕ). Let ϕ be a formula in SRNF. The range restriction of ϕ, denoted as rr(ϕ), is either a subset of FREE(ϕ) or ⊥, and it is computed according to the following rules: – rr(R(t1 , . . . , tn )) = VAR(R(t1 , . . . , tn )); – rr(x = y) = ∅; – rr(x = c) = {x}, where c ∈ C; – rr(ϕ1 ∧ ϕ2 ) = rr(ϕ1 ) ∪ rr(ϕ2 ); – rr(ϕ1 ∨ ϕ2 ) = rr(ϕ1 ) ∩ rr(ϕ2 ); – rr(ϕ∧x = y) = rr(ϕ), if {x, y}∩rr(ϕ) = ∅; and rr(ϕ∧x = y) = rr(ϕ)∪{x, y} otherwise; – rr(¬ϕ) = ∅ ∩ rr(ϕ); – rr(∃xϕ) = rr(ϕ) \ x if x ∈ rr(ϕ) and rr(∃xϕ) = ⊥ otherwise. We consider ⊥ as a special set, such that for any set Z : ⊥ ∪ Z = ⊥ ∩ Z = ⊥\Z = Z\⊥ = ⊥. If ϕ is not in SRNF, then rr(ϕ) := rr(SRNF(ϕ)). We say that a variable x ∈ FREE(ϕ) has restricted range in ϕ if x ∈ rr(ϕ). Definition 3 (Safe range) A formula ϕ is safe range iff rr(SRNF(ϕ)) = FREE(ϕ). We also consider a weaker version of safe-range property called ground safe-range. Given a formula, its grounding is the formula itself where all free variables are replaced by new constants. Definition 4 (Ground safe-range) A formula is ground safe-range if its grounding is safe-range. The safe-range fragment of first-order logic with the standard name assumption is equally expressive to the relational algebra, which is the core of the SQL query lan- guage [7]. For any concept C in L(NC , NR , NI ) we denote the corresponding logically equi- valent formula in FOL(C, P) with one free variable x as C(x). We will call any axiom (concept) in L(NC , NR , NI ) (ground) safe-range, if the corresponding logically equi- valent formula in FOL(C, P) is (ground) safe-range. An ontology T in L(NC , NR , NI ) is safe-range, if every formula in T is safe-range. 3 Exact Safe-range Query Reformulation In this section we state the problem of finding a first-order safe-range reformulation of a concept query. We then find the conditions to reduce the original query answering problem – which corresponds to an entailment problem – to a model checking problem of the reformulation over the DBox. Let us consider the class of queries of interest. The certain answer to a query in- cludes all the individuals which make the query true in all the models of the ontology: so, if an individual would make the query true only in some model, then it would be discarded from the certain answer. In other words, it may be the case that the answer to the query is not necessarily the same among all the models of the knowledge base. In this case, the query is not fully determined by the given source data; indeed, there is some answer which is possible, but not certain. Due to the indeterminacy of the data wrt the query, the complexity to compute the certain answer in general increases, and it corresponds to the complexity of entailment in the logic. In this paper we focus on the case when a query has the same answer over all the models of the ontology, namely, on the case when the information requested by the query is fully available from the source data without ambiguity. In this way, the indeterminacy disappears, and the complexity of the process may decrease. A query is definable [12] if its truth value in any model of the ontology depends only on the domain and on the interpretation of the database predicates and constants. The answer of a definable query does not depend on the interpretation of non-database pre- dicates. Once the database and a domain are fixed, it is never the case that an individual would make the query true in some model of the knowledge base and false in others, since the truth value of an implicitly defined query depends only on the interpretation of the database predicates and constants and on the domain (which are fixed). [12] proved that, in first-order logic, looking for definable queries from the DBox predicates amounts at having an exact reformulation of the query in terms of the DBox predicates. Definition 5 (Exact Reformulation) The FOL(C, P) formula Q b is an exact reformu- lation of Q under T over σD (P ) if σ(Q) b ⊆ σD (P ) and T |= ∀x.Q(x) ↔ Q(x). b Since we are dealing with finite databases, in the following we will focus on those fragments of FOL(C, P) for which the exact reformulation over unrestricted models and over finite models coincide; we say that these fragments have finitely controllable determinacy. Given DBox predicates σD (P ), an ontology T , and a query Q in the L(NC , NR , NI ) language, our goal is to find a safe-range exact reformulation Qb of Q in FOL(C, P) expressed in terms of DBox predicates, that being evaluated as a relational algebra ex- pression over a legal DBox (e.g., using a relational database system with SQL) gives the same answer as the certain answer of Q to the DBox under T . Since an exact reformulation is equivalent under the ontology to the original query, the certain answer of the original query and of the reformulated query are identical. More precisely, the following proposition holds. b be an exact reformulation of Q under T over σD (P ), then: Proposition 2 Let Q {a ∈ NI | ∀ I(D) ∈ M (T ) : I(D) |= Q(a)} = {a ∈ NI | ∀ I(D) ∈ M (T ) : I(D) |= Q(a)}. b From the above equation it is clear that in order to answer an exactly reformulated query, one still may need to consider all the models I(D) of the ontology embedding the DBox – i.e., we still have an entailment problem to solve. The following theorem states the condition to reduce the original query answering problem – based on entailment – to the problem of checking the validity of the exact reformulation over a single model: the condition is that the reformulation should be safe-range. Theorem 1 (Adequacy of Exact safe-range Query Reformulation) Let T be an on- tology in L(NC , NR , NI ), Q be a query in L(NC , NR , NI ) and D be a legal DBox for T . If Q b is an exact reformulation of Q under T over σD (P ) and Q b is safe-range, then: {a ∈ NI | ∀ I(D) ∈ M (T ) : I(D) |= Q(a)} = {a ∈ adom(σ(Q),b D) | I (NI ) |σ (P )∪N |= Q(a)}, b (D) D I b D) consists of all the constants from Q where adom(σ(Q), b and from the assertions in D corresponding to concept and role names appearing in Q. b A safe-range reformulation is necessary to transform a first-order query to a rela- tional algebra query which can then be evaluated by using SQL techniques. The theorem above shows in addition that being safe-range is also a sufficient property for an exact reformulation to be correctly evaluated as an SQL query. However, given an arbitrary input (an ontology, a DBox and a concept query), one can not guarantee the existence of an exact safe-range reformulation. Therefore, in the rest of this section we introduce conditions on the input to get an exact safe-range reformulation. Moreover, since we are dealing with a finite DBox, we have also to consider the condition under which the existence of an exact safe-range reformulation under unrestricted reasoning and under finite reasoning coincide. Let Q be any formula and Q e the formula obtained from it by uniformly replacing every occurrence of each non-DBox predicate P with a new predicate Pe. We extend this renaming operator e· to any set of formulas in a natural way. Then the following constructive theorem gives us sufficient conditions to check the existence of an exact safe-range reformulation. Theorem 2 (Constructive Theorem) If the following conditions hold: 1. T ∪ Te |= Q ≡ Q;e 2. Q is DBox-relativised under T ; 3. Q is ground safe-range; 4. T is safe-range; b of Q in FOL(C, P) over σD (P ) then there exists an exact safe-range reformulation Q under T . The above conditions can be divided into two groups: the first condition forces the exist- ence of an exact reformulation, while the three last conditions guarantee its safe-range property. The first condition says that one does not need to consider non-DBox pre- dicates to answer the query. In other words, its answer in any model of the ontology depends only on the domain and on the interpretation of the DBox predicates and con- stants. This property of a query is called implicit definability from a set of predicates (the DBox predicates) in first-order logic [12]. The second condition points out that the answer of the query is necessarily in the set of individuals appearing in the DBox original query or ontology. The first two conditions are necessary to have an exact safe-range reformulation, i.e. if there is an exact safe-range reformulation, then the original query should be implicitly definable and DBox-relativised. The last two conditions are just sufficient ones, as the following example shows. Example 1 Let P = {A, B, C}, σD (I) = {C}, T = {A ≡ C, > v B}, Q = A u B. – Q is implicitly definable from the DBox predicates under T because the first asser- tion of T gives an explicit definition of Q; – Q is safe-range; – Q is DBox-relativised under T because of the first assertion of T . – Q(x) b = C is an exact safe-range reformulation of Q under T over σD (I). But T is not safe-range because of the second assertion. t u 4 A case study: SHOQ Syntax Semantics A AI ⊆ ∆I R RI ⊆ ∆I × ∆I C uD C I ∩ DI C tD C I ∪ DI ¬C ∆I \C I ∃R.C {x|exists y such that (x, y) ∈ RI and y ∈ C I } ∀R.C {x|forall y (x, y) ∈ RI implies y ∈ C I } {o} {o}I ⊆ ∆I ≥ nR.C {x|#({y|(x, y) ∈ RI } ∩ C I ) ≥ n} ≤ nR.C {x|#({y|(x, y) ∈ RI } ∩ C I ) ≤ n} Table 1. Syntax and semantics of SHOQ concepts SHOQ is an extension of the description logic ALC with transitive roles, role hier- archies, qualified number restrictions, and individuals; it is a fragment of first-order logic and of OWL2. The syntax and semantics of SHOQ is summarised in table 1, where A is an atomic concept, C and D are concepts, o is an individual name and R is an atomic role. SHOQ is a pretty much standard description logic; for more details see, e.g., [13]. A TBox in SHOQ is a set of concept inclusion axioms C v D, role inclusion axioms R v S, and transitivity axioms Trans(R) (where C, D are concepts and R, S are atomic roles) with the usual expected semantics. In this section, we present an application of our framework where the ontology is a TBox in SHOQ, and the query is a SHOQ concept. Finitely controllable determinacy. Does SHOQ have finite controllability of determ- inacy? It is enough to check that the entailment T ∪ Te |= Q ≡ Q e coincide in the unrestricted and finite cases. The finite controllability of this equivalence axiom entail- ment in SHOQ is guaranteed because of the two following reasons: – The entailment T ∪ Te |= Q ≡ Qe can be reduced in SHOQ to a concept satisfiab- ility problem for an empty TBox. – SHOQ has finite model property [14]. So, we can use a standard SHOQ reasoner (e.g., an OWL2 reasoner) to check the first condition. Safe-range ontology. Let’s now check whether a SHOQ ontology is safe-range. Role inclusion and transitivity axioms are always safe-range. Unfortunately, concept inclu- sion axioms in SHOQ ontologies may not be safe-range: for example, the axiom ¬ male v female is not safe-range. It is easy to see that an axiom C v D is not safe- range if and only if C(x) is not safe-range and D(x) is safe-range: just observe that the axiom is logically equivalent to the formula ¬∃x. C(x) ∧ ¬D(x) in FOL(C, P). The following proposition provides rules deciding whether a SHOQ concept is safe-range. Proposition 3 Let A be an atomic concept, C and D be SHOQ concepts. Then the open formulas: 1. A(x), (∃R.C)(x), {o}(x), (≥ nR.C)(x) are safe-range; 2. (∀R.C)(x), (≤ nR.C)(x) are not safe-range; 3. (C u D)(x) is safe-range if and only if C(x) is safe-range or D(x) is safe-range; 4. (C t D)(x) is safe-range if and only if C(x) is safe-range and D(x) is safe-range; 5. ¬C(x) is safe-range if and only if C(x) is not safe-range. Proposition 4 For any SHOQ concept C, C(x) is ground safe-range. The presence of non-safe-range axioms in an ontology would prevent the applica- tion of our framework, but we argue that non-safe-range axioms should not appear in a cleanly designed SHOQ ontology, and, if present, they should be fixed. Indeed, the use of absolute negative information in the subsumee – such as in the axiom “a non- male is a female" (¬ male v female) – should be deprecated by a clean design methodology, since the subsumer would include all sorts of objects in the universe (but the ones of the subsumee type) without any obvious control. Only relativised negative information in the subsumee should be allowed – such as in the axiom “a non-male person is a female" (person u ¬ male v female). This observations suggests a fix for non-safe-range axioms: for every non-safe-range axiom C v D users will be asked to replace it by the safe-range one C u E v D, where E is an arbitrary safe-range concept. Therefore, the user is asked to make explicit the type of the subsumee, in a way to make it safe-range; note that the type could be also a fresh new atomic concept. We believe that the fix we are proposing for SHOQ is a reasonable one, and would make all SHOQ ontologies eligible to be used with our framework. Ground safe-range and DBox-relativised query. Let T be a SHOQ ontology, and Q an implicitly definable query, which is a possibly complex concept in SHOQ. In order to be able to use our framework, a query should be ground safe-range and DBox-relativised under the ontology. We already know by proposition 4 that a concept query is always ground safe-range. A query is DBox-relativised if it returns only DBox individuals; it may be strange for a user to issue a query which is not meant to return just DBox objects. One can check if Q is DBox-relativised under the ontology by using the following proposition. Proposition 5 The query Q is DBox-relativised under T if and only if: k G n G m G T |=SHOIQ Q v {oi } t Ai t (∃Ri .> t ∃Ri− .>), (1) i=1 i=1 i=1 where {A1 , . . . , An } is the set of all DBox concepts appearing in T and Q; {R1 , . . . , Rm } is the set of all DBox roles appearing in T and Q; and {o1 , . . . , ok } is the set of all individual names appearing in T and Q. In other words, if the SHOIQ entailment in the proposition is valid, then the query is DBox-relativised under the ontology. We use SHOIQ instead of SHOQ because we need inverse roles. Due to the incompleteness wrt finite model reasoning of the SHOIQ test, one might conclude that a query is not DBox-relativised but in fact it is DBox-relativised under finite model reasoning. In the rare case a user is issuing a real non-DBox-relativised query, or a DBox-relativised query which failed the above test due to its incompleteness, we would ask the user to conjoin the query with a safe-range concept composed only by database atomic concepts, which would become the type of the query. We believe that also this fix for the queries is a reasonable one, and would make all queries eligible to be used with our framework. A complete procedure. Given a SHOQ ontology T , a legal DBox D and a concept query Q, one can apply the procedure below to generate a safe-range exact reformula- tion over the DBox predicates. Input: A SHOQ TBox T , a concept query Q in SHOQ and a DBox predicates (DBox atomic concepts and roles). 1. Check implicit definability of the query Q by testing T ∪ Te |= Q ≡ Q e using standard DL reasoner of SHOQ. If it is the case, continue. 2. Check whether T is safe-range, and fix it if it is not safe-range. 3. Check the DBox-relativisation of Q, and fix it if it is not DBox-relativised. 4. Use the constructive theorem to (a) compute a ground safe-range reformulation Q0 (x) from the tableau proof gen- erated in step 1 (this is an extension of what has been presented in [5,11]; see [8] for a complete characterisation); (b) transform it to a safe-range one as follows: Q(x) b := Q0 (x)∧ ADOM (x),where ADOM is a predicate containing all the individuals in the DBox, in T , and in Q. ADOM actually represents the subsumer of the TBox axiom in (1). Output: A safe-range first-order exact reformulation Q(x)b expressed over the DBox predicates. Note that the above procedure could be executed once for all at compile time: in- deed, it could be run for each atomic concept in the ontology, and the outcome for each of them could be stored persistently, if the reformulation has been successful. 5 Conclusion We have introduced a framework to compute the exact reformulation of concept queries to a DBox in description logics. We have found the conditions which guarantee that a safe-range reformulation exists, and we show that it can be evaluated as a relational algebra query over the database to give the same answer as the original query under the ontology. A non-trivial case study has been presented in the field of description logics, with the SHOQ language. As a future work, we would like to study optimisations of reformulations. From the practical perspective, since there might be many rewritten queries from one original query, the problem of selecting an optimised one in terms of query evaluation is very important. In fact, one has to take into account which criteria should be used to optim- ise, such as: the size of the rewritings, the numbers of used predicates, the priority of predicates, the number of relational operators, and clever usage of duplicates. We wish to thank David Toman, İnanç Seylan, Jos de Bruijn, Alex Borgida, Grant Weddell, Tommaso Di Noia, Umberto Straccia, Balder ten Cate with whom we have learnt a lot about query rewriting based on Beth definability, and anonymous reviewers for insightful comments. References 1. Etzioni, O., Golden, K., Weld, D.S.: Sound and efficient closed-world reasoning for planning. Artif. Intell. 89 (January 1997) 113–148 2. Marx, M.: Queries determined by views: pack your views. In: Proceedings of the 26th ACM symposium on Principles of Database Systems. PODS ’07 (2007) 23–30 3. Nash, A., Segoufin, L., Vianu, V.: Views and queries: Determinacy and rewriting. ACM Trans. Database Syst. 35 (July 2010) 21:1–21:41 4. Fan, W., Geerts, F., Zheng, L.: View determinacy for preserving selected information in data transformations. Inf. Syst. 37 (March 2012) 1–12 5. İnanç Seylan, Franconi, E., de Bruijn, J.: Effective query rewriting with ontologies over DBoxes. In: Proc. of the 21st International Joint Conference on Artificial Intelligence (IJCAI 2009). (2009) 923–925 6. Franconi, E., Ibanez-Garcia, Y.A., İnanc Seylan: Query answering with DBoxes is hard. Electronic Notes in Theoretical Computer Science, Elsevier 278 (November 2011) 71–84 7. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995) 8. Franconi, E., Kerhet, V., Ngo, N.: Exact query reformulation with expressive ontolo- gies and databases. Technical Report 12158, KRDB Tech research group, Free Univer- sity of Bozen-Bolzano (March 2012) http://www.inf.unibz.it/krdb/pub/TR/ KRDB-Tech-12158.pdf. 9. Halevy, A.Y.: Answering queries using views: A survey. The VLDB Journal 10 (December 2001) 270–294 10. Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: The DL-Lite family and relations. J. Artif. Intell. Res. (JAIR) 36 (2009) 1–69 11. ten Cate, B., Franconi, E., İnanç Seylan: Beth definability in expressive description logics. In: Proc. of the 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011). (2011) 1099–1106 12. Beth, E.: On Padoa’s method in the theory of definition. Indagationes Mathematicae 15 (1953) 330–339 13. Horrocks, I., Sattler, U.: Ontology reasoning in the SHOQ(D) description logic. In: In Proc. of the 17th Int. Joint Conf. on Artificial Intelligence (IJCAI 2001). (2001) 199–204 14. Lutz, C., Areces, C., Horrocks, I., Sattler, U.: Keys, nominals, and concrete domains. J. Artif. Int. Res. 23 (June 2005) 667–726