=Paper=
{{Paper
|id=None
|storemode=property
|title=Towards Practical Query Answering for Horn-SHIQ
|pdfUrl=https://ceur-ws.org/Vol-846/paper_20.pdf
|volume=Vol-846
|dblpUrl=https://dblp.org/rec/conf/dlog/EiterOSTX12
}}
==Towards Practical Query Answering for Horn-SHIQ==
Towards Practical Query Answering for Horn-SHIQ? Thomas Eiter1 , Magdalena Ortiz1 , Mantas Šimkus1 Trung-Kien Tran2 , and Guohui Xiao1 1 Institute of Information Systems, Vienna University of Technology {eiter|ortiz|xiao}@kr.tuwien.ac.at simkus@dbai.tuwien.ac.at 2 STARLab, Vrije Universiteit Brussel truntran@vub.ac.be 1 Introduction Query answering has become a prominent reasoning task in Description Logics. This is witnessed not only by the high number of publications on the topic in the last decade, but also by the increasing number of query answering engines. A number of systems provide full conjunctive query (CQ) answering capabilities, including [1, 23, 25, 4, 11]. A common feature of these approaches is that they rely on existing technologies for relational or deductive databases. They focus on lightweight DLs like DL-Lite and EL, and they use query rewriting to reduce the problem of answering a query over a DL ontology to a database query evaluation problem. For more expressive DLs that are not tractable in combined complexity, however, CQs (with the complete first-order semantics) are not yet supported by current reasoning engines. A range of algorithms has been designed, but they serve for theoretical purposes such as showing complexity bounds and are not amenable to practical implementation. The only exception is the rewriting algorithm implemented in the R EQUIEM system, which covers ELHI [23], an expressive extension of EL for which standard reasoning is E XP T IME-hard. In this paper, we contribute to the development of practical query answering systems beyond DL-Lite and EL. We consider Horn-SHIQ, the Horn fragment of the popular DL SHIQ that underlies OWL DL. It combines all the expressive features of DL-Lite and EL, and simultaneously extends them with transitive roles, qualified number restric- tions and some universal quantification. Standard reasoning tasks in Horn-SHIQ are already E XP T IME-hard in combined complexity but, due to the absence of disjunction, they are polynomial in data complexity. Since in Horn-SHIQ models are significantly more complex than in DL-Lite and (most dialects of) EL, extending existing query rewriting techniques is not straightforward. The main contribution of this paper is a query answering method for Horn-SHIQ that appears to be promising for practicable systems, as confirmed by the experimental evaluation of a prototype implementation. – The core of the method is a novel query rewriting technique which transforms an input query q into a union Q of CQs (a UCQ) such that the answers of q over an ontology O = hT , Ai with TBox T and ABox A coincide with the answers over A ? This work was supported by the Austrian Science Fund (FWF) grants P20840 and T515. of a Datalog program comprising Q and some rules to complete A, showing Datalog- rewritability of CQ answering in Horn-SHIQ. – Naturally, the set Q may be exponential in the size of q in the worst case, but in practice we obtain rewritings Q that are of manageable size for real world ontologies. This is mostly due to the fact that new queries are only generated taking into account the anonymous domain elements implied by the terminology. Notably our algorithm is worst-case optimal in both data and combined complexity. – We describe a prototype implementation of the approach that uses off-the-shelf Datalog reasoners. Despite being preliminary and lacking sophisticated optimizations, it shows that the approach is promising. It can answer queries efficiently over Horn- SHIQ ontologies and scales down nicely to DL-Lite, where it is competitive with state of the art query rewriting systems. – The technique works for full Horn-SHIQ and arbitrary CQs, but the imple- mented version does not support transitive (or more generally, non-simple) roles in the query. To keep presentation simple, we present here only the case without transi- tive roles, and refer to [6, 9] for a more general version with transitive roles and richer queries formulated in weakly DL-safe Datalog. 2 Preliminaries Horn-SHIQ The syntax and semantics of Horn-SHIQ is defined in the usual way. A role is a role name p or its inverse p− . A Horn-SHIQ TBox T in normal form is a set of role inclusion axioms r v s, transitivity axioms trans(r), and general concept inclusion axioms (GCIs) of the forms (F1) A1 u. . .uAn vB, (F2) A1 v∀r.B, (F3) A1 v ∃r.B, and (F4) A1 v 61 r.B,where A1 , . . . , An , B are concept names and r, s are roles. Axioms (F3) are called existential. W.l.o.g. we consider only Horn-SHIQ TBoxes in normal form [16, 17]. We call s transitive in T , if trans(s) ∈ T or trans(s− ) ∈ T , and we call s simple in T , if there is no transitive r in T s.t. r v∗T s, where v∗ is the reflexive transitive closure of {(r, s) | r v s ∈ T or inv(r) v inv(s) ∈ T }. Only simple roles are allowed in axioms of the form (F4). An ABox A is a set of assertions A(a) and r(a, b), where A is a concept name, r a role, and a, b are individuals; the set of all individuals is denoted by NI . An ontology is a pair (T , A) of a TBox T and an ABox A. The semantics is given by interpretations I = h∆I , ·I i in the usual way. A Horn-ALCHIQ TBox is a Horn-SHIQ TBox with no transitivity axioms. Horn- ALCHIQu TBoxes are obtained by allowing the conjunction r1 u r2 of roles r1 and r2 , interpreted (r1 u r2 )I = r1I ∩ r2I . We let inv(r1 u r2 ) = inv(r1 ) u inv(r2 ) and assume w.l.o.g. that for each role inclusion r v s of a Horn-ALCHIQu TBox T , (i) inv(r) v inv(s) ∈ T , and (ii) s ∈ {p, p− } for a role name p. For a set W and a concept or role conjunction Γ = γ1 u . . . u γm , we write Γ ⊆ W for {γ1 , . . . , γm } ⊆ W . Conjunctive Queries A conjunctive query (CQ) is an expression of the form q(u) ← p1 (v1 ), . . . , pm (vm ) where each pi (vi ) is an atom of the form A(x) or r(x, y), where A is a concept name and r is a role, x, y are variables, and q is a special query predicate not occurring elsewhere. For convenience, we may identify such a CQ with the set of its atoms, and Table 1: Inference rules. M (0) , N (0) , (resp., S (0) ) are conjunctions of atomic concepts (roles); A, B are atomic concepts M v ∃S.(N u N 0 ) N v A M v ∃(S u S 0 ).N Svr M v ∃S.(N u ⊥) Rc v Rr v R⊥ 0 0 M v ∃S.(N u N u A) M v ∃(S u S u r).N M v⊥ M v ∃(S u r).N A v ∀r.B M v ∃(S u inv(r)).(N u A) A v ∀r.B R∀ R− ∀ M u A v ∃(S u r).(N u B) M vB M v ∃(S u r).(N u B) Av 61 r.B M 0 v ∃(S 0 u r).(N 0 u B) R≤ M u M 0 u A v ∃(S u S 0 u r).(N u N 0 u B) M v ∃(S u inv(r)).(N1 u N2 u A) Av 61 r.B N1 u A v ∃(S 0 u r).(N 0 u B u C) R− ≤ M uBvC M u B v ∃(S u inv(S 0 u r)).(N1 u N2 u A) S use q(u) (or simply q) to refer to it. We call u ⊆ 1≤i≤m vi the distinguished variables of q. A match for q in I is a mapping from variables in q to elements in ∆I such that |u| π(t) ∈ pI for each atom p(t) of q. The answer to q over O is the set of all c ∈ NI such that in every model I of O some match π for q exists with π(u) = (c)I . Elimination of Transitivity. As usual, transitivity axioms roles can be eliminated from Horn-SHIQ TBoxes. To obtain a Horn-ALCHIQ TBox T ∗ that is also in normal form, we can use the transformation from [16]. This transformation preserves satisfia- bility and, provided that queries contain only simple roles, also query answers (answers are not preserved for arbitrary queries unless the notion of match is suitably relaxed). In the rest of the paper we describe a procedure for answering CQs in Horn-ALCHIQu . 3 Canonical Models For answering CQs in Horn DLs usually the canonical model property is employed [7, 21, 2]. In particular, for a consistent Horn-ALCHIQu ontology O = (T , A), there exists a model I of O that can be homomorphically mapped into any other model I 0 of O. We show that such an I can be built in three steps: (1) close T under specially tailored inferences rules, (2) close A under all but existential axioms of T , and (3) extend A by “applying” the existential axioms of T . For Step (1) we use the calculus in Table 1, which is similar to [16, 20]. Given a Horn-ALCHIQu TBox T , we denote by Ξ(T ) the TBox obtained from T by ex- haustively applying the inference rules in Table 1. In Step (2) we simply ‘apply’ in the ABox all but existential axioms in T . For convenience, this is done using the set cr(T ) of Datalog rules in Table 2. Since every ABox A can be seen as a set of Datalog facts, A ∪ cr(T ) is a Datalog program (with constraints) which has a unique minimal Her- brand model J = MM (A ∪ cr(T )) if O is consistent. This model is almost a canonical model of (T , A); however, existential axioms may be violated. To deal with this, in Step (3) we extend J with new domain elements as required by axioms M v ∃r.N in Ξ(T ), using a procedure similar to the well known chase in databases. Table 2: (Completion rules) Datalog program cr(T ). B(y) ← A(x), r(x, y) for each A v ∀r.B ∈ T B(x) ← A1 (x), . . . , An (x) for all A1 u . . . uAn v B ∈ Ξ(T ) r(x, y) ← r1 (x, y), . . . , rn (x, y) for all r1 u . . . u rn v r ∈ T ⊥(x) ← A(x), r(x, y1 ), r(x, y2 ), B(y1 ), B(y2 ), y1 6= y2 for each Av 61 r.B ∈ T Γ ← A(x), A1 (x), . . . , An (x), r(x, y), B(y) for all A1 u . . . uAn v ∃(r1 u . . . urm ).(B1 u . . . uBk ) and Av 61 r.B of Ξ(T ) such that r=ri and B=Bj for some i, j with Γ ∈ {B1 (y), . . . , Bk (y), r1 (x, y), . . . , rk (x, y)} Definition 1. Let T be a Horn-ALCHIQu TBox and let I be an interpretation. A GCI M v ∃S.N is applicable at e ∈ ∆I if (a) e ∈ M I , (b) there is no e0 ∈ ∆I with (e, e0 ) ∈ S I and e0 ∈ N I , (c) there is no axiom M 0 v ∃S 0 .N 0 ∈ T such that e ∈ (M 0 )I , S ⊆ S 0 , N ⊆ N 0 , and S ⊂ S 0 or N ⊂ N 0 . An interpretation J obtained from I by an application of an applicable axiom M v ∃S.N at e ∈ ∆I is defined as: - ∆J = ∆I ∪ {d} with d a new element not present in ∆I (we call d a successor of e), - For each concept name A and each o ∈ ∆J , we have o ∈ AJ if (a) o ∈ ∆I and o ∈ AI ; or (b) o = d and A ∈ N . - For each role name r and o, o0 ∈ ∆J , we have (o, o0 ) ∈ rJ if (a) o, o0 ∈ ∆I and (o, o0 ) ∈ rI ; or (b) (o, o0 ) = (e, d) and r ∈ S; or (c) (o, o0 ) = (d, e) and inv(r) ∈ S. We denote by chase(I, T ) a possibly infinite interpretation obtained from I by applying the existential axioms in T . We require fairness: the application of an applicable axiom can not be infinitely postponed. We note that chase(I, T ) is unique up to renaming of domain elements. As usual in DLs, it can be seen as a ‘forest’: the application of existential axioms simply attaches ‘trees’ to an arbitrarily shaped model I. Theorem 1. Let O = (T , A) be a Horn-ALCHIQu ontology. Then O is consistent iff A ∪ cr(T ) consistent. Moreover, if O is consistent, then (a) chase(MM (A ∪ cr(T )), Ξ(T )) is a model of O, and (b) chase(MM (A ∪ cr(T )), Ξ(T )) can be homomorphi- cally mapped into any model of O. The proof of Theorem 1 can be found in [9]; see [21] for a proof of a similar result. Observe that checking consistency of O = (T , A) reduces to evaluating the Data- log program A ∪ cr(T ). We note that Ξ(T ) can be computed in exponential time in the size of T : the calculus only infers axioms of the form M v B and M v ∃S.N , where M, N are conjunctions of atomic concepts, B is atomic and S is a conjunction of roles, and there are exponentially many such axioms. 4 Query Rewriting The following theorem, which is immediate from Theorem 1, allows us to concentrate on models obtained by the chase procedure. q1 q10 q2 q20 A1 x1 A1 x1 A1 x1 A1 x1 r1 A v ∃(r u r2 ).(B u A3 ) r1 r1 r2 A v ∃(r u r3 u r4− ).(B u A3 ) r , r 1 2 A2 x2 x3 A2 , A x4 A4 A2 x2 A4 , A2 , A x3 r2 r3 r2 r3 r4 r3 r3 , r4− x3 A3 A4 x4 A3 A4 x4 A3 x3 A3 (a) Example 1 (b) Example 2 Fig. 1: Examples of query rewriting Theorem 2. Let O = (T , A) be a Horn-ALCHIQu ontology. Then A ∪ cr(T ) is con- sistent iff O is consistent. Moreover, if O is consistent, then ans(O, q) = ans(IO , q), where IO = chase(MM (A ∪ cr(T )), Ξ(T )). as IO can be infinite. Hence, we rewrite q Computing ans(IO , q) is still not trivial, S into a set Q of CQs such that ans(IO , q) = q0 ∈Q ans(MM (A ∪ cr(T )), q 0 ), that is, we can evaluate them over the finite MM (A ∪ cr(T )). The intuition is the following. Suppose q has a non-distinguished variable x, and that there is some match π in IO such that π(x) is an object in the ‘tree part’ introduced by the chase procedure, and it has no descendant in the image of π. Then for all atoms r(y, x) of q, the “neighbor” variable y must mapped to the parent of π(x). A rewrite step makes a choice of such an x, and employs an existential axiom from Ξ(T ) to ‘clip off’ x, eliminating all query atoms involving it. By repeating this procedure, we can clip off all variables matched in the tree part and obtain a query with a match in MM (A ∪ cr(T )). Definition 2 (Query rewriting). For a CQ q and a Horn-ALCHIQu TBox T , we write q →T q 0 if q 0 can be obtained from q in the following steps: (S1) Select in q an arbitrary non-distinguished variable x such that there are no atoms of the form r(x, x) in q. (S2) Replace each role atom r(x, y) in q, where y is arbitrary, by the atom inv(r)(y, x). (S3) Let Vp = {y | ∃r : r(y, x) ∈ q}, and select some M v ∃S.N ∈ Ξ(T ) such that (a) {r | r(y, x) ∈ q ∧ y ∈ Vp } ⊆ S, and (b) {A | A(x) ∈ q} ⊆ N . (S4) Drop from q each atom containing x. (S5) Rename each y ∈ Vp of q by x. (S6) Add the atoms {A(x) | A ∈ M } to q. We write q →∗T q 0 if q = q0 and q 0 = qn for some finite rewrite sequence q0 →T q1 · · · →T qn , n ≥ 0. Furthermore, we let rewT (q) = {q 0 | q →∗T q 0 }. Example 1. The query q1 (x1 ) ← A1 (x1 ), r1 (x1 , x2 ), A2 (x2 ), r2 (x2 , x3 ), A3 (x3 ), r3 (x2 , x4 ), A4 (x4 ) is depicted on the left hand side of Figure 1a. The node in bold corresponds to the answer variable x1 . Assume that Av∃(r u r2 ).(B u A3 ) and Av∃(r u r3 u r4− ).(B (1) Ontology (2) O Prepro- Saturation (6) cessing Datalog (7) Datalog (4) answers Translation Engine (5) Query Pre- (3) Query q processing Rewriting (1) ABox assertions (4) Existential axioms (7) Datalog rules (2) TBox axioms (5) Rewritten queries (3) Conjunctive queries (6) Axioms Fig. 2: C LIPPER system architecture uA3 ) are in Ξ(T ). If we pick for (S1) the variable x3 , we get Vp = {x2 } and we can select A v ∃(r u r2 ).(B u A3 ) ∈ Ξ(T ), as it satisfies (S3.a) and (S3.b). After performing (S4), (S5) and (S6) we obtain the rewritten query q10 (x1 ) ← A1 (x1 ), r1 (x1 , x3 ), A2 (x3 ), A(x3 ), r3 (x3 , x4 ), A4 (x4 ). Intuitively, we can safely remove from q1 all atoms containing x3 because the added atom A(x3 ) ensures that whenever q10 has a match so does q1 . Example 2 (ctd). Now we consider the query q2 (x1 ) ← A1 (x1 ), r2 (x1 , x2 ), A2 (x2 ), r3 (x2 , x3 ), A3 (x3 ), r1 (x1 , x4 ), A4 (x4 ), r4 (x3 , x4 ) in Figure 1b. We choose the variable x3 , replace r4 (x3 , x4 ) by r4− (x4 , x3 ) in step (S2), and get Vp = {x2 , x4 }. Intuitively, if π(x3 ) is a leaf in a tree-shaped match π, then x2 and x4 must both be mapped to the parent of π(x3 ). Since the GCI A v ∃(r u r3 u r4− ).(B u A3 ) in Ξ(T ) satisfies (S3.a,b), we can drop the atoms containing x3 from q2 , and perform (S5) and (S6) to obtain the rewritten query q20 (x1 ) ← A1 (x1 ), r1 (x1 , x3 ), r2 (x1 , x3 ), A4 (x3 ), A2 (x3 ), A(x3 ). Now we can state our main result (see [9] for the proof of a more general result): Theorem 3. Suppose O = (T S, A) is a consistent Horn-ALCHIQu ontology and let q be a CQ. Then ans(O, q) = q0 ∈rewT (q) ans(MM (A ∪ cr(T )), q 0 ). By the above reduction, we can answer q over O = (T , A)) by evaluating rewT (q) over MM (A∪cr(T ) or, equivalently, by evaluating the Datalog program rewT (q)∪A∪ cr(T ) and collecting the tuples u with q(u) in the minimal model. We note that rewT (q) is finite and computable in time exponential in the size of T and q: rules in rewT (q) use only relation names and variables that occur in q and T . Furthermore, the grounding of rewT (q) ∪ A ∪ cr(T ) is exponential in the size of O, but polynomial for fixed T and q. By the complexity of Datalog, it follows that the resulting algorithm is exponential in combined but polynomial in data complexity; this is worst-case optimal [7]. 5 Implementation To evaluate the feasibility of the new rewriting, we have implemented a prototype sys- tem C LIPPER,3 which supports CQ answering over Horn-SHIQ ontologies (non-simple 3 http://www.kr.tuwien.ac.at/research/systems/clipper/ Algorithm 1: Answering CQs via Query Rewriting Input: Horn-SHIQ ontology O = (T , A), Conjunctive Query q Output: query results T ← Normalize(T ) ; B Normalization T ∗ ← ElimTrans(T ) ; B Eliminate Transitive Roles Ξ(T ∗ ) ← Saturate(T ∗ ) ; B TBox Saturation Q ← Rewrite(q, Ξ(T ∗ )) ; B Query Rewriting cr(T ) ← CompletionRules(T ) ; B Completion Rules P = A ∪ cr(T ) ∪ Q ; B Datalog Translation ans ← {u | q(u) ∈ MinimalModel(P)}; B Call Datalog Reasoner return ans ; roles are disallowed in queries). To the best of our knowledge, it is the first such system for Horn-SHIQ (under the standard semantics of first-order logic), and in expressive- ness subsumes similar DL-Lite and EL reasoning engines (see below). We describe the architecture of C LIPPER in Figure 2, and the main steps in Al- gorithm 1. C LIPPER is implemented in Java and uses OWLAPI 3.2.2 [13] for parsing ontologies. It accepts an ontology O = (T , A) and a CQ q in the SPARQL syntax as input. For efficiency reasons we implemented a lightweight ontology representation: all concepts, roles and individuals are encoded as integers; the conjunction of concepts and roles are stored in hash sets. Since we often need to manipulate large tables of axioms, we built inverted indexes over such axioms to support fast lookup and matching. Ontology Preprocessing. This component is responsible for (1) ontology parsing (us- ing OWLAPI 3.2.2), (2) profile checking and ontology normalization [16], and (3) con- verting the ontology into the internal format. Query Preprocessing. This component simply parses CQs in SPARQL syntax and converts them into the internal format. Saturation. This component exhaustively applies the saturation rules in Table 1 on TBox. We use the index structure to find which rules can be applied and the new axioms are generated incrementally. Query Rewriting. This component uses Algorithm 2 to rewrite the input q. It imple- ments the rewriting step from Definition 2, exhaustively traversing all existential axioms in Ξ(T ) for all the non-distinguished variables. The index structure helps the system efficiently search through the set of existential axioms while rewriting. Datalog Translation. This component generates a Datalog program with the rewritten set of queries Q, the completion rules cr(T ) in Table 2, and the facts in A. Datalog Engine. The resulting program is evaluated using the Datalog engine DLV- 20101014 [18] or Clingo 3.0.3 [10]. If the program (and hence the ontology) is consis- tent, its minimal model is returned and the answer tuples are filtered from it. 6 Experiments We tested C LIPPER on a Pentium Core2 Duo 2.00GHZ with 2GB RAM under Ubuntu 10.04 and 512MB heap for the Java VM. We conducted the following experiments. Algorithm 2: Rewrite(q, T ) Input: CQ q with only simple roles; TBox T Output: Rewritten queries of q w.r.t. T rewT (q) ← ∅ ; B will be updated from the sub procedure rewrite(q) ; B Call sub procedure return rewT (q); Sub Procedure rewrite (q) rewT (q) ← rewT (q) ∪ {q}; foreach non-distinguished variables x of q do if r(x, x) 6∈ q then Replace each r(x, y) in q by r− (y, x) ; S ← {r | r(y, x) ∈ q} ; P ← {y | r(y, x) ∈ q} ; N ← {A | A(x) ∈ q} ; foreach M v ∃S 0 N 0 ∈ T do if S ⊆ S 0 and N ⊆ N 0 then Obtain q 0 from q by: begin (1) Drop from q each atom containing the variable x; (2) Rename each y ∈ P by x; (3) Add {A(x) | A ∈ M } to q; if q 0 6∈ rewT (q) then rewrite(q 0 ) ; B Recursion 1. Downscaling test. We compared C LIPPER with other query rewriting systems for DL-Lite, viz. R EQUIEM (Perez-Urbina et al. [23]) and P RESTO [25], and found that it is competitive and scales down well on DL-Lite ontologies. We used the ontologies and queries (Q1–Q5) from the REQUIEM test suite, which have been widely used for system tests; in addition we considered the queries in Table 3a. Table 3b shows the number of rewritten queries and the rewriting time for the on- tologies ADOLENA (A), STOCK-EXCHANGE (S), VICODI (V), and UNIVERSITY (U); the rewriting time excludes loading and preprocessing. C LIPPER and P RESTO generated in most cases rule sets of comparable size, and in short time. In a few cases P RESTO generated significantly less rules than C LIPPER, and only for V P RESTO was notably faster. R EQUIEM generated in several cases significantly more rules, despite consider- ing the G-version which generates optimized rules (and hence uses considerably more time). The difference seems to be caused by rule unfolding required in their rewriting. For UNIVERSITY, the only ontology in the suite having an ABox, we evaluated the rewritten queries over four different ABoxes (67k to 320k assertions) using DLV. Interestingly, in all cases the execution times for the three rewritings were very similar; the average runtime of each query on the four ABoxes is shown in brackets. 2. Full Horn-SHIQ. To test C LIPPER on a full Horn-SHIQ ontology, we modified the UOBM ontology [19], which is in SHOIN (D), by dropping or strengthening (in case of disjunctions) non-Horn-SHIQ TBox axioms; the final ontology has 196 TBox axioms. We used ABoxes Ai , 1 ≤ i ≤ 4, with 20k, 80k, 140k and 200k assertions. The test queries in Table 4a were tailored to require reasoning with Horn-SHIQ constructs Table 3: Experiments with DL-Lite ontology (a) Additional Queries A Q6(x) ←Device(x), assistsWith(x,y), ReadingDevice(y) Q7(x) ←Device(x), assistsWith(x,y), ReadingDevice(y), assistsWith(y,z), SpeechAbility(z) S Q6(x,z) ←Investor(x), hasStock(x,y), Stock(y), Company(z), hasStock(z,y) Q7(x,z,w) ← Investor(x), hasStock(x,y), Stock(y), isListedIn(y,z), StockExchangeList(z), Company(w), hasStock(w,y) U Q6(x,y) ←Professor(x), teacherOf(x,y), GraduateCourse(y) Q7(x,z) ←Faculty(y), Professor(z), Student(x), memberOf(x,y),worksFor(z,y) V Q6(x,y,z) ←Person(x), hasRole(x,y), Leader(y), exists(y,z) Q7(x,y,z,w) ← Person(x), hasRole(x,y), Leader(y), exists(y,z), TemporalInterval(z), related(x,w), Country(w) (b) Downscaling evaluation # Rules/CQs Time (ms) # Rules/CQs Time (ms) RG Presto C LIPPER RG Presto C LIPPER RG Presto C LIPPER RG Presto C LIPPER Q1 27 53 42 281 45 50 Q1 15 16 15 13 8 73 Q2 50 32 31 184 46 62 Q2 10 3 10 16 10 58 Q3 104 32 31 292 27 65 Q3 72 28 26 77 12 63 A Q4 224 43 36 523 32 71 V Q4 185 44 41 261 17 71 Q5 624 37 36 1177 25 70 Q5 30 16 8 99 15 44 Q6 364 35 30 523 31 65 Q6 18 22 18 27 11 69 Q7 2548 43 32 7741 61 64 Q7 180 34 27 359 12 105 Q1 6 7 10 14 7 19 Q1 2 4 2 14 ( 1247 ) 12 ( 1252 ) 27 ( 1255 ) Q2 2 3 22 263 9 22 Q2 1 2 45 201 ( 1247 ) 23 ( 1262 ) 36 ( 1637 ) Q3 4 4 9 1717 10 21 Q3 4 8 17 477 ( 2055 ) 26 ( 2172 ) 29 ( 1890 ) S Q4 4 4 24 1611 9 23 U Q4 2 56 63 2431 ( 1260 ) 20 ( 1235 ) 28 ( 1735 ) Q5 8 5 10 18941 10 22 Q5 10 8 16 7216 ( 1267 ) 26 ( 1305 ) 36 ( 1372 ) Q6 4 8 5 204 11 21 Q6 10 13 10 13 ( 1272 ) 14 ( 1260 ) 27 ( 1262 ) Q7 8 6 7 1733 11 17 Q7 960 24 19 1890 ( 1730 ) 15 ( 1310 ) 35 ( 1322 ) unavailable in DL-Lite and EL. Table 4b shows the number of rewritten queries, rewrit- ing time and DLV running time. We see that C LIPPER answered all queries in reasonable time and scaled well (time printed A1 / A2 / A3 /A4 ). The rewriting times for all the queries are small and at most within a factor of 3. The high number of rules generated for Q3 is due to many different possibilities for deriving some atoms in the query, like Person(x). However, the evaluation still performs well (it stays within a small factor). 7 Related Work Since Calvanese et al. introduced query rewriting in their seminal work on DL-Lite [3], many query rewriting techniques have been developed and implemented, e.g. (Perez- Urbina et al. [23], Rosati and Almatelli [25], Chortaras et al. [4], Gottlob et al. [11]), usually aiming at an optimized rewriting size. Some of them also go beyond DL-Lite; e.g. Perez-Urbina et al. cover ELHI, while Gottlob et al. consider Datalog ± . Most approaches rewrite a query into a (union of) CQs; in [25] a non-recursive Datalog pro- gram is generated, while Perez-Urbina et al. produce a CQ for DL-Lite and a (recursive) Datalog program for DLs of the EL family. Our approach rewrites a CQ into a union of CQs, but generates possibly recursive Datalog rules to capture the TBox. The closest DL to Horn-SHIQ for which a rewriting technique has been implemented is ELHI [23]. Unlike ELHI, Horn-SHIQ can express functionality, a feature supported by DL-Lite Table 4: Experiments with UOBM Horn-SHIQ ontology (a) Queries (b) Running time Q1(x) ←worksFor(x,y), isAffiliatedOrganizationOf(y,z), College(z) # of Rew Datalog time (DLV) Q2(x) ←Postdoc(x), worksFor(x,y), University(y), hasAlumnus(y,x) Rules (ms) (ms) Q3(x) ←Person(x), like(x,y), Chair(z), isHeadOf(z,w), like(z,y) Q1 3 87 120/ 370/ 620/ 870 Q4(x) ←takeCourse(x,y), GraduateCourse(y), Q2 16 98 130/ 380/ 630/ 880 isTaughtBy(y,z), Professor(z) Q5(x,z) ←LeisureStudent(x), takesCourse(x,y), CSCourse(y), Q3 180 212 370/ 890/ 1420/ 1960 isStudentOf(x, z), University(z) Q4 45 156 180/ 480/ 780/ 1070 Q6(x,y) ←enrollIn(x,y), hasDegreeFrom(x,y), University(y) Q7(x,y) ←PeopleWithManyHobbies(x), isMemberOf(x,z), like(x,w), Q5 37 152 160/ 440/ 720/ 1010 TennisClass(w), hasMember(z,y), like(y,w) Q6 16 112 130/ 370/ 630/ 880 Q8(x,z) ←TennisFan(x), like(x,y), Sport(y), Q7 55 143 180/ 470/ 750/ 1050 isHeadOf(x,z), ReserachGroup(z) Q9(x,y,z) ←Student(x), hasDegreeFrom(x,y), Professor(z), Q8 17 105 130/ 380/ 630/ 870 worksFor(z,y), isAdvisorOf(z,x) Q9 33 126 150/ 430/ 720/ 1010 Q10(x,y,w) ←Professor(x), Dean(y), isMemberOf(y,w), worksFor(x,w), Q10 23 137 150/ 390/ 640/ 880 hasPublication(x,z), isPublicationOf(z,y) considered relevant for applications. A comparison of both systems on ontologies be- yond DL-Lite remains for future work. Our technique resembles Rosati’s for CQs in EL [24], which incorporates the CQ into the TBox before saturation and then (after saturation) translates it into Datalog, resulting in a best-case exponential algorithm. We avoid this by doing a rewrite step only if the TBox has an applicable existential axiom. Rewriting approaches for more expressive DLs are less common. The most notable exception is Hustadt et al.’s translation of SHIQ terminologies into disjunctive Data- log [15], which is implemented in the KAON2 reasoner. The latter can be used to answer queries over arbitrary ABoxes, but supports only instance queries. An extension to CQs (without transitive roles) is given in [14], but it is not implemented. To our knowledge, also the extension of the rewriting in [23] to nominals remains to be implemented [22]. In [20] a Datalog rewriting is used to establish complexity bounds of standard reasoning in the Horn fragments of SHOIQ and SROIQ, but it does not cover CQs. 8 Conclusion We presented a rewriting-based algorithm for answering CQs over Horn-SHIQ on- tologies. Our prototype implementation shows potential for practical applications, and further optimizations will improve it. Future versions of C LIPPER will support transi- tive roles and queries formulated in weakly DL-safe Datalog, for which the theoretic foundations have been already developed here and in [9]. As an interesting application, we mention that our method allows to improve rea- soning with DL-programs, which loosely couple rules and ontologies [5]. To avoid the overhead caused by the interaction of a rule reasoner and an ontology reasoner of tra- ditional methods, the inline evaluation framework translates ontologies into rules [12, 8]. The techniques of this paper can be faithfully integrated into the inline evaluation framework to efficiently evaluate DL-programs involving Horn-SHIQ ontologies. References 1. Acciarri, A., Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M., Palmieri, M., Rosati, R.: QuOnto: Querying ontologies. In: AAAI. pp. 1670–1671 (2005) 2. Calı̀, A., Gottlob, G., Lukasiewicz, T.: Datalog± : a unified approach to ontologies and in- tegrity constraints. In: ICDT’09. pp. 14–30. ACM (2009) 3. Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. Autom. Reasoning 39(3), 385–429 (2007) 4. Chortaras, A., Trivela, D., Stamou, G.: Optimized query rewriting for OWL 2 QL. In: CADE’11. pp. 192–206. Springer-Verlag (2011) 5. Eiter, T., Ianni, G., Lukasiewicz, T., Schindlauer, R., Tompits, H.: Combining answer set programming with description logics for the Semantic Web. Artificial Intelligence 172(12- 13), 1495–1539 (2008) 6. Eiter, T., Ortiz, M., Šimkus, M., Tran, T., Xiao, G.: Query rewriting for Horn-SHIQ plus rules. In: AAAI’12 (2012), (To appear) 7. Eiter, T., Gottlob, G., Ortiz, M., Simkus, M.: Query answering in the description logic Horn- SHIQ. In: JELIA’08. pp. 166–179. Springer (2008) 8. Eiter, T., Krennwallner, T., Schneider, P., Xiao, G.: Uniform evaluation of nonmonotonic DL-programs. In: FoIKS’12. LNCS, vol. 7153, pp. 1–22. Springer (March 2012) 9. Eiter, T., Ortiz, M., Šimkus, M., Tran, T., Xiao, G.: Query rewriting for Horn-SHIQ plus rules. Tech. Rep. INFSYS RR-1843-12-04, TU Vienna (2012), http://www.kr. tuwien.ac.at/research/reports/rr1204.pdf 10. Gebser, M., Kaufmann, B., Kaminski, R., Ostrowski, M., Schaub, T., Schneider, M.T.: Potassco: The potsdam answer set solving collection. AI Commun. 24(2), 107–124 (2011) 11. Gottlob, G., Orsi, G., Pieris, A.: Ontological queries: Rewriting and optimization. In: ICDE’11. pp. 2 –13 (2011) 12. Heymans, S., Eiter, T., Xiao, G.: Tractable reasoning with DL-programs over Datalog- rewritable description logics. In: ECAI’10. pp. 35–40. IOS Press (2010) 13. Horridge, M., Bechhofer, S.: The OWL API: A java API for OWL ontologies. Semantic Web 2(1), 11–21 (2011) 14. Hustadt, U., Motik, B., Sattler, U.: A decomposition rule for decision procedures by resolution-based calculi. In: LPAR’04. pp. 21–35. Springer (2004) 15. Hustadt, U., Motik, B., Sattler, U.: Reasoning in description logics by a reduction to disjunc- tive Datalog. J. Autom. Reasoning 39(3), 351–384 (2007) 16. Kazakov, Y.: Consequence-driven reasoning for Horn SHIQ ontologies. In: IJCAI’09. pp. 2040–2045 (2009) 17. Krötzsch, M., Rudolph, S., Hitzler, P.: Complexity boundaries for Horn description logics. In: AAAI’07. pp. 452–457. AAAI Press (2007) 18. Leone, N., Pfeifer, G., Faber, W., Eiter, T., Gottlob, G., Perri, S., Scarcello, F.: The DLV system for knowledge representation and reasoning. ACM ToCL 7(3), 499–562 (2006) 19. Ma, L., Yang, Y., Qiu, Z., Xie, G.T., Pan, Y., Liu, S.: Towards a complete OWL ontology benchmark. In: ESWC’06. pp. 125–139. Springer (2006) 20. Ortiz, M., Rudolph, S., Simkus, M.: Worst-case optimal reasoning for the Horn-DL frag- ments of OWL 1 and 2. In: KR’10. AAAI Press (2010) 21. Ortiz, M., Rudolph, S., Simkus, M.: Query answering in the Horn fragments of the descrip- tion logics SHOIQ and SROIQ. In: IJCAI’11. pp. 1039–1044. IJCAI/AAAI (2011) 22. Pérez-Urbina, H., Motik, B., Horrocks, I.: Tractable query answering and rewriting under description logic constraints. J. Applied Logic 8(2), 186–209 (2010) 23. Pérez-Urbina, H., Motik, B., Horrocks, I.: A comparison of query rewriting techniques for DL-Lite. In: DL’09. CEUR-WS.org (2009) 24. Rosati, R.: On conjunctive query answering in EL. In: DL’07. CEUR-WS.org (2007) 25. Rosati, R., Almatelli, A.: Improving query answering over DL-Lite ontologies. In: KR’10. AAAI (2010)