=Paper= {{Paper |id=None |storemode=property |title=Towards Practical Query Answering for Horn-SHIQ |pdfUrl=https://ceur-ws.org/Vol-846/paper_20.pdf |volume=Vol-846 |dblpUrl=https://dblp.org/rec/conf/dlog/EiterOSTX12 }} ==Towards Practical Query Answering for Horn-SHIQ== https://ceur-ws.org/Vol-846/paper_20.pdf
Towards Practical Query Answering for Horn-SHIQ?

                    Thomas Eiter1 , Magdalena Ortiz1 , Mantas Šimkus1
                         Trung-Kien Tran2 , and Guohui Xiao1
             1
                 Institute of Information Systems, Vienna University of Technology
                         {eiter|ortiz|xiao}@kr.tuwien.ac.at
                                 simkus@dbai.tuwien.ac.at
                               2
                                  STARLab, Vrije Universiteit Brussel
                                     truntran@vub.ac.be



1     Introduction

Query answering has become a prominent reasoning task in Description Logics. This is
witnessed not only by the high number of publications on the topic in the last decade,
but also by the increasing number of query answering engines. A number of systems
provide full conjunctive query (CQ) answering capabilities, including [1, 23, 25, 4, 11].
A common feature of these approaches is that they rely on existing technologies for
relational or deductive databases. They focus on lightweight DLs like DL-Lite and
EL, and they use query rewriting to reduce the problem of answering a query over a
DL ontology to a database query evaluation problem. For more expressive DLs that
are not tractable in combined complexity, however, CQs (with the complete first-order
semantics) are not yet supported by current reasoning engines. A range of algorithms
has been designed, but they serve for theoretical purposes such as showing complexity
bounds and are not amenable to practical implementation. The only exception is the
rewriting algorithm implemented in the R EQUIEM system, which covers ELHI [23], an
expressive extension of EL for which standard reasoning is E XP T IME-hard.
    In this paper, we contribute to the development of practical query answering systems
beyond DL-Lite and EL. We consider Horn-SHIQ, the Horn fragment of the popular
DL SHIQ that underlies OWL DL. It combines all the expressive features of DL-Lite
and EL, and simultaneously extends them with transitive roles, qualified number restric-
tions and some universal quantification. Standard reasoning tasks in Horn-SHIQ are
already E XP T IME-hard in combined complexity but, due to the absence of disjunction,
they are polynomial in data complexity. Since in Horn-SHIQ models are significantly
more complex than in DL-Lite and (most dialects of) EL, extending existing query
rewriting techniques is not straightforward.
    The main contribution of this paper is a query answering method for Horn-SHIQ
that appears to be promising for practicable systems, as confirmed by the experimental
evaluation of a prototype implementation.
    – The core of the method is a novel query rewriting technique which transforms
an input query q into a union Q of CQs (a UCQ) such that the answers of q over an
ontology O = hT , Ai with TBox T and ABox A coincide with the answers over A
?
    This work was supported by the Austrian Science Fund (FWF) grants P20840 and T515.
of a Datalog program comprising Q and some rules to complete A, showing Datalog-
rewritability of CQ answering in Horn-SHIQ.
    – Naturally, the set Q may be exponential in the size of q in the worst case, but in
practice we obtain rewritings Q that are of manageable size for real world ontologies.
This is mostly due to the fact that new queries are only generated taking into account
the anonymous domain elements implied by the terminology. Notably our algorithm is
worst-case optimal in both data and combined complexity.
    – We describe a prototype implementation of the approach that uses off-the-shelf
Datalog reasoners. Despite being preliminary and lacking sophisticated optimizations,
it shows that the approach is promising. It can answer queries efficiently over Horn-
SHIQ ontologies and scales down nicely to DL-Lite, where it is competitive with
state of the art query rewriting systems.
    – The technique works for full Horn-SHIQ and arbitrary CQs, but the imple-
mented version does not support transitive (or more generally, non-simple) roles in
the query. To keep presentation simple, we present here only the case without transi-
tive roles, and refer to [6, 9] for a more general version with transitive roles and richer
queries formulated in weakly DL-safe Datalog.


2   Preliminaries
Horn-SHIQ The syntax and semantics of Horn-SHIQ is defined in the usual way.
A role is a role name p or its inverse p− . A Horn-SHIQ TBox T in normal form is
a set of role inclusion axioms r v s, transitivity axioms trans(r), and general concept
inclusion axioms (GCIs) of the forms (F1) A1 u. . .uAn vB, (F2) A1 v∀r.B, (F3) A1 v
∃r.B, and (F4) A1 v 61 r.B,where A1 , . . . , An , B are concept names and r, s are roles.
Axioms (F3) are called existential. W.l.o.g. we consider only Horn-SHIQ TBoxes in
normal form [16, 17]. We call s transitive in T , if trans(s) ∈ T or trans(s− ) ∈ T ,
and we call s simple in T , if there is no transitive r in T s.t. r v∗T s, where v∗ is the
reflexive transitive closure of {(r, s) | r v s ∈ T or inv(r) v inv(s) ∈ T }. Only simple
roles are allowed in axioms of the form (F4).
     An ABox A is a set of assertions A(a) and r(a, b), where A is a concept name, r a
role, and a, b are individuals; the set of all individuals is denoted by NI . An ontology is
a pair (T , A) of a TBox T and an ABox A. The semantics is given by interpretations
I = h∆I , ·I i in the usual way.
     A Horn-ALCHIQ TBox is a Horn-SHIQ TBox with no transitivity axioms. Horn-
ALCHIQu TBoxes are obtained by allowing the conjunction r1 u r2 of roles r1 and
r2 , interpreted (r1 u r2 )I = r1I ∩ r2I . We let inv(r1 u r2 ) = inv(r1 ) u inv(r2 ) and
assume w.l.o.g. that for each role inclusion r v s of a Horn-ALCHIQu TBox T , (i)
inv(r) v inv(s) ∈ T , and (ii) s ∈ {p, p− } for a role name p. For a set W and a concept
or role conjunction Γ = γ1 u . . . u γm , we write Γ ⊆ W for {γ1 , . . . , γm } ⊆ W .
Conjunctive Queries A conjunctive query (CQ) is an expression of the form
                             q(u) ← p1 (v1 ), . . . , pm (vm )
where each pi (vi ) is an atom of the form A(x) or r(x, y), where A is a concept name
and r is a role, x, y are variables, and q is a special query predicate not occurring
elsewhere. For convenience, we may identify such a CQ with the set of its atoms, and
Table 1: Inference rules. M (0) , N (0) , (resp., S (0) ) are conjunctions of atomic concepts
(roles); A, B are atomic concepts
M v ∃S.(N u N 0 ) N v A            M v ∃(S u S 0 ).N     Svr         M v ∃S.(N u ⊥)
                              Rc
                               v                                Rr
                                                                 v                  R⊥
                    0                               0
    M v ∃S.(N u N u A)                M v ∃(S u S u r).N                M v⊥

    M v ∃(S u r).N A v ∀r.B               M v ∃(S u inv(r)).(N u A)       A v ∀r.B
                             R∀                                                      R−
                                                                                      ∀
    M u A v ∃(S u r).(N u B)                            M vB
          M v ∃(S u r).(N u B) Av 61 r.B           M 0 v ∃(S 0 u r).(N 0 u B)
                                                                                R≤
                     M u M 0 u A v ∃(S u S 0 u r).(N u N 0 u B)
 M v ∃(S u inv(r)).(N1 u N2 u A) Av 61 r.B N1 u A v ∃(S 0 u r).(N 0 u B u C)
                                                                                        R−
                                                                                         ≤
             M uBvC         M u B v ∃(S u inv(S 0 u r)).(N1 u N2 u A)
                                                  S
use q(u) (or simply q) to refer to it. We call u ⊆ 1≤i≤m vi the distinguished variables
of q. A match for q in I is a mapping from variables in q to elements in ∆I such that
                                                                                    |u|
π(t) ∈ pI for each atom p(t) of q. The answer to q over O is the set of all c ∈ NI
such that in every model I of O some match π for q exists with π(u) = (c)I .
Elimination of Transitivity. As usual, transitivity axioms roles can be eliminated from
Horn-SHIQ TBoxes. To obtain a Horn-ALCHIQ TBox T ∗ that is also in normal
form, we can use the transformation from [16]. This transformation preserves satisfia-
bility and, provided that queries contain only simple roles, also query answers (answers
are not preserved for arbitrary queries unless the notion of match is suitably relaxed). In
the rest of the paper we describe a procedure for answering CQs in Horn-ALCHIQu .


3    Canonical Models
For answering CQs in Horn DLs usually the canonical model property is employed [7,
21, 2]. In particular, for a consistent Horn-ALCHIQu ontology O = (T , A), there
exists a model I of O that can be homomorphically mapped into any other model I 0 of
O. We show that such an I can be built in three steps:
(1) close T under specially tailored inferences rules,
(2) close A under all but existential axioms of T , and
(3) extend A by “applying” the existential axioms of T .
    For Step (1) we use the calculus in Table 1, which is similar to [16, 20]. Given a
Horn-ALCHIQu TBox T , we denote by Ξ(T ) the TBox obtained from T by ex-
haustively applying the inference rules in Table 1. In Step (2) we simply ‘apply’ in the
ABox all but existential axioms in T . For convenience, this is done using the set cr(T )
of Datalog rules in Table 2. Since every ABox A can be seen as a set of Datalog facts,
A ∪ cr(T ) is a Datalog program (with constraints) which has a unique minimal Her-
brand model J = MM (A ∪ cr(T )) if O is consistent. This model is almost a canonical
model of (T , A); however, existential axioms may be violated. To deal with this, in
Step (3) we extend J with new domain elements as required by axioms M v ∃r.N in
Ξ(T ), using a procedure similar to the well known chase in databases.
                    Table 2: (Completion rules) Datalog program cr(T ).
    B(y) ← A(x), r(x, y) for each A v ∀r.B ∈ T
    B(x) ← A1 (x), . . . , An (x) for all A1 u . . . uAn v B ∈ Ξ(T )
r(x, y) ← r1 (x, y), . . . , rn (x, y) for all r1 u . . . u rn v r ∈ T
    ⊥(x) ← A(x), r(x, y1 ), r(x, y2 ), B(y1 ), B(y2 ), y1 6= y2 for each Av 61 r.B ∈ T
       Γ ← A(x), A1 (x), . . . , An (x), r(x, y), B(y)
           for all A1 u . . . uAn v ∃(r1 u . . . urm ).(B1 u . . . uBk )
           and Av 61 r.B of Ξ(T ) such that r=ri and B=Bj for some i, j
           with Γ ∈ {B1 (y), . . . , Bk (y), r1 (x, y), . . . , rk (x, y)}

Definition 1. Let T be a Horn-ALCHIQu TBox and let I be an interpretation. A
GCI M v ∃S.N is applicable at e ∈ ∆I if (a) e ∈ M I , (b) there is no e0 ∈ ∆I
with (e, e0 ) ∈ S I and e0 ∈ N I , (c) there is no axiom M 0 v ∃S 0 .N 0 ∈ T such that
e ∈ (M 0 )I , S ⊆ S 0 , N ⊆ N 0 , and S ⊂ S 0 or N ⊂ N 0 . An interpretation J obtained
from I by an application of an applicable axiom M v ∃S.N at e ∈ ∆I is defined as:
- ∆J = ∆I ∪ {d} with d a new element not present in ∆I (we call d a successor of e),
- For each concept name A and each o ∈ ∆J , we have o ∈ AJ if (a) o ∈ ∆I and
  o ∈ AI ; or (b) o = d and A ∈ N .
- For each role name r and o, o0 ∈ ∆J , we have (o, o0 ) ∈ rJ if (a) o, o0 ∈ ∆I and
  (o, o0 ) ∈ rI ; or (b) (o, o0 ) = (e, d) and r ∈ S; or (c) (o, o0 ) = (d, e) and inv(r) ∈ S.
We denote by chase(I, T ) a possibly infinite interpretation obtained from I by applying
the existential axioms in T . We require fairness: the application of an applicable axiom
can not be infinitely postponed.
    We note that chase(I, T ) is unique up to renaming of domain elements. As usual in
DLs, it can be seen as a ‘forest’: the application of existential axioms simply attaches
‘trees’ to an arbitrarily shaped model I.
Theorem 1. Let O = (T , A) be a Horn-ALCHIQu ontology. Then O is consistent iff
A ∪ cr(T ) consistent. Moreover, if O is consistent, then (a) chase(MM (A ∪ cr(T )),
Ξ(T )) is a model of O, and (b) chase(MM (A ∪ cr(T )), Ξ(T )) can be homomorphi-
cally mapped into any model of O.
The proof of Theorem 1 can be found in [9]; see [21] for a proof of a similar result.
    Observe that checking consistency of O = (T , A) reduces to evaluating the Data-
log program A ∪ cr(T ). We note that Ξ(T ) can be computed in exponential time in the
size of T : the calculus only infers axioms of the form M v B and M v ∃S.N , where
M, N are conjunctions of atomic concepts, B is atomic and S is a conjunction of roles,
and there are exponentially many such axioms.


4     Query Rewriting
The following theorem, which is immediate from Theorem 1, allows us to concentrate
on models obtained by the chase procedure.
               q1                                     q10                         q2                                     q20

     A1        x1                               A1    x1                     A1   x1                                A1   x1

              r1     A v ∃(r u r2 ).(B u A3 )        r1                     r1         r2    A v ∃(r u r3 u r4− ).(B u A3 ) r , r
                                                                                                                             1 2


         A2    x2                                     x3     A2 , A    x4    A4         A2    x2              A4 , A2 , A x3
     r2                  r3                   r2                 r3         r4         r3                                  r3 , r4−
    x3    A3        A4    x4          A3                    A4    x4         A3   x3                                A3


                              (a) Example 1                                                  (b) Example 2

                                           Fig. 1: Examples of query rewriting
Theorem 2. Let O = (T , A) be a Horn-ALCHIQu ontology. Then A ∪ cr(T ) is con-
sistent iff O is consistent. Moreover, if O is consistent, then ans(O, q) = ans(IO , q),
where IO = chase(MM (A ∪ cr(T )), Ξ(T )).

                                                as IO can be infinite. Hence, we rewrite q
    Computing ans(IO , q) is still not trivial, S
into a set Q of CQs such that ans(IO , q) = q0 ∈Q ans(MM (A ∪ cr(T )), q 0 ), that is,
we can evaluate them over the finite MM (A ∪ cr(T )).
    The intuition is the following. Suppose q has a non-distinguished variable x, and
that there is some match π in IO such that π(x) is an object in the ‘tree part’ introduced
by the chase procedure, and it has no descendant in the image of π. Then for all atoms
r(y, x) of q, the “neighbor” variable y must mapped to the parent of π(x). A rewrite
step makes a choice of such an x, and employs an existential axiom from Ξ(T ) to
‘clip off’ x, eliminating all query atoms involving it. By repeating this procedure, we
can clip off all variables matched in the tree part and obtain a query with a match in
MM (A ∪ cr(T )).

Definition 2 (Query rewriting). For a CQ q and a Horn-ALCHIQu TBox T , we
write q →T q 0 if q 0 can be obtained from q in the following steps:

(S1) Select in q an arbitrary non-distinguished variable x such that there are no atoms
     of the form r(x, x) in q.
(S2) Replace each role atom r(x, y) in q, where y is arbitrary, by the atom inv(r)(y, x).
(S3) Let Vp = {y | ∃r : r(y, x) ∈ q}, and select some M v ∃S.N ∈ Ξ(T ) such that
     (a) {r | r(y, x) ∈ q ∧ y ∈ Vp } ⊆ S, and
     (b) {A | A(x) ∈ q} ⊆ N .
(S4) Drop from q each atom containing x.
(S5) Rename each y ∈ Vp of q by x.
(S6) Add the atoms {A(x) | A ∈ M } to q.

We write q →∗T q 0 if q = q0 and q 0 = qn for some finite rewrite sequence q0 →T
q1 · · · →T qn , n ≥ 0. Furthermore, we let rewT (q) = {q 0 | q →∗T q 0 }.

Example 1. The query q1 (x1 ) ← A1 (x1 ), r1 (x1 , x2 ), A2 (x2 ), r2 (x2 , x3 ), A3 (x3 ), r3 (x2 ,
x4 ), A4 (x4 ) is depicted on the left hand side of Figure 1a. The node in bold corresponds
to the answer variable x1 . Assume that Av∃(r u r2 ).(B u A3 ) and Av∃(r u r3 u r4− ).(B
                       (1)


                    Ontology     (2)
            O        Prepro-               Saturation    (6)
                     cessing
                                                                  Datalog      (7)    Datalog
                                            (4)                                                     answers
                                                                 Translation          Engine
                                                         (5)
                   Query Pre-    (3)        Query
             q
                   processing              Rewriting



                 (1) ABox assertions              (4) Existential axioms        (7) Datalog rules
                 (2) TBox axioms                  (5) Rewritten queries
                 (3) Conjunctive queries          (6) Axioms


                                 Fig. 2: C LIPPER system architecture
uA3 ) are in Ξ(T ). If we pick for (S1) the variable x3 , we get Vp = {x2 } and we can
select A v ∃(r u r2 ).(B u A3 ) ∈ Ξ(T ), as it satisfies (S3.a) and (S3.b). After performing
(S4), (S5) and (S6) we obtain the rewritten query q10 (x1 ) ← A1 (x1 ), r1 (x1 , x3 ), A2 (x3 ),
A(x3 ), r3 (x3 , x4 ), A4 (x4 ). Intuitively, we can safely remove from q1 all atoms containing
x3 because the added atom A(x3 ) ensures that whenever q10 has a match so does q1 .
Example 2 (ctd). Now we consider the query q2 (x1 ) ← A1 (x1 ), r2 (x1 , x2 ), A2 (x2 ),
r3 (x2 , x3 ), A3 (x3 ), r1 (x1 , x4 ), A4 (x4 ), r4 (x3 , x4 ) in Figure 1b. We choose the variable
x3 , replace r4 (x3 , x4 ) by r4− (x4 , x3 ) in step (S2), and get Vp = {x2 , x4 }. Intuitively,
if π(x3 ) is a leaf in a tree-shaped match π, then x2 and x4 must both be mapped to the
parent of π(x3 ). Since the GCI A v ∃(r u r3 u r4− ).(B u A3 ) in Ξ(T ) satisfies (S3.a,b),
we can drop the atoms containing x3 from q2 , and perform (S5) and (S6) to obtain the
rewritten query q20 (x1 ) ← A1 (x1 ), r1 (x1 , x3 ), r2 (x1 , x3 ), A4 (x3 ), A2 (x3 ), A(x3 ).
     Now we can state our main result (see [9] for the proof of a more general result):
Theorem 3. Suppose O = (T S, A) is a consistent Horn-ALCHIQu ontology and let
q be a CQ. Then ans(O, q) = q0 ∈rewT (q) ans(MM (A ∪ cr(T )), q 0 ).
     By the above reduction, we can answer q over O = (T , A)) by evaluating rewT (q)
over MM (A∪cr(T ) or, equivalently, by evaluating the Datalog program rewT (q)∪A∪
cr(T ) and collecting the tuples u with q(u) in the minimal model. We note that rewT (q)
is finite and computable in time exponential in the size of T and q: rules in rewT (q) use
only relation names and variables that occur in q and T . Furthermore, the grounding of
rewT (q) ∪ A ∪ cr(T ) is exponential in the size of O, but polynomial for fixed T and q.
By the complexity of Datalog, it follows that the resulting algorithm is exponential in
combined but polynomial in data complexity; this is worst-case optimal [7].

5     Implementation
To evaluate the feasibility of the new rewriting, we have implemented a prototype sys-
tem C LIPPER,3 which supports CQ answering over Horn-SHIQ ontologies (non-simple
 3
     http://www.kr.tuwien.ac.at/research/systems/clipper/
    Algorithm 1: Answering CQs via Query Rewriting
     Input: Horn-SHIQ ontology O = (T , A), Conjunctive Query q
     Output: query results
     T ← Normalize(T ) ;                                        B Normalization
     T ∗ ← ElimTrans(T ) ;                    B Eliminate Transitive Roles
     Ξ(T ∗ ) ← Saturate(T ∗ ) ;                              B TBox Saturation
     Q ← Rewrite(q, Ξ(T ∗ )) ;                               B Query Rewriting
     cr(T ) ← CompletionRules(T ) ;                         B Completion Rules
     P = A ∪ cr(T ) ∪ Q ;                               B Datalog Translation
     ans ← {u | q(u) ∈ MinimalModel(P)};             B Call Datalog Reasoner
     return ans ;

roles are disallowed in queries). To the best of our knowledge, it is the first such system
for Horn-SHIQ (under the standard semantics of first-order logic), and in expressive-
ness subsumes similar DL-Lite and EL reasoning engines (see below).
    We describe the architecture of C LIPPER in Figure 2, and the main steps in Al-
gorithm 1. C LIPPER is implemented in Java and uses OWLAPI 3.2.2 [13] for parsing
ontologies. It accepts an ontology O = (T , A) and a CQ q in the SPARQL syntax as
input. For efficiency reasons we implemented a lightweight ontology representation: all
concepts, roles and individuals are encoded as integers; the conjunction of concepts and
roles are stored in hash sets. Since we often need to manipulate large tables of axioms,
we built inverted indexes over such axioms to support fast lookup and matching.
Ontology Preprocessing. This component is responsible for (1) ontology parsing (us-
ing OWLAPI 3.2.2), (2) profile checking and ontology normalization [16], and (3) con-
verting the ontology into the internal format.
Query Preprocessing. This component simply parses CQs in SPARQL syntax and
converts them into the internal format.
Saturation. This component exhaustively applies the saturation rules in Table 1 on
TBox. We use the index structure to find which rules can be applied and the new axioms
are generated incrementally.
Query Rewriting. This component uses Algorithm 2 to rewrite the input q. It imple-
ments the rewriting step from Definition 2, exhaustively traversing all existential axioms
in Ξ(T ) for all the non-distinguished variables. The index structure helps the system
efficiently search through the set of existential axioms while rewriting.
Datalog Translation. This component generates a Datalog program with the rewritten
set of queries Q, the completion rules cr(T ) in Table 2, and the facts in A.
Datalog Engine. The resulting program is evaluated using the Datalog engine DLV-
20101014 [18] or Clingo 3.0.3 [10]. If the program (and hence the ontology) is consis-
tent, its minimal model is returned and the answer tuples are filtered from it.


6     Experiments

We tested C LIPPER on a Pentium Core2 Duo 2.00GHZ with 2GB RAM under Ubuntu
10.04 and 512MB heap for the Java VM. We conducted the following experiments.
 Algorithm 2: Rewrite(q, T )
   Input: CQ q with only simple roles; TBox T
   Output: Rewritten queries of q w.r.t. T
   rewT (q) ← ∅ ;              B will be updated from the sub procedure
   rewrite(q) ;                                    B Call sub procedure
   return rewT (q);
   Sub Procedure rewrite (q)
   rewT (q) ← rewT (q) ∪ {q};
   foreach non-distinguished variables x of q do
       if r(x, x) 6∈ q then
            Replace each r(x, y) in q by r− (y, x) ;
            S ← {r | r(y, x) ∈ q} ; P ← {y | r(y, x) ∈ q} ; N ← {A | A(x) ∈ q} ;
            foreach M v ∃S 0 N 0 ∈ T do
                if S ⊆ S 0 and N ⊆ N 0 then
                     Obtain q 0 from q by:
                     begin
                         (1) Drop from q each atom containing the variable x;
                         (2) Rename each y ∈ P by x;
                         (3) Add {A(x) | A ∈ M } to q;
                    if q 0 6∈ rewT (q) then
                          rewrite(q 0 ) ;                               B Recursion



1. Downscaling test. We compared C LIPPER with other query rewriting systems for
DL-Lite, viz. R EQUIEM (Perez-Urbina et al. [23]) and P RESTO [25], and found that it is
competitive and scales down well on DL-Lite ontologies. We used the ontologies and
queries (Q1–Q5) from the REQUIEM test suite, which have been widely used for system
tests; in addition we considered the queries in Table 3a.
    Table 3b shows the number of rewritten queries and the rewriting time for the on-
tologies ADOLENA (A), STOCK-EXCHANGE (S), VICODI (V), and UNIVERSITY (U);
the rewriting time excludes loading and preprocessing. C LIPPER and P RESTO generated
in most cases rule sets of comparable size, and in short time. In a few cases P RESTO
generated significantly less rules than C LIPPER, and only for V P RESTO was notably
faster. R EQUIEM generated in several cases significantly more rules, despite consider-
ing the G-version which generates optimized rules (and hence uses considerably more
time). The difference seems to be caused by rule unfolding required in their rewriting.
    For UNIVERSITY, the only ontology in the suite having an ABox, we evaluated
the rewritten queries over four different ABoxes (67k to 320k assertions) using DLV.
Interestingly, in all cases the execution times for the three rewritings were very similar;
the average runtime of each query on the four ABoxes is shown in brackets.

2. Full Horn-SHIQ. To test C LIPPER on a full Horn-SHIQ ontology, we modified
the UOBM ontology [19], which is in SHOIN (D), by dropping or strengthening (in
case of disjunctions) non-Horn-SHIQ TBox axioms; the final ontology has 196 TBox
axioms. We used ABoxes Ai , 1 ≤ i ≤ 4, with 20k, 80k, 140k and 200k assertions. The
test queries in Table 4a were tailored to require reasoning with Horn-SHIQ constructs
                            Table 3: Experiments with DL-Lite ontology
                                            (a) Additional Queries
    A       Q6(x) ←Device(x), assistsWith(x,y), ReadingDevice(y)
            Q7(x) ←Device(x), assistsWith(x,y), ReadingDevice(y), assistsWith(y,z), SpeechAbility(z)
    S     Q6(x,z) ←Investor(x), hasStock(x,y), Stock(y), Company(z), hasStock(z,y)
        Q7(x,z,w) ← Investor(x), hasStock(x,y), Stock(y), isListedIn(y,z),
                    StockExchangeList(z), Company(w), hasStock(w,y)
    U     Q6(x,y) ←Professor(x), teacherOf(x,y), GraduateCourse(y)
          Q7(x,z) ←Faculty(y), Professor(z), Student(x), memberOf(x,y),worksFor(z,y)
    V Q6(x,y,z) ←Person(x), hasRole(x,y), Leader(y), exists(y,z)
     Q7(x,y,z,w) ← Person(x), hasRole(x,y), Leader(y), exists(y,z), TemporalInterval(z), related(x,w), Country(w)


                                         (b) Downscaling evaluation
            # Rules/CQs          Time (ms)                  # Rules/CQs                  Time (ms)
          RG Presto C LIPPER RG Presto C LIPPER           RG Presto C LIPPER     RG         Presto     C LIPPER
      Q1 27     53     42    281    45     50          Q1 15 16        15        13            8          73
      Q2 50     32     31    184    46     62          Q2 10    3      10        16           10          58
      Q3 104 32        31    292    27     65          Q3 72 28        26        77           12          63
    A Q4 224 43        36    523    32     71        V Q4 185 44       41        261          17          71
      Q5 624 37        36 1177 25          70          Q5 30 16         8        99           15          44
      Q6 364 35        30    523    31     65          Q6 18 22        18        27           11          69
      Q7 2548 43       32 7741 61          64          Q7 180 34       27        359          12         105
      Q1 6       7     10     14    7      19          Q1 2     4       2    14 ( 1247 ) 12 ( 1252 ) 27 ( 1255 )
      Q2 2       3     22    263     9     22          Q2 1     2      45 201 ( 1247 ) 23 ( 1262 ) 36 ( 1637 )
      Q3 4       4      9    1717 10       21          Q3 4     8      17 477 ( 2055 ) 26 ( 2172 ) 29 ( 1890 )
    S Q4 4       4     24 1611       9     23        U Q4 2    56      63 2431 ( 1260 ) 20 ( 1235 ) 28 ( 1735 )
      Q5 8       5     10 18941 10         22          Q5 10    8      16 7216 ( 1267 ) 26 ( 1305 ) 36 ( 1372 )
      Q6 4       8      5    204    11     21          Q6 10 13        10    13 ( 1272 ) 14 ( 1260 ) 27 ( 1262 )
      Q7 8       6      7    1733 11       17          Q7 960 24       19 1890 ( 1730 ) 15 ( 1310 ) 35 ( 1322 )


unavailable in DL-Lite and EL. Table 4b shows the number of rewritten queries, rewrit-
ing time and DLV running time. We see that C LIPPER answered all queries in reasonable
time and scaled well (time printed A1 / A2 / A3 /A4 ). The rewriting times for all the
queries are small and at most within a factor of 3. The high number of rules generated
for Q3 is due to many different possibilities for deriving some atoms in the query, like
Person(x). However, the evaluation still performs well (it stays within a small factor).


7       Related Work

Since Calvanese et al. introduced query rewriting in their seminal work on DL-Lite [3],
many query rewriting techniques have been developed and implemented, e.g. (Perez-
Urbina et al. [23], Rosati and Almatelli [25], Chortaras et al. [4], Gottlob et al. [11]),
usually aiming at an optimized rewriting size. Some of them also go beyond DL-Lite;
e.g. Perez-Urbina et al. cover ELHI, while Gottlob et al. consider Datalog ± . Most
approaches rewrite a query into a (union of) CQs; in [25] a non-recursive Datalog pro-
gram is generated, while Perez-Urbina et al. produce a CQ for DL-Lite and a (recursive)
Datalog program for DLs of the EL family. Our approach rewrites a CQ into a union of
CQs, but generates possibly recursive Datalog rules to capture the TBox. The closest DL
to Horn-SHIQ for which a rewriting technique has been implemented is ELHI [23].
Unlike ELHI, Horn-SHIQ can express functionality, a feature supported by DL-Lite
                  Table 4: Experiments with UOBM Horn-SHIQ ontology
                             (a) Queries                                        (b) Running time
      Q1(x) ←worksFor(x,y), isAffiliatedOrganizationOf(y,z), College(z)        # of Rew Datalog time (DLV)
      Q2(x) ←Postdoc(x), worksFor(x,y), University(y), hasAlumnus(y,x)        Rules (ms)       (ms)
      Q3(x) ←Person(x), like(x,y), Chair(z), isHeadOf(z,w), like(z,y)     Q1   3 87 120/ 370/ 620/ 870
      Q4(x) ←takeCourse(x,y), GraduateCourse(y),
                                                                          Q2  16 98 130/ 380/ 630/ 880
             isTaughtBy(y,z), Professor(z)
    Q5(x,z) ←LeisureStudent(x), takesCourse(x,y), CSCourse(y),            Q3 180 212 370/ 890/ 1420/ 1960
             isStudentOf(x, z), University(z)                             Q4  45 156 180/ 480/ 780/ 1070
    Q6(x,y) ←enrollIn(x,y), hasDegreeFrom(x,y), University(y)
    Q7(x,y) ←PeopleWithManyHobbies(x), isMemberOf(x,z), like(x,w),        Q5  37 152 160/ 440/ 720/ 1010
             TennisClass(w), hasMember(z,y), like(y,w)                    Q6  16 112 130/ 370/ 630/ 880
    Q8(x,z) ←TennisFan(x), like(x,y), Sport(y),                           Q7  55 143 180/ 470/ 750/ 1050
             isHeadOf(x,z), ReserachGroup(z)
  Q9(x,y,z) ←Student(x), hasDegreeFrom(x,y), Professor(z),                Q8  17 105 130/ 380/ 630/ 870
             worksFor(z,y), isAdvisorOf(z,x)                              Q9  33 126 150/ 430/ 720/ 1010
 Q10(x,y,w) ←Professor(x), Dean(y), isMemberOf(y,w), worksFor(x,w),
                                                                          Q10 23 137 150/ 390/ 640/ 880
             hasPublication(x,z), isPublicationOf(z,y)



considered relevant for applications. A comparison of both systems on ontologies be-
yond DL-Lite remains for future work. Our technique resembles Rosati’s for CQs in
EL [24], which incorporates the CQ into the TBox before saturation and then (after
saturation) translates it into Datalog, resulting in a best-case exponential algorithm. We
avoid this by doing a rewrite step only if the TBox has an applicable existential axiom.
    Rewriting approaches for more expressive DLs are less common. The most notable
exception is Hustadt et al.’s translation of SHIQ terminologies into disjunctive Data-
log [15], which is implemented in the KAON2 reasoner. The latter can be used to answer
queries over arbitrary ABoxes, but supports only instance queries. An extension to CQs
(without transitive roles) is given in [14], but it is not implemented. To our knowledge,
also the extension of the rewriting in [23] to nominals remains to be implemented [22].
In [20] a Datalog rewriting is used to establish complexity bounds of standard reasoning
in the Horn fragments of SHOIQ and SROIQ, but it does not cover CQs.

8    Conclusion
We presented a rewriting-based algorithm for answering CQs over Horn-SHIQ on-
tologies. Our prototype implementation shows potential for practical applications, and
further optimizations will improve it. Future versions of C LIPPER will support transi-
tive roles and queries formulated in weakly DL-safe Datalog, for which the theoretic
foundations have been already developed here and in [9].
    As an interesting application, we mention that our method allows to improve rea-
soning with DL-programs, which loosely couple rules and ontologies [5]. To avoid the
overhead caused by the interaction of a rule reasoner and an ontology reasoner of tra-
ditional methods, the inline evaluation framework translates ontologies into rules [12,
8]. The techniques of this paper can be faithfully integrated into the inline evaluation
framework to efficiently evaluate DL-programs involving Horn-SHIQ ontologies.
References
 1. Acciarri, A., Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M., Palmieri, M., Rosati,
    R.: QuOnto: Querying ontologies. In: AAAI. pp. 1670–1671 (2005)
 2. Calı̀, A., Gottlob, G., Lukasiewicz, T.: Datalog± : a unified approach to ontologies and in-
    tegrity constraints. In: ICDT’09. pp. 14–30. ACM (2009)
 3. Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning
    and efficient query answering in description logics: The DL-Lite family. J. Autom. Reasoning
    39(3), 385–429 (2007)
 4. Chortaras, A., Trivela, D., Stamou, G.: Optimized query rewriting for OWL 2 QL. In:
    CADE’11. pp. 192–206. Springer-Verlag (2011)
 5. Eiter, T., Ianni, G., Lukasiewicz, T., Schindlauer, R., Tompits, H.: Combining answer set
    programming with description logics for the Semantic Web. Artificial Intelligence 172(12-
    13), 1495–1539 (2008)
 6. Eiter, T., Ortiz, M., Šimkus, M., Tran, T., Xiao, G.: Query rewriting for Horn-SHIQ plus
    rules. In: AAAI’12 (2012), (To appear)
 7. Eiter, T., Gottlob, G., Ortiz, M., Simkus, M.: Query answering in the description logic Horn-
    SHIQ. In: JELIA’08. pp. 166–179. Springer (2008)
 8. Eiter, T., Krennwallner, T., Schneider, P., Xiao, G.: Uniform evaluation of nonmonotonic
    DL-programs. In: FoIKS’12. LNCS, vol. 7153, pp. 1–22. Springer (March 2012)
 9. Eiter, T., Ortiz, M., Šimkus, M., Tran, T., Xiao, G.: Query rewriting for Horn-SHIQ
    plus rules. Tech. Rep. INFSYS RR-1843-12-04, TU Vienna (2012), http://www.kr.
    tuwien.ac.at/research/reports/rr1204.pdf
10. Gebser, M., Kaufmann, B., Kaminski, R., Ostrowski, M., Schaub, T., Schneider, M.T.:
    Potassco: The potsdam answer set solving collection. AI Commun. 24(2), 107–124 (2011)
11. Gottlob, G., Orsi, G., Pieris, A.: Ontological queries: Rewriting and optimization. In:
    ICDE’11. pp. 2 –13 (2011)
12. Heymans, S., Eiter, T., Xiao, G.: Tractable reasoning with DL-programs over Datalog-
    rewritable description logics. In: ECAI’10. pp. 35–40. IOS Press (2010)
13. Horridge, M., Bechhofer, S.: The OWL API: A java API for OWL ontologies. Semantic Web
    2(1), 11–21 (2011)
14. Hustadt, U., Motik, B., Sattler, U.: A decomposition rule for decision procedures by
    resolution-based calculi. In: LPAR’04. pp. 21–35. Springer (2004)
15. Hustadt, U., Motik, B., Sattler, U.: Reasoning in description logics by a reduction to disjunc-
    tive Datalog. J. Autom. Reasoning 39(3), 351–384 (2007)
16. Kazakov, Y.: Consequence-driven reasoning for Horn SHIQ ontologies. In: IJCAI’09. pp.
    2040–2045 (2009)
17. Krötzsch, M., Rudolph, S., Hitzler, P.: Complexity boundaries for Horn description logics.
    In: AAAI’07. pp. 452–457. AAAI Press (2007)
18. Leone, N., Pfeifer, G., Faber, W., Eiter, T., Gottlob, G., Perri, S., Scarcello, F.: The DLV
    system for knowledge representation and reasoning. ACM ToCL 7(3), 499–562 (2006)
19. Ma, L., Yang, Y., Qiu, Z., Xie, G.T., Pan, Y., Liu, S.: Towards a complete OWL ontology
    benchmark. In: ESWC’06. pp. 125–139. Springer (2006)
20. Ortiz, M., Rudolph, S., Simkus, M.: Worst-case optimal reasoning for the Horn-DL frag-
    ments of OWL 1 and 2. In: KR’10. AAAI Press (2010)
21. Ortiz, M., Rudolph, S., Simkus, M.: Query answering in the Horn fragments of the descrip-
    tion logics SHOIQ and SROIQ. In: IJCAI’11. pp. 1039–1044. IJCAI/AAAI (2011)
22. Pérez-Urbina, H., Motik, B., Horrocks, I.: Tractable query answering and rewriting under
    description logic constraints. J. Applied Logic 8(2), 186–209 (2010)
23. Pérez-Urbina, H., Motik, B., Horrocks, I.: A comparison of query rewriting techniques for
    DL-Lite. In: DL’09. CEUR-WS.org (2009)
24. Rosati, R.: On conjunctive query answering in EL. In: DL’07. CEUR-WS.org (2007)
25. Rosati, R., Almatelli, A.: Improving query answering over DL-Lite ontologies. In: KR’10.
    AAAI (2010)