<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Hybrid Query Answering Over Expressive DL Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Giorgos Stoilos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giorgos Stamou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Checking (In)Completeness of Systems</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Electrical and Computer Engineering</institution>
          ,
          <addr-line>NTUA</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The design of tractable Description Logics, like E L [1], DL-Lite [2], and RL [5] have spurred the design and implementation of highly scalable systems such as OWLim, Oracle's Semantic Graph, and more. The attractive properties of the these systems have in many occasions led application developers to use them even in cases where the input ontology is expressed in the far more expressive SROIQ language. Although, in these cases there can be combinations of inputs for which these systems will fail to compute all certain answers, nevertheless, this notion of completeness is too coarse and these might still be able to compute the correct answers for many input queries. In the current paper, we report on the following problem: given a query Q containing only distinguished variables (i.e., a SPARQL query) over a SROIQ TBox T and a (scalable) system ans complete for some fragment L of SROIQ, is it possible to identify in an e cient way if ans is complete for Q; T ? We show that this is possible if one has pre-computed the set (U ) of atomic queries for which ans is known to be complete. Then, using these techniques we develop an approach to query answering which can decide at run-time whether Q can be evaluated using ans or a fully- edged SROIQ reasoner needs to be employed together with ans to compute the right answers. We have instantiated our framework using the well-known RL system OWLim and the SROIQ reasoner HermiT. Our experimental evaluation has shown that for most test queries over LUBM and UOBM we can correctly determine if OWLim is (in)complete in less than 50 millisecond, while there are only 3 queries for which our tool replied \unknown". Finally, our hybrid query answering system was able to compute all answers to the test queries within milliseconds. Compared to previous techniques that attempt to deliver scalable query answering over SROIQ TBoxes by using RL systems [11, 8], our approach has the advantage that the set U always exists regardless of the expressivity of T , hence it is readily applicable to arbitrary SROIQ ontologies. Moreover, the framework is highly modular|that is, it can be instantiated using any combination of systems and not only RL ones. This is an extended abstract of the paper [9].</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Consider the TBox T = fA v 9S; 9S v B; D v Bg and a system ans that is
complete for RL fragment of SHOIQ [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Then, the behaviour of ans can be
characterised by the TBox T jRL = f9S v B; D v Bg since ans discards axioms
with existentials in the RHS. Clearly, ans is complete for the atomic queries
Q1 = S(x; y) and Q2 = D(x) and, moreover, it is clearly complete for the query
Q = S(x; y) ^ D(x).
      </p>
      <p>The completeness of ans w.r.t. Q; T is rather expected given the fact that
Q is formed by atoms S(x; y) and D(x) and ans is complete for queries Q1 and
Q2 which correspond to these atoms. This suggests that given a set of atomic
queries over which ans is known to be complete, called query base (QB) U of ans
for T , then we can deduce the completeness of ans w.r.t. an arbitrary SPARQL
query Q by checking if for each of its atoms there is an isomorphic query in U .
Actually, as shown next, it su ces for an atom to \appear" in U or be \entailed"
by other atoms of Q that appear in U .</p>
      <p>Theorem 1. Let T be a SROIQ-TBox, let ans be a system complete for a DL
L which over T is characterised by the TBox T jL, let U be a QB of ans for T , and
let Q be a SPARQL query. Let also B be all atoms of Q for which an isomorphic
query in U exists. If each atom in Q is either in B or T jL j= V B ! , then
ans is (Q; T )-complete.</p>
      <p>
        However, the previous provides only a su cient condition for completeness
(cf. [9, Example 7]). Unfortunately, to devise both a su cient and necessary
condition one most likely needs to quantify over all minimal \representative"
ABoxes A such that ans would not compute all answers over T [ A and this set
can be in nite [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Hence, there are (perhaps) strong theoretical limitations in
devising such a condition.
      </p>
      <p>
        To alleviate this issue we have also devised a su cient (easy to check)
syntactic condition for concluding incompleteness. This condition is based on the
notion of reachability between the symbols of a TBox T [
        <xref ref-type="bibr" rid="ref10 ref7">10, 7</xref>
        ]. Note that Q
needs to additionally satisfy a certain consistency Property (]) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Theorem 2. Let LH be a Horn DL and let ans be a system whose behaviour
over T is characterised by T jLH . Moreover, let T ; U ; B be as in Theorem 1, and
let Q be a SPARQL CQ satisfying Property (]) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. If an atom of Q is not
Sig(B)-reachable in T jLH , then ans is (Q; T )-incomplete.
      </p>
      <p>Clearly, there can be CQs for which neither of the above theorems hold and
hence for which we cannot determine if ans is complete or not.</p>
      <p>
        Constructing QBs In Practice An interesting feature of QBs is that they
always exist (a system ans is either complete or not for an atomic query and for
a given TBox T there is only a nite number of them). However, a central issue
is how to construct the QB in an easy (i.e., automatic) way. By recent results [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
and developed software1 this is to a large extent possible and, although there are
some negative results regarding the techniques in [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], we strongly believe that
even in the theoretically problematic cases QBs can be computed e ectively (cf.
[9, Section 5]).
1 https://code.google.com/p/sygenia/
      </p>
    </sec>
    <sec id="sec-2">
      <title>Hybrid Query Answering</title>
      <p>
        The straightforward way to exploit our techniques is to use them at query time to
decide if a given query Q can be evaluated using some scalable system ans or we
need to resort to a SROIQ reasoner. However, in queries with only distinguished
variables we can provide an even more ne-grained approach. More precisely, Q
can be split into the part Qc for which ans is known to be complete (determined
using the previous techniques) and the part Qi that is not. Then, we can evaluate
Qc using ans, Qi using a SROIQ reasoner and nally join the results. Evan
more, ans can also be used to further improve the performance of the second
step as follows: rst, the evaluation of Qc using ans provides an upper bound
of the answers (since the contraints in the Qi part are missing) and, second, we
can also evaluate Qi using ans to quickly provide with a lower bound. These
two bounds are passed to the SROIQ reasoner in order to restrict the search
to only those individuals that are in the upper and not in the lower bound. For
the detailed algorithm the reader is referred to [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
4
      </p>
    </sec>
    <sec id="sec-3">
      <title>Evaluation</title>
      <p>We have instantiated our framework using the well-known scalable RL
system OWLim and the SROIQ reasoner HermiT. The implemented system is
called Hydrowl and is available at http://www.image.ece.ntua.gr/~gstoil/
hydrowl.</p>
      <p>First, at a pre-processing step, we used Hydrowl to compute a QB for OWLim
for ontologies LUBM and UOBM. For LUBM we required 14.5 seconds while for
UOBM we required 48.7 seconds (some manual editing was required for UOBM).
Second, we used Hydrowl to check if OWLim is complete for all the test queries
of LUBM and UOBM. When all the atoms of a test query appear in the set B
(cf. Theorem 1) our tool replies \complete" almost instantaneously (less than
5ms). However, even when for some atom we tested T jL j= V B ! , the
tool still required less than 50 milliseconds. For LUBM we were able to correctly
identify (in)completeness of OWLim for all except for two queries (Hydrowl
replied \unknown"), wihle for UOBM for all except just one. It is worth noting
that for the unknown cases OWLim is actually incomplete.</p>
      <p>
        Fnially, we used Hydrowl to evaluate all test queries of LUBM using 5
universities and of UOBM using 1 department and we have compared against the
HermiT-BGP system [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Table 1 presents the results (in seconds) for all the
interesting queries (for the rest both systems have similar response times). Grey
coloured columns mark those queries where Hydrowl uses both OWLim and
HermiT; in all other cases only OWLim is used. As can be seen, in all queries
Hydrowl is considerably faster than HermiT-BGP.
      </p>
      <p>Acknowledgements Giorgos Stoilos was funded by a Marie Curie Career
Reintegration Grant within EU's FP7 under grant agreement 303914.
Query</p>
      <p>LUBM
1 3 8
9</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Franz</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Brandt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Carsten</given-names>
            <surname>Lutz</surname>
          </string-name>
          .
          <article-title>Pushing the EL envelope</article-title>
          .
          <source>In Proceedings of the 19th International Joint Conference on Arti cial Intelligence (IJCAI-05)</source>
          , pages
          <fpage>364</fpage>
          {
          <fpage>369</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and
          <string-name>
            <given-names>Riccardo</given-names>
            <surname>Rosati</surname>
          </string-name>
          .
          <article-title>Tractable reasoning and e cient query answering in description logics: The DL-Lite family</article-title>
          .
          <source>J. of Automated Reasoning</source>
          ,
          <volume>39</volume>
          (
          <issue>3</issue>
          ):
          <volume>385</volume>
          {
          <fpage>429</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Bernardo</given-names>
            <surname>Cuenca</surname>
          </string-name>
          <string-name>
            <surname>Grau</surname>
          </string-name>
          , Boris Motik, Giorgos Stoilos, and
          <string-name>
            <given-names>Ian</given-names>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>Completeness guarantees for incomplete ontology reasoners: Theory and practice</article-title>
          .
          <source>Journal of Arti cial Intelligence Research</source>
          ,
          <volume>43</volume>
          :
          <fpage>419</fpage>
          {
          <fpage>476</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Bernardo</given-names>
            <surname>Cuenca</surname>
          </string-name>
          <string-name>
            <surname>Grau</surname>
          </string-name>
          , Boris Motik, Giorgos Stoilos, and
          <string-name>
            <given-names>Ian</given-names>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>Computing datalog rewritings beyond horn ontologies</article-title>
          .
          <source>In Proceedings of the Twenty-Third International Joint Conference on Arti cial Intelligence (IJCAI</source>
          <year>2013</year>
          ),
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Benjamin</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Grosof</surname>
            , Ian Horrocks, Raphael Volz, and
            <given-names>Stefan</given-names>
          </string-name>
          <string-name>
            <surname>Decker</surname>
          </string-name>
          .
          <article-title>Description logic programs: Combining logic programs with description logic</article-title>
          .
          <source>In Proceedings of the Twelfth International World Wide Web Conference (WWW</source>
          <year>2003</year>
          ), pages
          <fpage>48</fpage>
          {
          <fpage>57</fpage>
          . ACM,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Ilianna</given-names>
            <surname>Kollia</surname>
          </string-name>
          and
          <string-name>
            <given-names>Birte</given-names>
            <surname>Glimm</surname>
          </string-name>
          .
          <article-title>Optimizing sparql query answering over owl ontologies</article-title>
          .
          <source>Journal of Arti cial Intelligence Research (JAIR)</source>
          ,
          <volume>48</volume>
          :
          <fpage>253</fpage>
          {
          <fpage>303</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Riku</given-names>
            <surname>Nortje</surname>
          </string-name>
          , Katarina Britz, and Thomas Meyer.
          <article-title>Reachability modules for the description logic SRIQ</article-title>
          .
          <source>In Proceedings of the International Conference on Logic for Programming Arti cial Intelligence and Reasoning (LPAR 19)</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Giorgos</given-names>
            <surname>Stoilos</surname>
          </string-name>
          .
          <article-title>Ontology-based data access using rewriting, OWL 2 RL systems and repairing</article-title>
          .
          <source>In Proceedings of the 11th European Semantic Web Conference (ESWC</source>
          <year>2014</year>
          ),
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Giorgos</given-names>
            <surname>Stoilos</surname>
          </string-name>
          and
          <string-name>
            <given-names>Giorgos</given-names>
            <surname>Stamou</surname>
          </string-name>
          .
          <article-title>Hybrid query answering for owl ontologies</article-title>
          .
          <source>Proceedings of the 21st European Conference on Arti cial Intelligence (ECAI 14)</source>
          ,
          <year>2014</year>
          . available at: http://www.image.ece.ntua.gr/~gstoil/temp/main.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Boontawee</given-names>
            <surname>Suntisrivaraporn</surname>
          </string-name>
          .
          <article-title>Module extraction and incremental classi cation: A pragmatic approach for ontologies</article-title>
          .
          <source>In Proceedings of European Semantic Web Conference (ESWC</source>
          <year>2008</year>
          ), pages
          <fpage>230</fpage>
          {
          <fpage>244</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Yujiao</surname>
            <given-names>Zhou</given-names>
          </string-name>
          , Yavor Nenov, Bernardo Cuenca Grau, and
          <string-name>
            <given-names>Ian</given-names>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>Complete query answering over horn ontologies using a triple store</article-title>
          .
          <source>In Proceedings of the 12th International Semantic Web Conference (ISWC</source>
          <year>2013</year>
          ),
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>