<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>OBDA Using RL Reasoners and Repairing?</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Electrical and Computer Engineering National Technical University of Athens</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Rewriting the input TBox T (and query Q) into a datalog program, called T rewriting ((Q,T )-rewriting ), is a prominent approach to ontology-based data access [1]. It was used as early as the KAON2 system [5] and it currently consists of (perhaps) the standard approach to answering queries over ontologies expressed in the languages DL-Lite [2, 12] and E LHI [8, 16]. Apart from computing a rewriting, a problem that has been proven very di cult is how to subsequently (e ectively) evaluate it over the data. In many cases, due to its size and/or complexity, additional techniques need to be devised [11, 9, 10]. Unfortunately, all of them have so far been applied only on DL-Lite while for more expressive languages, apart from the work in [3], to the best of our knowledge, no other integrated query answering system has been reported or evaluated with large and complex ontologies. An interesting observation is that T -rewritings are usually of a particular form|that is, they can be translated back into an RL TBox [14].1 Hence, scalable RL system, like OWLim, can be used to evaluate them over the data. More precisely, it was shown how, given a SROIQ TBox T and an RL system ans, a rewriting of T can be used to compute a set of axioms R (called repair ) such that for every CQ Q with only distinguished variables (i.e., SPARQL CQs) and every ABox A we have cert(Q; T [ A) ans(Q; T [ R [ A). That is, after repairing, ans is indistinguishable from a SROIQ reasoner w.r.t. SPARQL CQs. Although, the experimental results in [14] provided with encouraging results, these were quite preliminary. First, the implementation used an arguably obsolete system (Requiem) and had no optimisations. Second, the evaluation was based on LUBM (an arguably trivial TBox) and a small fragment of Galen. Hence, it was unclear whether repairing can be applied to large and complex TBoxes. Third, the approach could only handle SPARQL CQs. In the current paper we attempt to extensively study repairing as an approach to OBDA over expressive TBoxes. First, we propose several optimisations and re nements to the rst prototype in order to handle large and complex TBoxes. Second, we show how arbitrary queries can be supported. Third, we perform an extensive experimental evaluation which showed that we can handle very large and complex TBoxes (e.g., Galen, GO, Fly) and answer arbitrary CQs over one of them (Fly) within milliseconds. This is an extended abstract of paper [13].</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Computing Repairs E ciently</title>
      <p>Let T be a TBox and let ans be an RL system. Roughly speaking, a repair R
of ans for T is computed by rst computing a T -rewriting (Rew) for T , then
removing from Rew the parts that ans can already handle, and nally, minimising
the resulting set. For example, if T = fA v 9R; 9R v Bg, then Rew = fA(x) !
B(x); R(x; y) ! B(x)g is a T -rewriting of T , while R = fA v Bg is the desirable
repair for ans. We call Repair(T ) the overall procedure.</p>
      <p>
        Compared to [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] we have performed two important re nements. First,
according to [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] a repair R can only be computed by T -rewritings Rew such
that T j= Rew. This is quite restrictive as it prohibits the use of many e cient
rewriting algorithms which normalise the input TBox T into T 0 by introducing
fresh symbols. To be able to use such approaches to compute repairs we notice
that, for T 0 the normalised TBox used internally by the rewriting system to
compute Rew, we have cert(Q; T [ A) ans(Q; T 0 [ Repair(T 0) [ A). Second,
the minimisations steps applied by Repair are based on expensive FOR-loops
over the initial T -rewriting in which HermiT is invoked. Despite how optimised
a SROIQ reasoner is, in large ontologies, these entailment checks could simply
be too many. To improve the behaviour we have interveaned in the internals of
HermiT to implement a form of incremental entailment checking. First, we
exhaustively apply the calculus over T to construct an initial model and we mark
the completion of the execution. Then, to check T j= we resume the execution
from the previous point and after completion we backtrack to the marked point.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Supporting non-SPARQL Queries</title>
      <p>Let Q be an arbitrary CQ with query predicate Q, let T be a TBox, and let
RewD ] RewQ be a (Q; T )-rewriting (RewD is a datalog program not mentioning
Q and RewQ a UCQ that mentions Q). Since RewD does not mention Q it
captures only ground entialments of T over some A. But, after repairing, ans
is complete w.r.t. all ground entailments for any A; hence cert(Q; T [ A)
ans(RewQ; T [ Repair(T ) [ A). The following summarises our approach:
1. Compute a repair R of T for ans using procedure Repair.
2. Load the dataset A, the input TBox T , and the repair R to ans.
3. For a CQ Q, if Q is SPARQL then evaluated it over ans; otherwise compute
a (Q; T )-rewriting RewD ] RewQ and evaluate RewQ over ans.</p>
      <p>Note that most steps, i.e., steps 1 and 2, can be done only once as a pre-processing
(changes in A can be handled incrementally).
4</p>
    </sec>
    <sec id="sec-4">
      <title>Implementation and Evaluation</title>
      <p>
        We have implemented a prototype ontology repair and query answering tool
called Hydrowl.2 The tool uses Rapid [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] to compute T -rewritings and
HermiT [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and OWLim [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] to minimise it into the nal repair.
2 http://www.image.ece.ntua.gr/~gstoil/hydrowl/
      </p>
      <p>
        Using Hydrowl we managed to compute repairs for 151 out of the 152
ontologies of our dataset, which contained some large and highly complex ones. In the
vast majority of cases a repair could be computed in less than a second, only a
handfull required up to a few minutes, and only the very large ones several
minutes; Table 1 presents results for the latter. Despite their size and complexity we
see that we can compute repairs for them in a reasonable amount of time (recall
this is usually done only once). Actually, if we discard a very expensive
minimisation step, then we can compute some (non-minimal) repair very e ciently while
its size is not considerably larger than the minimal one (see Table 1 numbers in
parenthesis). The system in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] could not handle any of these ontologies.
      </p>
      <p>Next, Table 2 presents loading experiments to OWLim using UOBM (1 to
20 universities) and Fly (ABox multiplied up to 5 times). As can be seen, the
overhead introduced by repairing is signi cantly noticeable only in Fly, mostly
due to the size of the computed repair (R), however, recall that loading is usually
performed once. In Fly we have also loaded the non-minimal repair (R ) and
as it turns out there is no signi cant di erence compared to the minimal one.</p>
      <p>The Fly ontology comes with 4 non-SPARQL queries. Table 3 presents the
results of our technique from Section 3. As we can see in most cases we were
able to compute and evaluate a rewriting almost instantaneously. This greatly
outperformes previous approaches on Fly [17, 18] which require several seconds
per query. The good behaviour of our approach can be attributed to the fact
that most hard work is pushed to a pre-processing step while RewQ, the only
thing computed on-line, is usually small and of simple structure.
17. Yujiao Zhou, Bernardo Cuenca Grau, Ian Horrocks, Zhe Wu, and Jay Banerjee.</p>
      <p>Making the most of your triple store: Query answering in OWL 2 using an RL
reasoner. In Proc, WWW 2013, pages 1569{1580, 2013.
18. Yujiao Zhou, Yavor Nenov, Bernardo Cuenca Grau, and Ian Horrocks. Complete
query answering over horn ontologies using a triple store. In Proc. of the 12th
International Semantic Web Conference (ISWC). Springer LNCS, 2013.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Diego Calvanese De Giacomo Giuseppe Antonella Poggi</surname>
            , Domenico Lembo, Maurizio Lenzerini, and
            <given-names>Riccardo</given-names>
          </string-name>
          <string-name>
            <surname>Rosati</surname>
          </string-name>
          .
          <article-title>Linking data to ontologies</article-title>
          .
          <source>Journal on Data Semantics</source>
          , X:
          <volume>133</volume>
          {
          <fpage>173</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and
          <string-name>
            <given-names>Riccardo</given-names>
            <surname>Rosati</surname>
          </string-name>
          .
          <article-title>Tractable reasoning and e cient query answering in description logics: The DL-Lite family</article-title>
          .
          <source>Journal of Automated Reasoning</source>
          ,
          <volume>39</volume>
          (
          <issue>3</issue>
          ):
          <volume>385</volume>
          {
          <fpage>429</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Eiter</surname>
          </string-name>
          , Magdalena Ortiz, Mantas Simkus,
          <string-name>
            <surname>Trung-Kien Tran</surname>
            , and
            <given-names>Guohui</given-names>
          </string-name>
          <string-name>
            <surname>Xiao</surname>
          </string-name>
          .
          <article-title>Query rewriting for Horn-SHIQ plus rules</article-title>
          .
          <source>In Proc. of AAAI</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Benjamin</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Grosof</surname>
            , Ian Horrocks, Raphael Volz, and
            <given-names>Stefan</given-names>
          </string-name>
          <string-name>
            <surname>Decker</surname>
          </string-name>
          .
          <article-title>Description logic programs: Combining logic programs with description logic</article-title>
          .
          <source>In Proceedings of the Twelfth International World Wide Web Conference (WWW</source>
          <year>2003</year>
          ), pages
          <fpage>48</fpage>
          {
          <fpage>57</fpage>
          . ACM,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Ullrich</given-names>
            <surname>Hustadt</surname>
          </string-name>
          , Boris Motik, and
          <string-name>
            <given-names>Ulrike</given-names>
            <surname>Sattler</surname>
          </string-name>
          .
          <article-title>Deciding expressive description logics in the framework of resolution</article-title>
          .
          <source>Information &amp; Computation</source>
          ,
          <volume>206</volume>
          (
          <issue>5</issue>
          ):
          <volume>579</volume>
          {
          <fpage>601</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Atanas</given-names>
            <surname>Kiryakov</surname>
          </string-name>
          , Barry Bishoa, Damyan Ognyano , Ivan Peikov, Zdravko Tashev, and
          <string-name>
            <given-names>Ruslan</given-names>
            <surname>Velkov</surname>
          </string-name>
          .
          <article-title>The Features of BigOWLIM that Enabled the BBCs World Cup Website</article-title>
          .
          <source>In Workshop on Semantic Data Management (SemData)</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Boris</given-names>
            <surname>Motik</surname>
          </string-name>
          , Rob Shearer, and
          <string-name>
            <given-names>Ian</given-names>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>Hypertableau Reasoning for Description Logics</article-title>
          .
          <source>Journal of Arti cial Intelligence Research</source>
          ,
          <volume>36</volume>
          :
          <fpage>165</fpage>
          {
          <fpage>228</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Hector</given-names>
            <surname>Perez-Urbina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Boris</given-names>
            <surname>Motik</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Ian</given-names>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>Tractable query answering and rewriting under description logic constraints</article-title>
          .
          <source>Journal of Applied Logic</source>
          ,
          <volume>8</volume>
          (
          <issue>2</issue>
          ):
          <volume>186</volume>
          {
          <fpage>209</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Mariano</surname>
          </string-name>
          Rodriguez-Muro and
          <article-title>Diego Calvanese. High performance query answering over DL-Lite ontologies</article-title>
          .
          <source>In Proceedings of the Thirteenth International Conference on Principles of Knowledge Representation and Reasoning (KR</source>
          <year>2012</year>
          ),
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Mariano Rodr</surname>
            guez-Muro,
            <given-names>Roman</given-names>
          </string-name>
          <string-name>
            <surname>Kontchakov</surname>
            , and
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Zakharyaschev</surname>
          </string-name>
          .
          <article-title>Ontology-based data access: Ontop of databases</article-title>
          .
          <source>In Proceedings of the International Semantic Web Conference (ISWC</source>
          <year>2013</year>
          ), pages
          <fpage>558</fpage>
          {
          <fpage>573</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Riccardo</given-names>
            <surname>Rosati</surname>
          </string-name>
          . Prexto:
          <article-title>Query rewriting under extensional constraints in dl-lite</article-title>
          .
          <source>In 9th Extended Semantic Web Conference (ESWC</source>
          <year>2012</year>
          ), pages
          <fpage>360</fpage>
          {
          <fpage>374</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>Riccardo</given-names>
            <surname>Rosati</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Almatelli</surname>
          </string-name>
          .
          <article-title>Improving query answering over DLLite ontologies</article-title>
          .
          <source>In Proceedings of the 12th International Conference on the Principles of Knowledge Representation and Reasoning (KR-10)</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Giorgos</given-names>
            <surname>Stoilos</surname>
          </string-name>
          .
          <article-title>Ontology-based data access using rewriting, OWL 2 RL systems and repairing</article-title>
          .
          <source>In Proceedings of the 11th European Semantic Web Conference (ESWC</source>
          <year>2014</year>
          ),
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Giorgos</surname>
            <given-names>Stoilos</given-names>
          </string-name>
          , Bernardo Cuenca Grau, Boris Motik, and
          <string-name>
            <given-names>Ian</given-names>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>Repairing ontologies for incomplete reasoners</article-title>
          .
          <source>In Proceedings of the 10th International Semantic Web Conference (ISWC-11)</source>
          , Bonn, Germany, pages
          <volume>681</volume>
          {
          <fpage>696</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Depoina</surname>
            <given-names>Trivela</given-names>
          </string-name>
          , Giorgos Stoilos, Alexandros Chortaras, and
          <string-name>
            <given-names>Giorgos</given-names>
            <surname>Stamou</surname>
          </string-name>
          .
          <article-title>Optimising resolution-based rewriting algorithms for DL ontologies</article-title>
          .
          <source>In Proceedings of the 26th Workshop on Description Logics (DL</source>
          <year>2013</year>
          ),
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Despoina</surname>
            <given-names>Trivela</given-names>
          </string-name>
          , Giorgos Stoilos, Alexandros Chortaras, and
          <string-name>
            <given-names>Giorgos</given-names>
            <surname>Stamou</surname>
          </string-name>
          .
          <article-title>Optimising resolution-based rewriting algorithms for dl ontologies</article-title>
          .
          <source>In Proceedings of the 26th Workshop on Description Logics (DL</source>
          <year>2013</year>
          ), Ulm, Germany,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>