<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Polynomial Combined Rewritings for Linear Existential Rules and DL-Lite with n-ary Relations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Georg Gottlob</string-name>
          <email>georg.gottlob@cs.ox.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Manna</string-name>
          <email>manna@mat.unical.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Pieris</string-name>
          <email>pieris@dbai.tuwien.ac.at</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Oxford</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Mathematics and Computer Science, University of Calabria</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Institute of Information Systems, Vienna University of Technology</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>This paper considers the setting of ontology-based query answering (OBQA). In this
setting, Description Logics (DLs) and existential rules (a.k.a. tuple-generating
dependencies or Datalog rules) are popular ontology languages, while conjunctive queries
(CQs) is the main querying tool. Among KR researchers there is a clear consensus that
the required level of scalability in OBQA can only be achieved via query rewriting,
which attempts to reduce OBQA into the problem of evaluating a query over a
relational database. Two query languages are usually considered: first-order queries (FO)
and non-recursive Datalog queries (NDL).</p>
      <p>
        An interesting approach to query rewriting is the polynomial combined approach [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
which can be described as follows: an ontology can be incorporated together with a
CQ q into a database query q in polynomial time, and also with a database D into
a database D in polynomial time, such that q over D yields the same result as q
evaluated over the knowledge base consisting of D and . The polynomial combined
approach has been applied to E LHd?r [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], an extension of the well-known DL E L, to
DL-LitehNorn [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], one of the most expressive logics of the DL-Lite family, and only
recently to the main guarded- and sticky-based classes of existential rules [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Research Challenges. The problem of applying the polynomial combined approach
to existing DLs and classes of existential rules is relatively understood. Nevertheless,
there are still basic open questions that deserve our attention. Regarding DLs, little is
known about formalisms with n-ary relations such as DLR-LiteR, that is, the
extension of DL-LiteR with n-ary roles. Regarding existential rules, it is open whether the
polynomial combined approach can be applied to the class of linear existential rules (or
simply linear rules), that is, assertions of the form 8X8Y(s(X; Y) ! 9Z p(X; Z)),
where s(X; Y) and p(X; Z) are atomic formulas [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        It is not difficult to show that, if linear rules are polynomially combined rewritable,
then also DLR-LiteR is polynomially combined rewritable — this follows from the
fact that query answering under DLR-LiteR can be easily reduced to query answering
under linear rules [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Thus, the key question that we need to answer, which has been
explicitly stated as an open problem in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], is the following:
p(z1,a,z2)
p(z1,z3,a)
p(z3,a,z1)
p(a,b,c)
p(a,z1,b)
∃A∃B∃C∃D (p(A,a,B) ∧ p(C,B,b) ∧ p(D,c,b))
p(b,z4,z1)
p(z4,z1,b)
(a)
p(b,c,d)
p(b,z5,c)
p(z5,c,b)
z3
z1
      </p>
      <p>z5
(b)
z4</p>
      <p>Given a (Boolean) CQ q, a database D, and a set of linear rules, can we rewrite
in polynomial time: (i) q and , independently of D, into a Q-query q , where
Q 2 fFO; NDLg, and (ii) D and , independently of q, into a database D , such
that (D [ ) j= q iff D j= q ?</p>
      <p>
        The answer to the above question is affirmative under the assumption that the arity
of the underlying schema is bounded; implicit in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. However, little is known for
arbitrary linear rules, without any assumption on the underlying schema. We give a positive
answer even for linear rules that use predicates of unbounded arity. For more details,
we refer the reader to [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Proof Generator</title>
      <p>
        We assume the reader is familiar with the chase procedure. Recall that the chase for
a database D and a set of rules, denoted chase(D; ), is a universal model of D
and , and thus (D [ ) j= q iff chase(D; ) j= q, for each CQ q. The instance
chase(D; ) can be naturally seen as a directed labeled graph, called chase graph,
denoted CG (D; ). It is also easy to verify that for linear rules, CG (D; ) is a directed
forest; for details on the chase, see, e.g., [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Our main technical tool is called proof
generator, and it formalizes the intuitive idea that (Boolean) CQ answering under linear
rules can be realized as a reachability problem on the chase graph. Let us illustrate the
key ideas underlying the proof generator via a simple example.
      </p>
      <p>Example 1. Let D = fp(a; b; c); p(b; c; d)g, and let
brevity, the universal quantifiers are omitted):
be the set of linear rules (for
p(X; Y; Z) ! 9W p(X; W; Y )
p(X; Y; Z) ! 9W p(Y; X; W )
p(X; Y; Z) ! 9W p(Z; W; Y )
p(X; Y; Z) ! p(Y; Z; X):
Given the BCQ q = 9A9B9C9D (p(A; a; B) ^ p(C; B; b) ^ p(D; c; b)), as shown in
Figure 1(a), there exists a homomorphism h (dashed arrows in the figure) that maps q to
an initial segment of the chase graph of D and , and thus (D [ ) j= q. It is interesting
to observe that the nulls occurring in h(q), i.e., z1, z3, z4 and z5, form a rooted forest
F , depicted in Figure 1(b), with the following properties; for brevity, let (z) be the
atom in CG (D; ) where the null z is invented (see shaded atoms in Figure 1(a) for
(z1), (z3), (z4) and (z5)): (i) for every root node z, (z) is reachable from D; (ii)
for every edge (z; w), (w) is reachable from (z); and (iii) for every atom a 2 h(q),
there exists a unique path in F that contains all the nulls in a, and, assuming that
the leaf node of is z, a is reachable from (z). Indeed, it is easy to verify that (z1)
and (z5) are reachable from D, (z3) and (z4) are reachable from (z1), and finally,
h(p(A; a; B)) = p(z3; a; z1) is reachable from (z3), h(p(C; B; b)) = p(z4; z1; b) is
reachable from (z4), and h(p(D; c; b)) = p(z5; c; b) is reachable from (z5).</p>
      <p>Roughly speaking, the triple consisting of: (1) the homomorphism h, that maps q to
the chase; (2) the function , that gives the atoms in the chase where the nulls occurring
in h(q) were invented; and (3) the forest F , that encodes how the nulls of h(q) depend
on each other, as well as the order of their generation, is what we call a proof generator
for q w.r.t. D and . The existence of such a triple, allows us to generate the part of
the chase due to which the query is entailed, i.e., the proof of the query (hence the
name “proof generator”). Therefore, for query answering purposes under linear rules,
we simply need to check if such a proof generator exists.</p>
      <p>Lemma 1. (D [</p>
      <p>) j= q iff there exists a proof generator for q w.r.t. D and .
3</p>
    </sec>
    <sec id="sec-3">
      <title>The Combined Rewriting</title>
      <p>We give a positive answer to our research question regarding linear rules and the
polynomial combined approach. More precisely, we show that:
Theorem 1. The class of linear rules is polynomially combined Q-rewritable, where
Q 2 fFO; NDLg.</p>
      <p>To establish the above theorem, we heavily rely on the notion of the proof generator.
Fix a (Boolean) CQ q, a database D, and a set of linear rules. By Lemma 1, it
suffices to construct in polynomial time a query q 2 Q (independently of D), and
a database D (independently of q), such that D j= q iff a proof generator for q
w.r.t. D and exists. Roughly, the query q guesses a triple (h; ; F ) (as described in
Example 1), and then verifies that the guessed triple is a proof generator for q w.r.t. D
and . The interesting part of q is the component that applies the crucial reachability
checks required by the definition of the proof generator. Although the path among two
atoms in the chase graph may be of exponential size, its existence can be checked via
Q-queries of polynomial size. An immediate consequence of Theorem 1 is that:
Corollary 1. The description logic DLR-LiteR is polynomially combined Q-rewritable,
where Q 2 fFO; NDLg.</p>
      <p>The results of this work are, for the moment, of theoretical nature and we do not
claim that they will directly lead to better practical algorithms. We believe that a smart
implementation of the proposed techniques can lead to more efficient rewriting
procedures; this will be the subject of future research. We are also planning to optimize the
proposed rewriting algorithm, with the aim of constructing queries of optimal size.</p>
      <p>Acknowledgements. G. Gottlob was supported by the EPSRC Programme Grant
EP/M025268/ “VADA: Value Added Data Systems – Principles and Architecture”. M.
Manna was supported by the MIUR project “SI-LAB BA2KNOW – Business
Analitycs to Know”, and by Regione Calabria, programme POR Calabria FESR 2007-2013,
projects “ITravel PLUS” and “KnowRex: Un sistema per il riconoscimento e l’estrazione
di conoscenza”. A. Pieris was supported by the Austrian Science Fund (FWF):
P25207N23 and Y698.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Cal`ı,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Gottlob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Lukasiewicz</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>A general Datalog-based framework for tractable query answering over ontologies</article-title>
          .
          <source>J. Web Sem</source>
          .
          <volume>14</volume>
          ,
          <fpage>57</fpage>
          -
          <lpage>83</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kikot</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Podolskii</surname>
            ,
            <given-names>V.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwentick</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The price of query rewriting in ontology-based data access</article-title>
          .
          <source>Artif. Intell</source>
          .
          <volume>213</volume>
          ,
          <fpage>42</fpage>
          -
          <lpage>59</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manna</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pieris</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Polynomial combined rewritings for existential rules</article-title>
          .
          <source>In: KR</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manna</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pieris</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Polynomial rewritings for linear existential rules</article-title>
          .
          <source>In: IJCAI</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The combined approach to query answering in DL-Lite</article-title>
          . In: KR (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The combined approach to ontology-based data access</article-title>
          .
          <source>In: IJCAI</source>
          . pp.
          <fpage>2656</fpage>
          -
          <lpage>2661</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Conjunctive query answering in the description logic EL using a relational database system</article-title>
          .
          <source>In: IJCAI</source>
          . pp.
          <fpage>2070</fpage>
          -
          <lpage>2075</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>