<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>FO-Rewritability of Expressive Ontology-Mediated Queries</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Cristina Feier</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antti Kuusisto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carsten Lutz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fachbereich Informatik, Universitat Bremen</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We show that FO-rewritability of OMQs based on (extensions of) ALC and unions of conjunctive queries (UCQs) is decidable and 2NExpTime-complete. Previously, decidability was only known for atomic queries (AQs). On the way, we establish the same results also for monotone monadic SNP without inequality (MMSNP) and for monadic disjunctive Datalog (MDDLog). We also analyze the shape of FO-rewritings, thus making a step towards their actual computation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Query rewriting is a widely used technique for e ciently answering
ontologymediated queries (OMQs) using o -the-shelf database systems. In particular, if
an OMQ is rewritable into a rst-order (FO) query, then it can be answered
using a relational database system. It is thus a fundamental problem to design
algorithms that compute an FO-rewriting of a given OMQ. For OMQs based
on expressive DLs such as ALC, nding complete such algorithms turns out to
be a technically very challenging problem; moreover, the desired rewritings are
not always guaranteed to exist. A natural rst step is thus to nd an algorithm
that decides the existence of a rewriting; in fact, any complete and terminating
algorithm for computing rewritings will implicitly also solve the decision problem
and one can expect to learn important lessons already from the latter case.</p>
      <p>
        An OMQ is a triple Q = (T ; ; q) with T a TBox, an ABox signature, and
q a query [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. We use (L; Q) to denote the OMQ language that consists of all OMQs
where T is formulated in the DL L and q in the query language Q. It has been
shown in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] that decidability results and tight NExpTime complexity bounds
for FO-rewritability in (ALC; AQ) and related OMQ languages can be obtained
by translating the input OMQ Q into a constraint satisfaction problem (CSP)
whose complement is equivalent to Q and then applying known algorithms that
decide FO-rewritability of the CSP [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. When atomic queries (AQs) are replaced
with conjunctive queries (CQs) or unions thereof (UCQs), such
equivalencepreserving translation to CSP is no longer possible; instead and as also shown
in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], one can translate Q into a formula ' of the strictly more expressive logical
generalization MMSNP of CSP such that :' is equivalent to Q [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. In contrast
to the CSP case, though, it was not known whether FO-rewritability of MMSNP
formulas is decidable. In this abstract, we show that this is the case and that
the complexity is 2NExpTime-complete, with the lower bounds coming from [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
We then lift this result to FO-rewritability of OMQs formulated in any OMQ
language between (ALCI; UCQ) and (SHI; UCQ). Technically, our approach
consists in a reduction of the MMSNP case to the CSP case. We also analyze
the shape of the rewritings, which we hope will provide guidance for nding
algorithms that compute actual rewritings. One can prove analogous results
for rewritability into monadic Datalog in a very similar way. We do not report
about details here for lack of space. We are currently working on the unrestricted
Datalog case, which is more challenging.
      </p>
      <p>
        Related work. FO-rewritability was rst studied in an OMQ context for
the inexpressive DL-Lite family of DLs [
        <xref ref-type="bibr" rid="ref1 ref14 ref17 ref9">1, 9, 14, 17</xref>
        ]. FO-rewritability in OMQ
languages based on more expressive Horn DLs has been investigated in [
        <xref ref-type="bibr" rid="ref12 ref3 ref4">3, 4, 12</xref>
        ].
Rewritability of Horn DLs into Datalog was considered in [10, 13, 18{20].
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Results</title>
      <p>
        Instead of working with MMSNP, we prefer monadic disjunctive Datalog, MDDLog.
It was observed in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] that these have the same expressive power up to
complementation and can be mutually translated in polynomial time. We refer to [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
for full de nitions and notation. An MDDLog program is Boolean if it only
returns yes/no answers. The diameter of is the maximum number of variables
occuring in a rule in . A generalized CSP is de ned by a set of templates S
instead of a single template and asks for a homomorphism from the input I to
at least one template T 2 S (a template is simply a nite relational structure).
With coCSP, we mean the complement of a (potentially generalized) CSP. The
girth of a structure is the length of a smallest cycle in it and 1 if there is no
cycle, see e.g. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for details.
      </p>
      <p>
        We start with Boolean MDDLog programs. Our proofs make use of the
classical translation of MMSNP sentences into generalized CSPs rst described
by Feder and Vardi and stated in the following in terms of MDDLog.
Theorem 1 ([
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]). Given a Boolean MDDLog program over EDB schema
SE of diameter k, one can e ectively construct a set of templates S over a
di erent EDB schema S0E such that
1. every nite SE -structure I can be converted in polytime into an S0E -structure
      </p>
      <p>I0 such that I j= i I0 2= CSP(S );
2. every nite S0E -structure I0 of girth exceeding k can be converted in polytime
into an SE -instance I such that I j= i I0 2= CSP(S ).</p>
      <p>
        Exact size bounds on the set S and its elements are given in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]; although they
are needed for our nal result, we omit them from this abstract for brevity.
      </p>
      <p>We next observe that the reduction described in Theorem 1 preserves
FOrewritability, up to a certain gap related to inputs of small girth. In the following,
we will consider UCQ-rewritability instead of FO-rewritability since for MDDLog,
coCSP, and all other formalisms considered here FO-rewritability implies
UCQrewritability.</p>
      <p>Proposition 1. Let be a Boolean MDDLog program of diameter k and let
S be as in Theorem 1. Then
1. every UCQ-rewriting of coCSP(S ) can be converted into a UCQ-rewriting
of in polynomial time;
2. every UCQ-rewriting of can e ectively be converted into a UCQ-rewriting
of coCSP(S ) on structures of girth exceeding k.</p>
      <p>Thus, FO-rewritability of is equivalent to FO-rewritability of coCSP(S ) on
structures of su ciently high girth. The `high girth' quali cation is removed by
the following observation.</p>
      <p>Lemma 1. Let S be a set of templates over schema SE and g 0. Then, if
coCSP(S) is UCQ-de nable on nite SE -structures of girth exceeding g, it is
UCQ-de nable on (unrestricted) nite SE -structures.</p>
      <p>
        Analyzing the involved blowups and exploiting that FO-rewritability of generalized
CSPs can be decided in NP [
        <xref ref-type="bibr" rid="ref15 ref5">5, 15</xref>
        ] and that FO-rewritability of Boolean MDDLog
programs is 2NExpTime-hard [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], we obtain the following.
      </p>
      <p>Theorem 2. FO-rewritability of Boolean MDDLog programs is decidable in
2 NExpTime, thus 2 NExpTime-complete.</p>
      <p>
        The limitation to Boolean programs can be lifted by a simple reduction which
replaces answer variables with fresh monadic relation symbols. Moreover, it
was shown in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] that OMQs formulated in (SHI; UCQ) can be translated into
MDDLog with a blowup that is double exponential, but does not add up with
the blowup caused by the translation of MDDLog programs to generalized CSPs.
Finally, [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] also establishes a 2NExpTime lower bound for FO-rewritability in
(ALCI; CQ) and (ALC; UCQ). We can thus extend Theorem 2 as follows.
Theorem 3. FO-rewritability is 2 NExpTime-complete in MMSNP, in MDDLog,
and in OMQ languages between (ALC; UCQ) and (SHI; UCQ) as well as between
(ALCI; CQ) and (SHI; UCQ) (without transitive roles in UCQs).
We now consider the shape of rewritings. For brevity, we concentrate on Boolean
queries. An analysis of the proof of Proposition 1 yields the following.
Proposition 2. Every FO-rewritable Boolean MDDLog program of diameter k
has a rewriting of the form q1 _ _ qn with each qi a CQ of treewidth bounded by
(1; k). The same holds for Boolean OMQs from the OMQ languages in Theorem 3.
See [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for a de nition of treewidth (1; k). We believe that Proposition 2 is
interesting for at least two reasons. Firstly, it says that when constructing
rewritings, it is enough to look for a structurally very restricted type of UCQ
instead of for an unrestricted FO formula. And secondly because it clari es the
shape of obstructions of FO-rewritable MMSNP sentences in the spirit of CSP
obstructions [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In particular, it is interesting to contrast Proposition 2 with the
fact that if a CSP is FO-rewritable, then it has a nite set of obstructions that
are nite trees, that is, its complement is rewritable into a UCQ that consists of
tree-shaped CQs [
        <xref ref-type="bibr" rid="ref16 ref2">2, 16</xref>
        ].
      </p>
      <p>Acknowledgements. We thank Manuel Bodirsky and Florent Madelaine for
inspiring discussions and the ERC for funding this work under grant 647289.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Artale</surname>
          </string-name>
          , Diego Calvanese, Roman Kontchakov, and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          .
          <article-title>The DL-Lite family and relations</article-title>
          .
          <source>J. of Art. Int. Res. (JAIR)</source>
          ,
          <volume>36</volume>
          :1{
          <fpage>69</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Albert</given-names>
            <surname>Atserias</surname>
          </string-name>
          .
          <article-title>On digraph coloring problems and treewidth duality</article-title>
          .
          <source>Eur. J. Comb.</source>
          ,
          <volume>29</volume>
          (
          <issue>4</issue>
          ):
          <volume>796</volume>
          {
          <fpage>820</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Meghyn</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          , Carsten Lutz, and
          <string-name>
            <given-names>Frank</given-names>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>Deciding FO-rewritability in EL</article-title>
          .
          <source>In Proc. of DL</source>
          , pages
          <volume>70</volume>
          {
          <fpage>80</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Meghyn</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          , Carsten Lutz, and
          <string-name>
            <given-names>Frank</given-names>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>First-order rewritability of atomic queries in Horn description logics</article-title>
          .
          <source>In Proc. of IJCAI</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Meghyn</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          , Balder ten Cate, Carsten Lutz, and
          <string-name>
            <given-names>Frank</given-names>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>Ontologybased data access: A study through disjunctive Datalog, CSP, and MMSNP</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>39</volume>
          (
          <issue>4</issue>
          ):
          <volume>33</volume>
          :1{
          <fpage>33</fpage>
          :
          <fpage>44</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Manuel</given-names>
            <surname>Bodirsky</surname>
          </string-name>
          and
          <article-title>V ctor Dalmau</article-title>
          .
          <article-title>Datalog and constraint satisfaction with in nite templates</article-title>
          .
          <source>J. Comput. Syst. Sci.</source>
          ,
          <volume>79</volume>
          (
          <issue>1</issue>
          ):
          <volume>79</volume>
          {
          <fpage>100</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Pierre</given-names>
            <surname>Bourhis</surname>
          </string-name>
          and
          <string-name>
            <given-names>Carsten</given-names>
            <surname>Lutz</surname>
          </string-name>
          .
          <article-title>Containment in monadic disjunctive Datalog, MMSNP, and expressive description logics</article-title>
          .
          <source>In Proc. of KR</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Bulatov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krokhin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Larose</surname>
          </string-name>
          .
          <article-title>Dualities for constraint satisfaction problems</article-title>
          .
          <source>Complexity of Constraints</source>
          ,
          <volume>5250</volume>
          LNCS:
          <volume>93</volume>
          {
          <fpage>124</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and
          <string-name>
            <given-names>Riccardo</given-names>
            <surname>Rosati</surname>
          </string-name>
          .
          <article-title>Tractable reasoning and e cient query answering in description logics: The DL-Lite family</article-title>
          .
          <source>J. Autom. Reasoning</source>
          ,
          <volume>39</volume>
          (
          <issue>3</issue>
          ):
          <volume>385</volume>
          {
          <fpage>429</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Thomas</surname>
            <given-names>Eiter</given-names>
          </string-name>
          , Magdalena Ortiz, Mantas Simkus,
          <string-name>
            <surname>Trung-Kien Tran</surname>
            , and
            <given-names>Guohui</given-names>
          </string-name>
          <string-name>
            <surname>Xiao</surname>
          </string-name>
          .
          <article-title>Query rewriting for Horn-SHIQ plus rules</article-title>
          .
          <source>In Proc. of AAAI</source>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Feder</surname>
          </string-name>
          and
          <string-name>
            <given-names>Moshe Y.</given-names>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>The computational structure of monotone monadic SNP and constraint satisfaction: A study through datalog and group theory</article-title>
          .
          <source>SIAM J. Comput.</source>
          ,
          <volume>28</volume>
          (
          <issue>1</issue>
          ):
          <volume>57</volume>
          {
          <fpage>104</fpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Peter</surname>
            <given-names>Hansen</given-names>
          </string-name>
          ,
          <article-title>Carsten Lutz, I_nanc Seylan, and</article-title>
          <string-name>
            <given-names>Frank</given-names>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>E cient query rewriting in the description logic el and beyond</article-title>
          .
          <source>In Proc. of IJCAI</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Mark</surname>
            <given-names>Kaminski</given-names>
          </string-name>
          , Yavor Nenov, and Bernardo Cuenca Grau.
          <article-title>Computing datalog rewritings for disjunctive datalog programs and description logic ontologies</article-title>
          .
          <source>In Proc. of RR</source>
          , pages
          <volume>76</volume>
          {
          <fpage>91</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Stanislav</surname>
            <given-names>Kikot</given-names>
          </string-name>
          , Roman Kontchakov,
          <string-name>
            <surname>Vladimir</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Podolskii</surname>
            , and
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Zakharyaschev</surname>
          </string-name>
          .
          <article-title>Exponential lower bounds and separation for query rewriting</article-title>
          .
          <source>In Proc. of ICALP (2)</source>
          , pages
          <fpage>263</fpage>
          {
          <fpage>274</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Benoit</surname>
            <given-names>Larose</given-names>
          </string-name>
          , Cynthia Loten, and
          <string-name>
            <given-names>Claude</given-names>
            <surname>Tardif</surname>
          </string-name>
          .
          <article-title>A characterisation of rst-order constraint satisfaction problems</article-title>
          .
          <source>Logical Methods in Comp. Sci.</source>
          ,
          <volume>3</volume>
          (
          <issue>4</issue>
          ),
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>Jaroslav</given-names>
            <surname>Nesetril</surname>
          </string-name>
          and
          <string-name>
            <given-names>Claude</given-names>
            <surname>Tardif</surname>
          </string-name>
          .
          <article-title>Duality theorems for nite structures (characterising gaps and good characterisations)</article-title>
          .
          <source>J. Comb. Theory, Ser. B</source>
          ,
          <volume>80</volume>
          (
          <issue>1</issue>
          ):
          <volume>80</volume>
          {
          <fpage>97</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Hector</surname>
            Perez-Urbina,
            <given-names>Boris</given-names>
          </string-name>
          <string-name>
            <surname>Motik</surname>
            , and
            <given-names>Ian</given-names>
          </string-name>
          <string-name>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>A comparison of query rewriting techniques for DL-Lite</article-title>
          .
          <source>In Proc. of DL</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Hector</surname>
            Perez-Urbina,
            <given-names>Boris</given-names>
          </string-name>
          <string-name>
            <surname>Motik</surname>
            , and
            <given-names>Ian</given-names>
          </string-name>
          <string-name>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>Tractable query answering and rewriting under description logic constraints</article-title>
          .
          <source>J. of Applied Logic</source>
          ,
          <volume>8</volume>
          (
          <issue>2</issue>
          ):
          <volume>186</volume>
          {
          <fpage>209</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>Riccardo</given-names>
            <surname>Rosati</surname>
          </string-name>
          .
          <article-title>On conjunctive query answering in EL</article-title>
          .
          <source>In Proc. of DL</source>
          , pages
          <volume>451</volume>
          {
          <fpage>458</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Despoina</surname>
            <given-names>Trivela</given-names>
          </string-name>
          , Giorgos Stoilos, Alexandros Chortaras, and
          <string-name>
            <surname>Giorgos</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Stamou</surname>
          </string-name>
          .
          <article-title>Optimising resolution-based rewriting algorithms for OWL ontologies</article-title>
          .
          <source>J. Web Sem</source>
          .,
          <volume>33</volume>
          :
          <fpage>30</fpage>
          {
          <fpage>49</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>