<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Query-by-Example for Expressive Horn Description Logics?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>V ctor Gutierrez-Basulto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jean Christoph Jung</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Leif Sabellek</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>KU Leuven</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Belgium</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Cardi University</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universitat Bremen</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We initiate the study of query-by-example (QBE) in the framework of ontologybased data access. Query-by-example is a reverse engineering problem that asks for the existence of a query complying with a set of speci ed examples. Formally, we study the following problem QBE(L; Q) for some ontology language L and a query language Q: { Input: L-knowledge base (T ; A), sets S+; S , and a signature { Question: is there a query q(x) 2 Q using only symbols in (T ; A) j= q(a) for all a 2 S+, and (T ; A) 6j= q(b) for all b 2 S ? As a simple example, consider the knowledge base consisting of T = fHuman v Vertebrate; Vertebrate v 9hasPart:Spineg; A = fHuman(ax); hasPart(an; sp); Spine(sp); Bug(bug)g: If the positive examples are S+ = fax; ang and the negative example is S = fbugg, then q(x) = 9y hasPart(x; y) ^ Spine(y) is a witness query. However, there is no witnessing query in the case S+ = fan; bugg and S = faxg. Motivation. Query-by-example is a classical querying paradigm [2] which has, due to `big data', lately gained new interest since even expert users might nd it useful to explore the data in this way. The practical motivation for our paper is to improve the usability of ontology-based systems. In such systems, users access the knowledge base through queries usually formulated in powerful query languages such as (unions of) conjunctive queries (U)CQs. Unfortunately, it has been observed that casual non-expert users are often not able to specify queries in these formalisms, e.g., Statoil geologists [3], clearly hampering the usability of ontology-based systems. Query-by-example suggests an alternative approach to querying: instead of writing the query itself, the user provides examples that she does (not) expect as answers and the system supports the user by providing a conforming query. This is possibly repeated until the user is satis ed. From the theoretical perspective, our motivation is to continue bridging the gap between DL and machine learning research. Indeed, QBE over description logic knowledge bases can be viewed as an instantiation of the inductive logic ? Accepted at IJCAI-ECAI 2018 [1]. Authors were funded by EU's Horizon 2020 programme under the Marie Sklodowska-Curie grant 663830, the Research Foundation Flanders project G.0428.15, and ERC consolidator grant 647289 CODA, respectively</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        programming (ILP) framework [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], where typically the knowledge base P is given
in Prolog or a di erent rule language, and the goal is to learn a set of rules P0
such that P [ P0 conforms with the examples. The cases of our query languages
correspond to single rule programs P0 for CQs and sets of rules with the same
head for UCQs, respectively.
      </p>
      <p>
        It is worth mentioning that the query-by-example framework has been
studied and implemented for ALC-knowledge bases and ALC concept queries as
query language [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Contribution. Our main contributions are characterizations, algorithms, and
complexity bounds for QBE(L; Q) for L an expressive Horn DL L 2 fHorn-ALCI;
Horn-ALCg and Q 2 fCQ; UCQg. We start with providing natural
modeltheoretic characterizations for QBE(Horn-ALCI; Q) for Q 2 fCQ; UCQg by
lifting characterizations known from the relational database setting [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] by
replacing the database with the universal model of the knowledge base.
Unfortunately, our characterizations do not give immediate rise to a decision procedure
because the universal model is typically in nite. To overcome this, we exploit
the regularity of universal models and provide decision procedures running in
2-ExpTime and coNExpTime for Horn-ALCI and Horn-ALC, respectively.
Having these, we prove matching lower bounds, the most challenging one being
a 2-ExpTime-lower bound for QBE(Horn-ALCI; Q), Q 2 fCQ; UCQg.
Interestingly, some results depend on restricting the signature, so we consider also the
variant QBEf of QBE with unrestricted signature. Some of our techniques are
inspired from approaches to query conservative extensions [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ], but we are not
aware of direct reductions. The following table summarizes our results.
      </p>
      <p>L !
QBE(L; CQ)
QBE(L; UCQ)
QBEf (L; CQ)
QBEf (L; UCQ)</p>
    </sec>
    <sec id="sec-2">
      <title>Horn-ALCI</title>
      <p>2-ExpTime
2-ExpTime
2-ExpTime
ExpTime</p>
    </sec>
    <sec id="sec-3">
      <title>Horn-ALC</title>
      <p>coNExpTime</p>
      <p>ExpTime
coNExpTime</p>
      <p>
        ExpTime
Most notably,
{ in the cases of QBE (Horn-ALC; CQ), the availability of TBoxes does not
increase the complexity compared to the relational database setting [
        <xref ref-type="bibr" rid="ref6 ref9">9, 6</xref>
        ],
{ inverse roles lead to a jump in the complexity, and
{ 2-ExpTime-hardness already holds for ELI TBoxes and concept queries.
      </p>
      <p>We obtain the same results for the variant QDEF of QBE, the problem to
decide whether some q 2 Q returns precisely the positive examples. We also
investigate the size of witness queries which is vital for practical purposes since
eventually the user is interested in obtaining a query to further explore the
data. We particularly show that they can be double exponentially large, which
is in contrast to the relational database setting. We further show that { if a
witness query exists { there is always one of at most double exponential size
(Horn-ALC), respectively fourfold exponential size (Horn-ALCI). Such sizes are
clearly impractical which motivates the study of approximations in the future.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Gutierrez-Basulto</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jung</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sabellek</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Reverse engineering queries in ontology-enriched systems: The case of expressive horn description logic ontologies</article-title>
          .
          <source>In: Proc. of IJCAI-ECAI-18</source>
          , AAAI Press (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Zloof</surname>
            ,
            <given-names>M.M.</given-names>
          </string-name>
          :
          <article-title>Query-by-example: the invocation and de nition of tables and forms</article-title>
          .
          <source>In: Proc. of VLDB-75</source>
          . (
          <year>1975</year>
          )
          <volume>1</volume>
          {
          <fpage>24</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Hovland</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , Skj veland, M.G.,
          <string-name>
            <surname>Waaler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-based data access to Slegge</article-title>
          .
          <source>In: Proc. of ISWC-17</source>
          . (
          <year>2017</year>
          )
          <volume>120</volume>
          {
          <fpage>129</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Nienhuys-Cheng, S., de Wolf, R., eds.:
          <source>Foundations of Inductive Logic Programming. Volume 1228 of Lecture Notes in Computer Science</source>
          . Springer (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Concept learning in description logics using re nement operators</article-title>
          .
          <source>Machine Learning</source>
          <volume>78</volume>
          (
          <issue>1</issue>
          ) (
          <year>Sep 2009</year>
          )
          <fpage>203</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. ten Cate,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Dalmau</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          :
          <article-title>The product homomorphism problem and applications</article-title>
          .
          <source>In: Proc. of ICDT-15</source>
          . (
          <year>2015</year>
          )
          <volume>161</volume>
          {
          <fpage>176</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Botoeva</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konev</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ryzhikov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Inseparability and conservative extensions of description logic ontologies: A survey</article-title>
          .
          <source>In: Proc. of RW-16</source>
          . (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Jung</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schneider</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Query conservative extensions in horn description logics with inverse roles</article-title>
          .
          <source>In: Proc. of IJCAI-17</source>
          . (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Barcelo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romero</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The complexity of reverse engineering problems for conjunctive queries</article-title>
          .
          <source>In: Proc. of ICDT-17</source>
          . (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>