<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards an Equivalence Degree of E L CQs? (Extended abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oliver Fernández Gil</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anni-Yasmin Turhan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Theoretical Computer Science</institution>
          ,
          <addr-line>TU Dresden</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We report on initial steps towards designing similarity measures for conjunctive queries formulated over E L ontologies. In applications that use conjunctive queries (CQs) it is often the question how similar two queries are to each other. We consider here similarity of CQs in general and develop a particular family of conjunctive query similarity measures (CQSMs). Assessing the similarity in the presence of TBoxes has been considered for concept similarity measures (CSMs), which are functions that compute for a pair of concepts a value from the unit interval. As similarity per se is not a formal notion and similarity measures can be defined in an arbitrary fashion, often a catalog of properties is devised that a “well-behaved” measure should fulfill [3]. The overall idea is to perceive similarity of concepts as a form of gradual equivalence. CSMs with these properties have been devised for E L-concepts in [3], albeit only for acyclic TBoxes, and in [2,5] for general TBoxes. These CSMs implement gradual equivalence by computing a combination of the mutual subsumption degree of the two concepts. More precisely, this is done by finding simulations between the canonical models of the TBox and the respective concepts- as such simulations characterize subsumption in E L [4]. We want to transfer this approach to CQs and perceive similarity of CQs as a form of “gradual query equivalence” which yields measures that are defined independent of the data. However, the transfer is not straightforward for several reasons. While the existence of simulations characterize equivalence and subsumption of E L concepts w.r.t. a TBox, CQs require homomorphisms and the compact canonical model representation used for concepts does no longer suffice. In addition, pairs of CQs can differ in their arity or in the set of individual names occurring in them. In this initial investigation we concentrate on CQSMs for CQs that are rooted (every quantified variable can be reached from an answer variable) and that are uniform in the sense that the input queries must have the same number of answer variables and the same set of individuals. For the empty TBox, it is well-known that equivalence of queries can be characterized by the existence of isomorphism between the queries, provided that the queries are minimized [1]. Hence, we apply techniques for graph similarity to define a similarity measure for conjunctive queries, more precisely we use error-tolerant graph matchings.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In the presence of a TBox, however, the measure has to take the TBox
information into account, which means that equivalence of CQs cannot be tested
by a comparison of the plain query graphs.</p>
      <p>
        Our approach to develop a uniform CQSM for rooted E L CQs consists of
several steps. First, we define a CQSM for plain query graphs in terms of graph
edit distance, where the costs of some edit operations are adapted to the specific
setting of CQs. So, for instance, to compare two nodes with concept labels, the
similarity of these node labels, is assessed by the CSM from [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        The second step is to extend this measure to consider non-empty E L TBoxes.
To this end, we employ the result that CQ equivalence w.r.t. an E L TBox can
be reduced to the case with an empty TBox, by rewriting the queries with TBox
information. The general idea is to rewrite the input CQs and the TBox, by using
essentially the “rolling-up technique” [6], which replaces in the input queries
certain tree structures with fresh concept names. The TBox is then extended with
concept definitions for the fresh concept names that capture the trees pruned
from the two queries. Next, the pruned queries are rewritten again by adding
new atoms that capture the consequences of the extended TBox for the query
structure. One can show that the final CQs are equivalent w.r.t. the empty TBox
iff the initial ones are equivalent w.r.t. the original TBox. Thus, the similarity
between the input queries is the similarity between the rewritten queries. By
using the CSM from [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the comparison of the nodes for individuals (without their
relational neighborhood) still considers the information in the original TBox.
      </p>
      <p>The contributions are the following: we define CQSMs and formalize a
collection of useful properties for CQSMs that can ensure predictable behavior of the
measure to a certain extent. Furthermore, we construct a (family of) CQSMs in
E L that fulfill some of the most important properties by following the approach
just described. We show that the resulting CQSMs fulfill equivalence invariance
and equivalence closure.
6. Riccardo Rosati. On conjunctive query answering in EL. In Proc. of the 2007
Description Logic Workshop (DL 2007), volume 250 of CEUR, 2007.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ashok</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Chandra and Philip M. Merlin</surname>
          </string-name>
          .
          <article-title>Optimal implementation of conjunctive queries in relational data bases</article-title>
          .
          <source>In Proc. of the 9th ACM Symp. on Theory of Computing (STOC'77)</source>
          , pages
          <fpage>77</fpage>
          -
          <lpage>90</lpage>
          . ACM,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Ecke</surname>
          </string-name>
          , Rafael Peñaloza, and
          <string-name>
            <surname>Anni-Yasmin Turhan</surname>
          </string-name>
          .
          <article-title>Similarity-based relaxed instance queries</article-title>
          .
          <source>J. Applied Logic</source>
          ,
          <volume>13</volume>
          (
          <issue>4</issue>
          ):
          <fpage>480</fpage>
          -
          <lpage>508</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Karsten</given-names>
            <surname>Lehmann</surname>
          </string-name>
          and
          <string-name>
            <surname>Anni-Yasmin Turhan</surname>
          </string-name>
          .
          <article-title>A framework for semantic-based similarity measures for ELH-concepts</article-title>
          .
          <source>In Proc. of the 13th Eur. Conf. on Logics in Artificial Intelligence (JELIA</source>
          '
          <year>2012</year>
          ), volume
          <volume>7519</volume>
          <source>of LNCS</source>
          , pages
          <fpage>307</fpage>
          -
          <lpage>319</lpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Carsten</given-names>
            <surname>Lutz</surname>
          </string-name>
          and
          <string-name>
            <given-names>Frank</given-names>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>Conservative extensions in the lightweight description logic EL</article-title>
          .
          <source>In Proc. of the 21st Int. Conf. on Automated Deduction (CADE</source>
          <year>2007</year>
          ), volume
          <volume>4603</volume>
          <source>of LNCS</source>
          , pages
          <fpage>84</fpage>
          -
          <lpage>99</lpage>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Teeradaj</given-names>
            <surname>Racharak</surname>
          </string-name>
          , Boontawee Suntisrivaraporn, and
          <string-name>
            <given-names>Satoshi</given-names>
            <surname>Tojo</surname>
          </string-name>
          .
          <article-title>Personalizing a concept similarity measure in the description logic ELH with preference profile</article-title>
          .
          <source>Computing and Informatics</source>
          ,
          <volume>37</volume>
          (
          <issue>3</issue>
          ):
          <fpage>581</fpage>
          -
          <lpage>613</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>