<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Characterising Data Exchange Solutions in Open and Closed Words (Extended Abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Henrik Forssell</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Evgeny Kharlamov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Evgenij Thorstensen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Oslo</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Oxford</institution>
        </aff>
      </contrib-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Data exchange is the problem of translating information structured under a source
schema into a target schema, given a source data set and a set of declarative mappings
between the source and target schemata. The study of data exchange has recently received
significant attention from both database theory and systems communities and we refer the
reader to PODS and SIGMOD keynotes [
        <xref ref-type="bibr" rid="ref10 ref2">2, 10</xref>
        ] for overviews. Moreover, major database
systems have adapted existing data exchange implementations [
        <xref ref-type="bibr" rid="ref13 ref4">4, 13</xref>
        ].
      </p>
      <p>
        In data exchange, a set of schema mapping M is defined as a set of source-to-target
tuple generating dependences [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In general such mappings only partially specify how to
populate attributes of the target schema with data from the source instance S. Thus, a data
exchange solution is in general an incomplete target data instance V that contains labeled
nulls. Such V represents a set of possible complete target data instances denoted Rep(V ).
Several Rep functions were considered in the context of data exchange [
        <xref ref-type="bibr" rid="ref12 ref5 ref6 ref7">5–7, 12</xref>
        ], and
they correspond to different data exchange semantics. Fagin et al [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ] proposed an
open world (OWA) semantics based on the classical Rep of [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which we denote RepO,
Hernich et al [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] proposed a closed world (CWA) semantics, which was further extended
by Libkin et al [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to a semantics of mixed open-and-closed (OCWA) worlds and
they both are based on a different notion of Rep, called RepA defined for annotated
incomplete instances.
      </p>
      <p>
        The canonical solution that is obtained by chasing [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] the source instance with
mappings is considered in all semantics as a good data exchange solution for materialisation.
However, the canonical solution may not be optimal for storing: it may contain redundant
information and in general there might be another ‘smaller’ solution V that represents the
same target instances. Thus, from the practical point of view, a minimal such V according
to some order would be the best for materialisation. As the consequence, deciding
whether two incomplete instances V1 and V2 represent the same set of complete ones is
a fundamental problem underlying data exchange. For OWA semantics this decision
problem can be characterised in terms of homomorphisms [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]: RepO(V1) = RepO(V2)
iff V1 and V2 are homomorphically equivalent. Moreover, the minimality problem has a
unique solution, called the core [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The situation changes when we turn our attention
to OCWA semantics: RepA(V1) = RepA(V2) now cannot be characterised in terms
of homomorphisms as before. Moreover, to the best of our knowledge, the problem of
characterisation and minimality has not been studied in the context of OCWA.
      </p>
      <p>The goal of this work is to address both the characterisation and minimality problem
in the setting of OCWA semantics. As a first step we study the case of a restricted but
natural class of OCWA mappings where all nulls are open while occurrences of constants
can be either open or closed.</p>
      <p>
        Our contributions are as follows. We propose an alternate definition of Rep, which
we call RepC , that is based on homomorphic covers, and a new data exchange semantics
based on RepC . A homomorphic cover from an instance V 0 to V 00 is a finite set of
homomorphisms from V 0 to V 00 such that the union of their images is all of V 00. This
allows us to characterise when RepC (V1) = RepC (V2) in terms of homomorphisms and
thus opens doors for the study of minimality. In particular, we show that RepC (V1) =
RepC (V2) iff V1 and V2 are cover-equivalent (they homomorphically cover each other).
We then show that our definitions naturally extend the OCWA semantics [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], in the
sense that each their data exchange solution can be translated into our that represents the
same set of complete instances, but not the other way around. Finally for the problem of
minimisation we introduce several natural orders on incomplete instances, show that for
all of them there is in general no unique minimal element. At the same time we identify
one, which we called cover-core, or c-core that has desirable semantic properties.
      </p>
      <p>
        The notion of homomorphic cover has been used elsewhere (e.g. [
        <xref ref-type="bibr" rid="ref11 ref3 ref9">3, 9, 11</xref>
        ]). In our
opinion several more data management scenarios can benefit from it. For instance, two
conjunctive queries whose relational structures cover each other retrieve the same tuples
from every relation of any database instance, a fact of potential relevance in e.g. data
privacy settings. For another example, treating one conjunctive query as a view, it can be
used to completely rewrite another if there exists a cover from the view. Thus in this
setting, cover-equivalence corresponds to mutual complete rewritability.
Acknowledgements This was was partially supported by: the Norwegian Research
Council grant no. 230525; SIRIUS SFI1; the EPSRC projects MaSI3, DBOnto, and ED3.
1 http://sirius-labs.no
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hull</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Vianu</surname>
          </string-name>
          .
          <source>Foundations of Databases. Addison-Wesley</source>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Melnik</surname>
          </string-name>
          .
          <source>Model management 2</source>
          .
          <article-title>0: manipulating richer mappings</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>Optimization of real conjunctive queries</article-title>
          .
          <source>In PODS</source>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Haas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Popa</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Velegrakis</surname>
          </string-name>
          . Clio:
          <article-title>Schema mapping creation and data exchange</article-title>
          .
          <source>In Conceptual Modeling: Foundations and Applications</source>
          , pages
          <fpage>198</fpage>
          -
          <lpage>236</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Kolaitis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Popa</surname>
          </string-name>
          .
          <article-title>Data exchange: semantics and query answering</article-title>
          .
          <source>Theor. Comput. Sci.</source>
          ,
          <volume>336</volume>
          (
          <issue>1</issue>
          ):
          <fpage>89</fpage>
          -
          <lpage>124</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Kolaitis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Popa</surname>
          </string-name>
          .
          <article-title>Data exchange: getting to the core</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>30</volume>
          (
          <issue>1</issue>
          ):
          <fpage>174</fpage>
          -
          <lpage>210</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Hernich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Libkin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Schweikardt</surname>
          </string-name>
          .
          <article-title>Closed world data exchange</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>36</volume>
          (
          <issue>2</issue>
          ):
          <volume>14</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          :
          <fpage>40</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>T.</given-names>
            <surname>Imielinski</surname>
          </string-name>
          and
          <string-name>
            <given-names>W. L.</given-names>
            <surname>Jr</surname>
          </string-name>
          .
          <article-title>Incomplete information in relational databases</article-title>
          .
          <source>J. ACM</source>
          ,
          <volume>31</volume>
          (
          <issue>4</issue>
          ):
          <fpage>761</fpage>
          -
          <lpage>791</lpage>
          ,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>K.</given-names>
            <surname>Knauer</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Ueckerdt</surname>
          </string-name>
          .
          <article-title>Three ways to cover a graph</article-title>
          .
          <source>Discrete Mathematics</source>
          ,
          <volume>339</volume>
          (
          <issue>2</issue>
          ):
          <fpage>745</fpage>
          -
          <lpage>758</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. P. G. Kolaitis.
          <article-title>Schema mappings, data exchange, and metadata management</article-title>
          .
          <source>In PODS</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>E. V.</given-names>
            <surname>Kostylev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Reutter</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. Z.</given-names>
            <surname>Salamon</surname>
          </string-name>
          .
          <article-title>Classification of annotation semirings over containment of conjunctive queries</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>39</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>L.</given-names>
            <surname>Libkin</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Sirangelo</surname>
          </string-name>
          .
          <article-title>Data exchange and schema mappings in open and closed worlds</article-title>
          .
          <source>J. Comput. Syst. Sci.</source>
          ,
          <volume>77</volume>
          (
          <issue>3</issue>
          ):
          <fpage>542</fpage>
          -
          <lpage>571</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. L.
          <string-name>
            <surname>Popa</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Velegrakis</surname>
            ,
            <given-names>R. J.</given-names>
          </string-name>
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          <string-name>
            <surname>Hernández</surname>
            , and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Fagin</surname>
          </string-name>
          .
          <article-title>Translating web data</article-title>
          .
          <source>In VLDB</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>