<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the Undecidability of the Equivalence of Second-Order Tuple Generating Dependencies ?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ingo Feinerer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Reinhard Pichler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emanuel Sallinger</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vadim Savenkov</string-name>
          <email>savenkovg@dbai.tuwien.ac.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Vienna University of Technology</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Second-Order tuple generating dependencies (SO tgds) were introduced by Fagin et al. to capture the composition of simple schema mappings. Testing the equivalence of SO tgds would be important for applications like model management and mapping optimization. However, we prove the undecidability of the logical equivalence of SO tgds. Moreover, under weak additional assumptions, we also show the undecidability of a relaxed notion of equivalence between two SO tgds, namely the so-called conjunctive query equivalence.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Schema mappings play an important role in several areas of database research, notably
in data integration [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], data exchange [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], peer data management [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and model
management [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. A schema mapping is given by two schemas, called the source schema and
the target schema, as well as a set of dependencies describing the relationship between
the source and target schema. The most fundamental form of schema mappings are
mappings defined by a set of source-to-target tuple generating dependencies (s-t tgds):
They are First-Order formulae of the form 8x('(x) ! 9y (x; y)), where the
antecedent '(x) is a conjunctive query (CQ) over the source schema and the conclusion
(x; y) is a CQ over the target schema. Intuitively, such an s-t tgd defines a constraint
that the presence of certain tuples in the source database I (namely those in the image
of some homomorphism h from '(x) to I) enforce the presence of certain tuples in the
target database J (s.t. h can be extended to a homomorphism from (x; y) to J ).
      </p>
      <p>
        Several algebraic operators [
        <xref ref-type="bibr" rid="ref15 ref6">6, 15</xref>
        ] on schema mappings have been intensively
studied in recent time like computing inverses [
        <xref ref-type="bibr" rid="ref2 ref3 ref8">8, 3, 2</xref>
        ] and composing schema mappings [
        <xref ref-type="bibr" rid="ref12 ref14 ref16 ref7">7,
12, 14, 16</xref>
        ]. Our work is rather related to the composition operator. Fagin et al. proved
that s-t tgds are not powerful enough to express the composition of two mappings
defined by s-t tgds [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. To remedy this defect, so-called Second-Order tuple generating
dependencies (SO tgds) were introduced in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. SO tgds extend s-t tgds by existentially
quantified function-variables and equalities of (possibly functional) terms. Details and
formal definitions are given in Section 2. It was shown in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] that SO tgds capture
exactly the closure under composition of mappings defined by s-t tgds.
Example 1 ([
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]). Consider the following three schemas. Let S1 consist of the unary
relation symbol Emp( ) of employees. Schema S2 consists of a single binary relation
? This work was supported by the Vienna Science and Technology Fund (WWTF), project
      </p>
      <p>ICT08-032. Savenkov was supported by the Erasmus Mundus ECW Program of the EU.
symbol Mgr0( ; ) that associates each employee with a manager. Schema S3 consists
of a similar binary relation symbol Mgr( ; ) that is intended to provide a copy of Mgr0
and an additional unary relation symbol SelfMgr( ) to store employees who are their
own manager. Consider the mappings M12 = (S1; S2; 12) and M23 = (S2; S3; 23)
with 12 = f8e(Emp(e) ! 9m Mgr0(e; m))g and 23 = f8e; m(Mgr0(e; m) !
Mgr(e; m)), 8e(Mgr0(e; e) ! SelfMgr(e))g. We are looking for the composition of
M12 and M23. It can be verified that this composition can be expressed by the SO tgd
= 9f (8e (Emp(e) ! Mgr(e; f (e))) ^ 8e (Emp(e) ^ (e = f (e)) ! SelfMgr(e))):</p>
      <p>In this paper, we want to study the equivalence of SO tgds. Note that the question of
equivalence naturally arises in several scenarios. Figure 1 illustrates a model evolution
scenario, where data structured under some schema S is first migrated to a database
with schema T and then further transformed to meet schema U . Now suppose that there
exists an alternative migration path from
σ schema S via T 0 to schema U . The question</p>
      <p>if the two migration paths yield the same
re</p>
      <p>T sult comes down to checking if the
dependenΣ12 Σ23 cies and 0 (which represent the respective
mapping compositions) are equivalent.
ActuS U ally, Figure 1 can also be thought of as
illusΣ‘12 Σ‘23 trating a peer data management scenario, where
T‘ some peer with data structured according to S</p>
      <p>provides part of its data to some other peer
σ‘ with schema T (resp. T 0). The latter peer in</p>
      <p>turn passes this data on to yet another peer with
Fig. 1. Mapping compositions. schema U . Now suppose that a user may access
the data only at the peer with schema U . What
happens if some link in this peer data network is broken, say the one corresponding to
mapping 23? Will the path of mappings from S via T 0 to U still give the user full
access to the data provided by the peer with schema S? Testing the equivalence of
and 0 is thus crucial for answering questions of redundancy and reliability in a peer
data network.</p>
      <p>
        The equivalence of mappings is also fundamental to mapping optimization. As
mentioned in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], optimizing a mapping ultimately means replacing the mapping by an
“equivalent” one with better (computational) properties. This raises the question of
how the “equivalence” of two mappings should be defined. Since dependencies are
logical formulae, the most natural notion of equivalence is logical equivalence. In this
paper, we show that logical equivalence of SO tgds is undecidable. In order to allow
for more flexibility in optimizing mappings, Fagin et al. introduced relaxed notions of
equivalence [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In particular, the potential of conjunctive query (CQ) equivalence for
optimizing several kinds of mappings was studied in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Intuitively, two mappings
are CQ-equivalent, if conjunctive queries posed against the target database yield the
same result for both mappings (for details, see Section 2). For instance, it was shown
in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] that the composition of the mappings M12 and M23 in Example 1 cannot be
represented by an SO tgd without equalities in the antecedent. On the other hand,
in Example 1 is CQ-equivalent to 0 = 9f (8e (Emp(e) ! Mgr(e; f (e))). Under the
weak additional assumption that the source schema may have key dependencies, we
shall prove the undecidability of CQ-equivalence of SO tgds.
      </p>
      <sec id="sec-1-1">
        <title>Organization of the paper and summary of results. In Section 2, we recall some basic</title>
        <p>notions and results. A conclusion and an outlook to future work are given in Section 5.
The main results of the paper are detailed in Sections 3 and 4, namely:</p>
        <p>
          Logical equivalence. In Section 3 we prove that the logical equivalence of SO tgds
is undecidable. This result is easily obtained via a previous undecidability result in the
context of inverse schema mappings [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>Logical equivalence vs. CQ-equivalence. In Section 3, we also present a different
proof strategy for the undecidability of the logical equivalence of two SO tgds in order
to identify an important difference between logical equivalence and CQ-equivalence.
More precisely, we show that logical equivalence of two SO tgds remains undecidable
even if the two SO tgds are already known to be CQ-equivalent.</p>
        <p>CQ-equivalence. In Section 4 we prove the undecidability of CQ-equivalence of SO
tgds if the source schema may have key dependencies. The proof is by reduction from
the domino problem. As a by-product of this proof, we also get the undecidability of
logical equivalence of SO tgds without equalities.</p>
        <p>Due to lack of space, proofs are sketched. Details will be provided in the full version.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>Schemas and instances. A schema R = fR1; : : : ; Rng is a set of relation symbols
Ri each of a fixed arity. An instance I over a schema R consists of a relation for each
relation symbol in R, s.t. both have the same arity. For a relation symbol R, we write
RI to denote the relation of R in I. We only consider finite instances here.
Schema mappings. Let S = fS1; : : : ; Sng and T = fT1; : : : ; Tmg be schemas with no
relation symbols in common. A schema mapping is given by a triple M = (S; T; )
where S is the source schema, T is the target schema, and is a set of logical formulae
called dependencies expressing the relationship between S and T.</p>
      <p>Instances over S (resp. T) are called source (resp. target) instances. We write hS; Ti
to denote the schema fS1; : : : ; Sn; T1; : : : ; Tmg. If I is a source instance and J a target
instance, then hI; J i is an instance of the schema hS; Ti.</p>
      <p>Given a (ground) source instance I, a target instance J is called a solution for I
under M if hI; J i j= . The set of all solutions for I under M is denoted by Sol (I; M).
Dependencies. Source-to-target tuple generating dependencies (as already defined in
the introduction) are logical formulae of the form 8x('(x) ! 9y (x; y)). We write
x for a tuple (x1; : : : ; xn). However, by slight abuse of notation, we also refer to the set
fx1; : : : ; xng as x. Hence, we may use expressions like xi 2 x or x X, etc.</p>
      <p>A key dependency over schema R is of the form 8x(R(u; y; v) ^ R(u; z; w) !
y = z) where R denotes a relation symbol in R with u; v; w x and y; z 2 x.</p>
      <p>A second-order tuple generating dependency (SO tgd) is a logical formula of the
form 9f ((8x1('1 ! 1)) ^ ^ (8xn('n ! n))) where (1) each member of
f is a function symbol, (2) each 'i is a conjunction of atomic formulas of the form
S(y1; : : : ; yk) (with S 2 S and yj 2 xi), and equalities of the form t = t0 (with t and t0
terms based on xi and f ), (3) each i is a conjunction of atomic formulas of the form
T (t1; : : : ; t`) (with T 2 T where t1, . . . , t` are terms based on xi and f ), and (4) each
variable in xi appears in some atomic formula of 'i.</p>
      <p>
        When dealing with instances hI; J i in the context of SO tgds the domain of source
instances I is allowed to consist only of constants (i.e., grounded) whereas target
instances J may contain functional terms which can be treated like fresh variables, also
called labelled nulls. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]
Homomorphisms. Let I, I0 be instances. A homomorphism h: I ! I0 is a mapping s.t.
(1) whenever R(x) 2 I, then R(h(x)) 2 I0, and (2) for every constant c, h(c) = c.
If such h exists, we write I ! I0. Moreover, if I $ I0 then we say that I and I0 are
homomorphically equivalent.
      </p>
      <p>If a homomorphism h: I ! I0 is invertible, s.t. h 1 is also a homomorphism from
I0 to I, then h is called an isomorphism, denoted I = I0. An endomorphism is a
homomorphism I ! I. An endomorphism is proper if it is not surjective (for finite instances,
this is equivalent to being not injective), i.e., if it reduces the domain of I.</p>
      <p>If I is an instance, and I0 I is such that I ! I0 holds but for no other I00
I0: I ! I00 (that is, I0 cannot be further “shrunk” by a proper endomorphism), then I0
is called a core of I. The core is unique up to isomorphism. Hence, we may speak about
the core of I. Cores have the following important property: for arbitrary instances J
and J 0, J $ J 0 iff core(J ) = core(J 0).</p>
      <p>
        If J 2 Sol (I; M) is such that J ! J 0 holds for any other solution J 0 2 Sol (I; M),
then J is called a universal solution. Since the universal solutions for a source instance
I are homomorphically equivalent, their core is unique up to isomorphism [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. We
write UnivSol (I; M) to denote the set of universal solutions for I under mapping M
and we write core(I; M) to denote the core of the universal solutions.
      </p>
      <p>A substitution is a mapping which sends variables to other domain elements (i.e.,
variables or constants). We write = fx1 a1; : : : ; xn ang if maps each xi to ai
and is the identity outside fx1; : : : ; xng. The application of a substitution is usually
denoted in postfix notation, e.g.: x denotes the image of x under . For an expression
'(x) (e.g., a conjunctive query with variables in x ), we write '(x ) to denote the
result of replacing every occurrence of every variable x 2 x by x .</p>
      <p>
        Chase. A target instance for a given source instance and a schema mapping can be
computed by the chase [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. For SO tgds, the original chase procedure of [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] has to be
modified since homomorphisms now have to take the equalities in the antecedents into
account. Formally, a mapping h from a conjunct Ci = 8x('i ! i) of an SO tgd to
an instance I is a homomorphism if for every relational atom S(y1; : : : ; yk) in 'i the
tuple (h(y1); : : : ; h(yk)) is in SI and for every equality t = t0 we have h(t) = h(t0).
      </p>
      <p>Let hI; J i be an instance of a schema hS; Ti and let Ci = 8x('i ! i) be an SO
conjunct. If there is a homomorphism h of 'i into the instance I, then the conjunct Ci
can be applied to hI; J1i with homomorphism h. I.e., for every target atom T (t1; : : : ; t`)
of i the tuple (h(t1); : : : ; h(t`)) is added to the T relation and the union with J1
is taken yielding a new instance J2. The instance hI; J2i is now called the result of
applying Ci to hI; J1i with h. The actual application is called a chase step.</p>
      <p>
        A chase sequence is a finite sequence of chase steps. The chase of hI; ;i with an SO
tgd is a chase sequence until no conjuncts can be applied anymore. The result of this
chase is denoted by chase(I; ). Chasing hI; ;i with SO tgds can be done in
polynomial time w.r.t. the size of the source instance and results in a universal solution [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
Equivalence of schema mappings. Let M = (S; T; ) and M0 = (S, T; 0) be two
schema mappings.
      </p>
      <p>M and M0 are logically equivalent (denoted as M M0) if, for every source
instance I and target instance J , the equivalence hI; J i j= , hI; J i j= 0 holds.
This is the case iff Sol (I; M) = Sol (I; M0) holds for every source instance I.</p>
      <p>
        M and M0 are conjunctive-query equivalent (denoted as M CQ M0) if, for every
source instance I, either Sol (I; M) = ; = Sol (I; M0) or core(I; M) = core(I; M0).
Note that the original definition of M CQ M0 is that, for every source instance I,
any conjunctive query posed against the target schema yields the same certain answers
for the mappings M and M0. The above characterization via the core, which is more
convenient for our purposes here, was proved to be equivalent in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Undecidability of Logical Equivalence</title>
      <p>
        The undecidability of logical equivalence of two SO tgds follows easily from a previous
undecidability result in the context of inverse schema mappings [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Theorem 1. Let and 0 be two SO tgds. It is undecidable whether
is logically equivalent to 0.
0, i.e. that
Proof. (Sketch).1 In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], a schema mapping M2 is defined to be an inverse of another
mapping M1 if their composition M1 M2 is logically equivalent to the identity
mapping, i.e., a set of s-t tgds of the form (8x) P (x) ! P^(x) for every source relation P .
In [3, Corollary 9.5] the authors show that the following problem is undecidable: Given
mappings M1 and M2 specified by s-t tgds, check whether M2 is an inverse of M1.
      </p>
      <p>In other words, checking if the composition M1 M2 is equivalent to the identity
mapping is undecidable. Clearly, a set of s-t tgds can be represented as an SO tgd. Let
denote the SO tgd corresponding to the set of s-t tgds of the identity mapping. Likewise,
M1 M2 can be represented as an SO tgd, say 0. Then 0 holds iff M2 is an
inverse of M1. Hence, the logical equivalence of two SO tgds is undecidable.</p>
      <p>We now want to strengthen the above result by relating logical equivalence to
CQequivalence. Below, we show that logical equivalence of two SO tgds remains
undecidable even if the two SO tgds are already known to be CQ-equivalent. To this end, we
adopt a different proof strategy which goes by reduction from the following variant of
the Halting problem: Given a Turing machine TM, does TM halt when run on empty
input? Given an arbitrary instance TM of the Halting problem, we have to construct
two SO tgds and 0, s.t. the following equivalence holds: TM halts on empty input ,
6 0. A major challenge of this reduction is that SO tgds do not directly provide us
with recursion or with a means to enforce equalities.
1 We thank the anonymous referee of AMW 2011 who pointed out this proof.
Schemas. Let TM be a Turing machine (Q; A; ; q0) with set of states Q = f0; : : : ; mg,
tape alphabet A = f0; : : : ; ng, transition function : Q A ! Q A f ; !g, and
initial state q0. W.l.o.g., assume that q0 = 0 and let 1 denote the halting state. Moreover,
let 1 denote the tape starting symbol . and let 0 denote the blank symbol t.</p>
      <p>We represent the state, tape and cursor of a Turing machine configuration at time
t and tape location l by the function symbols stateq(t), tapea(t; l), and cursor(t; l)
(where q 2 Q, a 2 A). Each of these functions is intended to have Boolean range, e.g.
tapea(t; l) = 1 indicates that at time t, the symbol a is stored on the tape at location l,
and tapea(t; l) = 0 otherwise. We will ensure this Boolean range using our SO tgds.</p>
      <p>As source schema, we use root( ) and chain( ; ), which is intended to describe a
linear order starting at root(r) and continued by chain( ; ) atoms, i.e. source instance I
should have the form I = froot(0); chain(0; 1); chain(1; 2); : : :g. Still, we must make
sure through our SO tgds that what is described by root and chain does not deviate from
a linear order in a way that adversely affects our simulation.</p>
      <p>We define our target schema to consist of the relation symbol range( ), which
is intended to describe the range of the functions we defined above. Furthermore, we
use the target relation symbol halt to indicate a halting computation as well as to
signify an error condition. That is, an intended target instance should have the form
J = frange(0); range(1)g and possibly include halt().</p>
      <p>Dependencies. Given an arbitrary Turing machine TM, we define the SO tgds and 0
as a “big” conjunction consisting of several groups of conjuncts (i.e., implications). As
we will see, it is possible to construct an SO tgd that simulates a computation of TM.</p>
      <p>We start by describing initialization conjuncts which are used to encode the initial
configuration of the TM on the empty input tape. Since we cannot enforce equalities on
the right-hand side of implications, all the implications are formulated in such a way
that taking the wrong value will lead to an error condition indicated by halt(). E.g., we
want to enforce the initial state state0(r) = 1. Therefore, in the case that state0(r) = 0
holds, we enforce halt() in the target instance to indicate this error condition:
root(r) ^ state0(r) = 0 ! halt()
Note that this does not yet cover the case that state0(r) 62 f0; 1g. We construct other
implications which define the rest of the initial configuration analogously.</p>
      <p>We now define transition conjuncts, which implement the transition function of
TM. In what follows, we write dom(z) as a short-hand for a predicate that is satisfied
by instantiating z to an arbitrary constant occurring in the source instance, i.e., dom(z)
has to be replaced by root(z), chain(z; ), or chain( ; z). Such a predicate is needed to
fulfil the safety condition of SO tgds. We first define a common part of the following
implications, matching the current state q and the current tape symbol a.</p>
      <p>' := stateq(t) = 1 ^ cursor(t; l) = 1 ^ tapea(t; p) = 1 ^ chain(t; t0) ^ dom(l)
Suppose that the transition function contains the transition (q; a) = (q0; a0;
the SO tgds and 0 contain the following conjuncts:
). Then
' ^ stateq0 (t0) = 0 ! halt()
' ^ tapea0 (t0; l) = 0 ! halt()
' ^ chain(l0; l) ^ cursor(t0; l0) = 0 ! halt()</p>
      <p>While the transition conjuncts define what changes, we now construct inertia
conjuncts which describe the function values of what remains the same. In particular, we
can construct (but for space reasons omit here) conjuncts, which define that the cursor is
not located at positions that are (a) more than one move away from the previous cursor
location, (b) at the same location, or (c) at the location opposite to the cursor movement.</p>
      <p>Furthermore, we need to state that the tape symbol remains the same wherever the
cursor is not located. That is, for each q 2 Q, a 2 A:
stateq(t) = 1 ^ cursor(t; l) = 0 ^ tapea(t; l) = 1 ^ dom(l) ^</p>
      <p>chain(t; t0) ^ tapea(t0; l) = 0 ! halt()
Finally, we also have to construct uniqueness conjuncts, ensuring that at each time t,
there is exactly one stateq(t) and at each time t and location l exactly one tapea(t; l)
whose function value is 1.</p>
      <p>Guarding conjuncts. It is clear that, since the domain is in general non-Boolean, we
cannot prevent functions stateq, tapea and cursor to take a range outside of f0; 1g for
arbitrary interpretations. However, the structure of our reduction will show that we have
to either, for a given interpretation of the function symbols, define the interpretation of
the relation symbols, or for a given interpretation of the relation symbols, define the
interpretation of the function symbols.</p>
      <p>So if we define the functions, we can guarantee Boolean domain ourselves. If we
define the relations, and in particular the target relations, we can use function guarding
conjuncts to materialize the range in the target:</p>
      <p>dom(x) ^ dom(y) ^ state (x) = y ! range(y)
Analogous conjuncts are introduced for tape and cursor. This ensures the Boolean range
of our functions, given appropriately constructed target relations (rangeJ = f0; 1g).</p>
      <p>Now that we have sufficient control over the functions, we still need to gain some
control over the source instance, since our simulation is intended to work on a linear
order described by root and chain. We therefore define source guarding conjuncts.
What we can directly enforce is that the chain relation includes no self-loops and never
returns to any root using the following conjuncts:
chain(x; x) ! halt()
root(r) ^ chain( ; r) ! halt()</p>
      <p>However, what we cannot directly exclude is that some element x of the chain has
two distinct predecessors px 6= px0 with chain(px; x) and chain(px0; x). In particular,
we do not have equality on the right-hand side, which would allow us to require px =
px0. Nor can we use a derivation of halt() when there are two predecessors chain(px; x)
and chain(px0; x) because we cannot guarantee that px 6= px0 on the left-hand side.</p>
      <p>But what we can do is control the effect of such multiple predecessors (or
successors). Notably, if these multiple predecessors or successors actually have no effect on
our simulation, then we have no problem with these discrepancies from linear order,
since the simulation is doing the correct thing anyway. Altogether, we cannot detect all
deviations from a linear order, e.g., two distinct root(r) atoms cannot be ruled out. But
we can gain just enough control to ensure the correctness of our reduction.
Halting detection. We now construct the two SO tgds for our reduction, namely such
that TM halts , 6 0. The first one, , simply consists of all the conjuncts we
constructed up to this point, that is simulation conjuncts and guarding conjuncts. The
function symbols tape , state and cursor are in the scope of an existential quantifier
over the entire conjunction. We now define the halting detection conjunct:
dom(t) ^ state1(t) = 1 ! halt()
and construct 0 to be the same as plus this halting detection conjunct. So overall,
both and 0 simulate the computation of Turing machine TM, but only 0 indicates a
halting state by enforcing halt().</p>
      <p>Theorem 2. Let and 0 be two SO tgds. It is undecidable whether 0 (i.e. is
logically equivalent to 0), even when CQ 0 (i.e. is CQ-equivalent to 0).
Proof. (Sketch). We constructed 0 as with an additional conjunct, therefore it is
sufficient to show that TM halts , 6 0. We show both directions separately. The
challenging part of the reduction is that, in one direction we have control over the functions,
but not over the relations, and the converse for the other direction.</p>
      <p>Assume that TM halts. Then we have to find an instance K = (I; J ) such that K
but K 6 0. Since TM halts, there is a computation of length n that ends in the halting
state. We construct the instance K = (I; J ) as</p>
      <p>I = froot(1); chain(1; 2); : : : ; chain(n 1; n)g J = frange(0); range(1)g
Then K is clearly the case. What needs to be shown is that K 6 0. In particular,
K 6 0 means that for arbitrary f , we have (K; f ) 6 '0. That is, we constructed K,
but must deal with arbitrary f for '0. We proceed by case distinction, assuming vt;l;a is
the correct value of tapea(t; l) according to the transition function of TM:
1. tapea(t; l) = vt;l;a for all t; l; a, that is, it correctly represents the TM computation
2. tapea(t; l) = :vt;l;a for some t; l; a, that is, the simulation violates TM
3. For some tapea(t; p) = y we have that y 62 f0; 1g, i.e., non-Boolean range
and in each case show K; f 6 0. This also holds for state and cursor.
Assume that TM does not halt. Given an arbitrary instance K = (I; J ), we have to
show that K implies K 0. That is, we deal with arbitrary K, but only have
to find a single f for '0. Assume that TM does not halt. Then let K = (I; J ) be an
arbitrary instance. We proceed by case distinction based on K:
1. J already contains an atom halt()
2. Source instance I deviates from a linear order significantly for our simulation
3. J contains some range(c) atom with c 62 f0; 1g, i.e., allows non-Boolean range
4. All range(c) atoms in J have c 2 f0; 1g
and in each case show K; f 0.</p>
      <p>Finally, note that the SO tgds and 0 in the above problem reduction are CQ-equivalent.
The claim of the theorem thus follows.</p>
      <p>
        Equalities are fundamental in SO tgds in order to define the composition of two
schema mappings [12, Theorem 5.4]. Equalities also play a central role in our
undecidability proof of logical equivalence for SO tgds in Theorem 2 above. However, if
we consider CQ-equivalence rather than logical equivalence, then the equalities are not
strictly necessary anymore and can be removed: In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] it is proved that every SO tgd is
CQ-equivalent to a plain SO tgd (a class of SO tgds which do not contain equalities).
As a consequence the previous proof idea cannot be reused and we need a set of new
proof techniques to show undecidability also for CQ-equivalence.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Undecidability of Conjunctive Query Equivalence</title>
      <p>
        We now show the undecidability of CQ-equivalence of SO tgds by a reduction from the
Domino Problem (DP), which was first shown undecidable by Berger [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. We use the
following, slightly modified version of DP [1, p.414].
      </p>
      <p>Definition 1. A domino system (DS) is given by a set D of domino types defined as
follows. Let C be a set of colours. Then, each domino type is a quadruple hl; t; r; bi of
values from C, corresponding respectively to the left, top, right and bottom colour of
a square domino piece. Let d:side denote an accessor function D ! C returning the
colour of d, corresponding to side. Then, the Domino Problem (DP) given by D asks
if for each m; n 2 N a consistent tiling of an m n grid with the domino types from
D exists. That is, if there is a function tm;n: f1 : : : mg f1 : : : ng ! D, such that the
equality tm;n(i; j):right = tm;n(i + 1; j):left holds for 1 i &lt; m and 1 j n
and tm;n(k; l):bottom = tm;n(k; l + 1):top for 1 k m and 1 l &lt; n.</p>
      <sec id="sec-4-1">
        <title>Chains, proper instances, and coordinates. We model a grid using the source schema</title>
        <p>S consisting of two binary relations Ch and Cv intended to define a horizontal and
vertical successor relation: For an instance I of S, ChI and CvI denote the relations of
I. Their intended form is f(a0; a1); (a1; a2) : : : (an 1; an)g, with ai 6= aj , for every
0 i &lt; j n. We define a &lt; b w.r.t. an order C, if (a; b) 2 C or 9x; C(x; b) ^ a &lt; x.
Taking the tuples of C as edges of a directed graph (which we call the dual graph),
the order defined above corresponds to an acyclic chain. If each relation of an instance
I over S contains a single acyclic chain, I is called proper. The size (m; n) of such a
proper instance I is defined as (jChj; jCvj).</p>
        <p>Domino tilings are modelled by the target relation D=4. Each of its four arguments
is populated by a functional term cs(x; y) where c corresponds to the colour, s to the
side of the domino piece, and the pair (x; y) defines the position in the grid. To save
notation, we will omit parentheses when writing the terms in the relation D (e.g. the
above term will be written as csxy). Given a fact d 2 D, we define the accessor function
d:n, returning the term in the nth place of d, 1 n 4.</p>
        <p>Definition 2. Let D be a DS with types d1 : : : dn. A simulation mapping D for D over
the source schema S = fCh( ; ); Cv( ; )g and the target schema T = fD( ; ; ; )g
consists of an SO tgd D and four source key dependencies:
1: Ch(x; x1) ^ Ch(x; x2) ! x1 = x2
3: Cv(x; x1) ^ Cv(x; x2) ! x1 = x2
2: Ch(x1; x) ^ Ch(x2; x) ! x1 = x2
4: Cv(x1; x) ^ Cv(x2; x) ! x1 = x2</p>
        <p>The SO tgd D has the form 9c9ph9pv9qh9qv( d1 ^ : : : ^ dn ^ 1 ^ 2), where
d1 : : : dn encode the corresponding domino types:</p>
        <p>di : Ch(x1; x2) ^ Cv(y1; y2) ! D(lh x1 y2; tv x2 y1; rh x2 y2; bv x2 y2);
l; t; r; b denoting respectively the left, top, right and bottom colours of di. Superscripts
distinguish the horizontal and vertical dimensions, so that there are at most two distinct
functions in c for each colour in D. Moreover, c contains neither phjv nor qhjv.</p>
        <p>The conjuncts 1 and 2 are called guards and have a form similar to di :
h: Ch(x0; x1) ^ Ch(x1; x2) ^ Cv(y1; y2) ! D(ph x1 y2; pv x2 y1; ph x2 y2; pv x2 y2)
v: Ch(x1; x2) ^ Cv(y0; y1) ^ Cv(y1; y2) ! D(qh x1 y2; qv x2 y1; qh x2 y2; qv x2 y2)
wv1 0
Informally, given a proper source instance of size (m; n), the conjuncts d1 : : : djDj
create a stack of jDj domino tiles at each position in the m n grid. Additionally, the
guards h and v create two more tiles with unique colours p and q on each position
(i; j) with i; j &gt; 1, or a single tile if i = 1 or j = 1. The guards tackle source instances
with cycles instead of the linear order, as will be explained later.</p>
        <p>For a source instance I, a D-fact d enforced by the SO tgd in a simulation
mapping (that is, d 2 chase(I; D)) has a form D(lhxp y; tvxyp; rhxy; bvxy), such that
(xp; x) 2 ChI , (yp; y) 2 CvI .</p>
        <p>Definition 3. Let D be a DS and I a proper instance I of size (m; n). A fact d =
D(lhxpy; tvxyp; rhxy; bvxy) 2 chase(I; D) is associated with the domino type
hl; t; r; bi, and with the pair (x; y) of I-values, occurring as arguments in its two last
terms. Moreover, let i be the ordinal of x w.r.t. the order defined by ChI and j be the
ordinal of y w.r.t. CvI . We call the pair (i; j) the coordinates of d defining its grid position.
Property 1. Let I be a proper source instance, and D the simulation mapping of a DS
D. Moreover, let d1 and d2 be two facts in chase(I; D), d1 associated with the pair of
I-values (x1; y1), and d2 with (x2; y2). Then, the following properties hold true:
1. d1:3 = d2:1 ) (x1; x2) 2 CI</p>
        <p>h ^ y1 = y2 ^ 8i 6= 3 8j 6= 1 d1:i 6= d2:j.
2. d1:2 = d2:4 ) (y1; y2) 2 CvI ^ x1 = x2 ^ 8 6
i = 4 8j 6= 2 d1:i 6= d2:j.
3. 9i 2 f1:::4g d1:i = d2:i ) x1 = x2 ^ y1 = y2.
4. d1:1 6= d2:2 ^ d1:3 6= d2:4 ^ d1:1 6= d2:4.</p>
        <p>Claim 1 stipulates, that whenever a term is shared between the 3rd place of d1 and
the 1st place of d2, the facts correspond to horizontally adjacent grid positions (i.e., have
first coordinates i and i + 1 and the same second coordinate) and can have no further
terms in common. Claim 2 is an analogue of Claim 1 for the equality d1:4 = d2:2 and
vertical adjacency: d1 and d2 must have coordinates (i; j) and (i; j + 1). Claim 3 states
that terms at respective positions can be only shared by facts with the same coordinates.
Finally, Claim 4 restricts term equalities to the above cases.</p>
        <p>Example 2. Consider a DS D containing the four types a; b; c; d in Figure 2. In the
simulation mapping D, the first type is modelled by the following implicational conjunct
in the SO tgd a: Ch(x1; x2)^Cv(y1; y2) ! D(wh x1 y2; wv x2 y1; gh x2 y2; gv x2 y2).
Here, w and g stand for white and gray colours, respectively.</p>
        <p>The 3 2 grid is represented by a proper source instance I (see top of Figure 2). Six
dotted rectangles represent a consistent tiling of this grid with D. Each rectangle bears
the four terms of the corresponding D-fact in chase(I; D): E.g., the topmost leftmost
tile (the one at position (1,1), that is) is encoded by the fact D(wh01; wv10; gh11; gv11).
Problem reduction. Let D be a DS and D its simulation mapping. Furthermore,
consider a singleton colour set Cb = fbg (where b stands for “black”) and a corresponding
domino set B = fhb; b; b; big with the simulation mapping B. We claim that D has a
soluTtihoen ifSfO DtgdC Qin B Bholhdas.s the following form: B: 9bh9bv9ph9pv9qh9qv</p>
        <p>Ch(x1; x2) ^ Cv(y1; y2) ! D(bh x1 y2; bv x2 y1; bh x2 y2; bv x2 y2) ^ h ^ v .
It is easy to see that the conjuncts h and v are redundant in B. Hence, in the
following we assume B to contain a single implicational conjunct and no guards. Moreover,
the following relationship holds between the chase result of B and D:
Property 2. Let D be a DS with simulation mapping D, let (i1; j1) and (i2; j2) be
two distinct grid positions, and let d1; d2 2 chase(I; D) be facts at positions (i1; j1)
resp. (i2; j2). Then also chase(I; B) contains two facts d01; d02 at these grid positions.
Moreover, for every 1 k; l 4, if d1:k = d2:l then d01:k = d02:l, i.e., if two terms in
d1 and d2 are equal, then the terms at these places in d01 and d02 are also equal.</p>
        <p>This property uses an observation, that every implicational conjunct di encoding a
domino type di in the SO tgd of a simulation mapping produces exactly one D-fact for
each grid position, and that B 2 B uses the minimal set of functional symbols.</p>
        <p>For DPs that have a solution (of which B is obviously an example) the following
lemma, which is crucial for our construction, holds:
Lemma 1. Let I be a proper instance of size (m; n) and let D be a DS. Then, there is
a consistent tiling of an m n grid with D iff core(I; D) = core(I; B) holds.
Example 3. It is easy to check, that the six D-facts in Figure 2 indeed form core(I; D).
For I, B yields a solution with the isomorphic core: each fact in core(I; B) can be
obtained from some fact of core(I; D) by replacing leading symbols w and g with b.</p>
        <p>To lift the equivalence between solvability of D and existence of an isomorphism
core(I; D) = core(I; B) to unrestricted source instances I, we will first show, that
whenever ChI or CvI contains a single cycle (that is, its dual graph is connected, each
vertex having a single incoming and a single outgoing edge), then any simulation mapping
yields an instance with the same core as chase(I;
B).</p>
        <p>Lemma 2. Let I contain a single cycle in Ch or in Cv. Then, for arbitrary DS D,
core(I; D) = core(I; B) holds.</p>
        <p>Proof (Sketch). W.l.o.g. assume that the single cycle is contained in Ch. For such I,
the guard h in the SO tgd D 2 D is indistinguishable from B in B. Indeed,
the conclusions of these implications coincide up to renaming of functional terms. The
antecedent 'h of h has an extra atom C(x0; x1), that prevents it from being satisfied
by a Ch-fact at the beginning of the chain. However, if ChI is a cycle, the relations q1j2
defined as q1(x1; x2) $ Ch(x1; x2) and q2(x1; x2) $ 9x0 Ch(x0; x1) ^ Ch(x1; x2),
coincide, and so do B and h. Hence, J = chase(I; D) contains a subinstance JB
isomorphic to chase(I; B). Using Property 2, J ! JB can be shown.</p>
        <p>Next, we address the source instances I containing multiple connected components
iGnKChhIajvs. tThheefaccotnsnoefctKednaesssveirstidceefis,necdonansefcotleldowbsy: aGnivuenndairnecintesdtaendcgeeKw,htheneefvaecrt
cgorarrpehsponding facts have a common term. The facts f1; f2 2 K are connected, if there is a
path between the corresponding vertices in GK . The following property can be shown
by induction on the length of the path between two D-facts:
Lemma 3. Let D be a simulation mapping, I a source instance over S, and let d1; d2
be facts in J = chase(I; D) containing respectively terms (x1; y1) and (x2; y2).
Assume that x1 is a term in a fact c0h 2 ChI , and x2 in c0h0 2 ChI . Likewise, let y1 and y2
occur, respectively, in the facts c0v; c0v0 2 CvI . Then, d1 and d2 are connected in J only if
the facts c0h and c0h0 are connected in ChI , and so are c0v and c0v0 in CvI .</p>
        <p>The above lemma implies, that if the source I contains lh and lv connected components
in ChI and CvI respectively, then for any simulation mapping , chase(I; ) can be
decomposed into lhlv pairwise disconnected instances. Furthermore, if dual graphs of
the connected components of CI</p>
        <p>hjv are acyclic chains or cycles, then by applying one of
the Lemmas 1, 2 in each of the lhlv cases, the reduction established by these lemmas can
be lifted to the the case when relations CI</p>
        <p>Finally, we observe that union of chahinjvs caonndtaciyncmleuslitsiptlheecoonnlnyectytepde coofmdupaolngenratsp.hs
which is relevant. Indeed, if the dual graph of a source instance I contains a V-formed
subgraph fC(x; y1); C(x; y2)g or fC(y1; x); C(y2; x)g with y1 6= y2, then the key
dependencies ensure that for any simulation mapping (including B) the set Sol (I; )
is empty. This concludes our reduction from the domino problem to CQ-equivalence of
SO tgds. We may thus formulate the main result of this section.</p>
        <p>Theorem 3. The CQ-equivalence of mappings consisting of SO tgds and source key
dependencies is undecidable.</p>
        <sec id="sec-4-1-1">
          <title>For the simulation mappings D and</title>
          <p>easily proved that D CQ B holds iff</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>B in the above problem reduction, it can be D B holds. We thus immediately get</title>
          <p>Theorem 4. The logical equivalence of mappings consisting of SO tgds without
equalities in the antecedents and source key dependencies is undecidable.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>
        Testing the equivalence of schema mappings is a fundamental task. For mappings
defined by s-t tgds and target dependencies in the form of tgds and equality generating
dependencies (egds) this problem has been well studied. In the general case, the (logical)
equivalence of such mappings is undecidable. If the target tgds admit a finite chase, then
the problem becomes decidable [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], two relaxed notions of equivalence were
introduced, namely the CQ-equivalence studied here and data exchange equivalence
(DE-equivalence). Moreover, in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], the undecidability of CQ-equivalence was proved
for mappings consisting of s-t tgds and target tgds. The undecidability was extended in
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] in two directions, namely to mappings with target egds and from CQ-equivalence
to DE-equivalence. The equivalence of SO tgds has been unexplored so far.
      </p>
      <p>
        In this paper, we have proved the undecidability of logical equivalence. For the
case of source schemas with key dependencies, we proved the undecidability of
CQequivalence. The status of CQ-equivalence without any assumptions on the source
schema has been left open. Closing this gap is on top of the agenda for future work.
Moreover, the search for decidable fragments should be initiated. Finally, we also want
to study the DE-equivalence [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] mentioned above. We conjecture that DE-equivalence
of SO tgds is also undecidable. However, this has to be proved yet.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>C.</given-names>
            <surname>Allauzen</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Durand</surname>
          </string-name>
          .
          <article-title>The classical decision problem, by E</article-title>
          . Bo¨rger, E.
          <source>Gra¨del and Yu</source>
          . Gurevich, chapter “Appendix A: Tiling Problems”, pages
          <fpage>407</fpage>
          -
          <lpage>420</lpage>
          . Springer,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>M.</given-names>
            <surname>Arenas</surname>
          </string-name>
          , J. Pe´rez,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Reutter</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Riveros</surname>
          </string-name>
          .
          <article-title>Composition and inversion of schema mappings</article-title>
          .
          <source>SIGMOD Record</source>
          ,
          <volume>38</volume>
          (
          <issue>3</issue>
          ):
          <fpage>17</fpage>
          -
          <lpage>28</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Arenas</surname>
          </string-name>
          , J. Pe´rez, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Riveros</surname>
          </string-name>
          .
          <article-title>The recovery of a schema mapping: Bringing exchanged data back</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>34</volume>
          (
          <issue>4</issue>
          ),
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>C.</given-names>
            <surname>Beeri</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>A proof procedure for data dependencies</article-title>
          .
          <source>J. ACM</source>
          ,
          <volume>31</volume>
          (
          <issue>4</issue>
          ):
          <fpage>718</fpage>
          -
          <lpage>741</lpage>
          ,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>R.</given-names>
            <surname>Berger</surname>
          </string-name>
          .
          <article-title>The undecidability of the domino problem</article-title>
          .
          <source>Amer Mathematical Society</source>
          ,
          <year>1966</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          .
          <article-title>Applying model management to classical meta data problems</article-title>
          .
          <source>In Proc. CIDR'03</source>
          ,
          <year>2003</year>
          . available from http://www-db.cs.wisc.edu/cidr/cidr2003/program/p19.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Green</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Melnik</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Nash</surname>
          </string-name>
          .
          <article-title>Implementing mapping composition</article-title>
          .
          <source>VLDB J</source>
          .,
          <volume>17</volume>
          (
          <issue>2</issue>
          ):
          <fpage>333</fpage>
          -
          <lpage>353</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          .
          <article-title>Inverting schema mappings</article-title>
          .
          <source>In Proc. PODS'06</source>
          , pages
          <fpage>50</fpage>
          -
          <lpage>59</lpage>
          . ACM,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Kolaitis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Popa</surname>
          </string-name>
          .
          <article-title>Data exchange: semantics and query answering</article-title>
          .
          <source>Theor. Comput. Sci.</source>
          ,
          <volume>336</volume>
          (
          <issue>1</issue>
          ):
          <fpage>89</fpage>
          -
          <lpage>124</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Kolaitis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nash</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Popa</surname>
          </string-name>
          .
          <article-title>Towards a theory of schema-mapping optimization</article-title>
          .
          <source>In Proc. PODS'08</source>
          , pages
          <fpage>33</fpage>
          -
          <lpage>42</lpage>
          . ACM,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Kolaitis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Popa</surname>
          </string-name>
          .
          <article-title>Data exchange: getting to the core</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>30</volume>
          (
          <issue>1</issue>
          ):
          <fpage>174</fpage>
          -
          <lpage>210</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Kolaitis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Popa</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W. C.</given-names>
            <surname>Tan</surname>
          </string-name>
          .
          <article-title>Composing schema mappings: Second-order dependencies to the rescue</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>30</volume>
          (
          <issue>4</issue>
          ):
          <fpage>994</fpage>
          -
          <lpage>1055</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>A. Y. Halevy</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rajaraman</surname>
            , and
            <given-names>J. J.</given-names>
          </string-name>
          <string-name>
            <surname>Ordille</surname>
          </string-name>
          .
          <article-title>Data integration: The teenage years</article-title>
          .
          <source>In Proc. VLDB'06</source>
          , pages
          <fpage>9</fpage>
          -
          <lpage>16</lpage>
          . ACM,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>J.</given-names>
            <surname>Madhavan</surname>
          </string-name>
          and
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Halevy</surname>
          </string-name>
          .
          <article-title>Composing mappings among data sources</article-title>
          .
          <source>In Proc. VLDB'03</source>
          , pages
          <fpage>572</fpage>
          -
          <lpage>583</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>S.</given-names>
            <surname>Melnik</surname>
          </string-name>
          .
          <source>Generic Model Management: Concepts and Algorithms</source>
          , volume
          <volume>2967</volume>
          of Lecture Notes in Computer Science. Springer,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>A.</given-names>
            <surname>Nash</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Melnik</surname>
          </string-name>
          .
          <article-title>Composition of mappings given by embedded dependencies</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>32</volume>
          (
          <issue>1</issue>
          ):
          <fpage>4</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>R.</given-names>
            <surname>Pichler</surname>
          </string-name>
          , E. Sallinger, and
          <string-name>
            <given-names>V.</given-names>
            <surname>Savenkov</surname>
          </string-name>
          .
          <article-title>Relaxed notions of schema mapping equivalence revisited</article-title>
          .
          <source>In Proc. ICDT'11</source>
          , pages
          <fpage>90</fpage>
          -
          <lpage>101</lpage>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>