<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the Decidability of Consistent Query Answering</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marcelo Arenas</string-name>
          <email>marenas@ing.puc.cl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Leopoldo Bertossi</string-name>
          <email>bertossi@scs.carleton.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Carleton University, School of Computer Science</institution>
          ,
          <addr-line>Ottawa</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Pontificia Universidad Cat ́olica de Chile, Department of Computer Science</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Consistent query answering (CQA) is about formally characterizing and computing semantically correct answers to queries posed to a database that may fail to satisfy certain integrity constraints. In this paper, we revisit the decidability status of consistent query answering by considering different parameters of the problem as input to the decision problem. More specifically, we obtain some new results about the undecidability and combined complexity of CQA.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        e.g. size of the query or of the set of ICs. There are also few results about
undecidability of CQA. Exceptions in this direction are [
        <xref ref-type="bibr" rid="ref11 ref8">11, 8</xref>
        ]; and both refer to
a form of combined undecidability, i.e., without a fixed query and a fixed set of
ICs. We do not elaborate more on [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], because the repair semantics is different
from the one investigated in this paper. Furthermore, the results in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] heavily
rely on having denial constraints that involve numerical attributes.
      </p>
      <p>
        However, some results in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] are highly relevant in our context. The
framework there considers a combination of key constraints (KCs) and inclusion
dependencies (IDs). In case of inconsistency, KCs are repaired via tuple deletions;
but IDs are repaired only by insertion of tuples. If the original instance is
consistent wrt the KCs, then inconsistencies are solved by insertion of whole tuples,
which can happen in possibly many different ways, and only constrained by the
KCs. The new tuples can take values in the database domain associated to the
schema. In this case, the repairs minimize the set of inserted tuples wrt set
inclusion.
      </p>
      <p>
        It is proved in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] that CQA is undecidable. This result is obtained by
reduction from the problem of deciding entailment of dependencies from a set
of functional dependencies (FDs) and inclusion dependencies: F ∪ I |= δ? Here,
F and I are sets of FDs and IDs, respectively, and δ is a dependency. This
problem is undecidable [1, theorem 9.2.4]. A careful examination of the proof
in [11, theorems 3.2, 3.4], shows that the reduction in it requires, in essence,
generating for the combination of F , I and δ an ad hoc combination of KCs,
IDs, a query and a database instance. In consequence, the undecidability result
involves as input to the problem all the three natural parameters of CQA, namely
the ICs, the query and the instance. The set I is actually a cyclic set of IDs.
      </p>
      <p>In this paper, we revisit the decidability status of consistent query
answering by considering different parameters of the problem as input to the decision
problem. We obtain some new results about the undecidability and combined
complexity of CQA. More specifically, we prove that CQA is undecidable for
all the combinations of the possible inputs (instance, ICs, query) to the
problem depending upon the parameters that are left fixed. We also establish the
decidability of CQA for a broad class of universal ICs, with the combination
of instance, ICs, and query as the input. Furthermore, we study the combined
complexity on data and ICs (fixed query) for this class of dependencies.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>If we start from a database schema S, including a countably infinite database
domain U , we may consider the first-order language L(S) based on the predicates
in S and the constants for the elements of U . Database instances for schema S are
Herbrand structures with domain U and finite extensions for the interpretations
of the predicates in S. In consequence, a database instance D can be identified
with a set of ground atoms R(t¯), with R ∈ S and t¯ a finite sequence of elements
of U of the same length as the arity of R. The active domain of D, denoted by
dom(D), is the finite set that contains all the elements of U that appear in D.
We also allow built-in predicates in L(S), like =, 6=, &lt;, etc., that have fixed and
predefined extensions in a database instance.</p>
      <p>An integrity constraint (IC) over a database schema S is a first-order sentence
in L(S). In particular, a universal IC is a sentence of the form ∀x¯ ψ, where ψ does
not contain any quantifiers. We use IC to denote a set of integrity constraints,
and we assume that IC is consistent in the sense that there is some database
instance that makes it true, but not necessarily by the one at hand. It is always
the case that in CQA, and also in this paper, we consider database instances
that may not satisfy the given set of ICs.</p>
      <p>
        Definition 1. A database instance D is consistent with respect to IC if D
satisfies IC , denoted by D IC; and inconsistent, otherwise. 2
Definition 2 ([
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]).
(a) The distance Δ(D1, D2) between database instances D1, D2 is the symmetric
difference (D1 r D2) ∪ (D2 r D1).
(b) For a fixed instance D and instances D′, D′′: D′ ≤D D′′ iff Δ(D, D′) ⊆
Δ(D, D′′).
(c) For a given database instances D, we say that an instance D′ is a repair of D
wrt IC iff D′ IC and D′ is ≤D-minimal in the class of database instances
that satisfy IC . 2
Notice that the extensions of the built-in predicates do not contribute to Δ,
because they have fixed extensions, identical in every database instance.
      </p>
      <p>
        Queries are formulas of L(S). If a query is a sentence, i.e., it has not free
variables, it is called a Boolean query. We assume that queries and ICs satisfy the
usual syntactic safety conditions [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] that make them domain independent [
        <xref ref-type="bibr" rid="ref17 ref24">17,
24</xref>
        ]. This means that they can be evaluated or checked against the active domain
of the database instance.
      </p>
      <p>
        For a query Q(x¯), with free variables in x¯, and a database instance D, a tuple
t¯ of constants of U is an answer to Q in D iff D |= Q[t¯], i.e., Q is true in D when
the variables in x¯ are replaced by the corresponding values in t¯.
Definition 3 ([
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). Given a set of integrity constraints, a (ground) tuple t¯ is
a consistent answer to a query Q(x¯) in a database instance D wrt a set of
integrity constraints IC , denoted D |=IC Q[t¯], if for every repair D′ of D wrt
IC , D′ Q[t¯]. If Q is Boolean, then yes is the consistent answer if for every
repair D′ of D, D′ Q; otherwise the consistent answer is no. 2
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>The Decision Problem of CQA</title>
      <p>The computational problem of retrieving consistent answers to a query from a
possibly inconsistent database can be studied by concentrating on the decision
problem of consistent query answering. Actually, this gives rise to several decision
problems depending on which of the natural parameters involved are considered
to be fixed and which are considered to be a part of the input. The most basic
decision problems of CQA are the following.
(a) For a Boolean query Q and a consistent set IC of integrity constraints:
d -CQA(Q, IC ) := {D | D is a database instance and D |=IC Q}.
(b) For a database instance D and a Boolean query Q:</p>
      <p>
        ic-CQA(D, Q) := {IC | IC is a consistent set of ICs and D |=IC Q}.
(b) For a database instance D and a consistent set IC of integrity constraints:
q-CQA(D, IC ) := {Q | Q is a Boolean query and D |=IC Q}.
2
Problem d -CQA(Q, IC ) is the problem of CQA in data, where the ICs and the
query are fixed -the parameters of the problem- and the database is variable,
i.e., the input. The data complexity [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] of CQA is precisely the complexity of
this problem, for different classes of parameters IC, Q. As common in database
research, this is the problem that has been investigated the most. This problem
is decidable for all common classes of ICs and queries, and tight complexity
bounds have been established [
        <xref ref-type="bibr" rid="ref11 ref12 ref18 ref2 ref21 ref23 ref27 ref28 ref8">11, 12, 27, 18, 8, 21, 28, 2, 23</xref>
        ].
      </p>
      <p>We may consider combined versions of the CQA decision problem. For
example, for a given Boolean query Q:
{d, ic}-CQA(Q) := {(D, IC ) | D is a database instance,</p>
      <p>IC is a set of ICs and D |=IC Q}, (1)
and, more generally,
{d, q, ic}-CQA := {(D, Q, IC ) | D is a database instance,</p>
      <p>Q is a Boolean query, IC is a set of ICs and D |=IC Q}. (2)
In what follows, when we refer to {d}-CQA(Q, IC ), we mean d-CQA(Q, IC ), as in
Definition 4; the same for ic and q. The complexity of {d , q}-CQA(IC ) is usually
called combined complexity of CQA (with fixed ICs). It is measured in terms of
the size of both the database and the query. In the case of {q, ic, d}-CQA, the
instance, the query and the set of ICs become part of the input. All this decision
problems can also be stated for queries with free variables, say Q(x¯), and tuples
t¯ of constants in U , asking if t¯ is a consistent answer. Since the schema has
the constants in U , we may restrict ourselves to Boolean queries. As already
indicated above, it holds:</p>
      <sec id="sec-3-1">
        <title>Proposition 1 ([11]). {d, q, ic}-CQA is undecidable.</title>
        <p>
          2
The proof of this proposition given in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] uses key constraints and cyclic sets
of referential integrity constraints (RICs) [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. The existential quantifiers of the
RICs are satisfied by picking up values from the underlying database domain
U . We should mention that, in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], {q, ic, d}-CQA was made decidable for this
type of ICs by considering repairs based on introduction of null values as used
in SQL databases. Those null values are used for the existential variables in
the RICs. Actually, the decidability follows in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] from the use of disjunctive
Datalog programs with stable model semantics [
          <xref ref-type="bibr" rid="ref15 ref19">19, 15</xref>
          ] to specify the repairs of
the original instance wrt universal ICs and RICs, including those considered in
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. These repair programs have been used before [
          <xref ref-type="bibr" rid="ref20 ref5 ref6">5, 20, 6</xref>
          ], and they provide an
upper bound of Π2P for the time complexity of d-CQA(Q, IC ) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. This is also
a lower bound since CQA can be Π2P -complete in data [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Undecidability of CQA</title>
      <p>Now we turn to some of the other decision problems for CQA. Clearly, for Y ⊆
X ⊆ {d, ic, q}, if X-CQA is decidable, then also Y-CQA is decidable, but not
necessarily the other way around. We first consider the problem of CQA in
integrity constrains.</p>
      <p>Proposition 2. There exist a database instance D and a Boolean query Q such
that ic-CQA(D, Q) is undecidable.</p>
      <p>
        Proof. Let S be a database schema that contains a binary predicate E and
a unary predicate P . Consider the instance D of S whose extension for P is
{P (a)}, where a is an element of U , and whose extension for E is ∅.
Furthermore, consider the Boolean query Q : ¬P (a); and the problem SAT := {ψ ∈
L({E}) | ψ is satisfiable}. This problem is undecidable over finite graphs [16,
sec. 7.2.1], and can be reduced to the complement of ic-CQA(D, Q) as follows:
Given a first-order sentence ψ ∈ L({E}), define a set of integrity constraints IC
by {ψ ↔ ∃xP (x)}. It is easy to check that ψ ∈ SAT iff D 6|=IC Q. 2
Now we address the problem of CQA in queries, that is, when queries are the
input of the problem. The complexity of this decision problem is the query
complexity of CQA [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ].
      </p>
      <p>Proposition 3. There exists a database instance D and a consistent set IC of
integrity constraints such that q-CQA(D, IC ) is undecidable.</p>
      <p>Proof. Let S be a database schema containing a unary predicate P , a binary
predicate E, a ternary predicate F and built-in predicate ≤ that is interpreted
in such a way that (U, ≤) is isomorphic to (N, ≤). Then let D be the empty
instance of S, and IC be a set consisting of the following integrity constraints:
∃xP (x),
∀u∀v(E(u, v) ↔ ∃yF (u, v, y)),
∀x∀y(P (x) ∧ y ≤ x → ∃u∃vF (u, v, y)).</p>
      <p>Consider SAT := {ψ ∈ L({E}) | ψ is satisfiable}. As mentioned in the proof
of Proposition 2, this problem is undecidable over finite graphs. Next we show
that the restriction of SAT to finite graphs can be reduced to the complement
of q-CQA(D, IC ). In fact, we prove that ψ ∈ SAT iff ¬ψ 6∈ q-CQA(D, IC ).</p>
      <p>Assume that ψ ∈ SAT , and let D′ be a database instance over {E} satisfying
ψ. Define a database instance D⋆ over S as follows. The interpretation of E in
D⋆ coincides with the interpretation of E in D′. Let t¯1, . . ., t¯m be the tuples
in the interpretation of E in D′, and assume that a1, . . ., am are the first m
elements of U according to linear order ≤. Then the interpretations of P and F
in D⋆ are {am} and {(t¯i, ai) | 1 ≤ i ≤ m}, respectively. It is straightforward to
prove that D⋆ |= IC . Furthermore, D⋆ is a repair of D since no proper subset
of D⋆ satisfies IC . Thus, given that D⋆ |= ψ, we conclude that D 6|=IC ¬ψ
and, therefore, ¬ψ ∈/ q-CQA(D, IC ). The other implication is immediate, which
concludes the proof of the proposition. 2
Finally, we also consider the case when integrity constraints and queries are
considered to be fixed.</p>
      <p>Proposition 4. There exist a Boolean query Q and a consistent set IC of
integrity constraints such that d -CQA(Q, IC ) is undecidable.</p>
      <p>Proof. We will reduce to the complement of our problem the halting problem
for deterministic Turing machines with empty string on an infinite unidirectional
tape with alphabet 0, 1, B (blank). The integrity constraints will codify the
dynamics of the machine, and the database instance will contain the transition
function δ of the machine. Repairing the instance wrt the dynamics of the
machine corresponds to making the machine compute. The initial and final states
are q0, qf , respectively. We need the following predicates:
1. Head (t, p): At time t the head is at position with number p of the tape.
2. State(t, q): At time t the state is q.
3. Conf a(t, p): At time t and position p the symbol read is a. Here, a 6= B; and
a blank is represented by the absence of 0 or 1.
4. δa,a′,R(q, q′): Being at state q, reading a causes, according to the transition
function δ, to write a′, move to the right (R) and jump to state q′. There
are similar predicates, L, for left.
5. Succ(n, n′): For natural numbers n, n′, it holds that n′ is the successor of
n. This predicate can be defined using the built-in predicate &lt;, and its
definition can be part of IC . Thus, we also assume that the database schema
contains a built-in predicate &lt; that is interpreted in such a way that (U, &lt;)
is isomorphic to (N, &lt;).</p>
      <p>Assume that a0 is the first element of U according to linear order &lt;. Then IC
contains the following sentences:
(a) Head (a0, a0) and State(a0, q0).
(b) ∀p(¬Conf 0(a0, p) ∧ ¬Conf 1(a0, p)). This formula states that that we start
with a blank tape. In general, we represent with ¬Conf 0(t, p) ∧ ¬Conf 1(t, p)
the fact that at time t in position p there is a blank.
∀t∀t′∀q∀q′∀p∀p′ State(t, q) ∧ Head (t, p) ∧ Conf a(t, p) ∧
δa,a′,L(q, q′) ∧ Succ(t, t′) ∧ Succ(p′, p) →</p>
      <p>Conf a′ (t′, p) ∧ Head (t′, p′) ∧ State(t′, q′) .</p>
      <p>There is a similar sentence for δa,a′,R. If a is B, in the previous formula
Conf a(t, p) is replaced by ¬Conf 0(t, p) ∧ ¬Conf 1(t, p), obtaining:
∀t∀t′∀q∀q′∀p∀p′ State(t, q) ∧ Head (t, p) ∧ ¬Conf 0(t, p) ∧
¬Conf 1(t, p) ∧ δB,a′,L(q, q′) ∧ Succ(t, t′) ∧ Succ(p′, p) →</p>
      <p>Conf a′ (t′, p) ∧ Head (t′, p′) ∧ State(t′, q′) .</p>
      <p>Similar formulas are obtained when a′ = B.
(d) For every a 6= B:</p>
      <p>∀t∀t′∀p∀p′(Head (t, p) ∧ p 6= p′ ∧ Succ(t, t′) → (Conf a(t, p′) ↔ Conf a(t′, p′))).
The query Q is ¬∃t State(t, qf ). All these elements determine a particular
decision problem of the form q-CQA(IC , Q), parameterized by IC and Q. Now, we
show that this problem is undecidable by reduction from the halting problem.</p>
      <p>For a machine M , the database instance D(M ) corresponds to the transition
function δ by means of the δ-predicates above and their contents. The other
relations are empty. It holds that M halts at state qf iff D(M ) 6|=IC Q. In fact:
(⇒) In this case, there is a repair D(M )′ corresponding to the finite computation
of M , i.e., it contains all the configuration tuples that lead to the final state. For
this repair, it holds that D(M )′ 6|= Q. Thus, D(M ) 6|=IC Q.
(⇐) Assume M does not halt at state qf . We have two cases: First, assume that
M does not halt (at any state). The infinite computation does not generate a
repair, because repairs have finite relations. In this case, repairs can be obtained
by deletion of tuples from D(M ), and then they correspond to sub-function
of the original δ. None of the computations related to such a repair can reach
state qf , because they represent subcomputation of the original machine (the
determinism of the machine is crucial here). In this case, all the repairs satisfy
Q and, therefore D(M ) |=IC Q. In the second case, M halts at a state different
from qf . A similar argument as in the previous case can be applied. 2
It is important to notice that the proof of Proposition 4 relies on the
availability of a built-in linear order. Also, in this proof we use universal ICs except
for the definition of the successor function. However, this definition, namely
∀m∀n(Succ(m, n) ↔ m &lt; n ∧ ¬∃j(m &lt; j ∧ j &lt; n)), as an IC does not
contain “repairable” predicates, i.e., database predicates. Thus, one could also rely
on the availability of a built-in successor predicate to prove this proposition,
avoiding the use of non-universal constraints.</p>
      <p>The propositions above have been established for the single input cases in
Definition 4. However, several possible inputs can be combined, as in (1) and
(2). From Propositions 2, 3 and 4, we obtain:
Theorem 1 (Undecidability of consistent query answering). For every
nonempty subset X of {d, q, ic}, there are cases where X-CQA is undecidable.
The cases depend upon the parameters of the problem, i.e., on {d, q, ic} r X. 2
5</p>
    </sec>
    <sec id="sec-5">
      <title>Combined Decidability and Complexity of CQA</title>
      <p>In this section, we concentrate on a particular but common class of ICs.
Definition 5. An integrity constraint is a safe universal IC if both:
(a) It is logically equivalent to a sentence of the form</p>
      <p>
        m n
∀(_ Pi(x¯i) ∨ _ ¬Qj(y¯j) ∨ ψ),
i=1 j=1
(3)
where ∀ represents the universal closure of the formula, x¯i, y¯j are tuples
of variables, the Pi, Qj are database predicates, and ψ is a formula that
mentions only the built-in predicates.
(b) Each variable in an x¯i or ψ in (3) appears in some y¯j. 2
The second condition is the safety condition that guarantees domain
independence [
        <xref ref-type="bibr" rid="ref17 ref24">17, 24</xref>
        ]. It is easy to see that, according to this definition, the ICs in
Proposition 4 are not safe, nor domain independent.
      </p>
      <p>Theorem 2. For finite sets IC of safe universal ICs, and Boolean queries Q,
the problem of CQA, i.e., {(D, Q, IC ) | D |=IC Q}, is decidable.
Proof. We may assume that the sentences in IC are in the clausal form (3). This
representation for ICs allows us to determine all the repairs of a database that
does not satisfy them: From the domain independence condition, the finitely
many tuples that participate in a violation and whose modification may lead to
a restoration of the consistency of the database, can be obtained by posing to D,
for each IC (3), the domain independent query about the set of violating tuples:
n m
V (y¯1, . . . , y¯n) :- ^ Qj(y¯j) ∧ ¬ψ ∧ ^ ¬Pi(x¯i).</p>
      <p>
        j=1 i=1
Repairs are obtained by deleting Qj-ground tuples and/or inserting Pi-ground
tuples. In this way, the finitely many repairs can be computed. Then, the query
Q can be evaluated in every single repair. If for all of them it becomes true, the
answer is yes, otherwise, is no. 2
Notice that the decision algorithm in the proof of Theorem 2 requires the
computation all of possible repairs; and we know that there may be exponentially
many of them in the size of the database [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        This decidability result extends similar implicit results [
        <xref ref-type="bibr" rid="ref10 ref20 ref5 ref6">5, 20, 6, 10</xref>
        ] that
guarantee decidability for certain syntactic classes of universal ICs. Their syntax
makes them safe, and then, also domain independent. This is obtained from the
representation of repairs as stable models of disjunctive logic programs. A direct
proof of decidability of {d, ic, q}-CQA for this class of universal constraints plus
RICs can be found in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] (existential variables are satisfied with an SQL null).
Decidability of {d, ic, q}-CQA is proved in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] for sets of non-conflicting key
constrains and inclusion dependencies.
      </p>
      <p>In the following, we study a form of combined complexity of CQA, where the
input consists of the database and the ICs.3 This is a special case of the general
{d, ic}-CQA problem in (1) where only universal, domain independent ICs are
considered in the input.</p>
      <sec id="sec-5-1">
        <title>Theorem 3.</title>
        <p>(a) For every Boolean query Q,</p>
        <p>
          CQA(Q) := {(D, ϕ) | ϕ is a safe universal IC and D |={ϕ} Q}
is in co-2-NEXP.
(b) There are Boolean queries for which the same problem is co-NEXP-hard.
Proof. (a) We prove that the complement of CQA(Q) belongs to 2-NEXP, that
is, we prove that there exists a non-deterministic Turing machine M that works
in double exponential time and accepts the complement of CQA(Q). Let D
be a database instance and ϕ a safe universal IC. In order to verify whether
(D, ϕ) 6∈ CQA(Q), Turing machine M first transforms ϕ into an equivalent
IC ψ of the form (3), then it guesses a repair D′ of D wrt ψ, and finally it
checks whether D′ 6|= Q (which implies that D 6|={ϕ} Q). Sentence ψ can be of
exponential size in the size of ϕ, and it takes exponential time to generate this
formula. Checking that D′ is a repair of D can be done in exponential time in the
size of ψ and D, because it might be necessary to compare D′ for inclusion with
exponentially many candidates (see the proof of Theorem 2). Finally, verifying
that D′ 6|= Q can be done in polynomial time in the size of D′ since Q is assumed
to be a fixed query. Thus, we conclude that M works in double exponential time
in the size of ϕ and D.
(b) Let P be a unary predicate and a ∈ U . Next we prove that the complement
of CQA(Q) is NEXP -hard, where Q is the Boolean query P (a). More precisely,
next we provide a reduction from SAT for the Bernays-Schoenfinkel’s syntactic
class of first-order formulas (BSC ) to the complement of CQA(Q). The former
problem is decidable and NEXP -complete [22, chapter 20].
3 Other cases of combined complexity for CQA around key constraints and inclusion
dependencies can be found in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>Consider a schema S, and let χ: ∃x1 · · · ∃xn∀y¯ ψ(x1, . . . , xn, y¯) ∈ L(S), with
ψ quantifier free, be an instance for BSC . Without loss of generality, assume that
S does not contain unary predicate P . If χ is satisfiable, then it is satisfiable in
the universe U = {1, . . . , n}. In consequence, a sentence in BSC is satisfiable iff
it is satisfiable in a finite structure (actually with finite universe), which allows
us to use this result in our context.</p>
        <p>Let a1, . . ., an be pairwise distinct elements from U , which are all distinct
from a, U1, . . ., Un be unary predicates not contained in S ∪ {P }, and D be the
empty database instance over S ∪ {U1, . . . , Un, P }. Furthermore, let ϕ be the
conjunction of:
(4)
(5)
(6)
n
^ ∀x∀y
i=1
n n
^ _ Ui(aj ),
i=1 j=1</p>
        <p>Ui(x) ∧ Ui(y) → x = y ,
∀x1 · · · ∀xn U1(x1) ∧ · · · ∧ Un(xn) → ∀y¯ ψ(x1, . . . , xn, y¯) ∨ P (a).
The first two sentences make sure that each Ui contains exactly one element of
U . If χ is satisfiable, then there will be a repair where each of the U1, ..., Un will
contain exactly one element of U and nothing else, and P will be empty. In that
repair, the query P (a) will be false and, thus, D 6|={ϕ} P (a). On the other hand,
if χ is not satisfiable, then no matter how U1, . . ., Un and the relations in S are
populated, sentence
∀x1 · · · ∀xn</p>
        <p>U1(x1) ∧ · · · ∧ Un(xn) → ∀y¯ ψ(x1, . . . , xn, y¯)
will not be satisfied. Therefore, all the repairs of D will include element a into
P in order to satisfy (6), which implies that D |={ϕ} P (a). Hence, we conclude
that χ is satisfiable iff D 6|={ϕ} P (a).</p>
        <p>The IC ϕ is clearly universal, but we have to check safety. The first two
conjuncts obviously are. For the third conjunct, due to the universal
quantification on x1, ..., xn and their occurrence in the predicates U1, ..., Un, the only
source of non-safety could be formula ψ. There is no reason for ψ to be safe.
However, for the purpose of establishing our complexity lower-bound, we may
restrict ourselves to a subclass of sentences in BSC that still is NEXP -hard,
namely sentences that encode bounded tiling problems. Those sentences turn out
to be safe (cf. [9, theorem 6.2.21] for details).4 2
4 In particular, due to the bounded domain to be tiled, for each instance we do not
need to go beyond a finite numerical domain U . The same would happen if we were
encoding computations of Turing machines that are bounded in time.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Abiteboul</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hull</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Vianu</surname>
          </string-name>
          , V. Foundations of Databases. Addison-Wesley,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Afrati</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kolaitis</surname>
          </string-name>
          ,
          <source>Ph. Repair Checking in Inconsistent Databases: Algorithms and Complexity. Proc. of the International Conference on Database Theory (ICDT 09)</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bertossi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Chomicki</surname>
          </string-name>
          ,
          <source>J. Consistent Query Answers in Inconsistent Databases. Proc. ACM Symposium on Principles of Database Systems (PODS 99)</source>
          ,
          <year>1999</year>
          , pp.
          <fpage>68</fpage>
          -
          <lpage>79</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bertossi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chomicki</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raghavan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Spinrad</surname>
          </string-name>
          ,
          <source>J. Scalar Aggregation in Inconsistent Databases. Theoretical Computer Science</source>
          ,
          <year>2003</year>
          ,
          <volume>296</volume>
          (
          <issue>3</issue>
          ):
          <fpage>405</fpage>
          -
          <lpage>434</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bertossi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Chomicki</surname>
            ,
            <given-names>L. Answer</given-names>
          </string-name>
          <article-title>Sets for Consistent Query Answering in Inconsistent Databases</article-title>
          .
          <source>Theory and Practice of Logic Programming</source>
          ,
          <year>2003</year>
          ,
          <volume>3</volume>
          (
          <issue>4</issue>
          -5):
          <fpage>393</fpage>
          -
          <lpage>424</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Barcelo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bertossi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Bravo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Characterizing</surname>
          </string-name>
          and
          <article-title>Computing Semantically Correct Answers from Databases with Annotated Logic and Answer Sets</article-title>
          . Chapter in Semantics of Databases, Springer LNCS 2582,
          <year>2003</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Bertossi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <article-title>Consistent Query Answering in Databases</article-title>
          .
          <source>In ACM Sigmod Record</source>
          ,
          <year>2006</year>
          ,
          <volume>35</volume>
          (
          <issue>2</issue>
          ):
          <fpage>68</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Bertossi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bravo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Franconi</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lopatenko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>The Complexity and Approximation of Fixing Numerical Attributes in Databases Under Integrity Constraints</article-title>
          .
          <source>Information Systems</source>
          ,
          <year>2008</year>
          ,
          <volume>33</volume>
          (
          <issue>4</issue>
          ):
          <fpage>407</fpage>
          -
          <lpage>434</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B</given-names>
            <surname>¨orger</surname>
          </string-name>
          , E., Gr¨adel, E. and
          <string-name>
            <surname>Gurevich</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <article-title>The Classical Decision Problem</article-title>
          . Springer,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Bravo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Bertossi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <article-title>Semantically Correct Query Answers in the Presence of Null Values</article-title>
          .
          <source>Proc. EDBT International Workshop on Inconsistency and Incompleteness in Databases (IIDB 06)</source>
          . Springer LNCS 4254,
          <year>2006</year>
          , pp.
          <fpage>336</fpage>
          -
          <lpage>357</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Cali</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Rosati</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>On the Decidability and Complexity of Query Answering over Inconsistent and Incomplete Databases</article-title>
          .
          <source>Proc. ACM Symposium on Principles of Database Systems (PODS 03)</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>260</fpage>
          -
          <lpage>271</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Chomicki</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Marcinkowski</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Minimal-Change Integrity Maintenance Using Tuple Deletions</surname>
          </string-name>
          .
          <source>Information and Computation</source>
          ,
          <year>2005</year>
          ,
          <volume>197</volume>
          (
          <issue>1-2</issue>
          ):
          <fpage>90</fpage>
          -
          <lpage>121</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Chomicki</surname>
            ,
            <given-names>J. Consistent</given-names>
          </string-name>
          <string-name>
            <surname>Query</surname>
          </string-name>
          <article-title>Answering: Five Easy Pieces</article-title>
          .
          <source>Proc. International Conference on Database Theory (ICDT 07)</source>
          , Springer LNCS 4353,
          <year>2007</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Dantsin</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eiter</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Voronkov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Complexity And</surname>
          </string-name>
          <article-title>Expressive Power Of Logic Programming</article-title>
          .
          <source>ACM Computer Surveys</source>
          ,
          <year>2001</year>
          ,
          <volume>33</volume>
          (
          <issue>3</issue>
          ):
          <fpage>374</fpage>
          -
          <lpage>425</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Eiter</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Mannila</surname>
            ,
            <given-names>H. Disjunctive</given-names>
          </string-name>
          <string-name>
            <surname>Datalog</surname>
          </string-name>
          .
          <source>ACM Transactions on Database Systems</source>
          ,
          <year>1997</year>
          ,
          <volume>22</volume>
          (
          <issue>3</issue>
          ):
          <fpage>364</fpage>
          -
          <lpage>418</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Ebbinghaus</surname>
          </string-name>
          , H.-D. and
          <string-name>
            <surname>Flum</surname>
          </string-name>
          ,
          <source>J. Finite Model Theory. 2nd edition</source>
          , Springer,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Fagin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>Horn Clauses and Database Dependencies</article-title>
          .
          <source>Journal of the ACM</source>
          ,
          <year>1982</year>
          ,
          <volume>29</volume>
          (
          <issue>4</issue>
          ):
          <fpage>952</fpage>
          -
          <lpage>985</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Fuxman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>First-Order Query Rewriting for Inconsistent Databases</article-title>
          .
          <source>Journal of Computer and System Sciences</source>
          ,
          <year>2007</year>
          ,
          <volume>73</volume>
          (
          <issue>4</issue>
          ):
          <fpage>610</fpage>
          -
          <lpage>635</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Gelfond</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lifschitz</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <article-title>Classical Negation in Logic Programs</article-title>
          and
          <string-name>
            <given-names>Disjunctive</given-names>
            <surname>Databases</surname>
          </string-name>
          . New Generation Computing,
          <year>1991</year>
          ,
          <volume>9</volume>
          :
          <fpage>365</fpage>
          -
          <lpage>385</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Greco</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greco</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Zumpano</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <article-title>A Logical Framework for Querying and Repairing Inconsistent Databases</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          ,
          <year>2003</year>
          ,
          <volume>15</volume>
          (
          <issue>6</issue>
          ):
          <fpage>1389</fpage>
          -
          <lpage>1408</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Lopatenko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Bertossi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <article-title>Complexity of Consistent Query Answering in Databases under Cardinality-Based and Incremental Repair Semantics</article-title>
          .
          <source>Proceedings of the International Conference of Database Theory (ICDT 07)</source>
          , Springer LNCS 4353,
          <year>2007</year>
          , pp.
          <fpage>179</fpage>
          -
          <lpage>193</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Papadimitriou</surname>
            ,
            <given-names>Ch. Computational</given-names>
          </string-name>
          <string-name>
            <surname>Complexity</surname>
          </string-name>
          . Addison-Wesley,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Staworko</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Chomicki</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>Consistent Query Answering in the Presence of Universal Constraints</article-title>
          .
          <source>Information Systems</source>
          ,
          <year>2010</year>
          ,
          <volume>35</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Ullman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>Principles of Database and Knowledge-Base Systems</article-title>
          , Vol. I. Computer Science Press,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Van Gelder</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Topor</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>Safety and Correct Translation of Relational Calculus Formulas</article-title>
          .
          <source>Proceedings of the Sixth ACM Symposium on Principles of Database Systems (PODS 87)</source>
          . ACM Press,
          <year>1987</year>
          , pp.
          <fpage>313</fpage>
          -
          <lpage>327</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Vardi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>The Complexity of Relational Query Languages</article-title>
          .
          <source>Proc. ACM Symposium on Theory of Computing (STOC 82)</source>
          ,
          <year>1982</year>
          , pp.
          <fpage>137</fpage>
          -
          <lpage>146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Wijsen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>Database Repairing Using Updates</article-title>
          .
          <source>ACM Transactions on Database Systems</source>
          ,
          <year>2005</year>
          ,
          <volume>30</volume>
          (
          <issue>3</issue>
          ):
          <fpage>722</fpage>
          -
          <lpage>768</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Wijsen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>On the Consistent Rewriting of Conjunctive Queries Under Primary Key Constraints</article-title>
          .
          <source>Information Systems</source>
          ,
          <year>2009</year>
          ,
          <volume>34</volume>
          (
          <issue>7</issue>
          ):
          <fpage>578</fpage>
          -
          <lpage>601</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>