<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Turning The Partial-closed World Assumption Upside Down</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Simon Razniewski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ognjen Savkovic´</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Werner Nutt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>razniewski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>savkovic</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>nutt}@inf.unibz.it</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Free University of Bozen-Bolzano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The partial-closed world setting provides an intermediate ground between open-world and closed-world settings. Previous work on the partial-closed world assumption assumes that incompleteness is the default, i.e., that databases in general can be incomplete, and are only definitely complete in specified parts. In this work we turn this assumption around, and study databases where completeness is the default, and incompleteness only occurs in specified parts. We present four languages for describing potentially incomplete parts of databases, full-table statements, patterns, local statements and query statements. We show that except for full-table statements, it is not possible to translate between the two settings without extending the languages. Finally, we present techniques to decide query completeness entailment for each of the languages both on the schema and instance level, finding that for queries under bag semantics, the complexity is sometimes easier than under the setting where complete parts are specified.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Databases are traditionally considered under the closed-world assumption [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ],
while for many knowledge bases and the semantic web, the open-world
assumption [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is used. In many applications however, only parts of databases
are complete, while in other parts, data is known to be missing, or potentially
missing. One can call this the partial-closed world assumption (PCWA). A problem
for databases under the PCWA is to decide whether a query returns all possible
answers, in which case the query is called complete.
      </p>
      <p>
        Previous work on query completeness under the PCWA used so-called
completeness statements to describe complete parts, while assuming that the
undescribed parts of the database can be incomplete [
        <xref ref-type="bibr" rid="ref11 ref13 ref14 ref3 ref4 ref8">11,8,14,13,4,3</xref>
        ]. We call this
the incompleteness-as-default (IAD) setting.
      </p>
      <p>In this paper, we turn this assumption around and instead assume that
databases are only incomplete in explicitly described parts, while all other parts
are complete. We call this setting completeness-as-default (CAD). The study of this
setting is motivated by the observation that in many business applications, data
is generally of good quality, while only specific parts are questionable.</p>
      <p>Motivating Example. As an example, consider a university database
containing three tables:
– student(name,degree)
– lecturer(name,faculty)
– takes(name,course).</p>
      <p>Assume that for all faculties but Computer Science the course enrolment has
already finished, so takes is only potentially incomplete for enrolments of CS
students. Then clearly, queries for courses of students in the Philosophy degree
will still produce complete results, while queries for subjects of students in the
Computer Science degree might not.</p>
      <p>In order to translate this situation into the IAD setting, we would need a
completeness statement for the takes table for all entries for students that are
not in the CS degree, which requires the use of negation.</p>
      <p>Another motivation for incompleteness statements is that they can be
naturally translated into tasks. In the research information system of our university,
for instance, academics are periodically given tasks to complete information
about publications in recent years. These tasks are the same as incompleteness
statements for the respective periods. Also, the use of incompleteness
statements allows to directly point out why query answers are not complete, which
in general, in the incompleteness-as-default setting cannot be done.</p>
      <p>Research questions. Motivated by the example above, we study three
questions:
1. How can one describe potentially incomplete parts of a database?
2. Which language extensions are needed to translate descriptions from the</p>
      <p>CAD to the IAD setting?
3. How can one decide whether a query answer is complete?
Contribution. Our contribution is to study the completeness-as-default setting
for partially complete databases. In particular we present (1) four di erent
languages for describing potential incompleteness, which are not symmetric to
the ones used in the IAD setting, (2) we show that except for a trivial case, it is not
possible to translate between the two settings without extending the languages,
and (3) we present techniques to decide query completeness entailment for each
of the languages both on the schema and instance level.</p>
    </sec>
    <sec id="sec-2">
      <title>2 Formalization</title>
      <p>We assume a fixed set of relation symbols , and we deal with conjunctive queries
that are satisfiable and do not contain arithmetic comparisons.</p>
      <p>Incompleteness Statement Languages We study four languages for specifying
possibly incomplete parts of a database:</p>
      <sec id="sec-2-1">
        <title>1. Full-table statements,</title>
      </sec>
      <sec id="sec-2-2">
        <title>2. Pattern statements [13],</title>
        <p>
          3. Local statements, similar to the local completeness statements in [
          <xref ref-type="bibr" rid="ref14 ref8">8,14</xref>
          ],
4. Query statements, similar to the query completeness statements in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
Intuitively, full-table statements can be used to assert that a whole table is
potentially incomplete. They are a very coarse way to describe possible
incompleteness, nevertheless, they are the only of the languages studied here for which it is
possible to translate between completeness as default and incompleteness as
default without extending the language. Patterns were introduced in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], while the
local statements closely resemble the local completeness statements introduced
in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], and the query statements somewhat the query completeness statements
introduced in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. We next formalize the statements and their semantics.
        </p>
        <sec id="sec-2-2-1">
          <title>Potential Incompleteness Statements.</title>
          <p>A full table statement has the form PotInc(R), where R is a table.</p>
          <p>A pattern statement has the form PotInc( p(R)), where p is a selection by
constants, i.e., p is “x¯ = c¯”, with x¯ being a subset of the attributes of R, and R
being a table.</p>
          <p>A local statement has the form PotInc(R(x¯); G), where R is a table and G is a
conjunction of atoms.</p>
          <p>A query statement has the form PotInc(Q), where Q is a conjunctive query.
Example 1. Consider a database with three tables, student, lecturer and takes.
A full-table statement PotInc(student) for student would assert that the student
table is possibly incomplete.. A pattern statement PotInc( degree=CS(student))
would assert that students in the CS degree are possibly incomplete. A local
statement PotInc(student(n; d); takes(n; Databases)) would assert that records
in student of students taking Databases are possibly incomplete. A query
statement PotInc(QDatabases(n) :- student(n; d); takes(n; Databases)) would
assert that the query for students taking Databases is potentially incomplete,
which can be either due to missing records in student or to missing records in
takes.</p>
          <p>
            We next formalize the models of potential incompleteness statements. As
common [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ], an incomplete database is a pair (Di; Da) of an ideal and an available
database with Da Di. Given an ideal database Di and a fact f , we say that a
potential incompleteness statement explains the absence of f from Da, if
– Full table statement PotInc(R): f is a fact in relation R.
– Pattern statement PotInc( p(R)): f is an R-fact that satisfies the selection
condition p
– Local statement PotInc(R(x¯); G): f is in QR;G(Di), where QR;G is the query
          </p>
          <p>QR;G(x¯) :- R(x¯); G:
– Query statement PotInc(Q): There exists a valuation for Q over Di that maps
some atom in the body of Q to f .</p>
          <p>Intuitively, models of a set of incompleteness statements are those incomplete
databases where incompleteness occurs only in parts described by
incompleteness statements.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Definition 1. Given a set of incompleteness statements S, models are all incomplete</title>
        <p>database (Di; Da) where the absence of each fact f in Di n Da is explained by some
statement in S.</p>
        <p>Example 2. Consider the statement PotInc(student(n; c); takes(n; Databases))
from Ex. 1, which says that records in student of students taking Databases
are possibly incomplete. Models of this statement are all incomplete databases
(Di; Da) where the only di erence between Di and Da are student facts for
students that take Databases according to Di.</p>
        <p>Unlike for instance in propositional logic, models of potential
incompleteness statements are monotonic. The incomplete database (Di; Da) with Di =
fstudent(John; CS); lecturer(Mary; ST)g and Da = ; is neither a model of
PotInc(student) nor of PotInc(lecturer) alone, but of the two together. In
general, supersets of potential incompleteness statements have more models than
their subsets. We find the following order of expressivity of languages for
potential incompleteness:</p>
        <sec id="sec-2-3-1">
          <title>Proposition 1 (Expressiveness of Languages).</title>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>1. Local statements subsume query statements.</title>
      </sec>
      <sec id="sec-2-5">
        <title>2. Query statements subsume pattern statements.</title>
      </sec>
      <sec id="sec-2-6">
        <title>3. Pattern statements subsume full table statements.</title>
        <p>Proof. Subsumption (2) holds because any pattern statement PotInc( p(R)) can
be expressed using a query statement PotInc(Q() :- R). Similarly, (3) holds as any
potential incompleteness of a table can be expressed equivalently with a pattern
stating that any part of the table is potentially incomplete. Regarding (1), observe
that given a query Q :- R1; : : : ; Rn that is potentially incomplete, one can
equivalently formulate n local statements which assert PotInc(Ri; R1; : : : ; Ri 1; Ri+1; : : : ; Rn).</p>
        <p>An illustration is shown in Fig. 1.</p>
        <p>Query Completeness Under the OWA, conjunctive
queries only return certain answers, while under the
CWA, the returned answers are also all possible
answers. Under the PCWA, one can investigate whether
a query returns all possible answers, which is a
property called query completeness. Formally, an incomplete
database satisfies query completeness under bag (set)
semantics, if the bag (set) of answers to the query
over the ideal database is the same as over the
available database. We then write Compl(Q). Similarly, a set
of incomplete databases satisfies query completeness,
if each member satisfies query completeness. Note
that query completeness inferences are nonmonotonic
wrt. potential incompleteness statements: More
potential incompleteness statements imply that fewer
queries can be inferred to be complete.</p>
        <p>In the rest of the paper, we study three problems.</p>
        <p>Problem 1 (Translation). Given a language of potential incompleteness
statements, which additions are needed to translate any set of statements from the
CAD to the IAD setting?
Problem 2 (Schema Reasoning). Given a set S of statements and a query Q, is Q
complete over all models of S, i.e., S j= Compl(Q)?
Problem 3 (Instance Reasoning). Given statements S, a database Da and a query
Q, is Q complete over all models of S where Da is fixed, i.e., S j=Da Compl(Q)?
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Translation</title>
      <p>In this section we study how to translate from the CAD to the IAD setting.
Example 3. Recall the full-table statement PotInc(student) from Ex. 1. Given that
the database has only three tables, this can be translated into the IAD setting by
stating that the tables lecturer and takes are complete, i.e., Compl(lecturer)
and Compl(takes). In the motivating example we already saw that negation may
be needed.</p>
      <p>
        We next briefly formalize the semantics of completeness statements. An
incomplete database (Di; Da) satisfies a completeness statement, if:
– Full table statement PotInc(R): R(Di) = R(Da).
– Pattern statement PotInc( p(R)) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]: p(R(Di) R(Da)
– Local statement PotInc(R(x¯); G) [
        <xref ref-type="bibr" rid="ref14 ref8">8,14</xref>
        ]: The result of QR;G(x¯) :- R(x¯); G over Di
is contained in R(Da).
      </p>
      <p>
        – Query statement PotInc(Q) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]: Q(Di) = Q(Da).
      </p>
      <p>Unlike potential incompleteness statements, completeness statements are
monotonic, i.e., an incomplete database satisfies a set of completeness statements, i
it satisfies each of them. We next show how translation between CAD and IAD
can work in principle.</p>
      <p>The next proposition shows that unless for full table statements, translation
from the CAD to the IAD setting is not possible without additional assumptions.
Proposition 2 (Translatability from CAD to IAD).</p>
      <sec id="sec-3-1">
        <title>1. For full table statements, translation is possible.</title>
      </sec>
      <sec id="sec-3-2">
        <title>2. For pattern statements, translation is possible if attribute domains are finite, or</title>
        <p>disequality is added to the pattern statements.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3. For local and query statements, translation is possible if (1) the domains of all attributes instantiated in the pattern statement are finite or disequality is added, and (2) if negation is added.</title>
        <p>Proof. (1): Translation is possible by creating a completeness statement for every
relation in n fR j PotInc(R) 2 Sg.</p>
        <p>(2): A translation is possible if instantiated domains are finite, e.g. if gender is
either male or female, then PotInc( gender=male(person)) is analogue to
Compl( gender=female(person)), otherwise, we would need an infinite set of
statements. If disequality is allowed, one could assert Compl( gender,male(person)).</p>
        <p>(3): Additionally to finite domains or disequality, negations is needed, as
shown in the introductory example.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Completeness Entailment for Queries under Bag Semantics</title>
      <p>In the following, we develop techniques for deciding query completeness
entailment for queries under bag semantics. Somewhat surprisingly, for all four
languages, schema reasoning has the same complexity (PTIME), and instance
reasoning has the same complexity (coNP), which is generally easier than in the
IAD setting.</p>
      <p>Let us see first how to reason with full-table and pattern statements:</p>
      <sec id="sec-4-1">
        <title>Example 4 (Full-table and Pattern Statements). Consider the following query</title>
        <p>QDatabases(n) :- student(n; c); takes(n; Databases), and a potential
incompleteness statement for the takes table, that is, PotInc(takes). Then, this statement
does not entail query completeness, because missing records in takes could lead
to an incomplete query result. Alternatively, assume a potential incompleteness
statement PotInc( course=Algorithms(takes)). Clearly, this statement has no influence
on the query QDatabases that asks for students that take Databases, thus, query
completeness is entailed.</p>
        <p>For local statements, we find that, unlike in the IAD setting, even statements that
contain relations not appearing in the query can influence query completeness.
Example 5 (Local Statements). Consider an incompleteness statement for takes
for remote students, that is, PotInc(takes(n; d); remoteStudent(n)). This can lead
to a potentially incomplete query result for QDatabases, because remote students
might take Databases, thus, no completeness holds.</p>
        <p>As query statements are just collections of atoms that might be incomplete, they
behave analogously to pattern statements.</p>
        <p>Example 6 (Query Statements). Consider an incompleteness statement for a query
for all lecturers that are also students, PotInc(Q(n) :- student(n; d); lecturer(n; f )).
Clearly, such students could also take Databases and thus, if missing, could lead
to an incomplete query result for QDatabases, thus, no query completeness holds.
Formally, we find the following:</p>
        <sec id="sec-4-1-1">
          <title>Theorem 1 (Schema Reasoning for Bag Semantics). For all languages, the com</title>
          <p>bined complexity of schema reasoning is PTIME.</p>
          <p>Proof. We only discuss local statements, as they entail all other statements. For
local statements of the form PotInc(Ri(x¯i); Gi), completeness holds if and only if
none of the Ri(x¯i) atoms can be unified with any atom in Q.</p>
          <p>Wrt. the database instance, in general, more conclusions are possible.
Example 7 (Instance Reasoning). Consider again the situation from Ex. 4, where
PotInc(takes) holds, but assume that we additionally know that student is
empty. Then, no matter whether records are missing in takes, the result of
QDatabases will always be empty and hence complete.</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>Theorem 2 (Instance Reasoning for Bag Semantics). For all languages, instance</title>
          <p>reasoning is coNP-hard in combined complexity. For full table and pattern statements,
the combined complexity is also coNP, and the data complexity is PTIME.
Proof. (coNP-membership): To show that completeness does not hold, it su ces
to guess an instantiation for the body B of Q such that at least one fact in B is
not in Da, and such that (Da; Da [ B) satisfies S. As the latter part can be done
in PTIME, the entailment checking is in coNP.</p>
          <p>(PTIME data complexity): Observe that for full-table and pattern statements,
the guesses are especially easy, as it su ces to consider the constants in the
database plus at most as many new ones as there are variables in the query,
thus, there are polynomial many possible valuations to consider. For local and
query statements, one also has to show that the new facts are explained by
potential incompleteness statements, which may increase the complexity.</p>
          <p>(coNP-hardness): We can reduce graph colorability to instance reasoning
with full table statements as follows: Consider a graph G. We construct a
Boolean query Q() :- R(); G0, where G0 contains for each edge from node x to
node y in G an atom E(x; y) (where x and y are variables). Furthermore, let
Da = fE(r; g); E(r; b); E(g; r); E(g; b); E(b; r); E(b; g)g, that is, the database contains
the six allowed colorings for adjacent vertices. Finally, let S = fPotInc(R)g. Then,
S and Da entail Compl(Q) if and only if G is not three-colorable.. Assume G is not
colorable. Then there is no homomorphism from G0 into the (complete) set of
allowed colorings in any Di, and hence, Q would return the false also over any
Di and hence Q would be complete. On the other hand, if G is three-colorable, a
homomorphism from G0 to Da exists and hence we can construct a Di = Da [ R()
where Q evaluates to true.</p>
          <p>For local and query statements it is not yet known how to decide entailment,
because new facts in B might require justifications that use again new facts,
which require again justifications, and so on, similarly to chase-procedures.
All results for queries are summarized in Table 2, where we also list known
complexities from the IAD setting.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Completeness Entailment for Queries under Set Semantics</title>
      <p>Queries can be evaluated under bag or set semantics, and it is possible that a
query is complete under set, but not under bag semantics.</p>
      <p>Full-table
statements
Pattern
statments
Local
statements
Query
statements</p>
      <p>PTIME* cocmoNplPet-e* PTIME</p>
      <p>
        NPcomplete in 2P*
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
      </p>
      <p>PTIME</p>
      <p>
        NP-
2Pcomplete complete PTIME
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
coNPcoNPhard
coNPhard
      </p>
      <p>Example 8. Consider the query Q(n) :- takes(n; Databases); takes(n; x) that,
under bag semantics, asks for the number of courses that students taking databases
take. Consider also the incompleteness statement PotInc( course=OS(takes)). Then
the above query might not return the correct count (bag), but it would still return
the correct set of all students that take databases and something else.
For queries under set semantics, entailment is generally more di cult.</p>
      <sec id="sec-5-1">
        <title>Theorem 3 (Schema Reasoning for Set Semantics). Schema reasoning for full</title>
        <p>table statements in PTIME, for pattern statements in 2P and coNP-hard for pattern,
local and query statements in combined complexity.</p>
        <p>Proof. Membership for full table statements holds as one can just check whether
any relation used in the query is potentially incomplete.</p>
        <p>CoNP-hardness for pattern statements can be shown by reduction of
3colorability. Consider a graph G encoded into a query Q as before. Consider
furthermore potential incompleteness statements PotInc(E(r; g)), PotInc(E(r; b))
and so on (the six allowed colorings). Then Q is complete exactly if G is not
colorable.</p>
        <p>2P-membership can be shown by the same technique as used in the proof
of coNP-membership in Thm. 2. Now however it needs to be shown that the
newly retrieved fact is not already returned over Da, thus, after the first guess,
a call to an NP-oracle is needed.</p>
        <p>Instance reasoning is hard even for full-table statements.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Theorem 4 (Instance Reasoning for Set Semantics). Instance reasoning for all</title>
        <p>languages is 2P-hard in combined complexity, and in 2P for full-table and pattern
statements.</p>
        <p>
          Proof. (Hardness): Can be shown by reduction of 3-coloring extension problem [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>(Membership): Can be shown using the same technique as in the proof of
Thm. 2.</p>
        <p>Interestingly, even reasoning for Boolean queries provides a challenge.</p>
        <sec id="sec-5-2-1">
          <title>Proposition 3 (Boolean Queries). Instance reasoning with full table statements for boolean queries is co-DP-complete.</title>
          <p>
            Proof. We show that incompleteness is DP-complete. We show hardness by
encoding critical 3-colorability problem which is DP-complete [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ]. To show
membership we check incompleteness by calling two NP oracles: one that checks if a
boolean query evaluates to false over a given database, and the other that checks
if the query evaluates to true over a representative ideal database.
          </p>
          <p>All hardness results hold also for linear queries (because the complexity
comes from the choices for the mapping), as one could reduce satisfiability
of quantified boolean formulas (QBF). So all complexities above hold even
with fixed database. Also, if the number of variables in the query is bounded,
entailment becomes polynomial, as there are only polynomially many possible
mappings.
6</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Related Work</title>
      <p>
        The problem of query completeness entailment in the IAD setting has been
introduced by Motro [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and Halevy [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Most complexity results for this setting
appeared in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>Query completeness in the CAD setting is closely related to the problem of
queries independent of updates (QIU). QIU investigates whether a query result can
change as a consequence of a database update. Intuitively, query completeness
entailment holds, if the given query is independent from all insertion operations
corresponding to the incompleteness statements.</p>
      <p>
        QIU has been introduced by Blakeley et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], who investigated QIU for
conjunctive queries without selfjoins and deletion, insertion and modification
updates that correspond to patterns. Both queries and statements were allowed
to contain comparisons. They showed that deciding QIU wrt. a database instance
in these cases is in coNP for selfjoin-free queries that may contain comparisons.
Elkan [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and Halevy and Sagiv [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] expanded this work to recursive queries and
queries containing negation. They provided su cient conditions for QIU for
such cases in terms of query reachability, and showed that in special cases these
conditions were also necessary. Recent work studies QIU also for the SPARQL
language [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
    </sec>
    <sec id="sec-7">
      <title>Discussion</title>
      <p>Implementation Techniques Schema reasoning is straightforward for all four
languages, each time requiring to check for unifiability between single atoms.
Implementing instance reasoning for full table and pattern statements is
possible as follows: Consider that the query has the form Q() :- A1; : : : ; An, where
atoms A1 to Am are complete, and atoms Am+1 to An are potentially incomplete.
If some atom in Am+1 to An contains a variable that does not appear in A1 to Am,
it su ces to check whether the A1 to Am can be evaluated over Da. If none of the
potentially incomplete atoms has a new variable, one can proceed as follows:
For each atom Ai(s¯i), i = m + 1 : : : n one can build a query Q(s¯i) :- A1; : : : ; Am; :Ai.
If this query returns any answer over Da, this means that the query is not
complete. To deal with pattern statements, one would add the condition to the query.
Thus, completeness reasoning can be implemented using query evaluation.
Combining CAD and IAD One can imagine situations where some relations are
interpreted under IAD and some under CAD. An example could be internal
research evaluation by publication analysis, where publications of the members
of an institute are in a local information system and thus complete by default,
while publications of coauthors are taken from the web and thus generally
incomplete. To reason about query completeness in such settings, one could
proceed in three steps: (1) one would assert completeness for all relations
under CAD assumption, (2) one would perform completeness reasoning using all
completeness statements, and (3) one would perform completeness reasoning
using only the potential incompleteness statements. Consequently, the
complexity would be the union of the complexities of reasoning under CAD and
IAD.</p>
      <p>Incompleteness statements Instead of potential incompleteness statements, one
can also image statements that state that a certain part of a database is
definitely incomplete, for instance, that the student table is definitely missing some
records. For the nonspecified parts, one may either assume completeness or
potential incompleteness. In both cases, one could additionally check whether
a query is known to be definitely incomplete, which would be a di erent
reasoning problem. However, this inference is not very practical, as it is hardly ever
possible. For instance, consider a query Q(x) :- R(x); S(x); and consider that R and
S are known to be definitely incomplete (miss some facts). Then this does not
allow to conclude that Q is incomplete, because sets of facts missing from R and
S might be disjoint. More generally, for pattern statements, inferring definite
query incompleteness seems only possible in trivial cases where the query has
exactly the same shape as a single definite incompleteness statement.</p>
      <p>Future work will focus filling the gaps in the complexities of completeness
entailment, and on the connections to data exchange.
This work has been partially supported by the projects “MAGIC”, funded by
the province of Bozen-Bolzano, and “TaDaQua”, funded by the Free University
of Bozen-Bolzano.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Miklós</given-names>
            <surname>Ajtai</surname>
          </string-name>
          , Ronald Fagin, and
          <string-name>
            <surname>Larry J. Stockmeyer.</surname>
          </string-name>
          <article-title>The closure of monadic NP</article-title>
          .
          <source>J. Comput. Syst. Sci.</source>
          ,
          <volume>60</volume>
          (
          <issue>3</issue>
          ):
          <fpage>660</fpage>
          -
          <lpage>716</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Blakeley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Coburn</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.-A.</given-names>
            <surname>Larson</surname>
          </string-name>
          .
          <article-title>Updating derived relations: Detecting irrelevant and autonomously computable updates</article-title>
          .
          <source>In VLDB</source>
          , pages
          <fpage>457</fpage>
          -
          <lpage>466</lpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Denecker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cortés-Calabuig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bruynooghe</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Arieli</surname>
          </string-name>
          .
          <article-title>Towards a logical reconstruction of a theory for locally closed databases</article-title>
          .
          <source>ACM TODS</source>
          ,
          <volume>35</volume>
          (
          <issue>3</issue>
          ),
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>P.</given-names>
            <surname>Doherty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Lukaszewicz</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Szalas</surname>
          </string-name>
          .
          <article-title>E cient reasoning using the local closedworld assumption</article-title>
          .
          <source>In AIMSA</source>
          , pages
          <fpage>49</fpage>
          -
          <lpage>58</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ch</surname>
          </string-name>
          . Elkan.
          <article-title>Independence of logic database queries and updates</article-title>
          .
          <source>In Proc. PODS</source>
          , pages
          <fpage>154</fpage>
          -
          <lpage>160</lpage>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>R. E. Prasojo W. Nutt F.</given-names>
            <surname>Darari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Razniewski</surname>
          </string-name>
          .
          <article-title>Enabling fine-grained RDF data completeness assessment</article-title>
          .
          <source>ICWE</source>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>N.</given-names>
            <surname>Guido</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Genevès</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Layaïda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Roisin</surname>
          </string-name>
          .
          <article-title>On query-update independence for SPARQL</article-title>
          .
          <source>CIKM '15</source>
          , pages
          <fpage>1675</fpage>
          -
          <lpage>1678</lpage>
          , New York, NY, USA,
          <year>2015</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Alon</surname>
            <given-names>Y. Levy.</given-names>
          </string-name>
          <article-title>Obtaining complete answers from incomplete databases</article-title>
          .
          <source>In VLDB</source>
          , pages
          <fpage>402</fpage>
          -
          <lpage>412</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Alon</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Levy</surname>
            and
            <given-names>Yehoshua</given-names>
          </string-name>
          <string-name>
            <surname>Sagiv</surname>
          </string-name>
          .
          <article-title>Queries independent of updates</article-title>
          .
          <source>In Proc. VLDB</source>
          , pages
          <fpage>171</fpage>
          -
          <lpage>181</lpage>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>J.</given-names>
            <surname>Minker</surname>
          </string-name>
          .
          <article-title>On indefinite databases and the closed world assumption</article-title>
          .
          <source>In Conference on Automated Deduction</source>
          , pages
          <fpage>292</fpage>
          -
          <lpage>308</lpage>
          . Springer,
          <year>1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>A.</given-names>
            <surname>Motro</surname>
          </string-name>
          . Integrity = Validity + Completeness.
          <source>ACM TODS</source>
          ,
          <volume>14</volume>
          (
          <issue>4</issue>
          ):
          <fpage>480</fpage>
          -
          <lpage>502</lpage>
          ,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Christos</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Papadimitriou</surname>
          </string-name>
          .
          <article-title>Computational complexity</article-title>
          .
          <source>Addison-Wesley</source>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>S.</given-names>
            <surname>Razniewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Korn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Nutt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          .
          <article-title>Identifying the extent of completeness of query answers over partially complete databases</article-title>
          .
          <source>In SIGMOD</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>S.</given-names>
            <surname>Razniewski</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Nutt</surname>
          </string-name>
          .
          <article-title>Completeness of queries over incomplete databases</article-title>
          .
          <source>In VLDB</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>Raymond</given-names>
            <surname>Reiter</surname>
          </string-name>
          .
          <article-title>On closed world data bases</article-title>
          .
          <source>In Logic and Data Bases</source>
          , pages
          <fpage>55</fpage>
          -
          <lpage>76</lpage>
          ,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>