Introduction

Computing Approximate Certain Answers over Incomplete Databases

Sergio Greco

Cristian Molinaro

Irina Trubitsyna

greco

cmolinaro

trubitsynag@dimes.unical.it

0 0 DIMES, Universita` della Calabria , 87036 Rende (CS) , Italy

Certain answers are a widely accepted semantics of query answering over incomplete databases. Since their computation is a coNP-hard problem, recent research has focused on developing evaluation algorithms with correctness guarantees, that is, techniques computing a sound but possibly incomplete set of certain answers. In this paper, we show how novel evaluation algorithms with correctness guarantees can be developed leveraging conditional tables and the conditional evaluation of queries, while retaining polynomial time data complexity.

Introduction

Incomplete information arises naturally in many database applications, such as data integration [9, 21], data exchange [ 3, 10, 17, 7 ], inconsistency management [ 4, 5, 16, 20 ], data cleaning [11], ontological reasoning [6, 8], and many others.

A principled way of answering queries over incomplete databases is to compute certain answers, which are query answers that can be obtained from all the complete databases an incomplete database represents. This central notion is illustrated below. Example 1. Consider the database D consisting of the two unary relations R = fhaig and S = fh?ig, where ? is a null value. Under the missing value interpretation of nulls (i.e., a value for ? exists but is unknown), D represents all the databases obtained by replacing ? with an actual value.

A certain answer to a query is a tuple that belongs to the query answers for every database represented by D.

For instance, consider the query $1=a(R), selecting the tuples in R whose first value is equal to a. The certain answers to the query are fhaig, because no matter how ? is replaced, the query answers always contain hai. 2

For databases containing (labeled) nulls, certain answers to positive queries can be easily computed in polynomial time by applying a “naive” evaluation, that is, treating nulls as standard constants. However, for more general queries with negation the problem becomes coNP-hard.

To make query answering feasible in practice, one might resort to SQL’s evaluation, but unfortunately, the way SQL behaves in the presence of nulls may result in wrong answers.

Specifically, as evidenced in [25], there are two ways in which certain answers and SQL’s evaluation may differ: (i) SQL can miss some of the tuples that belong to certain answers, thus producing false negatives, or (ii) SQL can return some tuples that do not belong to certain answers, that is, false positives. While the first case can be seen as an under-approximation of certain answers (a sound but possibly incomplete set of certain answers is returned), the second scenario must be avoided, as the result might contain plain incorrect answers, that is, tuples that are not certain. The experimental analysis in [18] showed that false positive are a real problem for queries involving negation— they were always present and sometimes they constitute almost 100% of the answers. Example 2. Consider again the database D of Example 1. There are no certain answers to the query R S, as when ? is replaced with a, the query answers are the empty set.

Assuming that R and S’s attribute is called A, the same query can be expressed in SQL as follows:

SELECT R.A FROM R WHERE NOT EXISTS (

SELECT * FROM S WHERE R.A = S.A )

However, the answer to the query above is hai, which is not a certain answer. The problem with the SQL semantics is that every comparison involving at least one null evaluates to the truth value unknown, then 3-valued logic is used to evaluate the classical logical connectives and, or, and not, and eventually only those tuples whose condition evaluates to true are kept.

Going back to the query above, the nested subquery compares ? with a, and since such a comparison evaluates to unknown (because a null is involved) then the result of the nested subquery is empty. As a consequence, the final result of the query is hai. 2

Thus, on the one hand, SQL’s evaluation is efficient but flawed, on the other hand, certain answers are a principled semantics but with high complexity. One way of dealing with this issue is to develop polynomial time evaluation algorithms computing approximate certain answers.

In this regard, there has been recent work on evaluation algorithms with correctness guarantees, that is, techniques providing a sound but possibly incomplete set of certain answers [18, 24, 25]. Such approaches consists in rewriting a query Q into two queries Qt and Qf (resp., Q+ and Q?) computing approximations of certainly true and certainly false answers (resp., certainly true and possible answers). However, there are still very simple queries for which the approximation can be unsatisfactory.

Example 3. Consider the database D of Example 1 and the query R $1=b(S). In this case, the certain answers are fhaig. However, the approximation provided by the approach in [18] is the empty set. 2

In this paper, we show how conditional tables and the conditional evaluation of queries [19] can be leveraged to develop new approximation algorithms with correctness guarantees. The approach allows us to keep track of useful information that can be profitably used to determine if a tuple is a certain answer. The very basic idea is illustrated in the following example.

Example 4. Consider again the database D of Example 1 and the query R $1=b(S). The conditional evaluation of the query is carried out by applying the “conditional” counterpart of each relational algebra operator. Rather than returning a set of tuples, the conditional evaluation of a relational algebra operator returns pairs of the form ht; 'i, where t is a tuple and ' is an expression stating under which conditions t can be derived.

Considering the query above, first the conditional evaluation of $1=b(S) is performed, which gives h?; '0i, where '0 is the condition (? = b). Then, the conditional evaluation of the difference operator is carried out, yielding ha; '00i, where '00 is the condition (? 6= a) _ (? 6= b). Indeed, this is the result of the conditional evaluation of the whole query. 2

As we will show, conditions are valuable information that can exploited to determine which tuples are certain answers. For instance, from an analysis of '00 in the example above, one can realize that the condition is always true—thus, hai is a certain answer. There might be different ways of evaluating tuples’ conditions. In this paper, we propose some strategies. 2

Preliminaries

In this section, we recall basic notions on incomplete databases, conditional tables, and approximation algorithms for query answering over incomplete databases.

2.1 Incomplete databases

Basics. We assume the existence of the following disjoint countably infinite sets: a set Const of constants and a set Null of (labeled) nulls. Nulls are denoted by the symbol ? subscripted.

A tuple t of arity k is an element of (Const [ Null)k, where k is a non-negative integer. The i-th element of t is denoted as t[i], where 1 i k. Given a possibly empty ordered sequence Z of integers i1; : : : ; ih in the range [1::k], we use t[Z] to denote the tuple ht[i1]; : : : ; t[ih]i.

A relation of arity k is a finite set of tuples of arity k. A relational schema is a set of relation names, each associated with a non-negative arity. A database D associates a relation RD of arity k with each relation name R of arity k. With a slight abuse of notation, when the database is clear from the context, we simply write R instead of RD for the relation itself. The arity of R is denoted as ar (R).

The sets of all constants and nulls occurring in a database D are denoted by Const(D) and Null(D), respectively. The active domain of D is adom(D) = Const(D)[Null(D). If Null(D) = ;, we say that D is complete. Likewise, a relation is complete if it does not contain nulls. Databases as defined above have been called also naive tables, V-tables, and e-tables [ 19, 1, 15 ].

A valuation is a mapping from Const [ Null to Const s.t. (c) = c for every c 2 Const. Valuations can be applied to tuples, relations, and databases in the obvious way. For instance, the result of applying to a database D, denoted (D), is the complete database obtained from D by replacing every null ?i with (?i).

The semantics of a database D is given by the set of complete databases f (D) j is a valuationg, which are also called possible worlds. Indeed, this is the semantics under the missing value interpretation of nulls (i.e., every null stands for a value that exists but is unknown), and is referred to as the closed-world semantics of incompleteness.1

Query answering. We consider queries expressed with the relational algebra, that is, by means of the following operators: selection , projection , cartesian product , union [, intersection \, and difference . In the rest of the paper, a query is understood to be a relational algebra expression built up from the above operators, unless otherwise indicated. The result of evaluating a query Q on a database D, treating nulls as standard constants (i.e., every (labeled) null or constant is equal to itself and different from every other element of Const [ Null), is denoted as Q(D). A query Q returning k-tuples is said to be of arity k, and ar (Q) denotes its arity.

A widely accepted semantics of query answering relies on the notion of a certain answer. The certain answers to a query Q on a database D are the tuples in TfQ( (D)) j is a valuationg. Computing certain answers is coNP-hard (data complexity), even in the case of the more restricted Codd tables, that is, databases as defined above where the same null cannot occur multiple times [ 2 ].

For query answering, we will use a more general notion first proposed in [26] and called certain answers with nulls in [25].

Definition 1. Given a query Q and a database D, the certain answers with nulls, denoted cert(Q; D), is the set of all tuples t such that (t) 2 Q( (D)) for every valuation .

As opposed to the more commonly used notion of certain answers, certain answers with nulls allow tuples with nulls in query answers. Consider for instance the relation R = fha; ?i; hb; cig and the identity query. The certain answers consist only of hb; ci, but it is more intuitive to return the entire relation, as we are certain that its tuples are query answers. In this case, the certain answer with nulls are both ha; ?i and hb; ci. We refer the interested reader to [22, 23] for a discussion of other drawbacks of the standard notion of certain answers.

Below we report the definition of evaluation algorithm with correctness guarantees [18], which we use as the soundness property an approximation algorithm must satisfy.

Definition 2. A query evaluation algorithm has correctness guarantees for a query Q if for every database D it returns a subset of cert(Q; D).

When a query evaluation algorithm has correctness guarantees for every query, we say that it has correctness guarantees. 1 Under the open-world semantics of incompleteness, which we do not consider in this paper, the semantics of D is f (D) [ D0 j is a valuation and D0 is a complete databaseg. 2.2

Conditional tables

Syntax and semantics. Conditional tables have been proposed in [19].2 Essentially, they are relations (as defined in the previous subsection, thus possibly containing nulls) extended by one additional special column that specifies a “condition” for each tuple.

Formally, let E be the set of all expressions, called conditions, that can be built using the standard logical connectives ^, _, and : with expressions of the form ( = ), ( 6= ), true, and false, where ; 2 Const [ Null. We say that a valuation satisfies a condition ', denoted j= ', if its assignment of constants to nulls makes ' true.

A conditional tuple (c-tuple for short) of arity k (k 0) is a pair ht; 'i, where t is a tuple of arity k and ' 2 E . Notice that ' may involve nulls and constants not necessarily appearing in t—e.g., t is the tuple ha; ?1i and ' is the condition (?2 = c)^(?1 6= ?3).

A conditional table (c-table for short) of arity k is a finite set of c-tuples of arity k. A conditional database C associates a c-table RC of arity k with each relation name R of arity k. With a slight abuse of notation, when the conditional database is clear from the context, we simply write R instead of RC for the c-table itself.

Let T be a conditional table. The result of applying a valuation to T is (T ) = f (t) j ht; 'i 2 T and j= 'g: Thus, (T ) is the (complete) relation obtained from T by keeping only the c-tuples in T whose condition is satisfied by , and applying to such c-tuples.

The set of complete relations represented by T is rep(T ) = f (T ) j is a valuationg: Likewise, a conditional database C = fT1; : : : ; Tmg represents the following set of complete databases: rep(C) = ff (T1); : : : ; (Tm)g j is a valuationg:

Conditional evaluation. Below we recall the conditional evaluation of a query over a conditional database (see [19, 15]). Basically, it consists in evaluating the relational algebra operators so that they can take c-tables as input and return a c-table as output. The conditional evaluation of a query over a conditional database is then simply obtained by applying the conditional evaluation of each operator.

Let T1 and T2 be c-tables of arity n and m, respectively. In the definitions below, for the union and difference operators it is assumed that n = m. For projection, Z is a possibly empty ordered sequence of integers in the range [1::n]. For selection, is a Boolean combination of expressions of the form ($i = $j), ($i = c), ($i 6= $j), ($i 6= c), where 1 i; j n, and c 2 Const. The conditional evaluation of a relational algebra operator op is denoted as o_p. In the following, given two tuples t1 and t2 of arity n, we use (t1 = t2) as a shorthand for the condition V (t1[i] = t2[i]). i2[1::n] – Projection: – Selection:

_ Z (T1) = fht[Z]; 'i j ht; 'i 2 T1g: _ (T1) = fht; '0i j ht; 'i 2 T1 and '0 = ' ^ (t)g; where (t) is the condition obtained from by replacing every $i with t[i]. 2 A generalization of conditional tables augmenting them with “global conditions” has been proposed in [14], see also [15]. – Union: – Difference:

T1 [_ T2 = fht; 'i j ht; 'i 2 T1 or ht; 'i 2 T2g:

T1 _ T2 = fht1; '0i j ht1; '1i 2 T1 and '0 = '1 ^ 't1;T2 g; where 't1;T2 =

ht2;'2i2T2 – Cartesian product:

:('2 ^ (t1 = t2)).

T1 _ T2 = fht1 t2; '1 ^ '2i j ht1; '1i 2 T1; ht2; '2i 2 T2g; where t1 t2 is the tuple obtained as the concatenation of t1 and t2.

The result of the conditional evaluation of a query Q over a conditional database C is denoted as Q_ (C). Notice that Q_ (C) is a c-table. For a fixed query Q and a conditional database C, Q_ (C) can be evaluated in polynomial time in the size of C (see [15]). The size of a conditional table T is jjT jj = jT j + P jj'jj, where jT j is the number of ht;'i2T c-tuples in T and jj'jj is the length in symbols of condition '. The size of a conditional database C is jjCjj = P jjTijj.

Ti2C

Recall that c-tables are a strong representation system for the relational algebra, that is, for every relational algebra query Q and conditional database C, it is possible to compute a c-table T s.t. rep(T ) = fQ(D) j D 2 rep(C)g. Indeed, such a c-table can be computed through the conditional evaluation, that is, T = Q_ (C).

In the rest of the paper we assume that every selection condition is a conjunction of expressions of the form ($i = $j), ($i = c), ($i 6= $j), and ($i 6= c). There is no loss of generality, as an arbitrary selection (R) can be rewritten as follows to comply with our assumption. First, is rewritten in disjunctive normal form (DNF), that is, into a formula 1 _ _ m, where each i is a conjunction of expressions of the form ($i = $j), ($i = c), ($i 6= $j), and ($i 6= c). Then, (R) is replaced by 1 (R) [ [ m (R). Even though the conversion to DNF can lead to an exponential blow-up in size, this does not affect the data complexity (as the query is fixed). 3

Approximation Algorithms

In this section, we show how to exploit conditional tables to compute a sound (but possibly incomplete) set of certain answers with nulls. The basic idea is to rely on the conditional evaluation of the relational algebra operators, in order to keep track of how each tuple is derived during query evaluation, and then apply a strategy to evaluate conditions, so that each tuple is eventually associated with a truth value. Tuples that are associated with the condition true are certain answers with nulls.

We start by introducing an explicit conditional evaluation for intersection. In the previous section, we recalled the conditional evaluation of different relational algebra operators. Clearly, even if intersection was not reported, it can be expressed in terms of the other operators. Let T1 and T2 be c-tables of arity n. Then,

T1 \_ T2 = fht1; '0i j ht1; '1i 2 T1; ht2; '2i 2 T2; '0 = '1 ^ '2 ^ (t1 = t2)g: We slightly generalize c-tables to allow also unknown as a condition. Thus, from now on, E is the set of all expressions that can be built using the standard logical connectives with expressions, called atomic conditions, of the form ( = ), ( 6= ), true, false, and unknown, where ; 2 Const [ Null. We will discuss different strategies to “evaluate” conditions, that is, to reduce them to either true or false or unknown—as shown in the following, tuples having condition true are certain answers with nulls.

We assume the following strict ordering false < unknown < true.

The three-valued evaluation of a condition ' 2 E , denoted eval ('), is defined inductively as follows: – eval (( 8>true < = )) = false >:unknown if = ; if 6= and ; otherwise: 8>true if 6= and ; < – eval (( 6= )) = false if = ;

>:unknown otherwise: – eval (('1 ^ '2)) = minfeval ('1) ; eval ('2)g. – eval (('1 _ '2)) = maxfeval ('1) ; eval ('2)g.

>8true if eval (') = false; < – eval ((:')) = false if eval (') = true;

>:unknown otherwise. – eval (v) = v for v 2 ftrue; unknown; falseg.

The basic idea of our first evaluation algorithm, which we call naive evaluation, is to perform the three-valued evaluation of conditions after each relational algebra operator is applied (that is, after its conditional evaluation). One interesting fact about the naive evaluation is that it is equivalent to the evaluation algorithm of [18].

Given a c-tuple t = ht; 'i, with a slight abuse of notation, we use eval (t) to denote ht; eval (')i. Likewise, given a conditional table T , eval (T ) denotes feval (t) j t 2 T g.

To define the naive evaluation, we first provide the following definitions: Evaln(R) = eval (R) Evaln(Q1 [ Q2) = eval (Evaln(Q1) [_ Evaln(Q2)) Evaln(Q1 \ Q2) = eval (Evaln(Q1) \_ Evaln(Q2)) Evaln(Q1 Q2) = eval Evaln(Q1) _ Evaln(Q2) Evaln(Q1 Q2) = eval Evaln(Q1) _ Evaln(Q2) Evaln( (Q)) = eval ( _ (Evaln(Q)))

Evaln( Z (Q)) = eval ( _ Z (Evaln(Q)))

Given a relation R, we define the c-table R = fht; truei j t 2 Rg. Analogously, given a database D, we define D as the conditional database obtained from D by replacing every relation R of it with R. Here the basic idea is to convert a database D into a simple conditional database D where all conditions are true, so that D can be used as the starting point for the conditional evaluation of queries.

Given a query Q and a database D, we use Evaln(Q; D) to denote the result of evaluating Evaln(Q) over D. Finally, we define:

Evaltn(Q; D) = ft j ht; truei 2 Evaln(Q; D)g;

Evalnp(Q; D) = ft j ht; 'i 2 Evaln(Q; D) and ' 6= falseg: Example 5. Consider the database D consisting of the two unary relations R = fhaig and S = fh?1ig. Consider also the query Q5 = R S. The conditional database D consist of the two c-tables R = fha; trueig and S = fh?1; trueig. Then, Evaln(Q5; D) = eval eval R _ eval S = eval R _ S = eval (fha; true ^ :(true ^ (a = ?1))ig) = fha; unknownig: Thus, Evaltn(Q5; D) = ; and Evalnp(Q5; D) = fhaig.

The following theorem states that the naive evaluation is equivalent to the evaluation algorithm of [18]. We point out that, in the following claim, D is a database (thus, possibly containing nulls), but without conditions. However, the naive evaluation first converts D into a conditional database D with all conditions being true, and then performs the evaluation Evaln() over D, so that eventually tuples associated with true are certain answers with nulls.

Theorem 1. The naive evaluation is equivalent to the evaluation algorithm of [18]. Corollary 1. Evaltn(Q) has correctness guarantees.

Theorem 2. Evaln(Q; D) can be computed in polynomial time in the size of D, for every query Q and database D.

Obviously, Evaltn(Q; D) and Evalnp(Q; D) can be computed in polynomial time too (data complexity).

The naive evaluation presented above is somehow limited, since it does not use much the power of conditional tables, as shown in the following example. Example 6. Consider the database D of Example 5 and the query Q6 = R $1=b(S).

Clearly, cert(Q6; D) = fhaig, as the selection returns either fhbig or ; in every possible world, and thus the result of the difference is fhaig in every possible world. However, the naive evaluation is not able to realize this aspect and behaves as follows: 2 Evaln(Q6; D) = = eval eval R _ eval _ $1=b eval S = eval fha; trueig _ eval ( _ $1=b (fh?1; trueig)) = eval fha; trueig _ eval (fh?1; true ^ (?1 = b)ig) = eval fha; trueig _ fh?1; unknownig = eval (fha; true ^ :(unknown ^ (a = ?1))ig) = fha; unknownig: Thus, Evaltn(Q6; D) = ;. In this case, the crucial point is that eval _ $1=b eval S fh?1; unknownig, and the fact that ?1 can only be equal to b gets lost.

The previous example suggests that equalities might be exploited to refine the evaluation of conditions, in order to provide more accurate information, as illustrated in the following example.

Example 7. Consider again the database D of Example 5 and the query Q6 of Example 6. By “propagating” equalities into conditions and tuples after each relational algebra operator is conditionally evaluated, we get:

fha; trueig _ _ $1=b(fh?1; trueig) = fha; trueig _ fh?1; true ^ (?1 = b)ig = fha; trueig _ fhb; unknownig = fha; true ^ :(unknown ^ (a = b))ig = fha; trueig:

Thus, we conclude that hai is a certain answer with nulls. Recall that cert(Q6; D) = fhaig and Evaltn(Q6; D) = ;. 2

In the previous example, it is interesting to notice that the selection returns the ctuple hb; unknowni, meaning that the only tuple that can be returned (in some cases) is hbi, and thus providing more accurate information than the naive evaluation. 4

Conclusion

Certain answers are a principled manner to answer queries on incomplete databases. Since their computation is a coNP-hard problem, recent research has focused on polynomial time algorithms providing under-approximations. Leveraging conditional tables, we have shown how new approximation algorithms can be developed.

We are currently working on more refined evaluation strategies which better exploit the power of conditional tables and the conditional evaluation, with the aim of computing strictly more certain answers with nulls, while retaining polynomial time data complexity. For instance, conditions might be rewritten into a more suitable form (e.g., conjunctive normal form or disjunctive normal form [12, 13]) so as to allow better analyses. 6. Meghyn Bienvenu and Magdalena Ortiz. Ontology-mediated query answering with datatractable description logics. In Reasoning Web, pages 218–307, 2015. 7. Marco Calautti, Sergio Greco, Cristian Molinaro, and Irina Trubitsyna. Exploiting equality generating dependencies in checking chase termination. PVLDB, 9(5):396–407, 2016. 8. Andrea Cal`ı, Georg Gottlob, and Thomas Lukasiewicz. A general datalog-based framework for tractable query answering over ontologies. Journal of Web Semantics, 14:57–83, 2012. 9. Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and Riccardo Rosati. On reconciling data exchange, data integration, and peer data management. In Proc. Symposium on Principles of Database Systems (PODS), pages 133–142, 2007. 10. Ronald Fagin, Phokion G. Kolaitis, Rene´e J. Miller, and Lucian Popa. Data exchange: semantics and query answering. Theoretical Computer Science, 336(1):89–124, 2005. 11. Floris Geerts, Giansalvatore Mecca, Paolo Papotti, and Donatello Santoro. The LLUNATIC data-cleaning framework. Proceedings of the VLDB Endowment, 6(9):625–636, 2013. 12. Georg Gottlob and Enrico Malizia. Achieving new upper bounds for the hypergraph duality problem through logic. In Proc. Symposium on Logic in Computer Science (LICS), pages 43:1–43:10, 2014. 13. Georg Gottlob and Enrico Malizia. Achieving new upper bounds for the hypergraph duality problem through logic. SIAM Journal on Computing, 2017. 14. Go¨sta Grahne. Dependency satisfaction in databases with incomplete information. In Proc.

Very Large Data Bases (VLDB) Conference, pages 37–45, 1984. 15. Go¨sta Grahne. The Problem of Incomplete Information in Relational Databases, volume 554 of Lecture Notes in Computer Science. Springer, 1991. 16. Sergio Greco, Cristian Molinaro, and Francesca Spezzano. Incomplete Data and Data Dependencies in Relational Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2012. 17. Sergio Greco, Francesca Spezzano, and Irina Trubitsyna. Checking chase termination: Cyclicity analysis and rewriting techniques. IEEE Transactions on Knowledge and Data Engineering, 27(3):621–635, 2015. 18. Paolo Guagliardo and Leonid Libkin. Making SQL queries correct on incomplete databases: A feasibility study. In Proc. Symposium on Principles of Database Systems (PODS), pages 211–223, 2016. 19. Tomasz Imielinski and Witold Lipski Jr. Incomplete information in relational databases.

Journal of the ACM, 31(4):761–791, 1984. 20. Paraschos Koutris and Jef Wijsen. The data complexity of consistent query answering for self-join-free conjunctive queries under primary key constraints. In Proc. Symposium on Principles of Database Systems (PODS), pages 17–29, 2015. 21. Maurizio Lenzerini. Data integration: A theoretical perspective. In Proc. Symposium on

Principles of Database Systems (PODS), pages 233–246, 2002. 22. Leonid Libkin. Certain answers as objects and knowledge. In Proc. International Conference on Principles of Knowledge Representation and Reasoning (KR), pages 328–337, 2014. 23. Leonid Libkin. Incomplete data: what went wrong, and how to fix it. In Proc. Symposium on Principles of Database Systems (PODS), pages 1–13, 2014. 24. Leonid Libkin. Sql’s three-valued logic and certain answers. In Proc. International Conference on Database Theory (ICDT), pages 94–109, 2015. 25. Leonid Libkin. SQL’s three-valued logic and certain answers. ACM Transactions Database

Systems, 41(1):1, 2016. 26. Witold Lipski. On relational algebra with marked nulls. In Proc. Symposium on Principles of Database Systems (PODS), pages 201–203, 1984.

Serge

Abiteboul and Go¨sta Grahne. Update semantics for incomplete databases . In Proc. Very Large Data Bases (VLDB) Conference , pages 1 - 12 , 1985 .

2. Serge

Abiteboul

, Paris C. Kanellakis, and Go¨sta Grahne. On the representation and querying of sets of possible worlds . Theoretical Computer Science , 78 ( 1 ): 158 - 187 , 1991 .

Marcelo

Arenas , Pablo Barcelo´,

Leonid

Libkin , and

Filip

Murlak . Foundations of Data Exchange . Cambridge University Press, 2014 .

Marcelo

Arenas ,

Leopoldo E.

Bertossi , and

Jan

Chomicki . Consistent query answers in inconsistent databases . In Proc. Symposium on Principles of Database Systems (PODS) , pages 68 - 79 , 1999 .

5. Leopoldo

Bertossi . Database Repairing and Consistent Query Answering. Synthesis Lectures on Data Management . Morgan & Claypool Publishers, 2011 .