Lossless Selection Views under Constraints

            Ingo Feinerer1 , Enrico Franconi2 , and Paolo Guagliardo2
                         1
                          Vienna University of Technology
                               feinerer@logic.at
                        2
                          Free University of Bozen-Bolzano
                      {franconi,guagliardo}@inf.unibz.it


1   Introduction
The problem of updating a database through a set of views consists in propagat-
ing updates of the views to the base relations over which the view relations are
defined, so that the changes to the database reflect exactly those to the views.
This is a classical problem in database research, known as the view update prob-
lem [1], which in recent years has received renewed attention.
    View updates can be consistently propagated in an unambiguous way under
the condition that the mapping between database and view relations is lossless,
and that each database relation can be expressed in terms of the view relations
by means of a query, in much the same way the latter are defined from the former
[3]. In such a context, database decompositions play an important role, in that
their losslessness is associated with the existence of an explicit reconstruction
operator that prescribes how a database relation can be rebuilt from the pieces
into which it has been decomposed. In the case of horizontal decomposition, in
which a relation is split into sub-relations containing each a subset of its rows,
the reconstruction operator is the union.
    The study of horizontal decomposition in the literature has mostly focused on
settings where data values can only be compared for equality. However, most real-
world applications make use of data values coming from domains with a richer
structure (e.g., ordering) on which a variety of other restrictions besides equality
can be expressed (e.g., that of being within a range or above a threshold). It is
therefore of practical interest to consider a scenario where some of the attributes
in the database schema are interpreted over a specific domain, such as the reals
or the integers, on which a set of special predicates (e.g., smaller/greater than)
and functions (e.g., addition/subtraction) are defined, according to a first-order
language C. We consider horizontal decomposition in such a setting and invest-
igate its losslessness in the presence of integrity constraints. The only restriction
we impose on the language C is that of being closed under negation.

2   Fundamentals
We consider a source relation symbol R under so-called conditional domain con-
straints (CDCs) of the form

                             ∀x, y . R(x, y) ∧ λ(x) → δ(y) ,                     (1)
where λ(x) is a Boolean combination of equalities x = a between a variable from
x and a constant, and δ(y) ∈ C. The variables in x and y are associated with
non-interpreted and interpreted positions (that is, attributes) of R, respectively.
By means of formulae in C, CDCs restrict the admissible values at interpreted
positions, whenever a certain condition holds on the non-interpreted ones.
   We consider a decomposed schema V consisting of view symbols having the
same arity as R, each of which is defined by a formula of the form
                                                            
                  ∀x, y . V (x, y) ↔ R(x, y) ∧ λ(x) op σ(y) ,                  (2)

where λ(x) is as in (1), σ(y) ∈ C, and op is ∧. A set Σ of formulae of the form
(2), one for each V ∈ V, specifies a horizontal decomposition of R.

3    Losslessness
Lossless horizontal decomposition under CDCs can be characterised in terms of
unsatisfiability in C, which gives a decision procedure for losslessness whenever
the satisfiability of sets of formulae in C is decidable. The main idea is as follows:
1) With each equality between a variable and a constant a we associate a pro-
   positional variable pai , whose truth-value indicates whether the value at (non-
   interpreted) position i is a.
2) For a given valuation α of such propositional variables, we consider the set
   consisting of the C-formulae in the r.h.s. of all the CDCs that are “applicable”
   (i.e., the propositional representation of their l.h.s. is true) under α and the
   negation of any C-formula δ(y) appearing in some selection in Σ where the
   propositional representation of λ(x) is satisfied by α.
3) Checking losslessness is equivalent to checking that there exists no valuation
   α for which the above set of C-formulae is satisfiable.
     The class of Unit Two-Variable Per Inequality formulae (UTVPIs) is a frag-
ment of linear arithmetic over the integers for which the satisfiability problem is
decidable in polynomial time [4]. UTVPI have the form ay+by 0 ≤ d, where y and
y 0 are integer variables, a, b ∈ {−1, 0, 1} and d ∈ Z. We refer to Boolean combin-
ations of UTVPIs as BUTVPIs; the satisfiability problem for sets of BUTVPIs is
NP-complete [5]. It can be shown that, when C is the language of either UTVPIs
or BUTVPIs, the problem of deciding losslessness is co-NP-complete.

4    Separability
We also investigated lossless horizontal decomposition under CDCs in combina-
tion with traditional integrity constraints, showing that functional dependencies
(FDs) do not interact with CDCs and can therefore be allowed without any re-
striction, whereas this is not the case for unary inclusion dependencies (UINDs).
    The main technical tool we employ in our study is the notion of separability: a
class of constraints is separable from CDCs if, after making explicit the result of
their interaction, captured by a suitable finite set S of sound inference rules, we
can focus solely on CDCs and disregard the other constraints, as far as lossless


                                          2
horizontal decomposition is concerned. Given a horizontal decomposition Σ and
a combination ∆ of CDCs with other constraints that are separable from them
w.r.t. S, we can check whether Σ is lossless under ∆ as follows:
 1) compute the deductive closure ∆∗ of ∆ w.r.t. S, which makes explicit the
    interaction between CDCs and the other constraints in ∆ by adding entailed
    constraints;
 2) by using the technique of Section 3, check whether Σ is lossless under the
    set cdc(∆∗ ) obtained from ∆∗ by discarding all of the constraints that are
    not CDCs.
   A UIND R[i] ⊆ R[j] is satisfied if, for every source instance I, every value in
the i-th column of the extension RI of R under I appears also in the j-th column
of RI . Let yi and yj be the variables associated with the interpreted positions i
and j of R, respectively; by means of the following domain propagation rule
                         > → δ(yj )         R[i] ⊆ R[j]
                                                        ,                         (dp)
                                   > → δ(yi )
it is possible to derive a set of CDCs that fully captures the interaction between
a given set of UINDs and opportunely restricted CDCs w.r.t. lossless horizontal
decomposition, which makes possible to employ the general technique for decid-
ing losslessness also in the presence of UINDs.

5    Discussion
The results on losslessness and separability briefly reported in this short paper
are under revision for journal publication; a preliminary version appeared in [2].
We remark that our technique for checking losslessness can also be applied when
op in (2) is ∨.
    We are currently investigating the possibility of generalising the separability
results for UINDs to arbitrary inclusion dependencies (INDs). We also strongly
believe that equalities between non-interpreted variables (i.e., from x) in (1) and
(2) can be allowed and our technique extended by representing such equalities by
means of additional propositional variables and by adding suitable propositional
axioms to handle transitivity and symmetry.

References
1. Bancilhon, F., Spyratos, N.: Update semantics of relational views. ACM Trans.
   Database Syst. 6(4), 557–575 (Dec 1981)
2. Feinerer, I., Franconi, E., Guagliardo, P.: Lossless horizontal decomposition with
   domain constraints on interpreted attributes. In: Big Data – Proc. of BNCOD 2013,
   LNCS, vol. 7968, pp. 77–91. Springer (2013)
3. Franconi, E., Guagliardo, P.: Effectively updatable conjunctive views. In: Proc. of
   AMW 2013. CEUR Workshop Proceedings, vol. 1087. CEUR-WS.org (2013)
4. Schutt, A., Stuckey, P.J.: Incremental satisfiability and implication for UTVPI con-
   straints. INFORMS Journal on Computing 22(4), 514–527 (2010)
5. Seshia, S.A., Subramani, K., Bryant, R.E.: On solving boolean combinations of
   UTVPI constraints. JSAT 3(1-2), 67–90 (2007)


                                          3