New
          New Closure
               Closure Operators
                        Operators and
                                  and Lattice
                                      Lattice
     RepresentationsRepresentations
                     for Multivalued Dependencies
               and Related
      for Multivalued       Expressions
                       Dependencies  and Related
                                     Expressions
                     Jaume Baixeries1 and José Luis Balcázar1

                       Dept. Baixeries
                     Jaume   Llenguatges
                                      1 i Sistemes Informàtics 1
                                       and José Luis Balcázar
                            Universitat Politècnica de Catalunya
                       1            c/ Jordi Girona,   1-3Informàtics
                           Dept. Llenguatges   i Sistemes
                            Universitat08034  Barcelona
                                         Politècnica de Catalunya
                             {jbaixer,    balqui}@lsi.upc.edu
                                    c/ Jordi   Girona, 1-3
                                       08034 Barcelona
                             {jbaixer, balqui}@lsi.upc.edu

        Abstract. In Database Theory, Multivalued Dependencies are the main
        tool to define the Fourth Normal Form and, as such, their inference prob-
        lem has been deeply studied; two related notions appearing in that study
        are a syntactical analog in propositional logic and a restriction that main-
        tains to this logic the same relationship as Functional Dependencies do
        to Horn logic. We present semantic, lattice-theoretic characterizations
        of such multivalued dependencies that hold in a given relation, as well
        as similar results for the related notions just mentioned. Our charac-
        terizations explain better some previously known facts by providing a
        unifying framework that is also consistent with the studies of Functional
        Dependencies.


 1    Introduction

 Functional Dependencies (FDs) and Multivalued Dependencies (MVDs) are im-
 portant notions in Database Theory. They help to formalize the necessary con-
 ditions on a given relation to ensure, up to various degrees, a reduced amount
 of redundancy; and they indicate decomposition (more properly, normalization)
 operations to be performed to obtain such irredundant (normal form) relations.
     The most widely studied specific problems about these notions concern how
 to axiomatize, with a limited number of dependencies, a large set of them, or
 how to test as efficiently as possible whether a given dependency is a logical
 consequence of others ([7], [8], [14], [18], [19]).
     The less famous notion of Degenerate Multivalued Dependencies (DMVDs)
 does not play such an important role in database normalization, and has been
 usually neglected by database design practitioners, but has been studied by
 theorists. The interest of DMVDs is that they share some features with FDs
 (they are equality-generating instead of tuple-generating) and some features with
 MVDs (they have a similar form and analogous syntactical inference rules); then,
 their study may clarify relationships between FDs and MVDs. In particular,
 the highly more complex combinatorics of MVDs compared to FDs makes it


Radim Bělohlávek, Václav Snášel (Eds.): CLA 2005, pp. 22–33, ISBN 80–248–0863–3.
                          Multivalued Dependencies and Related Expressions        23
2       J. Baixeries and J.L. Balcázar

nontrivial to extend results for FDs into the MVD realm, and then DMVDs may
offer a useful intermediate stepping stone.
    In some ways, dependencies (functional or multivalued) behave as certain
fragments of propositional logic. The main interest of studying this analogy is
double: first, to propose alternative ways to solve inference problems, either for
sets of propositional clauses or for sets of dependencies; and, second, to offer
alternative semantic proofs of the syntactical correspondences of the (fully anal-
ogous) inference calculi available for FDs and MVDs and for their propositional
counterparts. This approach thus allows one to simplify, to some extent, the
combinatorics contained in the dependencies. More precise explanations of how
functional dependencies and multivalued dependencies are related to proposi-
tional clauses are given in the next section, and additional details can be found
in [18], [19].
    Recent progresses in [3] makes it appropriate to revisit the somewhat unsatis-
factory results in [6], trying to obtain a better understanding of the comparison-
based binarization. We considered there MVD clauses and stated a restricted
form of closure under intersection that characterizes the theories axiomatized by
such clauses in a form similar to Horn clauses (a full proof can be found in [4]);
whereas that result gave a clear characterization for these propositional theories,
it did not provide a lattice form due to the fact that, sometimes, the intersection
operation may not be of the restricted form that ensures remaining within the
theory. Likewise, the lattice form for MVDs did not consist only of propositional
models, but they were labeled with the specific pairs of original tuples that gave
rise to them, and the intersection had to be extended to the label through a sort
of join.
    Thanks to the additional knowledge obtained in [1], [2] and [3], we are now
able to refine these facts into somewhat more satisfactory lattice-theoretic char-
acterizations. Indeed, the purpose of this paper is twofold: on one hand, to
provide a lattice-theoretic characterization of a set of MVD clauses that hold in
a propositional theory; it generalizes that of [6] and parallels that of [5] for Horn
clauses. On the other hand, to clarify the relationship between MVD lattices
and MVD clauses for a larger-than-two-tuples world, by exactly identifying the
nodes that must be removed from the binarization in order to get the lattice
corresponding to MVDs. This result complements the discussion started in [18]
about the fragment of propositional logic induced by MVDs; provides a new
method to characterize the set of MVDs that hold in a relation; and contributes
a new view of how close the MVD lattice is to the DMVD lattice, with the pre-
cise understanding of where the difference lies. We expect this progress to allow
for further advances in the study of binary representations for more sophisti-
cated dependencies that fully lack, so far, lattice-theoretic characterizations. A
global longer-term aim guiding this research is to elucidate whether there are
deep implications between two properties of dependencies, seen in a general view:
some of them have Armstrong relations, others do not; and some of them ad-
mit lattice-theoretic characterizations, whereas it appears like others do not. We
24       Jaume Baixeries, José Luis Balcázar
      New Closure Operators and Lattices for MVD’s and Related Expressions          3

believe that these two properties have a deep interconnection that is worth to
investigate and bring into light.


2     Basic Definitions

Since we will be dealing with sets of data that have different domains (mul-
tivalued and binary) for a set of attributes R, we will denote a set of tuples
or a relation r = {t1 , t2 . . . tn } when the attributes are defined over a multi-
valued domain, following the usual notation in database research. Likewise we
denote a theory, that is, a set of propositional models, each a binary tuple, as
T = {m1 , m2 . . . mn } when the domain is binary.

Definition 1. The binarization of r, BIN (r), is the set of binary tuples (the-
ory) that result from comparing all the pairs of tuples attributewise.

    For instance, given a set of tuples r = {{a, b, c, d}, {a, b, d, c}, {a, c, c, d},
{a, c, d, c}}, it would yield BIN (r) = {{1, 0, 0, 0}, {1, 0, 1, 1}, {1, 1, 0, 0}} Ob-
serve that, being a set, BIN (r) has no duplicates.


2.1    Partitions of attributes

A main combinatorial ingredient of our contributions is given by partitioning
the set of attributes into equivalence classes.

Definition 2. The partition of attributes P ART (mi ) that results from a
model mi is the partition of R such that each attribute that has a true value in
mi , forms a singleton class, and all the attributes with false value are together
in the same class.

    For instance, if mi = {11001}, then P ART (mi ) = {A|B|CD|E}. We freely
say that P ART (T ) is the set of partitions for each model of T , with no du-
plications, of course, even if the same partition arises from several models. For
instance, if T = {1101, 1011}, then P ART (T ) would be {A|B|C|D}. The set of
all the partitions is denoted ℘.
    The key operation on partitions will be here a standard meet operation cor-
responding to the lattice of partitions; the corresponding join operation is also a
usual operation on partitions. Namely, given two partitions, their join is formed
by all the classes that can be obtained by pairwise intersecting one class from
each, that is, the coarsest partition that is finer than both; whereas the meet is
the finest partition that is coarser than both, so that the partial ordering asso-
ciated to the lattice is the natural ordering where P ≤ P 0 whenever P 0 is finer
than P . More details on alternative ways to define the meet operation can found
in [3].
    Moreover, through the operation P ART , we obtain a way of computing a
meet operation on models that is different from the standard bitwise intersec-
tion. For instance, 1110 ∧ 0011 = 0011, because P ART (1110) = {a|b|c|d} and
                           Multivalued Dependencies and Related Expressions                        25
4        J. Baixeries and J.L. Balcázar

P ART (0011) = {ab|c|d}, and {a|b|c|d} ∧ {ab|c|d} = {ab|c|d} corresponding to
the model 0011. Note that the operator ∧ may not be applicable to some models,
in the sense that the meet partition might have more than one nonsingleton class
and then there would be no model corresponding to the meet partition (e.g., for
11100 and 00111). Either among partitions or among models, we employ the
following notation:

Definition 3. For any given set S, if the operator ∧ is defined on that set, the
closure under meet of a subset Q ⊆ S, denoted [Q]∧ , is the smallest subset of
S that contains Q and is closed under ∧.

2.2    Clauses and Dependencies
All our expressions have two parts, an antecedent and a consequent, each being
a set of attributes or of propositional variables, denoted X, Y , or the like; juxta-
position as in XY denotes union. It is customary to separate them by some sort
of double arrow. For now on, we use only the word “attributes” to mean “at-
tributes or propositional variables” whenever this is the necessary interpretation.
The various expressions, and their semantics, are as follows.

Definition 4. A multivalued dependency clause (MVDcl) X ⇒⇒ Y holds
in a theory T if for each mi ∈ T if mi |= X then mi |= Y or mi |= R \ XY . It
is, if the X attributes are true, then, the Y attributes are true or the R \ XY
attributes are true.

Definition 5. A degenerate multivalued dependency (DMVD) X 7→                   7→ Y
holds in r if whenever two tuples ti , tj ∈ r, ti [X] = tj [X] implies that ti [Y ] =
tj [Y ] or ti [R \ XY ] = tj [R \ XY ].

Definition 6. A multivalued dependency (MVD) X →→ Y holds in r if
whenever two tuples ti , tj ∈ r coincide in X, ti [X] = tj [X], it implies that the
tuples t0i , t0j exist in r, where t0i [X] = t0j [X] = ti [X], t0i [Y ] = ti [Y ], t0j [Y ] = tj [Y ],
and conversely t0i [R \ XY ] = tj [R \ XY ] and t0j [R \ XY ] = ti [R \ XY ].

Definition 7. The syntactically equivalent MVD (DMVD) of a MVDcl
X ⇒⇒ Y is X → → Y (X 7→
                      7→ Y ); and similarly between MVDs and DMVDs.

    It is immediate to check that, if a degenerate multivalued dependency holds in
a relation r, so does the syntactically equivalent multivalued dependency. Some
common properties for DMVDs, MVDs and MVD clauses are the following:

 1. Complementation: If X → Y holds, then X → R \ Y holds.
 2. Reflexivity: If Y ⊂ X, then X → Y holds.
 3. Union of right-hand side: If X → Y and X → Y 0 hold, then X → Y Y 0 holds.

where X → Y can be a DMVD, a MVD or a MVD clause. For a proof of these
properties the reader can refer to [21] or [2]. For each left-hand side of a MVDcl,
a DMVD or a MVD, we can define its dependency basis as follows:
26      Jaume Baixeries, José Luis Balcázar
     New Closure Operators and Lattices for MVD’s and Related Expressions            5

Definition 8. Given a theory T (a relation r), the MVDcl dependency basis
(DMVD dependency basis, MVD dependency basis) for a set of attributes X,
DEP (X), is a partition of the set of attributes such that for all MVDcl (DMVD,
MVD) that have X in their left hand side, they hold in T (in r) if and only if
their right hand side may be formed by the union of classes of DEP (X).

    We note that, for a given set of attributes X, all these attributes will appear
as singletons in DEP (X) regardless of the kind of dependency basis. This is
so because of the reflexivity property that holds in all these dependencies; they
are often omitted in this generalized dependency. Additionally, the unions Y
mentioned at the end include, as particular cases, the individual classes of the
partition DEP (X). The definition of a dependency basis allows us to group
all the MVDcl, DMVDs or MVDs that have the same left-hand side in one
generalized MVDcl, DMVD or MVD:

Definition 9. All the MVDcl (DMVD, MVD) that have X in their lefthand side
may be represented by a generalized MVDcl (DMVD, MVD) X ⇒⇒ DEP (X)
(X 7→
    7→ DEP (X), X →   → DEP (X).).

    In [2] and [3], we defined closure operators on partitions, characterizing all
the DMVDs (MVDs respectively) that hold in a relation; the characterizations
have the same overall form as Theorem 1 below. They also were proved to derive
Armstrong relations for that set of DMVDs (MVDs). An Armstrong relation for
a set of dependencies is a relation such that all those dependencies hold in this
relation, as well as all the dependencies that are a logical consequence of this
set, but the rest of dependencies do not hold. It can be used as an alternative
way to characterize a set of dependencies. In this present paper, those closure
operators will be called ΓDM V D and ΓM V D respectively.
    Another relevant notion about partitions of attributes is that of matching,
which varies depending whether the partition is considered to be modelling a
set of DMVDs or a set of MVDs. We provide here a brief description of this
property (more details can be found in [2] and [3]): we say that a partition of
attributes P = {P1 |P2 | . . . |Pn } matches a set of tuples C (when modelling a set
of DMVDs) if all the tuples in C differ only in one class of attributes. If this same
partition is considered to be modelling a set of MVDs, we say that P matches a
set of tuples C if C = P r(P1 , C) × P r(P2 , C) × · · · × P r(Pn , C), where P r(X, C)
is the projection of C on the attributes X, that is, the set of all tuples obtained
from C by removing the values corresponding to attributes not in X.


3    Lattice Characterization of Multivalued Dependency
     Clauses

It is easy to prove that, given a relation r, a DMVD X 7→
                                                        7→ Y holds in r if and
only if the syntactically equivalent MVDcl X ⇒⇒ Y holds in BIN (r), as the
next proposition states:
                             Multivalued Dependencies and Related Expressions       27
6          J. Baixeries and J.L. Balcázar

Proposition 1. Let be r a relation, then X 7→
                                            7→ Y holds in r if and only if
X ⇒⇒ Y holds in BIN (r).

    According to this result, and due to the fact that the operator ΓDM V D as
defined in [2] provides a complete characterization of a set of DMVDs, the lattice
that characterizes the MVD clauses that hold in BIN (r) must be the same.
However, given a theory T whose lattice we wish to construct, we would need
to start by going back into a relation r such that BIN (r) = T and operate
on the DMVDs of r; and care must be exercised to make sure that BIN (r)
does not produce additional models. We deem useful to provide a construction
of the lattice that goes on directly from T , by means of a new closure operator
appropriate to work on models. We indeed provide now an operator that, given
a theory, calculates a lattice that characterizes the set of all MVD clauses that
hold in that theory, which is the following:

Definition 10. Fix a theory T . Let ΓM V Dcl : ℘ 7→ [P ART (T )]∧ a map such
that given a partition of attributes P , returns a partition of attributes P 0 such
that P ≤ P 0 and for all P 00 ∈ [P ART (T )]∧ , if P ≤ P 00 ≤ P 0 implies that
P 00 = P 0 .

  Note that it is well-defined since P ≤ P 0 and P ≤ P 00 implies that P ≤
P ∧ P 00 , and the codomain is closed under meet.
    0


Proposition 2. ΓM V Dcl is a closure operator.

Proof. We prove the three axioms for closure operators.

 1. P ≤ ΓM V Dcl (P ). Obvious since ΓM V Dcl always returns a P 0 ∈ [P ART (T )]∧
    such that P ≤ P 0 .
 2. ΓM V Dcl (P ) = ΓM V Dcl (ΓM V Dcl (P )). Let P 0 = ΓM V Dcl (P ) and, besides,
    Q = ΓM V Dcl (P 0 ) so that P 0 ≤ Q. For all P 00 ∈ [P ART (T )]∧ , P 0 ≤ P 00 ≤ Q
    implies that P 00 = Q, and since P 0 ∈ [P ART (T )]∧ and P 0 ≤ P 0 ≤ Q, we
    have that Q = P 0 .
 3. P ≤ P 0 ⇒ ΓM V Dcl (P ) ≤ ΓM V Dcl (P 0 ). Since P ≤ P 0 , we have that P ≤
    ΓM V Dcl (P 0 ), and also P ≤ ΓM V Dcl (P ). We then have that

                       P ≤ ΓM V Dcl (P ) ∧ ΓM V Dcl (P 0 ) ≤ ΓM V Dcl (P )

        By the closure under meet, ΓM V Dcl (P ) ∧ ΓM V Dcl (P 0 ) ∈ [P ART (T )]∧ ; by
        the last condition of the definition, we get ΓM V Dcl (P ) ∧ ΓM V Dcl (P 0 ) =
        ΓM V Dcl (P ) or, equivalently, ΓM V Dcl (P ) ≤ ΓM V Dcl (P 0 ) as desired.

    Note that P ART (mi ) can only express a partition that has only one class that
is not a singleton. Then, all the partitions that belong to P ART (T ) will have only
one class that is not a singleton. Partitions with more than one nonsingleton can
be obtained through the meet operation. The next theorem proves that ΓM V Dcl
characterizes all the MVD clauses that hold in a theory T :
28      Jaume Baixeries, José Luis Balcázar
     New Closure Operators and Lattices for MVD’s and Related Expressions                  7

Theorem 1. A generalized MVD clause X ⇒⇒ Y1 | . . . |Ym holds in T if and
only if for P = {X1 | . . . |Xn |Y1 | . . . |Ym } and P 0 = {X1 | . . . |Xn |Y1 . . . Ym } we
have that ΓM V Dcl (P 0 ) = P , where the Xj are the attributes in X.

Proof. (⇒) We first note that the fact that a generalized MVD clause X ⇒⇒
Y1 | . . . |Ym holds in T means that these Yi constitute the finest possible right
hand side for a generalized MVD clause that has X in its left hand side. For
each i, each model in T must satisfy X ⇒⇒ Yi ∨ Y 0 , where Y 0 stands for the
union of the remaining classes of attributes. This implies that all the zeros of
the model are in the same class Yi , since otherwise it is easy to check that this
last condition becomes violated. Conversely, it is immediate to see that, if all
the models satisfying X have all their zeros together in one single Yi , then T
satisfies the dependency clause.
     We must prove that P 0 = {X1 | . . . |Xn |Y1 | . . . |Ym } belongs to [P ART (T )]∧ ,
and that no other partition in [P ART (T )]∧ exists in                        0
                                                           V between P and P . Consider
                0
the theory T = {m ∈ T m |= X}, and let Q = m∈T 0 P ART (m). Clearly all
Xi are singletons in all P ART (m) form m ∈ T 0 , and, as just described, all of
them have their zeros (the only nonsingleton in P ART (m)) included in one of
the Yi ’s, so that for allVsuch m we have P 0 ≤ P ART (m), and the properties of
the meet ensure P 0 ≤ m∈T 0 P ART (m) = Q. We argue that actually P 0 = Q
so that indeed P 0 ∈ [P ART (T )]∧ : assume that P 0 < Q, that is, some class Yi
of P 0 is split in Q into more than one class. Fix two attributes A and B that
belong to Yi that are in different classes of Q, say ZA and ZB respectively; from
P 0 ≤ Q as just proved, ZA ∪ ZB ⊆ Yi . Noting the definition of Q, we see that no
model in T 0 has zeros both in ZA and in ZB , since otherwise the meet operation
would have joined them into a single class. Then, the observation we made in the
previous paragraph tells us that the multivalued dependency clause X ⇒⇒ ZA
is true in T , and ZA should be obtained as a union of Yi ’s, which is not the case.
Therefore P 0 = Q, so that P 0 ∈ [P ART (T )]∧ .
     It remains to prove that there is no other partition P 00 ∈ [P ART (T )]∧
with P ≤ P 00 ≤ P 0 , except, of course, for P 0 itself, and this will imply that
ΓM V Dcl (P )V= P 0 , as we wish. Assume P 00 ∈ [P ART (T )]∧ , and let T 00 ⊆ T such
that P 00 = m∈T 00 P ART (m). Again all the Xi must be singletons in P 00 due to
P ≤ P 00 , which means that the models         00
                                      V in T satisfy X, or,V       equivalently, T 00 ⊆ T 0 .
It is immediate to see, then, that m∈T 0 P ART (m) ≤ m∈T 00 P ART (m) be-
cause, due to the way the meet operation works, having at least as many tuples
in T 0 makes the partition at least as coarse as the one corresponding to T 00 . That
is, we see that P 0 ≤ P 00 ; and from the assumption P 00 ≤ P 0 we get P 00 = P 0 .
Taken together, all our consequents prove that ΓM V Dcl (P ) = P 0 .
     (⇐) We prove that if ΓM V Dcl (P 0 ) = P , then X ⇒⇒ Y1 | . . . |Yn holds in T .
We use the same simple observation as in the first paragraph of the converse
direction. Pick any model m ∈ T that satisfies X: we will prove that all the
zeros of m are in the same Yi , and this implies that the MVD clause is true in T .
This is not difficult now, since whenever m satisfies X all the singletons of P are
also singletons of m, that is, P ≤ P ART (m), and because P ART (m) belongs
obviously to the lattice, the closure of P , namely P 0 under our assumption,
                          Multivalued Dependencies and Related Expressions           29
8       J. Baixeries and J.L. Balcázar

must also obey P 0 ≤ P ART (m). The only way for this to hold is that the only
nonsingleton class of P ART (m) is fully included in one of the classes of P 0 , that
is, all the zeros of m are in one single Yi , and this implies that the MVD clause
is true.

     In fact, completely analogous theorems for DMVDs and MVDs, using the
corresponding closure operators instead, were proved in [2] and [3] respectively.
     We define LM V Dcl (T ) as the set of closed partitions of attributes that are
closed under ΓM V Dcl for T . Since ΓM V Dcl is a closure operator, the set LM V Dcl (T )
is closed under the operator meet. We have seen how to create a lattice to repre-
sent the MVD clauses that hold in a theory T and, as we previously mentioned,
since the equivalence of a DMVD that holds in a relation and a MVDcl that
holds in the binarization of that relation is direct, then we can conclude that the
lattices generated by ΓM V Dcl and ΓDM V D must be the same.
     Theorem 1 could have been proved instead by the following argumentation:
By Proposition 1, we have that ΓM V Dcl generates the same set of closed parti-
tions in BIN (r) as ΓDM V D in r. Then, to prove Theorem 1 we may also derive
it from the following theorem, whose proof resorts heavily to our previous work:

Theorem 2. Let P = {P1 | . . . |Pn } be a partition of attributes. P = ΓM V Dcl (P )
⇔ P = ΓDM V D (P ).

Proof. For this proof, we will use the Armstrong relation that can be derived
from a set of closed partitions as described in [2].
    ⇒) P = ΓM V Dcl (P ) implies that in the Armstrong relation for each Pi ∈ P
there will be a pair of tuples that will only differ in Pi , and each pair will be
different one from each other. It implies that if BIN is performed between the
two tuples of each pair (otherwise it will result in a all zeros model), it will result
in a binary model such that all the attributes in Pi will be set to zero, and the
rest will be set to one. It induces a set of partitions such that all the attributes
not in Pi will be singleton, and those in Pi will be in the same partition. Since it
happens for all Pi ∈ P , the meet for all those partitions will be P , which proves
that it will be closed under ΓDM V D .
    ⇐) P = ΓDM V D (P ) implies that for each Pi ∈ P that is not a singleton,
there is a P 0 such that all the attributes are singleton except those that are in Pi .
For all those models, it would imply that there is a pair of tuples that disagree
only in Pi , which constitues a set of tuples that imply that P = ΓM V Dcl (P ).

    Theorem 1 and the correspondence between MVD clauses and DMVD’s gives
us an alternative way to construct a lattice and derive the DMVDs of a relation
r: binarizing the tuples in r, and closing under meet the resulting partitions of
attributes. This process parallels that of the construction of Armstrong relations
for a set of FD’s [15] and that of constructing an Armstrong relation for a set of
DMVDs in [2].

Corollary 1. [P ART (BIN (r))]∧ = LM V Dcl (BIN (r)) = LDM V D (r).
30      Jaume Baixeries, José Luis Balcázar
     New Closure Operators and Lattices for MVD’s and Related Expressions          9

4    Multivalued Dependency Clauses and Multivalued
     Dependencies

We just proved that [P ART (BIN (r))]∧ = LDM V D (r), it is, that binarizing a
relation, then transforming these models into partitions of attributes and then
closing the resulting models under meet, the set of partitions of attributes so
constructed characterizes the DMVDs that hold in r. Our previous paper [2] pro-
vided such a construction by a Galois connection operating on partitions and re-
lying strongly on the “matching” condition defined before: now we see that bina-
rization offers a different, more natural way for both obtaining the set of DMVDs
that hold in a relation and performing the implication test for DMVDs. Equipped
with these facts, we move now to the most relevant notion, MVDs proper: in
this section we compare LM V D (r) and [P ART (BIN (r))]∧ . First consider the
possibility of having an equivalence of the form LM V D (r) = [P ART (BIN (r))]∧ .
This correspondence holds in relations were the sets of attributes that have some
values in common are reduced to sets of two tuples (a two-tuples world) as it
is proved in [19]. Proposition 3 next states that the MVD lattice is actually in-
cluded in the DMVD lattice; therefore, if the inclusion is an equality, then the
equality [P ART (BIN (r))]∧ = LM V D holds, otherwise, it does not.

Proposition 3. Let r be a relation, and let LDM V D (r) and LM V D (r) the lat-
tices that characterize the DMVDs and MVDs respectively that hold in r. Then,
LM V D (r) ⊆ LDM V D (r).

Proof. Fixed r, we prove that if partition P is not in LDM V D (r) then P is not
in LM V D (r) either. We use the facts, proved in [2] and [3] respectively, that par-
allel theorem 1 for DMVDs and MVDs: each application of the corresponding
closure operator to a partition P that gives a different partition corresponds to
a DMVD, respectively MVD, that holds in the relation given. If P ∈      / LDM V D (r)
then a nontrivial degenerate multivalued dependency with singletons from P as
left hand side holds in r; as indicated just after the definition, if the degenerate
multivalued dependency holds then its syntactically equivalent multivalued de-
pendency holds as well; the syntactical equivalence implies that its left hand side
is the same and therefore singletons of P , so that the closure operator for MVDs
also maps P into a different partition and thus P is not closed in LM V D (r)
either.

    Our next result provides a postprocessing method that eliminates the appro-
priate models in BIN (r) to obtain the MVD lattice. The method consists of
traversing the DMVD lattice top-down, and testing, for each node, the condi-
tions stated in the theorem below; at that point, it is necessary to assume that
all the nodes above the current node have been already tested, since we need for
the proof the meet of all the nodes in the MVD lattice that sit above the node
undergoing the test.
    In fact, it is immediate to see that, upon considering a node P , if we take
the meet of all the nodes above it in the MVD lattice, we obtain a node Q ≥ P
                          Multivalued Dependencies and Related Expressions             31
10      J. Baixeries and J.L. Balcázar

which is, in fact, one of them due to the closure under meet; and, of course, to
maintain the closure property, in case Q = P then P also belongs to the MVD
lattice.
    Thus, assume from now on that P < Q; since all nodes above P are already
correctly placed in the MVD lattice, we know that either P belongs to it, or its
closure is exactly Q. Thus, we consider a generalized dependency on the basis
of P and Q, and we know from [3] that P is in the lattice if and only if the
dependency does not hold. Namely, assuming that P = {X1 | . . . |Xn |Y } with
X = X1 . . . Xn and Q = {X1 | . . . |Xn |Y1 | . . . |Ym }, P must be removed from
the lattice if and only if r |= X →→ Yi for each i. We prove that considering
information local to P and Q suffices.
Theorem 3. Let m ∈ BIN (r) and P = P ART (m), where P = {X1 | . . . |Xn |Y }
with X = X1 . . . Xn , and Q = {X1 | . . . |Xn |Y1 | . . . |Ym } is the meet of all the
nodes above P in LM V D (r); assume P < Q. For each Yi , consider the related
models m1 and m2 : m1 has true the true variables of m and those corresponding
to Yi , and m2 has all variables true except those in Yi . It holds that P ∈ LM V D (r)
if and only if there is a Yi (with associated m1 and m2 ) and a pair of tuples t1
and t2 whose comparison results in m, for which there are no pairs t3 and t4 so
that the comparisons of t1 with t3 and of t2 with t4 yield m1 and the comparisons
of t1 with t4 and of t2 with t3 yield m2 .
Proof. One direction is easy. If indeed there is Yi and the pair of tuples, then it
can be readily checked from the properties of m1 and m2 that this pair witness
that r 6|= X → → Yi , so the generalized dependency constructed from P and Q
does not hold; it follows from the results in [3] that Q cannot be the closure of P .
But ΓM V D (P ) is in the MVD lattice and, if it was above P , the definition of Q
would imply P < Q < ΓM V D (P ) which contradicts the properties of the closure
operator. Thus the only possibility is ΓM V D (P ) = P so that P ∈ LM V D (r).
    For the converse, in fact all the steps in the previous paragraph are “if and
only if”, up to the point that r 6|= X →→ Yi However, this does not complete the
proof since there could be pairs of tuples violating the dependency somewhere
else: it remains to argue that t1 and t2 violating the implication can be picked
so that their comparison is m. In fact we prove more: we prove that all these
pairs are there. Fix any pair of tuples whose comparison gives m0 6= m: we will
prove that they satisfy the conditions for X →→ Yi . Note that what we want
to prove, that the X →   → Yi are satisfied for all Yi in this case, is the same as
proving that the generalized dependency X →       → Y1 | . . . |Ym is satisfied: it is the
one that would correspond to P in case its closure was Q.
    Since P = P ART (m) and X is the union of the singletons of P , that is,
the ones of m, the result is immediate whenever m0 has a zero among these,
because the pair of tuples would not satisfy the antecedent. Thus, we only need
to consider m0 > m. Let P 0 = P ART (m0 ), which implies that P 0 has at most
one class that is not a singleton and that P ≤ P 0 , and suppose first that Q ≤ P 0 .
Then, Q being coarser, the zeros of m0 must be all in the same class in Q and
the pair of tuples differ only in one class of Q, which implies that they trivially
satisfy the generalized dependency.
32      Jaume Baixeries, José Luis Balcázar
     New Closure Operators and Lattices for MVD’s and Related Expressions            11

    Therefore, we suppose now that Q 6≤ P 0 , that is, P ≤ P 0 ∧ Q < Q, which
implies that P 0 ∈    / LM V D (r) since otherwise, due to the closure under meet,
Q would not have been the meet of all the part of the MVD lattice above P .
This implies that a dependency holds in r associated to P 0 and ΓM V D (P 0 ). Its
left hand side X 0 is formed by the singletons of P 0 , which include those of P ,
say X1 , . . . , Xk , and some more, say Xk+1 , . . . , Xn ; call the union of this set
W = W1 W2 according to whether they come from Yi or from its complement.
    Regarding its right hand side, made up by the rest of the classes of ΓM V D (P 0 ),
we know that P ≤ P 0 ≤ ΓM V D (P 0 ), which surely is in the MVD lattice, so that
Q ≤ ΓM V D (P 0 ) by the definition of Q. Thus, we find that Yi is W1 (from the
left hand side X 0 ) union a union of classes of the right hand side. A pair of
tuples corresponding to m0 must coincide in X1 , . . . , Xn = XW1 W2 , and from
the fact that the generalized dependency associated to P 0 holds we easily obtain
the tuples needed to prove that the pair fulfills X →     → Yi as well.

   According to this process we can identify one by one, proceeding from the
top down, those nodes in LDM V D that must remain in LM V D and distinguish
them from those that must be deleted to construct LM V D from LDM V D .


References
 1. Baixeries J. A Formal Concept Analysis framework to model functional dependen-
    cies. Mathematical Methods for Learning (2004).
 2. Baixeries J., Balcázar, J.L. Characterization and Armstrong relations for Degen-
    erate Multivalued Dependencies using Formal Concept Analysis. The Third Inter-
    national Conference on Formal Concept Analysis 2005.
 3. Baixeries J., Balcázar, J.L. A Lattice Representation of Relations, Multivalued De-
    pendencies and Armstrong Relations. 13th International Conference on Conceptual
    Structures (ICCS’05).
 4. Balcázar, J.L. Query learning of Horn formulas revisited.. To appear in: Com-
    putability in Europe 2005..
 5. Balcázar, J.L., Baixeries J. Discrete Deterministic Data Mining as Knowledge
    Compilation. Workshop on Discrete Mathematics and Data Mining in SIAM In-
    ternational Conference on Data Mining (2003).
 6. Balcázar, J.L., Baixeries J. Characterizations of multivalued dependencies and re-
    lated expressions. Discovery Science 2004.
 7. Beeri C., Fagin R., Howard J. H. A complete axiomatization for functional and mul-
    tivalued dependencies in database relations. Jr. Proc. 1977 ACM SIGMOD Sym-
    posium, ed. D. C. P. Smith, Toronto, pp. 47-61.
 8. Beeri C. On the Membership Problem for Functional and Multivalued Dependencies
    in Relational Databases. ACM Transactions on Database Systems, Vol 5, No. 3,
    September 1980, pages 241-259.
 9. Davey B.A., Priestley H.A. Introduction to Lattices and Order. Second edition.
    Cambridge University Press, 1990, 2002.
10. Day A. A lattice interpretation of database dependencies. Semantics of Program-
    ming Languages and Model Theory (M. Droste and Yu. Gurevich, eds), Gordon
    and Breach, London.
                          Multivalued Dependencies and Related Expressions       33
12      J. Baixeries and J.L. Balcázar

11. Demetrovics J., Hencsey G., Libkin L., Muchnik I. Normal Form Relation Schemes:
    A New Characterization. Acta Cybernetica. 10(3): 141-164 (1992).
12. Demetrovics J., Libkin L., Muchnik I. Functional Dependencies in Relational
    Databases: A Lattice Point of View. Discrete Applied Mathematics 40(2): 155-185
    (1992).
13. Guigues J. L., Duquenne V. Familles minimales d’implications informatives resul-
    tant d’un tableau de données binaires. Math. Sci. hum., 24(95):5–18, 1986.
14. Fagin R., Vardi Y. V. The theory of data dependencies: a survey. Mathematics of
    Information Processing, Proceedings of Symposia in Applied Mathematics, AMS,
    1986, vol. 34, pp. 19-72.
15. Fagin R. Armstrong databases. Invited paper, Proc. 7th IBM Symposium on Math-
    ematical Foundations of Computer Science, Kanagawa, Japan, May 1982.
16. Ganter, B., Wille R. Formal Concept Analysis. Mathematical Foundations.
    Springer, 1999.
17. Lopes, S., Petit J-M., Lakhal L. Functional and approximate dependency mining:
    database and FCA points of view. Special issue of Journal of Experimental and
    Theoretical Artificial Intelligence (JETAI) on Concept Lattices for KDD, 14(2-
    3):93-114, Taylor and Francis, 2002.
18. Sagiv Y. An algorithm for inferring multivalued dependencies with an application
    to propositional logic. Journal of the ACM, 27(2):250-262, April 1980.
19. Sagiv Y., Delobel C., Parker S., Fagin R. An Equivalence Between Relational
    Database Dependencies and a Fragment of Propositional Logic. Journal of the
    ACM, Vol 28, No 3, July 1981, 435-453.
20. Thalheim B. Conceptual Treatment of Multivalued Dependencies. 22nd Interna-
    tional Conference on Conceptual Modeling, Chicago, IL, USA, Lecture Notes in
    Computer Science Springer 2003, pp 363 - 275.
21. Ullman J.D. Principles of Database and Knowledge-Base Systems. Computer Sci-
    ence Press, Inc. 1988.