<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>August</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1109/TKDE.2010</article-id>
      <title-group>
        <article-title>Dependency Covers from a FCA Perspective</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jaume Baixeries</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Victor Codocedo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mehdi Kaytoue</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amedeo Napoli</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universitat Politècnica de Catalunya</institution>
          ,
          <addr-line>Barcelona, Catalonia</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Université de Lorraine</institution>
          ,
          <addr-line>CNRS, LORIA, Nancy</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Université de Lyon. CNRS</institution>
          ,
          <addr-line>INSA-Lyon, LIRIS, Lyon</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>20</volume>
      <issue>2023</issue>
      <fpage>57</fpage>
      <lpage>66</lpage>
      <abstract>
        <p>Implications in Formal Concept Analysis (FCA), Horn clauses in Logic, and Functional Dependencies (FDs) in the Relational Database Model, are very important dependency types in their respective fields. Moreover, they have been proved to be equivalent from a syntactical point of view. Then notions and algorithms related to one dependency type in a field can be reused and applied to another dependency type in the other field. One of these notions is that of cover, also known as a base or basis, i.e., a compact representation of a complete set of implications, FDs, or Horn clauses. Although the notion of cover exists in the three fields, the characterization and the related uses of a cover are diferent. In this paper, we study and compare, from an FCA perspective, the principles on which rely the most important covers in each field. Finally, we discuss some open questions that are of interest in the three fields, and especially to the FCA community.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Functional dependencies</kwd>
        <kwd>Implications</kwd>
        <kwd>Horn Clauses</kwd>
        <kwd>Dependency Covers</kwd>
        <kwd>Closure</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction and Motivation</title>
      <p>
        A dependency is a relation between sets of attributes in a dataset. In this paper, they are
represented as  →  , where the type of the subsets of attributes  and  , and the semantics
of → may vary w.r.t. the context. There are many diferent kinds of dependencies: complete
and comprehensive surveys, from a Relational Database Theory perspective, can be found in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
and in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Here, we focus on those dependencies that follow the so called Armstrong axioms,
this is, reflexivity, augmentation and transitivity, which appear in diferent fields of computer
science: functional dependencies (FDs) in the Relational Database Model, Horn clauses in logic
and logic programming, and implications in Formal Concept Analysis.
      </p>
      <p>
        Functional dependencies [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] are of paramount importance in the Relational Database
Model (RDBM), where they are used to express constraints or rules that need to hold in a
database, to help the design of a database or to check for errors and inconsistencies. A set of
Horn clauses [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is a special case of Boolean functions that are crucial in logic programming
[
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ] and artificial intelligence (see [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for a detailed explanation). Implications are at the
core of Formal Concept Analysis (FCA) where they are used to model and deduce relevant
information that is contained in a formal context [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ].
      </p>
      <p>
        Although they appear in diferent fields, these three constructions have been applied on
diferent kinds of data and have been used for diferent purposes. They also all share the same
axioms, which means that, from a syntactical point of view, they are all equivalent. More
specifically, the equivalence between functional dependencies and Horn clauses is presented
in [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ] (see also [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for a more detailed explanation). The equivalence between functional
dependencies and implications is explained in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and the equivalence between implications
and Horn clauses is explained in Section 5.1 in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] as well as in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. These equivalences allow
us to talk in a generic way of Armstrong dependencies or, simply, dependencies.
      </p>
      <p>
        One of the consequences of this equivalence is the transversality of concepts, problems and
algorithms between these three fields. One of the most typical examples is the decision problem
of the logical implication which consists in, given a set of dependencies Σ and a single
dependency  , to determine whether  is logically entailed by Σ , that is, Σ |=  . Entailment
means that  can be derived from Σ by the iterative application of the Armstrong axioms. This
problem is named implication problem in the RDBM [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ] and FCA fields, and deduction
in logic [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. It is of capital interest in all three fields. In the RDBM it allows to test whether
two diferent sets of functional dependencies are equivalent [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], and it also allows to compute a
more succinct set of functional dependencies, which is relevant to assist the design process of
databases [15, 16]. In logic the deduction problem is used to check whether a logical expression
is consistent w.r.t. a knowledge base and to compute the prime implicants of a Horn function [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
In Formal Concept Analysis this problem is used, for instance, in attribute exploration [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],
which consists in creating a data table (context) in agreement w.r.t. a set of attributes and a set
of objects, and also for computing the Duquenne-Guigues basis [17].
      </p>
      <p>Roughly speaking, the computation of the logical implication problem Σ |=  →  is
performed by iterating over Σ and applying the Armstrong axioms to infer new dependencies
until a positive or negative answer is found. However, this problem can be reduced to the
computation of the closure of  with respect to Σ (closureΣ()). This closure returns the
largest set such that Σ |=  → closureΣ() holds. Therefore, the implication problem
Σ |=  →  boils down to testing whether  ⊆ closureΣ().</p>
      <p>
        As an example of transversality, the algorithm that computes closureΣ() appears in most
of the main database textbooks, where it is called closure [
        <xref ref-type="bibr" rid="ref13 ref14 ref3">14, 13, 3</xref>
        ], and also in logic, where it
is called forward chaining [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], while in FCA the same algorithm that first appeared in the
RDBM is discussed and reused in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Another “transversal notion” which is present in all three fields is the notion of cover. In
general terms, it is not suitable to handle the set Σ of all dependencies that hold, because of
its potential large size, but rather a subset of Σ that contains the same information and that
is significantly smaller in size. By “containing the same information” we mean that this subset
may generate, thanks to the application of the Armstrong axioms, the complete set Σ . This
compact and representative subset is called “cover” in the RDBM, “basis” in FCA and “set of
prime implicants” in logic. Moreover, each field has defined and used a diferent kind of cover.
While in the RDBM this base is the Canonical-Direct Basis or the Minimal Cover, the cover of
choice in FCA is the so-called Duquenne-Guigues basis.</p>
      <p>
        Both the implication problem and the problem of computing a cover are related: the
implication problem is used in the algorithm Canonical Basis (Algorithm 16, page 103 in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]) to
compute the Duquenne-Guigues basis, and it is also used in the algorithm Direct (Section 5.4,
Chapter 5 in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]) which is used to compute the Minimal Cover. Again, the transversality of
the Armstrong dependencies appears in a general concept (computing a cover) but in diferent
forms (Duquenne-Guigues basis and Minimal Cover).
      </p>
      <p>This paper is a short version of the paper Three Views on Dependency Covers from an FCA
Perspective that was accepted at the ICFCA 2023 (International Conference on Formal Concept
Analysis). The purpose of this study is to present a discussion of the main diferent covers
used in the RDBM, in Logic, and in FCA from the perspective of the formal concept analysis
community. To do so, we review three diferent main covers that appeared in the literature.</p>
      <p>The paper is organized as follows. Section 2 provides the necessary definitions needed in the
paper. Section 3 includes a detailed comparison of the main covers. Finally, Section 4 concludes
the paper and proposes an extensive discussion about these diferent and important covers.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Definitions</title>
      <p>In this section we introduce the definitions used in this paper. We do not provide the references
for all of them because they can be found in all the textbooks and papers related to the RDBM,
Logic and FCA.</p>
      <p>
        As explained in the introduction, implications[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], functional dependencies [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and Horn
clauses [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] are dependencies between sets of attributes, which are equivalent from a syntactical
point of view, since they are in agreement with the Armstrong axioms.
      </p>
      <p>Definition 2.1. Given set of attributes  , for any , ,  ⊆  , the Armstrong axioms are:
1. Reflexivity : If  ⊆ , then  →  holds.
2. Augmentation. If  →  holds, then  →   holds.
3. Transitivity. If  →  and  →  hold, then  →  holds.</p>
      <p>When we write that a dependency  →  holds, we mean all the instances in which this
dependency is valid or true. Therefore, the sentence “If  →  holds, then  →   holds”
can be rephrased as “In any instance in which  →  is valid, the dependency  →   is
valid as well”.</p>
      <p>The Armstrong axioms allow us to define the closure of a set of dependencies as the iterative
application of these axioms over a set of dependencies.</p>
      <p>Definition 2.2. Σ + denotes the closure of a set of dependencies Σ , and can be constructed thanks
to the iterative application of the Armstrong axioms over Σ . This iterative application terminates
when no new dependency can be added, and it is finite. Therefore, Σ + contains the largest set of
dependencies that hold in all instances in which all the dependencies in Σ hold.</p>
      <p>The closure of a set of dependencies induces the definition of the cover of such a set of
dependencies.</p>
      <sec id="sec-2-1">
        <title>Definition 2.3.</title>
        <p>The cover or basis of a set of dependencies Σ is any set Σ ′ such that Σ ′+ = Σ +.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Covers of Dependencies</title>
      <p>We now present the diferent types of covers that are present and adopted in the three fields,
namely RDBM, FCA, and Logic. Since the definition of a cover (Definition 2.3) is very generic,
the covers reviewed here have been defined with respect to specific characteristics and for
diferent purposes.</p>
      <sec id="sec-3-1">
        <title>3.1. Four Main Characteristics of Covers</title>
        <p>In general terms, a cover is simply a set of dependencies Σ . The definition of a cover is very
generic and below we introduce some relevant properties which are useful for characterizing
the diferent covers. In particular, Σ is equivalent to another set of dependencies Σ ′ modulo the
Armstrong axioms when the closures Σ + and Σ ′+ are the same.</p>
        <p>There are three sources of redundancy in a set of dependencies: reflexivity, augmentation
and transitivity (with respect to the Armstrong axioms 2.1), but here dependencies that can
be derived by reflexivity 2.1 are trivial and not considered in the following discussion. This
assumption does not invalidate any of the arguments that are to be presented, but simplifies the
discussion. We will present three bases that try to reduce the number of dependencies that are
needed in order to compute the closure closureΣ for any set of attributes. But first, we need to
characterize the said bases according to the diferent ways to remove redundancy that are used.
Definition 3.1. A set of dependencies Σ is left-reduced if and only if for all  →  ∈ Σ there
is no ′ →  , where ′ ⊂ , such that changing  →  by ′ →  in Σ gives an equivalent
base.</p>
        <p>The process of left-reducing a set of dependencies is also mentioned as the removal of
extraneous attributes in the left-hand sides of all dependencies. As it can be expected, an
attribute  is extraneous in the left-hand side of a dependency  ∈ Σ if removing  from the
left-hand side in  does not change Σ +.</p>
        <p>Example 3.1. Let us take the set of dependencies Σ = {  → ,  → ,  → ,  →  }. In
this set, the dependency  →  contains the extraneous attribute  in the left-hand side, because
changing  →  by  → , we have that Σ + is the same.</p>
        <p>The equivalent case, in the right-hand side, is right-reduction. An attribute  is extraneous in
the right-hand side of a dependency  ∈ Σ if removing  from the right-hand side in  does
not change Σ +.</p>
        <sec id="sec-3-1-1">
          <title>Definition 3.2.</title>
          <p>∈ Σ .</p>
          <p>
            A set of dependencies Σ is non redundant if and only if (Σ ∖  )+ ̸= Σ + for all
If a set of dependencies Σ is non redundant and all the right-hand sides are singletons then
Σ is right-reduced (Definition 5.6 in [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ]).
          </p>
          <p>Since {  →  } ≡ {  → ,  →  }, there is no diference between considering a
cover such that all the right-hand sides are singletons, or just joining in one single dependency
all those dependencies with the same left-hand side. Actually, when the right-hand sides are
singletons, the number of dependencies is “artificially” increased.</p>
          <p>Definition 3.3. A set of dependencies Σ is direct if and only if closureΣ() can be computed
with a single pass of Σ , for all  ⊆  .</p>
          <p>As we stated in Section 1, we assume that the computation of closureΣ is performed by the
algorithm Closure or LinClosure. The Definition 3.3 means that computing the closure of
any set of attributes  ⊆  implies only one complete exploration of Σ . Actually, this unique
exploration represents a very important property, especially when Σ is large or very large.
Moreover, it should also be noticed that (i) a cover Σ is direct regardless of how it is represented
or sorted, and also (ii) Σ is direct for all possible sets of attributes  ⊆  .</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Definition 3.4.</title>
          <p>Σ + = Σ ′+.</p>
          <p>A set of dependencies Σ is minimal if and only if |Σ | ≤ | Σ ′| for all Σ ′ such that
It is important to notice that when a cover is minimal, it is also non redundant, but the
opposite does not necessarily hold. Moreover, let us recall also the importance of computing a
non redundant cover given a set of dependencies Σ , and that computing a non redundant cover
is equivalent to the logical implication problem.</p>
          <p>In the next sections we review three diferent and major types of covers that are of interest
in the RDBM, in Logic, and in FCA.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. The Canonical-Direct Basis</title>
        <p>
          The Canonical-Direct Basis is defined with diferent characterizations in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and aims to remove
the redundancy caused by augmentation. The computation of this cover is performed in three
steps:
1. All the dependencies in Σ must have one single attribute in the right-hand side, as it was
already the case above for the computation of the minimal cover. Again this is performed
by simply replacing a dependency  →  by the dependencies  →  for all  ∈  .
2. Σ is closed by “pseudo-transitivity”, that is, Augmentation plus transitivity. Then
dependencies  →  such that  ∈  are removed.
3. Σ is left-reduced (see Definition 3.1 and step 2 in the construction of the minimal cover).
        </p>
        <p>It can be noticed that there is no removal of redundant dependencies. The only source of
redundancy that is taken into account and removed is the one generated by the application of
augmentation, but not of transitivity. The Canonical-Direct Basis is not necessarily minimal,
nor it is non-redundant, but it is direct.</p>
        <p>Example 3.2. We continue with this example: Σ = {  → ,  → ,  → ,  →  }. Applying
step 1 produces Σ = {  → ,  → ,  → ,  → ,  →  }. Here, step 2 consists in closing
Σ by pseudo-transitivity, which outputs: Σ ′ = {  → ,  → ,  → ,  → ,  → ,  →
,  → ,  → ,  →  }. When applying step 2 to left-reduce Σ , and since  →  can be
left-reduced to  → , the following equivalent set is obtained and constitutes the final result:
Σ ′′′ = {  → ,  → ,  → ,  →  }, Since there is no removal of redundant dependencies, then
dependency  →  appears in this canonical-direct base.</p>
        <p>
          We need to notice that this base is also characterized in Formal Concept Analysis by the
so called minimal generators (or minimal generating set). A minimal generator is defined
in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] (Section 2.3.3) as follows: A generating set of a closed set  is a subset  ⊆  such that
 = ′′ , and, obviously, a minimal generating set of  is a subset of  minimal with respect to
this property. A similar definition exists in [ 18]: A minimal generator B of a closed set  is also
called a base for  , i.e. () =  and () ⊂ () for every  ⊂ , or a free subset, i.e. for
every  ∈ ,  ∈ ( ∖ ) (where  is a closure operator). What is also relevant is that, in
this same paper, the characterization of the canonical direct base Σ  is defined as follows:
Σ  = {  → closureΣ() ∖  :  ⊆  is a minimal generator of closureΣ() }. That is,
the left-hand sides of a Canonical-Direct Basis are the set of minimal generators of the implicit
closure operator.
3.3. The Minimal Cover in the RDBM and its variations
Below we introduce the Minimal Cover, which is very popular among the RDBM community
and can be found in most of the database textbooks under diferent names (see Table 1). This
cover aims to remove the redundancy caused by both augmentation and transitivity.
        </p>
        <p>Name
Canonical Cover
Minimal Cover
Minimal Cover
Irreducible Set of Dependencies
Minimal Cover
Canonical Cover</p>
        <p>
          Ref Where
Maier [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] p. 79, Section 5.6
        </p>
        <p>
          Ullman [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] p. 390
Abiteboul [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] p. 286, Exercice 11.16
        </p>
        <p>Date [19] p. 341, Section 11.6</p>
        <p>Elmasri [20] p. 549, Section 16.1.3</p>
        <p>Silberschatz [21] p. 324, Section 7.4.3</p>
        <p>The computation of the Minimal Cover is performed in three steps:
1. All the dependencies in Σ must have only one single attribute in the right-hand side. This
is performed by simply replacing a dependency  →  by the dependencies  →  for
all  ∈  .
2. Σ is left-reduced. This is performed by changing a dependency  →  by a dependency
′ → , where ′ ⊂ , whenever (Σ ∖ { →  } ∪ {′ →  })+ ≡ Σ + (see Definition
3.1).
3. Redundant dependencies are removed from Σ (see Definition 3.2).</p>
        <p>
          It is important to notice that the order of steps 2 and 3 is relevant and mandatory. Section 5.3
in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] includes a discussion explaining why left-reduction needs to be performed before the
removal of redundant dependencies. The output of computing the Minimal Cover depends also
on the order in which dependencies are processed. As a consequence it comes that there may be
diferent minimal covers for the same Σ . Finally, the Minimal Cover does not ensure directness.
        </p>
        <sec id="sec-3-2-1">
          <title>Example 3.3 (Adapted from Section 5.2 in [13]). Let us suppose that we have the following</title>
          <p>set of dependencies: Σ = {  → ,  → ,  → ,  →  }. Applying step 1 outputs Σ = {  →
,  → ,  → ,  → ,  →  }. Then step 2 is applied to left-reduce Σ . Since  →  can be
left-reduced to  →  (thanks to Augmentation), then the following equivalent set is produced:
Σ ′ = {  → ,  → ,  → ,  →  }. Finally, applying step 3 outputs: Σ ′′ = {  → ,  →  }.
By contrast, let us assume that the order of Σ is changed as follows: Σ = {  → ,  → ,  →
,  →  }. Applying step 1 yields the same set as above: Σ = {  → ,  → ,  → ,  →
,  →  }. When applying step 2, it comes: Σ ′ = {  → ,  → ,  → ,  →  }. And finally
applying step 3 outputs the base Σ ′ = {  → ,  →  }.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.4. The Duquenne-Guigues basis</title>
        <p>
          The Duquenne-Guigues basis [
          <xref ref-type="bibr" rid="ref8">17, 8</xref>
          ], also called the Canonical Basis in the FCA community,
is the cover based on pseudo-closed sets [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. More precisely, it is defined as follows:
        </p>
        <sec id="sec-3-3-1">
          <title>Definition 3.5.</title>
          <p>The Duquenne-Guigues basis of a set of dependencies Σ is defined as
{  → closureΣ() |  ⊆ 
and  pseudoclosed }
where the definition of a pseudo-closed set of attributes w.r.t. a set of dependencies Σ is:</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>Definition 3.6.</title>
          <p>pseudoclosed if:</p>
          <p>Let Σ be a set of dependencies, and  the related set of attributes.  ⊆ 
is
1.  ̸= closureΣ(), that is,  is not closed.
2. If  ⊂  is a proper subset of  and pseudo-closed, then closureΣ( ) ⊆ .</p>
          <p>
            This base is not direct, but it is minimal and non-redundant. According to [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ], this base is
also presented by Maier in [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ], where it is called the Minimum Cover: “It has been obtained
independently (and with diferent formulations) by Maier (Minimum Cover), and Guigues and
Duquenne (Duquenne-Guigues basis)”. We interpret the expression with diferent formulations
as the fact that the left and right-hand sides of the Duquenne-Guigues basis can be further
left/right-reduced. This is: modulo left/right-reduction, the Minimum Cover and the
DuquenneGuigues basis are essentially the same. We note that [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ] (Section 5.6.2) presents a way to
compute a Minimum Cover out of a non redundant set of dependencies. However, at present,
the use and popularity of the Duquenne-Guigues basis seems to be rather restricted within the
FCA community [
            <xref ref-type="bibr" rid="ref8 ref9">9, 8</xref>
            ].
          </p>
          <p>Example 3.4. Let us consider Example 3.1: Σ = {  → ,  → ,  → ,  →  }.</p>
          <p>As in Σ the pseudo-closed sets are simply {  } and {  }, the following Duquenne-Guigues basis
is obtained: Σ ′ = {  → ,  →  }.
3.5. Some Notes on Complexity
Concerning the complexity of the computation of the diferent bases, in all cases the worst-case
scenario is that of a base of exponential size with respect to the number of attributes as well as
the computing time of such bases.</p>
          <p>In [22] it is proved that the complexity of computing the Canonical-Direct Basis is of order
(2 (  )2 2)), where  is the number of attributes and  is the number of records (the
2
size of the dataset). In [16] it is proved that any algorithm computing the dependencies that
hold in a dataset has a complexity of (︀( )︀ ), where  is the number of attributes. As for the
2
Duquenne-Guigues basis, in [23] is is proved that (a) the size of a Duquenne-Guigues basis is
exponential in the worst-case and that (b) to estimate its size without computing it is #P-hard.
In fact, deciding whether a set is a pseudo-closed set is a coNP-complete problem [24].</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion and Conclusion</title>
      <p>
        Relating the three main covers plus the D-Basis is relevant and very interesting. Actually,
while the D-Basis and the Canonical-Direct Basis are related by a subset relationship, such a
relationship is not known to exist between the Duquenne-Guigues basis and the Minimum
Cover, or between the Canonical-Direct Basis and the Duquenne-Guigues basis. In fact, in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ],
in the last sentence before the acknowledgements page 28, it is stated that:
      </p>
      <p>We conclude that this paper is contradicting a conjecture of the literature (in [37]).
Indeed, one observes that the premise of implication (10) of Σ  (a direct cover) is not
contained in a premise of any implication of Σ  (the Duquenne-Guigues base).</p>
      <p>This sentence goes back to a conjecture stated in [25] and can be interpreted as follows. The
left-hand sides of a Duquenne-Guigues basis, which is minimal, may not be a subset of the
left-hand sides of a direct cover.</p>
      <p>The RDBM, Logic, and FCA fields are addressing two diferent, yet related problems: the
implication problem and the computation of a compact and representative set –cover or base–
of a complete set of dependencies. Although the first question is solved in the three fields
thanks to the same algorithms, that is, Closure or Linclosure, this unanimity disappears in
confronting with the choice of a cover. Table 2 summarizes the three types of covers reviewed
in Section 3, plus the D-Basis. It can be noticed that the Canonical-Direct Basis and the D-Basis
do not keep dependencies that can be inferred by the application of the augmentation axiom:
 →  |=  →  , while they include dependencies that can be inferred by transitivity.
This additional amount of information is enough to make direct these two covers. By contrast,
the Minimal Cover does not contain dependencies that can be inferred by augmentation or
transitivity as they are removed in the last step of the computation. And this explains why the
Minimal Cover is not direct.</p>
      <p>We have there a kind of “no free lunch theorem”: the more information a base is keeping
the more direct the base can be. By contrast, minimality and non redundancy do not favor
directness.</p>
      <p>(found in)
Minimum
Direct
Redundant
Unique</p>
      <p>Canonical
Minimal
RDBM
no
no
no
no</p>
      <p>Canonical</p>
      <p>Direct
RDBM, Logic
no
yes
yes
yes</p>
      <p>Duquenne-Guigues</p>
      <p>Minimum
FCA/RDBM
yes
no
no
yes</p>
      <p>D-Basis
Lattice Th.</p>
      <p>no
yes
yes
yes</p>
      <p>
        The lack of unanimity can also be noticed in the RDBM only. Indeed textbooks such as
[
        <xref ref-type="bibr" rid="ref13 ref14 ref3">14, 13, 3, 19, 20, 21</xref>
        ] tend to present the Minimal Cover as the preferred cover, while algorithms
computing FDs output the Canonical-Direct Basis [26]. This can be interpreted as follows:
textbooks are preferring a cover left-reduced and non-redundant, and, hence, containing less
and more compact information. However, for discovering FDs, a left-reduced cover is computed
without removing redundant dependencies. A possible explanation is that algorithms computing
FDs take a dataset as input. Then it is easy to perform a left-reduction w.r.t. the input dataset
by removing an attribute from the left-hand side and test whether the dependency still holds.
However, a redundancy test is diferent in the sense that it can only be performed w.r.t. a set of
dependencies, once this set has completely been computed. Such a test is of a diferent nature
as it is not performed w.r.t. a dataset. Moreover, this does not prevent any algorithm from
performing it.
      </p>
      <p>
        Although the Minimum Cover (or Duquenne-Guigues basis) is also introduced in the RDBM
ifeld, it has not enjoyed the same popularity as the Minimal Cover. In fact, the Minimum Cover
is introduced and discussed only in Maier [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. This lack of popularity is probably due to the
rather intricate characterization of the Minimum Cover and the related algorithm at that time
(see for example Section 5.6, Chapter 5 in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]). This is especially true when compared with the
simplicity and expressiveness of the presentation of the Minimal Cover. The characterization
of Minimum Cover cannot compete either with the clear characterization of the
DuquenneGuigues basis in FCA, or the simplicity of the NextClosure Algorithm in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. However, this
“simplicity” is not for free and it comes at the expense of the whole theoretical framework of
FCA.
      </p>
      <p>The choice of a cover can be decided depending on the performance that Closure or Linclosure
are ofering. Both Closure and Linclosure are closely dependent on the “nature” of the set of
dependencies Σ . By “nature” we mean the diferent characteristics explained in Section 3,
which have an impact on the amount of information as well as on the number of dependencies
considered. If Σ is a direct cover (Definition 3.3), both algorithms have to perform only a single
pass of their outer loop. If the cover is not direct, e.g., Canonical-Direct Basis or
DuquenneGuigues basis, then, the number of iterations of the outer loop may be larger, while, at the
same time, the number of iterations of the inner loop may be shorter. Again, we are facing the
well-known trade-of between “expressivity and complexity”: the more expressive in terms of
information containment a cover is, the higher the cost of Closure and Linclosure is.</p>
      <p>
        The question of the preference between the Duquenne-Guigues basis and Minimum Cover
remains, even if things have changed. What is left for future work is in concern with the
practical behavior of the main covers, which has to be compared and investigated from the
experimental point of view, especially having in mind the needs in the RDBM, in Logic, and in
FCA, but also in knowledge representation and ontology engineering. In particular, attribute
exploration [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] may be a good support for evaluating the potential of the main covers studied
above in ontology engineering and database design.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>J. Baixeries is supported by a recognition 2021SGR-Cat (01266 LQMC) from AGAUR (Generalitat
de Catalunya) and is funded by a grant AGRUPS-2022 from Universitat Politècnica de Catalunya.
Mehdi Kaytoue and Amedeo Napoli are carrying out this research work as part of the French
ANR-21-CE23-0023 SmartFCA Research Project.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P. C.</given-names>
            <surname>Kanellakis</surname>
          </string-name>
          , Chapter 17 - Elements of Relational Database Theory, in: J. Van Leeuwen (Ed.),
          <source>Formal Models and Semantics</source>
          , Handbook of Theoretical Computer Science, Elsevier, Amsterdam,
          <year>1990</year>
          , pp.
          <fpage>1073</fpage>
          -
          <lpage>1156</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Vardi</surname>
          </string-name>
          ,
          <source>The Theory of Data Dependencies - An Overview, in: Proceedings of the 11th International Colloquium on Automata, Languages, and Programming, Lecture Notes in Computer Science 172</source>
          , Springer,
          <year>1984</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Ullman</surname>
          </string-name>
          , Principles of Database and
          <string-name>
            <surname>Knowledge-Base</surname>
            <given-names>Systems</given-names>
          </string-name>
          , Volume I, volume
          <volume>14</volume>
          of Principles of computer science series, Computer Science Press,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Horn</surname>
          </string-name>
          ,
          <article-title>On sentences which are true of direct unions of algebras</article-title>
          ,
          <source>J. Symb. Log</source>
          .
          <volume>16</volume>
          (
          <year>1951</year>
          )
          <fpage>14</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Kowalski</surname>
          </string-name>
          ,
          <article-title>Logic for problem solving</article-title>
          , volume
          <volume>7</volume>
          of The computer science library :
          <source>Artificial intelligence series</source>
          , North-Holland,
          <year>1979</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Padawitz</surname>
          </string-name>
          , Computing in Horn Clause Theories, volume
          <volume>16</volume>
          <source>of EATCS Monographs on Theoretical Computer Science</source>
          , Springer,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Crama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. L.</given-names>
            <surname>Hammer</surname>
          </string-name>
          , Boolean Functions - Theory, Algorithms, and Applications, volume
          <volume>142</volume>
          <source>of Encyclopedia of mathematics and its applications</source>
          , Cambridge University Press,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ganter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wille</surname>
          </string-name>
          ,
          <source>Formal Concept Analysis - Mathematical Foundations</source>
          , Springer,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ganter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Obiedkov</surname>
          </string-name>
          , Conceptual Exploration, Springer,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sagiv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Delobel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S. P.</given-names>
            <surname>Jr.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <article-title>An Equivalence Between Relational Database Dependencies and a Fragment of Propositional Logic</article-title>
          ,
          <source>J. ACM</source>
          <volume>28</volume>
          (
          <year>1981</year>
          )
          <fpage>435</fpage>
          -
          <lpage>453</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ibaraki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kogan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Makino</surname>
          </string-name>
          ,
          <article-title>Functional dependencies in horn theories</article-title>
          ,
          <source>Artif. Intell</source>
          .
          <volume>108</volume>
          (
          <year>1999</year>
          )
          <fpage>1</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>K.</given-names>
            <surname>Bertet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Monjardet</surname>
          </string-name>
          ,
          <article-title>The multiple facets of the canonical direct unit implicational basis</article-title>
          ,
          <source>Theor. Comput. Sci</source>
          .
          <volume>411</volume>
          (
          <year>2010</year>
          )
          <fpage>2155</fpage>
          -
          <lpage>2166</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Maier</surname>
          </string-name>
          , The Theory of Relational Databases, Computer Science Press,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vianu</surname>
          </string-name>
          , Foundations of Databases, Addison-Wesley,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>