Comparing Algorithms for Computing Lower Covers of Implication-closed Sets Alexandre Bazin LIMOS, CNRS, Blaise Pascal University, Clermont-Ferrand, France contact@alexandrebazin.com Abstract. In this paper, we consider two methods for computing lower cover of elements in closure systems for which we know an implica- tional basis: intersecting meet-irreducible elements or computing min- imal transversals of sets of minimal generators. We provide experimen- tal results on the runtimes for single computations of lower covers and depth-first searches. 1 Introduction Closed sets are essential objects in many fields such as data mining or database theory. Most of the time, one is interested in computing all or parts of the closure system for a given closure operator [6, 8]. However, we are sometimes faced with the problem of finding a specific closed set that respects some property and computing too much of the rest of the closure system is a waste of time. An example of this is the problem of finding a maximal frequent closed set. As frequency is an anti-monotone property, it suffices to start with the least closed set and perform a depth-first search by repeatedly computing upper covers of sets until we cannot find frequent sets anymore. Given a closure operator and a closed set, it is easy to compute the closed sets immediately above it. Problems start to arise when we are looking for specific closed sets respecting a property that is not anti-monotone. For example, when computing the Duquenne-Guigues basis B for some closure operator c, a pseudo-closed set P is I-closed but not c-closed for I = B \ {P → c(P )}. The non-anti-monotonicity of the c-closedness prevents us from using depth-first searches from the bottom. What we are interested in is the problem of performing depth-first searches starting from the maximal element in closure systems for which we know an implicational basis. This amounts to computing lower covers of closed sets using the implications, a problem shown to be NP-hard [1]. In this work, we compare two algorithms - one based on meet-irreducibles, the other on minimal generators - that solve this problem. Section 2 reviews basic definitions and properties of lower covers that the algorithms use. Sections 3 and 4 respectively describe the methods using meet-irreducible elements and minimal generators. In Section 5, we present experimental results on the runtimes for both algorithms. 2 Preliminaries 2.1 Basic Definitions Let us use E to denote a set of elements. Definition 1 A closure operator c : 2E 7→ 2E is an extensive (X ⊆ c(X)), increasing (X ⊆ Y ⇒ c(X) ⊆ c(Y )) and idempotent (c(c(X)) = c(X)) function. A set S ∈ 2E such that S = c(S) is said to be closed. The intersection of two closed sets is closed. The set of all sets closed for a given closure operator c ordered by inclusion forms a lattice that will here be denoted by Φc . Definition 2 An implication on E is a pair (A, B) ∈ 2E × 2E , most commonly written A → B. Definition 3 Let I be a set of implications. We denote by I(·) the closure operator, sometimes called logical closure, that maps a set X to its smallest superset Y such that ∀A → B ∈ I, A ⊆ Y ⇒ B ⊆ Y The logical closure is a closure operator. As such, for an implication set I, we will use ΦI to denote the lattice of implication-closed sets ordered by inclusion. Definition 4 A minimal generator G of X ∈ 2E for a closure operator c is an inclusion-minimal set such that c(G) = X. In the remainder of this paper, we will use the term minimal generator, without any more details, to talk about the minimal generators for the logical closure. The set of minimal generators of a set S for an implication set I will be denoted GenI (S). Definition 5 Let L = (E, ≤) be a lattice. A meet-irreducible element is an element e ⊆ E that has a single upper cover, i.e. {x ∈ E | x > e} has a single inclusion-minimal element. We use M(L) to denote the set of meet-irreducible elements of the lattice L. Any element of the lattice L is the infimum of a set of meet-irreducible elements. Definition 6 Let H = (E, V ) be a hypergraph. A minimal transversal of E is a set S ⊆ V such that ∀X ∈ E, X ∩ S 6= ∅. We use T r(E) to denote the set of minimal transversals of a set of hyperedges E. 2.2 Lower Covers If we want to perform a depth-first search from the top in a lattice ΦI for which we know I, we have to be able to compute the lower covers of a set S ∈ ΦI . We present here two methods that can be used to compute them. Proposition 1 For any S ∈ ΦI , the lower covers of S are the inclusion-maximal elements of {S ∩ M | M ∈ M(ΦI ) and S 6⊆ M }. Proof Every element of ΦI is the infimum of a subset of M(ΦI ). Additionally, ΦI being a closure system, the infimum of two sets is their intersection. As such, for any M ∈ M(ΦI ), S ∩M ⊆ S is an I-closed set. S ∩M being strictly contained in S, if it is inclusion-maximal then it is a lower cover of S.  Using Proposition 1 to compute lower covers requires that we first know the meet-irreducible elements of the lattice. In most case, they must be explicitly computed beforehand. Algorithm 1, presented in Section 3.1, does this. Proposition 2 For any S ∈ ΦI , the lower covers of S are the sets S \ T with T ∈ T r(GenI (S)). Proof Let T be a minimal transversal of GenI (S). For any e ∈ S \ T , (S \ T ) ∪ {e} = S \ (T \ {e}) contains a minimal generator of S because of the minimality of T . Therefore, there is no closed set between S and S \ T and S \ T is a lower cover of S.  Using Proposition 2 to compute the lower covers of S would require that we compute the minimal generators S first, then their minimal transversals. The algorithms we use for these problems are described in Section 4. 3 Computing with Meet-irreducible Elements 3.1 Computing Meet-irreducible Elements from an Implication Base We use Wild’s algorithm [9] to compute the set of meet-irreducible elements of the lattice ΦI for which we know an implication base I. It is based on the fact that a set S ∈ ΦI is meet-irreducible if and only if it is maximal among closed sets that do not contain some element e. Let us call max(e, I) the set of maximal elements of ΦI that do not contain e and max0 (e, I \ {I}) and max00 (e, I \ {I}) the subsets of max(e, I \ {I}) for which I = A → B respectively holds and does not hold. We then have [ max(e, I) ⊆ max0 (e, I \ {I}) max00 (e, I \ {I}) ∗ max(a, I \ {I}) a∈A where X ∗ Y = {x ∩ y | x ∈ X, y ∈ Y }. Algorithm 1 uses this property to compute the meet-irreducible elements of ΦI from I = {A1 → B1 , ..., An → Bn } by computing the meet irreducibles of every lattice ΦIi such that Ii = {A1 → B1 , ..., Ai → Bi } with 0 ≤ i ≤ n. Algorithm 1: Meet-irreducibles 1 Meet-Irreducibles (E, I) Input : Implication set I = {A1 → B1 , ..., An → Bn } on E Output: M(ΦI ) 2 foreach e ∈ E do 3 max(e) = {E \ {e}} 4 end 5 foreach i ∈ 1..n do 6 foreach e ∈ E do 7 max0 (e) = {Z ∈ max(e) | Ai 6⊆ Z or Bi ⊆ Z} 8 max00 (e) = max(e) \ max0 (e) 9 foreach a ∈ Ai do 10 foreach X ∈ max00 (e) and Y ∈ max(a) do 11 T (e) = T (e) ∪ {X ∩ Y } 12 end 13 end 14 max(e) = {Z ∈ T (e) | Z is inclusion-maximal} 15 end 16 end S 17 M(ΦI ) = e∈E max(e) Algorithm 1 has a runtime exponential in the size of its output, which can itself be exponential in the size of the implication base. 3.2 Intersecting Sets Once the meet-irreducible elements of the lattice are known, we can compute the lower covers of a set S by intersecting it with the meet-irreducible sets that are not supersets of S and keeping the inclusion-maximal elements. Algorithm 2 runs in time polynomial in the size of M(ΦI ). 4 Computing with Transversals of Minimal Generators 4.1 Computing Minimal Generators We compute the minimal generators of an implication-closed set with Algorithms 3 and 4 proposed in [7]. Algorithm 2: Intersection of meet-irreducible elements Input : A set S and meet-irreducibles M(ΦI ) Output: The lower covers of S 1 C =∅ 2 foreach M ∈ M(ΦI ) do 3 C = C ∪ (S ∩ M ) 4 end 5 Return max(C) Algorithm 3: First minimal generator 1 MinGen (P, L) Input : Implications L on the set E and a subset P ⊆ E such that L(P ) = P Output: A minimal generator Q of P 2 Q←P 3 foreach m ∈ P do 4 if L(Q \ {m}) = P then 5 Q ← Q \ {m} 6 end 7 end Algorithm 3, given a set P and an implication set L, computes a first minimal generator of P . Algorithm 4 computes all the minimal generators of a set P for an implication set L. It has time complexity O(|L| × |G| × |P| × (|G| + |P|)). 4.2 Computing Minimal Transversals The problem of computing minimal transversals is a classic that, while exten- sively studied, still holds many interesting question [4, 5, 3]. It has been shown to be solvable in quasi-polynomial total time. Here, we chose to compute minimal transversals using the, arguably, most simple algorithm, Berge Multiplication [2]. Given two hypergraphs H1 and H2 , their edgewise union is defined as : H1 ∨ H2 = {h1 ∪ h2 | h1 ∈ H1 and h2 ∈ H2 } We then have T r(H1 ∪ H2 = min(T r(H1 ) ∨ T r(H2 )) Which gives rise to Algorithm 5. 5 Experimental Results We implemented both methods, used them on randomly generated implica- tional bases and compared their runtimes. Implications were generated by using Algorithm 4: All minimal generators Input : Implication set L on the attribute set E and an L-closed set P ⊆ E Output: All minimal generators G of P 1 G ← M inGen(P, L) 2 foreach G ∈ G do 3 foreach L → R ∈ L such that L ∪ R ∪ G ⊆ P do 4 S ← L ∪ (K \ R) 5 f lag ← true 6 foreach H ∈ G do 7 if H ⊆ S then 8 f lag ← f alse 9 end 10 end 11 if f lag then 12 G ← G ∪ {M inGen(S, L)} 13 end 14 end 15 end Algorithm 5: All minimal transversals Input : Hypergraph H Output: T r(H) 1 T =∅ 2 foreach E ∈ H do 3 T = min(T ∨ {{v} | v ∈ E}) 4 end 5 Return T NextClosure on formal contexts (O, A, R) randomly generated with 50 objects, 12 attributes and a probability d (called density) to have (o, a) ∈ R. Two cases were considered: – Single computations of lower covers – Depth-first searches in lattices involving multiple computations of lower cov- ers For the first case, we simply computed the lower covers of A. For the second case, we removed some implications as to introduce, in the lattice, sets that are not intents of the formal contexts and we performed the depth-first searches starting from A and going deeper everytime we encountered a non-intent. 5.1 Single Computations We generated 1000 contexts with varying numbers of implications and computed the lower covers of A using the two methods we presented. Figure 1 presents the quadratic interpolation of the runtimes for each method relative to the number of implications in the basis. Fig. 1. Runtimes for both algorithms relative to the number of implications on a single computation (Red = Meet-irreducibles, Blue = Minimal generators) In our experiments, the algorithm using meet-irreducible elements outper- formed the one based on minimal generators 79% of the time. Computing the meet-irreducible elements represents 98% of its runtime. The second method de- votes 75% of its time to computing minimal generators and 25% on minimal transversals. 5.2 Depth-first Searches As with single computations, we randomly generated 1000 contexts with varying numbers of implications and performed depth-first searches starting from A. Runtimes Relative to the Size of the Basis Figure 2 presents the interpo- lation of the runtimes for each methods relative to the number of implications in the basis. Fig. 2. Runtimes for both algorithms relative to the number of implications for depth- first searches (Red = Meet-irreducibles, Blue = Minimal generators) Once again, using meet-irreducibles is the most efficient method (it outper- formed the minimal generators 72% of the time) and this trend accentuates with the size of the implicational basis. Computing the meet-irreducible elements rep- resents 86% of the total runtime. The second methods devotes 69% of its runtime to the computation of minimal generators and 31% on minimal transversals. Runtimes Relative to the Depth Figure 3 presents the interpolation of the runtimes for each methods relative to the depth of the search. Fig. 3. Runtimes for both algorithms relative to the maximum depth of the search (Red = Meet-irreducibles, Blue = Minimal generators) We can notice that the algorithm using meet-irreducible elements is much less affected by the depth of the search, most likely due to the fact that most of the computation is done prior to the actual search. Interestingly, the algorithm using minimal generators is least efficient for relatively shallow searches. We believe this is due to the fact that deep searches imply that the lattice is mostly composed of sets that have the property we are looking for. As we used closedness relative to a formal context, this means that those lattices correspond to nearly empty implicational bases. Runtimes with Increasing Bases We used both methods to perform depth- first searches in a sequence of lattices corresponding to implicational bases ∅ ⊆ I1 ⊆ ... ⊆ In . Results are presented in Figure 4. Fig. 4. Runtimes for both algorithms relative to a growing implicational basis (x-axis = number of implications)(Red = meet-irreducibles, Blue = Minimal generators) Once again, the runtime of the algorithm using minimal generators skyrockets when the basis grows. The meet-irreducibles method presents much less variance. It should be noted that the meet-irreducible elements are computed from scratch for each basis when Algorithm 1 could be used to speed up the process by using the meet-irreducible elements of the previous lattice to compute the new ones. 6 Discussion Experimental results indicate that the most efficient way to compute lower covers of implication-closed sets is to first compute the meet-irreducible elements of the lattice and then intersect them. It outperforms the method based on computing minimal transversals of minimal generators even when minimal generators have to be computed only once. During deeper searches, minimal generators have to be computed at each step and the difference is thus more pronounced. While the algorithm we used for computing minimal transversals is not the most efficient, the fact that this computation represents only a small portion of the second method indicates that, even with the best algorithm, using meet- irreducible elements would be best. Computing these meet-irreducible elements is time-consuming but, once we have them, they are easy to intersect. We believe there is still more room for improvement on the problem of computing meet- irreducible elements from implicational bases. Acknowledgments The author was supported by the European Union and the French Région Auvergne - Rhône-Alpes. References 1. Mikhail A. Babin and Sergei O. Kuznetsov. Dualization in lattices given by ordered sets of irreducibles. Theoretical Computer Science, 2016. CoRR, abs/1504.01145, 2015. 2. Claude Berge. Hypergraphs: Combinatorics of Finite Sets, volume 45. Elsevier, 1984. 3. Thomas Eiter, Georg Gottlob, and Kazuhisa Makino. New Results on Monotone Du- alization and Generating Hypergraph Transversals. SIAM J. Comput., 32(2):514– 537, 2003. 4. Thomas Eiter, Kazuhisa Makino, and Georg Gottlob. Computational As- pects of Monotone Dualization: A Brief Survey. Discrete Applied Mathematics, 156(11):2035–2049, 2008. 5. Michael L. Fredman and Leonid Khachiyan. On the Complexity of Dualization of Monotone Disjunctive Normal Forms. J. Algorithms, 21(3):618–628, 1996. 6. Bernhard Ganter and Klaus Reuter. Finding all Closed Cets: A General Approach. Order, 8(3):283–290, 1991. 7. Miki Hermann and Baris Sertkaya. On the Complexity of Computing Generators of Closed Sets. In Raoul Medina and Sergei A. Obiedkov, editors, ICFCA, volume 4933 of Lecture Notes in Computer Science, pages 158–168. Springer, 2008. 8. Petr Krajca, Jan Outrata, and Vilém Vychodil. Advances in Algorithms Based on CbO. In Marzena Kryszkiewicz and Sergei A. Obiedkov, editors, CLA, volume 672 of CEUR Workshop Proceedings, pages 325–337. CEUR-WS.org, 2010. 9. Marcel Wild. Computations with Finite Closure Systems and Implications. In Ding-Zhu Du and Ming Li 0001, editors, COCOON, volume 959 of Lecture Notes in Computer Science, pages 111–120. Springer, 1995.