AIM: Another Itemset Miner

                                            Amos Fiat, Sagi Shporer
                                           School of Computer Science
                                               Tel-Aviv University
                                                 Tel Aviv, Israel
                                             {ﬁat, shporer}@tau.ac.il


                      Abstract                               1.1. Contributions of this paper

   We present a new algorithm for mining frequent               We combine several pre-existing ideas in a fairly
itemsets. Past studies have proposed various algo-           straightforward way and get a new frequent itemset
rithms and techniques for improving the eﬃciency of          mining algorithm. In particular, we combine the sparse
the mining task. We integrate a combination of these         vertical bit vector technique along with the diﬀerence
techniques into an algorithm which utilize those tech-       sets technique of [14], thus reducing the computation
niques dynamically according to the input dataset. The       time when compared with [14]. The various techniques
algorithm main features include depth ﬁrst search with       were put in use dynamically according to the input
vertical compressed database, diﬀset, parent equiva-         dataset, thus utilizing the advantages and avoiding the
lence pruning, dynamic reordering and projection. Ex-        drawbacks of each technique.
perimental testing suggests that our algorithm and              Experimental results suggest that for a given level of
implementation signiﬁcantly outperform existing algo-        support, our algorithm/implementation is faster than
rithms/implementations.                                      the other algorithms with which we compare ourselves.
                                                             This set includes the dEclat algorithm of [14] which
                                                             seems to be the faster algorithm amongst all others.

1. Introduction                                              2. Related Work

                                                                 Since the introduction of the Apriori algorithm by
    Finding association rules is one of the driving appli-   [1, 2] many variants have been proposed to reduce time,
cations in data mining, and much research has been           I/O and memory.
done in this ﬁeld [10, 7, 4, 6]. Using the support-              Apriori uses breath-ﬁrst search, bottom-up ap-
conﬁdence framework, proposed in the seminal paper           proach to generate frequent itemsets. (I.e., constructs
of [1], the problem is split into two parts — (a) ﬁnding     i + 1 item frequent itemsets from i item frequent item-
frequent itemsets, and (b) generating association rules.     sets). The key observation behind Apriori is that all
    Let I be a set of items. A subset X ⊆ I is called        subsets of a frequent itemset must be frequent. This
an itemset. Let D be a transactional database, where         suggests a natural approach to generating frequent
each transaction T ∈ D is a subset of I : T ⊆ I. For an      itemsets. The breakthrough with Apriori was that
itemset X, support(X) is deﬁned to be the number of          the number of itemsets explored was polynomial in the
transactions T for which X ⊆ T . For a given parameter       number of frequent itemsets. In fact, on a worst case
minsupport, an itemset X is call a frequent itemset          basis, Apriori explores no more than n itemsets to out-
if support(X) ≥ minsupport. The set of all frequent          put a frequent itemset, where n is the total number of
itemsets is denoted by F.                                    items.
    The remainder of this paper is organized as follows.         Subsequent to the publication of [1, 2], a great many
Section 2 contains a short of related work. In section 3     variations and extensions were considered [3, 7, 13].
we describe the AIM-F algorithm. Section 4 contains          In [3] the number of passes over the database was re-
experimental results. In Section 5 we conclude this          duced . [7] tried to reduce the search space by combin-
short abstact with a discussion.                             ing bottom-up and top-down search – if a set is infre-
quent than so are supersets, and one can prune away                                        {:123}
infrequent itemsets found during the top-down search.
[13] uses equivalence classes to skip levels in the search
space. A new mining technique, FP-Growth, proposed                            {1:23}        {2:3}           {3:}
in [12], is based upon representing the dataset itself as
a tree. [12] perform the mining from the tree represen-
tation.                                                                   {12:3}    {13:} {23:}
   We build upon several ideas appearing in previous
work, a partial list of which is the following:                           {123:}
  • Vertical Bit Vectors [10, 4] - The dataset is stored
    in vertical bit vectors. Experimentally, this has                  Figure 1. Full lexicographic tree of 3 items
    been shown to be very eﬀective.
  • Projection [4] - A technique to reduce the size of           3. The AIM-F algorithm
    vertical bit vectors by trimming the bit vector to
    include only transaction relevant to the subtree                In this section we describe the building blocks that
    currently being searched.                                    make up the AIM-F algorithm. High level pseudo code
  • Diﬀerence sets [14] - Instead of holding the entire          for the AIM-F algorithm appears in Figure 7.
    tidset at any given time, Diﬀsets suggest that only
    changes in the tidsets are needed to compute the             3.1. Lexicographic Trees
    support.
                                                                    Let < be some lexicographic order of the items in I
  • Dynamic Reordering [6] - A heuristic for reducing
                                                                 such that for every two items i and j, i = j : i < j or
    the search space - dynamically changing the order
                                                                 i > j. Every node n of the lexicographic tree has two
    in which the search space is traversed. This at-
                                                                 ﬁelds, n.head which is the itemset node n represent,
    tempts to rearrange the search space so that one
                                                                 and n.tail which is a list of items, possible extensions
    can prune infrequent itemsets earlier rather than
                                                                 to n.head. A node of the lexicographic tree has a l evel.
    later.
                                                                 Itemsets for nodes at level k nodes contain k items. We
  • Parent Equivalence Pruning [4, 13] - Skipping lev-           will also say that such itemsets have length k. The root
    els in the search space, when a certain item added           (level 0) node n.head is empty, and n.tail = I. Figure
    to the itemset contributes no new information.               1 is an example of lexicographic tree for 3 items.
                                                                    The use of lexicographic trees for itemset generation
   To the best of our knowledge no previous imple-
                                                                 was proposed by [8].
mentation makes use of this combination of ideas, and
some of these combinations are non-trivial to combine.
For example, projection has never been previously used           3.2. Depth First Search Traversal
with diﬀerence sets and to do so requires some new ob-
servations as to how to combine these two elements.                 In the course of the algorithm we traverse the lexico-
   We should add that there are a wide variety of other          graphic tree in a depth-ﬁrst order. At node n, for every
techniques introduced over time to ﬁnd frequent item-            element α in the node’s tail,a new node n is generated
sets, which we do not make use of. A very partial list           such that n .head = n.head α and n .tail = n.tail−α.
of these other ideas is                                          After the generation of n , α is removed from n.tail, as
                                                                 it will be no longer needed (see Figure 3).
  • Sampling - [11] suggest searching over a sample of              Several pruning techniques, on which we elaborate
    the dataset, and later validates the results using           later, are used in order to speed up this process.
    the entire dataset. This technique was shown to
    generate the vast majority of frequent itemsets.             3.3    Vertical Sparse Bit-Vectors
  • Adjusting support - [9] introduce SLPMiner, an
    algorithm which lowers the support as the item-                 Comparison between horizontal and vertical
    sets grow larger during the search space. This at-           database representations done in [10] shows that the
    tempts to avoid the problem of generating small              representation of the database has high impact on the
    itemsets which are unlikely to grow into large item-         performance of the mining algorithm. In a vertical
    sets.                                                        database the data is represented as a list of items,


                                                             2
  Project(p : vector, v : vector )                              Apriori(n : node, minsupport : integer)
/* p - vector to be projected upon                              (1) t = n.tail
   v - vector being projected */                                (2) while t = ∅
 (1) t = Empty Vector                                           (3)     Let α be the ﬁrst item in t
 (2) i = 0                                                      (4)     remove α from t 
 (3) for each nonzero bit in p, at oﬀset j, in                  (5)     n .head = n.head α
     ascending order of oﬀsets:                                 (6)     n .tail = t
 (4)     Set i’th bit of target vector t to be the              (7)     if (support(n .head) ≥ minsupport)
         j’th bit of v.                                         (8)          Report n .head as frequent itemset
 (5)     i=i+1                                                  (9)          Apriori(n )
 (6) return t


                 Figure 2. Projection                                              Figure 4. Apriori

DFS(n : node,)                                                  PEP(n : node, minsupport : integer)
(1) t = n.tail                                                  (1) t = n.tail
(2) while t = ∅                                                (2) while t = ∅
(3)     Let α be the ﬁrst item in t                             (3)     Let α be the ﬁrst item in t
(4)     remove α from t                                        (4)     remove α from t 
(5)     n .head = n.head α                                     (5)     n .head = n.head α
(6)     n .tail = t                                            (6)     n .tail = t
(7)     DFS(n )                                                (7)     if (support(n .head) = support(n.head))
                                                                (8)          add α to the list of items removed by
                                                                             PEP
                Figure 3. Simple DFS                            (9)     else if (support(n .head)
                                                                                                   ≥ minsupport)
                                                                (10)         Report n .head {All subsets of items
                                                                             removed by PEP} as frequent itemsets
where every item holds a list of transactions in which          (11)         PEP(n )
it appears.
   The list of transactions held by every item can be
represented in many ways. In [13] the list is a tid-list,
                                                                                     Figure 5. PEP
while [10, 4] use vertical bit vectors. Because the data
tends to be sparse, vertical bit vectors hold many “0”
entries for every “1”, thus wasting memory and CPU
                                                                itemsets. The idea is to eliminate redundant zeros in
for processing the information. In [10] the vertical bit
                                                                the bit-vector - for itemset P , all the transactions which
vector is compressed using an encoding called skinning
                                                                does not include P are removed, leaving a vertical bit
which shrinks the size of the vector.
                                                                vector containing only 1s. For every itemset generated
   We choose to use a sparse vertical bit vector. Ev-
                                                                from P (a superset of P ), P X, all the transactions
ery such bit vector is built from two arrays - one for
                                                                removed from P are also removed. This way all the
values, and one for indexes. The index array gives the
                                                                extraneous zeros are eliminated.
position in the vertical bit vector, and the value array
is the value of the position, see Figure 8. The index              The projection done directly from the vertical bit
array is sorted to allow fast AND operations between            representation. At initialization a two dimensional ma-
two sparse bit vectors in a similar manner to the AND           trix of 2w by 2w is created, where w is the word length
operation between the tid-lists. Empty values will be           or some smaller value that we choose to work with.
thrown away during the AND operation, save space                Every entry (i,j) is calculated to be the projection of
and computation time.                                           j on i (thus covering all possible projections of single
                                                                word). For every row of the matrix, the number of bits
                                                                being projected is constant (a row represents the word
3.3.1   Bit-vector projection
                                                                being projected upon).
In [4], a technique called projection was introduced.              Projection is done by traversing both the vector to
Projection is a sparse bit vector compression technique         project upon, p, and the vector to be projected, v. For
speciﬁcally useful in the context of mining frequent            every word index we compute the projection by table


                                                            3
DynamicReordering(n : node, minsupport : integer)                  AIM-F(n : node, minsupport : integer)
(1) t = n.tail                                                    /* Uses DFS traversal of lexicographic itemset tree
(2) for each α in t                                                 Fast computation of small frequent itemsets
(3)     Compute sα = support(n.head α)                               for sparse datasets
(4) Sort items α in t by sα in ascending order.                      Uses diﬀerence sets to compute support
(5) while t = ∅                                                     Uses projection and bit vector compression
(6)     Let α be the ﬁrst item in t                                  Makes use of parent equivalence pruning
(7)     remove α from t                                             Uses dynamic reordering */
(8)     n .head = n.head α                                        (1) t = n.tail
(9)     n .tail = t                                               (2) for each α in t                      
(10)    if (support(n .head) ≥ minsupport)                        (3)     Compute sα = support(n.head α)
(11)         Report n .head as frequent itemset                   (4)     if (sα = support(n.head))
(12)         DynamicReordering(n )                                (5)          add α to the list of items removed by PEP
                                                                   (6)          remove α from t
                                                                   (7)     else if (sα < minsupport)
                                                                   (8)          remove α from t
             Figure 6. Dynamic Reordering                          (9) Sort items in t by sα in ascending order.
                                                                   (10) While t = ∅
lookup, the resulting bits are then concatenated to-               (11)    Let α be the ﬁrst item in t
gether. Thus, computing the projection takes no longer             (12)    remove α from t 
than the AND operation between two compressed ver-                 (13)    n .head = n.head α
tical bit lists.                                                   (14)    n .tail = t    
                                                                   (15)    Report n .head {All subsets of items
   In [4] projection is used whenever a rebuilding
                                                                               removed by PEP} as frequent itemsets
threshold was reached. Our tests show that because
                                                                   (16)    AIM-F(n )
we’re using sparse bit vectors anyway, the gain from
projection is smaller, and the highest gains are when
we use projection only when calculating the 2-itemsets
from 1-itemsets. This is also because of the penalty                                Figure 7. AIM-F
of using projection with diﬀsets, as described later, for
large k-itemsets. Even so, projection is used only if the
sparse bit vector will shrunk signiﬁcantly - as a thresh-         between those vectors.
old we set 10% - if the sparse bit vector contains less              Let t(P ) be the tidset of P . The Diﬀset d(P X) is
than 10% of ’1’s it will be projected.                            the tidset of tids that are in t(P ) but not in t(P X),
                                                                  formally : d(P X) = t(P ) − t(P X) = t(P ) − t(X). By
3.3.2     Counting and support                                    deﬁnition support(P XY ) = support(P X)−|d(P XY )|,
                                                                  so only d(P XY ) should be calculated.         However
To count the number of ones within a sparse bit vector,           d(P XY ) = d(P Y ) − d(P X) so the Diﬀset for every
one can hold a translation table of 2w values, where w            candidate can be calculated from its generating item-
is the word length. To count the number of ones in a              sets.
word requires only one memory access to the transla-                 Diﬀsets have one major drawback - in datasets,
tion table. This idea ﬁrst appeared in the context of             where the support drops rapidly between k-itemset to
frequent itemsets in [4].                                         k+1-itemset then the size of d(P X) can be larger than
                                                                  the size of t(P X) (For example see ﬁgure 9). In such
3.4     Diffsets                                                  cases the usage of diﬀsets should be delayed (in the
                                                                  depth of the DFS traversal) to such k-itemset where
   Diﬀerence sets (Diﬀsets), proposed in [14], are a              the support stops the rapid drop. Theoretically the
technique to reduce the size of the intermediate in-              break even point is 50%: t(P   X)
                                                                                              t(P ) = 0.5, where the size
formation needed in the traversal using a vertical                of d(P X) equals to t(P X), however experiments shows
database. Using Diﬀsets, only the diﬀerences between              small diﬀerences for any value between 10% to 50%.
the candidate and its generating itemsets is calculated           For this algorithm we used 50%.
and stored (if necessary). Using this method the inter-              Diﬀsets and Projection : As d(P XY ) in not
mediate vertical bit-vectors in every step of the DFS             a subset of d(P X), Diﬀsets cannot be used directly
traversal are shorter, this results in faster intersections       for projection. Instead, we notice that d(P XY ) ⊆


                                                              4
                                                                     
                                                             n.head α. Thus, X can be moved from the tail to
                                                             the head, thus saving traversal of P and skipping to
                                                             P X. This method was described by [4, 13]. Later when
                                                             the frequent items are generated the items which were
                                                             moved from head to tail should be taken into account
                                                             when listing all frequent itemsets. For example, if k
                                                             items were pruned using PEP during the DFS traver-
                                                             sal of frequent itemset X then the all 2k subsets of
                                                             those k items can be added to X without reducing the
                                                             support. This gives creating 2k new frequent itemsets.
                                                             See Figure 5 for pseudo code.
      Figure 8. Sparse Bit-Vector data structure

                                                             3.6   Dynamic Reordering

                                                                To increase the chance of early pruning, nodes are
                                                             traversed, not in lexicographic order, but in order de-
                                                             termined by support. This technique was introduced
                                                             by [6].
                                                                Instead of lexicographic order we reorder the chil-
                                                             dren of a node as follows. At node n, for  all α in the
                                                             tail, we compute sα = support(t.head α), and the
                                                             items are sorted in by sα in increasing
                                                                                                   order. Items α
              Figure 9. Diffset threshold                    in n.tail for which support(t.head α) < minsupport
                                                             are trimmed away. This way, the rest of the sub-tree
                                                             will beneﬁt from a shortened tail. Items with smaller
t(P X) and t(P X) = t(P ) − d(P X). However d(P X)           support, which are heuristically “likely” to be pruned
is known, and t(P ) can be calculated in the same            earlier, will be traversed ﬁrst. See Figure 6 for pseudo
way. For example t(ABCD) = t(ABC) − d(ABCD),                 code.
t(ABC) = t(AB) − d(ABC), t(AB) = t(A) − d(AB)
thus t(ABCD) = t(A)−d(AB)−d(ABC)−d(ABCD).                    3.7   Optimized Initialization
Using this formula the t(P X) can be calculated using
the intermediate data along the DFS trail. As the DFS
goes deeper, the penalty of calculating the projection          In sparse datasets computing frequent 2-itemsets
is higher.                                                   can be done more eﬃciently than than by perform-
                                                             ing n2 itemset intersections. We use a method similar
3.5     Pruning Techniques                                   to the one described in [13]: as a preprocessing step,
                                                             for every transaction in the database, all 2-itemsets are
                                                             counted and stored in an upper-matrix of dimensions
3.5.1    Apriori
                                                             n × n. This step may take up to O(n2 ) operations per
Proposed by [2] the Apriori pruning technique is             transaction. However, as this is done only for sparse
based on the monotonicity property of support:               datasets, experimentally one sees that the number of
support(P ) ≥ support(P X) as P X is contained in less       operations is small. After this initialization step, we
transactions than P . Therefore if for an itemset P ,        are left with frequent 2 item itemsets from which we
support(P ) < minsupport, the support of any exten-          can start the DFS proceedure.
sion of P will also be lower than minsupport, and the
subtree rooted at P can be pruned from the lexico-
                                                             4. Experimental Results
graphic tree. See Figure 4 for pseudo code.

3.5.2    Parent Equivalence Pruning (PEP)                       The experiments were conducted on an Athlon
                                                             1.2Ghz with 256MB DDR RAM running Microsoft
This is a pruning method based on the following
                                            prop-           Windows XP Professional. All algorithms where com-
erty : If support(n.head) = support(n.head α) then           piled on VC 7. In the experiments described herein, we
all the transactions that contain n.head also contain        only count frequent itemsets, we don’t create output.


                                                         5
   We used ﬁve datasets to evaluate the algorithms per-
formance. Those datasets where studied extensively in
[13].

 1. connect — A database of game states in the game
    connect 4.

 2. chess — A database of game states in chess.

 3. mushroom — A database with information about
    various mushroom species.

 4. pumsb* — This dataset was derived from the
    pumsb dataset and describes census data.

 5. T10I4D100K - Synthetic dataset.

The     ﬁrst    3    datasets  were    taken   from                  Figure 11. Connect - support 50000 (75%)
the    UN     Irvine    ML    Database    Repository
(http://www.ics.uci.edu/      mlearn/MLRepository).
The synthetic dataset created by the IBM Almaden
synthetic data generator
(http://www.almaden.ibm.com/cs/quest/demos.html).

4.1   Comparing Data Representation

   We compare the memory requirements of sparse ver-
tical bit vector (with the projection described earlier)
versus the standard tid-list. For every itemset length
the total memory requirements of all tid-sets is given
in ﬁgures 10, 11 and 12. We do not consider itemsets
removed by PEP.


                                                                     Figure 12. T10I4D100K - support 100 (0.1%)


                                                               as much memory as tid-list. Tests to dynamically
                                                               move from sparse vertical bit vector representation to
                                                               tid-lists showed no signiﬁcant improvement in perfor-
                                                               mance, however, this should be carefully veriﬁed in fur-
                                                               ther experiments.

                                                               4.2    Comparing The Various Optimizations

                                                                  We analyze the inﬂuence of the various optimiza-
                                                               tion techniques on the performance of the algorithm.
                                                               First run is the ﬁnal algorithm on a given dataset, then
      Figure 10. Chess - support 2000 (65%)                    returning on the task, with a single change in the al-
                                                               gorithm. Thus trying to isolate the inﬂuence of every
    As follows from the ﬁgures, our sparse vertical bit        optimization technique, as shown in ﬁgures 13 and 14.
vector representation requires less memory than tid-              As follows from the graphs, there is much diﬀerence
list for the dense datasets (chess, connect). However          in the behavior between the datasets. In the dense
for the sparse dataset (T10I4D100K) the sparse ver-            dataset, Connect, the various techniques had tremen-
tical bit vector representation requires up to twice           dous eﬀect on the performance. PEP, dynamic reorder-


                                                           6
                                                   ing and diﬀsets behaved in a similar manner, and the
                                                   performance improvement factor gained by of them in-
                                                   creased as the support dropped. From the other hand
                                                   the sparse bit vector gives a constant improvement fac-
                                                   tor over the tid-list for all the tested support values,
                                                   and projection gives only a minor improvement.
                                                      In the second ﬁgure, for the sparse dataset
                                                   T10I4D100K, the behavior is diﬀerent. PEP gives no
                                                   improvement, as can expected in sparse dataset, as ev-
                                                   ery single item has a low support, and does not contain
                                                   existing itemsets. There is drop in the support from
                                                   k-itemset to k+1-itemset due to the low support there-
                                                   fore diﬀset also gives no impact, and the same goes for
                                                   projection. A large gain in performance is made by op-
                                                   timized initialization, however the performance gain is
                                                   constant, and not by a factor. Last is the dynamic re-
                                                   ordering which contributes to early pruning much like
                                                   in the dense dataset.

Figure 13. In¤uence of the various optimiza-       4.3   Comparing Mining Algorithms
tion on the Connect dataset mining
                                                     For comparison, we used implementations of
                                                    1. Apriori [2] - horizontal database, BFS traversal of
                                                       the candidates tree.
                                                    2. FPgrowth [5] - tree projected database, searching
                                                       for frequent itemsets directly without candidate
                                                       generation, and
                                                    3. dEclat [13] - vertical database, DFS traversal using
                                                       diﬀsets.
                                                   All     of    the    above      algorithm      implemen-
                                                   tations    were     provided     by     Bart    Goethals
                                                   (http://www.cs.helsinki/u/goethals/)         and    used
                                                   for comparison with the AIM-F implementation.
                                                      Figures 15 to 19 gives experimental results on the
                                                   various algorithms and datasets. Not surprising, Apri-
                                                   ori [2] generally has the lowest performance amongst
                                                   the algorithms compared, and in some cases the run-
                                                   ning time could not be computed as it did not ﬁn-
                                                   ish even at the highest level of support checked. For
                                                   these datasets and compared with the speciﬁc algo-
                                                   rithms and implementations described above, our al-
                                                   gorithm/implementation, AIM-F, seemingly outper-
                                                   forms all others.
                                                      In general, for the dense datasets (Chess, Connect,
Figure 14. In¤uence of the various optimiza-       Pumsb* and Mushroom, ﬁgures 15,16,17 and 18 re-
tion on the T10I4D100K dataset mining              spectively), the sparse bit vector gives AIM-F an order
                                                   of magnitude improvement over dEclat. The diﬀsets
                                                   gives dEclat and AIM-F another order of magnitude
                                                   improvement over the rest of the algorithms.
                                                      For the sparse dataset T10I4D100K (Figure 19), the
                                                   optimized initialization gives AIM-F head start, which


                                               7
              Figure 15. Chess dataset
                                                                 Figure 17. Pumsb* dataset


            Figure 16. Connect dataset
                                                                Figure 18. Mushroom dataset
is combined in the lower supports with the advantage
of the sparse vertical bit vector (See details in ﬁgure
14)

5. Afterword

   This paper presents a new frequent itemset mining
algorithm, AIM-F. This algorithm is based upon a
mixture of previously used techniques combined dy-
namically. It seems to behave quite well experimen-
tally.

References

 [1] R. Agrawal, T. Imielinski, and A. N. Swami. Min-
     ing association rules between sets of items in large       Figure 19. T10I4D100K dataset
     databases. In SIGMOD, pages 207–216, 1993.


                                                            8
 [2] R. Agrawal and R. Srikant. Fast algorithms for min-
     ing association rules. In J. B. Bocca, M. Jarke, and
     C. Zaniolo, editors, Proc. 20th Int. Conf. Very Large
     Data Bases, VLDB, pages 487–499. Morgan Kauf-
     mann, 12–15 1994.
 [3] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dy-
     namic itemset counting and implication rules for mar-
     ket basket data. In SIGMOD, pages 255–264, 1997.
 [4] D. Burdick, M. Calimlim, and J. Gehrke. Maﬁa: a
     maximal frequent itemset algorithm for transactional
     databases. In ICDE, 2001.
 [5] J. Han, J. Pei, and Y. Yin. Mining frequent patterns
     without candidate generation. In SIGMOD, pages 1–
     12, 2000.
 [6] R. J. B. Jr. Eﬃciently mining long patterns from
     databases. In SIGMOD, pages 85–93, 1998.
 [7] D.-I. Lin and Z. M. Kedem. Pincer search: A new al-
     gorithm for discovering the maximum frequent set. In
     EDBT’98, volume 1377 of Lecture Notes in Computer
     Science, pages 105–119, 1998.
 [8] R. Rymon. Search through systematic set enumera-
     tion. In KR-92, pages 539–550, 1992.
 [9] M. Seno and G. Karypis. Slpminer: An algorithm
     for ﬁnding frequent sequential patterns using length
     decreasing support constraint. In ICDE, 2002.
[10] P. Shenoy, J. R. Haritsa, S. Sundarshan, G. Bhalo-
     tia, M. Bawa, and D. Shah. Turbo-charging vertical
     mining of large databases. In SIGMOD, 2000.
[11] H. Toivonen. Sampling large databases for association
     rules. In VLDB, pages 134–145, 1996.
[12] S. Yen and A. Chen. An eﬃcient approach to discov-
     ering knowledge from large databases. In 4th Interna-
     tional Conference on Parallel and Distributed Infor-
     mation Systems.
[13] M. J. Zaki. Scalable algorithms for association min-
     ing. Knowledge and Data Engineering, 12(2):372–390,
     2000.
[14] M. J. Zaki and K. Gouda. Fast vertical mining using
     diﬀsets. Technical Report 01-1, RPI, 2001.


                                                             9