APRIORI, A Depth First Implementation

                       Walter A. Kosters                                                 Wim Pijls
        Leiden Institute of Advanced Computer Science                        Department of Computer Science
                       Universiteit Leiden                                         Erasmus University
               P.O. Box 9512, 2300 RA Leiden                                P.O. Box 1738, 3000 DR Rotterdam
                        The Netherlands                                              The Netherlands
                        kosters@liacs.nl                                             pijls@few.eur.nl


                          Abstract                                  are kept in main memory, which might cause memory prob-
                                                                    lems: both are usually very large, and in particular the trie
    We will discuss  , the depth first implementation of           gets much larger as the support threshold decreases. Finally
A PRIORI as devised in 1999 (see [8]). Given a database,            the algorithm outputs all paths in the trie, i.e., all frequent
this algorithm builds a trie in memory that contains all fre-       itemsets. Note that once completed, the trie allows for fast
quent itemsets, i.e., all sets that are contained in at least       itemset retrieval in the case of online processing.
minsup transactions from the original database. Here min-              We formerly had two implementations of the algorithm,
sup is a threshold value given in advance. In the trie, that        one being time efficient, the other being memory efficient
is constructed by adding one item at a time, every path cor-        (called dftime.cc and dfmemory.cc, respectively),
responds to a unique frequent itemset. We describe the al-          where the time efficient version could not handle low sup-
gorithm in detail, derive theoretical formulas, and provide         port thresholds. The newest version (called dffast.cc)
experiments.                                                        combines them into one even faster implementation, and
                                                                    runs for all support thresholds.
                                                                       In this paper we first describe  , we then give some
                                                                    formal definitions and theoretical formulas, we discuss the
1 Introduction                                                      program, provide experimental results, and conclude with
                                                                    some remarks.
   In this paper we discuss the depth first (  , see [8])
implementation of A PRIORI (see [1]), one of the fastest
known data mining algorithms to find all frequent item-             2 The Algorithm
sets in a large database, i.e., all sets that are contained in at
least minsup transactions from the original database. Here              An appropriate data structure to store the frequent item-
minsup is a threshold value given in advance. There exist           sets of a given database is a trie. As a running example in
many implementations of A PRIORI (see, e.g., [6, 11]). We           this section we use the dataset of Figure 1. Each line rep-
would like to focus on algorithms that assume that the whole        resents a transaction. The trie of frequent patterns is shown
database fits in main memory, this often being the state of         in Figure 2. The entries (or cells) in a node of a trie are
affairs; among these,  and  (the FP-growth imple-                usually called buckets, as is also the case for a hash-tree.
mentation of A PRIORI, see [5]) are the fastest. In most pa-        Each bucket can be identified with its path to the root and
pers so far little attention has been given to theoretical com-     hence with a unique frequent itemset. The example trie has
plexity. In [3, 7] a theoretical basis for the analysis of these    9 nodes and 18 buckets, representing 18 frequent itemsets.
two algorithms was presented.                                       As an example, the frequent itemset       can be
   The depth first algorithm is a simple algorithm that             seen as the leftmost path in the trie; and a set as     
proceeds as follows. After some preprocessing, which in-            is not present.
volves reading the database and a sorting of the single items           One of the oldest algorithms for finding frequent patterns
with respect to their support,  builds a trie in memory,           is A PRIORI, see [1]. This algorithm successively finds all
where every path from the root downwards corresponds to             frequent 1-itemsets, all frequent 2-itemsets, all frequent 3-
a unique frequent itemset; in consecutive steps items are           itemsets, and so on. (A  -itemset has  items.) The frequent
added to this trie one at a time. Both the database and the trie    -itemsets are used to generate candidate   -itemsets,
                                                                   so on. So, A PRIORI can be thought of as an algorithm that
     Dataset                                                       builds the pattern trie in a breadth first way. We propose an
                                                                   algorithm that builds the trie in a depth first way. We will ex-
           transaction     items                                   plain the depth first construction of the trie using the dataset
               number                                              of Figure 1. Note that the trie grows from right to left.
                     1     BCD
                                                                      The algorithm proceeds as follows. In a preprocessing
                     2     ABEF
                                                                   step, the support of each single item is counted and the in-
                                                                   frequent items are eliminated. Let the  frequent items be
                     3     ABEF
                                                                   denoted by           . Next, the code from Figure 3 is
                     4     ABCF
                     5     ABCEF
                                                                   executed.
                     6     CDEF
                                                                         (1)       the trie including only bucket    ;
     Frequent itemsets when minsup                                     (2)   for           downto 1 do
                                                                         (3)            ;
           support       frequent itemsets                               (4)             with  added to the left and
                5        B, F                                                          a copy of     appended to   ;
                4        A, AB, AF, ABF, BF, C, E, EF                    (5)               (= the subtrie rooted in   );
                3        AE, ABE, ABEF, AEF, BC,                         (6)     count   ;
                         BE, BEF, CF                                     (7)     delete the infrequent itemsets from ;

                                                                         (9) procedure count     ::
   Figure 1. An example of a dataset along with                          (10) for every transaction including item   do
   its frequent itemsets.                                                (11) for every itemset  in do
                                                                         (12)      if supports  then  .support ;

                                                                                     Figure 3. The algorithm.
                               A B C E F
                                        
                                                                   The procedure count    determines the support of

                                           
                                                                   each itemset (bucket) in the subtrie . This is achieved by
                                                                   a database pass, in which each transaction including item
                B E F          C E F        F         F             is considered. Any such transaction is one at a time
                                                               “pushed” through , where it only traverses a subtrie if it

                                                               includes the root of this subtrie, meanwhile updating the
                                                                   support fields in the buckets. In the last paragraph from Sec-
                                                                   tion 4 a refinement of this part of the algorithm is presented.
               E F         F            F
                                                                 On termination of the algorithm, exactly contains the fre-
                                                                   quent itemsets.
                                                                     Figure 4 illustrates the consecutive steps of the algorithm
                                                                   applied to our example. The single items surpassing the
                     F                                             minimum support threshold 3 are        
                                                                       and    . In the figure, the shape of after
   Figure 2. An example of a trie (without sup-                    each iteration of the for loop is shown. Also the infrequent
   port counts).                                                   itemsets to be deleted at the end of an iteration are men-
                                                                   tioned. At the start of the iteration with index , the root of
                                                                   trie consists of the 1-itemsets             . (We denote a
                                                                   1-itemset by the name of its only item, omitting curly braces
where the candidates are only known to have two frequent           and commas as in Figure 1 and Figure 4.) By the statement
subsets with  elements. After a pruning step, where can-          in line (3) from Figure 3, this trie may also be referred to as
didates still having infrequent subsets are discarded, the            . A new trie is composed by adding bucket   to the
support of the candidates is determined. The way A PRIORI          root and by appending a copy of         (the former value of )
finds the frequent patterns implies that the trie is built layer   to  . The newly added buckets are the new candidates and
by layer. First the nodes in the root (depth  ) are con-         they make up a subtrie . In Figure 4, the candidate set is
structed, next the trie nodes at depth 1 are constructed, and      in the left part of each trie and is drawn in bold. Notice that
                                                                   number of database passes is one less than the num-
                              E F                              ber of frequent 1-itemsets. This causes the algorithm to
                                                                   be tractable only if the database under consideration is
                                                                   memory-resident. Given the present-day memory sizes, this
                                                                   is not a real constraint any more.
                                            F                          As stated above, our algorithm has a preprocessing step
                                                                   which counts the support for each single item. After this
                                                                   preprocessing step, the items may be re-ordered. The most
                           C E F
                                
                                                                   favorable execution time is achieved if we order the items

                                                                 by increasing frequency (see Section 3 for a more formal

                                   
                                                                   motivation). It is better to have low support at the top of the
                                                                   deeper side (to the left bottom) of the trie and hence, high
                                  E F       F                      support at the top of the shallow part (to the upper right) of
                                                                   the trie.
            CE and CEF                                                 We may distinguish between “dense” data sets and
            are infrequent                                         “sparse” datasets. A dense dataset has many frequent pat-
                                            F                      terns of large size and high support, as is the case for
            and hence deleted
                                                                   test sets such as chess and mushroom (see Section 5).
                                                                   In those datasets, many transactions are similar to each
                        B C E F
                                                                  other. Datasets with mainly short patterns are called sparse.
                                                                 Longer patterns may exist, but with relatively small sup-

                                   F
                                                                   port. Real-world transaction databases of supermarkets
                                                                   mostly belong to this category. Also the synthetic datasets
                         C E F F
                                                                  from Section 5 have similar properties: interesting support
                                                                  thresholds are much lower than in the dense case.
                               BCF is infrequent                     Algorithms for finding frequent patterns may be divided
                                                                   into two types: algorithms respectively with and without
                         F      F       and hence deleted          candidate generation Any A PRIORI-like instance belongs to
                                                                   the first type. Eclat (see [9]) may also be considered as an
                                                                   instance of this type. The FP-growth algorithm  from
                          A B C E F

                                                               [5] is the best-known instance of the second type (though
                                                                   one can also defend the point of view that it does generate

                                                               candidates). For dense datasets,  performs better than
                                                                   candidate generating algorithms.  stores the dataset in
         B C E F     C E F       F F
                                                               a way that is very efficient especially when the dataset has
                                                                   many similar transactions. In case of algorithms that do ap-
                                                               ply candidate generation, dense sets produce a large number
                                                                   of candidates. Since each new candidate has to be related
       C E F      F F       F
                                                                 to each transaction, the database passes take a lot of time.
                                                                   However, for sparse datasets, candidate generation is a very
            ABC, AC and ACF are infrequent                       suitable method for finding frequent patterns. To our expe-
                                                                   rience, the instances of the A PRIORI family are very useful
             F        and hence deleted                            when searching transaction databases. According to the re-
                                                                   sults in [7] the depth first algorithm  outperforms FP-
                                                                   growth  in the synthetic transaction sets (see Section 5
                                                                   for a description of these sets).
                                                                       Finally, note that technically speaking  is not a full
         Figure 4. Illustrating the algorithm.
                                                                   implementation of A PRIORI, since every candidate itemset
the final trie (after deleting infrequent itemsets) is identical   is known to have only one frequent subset (resulting from
to Figure 2.                                                       the part of the trie which has already been completed) in-
                                                                   stead of two. Apart from this, its underlying candidate gen-
   The number of iterations in the for loop is one less            eration mechanism strongly resembles the one from A PRI -
than the number of frequent 1-itemsets. Consequently, the          ORI .
3 Theoretical Complexity                                           effort of maintaining and using the datastructures. Counting
                                                                   each trie-node with the number of buckets it contains, the
   Let     denote the number of transactions (also called          total is computed to be:
customers), and let  denote the number of products (also                                
called items). Usually is much larger than . For a non-                                                         
empty itemset      we define:
                                                                                                                                 (3)
                                                                                              
                                                                              
           is the support of : the number of customers
     that buy all products from (and possibly more), or            When the trie is finally ready the number of remaining buck-
     equivalently the number of transactions that contain ;        ets equals the number of frequent sets, each item in a node
     is the smallest number in ;                                being the end of the path that represents the corresponding
                                                                   itemset.
     is the largest number in .                                   Notice that the complexity heavily depends on the sort-
                                                                   ing order of the items at the top level. It turns out that an in-
In line with this we let    . We also put                creasing order of items is beneficial here. This is suggested
and      . A set      is called fre-                by the contribution of the 1-itemsets in Equation (1):
quent if              , where the so-called support
threshold minsup is a fixed number given in advance.
    We assume every 1-itemset to be frequent; this can be
                                                                                                                     (4)
                                                                                       
effected by the first step of the algorithms we are looking
at, which might be considered as preprocessing.                    which happens to be minimal in that case. This 1-itemset
                                                                   contribution turns out to be the same for both  and  :
   A “database query” is defined as a question of the form         see [3, 7], where also results for  are presented in more
“Does customer  buy product  ?” (or “Does transaction            detail.
   has item  ?”), posed to the original database. Note that
we have  database queries in the “preprocessing” phase
in which the supports of the 1-itemsets are computed and
                                                                   4 Implementation Issues
ordered: every field of the database is inspected once. (By
the way, the sorting, in which the items are assigned the              In this section we discuss some implementation details
numbers    , takes            time.) The number          of our program. As mentioned in Section 2, the database
of database queries for  equals:                                  is traversed many times. It is therefore necessary that the
                                                                   database is memory-resident. Fortunately, only the occur-
                                                              rences of frequent items need to be stored. The database
                                        
                                                           (1)   is represented by a two-dimensional boolean array. For effi-
                                                             ciency reasons, one array entry corresponds to one bit. Since
                                                                the function count in the algorithm considers the database
For a proof, see [3]. It relies on the fact that in order for a    transaction by transaction, a horizontal layout is chosen,
node to occur in the trie the path to it (except for the root)     cf. [4].
should be frequent, and on the observation that this partic-          We have four preprocessing steps before the algorithm
ular node is “questioned” every time a transaction follows         of Section 2 actually starts.
this same path. In [3] a simple version of  is described
in a similar style, leading to                                       1 The range of the item values is determined. This is nec-
                                                                       essary, because some test sets, e.g., the BMS-WebView
                                                                       sets, have only values   .
                                                         (2)
                                                         2 This is an essential initial step. First, for each item the
                                                     support is counted. Next, the frequent items are se-
                                                                       lected and sorted by frequency. This process is rele-
database queries in “local databases” (FP-trees), except for
                                                                       vant, since the frequency order also prescribes the or-
the preprocessing phase. Note the extra condition on the in-
ner summation (which is “good” for  : we have less sum-
                                                                       der in the root of the trie, as stated before. The sorted
                                                                       frequent items along with their supports are retained in
mands there), while on the other hand the summands are
larger (which is “good” for  : we have a smaller contri-
                                                                       an array.
bution there).                                                       3 If a transaction has zero or one frequent item, it needs
   It makes also sense to look at the total number of nodes            not to be stored into the memory-resident representa-
of the trie during its construction, which is connected to the         tion of the database. The root of the trie is constructed
     according to the information gathered in step 2. For         itemsets. The number of items was set to   , fol-
     constructing the other buckets, only transactions with       lowing the design in [2]. We use T10I4D100K (4.0 MB)
     at least two frequent items are relevant. In this step, we   and T40I10D100K (15.5 MB), both also available from
     count the relevant transactions.                             the FIMI’03 website mentioned above; they both contain
                                                                  100,000 transactions.
  4 During this step the databases is stored into a two-
    dimensional array with horizontal layout. Each item is                                                     Database chess
                                                                                        1000
    given a new number, according to its rank in the fre-                                                                                  execution time DF
                                                                                                                 number of frequent sets (scale on right axis)
                                                                                                                                                                      5
    quency order. The length of the array equals the result
    of step 3; the width is determined by the number of                                  800

                                                                                                                                                                      4
    frequent items.


                                                                                                                                                                          number of sets in 1,000,000s
                                                                    runtime (seconds)
                                                                                         600

After this preparatory work, which in practice usually takes                                                                                                          3


a few seconds, the code as described in Section 2 is exe-
                                                                                         400
cuted. The cells of the root are constructed using the result                                                                                                         2


of initial step 2.
                                                                                         200
    In line (12) from Figure 3 in Section 2, backtracking is                                                                                                          1


applied to inspect each path  of . Inspecting a path  is
aborted as soon as an item  with  outside the current trans-                                0
                                                                                                  65   60   55                  50                 45            40
                                                                                                                                                                      0

                                                                                                             relative support (%)
action is found. Obviously, processing one transaction dur-
ing the count procedure is a relatively expensive task, which
                                                                              Figure 5. Experimental results for database
is unfortunately inevitable, whichever version of A PRIORI
is used.
                                                                               .

    As mentioned in the introduction, we used to have two
implementations, one being time efficient, the other being
                                                                                                            Database mushroom
memory efficient. These two have been used in the overall                               200
                                                                                                                                           execution time DF
FIMI’03 comparisons. The newest implementation (called                                                           number of frequent sets (scale on right axis)
                                                                                                                                                                      5

dffast.cc) combines these versions by using the follow-
ing refinement. Instead of appending a copy         of to                             150
                                                                                                                                                                      4


                                                                                                                                                                          number of sets in 1,000,000s
(see Figure 3 in Section 2), first the counting is done in aux-
                                                                    runtime (seconds)


iliary fields in the original , after which only the frequent
buckets are copied underneath   . This makes the dele-
                                                                                                                                                                      3
                                                                                        100


tion of infrequent itemsets (line (7) from Figure 3) unnec-                                                                                                           2

essary and leads to better memory management. Another                                   50
improvement might be achieved by using more auxiliary                                                                                                                 1

fields while adding two root items simultaneously in each
iteration, thereby halving the number of database passes at                               0
                                                                                              14       12   10                  8                  6             4
                                                                                                                                                                      0

the cost of more bookkeeping.                                                                                relative support (%)


                                                                              Figure 6. Experimental results for database
5 Experiments                                                                 .

   Using the relatively small database chess (342 kB, with
3,196 transactions; available from the FIMI’03 website at            The experiments were conducted at a Pentium-IV ma-
http://fimi.cs.helsinki.fi/testdata.html), the                    chine with 512 MB memory at 2.8 GHz, running Red Hat
database mushroom (570 kB, with 8,124 transactions; also          Linux 7.3. The program was developed under the G NU C 
available from the FIMI’03 website) and the well-known            compiler, version 2.96.
IBM-Almaden synthetic databases (see [2]) we shall exam-             The following statistics are plotted in the graphs: the ex-
ine the complexity of the algorithm. These databases have         ecution time in seconds of the algorithm (see Section 4),
either few, but coherent records (chess and mushroom),            and the total number of frequent itemsets: in all figures the
or many records (the synthetic databases). The parameters         corresponding axis is on the right hand side and scales 0–
for generating a synthetic database are the number of trans-      5,500,000 (0–8,000,000 for                ). The execution
actions  (in thousands), the average transaction size and        time excludes preprocessing: in this phase the database is
the average length  of so-called maximal potentially large       read three times in order to detect the frequent items (see
                      100
                                                     Database T10I4D100K
                                                                                                               8
                                                                                                                                                  minsup = 300, having 5,058,313 frequent sets, with 21 fre-
                                                                                    execution time DF
                                                          number of frequent sets (scale on right axis)                                           quent 19-itemsets and 1 frequent 20-itemset). The final trie
                                                                                                               7
                                                                                                                                                  in the T40I10D100K case occupies approximately 65 MB
                      80
                                                                                                               6                                  of memory — the output file in this case being 3 times as


                                                                                                                   number of sets in 1,000,000s
                                                                                                                                                  large.
                                                                                                               5
  runtime (seconds)


                      60
                                                                                                                                                      Note that the 3,457,747 sets for the chess database
                                                                                                               4
                                                                                                                                                  with minsup = 1,400 require 829 seconds to find, whereas
                      40
                                                                                                               3                                  the 3,771,728 frequent itemsets for mushroom with min-
                                                                                                                                                  sup = 400 take 158 seconds — differing approximately
                                                                                                               2
                      20                                                                                                                          a factor 5 in time. This difference in runtime is probably
                                                                                                               1                                  caused by the difference in the absolute minsup value. Each
                       0                                                                                       0
                                                                                                                                                  cell corresponding to a frequent itemset is visited at least
                       0.045   0.04   0.035   0.03      0.025      0.02      0.015       0.01      0.005   0
                                                      relative support (%)                                                                        1400 times in the former case against 400 times in the lat-
                                                                                                                                                  ter case. A similar phenomenon is observed when compar-
            Figure 7. Experimental results for database                                                                                           ing T40I10D100K with absolute minsup value 300 and
                     .                                                                                                                          T10I4D100K with minsup = 3: this takes 378 versus 88
                                                                                                                                                  seconds. Although the outputs have the same orders of mag-
                                                                                                                                                  nitude, the runtimes differ substantially. We see that, besides
                      500
                                                     Database T40I10D100K
                                                                                                                                                  the number of frequent itemsets and the sizes of these sets,
                                                                                    execution time DF

                      450
                                                          number of frequent sets (scale on right axis)
                                                                                                               5
                                                                                                                                                  the absolute minsup value is a major factor determining the
                                                                                                                                                  runtime.
                      400

                                                                                                               4
                                                                                                                   number of sets in 1,000,000s


                      350

                                                                                                                                                  6 Conclusions
  runtime (seconds)


                      300
                                                                                                               3
                      250

                      200                                                                                                                             In this paper, we addressed  , a depth first implemen-
                                                                                                                                                  tation of A PRIORI. To our experience,  competes with
                                                                                                               2
                      150


                      100
                                                                                                               1
                                                                                                                                                  any other well-known algorithm, especially when applied to
                      50
                                                                                                                                                  large databases with transactions.
                       0                                                                                       0
                                                                                                                                                      Storing the database in the primary memory is no longer
                               2              1.5                   1
                                                      relative support (%)
                                                                                         0.5               0
                                                                                                                                                  a problem. On the other hand, storing the candidates causes
                                                                                                                                                  trouble in situations, where a dense database is considered
            Figure 8. Experimental results for database                                                                                           with a small support threshold. This is the case for any al-
                      .                                                                                                                         gorithm using candidates. Therefore, it would be desirable
                                                                                                                                                  to look for a method which stores candidates in secondary
                                                                                                                                                  memory. This is an obvious topic for future research. To
                                                                                                                                                  our knowledge,  is the only algorithm that can cope
before); also excluded is the time needed to print the re-                                                                                        with memory limitations. However, for real world retail
sulting itemsets. These actions together usually only take a                                                                                      databases this algorithm is surpassed by  , as we showed
few seconds. The number of frequent 1-itemsets ( from                                                                                            in [7]. Other optimizations might also be possible. Besides
the previous sections, where we assumed all 1-itemsets                                                                                            improving the C  code, ideas from, e.g., [10] on diffsets
to be frequent) has range 31–39 for the experiments on                                                                                            with vertical layouts might be used.
the database chess, 54–76 for mushroom, 844–869 for                                                                                                   Our conclusion is that  is a simple, practical, straight-
T10I4D100K and 610–862 for T40I10D100K. Note the                                                                                                  forward and fast algorithm for finding all frequent itemsets.
very high support thresholds for mushroom (at least 5%)
and chess (at least 44%); for T10I4D100K a support
threshold as low as 0.003% was even feasible.                                                                                                     References
   The largest output files produced are of size 110.6 MB
(chess, minsup = 1,400, having 3,771,728 frequent sets                                                                                             [1] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and
with 13 frequent 17-itemsets), 121.5 MB (mushroom, min-                                                                                                A.I. Verkamo. Fast discovery of association rules.
sup = 400, having 3,457,747 frequent sets with 24 frequent                                                                                             In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and
17-itemsets), 131.1 MB (T10I4D100K, minsup = 3, hav-                                                                                                   R. Uthurusamy, editors, Advances in Knowledge Dis-
ing 6,169,854 frequent sets with 30 frequent 13-itemsets                                                                                               covery and Data Mining, pages 307–328. AAAI/MIT
and 1 frequent 14-itemset) and 195.9 MB (T40I10D100K,                                                                                                  Press, 1996.
 [2] R. Agrawal and R. Srikant. Fast algorithms for mining     and R. Srikant, editors, Proceedings of the Seventh
     association rules. In J.B. Bocca, M. Jarke, and C. Zan-   ACM SIGKDD International Conference on Knowl-
     iolo, editors, Proceedings 20th International Confer-     edge Discovery and Data Mining (KDD-2001), pages
     ence on Very Large Data Bases, VLDB, pages 487–           401–406, 2001.
     499. Morgan Kaufmann, 1994.
 [3] J.M. de Graaf, W.A. Kosters, W. Pijls, and V. Popova.
     A theoretical and practical comparison of depth first
     and FP-growth implementations of Apriori.          In
     H. Blockeel and M. Denecker, editors, Proceedings
     of the Fourteenth Belgium-Netherlands Artificial In-
     telligence Conference (BNAIC 2002), pages 115–122,
     2002.
 [4] B. Goethals. Survey on frequent pattern mining.
     Helsinki, 2003.      -
       .
 [5] J. Han, J. Pei, and Y. Yin. Mining frequent patterns
     without candidate generation. In Proceedings 2000
     ACM SIGMOD International Conference on Manage-
     ment of Data (SIGMOD’00), pages 1–12, 2000.
 [6] J. Hipp, U. Günther, and G. Nakhaeizadeh. Mining as-
     sociation rules: Deriving a superior algorithm by an-
     alyzing today’s approaches. In D.A. Zighed, J. Ko-
     morowski, and J. Żytkov, editors, Principles of Data
     Mining and Knowledge Discovery, Proceedings of
     the 4th European Conference (PKDD 2000), Springer
     Lecture Notes in Computer Science 1910, pages 159–
     168. Springer Verlag, 2000.
 [7] W.A. Kosters, W. Pijls, and V. Popova. Complex-
     ity analysis of depth first and FP-growth implemen-
     tations of Apriori. In P. Perner and A. Rosenfeld, ed-
     itors, Machine Learning and Data Mining in Pattern
     Recognition, Proceedings MLDM 2003, Springer Lec-
     ture Notes in Artificial Intelligence 2734, pages 284–
     292. Springer Verlag, 2003.
 [8] W. Pijls and J.C. Bioch. Mining frequent item-
     sets in memory-resident databases. In E. Postma
     and M. Gyssens, editors, Proceedings of the Eleventh
     Belgium-Netherlands Conference on Artificial Intelli-
     gence (BNAIC1999), pages 75–82, 1999.
 [9] M.J. Zaki. Scalable algorithms for association mining.
     IEEE Transactions on Knowledge and Data Engineer-
     ing, 12:372–390, 2000.
[10] M.J. Zaki and K. Gouda. Fast vertical mining using
     diffsets. In Proceedings 9th ACM SIGKDD Interna-
     tional Conference on Knowledge Discovery and Data
     Mining, 2003.
[11] Z. Zheng, R. Kohavi, and L. Mason. Real world per-
     formance of association rule algorithms. In F. Provost