<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AIM: Another Itemset Miner</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Amos Fiat, Sagi Shporer School of Computer Science Tel-Aviv University Tel Aviv</institution>
          ,
          <country country="IL">Israel</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present a new algorithm for mining frequent itemsets. Past studies have proposed various algorithms and techniques for improving the efficiency of the mining task. We integrate a combination of these techniques into an algorithm which utilize those techniques dynamically according to the input dataset. The algorithm main features include depth first search with vertical compressed database, diffset, parent equivalence pruning, dynamic reordering and projection. Experimental testing suggests that our algorithm and implementation significantly outperform existing algorithms/implementations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Finding association rules is one of the driving
applications in data mining, and much research has been
done in this field [10, 7, 4, 6]. Using the
supportconfidence framework, proposed in the seminal paper
of [1], the problem is split into two parts — (a) finding
frequent itemsets, and (b) generating association rules.</p>
      <p>Let I be a set of items. A subset X ⊆ I is called
an itemset. Let D be a transactional database, where
each transaction T ∈ D is a subset of I : T ⊆ I. For an
itemset X, support(X) is defined to be the number of
transactions T for which X ⊆ T . For a given parameter
minsupport, an itemset X is call a frequent itemset
if support(X) ≥ minsupport. The set of all frequent
itemsets is denoted by F .</p>
      <p>The remainder of this paper is organized as follows.
Section 2 contains a short of related work. In section 3
we describe the AIM-F algorithm. Section 4 contains
experimental results. In Section 5 we conclude this
short abstact with a discussion.</p>
      <sec id="sec-1-1">
        <title>1.1. Contributions of this paper</title>
        <p>We combine several pre-existing ideas in a fairly
straightforward way and get a new frequent itemset
mining algorithm. In particular, we combine the sparse
vertical bit vector technique along with the difference
sets technique of [14], thus reducing the computation
time when compared with [14]. The various techniques
were put in use dynamically according to the input
dataset, thus utilizing the advantages and avoiding the
drawbacks of each technique.</p>
        <p>Experimental results suggest that for a given level of
support, our algorithm/implementation is faster than
the other algorithms with which we compare ourselves.
This set includes the dEclat algorithm of [14] which
seems to be the faster algorithm amongst all others.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related</title>
    </sec>
    <sec id="sec-3">
      <title>Work</title>
      <p>Since the introduction of the Apriori algorithm by
[1, 2] many variants have been proposed to reduce time,
I/O and memory.</p>
      <p>Apriori uses breath-first search, bottom-up
approach to generate frequent itemsets. (I.e., constructs
i + 1 item frequent itemsets from i item frequent
itemsets). The key observation behind Apriori is that all
subsets of a frequent itemset must be frequent. This
suggests a natural approach to generating frequent
itemsets. The breakthrough with Apriori was that
the number of itemsets explored was polynomial in the
number of frequent itemsets. In fact, on a worst case
basis, Apriori explores no more than n itemsets to
output a frequent itemset, where n is the total number of
items.</p>
      <p>Subsequent to the publication of [1, 2], a great many
variations and extensions were considered [3, 7, 13].
In [3] the number of passes over the database was
reduced . [7] tried to reduce the search space by
combining bottom-up and top-down search – if a set is
infrequent than so are supersets, and one can prune away
infrequent itemsets found during the top-down search.
[13] uses equivalence classes to skip levels in the search
space. A new mining technique, FP-Growth, proposed
in [12], is based upon representing the dataset itself as
a tree. [12] perform the mining from the tree
representation.</p>
      <p>We build upon several ideas appearing in previous
work, a partial list of which is the following:
• Vertical Bit Vectors [10, 4] - The dataset is stored
in vertical bit vectors. Experimentally, this has
been shown to be very effective.
• Projection [4] - A technique to reduce the size of
vertical bit vectors by trimming the bit vector to
include only transaction relevant to the subtree
currently being searched.
• Difference sets [14] - Instead of holding the entire
tidset at any given time, Diffsets suggest that only
changes in the tidsets are needed to compute the
support.
• Dynamic Reordering [6] - A heuristic for reducing
the search space - dynamically changing the order
in which the search space is traversed. This
attempts to rearrange the search space so that one
can prune infrequent itemsets earlier rather than
later.
• Parent Equivalence Pruning [4, 13] - Skipping
levels in the search space, when a certain item added
to the itemset contributes no new information.</p>
      <p>To the best of our knowledge no previous
implementation makes use of this combination of ideas, and
some of these combinations are non-trivial to combine.
For example, projection has never been previously used
with difference sets and to do so requires some new
observations as to how to combine these two elements.</p>
      <p>We should add that there are a wide variety of other
techniques introduced over time to find frequent
itemsets, which we do not make use of. A very partial list
of these other ideas is
• Sampling - [11] suggest searching over a sample of
the dataset, and later validates the results using
the entire dataset. This technique was shown to
generate the vast majority of frequent itemsets.
• Adjusting support - [9] introduce SLPMiner, an
algorithm which lowers the support as the
itemsets grow larger during the search space. This
attempts to avoid the problem of generating small
itemsets which are unlikely to grow into large
itemsets.
3. The AIM-F algorithm</p>
      <p>In this section we describe the building blocks that
make up the AIM-F algorithm. High level pseudo code
for the AIM-F algorithm appears in Figure 7.</p>
      <sec id="sec-3-1">
        <title>3.1. Lexicographic Trees</title>
        <p>Let &lt; be some lexicographic order of the items in I
such that for every two items i and j, i = j : i &lt; j or
i &gt; j. Every node n of the lexicographic tree has two
fields, n.head which is the itemset node n represent,
and n.tail which is a list of items, possible extensions
to n.head. A node of the lexicographic tree has a l evel.
Itemsets for nodes at level k nodes contain k items. We
will also say that such itemsets have length k. The root
(level 0) node n.head is empty, and n.tail = I. Figure
1 is an example of lexicographic tree for 3 items.</p>
        <p>The use of lexicographic trees for itemset generation
was proposed by [8].</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Depth First Search Traversal</title>
        <p>In the course of the algorithm we traverse the
lexicographic tree in a depth-first order. At node n, for every
element α in the node’s tail, a new node n is generated
such that n .head = n.head α and n .tail = n.tail−α.
After the generation of n , α is removed from n.tail, as
it will be no longer needed (see Figure 3).</p>
        <p>Several pruning techniques, on which we elaborate
later, are used in order to speed up this process.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Vertical Sparse Bit-Vectors</title>
        <p>Comparison between horizontal and vertical
database representations done in [10] shows that the
representation of the database has high impact on the
performance of the mining algorithm. In a vertical
database the data is represented as a list of items,</p>
        <p>Project(p : vector, v : vector )
/* p - vector to be projected upon</p>
        <p>
          v - vector being projected */
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) t = Empty Vector
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) i = 0
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) for each nonzero bit in p, at offset j, in
ascending order of offsets:
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) Set i’th bit of target vector t to be the
j’th bit of v.
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) i = i + 1
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          ) return t
DFS(n : node,)
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) t = n.tail
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) while t = ∅
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) Let α be the first item in t
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) remove α from t
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) n .head = n.head α
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          ) n .tail = t
(
          <xref ref-type="bibr" rid="ref7">7</xref>
          ) DFS(n )
where every item holds a list of transactions in which
it appears.
        </p>
        <p>The list of transactions held by every item can be
represented in many ways. In [13] the list is a tid-list,
while [10, 4] use vertical bit vectors. Because the data
tends to be sparse, vertical bit vectors hold many “0”
entries for every “1”, thus wasting memory and CPU
for processing the information. In [10] the vertical bit
vector is compressed using an encoding called skinning
which shrinks the size of the vector.</p>
        <p>We choose to use a sparse vertical bit vector.
Every such bit vector is built from two arrays - one for
values, and one for indexes. The index array gives the
position in the vertical bit vector, and the value array
is the value of the position, see Figure 8. The index
array is sorted to allow fast AND operations between
two sparse bit vectors in a similar manner to the AND
operation between the tid-lists. Empty values will be
thrown away during the AND operation, save space
and computation time.
3.3.1</p>
        <sec id="sec-3-3-1">
          <title>Bit-vector projection</title>
          <p>
            In [4], a technique called projection was introduced.
Projection is a sparse bit vector compression technique
specifically useful in the context of mining frequent
Apriori(n : node, minsupport : integer)
(
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) t = n.tail
(
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) while t = ∅
(
            <xref ref-type="bibr" rid="ref3">3</xref>
            ) Let α be the first item in t
(
            <xref ref-type="bibr" rid="ref4">4</xref>
            ) remove α from t
(
            <xref ref-type="bibr" rid="ref5">5</xref>
            ) n .head = n.head α
(
            <xref ref-type="bibr" rid="ref6">6</xref>
            ) n .tail = t
(
            <xref ref-type="bibr" rid="ref7">7</xref>
            ) if (support(n .head) ≥ minsupport)
(
            <xref ref-type="bibr" rid="ref8">8</xref>
            ) Report n .head as frequent itemset
(
            <xref ref-type="bibr" rid="ref9">9</xref>
            ) Apriori(n )
          </p>
          <p>
            PEP(n : node, minsupport : integer)
(
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) t = n.tail
(
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) while t = ∅
(
            <xref ref-type="bibr" rid="ref3">3</xref>
            ) Let α be the first item in t
(
            <xref ref-type="bibr" rid="ref4">4</xref>
            ) remove α from t
(
            <xref ref-type="bibr" rid="ref5">5</xref>
            ) n .head = n.head α
(
            <xref ref-type="bibr" rid="ref6">6</xref>
            ) n .tail = t
(
            <xref ref-type="bibr" rid="ref7">7</xref>
            ) if (support(n .head) = support(n.head))
(
            <xref ref-type="bibr" rid="ref8">8</xref>
            ) add α to the list of items removed by
          </p>
          <p>
            PEP
(
            <xref ref-type="bibr" rid="ref9">9</xref>
            ) else if (support(n .head) ≥ minsupport)
(
            <xref ref-type="bibr" rid="ref10">10</xref>
            ) Report n .head {All subsets of items
removed by PEP} as frequent itemsets
(
            <xref ref-type="bibr" rid="ref11">11</xref>
            ) PEP(n )
itemsets. The idea is to eliminate redundant zeros in
the bit-vector - for itemset P , all the transactions which
does not include P are removed, leaving a vertical bit
vector containing only 1s. For every itemset generated
from P (a superset of P ), P X, all the transactions
removed from P are also removed. This way all the
extraneous zeros are eliminated.
          </p>
          <p>The projection done directly from the vertical bit
representation. At initialization a two dimensional
matrix of 2w by 2w is created, where w is the word length
or some smaller value that we choose to work with.
Every entry (i,j) is calculated to be the projection of
j on i (thus covering all possible projections of single
word). For every row of the matrix, the number of bits
being projected is constant (a row represents the word
being projected upon).</p>
          <p>
            Projection is done by traversing both the vector to
project upon, p, and the vector to be projected, v. For
every word index we compute the projection by table
DynamicReordering(n : node, minsupport : integer)
(
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) t = n.tail
(
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) for each α in t
(
            <xref ref-type="bibr" rid="ref3">3</xref>
            ) Compute sα = support(n.head α)
(
            <xref ref-type="bibr" rid="ref4">4</xref>
            ) Sort items α in t by sα in ascending order.
(
            <xref ref-type="bibr" rid="ref5">5</xref>
            ) while t = ∅
(
            <xref ref-type="bibr" rid="ref6">6</xref>
            ) Let α be the first item in t
(
            <xref ref-type="bibr" rid="ref7">7</xref>
            ) remove α from t
(
            <xref ref-type="bibr" rid="ref8">8</xref>
            ) n .head = n.head α
(
            <xref ref-type="bibr" rid="ref9">9</xref>
            ) n .tail = t
(
            <xref ref-type="bibr" rid="ref10">10</xref>
            ) if (support(n .head) ≥ minsupport)
(
            <xref ref-type="bibr" rid="ref11">11</xref>
            ) Report n .head as frequent itemset
(
            <xref ref-type="bibr" rid="ref12">12</xref>
            ) DynamicReordering(n )
lookup, the resulting bits are then concatenated
together. Thus, computing the projection takes no longer
than the AND operation between two compressed
vertical bit lists.
          </p>
          <p>In [4] projection is used whenever a rebuilding
threshold was reached. Our tests show that because
we’re using sparse bit vectors anyway, the gain from
projection is smaller, and the highest gains are when
we use projection only when calculating the 2-itemsets
from 1-itemsets. This is also because of the penalty
of using projection with diffsets, as described later, for
large k-itemsets. Even so, projection is used only if the
sparse bit vector will shrunk significantly - as a
threshold we set 10% - if the sparse bit vector contains less
than 10% of ’1’s it will be projected.
3.3.2</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>Counting and support</title>
          <p>To count the number of ones within a sparse bit vector,
one can hold a translation table of 2w values, where w
is the word length. To count the number of ones in a
word requires only one memory access to the
translation table. This idea first appeared in the context of
frequent itemsets in [4].
3.4</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>Diffsets</title>
        <p>Difference sets (Diffsets), proposed in [14], are a
technique to reduce the size of the intermediate
information needed in the traversal using a vertical
database. Using Diffsets, only the differences between
the candidate and its generating itemsets is calculated
and stored (if necessary). Using this method the
intermediate vertical bit-vectors in every step of the DFS
traversal are shorter, this results in faster intersections</p>
        <p>AIM-F (n : node, minsupport : integer)
/* Uses DFS traversal of lexicographic itemset tree
Fast computation of small frequent itemsets
for sparse datasets
Uses difference sets to compute support
Uses projection and bit vector compression
Makes use of parent equivalence pruning</p>
        <p>
          Uses dynamic reordering */
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) t = n.tail
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) for each α in t
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) Compute sα = support(n.head α)
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) if (sα = support(n.head))
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) add α to the list of items removed by PEP
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          ) remove α from t
(
          <xref ref-type="bibr" rid="ref7">7</xref>
          ) else if (sα &lt; minsupport)
(
          <xref ref-type="bibr" rid="ref8">8</xref>
          ) remove α from t
(
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) Sort items in t by sα in ascending order.
(
          <xref ref-type="bibr" rid="ref10">10</xref>
          ) While t = ∅
(
          <xref ref-type="bibr" rid="ref11">11</xref>
          ) Let α be the first item in t
(
          <xref ref-type="bibr" rid="ref12">12</xref>
          ) remove α from t
(
          <xref ref-type="bibr" rid="ref13">13</xref>
          ) n .head = n.head α
(
          <xref ref-type="bibr" rid="ref14">14</xref>
          ) n .tail = t
(15) Report n .head {All subsets of items
removed by PEP} as frequent itemsets
(16) AIM-F (n )
between those vectors.
        </p>
        <p>Let t(P ) be the tidset of P . The Diffset d(P X) is
the tidset of tids that are in t(P ) but not in t(P X),
formally : d(P X) = t(P ) − t(P X) = t(P ) − t(X). By
definition support(P XY ) = support(P X)−|d(P XY )|,
so only d(P XY ) should be calculated. However
d(P XY ) = d(P Y ) − d(P X) so the Diffset for every
candidate can be calculated from its generating
itemsets.</p>
        <p>Diffsets have one major drawback - in datasets,
where the support drops rapidly between k-itemset to
k+1-itemset then the size of d(P X) can be larger than
the size of t(P X) (For example see figure 9). In such
cases the usage of diffsets should be delayed (in the
depth of the DFS traversal) to such k-itemset where
the support stops the rapid drop. Theoretically the
break even point is 50%: t(P X) = 0.5, where the size
t(P )
of d(P X) equals to t(P X), however experiments shows
small differences for any value between 10% to 50%.
For this algorithm we used 50%.</p>
        <sec id="sec-3-4-1">
          <title>Diffsets and Projection : As d(P XY ) in not</title>
          <p>a subset of d(P X), Diffsets cannot be used directly
for projection. Instead, we notice that d(P XY ) ⊆
t(P X) and t(P X) = t(P ) − d(P X). However d(P X)
is known, and t(P ) can be calculated in the same
way. For example t(ABCD) = t(ABC) − d(ABCD),
t(ABC) = t(AB) − d(ABC), t(AB) = t(A) − d(AB)
thus t(ABCD) = t(A)−d(AB)−d(ABC)−d(ABCD).
Using this formula the t(P X) can be calculated using
the intermediate data along the DFS trail. As the DFS
goes deeper, the penalty of calculating the projection
is higher.
3.5</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>Pruning Techniques</title>
        <p>3.5.1</p>
        <sec id="sec-3-5-1">
          <title>Apriori</title>
          <p>Proposed by [2] the Apriori pruning technique is
based on the monotonicity property of support:
support(P ) ≥ support(P X) as P X is contained in less
transactions than P . Therefore if for an itemset P ,
support(P ) &lt; minsupport, the support of any
extension of P will also be lower than minsupport, and the
subtree rooted at P can be pruned from the
lexicographic tree. See Figure 4 for pseudo code.
3.5.2</p>
        </sec>
        <sec id="sec-3-5-2">
          <title>Parent Equivalence Pruning (PEP)</title>
          <p>This is a pruning method based on the following
property : If support(n.head) = support(n.head α) then
all the transactions that contain n.head also contain
n.head α. Thus, X can be moved from the tail to
the head, thus saving traversal of P and skipping to
P X. This method was described by [4, 13]. Later when
the frequent items are generated the items which were
moved from head to tail should be taken into account
when listing all frequent itemsets. For example, if k
items were pruned using PEP during the DFS
traversal of frequent itemset X then the all 2k subsets of
those k items can be added to X without reducing the
support. This gives creating 2k new frequent itemsets.
See Figure 5 for pseudo code.
3.6</p>
        </sec>
      </sec>
      <sec id="sec-3-6">
        <title>Dynamic Reordering</title>
        <p>To increase the chance of early pruning, nodes are
traversed, not in lexicographic order, but in order
determined by support. This technique was introduced
by [6].</p>
        <p>Instead of lexicographic order we reorder the
children of a node as follows. At node n, for all α in the
tail, we compute sα = support(t.head α), and the
items are sorted in by sα in increasing order. Items α
in n.tail for which support(t.head α) &lt; minsupport
are trimmed away. This way, the rest of the sub-tree
will benefit from a shortened tail. Items with smaller
support, which are heuristically “likely” to be pruned
earlier, will be traversed first. See Figure 6 for pseudo
code.
3.7</p>
      </sec>
      <sec id="sec-3-7">
        <title>Optimized Initialization</title>
        <p>In sparse datasets computing frequent 2-itemsets
can be done more efficiently than than by
performing n2 itemset intersections. We use a method similar
to the one described in [13]: as a preprocessing step,
for every transaction in the database, all 2-itemsets are
counted and stored in an upper-matrix of dimensions
n × n. This step may take up to O(n2) operations per
transaction. However, as this is done only for sparse
datasets, experimentally one sees that the number of
operations is small. After this initialization step, we
are left with frequent 2 item itemsets from which we
can start the DFS proceedure.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Results</title>
      <p>The experiments were conducted on an Athlon
1.2Ghz with 256MB DDR RAM running Microsoft
Windows XP Professional. All algorithms where
compiled on VC 7. In the experiments described herein, we
only count frequent itemsets, we don’t create output.</p>
      <p>We used five datasets to evaluate the algorithms
performance. Those datasets where studied extensively in
[13].</p>
      <p>1. connect — A database of game states in the game
connect 4.
2. chess — A database of game states in chess.
3. mushroom — A database with information about
various mushroom species.
4. pumsb* — This dataset was derived from the
pumsb dataset and describes census data.</p>
      <p>5. T10I4D100K - Synthetic dataset.</p>
      <p>The first 3 datasets were taken from
the UN Irvine ML Database Repository
(http://www.ics.uci.edu/ mlearn/MLRepository).
The synthetic dataset created by the IBM Almaden
synthetic data generator
(http://www.almaden.ibm.com/cs/quest/demos.html).
4.1</p>
      <sec id="sec-4-1">
        <title>Comparing Data Representation</title>
        <p>We compare the memory requirements of sparse
vertical bit vector (with the projection described earlier)
versus the standard tid-list. For every itemset length
the total memory requirements of all tid-sets is given
in figures 10, 11 and 12. We do not consider itemsets
removed by PEP.</p>
        <p>As follows from the figures, our sparse vertical bit
vector representation requires less memory than
tidlist for the dense datasets (chess, connect). However
for the sparse dataset (T10I4D100K) the sparse
vertical bit vector representation requires up to twice
as much memory as tid-list. Tests to dynamically
move from sparse vertical bit vector representation to
tid-lists showed no significant improvement in
performance, however, this should be carefully verified in
further experiments.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Comparing The Various Optimizations</title>
        <p>We analyze the influence of the various
optimization techniques on the performance of the algorithm.
First run is the final algorithm on a given dataset, then
returning on the task, with a single change in the
algorithm. Thus trying to isolate the influence of every
optimization technique, as shown in figures 13 and 14.</p>
        <p>As follows from the graphs, there is much difference
in the behavior between the datasets. In the dense
dataset, Connect, the various techniques had
tremendous effect on the performance. PEP, dynamic
reordering and diffsets behaved in a similar manner, and the
performance improvement factor gained by of them
increased as the support dropped. From the other hand
the sparse bit vector gives a constant improvement
factor over the tid-list for all the tested support values,
and projection gives only a minor improvement.</p>
        <p>In the second figure, for the sparse dataset
T10I4D100K, the behavior is different. PEP gives no
improvement, as can expected in sparse dataset, as
every single item has a low support, and does not contain
existing itemsets. There is drop in the support from
k-itemset to k+1-itemset due to the low support
therefore diffset also gives no impact, and the same goes for
projection. A large gain in performance is made by
optimized initialization, however the performance gain is
constant, and not by a factor. Last is the dynamic
reordering which contributes to early pruning much like
in the dense dataset.
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Comparing Mining Algorithms</title>
        <p>For comparison, we used implementations of
1. Apriori [2] - horizontal database, BFS traversal of
the candidates tree.
2. FPgrowth [5] - tree projected database, searching
for frequent itemsets directly without candidate
generation, and
3. dEclat [13] - vertical database, DFS traversal using
diffsets.</p>
        <p>All of the above algorithm
implementations were provided by Bart Goethals
(http://www.cs.helsinki/u/goethals/) and used
for comparison with the AIM-F implementation.</p>
        <p>Figures 15 to 19 gives experimental results on the
various algorithms and datasets. Not surprising,
Apriori [2] generally has the lowest performance amongst
the algorithms compared, and in some cases the
running time could not be computed as it did not
finish even at the highest level of support checked. For
these datasets and compared with the specific
algorithms and implementations described above, our
algorithm/implementation, AIM-F , seemingly
outperforms all others.</p>
        <p>In general, for the dense datasets (Chess, Connect,
Pumsb* and Mushroom, figures 15,16,17 and 18
respectively), the sparse bit vector gives AIM-F an order
of magnitude improvement over dEclat. The diffsets
gives dEclat and AIM-F another order of magnitude
improvement over the rest of the algorithms.</p>
        <p>For the sparse dataset T10I4D100K (Figure 19), the
optimized initialization gives AIM-F head start, which
is combined in the lower supports with the advantage
of the sparse vertical bit vector (See details in figure
14)</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Afterword</title>
      <p>This paper presents a new frequent itemset mining
algorithm, AIM-F . This algorithm is based upon a
mixture of previously used techniques combined
dynamically. It seems to behave quite well
experimentally.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Agrawal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Imielinski</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Swami</surname>
          </string-name>
          .
          <article-title>Mining association rules between sets of items in large databases</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>207</fpage>
          -
          <lpage>216</lpage>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Agrawal</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Srikant</surname>
          </string-name>
          .
          <article-title>Fast algorithms for mining association rules</article-title>
          . In J. B.
          <string-name>
            <surname>Bocca</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Jarke</surname>
          </string-name>
          , and C. Zaniolo, editors,
          <source>Proc. 20th Int. Conf. Very Large Data Bases, VLDB</source>
          , pages
          <fpage>487</fpage>
          -
          <lpage>499</lpage>
          . Morgan Kaufmann,
          <fpage>12</fpage>
          -
          <lpage>15</lpage>
          1994.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Brin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Motwani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Ullman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Tsur</surname>
          </string-name>
          .
          <article-title>Dynamic itemset counting and implication rules for market basket data</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>255</fpage>
          -
          <lpage>264</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Burdick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Calimlim</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          .
          <article-title>Mafia: a maximal frequent itemset algorithm for transactional databases</article-title>
          .
          <source>In ICDE</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . Pei, and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yin</surname>
          </string-name>
          .
          <article-title>Mining frequent patterns without candidate generation</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R. J. B.</given-names>
            <surname>Jr</surname>
          </string-name>
          .
          <article-title>Efficiently mining long patterns from databases</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>85</fpage>
          -
          <lpage>93</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.-I.</given-names>
            <surname>Lin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Z. M.</given-names>
            <surname>Kedem</surname>
          </string-name>
          .
          <article-title>Pincer search: A new algorithm for discovering the maximum frequent set</article-title>
          .
          <source>In EDBT'98</source>
          , volume
          <volume>1377</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>105</fpage>
          -
          <lpage>119</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rymon</surname>
          </string-name>
          .
          <article-title>Search through systematic set enumeration</article-title>
          .
          <source>In KR-92</source>
          , pages
          <fpage>539</fpage>
          -
          <lpage>550</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Seno</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Karypis. Slpminer</surname>
          </string-name>
          :
          <article-title>An algorithm for finding frequent sequential patterns using length decreasing support constraint</article-title>
          .
          <source>In ICDE</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Shenoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Haritsa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sundarshan</surname>
          </string-name>
          , G. Bhalotia,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bawa</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Shah</surname>
          </string-name>
          .
          <article-title>Turbo-charging vertical mining of large databases</article-title>
          .
          <source>In SIGMOD</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>H.</given-names>
            <surname>Toivonen</surname>
          </string-name>
          .
          <article-title>Sampling large databases for association rules</article-title>
          .
          <source>In VLDB</source>
          , pages
          <fpage>134</fpage>
          -
          <lpage>145</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Yen</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>An efficient approach to discovering knowledge from large databases</article-title>
          .
          <source>In 4th International Conference on Parallel and Distributed Information Systems.</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Zaki</surname>
          </string-name>
          .
          <article-title>Scalable algorithms for association mining</article-title>
          .
          <source>Knowledge and Data Engineering</source>
          ,
          <volume>12</volume>
          (
          <issue>2</issue>
          ):
          <fpage>372</fpage>
          -
          <lpage>390</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Zaki</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Gouda</surname>
          </string-name>
          .
          <article-title>Fast vertical mining using diffsets</article-title>
          .
          <source>Technical Report 01-1</source>
          , RPI,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>