<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>In-stream Frequent Itemset Mining with Output Proportional Memory Footprint</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel Trabold</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mario Boley</string-name>
          <email>mario.boley@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Mock</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tamas Horvath</string-name>
          <email>tamas.horvatg@iais.fraunhofer.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Computer Science, University of Bonn</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Fraunhofer IAIS</institution>
          ,
          <addr-line>Schloss Birlinghoven, 53754 St. Augustin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>93</fpage>
      <lpage>104</lpage>
      <abstract>
        <p>We propose an online partial counting algorithm based on statistical inference that approximates itemset frequencies from data streams. The space complexity of our algorithm is proportional to the number of frequent itemsets in the stream at any time. Furthermore, the longer an itemset is frequent the closer is the approximation to its frequency, implying that the results become more precise as the stream evolves. We empirically compare our approach in terms of correctness and memory footprint to CARMA and Lossy Counting. Though our algorithm outperforms only CARMA in correctness, it requires much less space than both of these algorithms providing an alternative to Lossy Counting when the memory available is limited.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Mining frequent itemsets from data streams with small memory footprint is an
important data mining task with many applications such as, for example, credit
card payment monitoring or phone call processing A stream setting requires
algorithms that provide aggregated results on the current status of the stream
in low latency at any time. Furthermore, the memory consumption must remain
small, ideally proportional to the number of frequent itemsets in the input. A
small overhead per transaction may add up to a huge amount for long streams.</p>
      <p>
        The goal of itemset frequency estimation is to report the frequencies for all
itemsets which occur more frequent than some threshold in a stream of
transactions. A number of papers consider the simpler problem of counting single items
or 1-itemsets from streams (see, e.g., [1{3]). All known solutions for the task of
itemset frequency estimation fall in one of the following two categories:
Algorithms without bu ers and algorithms with some form of transaction bu ering.
The former include Carma [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and hMiner [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Carma determines a superset of all frequent itemsets in a rst pass and
needs a second pass to compute the exact family of frequent itemsets. Requiring a
second pass is, however, unrealistic for large data streams. In contrast to Carma,
our algorithm does not require a second pass. Regarding hMiner, it has the
drawback that the maximum number of possible items in the stream must be
known for the algorithm in advance. In contrast to hMiner, we do not assume
any such prior knowledge.</p>
      <p>
        One of the representative algorithms belonging to the second category is
the Lossy Counting algorithm [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which has a frequency threshold and an
error parameter with &lt; . This algorithm stores not only the itemsets with
frequency at least , but also all itemsets with frequency above , implying that
Lossy Counting counts also a large number of infrequent itemsets with frequency
between and and keeps those counts in the memory.
      </p>
      <p>This paper presents an algorithm that, at any point in time, approximates
the family of frequent itemsets and their support counts from a data stream.
The central idea of our algorithm is to combine partial counts with a simple, yet
sound statistical model of inference. Our algorithm starts counting an itemset
when all of its non-empty proper subsets have reached the minimal frequency
threshold and prunes them immediately when they are no longer frequent.
Because counting starts only when all subsets are frequent, we may have only
partial counts even for the frequent itemsets. To obtain the frequency for the
entire stream, our algorithm uses statistical inference to explicitly model the
probabilities of itemsets observed from a certain time point t and estimates the
frequency for the unobserved interval [1; t 1]. Each transaction contributes
evidence to this model, making our approach more accurate. In fact, one can show
that for in nite data streams, our algorithm is always correct in the limit.</p>
      <p>
        We have compared our algorithm to the CARMA [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and Lossy
Counting [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] algorithms. In our experimental evaluations we have measured the
accuracy and the memory footprint for all of the three algorithms, where the accuracy
has been measured by calculating the Jaccard distance between the set of correct
frequent itemsets and that of the output of the algorithms. While our algorithm
has outperformed CARMA both in accuracy and memory, Lossy Counting
has generated the most accurate output. However, regarding the memory
consumption, our algorithm has required much less space than Lossy Counting.
Thus, once the space available is limited, our algorithm can be regarded as an
alternative to Lossy Counting.
      </p>
      <p>The rest of the paper is organized as follows. We rst collect the necessary
preliminaries in Section 2 and then present our algorithm in Section 3. In Section 4
we discuss some estimation strategies natural for our algorithm. Section 5
provides some important details about the implementation, while Section 6 presents
our empirical results. We conclude in Section 7 with some interesting problems
for future works.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>
        We follow the standard terminology used in frequent itemset mining (see, e.g., [
        <xref ref-type="bibr" rid="ref6 ref7">6,
7</xref>
        ]). In particular, for a set I of items, a subset X I will be referred to as itemset.
      </p>
      <p>CXtX ;t
j
t</p>
      <p>If for the cardinality of X we have jXj = k then X is a k-itemset. We consider a
potentially in nite stream of transactions, i.e., a potentially unbounded sequence
of subsets of I, that are observed consecutively at discrete time points. Moreover,
we assume that each element of this stream is generated independently according
to some xed1, but unknown distribution D : 2I ! [0; 1]. Formally, we have an
input sequence D1; D2; ::: with Dt I and Dt D for each point in time t 2 N.</p>
      <p>For an itemset X I we de ne the support count of X from time i to j by
CXi;j = jft 2 N : i
t
j; X</p>
      <p>Dtgj ;
i.e., the support count of X is equal to the number of transactions from Di to Dj
that contain X. We de ne the frequency of X at time t as the relative support
count
freqt(X) =</p>
      <p>CX1;t</p>
      <p>:
t
Finally, using these concepts, the family of frequent sets at time t with respect
to a frequency threshold 2 [0; 1] is given by</p>
      <p>Ft; = fX</p>
      <p>I : freqt(X) &gt;
g :</p>
      <p>Our goal in this paper is to design an algorithm that produces a sequence
C1; C2; ::: of families of itemsets such that at each point in time t 2 N, the family
Ct approximates Ft; closely, while maintaining a small memory footprint. To
evaluate the approximation performance of our and other algorithms, we will
use the Jaccard distance as loss function de ned for all A; B 2I by
l(A; B) = 1
jA \ Bj :
jA [ Bj
3</p>
    </sec>
    <sec id="sec-3">
      <title>Estimating support counts based on partial counts</title>
      <p>In this section we present our Partial Counting algorithm approximating
the frequency of frequent itemsets from a data stream of transactions at any
point of time. The inputs to the algorithm are a minimum frequency threshold
2 [0; 1] and, for each time point t 2 N, a transaction Dt I. For each
1 In the long version of this paper we will show how to get rid of this assumption.
t 2 N, the algorithm estimates the frequency of all frequent itemsets with respect
to the transaction database fD1; : : : ; Dtg, in one pass and without storing the
transactions D1; : : : ; Dt 1 seen earlier.</p>
      <p>
        For all t 2 N, the algorithm keeps only those itemsets in the memory that
have been estimated as frequent after having processed Dt; all other itemsets,
except for the 1-itemsets, are removed from the memory. Thus, in contrast to
e.g. the Lossy Counting algorithm [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the space complexity of our algorithm
is only O(F^t; ), where F^t; is the family of itemsets estimated as frequent with
respect to D1; : : : ; Dt.
      </p>
      <p>The algorithm is depicted in Algorithm 1. To process the next transaction Dt
arriving at time t, the algorithm takes in a rst step (Lines 4{12) all non-empty
subsets X of Dt (Line 4) and increments the counter for X if X is already in the
memory (Lines 5{6). Otherwise, if X is a singleton or it is a k-itemset for some
k &gt; 1 and all of its k 1-subsets are already in the memory, it stores X with some
additional auxiliary information. We utilize the Apriori property, i.e., that a
kitemset cannot be frequent if at least one of its (k-1)-subsets is infrequent. More
precisely, for X we store a quadruple (x:set; x:s; x:t; x:count) that will be used
to estimate the frequency of X for the data stream D1; : : : ; Dt 1 (see Lines 8{12
for the de nitions of entries in the quadruple). It is important to note that the
subsets of Dt are processed in increasing cardinalities, as otherwise potential
new frequent itemsets can be lost (see Lines 4,8, and 9).</p>
      <p>In a second step (Lines 13{16) the algorithm then prunes all quadruples
corresponding to itemsets X from the memory that satisfy jXj &gt; 1 and are
estimated as infrequent at point t. In Line 14, we rst calculate an estimation of
the support count of X; we present di erent strategies for this estimation step
in Section 4 by noting that all these strategies estimate the support count from
the counts (i.e., x:count) maintained by the algorithm using some statistical
inference. Figure 1 illustrates the general setting: For an itemset X (re)counted
from time tX , its support count for the period from 1 to tX 1 must be estimated
from the information available at time t. If the frequency derived from this
estimation is below the threshold , X is removed from the memory. Thus, when
an itemset becomes frequent it is stored and counted as long as it is estimated
as frequent. When it becomes infrequent, it is immediately pruned.</p>
      <p>For a query after time point t, we output all itemsets with their estimated
support counts from F that meet the minimum frequency condition (Line 18{
19). According to the construction, all itemsets X in F with jXj &gt; 1 will
automatically be part of the output. In summary, we have a true online algorithm
that returns a family F^t; of itemsets predicted as frequent from the stream of
transactions from the beginning of the stream up to time t.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Support count estimators</title>
      <p>In this section we describe a generic framework for support count estimation and
present di erent strategies for this problem. Except for one, all of the strategies
Algorithm 1 Partial Counting
1: Intitalization
2: F ;</p>
      <p>// current set of frequent patterns with auxiliary information
in this section are based on some careful combination of the observed counts
with the estimated ones that are derived from conditional probabilities.</p>
      <p>We write C^ for estimated support counts. As illustrated in Figure 1, whenever
there is some observed support count for an itemset X, the estimated support
count for the entire period of all transactions is given by the observed support
count CXtX ;t plus the estimation C^X1;tX 1 of the support count for the time period
[1; tX 1] for which the support count of X is not available at point t. As long
as no observed count for X exists, its estimated frequency is 0, i.e.,
C^X1;t =
0
C^X1;tX 1 + CXtX ;t
if t
o/w .</p>
      <p>
        tX
We now present di erent strategies to compute C^X1;tX 1. We rst recall an
estimation from [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and then propose two natural strategies based on conditional
probabilities.
4.1
      </p>
      <sec id="sec-4-1">
        <title>Upper bound estimation (ube)</title>
        <p>
          This estimation strategy is used in Carma [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. It takes the minimum count of
all (k 1)-subitemsets Y of a k-itemset X, i.e.,
        </p>
        <p>C^X1;tX 1 =
min</p>
        <p>Y X;
jY j=jXj 1</p>
        <p>C^1;tX 1 :</p>
        <p>Y
Clearly, this formula gives an upper bound on the true support count of X.
Notice that this is a static strategy in the sense that it does not improve the
estimation C^X1;tX 1 as further transactions arrive.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Estimation based on conditional probabilities</title>
        <p>We now turn to more complex, dynamic estimation strategies. They are based
on the probabilistic view of frequencies that for any itemset X and t 2 N
p(X)1;t = p(Y )1;t p(XjY )1;t
for any Y X. To estimate p(X)1;t, we need to estimate p(X)1;tX 1, as all
information about X is available for the interval [tX ; t]. We estimate p(X)1;tx 1
by estimating (i) p(Y )1;tX 1 and (ii) p(XjY )1;tX 1.
(i) Regarding p(Y )1;tX 1, we estimate it recursively by
by noting that the support counts are stored from the very beginning for all
1-itemsets.
(ii) Regarding p(XjY )1;tX 1, we make use of the fact that [tX ; t] is a common
observation period for both X and Y and the assumption that the underlying
distribution D is stationary and estimate p(XjY )1;tX 1 by
p(Y )1;tX 1</p>
        <p>C^1;tX 1</p>
        <p>Y
tX 1
p(XjY )1;tX 1
p(XjY )tX ;t;</p>
        <sec id="sec-4-2-1">
          <title>CXtX ;t</title>
          <p>p(XjY )tX ;t = CtX ;t :</p>
          <p>Y
which, in turn, can be calculated by
One can show that for su ciently large tX and t, (2) gives a close estimation
with high probability.</p>
          <p>Putting together, from (1), (2), and (3) it follows that p(X)1;tX 1 can be
estimated by</p>
          <p>p(X)1;tX 1 C^tXY1;tX 11 CCYXttXX ;;tt : (4)
As the frequency of X in [1; tX 1] is identical to the probability p(X)1;tX 1,
by (4) the support count C^X1;tX 1 can be estimated by</p>
          <p>C^X1;tX 1 = p(X)1;tX 1 (tX</p>
          <p>1)
C^1;tX 1</p>
          <p>Y
The two strategies presented below build upon the idea discussed above. They
di er from each other only by the particular choice of Y . All strategies have in
common that they estimate the support counts only for itemsets X with jXj 2.
(1)
(2)
(3)
(a) Minimum estimation (me): This strategy uses the single subset Y that
results in the minimum estimated count for X, i.e.,</p>
          <p>C^1;tX 1</p>
          <p>Y</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>CXtX ;t</title>
          <p>To illustrate the three di erent strategies presented above, we use the small
example given in Table 1. It shows a total of 8 transactions in the rst row, t
in the second row, and the frequencies of a, b, and ab in rows three to ve. The
last three rows show the estimated frequencies for the three strategies.</p>
          <p>In the example the rst estimation occurs for transaction 4 as the set ab
occurs for the rst time in transaction 4 and is counted from this transaction
onwards. Up to and including the third transaction the set a occurs twice, the
set b once. All estimations rely on the counts of a and b for the rst three
transactions. The strategies start to di er from the 6th transaction onwards. We
will therefore illustrate the di erent strategies for the 6th transaction. Ca4b;6 = 1
for all strategies.
(i) ube estimates C^a1b;3 = min(2; 1) = 1, which gives</p>
          <p>C^a1b;6 = C^a1b;3 + Ca4b;6 = 1 + 1 = 2
and thus, a frequency of 26 = 0:33.
(ii) me estimates C^a1b;3 = min(1 21 ; 2 1 ) = 0:5, which gives
2</p>
          <p>C^a1b;6 = C^a1b;3 + Ca4b;6 = 0:5 + 1 = 1:5
and a frequency of 1:5 = 0:25.
(iii) ae estimates C^a1b;3 =6 12 (1 12 + 2 1 ) = 0:75, which gives</p>
          <p>2</p>
          <p>C^a1b;6 = C^a1b;3 + Ca4b;6 = 0:75 + 1 = 1:75
and a frequency of 1:75 = 0:29.</p>
          <p>6
All other estimations are computed accordingly.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Implementation</title>
      <p>In this section we brie y discuss some important implementation issues of the
Partial Counting algorithm, in particular, the data structure, insertion into
the data structure, and pruning.</p>
      <p>The data structure F can be stored as a pre x tree. This allows for a compact
representation of the family of itemsets, as it su ces to store for each itemset
X only a single item. Indeed, the itemset corresponding to a node can uniquely
be recovered by concatenating the items on the path from the root to the node
at hand. Furthermore, it also allows for pruning entire branches, if a node has
to be deleted which has further children.</p>
      <p>The set x:s can be stored compactly as an array of integers. For an itemset
X with jXj = k, there are k di erent (k 1)-subsets. We sort all these subsets
Y lexicographically and take only the index of the missing item i (i.e., which
satis es Y [ fig = X) to store the count for itemset Y . Thus, in this way there
are k counters for the subsets and one for the itemset X.</p>
      <p>Theoretically, we may start observing an itemset X as soon as the estimated
frequencies for all subsets Y X have reached the minimum frequency threshold
. X may not occur in the transaction, when this condition is met. However, X
may become frequent only when it occurs in the current transaction. That is,
we start counting X, with the transaction containing X after all Y have already
reached the minimum frequency threshold. The price we pay for this is that the
rst estimation of p(XjY ) is 1 and as such, very inaccurate.</p>
      <p>An itemset which is inserted in F at time t is never pruned in t, otherwise it
would not have been inserted. We exploit this by skipping the pruning test for
itemsets which were inserted in the same transaction.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Experimental Evaluation</title>
      <p>
        To assess the performance of our algorithm in practice, we analyzed the number
of counters in memory and the empirical loss for real-world datasets taken from
the UCI machine learning repository [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The empirical evaluations focus on the
following three main aspects:
{ Comparing the memory and loss of the di erent estimation strategies.
{ Comparing Partial Counting with Lossy Counting and Carma in
terms of memory requirements and loss.
{ Evaluating the memory and loss of Partial Counting with decreasing
frequency threshold.
      </p>
      <p>
        For each dataset we shu e the transactions, add them one by one and mine
at regular intervals. The ground truth is obtained with Apriori [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. We compute
the Jaccard distance based loss for each algorithm as de ned in Section 2. This
loss accounts for both over and underestimation of itemset frequencies without
further distinguishing between the two.
      </p>
      <p>(a) memory for
= 70%
Overall, the results show that
(i) our algorithm requires less memory than CARMA and Lossy Counting (see</p>
      <p>Figure 4),
(ii) mining at lower thresholds results in a better approximation (see Figures 2,
3, and 4),
(iii) the two strategies (me and ae) based on statistical inference outperform the
simpler strategy (ube) (see Figures 3(b) and 3(d)), and
(iv) the loss decreases as the stream evolves (Figures 2(d) and 4(b)).
We report the memory requirements and loss for the dataset connect-4 with a
maximal itemset length of 3 and a minimum frequency threshold of 70% and
90% in Figure 2 and for the chess dataset at 10% and 30% frequency threshold
in Figure 3.
(c) memory for
(d) loss for</p>
      <p>We rst compare the memory and loss of the di erent estimation strategies.
The memory is measured in counters per truly frequent itemset. The average and
minimum estimation strategies require consistently fewer counters per frequent
itemset than the upper bound estimation strategy (Figures 2(a), 2(c), 3(a), and
3(c)). In terms of loss the inference based estimation strategies are the best. The
average estimation strategy has the overall lowest loss, the upper bound strategy
the highest (Figures 2(b), 2(d), 3(b), and 3(d)).</p>
      <p>To compare our approach with state of the art algorithms, we reimplemented
Lossy Counting and Carma. In Figure 4 we compare the counters in memory
per truly frequent itemset 4(c) and the loss 4(d) of Carma, Lossy Counting,
and Partial Counting. The number of counters per truly frequent itemset
is a fair mode of comparison, as it is an implementation independent measure,
whereas Megabytes depend more on the internal data representation. Our
Partial Counting algorithm clearly outperforms both algorithms in terms of
memory and Carma also in loss. Note that the number of counters in memory of
Lossy Counting increases as the stream evolves (see Figure 4(c)). The
memory required by our Partial Counting algorithm remains low, as the stream
evolves. It is not surprising that Lossy Counting performs better in terms
of loss for the 10% frequency threshold, as it stores more information yielding
typically better results.</p>
      <p>We now take a look at the e ect of the frequency threshold upon our
algorithm. The experiments show that a lower frequency threshold results in a lower
loss for our algorithm (Figures 2(b) vs 2(d), 3(b) vs 3(d), and 4(d) vs 4(b)). The
e ect on memory is the same but less prominent (Figure 2(a) vs 2(c)).
(c) memory for
(d) loss for</p>
      <p>Finally we observe that the loss decreases as the stream evolves (Figures
2(b), 2(d), 4(b) and 4(d)). This trend may be not observed for the short stream
in Figure 3.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>We have presented the Partial Counting algorithm for mining frequent
itemsets from data streams with a space complexity proportional to the number of
frequent itemsets generated from the stream at any point in time. The algorithm
starts counting the frequency of an itemset once all of its subsets are estimated to
be frequent. The support count of itemsets is estimated by statistical inference.</p>
      <p>We have evaluated our algorithm empirically and compared its memory
consumption and accuracy with that of CARMA and Lossy Counting. While
our algorithm outperforms CARMA in both of these aspects, Lossy
Counting turned out to have a better accuracy (measured with Jaccard distance).
This is, however, not surprising because Lossy Counting, as our experiments
clearly demonstrate, uses signi cantly more space than Partial Counting. Thus,
our algorithm is an alternative to Lossy Counting when memory is limited.</p>
      <p>In its current status, Partial Counting is outperformed in terms of
accuracy by Lossy Counting. As for future work, we would like to reduce this
gap. We are going to follow two ideas to achieve this goal. We will experiment
with a con dence parameter to reduce the e ect of overestimation when we rst
observe itemsets X and Y together. We believe that we can further improve
our probabilistic inference mechanism based on the observation that an itemset
which is not counted, while all of its subsets are counted, must be infrequent.</p>
      <p>Another important issue is that, as we will show in the long version of this
paper, our algorithm is correct for any in nite data stream. We note that this
holds even if we relax the assumption that the underlying distribution is
stationary. It is an interesting question that we are investigating, whether concept
drift can be handled for nite data streams as well.</p>
      <p>Last but not least, we are going to analyse our algorithm for distributed
data streams as well. One of the most challenging questions towards this
direction is to nd a distributed algorithm that generates frequent itemsets of good
approximation performance with as few as possible communication.
8</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgment</title>
      <p>This research has been supported by the EU FP7-ICT-2013-11 under grant
619491 (FERARI).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Demaine</surname>
            ,
            <given-names>E.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lopez-Ortiz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Munro</surname>
            ,
            <given-names>J.I.</given-names>
          </string-name>
          :
          <article-title>Frequency estimation of internet packet streams with limited space</article-title>
          .
          <source>In: Proceedings of the 10th Annual European Symposium on Algorithms. ESA '02</source>
          , London, UK, UK, Springer-Verlag (
          <year>2002</year>
          )
          <volume>348</volume>
          {
          <fpage>360</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Manku</surname>
            ,
            <given-names>G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motwani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Approximate frequency counts over data streams</article-title>
          .
          <source>In: Proceedings of the 28th International Conference on Very Large Data Bases. VLDB '02</source>
          ,
          <string-name>
            <given-names>VLDB</given-names>
            <surname>Endowment</surname>
          </string-name>
          (
          <year>2002</year>
          )
          <volume>346</volume>
          {
          <fpage>357</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Cormode</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muthukrishnan</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>An improved data stream summary: the countmin sketch and its applications</article-title>
          .
          <source>Journal of Algorithms</source>
          <volume>55</volume>
          (
          <issue>1</issue>
          ) (
          <year>2005</year>
          )
          <volume>58</volume>
          {
          <fpage>75</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Hidber</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Online association rule mining</article-title>
          . In Delis,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Faloutsos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Ghandeharizadeh</surname>
          </string-name>
          , S., eds.
          <source>: SIGMOD</source>
          <year>1999</year>
          ,
          <source>Proceedings ACM SIGMOD International Conference on Management of Data, June 1-3</source>
          ,
          <year>1999</year>
          , Philadephia, Pennsylvania, USA, ACM Press (
          <year>1999</year>
          )
          <volume>145</volume>
          {
          <fpage>156</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A novel hash-based approach for mining frequent itemsets over data streams requiring less memory space</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          <volume>19</volume>
          (
          <issue>1</issue>
          ) (
          <year>2009</year>
          )
          <volume>132</volume>
          {
          <fpage>172</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Imielinski</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swami</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Mining association rules between sets of items in large databases</article-title>
          .
          <source>In: SIGMOD '93: Proceedings of the 1993 ACM SIGMOD international conference on Management of data</source>
          , New York, NY, USA, ACM (
          <year>1993</year>
          )
          <volume>207</volume>
          {
          <fpage>216</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Mannila</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toivonen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verkamo</surname>
            ,
            <given-names>A.I.:</given-names>
          </string-name>
          <article-title>E cient algorithms for discovering association rules</article-title>
          .
          <source>In: Knowledge Discovery in Databases: Papers from the 1994 AAAI Workshop</source>
          , Seattle, Washington,
          <year>July 1994</year>
          .
          <source>Technical Report WS-94-03</source>
          . (
          <year>1994</year>
          )
          <volume>181</volume>
          {
          <fpage>192</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Bache</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lichman</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>UCI machine learning repository (</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>