<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AAvvooiiddiinngg tthhee iitteemmsseett cclloossuurree ccoommppuuttaattiioonn ””ppiittffaallll””</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>T. Hamrouni</string-name>
          <email>n@isf</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Ben Yahia</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Y. Slimani Tarek Hamrouni</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sadok Ben Yahia</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yahya Slimani</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Csaamdopku.bseUnnyaivheiras</institution>
          ,
          <addr-line>ityaaihrey,a1.s0l6im0 aTnui</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Facult ́e des Sciences de Tunis D ́eparteFmaecnutletd ́deessSSccieienncceessddeel'TInufnoirsmatique, DCep ́aamrtpeumseUntnidveesrsSitcaieirnec</institution>
          ,
          <addr-line>e1s0d6e0 lT'Iunnfoisr,mTautinqiusiee,</addr-line>
        </aff>
      </contrib-group>
      <fpage>46</fpage>
      <lpage>59</lpage>
      <abstract>
        <p>Extracting generic bases of association rules seems to be a promising issue in order to present informative and compact user addedvalue knowledge. However, extracting generic bases requires partially ordering costly computed itemset closures. To avoid the nightmarish itemset closure computation cost, specially for sparse contexts, we introduce an algorithm, called Prince, allowing an astute extraction of generic bases of association rules. The Prince algorithm main originality is that the partial order is maintained between frequent minimal generators and no more between frequent closed itemsets. A structure called minimal generator lattice is then built, from which the derivation of itemset closures and generic association rules becomes straightforward. An intensive experimental evaluation, carried out on benchmarking and ”worst case” datasets, showed that Prince largely outperforms the pioneer algorithms, i.e., Close, A-Close and Titanic.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        It is widely known that frequent itemset based algorithms suffer from the
generation of a very large number of frequent itemsets and hence association rules.
Thus, this prohibitive generation reduces not only efficiency but also effectiveness
of the mined knowledge. In fact, users have to perform tedious rummage within
an overwhelming large number of mined association rules [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In this context,
the approach based on the extraction of frequent closed itemsets [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] presented
a clear promise to reduce the frequent itemset extraction cost and mainly to
offer, to users, irreducible nuclei of association rules that are commonly known
as ”generic bases” of association rules. This approach, relying on the Formal
Concept Analysis mathematical background [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], proposes to reduce the search
space by detecting intrinsic structural properties. Therefore, the problem of
mining association rules might be reformulated, under the (frequent) closed itemsets
discovery point of view, as follows [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]:
1. Discover both distinct ”closure systems”, i.e., sets of sets which are closed
under the intersection operator, namely the set of closed itemsets and the set
of minimal generators. Also, the upper cover (Covu) of each closed itemset
should be available.
2. From all discovered information during the first step, i.e., both closure
systems and the upper cover sets, derive generic bases of association rules (from
which all remaining association rules can be derived).
      </p>
      <p>
        The essential report after an overview of the state of the art of frequent closed
itemset based algorithms (e.g., [
        <xref ref-type="bibr" rid="ref1 ref2 ref5 ref6 ref7 ref8 ref9">1, 2, 5–9</xref>
        ]) can be summarized in what follows:
1. These algorithms mainly concentrate on the first task, i.e., reducing the
computation time of the frequent itemset extraction step. Their performances are
interesting on dense contexts. However, they present modest performances
on sparse contexts. Indeed, computing itemset closures in this type of
contexts is heavily handicapping on these algorithm performances, since frequent
closed itemset search space tends to overlap that of frequent itemsets.
2. The frequent closed itemset based algorithms neglect the second task, i.e.,
extracting generic association rule bases. Indeed, none of them accepted to
maintain the order covering the relationship between frequent closed
itemsets.
      </p>
      <p>
        In this paper, we propose a new algorithm, called Prince, aiming to extract
generic bases of association rules. Prince performs a level-wise browsing of the
search space. Its main originality is that it is the only one who accepted to
bear the cost of building the partial order. Interestingly enough, to amortize
this prohibitive cost, the partial order is maintained between frequent minimal
generators and no more between frequent closed itemsets. The obtained partially
ordered structure is called minimal generator lattice [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], in which each
equivalence class is reduced to the corresponding set of frequent minimal generators.
Hence, itemset closures are not computed but derived when Prince performs
a simple sweeping of the minimal generator lattice to derive generic bases of
association rules. Practical performances of the Prince algorithm have been
compared to those of well known level-wise browsing algorithms, i.e., Close [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
A-Close [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and Titanic [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Our experiments were carried out on benchmark
datasets (dense and sparse) and on ”worst case” datasets. Obtained results are
very encouraging: although our algorithm performs the partial order
construction task, it largely outperforms Close, A-Close , and Titanic algorithms. In
addition to the ”worst case” datasets and due to space limit, we report our
results only on two benchmark datasets, frequently used for evaluating data mining
algorithms.
      </p>
      <p>
        It is important to note that omitting to compare Prince performances to
those of more recent algorithms, e.g., LCM [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], DCI-Closed [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], is argued by
two reasons:
1. Close, A-Close and Titanic algorithms determine at least the ”key”
information provided by the frequent minimal generator set.
2. Following our claim that stressing on fast enumeration of frequent closed
itemsets will not be of any interest nor presents any added-value knowledge
for end-users, since not all required information is extracted.
The remainder of the paper is organized as follows. Section 2 sketches the generic
association rule basis extraction problem. Section 3 is dedicated to the
presentation of Prince algorithm. Experimental results showing the utility of the
proposed approach are reported in section 4. The conclusion and future work
are presented in section 5.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Generic association rule basis extraction</title>
      <p>
        Since the apparition of the approach based on the extraction of frequent closed
itemsets [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], several generic association rule bases were introduced among which
those of Bastide et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and which are defined as follows:
1. The generic basis for exact association rules is defined as follows:
Definition 1. Let F CIK be the set of frequent closed itemsets extracted from
the extraction context K. For each entry f in F CIK, let M Gf be the set of
its minimal generators. The generic basis for exact association rules GB is
given by: GB = {R: g ⇒ (f - g) | f ∈ F CI K and g ∈ M Gf and g = f (1)}.
2. The transitive reduction of the informative basis [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], which is a basis for all
approximate association rules, is defined as follows(2):
Definition 2. Let F MGK be the set of frequent minimal generators
extracted from the extraction context K. The transitive reduction RI is given
by: RI = {R | R: g ⇒ (f - g) | f ∈ F CI K and g ∈ F MG K and g ≺ f (3)
and Conf(R) ≥ minconf }.
      </p>
      <p>
        In the remainder of the paper, we will refer to the generic association rules
formed by the couple (GB, RI). This couple is informative, sound and lossless
[
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ] and the association rules forming it are referred as informative
association rules. Thus, given an Iceberg Galois lattice – in which each frequent
closed itemset is decorated by its list of minimal generators – the derivation of
these association rules can be performed straightforwardly. Indeed, approximate
generic association rules represent ”inter-node” implications, assorted with the
confidence measure, between two adjacent comparable equivalence classes, i.e.,
from a frequent closed itemset to another frequent closed itemset immediately
covering it. For example, referring to the Iceberg Galois lattice depicted by
Figure 1 (Right), the approximate generic association rule C0⇒ .75ADEF is generated
from both equivalence classes topped respectively by the frequent closed itemsets
”CE” and ”ACDEF”. Inversely, exact generic association rules are ”intra-node”
implications, with a confidence equal to 1, extracted from each node in the
partially ordered structure. For example, from the closed itemset ”ACDEFG”,
three exact generic association rules are obtained: AG⇒ CDEF, DG⇒ ACEF and
FG⇒ ACDE.
1 The condition g = f ensures discarding non-informative association rules of the form
g ⇒ ∅ .
2 The closure operator is noted .
3 The notation ≺ indicates that f covers g in the Iceberg Galois lattice.
      </p>
      <p>
        A B C D E F G
1 × × × × × ×
2 × × × × × ×
3 × × ×
4 × × × ×
5 × × × × × ×
In order to palliate the frequent closed itemset based algorithm insufficiencies,
i.e., the cost of the closure computation as well as neglecting the partial
order construction, we will introduce a new algorithm called Prince. Prince
highly reduces the cost of closure computation and generates the partially
ordered structure, which makes it able to extract straightforwardly generic
association rule bases without coupling it with another algorithm. Prince takes as
input an extraction context K where the items are sorted by lexicographic order,
the minimum threshold of support minsup and the minimum threshold of
confidence minconf. It outputs the list of frequent closed itemsets and their associated
minimal generators as well as the informative association rules formed by the
couple (GB, RI). Thus, Prince operates in three successive steps: (i) Minimal
generator determination (ii) Partial order construction (iii) Generic association
rule basis extraction.
Following the ”Test-and-generate” technique, Prince traverses the search space
by level to determine the set of frequent minimal generators F MGK sorted by
decreasing support values. F MGK is then considered as divided into several
subsets. Each subset represents a given support. Thus, each time that a
frequent minimal generator is determined, it is added to the subset representing its
support. Prince also keeps track of the negative border of minimal generators
GBd− (4) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In the second step, the set of frequent minimal generators will
serve as a backbone to construct the minimal generator lattice. As shown by the
following property, the union of F MGK and GBd− will be used, in the second
step, as a concise lossless representation of frequent itemsets:
Property 1. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] Let X be an itemset. If ∃ Z ∈ GB d− and Z ⊆ X then X is
infrequent. Otherwise, X is frequent and Supp(X) = min {Supp(g) | g ∈ F MG K
and g ⊆ X}.
4 An itemset belong to GBd− if it is an infrequent minimal generator and all its subsets
are frequent minimal generators.
Prince uses, in this step, the same pruning strategies introduced in Titanic
namely minsup, the ideal order of the frequent minimal generator set and the
estimated support. A trie is used to store the minimal generator set in order to
speed-up the extraction of information that will be later of use. The path from
the root to each node represents a minimal generator.
3.2
      </p>
      <sec id="sec-2-1">
        <title>Partial order construction</title>
        <p>
          In this step, the frequent minimal generator set F MGK will form a minimal
generator lattice, and this without any access to the extraction context. The main
idea is how to construct the partial order without computing itemset closures,
i.e., how guessing the subsumption relation by only comparing minimal
generators? To achieve this goal, the list of immediate successors(5) of each equivalence
class will be updated in an iterative way. The processing of the frequent minimal
generator set is done according to the order imposed in the first step (i.e., by
decreasing support values). Each frequent minimal generator g of size k (k ≥ 1) is
introduced into the minimal generator lattice by comparing it to the immediate
successors of its (k-1)-subsets(6). This is based on the isotony property of the
closure operator [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. Indeed, let g1 , a (k-1)-itemset, be one of the subsets of g,
g1 ⊂ g ⇒ g1 ⊂ g . Thus, the equivalence class to which belongs g is a successor
(not necessarily an immediate one) of the equivalence class to which belongs g1 .
        </p>
        <p>While comparing g to the immediate successor list of g1 , noted L, two cases
are to be distinguished. If L is empty then g is added to L. Otherwise, g is
compared to the elements already belonging to L (cf. Proposition 1). The imposed
order in the first step allows to distinguish only two cases sketched by
Proposition 1 by replacing the frequent minimal generators X and Y by respectively g
and one of the elements of L.</p>
        <p>
          Proposition 1. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] Let X, Y ∈ F MG K, CX and CY their respective
equivalence classes:
a. If Supp(X) = Supp(Y ) = Supp(X ∪ Y ) then X and Y belong to the same
equivalence class.
b. If Supp(X) &lt; Supp(Y ) and Supp(X) = Supp(X ∪ Y ) then CX (resp. CY ) is
a successor (resp. predecessor) of CY (resp. CX ).
        </p>
        <p>The computation of the support of (X ∪ Y ) is performed in a direct manner
if (X ∪ Y ) belongs to F MGK ∪ GB d−. CX and CY are then incomparable.
Otherwise, Property 1 is applied. The support computation stops then as soon
as we find a minimal generator that is included in (X ∪ Y ) and has a support
strictly lower than that of X and that of Y . CX and CY are then incomparable.</p>
        <p>During these comparisons and to avoid redundant closure computations,
Prince introduces two complementary functions. These functions make it
possible to maintain the concept of equivalence class throughout processing. To this
5 By the term ”immediate successor”, we indicate a frequent minimal generator, unless
otherwise specified.
6 In the first step and for each k-candidate, links towards its (k-1)-subsets are stored
during the check of the ideal order property.
end, each equivalence class C will be characterized by a representative
itemset, which is the first frequent minimal generator introduced into the minimal
generator lattice. Both functions are described below:</p>
        <p>1. Manage-Equiv-Class : This function is used if a frequent minimal
generator, say g, is compared to the representative itemset of its equivalence class,
say R. The Manage-Equiv-Class function replaces all occurrences of g by R
in the immediate successor lists in which g was added. Then, comparisons to
carry out with g will be made with R. Thus, for each equivalence class, only its
representative itemset appears in the lists of immediate successors.</p>
        <p>2. Representative: This function makes it possible to find, for each frequent
minimal generator g, the representative R of its equivalence class in order to
complete the immediate successor list of Cg. This allows to manage only one
immediate successor list for all frequent minimal generators belonging to the
same equivalence class.</p>
        <p>The pseudo-code of the second step is given by the GenO-rder procedure
(Algorithm 1). Each entry, say g, in F MGK is composed by the following fields:
(i) support: the support of g (ii) direct-subsets : the list of (k-1)-subsets of g
(iii) immediates-uccs : the list of immediate successors of g. At the end of the
execution of the GenO-rder procedure, g.immediate-succs is empty if g is not
the representative itemset of its equivalence class or if g belongs to a maximal
equivalence class, i.e., not subsumed by any equivalence class. Otherwise, this
list will contain only representative frequent minimal generators.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Algorithm 1 Gen-Order</title>
        <p>Require: - F MGK.</p>
        <p>Ensure: - The elements of F MGK partially ordered in the form of a minimal generator
lattice.
1: for all (g ∈ F MG K) do
2: for all (g1 ∈ g.direct-subsets ) do
3: R = Representative(g1 );
4: for all (g2 ∈ R .immediate-succs ) do
5: if (g.support = g2 .support = Supp(g ∪ g2 )) then
6: ManageE-quivC-lass (g,g2 ); /*g, g2 ∈ C g and g2 is the representative of</p>
        <p>Cg*/
7: else if (g.support &lt; g2 .support and g.support = Supp(g ∪ g2 )) then
8: g is compared with g2 .immediate-succs ;
9: /*For the remainder of the element of R.immediate-succs , g is compared
only with each g3 | g3 .support &gt; g.support;*/
10: end if
11: end for
12: if (∀ g2 ∈ R .immediate-succs , Cg and Cg2 are incomparable) then
13: R.immediate-succs = R.immediate-succs ∪ { g};
14: end if
15: end for
16: end for
3.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Generic association rule basis extraction</title>
        <p>In this step, Prince extracts the valid informative association rules. For this
purpose and using Proposition 2, Prince finds the frequent closed itemset
corresponding to each equivalence class.</p>
        <p>
          Proposition 2. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] Let f and f1 be two closed itemsets such that f covers f1
in the Galois lattice LCK . Let M Gf be the set of minimal generators of f . The
closed itemset f can be composed as follows: f = ∪{ g|g ∈ M Gf } ∪ f1 .
        </p>
        <p>The traversal of the minimal generator lattice is carried out in an
ascending manner from the equivalence class whose frequent minimal generator is the
empty set(7) (denoted C∅ ) to the non subsumed equivalence class(es). If the
closure of the empty set is not null, the exact generic association rule between the
empty set and its closure is then extracted. Having the partial ordered structure
built, Prince extracts the valid approximate generic association rules between
the empty set and the frequent closed itemsets of the upper cover of C∅ . These
closures are found, by applying Proposition 2, using the minimal generators
of each equivalence class and the closure of the empty set. Equivalence classes
forming the upper cover of C∅ are stored which makes it possible to apply the
same process to them. By the same manner, Prince treats higher levels of the
minimal generator lattice until reaching the maximal equivalence class(es).</p>
        <p>
          The pseudo-code of this step is given by the procedure Gen-GRB (Algorithm
2). We use the same notations of the procedure Gen-Order to which we add the
field FCI to each element of F MGK. Thus, for each frequent minimal generator g,
this field allows to store the frequent closed itemset corresponding to Cg if g is its
representative. In the GenG-RB procedure, L1 indicates the list of equivalence
classes from which are extracted the valid informative association rules. By L2 ,
we note the list of equivalence classes which cover those forming L1 (8).
Example 1. Let us consider the extraction context K given by Figure 1 (Left)
for minsup=2 and minconf =0.5. The first step allows the determination of the
empty set closure, the sorted set F MGK and the negative border of minimal
generators GBd−. Thus, ∅ =E, F MGK = {(∅ ,5), (C,4), (D,4), (A,3), (B,3), (F,3),
(G,3), (CD,3), (AG,2), (BC,2), (BD,2), (DG,2), (FG,2)} and GBd−={(AB,1),
(BF,1), (BG,1), (BCD,1)}. During the second step, Prince processes the
element of F MGK by comparing each frequent minimal generator g, of size k (k ≥
1), with the immediate successor lists of its (k-1)-subsets. Since the list of
immediate successors of the empty set is empty, C is added to ∅ .immediate-succs .
Then, D is compared to C. Since CD is a minimal generator, CC and CD are then
incomparable and D is added to ∅ .immediate-succs . A is then compared to this
7 This class is called the Bottom element of the lattice [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. The corresponding closure
is calculated in the first step by collecting items appearing in all transactions of the
extraction context.
8 A test is carried out to check that an equivalence class does not belong to L2 . This
test consists in checking if the corresponding frequent closed itemset were already
calculated (Line 11 in Algorithm 2).
list. By comparing A to C, A.support &lt; C.support and A.support = Supp(AC)
and CA is then a successor of CC . A is added to C.immediate-succs without any
comparison since this list is still empty. A is also added to D.immediate-succs
since A.support &lt; D.support and A.support = Supp(AD). At this moment
of processing, we have ∅ .immediate-succs = {C,D} and B is added to this list
since CB is incomparable with CC (BC is a minimal generator) and CD (BD
is also a minimal generator). F is then introduced into the minimal
generator lattice by comparing it with the immediate successor list of its unique
0subset, i.e., the empty set. By comparing F to C, F.support &lt; C.support and
F.support = Supp(CF) and then CF is a successor of CC . F is then compared
to C.immediate-succs which contains A. F.support = A.support = Supp(AF)
and thus F ∈ C A whose A is the representative one. The Manage-Equiv-Class
function is then applied by replacing occurrences of F, in the immediate successor
lists, by A (in this case, there is no occurrence) and by continuing comparisons
with A instead of F (in this case, there are no more comparisons to do with
F). G is then compared to ∅ .immediate-succs equal to {C,B,D}. CG is a
successor of CC since G.support &lt; C.support and G.support = Supp(CG). After
comparing G with C.immediate-succs which only contains A, G is added to
C.immediate-succs since CG is incomparable with CA (AG is a minimal
generator). By comparing G to D (resp. B), CG is incomparable with CD (resp. CB) since
DG (resp. BG) is a minimal generator. Then, CD is compared to the immediate
successor lists of its 1-subsets, i.e., C and D. CC has CA and CG as immediate
successors. By comparing CD and A, CD is affected to CA since CD.support
= A.support = Supp(ACD). The Manage-Equiv-Class function is then
applied. In particular, comparisons to carry out with CD will be made with A. A is
then compared to the immediate successor list of the second 1-subset of CD, i.e.,
D. However, D.immediate-succs contains only A and the comparison process
stops. It is the same for the remainder of F MGK. Having the minimal
generator lattice built (cf. Figure 1 (Center)), an ascending sweeping is carried out
from C∅ . As ∅ =E, the exact generic association rule ∅ ⇒ E is then extracted.
∅ .immediates-uccs ={C,D,B}. The frequent closed itemset associated to CC is
then found and is equal to CE. The approximate generic association rule ∅ ⇒
CE, of a support equal to 4 and a confidence equal to 0.8, will be extracted.
It is the same for CD and CB. Using the same process and from CC , CD and
CB, the traversal of the minimal generator lattice is performed in an ascending
way until extracting all valid informative association rules. The resulting generic
association rule bases are sketched by Figure 2.
        </p>
        <p>Exact generic association rules
R1 : ∅ ⇒
R2 : C ⇒
R3 : D ⇒
R4 : B ⇒
R5 : A ⇒
R6 : F ⇒
R7 : CD ⇒</p>
        <p>E R8 : G ⇒
E R9 : BC ⇒
E R10 : BD ⇒
E R11 : AG ⇒
CDEF R12 : DG ⇒
ACDE R13 : FG ⇒</p>
        <p>AEF</p>
        <p>CE</p>
        <p>E
E
CDEF
ACEF
ACDE</p>
        <p>Approximate generic association rules
R14 : ∅ 0⇒.8CE R21 : D0⇒.5BE
R15 : ∅ 0⇒.8DE R22 : B0⇒ .66CE
R16 : ∅ 0⇒.6BE R23 : B0⇒ .66DE
R17 : C0⇒ .75ADEF R24 : A0⇒ .66CDEFG
R18 : C0⇒ .75GE
R19 : C0⇒.5BE</p>
        <p>R25 : F0⇒ .66ACDEG</p>
        <p>R26 : CD0⇒ .66AEFG</p>
        <p>R20 : D0⇒ .75ACEF R27 : G0⇒ .66ACDEF</p>
        <p>Fig. 2. Left: GB basis. Right: RI basis.
3.4</p>
      </sec>
      <sec id="sec-2-4">
        <title>Correctness and Computational cost</title>
        <p>In this section, we prove the correctness of Prince algorithm and we evaluate
its computational cost in the worst case.</p>
        <p>Theorem 1. (correctness) The Prince algorithm extracts all frequent
minimal generators and derives all frequent closed itemsets and all valid informative
association rules.</p>
        <p>Proof. During the first step, a minimal generator candidate c is pruned only
if its estimate support is equal to its actual support or if it does not verify
the ideal order of minimal generators. Otherwise, c is a minimal generator and
by comparing its actual support to minsup, Prince algorithm adds it to the
frequent minimal generator set F MGK or to the negative border of minimal
generators GBd−. Thus, at the end of the first step of Prince, all frequent
minimal generators are extracted in addition to the negative border of minimal
generators.</p>
        <p>During the second step, Prince takes care to introduce all frequent
minimal generators into the minimal generator lattice. Indeed, a frequent minimal
generator g is compared to the immediate successor list of all its (k-1)-subsets.
The Representative function allows to find the representative itemset of the
equivalence class of a (k-1)-subset of g. Once the representative found, the used
Proposition 1 treats both possible cases. The Manage-Equiv-Class function
is used only if g is compared to the representative of Cg. At the end of this step,
the minimal generator lattice is completely built.</p>
        <p>During the third step, all equivalence classes are taken in consideration when
deriving frequent closed itemsets and valid informative association rules.
Indeed, each equivalence class C, except C∅ , has at least one immediate predecessor.
Hence, the representative of C belongs at least to one immediate successor list of
another equivalence class, say C1 . When treating C1 , the frequent closed itemset
of C is derived and C is added to the equivalence class list from which valid
informative association rules will be derived in the next iteration. Thus, at the
end of this step, all frequent closed itemsets and all valid informative association
rules are derived.</p>
        <p>Proposition 3. (computational cost) In the worst case, the time complexity
of Prince is O((n3 + m) × 2n), where n (resp. m) is the number of distinct
items (resp. transactions) in the extraction context.</p>
        <p>Proof. The worst case is obtained when each extracted frequent itemset is a
frequent closed minimal generator. Thus, the frequent itemset lattice strictly
overlaps both the Iceberg Galois lattice and the minimal generator lattice. The
number of frequent closed minimal generators is then equal to 2n. We consider
that each transaction contains the n distinct items.</p>
        <p>During the first step, Prince performs two main tasks. The first task consists
in candidate support computations and is of order O(m × 2n). The second task
consists in trying to prune non-minimal generator candidates and it is done in
the order of O(n2×2n). The cost of the first step is then of order O((n2+m)×2n).</p>
        <p>During the second step, and for each frequent minimal generator g of size k,
Prince performs, in the worst case, O(k × (n − k)) comparisons ((k × (n − k))
will be over-estimated by n2). Indeed, the number of its (k-1)-subset is equal
to k. Each (k-1)-subset g1 has, in the worst case, (n − k) immediate successors
when comparing g with g1 .immediate-succs . Each comparison is performed
by making the union of g with an element of g1 .immediate-succs . The union
cost is O(n). The search of the support of the itemset, result of this union,
costs O(n) since it is a minimal generator. The cost of the second step is then
O((n + n) × n2 × 2n), i.e., O(n3 × 2n).</p>
        <p>During the third step, and for each equivalence class C, Prince performs two
main tasks. The first task consists in deriving the corresponding frequent closed
itemset f . This is carried out by performing the union of the set of frequent
minimal generators of f , containing only one element, and a frequent closed itemset
f1, which is an immediate predecessor of f . The first task then costs O(n). The
second task consists in deriving valid informative association rules. As each
frequent minimal generator is also closed, there is no exact generic association rules.
However, by fixing minconf to 0, there are k approximate generic association
rules, for an equivalence class whose frequent closed minimal generator is of size
k. To derive each approximate generic association rule, Prince performs the
difference between the frequent closed itemset f and the corresponding premise and
this costs O(n). The second task then costs O(k × n) (k will be over-estimated
by n). Hence, the cost of the third step is O((n + n2) × 2n), i.e., O(n2 × 2n).</p>
        <p>Thus, in the worst case, the time complexity of Prince is the sum of costs
of its three steps and is of order O((n3 + m) × 2n).</p>
        <p>
          It is important to mention that although Prince constructs the partial order,
its running time remains of the same order of magnitude as that of algorithms
dedicated to the extraction of frequent closed itemsets [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
4
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experimental results</title>
      <p>In this section, we shed light on Prince performances vs those of Close,
AClose and Titanic algorithms. Prince was implemented in the C language
using gcc version 3.3.1. All experiments were carried out on a PC with a 2.4 GHz
Pentium IV and 512 MB of main memory (with 2 GB of Swap) and running
S.u.s.e Linux 9.0.</p>
      <p>In all our experiments, all times reported are real times, including system
and user times, into benchmark datasets (dense and sparse(9)) and ”worst case”
datasets. Figure 3 (Left) summarizes the characteristics of benchmark datasets.
The definition of a ”worst case” context is given as follows:
Definition 3. A ”worst case” context is a context K = (O, A, R) where O
represents a finite set of objects (or transactions) of size (n+1), A is a finite set
of attributes (or items) of size n and R is a binary (incidence) relation (i.e.,
R ⊆ O × A ). Each object, among the first n ones, is verified by (n-1) distinct
attributes. The last object is verified by all attributes. Each attribute is checked
by n distinct objects.</p>
      <p>Thus, in a ”worst case” dataset, each closed itemset is equal to its (minimal)
generator. Hence, from a ”worst case” dataset of dimension equal to (n+1)×n, 2n
frequent closed itemsets can be extracted when minsup is fixed to 1 transaction.
Figure 3 (Right) presents an example of a ”worst case” dataset for n=4.
9 All these datasets are
http://fimi.cs.helsinki.fi/data.</p>
      <p>downloadable
on
the
following
address:
Dataset Type # items Avg. tr. size # transactions
Mushroom dense 119 23 8124
T40I10D100K sparse 1000 40 100000
1 × × ×
2 × × ×
3 × × ×
4 × × ×
5 × × × ×
Fig. 3. (Left) Benchmark dataset characteristics. (Right) A ”worst case” dataset for
n=4.</p>
      <p>Figure 4 shows execution times of Prince(10) algorithm compared to those
of Close, A-Close and Titanic algorithms.</p>
      <p>- Mushroom: In the case of Mushroom dataset, Prince performances are
better than those of Close, A-Close and Titanic for all minsup values, given
the important role played by equivalence class management functions. Indeed,
for a value of minsup equal to 0.1%, the number of frequent minimal generators
(equal to 360,166) is almost to 2.2 times the number of frequent closed itemsets
(equal to 164,117). Titanic performances decrease in a significant way due to
the extension attempts carried out for each frequent minimal generator. Indeed,
for minsup = 0.1%, 116 items, among 119, are frequent and the maximum size
of a frequent minimal generator is only equal to 10 items.</p>
      <p>- T40I10D100K: Prince performances for this dataset are largely better
than those of Close, A-Close and Titanic for all minsup values. Thus, Close
and A-Close are handicapped by a large average transaction size (40 items). In
the same way, Titanic performances regress considerably for the same reasons
previously evoked. The comparison cost for a frequent minimal generator, in
the case of Prince, being definitely more reduced than the intersection
operations performed in Close and A-Close and the extension attempts elaborated
in Titanic, explains the big gap between Prince performances and those of
remaining algorithms.</p>
      <p>- ”Worst case” datasets: For these experiments, minsup was fixed to 1
transaction. We tested 26 datasets showing the variation of n from 1 to 26. The
execution times of the four algorithms began to be distinguishable only starting
from the value of n equal to 15. The Prince algorithm performances remain
better than those of Close, A-Close and Titanic algorithms. Close and
Titanic executions stop for n=24 for lack of memory space. It is the same for
A-Close for n=25 and Prince for n=26. It is important to mention that the
partial order construction requires to store much more information than needed
when aiming only to extract frequent closed itemsets. Thus, the use of only one
trie to store information about all minimal generators instead of several tries,
as in Close, A-Close and Titanic algorithms(11), is an attempt aiming to
reduce the memory need of Prince algorithm.
10 The minconf value is set to 0.
11 Indeed, in the case of these three algorithms, a trie is used to save information about
each set of (frequent) minimal generators of size k.</p>
      <p>Mushroom</p>
      <p>Prince
Close
A-Close</p>
      <p>
        Titanic
In this paper, we proposed a new algorithm, called Prince, for an efficient
extraction of frequent closed itemsets and their respective minimal generators as
well as the generic association rule bases. To this end, Prince builds the partial
order contrary to the existing algorithms. A main characteristic of Prince
algorithm is that it relies only on minimal generators to build the underlying partial
order. Carried out experiments outlined that Prince largely outperforms
existing ”Test-and-generate” algorithms of the literature for both benchmark and
”worst case” contexts. In the near future, we plan to tackle two issues. Firstly,
we plan to study the possibility of integrating the work of Calders et al. [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] in
the first step of Prince. Indeed, this work can be applied to any set verifying
the property of ideal order such as the set of frequent minimal generators in our
case. Secondly, we propose to add constraints [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], so that the number of generic
association rules will be reduced while keeping the most interesting for the user.
Acknowledgements The authors are deeply grateful to Yves Bastide who
kindly accepted to provide source codes of Close, A-Close and Titanic
algorithms.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Pei</surname>
            ,
            <given-names>J</given-names>
            ., Han, J
          </string-name>
          .,
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nishio</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Closet: An efficient algorithm for mining frequent closed itemsets</article-title>
          .
          <source>In: Proceedings of the ACM-SIGMOD DMKD'00</source>
          , Dallas, Texas, USA. (
          <year>2000</year>
          )
          <fpage>21</fpage>
          -
          <lpage>30</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bastide</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taouil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Efficient Mining of Association Rules Using Closed Itemset Lattices</article-title>
          .
          <source>Journal of Information Systems</source>
          <volume>24</volume>
          (
          <year>1999</year>
          )
          <fpage>25</fpage>
          -
          <lpage>46</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <article-title>Restructuring lattices theory: An approach based on hierarchies of concepts. I. Rival, editor</article-title>
          ,
          <source>Ordered Sets</source>
          , Dordrecht-Boston (
          <year>1982</year>
          )
          <fpage>445</fpage>
          -
          <lpage>470</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>BenYahia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguifo</surname>
            ,
            <given-names>E.M.</given-names>
          </string-name>
          : Approches d'extraction de r`
          <article-title>egles d'association bas´ees sur la correspondance de Galois</article-title>
          . In Boulicault,
          <string-name>
            <given-names>J.F.</given-names>
            ,
            <surname>Cremilleux</surname>
          </string-name>
          , B., eds.:
          <article-title>Revue d'Ing´enierie des Syst`emes d'Information (ISI), Herm`es-Lavoisier</article-title>
          . Volume
          <volume>9</volume>
          . (
          <year>2004</year>
          )
          <fpage>23</fpage>
          -
          <lpage>55</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bastide</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Touil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Discovering frequent closed itemsets</article-title>
          . In Beeri,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Buneman</surname>
          </string-name>
          , P., eds.
          <source>: Proceedings of 7th International Conference on Database Theory (ICDT'99)</source>
          , LNCS, volume
          <volume>1540</volume>
          , Springer-Verlag, Jerusalem,
          <string-name>
            <surname>Israel.</surname>
          </string-name>
          (
          <year>1999</year>
          )
          <fpage>398</fpage>
          -
          <lpage>416</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Stumme</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taouil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bastide</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Computing iceberg concept lattices with Titanic</article-title>
          .
          <source>Journal on Knowledge and Data Engineering (KDE) 2</source>
          (
          <year>2002</year>
          )
          <fpage>189</fpage>
          -
          <lpage>222</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Zaki</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hsiao</surname>
            ,
            <given-names>C.J.:</given-names>
          </string-name>
          <article-title>Charm: An efficient algorithm for closed itemset mining</article-title>
          .
          <source>In: Proceedings of the 2nd SIAM International Conference on Data Mining</source>
          , Arlington, Virginia, USA. (
          <year>2002</year>
          )
          <fpage>34</fpage>
          -
          <lpage>43</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Uno</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiyomi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arimura</surname>
          </string-name>
          , H.:
          <article-title>LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets</article-title>
          . In Goethals,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Zaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.J.</given-names>
            ,
            <surname>Bayardo</surname>
          </string-name>
          , R., eds.
          <source>: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI'04)</source>
          . Volume 126 of CEUR Workshop Proceedings, Brighton, UK. (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Lucchesse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orlando</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perego</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>DCI-Closed : a fast and memory efficient algorithm to mine frequent closed itemsets</article-title>
          . In Goethals,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Zaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.J.</given-names>
            ,
            <surname>Bayardo</surname>
          </string-name>
          , R., eds.
          <source>: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI'04)</source>
          . Volume 126 of CEUR Workshop Proceedings, Brighton, UK. (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>BenYahia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cherif</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mineau</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaoua</surname>
          </string-name>
          , A.:
          <article-title>D´ecouverte des r`egles associatives non redondantes : application aux corpus textuels. Revue d'Intelligence Artificielle (special issue of Intl. Conference of Journ´ees francophones d'Extraction et Gestion des Connaissances (EGC'</article-title>
          <year>2003</year>
          )), Lyon, France
          <volume>17</volume>
          (
          <year>2003</year>
          )
          <fpage>131</fpage>
          -
          <lpage>143</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Bastide</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taouil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stumme</surname>
          </string-name>
          , G.:
          <article-title>Mining minimal non-redundant association rules using frequent closed itemsets</article-title>
          .
          <source>In: Proceedings of the International Conference DOOD'</source>
          <year>2000</year>
          , LNAI, volume
          <year>1861</year>
          , Springer-Verlag, London, UK. (
          <year>2000</year>
          )
          <fpage>972</fpage>
          -
          <lpage>986</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Kryszkiewicz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Concise representations of association rules</article-title>
          . In Hand, D.J.,
          <string-name>
            <surname>Adams</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bolton</surname>
          </string-name>
          , R., eds.
          <source>: Proceedings of Exploratory Workshop on Pattern Detection and Discovery in Data Mining (ESF)</source>
          ,
          <year>2002</year>
          , LNAI, volume
          <volume>2447</volume>
          , SpringerVerlag, London, UK. (
          <year>2002</year>
          )
          <fpage>92</fpage>
          -
          <lpage>109</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Kryszkiewicz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Concise representation of frequent patterns based on disjunctionfree generators</article-title>
          .
          <source>In: Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM)</source>
          , San Jose, California, USA. (
          <year>2001</year>
          )
          <fpage>305</fpage>
          -
          <lpage>312</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Davey</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Priestley</surname>
          </string-name>
          , H.:
          <article-title>Introduction to Lattices and Order</article-title>
          . Cambridge University Press (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Hamrouni</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>BenYahia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Slimani</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Prince : Extraction optimis´ee des bases g´en´eriques de r`egles sans calcul de fermetures</article-title>
          .
          <source>In: Proceedings of the 23rd International Conference INFORSID</source>
          ,
          <string-name>
            <surname>Inforsid</surname>
            <given-names>Editions</given-names>
          </string-name>
          , Grenoble, France. (
          <year>2005</year>
          )
          <fpage>353</fpage>
          -
          <lpage>368</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <source>Formal Concept Analysis</source>
          . Springer-Verlag (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Pasquier</surname>
          </string-name>
          , N.:
          <string-name>
            <surname>Datamining: Algorithmes d'</surname>
          </string-name>
          extraction et de r´
          <article-title>eduction des r`egles d'association dans les bases de donn´ees</article-title>
          . Th`ese de doctorat,
          <source>Ecole Doctorale Sciences pour l'Ing´enieur de Clermont Ferrand</source>
          , Universit´e
          <string-name>
            <surname>Clermont Ferrand II</surname>
          </string-name>
          , France (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Calders</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goethals</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Mining all non-derivable frequent itemsets</article-title>
          . In Elomaa, T.,
          <string-name>
            <surname>Mannila</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toivonen</surname>
          </string-name>
          , H., eds.
          <source>: Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery</source>
          ,
          <string-name>
            <surname>PKDD</surname>
          </string-name>
          <year>2002</year>
          ,
          <article-title>LNCS</article-title>
          , volume
          <volume>2431</volume>
          , Springer-Verlag, Helsinki, Finland. (
          <year>2002</year>
          )
          <fpage>74</fpage>
          -
          <lpage>85</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Bonchi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lucchese</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>On closed constrained frequent pattern mining</article-title>
          .
          <source>In: Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM'04)</source>
          , Brighton, UK. (
          <year>2004</year>
          )
          <fpage>35</fpage>
          -
          <lpage>42</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>