<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>GGeennAAllll AAllggoorriitthhmm:: DDeeccoorraattiinngg GGaallooiiss llaattttiiccee wwiitthh mmiinniimmaall ggeenneerraattoorrss</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>S. Ben Tekaya</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Ben Yahia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Y. Slimani Sondess Ben Tekaya</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sadok Ben Yahia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yahya Slimani</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>D ́epartement des Sciences de l'Informatique, Facult ́e des Sciences de Tunis Dep ́artement desCSacmiepnucessUdnei vl'eIrnsfiotarmirea</institution>
          ,
          <addr-line>ti1q0u6e0, FTaucnuislet, ́ dTeusnSiscieie.nces de</addr-line>
          <country>Tunis</country>
        </aff>
      </contrib-group>
      <fpage>166</fpage>
      <lpage>178</lpage>
      <abstract>
        <p>The problem of relevance and the usefulness of extracted association rules is becoming of primary importance, since an overwhelming number of association rules may be derived. This paper proposes an algorithm, called GenAll, to build a formal concept lattice, in which each formal concept is ”decorated” by its minimal generators. The main characteristic of this algorithm is to use a refinement process of upper cover lists to determine, in a simultaneous manner, the set of formal concepts, their underlying partial order and the set of minimal generators associated to each formal concept. Experimental results have showed that the proposed algorithm is specially efficient for dense formal contexts compared to that of Nourine et al.. Response times pointed out by GenAll algorithm largely outperform those of Nourine et al..</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Data Mining is a discipline which aims to discover regularities, in voluminous
datasets, commonly expressed by association rules [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Several studies in
particular underlined the prohibitive number of association rules drawn from even
reasonably sized datasets [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This fact bootstrapped the development of more
acute techniques or methods to reduce the size of the reported rule sets. In
this context, the battery of results provided by the Formal Concept Analysis
(FCA) permitted to define ”irreductible” nucleus of association rule subset
better known as generic bases. These bases constitute reduced sets of informative
rules allowing to preserve the most relevant rules, without loss of information.
The generation of these informative association rules relies on the extraction of
formal concepts, its associated minimal generators and the underlying partial
order [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In this paper, our main claim is that there is no single algorithm
allowing to obtain the generic base of association rules. In fact, a critical survey
of dedicated literature pointed out the following remarks:
1. Oriented Data Mining algorithms generate the set of closed itemsets1
and the set of associated minimal generators [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4–6</xref>
        ]. The underlying order
      </p>
      <sec id="sec-1-1">
        <title>1 or equivalently the set of formal concept intents.</title>
        <p>
          between closed itemsets is badly missing or not of interest. To obtain the
underlying partial relation, one can use the algorithm proposed by Valtchev
et al. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] to construct the covering graph.
2. Oriented formal concept algorithms generate the set of formal concepts
and the underlying order [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Here, the set of associated minimal generators
is missing. In this case, we can apply the JEN algorithm [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to catch out
minimal generators given that the covering graph is already generated.
In this paper, we propose a new algorithm, called GenAll, merging the above
mentioned views. Indeed, aiming to derive generic bases of association rules,
we propose to build the graph in which each formal concept is decorated by
its associated minimal generators. As a starting point, we choose to improve the
algorithm presented by Nourine et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. This choice can be justified by the fact
that the latter presents the best theoretical complexity, even though practical
experiments highlighted that it is the worst compared to other algorithms [
          <xref ref-type="bibr" rid="ref11 ref8">8, 11</xref>
          ].
Hence, a careful review of Nourine et al. algorithm showed that it operates in two
steps: An original approach, based on a special trie, to discover and store formal
concepts. A costly repeated access to originally resident disk to compute the
underlying order between formal concepts. However, the bad results of Nourine
et al. algorithm are not a fatality and we feel that its bad performances can be
largely improved using the GenAll algorithm proposed in this paper.
        </p>
        <p>The remainder of the paper is organized as follows: Section 2 briefly sketches
the basic FCA constructs and stresses on the link between the notion of
minimal blocker and minimal generator. Section 3 discusses in depth the Genall
algorithm. Results of experiments carried out on benchmarking datasets are
reported in Section 4. Finally, section 5 concludes this paper and points out some
research directions for future work.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Basic notions</title>
      <p>Formal Concept: Let us consider an formal context K = (O, I, R), where O
is a set of objects, I is a set of attributes (or items) and R is a binary relation
between the objects and the attributes (R ⊆ O × I). Each couple (o, a) ∈ R
expresses that the transaction o ∈ O contains the attribute a ∈ I. Within a
context, objects are denoted by numbers and attributes by letters.</p>
      <p>We define two functions summarizing links between subsets of objects and
subsets of attributes induced by R, that map sets of objects to sets of attributes
and vice versa.</p>
      <p>
        Thus, for a set O ⊆ O, we define: φ(O) = {a|∀o, o ∈ O ⇒ (o, a) ∈ R} and for
a set A ⊆ I, ψ(A) = {o|∀a, a ∈ A ⇒ (o, a) ∈ R}. Both functions φ and ψ form
a Galois connection between the sets P (I) and P (O) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Consequently, both
compound operators of φ and ψ are closure operators, in particular ω = φ ◦ ψ is
a closure operator.
      </p>
      <p>
        A formal concept is a pair CK = (O, A), where O is called extent, and A is a
closed itemset2, called intent. Furthermore, both O and A are related through
the Galois connection, i.e., φ(O) = A and ψ(A) = O. Let an itemset A ⊆ I, the
support of the itemset A in the context K is defined by: support(A) = |ψ(A)| .
|O|
Face: Let CK = (O, A) be a formal concept and let predi(CK) be the ith
immediate predecessor of CK in a Galois lattice extracted from a context K. The ith
face of the formal concept CK corresponds to the difference between its intent
and the intent of its ith predecessor [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        Let p be the number of immediate predecessors of the formal concept CK. The
family of faces FC of the formal concept CK is expressed by the following
relation: FC = {A − KIntent(predi(CK)}, i ∈ {1, . . . , p} [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. For example, according
to the foKrmal concept lattice presented on the figure 3, F(abde,23) = {b, e}.
Minimal blocker: Let G = {G1, ..., Gn} be a family of n sets. KA blocker B of
the family G is a set where the intersection with all sets Gi ∈ G is not empty
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. A blocker B of family of G = {G1, ..., Gn} is said to be minimal if ∃B1 ⊂ B
and ∀Gi ∈ G, B1 ∩ Gi = ∅ [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Given the family of faces F(abde,23) = {b, e}, the
minimal blocker B = {be} where the intersection with the two fKaces is always
different of the empty set.
      </p>
      <p>
        Generator: Let CK be a formal concept and FC its family of faces. The set
G of the generators associated with the intent AK of the formal concept CK,
corresponds to the minimal blockers associated to the family of faces FC [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Equivalently, An itemset g ⊆ I is called minimal generator of a formal coKncept
CK = (O, A), if and only if ω(g) = A and ∃g1 ⊂ g such that ω(g1) = A [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
Minimal association rules: An association rule is an assorted statistical
metric implication between itemsets of the form r : X ⇒ (Y − X ) in which X and Y
are frequent itemsets (their supports are at least equal to a minimal threshold,
called minsup), and X ⊂ Y . Itemsets X and (Y − X ) are called, respectively,
antecedent and conclusion of the rule r. The valid association rules are those
whose measure of confidence Conf (r) = ssuuppppoorrtt((XY)) is greater than or equal to
the minimal threshold of confidence, named minconf . An association rule whose
confidence is equal to 1 is called exact association rule. Otherwise it is called
approximative association rule. Bastide et al. characterized what they called ”the
generic base for exact association rules” (adapting the global implication base
of Guigues and Duquenne [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]). The generic base for exact association rules is
defined as follows :
Trie3: It is a research tree, whose elements are stored in a condensed way. The
”trie” used by the Nourine algorithm presented in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is a special trie, whose
edges are labelled by attributes of formal concepts, and only some nodes are
labelled by the extents of formal concepts. The way going from the root node
towards a labelled node by an extent form a formal concept. As shown in Figure
1, (acde, 34) is a formal concept.
      </p>
      <sec id="sec-2-1">
        <title>2 an itemset is a set of items or attributes.</title>
      </sec>
      <sec id="sec-2-2">
        <title>3 From reTrieval.</title>
        <p>2 3
2 3 4
e</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The proposed GenAll algorithm</title>
      <p>In this section, we present a new algorithm, called GenAll, to derive generic
bases of association rules. To do so, it gathers in a single pass all the required
informations which are : the set of formal concepts and their associated minimal
generators, the underlying partial order.</p>
      <p>
        The GenAll algorithm is inspired in particular from the first part of the
Nourine et al. algorithm [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], stressing on the construction of the formal concept
trie. The choice of this algorithm is motivated by the fact that it presents the
best theoretical complexity [
        <xref ref-type="bibr" rid="ref10 ref11 ref8">8, 10, 11</xref>
        ]. In addition to its linear complexity, the
originality of this algorithm resides in its way to discover formal concepts and
their storage in a special trie.
      </p>
      <p>However, this algorithm presents some disadvantages, that we can summarize
as follows:
1. A costly systematic back to the dataset, originally resident disk, to compute
the covering graph of the formal concepts,
2. Ignore the determination of minimal generators associated to each formal
concept.</p>
      <p>To avoid these disadvantages, we present a new algorithm, relying on the
construction of a trie to calculate on the fly, the set of formal concepts, their
associated minimal generators and the partial order between these formal concepts.
Notations used by GenAll algorithm are given in Table 1, while the
pseudocode is illustrated by Algorithm 1. In each iteration, the algorithm builds the
set of formal concepts, the set of candidate minimal generators and a potential
order, which has to be later refined, between the already discovered formal
concepts. Each transaction is passed only once. Each iteration consists of two steps.
The first calculates the formal concepts and generates a potential order between
formal concepts, whereas the second aims to refine the order and to compute the
associated minimal generators. These two steps are discussed in the following.
Algorithm 1 Construction of formal concept lattice decorated by the minimal
generators
1: Algorithm GenAll</p>
      <p>Input : Formal context K
Output : Lexicographic tree of F(trie), ImmSucc of each formal concept, list gen
Begin
2: F = (∅, I)</p>
      <p>{/* step 1: Generation of formal concepts */}
3: For Each transaction t ∈ K Do
4: L = ∅
5: For Each concept Ci ∈ F Do
6: C.intent = Ci.intent ∩ t
7: If C.intent ∈ F Then
8: F = F ∪ C
9: C.extent = Ci.extent ∪ t.T ID
10: C.ImmSucc = {t} ∪ {Ci} \ {C}
11: L = L ∪ C {/* Sorted insertion */}
12: Else
13: C.extent = C.extent ∪ Ci.extent ∪ t.T ID
14: If C ∈ L Then
15: L = L ∪ C {/* Sorted insertion */}
16: End If</p>
      <p>{/* Updating C.ImmSucc */}
17: C.LP = {t} ∪ {Ci} \ {C}
18: For Each Pi ∈ C.LP Do
19: For Each Succ ∈ C.ImmSucc Do
20: Comparec-oncept (Succ, Pi)
21: End For
22: End For
23: End If
24: End For{/* Refinement of immediate successor list and determination of minimal
generators*/}
25: For Each concept Ci ∈ L Do
26: Ci.ImmSucc = findS-ucc (Ci, L)
27: For Each Succ ∈ Ci.ImmSucc Do
28: f ace = Succ \ Ci
29: For Each f acei ∈ Succ.list f ace Do
30: Succ.list f ace = comparef-ace (f ace, f acei)
31: End For
32: End For
33: End For
34: End For</p>
      <sec id="sec-3-1">
        <title>Structure Fields</title>
      </sec>
      <sec id="sec-3-2">
        <title>Description</title>
        <p>F
L
LP
C</p>
        <p>Family of the formal concepts.</p>
        <p>Lists of formal concepts calculated in an iteration i.</p>
        <p>Intermediary list of formal concepts.</p>
        <p>Formal concept.</p>
        <p>Intent of the formal concept</p>
        <p>Extent of the formal concept
intent
extent
list gen
ImmSucc List of immediate successors of the formal concept
list f ace List of faces of the formal concept</p>
        <p>List of the minimal generators of the formal concept</p>
        <p>Table 1. Notations
3.1</p>
        <p>Generation of formal concepts
In this step (lines 4-24), we start by initializing the L list with the empty set. This
list will be useful for the refinement of the set of the immediate successors of each
formal concept found in an iteration. To calculate the set of formal concepts, we
perform an intersection between the intent of each formal concept of the family
F and each transaction of the dataset. Two cases are distinguishable:
1. The intent does not exist in the family F (a new formal concept is
found): it must then be added to the family F . Then, the associated formal
concept extent is calculated (line 9). We can notice that all the transactions
of the formal context are formal concepts. Hence, the result of the
intersection of the transactions with the formal concept where its intent is equal to
the set of attributes of the formal context, is equal to the attributes of the
transaction.</p>
        <p>Potential immediate successors of this new formal concept are firstly
initialized with the formal concept utilized to generate it (line 10). Indeed, to
determine potential immediate successor of a formal concept, we should
distinguish two particular cases:
If the generated formal concept is equal to the transaction, then only the
formal concept used in this intersection can be an immediate successor.
Otherwise both the transaction and the formal concept are considered as
potential immediate successors of the formal concept. Thereafter, this new formal
concept is added to the list L (line 11). It is important to mention that this
insertion is performed by maintaining an ascending order of formal concept
intents cardinality.
2. The intent already exists in the family F : the formal concept extent
(line 13) should be updated, and we check whether the concept is already
existing in the list L (lines 14-15). Aiming to update the immediate
successors list ImmSucc of the formal concept, and given that for each formal
concept we maintain an ImmSucc list, we build a list LP containing the
formal concepts used to generate it. This list is necessary, to be able to
make comparisons and to update the ImmSucc list. Indeed, for each
element in LP and for each element in ImmSucc list, we test the inclusion
of these formal concepts using the Comparec-oncept function (line 20).
The Comparec-oncept function is applied to update the ImmSucc list of
the formal concepts under consideration. This list has to be modified in two
cases: the first case, the element of LP is smaller (in term of inclusion) than
one element of ImmSucc. In this case, the old immediate successor will be
replaced by the new one. The second case, the two elements are
incomparable: a new successor will then be added to the ImmSucc list of the formal
concept under concern.
3.2</p>
        <p>Refinement of immediate successor list and determination of
minimal generators
We notice that the formal concepts, found in an iteration, represent a branch of
the lattice, i.e., for each formal concept (except the largest one), we find another
formal concept calculated in the same iteration, which covers it. For that, we
traverse the L list of formal concepts found in an iteration and for each formal
concept we call the findS-ucc function (lines 25-26). This function is based
on the fact that a formal concept of cardinality n (i.e., the length of its intent
is equal to n) is at least covered by a formal concept of cardinality (n + 1)
or higher. For that, and since the L list is sorted according to the cardinality’s
intent ascending order, we seek for each formal concept of cardinality n, a formal
concept which covers it in the list of cardinality (n + 1). In the event of defect, we
pass to the cardinality (n + 2), until we find a minimal covering formal concept.
Then, we compare these formal concepts to update the ImmSucc list.</p>
        <p>Once the list of the immediate successors (ImmSucc) of the current formal
concept is built, we traverse this list and for each immediate successor we have
to compute the set of its minimal generators. To obtain such result, we calculate
the associated faces of successors (line 28), then we compare them to the
corresponding list of faces by calling the Comparef-ace function (line 30). The set
of faces list f ace of the formal concept (immediate successor) can be modified
in two cases as illustrated by the following:
1. The f ace is smaller (in term of inclusion) than an element of list f ace: in
this case, we calculate the dif f erence f ace, and we replace the old face by
the new one. If the obtained dif f erence f ace does not exist in one of the
faces of this formal concept, then we remove each generator containing this
difference. This removal is necessary, since a non existing attribute in the
faces cannot exist in the generators.
2. The f ace is incomparable with all the faces of list f ace: in this case, it will
be added to list f ace.</p>
        <p>
          When the faces list is updated, the ComputeM-inb-lockers function of
the JEN algorithm [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] is applied for determining the minimal generators.
Example
Let us consider the Formal context illustrated by Figure 3 with the set of
attributes I = {a, b, c, d, e, f } and the set of transactions of the formal context,
denoted from 1 to 4. Firstly, the family F is initialized with the set of attributes
R
1
2
3
4
a
×
×
×
×
b
×
×
×
c
×
×
d
×
×
×
×
e f
×
×
× ×
Fig. 2. Formal context K.
        </p>
        <p>f
(acdef,4)</p>
        <p>bf
(acde, 3 4)</p>
        <p>e
(ade, 2 3 4)
(abcdef,)
be
(abcde,3)
(abde,2 3)</p>
        <p>b
(abd, 1 2 3)
(line 2). It present the top formal concept of the lattice.</p>
        <p>First step: Generation of formal concepts
Each formal concept of F is handled individually. Let us consider the first
transaction {abd}:
Transaction 1:
The application of the intersection operation with the single formal concept of
the family F gives raise to a new formal concept with an intent equal to {abd}.
The extent of this new formal concept, denoted C2, will be later computed.
Thus, the family F is updated by adding C2:
F = {(abcdef, ∅), (abd, ∅)}.</p>
        <p>At first, the set of immediate successor of C2 is initialized with C1 = (abcdef, ∅)
and the extent of C2 is initialized with union of C1.extent and the transaction
ID under concern.</p>
        <p>Hence, C2.extent = {1} and C2.ImmSucc = {(abcdef, ∅)} and the L list is
initialized with {(abd, 1)}.</p>
        <p>
          Second step: Refinement of immediate successor list and
determination of minimal generators
For the single formal concept in L = {(abd, 1)} we haven’t to update the
ImmSucc list of this concept, and (abd, 1).ImmSucc = {(abcdef, ∅)}. For each
immediate successor we compute its minimal generators set using the
ComputeMinb-lockers function of the JEN algorithm [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>The f ace of the formal concept (abcdef, ∅) = cef . Then C1.listgen = {c, e, f }.</p>
        <p>Transaction 2:
Let now,consider the second transaction {abde}:
{abcdef } ∩ {abde} = {abde}, {abde} is an intent of a new formal concept.
F = {(abcdef, ∅), (abd, 1), (abde, ∅)}, C3.extent = {2}, then:
F = {(abcdef, ∅), (abd, 1), (abde, 2)}
C3.ImmSucc = {(abcdef, ∅)} and L = {(abde, 2)}.</p>
        <p>These process above is repeated for the second formal concept in the family F :
{abd} ∩ {abde} = {abd}, this set is an intent of an existing formal concept. Then
we update the extent of this formal concept (line 13): C2.extent = {12} and
L = {(abd, 12), (abde, 2), }
C2.LP = {abde}. Using Comparec-oncept function we have {abde} ⊂ {abcdef }
then we update C2.ImmSucc = {abde}.</p>
        <p>The second step is performed and we obtain:
C2.ImmSucc = {(abde, 2)} and C3.ImmSucc = {(abcdef, ∅)}.</p>
        <p>At the end of the thirst transaction we obtain:
F = {(abcdef, ∅), (abd, 123), (abde, 23), (abcde, 3)}</p>
        <p>C2.ImmSucc = {(abde, 23)}
C4.ImmSucc = {(abcdef, ∅)}
C1.list gen = {f }
C3.list gen = {e}</p>
        <p>C3.ImmSucc = {(abcde, 3)}
C2.list gen = {a, b, d}</p>
        <p>C4.list gen = {c}</p>
        <p>Transaction 4= {acdef }
At the first step we obtain:
F = {(abcdef, ∅), (abd, 123), (abde, 23), (abcde, 3), (acdef, 4), (ad, 1234),
(ade, 234), (acde, 34)}
C5.ImmSucc = {(abcdef, ∅)} C6.ImmSucc = {(abd, 123), (acdef, 4)}
C7.ImmSucc = {(abde, 23), ((acdef, 4)} C8.ImmSucc = {(abcde, 3), (acdef, 4)}
L = {(ad, 1234), (ade, 234), (acde, 34), (acdef, 4)}. Let consider the first formal
concept in L list, (ad, 1234). The later formal concept, has as immediate
successor the formal concept (acdef, 4). But we find in L list, a formal concept
with cardinality 3 who can cover the formal concept (ad, 1234). Then we replace
(acdef, 4) by (ade, 234) because {ade} ⊂ {acdef }. The same process is done for
the other formal concepts. We have at the end:
C5.ImmSucc = {(abcdef, ∅)}
C7.ImmSucc = {(abde, 23), ((acde, 34)}
C1.list gen = {bf }
C3.list gen = {be}
C5.list gen = {f }
C7.list gen = {e}
C6.ImmSucc = {(abd, 123), (ade, 234)}
C8.ImmSucc = {(abcde, 3), (acdef, 4)}
C2.list gen = {b}
C4.list gen = {bc}
C6.list gen = {a, d}</p>
        <p>C8.list gen = {c}</p>
        <p>Figure 2 depicts the formal concept lattice associated to the formal context
K, presented in Figure 3. A sketch of some minimal generators, indicated by
dotted lines, are decorating formal concepts nodes.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental results</title>
      <p>We implemented the Nourine et al. and GenAll algorithms in C on a Linux
platform with a Mandrake distribution to assess their relative performances.
Both algorithms used the same data structure. Our experiments were carried
out on Pentium IV with CPU clock rate of 2Ghz and 512Mb of main memory.
To examine the practical efficiency of our algorithm, we run experiments on real
and synthetic datasets, whose characteristics are detailed in what follows.
4.1</p>
      <p>Test data
Hence the results depend on the dataset density, algorithms were tested on
two types of datasets: synthetic data, which mimic market basket, and dense
datasets, which belong to the domain of statistical databases. All these test
datasets are freely available on Internet4. Chess base is derived from steps of
chess game. Typically, real datasets are very dense. We also chose some synthetic
databases. Generally, synthetic datasets are sparse compared to real ones.
4.2</p>
      <p>Computational experiments
In the sequel, we assess performances of Nourine et al. and GenAll algorithms
both on sparse and dense datasets.</p>
      <p>
        Case of sparse datasets: For this type of datasets, the list of successors
associated to a given formal concept changes frequently. Thus, comparisons of all
the list of immediate successors will be carried out. Given that the successor list
of the formal concept is long, in sparse datasets compared to that of the dense
datasets, the step of updating the immediate successor list consumes more time
comparatively to the Nourine et al. algorithm [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], in which this comparison is
not performed. Indeed, this algorithm calculates, in each step, only the
immediate predecessor list of each formal concept. Hence,an important factor in the
performance of the algorithm is the average length of the transaction. In fact,
the smaller the number of items per transaction is, the faster the GenAll
algorithm, for determining formal concepts and their underlying order.
Case of dense datasets: The number of reported formal concepts is by far
more important than that obtained for sparse datasets. This will slacken the
corresponding execution time compared to that of sparse datasets. From the
point of view of execution time, we noticed that the list of immediate successors
does not change frequently as it is the case for sparse datasets.
4.3
      </p>
      <p>
        Relative performances of Nourine and GenAll algorithms
In what follows, we will put the focus on comparing performances pointed out
by GenAll algorithm and those obtained by Nourine et al. algorithm [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], in
4 http://fimi.cs.helsinki.fi/data
the context of formal concept discovery and their underlying partial order. The
behavior of both algorithms is somehow different for different levels of density.
      </p>
      <p>According to Figure 5, we note that the GenAll algorithm largely
outperforms Nourine et al. algorithm, especially on dense datasets. Indeed, the time
ratio is very large (the average is 40 and sometimes reaches 83). However,
acT10I4D1K</p>
      <p>GenAll</p>
      <p>Nourine
3000
)c 2500
e
(se 2000
itnm 1500
tuo 1000
i
c
e
xE 500
0 0 200 400 600 800 1000
|O| (# transactions)
Kosarak</p>
      <p>GenAll</p>
      <p>
        Nourine
7000
)c 6000
(se 5000
iem 4000
t
ino 3000
tceu 2000
xE 1000
0 0 200 400 600 800 1000120014001600
|O| (# transactions)
Fig. 4. Execution time of Nourine et al. and GenAll algorithms (case of sparse
dataset)
cording to Figure 4, Nourine et al. algorithm and GenAll show similar
performances on sparse datasets. The behavior of the sparse dataset Kosarak,
compared to the remaining sparse datasets is noteworthy. In fact, the execution time
of the GenAll algorithm [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] on the Kosarak base is slightly lower than that
of Nourine et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
We proposed a new algorithm for the construction at the same time the set of
formal concepts, its underlying order and the determination of the set of minimal
generators associated to each formal concept. GenAll algorithm [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] performs
only one scan on the datasets. We conduced a comparison of performances of
two algorithms : GenAll and Nourine et al. algorithms. The results shows that
GenAll outperforms largely Nourine et al. algorithm on dense datasets.
      </p>
      <p>By using the algorithm output, the derivation of the generic bases for the rules
of association can be performed in a straightforward manner. In the near future,
we plan to tackle two issues. Firstly, we try to improve Genall performances by
using advanced data structures to reduce the retrieval operations cost. Secondly,
we propose to examine the potential benefits of a parallel version of GenAll
algorithm on a MIMD machine (IBM SP2 with 32 processors).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skirant</surname>
          </string-name>
          , R.:
          <article-title>Fast algorithms for mining association rules</article-title>
          .
          <source>In: Proceedings of the 20th International Conference on Very Large Databases</source>
          , Santiago,
          <string-name>
            <surname>Chile.</surname>
          </string-name>
          (
          <year>1994</year>
          )
          <fpage>478</fpage>
          -
          <lpage>499</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Zaki</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Mining Non-Redundant Association Rules. Data Mining and Knowledge Discovery (</article-title>
          <year>2004</year>
          )
          <fpage>223</fpage>
          -
          <lpage>248</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>BenYahia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguifo</surname>
            ,
            <given-names>E.M.</given-names>
          </string-name>
          : Approches d'extraction de r`
          <article-title>egles d'association bas´ees sur la correspondance de Galois. Ing´enierie des Syst`emes d'Information (ISI)</article-title>
          ,
          <source>Herm`es-Lavoisier 3-4</source>
          (
          <year>2004</year>
          )
          <fpage>23</fpage>
          -
          <lpage>55</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Zaki</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hsiao</surname>
            ,
            <given-names>C.J.: CHARM</given-names>
          </string-name>
          :
          <article-title>An efficient algorithm for closed itemset mining</article-title>
          .
          <source>In: Proceedings of the 2nd SIAM International Conference on Data Mining</source>
          , Arlington. (
          <year>2002</year>
          )
          <fpage>34</fpage>
          -
          <lpage>43</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Pei</surname>
            ,
            <given-names>J</given-names>
            ., Han, J
          </string-name>
          .,
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nishio</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Closet: An efficient algorithm for mining frequent closed itemsets</article-title>
          .
          <source>In: Proceedings of the ACM SIGMOD DMKD'00</source>
          , Dallas,TX. (
          <year>2002</year>
          )
          <fpage>21</fpage>
          -
          <lpage>30</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          :
          <article-title>Data Mining : algorithmes d'extraction et de r´eduction des r`egles d'association dans les bases de donn´ees</article-title>
          . Doctorat d'universit´e, Universit´e de
          <string-name>
            <surname>Clermont-Ferrand</surname>
            <given-names>II</given-names>
          </string-name>
          , France (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Valtchev</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Missaoui</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lebrun</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A fast algorithm for building the hasse diagram of a galois lattice</article-title>
          .
          <source>In: Proceedings of the Colloque LaCIM</source>
          <year>2000</year>
          ,Montr´eal (CA).
          <article-title>(</article-title>
          <year>2000</year>
          )
          <fpage>293</fpage>
          -
          <lpage>306</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Obedkov</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Comparing performance of algorithms for generating concept lattices</article-title>
          .
          <source>JETAI</source>
          <volume>14</volume>
          (
          <year>2002</year>
          )
          <fpage>189</fpage>
          -
          <lpage>216</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Floc'h</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fisette</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Missaoui</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valtchev</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Godin</surname>
          </string-name>
          , R.: JEN : un algorithme efficace de construction de g´
          <article-title>en´erateurs pour l'identification des r`egles d'association</article-title>
          . In: Num´ero sp´ecial de la revue des Nouvelles Technologies de l'Information, Vol.
          <volume>1</volume>
          No.
          <issue>1</issue>
          ,
          <string-name>
            <surname>Editions</surname>
            <given-names>C</given-names>
          </string-name>
          ´epadu`es. (
          <year>2003</year>
          )
          <fpage>135</fpage>
          -
          <lpage>146</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Nourine</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raynaud</surname>
            ,
            <given-names>O.:</given-names>
          </string-name>
          <article-title>A fast algorithm for building lattices</article-title>
          .
          <source>Information Processing Letters</source>
          (
          <year>1999</year>
          )
          <fpage>199</fpage>
          -
          <lpage>214</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Fu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguifo</surname>
            ,
            <given-names>E.M.</given-names>
          </string-name>
          :
          <article-title>Etude et conception d'algorithmes de g´en´eration de concepts formels</article-title>
          .
          <source>Revue Ing´enierie des Syst`emes d'Information</source>
          <volume>9</volume>
          (
          <year>2004</year>
          )
          <fpage>109</fpage>
          -
          <lpage>132</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Barbut</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monjardet</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Ordre et classification</article-title>
          .
          <source>Alg`ebre et Combinatoire. Hachette</source>
          ,
          <string-name>
            <surname>Tome II</surname>
          </string-name>
          (
          <year>1970</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Pfaltz</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , C.M.:
          <article-title>Scientific discovery through iterative transformation of concept lattices</article-title>
          .
          <source>In: Proceedings of Workshop on Discrete Applied Mathematics in conjunction with the 2nd SIAM International Conference on Data Mining</source>
          , Arlington. (
          <year>2002</year>
          )
          <fpage>65</fpage>
          -
          <lpage>74</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Bastide</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taouil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stumme</surname>
          </string-name>
          , G.:
          <article-title>Mining minimal non-redundant association rules using frequent closed itemsets</article-title>
          .
          <source>In: Proceedings of the International Conference DOOD'2000, Lecture Notes in Computer Sciences</source>
          , Springer-Verlag.
          <article-title>(</article-title>
          <year>2000</year>
          )
          <fpage>972</fpage>
          -
          <lpage>986</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Guigues</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duquenne</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Familles minimales d'implications informatives r´esultant d'un tableau de donn´ees binaires</article-title>
          .
          <source>Math´ematiques et Sciences Humaines</source>
          (
          <year>1986</year>
          )
          <fpage>5</fpage>
          -
          <lpage>18</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>BenTekaya</surname>
            ,
            <given-names>S.</given-names>
            , BenYahia, S.
          </string-name>
          ,
          <string-name>
            <surname>Slimani</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Algorithme de construction d'un treillis des concepts formels et de d´etermination des g´en´erateurs minimaux</article-title>
          .
          <source>In: Proceedings of the 7th African Conference on Research in Computer Science</source>
          . (
          <year>2004</year>
          , Tunis)
          <fpage>247</fpage>
          -
          <lpage>254</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>