<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A newAinnfoerwmIantifvoermgeantievreicGbeanseeriocf Basassoeciation of Assocriualteiosn Rules</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gh. Gasmi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Ben Yahia</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>E. Mephu Nguifo</string-name>
          <email>mephu@cril.univ-artois.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Y. Slimani</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gh. Gasmi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Ben Yahia</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>E. Mephu Nguifo</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Y. Slimani</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>D´epartment des Sciences de l'Informatique</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>dFeaTcuulnet´idses Sciences de Tunis</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CCeennttrree ddee RReecchheerrcchhee eenn IInnffoorrmmaattiiqquuee ddee LLeennss--IIUUTT ddee LLeennss Rue de l'Universiet ́e ́SSPP1166,</institution>
          ,
          <addr-line>6622330077LLenenssCCededexex</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Campus Universitaire</institution>
          ,
          <addr-line>1060 Tunis, Tunisie</addr-line>
        </aff>
      </contrib-group>
      <fpage>67</fpage>
      <lpage>79</lpage>
      <abstract>
        <p>The problem of the relevance and the usefulness of extracted association rules is becoming of primary importance, since an overwhelming number of association rules may be derived from even reasonably sized real-life databases. In this paper, we introduce a novel generic base of association rules, based on the Galois connection semantics. The novel generic base is sound and informative. We also present a sound axiomatic system, allowing to derive all association rules that can be drawn from an extraction context.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Data mining has been extensively addressed for the last years, particularly the
problem of discovering association rules. These latter aim at exhibiting
correlations between data items (or attributes), whose interestingness is assessed by
statistical metrics. However, an unexploited huge amount of association rules is
drawn from real-life databases. This drawback encouraged many research issues,
aiming at finding the minimal nucleus of relevant knowledge can be extracted
from several thousands of highly redundant rules. Various techniques are used to
limit the number of reported rules, starting by basic pruning techniques based on
thresholds for both the frequency of the represented pattern (called the support )
and the strength of the dependency between premise and conclusion (called the
confidence). More advanced techniques that produce only a limited number of
the entire set of rules rely on closures and Galois connections [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1–3</xref>
        ]. These formal
concept analysis (FCA) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] based techniques have in common a feature, which is
to present a better trade-off between the size of the mining result and the
conveyed information than the ”frequent patterns” algorithms. Finally, works on
FCA have yielded a row of results on compact representations of closed set
families, also called bases, whose impact on association rule reduction is currently
under intensive investigation within the community [
        <xref ref-type="bibr" rid="ref1 ref2 ref5">1, 2, 5</xref>
        ].
      </p>
      <p>Once these generic bases are obtained, all the remaining (redundant) rules
can be derived ”easily”. In this context, little attention was paid to reasoning
from generic bases comparatively to the battery of papers to define them.
Essentially, they were interested in defining syntactic mechanisms for deriving rules
from generic bases.</p>
      <p>In this paper, we introduce a novel generic base of association rule, which is
sound and informative. The soundness property assesses the ”syntactic”
derivation, since it ensures that all association rules can be derived from the generic
base. The informativeness property ensures that the support and confidence of
a derivable rule can be exactly determined.</p>
      <p>The remainder of the paper is organized as follows. Section 2 introduces
the mathematical background of FCA and its connection with the derivation
of (non-redundant) association rule bases. Section 3 presents the related work
on defining and reasoning from generic bases of association rules. In section 4,
we introduce a novel, sound and informative generic base of association rules.
We also provide a set of inference axioms, for deriving association rules and we
we prove its soundness. Section 5 concludes this paper and points out future
research directions.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Mathematical background</title>
      <p>In the following, we recall some key results from the Galois lattice-based paradigm
in FCA and its applications to association rules mining.
2.1</p>
      <sec id="sec-2-1">
        <title>Basic notions</title>
        <p>
          In the rest of the paper, we shall use the theoretical framework presented in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
In this paragraph, we recall some basic constructions from this framework.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Formal context:</title>
        <p>A formal context is a triplet K = (O, A, R), where O represents a finite set
of objects (or transactions), A is a finite set of attributes and R is a binary
(incidence) relation (i.e., R ⊆ O × A). Each couple (o, a) ∈ R expresses that the
transaction o ∈ O contains the attribute a ∈ A. Within a context (c.f., Figure 1
on the left), objects are denoted by numbers and attributes by letters.</p>
        <p>
          We define two functions, summarizing links between subsets of objects and
subsets of attributes induced by R, that map sets of objects to sets of attributes
and vice versa. Thus, for a set O ⊆ O, we define φ(O) = {a | ∀o, o ∈ O ⇒ (o, a) ∈
R}; and for A ⊆ A, ψ(A) = {o | ∀a, a ∈ A ⇒ (o, a) ∈ R}. Both functions φ and
ψ form a Galois connection between the sets P( ) and P(O) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Consequently,
A
both compound operators of φ and ψ are closure operators, in particular ω = φ◦ψ
is a closure operator.
        </p>
        <p>In what follows, we introduce the frequent closed itemset 3, since we may
only look for itemsets that occur in a sufficient number of transactions.</p>
        <sec id="sec-2-2-1">
          <title>3 Itemset stands for a set of items</title>
          <p>Frequent closed itemset: An itemset A ⊆ A is said to be closed if A =
ω(A), and is said to be frequent with respect to minsup threshold if supp(A)=
|ψ(A)| ≥ minsup.</p>
          <p>|O|</p>
          <p>Formal Concept: A formal concept is a pair c = (O, A), where O is called
extent, and A is a closed itemset, called intent. Furthermore, both O and A are
related through the Galois connection, i.e., φ(O) = A and ψ(A) = O.</p>
          <p>
            Minimal generator: An itemset g ⊆ A is called minimal generator of a
closed itemset A, if and only if ω(g) = A and @g0 ⊆ g such that ω(g0) = A [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ].
The closure operator ω induces an equivalence relation on items power set, i.e.,
the power set of items is partionned into disjoint subsets (also called classes). In
each distinct class, all elements are equal support value. The minimal generator
is the smallest element in this subset, while the closed itemset is the largest one.
Figure 1(Right) sketches sample classes of the induced equivalence relation from
the context K.
          </p>
          <p>
            Galois lattice: Given a formal context K, the set of formal concepts CK
is a complete lattice Lc = (C, ≤), called the Galois (concept) lattice, when CK
is considered with inclusion between itemsets [
            <xref ref-type="bibr" rid="ref4 ref6">4, 6</xref>
            ]. A partial order on formal
concepts is defined as follows ∀ c1, c2 ∈ CK, c1 ≤ c2 iif intent(c2) ⊆ intent(c1),
or equivalently extent(c1) ⊆ extent(c2).
          </p>
          <p>The partial order is used to generate the lattice graph, called Hasse diagram,
in the following manner: there is an arc (c1, c2), if c1 ¹ c2 where ¹ is the
transitive reduction of ≤, i.e., ∀c3 ∈ CK, c1 ≤ c3 ≤ c2 implies either c1 = c3 or
c2 = c3.</p>
          <p>Iceberg Galois lattice: When only frequent closed itemsets are considered
with set inclusion, the resulting structure (Lˆ, ⊆) only preserves the LUBs, i.e.,
the joint operator. This is called a join semi-lattice or upper semi-lattice. In the
remaining of the paper, such structure is referred to as ”Iceberg Galois Lattice”.
Example 1. Let us consider the extraction context given by Figure 1 (Left). The
associated Iceberg Galois lattice, for minsup=2, is depicted by Figure 1(Bottom)4.
Each node in the Iceberg is represented as couple (closed itemset; support) and
is decorated with its associated minimal generators list.</p>
          <p>In the following, we present the general framework for the derivation of
association rules, then we establish its important connexion with the FCA framework.
2.2</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Derivation of association rules</title>
        <p>Let I = {i1, i2, . . . , im} be a set of m distinct items. A transaction T , with
an identifier further called TID, contains a set of items in I. A subset X of I
where k = |X| is referred to as a k−itemset (or simply an itemset), and k is
called the length of X. A transaction database, say D, is a set of transactions,
which can be easily transformed in an extraction context K. The number of
transactions of D containing the itemset X is called the support of X, i.e.,
4 We use a separator-free form for sets, e.g., AB stands for {A, B}.</p>
        <p>A B C D E
1 × × ×
2 × × ×
3 × × × ×
4 × ×
5 × × × ×</p>
        <p>AB</p>
        <p>AC</p>
        <p>AD</p>
        <p>AE</p>
        <p>BC BD BE CD CE</p>
        <p>DE
A</p>
        <p>B</p>
        <p>C</p>
        <p>D</p>
        <p>E
{A}
(AC;3)
{C}
(C;4)
{AE}, {AB}
(ABCE;2)</p>
        <p>{BC},{CE}
(BCE;3)
{B}, {E}
(BE;4)
(∅;5)
∅
supp(X)=| {T ∈ D | X ⊆ T } |. An itemset X is said to be frequent when
support(X) reaches at least the user-specified minimum threshold minsup.</p>
        <p>Association rule derivation is achieved from a set F of frequent itemsets in
an extraction context K, for a minimal support minsup. An association rule
R is a relation between itemsets of the form R : X ⇒ (Y − X), in which X
and Y are frequent itemsets, and X ⊂ Y . Itemsets X and (Y − X) are called,
respectively, premise and conclusion of the rule R. The valid association rules
are those whose the strength metric conf(R)= |XY | is greater than or equal to
|X|
the minimal threshold of confidence minconf. If conf(R)=1 then R is called exact
association rule (ER), otherwise it is called approximative association rule (AR).
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Related work on generic bases of association rules</title>
      <p>
        The problem of the relevance and usefulness of extracted association rules is
of primary importance. Indeed, in most real life databases, thousands and even
millions of high-confidence rules are generated among which many are
redundant. This problem encouraged the development of tools for rule classification
according to their properties, for rule selection according to user-defined criteria,
and for rule visualization. In this paper, we are specially interested in the lossless
information reduction, which is based on the extraction of a generic subset of
all association rules, called base, from which the remaining (redundant)
association rules may be derived. The concept of rule redundancy can be considered
as follows [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]:
Definitio.nA1.ruLleetRA: RXK⇒cbYe t5he set of association rules that can be drawn from a
context K ∈ ARK is considered as redundant (or derivable)
with respect to (from) R1: X1 ⇒cY1 if R fulfils the following conditions:
1. Supp(R)=Supp(R1) and conf(R)=conf(R1)=c
2. (X1 ⊂ X and Y ⊂ Y1) or (X1 = X and Y ⊂ Y1)
Exact association rules, of the form R : X ⇒ (Y ), are implications between two
itemsets X and XY whose closures are identical, i.e., ω(X) = ω(XY ). Indeed,
from the aforementioned equality, we deduce that X and XY belong to the same
class of the equivalence relation induced by the closure operator ω on P(A) and
then supp(X)= supp(XY) (i.e., conf(R) = 1).
Definition 2. Let FC be the set of frequent closed itemsets extracted from the
context and, for each frequent closed itemset c, let us denote Gc the set of minimal
generators of c. The generic basis for exact association rules, called GBE , is
defined as follows:
GBE ={R : g ⇒ (c − g) | c ∈ F CI and g ∈ Gc and g 6= c}.
      </p>
      <p>
        The authors also characterized a generic base of approximate association
rules, based on extending the partial implications base of Luxenburger [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The
GBA base is defined as follows :
Definition 3. The generic base of approximate association rules given by : GBA
= {R| R : X ⇒ Y, Y ∈ F CI and ω(X) ≤ Y and c=conf(R) ≥ minconf and
supp(Y) ≥ minsup}.
      </p>
      <p>
        As pointed out in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], the couple (GBE ,GBA) of generic bases form a sound
and informative generic base. With respect to Definitions 2 and 3, we consider
that given an Iceberg Galois lattice, representing precedence-relation-based
ordered closed itemsets, generic bases of association rules can be derived in a
straightforward manner. We assume that in such structure, each closed itemset
is ”decorated” with its associated list of minimal generators. Hence, AR
represent ”inter-node” implications, assorted with a statistical information, i.e., the
confidence, from a sub-closed-itemset to a super-closed-itemset while starting
from a given node in an ordered structure. Inversely, ER are ”intra-node”
implications extracted from each node in the ordered structure. For example, Generic
bases of exact and approximative association rules that can be drawn from the
context K, are respectively depicted in Figure 2 (Left and Right).
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the authors presented sound inference axioms for both (GBE and GBA)
bases, permitting to derive all association rules from generic bases of association
rules.
5 When conf(R: X⇒c Y) =1, then c is omitted, i.e., R is written as R: X⇒Y.
Rule # Implication
R1 E⇒B
R2 B⇒E
R3 A⇒C
R4 BC⇒E
R5 CE⇒B
R6 AB⇒CE
R7 AE⇒BC
      </p>
      <p>Rule # Implication Rule # Implication
R8
R9
R10
R11
R12</p>
      <p>C.⇒75A
C.⇒75BE
C ⇒.5 ABE
A.⇒66BCE
E.⇒75BC</p>
      <p>R13
R14
R15
R16
R17</p>
      <p>
        B.⇒75CE
E ⇒.5 ABC
B ⇒.5 ACE
BC.⇒66AE
CE.⇒66AB
In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], the author presented a definition of a generic base and the
associated derivation axioms. The author broke the ”monopoly” of the FCA
basedsemantics related work for deriving generic bases. Indeed, the presented approach
to derive the generic base takes as input the set of frequent itemsets (and not
closed) yield for example by the Apriori algorithm. Another peculiarity of the
proposed generic base is that it is composed of association rules, in which the
respective premise and conclusion parts are not necessarily disjoint.
      </p>
      <p>In what follows, we present the GBP han algorithm, whose pseudo-code is
given by Algorithm 1.1, permitting to derive the GBP han generic base.
Example 2. Let us consider the extraction context given by Figure 1 (Left).
Then, in what follows the running process of the GBP han algorithm for minsup
= 52 and minconf = 21 . The set R is initialized with the set of frequent itemsets in
a decreasing order, i.e., R= {ABCE, ABC, ABE, ACE, BCE, AB, AC, AE, BC,
BE, CE, A, B, C, E}. For J = ABCE, we have supp(ABCE) = 25 ≤ minconf.
Then, we have : LABCE = A, B, C, E, AB, AC, AE, BC, BE, CE, ABC, ABE,
ACE, BCE. For I = A, we have |ABCE| = 32 ≥ minconf. In this case, the generic
|A|
rule A⇒ ABCE is added to the generic base and from LABCE we delete all
element including A, i.e., AB, AC, AE, ABC, ABE, ACE, and A. Therefore, the
set LABCE equals {B, C, E, BC, CE, BCE}. Next for I= B, we have |ABCE| = 1 ,
|B| 2
we have to generate the generic rule B⇒ ABCE and to delete BC, C and BCE
from LABCE. Therefore, the set LABCE is equal to {C, E, CE}. For I= C, we have
supp(ABCE) = 21 , then we generate the generic rule C⇒ ABCE and we delete CE
supp(C)
and C from LABCE. Finally for I= E, we have |ABCE| = 1 . Hence, we generate
|E| 2
the generic rule E⇒ABCE and E is removed from LABCE. Next, we have to
remove from the set R, the following itemsets ABC, ABE, ACE, AB and AE.
The set R is found to be equal to {BCE, AC, BC, BE, CE, A, B, C, E}. Then,
for J= BCE, we have |BCE|= 53 ≥ minconf. In this case, we have to generate
the generic rule ∅ ⇒BCE and we delete BCE, BC, BE, CE, B, C and E from R.
Therefore, R= {AC, A} and for J= AC we have |AC = 35 ≥ minconf. Then, we
|
have to generate the generic rule ∅ ⇒ AC. After the removal of AC and A from
Algorithm 1.1: GBP han
Data:
– R= {frequent itemsets in a decreasing order}
– minsup
– minconf
Result: GBP han: generic base
begin
foreach J ∈ R of size i from m down to 1| m= |largest frequent itemset| do
if support(J) ≥ minconf then</p>
      <p>GBP han=GBP han∪ {∅ ⇒J }</p>
      <p>R=R-{J 0|J 0 ⊆ J }
else</p>
      <p>LJ = {all nonempty subset of J in an ascendant order }
foreach I ∈ LJ of size k from 1 to i-1 do
if @ I0 ⇒ J 0 ∈ GBP han | I0 ⊆ I ∧ and J ⊆ J 0 ∧ ||JI|| ≥ minconf
then</p>
      <p>GBP han=GBP han∪ {I ⇒ J }</p>
      <p>LJ =LJ -{ I0| I ⊂ I0 ⊂ J }</p>
      <p>LJ =LJ - I</p>
      <p>R= R- {J0 ⊆ J | |J0|=|J|}
end
R, the set R is found to be empty and the algorithm terminates. The resulting
generic base is depicted by Table 1.</p>
      <p>Implication Support Confidence</p>
      <p>ACBE∅∅⇒⇒⇒⇒⇒⇒AAAABBABBBCCCCCCEEEEE 545245525522 323521521312</p>
      <p>Table 1. Generic Base GBP han.</p>
      <p>For the derivation of association rules, the author presented a sound
axiomatic system composed of the following two axioms:
A1:Left augmentation If I⇒ JK ∈ GBP han, then IJ⇒JK is a derivable valid
rule.</p>
      <p>A2:Decomposition If I ⇒ JK∈ GBP han, then I⇒J and I⇒ K are derivable
valid rules.
For example, from ∅ ⇒AC and using the above mentioned axioms, it is possible
to derive A⇒ AC, C⇒ AC and A⇒ C.</p>
      <p>However, two drawbacks may be addressed. At first, only frequent itemsets
are used to define the generic base, and one can imagine the length and the
number of such frequent itemsets that could be derived from correlated extraction
contexts. Second, the presented generic base is not informative, i.e., it may exist
a derivable rule for which it is impossible to determine exactly both its support
and confidence. For example, the association rule BE⇒C is derivable from the
generic rule E⇒ABCE. However, it is impossible to derive the exact confidence
and support of the derivable rule from the GBP han generic base.
3.3</p>
      <sec id="sec-3-1">
        <title>Work of Kryszkiewicz</title>
        <p>
          In [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], the author introduced another syntactic derivation operator, called the
cover, defined as follows:
        </p>
        <p>Cover(X ⇒Y) = {X ∪ Z ⇒ V | Z, V ⊆ Y ∧ Z ∩ V = ∅ ∧ V 6= ∅}.</p>
        <p>The number of the derived rules from a rule R: X ⇒ Y is equal to
|Cover(R)|=3m2m, where |Y|=m. For a derived rule R’, we have conf(R’)≥ max { conf(Ri)| R’∈
Cover(Ri)} and supp(R’)≥ max { supp(Ri)| R’∈ Cover(Ri)}. In order to
determine whether a rule R’:X’⇒Y’ belongs to the cover of a rule R:X⇒Y, we have
to check that X ⊆ X0 and Y 0 ⊂ Y .</p>
        <p>The author proposed a minimal base of rules called representative association
rules (RR) defined as follows: RR={R ∈ AR | @ R’∈ AR, R6=R’and R ∈ C(R’)}
where AR is the set of all valid association rules.</p>
        <p>
          As pointed out in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], using the cover operator as a rule inference mechanism,
RR is lossless, sound but it is not informative. However, the author claimed that
the couple (GBE ,GBA) forms a lossless, sound and informative generic base, by
using the cover operator as inference mechanism.
        </p>
        <p>On the other hand, the author showed that given the following bases:
– The DG Duquenne-Guigues basis for exact rules, i.e.,</p>
        <p>DG={R:X⇒ω(X)-X |X ∈F P}, where F P the pseudo-closed itemsets.
– the PB Proper basis for approximative rules, i.e.,</p>
        <p>
          PB={R:X⇒Y-X |X,Y ∈F CI and X ⊂ Y and conf(R)≤ minconf}
then the couple(PB,DG) forms a lossless, sound and informative base, by
using the conjunction of closure-closure inference rule [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and Armstong’s
axioms [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] as inference mechanism. Clearly, the drawback of the solution proposed
is the considerable increase in the size of base in order to be sound and
informative.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>New generic base</title>
      <p>Intuitively, we are looking for defining a new informative generic base
providing the ”maximal boundaries” in which stand all the derivable rules. Indeed,
a derivable rule cannot present a smaller premise than that of its associated
generic rule, i.e., from which it can be derived. On the other hand, a derivable
rule cannot present a larger conclusion than that of its associated generic rule.
In what follows, we present the definition of the introduced IGB generic base.
Definition 4. The generic base IGB of association rules given by : IGB = {R
: gs ⇒ A-gs | A ∈ F CI ∧ ω(gs) ≤ A ∧ conf(R) ≥ minconf ∧ @ g0 such that gs
⊂ g0 and conf(g0 ⇒ A-g0)≥minconf }.</p>
      <p>Proposition 1. Let I a frequent closet itemset. If |I| ≥ minconf, then the generic
rule that can be drawn from I is ∅ ⇒I.</p>
      <p>Proof. since conf(∅ ⇒I)=supp(I), then the generic rule ∅ ⇒I is valid. It is also
the largest rule that can be drawn from the frequent closed itemset I. Indeed, @
R1: X1 ⇒ Y1 such that X1 ⊂ ∅ and I ⊆Y1.
4.1</p>
      <p>The IGB generic base construction
In what follows, we present the Igb algorithm, whose pseudo-code is given by
Algorithm1.2, permitting to construct the introduced generic base of association
rules. The Igb algorithm is based on Proposition 1. So, it considers the set of
frequent closed itemsets F CI. For each closed itemset I, it checks whether its
support is greater than or equal to minconf. If it is the case, then we generate the
generic rule ∅ ⇒I. Otherwise, it has to look for the smallest minimal generator,
say gs, associated to a frequent closed itemset subsumed by I, and generates the
generic rule gs ⇒I-gs.</p>
      <p>Example 3. Let us consider the extraction context given by Figure 1 (Left).
Then, in what follows the running process of the Igb algorithm for minsup = 2
5
and minconf = 12 . For I = ABCE, we have |ABCE|= 25 &lt; 12 , LABCE = {C, A,
B, E, BC, CE, AE, AB}. Since |ABCE| = 12 , we have to generate the generic rule
|C|
C⇒ABE and to remove BC, CE and C from LABCE . Next, we have |ABCE|
|A|
= 23 &gt; 12 . Then, we generate A⇒ BCE and AE, AB and A are removed from
LABCE . Then, we have |ABCE| = 1 . So, we generate the generic rule B⇒ACE
|B| 2
and we delete B from LABCE . From, the evaluation of |ABCE| = 1 , we have to
|E| 2
generate E⇒ ABC, and to remove E from LABCE , which is found to be empty.</p>
      <p>For I = BCE we have |BCE| ≥ minconf, then we generate the generic rule
∅ ⇒BCE. Next, we have to apply the same process, respectively, for I = AC, I
= BE and I = C. The resulting generic base is depicted by Table 2.</p>
      <p>Obviously, the introduced IGB generic base is largely more compact than
that proposed by Bastide et al. (8 generic rules vs 17). Comparatively to that
proposed by Phan, the IGB generic base is not more compact but it presents the
informative property. The Igb algorithm is currently under implementation, and
we have to assess its compactness output versus those presented in the literature
survey.
Algorithm 1.2: Igb
Data:</p>
      <sec id="sec-4-1">
        <title>Result: IGB:Informative generic base</title>
        <p>begin
foreach frequent closed itemset I ∈ F CI / I6= ∅ do
if (|I| ≥ minconf ) then</p>
        <p>R= ∅ ⇒ I
R.supp=|I|
R.conf=|I|</p>
        <p>IGB=IGB ∪ R
else</p>
      </sec>
      <sec id="sec-4-2">
        <title>1. F CI: set of frequent closed itemsets and their associated supports.</title>
        <p>2. minconf</p>
        <p>LI1 ={non empty minimal generators of frequent closed itemset I1 |
I1 ⊆ I}
foreach minimal generator g ∈ LI1 do
if ( ||gI|| ≥ minconf ∧ I 6= g) then</p>
        <p>R= g ⇒I-g
R.supp=|I|
R.conf= |I|</p>
        <p>|g|
IGB=IGB ∪ R</p>
        <p>LI1 =LI1 -{g0| g ⊂ g0}</p>
        <p>LI1 =LI1 -g
end
4.2</p>
        <sec id="sec-4-2-1">
          <title>Association rule derivation</title>
          <p>ARK/ Z ⊂ Y ∧ c0= |X|XZ|| .</p>
          <p>c0= | X|YZ| | .</p>
          <p>In this subsection, we will try to axiomatize the derivation of association rules
from the the IGB generic base. In the following, we provide a set of inference
axioms and we prove their soundness. Then, we show that the IGB generic base
is informative.</p>
          <p>Proposition 2. Let IGBK and ARK be a generic base and the set of valid
association rules, respectively, that can be drawn from a context K. Then, the
following inference axioms are sound:
A0.Conditional Reflexivity: If X ⇒c Y ∈ IGBK ∧ X 6= ∅ then X ⇒cY ∈ ARK
A1.Conditional Decomposition: If X ⇒cY ∈ IGBK ∧ X 6= ∅ then X ⇒c0 Z ∈
A2.Augmentation If X ⇒cY ∈ IGBK then X ∪ Z ⇒c0 Y-{Z} ∈ ARK /Z ⊂ Y ∧
Proof. A0.Conditional Reflexivity: follows from the proper definition of the
IGB generic base. The condition X 6= ∅ ensures the non emptiness of the
derived rule.
Implication Support Confidence
∅∅∅∅EBCA⇒⇒⇒⇒⇒⇒⇒⇒AAABBBCABCCBECCCEEEE 5235355225454525 1223512335541254</p>
          <p>Table 2. The IGB generic base.</p>
          <p>A1.Conditional Decomposition: Since, X ⇒cY ∈ IGBK then conf(X ⇒cY)=c
≥ minconf. Since Z ⊂ XY, we have |XZ|≥ |XY| and then |X|XZ|| ≥ |X|XY|| ≥
minconf. Hence, X ⇒c0 Z is a valid rule with a confidence value c0= |XZ| .The
|X|
condition X 6= ∅ ensures the non emptiness of the derived rule.</p>
          <p>A2.Augmentation Since, X ⇒cY ∈ IGBK then conf(X ⇒cY)=c ⇔
|XY | =c ≥
|X|
minconf. From X ⊂ XZ, we have |X| &gt;|XZ| &gt; and minconf &lt; |XY | &lt; |X|XYZZ|| .
|X|</p>
          <p>Hence, X ∪ Z ⇒c0 Y-{Z} is a valid rule, with a confidence value c0= | X|YZ| | .
Proposition 3. The IGB generic base is informative.</p>
          <p>Proof. To prove that the IGB generic base is informative, it is sufficient to show
that it contains all the necessary information to determine the support of an
itemset in a derived rule. Therefore, it means that we have to be able to
reconstitute all closed itemset by concatenation of the premise and the conclusion of
a generic rule 6. The algorithm considers all the discovered frequent closed
itemsets. Hence for a given frequent closed itemset, say c, it tries to find the smallest
minimal generator, say gs, associated to frequent closed itemsets subsumed by
c and fulfilling the minsup constraint. Therefore, the algorithm generates the
following generic rule gs ⇒c. Since gs ⊂ c, then gs ∪ c=c. Then by construction,
all frequent closed itemsets can be reconstituted from the IGB generic base.</p>
          <p>Conclusion: The IGB generic base is informative.</p>
          <p>In what follows, we present the AR-deriv algorithm, whose pseudo-code is given
by Algorithm1.3, permitting to derive all valid association rules from the IGB
generic base.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper, we presented a critical survey of the reported approaches for
defining generic bases of association rules. Then, we introduced a novel generic</p>
      <sec id="sec-5-1">
        <title>6 It is known that the support of an itemset is equal to the support of the smallest</title>
        <p>closed itemset containing it.
Algorithm 1.3: A-r-deriv</p>
      </sec>
      <sec id="sec-5-2">
        <title>Data: IGBK: Informative generic base</title>
        <p>Result: ARK:set of valid association rules
begin
foreach R:X⇒c Y ∈ IGBK do
/* Applying Reflexivity axiom*/
if X 6= ∅ then ARK=ARK ∪ X⇒c Y
if |Y| &gt; 1 then
foreach Z | Z⊂ Y do
/*Applying Decomposition axiom*/
R’=X⇒Z
s=Get-smallest-support(XZ, IGBK)
/* the Get-smallest-support function yields the support value of the
smallest closed itemset containing XZ*/
c’= |cX×Ys|
ARK=ARK ∪ X ⇒c0 Z
/*Applying Augmentation axiom*/
R’=XZ⇒Y-{Z}
s=Get-smallest-support(XZ,IGBK)
c’= |XY |</p>
        <p>s</p>
        <p>ARK=ARK ∪ XZ ⇒c0 Y − {Z}
end
base, which is sound and informative. We also provided a set of sound inference
axioms for deriving all association rules from the introduced generic base of
association rules. The reported algorithms are currently under implementation
in order to include them in a Information Retrieval prototype. Specially, we are
interested in assessing the well-known IR metrics, namely recall and precision,
by using the introduced generic rule base in a query expansion process.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bastide</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taouil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stumme</surname>
          </string-name>
          , G.:
          <article-title>Mining minimal non-redundant association rules using frequent closed itemsets</article-title>
          .
          <source>In: Proceedings of the Intl. Conference DOOD'</source>
          <year>2000</year>
          , LNCS, Springer-verlag. (
          <year>2000</year>
          )
          <fpage>972</fpage>
          -
          <lpage>986</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Stumme</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taouil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bastide</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Intelligent structuring and reducing of association rules with formal concept analysis</article-title>
          .
          <source>In: Proc. KI'2001 conference, LNAI 2174</source>
          , Springer-verlag. (
          <year>2001</year>
          )
          <fpage>335</fpage>
          -
          <lpage>350</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Zaki</surname>
            ,
            <given-names>M.J.:</given-names>
          </string-name>
          <article-title>Generating non-redundant association rules</article-title>
          .
          <source>In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          ,Boston, MA. (
          <year>2000</year>
          )
          <fpage>34</fpage>
          -
          <lpage>43</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <source>Formal Concept Analysis</source>
          . Springer-Verlag (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>BenYahia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cherif</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mineau</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaoua</surname>
          </string-name>
          , A.:
          <article-title>D´ecouverte des r`egles associatives non redondantes : application aux corpus textuels. Revue d'Intelligence Artificielle (special issue of Intl. Conference of Journ´ees francophones d'Extraction et Gestion des Connaissances(EGC'</article-title>
          <year>2003</year>
          ), Lyon, France
          <volume>17</volume>
          (
          <year>2003</year>
          )
          <fpage>131</fpage>
          -
          <lpage>143</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Barbut</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monjardet</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Ordre et classification</article-title>
          .
          <source>Alg`ebre et Combinatoire. Hachette</source>
          ,
          <string-name>
            <surname>Tome II</surname>
          </string-name>
          (
          <year>1970</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>BenYahia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguifo</surname>
            ,
            <given-names>E.M.:</given-names>
          </string-name>
          <article-title>Revisiting generic bases of association rules</article-title>
          .
          <source>In: Proceedings of 6th International Conference on Data Warehousing and Knowledge Discovery (DaWaK</source>
          <year>2004</year>
          ) (Springer-Verlag) to appear, Zaragoza,
          <string-name>
            <surname>Spain.</surname>
          </string-name>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Guigues</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duquenne</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Familles minimales d'implications informatives r´esultant d'un tableau de donn´ees binaires</article-title>
          .
          <source>Math´ematiques et Sciences Humaines</source>
          (
          <year>1986</year>
          )
          <fpage>5</fpage>
          -
          <lpage>18</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Luxenburger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Implication partielles dans un contexte</article-title>
          .
          <source>Math´ematiques et Sciences Humaines</source>
          <volume>29</volume>
          (
          <year>1991</year>
          )
          <fpage>35</fpage>
          -
          <lpage>55</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kryszkiewicz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Concise representations of association rules</article-title>
          . In Hand, D.J.,
          <string-name>
            <surname>Adams</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bolton</surname>
          </string-name>
          , R., eds.
          <source>: Proceedings of Pattern Detection and Discovery</source>
          , ESF Exploratory Workshop, London, UK. Volume
          <volume>2447</volume>
          of Lecture Notes in Computer Science., Springer (
          <year>2002</year>
          )
          <fpage>92</fpage>
          -
          <lpage>109</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Luong</surname>
            ,
            <given-names>V.P.</given-names>
          </string-name>
          :
          <article-title>Raisonnement sur les r`egles d'association</article-title>
          .
          <source>In: Proceedings</source>
          <volume>17</volume>
          `eme Journ´ees Bases de Donn´ees Avanc´ees BDA'
          <year>2001</year>
          ,
          <string-name>
            <surname>Agadir</surname>
          </string-name>
          (Maroc),
          <source>C´epadu`es Edition</source>
          . (
          <year>2001</year>
          )
          <fpage>299</fpage>
          -
          <lpage>310</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Armstrong</surname>
          </string-name>
          , W.:
          <article-title>Dependency structures of database relationships</article-title>
          .
          <source>In: IFIP Congress</source>
          .
          <article-title>(</article-title>
          <year>1974</year>
          )
          <fpage>580</fpage>
          -
          <lpage>583</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>