=Paper=
{{Paper
|id=Vol-162/paper-16
|storemode=property
|title=VIE_MGB: A Visual Interactive Exploration of Minimal Generic Basis of Association Rules
|pdfUrl=https://ceur-ws.org/Vol-162/paper16.pdf
|volume=Vol-162
|dblpUrl=https://dblp.org/rec/conf/cla/LatiriBYG05
}}
==VIE_MGB: A Visual Interactive Exploration of Minimal Generic Basis of Association Rules==
VIE
VIE MGB:
MGB: AA Visual
Visual Interactive
Interactive Exploration
Exploration of
of
Minimal Generic Basis of Assopciation Rules
Chiraz Latiri Cherif1 , Wissem Bellegha1 , Sadok Ben Yahia2 , and
1
Chiraz LatiriGhada
CherifGuesmi
, Wissem
2 Bellegha1
Sadok Ben Yahia and Ghada Guesmi2
2
1
1 École Supérieure de Commerce de Tunis,
École Supérieure de Commerce de Tunis,
Computer
Computer Sciences
Sciences Department,
Department,
Campus Universitaire de La Manouba,
2010, Tunisia
chiraz.latiri@gnet.tn
chiraz.latiri@gnet.tn
wissembellegha@gmail.com
2 wissembellegha@gmail.com
2
Faculty of Sciences of Tunis, Computer Sciences Department
Faculty of Sciences of Tunis,
Research Computer
Team URPAH Sciences Department
Research
Campus Universitaire ELTeam URPAH
Manar, 1060 Tunis, Tunisia
sadok.benyahia@fst.rnu.tn
Campus Universitaire EL Manar, 1060 Tunis, Tunisia
ghada gasmi@yahoo.fr
sadok.benyahia@fst.rnu.tn
ghada gasmi@yahoo.fr
Abstract. Mining association rules is an important task, even though
the number of rules discovered is often huge. A possible solution to this
problem, is to use the Formal Concept Analysis (FCA) mathematical
settings to restrict rules extraction to a generic basis of association rules.
This one is considered as a reduced set to which we can apply appropriate
inference mechanisms to derive redundant rules. In this paper, we intro-
duce a new minimal generic basis MGB of non-redundant association
rules based on the augmented Iceberg Galois lattice. The proposed ap-
proach involves the inference mechanisms used and a set of experiments
applied to several real and synthetic databases. Carried out experiments
showed important benefits in terms of reduction in the number of generic
rules extracted. We present also a new framework for generating and vi-
sually exploring the minimal generic basis MGB.
1 Introduction
Discovering association rules in large datasets is an interesting problem of re-
search in data mining. Its principal objective is to find out correlations between
frequents itemsets. An association rule is a strong one when its support (i.e.,
frequency of the represented pattern) and confidence (i.e., strength of the de-
pendency between the premise and the conclusion) are higher than minimum
Radim Bělohlávek, Václav Snášel (Eds.): CLA 2005, pp. 179–196, ISBN 80–248–0863–3.
180 Chiraz Latiri Cherif, Wissem Bellegha, Sadok Ben Yahia, Ghada Guesmi
thresholds fixed by the user (minsup and minconf ). However the number of
association rules generated is often huge because many of them are redundant.
So, a possible solution to this problem, is to restrict rules extraction to a reduced
set containing only non redundant ones which are strictly related to the user’s
need [1].
This reduced set is known as generic basis for association rules, to which
we can apply appropriate inference mechanisms to derive all strong redundant
association rules. A generic basis can be evaluated by three properties such as
informativity, compacity and inference mechanism which will be detailed later.
This paper is organized as follows. Section 2 presents the mathematical back-
ground of Formal Concept Analysis (FCA) [2] and describes the association rules
mining task. Section 3 deals with the problem of eliminating redundant rules by
surveying some of the proposed approaches for constructing generic bases of
association rules. In section 4, we present a new minimal generic basis of asso-
ciation rules denoted as MGB. Section 5 consolidates the proposed approach by
a set of experiments applied to several real synthetic databases. Finally, we will
present VIE MGB as a new framework for visually generating and exploring
MGB and the Iceberg Galois lattice. We were inspired by MIRAGE [3] which
allows interactive graphical visualization and exploration of minimal association
rules.
2 Mathematical background
In this section, we introduce the mathematical notions based on the FCA.
2.1 Basic notions
Formal context: A formal context is a triplet K = (O, A, R), where O rep-
resents a finite set of objects, A a finite set of attributes, and R a binary re-
lation(i.e., R ⊆ O × A). Each couple (o, a) ∈ R expresses that the transaction
o ∈ O contains the attribute a ∈ A.
We define two functions, summarizing links between subsets of objects and
subsets of attributes induced by R, that map sets of objects to sets of attributes
and vice versa. Thus, for a set O ⊆ O, we define
φ(O) = {a | ∀o, o ∈ O =⇒ (o, a) ∈ R} ; and for A ⊆ A, ψ(A) = {o | ∀a, a ∈
A =⇒ (o, a) ∈ R}. Both functions φ and ψ form a Galois connection between
the sets P(A) and P(O) [4]. Consequently, both compound operators of φ and
ψ are closure operators, in particular, ω = φ ◦ ψ is a closure operator.
VIE MGB: A Visual Interactive Exploration 181
Frequent closed itemset: An itemset A ⊆ A is said to be closed if
A = ω(A), and is said to be frequent with respect to minsup threshold if
support(A) = |ψ(A)| |O| ≥ minsup.
Formal concept: A formal concept is a couple c = (O; A), where O is called
extent, and A is a closed itemset, called intent. Furthermore, both O and A are
related through the Galois connection, i.e., φ(O) = A and ψ(A) = O.
Minimal generator: An itemset g ⊆ A is called minimal generator of a
closed itemset A, if and only if ω(g) = A and g ⊆ g such that ω(g) = A [5].
Galois lattice: Given a formal context K, the set of formal concepts CK
is a complete lattice Lc = (C, ≤), called the Galois (concept) lattice, when CK
is considered with inclusion between itemsets [2] [4]. A partial order on formal
concepts is defined as follows ∀c1 , c2 ∈ CK , c1 ≤ c2 iff intent(c2 ) ⊆ intent(c1 ),
or equivalently extent(c1 ) ⊆ extent(c2 ). The partial order is used to generate
the lattice graph, called Hasse diagram, in the following manner: there is an
arc (c1 , c2 ), if c1 c2 where is the transitive reduction of ≤, i.e., ∀c3 ∈ CK ,
c1 ≤ c3 ≤ c2 implies either c1 = c3 or c2 = c3 .
Property 1 Lattice operator join provides the least upper bound (LUB) in the
concept lattice, and it is defined as follows:
Let S ⊆ Lc
Join(S) = ω ∪ c
c∈S
Iceberg Galois lattice: When only frequent closed itemsets are considered
with set inclusion, the resulting structure (Lc , ⊆) only preserves the LUBS , i.e.,
the join operator. This is called a join semi-lattice or upper semi-lattice. In the
remaining of the paper, such structure is referred to as ”Iceberg Galois Lattice”.
Example 1 Let us consider the extraction context given by table 1. The asso-
ciated Iceberg Galois lattice, for minsup = 12 , is depicted by figure 1. Each node
in the Iceberg is represented as couple (closed itemset,support) and is decorated
with its associated minimal generators list.
2.2 Association rules
An association rule represents an implication between frequent itemsets. It is
an expression of the form: R : X =⇒ Y [6], in which X and XY are frequent
itemsets, and X ⊂ XY . Itemsets X and Y are called, respectively, premise and
conclusion of the rule R. The support of R denoted supp(R), is equal to the
support of XY , and its confidence denoted by conf (R) is determined as follows:
182 Chiraz Latiri Cherif, Wissem Bellegha, Sadok Ben Yahia, Ghada Guesmi
RACDTW
1 ×× × ×
2 ×× ×
3 ×× × ×
4 ××× ×
5 ×××× ×
6 ×××
Table 1. Extraction context k
Fig. 1. Iceberg Galois lattice for minsup = 12
conf (R) = supp(XY )
supp(X) . If conf (R) = 1, then R is an exact association rule
(ER), otherwise it is called approximate association rule (AR).
The process of association rules extraction can be split into two steps [6]:
1. Finding out all frequent itemsets from a given context extraction k for a
specified minsup value.
2. Generating all association rules from the frequent itemsets obtained and
restricting extraction to rules which confidence satisfies the minconf value.
3 Extraction of non redundant association rules
The problem of the relevance and usefulness of extracted association rules is of
primary importance. Indeed, in most real life databases, the set of association
rules can rapidly grow to be unwieldy, especially as we lower the minsup value,
and among which many are redundant. So, a possible solution to this problem
VIE MGB: A Visual Interactive Exploration 183
is to restrict extraction of rules to a generic basis of non redundant association
rules. Once this generic basis is obtained, all the remaining (redundant) rules
can be derived using an appropriate inference mechanism. In an ideal case a
generic basis should satisfy the following conditions [7]:
- Informativity: to exactly determine the support and confidence of redundant
association rules.
- Deriving redundant rules: the derivation axioms should be lossless (should
enable derivation of all strong rules) and sound (should forbid derivation of
rules that are not strong).
- Compacity: to have the most reduced set of generic rules allowing the deriva-
tion of all strong remaining ones(redundant rules).
In this section, we will overview some representations of strong association
rules.
3.1 Minimal non redundant association rules (MNR)
In [5], Bastide et al. present a couple of generic bases, one for the exact rules
denoted GB, and the second for approximate ones denoted IB.
Bastide et al. consider the following rule-redundancy definition:
Definition 1 An association rule R : X =⇒ Y is a minimal non redundant
association rule iff an association rule R1 : X1 =⇒ Y1 fulfilling the following
constraints:
1. supp(R) = supp(R1 ) and conf (R) = conf (R1 )
2. X1 ⊂ X and Y ⊂ Y1
Exact association rules, of the form R : X =⇒ Y , are implications between
two itemsets X and XY whose closures are identical, i.e., ω(X) = ω(XY ).
Indeed, from the aforementioned equality, we deduce that X and XY belong
to the same class of the equivalence relation induced by the closure operator
ω on P (A) and then supp(X) = supp(XY ) (i.e., conf (R) = 1). Bastide et
al. characterized what they called “the generic basis for exact association rules”
(adapting the global implications base of Duquenne and Guigues [8]). The generic
basis for exact association rules, that has been proved to be lossless information,
is defined as follows:
Definition 2 Let F C be the set of frequent closed itemsets extracted from a
context and, for each frequent closed itemset c, let us denote Gc the set of minimal
generators of c. The generic basis for exact association rules is as follows:
184 Chiraz Latiri Cherif, Wissem Bellegha, Sadok Ben Yahia, Ghada Guesmi
Association rules Support Conf idence
2
Association rules Support Conf idence
D =⇒ C 3
1 1 3
2
T =⇒ ACW 2 4
T =⇒ C 3
1 1 3
5
A =⇒ CT W 2 4
W =⇒ C 6
1 1 3
2
D =⇒ CW 2 4
A =⇒ CW 3
1 5 5
1
C =⇒ W 6 6
DW =⇒ C 2
1 2 2
1
C =⇒ T 3 3
T W =⇒ AC 2
1 2 2
1
C =⇒ D 3 3
AT =⇒ CW 2
1
Table 2. Non redundant exact and approximate association rules extracted from the
context extraction k for minsup = 12 and minconf = 23
GB = {R : g =⇒ c − g | c ∈ F C ∧ g ∈ Gc ∧ g = c}.
The authors also characterized the informative basis for approximate asso-
ciation rules based on extending the family of bases of partial implications of
Luxemburger [8], e.g., cover, structural and arborescent bases. The informative
basis is defined as follows:
Definition 3 The informative basis for approximate association rules is given
by:
IB = {R : g =⇒ c − g, c ∈ F C ∧ g ∈ G ∧ ω(g) ⊂ c}.
With respect to Definitions 2 and 3, we consider that given an Iceberg Galois
lattice, representing precedence-relation-based ordered closed itemsets, generic
bases of association rules can be derived in a straightforward manner. We assume
that in such structure, each closed itemset is ”decorated” with its associated list
of minimal generators. Hence, AR represent ”inter-node” implications, assorted
with a statistical information, i.e., the confidence, from a sub-closed-itemset to
a super-closed-itemset while starting from a given node in an ordered structure.
Inversely, ER are ”intra-node” implications extracted from each node in the
ordered structure.
Example 2 Generic bases of exact and approximate association rules that can
be drawn from the context k, are respectively depicted in table 2.
As shown in [7] the couple (GB,IB) form a sound and informative generic
basis. However, it produces a very important number of association rules, so this
generic basis is not compact.
VIE MGB: A Visual Interactive Exploration 185
Association rules Support Conf idence
1 3
A =⇒ ACT W 2 4
1 3
T =⇒ ACT W 2 4
1 3
D =⇒ CDW 2 4
2 2
∅ =⇒ CD 3 3
2 2
∅ =⇒ CT 3 3
2 2
∅ =⇒ ACW 3 3
Table 3. Representative basis RBP han extracted from context extraction k for
minsup = 12 and minconf = 23 .
3.2 Representative basis (RBP han )
In [9], the author proposes a new generic basis for association rules denoted
RBP han , and presents its associated inference mechanism. This approach takes
as input all frequent itemsets obtained from a given context extraction k by
applying an appropriate algorithm such as Apriori [10]. Phan defines the rule-
redundancy as follows:
Definition 4 Let E be the set of all strong association rules extracted from a
context extraction k. The rule r : l1 =⇒ l2 ∈ E is a redundant one if and
only if there is a rule r : l1 =⇒ l2 ∈ E, and the two following conditions are
satisfied:
1. l1 ⊆ l1
2. l2 ⊆ l2
So the representative basis RBP han is defined as follows:
Definition 5 Let minsup and minconf be the support and confidence thresholds
for interesting association rules.
RBP han = r : I =⇒ J | I ⊂ J ∧ r : I =⇒ J | I ⊆ I ∧ J ⊆ J
Remark 1. RBP han consists in a redefinition of representative rules denoted as
RR and defined in [7]. However, the difference between them is that for RBP han
the premise and the conclusion of a given rule are not necessarily disjoint.
Example 3 Given the context extraction k depicted by table 1, we present the
representative basis extracted for minsup = 12 and minconf = 23 in table 3.
186 Chiraz Latiri Cherif, Wissem Bellegha, Sadok Ben Yahia, Ghada Guesmi
Once the representative basis is generated, we can derive all strong redundant
association rules by applying the inference mechanism defined in [9] as follows:
Definition 6 Let I, J, K be frequent itemsets and having RBP han , we consider
two derivation axioms:
1. Left augmentation: if I =⇒ JK is interesting, then IJ =⇒ JK is
interesting.
2. Decomposition: if I =⇒ I1 I2 is interesting, then I =⇒ I1 and I =⇒ I2
are interesting.
Moreover, Phan showed that the representative basis is the most reduced set of
association rules, from which all remaining rules (redundant) can be derived, so
the property of compacity is satisfied. RBP han is sound and lossless, therefore,
this approach presents some drawbacks such as:
1. The representative basis is not informative. The support and confidence of
derived rules can not be found.
2. For each rule R : X =⇒ Y , we must verify if there is no rule R : X =⇒ Y
such as X ⊆ X ∧Y ⊆ Y .
3.3 Informative Generic Basis (IGB)
In [11], the authors propose a new informative generic basis for association rules
denoted IGB, which takes as input all the frequent closed itemsets. IGB is
defined in [11] as follows:
Definition 7 Let F C be the set of frequent closed itemsets and Gc , the list of
minimal generators for a given frequent closed itemset c.
IGB={R : gs =⇒ I − gs | I ∈ F C ∧ gs ∈ Gc , c ∈ F C ∧ c ⊆ I
∧ conf (R)≥minconf ∧ g | g ⊂ gs ∧ conf (g =⇒ A − g ) ≥ minconf }
Example 4 Given the context extraction k of table 1, we present the basis IGB
extracted for minsup = 12 and minconf = 23 in table 4.
In [11], the authors prove that IGB is informative. In fact, IGB contains all
frequent closed itemsets, and the support of a frequent itemset is equal to the
support of the smallest frequent closed itemset which contains it. So, the support
and the confidence of redundant rules can easily be determined.
In other way, IGB is sound and lossless, and the inference mechanism is
composed by three derivation axioms which are [11]:
VIE MGB: A Visual Interactive Exploration 187
Association rules Support Conf idence
1 3
A =⇒ CT W 2 4
1 3
T =⇒ ACW 2 4
1 3
D =⇒ CW 2 4
2 2
∅ =⇒ CD 3 3
2 2
∅ =⇒ CT 3 3
2 2
∅ =⇒ ACW 3 3
5 5
∅ =⇒ CW 6 6
∅ =⇒ C 1 1
Table 4. IGB extracted from the context extraction k for minsup = 12 and minconf =
2
3
.
1. Conditional Reflexivity: if X =⇒ Y ∈ IGB∧ X = ∅, then X =⇒ Y is
a strong association rule.
2. Augmentation: if X =⇒ Y ∈ IGB, then X ∪ Z =⇒ Y − {Z} is a strong
association rule, Z ⊂ Y .
3. Conditional Decomposition: if R : X =⇒ Y ∈ IGB∧ X = ∅, then
X =⇒ Z is a strong association rule, Z ⊂ Y .
Certainly, IGB is informative, therefore this property is satisfied by the presence
of all frequent closed itemsets in the generic basis. So, one can imagine the num-
ber of frequent closed itemsets extracted from real databases and consequently
the number of generic rules that can be extracted.
4 MGB: A new Minimal Generic Basis
In the previous section, we have studied the properties satisfied by each approach
presented, and so, we deduce that, the problem of the construction of generic
bases is to find a compromise among the compacity and the informativity of
such basis. Then, we propose a new minimal generic basis of association rules
that guarantees a partial informativity and which means that we can exactly
determine the support and confidence of some derived rules, and for others, this
approach allows to delimit this measures in an interval.
MGB is defined as follows:
Definition 8 Given
1. Lc : Iceberg Galois lattice decorated by minimal generators and their supports.
188 Chiraz Latiri Cherif, Wissem Bellegha, Sadok Ben Yahia, Ghada Guesmi
2. ci : frequent closed itemset.
3. S: set of immediate successors of a given concept c.
4. Gci : list of minimal generators of a concept ci .
R : g =⇒ ci − g | g ∈ Gci ∧ ci ∈ Lc ∧ conf (R) ≥ minconf
MGB= support(s)
∧ s ∈ S | support(g) ≥ minconf
4.1 Algorithm for discovery MGB from frequent closed itemsets
In this section, we present the algorithm that extracts the MGB basis directly
from Iceberg Galois lattice decorated with its associated list of minimal genera-
tors and their supports. It is assumed that for every frequent closed itemset we
determine the set of its immediate successors and the list of generators satisfying
a minconf constraint.
Algorithm 1 MGB
Input: Lc : Iceberg Galois lattice decorated by minimal generators and their
supports.
Output: MGB
begin
for each (ci ∈ F C | |ci | > 1) do
Gci = {minimal generators of concept c | c ⊆ c∧ support(c i)
support(c )
≥ minconf }
for each generator g ∈ Gci do
support(s)
if (s ∈ Sci | support(g) ≥ minconf ) then
MGB =MGB ∪R : g → ci − g
Gci = Gci − {g | g ⊆ g }
else
Gci = Gci − {g}
end
Example 5 Given the context extraction shown by table 1, we will illustrate in
table 5 the MGB generic basis for minsup = 12 and minconf = 23 .
As we have shown, MGB requires to determine the list of all immediate suc-
cessors of a given concept c, thus, we will present how to construct this list. So,
we will explore the Iceberg Galois lattice considering the set inclusion and the
partial order defined between the different frequent closed itemsets. In fact, this
process is composed of two steps, which are:
1. Generating the list of all successors of a given concept c.
2. Using this list, we determine the immediate successors of c.
VIE MGB: A Visual Interactive Exploration 189
Association rules Support Conf idence
1 3
A =⇒ CT W 2 4
1 3
T =⇒ ACW 2 4
2 4
W =⇒ AC 3 5
2 2
C =⇒ AW 3 3
2 2
C =⇒ T 3 3
2 2
C =⇒ D 3 3
1 3
D =⇒ CW 2 4
Table 5. The generic basis MGB obtained from the context extraction k for minsup =
1
2
and minconf = 23
4.2 Deriving redundant association rules
For deriving all strong redundant association rules, we propose to adopt two
derivation axioms of the inference mechanism defined in [11].
1. Augmentation: if X =⇒ Y ∈ MGB, then X ∪ Z =⇒ Y − {Z} is a strong
association rule, Z ⊂ Y .
2. Conditional Decomposition: if R : X =⇒ Y ∈ MGB, then X =⇒ Z is
a strong association rule, Z ⊂ Y .
This inference mechanism has been proved in [11] to be sound and complete.
Remark 2. It is important to mention that for deriving all approximate asso-
ciation rules, the order of applying the inference axioms is important. Indeed,
we have to apply first the Augmentation axiom and next apply the Conditional
Decomposition axiom on the resulting set of association rules, i.e., set of approx-
imate generic rules and those derived by the Augmentation axiom.
In fact, MGB is not informative. The support and confidence of derived
rules cannot be found. For example we cannot find the confidence of the rule
AT =⇒ CW derived from A =⇒ CT W because the support of AT is unknown.
Nevertheless, the non-informativity problem can be partially resolved by in-
troducing a new concept denoted ”Partial informativity”. In fact, it means that
we can exactly determine the support and confidence of some derived rules, and
for others, this approach allows to delimit this measures in an interval.
Proposition 1. Let R : X =⇒ Y ∈MGB, A and B two integers, if we apply
the inference mechanism, we obtain rules of the following forms:
1. Augmentation: R1 : XZ =⇒ Y − Z
190 Chiraz Latiri Cherif, Wissem Bellegha, Sadok Ben Yahia, Ghada Guesmi
A=B A=B
Augmentation Augmentation
R1.supp=R.supp R1.supp=R.supp
R1.conf=R.supp/supp(XZ) R1.conf∈[R.conf,1]
Conditional Decomposition Conditional Decomposition
R2.supp=supp(XZ) B ≤ R2 .supp ≤ A
R2.conf=supp(XZ)/supp(X) R2.conf∈ [R.conf,1]
Table 6. Support an confidence of derived rules
2. Conditional Decomposition: R2 : X =⇒ Z
(a) support(XZ) ≤ min {support(i1 ), support(i2 ), ..., support(ik ) | ii ⊂ XZ},
A, where ii is a frequent subset of XZ.
(b) support(XZ) ≥ max{support(I1 ), support(I2 ), ..., support(Ik )}, B, where
Ii is a frequent closed itemset containing XZ and existing in MGB.
Remark 3. if A = B, then we have supp(XZ) = A.
So, table 6 illustrates how we determine support and confidence of these rules.
Proof. a: Since the support of a frequent itemset is less than the support of all
its subsets, then we deduce that support(XZ) is less than the smallest support
of all its subsets.
b: By same, the support of a frequent itemset is higher than the support
of all its supersets, so support(XZ) is higher than the support of the smallest
frequent closed itemset which contains it and existing in MGB.
c: If we apply the augmentation axiom, the generic rule’s base and the re-
dundant rule’s base are similar, so support(R1 ) = support(R).
d : If we apply the conditional decomposition axiom, then the derived rule’s
base is XZ, and so support(R2 ) = support(XZ)
Remark 4. Before deriving the redundant rules, we determine the support of all
frequent itemsets which are the antecedent of generic rules.
Example 6 Given the generic rule R : C =⇒ AW , supp(R) = 23 and conf (R) =
2 supp(R)
3 , so, supp(C) = conf (R) = 1.
In table 7, we summarize the properties of the reviewed association rules
representations. These ones contain only rules whose bases are frequent closed
itemsets and whose antecedent are frequent generators. Let us consider the fol-
lowing notations:
VIE MGB: A Visual Interactive Exploration 191
Generic basis Intended derivable rules Infer. mechanism Lossless Sound Informative
RBP han AR LA + D yes yes no
IGB AR CR + A + CD yes yes yes
(GB,IB) certainAR+approxAR A + CD yes yes yes
scMGB AR A + CD yes yes no
Table 7. Summary of Association Rules Representations
Database Nature #transactions #items |f req − clos| |max − clos|
Retail sparse 88162 16470 580 284
Mushrooms dense 8124 120 427 63
Connect dense 67557 130 810 98
Chess dense 3196 76 1194 71
Table 8. Dataset characteristics
– AR: strong association rules
– LA: left augmentation
– D: decomposition
– CR: conditional reflexivity
– A: augmentation
– CD: conditional decomposition
5 Visual implementation and experimental evaluation
All experiments described below were performed on a 3,06GHz PC with 512MB
of memory, running Windows XP. The algorithm MGB was coded in JAVA.
Table 8 shows characteristics of real and synthetic datasets used in our eval-
uation.
Table 9 shows the experimental results. The last column consists of the number
of all strong association rules extracted.
5.1 Experimental results
We compare MGB’results with those concerning RBP han , IGB and (GB,IB)
which are detailed in [11].
192 Chiraz Latiri Cherif, Wissem Bellegha, Sadok Ben Yahia, Ghada Guesmi
Database Minsup Minconf MGB RBP han IGB (GB,IB) AR
Retail 0.5% 0.5% 435 284 580 1382 1382
1% 402 459 757 1334 1334
10% 232 214 427 770 770
50% 304 305 353 438 438
100% 0 0 0 0 0
Connect 95% 95% 635 97 809 25336 77816
96% 852 1003 2140 23780 73869
97% 1403 1178 2438 18470 60101
98% 1033 1404 2598 11717 41138
99% 1386 1667 2284 5250 19967
100% 682 682 682 682 2260
Mushrooms 30% 30% 332 63 427 7623 94894
50% 366 318 967 5761 79437
70% 364 429 966 4520 58010
90% 498 519 799 2159 24408
100% 557 558 558 557 8450
Chess 87% 87% 440 71 1194 31538 42740
89% 519 430 2423 29704 40451
91% 627 486 2655 26147 36098
93% 793 616 2766 21350 29866
95% 671 768 2754 14373 20312
100% 342 342 342 342 342
Table 9. Number of generic association rules
VIE MGB: A Visual Interactive Exploration 193
– For some minconf values, MGB is the most reduced set of generic associa-
tion rules:
- In fact, concerning the representative basis RBP han , for each frequent
closed itemset c, if support(c) ≥ minconf , then we have a factual rule
of the form ∅ =⇒ c.
- However, for MGB and for the same minconf value, if the set of generators
of c is empty, the algorithm do not generate any rule.
– If we consider the database Retail, one can observe that for minconf = 1
the number of association rules is equal to 0, so we can say that all the
frequent closed itemsets have only one minimal generator which overlaps the
associated closed itemset.
– By setting minsup = minconf
- For IGB, the number of association rules is equal to the number of frequent
closed itemsets. In fact, the support of all the frequent closed itemsets is
higher or equal than the minconf value.
- The number of association rules in RBP han is equal to the number of
maximal frequent closed itemsets,
– For Mushrooms dataset we can observe that the bases IGB and RBP han
have one more association rule than MGB and (GB,IB), because of the
presence of the item ”85” in all transactions which produces a factual rule
in IGB and RBP han of the form ∅ =⇒ 85.
5.2 VIE MGB: Visual Interactive Exploration of Minimal Generic
Basis
In this section, we present VIE MGB, a new framework for an interactive
graphical visualization and exploration of MGB and the Iceberg Galois lattice.
VIE MGB allows to display generic rules and to derive all strong redundant
association rules with their supports and confidences. The database of inter-
est has previously been mined, using an efficient algorithm for mining frequent
closed itemsets and their minimal generators. Once these closed itemsets are
mined, they are taken as input.
In fact, we have two independent processes executed by VIE MGB, which
are:
1. Generating MGB.
2. Constructing the Iceberg Galois lattice and to visualize redundant rules.
The visual interactive framework for exploring and visualizing minimal generic
basis is depicted in figure 2.
194 Chiraz Latiri Cherif, Wissem Bellegha, Sadok Ben Yahia, Ghada Guesmi
To generate MGB As we have shown below, this tool takes as input a pre-
viously stored file that contains all the frequent closed itemsets, their minimal
generators and supports extracted from a given dataset, then the user must
specify a minconf value, thus VIE MGB generates a text file which represents
MGB containing all the generic association rules with their supports and confi-
dences.
Fig. 2.
Interactive lattice and rule exploration In other way, from the closed item-
sets, VIE MGB creates the Iceberg Galois lattice as shown in figure 2. Each
closed itemset is represented by a node, and there is a link connecting two nodes
if they are related by a set inclusion relation and there is no intermediate set
between them. Once the lattice is displayed on the panel, one can generate the
redundant association rules as follows:
The user can click on a chosen node on the lattice, then VIE MGB dis-
plays on the text area the generic rules corresponding to this closed itemset and
all strong redundant association rules derived from the generic ones with their
supports and confidences.
VIE MGB: A Visual Interactive Exploration 195
Remark 5. When generating the redundant association rules, each frequent item-
set is mapped to the smallest frequent closed itemset which contains it to deter-
mine the support of the derived rule and its confidence. So, the problem of the
informativity of the generic basis is completely resolved.
6 Conclusion
In this paper, we have presented theoretical foundations of generic bases of asso-
ciation rules and we have pointed out the properties satisfied by some approaches.
We have proposed MGB which is a new minimal generic basis of associations
rules and its associated inference mechanism. We have also introduced the con-
cept of ”partial informativity”.
This new approach is consolidated by a set of experiments using real and
synthetic databases showing important benefits in terms of reduction in the
number of association rules presented to the user. Finally, we have proposed a
new framework denoted VIE MGB allowing a visual interactive exploration of
MGB.
References
1. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Intelligent structur-
ing and reducing of association rules with formal concept analysis. In: Proceedings
of KI’2001 conference, Lecture Notes in Artificial Intelligence 2174, Springer-verlag.
(2001) 335–350
2. Ganter, B., Wille, R.: Formal Concept Analysis. Springer-Verlag, Heidelberg (1999)
3. Zaki, M., Phoophakdee, B.: MIRAGE: A framework for mining, exploring and
visualizing minimal association rules. Rpi cs dept technical report (2003)
4. Barbut, M., Monjardet, B.: Ordre et classification. Algèbre et Combinatoire. Ha-
chette, Tome II (1970)
5. Bastide, Y., Pasquier, N., Taouil, R., Lakhal, L., Stumme, G.: Mining minimal
non-redundant association rules using frequent closed itemsets. In: Proceedings of
the International Conference DOOD’2000, Lecture Notes in Computer Sciences,
Springer-verlag. (2000) 972–986
6. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between sets of
items in large Databases. ACM SIGMOD Records (1993) 207–216
7. Kryszkiewicz, M.: Concise representations of association rules. In: Proceedings of
Pattern Detection and Discovery, ESF Exploratory Workshop, London, UK. (2002)
92–109
196 Chiraz Latiri Cherif, Wissem Bellegha, Sadok Ben Yahia, Ghada Guesmi
8. Duquenne, J., Guigues, J.: Famille minimale d’implications informatives résultant
d’un tableau de données binaires. Mathématiques et Sciences Humaines 95 (1986)
5–18
9. Luong, V.P.: Raisonnement sur les rgles d’association. In: In Proceedings 17me
Journe Bases de Donnes Avances BDA’2001, Agadir(Maroc), Cpadus Edition.
(2001) 299–310
10. Agrawal, R., Skirant, R.: Fast algorithms for mining association rules. In: Pro-
ceedings of the 20th International Conference on Very Large Databases. (1994)
478–499
11. Gasmi, G., BenYahia, S., Nguifo, E.M., Slimani, Y.: A new informative generic
base of association rules. In: In Proceedings of the Ninth Pacific-Asia Conference
on Knowledge Discovery and Data Mining (PAKDD-05),Hanoi,Vietnam. (2005)
81–90