1 Introduction

VVIIEE MMGGBB:: AA VViissuuaall IInntteerraaccttiivvee EExxpplloorraattiioonn ooff MMiinniimmaall GGeenneerriicc BBaassiiss ooff AAssssooppcciiaattiioonn RRuulleess

Chiraz Latiri Cherif

Wissem Bellegha

wissembellegha@gmail.com 0

Sadok Ben Yahia

Chiraz Latiri Cherif

smissi

Sadok Ben Yahia

Ghada Guesmi

0 0 Campus UniverRseitsaeiarrechELTeMa manUarR , P10A6H0 Tunis , Tunisia Campus

179 196

Mining association rules is an important task, even though the number of rules discovered is often huge. A possible solution to this problem, is to use the Formal Concept Analysis (FCA) mathematical settings to restrict rules extraction to a generic basis of association rules. This one is considered as a reduced set to which we can apply appropriate inference mechanisms to derive redundant rules. In this paper, we introduce a new minimal generic basis MGB of non-redundant association rules based on the augmented Iceberg Galois lattice. The proposed approach involves the inference mechanisms used and a set of experiments applied to several real and synthetic databases. Carried out experiments showed important benefits in terms of reduction in the number of generic rules extracted. We present also a new framework for generating and visually exploring the minimal generic basis MGB.

1 Introduction

Discovering association rules in large datasets is an interesting problem of research in data mining. Its principal objective is to find out correlations between frequents itemsets. An association rule is a strong one when its support (i.e., frequency of the represented pattern) and confidence (i.e., strength of the dependency between the premise and the conclusion) are higher than minimum thresholds fixed by the user (minsup and minconf ). However the number of association rules generated is often huge because many of them are redundant. So, a possible solution to this problem, is to restrict rules extraction to a reduced set containing only non redundant ones which are strictly related to the user’s need [ 1 ].

This reduced set is known as generic basis for association rules, to which we can apply appropriate inference mechanisms to derive all strong redundant association rules. A generic basis can be evaluated by three properties such as informativity, compacity and inference mechanism which will be detailed later.

This paper is organized as follows. Section 2 presents the mathematical background of Formal Concept Analysis (FCA) [ 2 ] and describes the association rules mining task. Section 3 deals with the problem of eliminating redundant rules by surveying some of the proposed approaches for constructing generic bases of association rules. In section 4, we present a new minimal generic basis of association rules denoted as MGB. Section 5 consolidates the proposed approach by a set of experiments applied to several real synthetic databases. Finally, we will present VIE MGB as a new framework for visually generating and exploring MGB and the Iceberg Galois lattice. We were inspired by MIRAGE [ 3 ] which allows interactive graphical visualization and exploration of minimal association rules. 2

Mathematical background

In this section, we introduce the mathematical notions based on the FCA. 2.1

Basic notions

Formal context: A formal context is a triplet K = (O, A, R), where O represents a finite set of objects, A a finite set of attributes, and R a binary relation(i.e., R ⊆ O × A). Each couple (o, a) ∈ R expresses that the transaction o ∈ O contains the attribute a ∈ A.

We define two functions, summarizing links between subsets of objects and subsets of attributes induced by R, that map sets of objects to sets of attributes and vice versa. Thus, for a set O ⊆ O, we define φ(O) = {a | ∀o, o ∈ O =⇒ (o, a) ∈ R} ; and for A ⊆ A, ψ(A) = {o | ∀a, a ∈ A =⇒ (o, a) ∈ R}. Both functions φ and ψ form a Galois connection between the sets P(A) and P(O) [ 4 ]. Consequently, both compound operators of φ and ψ are closure operators, in particular, ω = φ ◦ ψ is a closure operator.

Frequent closed itemset: An itemset A ⊆ A is said to be closed if A = ω(A), and is said to be frequent with respect to minsup threshold if support(A) = |ψ(A)| ≥ minsup.

|O|

Formal concept: A formal concept is a couple c = (O; A), where O is called extent, and A is a closed itemset, called intent. Furthermore, both O and A are related through the Galois connection, i.e., φ(O) = A and ψ(A) = O.

Minimal generator: An itemset g ⊆ A is called minimal generator of a closed itemset A, if and only if ω(g) = A and g ⊆ g such that ω(g ) = A [ 5 ].

Galois lattice: Given a formal context K, the set of formal concepts CK is a complete lattice Lc = (C, ≤), called the Galois (concept) lattice, when CK is considered with inclusion between itemsets [ 2 ] [ 4 ]. A partial order on formal concepts is defined as follows ∀c1, c2 ∈ CK, c1 ≤ c2 iff intent(c2) ⊆ intent(c1), or equivalently extent(c1) ⊆ extent(c2). The partial order is used to generate the lattice graph, called Hasse diagram, in the following manner: there is an arc (c1, c2), if c1 c2 where is the transitive reduction of ≤, i.e., ∀c3 ∈ CK, c1 ≤ c3 ≤ c2 implies either c1 = c3 or c2 = c3.

Property 1 Lattice operator join provides the least upper bound (LUB) in the concept lattice, and it is defined as follows:

Let S ⊆ Lc

J oin(S) = ω ∪ c c∈S

Iceberg Galois lattice: When only frequent closed itemsets are considered with set inclusion, the resulting structure (Lc, ⊆) only preserves the LUBS , i.e., the join operator. This is called a join semi-lattice or upper semi-lattice. In the remaining of the paper, such structure is referred to as ”Iceberg Galois Lattice”. Example 1 Let us consider the extraction context given by table 1. The associated Iceberg Galois lattice, for minsup = 12 , is depicted by figure 1. Each node in the Iceberg is represented as couple (closed itemset,support) and is decorated with its associated minimal generators list. 2.2

Association rules

An association rule represents an implication between frequent itemsets. It is an expression of the form: R : X =⇒ Y [ 6 ], in which X and X Y are frequent itemsets, and X ⊂ X Y . Itemsets X and Y are called, respectively, premise and conclusion of the rule R. The support of R denoted supp(R), is equal to the support of X Y , and its confidence denoted by conf (R) is determined as follows: R A C D T W 1 × × × × 2 × × × 3 × × × × 4 × × × × 5 × × × × × 6 × × × conf (R) = ssuupppp((XXY)) . If conf (R) = 1, then R is an exact association rule (ER), otherwise it is called approximate association rule (AR).

The process of association rules extraction can be split into two steps [ 6 ]: 1. Finding out all frequent itemsets from a given context extraction k for a specified minsup value. 2. Generating all association rules from the frequent itemsets obtained and restricting extraction to rules which confidence satisfies the minconf value. 3

Extraction of non redundant association rules

The problem of the relevance and usefulness of extracted association rules is of primary importance. Indeed, in most real life databases, the set of association rules can rapidly grow to be unwieldy, especially as we lower the minsup value, and among which many are redundant. So, a possible solution to this problem is to restrict extraction of rules to a generic basis of non redundant association rules. Once this generic basis is obtained, all the remaining (redundant) rules can be derived using an appropriate inference mechanism. In an ideal case a generic basis should satisfy the following conditions [ 7 ]: - Informativity: to exactly determine the support and confidence of redundant association rules. - Deriving redundant rules: the derivation axioms should be lossless (should enable derivation of all strong rules) and sound (should forbid derivation of rules that are not strong). - Compacity: to have the most reduced set of generic rules allowing the derivation of all strong remaining ones(redundant rules).

In this section, we will overview some representations of strong association rules. 3.1

Minimal non redundant association rules (MNR)

In [ 5 ], Bastide et al. present a couple of generic bases, one for the exact rules denoted GB, and the second for approximate ones denoted IB.

Bastide et al. consider the following rule-redundancy definition: Definition 1 An association rule R : X =⇒ Y is a minimal non redundant association rule iff an association rule R1 : X1 =⇒ Y1 fulfilling the following constraints: 1. supp(R) = supp(R1) and conf (R) = conf (R1) 2. X1 ⊂ X and Y ⊂ Y1

Exact association rules, of the form R : X =⇒ Y , are implications between two itemsets X and X Y whose closures are identical, i.e., ω(X ) = ω(X Y ). Indeed, from the aforementioned equality, we deduce that X and X Y belong to the same class of the equivalence relation induced by the closure operator ω on P (A) and then supp(X ) = supp(X Y ) (i.e., conf (R) = 1). Bastide et al. characterized what they called “the generic basis for exact association rules” (adapting the global implications base of Duquenne and Guigues [ 8 ]). The generic basis for exact association rules, that has been proved to be lossless information, is defined as follows: Definition 2 Let F C be the set of frequent closed itemsets extracted from a context and, for each frequent closed itemset c, let us denote Gc the set of minimal generators of c. The generic basis for exact association rules is as follows:

Association rules Support Conf idence

D =⇒ C 32 1 T =⇒ C 32 1 W =⇒ C 65 1 A =⇒ CW 32 1 DW =⇒ C 12 1 T W =⇒ AC 21 1 AT =⇒ CW 12 1

Association rules Support Conf idence

TADCC==C=⇒⇒==⇒=⇒⇒⇒ACCCWTTDWWW 111222263523 434334632523

GB = {R : g =⇒ c − g | c ∈ F C ∧ g ∈ Gc ∧ g = c}.

The authors also characterized the informative basis for approximate association rules based on extending the family of bases of partial implications of Luxemburger [ 8 ], e.g., cover, structural and arborescent bases. The informative basis is defined as follows: Definition 3 The informative basis for approximate association rules is given by:

IB = {R : g =⇒ c − g, c ∈ F C ∧ g ∈ G ∧ ω(g) ⊂ c}.

With respect to Definitions 2 and 3, we consider that given an Iceberg Galois lattice, representing precedence-relation-based ordered closed itemsets, generic bases of association rules can be derived in a straightforward manner. We assume that in such structure, each closed itemset is ”decorated” with its associated list of minimal generators. Hence, AR represent ”inter-node” implications, assorted with a statistical information, i.e., the confidence, from a sub-closed-itemset to a super-closed-itemset while starting from a given node in an ordered structure. Inversely, ER are ”intra-node” implications extracted from each node in the ordered structure.

Example 2 Generic bases of exact and approximate association rules that can be drawn from the context k, are respectively depicted in table 2.

As shown in [ 7 ] the couple (GB,IB) form a sound and informative generic basis. However, it produces a very important number of association rules, so this generic basis is not compact. In [ 9 ], the author proposes a new generic basis for association rules denoted RBP han, and presents its associated inference mechanism. This approach takes as input all frequent itemsets obtained from a given context extraction k by applying an appropriate algorithm such as Apriori [ 10 ]. Phan defines the ruleredundancy as follows: Definition 4 Let E be the set of all strong association rules extracted from a context extraction k. The rule r : l1 =⇒ l2 ∈ E is a redundant one if and only if there is a rule r : l1 =⇒ l2 ∈ E, and the two following conditions are satisfied:

So the representative basis RBP han is defined as follows: Definition 5 Let minsup and minconf be the support and confidence thresholds for interesting association rules.

RBP han = r : I =⇒ J | I ⊂ J ∧ r : I =⇒ J | I ⊆ I ∧ J ⊆ J Remark 1. RBP han consists in a redefinition of representative rules denoted as RR and defined in [ 7 ]. However, the difference between them is that for RBP han the premise and the conclusion of a given rule are not necessarily disjoint. Example 3 Given the context extraction k depicted by table 1, we present the representative basis extracted for minsup = 12 and minconf = 23 in table 3. Once the representative basis is generated, we can derive all strong redundant association rules by applying the inference mechanism defined in [ 9 ] as follows: Definition 6 Let I, J , K be frequent itemsets and having RBP han, we consider two derivation axioms: 1. Left augmentation: if I =⇒ J K is interesting, then IJ =⇒ J K is interesting. 2. Decomposition: if I =⇒ I1I2 is interesting, then I =⇒ I1 and I =⇒ I2 are interesting.

Moreover, Phan showed that the representative basis is the most reduced set of association rules, from which all remaining rules (redundant) can be derived, so the property of compacity is satisfied. RBP han is sound and lossless, therefore, this approach presents some drawbacks such as: 1. The representative basis is not informative. The support and confidence of derived rules can not be found. 2. For each rule R : X =⇒ Y , we must verify if there is no rule R : X =⇒ Y such as X ⊆ X ∧Y ⊆ Y . 3.3

Informative Generic Basis (IGB)

In [ 11 ], the authors propose a new informative generic basis for association rules denoted IGB, which takes as input all the frequent closed itemsets. IGB is defined in [ 11 ] as follows: Definition 7 Let F C be the set of frequent closed itemsets and Gc, the list of minimal generators for a given frequent closed itemset c.

IGB={R : gs =⇒ I − gs | I ∈ F C ∧ gs ∈ Gc, c ∈ F C ∧ c ⊆ I ∧ conf (R)≥minconf ∧ g | g ⊂ gs ∧ conf (g =⇒ A − g ) ≥ minconf } Example 4 Given the context extraction k of table 1, we present the basis IGB extracted for minsup = 12 and minconf = 23 in table 4.

In [ 11 ], the authors prove that IGB is informative. In fact, IGB contains all frequent closed itemsets, and the support of a frequent itemset is equal to the support of the smallest frequent closed itemset which contains it. So, the support and the confidence of redundant rules can easily be determined.

In other way, IGB is sound and lossless, and the inference mechanism is composed by three derivation axioms which are [ 11 ]:

Association rules Support Confidence

∅TAD∅∅∅======⇒=⇒⇒⇒⇒⇒⇒ACACCCCCTCWTDWWWW 22111223232356 33434432323256

∅ =⇒ C 1 1

Certainly, IGB is informative, therefore this property is satisfied by the presence of all frequent closed itemsets in the generic basis. So, one can imagine the number of frequent closed itemsets extracted from real databases and consequently the number of generic rules that can be extracted. 4

MGB: A new Minimal Generic Basis

In the previous section, we have studied the properties satisfied by each approach presented, and so, we deduce that, the problem of the construction of generic bases is to find a compromise among the compacity and the informativity of such basis. Then, we propose a new minimal generic basis of association rules that guarantees a partial informativity and which means that we can exactly determine the support and confidence of some derived rules, and for others, this approach allows to delimit this measures in an interval.

MGB is defined as follows:

Definition 8 Given

1. Lc: Iceberg Galois lattice decorated by minimal generators and their supports. 2. ci: frequent closed itemset. 3. S: set of immediate successors of a given concept c. 4. Gci : list of minimal generators of a concept ci.

MGB=

R : g =⇒ ci − g | g ∈ Gci ∧ ci ∈ Lc ∧ conf (R) ≥ minconf

∧ s ∈ S | ssuuppppoorrtt((gs)) ≥ minconf

Algorithm for discovery MGB from frequent closed itemsets

In this section, we present the algorithm that extracts the MGB basis directly from Iceberg Galois lattice decorated with its associated list of minimal generators and their supports. It is assumed that for every frequent closed itemset we determine the set of its immediate successors and the list of generators satisfying a minconf constraint.

Algorithm 1 MGB

Input: Lc: Iceberg Galois lattice decorated by minimal generators and their supports.

Output: MGB begin for each (ci ∈ F C | |ci| > 1) do Gci = {minimal generators of concept c | c ⊆ c∧ ssuuppppoorrtt((cci)) ≥ minconf } for each generator g ∈ Gci do

support(s) if ( s ∈ Sci | support(g) ≥ minconf ) then

MGB =MGB ∪R : g → ci − g end

Gci = Gci − {g | g ⊆ g } else

Gci = Gci − {g} Example 5 Given the context extraction shown by table 1, we will illustrate in table 5 the MGB generic basis for minsup = 12 and minconf = 23 . As we have shown, MGB requires to determine the list of all immediate successors of a given concept c, thus, we will present how to construct this list. So, we will explore the Iceberg Galois lattice considering the set inclusion and the partial order defined between the different frequent closed itemsets. In fact, this process is composed of two steps, which are: 1. Generating the list of all successors of a given concept c. 2. Using this list, we determine the immediate successors of c. For deriving all strong redundant association rules, we propose to adopt two derivation axioms of the inference mechanism defined in [ 11 ]. 1. Augmentation: if X =⇒ Y ∈ MGB, then X ∪ Z =⇒ Y − {Z} is a strong association rule, Z ⊂ Y . 2. Conditional Decomposition: if R : X =⇒ Y ∈ MGB, then X =⇒ Z is a strong association rule, Z ⊂ Y .

This inference mechanism has been proved in [ 11 ] to be sound and complete. Remark 2. It is important to mention that for deriving all approximate association rules, the order of applying the inference axioms is important. Indeed, we have to apply first the Augmentation axiom and next apply the Conditional Decomposition axiom on the resulting set of association rules, i.e., set of approximate generic rules and those derived by the Augmentation axiom.

In fact, MGB is not informative. The support and confidence of derived rules cannot be found. For example we cannot find the confidence of the rule AT =⇒ CW derived from A =⇒ CT W because the support of AT is unknown.

Nevertheless, the non-informativity problem can be partially resolved by introducing a new concept denoted ”Partial informativity”. In fact, it means that we can exactly determine the support and confidence of some derived rules, and for others, this approach allows to delimit this measures in an interval. Proposition 1. Let R : X =⇒ Y ∈MGB, A and B two integers, if we apply the inference mechanism, we obtain rules of the following forms: 1. Augmentation: R1 : XZ =⇒ Y − Z

A=B A=B Augmentation Augmentation

R1.supp=R.supp R1.supp=R.supp

R1.conf=R.supp/supp(XZ) R1.conf∈[R.conf,1] Conditional Decomposition Conditional Decomposition

R2.supp=supp(XZ) B ≤ R2 .supp ≤ A

R2.conf=supp(XZ)/supp(X) R2.conf∈ [R.conf,1]

Table 6. Support an confidence of derived rules 2. Conditional Decomposition: R2 : X =⇒ Z (a) support(XZ) ≤ min {support(i1), support(i2), ..., support(ik) | ii ⊂ XZ},

A, where ii is a frequent subset of XZ. (b) support(XZ) ≥ max{support(I1), support(I2), ..., support(Ik)}, B, where

Ii is a frequent closed itemset containing XZ and existing in MGB. Remark 3. if A = B, then we have supp(XZ) = A.

So, table 6 illustrates how we determine support and confidence of these rules. Proof. a: Since the support of a frequent itemset is less than the support of all its subsets, then we deduce that support(XZ) is less than the smallest support of all its subsets.

b: By same, the support of a frequent itemset is higher than the support of all its supersets, so support(XZ) is higher than the support of the smallest frequent closed itemset which contains it and existing in MGB.

c: If we apply the augmentation axiom, the generic rule’s base and the redundant rule’s base are similar, so support(R1) = support(R).

d : If we apply the conditional decomposition axiom, then the derived rule’s base is XZ, and so support(R2) = support(XZ) Remark 4. Before deriving the redundant rules, we determine the support of all frequent itemsets which are the antecedent of generic rules.

Example 6 Given the generic rule R : C =⇒ AW , supp(R) = 23 and conf (R) = 23 , so, supp(C) = csounpfp((RR)) = 1.

In table 7, we summarize the properties of the reviewed association rules representations. These ones contain only rules whose bases are frequent closed itemsets and whose antecedent are frequent generators. Let us consider the following notations: Generic basis Intended derivable rules Infer. mechanism Lossless Sound Informative RBP han AR LA + D yes yes no

IGB AR CR + A + CD yes yes yes (GB,IB) certainAR+approxAR A + CD yes yes yes scMGB AR A + CD yes yes no

Table 7. Summary of Association Rules Representations

Visual implementation and experimental evaluation

All experiments described below were performed on a 3,06GHz PC with 512MB of memory, running Windows XP. The algorithm MGB was coded in JAVA.

Table 8 shows characteristics of real and synthetic datasets used in our evaluation. We compare MGB’results with those concerning RBP han, IGB and (GB,IB) which are detailed in [ 11 ]. Mushrooms 30% – For some minconf values, MGB is the most reduced set of generic association rules: - In fact, concerning the representative basis RBP han, for each frequent closed itemset c, if support(c) ≥ minconf , then we have a factual rule of the form ∅ =⇒ c. - However, for MGB and for the same minconf value, if the set of generators of c is empty, the algorithm do not generate any rule. – If we consider the database Retail, one can observe that for minconf = 1 the number of association rules is equal to 0, so we can say that all the frequent closed itemsets have only one minimal generator which overlaps the associated closed itemset. – By setting minsup = minconf - For IGB, the number of association rules is equal to the number of frequent closed itemsets. In fact, the support of all the frequent closed itemsets is higher or equal than the minconf value. - The number of association rules in RBP han is equal to the number of maximal frequent closed itemsets, – For Mushrooms dataset we can observe that the bases IGB and RBP han have one more association rule than MGB and (GB,IB), because of the presence of the item ”85” in all transactions which produces a factual rule in IGB and RBP han of the form ∅ =⇒ 85. 5.2

VIE MGB: Visual Interactive Exploration of Minimal Generic Basis

In this section, we present VIE MGB, a new framework for an interactive graphical visualization and exploration of MGB and the Iceberg Galois lattice. VIE MGB allows to display generic rules and to derive all strong redundant association rules with their supports and confidences. The database of interest has previously been mined, using an efficient algorithm for mining frequent closed itemsets and their minimal generators. Once these closed itemsets are mined, they are taken as input.

In fact, we have two independent processes executed by VIE MGB, which are: 1. Generating MGB. 2. Constructing the Iceberg Galois lattice and to visualize redundant rules. The visual interactive framework for exploring and visualizing minimal generic basis is depicted in figure 2.

Interactive lattice and rule exploration In other way, from the closed item

sets, VIE MGB creates the Iceberg Galois lattice as shown in figure 2. Each closed itemset is represented by a node, and there is a link connecting two nodes if they are related by a set inclusion relation and there is no intermediate set between them. Once the lattice is displayed on the panel, one can generate the redundant association rules as follows:

The user can click on a chosen node on the lattice, then VIE MGB displays on the text area the generic rules corresponding to this closed itemset and all strong redundant association rules derived from the generic ones with their supports and confidences. Remark 5. When generating the redundant association rules, each frequent itemset is mapped to the smallest frequent closed itemset which contains it to determine the support of the derived rule and its confidence. So, the problem of the informativity of the generic basis is completely resolved. 6

Conclusion

In this paper, we have presented theoretical foundations of generic bases of association rules and we have pointed out the properties satisfied by some approaches. We have proposed MGB which is a new minimal generic basis of associations rules and its associated inference mechanism. We have also introduced the concept of ”partial informativity”.

This new approach is consolidated by a set of experiments using real and synthetic databases showing important benefits in terms of reduction in the number of association rules presented to the user. Finally, we have proposed a new framework denoted VIE MGB allowing a visual interactive exploration of MGB.

1. Stumme , G. , Taouil , R. , Bastide , Y. , Pasquier , N. , Lakhal , L. : Intelligent structuring and reducing of association rules with formal concept analysis . In: Proceedings of KI'2001 conference, Lecture Notes in Artificial Intelligence 2174 , Springer-verlag. ( 2001 ) 335 - 350

2. Ganter , B. , Wille , R.: Formal Concept Analysis . Springer-Verlag, Heidelberg ( 1999 )

3. Zaki , M. , Phoophakdee , B. : MIRAGE: A framework for mining, exploring and visualizing minimal association rules . Rpi cs dept technical report ( 2003 )

4. Barbut , M. , Monjardet , B. : Ordre et classification . Alg`ebre et Combinatoire. Hachette , Tome II ( 1970 )

5. Bastide , Y. , Pasquier , N. , Taouil , R. , Lakhal , L. , Stumme , G.: Mining minimal non-redundant association rules using frequent closed itemsets . In: Proceedings of the International Conference DOOD'2000, Lecture Notes in Computer Sciences , Springer-verlag. ( 2000 ) 972 - 986

6. Agrawal , R. , Imielinski , T. , Swami , A. : Mining Association Rules between sets of items in large Databases . ACM SIGMOD Records ( 1993 ) 207 - 216

7. Kryszkiewicz , M. : Concise representations of association rules . In: Proceedings of Pattern Detection and Discovery , ESF Exploratory Workshop, London, UK. ( 2002 ) 92 - 109

8. Duquenne , J. , Guigues , J.: Famille minimale d'implications informatives r´esultant d'un tableau de donn´ees binaires . Math´ematiques et Sciences Humaines 95 ( 1986 ) 5 - 18

9. Luong , V.P. : Raisonnement sur les rgles d'association . In: In Proceedings 17me Journe Bases de Donnes Avances

BDA

' 2001 , Agadir (Maroc), Cpadus Edition . ( 2001 ) 299 - 310

10. Agrawal , R. , Skirant , R.: Fast algorithms for mining association rules . In: Proceedings of the 20th International Conference on Very Large Databases . ( 1994 ) 478 - 499

11. Gasmi , G. , BenYahia, S. , Nguifo , E.M. , Slimani , Y. : A new informative generic base of association rules . In: In Proceedings of the Ninth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-05) ,Hanoi, Vietnam. ( 2005 ) 81 - 90