<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>GARM: Generalized Association Rule Mining</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>T. Hamrouni</string-name>
          <email>hamrouni@cril.univ-artois.fr</email>
          <email>tarek.hamrouni@fst.rnu.tn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Ben Yahia</string-name>
          <email>sadok.benyahia@fst.rnu.tn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>E. Mephu Nguifo</string-name>
          <email>mephu@cril.univ-artois.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CRIL-CNRS, IUT de Lens</institution>
          ,
          <addr-line>Lens</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, Faculty of Sciences of Tunis</institution>
          ,
          <addr-line>Tunis</addr-line>
          ,
          <country country="TN">Tunisia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2008</year>
      </pub-date>
      <fpage>145</fpage>
      <lpage>156</lpage>
      <abstract>
        <p>A thorough scrutiny of the literature dedicated to association rule mining highlights that a determined effort focused so far on mining the co-occurrence relations between items, i.e., conjunctive patterns. In this respect, disjunctive patterns presenting knowledge about complementary occurring items were neglected in the literature. Nevertheless, recently a growing number of works is shedding light on their importance for the sake of providing a richer knowledge for users. For this purpose, we propose in this paper a new tool, called GARM, aiming at building a partially ordered structure amongst some particular disjunctive patterns, namely the disjunctive closed ones. Starting from this structure, deriving generalized association rules, i.e., those offering conjunctive, disjunctive and negative connectors between items, becomes straightforward. Our experimental study put the focus on the mining performances as well as the quantitative aspect and proved the utility of the proposed approach.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Association rule mining is a fundamental topic in Data mining [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. It has been
extensively investigated since its inception. Its key idea consists in looking for causal
relationships between sets of items, commonly called itemsets, where the presence of some
items suggests that others follow from them. A typical example of a successful
application of association rules is the market basket analysis, where the discovered rules can
lead to important marketing and management strategic decisions. Recently, mining
association rules was extended to various pattern classes like sequential patterns, graphs,
etc. Nevertheless, the main moan that can be addressed to the contributions related to
association rules is their focus on co-occurrences between items [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], probably as a
heritage of the market basket analysis framework. Indeed, almost all related works neglect
the other kinds of relations, like mutually exclusive occurrences [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], that can also bring
information of worth interest for users.
      </p>
      <p>In this paper, we propose a new tool, called GARM 1, covering the whole process
allowing the extraction of generalized association rules. These latter generalize classical
rules – positive rules – to offer disjunctive and negative connectors between items,</p>
    </sec>
    <sec id="sec-2">
      <title>1 GARM is the acronym of generalized association rule miner.</title>
      <p>
        in addition to the conjunctive one [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Our tool includes a first component making it
possible extracting a concise representation of frequent patterns based on disjunctive
patterns. Thanks to a second component, these latter will be partially structured w.r.t. set
inclusion. Once the partially ordered structure obtained, generalized association rules
can be easily derived thanks to the last component of our tool.
      </p>
      <p>
        Noteworthily, extracting an exact concise representation of frequent patterns in the
first component of the process makes it possible to exactly derive the different supports
of each frequent pattern. This will make us able to compute the exact values of
quality measures. Indeed, it was shown in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] that almost all interestingness measures for
association rules are expressed depending on the support of the rule and those of its
associated premise and conclusion. In addition, using disjunctive patterns – in
particular closed and essential patterns [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] – will provide an interesting starting point towards
mining association rules conveying complementary occurrences between items, rather
than co-occurrences. Indeed, these latter relationships – co-occurrences within literals 2
– were explored in-depth in the literature through association rules having conjunction
of literals, called literalsets, in premise and conclusion. This leads to what is commonly
known as positive and negative association rules. While disjunctive association rules
only have recently begin to grasp the interest of researchers.
      </p>
      <p>
        In general, generalized association rules are useful in many applications. In
particular, disjunctive association rules – having disjunction of items either in premise or in
conclusion – were considered for two main purposes: On the one hand, they were used
as an intermediate step for defining some concise representations for frequent patterns
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. On the other hand, they were exploited to provide users with new forms of
association rules [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. For example, the added-value of such association rules has been
recently highlighted in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. It is however important to note that generalized association
rules can be considered as particular GUHA rules [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Note that we restrict ourselves in this work to disjunctive closed patterns whose
smallest seeds, i.e. essential patterns, are frequent with respect to a minimum
conjunctive support threshold. This is argued by the fact that we aim at retaining the spirit of
association rule mining where this threshold, as well as the confidence-based one, is
used to dramatically limit the number of extracted association rules. In addition, the use
of a partially ordered structure will make it possible to select representative subsets of
rules to be extracted. This nucleus of rules will be of paramount help for avoiding to
overwhelm users by highly-sized rule lists.</p>
      <p>The remainder of the paper is organized as follows. The next section discusses the
related work. Section 3 recalls the key notions used throughout this paper. The
structural properties of the disjunctive search space are explored in Section 4, followed by a
detailed description of the GARM tool having for purpose to offer a complete process
for the extraction of generalized association rules in Section 5. Experimental results
focusing on the mining time as well as the quantitative aspect are reported and discussed
in Section 6. Section 7 concludes the paper and points out future works.</p>
    </sec>
    <sec id="sec-3">
      <title>2 A literal is an item or the negation of an item.</title>
      <sec id="sec-3-1">
        <title>Related Work</title>
        <p>
          Contributions related to association rule mining mainly concentrated on the classical
rule form, namely that presenting conjunction of items in both premise and conclusion
parts. In this respect, many concise representations for such rules were proposed in the
literature [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Recently, some works focused on introducing negative items.
Nevertheless, the majority of items are not present in each transaction leading to explosive
amounts of association rules with negation. Thus, existing approaches have tried to
address this problem through the use of additional background information about the
data, incorporating attribute correlations, and additional rule interestingness measures,
etc. Here we will mainly detail the reduced number of related works on association
rules relying on the disjunctive connector within items.
        </p>
        <p>
          Some works [
          <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
          ] were interested in using the disjunction connector within the
association rule mining issue to define what is called generalized association rules.
These rules grasped the interest of many researchers since they offer wealthier types of
knowledge in many applications. In addition to the inclusive disjunction operator, i.e.,
the operator ∨, Nanavati et al. in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] were also interested in the exclusive disjunction
operator, denoted ⊕. The authors hence proposed two kinds of rules which are the
simple disjunctive rules and the generalized disjunctive ones. Simple disjunctive rules
are those having either the premise or the conclusion (i.e., not simultaneously both)
composed by a disjunction of items. This disjunction can be inclusive (the simultaneous
occurrence of items is possible) or exclusive (two distinct items cannot occur together).
On the other hand, generalized disjunctive rules are disjunctive rules whose premises
or conclusions contain a conjunction of disjunctions. These disjunctions can either be
inclusive or exclusive. In [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], the author mainly focuses on getting out association rules
having conclusions containing mutually exclusive items, i.e., the presence of one of
them leads to the absence of the others, what is expressed in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] using the operator ⊕.
Other forms of generalized association rules were also described in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. In [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], Shima
et al. extract what they called disjunctive closed rules. In their work, a disjunctive closed
rule simply stands for a clause under the disjunctive normal form (DNF) such that its
disjuncts are constituted by frequent closed patterns. Elble et al. used disjunctive rules
to handle numerical attributes by considering disjunctions between intervals [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. This
latter work extends other ones taking also into account categorical attributes (see [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]
for references). Finally, it is worth noting that the disjunction connector has also been
used to define some concise representations of frequent patterns through the so-called
disjunctive rule (see for example [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] for references).
3
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Basic Concepts</title>
        <p>In this section, we briefly sketch the key notions that will be of use throughout the paper.
Definition 1. An extraction context is a triplet K = (O, I, R) where O and I are,
respectively, a finite set of objects (or transactions) and items (or attributes), and R ⊆
O × I is a binary relation between the objects and items. A couple (o, i) ∈ R denotes
that the object o ∈ O contains the item i ∈ I.
Example 1. We will consider in the remainder a context that consists of transactions
(1, AB ), (2, ACD ), (3, CDE ), (4, DEF ), (5, ABCDE ), and (6, ABC ) 3.
Definition 2. (SUPPORTS OF A PATTERN) Let K = (O, I, R) be a context and I be a
pattern. We mainly distinguish three kinds of supports related to I:</p>
        <p>Supp( ∧ I ) = | {o ∈ O | (∀ i ∈ I, (o, i) ∈ R)} |
Supp( ∨ I ) = | {o ∈ O | (∃ i ∈ I, (o, i) ∈ R)} |</p>
        <p>Supp(I ) = | {o ∈ O | (∀ i ∈ I, (o, i) ∈/ R)} |
Roughly speaking, the semantics of the aforementioned supports is as follows:
• Supp(∧ I ) is the number of objects containing all items of I.
• Supp(∨ I ) is the number of objects containing at least one item of I.
• Supp(I ) is the number of objects that do not contain any item of I.</p>
        <p>Note also that Supp(∨ I ) and Supp(I ) are two complementary quantities w.r.t. |O| in
the sense that: Supp(∨ I ) + Supp(I ) = |O|.</p>
        <p>Example 2. Consider our running context. We have Supp(∧ CDE) = | {3, 5} | = 2,
Supp(∨ CDE) = | {2, 3, 4, 5, 6} | = 5 and Supp(CDE) = | {1} | = 1.</p>
        <p>Hereafter, Supp(∧ I ) will simply be denoted Supp(I ). In addition, if there is no risk of
confusion, the conjunctive support will simply be called support. A pattern I is said to
be frequent if Supp(I ) is greater than or equal to a minimum support threshold, denoted
minsupp. Since the set of frequent patterns is an order ideal, the set of items I will be
considered as only containing frequent items. Lemma 1 states that conjunctive supports
can be derived starting from disjunctive ones.</p>
        <p>
          Lemma 1. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] Let I ⊆ I. The following equalities hold:
        </p>
        <p>X
Supp(I ) =</p>
        <p>( − 1)|I0|−1Supp( ∨ I0)
∅⊂I0⊆I
4</p>
      </sec>
      <sec id="sec-3-3">
        <title>Structural Properties of the Disjunctive Search Space</title>
        <p>
          In this section, we will characterize disjunctive patterns through the associated
equivalence classes induced by the following closure operator:
Definition 3. Let K = (O, I, R) be an extraction context. The disjunctive closure
operator h is defined as follows [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]:
h : P (I ) → P (I )
        </p>
        <p>
          I 7→ h(I ) = {i ∈ I | (∀ o ∈ O) ((o, i) ∈ R) ⇒ (∃ i1 ∈ I )((o, i1) ∈ R)}.
The disjunctive closure h(I ) of a pattern I is equal to the maximal set of items which
only appear in the transactions that contain at least an item of I. The closure operator h
induces an equivalence relation on the power-set of I, which partitions it into so-called
disjunctive equivalence classes. In each class, all the elements have the same
disjunctive support. The smallest incomparable elements, w.r.t. set inclusion, of a disjunctive
equivalence class are called essential patterns, while the disjunctive closed pattern is the
largest one [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. These particular patterns are defined as follows.
3 We use a separator-free form for the sets, e.g., ABC stands for the set of items {A, B, C}.
        </p>
        <sec id="sec-3-3-1">
          <title>Definition 4.</title>
          <p>• A pattern I ⊆ I is a disjunctive closed pattern if I = h(I ) or, equivalently, Supp( ∨ I )
&lt; min{Supp(∨I0) | I0 ⊆ I s.t. I ⊂ I0}.
• A pattern I ⊆ I is an essential pattern if ∀ I0 ⊂ I, I * h(I0) or, equivalently, Supp(∨
I ) &gt; max{Supp(∨I0) | I0 ⊆ I s.t. I0 ⊂ I}.</p>
          <p>Example 3. Consider our running context. The pattern CDEF is disjunctively closed,
while BE is not, since Supp(∨ BE ) = Supp(∨ BEF ). On the other hand, the pattern AC
is essential, while DE is not, since Supp(∨ DE ) = Supp(∨ D ).</p>
          <p>In the remainder, F E PK 4 denotes the set of frequent essential patterns associated
to a given context K and a fixedminsupp value. The associated set of disjunctive closure
will further be denoted E DCPK 5. This latter set is hence equal to {h(I ) | I ∈ F E P K}.</p>
          <p>
            To establish the link with conjunctive equivalence class – gathering patterns having
the same Galois closure [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ] – we notice that essential patterns (resp. disjunctive closed
patterns) are equivalent to minimal generators aka free-sets (resp. closed patterns) (see
[
            <xref ref-type="bibr" rid="ref1">1</xref>
            ] for references). These latter patterns were at the basis of the main concise
representations of association rules that were proposed in the literature [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ]. This clearly
motivates the use of their correspondences within the disjunctive search space.
5
          </p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>Detailed Description of the GARM Tool</title>
        <p>As mentioned in the first section, the GARM tool is composed of three
complementary components which are as follows: (i) Extracting an exact concise representation
of frequent patterns based on disjunctive closed patterns and frequent essential ones.
(ii) Building a partially ordered structure w.r.t. set inclusion within disjunctive closed
patterns. Each one of these latter will be accompanied by its set of frequent essential
patterns. (iii) Deriving generalized association rules from the built structure.
5.1</p>
        <sec id="sec-3-4-1">
          <title>Extracting a New Concise Representation based on Disjunctive Patterns</title>
          <p>
            Our representation is based on the sets F E PK and E DCPK, as stated by Theorem 1.
Theorem 1. The set E DCPK ∪ F E PK is an exact concise representation of the set of
frequent patterns F PK [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ].
          </p>
          <p>Example 4. Figure 1 (Left) lists the set of disjunctive closed patterns associated to
the running context. For each closed pattern, its associated disjunctive support and
frequent essential patterns, for minsupp = 1, are also given.</p>
          <p>
            This representation will be denoted DSSRK 6. It is extracted thanks to an
adaptation of our DCPR MINER 7 algorithm [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ], what constitutes the first component of the
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4 Stands for frequent essential patterns. 5 Stands for essential disjunctive closed patterns. 6 Stands for disjunctive search space-based representation. 7 DCPR MINER is the acronym of disjunctive closed pattern-based representation miner.</title>
      <sec id="sec-4-1">
        <title>EDCPK Disj. Supp.</title>
        <p>B 3
C 4
F 1
AB 4
EF 3
ABC 5
BEF 5
DEF 4</p>
        <p>CDEF 5
ABCDEF 6</p>
        <p>FEPK</p>
        <p>B
C
F
A</p>
        <p>E
AC, BC</p>
        <p>BE</p>
        <p>D</p>
        <p>CD, CE
AD, AE, BD, BCE</p>
        <p>({AD, AE, BD, BCE}: ABCDEF, 6)
({AB, BC}: ABC, 5) ({BE}: BEF, 5) ({CD, CE}: CDEF, 5)
({A}: AB, 4)
({D}: DEF, 4)
({E}: EF, 3)
({B}: B, 3)
({C}: C, 4)</p>
        <p>
          ({F}: F, 1)
∅
GARM tool. Starting from DSSRK, the conjunctive and negative supports of frequent
patterns can thus be deduced using disjunctive supports. This representation also allows
the derivation of the support of each literalset whose positive variation is based on a
frequent pattern. This is carried out using the following formula [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]: Supp(x1 ∧ x2 ∧
X
        </p>
        <p>( − 1)|S|Supp(x1 ∧ x2 ∧ . . . ∧ xn ∧ S), such</p>
        <p>S⊆{y1,...,ym}
that its positive variation, namely {x1, x2, . . ., xn, y1, y2, . . ., ym}, belongs to F PK.
5.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Building the Partially Ordered Structure</title>
        <p>
          In this section, we will propose a new algorithm, called POSB 8, for partially sorting
disjunctive closed patterns w.r.t. set inclusion. The POSB algorithm hence takes as
input the representation DSSRK s.t. to each disjunctive closed pattern is associated
its set of frequent essential patterns and disjunctive support. A node in the partially
ordered structure will be associated to each disjunctive closed pattern. The
pseudocode of POSB is shown by Algorithm 1. Our algorithm inherits two main optimizations
used in the algorithm proposed by Valtchev et al. [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], namely the sorting of disjunctive
closed patterns, and the use of a border. Indeed, the set of disjunctive closed patterns
E DCPK is sorted w.r.t. the increasing pattern size. Since closures of equal size cannot be
comparable, this sorting avoids unnecessary comparisons. In addition, it makes possible
that the closure f under treatment be of the largest size w.r.t. already treated ones. Thus,
it suffices to find its lower cover among the nodes inserted in the structure. This lower
cover is composed by those closures which are immediately covered by f .
        </p>
        <p>On the other hand, the border B is an anti-chain w.r.t. set inclusion containing
maximal closures among those already treated. In fact, the Valtchev et al. algorithm
constructs the Hasse diagram representing the subset-superset relationship among concepts
in the Galois lattice. It begins at the top of the lattice and then recursively identifies the
lower neighbors of each concept. Nevertheless, it is not directly adapted to our
situation. Indeed, although the intersection of two disjunctive closed patterns is obviously</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>8 POSB is the acronym of partially ordered structure builder.</title>
      <sec id="sec-5-1">
        <title>Algorithm 1: POSB</title>
        <p>Input: The set EDCPK of disjunctive closed patterns.
Output: The disjunctive closed patterns ordered by set inclusion.
Begin</p>
        <p>B := ∅ ;</p>
        <sec id="sec-5-1-1">
          <title>Foreach (f ∈ EDCPK) do</title>
          <p>P rohibited List = ∅;</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>Foreach (b ∈ B) do</title>
        <p>inter := b ∩ f ;
If (inter = b) then</p>
        <p>LOWER COVER INSERTION(f , b);</p>
        <p>B := B\ b;
Else If (inter 6= ∅) then</p>
        <p>LOWER COVER MANAGEMENT(f , b);
End</p>
        <p>
          B := B ∪ f ;
a disjunctive closed pattern, this latter does not necessarily belong to E DCPK. This is
due to the fact that it could have all its essential patterns infrequent and, hence, has been
already pruned. On its side, the proposed algorithm in [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] relies on the fact that the
intersection of two concepts was already treated and it suffices to locate the corresponding
node within the Hasse diagram.
        </p>
        <p>In Algorithm 1, disjunctive closed patterns are inserted one at a time to a structure
which is only partially finished to obtain at the end the entire one. Letf be the current
disjunctive closed pattern to be inserted in the partially ordered structure. f will be
compared to the elements of the border B. If an element b ∈ B is included in f , then it is an
element of its lower cover. A link between the node representing b and that representing
f will be constructed thanks to the LOWER COVER INSERTION procedure (cf.
Algorithm 2). The element b will then be deleted from the border. If b is not included in f but
their intersection is not empty, then the LOWER COVER MANAGEMENT procedure will
identify the common immediate predecessors of b and f (cf. Algorithm 3). Finally, f
will be added to the border. It is important to note that in the LOWER COVER
MANAGEMENT procedure, a prohibited list is associated to each disjunctive closed pattern to be
inserted in the partially ordered structure. Indeed, when updating the precedence link
between disjunctive closed patterns, a node can be visited more than once since it can
be an immediate predecessor of many other nodes. This list will avoid such useless
treatments by only allowing the visit of nodes that do not belong to it.</p>
        <p>Example 5. The associated structure to our running context is given by Figure 1 (Right).
5.3</p>
      </sec>
      <sec id="sec-5-3">
        <title>Deriving Generalized Association Rules</title>
        <p>Once the partially ordered structure built, deriving (subsets) generalized association
rules can be easily done. An association rule R: X ⇒ Y based on a pattern Z, denoted
Z-based rule, is such that X = {x1, x2, . . . , xn} ⊆ I and Y = {y1, y2, . . . , ym} ⊆ I be
two patterns, X ∩ Y = ∅, and X ∪ Y = Z. An association rule is usually considered as
interesting w.r.t. two statistical measures, namely the support and the confidence. The
formulae of these measures for an arbitrary rule are as follows:</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Algorithm 2: LOWER COVER INSERTION</title>
      <p>Input: A disjunctive closure f , and an element pred to be inserted in its lower cover.
Output: The updated lower cover of f .</p>
      <p>Begin</p>
      <p>Foreach (l ∈ Lower Cover(f )) do
inter := l ∩ pred;
If (inter = pred) then</p>
      <p>return;
Else If (inter = l) then</p>
      <p>Lower Cover(f ) := Lower Cover(f ) \ l;
End</p>
      <p>Lower Cover(f ) := Lower Cover(f ) ∪ pred;</p>
    </sec>
    <sec id="sec-7">
      <title>Algorithm 3: LOWER COVER MANAGEMENT</title>
      <p>Input: A disjunctive closed pattern f , and an element b of the border B.</p>
      <p>Output: The updated lower cover of f .</p>
      <p>Begin</p>
      <p>Foreach (pred b ∈ Lower Cover(b)) do</p>
      <p>If (pred b ∈/ P rohibited List) then
inter := pred b ∩ f ;
If (inter = pred b) then</p>
      <p>LOWER COVER INSERTION(f , pred b);
Else If (inter 6= ∅) then</p>
      <p>LOWER COVER MANAGEMENT(f , pred b);</p>
      <p>P rohibited List := P rohibited List ∪ pred b;
End</p>
      <p>Supp(X ⇒ Y ) = Supp(X ∧ Y ), and, Conf(X ⇒ Y ) = Supp(X ∧ Y )
Supp(X )
A rule is said to be exact if its confidence is equal to 1. Otherwise, it is said to be
approximate. In addition, it is said to be interesting or valid if its support and confidence
values are greater than or equal to their respective minimum thresholds minsupp and
minconf. It is clear that whenever we are able to evaluate Supp(X ⇒ Y ), the derivation
of the confidence value will be straightforward.</p>
      <p>Let us now adapt the association rule framework to our context. As shown in
Subsection 5.1, the DSSRK representation allows deriving the disjunctive, conjunctive and
negative supports of each set of positive and negative items whose positive variation is
based on a frequent pattern. In the sequel, we present an overview of the process by
which we retrieve generalized association rules and evaluate their associated supports
through traversing the partially ordered structure. Rules can be classified according to
the number of nodes required for their extraction. We then distinguish two cases:
1. An intra-node rule: it requires a unique node and highlight relationships between
a frequent essential pattern and its disjunctive closure f (here Z = f ).
2. An inter-nodes rule: it is extracted using two nodes N1 and N2 s.t. the associated
disjunctive closure of N1, denoted f1, is one of the immediate predecessors of that
of N2, denoted f2. Let e1 be a frequent essential pattern of f1. An inter-nodes rule
describes relationships between either f1 and f2 or e1 and f2 (here Z = f2).
Both kinds of rules – intra-node and inter-nodes – can be either exact or approximate.</p>
      <p>
        Different forms of generalized association rules can be extracted starting from our
representation (cf. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] for a detailed description). To limit the number of possible
extracted rule forms, we mainly focus here on the following ones:
1. Form 1: disjunction of items in premise and conclusion ∨ X ⇒ ∨ Y : Supp(∨ X
⇒ ∨ Y ) = Supp(∨ X ∧ ∨ Y ) = Supp(∨ X ) + Supp(∨ Y ) - Supp((∨ X ) ∨ (∨ Y ))
= Supp(∨ X ) + Supp(∨ Y ) - Supp(∨ Z ),
2. Form 2: negation of items in premise and conclusion X ⇒ Y : Supp(X ⇒ Y ) =
      </p>
      <p>Supp(X ∧ Y ) = Supp((( ∨ X ) ∨ ( ∨ Y ))) = Supp(Z ) = |O| - Supp(∨ Z ),
3. Form 3: disjunction of items in premise and negation of items in conclusion ∨ X
⇒ Y : Supp(∨ X ⇒ Y ) = Supp(∨ X ∧ Y ) = Supp((∨ X ) ∨ (∨ Y )) - Supp(∨ Y ) =
Supp(∨ Z ) - Supp(∨ Y ), and,
4. Form 4: negation of items in premise and disjunction of items in conclusion X ⇒
∨ Y : Supp(X ⇒ ∨ Y ) = Supp(X ∧ ∨ Y ) = Supp((∨ X ) ∨ (∨ Y )) - Supp(∨ X ) =
Supp(∨ Z ) - Supp(∨ X ),
where either X or Y is a frequent essential pattern or a disjunctive closed one, and Z =
X ∪ Y is a disjunctive closed pattern (as described above). For each rule, the support
of Z is known. It is the same for either X or Y since one of them is assumed to be a
frequent essential pattern or a disjunctive closed pattern. For the sake of simplicity, we
assume in the remainder that X is a frequent essential pattern or a disjunctive closed
pattern. Since Y = Z\X, then Y does not necessarily belong to DSSRK and, may even
not be a frequent pattern. Nevertheless, its disjunctive support is required to evaluate
that of the associated rule. To this end, we bound the support of Y using a lower bound,
denoted lb Supp, and an upper bound, denoted ub Supp, computed as follows:
• lb Supp(∨ Y ) = max{Supp(∨ e) | e ∈ F E P K and e ⊆ Y },
• ub Supp(∨ Y ) = min{Supp(∨ f ) | f ∈ E DCP K and Y ⊆ f }.</p>
      <p>In this respect, if Y is encompassed between a frequent essential pattern and its
disjunctive closure, then lb Supp(∨ Y ) = ub Supp(∨ Y ). Hence, the support and
confidence of the associated rule will be exactly computed. Otherwise, these latter measures
will be bounded by a minimal and a maximal possible value using the bounds associated
to Y . Such rules, further denoted approximated rules, are defined as follows:
Definition 5. An association rule is said to be approximated if it has either its support
or its confidence not exactly determined.</p>
      <p>
        Then, only valid rules having minimum possible values of support and confidence
greater than or equal to minsupp and minconf, respectively, will be retained. Note that
an approximated rule is different from an approximate rule in the sense that the latter
has its support and confidence exactly computed (with a confidence not equal to1),
what is not the case of the former. In this respect, approximated rules were shown to
convey interesting knowledge in the case of positive rules (see for example [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]).
      </p>
      <p>Noteworthily, the bounds lb Supp(∨ Y ) and ub Supp(∨ Y ) always exist. Indeed, on
the one hand, since the set of items I is pruned w.r.t. minsupp, then Y will be composed
of frequent items even if it is infrequent. These items obviously belong to F E PK, what
ensures the existence of the lower bound. On the other hand, Y is covered by at least a
disjunctive closed pattern, namely Z, what ensures the existence of the upper bound.
Example 6. Let minsupp = 1 and let minconf = 0.7. Consider the intra-node rule R1
of Form 1 based on the disjunctive closed pattern ABCDEF and its frequent essential
pattern BCE: ∨ BCE ⇒ ∨ ADF. Supp(R1) = Supp(∨ BCE) + Supp(∨ ADF) - Supp(∨
ABCDEF) = Supp(∨ ADF) (since h(BCE ) = ABCDEF ). Since ADF ∈/ DSSRK, we
need to evaluate its support. Since AD ⊆ ADF ⊆ h(AD ) = ABCDEF (cf. Figure 1
(Left)), then lb Supp(∨ ADF) = ub Supp(∨ ADF) = 6. Hence, Supp(R1) = 6 and
Conf (R1) = 1. R1 is hence a valid rule. Now, consider the inter-nodes rule R2 of Form
1 based on ABCDEF and one of its immediate predecessors, namely ABC (cf. Figure 1
(Right)): ∨ ABC ⇒ ∨ DEF. In this case, DEF ∈ E DCPK. Hence, Supp(R2) = Supp(∨
ABC) + Supp(∨ DEF) - Supp(∨ ABCDEF) = 5 + 4 - 6 = 3, and Conf (R2) = 0.6.
Here, we took X = ABC. If we set Y = ABC, then the associated rule R3 = ∨ DEF
⇒ ∨ ABC will have the same support than R2. Nevertheless, its confidence is equal to
0.75. Hence, R3 is a valid rule while R2 is not.
6</p>
      <sec id="sec-7-1">
        <title>Experimental Results</title>
        <p>Our experiments 9 focused on the mining time as well as the number of extracted valid
rules w.r.t. their associated type, i.e., exact, approximate or approximated. They were
carried out on a PC equipped with a Pentium (R) having 3GHz as clock frequency and
1.75GB of main memory, running the GNU/Linux distribution Fedora Core 7 (with
2GB of swap memory). The compiler gcc 4.1.2 is used to generate the executable code
starting from our C++ implementation.</p>
        <p>In the proposed experiments, the minconf value is set to the relative minimum
support value, i.e., minsupp . Table 1 presents the mining time in seconds of the three
|O|
components of GARM. This table shows the efficiency of our tool towards
extracting generalized associated rules. Indeed, even for low minsupp values, GARM remains
very fast. In this respect, the time consumed by each component, w.r.t. the total time,
9 Test contexts are available at: http://fimi.cs.helsinki.fi/data .
closely depends on the context characteristics. Nevertheless, the second and third
components are in general faster than the first one. On the other hand, Table 2 highlights that
the number of extracted rules closely depends on the context density. Indeed, the higher
the value of this latter, the larger the associated equivalence classes are, and the greater
the number of frequent essential patterns and closed ones is. This fact augments the
number of rules even for high minsupp values for dense contexts. Interestingly enough,
the number of exact and approximated rules for RETAIL and KOSARAK is equal to 0
for the tested minsupp values. This is due to the fact that for both contexts, each
essential pattern is equal to its disjunctive closure what is not the case for the CONNECT and
PUMSB contexts. Please note that the mining time and the number of extracted rules
when minconf varies is omitted here, due to space limitations.
7</p>
      </sec>
      <sec id="sec-7-2">
        <title>Conclusion and Perspectives</title>
        <p>In this paper, we presented a complete tool, called GARM, allowing the extraction
of generalized association rules. Our tool is composed of three components. The first
consists in extracting a concise representation of frequent patterns based on disjunctive
closed ones. The second component aimed at partially ordering these closure w.r.t. set
inclusion. Once the structure built, extracting subsets of generalized association rules
becomes a straightforward task thanks to the last component. Carried out experiments
proved the effectiveness of the proposed tool. It is also important to mention that our
GARM tool is easily adaptable to the case where the input is composed by conjunctive
(closed) patterns instead of disjunctive ones.</p>
        <p>
          Other avenues for future work mainly address the following points: First, a detailed
comparison of our approach to the general GUHA approach [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] will be carried out.
Second, the relationships between the various rule forms will be studied. The purpose
is to only retain a lossless subset of rules while being able to derive the remaining
redundant ones. Adequate axiomatic systems need thus to be set up.
        </p>
        <p>Acknowledgments: We would like to thank anonymous reviewers for their helpful
comments and suggestions. We are also grateful to Mrs. Nassima Ben Younes for
fruitful discussions and help in the implementation of the tool. This work is supported by
the French-Tunisian project CMCU-Utique 05G1412.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ceglar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roddick</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          :
          <article-title>Association mining</article-title>
          .
          <source>ACM Computing Surveys</source>
          , volume
          <volume>38</volume>
          (
          <issue>2</issue>
          ) (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Steinbach</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Generalizing the notion of confidence</article-title>
          .
          <source>Knowledge and Information Systems</source>
          , volume
          <volume>12</volume>
          (
          <issue>3</issue>
          ) (
          <year>2007</year>
          )
          <fpage>279</fpage>
          -
          <lpage>299</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Tzanis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berberidis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Mining for mutually exclusive items in transaction databases</article-title>
          .
          <source>International Journal of Data Warehousing and Mining</source>
          , volume
          <volume>3</volume>
          (
          <issue>3</issue>
          ) (
          <year>2007</year>
          )
          <fpage>45</fpage>
          -
          <lpage>59</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Toivonen</surname>
          </string-name>
          , H.:
          <article-title>Discovering of frequent patterns in large data collections</article-title>
          .
          <source>PhD thesis</source>
          , University of Helsinki, Helsinki, Finland (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. He´bert,
          <string-name>
            <surname>C.</surname>
          </string-name>
          , Cre´milleux, B.:
          <article-title>A unified view of objective interestingness measures</article-title>
          .
          <source>In: Proceedings of the 5th International Conference Machine Learning and Data Mining in Pattern Recognition</source>
          , Springer-Verlag, LNCS, volume
          <volume>4571</volume>
          . (
          <year>2007</year>
          )
          <fpage>533</fpage>
          -
          <lpage>547</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hamrouni</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Denden</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Ben</given-names>
            <surname>Yahia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Mephu Nguifo</surname>
          </string-name>
          , E.:
          <article-title>A new concise representation of frequent patterns through disjunctive search space</article-title>
          .
          <source>In: Proceedings of the 5th International Conference on Concept Lattices and their Applications</source>
          .
          <article-title>(</article-title>
          <year>2007</year>
          )
          <fpage>50</fpage>
          -
          <lpage>61</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kim</surname>
          </string-name>
          , H.D.:
          <article-title>Complementary occurrence and disjunctive rules for market basket analysis in data mining</article-title>
          .
          <source>In: Proceedings of the 2nd IASTED International Conference Information and Knowledge Sharing</source>
          . (
          <year>2003</year>
          )
          <fpage>155</fpage>
          -
          <lpage>157</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Nanavati</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chitrapura</surname>
            ,
            <given-names>K.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joshi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krishnapuram</surname>
          </string-name>
          , R.:
          <article-title>Mining generalised disjunctive association rules</article-title>
          .
          <source>In: Proceedings of the 10th International Conference on Information and Knowledge Management</source>
          . (
          <year>2001</year>
          )
          <fpage>482</fpage>
          -
          <lpage>489</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Ha´jek,
          <string-name>
            <surname>P.</surname>
          </string-name>
          , Havra´nek,
          <source>T.: Mechanizing Hypothesis Formation: Mathematical Foundations for a General Theory</source>
          . Springer-Verlag (
          <year>1978</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kryszkiewicz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Concise representations of association rules</article-title>
          .
          <source>In: Proceedings of the ESF Exploratory Workshop on Pattern Detection and Discovery in Data Mining</source>
          , Springer-Verlag, LNCS, volume
          <volume>2447</volume>
          . (
          <year>2002</year>
          )
          <fpage>92</fpage>
          -
          <lpage>109</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. Gru¨n, G.A.:
          <article-title>New forms of association rules</article-title>
          .
          <source>Technical Report TR 1998-15</source>
          , School of Computing Science, Simon Fraser University, Burnaby,
          <string-name>
            <surname>BC</surname>
          </string-name>
          , Canada (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Shima</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hirata</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harao</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yokoyama</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matsuoka</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Izumi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Extracting disjunctive closed rules from MRSA data</article-title>
          .
          <source>In: Proceedings of the 1st International Conference on Complex Medical Engineering</source>
          . (
          <year>2005</year>
          )
          <fpage>321</fpage>
          -
          <lpage>325</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Elble</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heeren</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitt</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Optimized disjunctive association rules via sampling</article-title>
          .
          <source>In: Proceedings of the 3rd IEEE International Conference on Data Mining</source>
          . (
          <year>2003</year>
          )
          <fpage>43</fpage>
          -
          <lpage>50</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Galambos</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simonelli</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Bonferroni-type inequalities with applications</article-title>
          . Springer (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <source>Formal Concept Analysis</source>
          . Springer (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Hamrouni</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Denden</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Ben</given-names>
            <surname>Yahia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Mephu Nguifo</surname>
          </string-name>
          , E.:
          <article-title>Exploring the disjunctive search space towards discovering new exact concise representations for frequent patterns</article-title>
          .
          <source>Technical report, CRIL-CNRS of Lens</source>
          , Lens, France (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Denden</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamrouni</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ben Yahia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Efficient exploration of the disjunctive lattice towards extracting concise representations of frequent patterns</article-title>
          . To appear
          <source>in the Proceedings of the 9th African Conference on Research in Computer Science and Applied Mathematics (in French)</source>
          . (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Valtchev</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Missaoui</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lebrun</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A fast algorithm for building the Hasse diagram of a Galois lattice</article-title>
          .
          <source>In: Proceedings of the Conference on Combinatorics, Computer Science and Applications</source>
          . (
          <year>2000</year>
          )
          <fpage>293</fpage>
          -
          <lpage>306</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Boulicaut</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bykowski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rigotti</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Free-sets: A condensed representation of Boolean data for the approximation of frequency queries</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          volume
          <volume>7</volume>
          (
          <issue>1</issue>
          ) (
          <year>2003</year>
          )
          <fpage>5</fpage>
          -
          <lpage>22</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>