<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>New Approach to Mining Fuzzy Association Rule with Linguistic Threshold Based on Hedge Algebras</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Le Anh Phuong</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tran Dinh Khang</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nguyen Vinh Trung</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Hue University of Education, Hue University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Information Technology Center, Hue University of Education, Hue University</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>SoICT, Hanoi University of Science and Technology</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The authors [2-5] have studied and presented the quantitative method of linguistic variables and linguistic threshold by fuzzy set. Chien-Hua Wang, Chin-Pang Tzong proposed an algorithms for mining fuzzy association rule [2]. In this paper, we extend the algorithms proposed in [2] for number data and linguistic variables by using hedge algebras.</p>
      </abstract>
      <kwd-group>
        <kwd>fuzzy association rules</kwd>
        <kwd>linguistic threshold</kwd>
        <kwd>hedge algebra</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Data mining with the approach of association rules is one of important aspects
in the field of data mining.</p>
      <p>Many authors have presented various methods, algorithms of data mining
by association rules with numerical support and confidence value. However, in
reality, these values are natural linguistic ones. Besides, importance value of each
item is evaluated not only by quantity, frequency of occurrence in each
transaction but also by the qualitative evaluation of administrators (for those items)
by natural language. And hedge algebra have met the requirements for directly
processing calculation on linguistic value (without fuzzification, but with
direct calculation based on qualitative semantic function and flexible calculation).
Thus, it is necessary to establish a method of data mining by association rules
with hedge algebra, in which the input is qualitative transactional database and
qualitative evaluation table of those database items and the support, confidence
values are also natural language ones.
2.1</p>
      <sec id="sec-1-1">
        <title>Association rules</title>
        <p>Let I = I1, I2, . . . , Im be a set of items. Let D, the task-relevant data, be a set of
database transactions where each transaction T is a set of items, such is T ✓ I.
Each transaction is associated with an identi er, called TID.
Definition 1. An association rule has the form of X ! Y , where X ✓
Y ✓ I, and X \ Y = ✓ .</p>
        <p>I,
Definition 2. The support of association rule X ! Y the probability that X [ Y
exists in a transaction in the database D.</p>
        <p>support(X ! Y ) = |X \ Y |
|N |
Definition 3. The confidence of the association rule X ! Y is the probability
that X [ Y exists given that a transaction contains X, i.e.</p>
        <p>confidence (X ! Y ) =
support(X [ Y ) = |X \ Y |
support(X) |X|
Where: |X| is the number of transactions, including X; |X \ Y | is the number
of transactions, including X and Y ; N is the total of transaction database.</p>
        <p>Mining the association rules of the database is finding all of the rules that
have the degree of support and confidence greater than degree of support minsup
and confidence minconf determined by the available user.
2.2</p>
      </sec>
      <sec id="sec-1-2">
        <title>Hedge algebras (HA)</title>
        <p>Let X be a linguistic variable and X be a set of its terms, called a term-domain of
X. E.g. if X is the rotation speed of an electrical motor and linguistic hedges used
to describe its speed are V ery, M ore, P ossibly, Little, denoted correspondingly
for short by V, M, P and L, then X = {f ast, V f ast, M f ast, LP f ast, Lf ast,
P f ast, Lslow, slow, P slow, V slow, ...} U 0, W, 1 is a term-domain of X.</p>
        <p>It can be considered as an abstract algebra AX = (X, C, H,  ), where H is
a set of linguistic hedges, which can be regarded as one-argument operations,
 is called a semantics-based ordering relation on X and W, 0, 1 is a set of
constants in X with fast and slow being primary terms of X and W, 0, 1 being
additional elements in X interpreted as the neutral, the least and the greatest
ones, respectively. Denote by hx the result of applying an h 2 H to x 2 X and
by H(x) the set of all u 2 X generated algebraically from x by using hedges in
H, i.e. H(x) = u: u = hn...h1x, h1, ..., hn 2 H.</p>
        <p>
          It is natural that there is a demand to transform fuzzy sets defined on a
real interval [a, b], which represents the meaning of terms in a term-domain X,
into [a, b] or, for normalization, into [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ]. This defines a mapping of the
termdomain X into [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ], called in the algebraic approach a semantically quantifying
mapping. Now, we take these mappings in mind to define a notion of fuzziness
measure. Let us consider a mapping f from X into [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ], which preserves the
ordering relation on X. Then, the “size” of the set H(x), for x 2 X, can be
measured by the diameter of f (H(x)) ✓ [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ]. That is that this diameter will
be considered as a fuzzy measure of the term x. Taking this model of fuzziness
measure in mind, we may adopt the following definition:
        </p>
        <p>
          Let AX = (X, C, H,  ) be a linear HA. An f m: X ! [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] is said to be a
fuzzy measure of terms in X if:
Definition 4. For each x 2
as follows:
1) if x = c+ or x = c then |x| = 1.
2) if x = hx0 then |x| = 1 + |x0|, for all h 2 H.
        </p>
        <p>X, the length of x is denoted by |x|, and defined
Proposition 1. The fuzziness measure (f m) and the fuzziness measure of hedge
h, denoted by µ (h), 8 h 2 H, with the following properties:
1) f m(hx) = µ(h) ⇥ f m(x) with 8 x 2 X;
2) f m(c+) + f m(c ) = 1;
3) P
4) P
5) P
q i p,i6=0 f m(hic) = f m(c), c 2 { c+, c };
q i p,i6=0 f m(hix) = f m(x);
q i 1 µ(hi) = ↵ , P1 j p µ(hj ) = , ↵ +
3</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Algorithm</title>
      <p>+ Calculate the fuzzy of variable X: f m(X);
+ Identify fuzzy approximately of X: I(X) = [a, b];
+ The fuzzy average value of the variable X:
gt(X) =
Step 2: Handling qualitative table: A set of m items with their importance
evaluated by d managers
+ Calculate the fuzzy of linguistic variables;
+ Calculate the average o↵uzzy approximatelyqualitativeterms for all items.
kdt⇠tb(j) =
1
d ⇥</p>
      <p>d
X (a(j)i, b(j)i) ; (has the form: [aj , bj ])
i=1
+ Calculate the average of fuzzy value for each item:
gtdt⇠tb(j) =
aj + bj
2
;
where: aj and bj are the values of kdt⇠tb(j), which kdt⇠tb(j) = [aj , bj ]</p>
      <sec id="sec-2-1">
        <title>Step 3: Handling n quantitative transactions.</title>
        <p>
          + Transform the quantitative valueas Aj (j = (1, m)) as X variables in HA
(X 2 X), determined as follows:
Xsl = (Xsl, Gsl, Hsl,  ), with: Gsl = {High, Low}, (High = H, Low = L);
c+ = {H}; c = {L}; Hs+l = {V ery, M ore}; Hsl = {Less, P ossibly}; (with
V ery &gt; M ore; Less &gt; P ossibly)
- Selection: Dom(sl); fm(H); fm(L); fm(V); fm(M); fm(L); fm(P);
- Identify fuzzy approximately of X is I(X), with X 2 X
- Transform the quantitative value of item into [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] respectively;
        </p>
        <p>
          With each Aj 2 [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] that into fuzzy approximately I(X), respectively;
+ Statistics of fuzzy partitions in D⇠
+ Find the largest fuzzy partition as representative of each item jth:
max countj = max(countji), with i = (1, K);
Step 4: Calculate the fuzzy support of each item (j = 1, m), as:
sup(j) =
where gtdt⇠tb(j) is the qualitative value (calculated by formula (3), in step 2);
max countj is the quantitative vaule (calculated by formula (4), in step 3); and
N is the total number of transaction data, N = |D|.
        </p>
        <p>Step 5: Filter out all items in D⇠ , such that: satisfied frequent item
of minimum support: sup(item) minsup.</p>
        <p>Step 6: Establish Fuzzy FP-tree: establish Header table; establish FP-tree</p>
      </sec>
      <sec id="sec-2-2">
        <title>Step 7: Calculate the fuzzy qualitative of n-itemset (K n 2).</title>
        <p>+ Find out of all frequent itemsets (denote by n-itemset) from FP-tree;
+ Calculate the qualitative of n-itemset.</p>
      </sec>
      <sec id="sec-2-3">
        <title>Step 8: Calculate the fuzzy support of each n-itemset.</title>
        <p>+ Using the formula (5) - in step 4:
sup(n
itemset) =
Step 9: Export rules, calculate the confidence and check with minconf.
Using the following substeps:
+ Check the association rules from result of step 8, each n-itemset with items
(A1, A2, ..., An), (n = 2, M ): A1^ ... ^ Ai 1^ Ai+1...An ! Ai; (i = 1, M )
+ Calculate the fuzzy confidence value of each possible fuzzy association rule as:
conf (A ! B) =
sup(A [ B)
sup(A)
;
(6)
+ Select the satisfied fuzzy association rule of minimum confidence.</p>
        <p>During use of HA for fuzzy transaction database and quantify of linguitics,
we view each element of HA is a fuzzy region. So, the process of creating fuzzy
region based on the structure of HA will simple, intuitive, and more cient.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>An example</title>
      <p>In this section, an example is given to illustrate the proposed algorithm.
Input: includes three data follows:
1. The data set includes six quantitative transactions, as show in Table 2.
2. The importance of the items is evaluated by three managers as shown in
Table 3.</p>
      <p>3. A pre-defined linguistic minimum support value min s and linguistic
minimum confidence value min c.</p>
      <sec id="sec-3-1">
        <title>Output: A set of fuzzy association rules. Method: Includes 9 following general steps:</title>
        <p>Step 1: Identify minsup, minconf from the pre-defined threshold
linguistic</p>
        <p>Identify parameters in HA: X = (X, G, H,  ), with:
G = {Low, High}; c+ = High (denoted by H); c = Low (denoted by L); H+ =
{V ery, M ore}, H = {Less, P ossibly}; (with: V ery &gt; M ore; Less &gt; P ossibly)
with: f m(L) = 0.3; f m(H) = 0.7; f m(V ) = f m(M ) = f m(L) = f m(P ) = 0.25;
Identify fuzzy degree and fuzzy approximately of X:</p>
        <p>With the variable X contains c = “Low”:
+ f m(V L) = 0.25 ⇥ 0.3 = 0.075 ) I(V L) = [0, 0.075] ) I(V L)T B = 3.75%
+ f m(M L) = 0.25 ⇥ 0.3 = 0.075 ) I(M L) = [0.075, 0.15] ) I(M L)T B =
11.25%
+ f m(P L) = 0.25 ⇥ 0.3 = 0.075 ) I(P L) = [0.15, 0.225] ) I(P L)T B =
18.75%
+ f m(LL) = 0.25 ⇥ 0.3 = 0.075 ) I(LL) = [0.225, 0.3] ) I(LL)T B = 26.25%</p>
        <p>Similar, with the variable X contains c+ = “High”:
+ f m(LH) = 0.25⇥ 0.7 = 0.175 ) I(LH) = [0.3, 0.475] ) I(LH)T B = 38.75%
+ f m(P H) = 0.25 ⇥ 0.7 = 0.175 ) I(P H) = [0.475, 0.65] ) I(P H)T B =
56.25%
+ f m(M H) = 0.25 ⇥ 0.7 = 0.175 ) I(M H) = [0.65, 0.825] ) I(M H)T B =
73.75%
+ f m(V H) = 0.25 ⇥ 0.7 = 0.175 ) I(V H) = [0.825, 0.1] ) I(V H)T B =
91.25%
- Select minsupport with linguistic thresholds as “Less Low” (denoted by LL)
- Select minconf with linguistic thresholds as “More High” (denoted by MH)
minsup = minsup(LL) = 26.25%
minconf = minconf (M H) = 73.75%
Step 2: Handling qualitative table: A set of m items with their importance
evaluated by 03 managers.</p>
        <p>Identify parameters in HA: Denote:
I: Important; uI: UnImportant; O: Ordinary;
VI: Very Important; VuI: Very UnImportant;</p>
        <p>Xqt = (Xqt, Gqt, Hqt,  ), with: Gqt = {I mportant, U nI mportant}; c+ =
I mportant; c = U nI mportant; Hq+t = {V ery, M ore}; Hqt = {Less, P ossibly};
(with: V ery &gt; M ore; Less &gt; P ossibly).</p>
        <p>Let: Wqt = 0.5; f m(I ) = 0.4; f m(uI ) = 0.6; f m(V ) = 0.3; f m(M ) = 0.2;
f m(L) = 0.3; f m(P ) = 0.2;</p>
        <p>Should have: f m(V I ) = 0.3 ⇥ 0.4 = 0.12 ) I (V I ) = [0.88, 1]; f m(V uI ) =
0.3 ⇥ 0.6 = 0.18 ) I (V uI ) = [0, 0.18]; f m(O) = 0.5 ) I (O) = [0.25, 0.75];</p>
        <p>Table 3 is converted into Table 4, where kdt⇠tb is the average of fuzzy
approximately qualitative; gtdt⇠tb is the average of fuzzy value.
+ fuzzy approximately of support: ([0.6, 1] ⇥ 3.04)/6 = [0.304, 0.51];
+ fuzzy value of support: (0.304 + 0.51)/2 = 0.41 = 41%.</p>
        <p>Step 5: Filter out all items in D⇠ . Such that: satisfied frequent item of
minimum support: sup(item) minsup.</p>
        <p>If: sup(item) &lt; minsup (with: minsup = 26.25%, result at Step 1)
Then: remove item in table 8.</p>
        <sec id="sec-3-1-1">
          <title>Step 6: Establish fuzzy FP-tree: see figure 1</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Step 7: Calculate the fuzzy qualitative of n-itemset</title>
          <p>Substep 7.1: Find out of all frequent itemsets (denote by n-itemset) from FP-tree
(see Table 11)
2-item 3-item
F.PH, B.PH: 1.52; F.PH, E.PH: 1.52; B.PH, E.PH: 2.28 F.PH, B.PH, E.PH: 1.52</p>
          <p>itemset kdt⇠tb gtdt⇠tb
F.PH, B.PH (0.693, 0.92) 81%
F.PH, E.PH (0.6, 0.92) 76%
B.PH, E.PH (0.693, 1) 85%
F.PH, B.PH, (0.693, 0.92) 81%</p>
          <p>E.PH</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Result, we have 2 rules:</title>
        <p>Itemset Support Minsup = 26.25%
F.PH, E.PH 19% unselected
F.PH, B.PH 21% unselected
E.PH, B.PH 32% selected
F.PH, B.PH, 21% unselected
E.PH
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion References</title>
      <p>
        The paper is an extension of the evaluation of fuzzy association rules was
researched by Chien-Hua Wang and Chin-Pang Tzong [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], using algebras instead of
fuzzy sets. The optimization of the parameters of quantitative semantic content
in order to fit various problems will be discussed in our next papers.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Tran</given-names>
            <surname>Thai</surname>
          </string-name>
          <article-title>Son and Nguyen Anh Tuan, Improve eciency of fuzzy association rule using hedge algebra approach</article-title>
          ,
          <source>Journal of Computer Science and Cybernetics</source>
          , v.
          <volume>30</volume>
          , n.
          <volume>4</volume>
          ,
          <fpage>397</fpage>
          -
          <lpage>408</lpage>
          ,
          <year>2014</year>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chien-Hua Wang</surname>
          </string-name>
          and
          <string-name>
            <surname>Chin-Tzong</surname>
            <given-names>Pang</given-names>
          </string-name>
          ,
          <article-title>Finding Fuzzy Association Rules Using FWFP-Growth with linguistic Supports and Confidences</article-title>
          , World Academy of Science, Engineering and Technology,
          <volume>29</volume>
          ,
          <fpage>1133</fpage>
          -
          <lpage>1141</lpage>
          ,
          <year>2009</year>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chien-Hua</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chin-Tzong Pang</surname>
          </string-name>
          and
          <string-name>
            <surname>Sheng-Hsing</surname>
            <given-names>Liu</given-names>
          </string-name>
          ,
          <article-title>Mining association rules uses fuzzy weighted FP-growth</article-title>
          ,
          <source>Soft Computing and Intelligent Systems (SCIS) and 13th International Symposium on Advanced Intelligent Systems (ISIS)</source>
          ,
          <year>2012</year>
          Joint 6th International Conference on,
          <volume>13498461</volume>
          ,
          <fpage>983</fpage>
          -
          <lpage>988</lpage>
          ,
          <year>2012</year>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Tzung-Pei</surname>
            <given-names>Hong</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chun-Wei Lin</surname>
          </string-name>
          and
          <string-name>
            <surname>Wen-Hsiang</surname>
            <given-names>Lu</given-names>
          </string-name>
          ,
          <article-title>Lingguitic data mining with fuzzy FP-trees</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>37</volume>
          ,
          <fpage>4560</fpage>
          -
          <lpage>4567</lpage>
          ,
          <year>2010</year>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Tzung-Pei</surname>
            <given-names>Hong</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Minh-Jer Chiang</surname>
          </string-name>
          and
          <string-name>
            <surname>Shyue-Liang</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <source>Data Mining with Linguistic Thresholds, Int.Jcontemp. Math. Sciences,</source>
          vol
          <volume>7</volume>
          , n.
          <volume>35</volume>
          ,
          <fpage>1711</fpage>
          -
          <lpage>1725</lpage>
          ,
          <year>2012</year>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>