<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Learning Fuzzy Cognitive Maps by a Hybrid Method Using Nonlinear Hebbian Learning and Extended Great Deluge Algorithm</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zhaowei Ren</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Electronic and Computing Systems University of Cincinnati 2600 Clifton Ave.</institution>
          ,
          <addr-line>Cincinnati, OH 45220</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2007</year>
      </pub-date>
      <abstract>
        <p> Fuzzy Cognitive Maps (FCM) is a technique to represent models of causal inference networks. Data driven FCM learning approach is a good way to model FCM. We present a hybrid FCM learning method that combines Nonlinear Hebbian Learning (NHL) and Extended Great Deluge Algorithm (EGDA), which has the efficiency of NHL and global optimization ability of EGDA. We propose using NHL to train FCM at first, in order to get close to optimization, and then using EGDA to make model more accurate. We propose an experiment to test the accuracy and running time of our methods.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction:</title>
      <p>Fuzzy Cognitive Maps (FCM) (1) is a modeling
methodology that represents graph causal relations of
different variables in a system. One way of developing the
inferences is by a matrix computation. FCM is a cognitive
map with fuzzy logic (2).FCM allows loops in its network,
and it can model feedback and discover hidden relations
between concepts (3). Another advantage is that Neuron
network techniques are used in FCM, e.g. Hebbian
learning(4), Genetic Algorithm (GA) (5), Simulated
Anealling (SA) (6).</p>
      <p>A
2</p>
      <p>
        A1
A
3
The structure of FCM is similar to an artificial neuron
network, e.g. Figure 1. There are two elements in FCM,
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
concepts and relations. Concepts reflect attributes, qualities
and states of system. The value of concepts ranges from 0
to 1. Concepts can reflect both Boolean and quantitative
value. For example, a concept can reflect either the state of
light (while 0 means off and 1 means on), or water level of
a tank. If it reflects a quantitative value, equation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] can be
used for normalization.
where A is the concept value before normalization, and
and are the possible maximum and minimum
value of A. Relations reflect causal inference from one
concept to another. Relations have direction and weight
value. is denoted as the weight value from concept
to concept . For a couple of nodes, there may be two, one
or none relations between them. There are three possible
types of causal relations:



      </p>
      <p>the relation from concept to concept
is positive. When concept increases (decreases),
concept also increases (decreases).</p>
      <p>the relation from concept to concept
is negative. When concept increases (decreases),
on the contrary, concept decreases (increases).
there is no relations between
and</p>
      <p>
        When initial state of FCM is given, FCM will converge
to a steady state through iteration process. One concept
value is computed by the sum of weighted sum of all
concepts that may be related to it. In each iteration, concept
value is calculated by equation [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        ∑
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
where is the value of conceptin iteration k+1, is
the value of concept in iteration, and is the weight
value of the edge from concept to concept . And
, which is a transfer function to normalize
      </p>
      <p>
        is a parameter that determines its
weight value to [
        <xref ref-type="bibr" rid="ref1">-1,1</xref>
        ].
steepness.
      </p>
      <p>For example, figure 2 is It is a problem an industrial
process control problem (8). There is a tank with two
valves where liquids flow into the tank. These two liquid
had reaction in this tank. There is another valve which
empties the fluid in the tank. There is also a sensor to
gauge the gravity of produced liquids (proportional to the
rate of reaction) in tank. As described in the figure below</p>
      <p>There are two constraints of this problem. The first one
is to maintain value of G in a particular range, and the
second one is to keep height of liquids (T) in a range.
Parsopoulos et al. (8) proposes that there should be five
concepts: (a) height of liquid in the tank, (b) the state of
valve 1, (c) the state of valve 2, (d) the state of valve 3, and
(e) the gravity of produced liquid in the tank. Our aim is to
find out the causal inference value from one concept to
another one.</p>
      <p>There are mainly two strategies to learn an FCM. One is
to exploit expert domain knowledge and formulate a
specific application’s FCM (7), this can be used when
there is no good automated or semi-automated methods to
build this model. If there are multiple domain experts
available, each expert choose a value (e.g. very weak,
weak, medium, strong, very strong) for the causal effect
from one concept to another one; then the values are
quantified and combined together into one value between
1 and 1. This strategy has its own shortage: when the
problem is complex and need a large number of concepts
to describe a system, the cost of expert strategy is very
high; moreover, it is difficult to discover new hidden
relations by this strategy. Another strategy is to develop a
data driven learning method. Input, output and state of a
system are recorded when it is running, and these records
are used as a neuron network training dataset.</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>One branch of Fuzzy Cognitive map (FCM) learning is
Hebbian learning. Different Hebbian learning has been
proposed, for example, Differential Hebbian Learning
(DHL)(4), and its modified version Banlanced Differential
Hebbian Learning (BDHL)(9). DHL changes weight
matrix by the difference of two records, but it did not
consider the scenario that multiple concepts have effect on
one mutually. BDHL covers this situation, but it is costly
owe to lack of optimization. The two Hebbian learning
methods that have been used in real world is Active
Hebbian learning (AHL) (10) and Nonlinear Hebbian
Learning(NHL) (11), and both of them require expert
knowledge before computation. AHL explores a method to
determine the sequence of active concepts. For each
concept, only concepts that may affect it are activated.
AHL is fast but requires expert intervention. Experts
should determine the desired set of concepts and initial
structure of FCM. NHL is a nonlinear extension of DHL. In
NHL, before iteration starts,experts have to indicate an
initial structure and sign of each non-zero weight. Weight
values are updated synchronously, and only with concepts
that experts indicate.</p>
      <p>Another branch of learning FCM structure is
populationbased method. Koulouriotis (12)proposes evolution
strategies to train fuzzy cognitive maps. In 2007
Ghazanfari et al. (6) proposes using Simulated Annealing
(SA) to learn FCM, and he compared Genetic Algorithm
(GA)(5) and SA. They concluded that when there are more
concepts in the network, SA has a better performance than
genetic algorithm. In 2011, Baykasoglu and Adil (13)
proposed an algorithm called extended great deluge
algorithm (EGDA) to train FCM. EGDA is quite similar to
SA, but it demands smaller number of parameters than SA.
Population-based method is capable to reach global
optimization even if the initial weight matrix is not good,
but it is usually computationally costly, especially when the
initial weight matrix is far from optimal position. Moreover,
population-based methods have many parameters that have
to be set before processing. The parameters are set usually
by experiences, and then duplicated experiments with
different parameters should be made to get better
performance. Hebbian learning methods are relatively fast,
but their performance depends on initial weight matrix and
predefined FCM structure very much. Expert intervention
is usually essential. Experts need to indicate a structure
before experiments.</p>
      <p>The third branch is hybrid method, which takes both the
effectiveness of Hebbian learning and global search
capability of population-based methods. Papageorgiou and
Groumpos (14) proposed a hybrid learning method that
combines NHL and Differential Evolution algorithm (DE).
First, NHL is used to learn FCM, and then its result is feed
to DE algorithm. This method makes uses of both the
effectiveness of Hebbian learning and the global search
ability of population-based method. The three experiments
they did show this hybrid method is capable to train FCM
effectively. Zhu et.al(15) proposes another hybrid method
which combines NHL and Real-coded Genetic Algorithm
(RCGA)</p>
      <p>Here I suggest a hybrid method combing NHL and
EGDA. EGDA has global search ability and relatively less
demand of parameters. If its initial weight matrix is close to
optimal condition, it will save much computing expense.
Here we use NHL to train FCM roughly first, and then feed
its result to EGDA. NHL is picked because it is simple and
fast, and it can deal with continuous range of value of
concepts</p>
      <p>
        Then we use equation [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] to update
the learning rate
2 and
3 .Here
2
3
2
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Hybrid Method Using NHL and EGDA</title>
      <p>This hybrid method is processed by two stages.</p>
      <sec id="sec-3-1">
        <title>Stage 1 use nonlinear Hebbian learning (NHL) (11)to train FCM</title>
        <p>Step 1: Initialize weight matrix with help of experts
and read input concept . We feed the initial weight
matrix to feed</p>
        <p>Step2:</p>
        <p>
          Calculate (concept value in iteration 1.Initial values
can be denoted as values in iteration 0) by the equation [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]
∑ [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]
where
        </p>
        <p>. λis a parameter that determines
increasing rate of curve. It is a transfer function. When x
changes from - ∞ to ∞ , f(x) changes from 0 to 1.
Therefore, final result of concept value is still from zero to
one.</p>
        <p>
          Step 3:
Use equation [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] to update weights,
[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]
is learning rate function, and it decreases as k
where
increases.
        </p>
        <p>
          Step 4: At the end of each updating, the error function is
computed as equation [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]
        </p>
        <p>
          ∑ 2 [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]
where k is the iteration number. There are two
termination conditions. One is that value of error function
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] is below a threshold, and the other is there are enough
times of iterations. If one of the termination conditions is
reached, the iteration ends. Otherwise, go on the next
iteration.
        </p>
        <p>For example, now we have time series data of each
concept value as Table 1</p>
        <p>A1 A2 A3
0.5 0.5 0.1
0.6 0.4 0.2
0.5 0.3 0.3</p>
        <p>Table 1 Concept value record
Each tuple is a record of three concept value.</p>
        <p>Initial weight matrix is predicted by experts or generated
randomly. Here it is as Table 2</p>
        <p>W 1 2 3
1 N/A 0.3 0
2 0.7 N/A 0.2
3 -0.6 -0.3 N/A</p>
        <p>Table 2 Initial weight matrix
(the weight from concept I to concept j) is the value
in line I and column j. For example, 2</p>
        <p>
          For example, we want to update 2 and 3 using the
first tuple of data. First, we use equation [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] to calculate
is set to 1 here.
.
        </p>
        <p>If J is larger than termination threshold, then go to step 2.
Otherwise, terminate this algorithm and got to stage 2.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Stage 2: use extended great deluge algorithm (EGDA)(13) to train FCM.</title>
        <p>Step 1: Initialize the weight matrix with the suggested
value from stage 1. The output of step 1 is feed to this step.
Assume the weight matrix we got from last step is as Table
3
where K is the number of iteration, and N is the number
of concepts.</p>
        <p>Step 4: If the fitness function value of the neighbor
configuration is better than tolerance, it is picked as current
configuration. And then go to step 5, otherwise, go to step
2. Then reduce the tolerance.
2
*0.2</p>
        <p>Step 5: If the value of fitness function is better than best
condition, update best condition.</p>
        <p>For example: First we find a new neighbor for this.
is set as 0.2.</p>
        <p>If this value is above a tolerance, it is denoted as current
configuration, and then it is compared with best
configuration to see if it is the best so far. If the new
neighbor is not below tolerance, find another neighbor near
current one. Reduce tolerance after each search. If the total
error is below a threshold or there is enough number of
iteration, then this algorithm terminates.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiment Design:</title>
      <p>There are two steps of our experiment. First we are
going to test our method by simulated data, and try to find
out the scenario that our method can be most efficient and
accurate. On the second step, we will use our method in a
real application.</p>
      <p>In this experiment, data is generated by a random
process. First the number of concepts and density of
relations are set. We can try different number of concepts,
from small to large, in order to test the performance of this
method in network with different complexity. Density
represents how many percent of edges exist in a network.
It is defined as equation [8].</p>
      <p>For example, if we set number of concepts as 5, and
density as 0.4, number of edges is computed as below
)
)
then there would be eight edges in this network.</p>
      <p>
        After number of concepts and edges are set, a model can
be generated with random weight, and we name it original
model. Then random data is generated, and they are fed to
equation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] iteratively, until it reaches a steady state (the
error in equation [7] is lower than threshold). The steady
state would be a record for simulated data. After a certain
time of iteration, if it still cannot reach steady state, a new
tuple of data would be generated randomly and fed to
equation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. After hundreds of times, we will have a series
of data as training set. This data is used to learn FCM by
our method. The weight matrix we get would be compared
with the original model. The error is calculated as equation
[9]
∑
∑
̅̅̅̅ 2
[9]
where N is the number of concepts in this model.
      </p>
      <p>Some other methods (NHL, EGDA, SA) can also be
programmed, and compared with this method. These
methods will be compared in accuracy and running time, in
several conditions.</p>
      <p>After simulated experiment, based on the best conditions
for our method, we will apply it on a real practical problem.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We propose a hybrid method to learn FCM. Our method
has taken advantages of fast speed of NHL and global
search ability of EGDA. Moreover, we propose an
experiment to test our algorithm, and try to apply it into
practice.</p>
      <p>7. Khan MS, Quaddus M. Group decision support using
fuzzy cognitive maps for causal reasoning. Group Decis
Negotiation 2004; 13: 463-480.</p>
      <p>8. Parsopoulos KE, Papageorgiou EI, Groumpos PP, and
Vrahatis MN. A first study of fuzzy cognitive maps
learning using particle swarm optimization. 2003; 2: 1440.</p>
      <p>9. Huerga AV. A balanced differential learning
algorithm in fuzzy cognitive maps. 2002; .</p>
      <p>10. Papageorgiou EI, Stylios CD, and Groumpos PP.
Active hebbian learning algorithm to train fuzzy cognitive
maps. International Journal of Approximate Reasoning
2004; 37: 219.</p>
      <p>11. Papageorgiou E, Stylios C, and Groumpos P. Fuzzy
Cognitive Map Learning Based on Nonlinear Hebbian
Rule. In: Gedeon T and Fung L eds. AI 2003: Advances in
Artificial Intelligence. Springer Berlin / Heidelberg, 2003:
256-268.</p>
      <p>12. Koulouriotis DE, Diakoulakis IE, and Emiris DM.
Learning fuzzy cognitive maps using evolution strategies:
A novel schema for modeling and simulating high-level
behavior. 2001; 1: 364.</p>
      <p>13. Baykasoglu A, Durmusoglu ZDU, and Kaplanoglu
V. Training fuzzy cognitive maps via extended great
deluge algorithm with applications. Comput Ind 2011; 62:
187.</p>
      <p>14. Papageorgiou EI, Groumpos PP. A new hybrid
method using evolutionary algorithms to train fuzzy
cognitive maps. Applied Soft Computing 2005; 5: 409.</p>
      <p>15. Yanchun Z, Wei Z. An integrated framework for
learning fuzzy cognitive map using RCGA and NHL
algorithm. 2008; 1.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>1. Bart and Kosko. Fuzzy cognitive maps</article-title>
          .
          <source>International Journal of Man-Machine Studies</source>
          <year>1986</year>
          ;
          <volume>24</volume>
          :
          <fpage>65</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kosko</surname>
            <given-names>B</given-names>
          </string-name>
          .
          <article-title>Fuzzy engineering</article-title>
          . Upper Saddle River, NJ, USA: Prentice-Hall, Inc.
          <year>1997</year>
          : .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Papageorgiou</surname>
            <given-names>EI</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stylios</surname>
            <given-names>C</given-names>
          </string-name>
          , and Groumpos PP.
          <article-title>Unsupervised learning techniques for fine-tuning fuzzy cognitive map causal links</article-title>
          .
          <source>International Journal of Human-Computer Studies</source>
          <year>2006</year>
          ;
          <volume>64</volume>
          :
          <fpage>727</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Dickerson</surname>
            <given-names>JA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kosko</surname>
            <given-names>B</given-names>
          </string-name>
          .
          <article-title>Virtual worlds as fuzzy cognitive maps</article-title>
          .
          <source>Virtual Reality Annual International Symposium</source>
          ,
          <year>1993</year>
          , 1993
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          <year>1993</year>
          ;
          <fpage>471</fpage>
          -
          <lpage>477</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Stach</surname>
            <given-names>W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kurgan</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pedrycz</surname>
            <given-names>W</given-names>
          </string-name>
          , and Reformat M.
          <article-title>Genetic learning of fuzzy cognitive maps</article-title>
          .
          <source>Fuzzy Sets Syst</source>
          <year>2005</year>
          ;
          <volume>153</volume>
          :
          <fpage>371</fpage>
          -
          <lpage>401</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ghazanfari</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alizadeh</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fathian</surname>
            <given-names>M</given-names>
          </string-name>
          , and Koulouriotis DE.
          <article-title>Comparing simulated annealing and genetic algorithm in learning FCM</article-title>
          .
          <source>Applied Mathematics and Computation</source>
          <year>2007</year>
          ;
          <volume>192</volume>
          :
          <fpage>56</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>