=Paper= {{Paper |id=Vol-2212/paper16 |storemode=property |title=Application of time series analysis for structural and parametric identification of fuzzy cognitive models |pdfUrl=https://ceur-ws.org/Vol-2212/paper16.pdf |volume=Vol-2212 |authors=Ruslan Isaev,Aleksandr Podvesovskii }} ==Application of time series analysis for structural and parametric identification of fuzzy cognitive models == https://ceur-ws.org/Vol-2212/paper16.pdf
Application of time series analysis for structural and
parametric identification of fuzzy cognitive models

                    R A Isaev1 and A G Podvesovskii1


                    1
                     Bryansk State Technical University, 50 let Oktyabrya 7, Bryansk, Russia, 241035


                    Abstract. The article deals with problems of structural and parametric identification of fuzzy
                    cognitive models on the basis of statistical data analysis. The feasibility of application of time
                    series analysis for solving these problems is justified. The Granger causality test is proposed
                    for structural identification. An approach for parametric identification based on distributed-lag
                    time series model is also proposed. The results of experimental verification of the described
                    approaches are presented.



1. Introduction
A cognitive approach is one of approaches to the study of semi-structured systems, which is widely
used at the present time. According to the definition given in [1], this approach focuses on the
development of formal models and methods supporting the intelligent problem-solving process as they
include human cognitive capabilities (perception, conception, cognition, understanding, explanation) in
solving management problems. Structure and target modeling and simulation modeling methods based
on cognitive approach are commonly subsumed under the umbrella term “cognitive modeling”. In
general terms, cognitive modeling refers to the study of structure, functioning and development of a
system by analyzing its cognitive model. The cognitive model is based on a cognitive map, which
reflects researcher's subjective notion (individual or collective) of the system as a number of semantic
categories (known as factors or concepts) and a set of cause-and-effect relationships between them.
    A cognitive model is an effective tool for exploratory and estimative analysis of the situation. It
does not give an opportunity to obtain accurate quantitative characteristics of the system under study,
but it allows to assess trends related to its functioning and development, and to identify the key factors
influencing these processes. Thus, we can search, generate and develop effective solutions for system
management, as well as identify risks and develop strategies to reduce them.
    Cognitive modeling starts with creating a cognitive map of the system under study on the basis of
information received from experts. The next step includes direct simulation. Its main objectives are
forming and testing hypotheses for the structure of the system under study, which can explain its
behavior, also developing strategies for various situations in order to reach the specified target states.
    Tasks solved by means of cognitive modeling can be divided into two groups:
    1. Tasks of structure and target analysis:
       • finding the key factors influencing the targets;
       • identification of contradictions between the targets;
       • identification of feedback loops.
    2. Tasks of dynamic analysis (scenario simulation):
       • self-development (“what if we do nothing”);


IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)
Data Science
R A Isaev and A G Podvesovskii




        • managed development:
          o direct task (“what if”);
          o inverse task (“how to”).
   Thus, the scenario simulation allows prediction of the simulated system states under different
control actions and search for alternative control solutions bringing the system to the target state.
   Mathematical apparatus most commonly used to represent cognitive models and underlying
methods for their analysis is fuzzy logic. As a result, there appeared a whole class of cognitive models
based on different types of fuzzy cognitive maps (FCM). A detailed overview of such models can be
found, for instance, in monograph [2]. One of FCM varieties, well-proven in practical analyzing and
modeling of semi-structured organizational, social and economic systems are Sylov’s FCMs. They
were firstly proposed in [3] and represent the development of signed cognitive maps [4].

2. Formal definition and structure of Sylov’s fuzzy cognitive map
As previously mentioned, a cognitive model is based on formalization of cause-and-effect
relationships which occur between factors characterizing the system under study. The result of the
formalization represents the system in the form of a cause-and-effect network, termed a cognitive map
and having the following form:
                                                G = < E, W >,
where E = {e1, e2, …, eK} is a set of factors (also called concepts), W is a binary relation on the set E,
which specifies a set of cause-and-effect relationships between its elements.
    Concepts can specify both relative (qualitative) characteristics of the system under study, such as
popularity, social tension, and absolute, measurable values: population size, cost, etc. Moreover, every
concept ei is connected with a state variable vi, which specifies the value of the corresponding index at
a particular instant. State variables can possess values expressed on a certain scale, within the
established limits. Value vi(t) of state variable at instant t is called the state of concept ei at the given
instant. Thus, the state of the simulated system at any given instant is described by the state of all
concepts included in its cognitive map.
    Concepts ei and ej are considered to be connected by relation W (designated as (ei , ej) ∈ W or eiWej)
if changing the state of concept ei (cause) results in changing the state of concept ej (effect). In this
case, we say that concept ei influences concept ej. Besides, if the value increase of the concept-cause
state variable leads to the value increase of the concept-effect state variable, then the influence is
considered positive (“strengthening”); if to the decrease – then negative (“inhibition”). Therefore, the
relation W can be represented as a union of two disjoint subsets W = W +  W − , where W + is a set of
positive relationships and W– is a set of negative relationships.
    Fuzzy cognitive model is based on the assumption that the influence between concepts may vary in
intensity; whereas, intensity may be constant or variable in time. Taking into account this assumption,
W is set as a fuzzy relation, however, its setting depends on the adopted approach to formalization of
cause-and-effect relationships. A cognitive map with fuzzy relation W is termed a fuzzy cognitive
map.
    Sylov’s fuzzy cognitive map represents FCM, characterized by the following features:
    1. State variables of concepts can possess values on the interval [0, 1].
    2. Influence intensity is considered constant, so relation W is specified as a set of numbers wij,
       characterizing the direction and degree of influence intensity (weight) between concepts ei and
       ej:
                                                wij = w(ei , ej),
where w is a normalized index of influence intensity (characteristic function of the relation W) with the
following properties:
    a) –1 ≤ wij ≤ 1;
    b) wij = 0, if ej does not depend on ei (no influence);
    c) wij = 1 if positive influence of ei on ej is maximum, i.e. when any changes in the system related
       to concept ej are univocally determined by the actions associated with concept ei;



IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                    120
Data Science
R A Isaev and A G Podvesovskii




   d) wij = –1 if negative influence is maximum, i.e. when any changes related to concept ej are
      uniquely constrained by the actions associated with concept ei;
   e) wij possesses the value from the interval (–1, 1), when there is an intermediate degree of positive
      or negative influence.
   Clearly, FCM of this structure can be graphically represented as a weighted directed graph, which
points correspond to elements of set E (concepts) and arcs correspond to nonzero elements of relation
W (cause-and-effect relationships). Each arc has a weight which is specified by the corresponding
value wij. In this case, relation W can be represented as a matrix of dimension n×n (where n is the
number of concepts in the system), which can be considered as the graph adjacency matrix and is
termed a cognitive matrix.

3. The present state of research in the field of fuzzy cognitive models identification
In the course of building a FCM we can distinguish two stages:
    • structural identification, which implies determining a set of concepts E and a crisp relation W
       over this set, i.e. verification of connections between concepts;
    • parametric identification, which implies transition from the crisp relation W to a fuzzy one, i.e.
       determination of connection weights (influence intensity) between concepts.
    Experts are the key source of information at both stages of building a map. In particular, at the
structural identification stage, a list of concepts is formed by an expert (or a group of experts). Then,
connections between concepts are added to the cognitive model on the basis of the expert notion of the
simulated situation.
    Expert methods are also most commonly used at the parametric identification stage. They can be
direct and indirect. The direct methods imply immediate (explicit) weighing by an expert. The indirect
methods are used to minimize the impact of subjectivity in the process of weighing, and they are based
on breaking the general task of determining weights into a number of simpler sub-tasks. Saati’s
pairwise comparison method, Yager’s level set method and Churchman-Ackoff method are examples
of indirect methods. Description of these methods, as applicable to defining FCM weights, can be
found in monograph [5] (section 3.2).
    As previously noted, some concepts can set quantitative parameters of the system under study and
consequently have numerical state variables. Provided that there is statistical information about the
values of these variables, it can be used to identify connection weights between such concepts instead
of expert assessment. Thus, statistical methods can be used to identify FCM parameters alongside with
the expert methods. The possibility of using this or that method is determined by the nature of the
available statistical information [6]. For instance, if statistical data about concepts are represented in
the form of spatial sampling, a linear regression model can be applied to identify the sign and intensity
of influence between the concepts.
    Method based on a pair linear regression model was proposed in monograph [7] (sections 4.2-4.3).
With its help, it’s possible to identify the sign and intensity of influence between two concepts.
Generalization variations of this method based on multiple regression analysis are of special interest
and allow identifying parameters of influence of several concepts on a concept.
    Attempts to apply correlation and multiple regression analysis to build fuzzy cognitive models of
social and economic systems were undertaken in [8, 9]. Nevertheless, the results given in these papers
can’t be regarded satisfactory for several reasons. First, the authors use the correlation and regression
analysis to reveal the very existence of cause-and-effect relationships between concepts and to
ascertain the direction of these relationships. However, it is well known that high value of a coefficient
of correlation between factors as well as reliability of the regression model built on their basis are not
sufficient to draw a conclusion of a cause-and-effect relationship between these factors. Moreover, it is
impossible to define accurately the direction of this relationship through the specified methods.
Second, the authors propose to use regression equation coefficient values as connection weights
between concepts. Yet, the weights obtained can acquire values outside the range [-1, 1], which
contradicts the formal definition of Sylov’s FCMs. Finally, these papers don’t examine the



IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                121
Data Science
R A Isaev and A G Podvesovskii




multicollinearity problem, i.e. a high degree of intercorrelation among explanatory variables in
regression models. This inevitably leads to abundance of redundant connections in the FCMs obtained.
    The approach described in [10] is also based on the multiple regression analysis but is free of the
enumerated drawbacks. Nevertheless, there are still a number of current problems connected with the
identification of fuzzy cognitive models on the basis of statistical data.
    First of all, it should be noted that since the modeled systems are dynamic (i.e. their state changes
in the course of time), statistical information about them is likely to be represented in the form of time
series in most cases. In this context, the regression analysis is not viable because one of its conditions
is representation of data in the form of spatial sampling. Considering this, methods should be
developed to identify connection weights between concepts on the basis of time series analysis. This
issue was partially addressed in [11], but no approach to identification based directly on time series
analysis was proposed in it – the primary focus was on correlation analysis.
    Development of FCM structural identification methods based on statistical data is another
advanced problem. As has been mentioned above, methods based on spatial sampling analysis do not
enable us to establish cause-and-effect relationships between concepts, while time series analysis
methods provide us with such an opportunity.
    Further, we describe approaches to solving the specified problems.

4. Application of Granger causality test to structural identification
As noted above, the decision of adding a connection between two concepts to a cognitive model is
made on the basis of expert notion of the system modeled. Even if there are statistical data about the
concepts in the form of spatial sampling, it is impossible to establish either a cause-and-effect
relationship between concepts or its direction. In this case, statistical information is used only for the
sign and influence intensity identification if there is such influence in the expert’s opinion.
    If there are statistical data about some concepts X and Y in the form of time series, then Granger
causality test can be used for verification of feasibility and viability of adding a connection between
them [12].
    The idea of the test is as follows: if X influences Y, then a change in X must precede a change in Y,
but not vice versa. Moreover, the following two conditions must be met:
    • X must contribute significantly to the prediction of Y;
    • Y must not contribute significantly to the prediction of X.
    If every variable contributes significantly to the prediction of the other one, there are two options
possible:
    • there is a two-way causality between them;
    • there is a third variable influencing both.
    Two null hypotheses are sequentially checked in Granger test:
    • “X does not Granger-cause Y”;
    • “Y does not Granger-cause X”.
    To test these hypotheses two regressions are built; in each of them, the regressand is one of the
variables tested for causality, and the regressors are the lags of both variables:
                                yt = a0 + a1 yt −1 +  + a p yt − p + b1 xt −1 +  + bp xt − p + ε t ;
                                                                                                             (1)
                                    xt = c0 + c1 xt −1 +  + c p xt − p + d1 yt −1 +  + d p yt − p + ut .
                                                                                                             (2)
   For each regression (1) and (2) the null hypothesis is that the coefficients of the lagged values of
the second variable simultaneously equal zero:
                                            H 01 : b=
                                                    1 = b=
                                                          p   0;
                                                                                                             (3)
                                                        H 02 : d=
                                                                1 = d=
                                                                    p 0.
                                                                                                             (4)
    To test hypotheses (3) and (4), an F-test should be performed. To arrive at the conclusion that X
influences Y, it is necessary that the first hypothesis be rejected and the second one be accepted
(generally, at significance value 0.05).



IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                       122
Data Science
R A Isaev and A G Podvesovskii




   The number of lagged variables included in the regressions influences the result of the test.
Therefore the test is recommended to be performed at a variety of p.
   Granger causality between variables doesn’t guarantee a cause-and-effect relationship between
them but implies the possibility of such relationship. Meanwhile, no Granger causality guarantees
absence of such relationship. In other words, Granger causality between time series is an essential but
not sufficient condition for a cause-and-effect relationship between the corresponding concepts. Thus,
the final decision whether to add a connection to the cognitive model remains with the expert.
   Suppose a cognitive model includes concepts X1, K, Xn and there are data about them in the form of
time series xt1 , , xtn . Then, at the structural identification stage, it is required to establish between
which pairs of concepts connections should be added. For this purpose, the described test should be
conducted between the series corresponding to each pair of concepts.
   Moreover, it is important to consider the fact that influence between concepts can be realized not
only directly but also transitively. Granger causality will be also detected in the latter case but with a
longer lag (i.e. at larger p) than under the direct influence. Since existence of a relationship between
concepts in a cognitive model means that a change of cause concept state leads to a change of effect
concept state in one step, the question of adding a connection should be raised only if causality is
detected between time series at minimum value of p.

5. Parametric identification based on distributed-lag time series model
At the parametric identification stage, it is necessary to determine signs and weights of all connections
between concepts added to the model following the results of the structural identification stage.
   Choosing a time series model for weighting FCM connections, it is required to correlate it with the
impulse process model which is supposed to be used for the dynamic analysis of the map under study.
Research on various impulse process models can be found in one of the authors’ previous papers [13].
   Further description of the proposed approach is given by the example of the most common impulse
process model, namely an additive model with absolute changes. Within this model, a change of
concept Y state in a given step t is supposed to be determined (except control and external actions) by
absolute changes of influencing concept states in a previous step (t – 1). Meanwhile, previous state
changes of concept Y itself are not taken into account. Considering this, for the simplest case (such as
when influence on concept Y is realized from one concept X) we obtain the expression:
                                                 ∆yt = a1∆xt −1 + ε t ,                                (5)
where ∆yt = yt − yt −1 ; ∆xt −1 = xt −1 − xt − 2 ; a1 is a coefficient, determining the intensity of influence
transmission from X to Y; ε t is an error.
   It is easily seen that model (5) is equivalent to the following:
                                                      yt =a0 + a1 xt −1 + ut ,                              (6)
where a0 is an absolute term; ut is an error.
    The described model (6) is a special case of a distributed-lag time series model (DL), which, in its
turn, can be represented as a special case of autoregressive distributed lag model (ADL) [12].
    This model can be assessed by the least squares method (in fact it is a model of concept Y values
regression on previous values of the influencing concept X), so that we can obtain the desired value of
a1 (regression coefficient).
    The described model is naturally generalized in case of several influencing concepts. In this case,
we receive a multiple regression model. Application of this model to the parametric identification was
viewed in detail by the authors in [10].
    The same principle of transition from regression coefficients to connection weights should be
applied as in the regression analysis. The principle was also described in detail in [10].




IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                      123
Data Science
R A Isaev and A G Podvesovskii




6. Experimental validation of the proposed approaches to structural and parametric
identification
Suppose concepts X and Y are added to a fuzzy cognitive model, and there is statistical information
about them in the form of time series. Time series corresponding to the concepts are illustrated by
graphs in figure 1.




                            Figure 1. Time series corresponding to concepts X and Y.

    According to the described approach to the structural identification and taking p = 1, let us plot the
following regressions:
                                         yt =a0 + a1 yt −1 + b1 xt −1 + ε t ;                           (7)
                                                    xt =+
                                                       c0 c1 xt −1 + d1 yt −1 + ut .                         (8)
    By assessing models (7) and (8) by the least squares method, we obtain:
                                      yt = 0.011 + 0.053 yt −1 + 0.858 xt −1 ;                               (9)
                                                  xt =
                                                     0.61 − 0.31xt −1 − 0.159 yt −1 .                      (10)
    Next, for each regression (9) and (10), it is necessary to test the hypothesis of the coefficient equal
to zero with the second variable lagged, that is H 01 : b1 = 0 and H 02 : d1 = 0 . F-test reveals that the first
hypothesis is rejected at significance value 0.05 and the second one is accepted. Thus, X Granger
causes Y if p = 1.
    Detection of Granger causality between these concepts at minimum value of p is a reason to
question the expert whether to add a connection directed from concept X to concept Y to the fuzzy
cognitive model.
    Suppose the expert decides to add such connection to the model. In this case, the same statistical
data about the concepts used at the previous stage can be applied to identify the sign and influence
intensity between the concepts. For this, let us develop a model using the existing time series
                                               yt = a0 + a1 xt −1 + ut ,                                   (11)
    Having assessed the model by the least squares method, we obtain a1 = 0.857. Determination
coefficient R2 of model (11) equals 0.9, which indicates its acceptable quality. Besides, the obtained
value a1 is significant according to Student’s t-test. With the help of transformations described in [10],
let us pass from the regression coefficient obtained to the influence intensity of concept X on concept
Y. As a result, we obtain wXY = 0.88 (with normalizing function parameter b = 3).

7. Conclusion
The paper deals with problems of structural and parametric identification of fuzzy cognitive models
and the existing problem-solving techniques: expert and statistical. The viability of new approaches to
solving these problems on the basis of time series analysis is substantiated. An approach to solving the
problem of structural identification is proposed, based on Granger causality test. Also a possible
approach to parametric identification on the basis of distributed-lag time series model is studied. The
results of experimental validation of the proposed approaches are presented.
   Let us consider possible directions for further research which are of major interest.


IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                      124
Data Science
R A Isaev and A G Podvesovskii




    First, one of the features of data analysis represented in the form of time series is that, besides the
measurements themselves (levels of a series), there is information about real time moments at which
these measurements were obtained. Knowing the difference in time between two successive levels of
time series (and consequently knowing the time of influence spreading between directly connected
concepts), we can approximately correlate model time steps with the real time of the simulated system
and thus improve accuracy and concreteness of prediction resulting from the cognitive model dynamic
(scenario) analysis.
    Second, it is worthwhile developing the existing impulse process models towards taking account of
different rates of influences: between different pairs of concepts, influences can spread at a variable
speed (at a varying number of simulation steps). Meanwhile, rates of influence spread between pairs of
concepts are determined on the basis of time series pairs corresponding to them.
    Finally, use of statistical data for fuzzy cognitive model identification in the form of time series
provides the model verification with new opportunities. In case of comprehensive statistical data,
model identification can be performed using only a part of them; the rest can be used for its
verification. Degree of the model adequacy will be determined by the accuracy of recalling in the
process of dynamic simulation the data by which it was trained and by the efficiency of predicting
data, which were not accounted for while training.

8. References
[1] Avdeeva Z K, Kovriga S V and Makarenko D I 2007 Cognitive modeling approach to control of
      semi-structured systems (situations) Managing Large Systems 16 26-39
[2] Borisov V V, Kruglov V V and Fedulov A S 2012 Fuzzy Models and Networks (Moscow:
      “Goryachaya Liniya – Telekom” Publisher)
[3] Sylov V B 1995 Strategic Decision Making in Fuzzy Environment (Moscow: “INPRO-RES”
      Publisher)
[4] Roberts F S 1976 Discrete Mathematical Models with Application to Social, Biological and
      Environmental Problems (Englewood Cliffs, N.J.: Prentice-Hall)
[5] Erokhin D V, Lagerev D G, Laricheva E A and Podvesovskii A G 2010 Strategic Enterprise
      Innovation Managemnet: Monograph (Bryansk: Bryansk State Technical University Press)
[6] Denisova A Y and Sergeev V V 2015 Impulse response identification for remote sensing
      images using gis data Computer Optics 39(4) 557-563 DOI: 10.18287/0134-2452-2015-39-4-
      557-563
[7] Averchenkov V I, Kozhukhar V M, Podvesovskii A G and Sazonova A S 2010 Monitoring and
      Prediction of Regional Demand for Highest Scientific Degree Specialists: Monograph
      (Bryansk: Bryansk State Technical University Press)
[8] Makarova E A, Gabdullina E R, Zakieva E Sh and Valiullina K M 2016 Algorithms for
      intelligent analysis of life quality in the domain of public health on a regional level Proc. of the
      6th Int. Conf. on Information Technologies for Intelligent Decision Making Support 2 222-228
[9] Makarova E A, Zakieva E Sh, Gabdullina E R and Makhmutova A E 2016 Knowledge
      generation algorithms for construction of cognitive model of life quality in the domain of high
      education on a regional level Proc. of the 6th Int. Conf. on Information Technologies for
      Intelligent Decision Making Support 2 54-59
[10] Podvesovskii A G and Isaev R A 2016 Application of multiple regression analysis for
      parametric identification of fuzzy cognitive models Proc. of the 6th Int. Conf. on Information
      Technologies for Intelligent Decision Making Support 2 28-33
[11] Rogachyov A F and Melikhova E V 2014 Problems of statistical estimation of cognitive map
      characteristics on the basis of correlation analysis Proc. of the Int. Conf. “Physico-
      Mathematical Sciences: Theory and Practice” 55-62
[12] Magnus Ya R, Katyshev P K and Persetskii A A 2004 Econometrics: Basic Course (Moscow:
      “Delo” Publisher)
[13] Isaev R A and Podvesovskii A G 2017 Generalized model of pulse process for dynamic analysis
      of Sylov’s fuzzy cognitive maps CEUR Workshop Proceedings 1904 57-63


IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                  125