Hybrid Neo-Fuzzy Neural Networks Based on Self-Organization
and Their Application for Forecasting in Financial Sphere
Yuriy Zaychenkoa and Galib Hamidovb
a
  National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Peremohy av. 37, Kyiv,
  03056, Ukraine
b
  Information Technologies Department, Azershiq, str. K. Kazimzade 20, Baku, AZ1008, Azerbaijan


                 Abstract
                 In this paper new class of deep learning – cascade neo-fuzzy neural network (CNFNN) was
                 considered and investigated. Neo-fuzzy neuron with two inputs is used as a node of hybrid
                 network. The experimental investigations were carried out during which the optimal
                 parameters of neo-fuzzy network were found: number of inputs, training/test sample ratio and
                 number of linguistic variables. Method GMDH was used for optimal structure construction
                 of deep hybrid network. The experimental investigations of deep NFNN were carried out in
                 the problem of market index forecasting at German stock exchange and Google share prices.
                 The comparison experiments of the deep neo-fuzzy network with alternative methods GMDH
                 and conventional cascade neo-fuzzy network were carried out and the efficiency of suggested
                  hybrid NFN was estimated

                 Keywords1
                 Deep learning, GMDH, cascade neo-fuzzy network, parameters and structure optimization.

1. Introduction
    Nowadays forecasting problems of share prices and market indicators attract great attention of
investors and managers of invest funds. For forecasting at financial markets usually are applied
methods of regressive analysis ARIMA, ARCH and GARCH methods, exponential smoothing [1].
    But last years for solution of this problem fuzzy neural networks (FNN) were suggested [5-7].
Their main advantages are capability to work with fuzzy, incomplete and qualitive information and to
utilize expert knowledge. Besides, FNNs have properties of high approximation due to FAT theorem
[5] and interpretability. But for application of FNN in forecasting problems its necessary to train rule
base and membership functions of fuzzy rules. This demands large computational resources and a lot
of training time.
    Last years new class of FNN- cascade neo-fuzzy neural networks (CNFNN) appeared [8], their
main advantage is absence of necessity to train membership functions and only rule weights are to be
trained using input sample. This enables to substantially cut training time and computational expense
and to apply this class of FNN for high dimensional problems (Big Data).
    Usually training of NN means the adjustment of weights between neurons. But efficiency of
training can be substantially improved if to adapt not only neuron weights, but a network structure as
well using training sample. For this aim the application of Group Method of Data Handling seems
very promising. GMDH is based on the principle of self- organization and enables to construct
structure of model in the process of its run. Besides, as building blocks for model simple sub-models
called partial descriptions consisting of two variables are used [2-4]. That allows to GMDH to work
with short training samples.


COLINS-2021: 5th International Conference on Computational Linguistics and Intelligent Systems, April 22–23, 2021, Kharkiv, Ukraine
EMAIL: zaychenkoyuri@gmail.com (Yu. Zaychenko); galib.hamidov@gmail.com (G. Hamidov)
ORCID: 0000-0001-9662-3269 (Yu. Zaychenko)
            ©️ 2021 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)
   For the first time application of GMDH to construct structure of neural networks and to train its
weights was suggested by A.G. Ivakhnenko and his collaborates (so-called “active neurons”) [2-4]. In
the next works method GMDH was successfully applied for construction of hybrid neuro-fuzzy
networks with kernel activation functions [9], spiking neurons [10], wavelet functions [11,12] and
other class of fuzzy neural networks [13]. But in these works, the important property of GMDH –
application of basic models with only two inputs and small number of tunable parameters wasn’t
used. This property is very important for deep learning fuzzy networks and enable to cut number of
adapted parameters and training time as well.
   The goal of this paper is to find optimal parameters of deep CNFNN, construct its structure using
GMDH, investigate its efficiency in the forecasting problem of share prices at stock exchange and to
compare its efficiency with CNFNN of standard structure.

2. Experimental investigations of deep learning neo-fuzzy neural networks
   The goal of experiments was to analyze the forecasting efficiency of deep learning neo-fuzzy
neural network (NFNN) in the problem of forecasting share prices and market index of German stock
exchange DAX, in particular find optimal parameters of neo-fuzzy network – number of inputs,
number of fuzzy membership functions and optimal structure of deep NFNN.
   As input data were taken average month values of market index DAX in the period since January
2010 to December 2016. Then total sample size was 80 elements – average month values. The input
data is presented in the Table 1.

Table 1
Dynamics of German stock exchange index
  Germany       2010          2011       2012            2013        2014        2015        2016
   January       100        112,5032    96,8841        123,189     154,8426    140,4772    127,3783
  February    91,32258      119,0592    107,396        122,1897    155,3302    148,9331    123,2948
    March     96,74254      116,4645    109,996        122,595     154,2723    152,4478    131,1697
     April    99,67204      124,9138    105,958        120,2768    156,6237    154,271     135,7509
     May      89,38131      125,5138    98,0491        128,8783    159,2462    154,6962    135,1146
     June     88,73958      123,1834    92,7264        127,6049    161,3321    150,7816    132,4843
      July    92,74674      124,5796    96,2023        127,7148    157,7584    148,275     131,6502
   August     94,39056      101,4666    103,017        132,616     147,5007    144,4888    141,021
 September    97,24559      88,79165    111,930        135,6962    148,5193    134,0826    140,8114
  October     106,9113      96,40158    112,958        143,4353    136,0243    136,9621    139,7606
 November     109,9191      94,39673    111,048        147,9172    141,4704    140,8389    136,0866
 December     110,4825       92,2427    118,689        151,2206    144,6012    138,8777    139,7837

   Network training was performed by gradient descent method with adaptive steps and Widrow-
Hoff method (6) in the sequential mode (see previous section). The goal of experiments was to find
optimal parameters NFNN. In experiments the following parameters varied: inputs number
(prehistory length), number of layers, number of linguistic variables (fuzzy sets/per variable), rules
number and training/ test sample ratio Ntrain /N test (%).
   In the first experiment the number of layers was varied under different ratio training test sample
and its influence on forecasting accuracy was explored. The corresponding results are presented in
the Table 2. In denoting CNFNN (m,n,k) the first digit m indicates number of layers, the second digit
n-inputs number, the third k-number of linguistic variables.
Table 2
Dependence forecasting accuracy versus number of network layers
 Data points       Real        CNFNN(2,4,4)     CNFNN(3,4,4)    CNFNN(4,4,4)           CNFNN(5,4,4)
     31         107.7221          130.2851         93,3602        110,1878               94,9563
     32         110.0536          130.4976         93,0511        109,9537               94,6139
     33         109.9942          131.1365         92,7585        109,7220               94,2873
     34         107.4861           131.82          92,4814        109,4927               93,9759
     35          105.812          132.1484          92,219        109,2658               93,6788
     36         108.7387          132.4245         91,9706        109,0412               93,3955
     37         109.2837          133.0117         91,7354        108,8190               93,1252
     38         105.0767          133.6728         91,5126        108,5990               92,8674
     39         102.8346          134.0803         91,3017        108,3813               92,6215
     40         105.3798          134.4172         91,1021        108,1659               92,3870
  MAPE, %       -----------         25.78            23,8           14,3                   15,6

   In the Figure 1 the corresponding dependence of criterion MAPE on layers number is presented.

             30

             25

             20

             15

             10

              5

              0
                      CNFNN(2,4,4)   CNFNN(3,4,4)     CNFNN(4,4,4)      CNFNN(5,4,4)

Figure 1: Dependence of MAPE (%) versus number of layers for ratio Ntrain/N test = 50/50

   As it follows from Figure 1 the optimal layers number for considered problem is equal to 4.
   Further investigation of MAPE dependence on inputs number was carried out. The corresponding
results are presented in Figure 2 for ratio Ntrain/N test =50/50.

              25

              20

              15

              10

                  5

                  0
                      CNFNN(4,3,4)   CNFNN(4,4,4)     CNFNN(4,5,4)    CNFNN(4,6,4)

Figure 2: Dependence of criterion МАРЕ (%) on inputs number

   As it follows from the presented results the optimal inputs number exists which, in general case
depends on ratio N train/N test . The optimal inputs number for ratio N train/N test = 50 equals to 4.
   The important parameter for deep NFN is a number of linguistic variables (fuzzy sets per one
variable). The corresponding investigations were carried out and the results are presented in the
Figure 3.


Figure 3: Dependence of criterion MAPE (%) on the number of linguistic variables

    The presented results show optimal number of linguistic variables is equal to 4 for the considered
problem.
    The investigations of ratio N train/N test dependence on forecasting accuracy were carried out.
    The corresponding results of forecasting accuracy versus number of layers for different rations
Ntrain/N test are presented in figure 6 and in the Table 3.

Table 3
Forecasting accuracy dependence on layers number
      Ntrain /Ntest          50-50        60-40          70-30              80-20            90-10
    CNFNN(2,4,4)            25,78%        20,2%         11,052%           7,0012%           3,5213%
    CNFNN(3,4,4)            23,8%         17,3%        10,5341%           5,9654%           3,2592%
    CNFNN(4,4,4)            16,3%         15,4%         9,6584%           4,4325%           3,1952%
    CNFNN(5,4,4)            19,6%         19,2%        12,9532%           6,3454%           3,2421%

                30

                25

                20                                                                  50-50
                                                                                    60-40
                15
                                                                                    70-30
                10                                                                  80-20
                                                                                    90-10
                  5

                  0
                      CNFNN(2,4,4) CNFNN(3,4,4) CNFNN(4,4,4) CNFNN(5,4,4)

Figure 4: Dependence of MAPE (%) on layers number for different ratios Ntrain /Ntest

   The found dependence of criterion MAPE on inputs number for different ratios N train/N test are
presented in the Table 4.
Table 4
Forecasting accuracy MAPE (%) dependence on inputs number
      Ntrain /Ntest            50-50       60-40          70-30              80-20             90-10
     NFNN(4,3,4)               17,7%       17,642%      10,5329%            4,8543%           4,5213%
     NFNN(4,4,4)               15,5%        15,4%        9,6584%            4,4325%           3,1952%
     NFNN(4,5,4)               19,7%      16,5922%       8,5811%            3,2151%           1,6819%
     NFNN(4,6,4)               21,5%      19,6483%       8,6954%            4,9623%           1,7651%

    The forecasting accuracy versus number of linguistic variables was also investigated for different
ratios Ntrain /Ntest and presented in Figure 5.

             30

             25
                                                                                      50-50
             20
                                                                                      60-40
             15
                                                                                      70-30
             10                                                                       80-20
              5                                                                       90-10

              0
                      CNFNN(4,5,2) CNFNN(4,5,3) CNFNN(4,5,4) CNFNN(4,5,6)

Figure 5: Dependence of MAPE (%) versus number of linguistic variables

   As it follows from the presented results for each class of financial processes there exists optimal
number of layers of NFNN. Under its further increase criterion MAPE on test sample begins to
increase or stops to change. That is well comply with principle of self-organization of GMDH [2-4].
The similar dependence was detected for number of inputs and number of linguistic variables.
   Further the similar forecasting experiments were carried out with FNN ANFIS. Number of inputs
and number of linguistic values were taken 4. The efficiency comparison with the deep neo-fuzzy
network with similar parameters values NFNN(4,4,4) was performed. The corresponding results for
both networks are presented in the Table 5.

Table 5
Comparison of forecasting accuracy of NFNN and FNN ANFIS
        MAPE                   50-50       60-40          70-30              80-20             90-10
     FNN ANFIS                 19,7%       17,65%         12,54%            6,9554%           4,5614%
  Deep NFNN(4,4,4)             15,5%        15,4%        9,6584%            4,4325%           3,1952%

    As the results of comparison show the deep neo-fuzzy neural network has higher forecasting
accuracy than ANFIS. Additional advantages of NFNN are less computational complexity and less
training time due to lack of necessity to adjust membership functions These properties enable to use
NFNN in Big Data forecasting problems.
    In the process of investigations GMDH was applied to construction of optimal structure of hybrid
cascade network. In this research Google shares close prices since August till December 2019 were
forecasted. The process of hybrid network structure generation which was obtained by GMDH
algorithm is presented in Figure 6 [14].
                   X1 A0 = f(X1, X2); B0 = f(A0, A4); C0 = f(B0, B3); D0 = f(C0, C1).
                   X2 A1 = f(X1, X3); B1 = f(A1, A2); C1 = f(B1, B2).
                   X3 A2 = f(X2, X4); B2 = f(A1, A5);
                   X4 A3 = f(X1, X5); B3 = f(A3, A4).
                   X5 A4 = f(X3, X4);
                       A5 = f(X2, X5).
Figure 6: The optimal structure of hybrid neo-fuzzy network

    The optimal structure generated by GMDH was such: 6 neurons (A0, A1, A2, A3, A4, A5) at the
first layer, 4 neurons (B0, B1, B2, B3) at the second layer, 2 neurons (C0,C1) at the third layer and
one neuron (D0) at the last layer. All 5 inputs were used in the structure.

3. Conclusion

   1. In this paper new class of deep learning networks-hybrid cascade neo-fuzzy neural network
   (NFNN) based on GMDH was developed and explored in the problem of forecasting market
   indicator of German stock exchange and Google share prices. In this type of deep networks neo-
   fuzzy neuron with two inputs is used as a node.
   The experimental explorations were carried out during which the optimal parameters of hybrid
   neo-fuzzy network were determined.
   2. The problem of optimal structure generation of hybrid cascade network was considered and
   for its solution method GMDH was applied and investigated.
   3. The comparative experiments of the deep hybrid network with alternative methods GMDH
   and cascade network were carried out and forecasting efficiency of the suggested hybrid network
   was estimated and proved to be the best one.
   4. After experiments it was detected the developed deep hybrid neo-fuzzy network is very
   promising for forecasting in the financial sphere. Besides, it’s free from typical drawbacks of
   conventional deep learning networks.

4. References
[1] L. Lewis, Methods of forecasting economic indicators (transl. from English). Finance and
    Statistics, Moscow, 1986.
[2] A.G. Ivakhnenko, G.A. Ivakhnenko, J.A. Mueller, Self-organization of the neural networks with
    active neurons. Pattern Recognition and Image Analysis 4, 2 (1994): 177-188.
[3] A.G. Ivakhnenko, D. Wuensch, G.A. Ivakhnenko, Inductive sorting-out GMDH algorithms with
    polynomial complexity for active neurons of neural networks. Neural Networks 2 (1999):
    1169-1173.
[4] G.A. Ivakhnenko, Self-organization of neuronet with active neurons for effects of nuclear test
    explosions forecasting. System Analysis Modeling Simulation 20 (1995): 107-116.
[5] M. Zgurovsky, Yu. Zaychenko, Fundamentals of computational intelligence- System approach.
    Springer, 2016.
[6] L.-X. Wang, J. M. Mendel, Fuzzy basis functions, universal approximation, and orthogonal least-
    squares learning. IEEE Trans. on Neural Networks 3, №5 (1992): 807-814.
[7] J.-S. Jang, ANFIS: Adaptive-network-based fuzzy inference systems. IEEE Trans. on Systems,
    Man, and Cybernetics. 23 (1993,): 665-685.
[8] T. Yamakawa, E. Uchino, T. Miki, H. Kusanagi, A neo-fuzzy neuron and its applications to
    system identification and prediction of the system behavior, in: Proceedings 2nd Intеrn. Conf.
    Fuzzy Logic and Neural Networks «LIZUKA-92». Lizuka, 1992, pp. 477-483.
[9] Ye. Bodyanskiy, N. Teslenko, P. Grimm, Hybrid evolving neural network using kernel activation
    functions, in: Proc. 17th Zittau East-West Fuzzy Colloquium, Zittau/Goerlitz, HS, 2010,
    pp. 39-46.
[10] Ye. Bodyanskiy, O.A .Vynokurova, A.I. Dolotov, Self-learning cascade spiking neural network
     for fuzzy clustering based on Group Method of Data Handling. J. of Automation and Information
     Sciences, 45, №3 (2013,): 23-33.
[11] Ye. Bodyanskiy, O. Vynokurova, A. Dolotov, O. Kharchenko, Wavelet-neuro-fuzzy network
     structure optimization using GMDH for the solving forecasting tasks, in: Proceedings 4th Int.
     Conf. on Inductive Modelling ICIM 2013, Kyiv, 2013, pp. 61-67.
[12] Ye. Bodyanskiy, O. Vynokurov, N. Teslenko, Cascade GMDH-wavelet-neuro-fuzzy network, in:
     Proceedings 4th Int. Workshop on Inductive Modeling «IWIM 2011», Kyiv, Ukraine, 2011,
     pp. 22-30.
[13] Ye. Bodyanskiy, O. Boiko, Yu. Zaychenko, G. Hamidov, Evolving Hybrid GMDH-Neuro-Fuzzy
     Network and Its Applications, in: Proceedings of the Intern conference SAIC 2018, Kiev,
     Ukraine, 2018.
[14] E. Bodyanskiy, Yu. Zaychenko, O. Boiko, G. Hamidov, A. Zelikman, The hybrid GMDH-neo-
     fuzzy neural network in forecasting problems in financial sphere, in: Proceedings of the
     International conference IEEE SAIC 2020, Kiev, Ukraine, 2020.