Genetic Algorithm Neural Network model vs Backpropagation Neural Network model for GDP Forecasting Dezdemona Gjylapi Eljona Proko Alketa Hyso Computer Science Dept. Computer Science Dept. Computer Science Dept. University “Ismail Qemali”, University “Ismail Qemali”, University “Ismail Qemali”, Vlorë Vlorë Vlorë dezdemona.gjylapi@univlora.edu.al eljona.proko@univlora.edu.al alketa.hyso@univlora.edu.al neural network models outperform the best linear models by between 15 and 19 per cent at this horizon, implying that neural network models can be exploited for noticeable gains in forecast accuracy. Abstract Giovanis (2009) [GIO09] in his paper is using the This paper evaluates the usefulness of neural networks in ARIMA and ANN to predict the rate of economic GDP forecasting. It is focused on comparing a neural growth in the USA. This study examines the estimation network model trained with genetic algorithm (GANN) to and forecasting performance of ARIMA models in a backpropagation neural network model, both used to comparison with some of the most popular and common forecast the GDP of Albania. Its forecasting is of models of neural networks. The results of this study particular importance in decision-making issues in the indicate that neural networks models outperform the field of economy. The conclusion is that the GANN ARIMA forecasting. model achieves higher accuracy on GDP forecasting. AG (Çeliku, Kristo and Boka) (2009) in the discussion improves ANN model performance compared with paper treats several models to forecast quarterly GDP in standard backpropagation ANN model. Albania. They consist on ARIMA models with seasonal components and indicator models, similar to bridge models. This paper presents a first attempt to model the 1. Introduction GDP using a multi equations system which accounts for the sectional interactions [ÇEL09]. The quarterly GDP data are important for economic Models that predict GDP in the short term use analysis, because it gives insight on the general surveys’ indicators because these indicators have economic activity, on the fluctuations of business cycle several advantages such as: and on the economic turning points. Forecast of macroeconomic variables such as GDP play an 1. Provide preliminary signals about short-term important role in monetary policy decisions and in developments of the activities of economic assessing the future economic situation. Economic agents. policymakers and analysts can adapt their theoretical 2. These indicators are published formerly than the analysis of economic conditions according to the main macroeconomic aggregates forecasts of macroeconomic variables, or even perhaps 3. These indicators are rarely subject to use them as a support and an explanation of their adjustments. theoretical analysis. Forecasts with better performance on macroeconomic variables will lead to better Due to the difficulties encountered in modeling the decisions. quarterly GDP and the fact that ANN has proven to be To have an assessment of the economic situation of an efficient tool for non-parametric model data in the the country, policy institutions need to have information form of non-linear function; we have developed a model on the indicator of GDP, in order to analyze the for the Albania’s Gross Domestic Product forecasting macroeconomic policies implemented in the past and to with artificial neural network approach. In this paper, we take political decisions about the future. forecast the GDP using a neural network model trained Tkacz and Hu (1999) [TKA99] studied forecasting by genetic algorithm (GANN) and compare it with a GDP of Canada. The result of this study is that the best backpropagation neural network model, also used to predict the Albania’s GDP. The rest of the paper is organized as follows. Section 2 presents a brief description of the ANN for GDP forecasting, Section 3 explains the details of the GANN model, Section 4 explains the results of the GANN and BPNN models, and finally, Section 5 presents our conclusions. 2. Artificial Neural Networks for GDP Figure 1. The structure of a MLP network Forecasting The advantage of artificial neural network approach 2.1 The data used for the neural network is that this model can capture the relationship of nonlinear data, especially if the economy is very The accuracy of GDP forecasting with ANN volatile, and it is superior when forecasting chaotic data depends on the selection of the variables to include as [IMA11]. input to the network. An ANN consists of a number of simple and In this selection process we were based on the paper of interconnected processors, also called neurons, Celiku et al.[ÇEL09] to use the variables judging their analogous to biological neurons in the brain. Each potential economic correlations with the quarterly GDP. neuron receives a number of input signals through its In this paper we use all these economic and financial connections, however, it gives no more than one output variables combined with surveys’ variables. signal. The output signal is transmitted through the a. Economic variables neuron’s exit line [NEG05]. o Government expenditures, data provided by the Developing an ANN comprises the definition of: Ministry of Finance. o Construction permissions for residential and 1 - the network architecture, which is defined by the business purposes, in value data provided by basic processing elements (i.e. neurons) and by the INSTAT. way in which they are interconnected (i.e. layers); o Total imports, data provided by the INSTAT. b. Financial variables o Interest rate on loans denominated in EURO, data 2 - the NN Learning, which implies that a processing provided by the Bank of Albania; unit is capable of changing its input or output behavior as a result of changes in the environment, c. Variables from surveys i.e. to adjust the weights based on input vector values; Variables from surveys are based on indicators constructed from qualitative surveys developed by the 3 - the data used for training, testing and validating the Bank of Albania with businesses and consumers. neural network. o ESI (Alb. TNE), the Economic Sentiment Indicator In this paper two neural networks have been aggregates in a single indicator the opinions of the implemented: the multilayer perceptron (MLP, Figure 1) main market agents. They are collected from the neural networks trained by back-propagation (BP), and confidence surveys for the industry, construction the MLP neural network trained by a genetic algorithm and services sector and for the consumers. (GANN), both using bipolar sigmoid activation o EI (Alb.TE) the survey indicator for the economy function, which form is as follows: o II (Alb.TI) the survey indicator for industry sector o CI (Alb.TN) the survey indicator for construction 2 sector 𝑓(𝑥) = 1+𝑒 −𝛼𝑥 − 1 o SI (Alb.TSH) the survey indicator for services and the output layer has only one. The main parameters sector of the BP used in BPNN model were set as: o MPI (Alb. TBM) the survey indicator for major Momentum = 0.5 purchases Learning rate = 0.3 The results are presented in Figure 3. Figure 3. LOG(GDP) vs LOG(GDP forecasted) using BPNN Figure 2. The model of a neuron in the hidden layer In Table 1 are presented the results of an evaluation performed for calculating the accuracy of the forecast. The Figure 2 presents the sample of a neuron of the hidden layer for the GDP forecasting, used in both models, GANN and BPNN. Table 1: Indicators of the accuracy of the GDP forecast The data used to forecast GDP are quarterly data with Backpropagation NN. from 2002Q2 till 2016Q1. These data was stored in a Indicator Value .csv file and served as input to the neural network models. In general, ANN models require large amounts MFE1 0.0047 of data. For the case of Albania’s GDP forecasting, the MAD2 0.0184 biggest inhibitor is the lack of sufficient data. This encompasses availability of data, its consistency and the MSE3 0.0056 data time span. TS4 0.0052 2.2 BPNN for Albania’s GDP Forecasting In a GDP forecast by using the neural network In Table 1 are presented the results of an evaluation trained with the Backpropagation method, it was used a performed for calculating the accuracy of the forecast. three-layer architecture, i.e. the input, output, and the hidden layer. The input layer, similarly as in the “neuro- genetic” model, it has 10 neurons, the hidden layer 20, 1 MFE-Mean Forecasted Error 3 MSE - Mean Squares Error 2 4 MAD-Mean Absolute Deviation TS - Tracing Signal Table 1 shows that this model works, as TS = 0.0052, traditional search and optimization procedures that and since the MFE> 0 the model tends to under-forecast, make them such a robust method [GOL92]: with a mean square error of 0.0056 units. 1) GAs use an encoding of the parameters, not the parameters themselves; 3. GANN Model for Albania’s GDP 2) GAs search from a population of search points, Forecasting not a single point; 3) GAs only use the objective function to judge solution quality, not derivatives or other 3.1 Network architecture auxiliary knowledge; 4) GAs use probabilistic transition rules, not In this model is used the artificial neural network deterministic rules. Feedforward with multiple layers. Several tests are conducted with networks with 3, 4, and 5 layers. In the The basic idea in GA is finding the suitable three cases the first layer, which is the input layer, it has individual in the current population. In this context, the 10 neurons, the same number of the factors which are GA searches for the vector of coefficients which is the determined to be influencers in the forecast of the GDP. global optimal solution. The basic process of the GA is It is the same with the last layer, the output one, which represented as follows. It starts with an initial population consists of only one neuron. selected in a random manner. If the population The hidden layers have a different number of converges, the process stops and the solution is neurons: the 3-layer model has a hidden layer consisting presented. Otherwise, new individuals are created from of 20 neurons (two times the number of neurons of input the old ones and after some operations the new layer); the 4-layer model has two hidden layers with generation is created. This process is repeated until the respectively 20 and 10 neurons; while the 5-layer model objective is reached, explained in more details as has three hidden layers with respectively 20, 20, and 10 follows: neurons. The objective of GA is to minimize the sum of 3.2 The learning Algorithm of the GANN model squares error (SSE), subject of the parameters generated by the algorithm, i.e. There are two types of learning algorithms: the   2 gradient descent and the global search method. The Min yi  yi ku y  f (x |  ) methods such as Backpropagation, Newton, Quasi- Newton, and Levenberg - Marquant can be classified as gradient descent methods. While the genetic algorithm Where y is the output generated from the model, and θ is a global search method. are the coefficients parameters, which minimize the ANN model is very useful, especially in the cases error function. GA has 8 main steps, which are described where the process of data generation is nonlinear and as follows: complex, or when the functional form is not clear. 1) The creation of an initial population of vectors with However, the learning process or the adjustment of the coefficients [θ1, θ2, ..., θp ], where p is an even parameters in an ANN model can take time and can fall number, and θi is a vector of Kx1 elements. The also in a trap, such as the local minima, especially when initial population can be created randomly from the there are too many parameters. To avoid such potential normal standard distribution, or using constrains on problems, we can use the GA in the learning process. the sign or interval of the parameters’ values. The advantage of the ANN model with genetic 2) Two couples are selected randomly from the initial algorithms is that this model can capture the relationship population. The fitness of these four vectors is of the nonlinear data, especially if the economy is very estimated in regards to the objective function and unstable; and it is a superior model when it comes to two vectors with the best fitness (smallest SSE) are predict chaotic data. selected as winners. These two vectors are also There are four main differences to be distinguished referred to as parents. in what respects genetic algorithms (GAs) differ from 3) Through the application of the crossover operator 5) A “fight” runs between the four individuals, the from the parent vectors are created two new vectors, parents and the children (P1, P2, F1 and F2). The called children. The simplest form of the crossover two best vectors, the ones with the lowest sum of operator is the one-point crossover, through which squares error will survive, and they will move to the the two parents intersect at point I and the parts on next generation, whereas the two others will be the right of point I switch the positions. I is selected eliminated. randomly from the set (1, K-1). For K=6 and I=3, 6) The process repeats, returning the parents again in the operator would turn out as below: the pool of the population, so that they have the possibility of re-selection, until the next generation 11   21  11   21  is populated with P vectors of coefficients.         7) The members of the current generation are  12   22   12   22  evaluated together with the ones of the new 13   23  13   23  generation, in regards to the fitness criteria, and the         best one for the next generation is selected. Hence,          the concept of “Elitism” is applied. 14   24   24  14  8) Create new generations of populations with P         individuals and assess the convergence through the 15   25   25  15  behavior of the best member in each generation,         based on the fitness criteria. If the change in the  16  P1  26  P 2  26  F 1  16  F 2 assessment of fitness of the best member in each generation, which passes through 50 generations, is 4) The mutation operator is applied to each element of small, it can be claimed that genetic research has the children vectors. Under the influence of this converged on an optimum. operator, each of these elements is subject of a hit The GANN learning flowchart is presented in Figure 4. with a low probability, µ > 0. The probability is typically given by the formula: µ = 0.15 + 0.33 / G, where G is the size of the generation, while in our application it is defined by the user. The mutation operator looks like the following [MIC96] :  (1t / T ) b   s[1  r2 ] if r1  0.5   (1t / T ) b   s[1  r2  ] if r1  0.5 While in our application we use the following mutation operator:   r2 if r1  0.5    * r2 if r1  0.5 Where r1, r2 are two real numbers from the range [0, 1], selected randomly and s is a random number from a normal standard distribution, t – the number of the next generation, T – the maximal number of the generations, and b is a parameter which defines the degree in which the mutation operator is non- uniform. Figure 4. NN learning using GA (GANN) [NUR14] MFE 0.0005 3.3 GANN model results MAD 0.0081 MSE 0.0004 The main parameters of the GA used in GANN TS 3.6100 model were set as:  Genetic population Size = 100 Table 3 shows that this model works, as TS = 3.61, and  Crossover probability in genetic population = 0.6 since the MFE> 0 the model tends to under-forecast,  Mutation probability in genetic population= 0.4 with a mean square error of 0.0004 units.  Probability to add newly generated chromosome to population = 0.25  The bipolar sigmoid coefficient α = 1.5 4. GANN model vs BPNN Model We compare BPNN and the GANN model using the We tested three types of architectures. The results indicators of accuracy. Details about the comparison are are presented in Table 2 shown in Table 4. Table 2: Indicators of the accuracy of the GDP forecast Table 4: Indicators of the accuracy of the GDP forecast with different GANN architectures. with BPNN and GANN. Indicator 3-layers 4-layers 5-layers Indicator BP GANN MFE 0.0006 0.0005 0.0005 MFE 0.0047 0.0005 MAD 0.0087 0.0081 0.0082 MAD 0.0184 0.0081 MSE 0.0004 0.0004 0.0005 MSE 0.0056 0.0004 TS 3.9555 3.6100 3.4870 TS 0.0052 3.6100 As shown in Table 2, it results that the model with 4- layers architecture is the best model. The results of this 5. Conclusions model are presented in Figure 5. In the model developed in this paper it is treated exactly the evolution of neural network weights through genetic algorithm. Due to the simplicity and generalization of evolution and the fact that training algorithms based on gradient often need to be executed several times in order to avoid being trapped in a local minima, the evolution technique is highly competitive. The results show that the model that uses neural networks to forecast GDP, regardless of the method used for weight training, it works and has a very satisfactory performance. As long as the tracking signal (TS) is between –4 and 4, (in our model TS is equal to GANN=3.61 and Figure 5. 4-layers GANN model forecasting BPNN=0.0052) we can say that the model is working Table 3: Indicators of the accuracy of the GDP forecast correctly. with GANN. The GANN forecasted GDP in this study resulted with a MSE equal to 0.0004 while the BPNN forecasted Indicator Value GDP resulted with a MSE equal to 0.0056. The GANN model tends to slightly under-forecast, with an average absolute error of 0.0081 units, while the BPNN model with an average absolute error of 0.0184 units. GANN outperforms better then BPNN in Albania’s GDP forecasting. References [ÇEL09] Çeliku, Evelina, Ermelinda Kristo, and Merita Boka. 2009. "Modelimi i PBB-së tremujore. Roli i treguesve ekonomikë dhe atyre të vrojtimeve." Tirane: Banka e Shqipërisë. [GIO09] Giovanis, Eleftherios. 2009. "ARIMA and Neural Networks. An application to the real GNP growth rate and the unemployment rate of U.S.A." SSRN Eletronic Journal. doi:10.2139. [GOL92]Goldberg, David E. 1992. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Publishing. [IMA11]Imansyah, Muhammad Handry, Suryani, Nurhidayat, and Muzdalifah. 2011. "GDP Estimation and Slow Down Signal Model for Indonesia:An Artificial Neural Network Approach." Finance and Banking Journal 13 (1): 77-94. [MIC96] Michalewicz, Zbigniew. 1996. Genetic Algorithms + Data Structures = Evolution Programs. Springer-Verlag Berlin Heidelberg. [NEG05]Negnevitsky, Michael. 2005. Artificial Intelligence: A Guide to Intelligent Systems. Pearson Education. [NUR14]Nurcahyo, Septian, Fhira Nhita, and Adiwijaya. 2014. "Rainfall Prediction in Kemayoran Jakarta Using Hybrid Genetic Algorithm (GA) and Partially Connected Feedforward Neural Network (PCFNN)." 2nd International Conference on Information and Communication Technology (ICoICT). [TKA99]Tkacz, G. and S. Hu. 1999. "Forecasting GDP growth using artificial neural networks." Bank of Canada WP, No 99.