=Paper=
{{Paper
|id=Vol-3126/paper8
|storemode=property
|title=Preparing data and determining parameters for a feedforward neural network used for short-term air temperature forecasting
|pdfUrl=https://ceur-ws.org/Vol-3126/paper8.pdf
|volume=Vol-3126
|authors=Boris Perelygin,Tatiana Tkach,Anna Gnatovskaya
}}
==Preparing data and determining parameters for a feedforward neural network used for short-term air temperature forecasting==
Preparing Data and Determining Parameters for a Feedforward Neural Network Used for Short-Term Air Temperature Forecasting Boris Perelygin 1, Tatiana Tkach 2, Anna Gnatovskaya 3 1,2,3 Odessa State Environmental University, Lvivska st., 15, Odessa, 65016, Ukraine Abstract This article presents the results of solving the problem of preparing initial data and determining specifications for an artificial feedforward neural network used for short-term forecasting of ambient air temperature values. Based on the requirements for the accuracy of forecasts, the data for network self-learning was optimized, namely, the number of training vectors and their length, the type of the source data itself, the features of creating a training sample from an array of source data were determined. Additionally, the specifications of neural network that provide the required accuracy of forecasts were selected, namely, the requirements for the network neuron activation, and the number of hidden layers. Keywords 1 Artificial neural network, short-term forecast, air temperature 1. Introduction it is possible to model the nonlinear dependence between the future value of a time series, its past values, and the values of external factors [5]. For Forecasting is one of the most important tasks instance, it is proposed to use fully connected in almost all areas of science and life. Predicting feedforward neural networks to predict the time weather factors is one of the oldest forecasting series of mountain soils humidity [6]. Deep neural tasks because of their great influence on all networks could be used to predict the aspects of human life. Meteorological weather meteorological visibility range [7]. Multi-wavelet forecasts are a scientifically based assumption polymorphic networks could be employed to about the future state of the weather. The success predict geophysical time series [2]. To predict of modern short-term weather forecasts is quite long-term series, it is proposed to use extreme high, but there are also inaccurate forecasts, learning machines [8], convolutional neural especially in cases of abnormal manifestations of networks [9]. In particular, examples of predicting the weather. Therefore, research in this area temperature values are presented [1, 3, 10]. remains relevant at the present time. However, a comparative analysis of the In recent decades, along with traditional possibility of using different ANN architectures is methods of weather forecasting, the use of carried out in the papers concerned with the artificial neural networks (ANN) for forecasting is aforementioned subject, and, as a rule, the considered as a promising area of research [1, 2, methodology for preparing initial data, numerical 3, 4]. The initial data for weather forecasting for estimates of forecasting quality, and the influence ANN are commonly the results of regular of the ANN parameters on these estimates are not measurements of weather characteristics in the sufficiently covered. That means, that these form of numerical values. With the help of ANN, academic papers do not sufficiently cover ISIT 2021: II International Scientific and Practical Conference «Intellectual Systems and Information Technologies», September 13–19, 2021, Odesa, Ukraine EMAIL: b.perelygin@gmail.com (A.1); tatkatkach@gmail.com (A.2); aninfo2000@gmail.com (A.3) ORCID: 0000-0002-6049-8897 (A.1); 0000-0002-5403-5933 (A.2); 0000-0002-0018-5696 (A.3) ©️ 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) solutions problems of selection of parameters of observations, the types of source data) that ANN for forecasting meteorological elements. provide the best accuracy for short-term forecasts This work aims to fill this gap. Therefore, when with a lead time of 3 hours, 1 day and 3 days; 2) implementing this attempt, a well-known the influence of the parameters of the neural feedforward ANN was taken as the subject of network on the accuracy of forecasts with the research and used for short-term forecasting. above-mentioned lead time (the number of its hidden layers, the presence of restrictions in the 2. Problem statement activation functions of neurons of hidden layers). The object of the study is the process of using 3. Initial data a feedforward ANN to predict air temperature values. The subject of the study is a feedforward The air temperature values were chosen as data ANN designed for short-term forecasting of air for the research because of the continuity of these temperature values. data and the clarity of the results obtained. The In the process of conducting research, it was data are a long 15-year series of air temperature necessary to find out the impact on the accuracy values (43569 samples) obtained during regular of forecasts: 1) the parameters of neural network eight-term observations at the weather station learning data (the length of the training vectors, 33837 Odessa from February 01, 2005 to the number of training vectors, the location of the December 31, 2019 (Fig. 1) [11]. sample for training in the general series of Figure 1: A fragment of data on the air temperature of the weather station 33837 Odesa When forming an array of initial data, the of any value specified in the forecast should be missing air temperature values were interpolated considered as the most likely value that this value as the arithmetic mean of neighboring will have during the period of the forecast. At the temperature values. There were only 6 such end of the validity period of the short-term omissions in the data, so with a row length of forecast, an assessment of its success is made, 43569 samples, their correction did not which is based on the accuracy. Accuracy is the significantly affect the result of the study. degree of matching, with certain established tolerances, of predictive and actual 4. Research methodology end tools meteorological values, and phenomena. The accuracy of the temperature forecast is measured alternatively. If the prognostic temperature Due to the large variability of meteorological differed from the actual one by no more than 2.0 quantities in space and in time, the specific value °C, then the forecast accuracy is 100%, if the using the control set. In order to give the final difference is 3.0 °C, the accuracy is 50%, if ≥ 4.0 model proper reliability, the third part of the first °C - it's 0% [12] When conducting studies, the group of data was a backup (test) set of accuracy was calculated similarly to the above observations (III in Fig. 2). method, but without an alternative, in the form of The final model was tested on data from this the ratio of the number of accurate forecasts set to make sure that the results achieved using the (falling within the range of ±2 °C) to the total training and validation sets are real. According to number of forecasts for a given advance. the obtained data, the indicator of the quality of In the context of this paper, different types of training was calculated according to the data should be understood as the actual existing methodology applied to the calculation of the series of observations, the so – called "raw" data, forecast accuracy. During the research, the and its two transformations: a centered series – parameters of the neural network, the length and obtained by subtracting the arithmetic mean from number of training vectors changed, as well as the all the values of the series; and a normalized series place of the beginning of the array changed to obtained by dividing all the values of the series by assess the seasonal impact of data on the quality the maximum modular value of the series. All of the forecast. series of observations were divided into two large For research, we used a traditional ANN of groups: the 1st group of data – for training and the direct propagation, shown in Fig. 3, the number of 2nd group of data – for forecasting (Fig. 2). inputs of which varied depending on the size of the training set, the number of hidden layers varied from 1 to 3. The neuron activation functions also changed. Figure 3: ANN model used in research The simulation was carried out using a computer with a processor Intel® Core™2 Quad CPU Q8200 2.34 GHz and 12 GB of RAM. 5. Result analysis Figure 2: Splitting the source data into groups When determining the best length of the and arrays necessary for training an artificial training vector and the best number of vectors neural network and forecasting training the neural network in the context of forecasts' accuracy, multiple simulations of the The required arrays of initial data were formed training and forecasting procedure were as follows from the first group of data intended for performed (N cycles – approximately 50). At the training. The whole group was divided into 3 same time, the training array size (the length of parts. The first part (I in Fig. 2) it was used for the vectors and their quantity) was selected in training the network (training set). The second such a way that the training procedure was part (II in Fig. 2) was used as verification completed in no more than one hour. Otherwise, (validation) set to check the quality of training. the meaning of short-term forecasting with a 3- Repeated repetition of experiments leads to the hour lead time would have been lost, since the fact that the control set begins to play a key role result could have been obtained after the forecast in creating the model, that is, it becomes part of time. During multiple simulations, the quality of the learning process. This weakens its role, as an training and the accuracy of all three forecasts of independent criterion for the quality of the model different timings were evaluated when changing: – with a large number of experiments, there is a a) the type of source data ("raw", centered, risk of choosing a network that gives a good result normalized), b) when changing the number of studies, 72‧N three-dimensional graphs were hidden layers (1, 2, 3) and c) when changing the obtained. Since the volume of this paper does not activation function from linear to sigmoidal for allow us to present them all, two of them, as an hidden and output layers. As a result of these example, are shown in Fig. 4. a) b) Figure 4: An example of displaying the parameters of the quality of training (a) and the accuracy of forecasts (b) A number of training vectors is drawn along of the training array, and in Fig. 4, b – the quality the abscissa axis, the length of the vectors is of extrapolation of data on which the network was drawn along the ordinate axis, and either the value not trained. The graph in Fig. 4, b clearly shows of the training quality or the forecast accuracy the instability of forecasts for any length of parameter for the corresponding advance time is training vectors and for a small number of them. displayed along the application axis. Each point of The analysis of all the graphs showed that in these graphs was calculated for the specified order to obtain a specific accuracy of forecasts, vector length, the number of vectors, the number the initial data for training the network should be of hidden layers of the ANN, the type of activation presented in the form of 150 vectors with a length function and the type of source data. From a of 16 samples each (the circled area in Fig. 5, a), mathematical standpoint, the graph in Fig. 4, a which also provides stable short-term forecasting, shows the quality of the network's approximation one of the results of which is shown in Fig. 6. a) b) Figure 5: Accuracy of forecasting with a linear function of neuron activation (a) and in the presence of nonlinearity in the activation function of neurons (b) with the same form of initial data and the same number of hidden layers In addition, the presence of non – linearity by more than 2 times. Therefore, when solving (restriction) in the activation function greatly such a problem, the neurons of the network must worsens the prediction quality indicator, i.e. have a complete linear activation function. accuracy (Fig. 5), and increases the training time a) b) Figure 6: The result of forecasting for one day (a) and for three days (b) The simulation showed that an increase in the the neurons of the network must have a number of hidden network layers does not complete linear activation function; improve, but also does not worsen the quality of an increase in the number of hidden forecasting, the accuracy does not change network layers does not improve, but also does significantly, but the network architecture not worsen the quality of forecasting, the becomes more complicated and with the same accuracy does not change significantly, but the learning algorithm (the Levenberg-Marquardt network architecture becomes more algorithm in the error back propagation complicated and with the same learning procedure), the training time increases by a algorithm (the Levenberg-Marquardt multiple. algorithm in the error back propagation The type of source data does not affect the procedure), the training time increases by a quality of forecasting, the accuracy does not multiple; change significantly when replacing the "raw" the type of source data does not affect the source data with centered or normalized ones. quality of forecasting, the accuracy does not change significantly when replacing the "raw" 6. Conclusions source data with centered or normalized ones; the initial data for training the network The analysis of the obtained results made it should be presented in the form of 150 vectors possible to determine the type of initial data, the with a length of 16 samples each, which volume of initial data, and the main parameters of ensures stable short-term forecasting; the feedforward ANN for predicting temperature when training on such a small amount of values: data, the following condition needs to be the presence of non-linearity (restriction) observed: when forecasting in a certain season in the activation function significantly worsens (time of year), earlier data for training needs to the prediction quality indicator, i.e. accuracy, be selected, necessarily from exactly the same and increases the training time by more than 2 time of year, otherwise the forecast error times, therefore, when solving such a problem, increases significantly. 7. References [12] Guidelines for hydrometeorological forecasting. Ukrainian Hydrometeorological Center. Kyiv, 2019. – 35 p. [1] B. F. Kuznetsov Short-term temperature prediction based on neural networks. // Topical issues of agrarian science. 2019. # 30. Pp. 59-65. [2] S. N. Verzunov, N. M. Lychenko Multi- wavelength polymorphic network for forecasting geophysical time series // Problems of automation and control. 2017. # 1 (32). Pp. 78-87. [3] A. S. Kozadaev, A. A. Arzamassev Forecasting of time series using the apparatus of artificial neural networks. Short-term forecast of air temperature. // Bulletin of the Tambov University. Series: Natural and Technical Sciences. 2006. Vol. 11. Issue 3. Pp. 299-304. [4] A. S. Gribin. Application of artificial neural network algorithms for short-term prognosis: Dis. ... candidate of physical and mathematical sciences. SP-b. 2005. 154 p. [5] J.W. Taylor, R. Buizza Neural Network Load Forecasting with Weather Ensemble Predictions. // IEEE Trans. on Power Systems, 2002, Vol. 17, Pp. 626-632. DOI: 10.1109/TPWRS.2002.800906 [6] L. I. Velikanova Short-term forecasting of humidity of mountain soils / / Problems of automation and control. – 2015, No. 4. – Pp. 158-166. [7] S. N. Verzunov. Application of deep neural networks for short-term prediction of visibility range. Problems of automation and control. 2019, #1 (36). Pp. 118-130. [8] LeiYu, ZhaoDanning, Cai Hongbing Prediction of length-of-day using extreme learning machine // Geodesy and geodynamics. – 2015 V. 6. N. 2. – Pp. 151- 159. [9] Koprinska, Irena et al. Convolutional Neural Networks for Energy Time Series Forecasting // 2018 International Joint Conference on Neural Networks (IJCNN). 2018: Pp. 1-8. [10] Imran Maqsood, Muhammad Riaz Khan, Ajith Abraham An ensemble of neural networks for weather forecasting Neural Comput & Applic (2004) 13: Pp. 112-122. DOI 10.1007/s00521-004-0413-4 [11] Archive of meteorological data in Odessa. URL:https://rp5.ru/Архив_погоды_в_Одес се (Accessed 19.11.2020)