=Paper=
{{Paper
|id=Vol-3126/paper8
|storemode=property
|title=Preparing data and determining parameters for a feedforward neural network used for short-term air temperature forecasting
|pdfUrl=https://ceur-ws.org/Vol-3126/paper8.pdf
|volume=Vol-3126
|authors=Boris Perelygin,Tatiana Tkach,Anna Gnatovskaya
}}
==Preparing data and determining parameters for a feedforward neural network used for short-term air temperature forecasting==
<pdf width="1500px">https://ceur-ws.org/Vol-3126/paper8.pdf</pdf>
<pre>
Preparing Data and Determining Parameters for a Feedforward
Neural Network Used for Short-Term Air Temperature
Forecasting
Boris Perelygin 1, Tatiana Tkach 2, Anna Gnatovskaya 3
1,2,3
        Odessa State Environmental University, Lvivska st., 15, Odessa, 65016, Ukraine

                   Abstract
                   This article presents the results of solving the problem of preparing initial data and determining
                   specifications for an artificial feedforward neural network used for short-term forecasting of
                   ambient air temperature values. Based on the requirements for the accuracy of forecasts, the
                   data for network self-learning was optimized, namely, the number of training vectors and their
                   length, the type of the source data itself, the features of creating a training sample from an array
                   of source data were determined. Additionally, the specifications of neural network that provide
                   the required accuracy of forecasts were selected, namely, the requirements for the network
                   neuron activation, and the number of hidden layers.

                   Keywords 1
                   Artificial neural network, short-term forecast, air temperature


1. Introduction                                                                                it is possible to model the nonlinear dependence
                                                                                               between the future value of a time series, its past
                                                                                               values, and the values of external factors [5]. For
    Forecasting is one of the most important tasks
                                                                                               instance, it is proposed to use fully connected
in almost all areas of science and life. Predicting
                                                                                               feedforward neural networks to predict the time
weather factors is one of the oldest forecasting
                                                                                               series of mountain soils humidity [6]. Deep neural
tasks because of their great influence on all
                                                                                               networks could be used to predict the
aspects of human life. Meteorological weather
                                                                                               meteorological visibility range [7]. Multi-wavelet
forecasts are a scientifically based assumption
                                                                                               polymorphic networks could be employed to
about the future state of the weather. The success
                                                                                               predict geophysical time series [2]. To predict
of modern short-term weather forecasts is quite
                                                                                               long-term series, it is proposed to use extreme
high, but there are also inaccurate forecasts,
                                                                                               learning machines [8], convolutional neural
especially in cases of abnormal manifestations of
                                                                                               networks [9]. In particular, examples of predicting
the weather. Therefore, research in this area
                                                                                               temperature values are presented [1, 3, 10].
remains relevant at the present time.
                                                                                               However, a comparative analysis of the
    In recent decades, along with traditional
                                                                                               possibility of using different ANN architectures is
methods of weather forecasting, the use of
                                                                                               carried out in the papers concerned with the
artificial neural networks (ANN) for forecasting is
                                                                                               aforementioned subject, and, as a rule, the
considered as a promising area of research [1, 2,
                                                                                               methodology for preparing initial data, numerical
3, 4]. The initial data for weather forecasting for
                                                                                               estimates of forecasting quality, and the influence
ANN are commonly the results of regular
                                                                                               of the ANN parameters on these estimates are not
measurements of weather characteristics in the
                                                                                               sufficiently covered. That means, that these
form of numerical values. With the help of ANN,
                                                                                               academic papers do not sufficiently cover

ISIT 2021: II International Scientific and Practical Conference
«Intellectual Systems and Information Technologies», September
13–19, 2021, Odesa, Ukraine
EMAIL: b.perelygin@gmail.com (A.1); tatkatkach@gmail.com
(A.2); aninfo2000@gmail.com (A.3)
ORCID: 0000-0002-6049-8897 (A.1); 0000-0002-5403-5933
(A.2); 0000-0002-0018-5696 (A.3)
               ©️ 2021 Copyright for this paper by its authors. Use permitted under Creative
               Commons License Attribution 4.0 International (CC BY 4.0).
               CEUR Workshop Proceedings (CEUR-WS.org)
solutions problems of selection of parameters of      observations, the types of source data) that
ANN for forecasting meteorological elements.          provide the best accuracy for short-term forecasts
This work aims to fill this gap. Therefore, when      with a lead time of 3 hours, 1 day and 3 days; 2)
implementing this attempt, a well-known               the influence of the parameters of the neural
feedforward ANN was taken as the subject of           network on the accuracy of forecasts with the
research and used for short-term forecasting.         above-mentioned lead time (the number of its
                                                      hidden layers, the presence of restrictions in the
2. Problem statement                                  activation functions of neurons of hidden layers).

   The object of the study is the process of using    3. Initial data
a feedforward ANN to predict air temperature
values. The subject of the study is a feedforward        The air temperature values were chosen as data
ANN designed for short-term forecasting of air        for the research because of the continuity of these
temperature values.                                   data and the clarity of the results obtained. The
   In the process of conducting research, it was      data are a long 15-year series of air temperature
necessary to find out the impact on the accuracy      values (43569 samples) obtained during regular
of forecasts: 1) the parameters of neural network     eight-term observations at the weather station
learning data (the length of the training vectors,    33837 Odessa from February 01, 2005 to
the number of training vectors, the location of the   December 31, 2019 (Fig. 1) [11].
sample for training in the general series of


Figure 1: A fragment of data on the air temperature of the weather station 33837 Odesa

   When forming an array of initial data, the         of any value specified in the forecast should be
missing air temperature values were interpolated      considered as the most likely value that this value
as the arithmetic mean of neighboring                 will have during the period of the forecast. At the
temperature values. There were only 6 such            end of the validity period of the short-term
omissions in the data, so with a row length of        forecast, an assessment of its success is made,
43569 samples, their correction did not               which is based on the accuracy. Accuracy is the
significantly affect the result of the study.         degree of matching, with certain established
                                                      tolerances,    of     predictive     and     actual
4. Research methodology end tools                     meteorological values, and phenomena. The
                                                      accuracy of the temperature forecast is measured
                                                      alternatively. If the prognostic temperature
   Due to the large variability of meteorological     differed from the actual one by no more than 2.0
quantities in space and in time, the specific value
°C, then the forecast accuracy is 100%, if the          using the control set. In order to give the final
difference is 3.0 °C, the accuracy is 50%, if ≥ 4.0     model proper reliability, the third part of the first
°C - it's 0% [12] When conducting studies, the          group of data was a backup (test) set of
accuracy was calculated similarly to the above          observations (III in Fig. 2).
method, but without an alternative, in the form of          The final model was tested on data from this
the ratio of the number of accurate forecasts           set to make sure that the results achieved using the
(falling within the range of ±2 °C) to the total        training and validation sets are real. According to
number of forecasts for a given advance.                the obtained data, the indicator of the quality of
    In the context of this paper, different types of    training was calculated according to the
data should be understood as the actual existing        methodology applied to the calculation of the
series of observations, the so – called "raw" data,     forecast accuracy. During the research, the
and its two transformations: a centered series –        parameters of the neural network, the length and
obtained by subtracting the arithmetic mean from        number of training vectors changed, as well as the
all the values of the series; and a normalized series   place of the beginning of the array changed to
obtained by dividing all the values of the series by    assess the seasonal impact of data on the quality
the maximum modular value of the series. All            of the forecast.
series of observations were divided into two large          For research, we used a traditional ANN of
groups: the 1st group of data – for training and the    direct propagation, shown in Fig. 3, the number of
2nd group of data – for forecasting (Fig. 2).           inputs of which varied depending on the size of
                                                        the training set, the number of hidden layers
                                                        varied from 1 to 3. The neuron activation
                                                        functions also changed.


                                                        Figure 3: ANN model used in research

                                                          The simulation was carried out using a
                                                        computer with a processor Intel® Core™2 Quad
                                                        CPU Q8200 2.34 GHz and 12 GB of RAM.

                                                        5. Result analysis
Figure 2: Splitting the source data into groups
                                                            When determining the best length of the
and arrays necessary for training an artificial
                                                        training vector and the best number of vectors
neural network and forecasting
                                                        training the neural network in the context of
                                                        forecasts' accuracy, multiple simulations of the
    The required arrays of initial data were formed
                                                        training and forecasting procedure were
as follows from the first group of data intended for
                                                        performed (N cycles – approximately 50). At the
training. The whole group was divided into 3
                                                        same time, the training array size (the length of
parts. The first part (I in Fig. 2) it was used for
                                                        the vectors and their quantity) was selected in
training the network (training set). The second
                                                        such a way that the training procedure was
part (II in Fig. 2) was used as verification
                                                        completed in no more than one hour. Otherwise,
(validation) set to check the quality of training.
                                                        the meaning of short-term forecasting with a 3-
    Repeated repetition of experiments leads to the
                                                        hour lead time would have been lost, since the
fact that the control set begins to play a key role
                                                        result could have been obtained after the forecast
in creating the model, that is, it becomes part of
                                                        time. During multiple simulations, the quality of
the learning process. This weakens its role, as an
                                                        training and the accuracy of all three forecasts of
independent criterion for the quality of the model
                                                        different timings were evaluated when changing:
– with a large number of experiments, there is a
                                                        a) the type of source data ("raw", centered,
risk of choosing a network that gives a good result
normalized), b) when changing the number of           studies, 72‧N three-dimensional graphs were
hidden layers (1, 2, 3) and c) when changing the      obtained. Since the volume of this paper does not
activation function from linear to sigmoidal for      allow us to present them all, two of them, as an
hidden and output layers. As a result of these        example, are shown in Fig. 4.


                                              a)                                               b)
Figure 4: An example of displaying the parameters of the quality of training (a) and the accuracy of
forecasts (b)

    A number of training vectors is drawn along       of the training array, and in Fig. 4, b – the quality
the abscissa axis, the length of the vectors is       of extrapolation of data on which the network was
drawn along the ordinate axis, and either the value   not trained. The graph in Fig. 4, b clearly shows
of the training quality or the forecast accuracy      the instability of forecasts for any length of
parameter for the corresponding advance time is       training vectors and for a small number of them.
displayed along the application axis. Each point of       The analysis of all the graphs showed that in
these graphs was calculated for the specified         order to obtain a specific accuracy of forecasts,
vector length, the number of vectors, the number      the initial data for training the network should be
of hidden layers of the ANN, the type of activation   presented in the form of 150 vectors with a length
function and the type of source data. From a          of 16 samples each (the circled area in Fig. 5, a),
mathematical standpoint, the graph in Fig. 4, a       which also provides stable short-term forecasting,
shows the quality of the network's approximation      one of the results of which is shown in Fig. 6.


                                                   a)                                             b)
Figure 5: Accuracy of forecasting with a linear function of neuron activation (a) and in the presence of
nonlinearity in the activation function of neurons (b) with the same form of initial data and the same
number of hidden layers

   In addition, the presence of non – linearity       by more than 2 times. Therefore, when solving
(restriction) in the activation function greatly      such a problem, the neurons of the network must
worsens the prediction quality indicator, i.e.        have a complete linear activation function.
accuracy (Fig. 5), and increases the training time
                                                                                            a)


                                                                                             b)
Figure 6: The result of forecasting for one day (a) and for three days (b)

   The simulation showed that an increase in the         the neurons of the network must have a
number of hidden network layers does not                 complete linear activation function;
improve, but also does not worsen the quality of             an increase in the number of hidden
forecasting, the accuracy does not change                network layers does not improve, but also does
significantly, but the network architecture              not worsen the quality of forecasting, the
becomes more complicated and with the same               accuracy does not change significantly, but the
learning algorithm (the Levenberg-Marquardt              network      architecture      becomes       more
algorithm in the error back propagation                  complicated and with the same learning
procedure), the training time increases by a             algorithm       (the      Levenberg-Marquardt
multiple.                                                algorithm in the error back propagation
   The type of source data does not affect the           procedure), the training time increases by a
quality of forecasting, the accuracy does not            multiple;
change significantly when replacing the "raw"                the type of source data does not affect the
source data with centered or normalized ones.            quality of forecasting, the accuracy does not
                                                         change significantly when replacing the "raw"
6. Conclusions                                           source data with centered or normalized ones;
                                                             the initial data for training the network
   The analysis of the obtained results made it          should be presented in the form of 150 vectors
possible to determine the type of initial data, the      with a length of 16 samples each, which
volume of initial data, and the main parameters of       ensures stable short-term forecasting;
the feedforward ANN for predicting temperature               when training on such a small amount of
values:                                                  data, the following condition needs to be
       the presence of non-linearity (restriction)      observed: when forecasting in a certain season
   in the activation function significantly worsens      (time of year), earlier data for training needs to
   the prediction quality indicator, i.e. accuracy,      be selected, necessarily from exactly the same
   and increases the training time by more than 2        time of year, otherwise the forecast error
   times, therefore, when solving such a problem,        increases significantly.
7. References                                          [12] Guidelines     for     hydrometeorological
                                                            forecasting. Ukrainian Hydrometeorological
                                                            Center. Kyiv, 2019. – 35 p.
[1] B. F. Kuznetsov Short-term temperature
     prediction based on neural networks. //
     Topical issues of agrarian science. 2019. #
     30. Pp. 59-65.
[2] S. N. Verzunov, N. M. Lychenko Multi-
     wavelength polymorphic network for
     forecasting geophysical time series //
     Problems of automation and control. 2017. #
     1 (32). Pp. 78-87.
[3] A. S. Kozadaev, A. A. Arzamassev
     Forecasting of time series using the apparatus
     of artificial neural networks. Short-term
     forecast of air temperature. // Bulletin of the
     Tambov University. Series: Natural and
     Technical Sciences. 2006. Vol. 11. Issue 3.
     Pp. 299-304.
[4] A. S. Gribin. Application of artificial neural
     network algorithms for short-term prognosis:
     Dis. ... candidate of physical and
     mathematical sciences. SP-b. 2005. 154 p.
[5] J.W. Taylor, R. Buizza Neural Network Load
     Forecasting with Weather Ensemble
     Predictions. // IEEE Trans. on Power
     Systems, 2002, Vol. 17, Pp. 626-632. DOI:
     10.1109/TPWRS.2002.800906
[6] L. I. Velikanova Short-term forecasting of
     humidity of mountain soils / / Problems of
     automation and control. – 2015, No. 4. – Pp.
     158-166.
[7] S. N. Verzunov. Application of deep neural
     networks for short-term prediction of
     visibility range. Problems of automation and
     control. 2019, #1 (36). Pp. 118-130.
[8] LeiYu, ZhaoDanning, Cai Hongbing
     Prediction of length-of-day using extreme
     learning machine // Geodesy and
     geodynamics. – 2015 V. 6. N. 2. – Pp. 151-
     159.
[9] Koprinska, Irena et al. Convolutional Neural
     Networks for Energy Time Series
     Forecasting // 2018 International Joint
     Conference on Neural Networks (IJCNN).
     2018: Pp. 1-8.
[10] Imran Maqsood, Muhammad Riaz Khan,
     Ajith Abraham An ensemble of neural
     networks for weather forecasting Neural
     Comput & Applic (2004) 13: Pp. 112-122.
     DOI 10.1007/s00521-004-0413-4
[11] Archive of meteorological data in Odessa.
     URL:https://rp5.ru/Архив_погоды_в_Одес
     се (Accessed 19.11.2020)

</pre>