<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Preparing Data and Determining Parameters for a Feedforward Neural Network Used for Short-Term Air Temperature Forecasting</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Boris Perelygin</string-name>
          <email>b.perelygin@gmail.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tatiana Tkach</string-name>
          <email>tatkatkach@gmail.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anna Gnatovskaya</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>This article presents the results of solving the problem of preparing initial data and determining specifications for an artificial feedforward neural network used for short-term forecasting of ambient air temperature values. Based on the requirements for the accuracy of forecasts, the data for network self-learning was optimized, namely, the number of training vectors and their length, the type of the source data itself, the features of creating a training sample from an array of source data were determined. Additionally, the specifications of neural network that provide the required accuracy of forecasts were selected, namely, the requirements for the network neuron activation, and the number of hidden layers.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Artificial neural network</kwd>
        <kwd>short-term forecast</kwd>
        <kwd>air temperature</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Forecasting is one of the most important tasks
in almost all areas of science and life. Predicting
weather factors is one of the oldest forecasting
tasks because of their great influence on all
aspects of human life. Meteorological weather
forecasts are a scientifically based assumption
about the future state of the weather. The success
of modern short-term weather forecasts is quite
high, but there are also inaccurate forecasts,
especially in cases of abnormal manifestations of
the weather. Therefore, research in this area
remains relevant at the present time.</p>
      <p>
        In recent decades, along with traditional
methods of weather forecasting, the use of
artificial neural networks (ANN) for forecasting is
considered as a promising area of research [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1, 2,
3, 4</xref>
        ]. The initial data for weather forecasting for
ANN are commonly the results of regular
measurements of weather characteristics in the
form of numerical values. With the help of ANN,
it is possible to model the nonlinear dependence
between the future value of a time series, its past
values, and the values of external factors [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. For
instance, it is proposed to use fully connected
feedforward neural networks to predict the time
series of mountain soils humidity [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Deep neural
networks could be used to predict the
meteorological visibility range [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Multi-wavelet
polymorphic networks could be employed to
predict geophysical time series [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. To predict
long-term series, it is proposed to use extreme
learning machines [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], convolutional neural
networks [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In particular, examples of predicting
temperature values are presented [
        <xref ref-type="bibr" rid="ref1 ref10 ref3">1, 3, 10</xref>
        ].
However, a comparative analysis of the
possibility of using different ANN architectures is
carried out in the papers concerned with the
aforementioned subject, and, as a rule, the
methodology for preparing initial data, numerical
estimates of forecasting quality, and the influence
of the ANN parameters on these estimates are not
sufficiently covered. That means, that these
academic papers do not sufficiently cover
solutions problems of selection of parameters of
ANN for forecasting meteorological elements.
This work aims to fill this gap. Therefore, when
implementing this attempt, a well-known
feedforward ANN was taken as the subject of
research and used for short-term forecasting.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem statement</title>
      <p>The object of the study is the process of using
a feedforward ANN to predict air temperature
values. The subject of the study is a feedforward
ANN designed for short-term forecasting of air
temperature values.</p>
      <p>In the process of conducting research, it was
necessary to find out the impact on the accuracy
of forecasts: 1) the parameters of neural network
learning data (the length of the training vectors,
the number of training vectors, the location of the
sample for training in the general series of
observations, the types of source data) that
provide the best accuracy for short-term forecasts
with a lead time of 3 hours, 1 day and 3 days; 2)
the influence of the parameters of the neural
network on the accuracy of forecasts with the
above-mentioned lead time (the number of its
hidden layers, the presence of restrictions in the
activation functions of neurons of hidden layers).</p>
    </sec>
    <sec id="sec-3">
      <title>3. Initial data</title>
      <p>
        The air temperature values were chosen as data
for the research because of the continuity of these
data and the clarity of the results obtained. The
data are a long 15-year series of air temperature
values (43569 samples) obtained during regular
eight-term observations at the weather station
33837 Odessa from February 01, 2005 to
December 31, 2019 (Fig. 1) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>When forming an array of initial data, the
missing air temperature values were interpolated
as the arithmetic mean of neighboring
temperature values. There were only 6 such
omissions in the data, so with a row length of
43569 samples, their correction did not
significantly affect the result of the study.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Research methodology end tools</title>
      <p>
        Due to the large variability of meteorological
quantities in space and in time, the specific value
of any value specified in the forecast should be
considered as the most likely value that this value
will have during the period of the forecast. At the
end of the validity period of the short-term
forecast, an assessment of its success is made,
which is based on the accuracy. Accuracy is the
degree of matching, with certain established
tolerances, of predictive and actual
meteorological values, and phenomena. The
accuracy of the temperature forecast is measured
alternatively. If the prognostic temperature
differed from the actual one by no more than 2.0
°C, then the forecast accuracy is 100%, if the
difference is 3.0 °C, the accuracy is 50%, if ≥ 4.0
°C - it's 0% [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] When conducting studies, the
accuracy was calculated similarly to the above
method, but without an alternative, in the form of
the ratio of the number of accurate forecasts
(falling within the range of ±2 °C) to the total
number of forecasts for a given advance.
      </p>
      <p>In the context of this paper, different types of
data should be understood as the actual existing
series of observations, the so – called "raw" data,
and its two transformations: a centered series –
obtained by subtracting the arithmetic mean from
all the values of the series; and a normalized series
obtained by dividing all the values of the series by
the maximum modular value of the series. All
series of observations were divided into two large
groups: the 1st group of data – for training and the
2nd group of data – for forecasting (Fig. 2).</p>
      <p>The required arrays of initial data were formed
as follows from the first group of data intended for
training. The whole group was divided into 3
parts. The first part (I in Fig. 2) it was used for
training the network (training set). The second
part (II in Fig. 2) was used as verification
(validation) set to check the quality of training.</p>
      <p>Repeated repetition of experiments leads to the
fact that the control set begins to play a key role
in creating the model, that is, it becomes part of
the learning process. This weakens its role, as an
independent criterion for the quality of the model
– with a large number of experiments, there is a
risk of choosing a network that gives a good result
using the control set. In order to give the final
model proper reliability, the third part of the first
group of data was a backup (test) set of
observations (III in Fig. 2).</p>
      <p>The final model was tested on data from this
set to make sure that the results achieved using the
training and validation sets are real. According to
the obtained data, the indicator of the quality of
training was calculated according to the
methodology applied to the calculation of the
forecast accuracy. During the research, the
parameters of the neural network, the length and
number of training vectors changed, as well as the
place of the beginning of the array changed to
assess the seasonal impact of data on the quality
of the forecast.</p>
      <p>For research, we used a traditional ANN of
direct propagation, shown in Fig. 3, the number of
inputs of which varied depending on the size of
the training set, the number of hidden layers
varied from 1 to 3. The neuron activation
functions also changed.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Result analysis</title>
      <p>When determining the best length of the
training vector and the best number of vectors
training the neural network in the context of
forecasts' accuracy, multiple simulations of the
training and forecasting procedure were
performed (N cycles – approximately 50). At the
same time, the training array size (the length of
the vectors and their quantity) was selected in
such a way that the training procedure was
completed in no more than one hour. Otherwise,
the meaning of short-term forecasting with a
3hour lead time would have been lost, since the
result could have been obtained after the forecast
time. During multiple simulations, the quality of
training and the accuracy of all three forecasts of
different timings were evaluated when changing:
a) the type of source data ("raw", centered,
normalized), b) when changing the number of
hidden layers (1, 2, 3) and c) when changing the
activation function from linear to sigmoidal for
hidden and output layers. As a result of these
studies, 72‧N three-dimensional graphs were
obtained. Since the volume of this paper does not
allow us to present them all, two of them, as an
example, are shown in Fig. 4.</p>
      <p>A number of training vectors is drawn along
the abscissa axis, the length of the vectors is
drawn along the ordinate axis, and either the value
of the training quality or the forecast accuracy
parameter for the corresponding advance time is
displayed along the application axis. Each point of
these graphs was calculated for the specified
vector length, the number of vectors, the number
of hidden layers of the ANN, the type of activation
function and the type of source data. From a
mathematical standpoint, the graph in Fig. 4, a
shows the quality of the network's approximation
of the training array, and in Fig. 4, b – the quality
of extrapolation of data on which the network was
not trained. The graph in Fig. 4, b clearly shows
the instability of forecasts for any length of
training vectors and for a small number of them.</p>
      <p>The analysis of all the graphs showed that in
order to obtain a specific accuracy of forecasts,
the initial data for training the network should be
presented in the form of 150 vectors with a length
of 16 samples each (the circled area in Fig. 5, a),
which also provides stable short-term forecasting,
one of the results of which is shown in Fig. 6.</p>
      <p>a) b)
Figure 5: Accuracy of forecasting with a linear function of neuron activation (a) and in the presence of
nonlinearity in the activation function of neurons (b) with the same form of initial data and the same
number of hidden layers</p>
      <p>In addition, the presence of non – linearity
(restriction) in the activation function greatly
worsens the prediction quality indicator, i.e.
accuracy (Fig. 5), and increases the training time
by more than 2 times. Therefore, when solving
such a problem, the neurons of the network must
have a complete linear activation function.</p>
      <p>a)
b)</p>
      <p>The simulation showed that an increase in the
number of hidden network layers does not
improve, but also does not worsen the quality of
forecasting, the accuracy does not change
significantly, but the network architecture
becomes more complicated and with the same
learning algorithm (the Levenberg-Marquardt
algorithm in the error back propagation
procedure), the training time increases by a
multiple.</p>
      <p>The type of source data does not affect the
quality of forecasting, the accuracy does not
change significantly when replacing the "raw"
source data with centered or normalized ones.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>The analysis of the obtained results made it
possible to determine the type of initial data, the
volume of initial data, and the main parameters of
the feedforward ANN for predicting temperature
values:
 the presence of non-linearity (restriction)
in the activation function significantly worsens
the prediction quality indicator, i.e. accuracy,
and increases the training time by more than 2
times, therefore, when solving such a problem,
the neurons of the network must have a
complete linear activation function;
 an increase in the number of hidden
network layers does not improve, but also does
not worsen the quality of forecasting, the
accuracy does not change significantly, but the
network architecture becomes more
complicated and with the same learning
algorithm (the Levenberg-Marquardt
algorithm in the error back propagation
procedure), the training time increases by a
multiple;
 the type of source data does not affect the
quality of forecasting, the accuracy does not
change significantly when replacing the "raw"
source data with centered or normalized ones;
 the initial data for training the network
should be presented in the form of 150 vectors
with a length of 16 samples each, which
ensures stable short-term forecasting;
 when training on such a small amount of
data, the following condition needs to be
observed: when forecasting in a certain season
(time of year), earlier data for training needs to
be selected, necessarily from exactly the same
time of year, otherwise the forecast error
increases significantly.</p>
    </sec>
    <sec id="sec-7">
      <title>7. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B. F.</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          <article-title>Short-term temperature prediction based on neural networks</article-title>
          . // Topical issues of agrarian science.
          <year>2019</year>
          . # 30. Pp.
          <volume>59</volume>
          -
          <fpage>65</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Verzunov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. M.</given-names>
            <surname>Lychenko</surname>
          </string-name>
          <article-title>Multiwavelength polymorphic network for forecasting geophysical time series // Problems of automation and control</article-title>
          .
          <year>2017</year>
          . #
          <volume>1</volume>
          (
          <issue>32</issue>
          ). Pp.
          <volume>78</volume>
          -
          <fpage>87</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Kozadaev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Arzamassev</surname>
          </string-name>
          <article-title>Forecasting of time series using the apparatus of artificial neural networks. Short-term forecast of air temperature</article-title>
          . // Bulletin of the Tambov University. Series: Natural and
          <string-name>
            <given-names>Technical</given-names>
            <surname>Sciences</surname>
          </string-name>
          .
          <year>2006</year>
          . Vol.
          <volume>11</volume>
          . Issue 3. Pp.
          <volume>299</volume>
          -
          <fpage>304</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Gribin</surname>
          </string-name>
          .
          <article-title>Application of artificial neural network algorithms for short-term prognosis: Dis.</article-title>
          ...
          <source>candidate of physical and mathematical sciences. SP-b</source>
          .
          <year>2005</year>
          . 154 p.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.W.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , R.
          <article-title>Buizza Neural Network Load Forecasting with Weather Ensemble Predictions</article-title>
          . // IEEE Trans.
          <source>on Power Systems</source>
          ,
          <year>2002</year>
          , Vol.
          <volume>17</volume>
          , Pp.
          <fpage>626</fpage>
          -
          <lpage>632</lpage>
          . DOI:
          <volume>10</volume>
          .1109/TPWRS.
          <year>2002</year>
          .800906
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>L. I.</surname>
          </string-name>
          <article-title>Velikanova Short-term forecasting of humidity of mountain soils / / Problems of automation and control</article-title>
          .
          <source>- 2015</source>
          , No. 4. - Pp.
          <fpage>158</fpage>
          -
          <lpage>166</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Verzunov</surname>
          </string-name>
          .
          <article-title>Application of deep neural networks for short-term prediction of visibility range</article-title>
          .
          <source>Problems of automation and control.</source>
          <year>2019</year>
          , #
          <volume>1</volume>
          (
          <issue>36</issue>
          ). Pp.
          <volume>118</volume>
          -
          <fpage>130</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>[8] LeiYu, ZhaoDanning, Cai Hongbing Prediction of length-of-day using extreme learning machine // Geodesy and geodynamics</article-title>
          . - 2015 V.
          <article-title>6</article-title>
          .
          <string-name>
            <surname>N.</surname>
          </string-name>
          <year>2</year>
          . - Pp.
          <fpage>151</fpage>
          -
          <lpage>159</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Koprinska</surname>
          </string-name>
          , Irena et al.
          <source>Convolutional Neural Networks for Energy Time Series Forecasting // 2018 International Joint Conference on Neural Networks (IJCNN)</source>
          .
          <year>2018</year>
          : Pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Imran</surname>
            <given-names>Maqsood</given-names>
          </string-name>
          , Muhammad Riaz Khan,
          <article-title>Ajith Abraham An ensemble of neural networks for weather forecasting Neural Comput</article-title>
          &amp;
          <string-name>
            <surname>Applic</surname>
          </string-name>
          (
          <year>2004</year>
          ) 13: Pp.
          <fpage>112</fpage>
          -
          <lpage>122</lpage>
          . DOI 10.1007/s00521-004-0413-4
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <article-title>Archive of meteorological data in Odessa</article-title>
          . URL:https://rp5.ru/Архив_погоды_в_Одес
          <source>се (Accessed</source>
          <volume>19</volume>
          .
          <fpage>11</fpage>
          .
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <article-title>Guidelines for hydrometeorological forecasting</article-title>
          .
          <source>Ukrainian Hydrometeorological Center. Kyiv</source>
          ,
          <year>2019</year>
          . - 35 p.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>