59


      Time Series Forecasting with Metric Analysis Approach

                   Victor V. Ivanov†∗ , Alexander V. Kryanev∗† ,
                  Leonid A. Sevastianov‡† , David K. Udumyan∗‡§
                      ∗
                       National Research Nuclear University “MEPhI”
                           †
                             Joint Institute for Nuclear Research
               ‡
                 People’ Friendship University of Russia (RUDN University)
                                    §
                                      University of Miam
  Email: ivanov@jinr.ru, a_v_kryanev@mtu-net.ru, sevastianov_la@rudn.university, mathudum@gmail.com

   Time series forecasting scheme based on the metric analysis approach is presented. This
approach provides preliminary filtering of noisy components with the help of singular-spectral
analysis.
   The scheme uses an auto regression model in which the predicted value is considered as a
function of the m previous values of the filtered values of this series. Thus, the auto regression
model reduces the prediction problem to the problem of nonlinear interpolation of a function
of several variables that are the smoothed (filtered) values of the original chaotic time series.
To solve the problem of interpolation of such functions, the article uses metric analysis, which
makes it possible to reveal the patterns of auto regression dependence in the deterministic
component of the investigated chaotic time process. In classical interpolation schemes, the
interpolation function is recovered immediately in the entire region of independent variables
under consideration. In the metric analysis, the interpolation values are restored separately
at each point, taking into account the location of the point at which the value of the function
is restored, with respect to the location of the points at which the values of the function
are given. Therefore, metric analysis allows to take into account individual features of the
location of points in which the values of the function are restored directly in the interpolation
formula, which makes it possible to obtain a more accurate result in the absence of additional
information about the function.
   The article reviews time series from various areas, such as stock prices, sales volumes,
passenger traffic volumes in the metro, and electricity consumption in the Moscow region.
The presented examples demonstrate the effectiveness of the presented scheme. In particular,
it is shown that with the help of the presented scheme it is possible to predict the dynamics
trend of the time chaotic time series under investigation with acceptable accuracy by several
tens of time steps in advance. The accuracy of the forecast largely depends on the choice of
the dimension of the forecast model — the number m of previous values of this auto regression
series.The scheme allows one to select an optimal value of the dimension of the forecasting
model, which, on average, provides the best accuracy for prediction.
 The publication was partially supported by the Ministry of Education and Science of the
Russian Federation (the Agreement number 02.A03.21.0008).

   Key words and phrases: ime series, auto regression model, interpolation of functions
of many variables, metric analysis, prediction scheme, forecasting examples .


                                       1.   Introduction
   One of the main problems of data processing in many areas is the problem of pre-
dicting the values of time processes. To date, many different methods and schemes
have been developed which solve various particular problems of forecasting time pro-
cesses [1, 2]. Below there is a brief description of the metric analysis of predicting the
values of time series [3–5] and its application for the prediction of specific time series [6].


Copyright © 2017 for the individual papers by the papers’ authors. Copying permitted for private and
academic purposes. This volume is published and copyrighted by its editors.
In: K. E. Samouilov, L. A. Sevastianov, D. S. Kulyabov (eds.): Selected Papers of the VII Conference
“Information and Telecommunication Technologies and Mathematical Modeling of High-Tech Systems”,
Moscow, Russia, 24-Apr-2017, published at http://ceur-ws.org
60                                                                                              ITTMM—2017


                                    2.    Forecasting Scheme
     The prediction scheme uses the interpolation method for the functional dependence:
                                                                ~
                                  Y = F (X1 , . . . , Xm ) = F (X),

where the function F (X)    ~ is unknown and is the subject for recovery, either at one
point X~ ∗ or in a set of given points on the basis of known values of the function Yk ,
                                  ~ k = (Xk1 , . . . , Xkm )T [3].
k = 1, . . . , n, at fixed points X
   According to the method of interpolation, based on metric analysis, interpolation
values are found as solutions of problems to a minimum according to ~ z = (z1 , . . . , zn )T
the measure of metric uncertainty [3–5]
                           2     ∗            ~ ∗; X
                                                   ~ 1; . . . ; X
                                                                ~ n )~
                          σN D (Y ; ~
                                    z ) = (W (X                      z, ~
                                                                        z ),
and the interpolation value is determined by a linear combination
                                                    n
                                                    X
                                            Y∗ =          z i Yi ,
                                                    i=1
and is given by
                                                        ~ , ~1)
                                                  (W −1 Y
                                          Y∗ =                  .
                                                  (W −1~1, ~1)
  The matrix of the metric W uncertainty is defined by
                                                                       
                    ρ2 X ~ 1, X
                              ~∗        ~ 1, X
                                        X      ~2        ...      ~ 1, X
                                                                  X     ~n
                              w ~                w
                                                     ~
                                                                          w ~ 
                   X  ~ 2, X
                            ~1       ρ2 X ~ 2, X  ~∗     ...      ~ 2, X
                                                                  X     ~n       
           W =                                                                ~ ,
                                                                                
                                 w
                                 ~                     w
                                                       ~                      w
                    ...                  . . .        . . .       . . .
                                                                                
                                                                             
                       ~
                      Xn , X1~          ~
                                       Xn , X2 ~                2   ~
                                                         . . . ρ Xn , X ,  ~ ∗
                                 w
                                 ~                   w
                                                     ~
                       m
           ~ i, X
                ~∗ =                        2
       2 X                   wk Xik − Xk∗ ,
                        P                 
where ρw
       ~
                          k=1

                            m
                              X
              ~ i, X
                   ~j               wk (Xik − Xk∗ ) · Xjk − Xk∗ ,
                                                               
             X            =                                                   i, j = 1, . . . , n.
                      w
                      ~
                              k=1

Consider the function of time ty = f (t) with values Y1 = f (t1 ), . . . , Yn = f (tn ) for
t1 < . . . < tn ∈ [t1 , tn ].
   It is required to find the predicted value yn+1 for tn+1 .
   The problem of finding the predicted value yn+1 is reduced to the problem of in-
terpolation of functions of several variables by means of a nonlinear autoregressive
model [3–5]
                              y(tm+1 ) = ym+1 = F (y1 , . . . , ym ),
                              y(tm+2 ) = ym+2 = F (y2 , . . . , ym+1 ),
                              .................................................
                              y(tN ) = yN = F (yN −m , . . . , yN −1 ).
                                     Ivanov Victor V. et al.                              61


   Then the prediction of the function y = f (t) is reduced to interpolating the function
of m variables Y = F (y1 , y2 , . . . , ym ) with values in n − m points

                                   X~ 1 = (Y1 , . . . , Ym )T ,
                                    ~
                                   X2 = (Y2 , . . . , Ym+1 )T ,
                                   ............................
                                   X~ n−m = (Yn−m , . . . , Yn−1 )T .
  The predicted value yfor = yn+1 is defined as the interpolation value of the function
Y = F (y1 , y2 , . . . , ym ) at the point X   ~ ∗:

                                                             ~)
                                                    (W −1~1, Y
                                        ~ ∗) =
                              yn+1 = F (X                       ,
                                                    (W 1, ~1)
                                                       −1 ~
where, X~ ∗ = (Yn−m+1 , . . . , Yn )T , W −1 is the inverse matrix to the (n − m) × (n − m)
matrix of metric uncertainty, Y   ~ = (Ym+1 , . . . , Yn )T is the (n − m)-dimensional vector
of the values of the predicted time process.
    The natural number m determines the dimensionality of the space of vectors X       ~ and
its value is found as the solution of the extremal problem [3–5]

                                            ~ −Y
                                 m = argminkY  ~for k.

   In the scheme proposed in this paper, preliminary filtering (trend isolation) of the
original time series is used with the help of singular-spectral analysis [1].

                                3.    Numerical Results
    Fig. 1 shows one example of forecasting the daily closing prices for a company’s
shares using the metric analysis scheme.
    Fig. 2 shows the forecast of the sums of one-time shoe store sales (data provided by
A. Khokhlov).
    Figs. 3 and 4 show the forecasting of the daily passenger traffic (in thousands of
passengers) on the Moscow metro during various periods of 2014 (the source of the
data is the Moscow Metro) (data provided by E. Osetrov, see also [6]).
    Figs. 5 and 6 show the forecasting the daily electricity consumption (in billion
kilowatt-hours) in the Moscow region (Moscow and Moscow region) only on working
days during different periods of 2014 (data source — System Operator of the Unified
Energy Systems / JSC “SO UES” branch of the Moscow Regional Dispatch Office) (data
provided by E. Osetrov, see also [6]).

                                      4.    Conclusion
   The forecasting scheme presented in this article allows one to predict the trend
dynamics of chaotic time series under analysis. The obtained numerical results of pre-
diction with respect to time processes taken from various fields show that the presented
scheme yields acceptable results in the accuracy of the forecast.
62                                                                               ITTMM—2017

                                      5
                               x 10
                           4


                         3.5


                           3


                         2.5


                           2


                         1.5
                            0             20   40   60    80        100    120

Figure 1. Forecasting 20 steps ahead. The continuous line is original row data, the dashed
                line is the filtered component, the solid line is the forecast
                                 (the optimal value of m is 20)


                               3000

                               2500

                               2000

                               1500

                               1000

                                500

                                 0
                                  0            10    20        30         40

     Figure 2. Forecasting 10 steps ahead. The continuous line is the original row data, the
               dashed line is the filtered component, the solid line is the forecast
                                  (the optimal value of m is 10)


                                                References
1.    N. Golyandina, V. Nekrutkin, A. Zhigljavsky. Analysis of Time Series Structure.
      SSA and Related Techniques. Chapman & Hall / CRS, 2001.
2.    A. V. Kryanev, G. V Lukin. Mathematical methods for processing indeterminate
      data. Leningrad: Nauka, 2006 [in Russian].
3.    A. V. Kryanev, D. K. Udumyan, Metric Analysis, Properties and Applications as a
      Tool for Forecasting. International Journal of Mathematical Analysis (2014), Vol. 8,
      no. 60, pp. 2971–2978.
4.    V. V. Ivanov, A. V. Kryanev, D. K. Udumyan, G. V. Lukin, Metric Analysis Ap-
      proach for Interpolation and Forecasting of Time, Processes. Applied Mathematical
                                      Ivanov Victor V. et al.                                  63

                            10000


                             9000


                             8000


                             7000


                             6000
                                 0     50        100        150    200   250

     Figure 3. Forecasting 30 steps ahead. The continuous line is the original row data, the
               dashed line is the filtered component, the solid line is the forecast
                                  (the optimal value of m is 20)


                            9000

                            8500

                            8000

                            7500

                            7000

                            6500

                            6000
                                0    20     40         60     80   100   120

     Figure 4. Forecasting 40 steps ahead. The continuous line is the original row data, the
               dashed line is the filtered component, the solid line is the forecast
                                  (the optimal value of m is 30)


      Sciences, Vol. 8, 2014, no. 22, pp. 1053–1060.
5.    A. V. Kryanev, G. V. Lukin, D. K. Udumyan, Metric Analysis and data processing.
      Leningrad: Nauka, 2012 [in Russian].
6.    V. V. Ivanov, A. V. Kryanev, E. S. Osetrov, Forecasting daily electricity consump-
      tion in the Moscow region using artificial neural networks, Physics of Particles and
      Nuclei Letters (2017), Issue 2.
64                                                                               ITTMM—2017

                                      5
                               x 10
                         3.1

                           3

                         2.9

                         2.8

                         2.7

                         2.6

                         2.5

                         2.4
                            0              20         40         60     80

     Figure 5. Forecasting 20 steps ahead. The continuous line is the original row data, the
               dashed line is the filtered component, the solid line is the forecast
                                  (the optimal value of m is 10)


                                  5
                               x 10
                         3.5


                           3


                         2.5


                           2
                            0             50    100        150   200   250

     Figure 6. Forecasting 50 steps ahead. The continuous line is the original row data, the
               dashed line is the filtered component, the solid line is the forecast
                                  (the optimal value of m is 12)