Modification of the “Piramidal” Algorithm of the Small Time Series Forecasting Yuriy Turbala, Mariana Turbala , Andriy Bombaa , Abd Alkaleg Hsen Driwi a and Nataliia Kunanetsb a National university of water and environmental engineerin , Soborna, 12, Rivne, 33022, Ukraine b Lviv Politechnic National University, Bandery str., 28a, Lviv, 79013, Ukraine Abstract It is proposed a modification of the “piramidal” algorithm of small time series forecasting. “Piramidal” approach was developed in recent years, numerical results show advantages of this method in comparison with known approaches to extrapolation, based on the using of polynomials, including Newton’s extrapolation. But this approach was tested only on deterministic time series. In this paper piramidal approach is applied to construct prognoses in the case where the time series contains a random component. It is studied the procedure for constructing the forecast value in accordance with the pyramidal method and improved the main criteria of this method . The main idea of the method improving is to find special patterns in the table of finite differences. The improved method is used for the number of patients with COVID-19 forecasting in Ukraine. Keywords 1 Time series, piramidal algorithm, forecasting, extrapolation, pattern, COVID-19. 1. Introduction Today, forecasting is one of the most important tasks in the study of various processes. We would like always to look into the future. There is a number of methods of time-series forecasting. In many tasks, it becomes necessary to find patterns in large volumes of data and use them for forecasting [3]. Data mining as well as predictive modeling is used in many fields of scientific research. In the case of large amount of data it can be useful wellknown statistical approaches [17]-[21]. But what to do when very little is known? In the case of small time series many specific features arise. It is often impossible to determine what is the nature of the process from the point of view of determinism, what is the ratio of the deterministic and random components of the process. In the deterministic case according to the observation data can be built some mathematical model which is used to obtain the predicted value. There is a number of methods for solving the extrapolation problem. For the extrapolation various interpolation functions can be used such as: generalized polynoms based on the systems of Chebyshev functions – polynomials [1], exponential, trigonometric functions[12]; flat radial basis functions [14]; splines – cubic, B-spline; Bezier curves [4]; special analytic functions and trend analysis [9]-[13],[15]. Neural networks also are widely used for extrapolation [8]. But how to choose the optimal model corresponding to a finite set of experimental data? It is obvious that an infinite set of curves passes through a finite set of points on the plane, and each of them can be a model of the process. IntelITSIS’2021: 2nd International Workshop on Intelligent Information Technologies and Systems of Information Security, March 24–26, 2021, Khmelnytskyi, Ukraine EMAIL: turbaly@gmail.com (Y. Turbal); turbal.mariana.1@gmail.com (M. Turbal); a.bomba@ukr.net (A. Bomba); abdo_sum83@yahoo.com (A. A. H. Driwi); nek.lviv@gmail.com (N. Kunanets) ORCID: 0000-0002-5727-5334 (Y. Turbal); 0000-0001-5675-861X (M. Turbal); 0000-0001-5528-4192 (A. Bomba); 0000-0001-5680- 2502(A. A. H. Driwi); 0000-0002-5671-3638 (N. Kunanets) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) In the paper [1] it was proposed a new method of short time series extrapolation which was called “piramidal”. The aim of the authors is to develop a forecasting method that would not use specific classes of functions or any mathematical models. “Piramidal” method is based on the procedure of finding special conditions in the data obtained as special finite differences. The results of calculations for test functions showed the advantages of this method in comparison with approaches to extrapolate, based on the use of intarpolation polynomials. But piramidal approuch is comparatively new and requires deep in-depth research and data series validation. In this paper, we have attempted to apply a piramidal approach to construct prognoses in the case when the time series contains a random component. We study the procedure of forecast value constructing in accordance with the pyramidal method and improve the main criteria of the optimal row choosing. The main idea of this method improving is based on finding patterns in the table of finite differences. Our modification makes possible use pyramidal approach in the case of data with stochastic component. 2. “Piramidal” algorithm without midpoints “Piramidal” method of data extrapolation was proposed in work [1] . The main feature of this method is to construct a special divided differences and find their order, for which a better predicted value in a certain sense can be found. Then the value of the original function at the point located outside the interpolation interval is based on the predictive value for the divided differences using a special computational procedure. In works [1],[12] this method has been described taking into account additional interpolation at intermediate points. Since such interpolation did not play a significant role, here we consider an analogue of the corresponding algorithm without midpoints and use another notatin Let f 1 , f 2 ,… , f n be any time-series, x1 , x 2 ,… , x n are points of time respectively. It is needed to estimate the future observation f n +1 at the point x > x n . Consider the finite differences modified as follows: 𝑓𝑓𝑖𝑖+2 − 𝑓𝑓𝑖𝑖 ∆1 𝑓𝑓𝑖𝑖 = , 𝑖𝑖 = ���������� 1, 𝑛𝑛 − 1, 𝑥𝑥𝑖𝑖+2 − 𝑥𝑥𝑖𝑖 ∆1 𝑓𝑓𝑖𝑖+2 − ∆1 𝑓𝑓𝑖𝑖 ∆2 𝑓𝑓𝑖𝑖 = ���������� , 𝑖𝑖 = 1, 𝑛𝑛 − 2, 𝑥𝑥𝑖𝑖+3 − 𝑥𝑥𝑖𝑖+1 ∆2 𝑓𝑓𝑖𝑖+2 − ∆2 𝑓𝑓𝑖𝑖 ∆3 𝑓𝑓𝑖𝑖 = , 𝑖𝑖 = ���������� 1, 𝑛𝑛 − 3, 𝑥𝑥𝑖𝑖+4 − 𝑥𝑥𝑖𝑖+2 … In general case we have: ∆𝑗𝑗−1 𝑓𝑓𝑖𝑖+2 −∆𝑗𝑗−1 𝑓𝑓𝑖𝑖 ∆𝑗𝑗 𝑓𝑓𝑖𝑖 = , (1) 𝑥𝑥 𝑖𝑖+𝑗𝑗+1 −𝑥𝑥 𝑖𝑖+𝑗𝑗−1 𝑛𝑛−1 , 𝑛𝑛 = 2𝑘𝑘 + 1, where 𝑗𝑗 = ����� 1, 𝑝𝑝, 𝑖𝑖 = ��������� 1, 𝑛𝑛 − 𝚥𝚥, 𝑝𝑝 = � 2𝑛𝑛−2 , 𝑛𝑛 = 2𝑘𝑘. 2 It is obvious that the finite differences (1) approximate the derivatives and differ from the classical ones, which are considered in the construction of Newton's interpolation polynomials. Note that if we find the value ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 for any index 𝑘𝑘 of the table of finite differences it can be easily constructed the predicted value of the function at the point 𝑥𝑥𝑛𝑛+1 (see Fig. 1, 2) according to the following computational procedure: ∆𝑗𝑗−1 𝑓𝑓𝑛𝑛−2𝑗𝑗+3 = ∆𝑗𝑗−1 𝑓𝑓𝑛𝑛−2𝑗𝑗+1 + ∆𝑗𝑗 𝑓𝑓𝑛𝑛−2𝑗𝑗+1 (𝑥𝑥𝑛𝑛−𝑗𝑗+2 − 𝑥𝑥𝑛𝑛−𝑗𝑗 ), 𝑗𝑗 = ����� 𝑘𝑘, 1. (2) Let’s consider such modification of the finite differences: ∆𝑗𝑗−2 𝑓𝑓𝑛𝑛−2(𝑗𝑗−2) −∆𝑗𝑗−2 𝑓𝑓𝑛𝑛−2(𝑗𝑗−2)−1 ∆𝑗𝑗−2 𝑓𝑓𝑛𝑛−2(𝑗𝑗−2)−1 −∆𝑗𝑗−2 𝑓𝑓𝑛𝑛−2(𝑗𝑗−2)−2 � − � 𝑥𝑥𝑛𝑛−𝑗𝑗+2 −𝑥𝑥𝑛𝑛−𝑗𝑗+1 𝑥𝑥𝑛𝑛−𝑗𝑗+1 −𝑥𝑥𝑛𝑛−𝑗𝑗 ∆�𝑗𝑗 𝑓𝑓𝑛𝑛−2𝑗𝑗+1 = , (3) (𝑥𝑥𝑛𝑛−𝑗𝑗+2 −𝑥𝑥𝑛𝑛−𝑗𝑗 )/2 The logic for constructing finite differences (3) is as follows. Let consider the simplest case (see 𝑓𝑓 −𝑓𝑓𝑛𝑛−1 𝑓𝑓𝑛𝑛−1 −𝑓𝑓𝑛𝑛−2 � 𝑛𝑛 − � 𝑥𝑥𝑛𝑛 −𝑥𝑥𝑛𝑛−1 𝑥𝑥𝑛𝑛−1 −𝑥𝑥𝑛𝑛−2 �2 Fig. 1), 𝑗𝑗 = 2 , ∆ 𝑓𝑓𝑛𝑛−3 = . (𝑥𝑥𝑛𝑛 −𝑥𝑥𝑛𝑛−2 )/2 It is obvious, that ∆�2 𝑓𝑓𝑛𝑛−3 is a discrete analogue of the second derivative. The main idea of this approach is to find an additional condition when it is satisfied the equation: ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 = ∆�𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 (4) Considering (1) and (3), we have: ∆𝑘𝑘−1 𝑓𝑓𝑛𝑛−2𝑘𝑘+3 −∆𝑘𝑘−1 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 = 𝑥𝑥 𝑛𝑛−𝑘𝑘+2 −𝑥𝑥 𝑛𝑛−𝑘𝑘 , ∆𝑘𝑘−1 𝑓𝑓𝑛𝑛−2𝑘𝑘+3 − ∆𝑘𝑘−1 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 = 𝑥𝑥 𝑛𝑛−𝑘𝑘+2 − 𝑥𝑥 𝑛𝑛−𝑘𝑘 ∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2(𝑘𝑘−2) − ∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2(𝑘𝑘−2)−1 ∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2(𝑘𝑘−2)−1 − ∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2(𝑘𝑘−2)−2 � − � 𝑥𝑥𝑛𝑛−𝑘𝑘+2 − 𝑥𝑥𝑛𝑛−𝑘𝑘+1 𝑥𝑥𝑛𝑛−𝑘𝑘+1 − 𝑥𝑥𝑛𝑛−𝑗𝑗 = 𝑥𝑥𝑛𝑛−𝑘𝑘+2 − 𝑥𝑥𝑛𝑛−𝑘𝑘 . 2 From the last equation we get: ∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2𝑘𝑘+5 − ∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2𝑘𝑘+3 = 𝑥𝑥 𝑛𝑛−𝑘𝑘+3 − 𝑥𝑥 𝑛𝑛−𝑘𝑘+1 ∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2(𝑘𝑘−2) −∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2(𝑘𝑘−2)−1 ∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2(𝑘𝑘−2)−1 −∆𝑘𝑘−2 𝑓𝑓𝑛𝑛−2(𝑘𝑘−2)−2 =2 − 𝑥𝑥𝑛𝑛−𝑘𝑘+1 −𝑥𝑥𝑛𝑛−𝑘𝑘 (5) 𝑥𝑥𝑛𝑛−𝑘𝑘+2 −𝑥𝑥𝑛𝑛−𝑘𝑘+1 … ~ ∆3 f1 … ∆3 f n −6 ∆3 f n −5 ~ ∆2 f1 ∆2 f 2 … ∆2 f n −5 ∆2 f n − 4 ∆2 f n −3 ~ ∆1 f1 ∆1 f 2 ∆1 f 3 … ∆1 f n − 4 ∆1 f n −3 ∆1 f n − 2 ∆1 f n −1 f1 f2 f3 f4 … f n −3 f n−2 f n −1 fn f n +1 x1 x2 x3 x4 … x n −3 xn−2 x n −1 xn x n +1 Figure 1: Structure of the table of modified finite differences Figure 2: Illustration to the spatial generalization of the "pyramidal" method The method is based on the search for conditions under which the error |∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 − ∆�𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 | is minimal. In [1],[6] was proposed the following algorithm Ξ , which consists of the next steps. 1. Construction the table of finite differences according to (1). 2. Finding a row in the table of finite difference according to the condition: i −1 k = arg min | Δ f n − 2i +1 − ( Δ i − 2 f n − 2i +3 − Δ i − 2 f n − 2i+2 |. ) i x n −i +1 − x n −i (6) 3. Calculation the value ∆�𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 according to (3). 4. Building predictive value according to the procedure (2). Spatial generalization of the "pyramidal" method was proposed in [12]. To construct the "predictive" value of some surface at the selected point, it is proposed to consider paths passing through lattice nodes, where the values of the corresponding surface are known and a special parameter (measure) of the predictability of the function is determined. Then, a predictive value is the result of one-dimensional "pyramidal" approach for the function values through the path for which the degree of predictability is maximal. 3. Modification of the Ξ -algoritm Without loss of generality we can consider uniform grid, x k − x k −1 = 0.5 . In this case finite differences (3) can be easy to calculate. The illustration of the calculation the value ∆�𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘+1 is presented in the Fig. 3 (this is a part of the transposed table in the Fig. 1). In this table, the values ∆k f l , ∆k f l +1 ∆ f l + 2 ∆k f l +3 are known, ∆k f l + 4 is unknown. Other values recorded in selected cells are also k unknown. According to (3) we can find 4∆k f l +3 − 8∆k f l + 2 + 4∆k f l +1 and it is easy to find another unknown values according to procedure (5), for example, 4∆k f l +3 − 8∆k f l + 2 + 4∆k f l +1 + + ( ∆k f l + 2 − ∆k f l ) , ∆ k fl + 4 = 4∆ k fl +3 − 8∆ k fl + 2 + 4∆ k fl +1 + (∆ k fl + 2 − ∆ k fl ) + ∆ k fl + 2 = = 4∆ k fl +3 − 6∆ k fl + 2 + 4∆ k fl +1 − ∆ k fl . ∆k f l + 4 ∆k f l +3 4∆k f l + 3 − 8∆k f l + 2 + 4∆k f l +1 + + (∆k f l + 2 − ∆k f l ) ∆k f l + 2 ∆k f l +3 - ∆k f l +1 4∆k f l +3 − 8∆k f l + 2 + 4∆k f l +1 ∆k f l +1 ∆k f l + 2 - ∆k f l ∆k f l Figure 3: Illustration of the calculation modified finite differences (3) in the case of uniform grid. Unknown values in the table cells are highlighted For a more detailed analysis of the Ξ -algorithm , it is necessary to consider the required and sufficient conditions for the fulfillment of the relation (4). We can use two results. In [3] it is investigated that procedure of building prediction according ещ formula (4) is equivalent to the cubic extrapolation . Thus, the task of determining the forecast value in the corresponding row of the pyramidal method is equivalent to the cubic forecast based on the last 4 values of the data series, ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘−3 , ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘−2 ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘−1, ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘 . If a cubic curve passes through the last four points and predictable fifth point, equation (4) is satisfied. Next additional result can be easily obtained and is deals with quadratic extrapolation. Equation (3) is satisfied if and only if the parabola passes through the points: ( x n −i , (Δ i −2 f n − 2i +3 − Δ i − 2 f n − 2i +1 ) ), (( x n −i +1 + x n −i ) / 2, ( Δ i − 2 f n − 2i +3 − Δ i − 2 f n − 2i+2 ), ) x n −i +1 − x n −i −1 x n −i +1 − x n −i (( x n −i + 2 + x n −i +1 ) / 2, (Δ i −2 f n − 2i + 4 − Δ i − 2 f n − 2i+3 ) ( ), ( x n −i + 2 , ) Δ i − 2 f n − 2i +5 − Δ i − 2 f n − 2i+3 ) (7) x n −i + 2 − x n −i +1 x n −i +3 − x n −i +1 Thus, we have two criteria of (3) satisfaction: “cubic” and “quadratic”. Let us analyze a cases when parabola or a cubic curve gives the best forecast. It is obvious such property that faster interpolation curve grows on the forecast interval, the greater is probability of extrapolation error based on this curve. Let us consider first three points of series (7) for the “quadratic” criteria or four points (𝑥𝑥𝑛𝑛−𝑘𝑘−3 , ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘−3 ), (𝑥𝑥𝑛𝑛−𝑘𝑘−2 , ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘−2 ), (𝑥𝑥𝑛𝑛−𝑘𝑘−1 , ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘−1, ), (𝑥𝑥𝑛𝑛−𝑘𝑘 , ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘 ) for the “cubic” one. Let the point data sequence and the rate change are increasing. In this case, the quadratic or cubic forecast will also give an increase, but the real function may increase according to a significantly different law and error of the forecasting may be large. Let the point data sequence is increasing and the rate of change decreases. Then the nature of the uncertainty will significantly depend on the rate of growth and approach to a corresponding local extremum, the farther the extremum point from the observed interval, degree of uncertainty of the real function increases. Let the abscissa of the point of the local extremum is inside the observed interval. In this case, the quadratic or cubic prediction is in the region of exiting from the zone of small change of function. The uncertainty can be large. Let the quadratic or cubic interpolation curve have an extremum that coincides with the last observed point. In this case, the uncertainty is minimal, because if the real function also has a local extremum there, then the error is minimal. At the same time, if the real function does not have a local extremum at the last point, but it still reduces the growth rate. The curve optimally predicts a certain sequence of data if the forecast interval is in the area of a local extremum. Thus, we can propose the following modification of the finite difference table row selection procedure, for which an unknown predictive value is constructed by formula (3). Condition β. In piramidal algorithm instead of condition (6) it is selected that line of the table of finite differences for which last observation point deviates minimally from the point of local extremum, determined by the cubic or quadratic interpolation curve. Note that condition (6) describes a partial case of condition β. It can be proved that under condition (6) the points (𝑥𝑥𝑛𝑛−𝑘𝑘−3 , ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘−3 ), (𝑥𝑥𝑛𝑛−𝑘𝑘−2 , ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘−2 ), (𝑥𝑥𝑛𝑛−𝑘𝑘−1 , ∆𝑘𝑘 𝑓𝑓𝑛𝑛−2𝑘𝑘−1, ) lie on one line. This means that the function that passes through these points changes the convexity. Then the cubic polynomial at the last point has either an approach to the local extremum, or a rapid increase in the function, which will lead to a larger prediction error. 4. Numerical results To illustrate our method, let’s consider data set on the incidence of COVID-19 in Ukraine (Official statistics of the Ministry of Health of Ukraine, ttps://www.pravda.com.ua/cdn/covid-19/cpa/). Let consider statistics from 22.12.20 until 10.01.21. We have input time series: 6545, 8513, 10136, 11490, 11035, 7709, 6113, 4385, 6988, 7986, 9699, 9432, 5038, 4576, 4158, 5334, 6911, 8997, 5676, 4846, 5011. Results of the evaluation according to our modified piramidal algorithm are in Fig. 4. According to the condition β, we analyze distanсes from the last observation point (xn−k+2 , ∆k−2 fn−2+4k, ) for cubic extrapolation or point (Δ k −2 f n − 2 k + 4 − Δ k − 2 f n − 2 k +3 (( x n − k + 2 + x n − k +1 ) / 2, ) ) x n − k + 2 − x n − k +1 for the quadratic extrapolation to the point of corresponding local extremum. The illustration of the process of finite differences is presented in the Fig. 4. Small distance was found for the row 8 for the quadratic extrapolation, = 8 , optimal distance –for the row 2. Graphs of the corresponding interpolation curves for the first case (row 8) are on the Fig. 5. You can see that both extrapolation curves give good results, last points are not far from the points of the corresponding local extremums. Our predictive value is 4023, real value-4288. We can also consider for this data set another row number 2 in the Fig. 6. This is optimal situation, for the quadratic extrapolation distanсe from the last observation point to the point of local extremum tends to 0 (see Fig. 5). Cubic extrapolation also gives good result. Our predictive value is 4675. 0 9432 5038 4576 4158 5334 6911 8997 5676 4846 5011 4023 1 -4661 -4856 -880 758 2753 3663 -1235 -4151 -665 -823 2 -6302 3781 5614 3633 2905 -3988 -7814 570 3328 3 11153 11916 -148 -2709 -7621 -10719 4558 11142 4 16063 -11301 -14625 -7473 -8010 12179 21861 -1660 5 -31662 -30688 3828 6615 19652 29871 -5832 2491 6 -41818 35490 37303 15824 23256 -7652 1982 7 92760 79121 -19666 -14047 -6196 3067 8 111588 -112426 -93168 -1074 1814 9 -277732 -340592 5574 -7689 10 -681184 3626 -31729 11 -27278 -75495 Figure 4: Illustration of the process of finite differences table analysis 60000 100000 80000 40000 60000 20000 40000 0 20000 1 2 3 4 5 0 -20000 -20000 1 2 3 4 5 -40000 -40000 -60000 -60000 Figure 5: Graphs of the cubic (left) and quadratic extrapolation curves Let’s consider next value 4288 (number of COVID incidence in Ukraine 11.01.21) and add it to our data set. If we try to build prediction using piramidal approuch , there is not good situation according to the condition β for all roads of table of finite differences, predictive value is 794 (see Fig. 8), it is far from reality. This means that we cannot find predictive patterns in such dataset. In such situation we must use another method. Let us consider other points of observation: 5116, 6409. We also can find good situation for the forecasting (see Fig. 11), predictive value is 7081 (see Fig. 10), real observation is7925. Let us consider next point 7925 and add it to our data set . Result of the forecasting is in the Fig. 12, 9422. Real value is 9699. 0 4576 4158 5334 6911 8997 5676 4846 5011 4675 1 -880 758 2753 3663 -1235 -4151 -665 -171 2 5614 3633 2905 -3988 -7814 570 3980 3 -148 -2709 -7621 -10719 4558 11142 4 -14625 -7473 -8010 12179 21861 -1660 5 3828 6615 19652 29871 -5832 2491 6 37303 15824 23256 -7652 1982 7 -19666 -14047 -6196 3067 8 -93168 -1074 1814 9 5574 -7689 10 -31729 Figure 6: Illustration of the process of finite differences table analysis 10000 1000 9000 500 8000 0 7000 -500 1 2 3 4 5 6000 -1000 5000 -1500 4000 -2000 3000 -2500 2000 -3000 1000 -3500 0 -4000 1 2 3 4 5 -4500 Figure 7: Graphs of the cubic (left) and quadratic extrapolation curves for the optimal case 6,5 7 7,5 8 8,5 9 9,5 10 10,5 11 11,5 5038 4576 4158 5334 6911 8997 5676 4846 5011 4288 337 -4856 -880 758 2753 3663 -1235 -4151 -665 -558 -4674 3781 5614 3633 2905 -3988 -7814 570 3593 -4009 11916 -148 -2709 -7621 -10719 4558 11407 -4579 1 -11301 -14625 -7473 -8010 12179 22126 -9137 330 -30688 3828 6615 19652 30136 -21316 6972 995 35490 37303 15824 23521 -40968 16768 6402 79121 -19666 -13782 -10212 30554 12210 -112426 -92903 116704 40378 18375 -204491 418684 26074 20726 974148 -42958 10250 Figure 8: Part of the finite differences table 60000 40000 20000 0 1 2 3 4 5 -20000 -40000 Figure 9: Graphs of the quadratic extrapolation curves 9 9,5 10 10,5 11 5011 4288 5116 6409 7081 -558 105 2121 1965 770 2679 1860 -914 5412 18928 1656 1326 -1551 -556 Figure 10: Part of the finite differences table 8000 3000 2500 6000 2000 4000 1500 1000 2000 500 0 0 1 2 3 4 5 1 2 3 4 5 Figure 11: Graphs of the cubic (left) and quadratic extrapolation curves The peculiarity of this example is that we have good compliance with the condition β only by quadratic extrapolation. Cubic extrapolation shows (see Fig. 13) that forecast point is in zone of convexity changing. This gives a good agreement with the quadratic extrapolation. But cubic extrapolation cannot be used independently, since it is impossible to assert by four points that the fifth is in the zone of convexity changes for the predicted function . 5. Conclusions Thus, it is presented a new modification of the “piramidal” algorithm of data forecasting. Keeping the basic idea of the pyramidal approach, we have changed the procedure for selecting a row in the finite difference table where predicted value is found. The improved procedure allowed us to efficiently use the previously proposed piramidal approach for forecasting time series containing a stochastic component. Our approach works by finding certain patterns in a small series of data. 6,5 7 7,5 8 8,5 9 9,5 10 10,5 11 6911 8997 5676 4846 5011 4288 5116 6409 7925 9422 3663 -1235 -4151 -665 -558 105 2121 2809 3013 -3988 -7814 570 3593 770 2679 2704 892 -10719 4558 11407 200 -914 1934 2613 12179 22126 -4358 -12321 1734 3527 2586 30136 -16537 -34447 6092 15848 4032 -465 -36189 -64583 22629 505043 3818 -1353 -88104 58818 569626 -2228 -4732 110831 657730 -15926 -10093 732052 -35820 -18521 Figure 12: Part of the finite differences table 10000 4000 8000 3000 6000 2000 4000 1000 2000 0 0 1 2 3 4 5 1 2 3 4 5 Figure 13: Graphs of the cubic (left) and quadratic extrapolation curves To illustrate our method, we consider data set on the incidence of COVID-19 in Ukraine from 22.12.2020 until 14.01.21. Numerical results have demonstrated the high efficiency of our technique of forecasting. Relative forecasting errors are within 2,8%-10,5%. Note that the errors could also be associated with inaccuracies in recording the number of cases in different regions of Ukraine. In the process of the algorithm justification we obtaine interesting additional results. For example, equivalence of the prediction procedure according to the formula (4) and cubic extrapolation makes it possible to significantly improve, in the context of computational complexity, the classical method for constructing a forecast based on a cubic interpolation polynomial. Indeed, there is no need to compose a system of 4 algebraic equations and solve it to find the parameters of a cubic polynomial. It is enough to construct Fig. 3 and perform simple corresponding calculations which are described in detail in paragraph 2 (abscissa of the first interpolation point can be arbitrary, but the distances between the abscissas of all points must be the same). The proposed method is generic and can be used to extrapolate the time series in arbitrary areas of research, including the construction of series of short-term forecasts of economic dynamics. 6. References [1] Y. Turbal, A. Bomba, A. Sokh, O. Radoveniuk, M. Turbal, Pyramidal method of small time series extrapolation, International journal of computing science and mathematic 10.4 (2019) 122- 130. doi: 10.1504/IJCSM.2019.104025. [2] A. Bomba, Y.Turbal, Data analysis method and problems of identification of trajectories of solitary waves, Journal of Automation and Information Sciences 5 (2015) 34-43. doi: 10.1615/JAutomatInfScien.v47.i10.20 [3] I. H. Witten, E. F. Mark, A.H. Christopher, J. Pal, Data Mining (Fourth Edition), Practical Machine Learning Tools and Techniques, Chapter 8 - Data transformations (2017) 285-334. doi: 10.1016/B978-0-12-804291-5.00008-8 [4] A.S. Kostinsky, On the principles of a spline extrapolation concerning geophysical data. Reports of the National Academy of Sciences of Ukraine 2 (2014) 111–117. doi:10.15407/dopovidi2014.02.111 [5] Z. Zhan, R Yang, Z Xi, et al., A Bayesian Inference based Model Interpolation and Extrapolation. SAE Int. J. Mater. Manf. 5.2 (2012) 357-364. doi:10.4271/2012-01-0223 [6] Y. Turbal, A. Bomba, A. Sokh, O. Radoveniuk, M. Turbal, Spatial generalization of the pyramidal data etrapolation, Bulletin of Taras Shevchenko National University of Kyiv. Series Physics & Mathematics 2 (2017) 146-151. [7] Y. Turbal, M. Turbal, A.A.Driwi, S. Al Shukri, On the equivalence of the forecast value construction in the “pyramidal” extrapolation method and cubic forecast, MCIT 4 (2020) 67–70. doi.org/10.31713/MCIT.2020.15 [8] K. Xu, M. Zhang, J. Li, S. S. Du, K. Kawarabayashi, S. Jegelka, How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks, in: Proceedings of the International Conference on Learning Representations ICLR, Vienna Austria (2021) URL: http://arXiv:2009.11848v5 [9] J. I. Monroe, H. W. Hatch, N. A. Mahynski, M. S. Shell, V. K. Shen, Extrapolation and interpolation strategies for efficiently estimating structural observables as a function of temperature and density, J. Chem. Phys. 153 (2020) 144101. doi:10.1063/5.0014282. [10] N. A. Mahynski, J. R. Errington, V. K. Shen, Multivariable extrapolation of grand canonical free energy landscapes, J. Chem. Phys. 147 (2017) 234111. doi:10.1063/1.5006906. [11] L-Y. Wang, W-C. Lee, One-step extrapolation of the prediction performance of a gene signature derived from a small study, BMJ Open, 5:e007170 (2014). doi:10.1136/bmjopen-2014-007170 [12] N. P. Bakas, Numerical Solution for the Extrapolation Problem of Analytic Functions, Research Volume 2019. doi:10.34133/2019/3903187 [13] T. CoşKun, Approximation of analytic functions of several variables by linear k-positive operators. Turkish Journal of Mathematics 41 (2) (2017) 426–435. doi: 10.3906/mat-1512-96. [14] N. Mai-Duy, T. T. Le, C. M. Tien, D. Ngo-Cong, and T. Tran-Cong, Compact approximation stencils based on integrated flat radial basis functions. Engineering Analysis with Boundary Elements 74 (2017) 79–87. doi: 10.1016/j.enganabound.2016.11.002. [15] S. Makridakis, N. Bakas, Forecasting and uncertainty: a survey. Risk and Decision Analysis 6(1) (2016) 37–64. doi: 10.3233/RDA-150114. [16] M. A. Negrin, J. Nam, A.H. Briggs, Bayesian solutions forhandling uncertainty in survival extrapolation. Med DecisMaking 37(4) (2017) 367–76. doi: 10.1177/0272989X16650669. [17] A.Vehtari, A. Gelman, J.Gabry, Practical Bayesian model evaluation using leave-one-out cross- validation and WAIC, Stat Comput 27 (2017) 1413–1432. doi: 10.1007/s11222-016-9696-4. [18] P.B. Conn, D.S.Johnson , P.L. Boveng, On extrapolating past the range of observed data when making statistical predictions in ecology. PLoS One (2015) 10(10):e0141416. doi:10.1371/journal.pone.0141416 [19] N. Demiris, D. Lunn, L.D. Sharples, Survival extrapolationusing the poly-Weibull model. Stat Methods Med Res 24(2) (2015) 287–301. doi:10.1177/0962280211419645 . [20] C. Jackson, J. Stevens, S. Ren, Extrapolating survivalfrom randomized trials using external data: a review of methods. Med Decis Making 37(4) (2017) 377–90. doi:10.1177/0272989X16639900. [21] A. Vickers, An Evaluation of Survival Curve Extrapolation Techniques Using Long-Term Observational Cancer Data, Medical Decision Making 39(8) (2019) 926–938. doi:10.1177/0272989X19875950