The Use of The Kolmogorov–Wiener Filter for Prediction of Heavy-Tail Stationary Processes Vyacheslav Gorev, Alexander Gusev and Valerii Korniienko Dnipro University of Technology, 19 Dmytra Yavornytskoho Ave., 49005 Dnipro , Ukraine Abstract We investigate the possibility of the practical use of the Kolmogorov–Wiener filter for the prediction of a heavy-tail stationary random process. A discrete process and a discrete filter are considered. Nowadays telecommunication traffic in telecommunication systems with data packet transfer is considered to be a heavy-tail random process, so the problem under consideration may be applied to the prediction of telecommunication traffic, which may be important, for example, for the prevention of network congestion, for the maximization of the network utilization rate and for cyber security, because a comparison of the actual traffic with the predicted one may help to detect cyber-attacks. There are a lot of different and rather sophisticated approaches to traffic prediction, for example, the ARIMA approach, neural network approaches and so on, which may be applicable to the prediction of a non-stationary traffic in various cases. However, in the rather simple case of a stationary telecommunication traffic, more simple approaches may be applied. For example, such a simple prediction approach as the Kolmogorov–Wiener filter is not sufficiently developed in the literature. In this paper it is shown that if a stationary heavy-tail random process is smooth enough, then the Kolmogorov–Wiener filter may be used for its practical prediction. The obtained results may be taken into account for practical telecommunication traffic prediction in telecommunication systems with data packet transfer. Keywords 1 Kolmogorov–Wiener filter, prediction, heavy-tail stationary random process, power-law correlation function, telecommunication traffic 1. Introduction and related works The problem of telecommunication traffic prediction is important for telecommunications. For example, it is important for the prevention of network congestion and for the maximization of the network utilization rate [1]; it is significant for understanding future market dynamics and reducing the decision risks [2]. The telecommunication traffic prediction is also important for cyber security [3] because the comparison of the actual traffic with the predicted one may help to detect cyber-attacks. There are a lot of different approaches to traffic prediction. For example, the following ones can be indicated: Auto Regressive Integrated Moving Average (ARIMA), Markov Modulated Poisson Process models (MMPP), Kalman filtering, Seasonal ARIMA (SA), a neural network approach (including deep neural networks [4]), wavelet transforms [1], the least-squares support vector machine (LSSVM), gray models [2], Holt-Winters models [3]. Of course, rather complicated approaches should be used for non-stationary randomly fluctuating traffic prediction. But if the traffic is stationary and rather smooth, sophisticated approaches may not be needed. For example, in [2] some methods are presented for a description of rather simple cases. In [2] it is stressed that in stationary IntelITSIS’2022: 3rd International Workshop on Intelligent Information Technologies and Systems of Information Security, March 23–25, 2022, Khmelnytskyi, Ukraine EMAIL: lordjainor@gmail.com (V. Gorev); gusev1950@ukr.net (A. Gusev); vikor7@ukr.net (V. Korniienko) ORCID: 0000-0002-9528-9497 (V. Gorev); 0000-0002-0548-728X (A. Gusev); 0000-0002-0800-3359 (V. Korniienko) ©️ 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) cases the ARMA approach may be used too, and in the case of a smooth monotone process the gray model may be applied. As is known [5], such a simple filter as the Kolmogorov–Wiener one may be used for the prediction of stationary random processes. However, as far as we know, such an approach is not sufficiently developed in the literature for traffic prediction even for rather simple cases. The Kolmogorov–Wiener filter is widely used for signal extraction in different fields of knowledge [6]. It is widely used in econometric analyses [7, 8] and in image restoration [9]. The theoretical fundamentals of the Kolmogorov–Wiener filter for continuous telecommunication traffic prediction are developed in our recent paper [10]. The paper [10] is dedicated to the solution of the Wiener–Hopf integral equation in the unknown filter weight function for two telecommunication traffic models: the power-law structure function model and the model of fractional Gaussian noise; the solutions based on the truncated polynomial expansion method and the truncated trigonometric Fourier series method are obtained. However, the possibility of using the Kolmogorov–Wiener filter for practical traffic prediction is still under question. The aim of this work is to show that the Kolmogorov–Wiener filter may be applicable to traffic prediction if the traffic is stationary and smooth enough. As is known [11, 12], the telecommunication traffic in systems with data packet transfer is considered to be a self-similar heavy-tail random process. So, if we show that the Kolmogorov–Wiener filter is applicable to the prediction of simulated data of a stationary random self-similar heavy-tail process, then we will be able to conclude that it may be applied to practical telecommunication traffic prediction. In this paper we restrict ourselves to the investigation of a discrete process and a discrete filter. The corresponding simulated data may be generated via the symmetric moving average approach [13], the generated process is in fact similar to the fractional Gaussian noise process, which may describe telecommunication traffic, see [14]. The paper is organized as follows. In Sec. 1 the introduction and the literature review are given. In Sec. 2 the discrete Kolmogorov–Wiener filter and the symmetric moving average approach for obtaining simulated stationary heavy-tail data are described. In Sec. 3 heavy-tail simulated data are obtained. In Sec. 4 the prediction results are described, and in Sec. 5 conclusions are made. 2. Description of the discrete Kolmogorov–Wiener filter and of the method of generation of heavy-tail simulated data Let the filter input 𝑥𝑡 be a stationary random process which is the sum of the signal 𝑠𝑡 and the noise 𝑛𝑡 : 𝑥′𝑡 = 𝑠𝑡 + 𝑛𝑡 . (1) The Kolmogorov–Wiener filter output 𝑦𝑡 should be «the closest» to the value 𝑠𝑡+𝑧 where 𝑧 is the number of points for which the prediction is made, so we have the following requirement: 〈(𝑦𝑡 − 𝑠𝑡+𝑧 )2 〉 → min. (2) The correlation function 𝑅𝑥′ (𝑡) of the filter input 𝑥′𝑡 and the cross-correlation function of the processes 𝑠𝑡 and 𝑥′𝑡 𝑅𝑠𝑥′ (𝑡) are considered to be given. The Kolmogorov–Wiener filter is considered to be a linear one, so the filter output is expressed in terms of the filter input as follows: 𝑇 𝑦𝑡 = ∑ ℎ𝑖 𝑥′𝑡−𝑖 (3) 𝑖=0 where ℎ𝑖 are the unknown filter weight coefficients and the input data are given for 𝑡 = 0,1,2, . . , 𝑇. 2 〉 The coefficients ℎ𝑖 should minimize expression (2). The term 〈𝑠𝑡+𝑧 is a constant and does not depend on the weight coefficients ℎ𝑖 , so (2) can be rewritten as 〈𝑦𝑡2 〉 − 2〈𝑦𝑡 𝑠𝑡+𝑧 〉 → min, (4) which in view of (3) gives 𝑇 𝑇 ∑ ℎ𝑖 ℎ𝑗 〈𝑥′𝑡−𝑖 𝑥′𝑡−𝑗 〉 − 2 ∑ ℎ𝑖 〈𝑥′𝑡−𝑖 𝑠𝑡+𝑧 〉 = 𝑓(ℎ0 , ℎ1 , … , ℎ 𝑇 ) → min. (5) 𝑖,𝑗=0 𝑖=0 With account for the facts that 〈𝑥′𝑡−𝑖 𝑥′𝑡−𝑗 〉 = 𝑅𝑥′ (𝑖 − 𝑗) (6) and 〈𝑥′𝑡−𝑖 𝑠𝑡+𝑧 〉 = 𝑅𝑠𝑥′ (𝑖 + 𝑧) (7) one can finally write 𝑇 𝑇 ∑ ℎ𝑖 ℎ𝑗 𝑅𝑥′ (𝑖 − 𝑗) − 2 ∑ ℎ𝑖 𝑅𝑠𝑥′ (𝑖 + 𝑧) = 𝑓(ℎ0 , ℎ1 , … , ℎ 𝑇 ) → min. (8) 𝑖,𝑗=0 𝑖=0 The function 𝑓(ℎ0 , ℎ1 , … , ℎ 𝑇 ) is a quadratic one, and thus it has one minimum, which is described by the conditions 𝜕𝑓(ℎ0 , ℎ1 , … , ℎ 𝑇 ) = 0; 𝑘 = 0,1,2, … , 𝑇. (9) 𝜕ℎ𝑘 These conditions with account for the evenness of the correlation function and the fact that 𝜕ℎ𝑖 1, 𝑖 = 𝑗 = 𝛿𝑖𝑗 = { (10) 𝜕ℎ𝑗 0, 𝑖 ≠ 𝑗 lead to 𝑇 ∑ ℎ𝑖 𝑅𝑥′ (𝑖 − 𝑘) = 𝑅𝑠𝑥′ (𝑘 + 𝑧); 𝑘 = 0,1,2, … , 𝑇, (11) 𝑖=0 which is a set of linear equations in the unknown coefficients ℎ𝑖 . In matrix form, this set may be presented as 𝑅𝑥′ ∙ ℎ = 𝑅𝑠𝑥′ (12) where 𝑅𝑥′ (0) 𝑅𝑥′ (1) 𝑅𝑥′ (2) ⋯ 𝑅𝑥′ (𝑇) 𝑅𝑥′ (1) 𝑅𝑥′ (0) 𝑅𝑥′ (1) ⋯ 𝑅𝑥′ (𝑇 − 1) 𝑅𝑥′ = 𝑅𝑥′ (2) 𝑅𝑥′ (1) 𝑅𝑥′ (0) ⋯ 𝑅𝑥′ (𝑇 − 2) (13) ⋮ ⋮ ⋮ ⋱ ⋮ ⋯ (𝑅𝑥′ (𝑇) 𝑅𝑥′ (𝑇 − 1) 𝑅𝑥′ (𝑇 − 2) 𝑅𝑥′ (0) ) is the correlation matrix [5], ℎ is the vector column of the unknown weight coefficients, and 𝑅𝑠𝑥 is the vector column of the free terms: ℎ0 𝑅𝑠𝑥′ (𝑧) ℎ1 𝑅𝑠𝑥′ (𝑧 + 1) ℎ = ℎ2 , 𝑅𝑠𝑥′ = 𝑅𝑠𝑥′ (𝑧 + 2) . (14) ⋮ ⋮ (ℎ 𝑇 ) (𝑅𝑠𝑥′ (𝑧 + 𝑇)) So, the vector column ℎ may be found as −1 (15) ℎ = 𝑅𝑥′ ∙ 𝑅𝑠𝑥′ . Then the filter output may be obtained by formula (3). It should be noticed that all the above-mentioned calculations are described in [6]. The Kolmogorov–Wiener filter may be used both for the extraction of a signal form the sum of a signal and a noise and for the signal prediction. In the case where the input signal is non-noisy, the Kolmogorov–Wiener filter may be used for the prediction of the stationary process given at the filter input. In the non-noisy case, the filter weight coefficients are given by formula (15) with account for the fact that 𝑅𝑠𝑥′ = (𝑅𝑥′ (𝑧) 𝑅𝑥′ (𝑧 + 1) 𝑅𝑥′ (𝑧 + 2) … 𝑅𝑥′ (𝑧 + 𝑇))𝑇 . (16) Now let us describe the method of the generation of heavy-tail simulated data, which is used in the paper. In this paper we use the symmetric moving average approach, which is described in detail in [13]. Such an approach was chosen because of its simplicity. Let 𝑉𝑡 be a stationary white noise process with an average value equal to zero and a variance equal to 1. Then a heavy-tail process 𝑋𝑖 similar to the fractional Gaussian noise may be generated as follows [13]: 𝑞 𝑋𝑖 = ∑ 𝑎|𝑗| 𝑉𝑖+𝑗 = 𝑎𝑞 𝑉𝑖−𝑞 + 𝑎𝑞−1 𝑉𝑖−𝑞+1 + ⋯ + 𝑎𝑞 𝑉𝑖+𝑞 , (17) 𝑗=−𝑞 theoretically, 𝑞 should be infinite; in practical calculation it may be a rather large, but finite number; and the coefficients 𝑎𝑗 are as follows: √(2 − 2𝐻)𝛾0 (18) 𝑎0 = 1.5 − 𝐻 and 𝑎0 𝑎𝑗 = ((𝑗 + 1)𝐻+0.5 + (𝑗 − 1)𝐻+0.5 − 2𝑗𝐻+0.5 ), (19) 2 here, 𝛾0 is the variance and 𝐻 is the Hurst exponent of the process 𝑋𝑖 . The number 𝑞 may be very large, it is estimated as follows [13]: 1 𝐻 2 − 0.25 1.5−𝐻 𝑞 ≥ max (𝑚, ( ) ) (20) 2𝛽 where 𝑚 is the number of correlation function points of the process 𝑋𝑖 which should be obtained and a small number 𝛽 is in fact the given accuracy of the coefficient 𝑎𝑗 in (17); the values 𝑎𝑗>𝑞 should be less than 𝛽𝑎0 . The accuracy of this method depends on 𝑞, and the method is not exact even in the case where 𝑞 → ∞. However, for a rather large 𝑞 the method may lead to good practical results [13]. 3. The generation of non-smooth and smooth heavy-tail simulated data 106 points of the white noise process 𝑉𝑡 with an average value equal to 0 and a variance equal to 1 are generated on the basis of the generator built in the Wolfram Mathematica package. The following parameters were chosen: 𝑚 = 105 , 𝛽 = 10−4 , 𝐻 = 0.8, 𝛾0 = 1. (21) The corresponding number 𝑞 = 3 ∙ 10 is chosen. In fact, the inequality (21) holds even for 𝑞 = 105, 5 the value 𝑞 = 3 ∙ 105 was chosen for a higher accuracy. On the basis of the idea (17)–(19), 105 points of the process 𝑋𝑖 were generated as follows: 𝑞 𝑋𝑖 = ∑ 𝑎|𝑗| 𝑉𝑖+𝑗+𝑞 , (22) 𝑗=−𝑞 in fact, the quantities 𝑉𝑖+𝑗+𝑞 and 𝑉𝑖+𝑗 are independent because 𝑉𝑡 is the white noise, no matter whether formula (17) or formula (22) is used; formula (22) is chosen in order to avoid indices beyond the array 𝑉𝑖 bounds. The coefficients 𝑎𝑗 are calculated on the basis of (19). The average value of 𝑋𝑖 is close to zero. We have to construct simulated data that may describe telecommunication traffic, which is obviously non-negative. So we build the array 𝑥𝑖 as follows: 𝑥𝑖 = 𝑋𝑖 + |min(𝑋)| + 10−3 , (23) -3 a small summand 10 is added in order to avoid obtaining an infinite value of the prediction mean average percentage error (MAPE). The process 𝑥𝑖 is a non-negative random stationary heavy-tail process; its graph is given in Fig. 1. Let us make sure that the generated process 𝑥𝑖 is a heavy-tail one. Let us describe the corresponding centralized process 𝑥𝑐𝑖 : 𝑥𝑐𝑖 = 𝑥𝑖 − 〈𝑥〉 (24) where the average value 〈𝑥〉 is 105 1 〈𝑥〉 = 5 ∑ 𝑥𝑖 , (25) 10 𝑗=1 here we take into account the fact that the number of points of the generated array 𝑥𝑖 is equal to 105. The correlation function of the process 𝑥𝑐𝑖 is built as follows: 105 −𝜏 1 𝑅𝑥 (𝜏) = 〈𝑥𝑐𝑖 ∙ 𝑥𝑐𝑖+𝜏 〉 = 5 ∑ (𝑥𝑐𝑖 ∙ 𝑥𝑐𝑖+𝜏 ). (26) 10 − 𝜏 𝑖=1 The corresponding correlation function and its least-square fit are given in Fig.2. Figure 1: The values of the simulated non-smooth heavy-tail non-negative random process Figure 2: The correlation function of the simulated non-smooth heavy-tail random process and its least-square power-law fit; t≥1. The corresponding least-square fit is sought as 𝑅fit (𝑡) = 𝑎 ∙ 𝑡 𝑏 , (27) The following numerical coefficients were obtained: 𝑎 = 0.39, 𝑏 = −0.44, (28) here, the coefficients are rounded off to two significant digits. So, 𝑅fit (𝑡) = 0.39 ∙ 𝑡 −0.44 , (29) and on the basis of formula (29) and Fig.2 one can conclude that the correlation function exhibits a power law decay rather than an exponential one. So, indeed, the generated process is a heavy-tail one. It should also be noticed that according to [13] the following property should be valid for ≥ 1 : 𝑅𝑥 (𝑡)~𝑡 2𝐻−2, (30) so, according to the least-square fit 2𝐻 − 2 = −0.44, (31) which leads to 𝐻 = 0.78, (32) which is very close to the value 0.8, see (21). The variance of the process is equal to 𝑅𝑥 (0) = 0.93, (33) which is rather close to the value 𝛾0 = 1, see (21). So one can conclude that the generated process is close to the fractional Gaussian noise with given variance and Hurst exponent. The generated process is non-smooth, i.e. it is really highly fluctuating, so it is rather difficult to predict it. So it is reasonable to investigate smooth heavy-tail processes. In order to obtain smoother processes, we use a very simple smoothing algorithm [15]: 𝑙 1 𝑋̃𝑖 = ∑ 𝑋𝑖+𝑗 (34) 2𝑙 + 1 𝑗=−𝑙 where 𝑋̃𝑖 are the values of a smooth process, expression (34) is valid for every point except for the first 𝑙 and the last 𝑙 ones. The first 𝑙 and the last 𝑙 points of the process 𝑋̃𝑖 may be obtained as the corresponding linear least-square fit of the first 𝑙 and the last 𝑙 points of the process 𝑋𝑖 , respectively. The corresponding non-negative process may be expressed similarly to (23): 𝑥̃𝑖 = 𝑋̃𝑖 + |min(𝑋̃)| + 10−3 , (35) and the corresponding centralized process ̃𝑖 = 𝑥̃𝑖 − 〈𝑥̃〉 𝑥𝑐 (36) where the average value 105 1 〈𝑥̃〉 = 5 ∑ 𝑥̃𝑖 . (37) 10 𝑗=1 The simulated data for the process 𝑥̃𝑖 for 𝑙 = 3 are given in Fig.3. Figure 3: The values of the simulated smooth heavy-tail non-negative random process for 𝑙 = 3 Figure 4: The correlation function of the simulated smooth heavy-tail random process and its least- square power-law fit; t≥1. It should be stressed that the obtained smooth process 𝑥̃𝑖 is also a heavy-tail one. Let us consider the corresponding correlation function: 105 −𝜏 1 𝑅𝑥̃ (𝜏) = 〈𝑥𝑐 ̃𝑖+𝜏 〉 = 5 ̃𝑖 ∙ 𝑥𝑐 ∑ (𝑥𝑐 ̃𝑖+𝜏 ). ̃𝑖 ∙ 𝑥𝑐 (38) 10 − 𝜏 𝑖=1 For example, for 𝑙 = 3 the following correlation function and its fit are obtained, see Fig. 4. The least-square fit is sought in the form (27) , the following numerical coefficients were obtained: 𝑎 = 0.43, 𝑏 = −0.46, (39) here, the coefficients are rounded off to two significant digits. So, 𝑅fit (𝑡) = 0.43 ∙ 𝑡 −0.46 . (40) As can be seen form Fig.4, the correlation function of a smooth process is also well described by a power-law function, the obtained smooth process 𝑥̃𝑖 is also a heavy-tail one, and, in fact, this process may also be roughly considered as fractional Gaussian noise. 4. Prediction on the basis of the Kolmogorov–Wiener filter The prediction for non-smooth data is built as follows. In fact, the prediction for the centralized process is used. The filter weight coefficients are built on the basis of (13)–(16); the corresponding correlation function is taken in the form (26). First of all, the points 𝑥𝑐1 , 𝑥𝑐2 ,…, 𝑥𝑐𝑇+1 of the simulated process 𝑥𝑐 are taken as the filter input, and the points 𝑥𝑐𝑇+2 , 𝑥𝑐𝑇+3 ,…, 𝑥𝑐𝑇+𝑧+1 are predicted. Then the points 𝑥𝑐2 , 𝑥𝑐3 , … , 𝑥𝑐𝑇+2 are taken from the simulated data, and the points 𝑥𝑐𝑇+3 , 𝑥𝑐𝑇+4 ,…, 𝑥𝑐𝑇+𝑧+2 are predicted, and so on. At the 𝑖 th iteration of the algorithm the predcition is calculated as follows. The filter input data are 𝑥′0 = 𝑥𝑐𝑖 , 𝑥′1 = 𝑥𝑐𝑖+1 , … , 𝑥′ 𝑇 = 𝑥𝑐𝑖+𝑇 , (41) so 𝑥′𝑗 = 𝑥𝑐𝑖+𝑗 . (42) The filter output 𝑦𝑡 is the predicted value for 𝑥′𝑡+𝑧 (the non-noisy case is investigated). According to (3) we have 𝑡 𝑦𝑡 = ∑ ℎ𝑘 𝑥′𝑡−𝑘 , (43) 𝑘=0 here, the upper bound of summation is changed in order to avoid obtaining indices beyond the array of the input data. Such a change of the bound does not lead to a significant error for the prediction under consideration. On the basis of (41)–(43) one can conclude that 𝑡 𝑥𝑐 ̂𝑖+𝑡+𝑧 = ∑ ℎ𝑘 𝑥𝑐𝑡+𝑖−𝑘 (44) 𝑘=0 where 𝑥𝑐̂𝑖+𝑡+𝑧 is the predicted value of 𝑥𝑐𝑖+𝑡+𝑧 . Obviously, the prediction is made only for the values 𝑖 + 𝑡 + 𝑧 = 𝑇 + 1 + 𝑖, 𝑇 + 2 + 𝑖, … , 𝑇 + 𝑧 + 𝑖. We should also remember that we should make the prediction for the non-negative simulated data. So, the predicted non-negative data may be expressed as 𝑡 ̂𝑖+𝑡+𝑧 + 〈𝑥〉 = 〈𝑥〉 + ∑ ℎ𝑘 𝑥𝑐𝑡+𝑖−𝑘 , 𝑡 = ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ 𝑥̂𝑖+𝑡+𝑧 = 𝑥𝑐 𝑇 + 1 − 𝑧, 𝑇. (45) 𝑘=0 The MAPE and MAE errors for the corresponding prediction are calculated as 𝑇 1 𝑥̂𝑖+𝑡+𝑧 − 𝑥𝑖+𝑡+𝑧 MAPE = ∑ | | ∙ 100% (46) 𝑧 𝑥𝑖+𝑡+𝑧 𝑡=𝑇+𝑧−1 and 𝑇 1 MAE = ∑ |𝑥̂𝑖+𝑡+𝑧 − 𝑥𝑖+𝑡+𝑧 |. (47) 𝑧 𝑡=𝑇+𝑧−1 The corresponding prediction errors are calculated at each iteration. Let us tell a few words why the above-mentioned change of the upper bound of summation has no significant effect on the result. In order to make the prediction for 𝑥𝑐 ̂ 𝑇+1+𝑖 , one should calculate the sum of 𝑇 + 2 − 𝑧 summands, in order to make the prediction for 𝑥𝑐 ̂ 𝑇+2+𝑖 one should calculate the sum of 𝑇 + 3 − 𝑧 summands, and so on. We obviously deal with the case where 𝑇 ≫ 𝑧, so the value 𝑇 + 1 − 𝑧 is rather close to 𝑇 + 1, so the above-mentioned change of the upper bound is not significant for the calculations. Similarly, the prediction for the smooth heavy-tail process is made as follows. At the 𝑖 th iteration of the algorithm the prediction is calculated as follows: 𝑡 ̃𝑡+𝑖−𝑘 , 𝑡 = ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ 𝑥̂̃𝑖+𝑡+𝑧 = 〈𝑥̃〉 + ∑ ℎ𝑘 𝑥𝑐 𝑇 + 1 − 𝑧, 𝑇 (48) 𝑘=0 and the corresponding MAPE and MAE errors are 𝑇 1 𝑥̂̃𝑖+𝑡+𝑧 − 𝑥̃𝑖+𝑡+𝑧 MAPE = ∑ | | ∙ 100% (49) 𝑧 𝑥̃𝑖+𝑡+𝑧 𝑡=𝑇+𝑧−1 and 𝑇 1 MAE = ∑ |𝑥̂̃𝑖+𝑡+𝑧 − 𝑥̃𝑖+𝑡+𝑧 |. (50) 𝑧 𝑡=𝑇+𝑧−1 The MAPE and MAE are calculated for each above-mentioned iteration. The 105 − 𝑇 − 𝑧 MAPE and MAE values both for smooth and for non-smooth processes are obtained. The following parameters are chosen: T = 100, z = 1. (51) Figure 5: The MAPE and MAE histograms for the prediction of a non-smooth heavy-tail process Table 1 The prediction results for a smooth heavy-tail process 𝑙 〈𝑥̃〉 Average MAPE, % Average MAE 1 2.98 9.11 0.235 2 2.52 6.26 0.142 3 2.34 4.85 0.103 4 2.31 3.92 0.081 5 2.22 3.37 0.067 6 2.11 2.98 0.057 7 2.04 2.68 0.050 The following results are obtained. The MAPE and MAE histograms in the case of the non-smooth process are shown in Fig.5. The y-axes of the histograms indicate the number of MAPE and MAE values that belong to the corresponding intervals. For the non-smooth process the average MAPE is 24.7%, and the average MAE is 0.70 (the average value of the process is 〈𝑥〉 = 3.88). It should also be stressed that for some points the MAPE are more than 100%. So one can conclude that the prediction accuracy is not high in the case of the non-smooth process. So, if the process is a highly fluctuating one, then the prediction based on the Kolmogorov–Wiener filter may not lead to good results. But if the process is rather smooth, the prediction results are much better. The corresponding results are given in Table 1. In Table 1 𝑙 is the parameter used in (34), i.e. 2𝑙 + 1 is the number of smoothing points. As can be seen, the smother the process is, the better the prediction results are, and the prediction accuracy increases with 𝑙. The corresponding histograms for 𝑙 = 3 are given in Fig.6. The predictions for 𝑙 ≥ 6 have an average MAPE value less than 3%. Figure 6: The MAPE and MAE histograms for the prediction of a non-smooth heavy-tail process For example, for 𝑙 = 3 the average MAPE is less than 5%. As can be seen from the corresponding histogram, the MAPE for the overwhelming majority of points is less than 10%. For some very rare points the MAPE may be rather high (up to 40%), but in our opinion this may be explained as follows. As can be seen from Fig. 3, the values for some points of the process 𝑥̃ are rather close to zero, and the MAPE may not be an adequate characteristic for the prediction of points close to zero. So, one can conclude that the Kolmogorov–Wiener filter may give good results for the prediction of a stationary heavy-tail random process if the process is smooth enough. 5. Conclusions and plans for the future The use of the Kolmogorov–Wiener filter for the prediction of stationary random heavy-tail processes is considered. The attention is paid to the discrete case. The problem under consideration may be connected with the telecommunication traffic prediction, which is important, for example, for cyber security, see [3]. There are many rather sophisticated approaches to telecommunication traffic prediction [1]. For rather simple cases (stationary or smooth traffic) the ARMA or gray model approaches may be used [2]. The traffic in telecommunication systems with data packet transfer is considered to be a self-similar heavy-tail process, see [11]. Such a simple filter as the Kolmogorov– Wiener one may be used in the prediction of stationary random processes [6]. However, as far as we know, the corresponding approach for traffic prediction is not sufficiently developed in the literature. In this paper we generate data for a stationary heavy-tail process on the basis of the symmetric moving average approach [13]. The corresponding non-smooth and smooth data are generated. The prediction for 1 point forward on the basis of the previous 101 points is investigated. It is shown that the Kolmogorov–Wiener filter is not good for non-smooth processes, but may give a good prediction for a stationary random heavy-tail process if the process is rather smooth. So, if the traffic is stationary and rather smooth, the Kolmogorov–Wiener filter may be used for its prediction. The advantage of the corresponding approach is the simplicity of the method in contrast with, for example, neural networks or ARIMA models. The plans for the future are as follows. In this paper only the values T = 100 and z = 1 are investigated. So the prediction investigation for a wider range of parameters may be a plan for the future. In our recent paper [10] the theoretical approach to the Kolmogorov–Wiener filter construction in the continuous case is considered. In this paper we generated a large number of data points, which may allow one to try to investigate the continuous case, so the investigation of the applicability of the method [10] may be another plan for the future. This paper is based on the generation of simulated data, so the investigation of real experimental traffic data may be another plan for the future. It should also be stressed that the use of the Kolmogorov–Wiener filter for the prediction of stationary processes may be useful not only in telecommunications, but also in other fields of knowledge, for example, in electrical engineering, see [16]. 6. References [1] Q. H. Do, T. T. H. Doan, T. V. A. Nguyen, N. T. Duong, V. Van Linh, Prediction of Data Traffic in Telecom Networks based on Deep Neural Networks, Journal of Computer Science 16 (2020) 1268-1277. doi: 10.3844/jcssp.2020.1268.1277. [2] J.-X. Liu, Z.-H. Jia, Telecommunication Traffic Prediction Based on Improved LSSVM, International Journal of Pattern Recognition and Artificial Intelligence, 32, No. 3 (2018) 1850007 (16 pages), doi: 10.1142/S0218001418500076. [3] H. Brugner, Holt-Winters Traffic Prediction on Aggregated Flow Data, Proceedings of the Seminars Future Internet and Innovative Internet Technologies and Mobile Communication Focal Topic: Advanced Persistent Threats. Summer Semester 2017 (2017), 25-32. doi: 10.2313/NET-2017-09-1_04.d [4] P. Kaushik, S. Singh, P. Yadav, Traffic Prediction in Telecom Systems Using Deep Learning, Proceedings of 7th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions), August 29-31, 2018, Noida, India (2018), 302-207, doi: 10.1109/ICRITO.2018.8748386. [5] P. S. R. Diniz, Adaptive Filtering Algorithms and Practical Implementation, 5th ed., Springer Nature Switzerland AG, Cham, 2020, doi: 10.1007/978-3-030-29057-3. [6] T. Bao, J. Duffy, Signal extraction: experimental evidence, Theory and Decision 90 (2021), 219– 232. doi: 10.1007/s11238-020-09785-x [7] S. G. Pollock, Filters, Waves and Spectra, Econometrics 6 (2018), 35 (33 pages). doi: 10.3390/econometrics6030035 [8] S. G. Pollock, E. Mise, A Wiener–Kolmogorov Filter for Seasonal Adjustment and the Cholesky Decomposition of a Toeplitz Matrix, Computational Economics 59 (2022), 913–933. doi: 10.1007/s10614-020-10087-1 [9] V. Pronina, F. Kokkinos, D.V. Dylov, S. Lefkimmiatis, Microscopy Image Restoration with Deep Wiener-Kolmogorov Filters, in: A. Vedaldi, H. Bischof, T. Brox, JM. Frahm (Eds.), Lecture Notes in Computer Science, vol 12365, Springer, Cham, 2020, pp. 185–201. doi:10.1007/978-3-030-58565-5_12 [10] V. Gorev, A. Gusev, V. Korniienko, M. Aleksieiev, Kolmogorov–Wiener Filter Weight Function for Stationary Traffic Forecasting: Polynomial and Trigonometric Solutions, in: P. Vorobiyenko, M. Ilchenko, I. Strelkovska (Eds.), Lecture Notes in Networks and Systems, vol 212, Springer, 2021, pp. 111–129. doi:10.1007/978-3-030-76343-5_7 [11] D. Zhuang, C. Li, Loss Analysis for Networks based on Heavy-Tailed and Self-Similar Traffic, Journal of Physics: Conference Series 1584 (2020), 012054 (8 pages). doi: 10.1088/1742- 6596/1584/1/012054. [12] D. Radev, I. Lokshina, Advanced models and algorithms for self-similar IP network traffic simulations and pefformance analysis, Journal of Electrical Engineering 61, No. 6 (2010), 341- 349. doi: 10.2478/v10187-010-0053-0. [13] D. Koutsoyiannis, The Hurst phenomenon and fractional Gaussian noise made easy, Hydrological Sciences Journal, 47 (2002), 573-595. doi: 10.1080/02626660209492961. [14] M. Li, Generalized fractional Gaussian noise and its application to traffic modeling, Physica A 579 (2021), 126138 (22 pages). doi: 10.1016/j.physa.2021.126138. [15] K. Molugaram, G. S. Rao, Statistical Techniques for Transportation Engineering, Butterworth- Heinemann (Elsevier), Oxford, 2017, doi: 10.1016/B978-0-12-811555-8.00012-X. [16] Yu. A. Papaika, O. H. Lysenko, Ye. V. Koshelenko, I. H. Olishevskyi, Mathematical modeling of power supply reliability at low voltage quality, Naukovyi Visnyk Natsionalnoho Hirnychoho Universytetu, No. 2 (2021), 97-103. doi: 10.33271/nvngu/2021-2/097.