<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Forecasting Network Exchange Time Series</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aleksandr Moshnikov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aleksandr Syrov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ITMO University</institution>
          ,
          <addr-line>49 Kronverksky Pr., St. Petersburg, Russian Federation</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>This article deals with the problem of network trafic forecasting using the time series tool. A special feature of the proposed task is to consider the trafic of an individual network user. We consider 5 types of user trafic characterized by diferent parameters of the volume and number of transmitted packets. 13 main models for trafic forecasting are considered. To consider the efectiveness the parameters - RMSE, p-value, MAPE were evaluated. Leung-Box test is used to model assessment. To solve the problem the R software, the forecast package, is used. The results of an experiment using 13 diferent models are considered.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Network trafic forecasting</kwd>
        <kwd>Network modelling</kwd>
        <kwd>R programming language</kwd>
        <kwd>Time series</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Modeling network trafic is a notoriously dificult task. This is primarily due to the
everincreasing complexity of network trafic and the various ways in which the network can be
excited by user activity. The ongoing development of new network applications, protocols, and
usage profiles further necessitates models that can adapt to the specific networks in which they
are deployed [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] demand for telecommunications network trafic continues to grow
exponentially worldwide. Research has shown that the number of mobile cellular subscribers is growing
worldwide, and the world’s population is gradually equalizing with the level of its use [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Achieving this exponential growth requires efective planning and rapid expansion of
telecommunications systems, as well as the introduction of modern equipment. One approach to the
leadership industry players is the development and adoption of appropriate forecasting models
for the implementation of this agenda. Forecasting methods can be classified as long-term
and short-term. According to [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] the forecasting process based on a time interval in weeks,
months, and years is a long-term forecast, while short-term forecasts are milliseconds, seconds,
minutes, hours, and days. Time series modeling and forecasting are widely used for analyzing
telecommunications network trafic [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It has been shown that ARIMA models are stable
forecast for BitTorrent trafic [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In contrast, [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] indicated that the accuracy of prediction of
ARIMA models has a limited time interval. Data sets of http network trafic grouped in diferent
periods of the day [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and activities [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        In addition to the trafic itself, the task of analyzing the reliability of a network structure
that provides information exchange is also important [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], as well as aspects related to the
computational reliability of operations performed [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem statement</title>
      <p>The problem of network trafic forecasting using the time series tool is considered. A special
feature of the proposed task is to consider the trafic of a single network user. As part of the
task under consideration, the following tasks were performed:
• typical trafic sections (time series) characteristic of a particular network user behavior
model were formed;
• an overview of existing time series forecasting models was performed;
• a time series forecasting method has been selected that allows you to get acceptable
forecast results for various models of network user behavior.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Description of initial data</title>
        <p>To generate time series, raw sockets are used to capture and analyze frames sent or received
by the node in question. Time series are formed by measuring (at intervals of one second) the
amount of data coming to the node under consideration over the TCP/IP Protocol stack.</p>
        <p>Information about frames is saved in a table (the format of such a table is shown in the Fig.
1). This table is then converted to a time series data.</p>
        <p>Typical trafic sections (time series) that are typical for various network user behavior models
The following models of network user behavior are considered (based on personal experience):
• click on links (time series R1);
• download files (time series R2);
• listening to music (time series R3);
• view video (time series R4);
• the user’s browser is running but not in use (time series R5).</p>
        <p>The results are shown in Fig. 2. and Fig. 3.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Overview of existing time series forecasting models</title>
        <p>
          This review is based on the following works [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] and [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The following designations are used:
 - the observed (actual) value of the series at time  , ^ +ℎ| - the predicted value of the time
series for time  + ℎ.
        </p>
        <p>Summary of time series forecasting models and fucntion parametres are presented in Table 2.</p>
        <p>All calculations related to forecasting will be performed in the R software environment. R is
a programming language for statistical data processing and graphics, as well as a free and open</p>
        <sec id="sec-3-2-1">
          <title>Autoregression based on neural networks</title>
        </sec>
        <sec id="sec-3-2-2">
          <title>Parameters Function forecast library</title>
          <p>,  , 
,</p>
          <p>
            , , 
, , 
, , , 
rwf(ts,drift=TRUE)
meanf(ts)
naive(ts)
ses(ts)
holt(ts)
holt(ts,damped=TRUE)
hw(ts,seasonal= additive)
hw(ts,seasonal= multiplicative)
hw(ts,seasonal= multiplicative,
damped=TRUE)
stlf(ts, method)
nnetar(ts)
source computing environment within the GNU project. The R language contains tools that
allow you to create several parallel threads of calculations (by simultaneously loading several
processor cores) and reduce the time spent on modeling several times [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ]. Series forecasting
models provided in the forecast package are considered.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Estimation of forecast accuracy</title>
      <sec id="sec-4-1">
        <title>4.1. Evaluation of prediction accuracy</title>
        <p>Forecast error refers to the diference between the observed value and its forecast:
The root-mean-square error is used to estimate the prediction accuracy:
 +ℎ = ^ +ℎ − ^ +ℎ|
  =
√︁
(2 )
(1)
(2)
4.1.1. Schematic diagram of estimation of forecasting accuracy.</p>
        <p>The considered time series is divided into training and test parts. The parameters of the
forecasting model are determined by the training part. After determining the model parameters,
the Leung-Box test is performed for the absence of auto-correlation in the model residuals. If
the Leung-Box test is not passed (there is a strong auto-correlation in the remainder of the
model), the model is rejected. If the test is passed, the accuracy of the forecast (RMSE value) is
determined by the test part of the series. Algorithm presented at Fig. 4.
4.1.2. Combination of forecasting models.</p>
        <p>As an additional study, we consider the possibility of predicting time series based on
combinations (combining) of several forecasting models. For a combination of forecasting models, the
total forecast error is calculated using:

 +ℎ = (1/ ) · ∑︁ (+)ℎ
=1</p>
        <p>(3)
()
where  - number of prediction models in combination;  +ℎ prediction error of the -th
forecasting model.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. The results of evaluation of prediction accuracy for diferent models</title>
      <p>Results of estimating the accuracy of forecasting the R1 trafic type (as example): original series
and the function of partial auto-correlation are shown in Fig. 5; and the residual plot and fitted
model are shown in Fig. 6.</p>
      <p>For a number of models, it was not possible to calculate the RMSE, MAPE, and p-value
indicators this is due to the fact that the sample size exceeded the parameters of the input file
for the Foreach package procedure. Results of estimating the accuracy of forecasting time series
R1-R5 shown in Table 2.</p>
      <p>Time diagrams of trafic forecasting of the R1 type 1-6 are shown in Fig. 7 and type 7-13 are
shown in Fig.8.</p>
      <p>Based on the results of testing the forecasting models using the Leung-Box test, it was found:
• Average method, Simple exponential smoothing, Holt’s linear trend method, Holt’s linear
trend method. Damped trend methods, Holt-Winters’ additive method, Holt-Winters’
multiplicative method. Damped method, ARIMA, Neural network models passed test for
time series R1;
• Neural network models passed test for time series R2;
• ARIMA, Neural network models passed test for time series R3;
• ARIMA passed test for time series R4;
• Seasonal naive method, STL with multiple seasonal periods, Neural network models
passed test for time series R5. The Neural network models showed the better flexibility
by performing the Leung-Box test for the largest number of trafic types.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>The article considers the problem of network trafic forecasting using the time series tool. A
special feature of the proposed task is to consider the trafic of a single network user. The
following activities were used as trafic types: clicking on links — time series R1; downloading
ifles (for example, movies) - time series R2, and others.</p>
      <p>Typical, generally accepted models were used for forecasting, such as the drift Model, the
Holt-winters Model, the ARIMA Model, and others. The accuracy estimation was performed
for the considered models. All implementation was carried out in the R software package and
using the forecast package. Based on the results obtained, we can conclude the following
the most acceptable model for predicting network trafic of an individual user is a forecasting
model based on the decomposition of a series into separate components and their independent
forecasting. Further work will focus on the use of combinations of forecasting models for trafic
forecasting.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Ntlangu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baghai-Wadji</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Modelling Network Trafic Using Time Series Analysis: A Review</article-title>
          .
          <year>BDIOT2017</year>
          .
          <volume>10</volume>
          .1145/3175684.3175725
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>International</given-names>
            <surname>Telecommunication Union (ITU).</surname>
          </string-name>
          (
          <year>2014</year>
          ),
          <source>The World in 2014: ICT Facts and Figures</source>
          ,
          <source>Technical Report</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Gowrishankar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2008</year>
          ),
          <article-title>A Time Series Modelling and Prediction of Wireless Network Trafic</article-title>
          ,
          <source>Georgian Electronic Scientific Journal: Computer Science and Telecommunications</source>
          , Vol.
          <volume>2</volume>
          , No.
          <volume>16</volume>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>52</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Diaz-Aviles</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinelli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lynch</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nabi</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gkoufas</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bouillet</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Calabrese</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          (
          <year>2015</year>
          ),
          <article-title>Towards Real-time Customer Experience Prediction for Telecommunication Operators</article-title>
          ,
          <source>IEEE International Conference on Big Data</source>
          , pp.
          <fpage>1063</fpage>
          -
          <lpage>1072</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>KuanHoong</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>I.K.T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>YikKeong</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Bit Torrent Network Trafic Forecasting with ARMA</article-title>
          ,
          <source>International Journal of Computer Networks and Communications</source>
          , Vol.
          <volume>4</volume>
          , No.
          <issue>4</issue>
          , pp.
          <fpage>143</fpage>
          -
          <lpage>156</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Moussas</surname>
            ,
            <given-names>V. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daglis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kolega</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          (
          <year>2005</year>
          ),
          <article-title>Network Trafic Modelling and Prediction Using Multiplicative Seasonal ARIMA Models</article-title>
          , 1st International Conference on Experiments/Process/System/ Modelling/Simulation/Optimisation, 1st IC-EpsMsO,
          <fpage>6</fpage>
          -
          <issue>9</issue>
          <year>July</year>
          , Athens, pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Santos</surname>
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Network trafic characterization based on Time Series Analysis and Computational Intelligence</article-title>
          .
          <source>Journal of Computational Interdisciplinary Sciences. 2</source>
          . 10.6062/jcis.
          <year>2011</year>
          .
          <volume>02</volume>
          .03.0046.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Oduro-Gyimah</surname>
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boateng</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <article-title>Analysis and modelling of telecommunications network trafic: a time series approach (</article-title>
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Moshnikov</surname>
          </string-name>
          and
          <string-name>
            <given-names>V. S.</given-names>
            <surname>Kolomoitcev</surname>
          </string-name>
          ,
          <article-title>"Reliability Assessment of Distributed Control Systems with Network Structure," 2020 Wave Electronics and its Application in Information and Telecommunication Systems (WECONF), Saint-</article-title>
          <string-name>
            <surname>Petersburg</surname>
          </string-name>
          , Russia,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          , doi: 10.1109/WECONF48837.
          <year>2020</year>
          .
          <volume>9131490</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>V. A.</given-names>
            <surname>Bogatyrev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Bogatyrev</surname>
          </string-name>
          and
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Bogatyrev</surname>
          </string-name>
          ,
          <article-title>"Reliability and Probability of Timely Servicing in a Cluster of Heterogeneous Flow of Query Functionality," 2020 Wave Electronics and its Application in Information and Telecommunication Systems (WECONF), Saint-</article-title>
          <string-name>
            <surname>Petersburg</surname>
          </string-name>
          , Russia,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          , doi: 10.1109/WECONF48837.
          <year>2020</year>
          .
          <volume>9131165</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Ruey</surname>
            <given-names>S. Tsay.</given-names>
          </string-name>
          :
          <source>Analysis of Financial Time Series. 2nd edn. A JOHN WILEY and SONS, New Jersey</source>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Robert</surname>
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Shumway</surname>
          </string-name>
          , David S. Stofer.:
          <source>Time series analysis and it Applications</source>
          , With R Examples.
          <source>Third edition</source>
          , Springer, (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Crawley</surname>
            <given-names>MJ.</given-names>
          </string-name>
          :
          <article-title>The R Book</article-title>
          . 2nd ed. Wiley Publishing;
          <year>2012</year>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>