468


      Estimation of Buffer Capacity for Weibull Traffic*

    Oleg I. Kutuzov 1 [0000-0001-9318-6454] and Tatiana M. Tatarnikova 1 [0000-0002-6419-0072]
 1Saint Petersburg Electrotechnical University "LETI", ul. Professora Popova 5, 197376 St. Pe-

                                          tersburg, Russia

                                kutuzov-oleg@mail.ru
                                 tm-tatarn@yandex.ru


        Abstract. The solution to the problem of estimating the probability of exceeding
        a given value by a random value is discussed, in particular, estimating the small
        and very small probabilities of packet loss due to buffer overflows. It is shown
        that statistics of extreme values is an effective means of solving the problems of
        this class. With the appropriate choice of the structure of the extrapolation for-
        mula, which underlies the extreme values, it is possible to significantly accelerate
        the computational experiment. The use of the Extreme Value Theory method in
        combination with the Monte Carlo method for estimating small and very small
        values of the probabilities of random variables allows us to reduce the simulation
        time by 4-5 tens compared to the classical Monte Carlo method. Demonstrated
        the solution to the problem of calculating the buffer capacity for Weibull traffic
        based on the admissible probability 10-6 of its overflow.

        Keywords: Network Traffic, Fractal Traffic, Distribution with Heavy «Tails»,
        Weibull, Buffer Storage Volume, Probability of Loss, Extreme Value Theory,
        Simulation, Experiment Acceleration.


1       Introduction

A feature of the traffic of modern packet-switched info-communication networks is its
large-scale invariance (fractality). Fractal traffic is described by long-term dependen-
cies (LTD). The concept of a pulsating structure is often used in this context [2].
   An adequate description of fractal traffic is given by probability distributions with
heavy tails [1], [2]. One of these distributions used to describe the packet structure of
network traffic is the Weibull distribution [3], [4]
   The Weibull distribution, hereinafter W(c, b), is widely used in the technique and it
is distributions with heavy tails with the distribution function of y values:

                                  F ( y )  1  e  ( y ) , y  0.
                                                      c
                                                                                                 (1)


*   Copyright 2021 for this paper by its authors. Use permitted under Creative Commons License
    Attribution 4.0 International (CC BY 4.0).
                                                                                                                  469


The distribution W(c, b) can be obtained as a function of the exponential distribution.
Let X be a random variable that has an exponential distribution Fx  X   1  ex . Then
the random variable Y has a Weibull distribution Y  X 1/ с . Really

                                                                                                     
      Fy (Y )  P(Y  y )  P  X 1/ с  y   P  X  y c   1  e y  1  e
                                                                                                          c
                                                                                c           y1/ c
                                                                                                              .   (2)

The resulting final expression is a function for the Weibull distribution.
   Let us compare the values of the tails W(c, b) and exponentially distributed random
variables with the same probability of their observation

                            
                                      e x   y1/ c   x  y c  x  y  x1/ c .
                                 c
                   y1/ c                             c
              e                                                                                                   (3)

    From the relation, y  x1/ c it follows that for values x> 1 and c <1 value y> x, that
is, the “tail” of the W(c, b) distribution is heavier than the “tail” of the exponential
distribution.
    “Heavy tail” means that in distributions with heavy tails, significant (large) values
of a random variable are more likely (more often) than the same values, for example,
in an exponential distribution.
    The fractal properties of traffic have a significant effect on the characteristics of the
network [1-5]. The solution to many problems of the theory and technology of infor-
mation networks requires an assessment of such characteristics as the probability of a
random value exceeding a given value. These may be tasks related to the estimation of
small and very small probabilities of packet loss due to buffer overflow. For such prob-
lems, a relatively small part of the random variable values given by the tail of the dis-
tribution function becomes decisive [3], [ 4].
    An effective means of solving problems of this class is the statistics of extreme val-
ues [6]. This is an extrapolation method based on knowledge of the asymptotic behavior
of the probability distribution. The "tail" of the probability distribution of values ran-
dom variable is approximated by a dependence, the parameters of which are estimated
from the results of direct modeling by the Monte Carlo method [7]. With the proper
choice of the structure of the extrapolation formula, one can significantly accelerate the
computational experiment [8].
    The purpose of the paper is to present a solution to the problem of calculating the
buffer capacity for "heavy" traffic, based on the admissible probability Pover≤10-6 of its
overflow.


2      Node Model and Problem Statement

We consider the network node as a queuing system (QS) of class G|G|1|m. To describe
the recurrent flow of applications and/or service time, we use the probability distribu-
tion with a heavy tail - the Weibull distribution. With the ON-OFF model, the Pareto
distribution is more often considered. However, the Weibull distribution with the form
470


parameter c<1 has all finite moments. This makes the distribution W(c, b) suitable for
modeling the convergence of tail parts in a heavy multiplexing environment [3].
   We set the task of calculating the required buffer capacity as the problem of finding
the dependence

                                           P  (m)                                               (4)

the probability P of failure (loss of claim) on the size of the buffer m (m=0, 1, 2, ...).
This will allow for any predetermined admissible 1 loss probability P* to determine the
corresponding smallest admissible buffer size m* by the formula following from (4):

                                        m*  –1 ( P * ),                                         (5)

where -1 is the inverse of the function . Assuming that the function  is monoton-
ically decreasing2, we can state that the inverse function -1 exists3. Thus, having solved
the main problem, we can, for any given quality level (that is, for any maximum admis-
sible probability of losses P*), by the formula (5) determine the smallest buffer size m*
that provides it.
   In solving this problem, we will rely on the method of extreme values (Extreme
Value Theory, EVT) [6].


3       Extreme Value Method

The theory of extreme values is based on the postulate that for a large number of initial
distributions F(x), the distribution of maxima Fn(x) taken from samples of size n tends
to an asymptote of the form

                              lim F n (bn  an )  exp(  exp(  x)),                             (6)

where an and bn are parameters estimated from independent sample values of a contin-
uous random variable X.
   A class of distributions converging to the asymptote (6) is called exponential.
Weibull distribution belongs to this class.
   The EVT method was developed to analyze continuous random variables.
   The classical expression obtained using EVT for approximating the "tails" of con-
tinuous distributions belonging to the exponential class has the form [6]


1   In practice, it is required that the probability of loss of the application is very small (for ex-
    ample, the admissible probability of losses can be 10 –6–10–9).
2   We believe that the proportion of P requests lost due to lack of queue space cannot increase
    with increasing buffer size m.
3   The function  is defined in formula (4) on the discrete set {m} = {0, 1, ...} of values of the
    argument m. Therefore, the inverse function -1, strictly speaking, is defined on a discrete set
    – the set {P} = { (0),  (1), ...}.
                                                                                                       471


                                                     1        x  bn 
                                          Q( x)         exp         ,                                (7)
                                                    Eτ n      an 

where Eτ is the average length of the sample sequences of values of the random variable
X, generated serially;
n is a fixed length of a series of sample sequences;
an and bn are parameters calculated from the results of the general sample.
   The accuracy of the approximation can be estimated by comparing the results ob-
tained by the EVT method with the results calculated from the corresponding theoreti-
cal distributions
   To this end, a computer model was constructed to simulate Weibull and exponential
distributions. The following results were obtained from the experiment with the model
sampling.
   In the Table. 1 shows the results of modeling the “tail” of the Weibull distribution
e ( x / b ) with parameters c=0,5; b=1.
            c


                                   Table 1. Weibull distribution tail results
 x                  200       400          600         800        1000      1200       1600      2000
  e   ( x / b )c   8,4910-4 4,510-5     4,810-6    7,210-7   1,3610-7 3,010-8   2,0610-9 1,910-10
 Simu-              9,110-4   5,510-5    6,110-6    9,210-7   1,710-7   3,6410-8 2,2710-9 1,910-10
 lation

                                                                                                    ( x / m)
Table 2 presents the results of modeling the “tail” of the exponential distribution e
with parameter m=3,37.

                                 Table 2. Exponential distribution tail results
 x                  25       35      45      50                   55        60      65     70
  e( x / m)        6,110 3,0810 1,5810 3,610-7
                          -4      -5      -6                      8,1610 1,8910 4,210 9,5210-10
                                                                         -8      -8     -9

 Simu-              5,610-4 2,810-5      1,4410-6 3,2410-7 7,310-8      1,6410-8 3,710-9 8,3310-10
 lation

The results of an experimental study of the use of the EVT method for estimating the
tail of known theoretical distributions show a rather high approximation accuracy – for
the same tail values the probabilities are of the same order of smallness.
    Therefore, it can be argued that the function constructed by the EVT method and
approximating the tail of the distribution is equivalent to the theoretical function.
    Also, a comparison of the values of the Weibull “tails” and the exponential distribu-
tion for probabilities of the same order of smallness shows how much the Weibull “tail”
is heavier than the exponential one.
472


4      The Model of the Process of Losses in the Buffer of the
       Limit Capacity

The process of receipt and servicing of requirements in the QS with a limited queue
(hereinafter referred to as the limited buffer – LB ) can be considered as a regenerative
process [9] of occupation and discharge of the system.
    During the busy interval, requirements arrive and leave the system. The regeneration
point in the model coincides with the moment the request arrives, which catches the
service node unoccupied [9]. A separate busy interval ends when the entire buffer is
freed. The number of escalation processes and queue reduction processes occurring in
the same occupation period is the same.
    From the analysis of the QS model with an unlimited queue (hereinafter referred to
as the unlimited buffer - UB) [7], it follows that the number of requirements lost in the
LB due to the overflow of its capacity is equal to the difference between the maximum
value of the queue formed in the UB during the busy period and the value of the LB
capacity.
    Due to the stationary nature of the QS, the analysis of events in one employment
interval is equivalent for all intervals.
    The maximum value of the queue Li in the UB achieved during the escalation process
at the i-th employment interval does not depend on the number of escalation and reduc-
tion processes, nor the number of requests served in this interval and does not depend
on the length of the employment gap.
    The value of Li depends only on the time intervals between arrivals and service in-
tervals. Since these times are random variables, the value Li for  i is also a random
variable. When observing an infinite number of employment intervals, it can be argued
that the maximum value of the queue L has a probability distribution function of its
values.
    This allows us to finally recognize that the probability of losing requests in an LB
with a capacity of N (probability of buffer overflow) is a function of the tail of the
distribution of the values of the maximum value L, starting with the value L=N+1.
    Therefore, if the distribution function for L=N is F(x), then the probability of losses is

                          P loss( N )  f Q( N  1)  f 1  F ( N ).                  (8)

that is, it is a distribution function of the tail of the probabilities of the values of the
maximum queue length in the UB.
Experimental estimates of the required capacity of the limit buffer
   The queue in the buffer is a discrete random variable. To describe the probability
function of the values of the “tails” of discrete random values, for expression (4) is
obtained extension in the form [10]

                                         an       x  bn  1
                           Q  x  R       exp(           ),                            (9)
                                         nK           an

were an and bn are the coefficients calculated from the sample data;
                                                                                           473


 K – the average number of applications received at the entrance of the QS for the
regeneration period;
n is the number of values in the sequences into which the sampling sequence is com-
posed of the maximum values of the queue at the regeneration intervals;
 R  exp  x an  – the constant R depends on the value of the interval 2x=hi-hi-1 between
the values hi and hi-1 of the discrete values x.
   Let Ploss_ad be the admissible value of the probability of losses due to the excess of
the buffer capacity N.
   We set the value of Ploss_ad and solve (9) concerning the variable x=N. We obtain an
expression for estimating the value of the buffer capacity N, which provides the value
of the probability of losses Ploss (N) ≤ Ploss_ad (N), in the form

                                             an R 
                             N   an ln                 bn  1                      (10)
                                             nKPloss_ad 

where  x  is the nearest integer not less than x.
    To use expression (10) to determine the required value of the capacitance of the LB, it
is necessary to calculate the coefficients, an and bn from the sample data K , ab, and bn.
    For this purpose, a simulation model of a single-phase QS with LB was realized. The
simulation model reproduced the periods of regeneration of receipt of applications in
the buffer and their servicing. The maximum values of the queues and their processing
were taken into account.
    The model corresponded to four types of QS: M|W|1, W|W|1, W|M|1, and M|M|1
with load factors = 0,5; 0,7; 0,9 for each type of QS. The simulation was carried out
for the given values of the probability of losses Ploss_ad: 10-4, 10-6, 10-8, 10-10.
    Tables 1-4 show the simulation results for this source data and PM is the value of the
probability of losses obtained from the Monte Carlo simulation with the found values
of the buffer capacity N.
    The corresponding values of Ploss_ad and PM presented in the tables are of the same
order of smallness, which confirms the sufficient accuracy of the calculated results by
expression (10).
    Also, a comparative analysis of the calculated data in Tables 1-3 with the data in
Table 4 confirms the fact that the packet traffic, in which receiving of requests are
described as distribution with heavy tails, requires significantly more significant
amounts of the buffer memory in network nodes.

                           Table 3. Simulation results for QS M|W|1
                                 N                                         PM
   Ploss_ad
                 =0,5         =0,7           =0,9           =0,5      =0,7       =0,9
    10-4          42            66              180           6,710-5   1,710-4   1,910-4
    10-6          69            113             300              -       3,010-6        -
    10-8          94            154             430              -          -            -
    10-10         122           203             576              -          -            -
474


                        Table 4. Simulation results for QS W|M|1
                              N                                       PM
   Ploss_ad
                =0,5       =0,7        =0,9         =0,5         =0,7      =0,9
      10-4       53          81           198         1,410-4      1,710-4   1,210-4
      10-6       81          128          343         6,010-6      2,610-6   4,610-6
      10-8       121         179          482            -             -          -
      10-10      150         227          603            -             -          -

                        Table 5. Simulation results for QS W|W|1
                              N                                       PM
   Ploss_ad
                =0,5       =0,7        =0,9         =0,5         =0,7      =0,9
      10-4       83          128          311         1,310-4      1,610-4   2,810-4
      10-6       128         217          546            -          1,310-6   5,910-6
      10-8       184         279          780            -             -          -
      10-10      242         366         1026            -             -          -

                         Table 6. Simulation results for QS M|M|1
                              N                                       PM
   Ploss_ad
                =0,5       =0,7        =0,9         =0,5         =0,7      =0,9
      10-4       12          21           60          5,910-5      1,110-4   1,310-4
      10-6       18          32           103         8,010-7      2,710-6   1,710-6
      10-8       23          49           147         3,010-7         -          -
      10-10      32          61           187            -             -          -

   The use of the analytical-statistical method and expression (10) in the case of multi-
server traffic allows us to calculate the values of the LB capacity based on the admissi-
ble value of the probability of losses Ploss_ad <<10-9.


         Conclusion

Fractal models are difficult to analytically interpret. The presented method for calcu-
lating the buffer capacity under the conditions of “heavy” traffic described by the
Weibull distribution is relatively simple and efficient. The use of the EVT method in
combination with the Monte Carlo method for estimating small and very small values
of the probabilities of random variables allows us to reduce the simulation time by 4–5
tens in comparison with the classical Monte Carlo method [8].
   The EVT method was developed to approximate extreme statistics of distributions
of an exponential class [6].
   However, the case of small values of the shape parameter (c<1) is of particular in-
terest since Weibull distributions with such parameters occupy an intermediate place
between distributions with an exponential decrease in tails (exponential distribution,
                                                                                              475


gamma distribution) and “heavy-tailed” distributions with power-law descending tails
of the Zipf-Pareto type [11].
   In the same work [11], the possibility of expanding the convergence of the distribu-
tions of statistics of a fairly general form, based on samples of random volume, to the
Weibull distribution was shown.
   In our opinion, this is a significant point, since it expands the Weibull distribution
capabilities to describe real traffic.


References
 1. Shelukhin O. I. Multifractals. Infocommunication applications. Moscow: Hot line - Tele-
    com, (2011).
 2. Stolings W. Modern computer networks. 2nd ed. – St. Petersburg: Peter, (2003).
 3. Arfeen M.A., Pawlikowski K., McNickle N., Willig A. The role of the Weibull Distribution
    in Internet traffic modeling. In 25th International Teletraffic Congress (ITC). – Shanghai,
    2013. pp. 1–8. DOI: 10. 1109/ITC.2013.6662948
 4. Ramirez-Cobo R., Lillo R.E., Wiper M. P. Bayesian analysis of a gueeuing system with a
    long-tailed arrival process. Communication in Statistics, vol. 37, 697–712 (2008).
 5. Kutuzov O. I., Tatarnikova T. M. Model of a self-similar traffic generator and evaluation
    of buffer storage for classical and fractal queuing system// Moscow Workshop on Electronic
    and Networking Technologies, MWENT 2018 - Proceedings 1, pp. 1-3, 2018.
 6. Galambos Janos. Asymptotic theory of extreme order statistics. Moscow: Nauka, (1984).
 7. Kutuzov O. I., Haddad M. Analytical and statistical method for calculating the capacity of
    the drive / Equipment in ecology and human activity. Materials of the International Confer-
    ence "IENS - 2002". St. Petersburg, 9 p. (2002).
 8. Francesco Bernabei, Roberto Ferretti, Marco Listanti, Giuseppe Zingrillo. A methodology
    for buffer design in ATM switches // J.European transactions on telecommunications and
    related technologia vol. 2, no. 4, 367-379 (1991).
 9. Crane M., Lemoine O. Introduction to the regenerative method of model analysis. - M.:
    Nauka, 1982.
10. Haddat Mufid. Development of a method for calculating the characteristics of the exchange
    of information in wide-band networks for integrated maintenance of automated control sys-
    tems. The dissertation for the degree of candidate of technical sciences. St. Petersburg State
    Electrotechnical Institute (LETI). St. Petersburg, 1994.
11. Korolev V.Yu., Sokolov I.A. On the conditions for convergence of distributions of extreme
    order statistics to the Weibull distribution. Informatics and its application, no3(8), 3-11
    (2014).