=Paper= {{Paper |id=Vol-3187/short4 |storemode=property |title=Exposing Deviations in Information Processes using Multifractal Analysis (short paper) |pdfUrl=https://ceur-ws.org/Vol-3187/short4.pdf |volume=Vol-3187 |authors=Yevhen Ivanichenko,Valerii Kozachok,Yurii Dreis,Olena Nesterova,Kate Dmytriienko |dblpUrl=https://dblp.org/rec/conf/cpits/IvanichenkoKDND21 }} ==Exposing Deviations in Information Processes using Multifractal Analysis (short paper)== https://ceur-ws.org/Vol-3187/short4.pdf
Exposing Deviations in Information Processes
using Multifractal Analysis
Yevhen Ivanichenko1, Valerii Kozachok1, Yurii Dreis2, Olena Nesterova1,3,
and Kate Dmytriienko1
1
  Borys Grinchenko Kyiv University, 18/2 Bulvarno-Kudriavska str., Kyiv, 04053, Ukraine
2
  Polissia National University, 7 Staryi ave., Zhytomyr, 10008, Ukraine
3
  National Pedagogical Dragomanov University, 9 Pyrohova str., Kyiv, 01601, Ukraine

               Abstract
               The main requirement for modern systems of intrusion detection is the possibility of identifying
               deviations in information processes in order to detect unknown attack types. An overview of
               existing approaches to identifying network deviations based on multifractal analysis methods
               is given. The results of the calculation of the Hurst exponent for the time series of CPU usage
               for different types of user activity are presented.

               Keywords1
               Fractal analysis, the Hurst exponent, network deviation detection.

1. Introduction
   Signature methods of analysis used in modern intrusion detection systems aimed at identifying
known and more specific methods of attacks, appear not to be able to detect their modifications or new
types, which makes the use of such systems ineffective. Existing solutions to individual cases of
detection of network deviations to this time do not allow to develop a single universal mechanism for
detecting previously unknown attack types.
   The current task at the moment is to find more effective universal methods for detecting network
deviations that are a consequence of technical failures or unauthorized impacts. The main requirement
for these methods is the possibility of identifying arbitrary types of intruders, including distributed in
time. Statistical studies of network traffic indicate that it has the properties of fractality or self-
similarity, as well as the variability of these characteristics in the event of deviations in the network,
which allows the use of fractal analysis to detect attacks [1, 2].
   The purpose of this study is an overview of modern existing approaches to identifying network
deviations based on the method of fractal analysis.

2. Methods of Detecting Attacks
   Attacks are deliberate actions of the offender, leading to violation of confidentiality, integrity or
accessibility of the system.
   Methods of detection of attacks are divided into:
   1. Signature.
   2. Behavioral.
   Signature methods are intended to detect known and clearly described attacks and founded on a
reference verification of symbol sequences and events with the database of attack signatures.
Advantages of signature methods include low demands for computing power and probability of
detection of attacks. Disadvantages of signature methods are the impossibility of identifying new types

CPITS-II-2021: Cybersecurity Providing in Information and Telecommunication Systems, October 26, 2021, Kyiv, Ukraine
EMAIL: y.ivanichenko@kubg.edu.ua (Y. Ivanichenko); v.kozachok@kubg.edu.ua (V. Kozachok); dreisyuri@gmail.com (Y. Dreis);
o.d.nesterova@npu.edu.ua (O. Nesterova); k.dmytriienko.asp@kubg.edu.ua (K. Dmytriienko)
ORCID: 0000-0002-6408-443X (Y. Ivanichenko); 0000-0003-0072-2567 (V. Kozachok); 0000-0003-2699-1597 (Y. Dreis); 0000-0002-
0402-0370 (O. Nesterova); 0000-0001-7984-7279 (K. Dmytriienko)
            ©️ 2022 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)



                                                                              251
of attacks and modifications that exist without a clear formalization of keywords of network traffic and
updating the signature database.
   Behavioral methods are intended to identify unknown attacks and are based on detection of
deviations from normal operation mode. Advantages of behavioral methods comprise the possibility of
analyzing the dynamics of processes and the possibility of identifying new types of attacks.
Disadvantages of behavioral methods include higher requirements for computing resources and
capacities and lower probability of detection.

3. Fractal Analysis
   The time series is a sequence of values of the studied magnitude measured at regular intervals.
   The central concepts of fractal analysis are fractal dimension (D) and the Hurst exponent (H).
   The fractal dimension of the set (according to Hausdorff) is determined by:
                                                 lg⁡[N(ε)]
                                      D = − lim            ,                                           (1)
                                              ε→∞ lg⁡(ε)
where N(ε) – the minimum number of non-empty cells ε that cover a given set.
    The Hurst exponent characterizes the degree of similarity of the process:
    1. 0  0.5 – a trend-resistant process that has a long memory and is self-similar.
    The fractal dimension is directly related to the Hurst exponent: D = 2 – H.
    This ratio is fair when the structure of the curve that describes the fractal function is investigated
with high resolution, that is, in the local limits. One of the popular methods of finding fractal dimension
is R / S analysis [2]:
                                           R(n)
                                       M [ S(n) ] ~cnH , n → ∞                                         (2)
where n – high resolution; с – a positive finite constant that does not depend on n; Н – the Hurst
exponent; R(n) – the scope of the time series.
                                      R(n) = max ∆j − ⁡ min ∆j                                 (3)
                                                1≤j≤n       1≤j≤n
                                       ∆j = ∑i=n xi − kx̅ , k = ⃗⃗⃗⃗⃗⃗
                                               n
                                                                 1, n                                  (4)
                                            1 n
                                       x̅ = n ∑i=1 xi                                                  (5)
                                                 1
                                       S(n) = n−1 ∑ni=1(xi − x̅)2                                      (6)

4. Review of Existing Methods and Approaches
    In [1], a method of maximum modules of wavelet transform (MMWT) is used to detect traffic
deviations, which allows us to detect the singularity of the signal. The network traffic collected on the
boundary router of the university network was taken as the analyzed data. Each sequence is about 24
hours long with a sampling step of 1 second. Samples of "pure" traffic without attacks, and also with
various deviations are presented: at DDoS-attacks of different types of scanning. The algorithm for
estimating the parameters of the multifractal spectrum is as follows:
    The output signal f (t) is decomposed by means of a wavelet transform by the mother wavelet Ψ (t)
into the corresponding coefficients:
                                                                          j
                                                                              t−u
                                      Wf (u, i) = (f(t), Ψu,s (t)) = 2−2 ∫ 2j dt                       (7)
   The partition function is calculated:
                                                               q
                                      S(q, j) = ∑p|Wf (up , j|                                         (8)
   For each value of q∈R it is necessary to calculate the scale indicator:
                                                               lnS(q,j)
                                      τ(q, j) = log j→0 inf ln2j                                       (9)
   Then the multifractal spectrum fl (a), using the Legendre transformation is calculated:




                                                   252
                                                                   1
                                      fl (a) = min [q (a + 2) − τ(q)]                                    (10)
                                                 q∈R
   For each octave j, the multifractal dimension of the order q is calculated:
                                             1
                                     Dq,j = q−1 [q(a(q, j) − f(a(q), j))]                                (11)
   When q <0, the value of S (q, j) depends on small maxima of the amplitude | W_f (u_p, j) |, as a
result, the calculations may not be stable.
   In order to avoid the emergence of false maxima of modules created by computing errors in areas
where f is almost constant, wavelet-maxima are combined in a chain to form a maximum curve
depending on the scale.
                                         −t  2
                 P (p)               1
    If Ψ = (−1) θ        , where θ = 2π e 2    – Gaussian function, then all lines of maxima up (j) are
                                    √
determined by curves that are limited by j = 0. Therefore, all maximum lines that do not extend to the
smallest scale are deleted when calculating S (q, j).
    Formalizing the difference in traffic spectra with certain deviations and without them, it is possible to
compare the fractal dimensions D1, correlation dimensions D2 and intervals characterizing the “width”
of the Lezhandr spectrum for each of the implementations for each octave of decomposition of j.
    Information dimensions of the comparative implementations D1 are distinguished by a small stable
value and practically do not depend on the number of levels of sampling. This allows us to conclude
that the presence of long-term attacks in the signal and non-predicted activity changes the self-similar
nature of traffic, and this property can be used in the future to detect attacks.
    In [3] it is proposed to determine the deviations, based on their identity and distribution of "heavy
tails". Network deviations may occur as a result of overloads, errors by network devices, DDOS attacks,
attempts of unauthorized access. To reduce the impact of the periodicity of network traffic on the
estimation of the Hurst exponent, the time series is divided into 24 sets of values.
    For each set, a histogram is built for 24 equal intervals. For each group there are a packet number
and the average package length for the same time intervals. At the next stage, the Hurst exponent is
calculated by the method of periodogram that use the slope of the power spectrum. The Hurst exponent
is calculated from the ratio:
                                      β − 1 = 1 − 2Н,                                                     (12)
where β -the slope of the line on a logarithmic scale.
    In practice, you must first analyze traffic in the regular network operation mode during the day.
When the deviation detection mode is turned on, at first the Hurst exponent is calculated and compared
with the corresponding reference value calculated in normal mode, for each parameter separately.
    In the paper [4], an algorithm for detecting deviations based on a discrete stationary wavelet
transform and fractal dimension is used. As the first step, a time series filtering with a discrete stationary
wavelet transform is performed. This preliminary processing is necessary to increase the accuracy of
the proposed method: the main components are allocated, the details are filtered.
    The main advantage of the discrete stationary wavelet transform over a classic one is the preservation
of the time information of the output signal at each level. In the second step, the time series is bypassed
by two adjacent windows R and S. For each window, a fractal dimension is calculated according to the
algorithm:
                                                 L
                                              lg⁡( )
                                                 a
                                      FD =       d                                                       (13)
                                              lg( )
                                                 a
where L – the length of the time series, d – the distance between the first point of the series and the
farthest from it, a - the average distance between two adjacent points of the series.
    Changes in the statistical parameters of the signal are reflected in the fractal dimension, to account
for which the following function is introduced:
                                       Gk = |FDk+1 − FDk |, k = 1, … , n                              (14)
where n = number of points G.
    The third step is the search for local maxima G that exceed a given threshold, which are considered
as deviations from normal behavior. The accuracy of the method is significantly affected by the length
of the window. For the analyzed window of length l, the energy of the function⁡Gl ⁡⁡⁡⁡ is calculated as:
                                                         2
                                             ∑ k|Gl |
                                                     k
                                      Gl =       N
                                                                                                         (15)


                                                             253
   The window length is calculated as the minimum of the standard energy function EGl .
   In [5], a method for detecting DDOS attacks is offered based on the estimation of the Hurst exponent
using the Fourier fractional transformation, which makes the transition to the frequency-time area.
   For the signal x (t) the fractional Fourier transform is determined as:
                                                           ∞
                                       X a (u) = Fa (u) = ∫−∞ x(t)K a (t, u)dt                      (16)
             K a (t, u) = ⁡ √1 − i ∙ cot(α) ∙ exp⁡[iπ(t 2 cot⁡(α) − 2ut ∙ csc(α) + u2 cot⁡(α))], α ≠ nπ (17)
                                         K a (t, u) = δ(t − u), a = (2 ∓ 1)π                            (18)
                                                      aπ
                                         n ∈ Z, α = 2                                                   (19)
where a – the order of fractional Fourier transformation, provided that a=1, then the formula changes
to the usual Fourier transformation.
    Using a discrete wavelet transform and a multi-scale method of analysis, we can calculate the Hurst
exponent if we analyze the expression:
                                 G(j) ↔ (2H + 1)j + constant, where j – scale.
    Next, the optimal selection of the range of scale intervals is made, using the method of one-
dimensional weighted estimation of least squares [8].
    Experimental verification of the proposed method showed its high accuracy, which reduced the
number of false positives and omissions during the detection of the attack.
    It was stated that network traffic is divided into several disjoint segments. The Hurst exponent for
each segment is estimated. When the threshold values are exceeded, the traffic loses the property of
self-similarity, which is regarded as a DDoS attack. But the intensity of the DDoS attack can change,
which leads to a change in the Hurst exponent, so detection methods based on a fixed threshold require
flexibility and adaptability.
    This article proposes a method consisting of two stages:
    1. Statistical analysis of the time series of network traffic using discrete wavelet transform and the
Schwartz information criterion to find the change point of the Hurst exponent, which signals the start
of a DDoS attack.
    2. Adaptive regulation of the intensity of a DDoS-attack on the basis of fuzzy logic, by analyzing
the Hurst exponent and the rate of its change. The Schwartz information criterion is based on the
maximum likelihood function for the model and can be used to detect the presence of a threshold point
by comparing the probability of a null hypothesis (no point) and an alternative (point of presence).
    The Hurst exponent is estimated using a discrete wavelet transform, because in practice this method
is one of the most reliable, as it is more resistant to gentle polynomial trends and noise.

5. The Evaluation is Performed According to the Following Algorithm
    For the time series of network traffic X in real time, the wavelet coefficients d (j, k) are calculated
for each scale j and position k. Next, it is necessary to perform a detailed assessment of the dispersion
at each scale j:
                                              nj
                                       Sj = ∑k=1  d2 (j, k)                                            (20)
where nj ⁡– the number of wavelet coefficients that are available in scale j
    We assume that a new sample of traffic is received, then the amount will be updated as follows:
                                       nj ≠ nj + 1                                                     (21)
                                       Sj = Sj + d2 (j, nj )                                           (22)
    Estimation of variance in scale j:
                                           S
                                      εj = nj                                                          (23)
                                               j
   Next, the dependence of log 2 (εj ) on scaling j is constructed and a weighted linear regression is
performed for the linear section, α is calculated.You do not need to build this dependency every time
you receive a new segment of traffic, this action is performed only when necessary.
   Then the Hurst exponent is calculated
                                          α+1
                                     H= 2                                                         (24)




                                                    254
6. The Principle of Detecting Attacks
    Let X be a time series of normal traffic, Y is a time series of traffic with deviations, Z - a time series
of deviations, i.e. the relation Y = X + Z holds. Based on the theorems, we can conclude that regardless
of the presence of self-similarity in Z, if X is a stationary self-similar process of the second order, then
Y will still be a self-similar process. But the degree of self-similarity may change.
    Let ry , rz be autocorrelation functions X, Y, Z, respectively. Then a ‖ry − rx ‖, is of interest during
the attack, and ry = rx + rz . For each value of H ∈ (0.5,1] there is only one autocorrelation function
on self-similarity. Thus, we examine ‖Hy − Hx ‖, where Hx ⁡and Hx is an average values of the Hurst
exponents Y and X, respectively.
    The disadvantage of this method is that the wavelet transform coefficients and statistics based on
the Schwartz information criterion are updated at the moment when new traffic values arrive, and the
detection of the traffic self-similarity threshold will be restarted for each scale. Thus, a signal of the
change in the point of self-similarity will be given, even if this change occurred on a different scale at
the same moment.
    After an attack is detected, close to the detection time the traffic is divided into parts. By analyzing
the Hurst exponent and the speed of its change (the difference between the Hurst exponents of traffic
parts before and after the moment of detection), we can determine the intensity of DDoS-attack, using
the rules of fuzzy logic.
    Determining the point of change of traffic self-similarity by the Schwartz information criterion
which is based on the assumption that the entropy of a sequence with a variable self-similarity boundary
point is greater than the entropy of the sequence in which this point is fixed. Suppose there is a sequence
of length M. It is assumed that there is only one point of the self-similarity boundary at position 1