<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>ORCID:</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>for Decentralized Distributed Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dmytro Ageyev</string-name>
          <email>dmytro.aheiev@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tamara Radivilova</string-name>
          <email>tamara.radivilova@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv National University of Radio and Electronics</institution>
          ,
          <addr-line>14 Nauka ave., Kharkiv, 61166</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Internet traffic monitoring is a crucial task for the security and reliability of communication networks. This description of the traffic statistics is used to detect traffic anomalies. Modern methods of detecting attacks and other traffic anomalies in the network are not reliable enough, in particular, due to inaccurate attack moment determination, so that an attacker can easily inject errors to the operation of the system, thereby incapacitating it using DDOS attacks. To solve the problem of searching for network anomalies, a method is proposed for forming a set of informative features that formalize the normal and anomalous behavior of the system, and criteria are defined that make it possible to detect and identify various types of network anomalies. The paper discusses methods for detecting anomalies that are based on statistical approaches such as fractal traffic analysis. The issues of detection of network attacks are analyzed, which have similar statistical features, expressed in the change in mean and variance. Traffic abnormalities, statistical analysis, Decentralized Distributed Networks Cybersecurity Providing in Information and Telecommunication Systems, January 28, 2021, Kyiv, Ukraine</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>attacks.</p>
      <p>In the network security area, an intrusion is defined as a set of malicious actions against the integrity,
confidentiality, and availability of information in a system or network that make it vulnerable to future</p>
      <p>In modern conditions, one of the main processes of management of the infocommunication network
is the process of managing its information security. This process must be both at the design stage of the
network and at the stage of its operation and consists of a continuous analysis of the network. The
effectiveness of the information security management process is significantly affected by changes in
the structure, topology, and modes of the network operation. This may be due to the addition of new
devices or changes to the settings of existing mechanisms, hardware failures, incorrect actions of
network administrators, users, etc. When analyzing the state of the network, the main of the evaluated
parameters are the probability of attack, the degree (probability) of vulnerability of network elements
to information attacks. To counter threats, Intrusion Detection Systems (IDS) are deployed to detect
and identify intrusion attempts into a system or network. Using a set of hardware and software
resources, IDS attempt to detect intrusions by tracking data collected from a single host or network, and
generate an alarm when intrusion detection is detected. IDS can be divided into different categories
depending on the source of information and detection techniques.</p>
      <p>The rapid development of computer networks and information technology raises a number of
problems related to the security of network resources, which require new approaches. Currently, the
issue of building intrusion detection systems is a current trend in the field of information technology.
There are many papers on the topic of detecting and classifying attacks using a variety of methods,
which include traditional approaches based on compliance with signature templates and adaptive
models using data mining techniques.</p>
      <p>2021 Copyright for this paper by its authors.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Traffic Abnomality Description</title>
      <p>Methods of detecting attacks and preventing intrusions into information systems and
infocommunication networks are one of the current areas that are actively developing in the field of
information security.</p>
      <p>To do this, a number of specialized algorithms and tools are used to detect known and unknown
attacks behavioral, signature methods, as well as methods to detect abnormal activity, which are
particularly effective for detecting insider attacks and "zero day" attacks.</p>
      <p>The following are used as the main classification features.
1. Source type.
2. Place of origin;
3. Method of manifestation;
4. The cause;
5. The nature of change.</p>
      <p>To solve the problem of detecting network attacks, the most important features will be such as the
source, the nature of traffic changes and the area of manifestation. The classification of network
abnormalities by causes and nature of traffic changes is given in Table 1.</p>
      <sec id="sec-2-1">
        <title>Characteristics of traffic changes</title>
      </sec>
      <sec id="sec-2-2">
        <title>Unusually high point-to- Emission in the representation of traffic point traffic bytes/s, packets/s on one dominant source-destination stream. Short duration (up to 10 minutes)</title>
      </sec>
      <sec id="sec-2-3">
        <title>Distributed denial-of-service Emission in the traffic view packets/s, attack per victim streams/s, from multiple sources to a single destination address.</title>
      </sec>
      <sec id="sec-2-4">
        <title>Unusually high demand for Jump in traffic on streams/s to one</title>
        <p>one network resource or dominant IP address and a dominant
service port. Usually a short-term anomaly.</p>
      </sec>
      <sec id="sec-2-5">
        <title>Scan the network for specific Jump in traffic on streams/s, with several open ports or scan a single packets in streams from one dominant IP host for all ports to look for address vulnerabilities</title>
      </sec>
      <sec id="sec-2-6">
        <title>A malicious program that Discharge in traffic without a dominant spreads itself over a network destination address, but always with one and exploits OS or more dominant destination ports vulnerabilities</title>
      </sec>
      <sec id="sec-2-7">
        <title>Distribution of content from one server to many users</title>
      </sec>
      <sec id="sec-2-8">
        <title>Network problems that cause a drop in traffic between one sourcedestination pair</title>
      </sec>
      <sec id="sec-2-9">
        <title>Unusual switching of traffic flows from one inbound router to another</title>
      </sec>
      <sec id="sec-2-10">
        <title>Emission in packets, bytes from the</title>
        <p>dominant source to several destinations,
all to one well-known port</p>
      </sec>
      <sec id="sec-2-11">
        <title>The drop in traffic on packets, streams</title>
        <p>and bytes is usually to zero. Can be
longterm and include all
source-todestination streams from or to a single
router</p>
      </sec>
      <sec id="sec-2-12">
        <title>Drop in bytes or packets in one traffic stream and release in another. May affect multiple traffic flows.</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Intrusion Detection Systems for Decentralized Distributed Networks</title>
      <p>Modern networks are characterized by large volumes and speeds of information transfer. This led to
inefficiency of IDS for per-packet collection and traffic analysis. Improving the IDS efficiency and
productivity implements by changing methods used in IDS, moving to network traffic flow analysis
using new methods based on AI.</p>
      <p>To solve these problems, the paper offers the following IDS architecture [1], which consists of two
main virtualized components: the abnormalities symptoms detection (ASD) and the network
abnormalities detections (NAD). The first (ASD) is located in the network infrastructure and is its
distributed component. ASD focuses on the rapid search for symptoms of abnormalities to be able to
detect abnormalities in network traffic generated by user equipment and other network nodes.</p>
      <p>On the other hand, the NAD collects timestamps and symptoms, and then the central process
analyzes this data and tries to identify patterns that can be attributed to abnormal traffic. As soon as an
anomaly is detected, a non-compliant message is immediately sent to the monitoring and diagnostic
module.</p>
      <p>This approach is very flexible, as it allows you to dynamically transform new virtualized resources
to detect symptoms of the anomaly with increasing network traffic; and the spread, detection of
symptoms, which is one of the costly processes of analysis, distributed in the network, while the
detection of anomalies is centralized, which requires only symptoms in the input data.</p>
      <p>When considering the architecture, the detection of anomalies is organized on two levels. At the
lower level of the collector, the flow receives all the different flows over a given period of time,
calculates the vector characteristics that the ASD module classifies as abnormal or normal. This priority
classification should be done as soon as possible, even if it sacrifices accuracy for a lower call time. If
an abnormal package of symptoms is suspected, which is in the window of signs, time and type of the
detected anomaly, the NAD module is sent to the next level. The NAD receives several streams of
symptoms from all ASDs, sorts them by time, and collects the temporal sequence of symptoms.</p>
      <p>With this approach, it is important to learn that each ASD must support a huge amount of traffic,
which is why it is extremely important to be able to select a sufficient number of threads per second,
even if the detection is not quite as accurate as it may be.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Abnormalities Symptoms Detection</title>
      <p>When developing a machine learning model, it is important to decide which features should be used
as input data for the training algorithm. The selection of features in the formation of the feature space
is a mandatory procedure both at the preparatory stage (prior to training) and at the stage of assessing
the results obtained and subsequent adjustment of the training sample and/or model hyperparameters.
4.1.</p>
      <p>Statistical Characteristics which Used for Traffic Anomality Detection</p>
      <p>To assess the current statistical characteristics of network traffic, we will use “sliding windows” of
a given duration, which allow you to “view” network traffic in the “online” mode.</p>
      <p>Statistical analysis conducted within each window can be used as a basis for constructing a space of
informative features and determining the criteria for detecting network anomalies.</p>
      <p>The analysis involves the calculation for each window of the following statistical characteristics:
the sample mean is determined by the equation
where Sj is the sample value of traffic intensity at the time tj;
the sample variance is determined by the equation

1
 +
∑   ,
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
 ( | ) =
∑</p>
      <p>( ,  )
 , ∈ ,
_ ( | ) =
∑  ( )  
 ∈</p>
      <p>1
 ( | )
 ( )
where P(x, y) is the joint probability of x and y and P(x|y) the conditional probability of x given y.</p>
      <p>Relative entropy is the entropy between two probability distributions p(x) and q(x) defined over
the same  ∈  as
Relative conditional entropy is the entropy between two probability distributions (p(x|y) and
q(x|y)) defined over the same  ∈  and  ∈  as
  1 =

1 ∑ +=  (  −  ̂  )
 3</p>
      <p>2 =

1 ∑ +=  (  −  ̂  )</p>
      <p>4
 4

3</p>
      <p>,
− 3,
determining the degree of asymmetry of the probability density relative to the axis passing through its
center of gravity.</p>
      <p>The kurtosis is determined by the equation
which shows how sharp the vertex has a probability density compared to the normal distribution. If the
excess coefficient is greater than zero, then the distribution has a sharper vertex than the Gaussian
distribution, if less than zero, then a flatter vertex than the normal distribution.</p>
      <p>Antikurtosis is determined by the equation</p>
      <p>the asymmetry coefficient is determined by the equation



</p>
      <p>,,
1
√</p>
      <p>,
 4
 4
 ( ) =
∑  ( )
 ∈</p>
      <p>1
 ( )
,
where  is kurtosis parameter which determined by the equation
where  is standard deviation; 4 is sample fourth central moment.</p>
      <p>To study the spectral properties of traffic in the absence and presence of anomalous emissions,
correlation analysis is used, which includes the calculation of the correlation function, correlation
coefficient and correlation interval. The calculation of the correlation function is performed according
to the equation
where P(x) is the probability of x in D.</p>
      <p>Conditional entropy is the entropy of D given that Y is the entropy of the probability distribution
(P(x|y)) as

1
 + −
∑
where k is lag (time shift of the output row).</p>
      <p>As the correlation coefficient rj(k) we understand the normalized value of the correlation function
As the correlation interval Tcor we will understand the value of the argument at which the
autocorrelation function for each window changes sign for the first time.</p>
      <p>Information characteristics can be used for anomaly detection model [2]. Information characteristics,
such as entropy, conditional entropy, relative entropy, information gain and information cost, are have
been used. We provide the following definitions of these measures.</p>
      <p>Entropy is a key parameter of information theory which calculates the data collection
uncertainty. For a dataset, D, the entropy is defined as
 _</p>
      <p>∑  ( ,  )</p>
      <p>, ∈ ,
Information gain is a measure of the information gain of an attribute or feature A in a dataset D
and is</p>
      <p>∈ ( )
where values(A) is the set of possible values of A and Dv the subset of D where A has the value v.</p>
      <p>Based on this knowledge, appropriate abnormalities symptoms detection models can be built.
Supervised anomaly detection techniques require a training dataset followed by a test data to evaluate
the performance of a model.
4.2. Analysis of the Selected Statistical Characteristics Applicability for</p>
    </sec>
    <sec id="sec-5">
      <title>Abnormalities Symptoms Detection</title>
      <p>This method was tested on a set of real traffic data [3]. To do this, we used records of real network
traffic, which contained attacks of various classes such as Flash-crowd, ICMP-flooding, Fraggle, Smurf,
Synflooding, UDP-storm, Neptune. Experiments have shown that the impact of anomalies in the
analysis window sample mean, sample variance, antikurtosis take maximum values and can be used as
a symptom in intrusion detection algorithms. The kurtosis and asymmetry coefficient can also be used,
but they take minimal values (Table 2)</p>
      <p>Similarly, using records of real traffic, the analysis of information parameters of traffic was carried
out. Through experiments, we observed that while the network is not under attack, the entropy values
for different header fields each fall in a fairly narrow range. While the network is under attack by means
of attack, these entropy values exceed these ranges.</p>
      <p>During the experiment, not only DDoS attacks were identified, but also UDP-flood, HTTP flood,
TCP SYN, Ping of Death attacks. During the analysis of the experimental results, the entropy changed
for each type of attack, as shown in table 2.</p>
      <sec id="sec-5-1">
        <title>Attack type</title>
      </sec>
      <sec id="sec-5-2">
        <title>DDoS</title>
      </sec>
      <sec id="sec-5-3">
        <title>UDP-flood</title>
        <p>TCP SYN</p>
      </sec>
      <sec id="sec-5-4">
        <title>Ping of Death</title>
      </sec>
      <sec id="sec-5-5">
        <title>HTTP flood</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusions</title>
      <p>The data in Table 3 show that the entropy analysis method suitable for identifying different types of
attacks.</p>
      <p>Table 4 presents the values of the quality of attack detection by entropy analysis of protocols for
different types of attacks. The proposed method has false positives. This can happen due to high traffic
speed, due to limited packet header information (encrypted data), etc.</p>
      <p>Modern information communication networks are characterized by large volumes and speeds of
information transfer. Therefore, the creation of distributed intrusion detection systems, which consists
of two main virtualized components: the abnormalities symptoms detection and the network
abnormalities detections, makes it possible to increase the efficiency of intrusion detection.</p>
      <p>Analysis of changes in the statistical traffic characteristics shows that during the occurrence of
abnormality is observed a sharp jump of the sample mean, sample variance, entropy and antikurtosis.
From the data obtained, it can be seen that the use of the above indicators in intrusion detection tasks
can be used to detect abnormalities symptoms.</p>
      <p>A method for monitoring, detecting intrusions and identifying attacks based on packet entropy
analysis has been developed, which is based on the calculation of conditional entropy and statistical
characteristics of packet data, which reduces intrusion detection time and identifies previously unknown
attacks.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Acknowledgements</title>
    </sec>
    <sec id="sec-8">
      <title>7. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <year>2020</year>
          .
          <volume>01</volume>
          /0351. [1]
          <string-name>
            <given-names>L.F.</given-names>
            <surname>Maimó</surname>
          </string-name>
          , et al.,
          <article-title>On the performance of a deep learning-based anomaly detection system for 5G</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          networks, in: 2017 IEEE SmartWorld,
          <source>Ubiquitous Intelligence &amp; Computing</source>
          , Advanced &amp; Trusted
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          , doi: 10.1109/UIC-ATC.
          <year>2017</year>
          .
          <volume>8397440</volume>
          . [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Naser</given-names>
            <surname>Mahmood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <article-title>A survey of network anomaly detection techniques</article-title>
          , Journal
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>of Network and Computer Applications</source>
          ,
          <volume>60</volume>
          (
          <year>2016</year>
          )
          <fpage>19</fpage>
          -
          <lpage>31</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.jnca.
          <year>2015</year>
          .
          <volume>11</volume>
          .016. [3]
          <string-name>
            <given-names>O.I.</given-names>
            <surname>Sheluhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.S.</given-names>
            <surname>Filinova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.V.</given-names>
            <surname>Vasina</surname>
          </string-name>
          ,
          <article-title>Obnaruzhenie anomal'nyh vtorzhenij v komp'juternye</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          statistical methods],
          <source>T-Comm - Telekommunikacii i Transport</source>
          ,
          <volume>9</volume>
          (
          <issue>10</issue>
          ) (
          <year>2015</year>
          )
          <fpage>42</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>