<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Network traffic analyzing algorithms on the basis of machine learning methods</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>R I Battalov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A V Nikonov</string-name>
          <email>nikonovandrey1994@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M M Gayanova</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>V V Berkholts</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>R Ch Gayanov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Higher School of Economics</institution>
          ,
          <addr-line>Myasnitskaya str., 20, Moscow, Russia, 101000</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ufa State Aviation Technical University</institution>
          ,
          <addr-line>K. Marks str., 12, Ufa, Russia, 4500082</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>445</fpage>
      <lpage>456</lpage>
      <abstract>
        <p>Traffic analysis systems are widely used in monitoring the network activity of users or a specific user and restricting client access to certain types of services (VPN, HTTPS) which makes content analysis impossible. Algorithms for classifying encrypted traffic and detecting VPN traffic are proposed. Three algorithms for constructing classifiers are considered - MLP, RFT and KNN. The proposed classifier demonstrates recognition accuracy on a test sample up to 80%. The MLP, RFT and KNN algorithms had almost identical performance in all experiments. It was also found that the proposed classifiers work better when the network traffic flows are generated using short values of the time parameter (timeout). The novelty lies in the development of network traffic analysis algorithms based on a neural network, differing in the method of selection, generation and selection of features, which allows to classify the existing traffic of protected connections of selected users according to a predetermined set of categories.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The term “deep package inspection” (DPI) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] refers to the analysis of the network packet at the upper
levels (application and presentation level) of the open systems interaction model (OSI) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        In addition to analyzing network packets [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] using standard patterns by certain standard patterns
that can be used to unambiguously determine whether a package belongs to a specific application, for
example, by the format of headers, port numbers, etc., the DPI system performs behavioural analysis
of traffic. This allows to recognize applications that do not use known data headers and data structures
for data exchange.
      </p>
      <p>For identification, an analysis of the sequence of packets with the same characteristics is carried
out. Analyzed characteristics are Source_IP: port - Destination_IP: port; packet size; frequency of
opening new sessions per unit of time, etc. The analysis based on behavioral (heuristic) models
corresponding to such applications.</p>
      <p>
        The main component of the DPI solution [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is the classification module. It is responsible for the
classification of network flows. The classification can be performed with different accuracy depending
on the purposes of the DPI application:
• the type of protocol or application (for example, Web, P2P, VoIP);
• a specific application-layer protocol (HTTP BitTorrent, SIP);
• applications using the protocol (Google Chrome, uTorrent, Skype).
      </p>
      <p>
        Traffic analysis using traditional tools becomes impossible without selecting a key for streaming
data with encryption (for example, TLS / SSL protocols). It takes a lot of resources to find the key.
The relevance of hacking remains only at the governmental or military level [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        In the [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] the classification of network encrypted traffic from Skype, Tor, PuTTY (SSHv2),
CyberGhost (VPN) is discussed by application types for detecting security threats using such machine
learning methods as the Naive Bayes, C4.5, AdaBoost and Random Forest algorithms. For the
analysis, more than two million network packets from four applications that transmit encrypted traffic
were collected: Skype, Tor, PuTTY (SSHv2), CyberGhost (VPN). Two different classification
approaches were considered: the formation and analysis of flows for network packets whose IP
addresses of the sender / recipient and the network protocol are the same, as well as the interception
and analysis of each network packet [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. When using each approach, the various attributes were
identified and with the use of which the classification was made. Obtained results can be used to build
traffic classifiers and intrusion detection systems, effectively processing the encrypted traffic used by
various network applications.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] it is shown that comparison of various algorithms for classifying network traffic is
significantly difficult due to the lack of a generally accessible base of fully-fledged network routes on
which it would be possible to make comparisons. One of the most actively developing areas at the
moment is the use of various machine learning algorithms, graph and statistical analysis, because of
their applicability to encrypted traffic (as opposed to the DPI approaches), whose share is growing
rapidly. Another emerging focus is the development of combined approaches and classification
systems. One of the reasons for development is an attempt to overcome the shortcomings of individual
approaches (for example, low accuracy or processing speed) and use their advantages [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ].
      </p>
      <p>
        Therefore, the development of algorithms that allow classifying the traffic of secure connections
with the required level of detail by protocol is relevant [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>Objective: improving the algorithms for analyzing the network traffic of secure connections.
The main tasks of the study:
1. Analysis of algorithms for network traffic classification;
2. Development of the network traffic analysis system structure;
3. Development of the algorithm for analyzing the network traffic of secure connections on the
basis of algorithms of feature generation and selection for construction a neural network classifier;
4. Software implementation of algorithms for analyzing network traffic and evaluating the
effectiveness of the proposed solution on the basis of full-scale data.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Analysis of algorithms and systems of network traffic treatment</title>
      <p>
        Traffic classification [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] allows to identify various applications and protocols transmitted over the
network. Also, the classification function is the management of this traffic, its optimization and
prioritization. All packets become marked by belonging to a specific protocol or application after
classification. This allows network devices to apply quality of service policies (QoS) based on these
labels and flags.
      </p>
      <p>There are two main methods of traffic classification (Figure 1):
• Classification based on data blocks (Payload-Based Classification). It is based on the analysis
of data packet fields. This method is the most common but does not work with encrypted and
tunneled traffic.
• Classification based on statistical analysis (time between packets, session time, etc.).</p>
      <p>
        A universal approach to traffic classification based on information in the header of the IP packet.
This is usually IP address (Layer 3), MAC address (Layer 2) and the protocol used. This approach has
its limitations [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        Deep package inspection (DPI) allows to implement more advanced classification . The main
mechanism for identifying applications in DPI is signature analysis [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Each application has its own
unique characteristics, which are entered in the database of signatures. Comparing the sample from the
database with the analyzed traffic allows to determine the application or protocol. However, new
applications periodically appear, the signature database also needs to be updated to ensure high
identification accuracy.
      </p>
      <p>There are several methods of signature analysis:</p>
    </sec>
    <sec id="sec-3">
      <title>Pattern analysis</title>
      <p>The applications contain certain sample sequences in the package data block. They can be used for
identification and classification. Not every package contains a sample application data, so the method
does not always work.</p>
    </sec>
    <sec id="sec-4">
      <title>Numerical analysis</title>
      <p>Numerical analysis uses the quantitative characteristics of the sequence of packets, such as: size of
the data block, response time, interval between packets. Simultaneous analysis of multiple packets is
time consuming, which reduces the effectiveness of this method.</p>
    </sec>
    <sec id="sec-5">
      <title>Behavioral analysis, Heuristic analysis</title>
      <p>
        Method is based on the analysis of traffic dynamics of the running application. While the
application is running, it creates traffic that can also be identified and labeled [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>Protocol/state analysis</title>
      <p>Protocols of some applications are a sequence of certain actions. Analysis of such sequences allows
to accurately identify the application. reduces the effectiveness of this method.</p>
      <p>
        Behavioral and heuristic analysis are used when working with encrypted traffic. For more accurate
identification, cluster analysis is used, which combines the methods of heuristic and behavioral
analysis [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ].
      </p>
    </sec>
    <sec id="sec-7">
      <title>3. Development of the network traffic system analysis based on machine-learning training</title>
      <p>
        The development of analysis algorithm for classifying network traffic of secure connections of
dedicated users according to a pre-defined set of categories is actual [
        <xref ref-type="bibr" rid="ref18 ref19 ref20">18-20</xref>
        ].
      </p>
      <p>Two scenarios of network traffic analysis are considered:
• analysis of encrypted traffic;
• analysis of encrypted traffic passing through a virtual private network (VPN).</p>
      <p>The figure 2 shows the organization's LAN structure with an analysis module for network traffic of
encrypted connections.</p>
      <p>Traffic arrives from the edge router. There is a seizure and preprocessing of traffic using the
libpcap library. The primary features of the data flow are extracted from the received pcap files.
Vector of primary features and sessions with a duration of 15, 30, 60 and 120 seconds is formed. The
generation and selection of features for the neural network classifier training is performed. The
prepared vector of features is fed to the neural network analysis module of user sessions. The settings
for training and work are set by the administrator.</p>
      <p>This process is shown in the figure 3.</p>
      <p>Further, the following information comes to the decision block: the decision of the base block on
the type of traffic, the probability of the traffic belonging to one of the basic types and the types of
recognized traffic from the NA block analyzing user sessions. Administrator can perform the
adjustment of the current solution on the decision block on the type of traffic.</p>
      <p>Then, the current traffic from the decision block and marked traffic from the basic traffic analysis
module (sender's IP, recipient's IP, port, sender, port of the recipient) are sent to the user sessions
store.</p>
      <p>Further, data on recognized user sessions is sent to the traffic type analysis module and the user
type. An information security specialist receives information about the types of users and their rights.
The administrator interacts with the repository to view and replenish the database and sets the
parameters for capturing traffic.</p>
      <p>Development of an algorithm for analyzing network traffic
At the first step, the fragment of the intercepted traffic is downloaded, then the classifier scenario is
selected. Based on the features indicated in the scenario, a training sample is formed to build the initial
knowledge base. After analyzing the given features on the test sample, the accuracy of the classifier is
determined. If the accuracy satisfies the requirements, the current state is saved, otherwise the cycle
returns to the definition of the type of the script. The block diagram of the network traffic classifier is
shown in Figure 4.</p>
    </sec>
    <sec id="sec-8">
      <title>4. Realization of network traffic analysis algorithms and experiment on natural data</title>
      <p>Traffic classification is based on the analysis of the temporal characteristics of the intercepted network
packets stream for the formation of encrypted and VPN-traffic features (time-related features). The
temporal characteristics of the flow make it possible to reduce the computational cost of building a set
of features extracted from the encrypted network traffic by reducing the set of fixed parameters.</p>
      <p>
        The experiment uses a dump of network traffic [
        <xref ref-type="bibr" rid="ref21 ref22 ref23 ref24 ref25 ref26 ref27 ref28 ref29">21-29</xref>
        ] with 14 different traffic type tags generated
by different applications (7 for conventional encrypted traffic and 7 for VPN traffic).
      </p>
      <p>Function
duration
fiat
biat
flowiat
active
idle
fb psec
fp psec</p>
    </sec>
    <sec id="sec-9">
      <title>Traffic</title>
      <p>Webbrowsing
Email</p>
      <sec id="sec-9-1">
        <title>Chat</title>
      </sec>
      <sec id="sec-9-2">
        <title>Streaming video</title>
      </sec>
      <sec id="sec-9-3">
        <title>File Transfer</title>
      </sec>
      <sec id="sec-9-4">
        <title>VoIP</title>
        <p>P2P
ICQ, AIM,
Skype,
Facebook,
Hangouts
Vimeo,</p>
        <p>YouTube</p>
      </sec>
      <sec id="sec-9-5">
        <title>Skype, FTPS</title>
        <p>and SFTP
using FileZilla
and external</p>
        <p>service
Voice calls on</p>
        <p>Facebook,</p>
        <p>Skype
uTorrent and</p>
        <p>transfer
(BitTorrent)</p>
        <p>
          The quality criterion for traffic classification is the accuracy of classifying samples. Assessment of
the classification accuracy can be carried out by cross-validation [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]. The separation into the training
and test sets is performed by dividing the sample in a certain pro-portion – the training set is
twothirds of the data and the test set is one-third of the data.
        </p>
        <p>
          To solve the classification problem, the following algorithms are considered:
• Random Forest algorithm (RFT);
• K-Nearest Neighbor method (KNN);
• Multilayer Perceptron (MLP) [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ].
        </p>
        <p>As the source data, the real traffic generated by such applications and services as Skype, Facebook,
etc. is used. Table 1 provides a complete list of the different types of traffic and applications included
in the source dataset.</p>
        <p>For each type of traffic (VoIP, P2P, etc.), open sessions and sessions are used in the created VPN
tunnel, therefore there are in total 14 categories of traffic: VoIP, VPN-VoIP, P2P, VPN-P2P, etc.</p>
        <p>Traffic was captured using the Wireshark sniffer. For VPN-traffic, the external service of the VPN
provider was used, the connection was made using OpenVPN. To generate SFTP and FTPS traffic, use
an external service provider and FileZilla as the client.</p>
        <p>Let's define two different scenarios A and B (Figure 5). Four different time duration values were
used to generate the data sets.</p>
        <p>Scenario A: The purpose is to select the features of encrypted traffic with VPN identification, for
example, distinguishing between voice calls (VoIP) and voice calls passing through VPN
(VPNVoIP). As a result, there are 14 different types of traffic: 7 regular types of encrypted traffic and 7
types of traffic passing through the VPN. The first classifier uses VPN and non-VPN traffic
separation, and then each traffic type classified separately (VPN and non-VPN).</p>
      </sec>
      <sec id="sec-9-6">
        <title>The streaming label defines multimedia applications that require a continuous and stable data flow. For example, the services of YouTube (HTML5 and Flash version) and Vimeo, using Chrome and Firefox.</title>
        <p>The label identifies the application traffic for sending or
receiving files and documents. The files were transferred
to Skype, FTP through SSH (SFTP) and FTP through SSL
(FTPS).</p>
      </sec>
      <sec id="sec-9-7">
        <title>The IP Telephony Label groups all traffic generated in voice ap-plications (Facebook, Hangouts and Skype)</title>
      </sec>
      <sec id="sec-9-8">
        <title>A label is used to identify file sharing protocols, such as Bit-Torrent</title>
        <p>Scenario B: in this case we use a mixed data set. The classifier's input is regular encrypted traffic
and VPN traffic, and the output is allocated the same 14 different categories.</p>
        <p>Using general definition of traffic, it is determined by the sequence of packets with the same values
for: destination IP address, sender's IP address, sender's port, sender's port, and protocols (TCP or
UDP).</p>
        <p>
          Streams are considered bidirectional. Together with the generation of traffic, the features associated
with each type of traffic are determined. TCP stream are usually terminated when the connection is
broken (by the FIN packet), and the UDP stream are terminated by a thread interruption. The value of
the thread interrupt can be assigned arbitrarily. In particular, we set the duration of the streams to 15,
30, 60 and 120 seconds [
          <xref ref-type="bibr" rid="ref32 ref33 ref34">32-34</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>Scenario A analysis</title>
      <p>
        MLP is a neural network of direct propagation. It consists of two hidden layers. In the first hidden
layer there are 30 neurons, and in the second there are 15 neurons. Size of input vector of features is
23. The activation function of neurons is a hyperbolic tangent [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. As a numerical metric, the
rootmean-square error is used to estimate the network error. As a learning algorithm, the method of
conjugate gradients is used. The number of learning epochs is 5000.
      </p>
      <p>When testing the KNN algorithm, the neighbor number parameter is 50.</p>
      <p>When testing the random tree method, the classifier type “Bag” is used. The number of ensemble
training cycles is 150.</p>
      <p>The cross validation (Tables 4, 5) was performed in order to assess how classifiers are able to work
with real data, while producing a result whose accuracy is correlated to accuracy in the test sample
[36].</p>
      <p>There is a direct relationship between the length of the captured session of the thread and the
performance of the classifiers. When using the RFT classifier, the accuracy on the test sample
decreases from 0.857 with a flow time of 15 seconds to 0.828 using a 120 second flow. Similar
behavior is observed for the KNN and MLP algorithms. The best results are achieved using the RFT
algorithm, with the time required to create the stream equal to 15. These results show that, using
shorter time-out values for the traffic classifier, you can in-crease the accuracy value.</p>
      <p>The second part of scenario A focuses on the separation of VPN and non-VPN traffic. The input is
classified according to traffic categories. The results for shorter duration values are better than the
results for larger values, albeit with some exceptions. In the case of VPN classifier, as VPN-mail,
where the best result is obtained with the value of ftm equal to 30 seconds. In the case of a non-VPN
classifier, the same thing happens.</p>
    </sec>
    <sec id="sec-11">
      <title>Analysis of scenario B</title>
      <p>All encrypted streams and VPN traffic are mixed in one set of data. The goal is to classify traffic
without prior VPN separation from non-VPN traffic. There are 14 types of traffic: 7 encrypted and 7
VPN traffic types.</p>
      <p>The short duration of the session of the captured stream does not provide the greatest accuracy. For
example, for the MLP algorithm, the test accuracy is 0.795 and 0.51 for 15 seconds, and for a session
time of 30 seconds, the accuracy on the test sample for the same algorithm is 0.798 and 0.637. The
highest accuracy on the test sample for different interrupt values is 0.847 (RFT algorithm with a flow
time of 120 seconds).
0.882</p>
      <p>The cross-validation (Table 8) was performed to assess the effectiveness of the KNN method and
the RFT method.</p>
    </sec>
    <sec id="sec-12">
      <title>5. Conclusion</title>
      <p>Traffic analysis systems are widely used in monitoring the network activity of users or a specific user
and restrict the client's access to certain types of services (VPN, HTTPS). This makes analysis of
content impossible. Algorithms for classification of encrypted traffic and detection of VPN traffic are
proposed. Three algorithms for constructing classifiers: MLP, RFT and KNN, are considered.</p>
      <p>The effect of the session length of the captured data stream on the accuracy of the classification is
established. The developed classifier demonstrates the accuracy of recognition on the test sample to
80%. Algorithms MLP, RFT and KNN had almost identical indicators in all experiments.</p>
      <p>It is also established that the proposed classifiers work better when network traffic flows are
generated using short time-out values.</p>
      <p>The novelty lies in the development of algorithms for analyzing network traffic on the basis of a
neural network. This method differs in the way of features generation and selection, which allows
classifying the existing traffic of protected connections of selected users according to a predefined set
of categories.</p>
      <p>The developed algorithms can improve the security of the data transmission network by improving
the algorithms for analyzing network traffic as part of a data leak prevention system.</p>
    </sec>
    <sec id="sec-13">
      <title>Acknowledgments</title>
      <p>This work is partially supported by the Russian Science Foundation under grants № 17-07-00351.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>DPI</given-names>
            <surname>Technology Overview - Deep Packet Inspection</surname>
          </string-name>
          <string-name>
            <surname>URL</surname>
          </string-name>
          : https://habr.com/post/111054/
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Olifer</surname>
            <given-names>V G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Olifer</surname>
            <given-names>N A</given-names>
          </string-name>
          <year>2011</year>
          <article-title>Computer networks</article-title>
          . Principles, technologies, protocols
          <article-title>: Textbook for high schools (SPb</article-title>
          .: Peter) p
          <fpage>944</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>[3] Analyzers of network packets URL: https://compress</article-title>
          .ru/article.aspx?id=
          <fpage>16244</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Smith</surname>
            <given-names>R 2008</given-names>
          </string-name>
          <article-title>Deflating the big bang: fast and scalable deep packet inspection with extended finite automata</article-title>
          <source>ACM SIGCOMM Computer Communication Review</source>
          <volume>38</volume>
          (
          <issue>4</issue>
          )
          <fpage>207</fpage>
          -
          <lpage>218</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Traffic security</surname>
            <given-names>URL</given-names>
          </string-name>
          : https://habr.com/post/46321/
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Kostin</surname>
            <given-names>D V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheluhin O I 2016</surname>
          </string-name>
          <article-title>Comparison of machine learning algorithms for encrypted traffic classification T-Comm</article-title>
          .
          <volume>10</volume>
          (
          <issue>9</issue>
          )
          <fpage>43</fpage>
          -
          <lpage>52</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Get'man A I 2017</surname>
          </string-name>
          <article-title>Overview of tasks and methods for solving them in the field of classification of network traffic Proc. of the Institute for Syst</article-title>
          .
          <source>Prog. of the Russian Academy of Sciences</source>
          <volume>29</volume>
          (
          <issue>3</issue>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Lim</surname>
            <given-names>Y 2010</given-names>
          </string-name>
          <article-title>Internet traffic classification demystified: on the sources of the discriminative power</article-title>
          <source>Proc. of the 6th Int. Conf. (ACM)</source>
          p
          <fpage>9</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Moore</surname>
            <given-names>A W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zuev</surname>
            <given-names>D 2005</given-names>
          </string-name>
          <article-title>Internet traffic classification using bayesian analysis techniques</article-title>
          <source>ACM SIGMETRICS Performance Evaluation Review</source>
          <volume>33</volume>
          (
          <issue>1</issue>
          )
          <fpage>50</fpage>
          -
          <lpage>60</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Federal</given-names>
            <surname>Law</surname>
          </string-name>
          №
          <fpage>149</fpage>
          -
          <lpage>FZ</lpage>
          “On Information, Information Technologies and Information Protection”
          <article-title>URL: www.internet-law</article-title>
          .en/law/inflaw/inf.htm
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <article-title>Traffic classification and Deep Packet Inspection URL: https:// vasexperts.ru/blog/klassifikatsiya-trafika-i-deep-packet-inspection/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Sukhov</surname>
            <given-names>A M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sagatov</surname>
            <given-names>E S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baskakov</surname>
            <given-names>A V</given-names>
          </string-name>
          <year>2014</year>
          <article-title>Analysis of Internet service user audiences for network security problems 2nd Int</article-title>
          .
          <source>Symp. on Telecommunication Technologies (ISTT</source>
          )
          <fpage>214</fpage>
          -
          <lpage>219</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <article-title>The composition of the DPI system URL: https://vasexperts</article-title>
          .ru/blog/sostav-sistemy-dpi/
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Moore</surname>
            <given-names>A W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zuev</surname>
            <given-names>D 2005</given-names>
          </string-name>
          <article-title>Internet traffic classification using bayesian analysis techniques Int</article-title>
          .
          <source>Conf. Measurement and Modeling of Comp</source>
          . Syst. URL: http://www.cl.cam.ac.uk/~awm22/ publications/moore2005internet.pdf
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <article-title>Russian manufacturers of DPI and their platforms URL: https://vasexperts.ru/blog/rossijskieproizvoditeli-dpi-i-ih-platfo/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Foreign</surname>
            <given-names>DPI</given-names>
          </string-name>
          <article-title>manufacturers and their platforms URL: https://vasexperts.ru/blog/inostrannyeproizvoditeli-dpi-i-ih-platf/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Sherry</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lan</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popa</surname>
            <given-names>R A</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ratnasamy S</surname>
          </string-name>
          <article-title>Blindbox: Deep packet inspection over encrypted traffic URL: http://iot</article-title>
          .stanford.edu/pubs/sherry-blindbox- sigcomm15.pdf
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Shen</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            <given-names>C</given-names>
          </string-name>
          and
          <article-title>Ren X 2007 Research of P2P traffic identification based on BP neural network 3rd Int</article-title>
          .
          <source>Conf. on Intelligent Information Hiding and Multimedia Signal Processing</source>
          <volume>2</volume>
          <fpage>75</fpage>
          -
          <lpage>78</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Raahemi</surname>
            <given-names>B 2008</given-names>
          </string-name>
          <article-title>Classification of Peer-to-Peer traffic using incremental neural networks (Fuzzy ARTMAP) Canadian Conf</article-title>
          .
          <source>on Electrical and Comp. Eng</source>
          .
          <volume>719</volume>
          -
          <fpage>724</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>The</given-names>
            <surname>UNSW-NB15 Dataset Description</surname>
          </string-name>
          <string-name>
            <given-names>URL</given-names>
            : https://www.unsw.adfa.edu.au/australian
            <surname>-</surname>
            centrefor-
          </string-name>
          cyber-security/cybersecurity/ADFA- NB15-Datasets/
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <article-title>Tor-non Tor dataset (ISCXTor2016) URL</article-title>
          : http://www.unb.ca/cic/datasets/tor.html
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Lotfollahi</surname>
            <given-names>M</given-names>
          </string-name>
          et al
          <year>2017</year>
          <article-title>Deep Packet: A Novel Approach For Encrypted Traffic Classification Using Deep Learning arXiv preprint</article-title>
          arXiv:
          <volume>1709</volume>
          .
          <fpage>02656</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Lopez-Martin</surname>
            <given-names>M</given-names>
          </string-name>
          et al
          <year>2017</year>
          <article-title>Network traffic classifier with convolutional and recurrent neural networks for</article-title>
          <source>Internet of Things IEEE Access</source>
          <volume>5</volume>
          <fpage>18042</fpage>
          -
          <lpage>50</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Miller</surname>
            <given-names>B 2014</given-names>
          </string-name>
          <article-title>I know why you went to the clinic: Risks and realization of https traffic analysis Int Symp</article-title>
          .
          <source>on Privacy Enhancing Technologies</source>
          (Springer, Cham)
          <fpage>143</fpage>
          -
          <lpage>163</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Foremski</surname>
            <given-names>P 2013</given-names>
          </string-name>
          <article-title>On different ways to classify Internet traffic: a short review of selected publications Theoretical</article-title>
          and
          <source>Applied Informatics</source>
          <volume>25</volume>
          (
          <issue>2</issue>
          )
          <fpage>119</fpage>
          -
          <lpage>136</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Smit</surname>
            <given-names>D 2017</given-names>
          </string-name>
          <article-title>Looking deeper: Using deep learning to identify internet communications traffic Macquarie Matrix: Special edition</article-title>
          , ACUR 1
          <fpage>1318</fpage>
          -
          <lpage>1323</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Michael A K J Network</surname>
          </string-name>
          <article-title>traffic classification via neural networks</article-title>
          <source>Technical Report</source>
          (University of Cambridge, Computer Laboratory) p
          <fpage>25</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Belov</surname>
            <given-names>S D</given-names>
          </string-name>
          <year>2008</year>
          <article-title>Detection of patterns and recognition of abnormal events in the data stream of network traffic Vestnik NGU:</article-title>
          <source>Information Technologies</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          )
          <fpage>57</fpage>
          -
          <lpage>68</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Haykin</surname>
            <given-names>S 2006</given-names>
          </string-name>
          <article-title>Neural networks</article-title>
          .
          <source>Neural networks: a full course</source>
          (Moscow: Publishing house “Williams”) p
          <fpage>1104</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <article-title>Data science</article-title>
          . What is cross-validation? URL: http://datascientist.one/cross-validation/
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Sukhov</surname>
            <given-names>A M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sagatov E S and Baskakov</surname>
            <given-names>A V</given-names>
          </string-name>
          <year>2017</year>
          <article-title>Rank distribution for determining the threshold values of network variables and the analysis</article-title>
          of
          <source>DDoS attacks Procedia Engineering</source>
          <volume>201</volume>
          <fpage>417</fpage>
          -
          <lpage>27</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Galtsev</surname>
            <given-names>A A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sukhov A M 2011</surname>
          </string-name>
          <article-title>Network attack detection at flow level 11th Int</article-title>
          . Conf.,
          <source>NEW2AN: Lecture Notes in Computer Science</source>
          <volume>6869</volume>
          <fpage>326</fpage>
          -
          <lpage>334</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Salimov</surname>
            <given-names>A S</given-names>
          </string-name>
          <year>2018</year>
          <article-title>Application of SDN Technologies to Protect Against Network Intrusions Int</article-title>
          .
          <source>Scien. and Techn. Conf. Modern Comp. Network Technologies (MoNeTeC</source>
          )
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Multilayer perceptron</surname>
            <given-names>URL</given-names>
          </string-name>
          : http://www.aiportal.ru/articles/neural-networks/multiperceptron.html
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <article-title>Mastering fuzzy modeling methods and developing an algorithm to optimize the fuzzy classifier rules base based on observable data using a genetic algorithm URL: http:// refleader</article-title>
          .ru/jgernayfsyfs.html
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>