Development of Algorithmic Solutions for Solving the Problem
of identifying Network Attacks Based on Adaptive Neuro-Fuzzy
Networks ANFIS
Denis Parfenov a, Lubov Zabrodina a, Irina Bolodurina a and Anton Parfenov a
a
    Orenburg State University, Prospekt Pobedy, 13, Orenburg, 460018, Russia

                 Abstract
                 Currently, the problem of detecting and classifying network attacks is one of the topical
                 problems in ensuring network security. Existing intrusion detection systems, as a rule, do not
                 provide the ability to identify all existing types of attacks, since today there is no universal
                 algorithm for solving this problem. As part of this study, we proposed an approach to
                 searching and detecting network attacks based on checking whether network traffic meets
                 certain flexible rules. The problem of forming a base of fuzzy rules lies in the development of
                 optimal functions and the creation of term sets that allow you to create a system of fuzzy
                 conclusions that do not depend on the subjective assessments of specialists in a particular
                 area. One of the effective methods used to solve this problem is the construction of a neuro-
                 fuzzy network ANFIS. However, for its operation, it is necessary to carry out the
                 preprocessing of the data array. We have proposed a solution that makes it possible to
                 sequentially form data arrays using the C4.5 algorithm and the neuro-fuzzy network ANFIS.
                 The study of a hybrid approach to the formation of adaptive neuro-fuzzy networks ANFIS
                 based on various representations of fuzzy rules made it possible to improve the classification
                 of incoming network traffic both in terms of accuracy and in terms of performance.

                 Keywords 12
                 Classifying network attacks, network traffic, multiclass fuzzy classification

1. Introduction
   Currently, there is a tendency to change traditional approaches to the organization of network
architecture. This is primarily due to the annual increase in the number of devices participating in the
network data exchange. Recent trends are networks of "smart" mobile devices. The main traffic of
smart devices, as a rule, is directed towards interaction with neighboring devices. Moreover, the
number of such devices can reach several thousand, which, when the network devices interact with
each other, generates significant amounts of data transmitted through the existing communication
channels of telecom operators.
   In networks with such a large number of devices, it is not technically possible to clearly define the
border and areas of responsibility between customers and telecom operators. Nevertheless, telecom
operators are responsible for ensuring the uninterrupted functioning of the network, which in turn
requires the development of new approaches to ensuring network security [1].
   The main tool for such networking is traffic analysis tools. To ensure effective security, various
researchers propose approaches based on the placement of security elements, as well as data mining
methods, including machine learning, etc. [2, 3].


YRID-2020: International Workshop on Data Mining and Knowledge Engineering, October 15-16, 2020, Stavropol, Russia
EMAIL: parfenovdi@mail.ru (Denis Parfenov); zabrodina97@inbox.ru (Lubov Zabrodina); prmat@mail.osu.ru (Irina Bolodurina);
anton_parfenov@bk.ru (Anton Parfenov);
ORCID: 0000-0002-1146-1270 (Denis Parfenov); 0000-0003-2752-7198 (Lubov Zabrodina); 0000-0003-0096-2587 (Irina Bolodurina);
0000-0003-0411-3582 (Anton Parfenov)
                 2020 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                                       94
   One of the most common methods for detecting network attacks is to analyze network traffic data
for complete coincidence with the existing database of network attack indicators. Due to the
increasing complexity of threats, this method has low efficiency. One way to bypass this detection
algorithm is to hide or proxy the attacker's IP address. In this case, in the event of a threat, the means
of protection will block the hosts of legitimate users, and they will not be able to access resources. In
addition to the above, methods for detecting network attacks, such as statistical methods, expert
systems, and neural networks, are currently used. They are used both comprehensively and as separate
tools for analyzing network traffic. This approach is more flexible and allows you to simplify the
process of updating the security system.
   Each of the above solutions is effective in its way, but only for traditional networks. Modern
networks of telecom operators are increasingly using flexible topologies, characterized by variable
volumes of flowing traffic. In such networks, the identified threat metrics captured in the security
systems may not be effective.
   Therefore, this work is devoted to the study and development of solutions for network protection
tools based on multi-class fuzzy classification of network traffic to identify attacks.
   The choice of the proposed approach is primarily because in networks with constantly changing
topology there is no way to accurately identify attacks. An even more difficult task is to classify
detected attacks according to known threat types. Nevertheless, the amount of data collected by
network monitoring systems, in conjunction with well-known metrics of security systems, makes it
possible, with a certain degree of probability, to identify harmful traffic.

2. Related works
    Currently, the issue of detecting and classifying network attacks is one of the most relevant in
ensuring network security. Existing intrusion detection systems usually do not provide the ability to
identify all existing types of attacks, since there is no universal algorithm for solving this problem. In
this regard, the problem of identifying network attacks is studied by various authors around the world.
    For example, article [4] suggests an intrusion detection system built using the Feed Forward Deep
Neural Network (FFDNN), which includes a feature extraction unit based on the wrapper method
(Wrapper Based Feature Extraction Unit, WFEU) using the Extra Trees (ET) algorithm, which
generates an optimal set of features. Experimental studies were conducted on the UNSW-NB15 data
set. The authors found that the proposed approach exceeds the classical methods of machine learning
and allows us to obtain a solution with high accuracy, both for binary classification and for multi-
class classification of network traffic.
    In work [5], an intrusion detection system based on the ensemble method (IBk(K-NN), Random
Tree, REP Tree, j48graft, Random Forest) was developed to improve the accuracy and reliability of
network traffic classification. Also, the filter-based attribute evaluation technique is used to reduce the
number of attributes. To evaluate the performance of the proposed method, the NSL-KDD data set is
selected. The ensemble method demonstrates an accuracy of 99.72% for binary classification and
99.68% for multi-class classification.
    By the authors [6] an approach for multiclass classification of network traffic based on the use of
deep convolutional neural networks is proposed. The performance of this approach was studied on the
UNSW-NB15 dataset. The results of the experimental study allowed us to conclude that the proposed
classifier is more effective than traditional machine learning classifiers, as well as dense neural
networks. In addition, classification results based on convolutional neural networks are superior to
previously obtained results in [7], including for complex types of attacks such as Analysis, Backdoor,
Shellcode, Worms, with an increase in the F-score parameter by about 20%.
    As part of the study [8], a two-stage intrusion detection system based on a stacked autoencoder and
a softmax classifier are proposed. At the first stage, network traffic is classified as abnormal or secure,
and at the second stage, abnormal traffic belongs to a certain class of intrusions. Studies of the
proposed approach were conducted on two data sets KDD-CUP 99 and UNSW-NB15. This approach
shows a high detection accuracy.
    One of the promising areas of research for network security is the use of fuzzy logic methods.


                                                                                                        95
    In the study [9] an approach based on the use of fuzzy logic implemented using a genetic
algorithm to detect intrusions into a wireless network is proposed. The KDD-CUP 99 data set was
used to study the proposed approach. The system developed by the authors controls the distribution of
messages about the connection Request message.
    An approach based on the application of a genetic algorithm for fuzzy classification is also
considered in the article [10]. the system proposed by the authors for detecting anomalies in real
network traffic flows allows achieving detection accuracy of 96.53% and false-positive speed of
0.56%. This approach shows better results than CNN, SVM, and ACODS (Ant Colony Optimization
for Digital Signature).
    By the authors [11] a classifier based on a modified improved fuzzy min-max neural network
(EFMN) for rapid identification of network attacks is proposed. The authors conducted a comparative
analysis of the performance of the developed classifier with other standard classifiers, such as SVM,
RIPPER (or JRIP), PART, ANFIS, and FMN. Based on the research, it was found that the proposed
approach achieves a higher rate of detection of low-frequency attacks. Also, it surpasses the
approaches considered in terms of indicators such as the frequency of false positives and recall time.
    In the article [12] an approach using an adaptive neuro-fuzzy inference system (ANFIS) and a
particle swarm optimization (PSO) method for detecting and preventing blackhole attacks in mobile
ad-hoc networks (MANET) is proposed. According to the results of experimental studies, this
approach has a good detection rate (on average 99.13%) and a low false alarm rate (on average
1.39%).
    To ensure the security of MANET networks, a fuzzy system for detecting RREQ message flooding
attacks is also proposed [13]. This system is based on the first-order Mamdani-type fuzzy inference
system. The system proposed by the authors uses network parameters such as routing costs,
bandwidth, and packet loss rate. As part of the experimental study, it was found that any deviation
from the normal behavior of nodes is immediately detected by the proposed system.
    As part of the study [14] a comparative analysis of «soft computing» methods, such as genetic
programming (GP), fuzzy logic, artificial neural network (ANN), and a probabilistic model using
clustering methods, for binary classification of network traffic packets. All studies were conducted on
the NSL-KDD dataset. Based on the research, it was found that algorithms based on fuzzy logic show
better performance. Thus, the FURIA algorithm provides a high detection rate of 99.69% with a low
false alarm rate of 0.31% for a time of 78.14 sec, and the FRNN algorithm provides an accuracy of
99.51% and an acceptable false alarm rate of 0.49% for a computational time of 0.33 sec.
    Research has shown that existing methods for identifying network attacks based on fuzzy
classification allow us to determine the type of attacks with high accuracy. Most works based on
fuzzy logic methods classify network traffic as secure or abnormal. However, determining the type of
attack is an important aspect of network security. The research conducted on the UNSW-NB15
dataset does not address the problem of intersecting classes of different types of attacks in terms of
analyzing similar characteristics. In this paper, we will describe the application of a multi-class fuzzy
classification for network traffic and conduct a comparative analysis of the results obtained.

3. Problem statement
    Consider a network of telecommunications service providers that provide end-user access to
information systems. It is necessary to detect and classify malicious fragments of continuous network
traffic. In other words, we will consider the network security problem as a multi-class classification of
network traffic for detecting network attacks.
    For the experimental study, we will use the UNSW-NB15 data set, which contains data on normal
traffic and data from 9 classes of attacks [15]:
     Normal is secure data transactions;
     Fuzzers is an attack that causes a program or network to fail due to generating a large amount
of random data that is passed to it for input;
     Analysis is an attack that involves scanning ports, sending spam, and embedding in HTML
files;


                                                                                                      96
     Backdoors is a method of bypassing the security mechanisms of the system in order to obtain
hidden access to the computer or its data or programs;
     DoS is a denial-of-service attack on a server or network resource that makes it difficult for
authorized users to access the computer;
     Exploits is an attack that leads to unexpected behavior of the host or network due to the attacker
using known errors, failures, and vulnerabilities in the operating system or program;
     Generic is a technique that allows you to detect traffic encrypted with a block cipher;
     Reconnaissance is intelligence attack, i.e. an attack that collects information about a network to
circumvent its security system;
     Shellcode is malware that transfers small parts of code used to exploit software vulnerabilities;
     Worms is attack, which is associated with self-replication of the attacking code.
    The data set in question was developed in 2015 by the IXIA Perfect Storm tool in the
cybersecurity lab of the Australian cybersecurity center (ACCS). The UNSW-NB15 dataset, unlike
the KDD CUP 99 and NSL KDD datasets, fixed their main shortcomings and added information
about modern types of attacks. The database used contains 2540044 network traffic records stored in
four CSV files. Each record is represented as a set of 49 characteristics (attributes) of a specific data
type. The corresponding sets containing 175341 and 82332 records are allocated for training and
testing, respectively.
    All the features considered in the data set can be divided into 5 main groups [16]:
     flow features;
     basic features;
     content features;
     time features;
     additional generated features.
    Let's assume that the network attack model is represented as a time-ordered series of events
(states) of a single node with an additive overlay of the attack profile on each element of the network
of telecommunications service providers.
    Let us consider the problem of constructing a system of neuro-fuzzy classification of network
attacks from the point of view of predictive modeling, the solution of which can be obtained using
supervised machine learning. Because the set of identified attacks is limited only from a practical
point of view and is represented by the most common types of attacks, this task is a multi-class
classification. Note that the neuro-fuzzy system allows transforming both continuous and categorical
data into term sets, which significantly expands the set of variable characteristics that can be used.
    Let us describe the formal mathematical formulation of the problem of classifying network attacks.
Let us assume that information about the events taking place in the network is recorded with some
rather short time interval. In this case, in addition to data about the device itself and its technical
characteristics, information about the actions performed by end-users through the devices under
consideration is also recorded.
    Let the set X contain information about the states of all network objects xi  X , i  1,.., m with
which some records of the event log are compared, i.e. xi   xi1 , xi 2 ,..., xik  . The task of the
multiclass classification of network attacks is to associate many types of attacks with network objects
Y  1,..., K  .
   Thus, the problem of identifying network attacks is that it is necessary to construct a mapping
fc  X  : X  Y that allows describing the dependence between the recorded characteristics of
network traffic and comparing the behavior of network objects with characteristics and choosing the
most probable one in the absence of attacks and for a specific type of attack.
    This study analyzes the classification of network attacks on the UNSW-NB15 dataset [16], which
contains information about traffic with five different types of network attacks and the set has the form
{Normal, Fuzzers, Generic, Reconnaissance, Exploits, DoS}. Note that the presented data on network
traffic is collected by more than 40 characteristics and has more than 2.5 million records. Also,


                                                                                                      97
balanced sets for training and testing are compared to the data when analyzing the accuracy of the
resulting classification models.

4. Approaches to identifying attacks based on systems of neuro-fuzzy
   classification
    Within the framework of this work, to identify attacks, it is proposed to use the extraction of fuzzy
rules from a decision tree built on the UNSW-NB15 training dataset (175341 unique records). The
input features for building a decision tree are the network traffic characteristics extracted from the
presented data set at the preprocessing stage. The output feature is the field with the class label of the
attacking effect. The C4.5 algorithm is used to construct a decision tree.
    Algorithm C4.5 (T)
    Input: training data set T; attributes S.
    Output: decision tree Tree .
    if T is NULL then
        return failure
    if S is NULL then
              return Tree as a single node with most frequent class label in T
    if |S| = = 1 then
              return Tree as single node S
    set Tree ={}
    for a  S do
         set Info(a, T )  0 , and SplitInfo(a, T )  0
       compute Entropy(a)
       for v  values(a, T ) do
          set Ta ,v as the subset of T wiyh attribute a  v
                            | Ta ,v |
           Info(a, T )                Entropy (av )
                             | Ta |
                                   | Ta ,v |         | Ta ,v |
           SplitInfo(a, T )                  log
                                      | Ta |          | Ta |
       Gain(a, T )  Entropy(a)  Info(a, T )
                               Gain(a, T )
       GainRatio(a, T ) 
                              SplitInfo(a, T )
   set abest  arg max{GainRatio(a, T )}
                   a

   attach abest into Tree
   for v  values(abest , T ) do
        call C 4.5(Ta,v )
   return Tree
   Neural fuzzy networks are often used for more accurate identification of semi-structured data.
Therefore, in addition to the already constructed decision tree, it is proposed to use neuro-fuzzy
networks. It is proposed to use the Sugeno-Takagiya algorithm as an algorithm for fuzzy
transformations. This method allows one to approximate arbitrary continuous functions dependent on
many variables by the sum of functions depending on one variable with a given accuracy. Let us
consider the basic ideas of constructing neuro-fuzzy ANFIS networks using the selected algorithm,
and also present an approach to the formation of neuro-fuzzy inference.
   The Sugeno-Takagi algorithm uses the following fuzzy rule model:
                                 Ri :IF xi eq Ai1 and …and xn eq Ain then y  f ( X )


                                                                                                       98
   Note that for each fuzzy Sugeno-Takagi rule, a cut-off level is selected, at which the rule
conclusions are calculated. In the framework of this study, a first-order polynomial was used as an
output function.
   The neuro-fuzzy network ANFIS corresponding to the Sugeno-Takagi inference model is shown in
Fig. 1 and has the following structure:
   Layer 1. Responsible for matching the continuous input signal values of a specific term-set
(fuzzification).
   Layer 2. Determines the premises of fuzzy rules taking into account the input values of term sets
and is interpreted as the degree of fulfillment of a certain rule.
   Layer 3. Calculates the relative frequency of execution of the fuzzy rule (normalization).
   Layer 4. Calculates the importance of each fuzzy rule and determines its contribution to the result.
   Layer 5. Aggregates the results of fuzzy rules based on the identified importance.


Figure 1: Scheme of neuro-fuzzy network ANFIS using Sugeno-Takagi inference

   To test the identification of various types of attacks using neuro-fuzzy classification and fuzzy
inference systems, to evaluate the effectiveness, we will conduct an experimental study of the
classification of cybersecurity incidents on real network traffic.

5. Simulation results
    To carry out a computational experiment to identify attacking influences using the proposed
approach, combining the construction of rules using the C4.5 algorithm and the ANFIS neuro-fuzzy
classification algorithm, a module for a traffic monitoring system in Python was implemented.
    The proposed module was run on a virtual machine running Ubuntu 19.10 LTS Linux. As part of
the study, we compared performance against three other machine learning approaches: Naïve Bayes,
SVM, and KNN. To implement the proposed research plan, an experimental stand was built, which
allows:
    1. Use similar parameters of the experiment traffic generator on the equipment, if possible.
    2. Use PCAP files with saved original experiment traffic on the hardware.


                                                                                                    99
   At the first step of the experimental study, the effectiveness of the constructed fuzzy inference
systems was assessed to determine the class of attacking effects; it was estimated based on the
analysis of network traffic on the UNSW-NB15 dataset.


Figure 2: ANFIS error matrix using the Sugeno-Takagi algorithm

   The results obtained are also presented as a general assessment of the effectiveness of identifying
network attacks using measures of Accuracy, Precision, F-measure, and the number of truly positive
classification results:

Table 1
Experimental results
     Parameter                  Accuracy             Precision         Recall            F-measure
        ANFIS                     86.15                85.60           86.60               86.40
    Naïve Bayes                   85.20                84.70           85.65               86.05
         SVM                      86.10                85.20           84.75               86.25
         KNN                      84.50                85.35           85.95               86.15
   Multiclass Fuzzy
                                  85.00                85.15           85.87               85.97
    Classification
   At the next stage of the experiment, we test the load on the equipment created by each of the traffic
analysis modules, measured in the IDS system. We assessed terms of the load on the device in terms
of processor and RAM. And also established an import classification method that determines the
network before making a decision. The analysis results are presented in Table 2.

Table 2
Experimental results of load equipment
   Parameter             ANFIS      Multiclass Fuzzy      Naïve Bayes        SVM             KNN
                                     Classification
       CPU %               <1%               2%                 3%              4%              4%
        RAM                <1%              <1%                 1%              2%              3%
        Delay              <1%              <1%                 3%              3%              4%

6. Conclusion


                                                                                                     100
   As a result of the study, an analysis of network traffic was carried out for an approach to
identifying attacks based on multi-class fuzzy classification. The results obtained showed the
possibility of building a sufficiently accurate model to identify certain types of attacks. Also, a study
was conducted on the performance of the proposed solution on real traffic. The obtained results of a
general assessment of the effectiveness of network attacks using various measures of accuracy, the
most optimal neuro-fuzzy classifier ANFIS network. Messages with a message about the information
management system and security events. In future studies, it is planned to investigate other neuro-
fuzzy classifier algorithms for the ANFIS network.

7. Acknowledgments
   The study was carried out with the financial support of the RFBR in the framework of scientific
project No. 20-07-01065, as well as a grant from the President of the Russian Federation for state
support of leading scientific schools of the Russian Federation (NSh-2502.2020.9) and a grant from
the President of the Russian Federation Federations for state support of young Russian scientists -
candidates of sciences (MK-860.2019.9).

8. References
   [1] I. Bolodurina, D. Parfenov, V. Torchin, L. Legashev "Development and Investigation of Multi-
       Cloud Platform Network Security Algorithms Based on the Technology of Virtualization
       Network Functions" 2018 International Scientific and Technical Conference Modern Computer
       Network Technologies (MoNeTeC) (2018): 1-7.
   [2] I. Bolodurina, D. Parfenov "The development and study of the methods and algorithms for the
       classification of data flows of cloud applications in the network of the virtual data center"
       International Journal of Computer Networks and Communications 10(2) (2018): 15-22.
   [3] A. E. Krasnov, D. N. Nikol'skii, D. S. Repin, V. S. Galyaev, E. A. Zykova "Detecting DDoS
       Attacks Using the Analysis of Network Traffic as Dynamical System" 2018 International
       Scientific and Technical Conference Modern Computer Network Technologies (MoNeTeC)
       (2018): 1-7.
   [4] S. M. Kasongo, Y. Sun "A Deep Learning Method With Wrapper Based Feature Extraction For
       Wireless Intrusion Detection System" Computers & Security 92 (2020): 1-21.
   [5] Kunal, M. Dua "Attribute Selection and Ensemble Classifier based Novel Approach to
       Intrusion Detection System" Procedia Computer Science 167 (2020): 2191-2199.
   [6] R. Chapaneri, S. Shah "Detection of Malicious Network Traffic using Convolutional Neural
       Networks" 2019 10th International Conference on Computing, Communication and
       Networking Technologies (ICCCNT) (2020): 1-6.
   [7] S. Potluri, S. Ahmed, C. Diedrich "Convolutional neural networks for multi-class intrusion
       detection system" Lecture Notes in Computer Science 11308 (2018): 225-238.
   [8] F. A. Khan, A. Gumaei, A. Derhab, A. Hussain, “TSDL: A Two-Stage Deep Learning Model
       for Efficient Network Intrusion Detection" IEEE Access 7 (2019): 30373-30385.
   [9] S. Sai Satyanarayana Reddy, P. Chatterjee, C. Mamatha, "Intrusion Detection in Wireless
       Network Using Fuzzy Logic Implemented with Genetic Algorithm" Computing and Network
       Sustainability. Lecture Notes in Networks and Systems 75 (2019): 425-432.
   [10] A. H. Hamamoto, L. F. Carvalho, L. D. H. Sampaio, T. Abrão, M. L. Proença "Network
       Anomaly Detection System using Genetic Algorithm and Fuzzy Logic" Expert Systems with
       Applications 92 (2018): 390-402.
   [11] N. Upasani, H. Om "A modified neuro-fuzzy classifier and its parallel implementation on
       modern GPUs for real time intrusion detection" Applied Soft Computing 82 (2019): 1-16.
   [12] H. Moudni, M. Er-rouidi, H. Mouncif, B. E. Hadadi "Black Hole attack Detection using
       Fuzzy based Intrusion Detection Systems in MANET" Procedia Computer Science 151 (2019):
       1176-1181.
   [13] B. Nithya, A. Nair, A. S. Sreelakshmi "Detection of RREQ Flooding Attacks in MANETs"
       Data and Communication Networks. Advances in Intelligent Systems and Computing 847

                                                                                                     101
   (2018): 109-121.
[14] J. E. Varghese, B. Muniyal "A Comparative Analysis of Different Soft Computing
   Techniques for Intrusion Detection System" Security in Computing and Communications.
   SSCC 2018. Communications in Computer and Information Science 969 (2019): 563-577.
[15] T. T. L. Le "Intrusion detection on the modern database UNSW-NB15 using multilayer
   neural network" Informatization and communication 1 (2017): 61-66.
[16] N. Moustafa, J. Slay "UNSW-NB15: a comprehensive data set for network intrusion detection
   systems (UNSW-NB15 network data set)" 2015 Military Communications and Information
   Systems Conference (MilCIS) (2015): 1-6.


                                                                                          102