Botnet Detection Approach Based on DNS
Sergii Lysenkoa, Kira Bobrovnikovaa, Bohdan Savenkoa, Piotr Gajb, and Oleg Savenko a
a
    Khmelnitsky National University, Khmelnitsky, Ukraine
b
    Silesian University of Technology, Gliwice, Poland

                 Abstract
                 Botnets that use DNS technology are a serious threat on the Internet today. The potential of
                 botnets is very large, from the spread of malware, ransomware, spam mailings to the theft of
                 confidential information and money from bank accounts. Analysis of known methods and
                 means of identification of botnets that use DNS has demonstrated the insufficient level of
                 detection capacity of this type of botnets. Therefore, it is necessary to improve the method of
                 botnets detection. The paper presents botnet detection approach based on DNS. The paper
                 proposes a method of identifying botnets that use DNS based on the Decision Tree classifier
                 with the application of the AdaBoost algorithm. The method allows you to ensure the detection
                 of botnets that use DNS based on the characteristics of this technology of malicious software.
                 The Decision Tree application algorithm is argued by the fact that it is a powerful tool for
                 classification and prediction, and to strengthen the work of the above classifier, the AdaBoost
                 algorithm was used in the study.

                 Keywords 1
                 DNS, Botnet, Cyberattack, Computer Network, Cybersecurity, Computer system, Malware,
                 Malicious traffic, Botnet Detection, AdaBoost, Decision Tree

1. Introduction
   The problem of information protection today is relevant, since it has no final solution and due to the
rapid development of technology, new types of threats are constantly emerging. Modern computer
systems typically rely on a domain name system (DNS). However, cybercriminals often abuse domain
names because DNS traffic is usually unfiltered or allowed through a firewall, thereby providing a
stable and seamless communication channel [1].
   The relevance of the work lies in the development of an approach of identifying botnets that use
DNS. The detection of new previously unknown threats should be based on a combination of all
knowledge about botnets that use DNS-based evasion technologies. The use of this evasion technology
can be detected by analyzing the signs removed from DNS messages using machine learning [2, 3]. The
method should ensure the detection of botnet attacks that use DNS in the early stages or even before
their occurrence [4, 5].
   The purpose of scientific work is to increase the reliability of the process of identification of botnets
using the technology of "domain flux", based on the analysis of DNS traffic.
        The goal is achieved by solving the following main tasks:
        1.      To investigate the peculiarities of the functioning of botnets using the technology of
"domain flux", taking into account the domain name system.
        2.      Analyze modern methods and means of identifying botnets based on DNS traffic in
order to determine ways to increase the reliability of botnet detection.


IntelITSIS’2022: 3rd International Workshop on Intelligent Information Technologies and Systems of Information Security, March 23–25,
2022, Khmelnytskyi, Ukraine
EMAIL: sirogyk@ukr.net (S. Lysenko); bobrovnikova.kira@gmail.com (K. Bobrovnikova); savenko_bohdan@ukr.net (B. Savenko);
piotr.gaj@polsl.pl (P. Gaj); savenko_oleg_st@ukr.net (O. Savenko)
ORCID: 0000-0001-7243-8747 (S. Lysenko); 0000-0002-1046-893X (K. Bobrovnikova); 0000-0001-5647-9979 (B. Savenko); 0000-0002-
2291-7341 (P. Gaj); 0000-0002-4104-745X (O. Savenko).
              ©️ 2022 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)
         3.      Develop an appropriate model of the botnet, taking into account the domain name
system, the DNS traffic model and the model of the process of identification of the "domain flux"
evasion technology based on the analysis of DNS traffic.
         4.      Develop an improved method of identifying botnets that use "domain flux" technology
by filtering DNS traffic using an accumulated database of whitelists, using the method of frequency
lexical analysis of domain names, collecting signs of DNS traffic and analyzing them based on the
Decision Tree classifier with the application of the AdaBoost algorithm.
    Develop a botnet identification system that will improve the reliability and efficiency of the process
of detecting botnets that use the "domain flux" technology.
    The object of research is the process of identifying botnets that use the "domain flux" technology,
based on the analysis of DNS traffic.
    The subject of research is models, methods and software tools for identifying botnets using "domain
flux" technology, based on the analysis of DNS traffic.
    The aim of the work is to increase the reliability of the process of identification of botnets using the
technology of "domain flux".
    Scientific novelty of the obtained results. As a result of the study, the method of identification of
botnets using the "domain flux" technology has been improved, based on the analysis of DNS traffic,
which, unlike the well-known ones, uses complex analysis of DNS traffic using the Decision Tree
classifier and the AdaBoost algorithm.
    The practical significance of the results is the developed system of identification of botnets using
the technology of "domain flux", based on a comprehensive analysis of DNS traffic with high reliability
and efficiency based on the use of the RapidMiner platform, which is able to detect botnets with high
reliability.

2. Related works
2.1. Botnets and DNS

   Botnets play an important role in the spread of malware, and they are widely used to spread malicious
activity on the Internet. The study of the literature shows that a large subgroup of botnets uses DNS to
spread malicious actions and that there are different methods for detecting them using DNS queries.
The Domain Name System (DNS) is a system that establishes a correspondence between the IP address
and the domain name (and vice versa), and is designed to respond to DNS queries using the
corresponding protocol [6].
   Botnet is a computer network that includes a finite number of hosts that works with standalone
software – running bots. Typically, a bot as part of a botnet is a program that is installed hidden on the
victim's computer and allows the attacker to access the resources of the infected computer [7] and
perform certain actions. Most often used for illegal activities – sending spam emails, collecting
passwords on a remote system, organizing denial of service attacks, having access to personal
information about users, theft of credit card numbers and access passwords [8]. "Domain flux" is a
method of ensuring malicious activity of the botnet by constantly changing the domain name of the
C&C server. Domain names are replaced in time based on the application of an algorithm that is known
only to an attacker (botmaster). This makes it impossible to detect malicious traffic generated by the
botnet.
   Modern botnets such as Zeus (Zbot, PRG, Wsnpoem, Gorhax, Kneber, Chthonic, Panda), Torpig,
Kraken, Conficker (DownUp, DownAndUp, DownAdUp, Kido), Mirai, "Star Wars" Twitter, Satori
IoT, Trickbot, Emotet, usually use technology called "domain flux" and domain generation algorithm
(DGA) [9] to generate a large number of pseudo-random domain names to dynamically manage network
operator bots and their bots.
   Typically, botnets generate a large number of DNS queries registered to the same IP address, and
they often generate many failures in DNS traffic. Failed DNS queries may indicate the presence of bots
on clients, while successful queries that occur in time next to unsuccessful ones are likely related to
benign user.
   In addition, the botnet can quickly transmit messages to all bots, which is one of the main advantages.
DNS traffic monitoring is an important task and helps to detect botnets that use the DNS analysis [10].
Identifying domain names generated algorithmically through DNS traffic analysis has different
advantages.
   For example, DNS includes only a small amount of traffic throughout the network, making it
appropriate for analysis even on large large-scale networks. Additionally, DNS traffic is typically
cached, which reduces network load. Moreover, the analysis of DNS queries helps to detect attacks in
the early stages or even before they occur.

2.2.    Related works
    Today, the scientific community has developed a large number of methods for solving the problem
of identification of botnets using the technology of "domain flux".
    In particular, [11] describes a method that is based on the analysis of DNS queries, determines the
correlation of various logs and error messages, diagnoses based on the history of suspicious activity,
detects the activities of botnets based on DNS traffic failure and diagnoses based on DNS group activity,
and analyzes group activity on the network.
    In [12] a method is proposed to identify DGA based on the dictionary using graph theory.
    The article [13] describes the method of detecting bot modes using frequency analysis of character
distribution and weighted domain name scores.
    Paper [14] presents a method for detecting botnets by classifying text strings of domain names based
on 𝑛 -grams.
    The paper [15] provides a method for detecting botnets based on the analysis of DNS traffic
functions. This method passively captures all DNS traffic from the gateway network, and then extracts
key functions to identify pseudo-random domain names.
    The anomaly detection and passive DNS analysis approach for botnet detection are presented in
[16, 17]. In [18] Identifying legitimate Web users and bots with different traffic profiles - an Information
Bottleneck approach is presented.
    Thus, in contrast to heuristic methods, machine learning-based methods achieve high accuracy in
detecting botnets using "domain flux" technology. Of course, if in the process of generating pseudo-
random domain names, the logic of work changes, as well as certain characteristics change dramatically,
which may significantly reduce the efficiency of detection of botnets of this type.
    The anti-virus tools in question detect all known types of malware using signature databases and
existing heuristic approaches. However, none of the popular anti-virus tools today detect 100% of
botnets that use DNS.

2.3.    Conclusions and the problem statement
   Known methods of botnets identification that use DNS are the subject to a decrease in the accuracy
of detection of new unknown botnets of this type when using pseudo-random domain names other than
known methods of generating pseudo-random domain names. Also, most methods do not allow
detecting attacks in the early stages or even before they occur.
    Therefore, in order to increase the reliability and efficiency of the process of identifying botnets
that use DNS, it is necessary to develop or improve the method based on a comprehensive analysis of
DNS traffic through the use of machine learning algorithms.

3. Model of the process of identification of botnets that use DNS
3.1. Process Formalization for the functioning of botnets that use DNS
    Botnets are distributed over the Internet using similar approaches to other SDPs. They can spread
like a worm, or they can disguise themselves as a Trojan executing following operations: the botmaster
issues a command (set of parameters and settings) to the command-and-control center to carry out the
attack; in turn, the command-and-control center sends a message to all bots of the botnet, which
immediately begin to execute the commands of the botmaster.
   Consider the model of a botnet that uses DNS as a control system for infected bots of a computer
system and present it in the form of a tuple:

                                                       𝑀𝐷𝐹 = 〈𝐶, 𝐴, 𝑃, 𝐵, 𝐿, 𝐹〉,                                  (1)

   where C – a set of command and control servers (C&C) of the botnet;
                                                                           𝑁𝐶𝐶
   𝑁𝐶𝐶 – the number of command-controlling servers of the botnet, 𝐶 = {𝑐𝑗 }𝑗=1 𝑁𝐶𝐶 ;
               3
   𝐴 = {𝑎𝑗 }𝑗=1 – type of botnet architecture;
           𝑝𝑜𝑟𝑡 𝑁𝑃
   𝑃 = {𝑝𝑗           }      – a set of network protocols used for the functioning of the botnet – the number of
                     𝑗=1
network protocols used by the botnet,
    where 𝑁𝑃 – a set of ports used for commissioning with the botnet, where 𝑁𝑃 𝑝𝑜𝑟𝑡 ∈ 𝑁𝑃𝑜𝑟𝑡𝑁𝑃𝑜𝑟𝑡 =
{1. .65535}27];
              𝑁𝐵
     𝐵 = {𝑏𝑗 }𝑗=1 – a set of bots of the bot-network,𝑁𝐵 – the number of bots included in the bot network;
               5
   𝐿 = {𝑙𝑗 }         – a set of stages of the life cycle of the botnet;
               𝑗=1
                𝑁𝐹
   𝐹 = {𝑓𝑗 }             – a set of bot functions that can be performed in the corresponding phase of the botnet
               𝑗=1
life cycle,
     𝑁𝐹 – the number of functions that bots can perform botnets.

3.2.    Model of attack carried out by a botnet that use DNS
   Consider the botnet attack model, which uses DNS as a set of commands to carry out malicious
activities that can be performed by bots of botnets and their possible use scenarios:𝐴

                                 𝐴 = {𝑎𝐷𝐷𝑜𝑆 , 𝑎𝑠𝑝𝑎𝑚 , 𝑎𝑝ℎ𝑖𝑠ℎ𝑖𝑛𝑔 , 𝑎𝑝ℎ𝑖𝑠ℎ𝑖𝑛𝑔 , 𝑎𝑒𝑠𝑝𝑖𝑜𝑛𝑎𝑔𝑒 , 𝑎𝑝𝑜𝑠𝑡𝑖𝑛𝑔 , 𝑎𝑝𝑟𝑜𝑥𝑦 },   (2)

    where 𝑎𝐷𝐷𝑜𝑆 – distributed denial of service attack;
    𝑎𝑠𝑝𝑎𝑚 – spam attack;
    𝑎𝑝ℎ𝑖𝑠ℎ𝑖𝑛𝑔 – phishing attack;
    𝑎𝑒𝑠𝑝𝑖𝑜𝑛𝑎𝑔𝑒 – espionage;
    𝑎𝑝𝑜𝑠𝑡𝑖𝑛𝑔 – placement of harmful content, such as the placement of content or advertising;
    𝑎𝑝𝑟𝑜𝑥𝑦 – carrying out attacks using proxy servers.
    Botnets are distributed over the Internet using similar approaches to other malware. They can spread
like a worm, or they can disguise themselves as a Trojan. Let's take a closer look at the model of ways
to spread malware [24] and present it as follows:

                                              𝑎𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 = {𝑑𝑜𝑠 , 𝑑𝑠 , 𝑑𝑎 , 𝑑𝑠𝑒 },                             (3)

   where 𝑑𝑜𝑠 – distribution due to vulnerabilities in the operating system;
   𝑑𝑠 – distribution through services and services;
   𝑑𝑎 – distribution through applications and applications;
   𝑑𝑠𝑒 – Spread through social engineering.

3.3.    Botnet detection model
   Since the detection of botnets that use DNS is based on the botnet model, taking into account the
domain name system and DNS traffic model, it is an important task to develop a DNS traffic model and
a DNS package.
   Let's present a model of DNS traffic as a tuple:
                                  𝐷𝑁𝑆𝑡𝑟𝑎𝑓𝑓𝑖𝑐 = 〈𝑀, 𝐶, 𝑆, 𝐷〉,                                        (4)

   where 𝑀 – a set of DNS messages sent and received from a set of computer systems of the network,
𝑀 = 𝑀𝑂 ∪ 𝑀𝐼 , where 𝑀𝑂 and 𝑀𝐼 – are the set of outgoing and incoming DNS messages in the network,
respectively;
    𝐶 – a set of computer systems of the network;
   𝑆 – a set of DNS servers to which DNS queries and DNS responses were sent and received,
respectively 𝑆 = 𝑆𝐿 ∪ 𝑆𝑁 ,
   where 𝑆𝐿 and 𝑆𝑁 – set of local and non-local DNS servers, respectively;
   𝐷 – a set of requested domain names by a set of hosts of the network,
                    𝑁𝐷
   where 𝐷 = {𝑑𝑗 }𝑗=1 𝑁𝐷 – the number of different domain names.
   Let us present a model of DNS messages. It has to involve such elements as information concerning
the fields of incoming DNS messages. It enables the detection of the botnet that use DNS. that can be
used to identify botnets that use DNS,
   Thus, DNS messages may be presented:

                          𝑅 = 〈𝑅𝑀𝑎𝑐 , 𝑅𝐼𝑃 , 𝑅𝑃𝑜𝑟𝑡 , 𝑅𝑇 , 〈𝑅𝐻 , 𝑅𝑅𝑒𝑞 , 𝑅𝐴𝑛𝑠 , 𝑅𝐴𝑡ℎ , 𝑅𝐴𝑑𝑑 〉〉,        (5)

   where 𝑅𝑀𝑎𝑐 is the host MAC address;
   𝑅𝐼𝑃 – host IP address of the SOURCE of the DNS package;
   𝑅𝑃𝑜𝑟𝑡 – the source port of the DNS package;
   𝑅𝑇 – the time of receipt of the DNS package;
   𝑅𝐻 –– header DNS-message section;
   𝑅𝑅𝑒𝑞 – question DNS message request section;
   𝑅𝐴𝑛𝑠 – answer DNS message section;
   𝑅𝐴𝑡ℎ –authority DNS message section;
   𝑅𝐴𝑑𝑑 – additional information DNS message section.
   The process of extracting signs from incoming DNS messages for a specific domain name is
presented as a function:

                                      𝑓𝑒𝑥𝑡𝑟 (𝐷, 𝑀, 𝑅, 𝑉) → 𝐼,                                       (6)

   where 𝑉 – set of signs that indicate the botnet presence.
   The set of signs indicating the activity of the botnet, which use DNS, consists of the following
elements:

                            𝑉 = {𝑁𝑑𝑜𝑚 , 𝑆𝑏𝑖𝑡 , 𝑇𝑡𝑡𝑙 , 𝐿𝑑𝑜𝑚 , 𝑁𝑛𝑢𝑚 , 𝑊𝑑𝑜𝑚 },                         (7)

   where 𝑁𝑑𝑜𝑚 is the number of domain names that share an IP address;
     𝑆𝑏𝑖𝑡 – a binary sign of the success of the DNS query (if 𝑆𝑏𝑖𝑡 = false – a failed DNS query, and if
𝑆𝑏𝑖𝑡 = true – a successful DNS query);
   𝑇𝑡𝑡𝑙 – TTL-period;
     𝐿𝑑𝑜𝑚 – the length of the domain name;
     𝑁𝑛𝑢𝑚 – number of digits in the domain name;
     𝑊𝑑𝑜𝑚 – a balanced estimate of the frequency lexical analysis of domain names, determined by the
formula:

                                                           ∑𝑛
                                                            𝑖=0 𝑋𝑖
                                                𝑊𝑑𝑜𝑚 =               ,                              (8)
                                                             𝑛

   Where 𝑛 – the number of letters in the domain name; 𝑋𝑖 – frequency of use i-th letter.
   Let's present a model of the process of identification of botnets that use DNS as follows:
                                           𝑃 = 〈𝑀𝐷𝐹 , 𝐷𝑁𝑆𝑡𝑟𝑎𝑓𝑓𝑖𝑐 , 𝑓𝑒𝑥𝑡𝑟 , 𝑓𝑚𝑎𝑝 , 𝑓𝑐𝑙𝑎𝑠 , 𝑓𝑚𝑒𝑠 〉,     (9)

   where 𝑀𝐷𝐹 – botnet model; 𝐷𝑁𝑆𝑡𝑟𝑎𝑓𝑓𝑖𝑐 – DNS traffic model;
    𝑓𝑒𝑥𝑡𝑟 – function of the algorithm for extracting signs from incoming DNS messages;
𝑓𝑚𝑎𝑝 (𝐷𝑁𝑆𝑡𝑟𝑎𝑓𝑓𝑖𝑐 , 𝐼) → 𝑉 – function of sampling signs from DNS traffic;
   𝑓𝑐𝑙𝑎𝑠 (𝐷𝑁𝑆𝑡𝑟𝑎𝑓𝑓𝑖𝑐 , 𝑅, 𝑉) → 𝑅𝑒𝑠 – the function of classification of DNS messages of DNS traffic in
the network;
    𝑓𝑚𝑒𝑠 (𝑅𝑒𝑠) → 𝑀𝑒𝑠 – function of notification of detection of bots of bots.

4. Botnet Cyberattacks Detection Approach Based on DNS
4.1. Method for Botnet Detection Based on Decision Tree Classifier
    The paper proposes a method of identifying botnets that use DNS based on the Decision Tree
classifier with the application of the AdaBoost algorithm. The method allows you to ensure the detection
of botnets that use DNS based on the characteristics of this technology of malicious software.
    The Decision Tree application algorithm is argued by the fact that it is a powerful tool for
classification and prediction, and to strengthen the work of the above classifier, the AdaBoost algorithm
was used in the study.
    The method consists of the following stages: preparation, training of the system and direct detection
of the activities of the botnet that use DNS.
         The preparation stage includes the following steps:
         1)       analysis, modeling and identification of key features that will be used to identify
botnets that use DNS;
         2)       collection of test data (network traffic) for training.
         The training stage includes the following steps:
         1)       downloading test data (network traffic);
         2)       data conversion – in most cases, the available data is not suitable for use directly for
teaching the machine learning model, the necessary data must be pre-processed;
         3)       frequency lexical analysis of domain names;
         4)       formation of a database of white lists of domain names;
         5)       model training –using the Decision classifier and the AdaBoost algorithm, based on the
signs identified at the preparation stage for identifying botnets that use DNS;
         6)       evaluation of the model.
         The stage of detection of the activities of botnets that use DNS includes the following steps:
         1)       monitoring of network traffic;
         2)       filtering DNS traffic that uses weeding out known DNS queries that contain legitimate
domain names;
         3)       Collect all available parameters and features in filtered collected traffic
         4)       Identify groups in which the DNS query is unsuccessful.
         5)       identification of queries in which domain names by statistical analysis method are most
likely formed algorithmically;
         6)       Comparing multiple groups of features and analyzing them using the Decision Tree
classifier and the AdaBoost algorithm;
         7)       formation of conclusions.
         To train the model, the Decision Tree classifier was used with the AdaBoost algorithm.
         The pseudocode of the AdaBoost algorithm is given below:
         We have: for all ( x1, y1 ),..., ( xm , ym ) xi  X , yi  Y = − 1, + 1.
                               1
        Initialize D1 (i ) =     , i = 1,..., m.
                               m
        For every t = 1,...,T :
        Find     ht : X → −1, + 1 a classifier that minimizes weighted classification error:
                                m
ht = argmin ej where ej =  Dt ( i )[ yi  hj ( xi )] .
         hj H
                               i =1

        If the value is , then et  0.5 the stop is performed.
                                1 1− et
        Select t  R ,  t = ln          where the weighted classifier failed. et ht
                                2    et

                                                            Dt (i )e −t yiht ( xi )
                                               Dt +1 (i ) =                          ,                       (10)
                                                                   Zt

                                                                       m
        where the normalization parameter is Z t Dt +1                 D (i) =1(selected so that the probability
                                                                      i =1
                                                                              t +1


distribution is selected, that is).
         Build the resulting classifier:

                                                                  T             
                                                    H ( x) = sign   t ht ( x)  .                         (11)
                                                                  t =1          

        The expression to update the distribution must be designed in such a way that the following
condition is met:

                                                                 1, y (i ) = ht ( xi )
                                            e − t yi ht ( xi )                         .                   (12)
                                                                 1, y (i )  ht ( xi )

       Thus, after selecting the optimal classifier hi for the distribution Dt of objects x i that are a
classifier identifies correctly, have weights less than those determined incorrectly.
   In the initial step, DNS traffic is obtained by monitoring the network through the SPAN port of the
network switch (Switched Port Analyzer), which duplicates packets from one or more ports to a separate
port and for each domain name, the value of a weighted estimate of frequency lexical analysis is
determined.
   Weighted estimates of the frequency lexical analysis of DNS domain names are used to further
identify botnets using "domain flux" technology, based on the 𝑊𝑑𝑜𝑚 Decision Tree classifier and the
AdaBoost algorithm, as one of the signs indicating this application of evasion technology.
   Then all selected and analyzed data from the filtered DNS traffic are combined into a set of features,
which will allow detecting botnets that use DNS, based on the Decision Tree classifier with the
application of the AdaBoost algorithm.
    For all the collected grouped data, the applied conversion and normalization, resulting in a set of
features for each domain name

                               𝑉𝑗 = {𝑁𝑑𝑜𝑚,𝑗, 𝑆𝑏𝑖𝑡,𝑗 , 𝑇𝑡𝑡𝑙,𝑗 , 𝐿𝑑𝑜𝑚,𝑗 , 𝑁𝑛𝑢𝑚,𝑗 , 𝑊𝑑𝑜𝑚,𝑗 },

   where 𝑗 ∈ 𝑅 ′– the number of collected DNS messages after filtering.
   The last step is to form conclusions based on the analysis of a set of features 𝑉𝑗 for each domain
name by the AdaBoost machine learning algorithm. Since binary classification is used by a machine
learning algorithm, the output obtains a result in accordance with each requested domain name, which
can acquire two values: malicious request from the bot or benign request.
4.2.   Implementation of the Botnet Detection system
   In order to verify the effectiveness of the proposed method, a botnet identification system was
implemented, which is based on the use of the RapidMiner open source platform [19].
   RapidMiner is an integrated environment for data processing in large information arrays, machine
learning, text analytics and construction of predictive models, as well as for solving applied and
scientific problems.
   Model of the botnet detection system using designed in a RapidMiner environment is presented in
Figure 1.


Figure 1: Model of the botnet detection system using designed in a RapidMiner environment

   Subsystem of application of the AdaBoost algorithm is shown in Figure 2.


Figure 2: Subsystem of application of the AdaBoost algorithm
   The process of strengthening the Decision Tree classifier by the AdaBoost algorithm is shown in
Figure 3. Parameters for Decision Tree evaluation are presented in Figure 4.


Figure 3: The process of strengthening the Decision Tree classifier by the AdaBoost algorithm


Figure 4: Parameters for Decision Tree evaluation

4.3. Experimental studies of the effectiveness of the botnet detection
system
   To assess the effectiveness of proposed method of botnets detection, a number of experiments were
carried out. the experimental environment is based on the framework described in [20-23].
   To ensure unbiased results at the training stage, the dataset [24] was divided into two parts.
   The first is 75% for training, and the remaining 25% is used to check the correctness of the system.
   A total of 19,500 domain names (components of the training sample) were analyzed, among which
14,625 (75%) were selected for training, and 4,875 (25%) were used to verify correctness.
   There were 9,969 requests for input experiments. There were also requests from bots of the bot
network using the "domain flux" technology.
   The total number was 306. Correctly identified 9775 requests, which is 98.05% of the total.
   The total number of correctly and incorrectly identified requests is presented as a result of the system
in Figure 5.
   Thus, the proposed method demonstrated the possibility of identifying botnets using "domain flux"
technology with high reliability (98.05%).


Figure 5: Results of botnet detection system

5. Conclusions
   The method of identification of botnets that use DNS has been developed. The results of the study
obtained such scientific results.
   The peculiarities of the functioning of botnets that use DNS taking into account the domain name
system was investigated. Modern methods and means of identification of botnets based on DNS-traffic
were analyzed in order to determine ways to increase the efficiency of botnet detection.
   The corresponding model of the botnet was developed taking into account the domain name system,
the DNS traffic model and the model of the process of botnet detection on the analysis of DNS traffic.
   An improved method of identification of botnets that use DNS by filtering DNS traffic using the
accumulated database of white lists, collecting signs of DNS traffic and analyzing them based on the
Decision Tree classifier with the application of the AdaBoost algorithm has been developed.
   A system for identifying botnets has been developed that will ensure an increase in the reliability
and efficiency of the process of detecting botnets that use DNS.
   Experimental research demonstrated the ability of the proposed method to identify botnets that use
DNS with high reliability (up to 98.05%).

6. References
[1] Check point software cyber security report 2022. URL: https://www.ntsc.org (accessed on
    February 1, 2022).
[2] Wang Zishuo, Wang Chunyang, Ding Lianghua, Wang Zeng, Liang Shuning, Parameter
    identification of fractional-order time delay system based on Legendre wavelet, Mechanical
    Systems and Signal Processing, Volume 163, 2022, 108141, ISSN 0888-3270,
    https://doi.org/10.1016/j.ymssp.2021.108141.
[3] Zhang, Wang Huiqin, J Wang Chun, Meng Xudong Chen,. Integration of cuckoo search and fuzzy
    support vector machine for intelligent diagnosis of production process quality. Journal of Industrial
    & Management Optimization. 2017. 13. 10.3934/jimo.2020150.
[4] S.N.Thanh, M.Stcgc, P.I.El-Habr, J.Bang, N.Dragoni, Survey on botnets: Incentives, evolution,
    detection and current trends. Future Internet, 2021,13(8), 198.
[5] G.Suchacka, A.Cabri, S.Rovetta, F.Masulli, Efficient on-the-fly Web bot detection. Knowledge-
    Based Systems, 2021, 223,107074.
[6] RFC 1034, Domain Names - Concepts and Facilities.
[7] Lekssays, A., Landa, L, Carminati, B., Ferrari, E. PAutoBotCatcher: A blockchain-based privacy-
    preserving botnet detector for Internet of Things. Computer Networks. 2021, 200,108512.
[8] J.Shi., Y.-B.Leau, K. Li, J.H.Obit. A comprehensive review on hybrid network traffic prediction
    model. International Journal of Electrical and Computer Engineering, 2021, 11(2), pp. 1450-1459.
[9] D.Truong, G.Cheng Detecting domain‐flux botnet based on DNS traffic features in managed
     network. Security Comm. Networks. 2016. P. 2338–2347. DOI: 10.1002/sec.1495.
[10] O.Olowoyo, P. Owolawi, Malware Classification using Deep Learning Technique. 2020 2nd
     International Multidisciplinary Information Technology and Engineering Conference, IMITEC
     2020, 9334071.
[11] C.Nafari, E.Mahdipoor, Sayed Javadi H.Hajj Detection of active botnets based on DNS traffic
     analysis. Journal of Advances in Computer Engineering and Technology. 2019, Vol. 5, No. 3. P.
     129–142.
[12] M.Pereira, S. Yu B.Coleman, M.D.Cock, A.C. Nascimento. Dictionary Extraction and Detection
     of Algorithmically Generated Domain Names in Passive DNS Traffic. 21st International
     Symposium : RAID 2018. Heraklion, September 10-12, 2018. DOI: 10.1007/978-3-030-00470-
     5_14.
[13] E.Agyepong, W.J.Buchanan, K.Jones. Detection of Algorithmically Generated Malicious Domain
     Using Frequency Analysis. International Journal of Computer Science and Information
     Technology. 2018. P. 91–111 DOI: 10.5121/ijcsit.2018.10306.
[14] T.Wang, L.C.Seidenberg. Detecting Algorithmically Generated Domains Using Data
     Visualization and N-Grams Methods. Proceedings of Student-Faculty Research Day, CSIS, Pace
     University, May 5, 2017.
[15] J.Spooren, D.Preuveneers, L.Desmet, P.Janssen, W.Joosen. Detection of algorithmically generated
     domain names used by botnets: a dual arms race. SAC '19: Proceedings of the 34th ACM/SIGAPP
     Symposium         on      Applied      Computing.        2019.       P.      1916–1923.        DOI:
     https://doi.org/10.1145/3297280.3297467.
[16] H.Qin, J.Yang, X Luo. Z.Li, Q. Guo, Research on DNS anomaly detection technology based on
     multiple features. Journal of Shenzhen University Science and Engineering. 2020, 37. pp. 36-43.
[17] X.Guo, Z.Pan, Y.Chen, Application of Passive DNS in Cyber Security. Proceedings of2020 IEEE
     International Conference on Power, Intelligent Computing and Systems, CPICS 2020, pp. 2S7-
     259,9202344.
[18] G.Suchacka, J. Iwariski. Identifying legitimate Web users and bots with different traffic profiles -
     an Information Bottleneck approach. Knowledge-Based Systems, 2020,197,10587S
[19] RapidMiner’s data science platform. https://rapidminer.com/ (accessed on February 1, 2022).
[20] Tomas Sochor, Nadezda Chalupova. Interpersonal Internet Messaging Prospects in Industry 4.0
     Era. In: Recent Advances in Soft Computing and Cybernetics. Springer, Cham, 2021. p. 285-295.
[21] O.Savenko, S.Lysenko, A.Kryschuk Multi-agent based approach of botnet detection in computer
     systems. Communications in Computer and Information Science. 2012. Vol. 291. PP.171-180,
     ISSN: 1865-0929.
[22] Oleg Savenko, Sergii Lysenko, Andrii Kryshchuk, Yuriu Klots. Botnet detection technique for
     corporate area network. Proceedings of the 7-th IEEE International Conference on Intelligent Data
     Acquisition and Advanced Computing Systems: Technology and Applications, Berlin (Germany),
     September 12–14, 2013. Berlin, 2013. Pp. 363–368. ISBN 978-1-4799-1426-5.
[23] Sergii Lysenko, Kira Bobrovnikova, Serhii Matiukh, Ivan Hurman, Oleg Savenko. Detection of
     the botnets’ low-rate DDoS attacks based on self-similarity. International Journal of Electrical and
     Computer Engineering. 2020. Vol. 10., №4. PP.3651-3659, ISSN: 2088-8708.
[24] O. Savenko, A. Nicheporuk, I. Hurman, S. Lysenko. Dynamic signature-based malware detection
     technique based on API call tracing. CEUR-WS. 2019. Vol. 2393. P.633-643, ISSN: 1613-0073.
[25] Canadian Institute for Cybersecurity. Botnet dataset, https://www.unb.ca/cic/datasets/botnet.html
     (accessed 15.01.2022).
[26] IoT dataset. URL: https://github.com/thieu1995 /iot dataset (accessed on 15.01.2022).
[27] IoTPOT dataset. URL:https://ipsr.ynu.ac.jp/iot /index. html# datasets (accessed on 15.01.2022).