<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ML-trained model and method for blocking dangerous</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tetiana Korobeinikova</string-name>
          <email>tetiana.i.korobeinikova@lpnu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nazar Kravchuk</string-name>
          <email>nazar.v.kravchuk@lpnu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Lviv Polytechnic National University</institution>
          ,
          <addr-line>12 Stepan Bandera str., 79000 Lviv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>0000</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>This work aimed to develop a trained machine-learning model that automatically detects and blocks dangerous requests in cybersecurity systems. The main classification algorithms were analyzed during the study, such as logistic regression, support vector machine, decision trees, and deep neural networks. For each algorithm, models were trained using a set of network traffic data, which allowed us to automate the process of classifying and detecting dangerous requests in real-time. The models were trained in Matlab based on the Network Traffic Dataset. The study's main results were a comparison of the effectiveness of different algorithms using metrics such as accuracy, F-score, precision, and recall. In particular, the decision tree-based model demonstrated the highest level of accuracy, which ensured the effective detection of known and new types of attacks. In addition, methods of blocking dangerous queries are considered. The supervised learning method accurately detects known threats but requires significant computing resources. Signature-based methods have shown effectiveness in detecting known attacks, but they are unable to detect new, unknown threats. Behavioral analysis and anomaly detection are good at detecting new attacks but can be circumvented by encryption or polyforms. The blacklist and whitelist methods are easy to implement but require regular updates of the lists and are not always effective against new threats. In general, the results obtained confirm the high efficiency of the proposed methods and models in real conditions and the possibility of their integration into modern cybersecurity systems to automate the protection process.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;vulnerability</kwd>
        <kwd>data protection</kwd>
        <kwd>cyberattacks</kwd>
        <kwd>data classification</kwd>
        <kwd>selection automation</kwd>
        <kwd>machine learning</kwd>
        <kwd>LR</kwd>
        <kwd>SVM</kwd>
        <kwd>DT</kwd>
        <kwd>DNN</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Modern cybersecurity threats require the use of effective methods to detect and block dangerous
requests automatically. Machine learning (ML) plays a key role in this process, allowing the
creation of automated classifiers to analyze traffic and identify potentially malicious activities. The
main classification algorithms include logistic regression (LR), which estimates the probability of a
query belonging to a certain class; support vector machine (SVM), which finds the optimal
hyperplane for class separation; decision trees (DT), which builds a hierarchical decision-making
structure based on rules; and deep neural networks (DNN), which can handle complex
dependencies in large amounts of data. However, the effective application of these algorithms
requires their software implementation, in particular, the creation of an automated classification
model that will accurately identify dangerous queries in real time. In addition, it is important to
analyze and improve existing methods of blocking dangerous requests to increase the effectiveness
of cyber defense by combining ML approaches with traditional methods.</p>
      <p>The goal of this work is to develop a trained model using ML algorithms, which was not
implemented in the reviewed works. The tasks were to analyze classification algorithms and
develop their architectures, train an automated classification
model based on these
algorithms, and analyze existing methods for blocking dangerous queries .</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature overview</title>
      <p>
        For a better understanding, similar works should be considered. For example, V. Gnatyuk et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
developed a data model to improve the cybersecurity of content management systems, which
allows identifying vulnerabilities and taking preventive measures to increase the level of
protection. Researchers A. Nafiiev et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] provided an optimized model for malware detection
using ML techniques. The study compared several models, of which the model based on binary data
representation demonstrated the highest performance in terms of F-score, precision, and recall.
Moreover, A. Yanko et al. developed a model for detecting a low-rate denial-of-service attack
(DDoS) using ML in software-defined networking [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The authors emphasized the importance of
online classifiers. They demonstrated the high efficiency of the proposed model, which achieved an
average detection rate of 99.7% for normal and DDoS traffic, which outperforms the results of
previous models.
      </p>
      <p>
        On the other hand, A. González Álvarez et al. conducted a study of the impact of optimization
on the performance of pre-trained ML models for image classification [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It was found that
dynamic quantization significantly reduces inference time and energy consumption, making it a
good choice for scalable systems, while global model pruning causes high costs due to longer
optimization time. As for the work of I.G.A. Mulyawarman et al. in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], they considered policies for
blocking dangerous content on the Internet. Policies for blocking dangerous requests should be
sensitive to human rights and not violate the principles of democratic freedoms. In addition, H.
Zhao et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] proposed a continual forgetting technique for trained models that effectively
removes unwanted information without significantly affecting the rest of the model’s knowledge.
This is an important solution for maintaining privacy and security, as it allows to keep the model
free of dangerous or sensitive data while minimizing the negative impact on the rest of the
functionality. At the same time, the authors C. Sun et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] have developed a framework for
blocking entity resolution tasks based on pre-trained language models. This approach effectively
filters comparisons and speeds up the process of solving entities, demonstrating an advantage in
working with textual and impure data.
      </p>
      <p>
        In his turn, A. Vrincean [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] compared traditional blocking approaches to query processing with
newer methods, particularly non-blocking I/O. The author investigated how these methods can be
used to improve server efficiency in the context of scalability of query processing, where
nonblocking technologies significantly reduce the overhead of creating and maintaining threads,
especially with high I/O requirements. The study by J. R. García et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] presents the THREAD
architecture, which allows medical data collection while protecting users from dangerous actions.
THREAD also allows tracking the origin of data and its use, which is an important aspect of the
ethical use of personal information for training ML models in the healthcare industry. Additionally,
D. Papathanasiou et al. proposed the MYRTO methodology [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], which allows for optimal
placement of IoT data (Internet of Things) in Pervasive Edge Computing ecosystems, considering
processing efficiency and latency reduction, which is important when using ML models to process
data requests, in cybersecurity. This helps to improve the efficiency of data processing and
classification of dangerous requests on different network nodes. Additionally, Brydinskyi et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
conducted a comprehensive comparison of modern deep learning models for speaker verification,
highlighting the effectiveness of advanced neural architectures in improving accuracy and
robustness, which can be analogously applied to the classification of dangerous queries in
cybersecurity contexts. Furthermore, recent works by Shevchuk et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] emphasize the
importance of designing secured services for authentication, authorization, and accounting,
providing a foundational approach to protecting systems against unauthorized or harmful requests
through robust security policies and mechanisms.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Initial information that establishes research</title>
      <p>
        The study examined the main classification algorithms used to detect and block dangerous requests
automatically in cybersecurity systems. LR, SVM, DT, and DNN are analyzed. For each algorithm, a
general description and examples of use in the field of cybersecurity are presented [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], as well as
their query classification schemes. As part of the LR analysis, a model was built based on such
parameters as Internet Protocol (IP), Hypertext Transfer Protocol (HTTP), and principal component
analysis (PCA) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. For SVM, the algorithm was tested on a sample of network traffic to determine
its effectiveness in detecting attacks [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. In the case of DT, the accuracy of query classification was
tested using the Uniform Resource Locator (URL) as a key classification parameter [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. For DNN, a
multilayer neural network was built, and the model was trained and tested to evaluate its
effectiveness in detecting threats [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        We used Matlab and the Network Traffic Dataset obtained in comma-separated values (CSV)
format to implement the automated query classification from the Kaggle platform [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The data
were pre-loaded, cleaned, and transformed. The training (80%) and test (20%) samples were used to
ensure the proper quality of training ML models (LR, SVM, DT, DNN). For DNN, the model
architecture was defined, and training was performed with visualization of the training process.
After outputting the DNN training result for data classification, the effectiveness of LR, SVM, and
DT models was compared. Then, the prediction was performed, and the accuracy of the
classification models was evaluated using the accuracy, F-score, precision, and recall metrics.
      </p>
      <p>
        In addition, the study used methods of blocking dangerous queries [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], their advantages and
disadvantages to assess the feasibility of combining approaches [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. The supervised learning
method is implemented using dynamic classifier selection (DCS), where the optimal classifier is
selected for each query according to [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]
      </p>
      <p>C ´ = arg max confidence( C i , x)</p>
      <p>Ci∈ C
(1)
where C´ is the optimal classifier selected to process the query; C = {C1, C2, ..., Cn} is the set of
available classifiers;Ci is a separate classifier from the set ;x is the input query to be classified;
confidence(Ci, x) is the confidence function in the classifierCi for the query , which evaluates how
well this classifier is the best for the current query.</p>
      <p>
        For signature blocking, we used query verification against a set of known attacks, which is
formalized by [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]
where W is a set of allowed queries (whitelist); B is a set of prohibited queries (blacklist).
      </p>
      <p>S ( x )= { 1 , if x ∈ S</p>
      <p>0 , else
D =
| xnew− μ|</p>
      <p>σ
B ( x )= { 1 , if x ∈ W
0 , if x ∈ B
where x is an input request; S is a set of known attack signatures.</p>
      <p>
        In turn, the method of behavioral analysis and anomalous detection was based on an assessment
of the deviation of a new request from the average value of the profile, which is determined by [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]
where D is the distance between the new query and the average value of the profile;xnew is the
value of the new query; μ is the average value of the profile; σ is the standard deviation of the
profile. Moreover, the blacklist and whitelist methods were implemented using the following [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Analysis of machine learning classification algorithms</title>
      <p>As information technology advances, the number of cyberattacks aimed at unauthorized access to
confidential data is growing. Information protection is becoming one of the key challenges for
cybersecurity systems, requiring the use of innovative approaches to detecting and neutralizing
threats. One of the most promising areas is the use of ML methods and algorithms that automate
the process of classifying and blocking dangerous requests. In cybersecurity, this helps to improve
the accuracy of detecting potential attacks, minimize the risk of false blocking, and adapt
algorithms to new types of threats. In this context, developing an effective ML-trained model for
analyzing queries and determining their danger is a pressing task that significantly impacts
improving information systems’ protection level. Choosing an optimal ML classifier to implement
an automated mechanism for blocking dangerous queries is important. Modern classification
algorithms, such as LR, SVM, DT, and DNN, have different characteristics that affect the
performance of a cybersecurity system (Table 1). LR is one of the basic ML algorithms used for
binary classification. Due to its mathematical simplicity and effectiveness in cases where the data
has linearly separated classes, LR allows you to quickly identify potentially malicious queries and
take measures to block them.</p>
      <sec id="sec-4-1">
        <title>Description</title>
        <p>An algorithm for binary classification
that predicts the probability of an object
belonging to a certain category based on
independent variables.</p>
        <p>A classification method that uses
hyperplanes to divide data into classes,
capable of working with linearly and
nonlinearly separated data.</p>
        <p>An algorithm that builds a decision tree
for classification, where each node is a
certain feature, and the branches are
possible values.</p>
        <p>A network of neurons with many layers
that can learn complex, non-linear
patterns in data.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Example of use in сybersecurity It is used for basic classification tasks (e.g., identifying suspicious requests as ‘attacker’ or ‘normal’).</title>
      </sec>
      <sec id="sec-4-3">
        <title>It is used to classify malicious queries based on large datasets with many features and non-linear relationships.</title>
      </sec>
      <sec id="sec-4-4">
        <title>It is used to detect malicious requests, where easy allows requests to be blocked.</title>
      </sec>
      <sec id="sec-4-5">
        <title>Suitable for detecting complex</title>
        <p>anomalies in large semantic data sets,
such as real-time threat analysis.</p>
        <p>The main limitation of LR is its poor ability to handle complex non-linear dependencies between
features. However, its use is justified in cases where it is necessary to quickly assess the risk of
threats based on key request parameters, such as the source IP address, HTTP request structure,
frequency of requests, etc. For a better understanding, consider the process of classifying requests
using LR in a cybersecurity system (Figure 1).</p>
        <p>The provided scheme starts by processing input data, such as query parameters or user
behavior. This data is passed to the model to calculate the probability that the request is malicious
or normal. The result is a threat probability that is compared to a predefined threshold. If the
calculated probability exceeds this threshold, the request is classified as malicious and is subject to
blocking. This model allows you to quickly and accurately detect suspicious requests, reducing the
risk of cyberattacks.</p>
        <p>On the other hand, SVM algorithm is a powerful classification method that uses hyperplanes to
divide data into classes. The advantage of SVM is the ability to efficiently process both linearly and
nonlinearly separated data. SVM also has the ability to adapt to changes in the types of attacks and
queries, which makes it a reliable tool for protecting information systems from new threats. In the
process of classification using SVM, an optimal hyperplane is first constructed to maximize the
distance between classes. This allows you to separate malicious queries from normal ones
effectively. After that, the query is compared with this hyperplane to determine its class. It is worth
considering the classification of queries using the SVM algorithm (Figure 2).</p>
        <p>First, the input query with extracted features goes through a preprocessing stage, where the
data is normalized and, if necessary, PCA is applied. Next, a hyperplane is constructed to maximize
the distance between classes, allowing for a clear query distribution. After that, the query is
compared to the hyperplane, and based on this comparison, it is determined whether the query is
malicious (attack) or normal. If a malicious request is detected, it is blocked, while a normal request
is allowed to be processed further. In turn, the DT algorithm works on the principle of sequentially
dividing data into subgroups depending on the values of certain characteristics. This makes it
possible to clearly distinguish between normal and malicious queries using a hierarchical
decisionmaking structure. The main advantage of DT is its interpretability, which makes it easy to track the
logic of classification and threat identification. The disadvantage may be the tendency to overlearn,
especially if the tree becomes too deep. For a more detailed analysis, consider the classification of
requests using DT (Figure 3).</p>
        <p>The first step is to receive an HTTP request and extract key characteristics from it. If the
address is known and safe, the risk level is checked. If the URL is suspicious, the request can be
blocked. Finally, if all checks are successful, the request is allowed, otherwise it is blocked. In
addition, DNNs can also efficiently analyze large data sets and detect complex, non-linear patterns.</p>
        <p>Due to its multi-layered architecture, it can accurately classify cybersecurity requests, adapt to
new threats, and minimize false blocking. One of the key advantages of DNNs is the ability to
automatically extract features, which allows you to find hidden correlations in the input data. This
algorithm can detect complex attacks that are difficult to identify using traditional methods.
However, DNN requires significant computing resources and may have problems with the
interpretability of solutions, which is an important aspect of cybersecurity systems. To understand
how query classification works using DNNs, we should analyze this process (Figure 4).</p>
        <p>The system receives a request to be verified. After that, the key characteristics of the request are
extracted, such as the IP address, URL structure, HTTP headers, request content, etc. Next, the data
is normalized and converted into a format suitable for neural network processing. A multilayer
model processes data by passing through several hidden layers that are trained on a large sample.
DNN determines the probability of a request belonging to a certain class: “malicious” or “normal”.
A request is blocked if it is identified as dangerous; if not, it is passed on for further processing.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Development and training of an automated classification model</title>
      <p>
        For the practical implementation of automated query classification, it is necessary to implement an
effective mechanism for training the ML model. For this purpose, we propose to use the Matlab tool
and the Network Traffic Dataset [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. First, the dataset is loaded from a CSV file using readtable,
while keeping the variable names unchanged (Figure 5). for training and testing the models. Next,
the rmmissing function removes rows with missing values. The code then checks all variables and,
if they are text (cell type), converts them to categorical variables and indexes them to replace text
values with numeric values.
      </p>
      <p>The next step is to generate random class labels for classification (normal traffic or attack). After
that, the data is divided into training (80%) and test (20%) samples using cvpartition. Finally, the
tables are converted into arrays, where X_train and X_test contain the features, and Y_train and
Y_test contain the corresponding class labels After data preprocessing, the next step is to train the
ML model (Figure 6). Since the amount of data is significant (315309 rows), a subset of 10000 rows
is selected for the convenience and speed of data training. To do this, we use the randperm
function, which generates random indices that are used to select appropriate samples of features
(Xtrain_subset) and class labels (Y_train_subset). Next, three classical models are trained: LR using
fitclinear, SVM using fitcsvm, and DT using fitctree. After that, the class labels are converted to the
categorical format so that they can be used for deep learning (DL). Finally, the DNN architecture is
defined, which consists of an input layer (featureInputLayer), two hidden fully connected layers
with 100 and 50 neurons, respectively, ReLU activation functions, an output layer with two
neurons, a softmaxLayer to convert the output values into probabilities, and a classificationLayerfor
classification.</p>
      <p>Next, we describe the process of training a DNN for data classification using Matlab, using
training options such as the number of epochs (50), mini-batch size (32), and the ‘sgdm’
optimization algorithm (Figure 7).</p>
      <p>During training, information on each iteration’s accuracy, loss, and time is displayed, allowing
you to track the model’s progress. As you can see, at iteration 1401 out of 15600, the total elapsed
time was 24 seconds, with the model being on the 5th epoch out of 50, and the number of iterations
per epoch was 312. This allows us to estimate the speed of learning and the potential need to
optimize the model parameters.</p>
      <p>As a result, the training process ends when the maximum number of epochs is reached,
displaying model performance metrics at each stage of training (Figure 8). The analysis of these
metrics allows us to assess the stability and convergence rate of the model, as well as to identify
possible over- or under-training problems. Based on this data, optimization parameters such as
learning rate, number of neurons in hidden layers, or mini-packet size can be adjusted to improve
the model’s overall performance.</p>
      <p>Next, the four classification models (LR, SVM, DT, DNN) are predicted on the test dataset, and
the accuracy of each model is calculated (Figure 9). Accuracy is calculated as the ratio of correct
predictions to the total number of test cases. In addition, a confusion matrix is built for the DNN
model, which allows you to visually assess the number of correct and incorrect predictions for each
class.</p>
      <p>The results obtained from the DNN confusion matrix show that the classification model has
some problems with correctly recognizing queries of the ‘Normal’ class, as a significant proportion
of such queries were misclassified as ‘Attack’ (39358 cases). At the same time, the model does a
good job of classifying queries of the ‘Attack’ class, correctly identifying them as ‘Attack’ (39464
cases). Compared to other methods (LR, SVM, DT), DNN may have better results when configuring
the right parameters and architecture, but it can also suffer from the problem of class imbalance. In
addition, the accuracy of each model is displayed as a percentage. The results show that all four
models (LR, SVM, DT, DNN) have similar accuracies, around 50% (Figure 10).</p>
      <p>It is worth noting that the results for LR, SVM, and DT classification models show different
efficiencies in classifying normal queries and attacks.</p>
      <p>The LR model has a significant number of false positives and false negatives, which indicates
difficulties with accurate classification. The SVM model improves accuracy by reducing the number
of false positives, but there are still errors. At the same time, DT demonstrates the best accuracy
among all models, reducing the number of misclassifications due to its flexibility in class
distribution. The confusion matrices can evaluate these results, which distinguish between
correctly and incorrectly classified queries as ‘Normal’ and ‘Attack’ (Figure 11).</p>
      <p>Thus, LR showed lower efficiency than other models, with many false positive and false
negative classifications. This indicates difficulties in accurately recognizing classes, particularly for
attacks. LR has limited flexibility in class distribution, which may be the main reason for its errors.
On the other hand, SVM has shown improvement over LR by reducing the number of false
classifications. However, like LR, the SVM model still makes mistakes when classifying some
queries, especially for the Normal class. It does a better job of distinguishing between classes due to
the flexible use of the hyperplane. In turn, DT proved to be the most effective among the three
models, reducing the number of false classifications due to its ability to adapt flexibly to the data.
The DT model demonstrated better accuracy in recognizing both Normal and Attack queries. This
may indicate that DT is better suited for solving problems with more complex relationships
between classes. In addition, DNN showed similar results to the other methods but was the most
susceptible to the problem of class imbalance. Although DNN can achieve better results when
configured with the right parameters and architecture, its effectiveness is limited due to the need
for DL and the difficulty in choosing the optimal settings.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion</title>
      <p>
        The results of this study are focused on the development of a trained ML model for automatically
detecting and blocking dangerous requests in cybersecurity systems using LR, SVM, DT, and DNN
algorithms. At the same time, P. Bova et al. presented a quantitative model for assessing the
dangerous capabilities of artificial intelligence (AI), including mechanisms for early warning of
potential AI risks [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. Both studies use intelligent systems to ensure security, but the current
results emphasize practical methods of real-time traffic classification for cyber defense, providing
greater accuracy in threat detection, while the work under review is more focused on general
strategies to prevent potential AI risks.
      </p>
      <p>
        Like the work conducted, where ML algorithms such as SVM and LR are used, the study by S.
Khan et al. [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] focuses on the classification of user interface errors using these algorithms, as well
as text vectorization techniques such as Term Frequency-Inverse Document Frequency (TF-IDF)
and Bag of Words (BoW). The best accuracy is achieved with SVM, TF-IDF, and data balancing
techniques. In addition, the authors apply Natural Language Processing (NLP) and Random Forest
(RF) methods.
      </p>
      <p>The current study also uses SVM and LR, but additionally considers DT and DNN algorithms,
which allows achieving higher accuracy and reliability in automating real-time attack detection.
This makes it possible to detect both known and new types of threats more efficiently, which is an
important aspect of cybersecurity compared to the interface error-oriented methods in the work
cited above.</p>
      <p>
        The results of this work are aimed at automating the detection and blocking of dangerous
requests in cybersecurity systems with a trained ML model using LR, SVM, DT, and DNN
algorithms. The study by U. Ahmed et al. [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] also applied ML and deep learning (DL) algorithms
to classify network traffic to improve network security. They tested SVM, k-nearest neighbours
(KNN), RF, DT, long short-term memory (LSTM), and artificial neural networks (ANN), where SVM
and RF were found to be effective for use in IDS, and DL models showed high accuracy in
recognizing complex attacks. Both studies demonstrate the effectiveness of SVM and DT
algorithms in the field of cybersecurity, and the conclusions of this work complement the study by
emphasizing the practical application of these algorithms in real-time to block threats
automatically.
      </p>
      <p>
        Unlike this study, which focuses on the ML model and its algorithms, the work of B. Ayyorgun
[
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] is aimed at using ML to work on edge devices in noisy environments. The main difference is
that the current study considers network traffic classification for cyber defense purposes, while the
presented work deals with IoT-based physical event analytics. At the same time, both works have a
common aspect—the use of ML for real-time and the importance of model optimization. Thus, the
results of this work complement the current study, particularly in optimizing ML models for
detecting cybersecurity anomalies, which opens up prospects for improving the effectiveness of
real-time protection.
      </p>
      <p>
        The study by G.A. López-Ramírez et al [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] uses ML methods to predict path loss in millimetre
wave (mmWave) networks, including LR, ANN, and Extreme Gradient Boosting (XGBoost),
improving the prediction accuracy compared to traditional methods. Similarly, the current work
also applies the LR algorithm to classify network traffic to detect dangerous requests. The results of
both studies confirm the effectiveness of using LR to improve prediction and classification
accuracy, although the current work focuses on automating real-time threat detection rather than
path loss prediction.
      </p>
      <p>
        While M. Kim et al. developed ML models to predict the risk of outcomes in medical processes
[
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] and databases [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] and cloud systems [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], the current work focuses on using ML to train the
model to detect and block dangerous queries. The work used the CatBoost model to predict
treatment effectiveness, while the current study uses LR, SVM, DT, and DNN algorithms to
automate the classification and detection of threats in real-time. The results of both studies confirm
the effectiveness of ML in high-precision prediction and classification, but the scope and types of
algorithms are different.
      </p>
      <p>
        The study compared the effectiveness of ML algorithms and methods of blocking dangerous
requests, and the DT-based model demonstrated the best results. In turn, the study by A. Aggarwal
et al. [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] used ML regression to predict the physical parameters of systems and achieved high
accuracy in forecasting. While both studies confirm the effectiveness of ML for providing accuracy
in prediction and classification, the current study is distinguished by its focus on real-time
detection and blocking of cyber threats. This makes it focused on real-world cybersecurity needs,
as opposed to general methods of predicting physical parameters.
      </p>
      <p>
        In addition to ML algorithms, the findings include an overview of methods for blocking
dangerous requests, such as supervised learning, signature methods, behavioral analysis, and
anomaly detection, as well as black-and-white lists. At the same time, the study by E. Peixoto et al
[
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] proposed an environmentally sustainable approach to automating ML processes in dynamic
data environments, which involves the reuse of models based on data similarity metrics. This
approach reduces the frequency of model retraining without losing performance, which can be
useful for cybersecurity, where frequent updates of threat detection models are critical. Thus, this
study’s results confirm the feasibility of blocking dangerous queries considered in the current work
since optimizing the processes of updating models contributes to the efficiency of real-time threat
detection.
      </p>
      <p>
        Similarly to this study, the study by M. D’Orazio et al. [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] uses classification algorithms, in
particular LR and SVM. At the same time, the current study additionally uses DT and DNN, while
the work uses neural networks (NN) and naïve bayes (NB). Both works are aimed at automating
query classification, so they complement each other, demonstrating the versatility of ML methods
in different domains. This also highlights the potential of using NLP to analyze text queries and
identify potential threats.
      </p>
      <p>
        Similarly to this study, which focuses on ML and DNN for classifying dangerous queries, J.-J.
Hou et al. [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] focuses on DL with an emphasis on convolutional neural networks (CNNs) and
recurrent neural networks (RNNs) for recognising dangerous behaviour. The use of DNNs in the
current work and CNNs and RNNs in the above study shows that different NNs can be adapted to
the tasks of automated recognition of dangerous actions, so these works extend each other,
confirming the wide capabilities of NNs.
      </p>
      <p>
        This study aims to develop methods for blocking dangerous queries, while Z. Su et al. [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]
investigate data compression methods for pre-trained models. The results of both works can be
complementary, since memory optimisation, as in the above work, can improve the performance of
current methods in real-world conditions, especially for resource-intensive algorithms. In addition,
the use of compression techniques can be useful for reducing the load on cybersecurity systems.
      </p>
      <p>It is worth noting that the study focuses on the development of an ML model in Matlab using
LR, SVM, DT, and DNN classification algorithms. Currently, in the work of Z. Al Shara et al. [38]
applied KMeans and Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH)
clustering algorithms to recover the links between pool requests and issues on GitHub, achieving
91.5% accuracy with BIRCH. Although both works use ML algorithms, the current approach
focuses on classification to identify and block dangerous requests, while the other work focuses on
clustering. Therefore, these studies complement each other, as classification and clustering are
interrelated processes in data analysis. Classification allows you to assign categories to objects
based on their characteristics while clustering groups similar objects. It can be useful for further
improving classification models or identifying new, previously unknown patterns.</p>
      <p>This work focuses on the application of ML algorithms using trained models with indicators of
accuracy and adaptability, and K.E. Brown et al. [39] compare large language models (LLM), such
as GPT-3.5 (Generative Pre-trained Transformer) and GPT-4, with traditional ML methods. The
current work focuses on the effectiveness and stability of using classical ML methods and
algorithms to accurately and reliably detect dangerous queries, which is an important aspect for
ensuring security in cyberspace. In contrast to the work, this paper demonstrates the advantage in
stability and adaptability of such methods, ensuring accuracy in detecting dangerous queries,
which is critical for cybersecurity.</p>
      <p>Like this work, the study by T. Matyja et al. [40] uses the Matlab environment. This paper
discusses business process optimization using simulation methods, where the authors use
SimEvents in Matlab/Simulink to model and generate artificial query sequences. They apply ML
techniques to transform real data into random data. At the same time, the current work uses
Matlab to apply classical ML algorithms, using such functions as fitclinear, fitcsvm, fitctree, and
Training Progress. Thus, the results of the presented work confirm the conclusions of the current
study, indicating the effectiveness of using ML in query analysis tasks and the convenience of
using Matlab for this purpose [41–43].</p>
      <p>This study has shown that ML effectively automates the classification of insecure requests, and
the methods of supervised learning, signature analysis, behavioral analysis, and black- and
whitelists have different efficiencies. In turn, M. Rahimifar et al. presented a method for predicting
resource utilization and inference latency of NNs before their implementation on
fieldprogrammable gate arrays (FPGAs) [44]. Although this approach is focused on optimizing
hardware implementation, both papers emphasize the importance of automated evaluation of ML
models to improve their efficiency [45].</p>
      <p>Finally, the study demonstrated the effectiveness of DNNs in classifying insecure queries and
provided a software implementation in Matlab. H. Bakır [46] proposed TuneDroid, a method for
optimizing CNN configurations to improve Android malware detection through code visualization.
While TuneDroid focuses on improving the accuracy of CNNs through Bayesian optimization, the
current work proposes an approach to automated threat classification in general network traffic,
not just for Android.</p>
      <p>Thus, the proposed ML model for blocking dangerous requests outperforms the considered
approaches due to the combination of accuracy, the ability to detect both known and new threats,
and the optimal balance between performance and computational costs. The combination of
different ML methods and algorithms allowed us to achieve high classification accuracy, and the
use of a neural network ensured efficiency in detecting anomalies in real-time. Thus, the proposed
approach has the potential to be integrated into modern cybersecurity systems, improving their
ability to detect and neutralize threats and risks [47, 48] autonomously.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusions</title>
      <p>This work has shown that ML algorithms such as LR, SVM, DT, and DNN are effective tools for
automated classification and blocking of dangerous requests in cybersecurity systems. The analysis
showed that DT provides the highest classification accuracy due to its ability to recognize complex
patterns in input data. SVM works well with large amounts of data and complex hyperplanes of
separation, while LR is fast and easy to implement but inferior to other models in complex
scenarios. At the same time, DNN has demonstrated high efficiency in detecting complex attack
patterns, but its use requires significant computing resources.</p>
      <p>MATLAB modeling based on the Network Traffic Dataset allowed us to evaluate the
performance of the selected algorithms. Using the accuracy, F-score, precision, and recall metrics,
we confirmed that DTs provide the best balance between computational speed and classification
quality, while DNNs have the potential to detect complex and new attacks. The comparison also
showed that combining several methods increases the overall effectiveness of a cybersecurity
system. The considered methods of blocking dangerous requests allowed us to establish that the
use of supervised learning based on DCS improves classification accuracy, as it adapts the choice of
algorithm according to the nature of the input data. Signature analysis proved to be effective in
recognizing known attacks, but its disadvantage is the inability to respond to new threats.
Behavioral analysis and anomalous detection methods provide adaptability but have the risk of
false positives. The use of black- and whitelists proved to be the least flexible but is useful as an
additional defense mechanism.</p>
      <p>Among the study’s limitations are the use of only one dataset, which may affect the
generalizability of the results, and the high computational complexity of DNN training, making it
difficult to apply them in real time. In addition, signature-based attack detection methods
demonstrate limited effectiveness against new threats, and behavioral analysis has a risk of false
positives.</p>
      <p>Further research could focus on expanding and diversifying datasets, optimizing NNs to
improve performance, and developing hybrid approaches that combine different classification and
blocking methods to achieve greater efficiency in cybersecurity systems.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>While preparing this work, the authors used the AI programs Grammarly Pro to correct text
grammar and Strike Plagiarism to search for possible plagiarism. After using this tool, the authors
reviewed and edited the content as needed and took full responsibility for the publication’s content.
[38] Z. Al Shara, H. Eyal-Salman, A. Shatnawi, A.-D. Seriai, ML-Augmented Automation for
Recovering Links Between Pull-Requests and Issues on GitHub, IEEE Access 99 (2023).
doi:10.1109/ACCESS.2023.10015726
[39] K. E. Brown, C. Yan, Z. Li, X. Zhang, B. X. Collins, Y. Chen, E.W. Clayton, M. Kantarcioglu, Y.</p>
      <p>Vorobeychik, B. A. Malin, Not the Models You Are Looking For: Traditional ML Outperforms
LLMs in Clinical Prediction Tasks, medRxiv (2024). doi:10.1101/2024.12.03.24318400
[40] T. Matyja, Z. Stanik, K. Włodkowski, A Method of Generating Customer Requests in a Car
Rental Simulation Model, Sci. J. Silesian Univ. of Technology, Series Transport 123 (2024) 191–
208.
[41] V. Buhas, et al., Using Machine Learning Techniques to Increase the Effectiveness of
Cybersecurity, in: Cybersecurity Providing in Information and Telecommunication Systems,
vol. 3188, no. 2 (2021) 273–281.
[42] V. Zhebka, et al., Methodology for Predicting Failures in a Smart Home based on Machine
Learning Methods, in: Workshop on Cybersecurity Providing in Information and
Telecommunication Systems, CPITS, vol. 3654 (2024) 322–332.
[43] M. Adamantis, V. Sokolov, P. Skladannyi, Evaluation of State-Of-The-Art Machine Learning
Smart Contract Vulnerability Detection Method, Advances in Computer Science for
Engineering and Education VII, vol. 242 (2025) 53–65. doi:10.1007/978-3-031-84228-3_5
[44] M. Rahimifar, H. Ezzaoui Rahali, A. Corbeil Therrien, Rule4ML: An Open-Source Tool for
Resource Utilization and Latency Estimation for ML Models on FPGA, Machine Learning:
Science and Technology 6(1) (2025). doi:10.1088/2632-2153/ada71c
[45] V. Zhebka, et al., Methodology for Choosing a Consensus Algorithm for Blockchain
Technology, in: Workshop on Digital Economy Concepts and Technologies Workshop,
DECaT, vol. 3665 (2024) 106–113.
[46] H. Bakır, A New Method for Tuning the CNN Pre-Trained Models as a Feature Extractor for
Malware Detection, Pattern Analysis and Applications 28(1) (2025).
doi:10.1007/s10044-02401381-x
[47] T. Korobeinikova, I. Tachenko, R. Chekhmestruk, P. Mykhaylov, O. Romanyuk, S. Romanyuk,
A General Method of Risk Estimation, in: 13th Int. Conf. Advanced Computer Information
Technologies (ACIT), Wrocław, Poland, 2023, 410–413. doi:10.1109/ACIT58437.2023.10275626
[48] V. Susukailo, I. Opirsky, O. Yaremko, Methodology of ISMS Establishment Against Modern
Cybersecurity Threats, in: M. Klymash, M. Beshley, A. Luntovskyy (eds), Future Intent-Based
Networking, Lecture Notes in Electrical Engineering, vol. 831, Springer, Cham, 2022.
doi:10.1007/978-3-030-92435-5_15</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Gnatyuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zozulja</surname>
          </string-name>
          ,
          <article-title>Model of Data for Improving Cybersecurity Content Management System</article-title>
          ,
          <source>Infocommun. Comput. Technol</source>
          .
          <volume>2</volume>
          (
          <issue>02</issue>
          ) (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .36994/
          <fpage>2788</fpage>
          -5518-2021-02-02-15
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nafiiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lande</surname>
          </string-name>
          ,
          <source>Malware Detection Model based on Machine Learning, Bulletin of Cherkasy State Technological University</source>
          <volume>28</volume>
          (
          <issue>3</issue>
          ) (
          <year>2023</year>
          ). https://bulletin-chstu.com.ua/en/ journals/t-23-3
          <article-title>-2023/model-viyavlennya-shkidlivogo-programnogo-zabezpechennya-naosnovi-mashinnogo-navchannya</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>А.</given-names>
            <surname>Yanko</surname>
          </string-name>
          , А. Prokudin, І. Fil, О. Kruk,
          <article-title>Detection of LDDoS Attacks using SDN Networks with Machine Learning Elements, Measuring and Computing Devices in Technological Processes (4) (</article-title>
          <year>2024</year>
          )
          <fpage>287</fpage>
          -
          <lpage>296</lpage>
          . doi:
          <volume>10</volume>
          .31891/
          <fpage>2219</fpage>
          -9365-2024-80-36
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Á. González</given-names>
            <surname>Álvarez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Castaño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Franch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Martínez-Fernández</surname>
          </string-name>
          ,
          <article-title>Impact of ML Optimization Tactics on Greener Pre-Trained ML Models</article-title>
          ,
          <year>arXiv</year>
          ,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2409.12878
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>I. G. A.</given-names>
            <surname>Mulyawarman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.G.A.S.</given-names>
            <surname>Yasa</surname>
          </string-name>
          , L. Cait,
          <article-title>Blocking Dangerous Content in Electronic Communications Networks: Evidence from Netherlands, United States and Singapore</article-title>
          ,
          <source>J. Human Rights Culture and Legal System</source>
          <volume>4</volume>
          (
          <issue>1</issue>
          ) (
          <year>2024</year>
          )
          <fpage>237</fpage>
          -
          <lpage>262</lpage>
          . URL: https://jhcls.org/index.php/JHCLS/article/view/216
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z. Zhang,</surname>
          </string-name>
          <article-title>Practical Continual Forgetting for Pretrained Vision Models</article-title>
          ,
          <source>in: CVPR</source>
          <year>2025</year>
          , CVF Open Access (
          <year>2025</year>
          ). URL: https://openaccess.thecvf.com/content/CVPR2024/papers/Zhao_Continual_
          <article-title>Forgetting_for_Pre trained_Vision_Models_CVPR_2024_paper</article-title>
          .pdf
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Exploring the Design Space of Unsupervised Blocking with Pre-trained Language Models in Entity Resolution, in: Advanced Data Mining and Applications</article-title>
          , Computer Science,
          <year>2023</year>
          ,
          <fpage>228</fpage>
          -
          <lpage>244</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -46661-8_
          <fpage>16</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vrincean</surname>
          </string-name>
          , Optimizing Request Handling using Blocking &amp;
          <string-name>
            <surname>Non-Blocking</surname>
            <given-names>I</given-names>
          </string-name>
          /O Middleware,
          <string-name>
            <surname>Babes-Bolyai</surname>
            <given-names>Univ.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cluj-Napoca</surname>
          </string-name>
          ,
          <year>2021</year>
          . https://www.researchgate.net/publication/353103884_ Optimizing_
          <article-title>request_handling_using_blocking_non-blocking_IO_middleware</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>García</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Favela</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
             E.
            <surname>Sánchez-Torres</surname>
          </string-name>
          ,
          <article-title>Traceable Health Data for Consciously Trained ML Models</article-title>
          , in
          <source>: Proc. 15th Int. Conf. Ubiquitous Computing &amp; Ambient Intelligence</source>
          ,
          <year>2023</year>
          ,
          <fpage>279</fpage>
          -
          <lpage>284</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -48642-5_
          <fpage>28</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Papathanasiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tziouvaras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kolomvatsos</surname>
          </string-name>
          ,
          <string-name>
            <surname>MYRTO:</surname>
          </string-name>
          <article-title>An Efficient Pervasive Method for Hybrid ML-based Data Filtered Allocations</article-title>
          ,
          <source>J. Intelligent Information Systems</source>
          (
          <year>2024</year>
          ).
          <source>doi:10.1007/s10844-024-00909-1</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>V.</given-names>
            <surname>Brydinskyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Khoma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sabodashko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Podpora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Khoma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Konovalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kostiak</surname>
          </string-name>
          ,
          <article-title>Comparison of Modern Deep Learning Models for Speaker Verification</article-title>
          , Appl. Sci. (Switzerland)
          <volume>14</volume>
          (
          <issue>4</issue>
          ) (
          <year>2024</year>
          )
          <fpage>1329</fpage>
          -
          <lpage>1</lpage>
          -
          <fpage>1329</fpage>
          -12.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
             
            <surname>Shevchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
             
            <surname>Harasymchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
             
            <surname>Partyka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
             
            <surname>Korshun</surname>
          </string-name>
          , Designing Secured Services for Authentication, Authorization, and
          <article-title>Accounting of Users, in: Cybersecurity Providing in Information and Telecommunication Systems II (CPITS-II)</article-title>
          ,
          <volume>3550</volume>
          ,
          <year>2023</year>
          , pp.
          <fpage>217</fpage>
          -
          <lpage>225</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tavasoli</surname>
          </string-name>
          , 10
          <source>Types of Machine Learning Algorithms and Models, Simplilearn Solutions</source>
          ,
          <year>2025</year>
          . https://www.simplilearn.com/10
          <article-title>-algorithms-machine-learning-engineers-need-to-knowarticle</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>S.</surname>
          </string-name>
           Chalichalamala,
          <string-name>
            <given-names>N.</given-names>
             
            <surname>Govindan</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
           Kasarapu,
          <article-title>Logistic Regression Ensemble Classifier for Intrusion Detection System in Internet of Things</article-title>
          ,
          <source>Sensors</source>
          <volume>23</volume>
          (
          <issue>23</issue>
          ) (
          <year>2023</year>
          )
          <article-title>9583</article-title>
          . doi:
          <volume>10</volume>
          .3390/s23239583
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Katiyar</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Off-Line Handwritten Character Recognition System Using Support Vector Machine</article-title>
          .
          <source>Amer. J. Neural Netw. Appl.</source>
          ,
          <volume>3</volume>
          (
          <issue>2</issue>
          ),
          <fpage>22</fpage>
          . URL: https://www.sciencepublishinggroup.com/article/10.11648/j.ajnna.
          <volume>20170302</volume>
          .12
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Patil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Patil</surname>
          </string-name>
          ,
          <article-title>Malicious URLs Detection using Decision Tree Classifiers and Majority Voting Technique</article-title>
          ,
          <source>Cybernetics and Information Technologies</source>
          <volume>18</volume>
          (
          <issue>1</issue>
          ) (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Subasi</surname>
          </string-name>
          ,
          <source>Machine Learning Techniques, in: Practical Machine Learning for Data Analysis Using Python</source>
          ,
          <year>2020</year>
          ,
          <fpage>91</fpage>
          -
          <lpage>202</lpage>
          . https://www.sciencedirect.com/science/article/abs/pii/ B9780128213797000035
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gattu</surname>
          </string-name>
          , Network Traffic Dataset, Kaggle,
          <year>2024</year>
          . https://www.kaggle.com/datasets/ ravikumargattu/network-traffic-dataset
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>I.</given-names>
            <surname>Zhuravel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Semenyuk</surname>
          </string-name>
          ,
          <article-title>Stochastic Models for Computer Malware Propagation</article-title>
          ,
          <source>in: Proc. 2024 IEEE 17th Int. Conf. Advanced Trends in Radioelectronics</source>
          , Telecommunications and Computer Engineering (TCSET), Lviv, Ukraine,
          <year>2024</year>
          ,
          <fpage>424</fpage>
          -
          <lpage>427</lpage>
          . doi:
          <volume>10</volume>
          .1109/TCSET64720.
          <year>2024</year>
          .10755827
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>T.</given-names>
            <surname>Korobeinikova</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Zhuravel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Mychuda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sikora</surname>
          </string-name>
          ,
          <article-title>The Practice of Block Symmetric Encryption for a Secure Internet Connection</article-title>
          ,
          <source>in: Computational Intelligence Application Workshop (CIAW)</source>
          ,
          <volume>3861</volume>
          (
          <year>2024</year>
          )
          <fpage>114</fpage>
          -
          <lpage>122</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Brownlee</surname>
          </string-name>
          , Dynamic Classifier Selection Ensembles in Python,
          <source>Machine Learning Mastery</source>
          ,
          <year>2021</year>
          . https://machinelearningmastery.com
          <article-title>/dynamic-classifier-selection-in-python/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <article-title>What is Signature-based Detection?, Corelight Inc</article-title>
          .,
          <year>2025</year>
          . https://corelight.com/resources/glossary/signature-based-detection
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>A.</given-names>
            <surname>Korchenko</surname>
          </string-name>
          ,
          <article-title>Methods for Identifying Anomalous Conditions in Intrusion Detection Systems</article-title>
          , Komprint, Kyiv,
          <year>2019</year>
          . https://nubip.edu.ua/sites/default/files/u34/monografiy_a korchenko_anna.pdf
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>C.</given-names>
            <surname>Kime</surname>
          </string-name>
          ,
          <article-title>Whitelisting vs Blacklisting: How Are They Different?</article-title>
          , TechnologyAdvice, eSecurity Planet,
          <year>2023</year>
          . https://www.esecurityplanet.com/applications/whitelisting-vs
          <article-title>-blacklistingwhich-is-better/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Di</given-names>
            <surname>Stefano</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
           A. Han,
          <article-title>Quantifying Detection Rates for Dangerous Capabilities: A Theoretical Model of Dangerous Capability Evaluations</article-title>
          , arXiv,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2412.15433
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>S.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Pal, User Interface Bug Classification Model Using ML and NLP Techniques: A Comparative Performance Analysis of ML Models, Int</article-title>
          .
          <source>J. Experimental Research and Review</source>
          <volume>45</volume>
          (
          <year>2024</year>
          )
          <fpage>56</fpage>
          -
          <lpage>69</lpage>
          . https://qtanalytics.in/journals/index.php/IJERR/article/view/3829
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>U.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          , et al.,
          <source>Signature-based Intrusion Detection using Machine Learning and Deep Learning Approaches Empowered with Fuzzy Clustering, Sci. Rep</source>
          .
          <volume>15</volume>
          (
          <year>2025</year>
          ).
          <source>doi:10.1038/s41598-025-85866-7</source>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ayyorgun</surname>
          </string-name>
          ,
          <article-title>Developing Ultra-lite ML Models for Crash Detection in Noisy Environments</article-title>
          , OSF Preprints (
          <year>2025</year>
          ). doi:
          <volume>10</volume>
          .31219/osf.io/b6zpc
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>G.A.</given-names>
            <surname>López-Ramírez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Aragón-Zavala</surname>
          </string-name>
          ,
          <article-title>Enhancing Indoor mmWave Communication With ML-Based Propagation Models</article-title>
          ,
          <source>IEEE Access 99</source>
          (
          <year>2025</year>
          ). doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2025</year>
          .10835075
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kim</surname>
          </string-name>
          , G. Choi,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cheon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Koo</surname>
          </string-name>
          ,
          <article-title>Chemotherapy Selection for Advanced or Metastatic Pancreatic Cancer using Machine Learning Models Trained with Multi-Center Datasets</article-title>
          ,
          <source>J. Clin. Oncol</source>
          .
          <volume>43</volume>
          (
          <year>2025</year>
          )
          <fpage>738</fpage>
          -
          <lpage>738</lpage>
          . doi:
          <volume>10</volume>
          .1200/JCO.
          <year>2025</year>
          .
          <volume>43</volume>
          .4_
          <issue>suppl</issue>
          .
          <fpage>738</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>T.</given-names>
             
            <surname>Korobeinikova</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Web-Applications Fault</surname>
          </string-name>
          Tolerance and
          <article-title>Autoscaling Provided by the Combined Method of Databases Scaling</article-title>
          ,
          <source>in: 12th Int. Conf. Advanced Computer Information Technologies (ACIT)</source>
          ,
          <year>2022</year>
          ,
          <fpage>27</fpage>
          -
          <lpage>32</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACIT54803.
          <year>2022</year>
          .9913098
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>O.</given-names>
            <surname>Vakhula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Opirskyy</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. Mykhaylova,</surname>
          </string-name>
          <article-title>Research on Security Challenges in Cloud Environments and Solutions based on the “Security-as-Code” Approach</article-title>
          , in: CEUR
          <source>Workshop Proc. 3550</source>
          ,
          <year>2023</year>
          ,
          <fpage>55</fpage>
          -
          <lpage>69</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>A.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.K.</given-names>
            <surname>Subbiah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bajpai</surname>
          </string-name>
          ,
          <article-title>Using ML-Supervised Learnings BasedAlgorithms to Create a Relative Permeability Model, SPE Caspian Technical Conf</article-title>
          . and
          <string-name>
            <surname>Exhibition</surname>
          </string-name>
          (
          <year>2024</year>
          ). http://researchgate.net/publication/386139521
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>E.</given-names>
            <surname>Peixoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Torres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Carneiro</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
           M. 
          <string-name>
            <given-names>L.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Marques</surname>
          </string-name>
          ,
          <string-name>
            <surname>Reusing</surname>
            <given-names>ML</given-names>
          </string-name>
          <article-title>Models in Dynamic Data Environments: A Data Similarity-based Approach for Efficient MLOps</article-title>
          ,
          <string-name>
            <surname>Preprints</surname>
          </string-name>
          (
          <year>2025</year>
          ).
          <source>doi:10.20944/preprints202501.1385.v1</source>
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>M. D'Orazio</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Bernardini</surname>
            ,
            <given-names>E. Di Giuseppe</given-names>
          </string-name>
          ,
          <source>Influence of Pre-Processing Methods on the Automatic Priority Prediction of Native-Language End-Users' Maintenance Requests through Machine Learning Methods, J. IT in Construction</source>
          <volume>29</volume>
          (
          <year>2024</year>
          ). https://itcon.org/paper/2024/6
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>J.-J. Hou</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhong</surname>
          </string-name>
          , W. He,
          <source>Research Progress of Dangerous Driving Behavior Recognition Methods Based on Deep Learning, World Electric Vehicle J</source>
          .
          <volume>16</volume>
          (
          <issue>2</issue>
          ) (
          <year>2025</year>
          )
          <article-title>62</article-title>
          . doi:
          <volume>10</volume>
          .3390/wevj16020062
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anwar</surname>
          </string-name>
          , Y. Cheng,
          <article-title>Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to Ask</article-title>
          ,
          <source>Proc. VLDB Endowment</source>
          <volume>17</volume>
          (
          <issue>8</issue>
          ) (
          <year>2024</year>
          )
          <fpage>2036</fpage>
          -
          <lpage>2049</lpage>
          . doi:
          <volume>10</volume>
          .14778/3659437.3659456
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>