Models and Technologies for Autoscaling Based on
                         Machine Learning for Microservices Architecture
                         Serhiy Semerikov1, Dmytro Zubov2, Andrey Kupin1, Maxim Kosei1 and Vladyslav
                         Holiver1
                         1 Kryvyi Rih National University, Vitaly Matusevich 11, Kryvyi Rih, 50027, Ukraine
                         2 University of Central Asia, 125/1 Toktogul Street, Bishkek, 720001, Kyrgyzstan


                                         Abstract
                                         The subject of the research in the article is machine learning processes in web service systems used for
                                         providing online services. The subject of the study is methods and tools for auto-scaling these web
                                         services using machine learning. The evolution of web services, their structure including development
                                         history, scaling options, key concepts of microservices architecture, and general principles of artificial
                                         intelligence and machine learning are analyzed, providing an important foundation for understanding
                                         technological innovations and potential enhancements for web services. The most significant aspects of
                                         applying machine learning in microservices architecture are identified, including various design
                                         patterns and machine learning models, which form the basis for improving the efficiency and
                                         capabilities of complex systems. Relevant mathematical models and necessary software are proposed.

                                         Keywords
                                       Microservices architecture, artificial intelligence, machine learning, deep learning, SAGA, CRUD, CQRS,
                         API gateway, circuit breaker, Python, containers, Docker, Ubuntu.1


                         1. Introduction
                         Web services are an essential part of the modern Internet. They enable web applications to
                         interact with each other, regardless of what platforms or programming languages they are
                         written in. This makes web services a valuable tool for developers who want to create flexible
                         and scalable web applications [1, 2]. Web services are used in a wide range of applications,
                         including e-commerce systems, banks, mobile operators, and web stores. For example, e-
                         commerce systems often use web services to process payments, deliver goods, and provide
                         customer support, banks use web services to provide financial services such as banking, loans,
                         and deposits, mobile operators use web services to manage their networks and provide services
                         to their customers, and web stores use web services to catalog products, process orders, and
                         deliver goods. The significance of web services is growing every year, so it is becoming crucial to
                         ensure the scalability of web services, i.e., the ability of the system to increase the processing
                         amount with the increase in the number of users [3].
                            The application of artificial intelligence (AI) to automate the scaling of web services is an active
                         area of research. Scientists and engineers are developing new AI methods and algorithms that
                         can help improve the efficiency and accuracy of automated scaling in Microservices architecture
                         (MSA). Here are a few specific examples of how AI can be used to automate web service scaling:
                         Amazon Web Services (AWS) uses AI for its Auto Scaling system, which automatically deploys or
                         shuts down servers based on load; Google Cloud Platform (GCP) uses AI for its Cloud Autoscaling
                         system, which has similar functionality to AWS' Auto Scaling; Microsoft Azure uses AI for its Azure
                         Autoscale system, which also has similar functionality.


                         COLINS-2024: 8th International Conference on Computational Linguistics and Intelligent Systems, April 12–13, 2024,
                         Lviv, Ukraine
                            semerikov@gmail.com (S. Semerikov); dzubov@ieee.org (D. Zubov); kupin@knu.edu.ua (A. Kupin);
                         kosei@knu.edu.ua (M. Kosei); holivervlad@gmail.com (V. Holiver)
                           0000-0003-0789-0272 (S. Semerikov); 0000-0002-5601-7827 (D. Zubov); 0000-0001-7569-1721 (A. Kupin);
                         0009-0008-3445-5675 (M. Kosei); 0000-0002-8276-5992 (V. Holiver)
                                    © 2024 Copyright for this paper by its authors.
                                    Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
    Autoscaling based on machine learning (ML) can be utilized not only for web services but also
for other types of applications such as mobile apps, embedded systems, and more. For instance,
autoscaling can be employed for mobile applications that experience a high volume of server
requests, such as online shopping apps or social networks. ML-based autoscaling of web services
allows for automatically adjusting server resources based on the workload of the web service.
This is achieved using ML algorithms that analyze service monitoring data and forecast future
workload. For example, classification algorithms, regression, clustering, or neural networks can
be employed [4-7].
    Advantages of using ML-based auto-scaling methods and tools:
    - high accessibility and reliability of the system due to horizontal scaling and the use of
microservice architecture;
    - efficient use of resources due to automatic scaling of resources based on the actual load;
    - flexibility and scalability of the system due to the ability to add new servers and containers
as needed and scale individual services depending on the load;
    - high performance and avoidance of unpredictable failures through the use of machine
learning to predict future load and automatically scale server resources based on the current load.
    Although machine learning and container-based autoscaling methods and tools have many
advantages, they also have some disadvantages [8]. One of the main drawbacks is the complexity
of implementing and customizing these tools. In addition, the use of machine learning can require
significant computing resources, which can lead to increased hardware and system maintenance
costs. Finally, autoscaling can be difficult when using third-party services such as databases or
other tools that may not be compatible with autoscaling.
    Moreover, it is worth noting that when employing autoscaling methods and tools for web
services, security considerations must be taken into account. For example [9], when using
containers, ensuring container security is essential as they may contain sensitive data.
Additionally, network security on which the web service operates should be ensured as
inadequate security measures can lead to system breaches and loss of confidential information.
    Additionally, it is essential to consider that security is an ongoing process that requires
continuous monitoring and updating of protection methods and tools. Therefore, when selecting
methods and tools for autoscaling web services, security considerations must be taken into
account, and continuous monitoring and system updates should be ensured to maintain the
highest level of security for the web service.
    When choosing autoscaling methods and tools, it is also essential to consider economic
feasibility. The utilization of complex methods and tools may result in a significant increase in
deployment and maintenance costs. Therefore, when selecting autoscaling methods and tools, it
is necessary to consider their effectiveness and cost, as well as which methods and tools are most
optimal for the specific system.
    The aim of this research is to develop efficient methods and algorithms for autoscaling web
services based on machine learning to ensure stable operation of services under varying
workloads. The tasks include analyzing existing autoscaling methods, developing new algorithms,
implementing them, and testing them on real web services.


2. Analyzing Patterns for Designing in Microservices Architecture
   (MSA)
MSA offers numerous advantages in terms of flexibility, scalability, and service independence, but
it also introduces complexities associated with managing these distributed systems. Utilizing
design patterns in MSA becomes exceedingly important as they offer ready-made and proven
solutions to such challenges. Patterns help find optimal ways to manage data consistency,
implement service-to-service relationships, monitoring, and scaling. They contribute to the
creation of stable, efficient microservices applications, allowing developers to focus on
application functionality rather than solving complex technical tasks.
    In MSA, where system components are divided into separate services, situations may arise
where one service successfully makes changes to the database while another service, depending
on these changes, has not yet updated its data. This can lead to inconsistent data state in the
system. Additionally, problems may arise during the execution of distributed transactions (Figure
1), where part of the transaction is executed successfully, while another part is not. This may be
caused by network issues, software errors, or other unforeseen circumstances.


Figure 1: Example of a distributed transaction where transaction boundaries are crossed by
multiple services and databases

   To address issues with distributed transactions, the SAGA (Segregated Access of Global
Atomicity) design pattern is used. This pattern is based on the idea of breaking down a large
transaction into smaller sub-transactions, which are executed separately in each microservice
(Figure 2). If one of the sub-transactions fails, a compensating transaction is applied to undo or
adjust the changes made by the preceding transactions.


Figure 2: An example of a distributed transaction using the SAGA Flow pattern

   The SAGA Execution Coordinator (SEC) is the central component for executing a SAGA flow
(Figure 3). It contains the SAGA Log, which records the sequence of events of a distributed
transaction. There are two types of SAGA pattern implementation: Choreography and
Orchestration. In the SAGA Choreography pattern, each microservice involved in a transaction
publishes an event that is processed by the next microservice. This is an interaction of services
where each service receives events from others and responds to them, controlling the course of
the transaction. To use this pattern, you need to decide whether the microservice will be part of
SAGA. Accordingly, the microservice must use an appropriate framework to implement SAGA. In
this pattern, the SAGA runtime coordinator can be embedded in the microservice or act as a
separate component.


Figure 3: Saga Execution Coordinator Component
    In SAGA Choreography, the flow is considered successful if all microservices successfully
complete their local transactions and none of them report a failure (Figure 4). In case of failure,
the microservice notifies the SAGA Execution Coordinator (SEC), through which corresponding
compensating transactions are invoked. If the invocation of the compensating transaction fails, it
is retried until successfully executed, with the compensating transaction being idempotent and
capable of being retried.


Figure 4: SAGA Choreography diagram illustrates the successful execution of a transaction

   In traditional systems, especially in monolithic applications, it is very common to use a shared
relational database hosted on the server side and accessible from the user application. This
centralized database is accessed using Create-Read-Update-Delete (CRUD, Figure 5) operations.
However, in today's complex MSA applications, especially when scaling the application, this
traditional implementation creates a problem. When processing multiple CRUD requests to a
database, table connections are created, which leads to a high probability of database locks.
Blocking tables causes delays and competition for resources, which significantly affects the
overall performance of systems.


Figure 5: CRUD Pattern
   Complex queries contain a large number of table connections and can block tables, preventing
any write or update operations until the query is complete and the database unblocks the tables.
Database read operations are usually performed several times more frequently than write
operations, and in systems with intensive transaction execution, this problem can be exacerbated.
   The CQRS (Command Query Responsibility Segregation) pattern divides one object into two
objects. So, instead of executing both commands and queries for one object, we split this object
into two objects - one for the command and one for the query. A command is an operation that
changes the state of an object, and a query does not change the state of the system - it just returns
the result. The object here is the Database, and this section of the database can be physical or
logical.
   However, the best practice is to have two physical databases for CQRS, but still, it is possible
to use one physical database for both commands and queries. For example, the database can be
divided into two logical representations - one for commands and the other for queries. When
using two physical databases in CQRS, a replica is created from the primary database. The replica
will need to be synchronized with the primary database for data consistency. Synchronization
can be achieved by implementing Event-driven architecture (EDA), where a message broker
handles all system events. The replica subscribes to the message broker, and whenever the
primary database publishes an event in the message broker, the replica database synchronizes
that specific change.
   There will be a delay between the exact time the master database was actually changed and
the time that change is reflected in the replica; the two databases will not be 100% consistent
during this period, but they will be consistent over time after the change is synchronized. In CQRS,
this synchronization is called final consistency synchronization. When applying CQRS in MSA, the
latency of database operations is significantly reduced, and therefore, the performance of
communication between individual services is significantly improved, which leads to an overall
increase in system performance.
   The type of database used for CQRS may vary depending on the business requirements for a
particular service in the MSA. It may well be a relational database (RDB), a document-oriented
database, a graph database, or another type of NoSQL database.
   Microservices can communicate directly with each other without the need for a centralized
manager (Figure 6). However, as the MSA system evolves and becomes more mature, and the
number of microservices gradually increases, direct communication between microservices can
lead to significant overheads, especially for calls that require multiple round trips between the
API consumer and the API provider.


Figure 6: MSA System with Direct Service Interaction
    Following the principle of microservices autonomy, each microservice can use its own
technological stack and may communicate using a different API contract than other microservices
in the same MSA system. For example, one microservice may only understand RESTful APIs with
JSON data structure, while others may communicate only through Thrift APIs or Avro APIs.
    Furthermore, the location (IP address and port for listening) of active microservice instances
can dynamically change within the system based on microservice needs. Therefore, the system
needs a mechanism to identify the destination to which the API consumer can send its calls. All of
this requires additional built-in code in each microservice to help understand outdated APIs,
discover the network location (Service Discovery) of other microservices in the system, and
comprehend the overall needs of each microservice. However, such an approach violates the
principle of microservices autonomy and leads to system complexity as with each new
microservice added to the system, the number of potential interactions between microservices
increases.
    A more efficient approach to addressing these issues is to use the API Gateway pattern (Figure
7). Here, all microservices communicate with each other, receive API calls from consumers, and
then transform the received data into a data structure and protocol that API providers can
understand and process. Using the API Gateway significantly reduces direct one-to-one
communication between services. Additionally, microservices are relieved from the code
required to perform tasks such as API contract mapping, service discovery, and tasks related to
controlling and providing access to system resources (AAA - Authentication-Authorization-
Accounting).


Figure 7: MSA System with API Gateway

   Another important issue in MSA systems is the stability and reliability of executing business
processes. Patterns like SAGA (Figures 2-3) are used to ensure that all transactions in a specific
business process either all succeed or none of them do. However, this is insufficient for the
reliable operation of an MSA system.
   Let’s consider a scenario (Figure 8) where the called microservice is too slow to react to API
calls. Requests are successfully executed, but responses from the microservice end in a timeout.
The user of the microservice may assume a failure of execution, and accordingly repeat the
operation, which can be very problematic. In MSA systems, when creating a microservice, limited
resources and threads are separated to avoid one particular microservice from taking up all the
system resources.
Figure 8: Example of interaction between microservices with excessively slow response time

   Let's consider another scenario (Figure 9), where the Inventory microservice is part of a
workflow and fails to respond to API calls for any reason. In this case, both the Order and Payment
microservices will continue to wait for confirmation from the Inventory microservice before
releasing their resources.


Figure 9: Example of interaction between microservices when one of the microservices does
not respond to calls

    Such scenarios in MSA systems can trigger a domino effect, leading to a cascading failure of
multiple microservices, which in turn can cause a system-wide failure. To prevent a cascading
failure of the system, the Circuit Breaker pattern is used.
    The Circuit Breaker monitors the performance of the microservice using real traffic metrics. It
analyzes parameters such as response time and the percentage of successful responses and
determines the state of the microservice in real-time. If the microservice stops responding to
calls, the Circuit Breaker transitions to an open state and immediately sends an error to the
consumers of the microservice. When the Circuit Breaker detects that the microservice is
experiencing issues, it still monitors and evaluates its state, but it transitions to a half-open state,
allowing only a limited number of requests to pass through to the controlled microservice. If the
breaker then detects normal behavior from the microservice, it transitions back to a closed state,
and all requests are again routed for processing to the microservice.


3. Model Selection for Machine Learning (ML)
Among the most significant types of ML models are regression, neural network, and multiclass
classification models [10]. Let's analyze their main features.
    Regression models are modeling methods used to determine the relationship between
independent and dependent variables. Typically, the outcome of a regression model is a
continuous value, also known as a quantitative variable. Typical examples include predicting
house prices based on their characteristics or forecasting sales of a particular product in a new
store based on information about previous sales.
    Before creating a regression model, it is necessary to understand the data and its structure.
Most regression models use supervised learning (SL) methods. For regression models, the
training data usually consists of a set of features and the values of the dependent variable, known
as the label. Features are typically denoted as X, and labels as Y. Usually, the training data is
divided into two subsets: training and testing sets. The training set typically consists of 70-80%
of the total data, while the testing set contains the rest. This allows the model to learn from the
training set and evaluate its performance on the testing set to assess its accuracy and quality.
From the obtained results, we can draw conclusions about how our model performs on the
dataset.
  For a linear regression model to work effectively, our data must have a linear structure. The
model uses this formula to train and learn from the data
                                                         ,                               (1)
   where yi is the final outcome (actual value) of the target parameter; x1 ,..., xn are input
parameters; 0 , 1 ,...,  n are free parameters; n is number of data points.
   In the case of linear regression, there are two common metrics used to evaluate the model.
These are typically the Root Mean Square Error (RMSE) and the coefficient of determination R 2 .
   RMSE represents the standard deviation of the residual errors in predictions. Residual
indicates the distance between actual data points and the regression line. The greater the average
deviation of all points from the line, the higher the error. This indicates a weak model as it fails to
capture the correlation between data points. This metric can be calculated using the following
formula
                                                                                                (2)

                                                           ,
   where ŷi is the predicted value.
   The coefficient of determination ( R 2 ), measures the proportion of the variance in the
dependent variable (Y), that can be explained by the independent variables (X). Essentially, it
indicates how well the data fit the model. Unlike RMSE, which can be any number, R 2 is expressed
as a number ranging from 0 to 1, making it easier to understand. The closer R 2 to 1, the better
the correlation of the data. Although this is a useful indicator, values close to 1 percent are not
always indicative of a strong model. The quality of the value depends on the specific application
and the user's understanding of the data.
    R2 can be calculated using the formula
                                           ∑ (𝑦 − 𝑦)                                         (3)
                                   𝑅 =                    ,
                                          (∑ (𝑦 − 𝑦) )
   where yi is the mean value for the entire dataset.
   Many other metrics can assess the effectiveness of a regression model, but these two are
sufficient to get an idea of how it performs. When creating and evaluating a model, it is important
to visualize the data and the model, as this can reveal key insights. Graphs can help determine
whether the model is overfitting (Figure 10) or underfitting (Figure 11).


Figure 10: Example of an overfitted regression model
Figure 11: Example of an underfitting regression model

    Multiclass classification is an ML task that involves classifying objects into multiple classes.
Unlike binary classification, where an object can belong to only one of two classes, in multiclass
classification, an object can belong to multiple classes. Multiclass classification models are
considered versatile as they can be applied in both Supervised Learning (SL) and Unsupervised
Learning (UL), whereas regression models are primarily used in SL. There are several regression
models (such as logistic regression and support vector machines) that are also considered
classification models because they use a threshold to partition the output of continuous values
into different categories.
    UL is a widespread application that is widely used. Although SL typically performs better and
provides significant results since we know the expected output, most of the data we collect is
unlabeled. It takes a lot of time and money for experts to review and label data. UL helps reduce
costs and time by allowing models to attempt to determine labels for data and extract meaningful
information from them. Sometimes, they can even outperform humans.
    The number of categories at the output of the classification model determines its type (Figure
12). If the model has only two outputs (for example, dividing email messages into "spam" and
"not spam"), then it is a binary classifier. In the case where the model has more than two outputs,
it is considered a multiclass classifier.


Figure 12: Binary and Multiclass Classifiers
   Classifiers can be classified by the type of learning algorithm they use. There are two types of
learning algorithms: lazy learning and eager learning. Lazy learning algorithms actually store the
training data and wait until they receive new test data. Once they receive test data, the model
classifies the new data based on the existing data. These types of algorithms require less time
during training since new data can be continuously added without retraining the entire model,
but they spend more time during classification as they need to go through all data points.
   Main algorithms of this type include:
   1) K-Nearest Neighbor algorithm (KNN).
   2) Support Vector Machines algorithm (SVM).
   3) Naive Bayes Classifier algorithm.
   Eager learning algorithms, on the other hand, work oppositely. Each time new data is added
to the model, it needs to be retrained. Although this takes more time compared to lazy learning
algorithms, querying the model is much faster since they don't need to go through all data points.
   Main algorithms of this type include:
   1) Decision tree.
   2) Artificial Neural Networks (ANN).
   Neural networks can also be used for modeling and forecasting time series. The main
advantage of neural networks compared to traditional methods lies in their ability to
automatically detect complex dependencies in data and adapt to changes in the time series.
   TLRN (Time-Lagged Recurrent Network) networks and the NNARX model (Nonlinear
Nonparametric AutoRegressive model with eXogenous inputs) (Figure 13) are a subtype of
Recurrent Neural Networks (RNNs), which are typically used for modeling time series,
particularly for forecasting the next element of a time series based on previous elements. TLRN
allows for considering time delays in the input data and preserving the network's previous states
for better forecasting of future values.


Figure 13: Architecture of NARX type recurrent network model
   Unlike the feedforward artificial neural network, where data are processed in one direction
from the input to the output layer without feedback, the RNN preserves information about
previous data states using feedback loops, allowing the model to retain previous data states and
use them for further data processing.
   Also prevalent are recurrent neural networks such as LSTM (Long Short-Term Memory) and
GRU (Gated Recurrent Unit). The architecture of LSTM was specifically designed to address the
vanishing gradient problem that can occur during the backpropagation algorithm in traditional
recurrent neural networks. For example, the LSTM Cell type architecture includes three gates:
input, output, and forget gates (Figure 14).
   The gating blocks can learn to open or close based on input data and the previous memory
state, allowing the network to selectively retain or discard information over time.


Figure 14: Architecture of LTSM Cell Network


4. Libraries for Machine Learning (ML) Models in Python
There are many programming languages used for creating ML models: MATLAB, R, and Python.
Among them, Python has become the most popular programming language in the field of machine
learning due to its versatility and the abundance of libraries [11] that facilitate the creation of
machine learning models.
   NumPy is a key library in developing machine learning models in Python because there is a lot
of work with large multidimensional arrays involved in model creation. Since the bulk of the work
involves transformations, splitting, and performing complex mathematical operations on these
arrays, NumPy provides fast and efficient tools for this.
   Matplotlib is an important library for visualizing results and model data. It provides an easy
way to create plots, ranging from simple line plots to more complex ones such as contour plots
and 3D plots. The popularity of this library is due to its ease of interaction with NumPy.
   The Pandas library has become popular in the Python community due to its convenience and
versatility in working with data stored in CSV files, which has been a recent trend. This library is
used for data analysis, storing data in a tabular format. Users are provided with simple functions
for preprocessing and manipulating data, allowing them to adapt the data to their needs.
Additionally, Pandas is useful for working with time series data, which is an important aspect
when building forecasting models. The TensorFlow and Keras libraries are the foundation for
building deep learning models. Although both can be used independently, Keras is utilized as an
interface for the TensorFlow framework, allowing users to easily create powerful deep learning
models.
   TensorFlow, developed by Google, serves as the backend for building machine learning
models. It operates by creating static data flow graphs that specify how data moves through deep
learning. The graph consists of nodes and edges, where nodes represent mathematical
operations. It passes this data using multidimensional arrays known as tensors.
   Keras, which was later integrated with TensorFlow, can be considered as a frontend for
designing deep learning models. It was implemented with user convenience in mind, allowing
them to focus on designing their neural network models without delving into the complex details
of the backend. It resembles object-oriented programming as it replicates the style of object
creation. Users can freely add different types of layers, activation functions, and more. They can
even utilize pre-built neural networks for easy training and testing.
   PyTorch is another machine learning framework created by Meta, formerly known as
Facebook. Similar to Keras/TensorFlow, it allows users to create machine learning models. This
framework is well-suited for natural language processing (NLP) and computer vision tasks, but it
can be configured for most applications. What makes PyTorch unique is its dynamic
computational graph. It has a module called Autograd, which enables automatic differentiation
dynamically, unlike TensorFlow, where it is static. Additionally, PyTorch is more aligned with the
Python programming language, making it easier to understand and utilize Python's useful
features such as parallel programming.
   SciPy is a library designed for scientific computing. It contains many built-in functions and
methods for linear algebra, optimization, and integration, which are often used in machine
learning. This library is useful for calculating certain statistical indicators and transformations
when building machine learning models.
   Scikit-Learn is a machine learning library that is an extension of SciPy and built using NumPy
and Matplotlib. It includes many built-in machine learning models such as Random Forest, k-
means clustering, and Support Vector Machine (SVM).

5. Integration of machine learning (ML) into MSA
Having analyzed the core concepts and approaches used in MSA and ML, the next step is to
synthesize approaches for integrating ML into the MSA system. The space for integrating machine
learning into MSA systems is quite extensive and can be utilized for many different scenarios, but
four main integration areas can be highlighted [12-14].
   The first area is forecasting system load, which allows determining when microservices
experience higher-than-usual loads and taking measures to prevent system failures (Figure 15).
Forecasting system load is a common issue when working with web services. MSA has an
advantage over monolithic systems because resources are allocated separately for each
microservice, simplifying maintenance and scalability. However, there are situations in MSA
where a microservice experiences a high load, leading to a cascading effect where failures spread
to other microservices.
   With the help of ML, a model can be trained using various features, such as critical
microservices response time, and used to detect patterns in the operation of the MSA system.
Similar to the Circuit Breaker, this model can promptly determine whether a microservice is
under heavy load and address this issue before it becomes critical and starts negatively impacting
other microservices.


Figure 15: System Load Forecasting Model
   The second area is forecasting system performance degradation. This is similar to forecasting
system load but has a more specific goal - to identify issues that may lead to a decrease in system
performance or reliability (Figure 16).
   Like the system load forecasting model, we can build a model to detect anomalies in MSA that
may lead to system performance degradation. Instead of focusing solely on the load for a specific
microservice, it is possible to study the entire MSA and identify various patterns in how it
operates overall.
   MSA systems may experience varying loads and errors during specific times and periods. For
instance, an MSA system may encounter spikes in requests during certain periods, such as
holidays and seasonal events, when the number of users can sharply increase. By allowing the
model to learn and understand the MSA system and how it operates over time, we can prepare
the model to better detect anomalies and prevent false positives.
   Moreover, rather than tracking individual microservices, it is necessary to evaluate clusters of
microservices and how they interact with the entire MSA system. This way, there is an
opportunity to identify specific bottlenecks and errors that may arise at the scale of the entire
system.


Figure 16: System Productivity Degradation Prediction Model

   The third area of ML application is system security: In the era of cybersecurity, it is important
to have the ability to protect your MSA system from targeted attacks. By studying the behavior of
your MSA system, the model can predict and detect attacks that may threaten the security of the
system.
   Machine learning is successfully used in the field of cybersecurity. With the development of
more sophisticated hacker attacks, the issue of protecting the MSA system becomes relevant.
   Machine learning simplifies the creation of reliable models that can analyze and predict
attacks before they can impact the operation of the system.
   For example, Denial of Service (DoS) attacks, aimed at disrupting users' access to certain
services, are becoming increasingly sophisticated with the advancement of technology. With
machine learning, it is possible to recognize DoS attacks within the context of the MSA system.
Thus, it can determine whether our system is susceptible to such MSA attacks and alert the
security team or implement countermeasures to combat specific attacks and maintain system
integrity.


Figure 17: System Protection Model
   The fourth area of ML application is system resource planning. As the system grows and
evolves, it is important to properly allocate resources and adapt to the system's needs. With
machine learning, it is possible to learn which services require more resources and how much
resources are needed for effective system scalability (Figure 18). Part of the system self-healing
process involves allocating resources to certain microservices in case of system growth and
expansion in the MSA system.
   Over time, as the number of users increases, resulting in an increase in the number of requests
and load on the system, incorrect problem identification and erroneous resource planning and
allocation may occur. However, the application of a forward-looking ML model, which can track
the gradual growth of the MSA system and identify when certain services require additional
resources, will help avoid such mistakes and significantly improve system reliability, as it can
accurately and efficiently allocate resources.


Figure 18: System Resource Planning Model


Conclusion
This article extensively examines the key aspects of machine learning and their integration into
microservices architecture. The authors analyze various design patterns in the context of
microservices architecture, such as SAGA, CQRS, API Gateway, and CircuitBreaker, identifying
their advantages and drawbacks.
   Research is conducted on various classes of machine learning (ML) models, from regression
to multiclass classification models and models for time series analysis, as well as modern libraries
for creating Microservices (MSA) models using the Python programming language.
   Based on the analysis results of MSA and ML elements and approaches, it is concluded that
integrating ML into MSA is an important step to enhance the efficiency and capabilities of complex
and large-scale systems. Such an approach allows for automating processes, improving decision-
making quality, and responding to changes in real-time. Overall, this provides a well-founded
understanding of the interaction between machine learning and microservices, as well as offering
practical recommendations and examples for successful integration of these technologies into
modern systems.
   The material presented in this research on the interaction of machine learning and
Microservices provides a valuable foundation for further investigations by authors in the field of
intelligent web service scaling systems based on MSA using ML.


References
[1] E. Zharikov, S. Telenyk, O. Rolik, Method of Distributed Two-Level Storage System
    Management in a Data Center Advances in Intelligent Systems and Computing, 938 (2020)
    301–315. DOI:10.1007/978-3-030-16621-2_28.
[2] P. Raj, A. Raman, H. Subramanian, Architectural Patterns. Packt Publishing (2017). ISBN:
    9781787287495.
[3] S.Newman, Building microservices: Designing fine-grained systems. Beijing i pozostałe:
    O’Reilly, 2021. ISBN: 978-1492034025.
[4] M. Bruce, P. Pereira, Microservices in action. Shelter Island, NY: Manning Publications Co.,
    2019. ISBN: 9781617294457.
[5] A. Müller, S. Guido, Introduction to machine learning with python: A guide for data scientists.
     Sebastopol: O’Reilly Media, 2018. ISBN: 978-1-449-36941-5.
[6] J. Mueller, Machine learning security principles: Use various methods to keep data, networks,
     users, and applications safe from Prying eyes. Birmingham: Packt Publishing, 2023. ISBN:
     978-1-80461-885-1.
[7] S. Raschka, Y. Liu, and V. Mirjalili. Machine learning with pytorchand Scikit-Learn: Develop
     machine learning and deep learning models with python. Birmingham: Packt Publishing,
     2022. ISBN: 978-1-80181-931-2.
[8] M. Abouahmed, and O. Ahmed. Machine learning in microservices: Productionizing
     Microservices Architecture for Machine Learning Solutions. Birmingham: Packet Publishing,
     2023. ISBN: 978-1-80461-774-8.
[9] Ubuntu server - for scale out workloads Ubuntu, 2023. URL: https://ubuntu.com/server/
[10] A. Kupin, Y. Osadchuk, R. Ivchenko, O. Gradovoy. The Methods for Training Technological
     Multilayered Neural Network Structures (2021), in: CEUR Workshop Proceedings, 3013, pp.
     327–333. URL: https://ceur-ws.org/Vol-3013/20210327.pdf.
[11] J. Brains, PyCharm: The python IDE for professional developers by jetbrains, JetBrains, 2021.
     URL: https://www.jetbrains.com/pycharm/
[12] DBeaver Community, 2023. URL: https://dbeaver.io/
[13] MySQL, 2023. URL: https://www.mysql.com/
[14] Accelerated      Container     Application      Development,      2023     Docker.       URL:
     https://www.docker.com/