Architecture of intelligent system for webservices scaling
                                Andrey Kupin1, Dmytro Zubov2, Maxim Kosei1 and Vladyslav Holiver1
                                1
                                    Kryvyi Rih National University, Vitaly Matusevich 11, Kryvyi Rih, 50027, Ukraine
                                2
                                    University of Central Asia, 125/1 Toktogul Street, Bishkek, 720001, Kyrgyzstan

                                                   Abstract
                                                   The key aspects of web services development, administration structures, and the use of machine learning
                                                   technologies for server optimization are explored. The tendencies of web services development, scaling
                                                   options, importance and basic concepts of microservice architecture are considered.
                                                   The article highlights the general principles of artificial intelligence, machine learning, and deep learning
                                                   and their impact on the functionality of web services. To enhance the operation of web services, an
                                                   architecture of an intelligent system for automatic scaling is presented and machine learning algorithms
                                                   with increased reliability are elaborated. The article optimizes the performance of such a system. Methods
                                                   for detecting abnormal system behavior are proposed, which allows preventing failures or a decrease in
                                                   overall performance.

                                                   Keywords
                                                   Microservices architecture, scaling, artificial intelligence, machine learning, deep learning, pattern, API,
                                                   DevOps, CI/CD, PBW, PAD, Docker, One-Class SVM.1


                                1. Introduction
                                The research deals with the issue of developing effective models, methods, and algorithms for
                                scaling web services in modern information systems based on machine learning to ensure stable
                                operation of servers when the load changes.
                                   A detailed analysis of the relevance of the problem, the task statement, and the main research
                                directions were defined by the authors in their previous work [1]. In particular, this article presents
                                the necessary architectural and algorithmic solutions.
                                   The development trends of web services at the present stage have been significantly influenced
                                by the following events:
                                   - emergence of cloud platforms (2010s): In the 2010s, cloud-based platforms for developing web
                                services, such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform, were
                                launched. These platforms provide infrastructure and tools for deploying, managing, and scaling
                                web applications. Web analytics and performance optimization services, cloud storage, mobile
                                applications, etc. have also appeared;
                                   - spread of microservice architecture (since the 2010s): One of the current trends in web
                                development is the use of microservice architecture for web services. Instead of creating monolithic
                                applications, developers break down functionality into small, independent components that can be
                                deployed, scaled, and managed separately. This allows for greater flexibility, faster development
                                and deployment, and easier integration with other services;
                                   - expansion of the capabilities of artificial intelligence and the Internet of Things (since the
                                2010s): Recently, web services have started to use artificial intelligence to automate routine tasks,
                                analyze data, and improve user experience. Web services are also being developed to connect to the
                                Internet of Things, allowing physical devices to be controlled over the network.
                                   Preliminary analysis [2-8, 15-17] shows that there is a lack of research in this area. This is
                                especially true when it comes to identifying effective models, methods, techniques, and algorithmic
                                hardware and software for reliable management of servers on the global Internet. With this in
                                mind, the purpose of this article is to justify the choice of a rational architecture and develop

                                ICST-2024: Information Control Systems & Technologies, September 23-25, 2023, Odesa, Ukraine.
                                    kupin@knu.edu.ua (A. Kupin); dzubov@ieee.org (D. Zubov); kosei@knu.edu.ua (M. Kosei); holivervlad@gmail.com
                                (V. Holiver)
                                    0000-0001-7569-1721 (A. Kupin); 0000-0002-5601-7827 (D. Zubov); 0009-0008-3445-5675 (M. Kosei); 0000-0002-8276-
                                5992 (V. Holiver)
                                              © 2024 Copyright for this paper by its authors.
                                              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
algorithms for an intelligent web service scaling system based on Microservices Architecture
(MSA).
   The observed literature thoroughly discusses the architectural patterns in MSA and common
machine learning models. However, there is a significant gap in research regarding the application
of machine learning techniques specifically for scaling Docker-based microservices within MSA.
This article aims to address this gap by developing and justifying an intelligent system architecture
and algorithms focused on optimizing the performance and reliability of such systems.
   Traditional scaling methods for web services typically rely on threshold-based policy rules.
While effective, these methods have limitations. Incorporating machine learning into the scaling
process offers significant benefits, including improved adaptability, more efficient resource
management, and better performance prediction.

2. Types Of Scaling
There are two main types of scaling that are used to provide growth in resources and system
performance (Figure 1):
   - Vertical Scaling: This type of scaling involves increasing the capacity of hardware such as
processors, memory, and disk. With vertical scaling, one single server can handle more tasks or
process more data. For example, increasing the amount of RAM or upgrading to a more powerful
processor.
   - Horizontal Scaling: This type of scaling is adding other servers or nodes to the system. It
spreads the load across many physical or virtual servers to provide more processing power and
availability. Horizontal scaling is often used in cloud environments and distributed computing
systems.


Figure 1: The main types of web service scaling

3. Web Service Architecture
Web service architecture can be organized using two main approaches: monolithic architecture and
microservice architecture (Figure 2). Monolithic architecture is a traditional approach to web
application development in which all application features are located in a single software module,
usually a monolithic application or a monolithic server. In a monolithic architecture, all code,
database, and logic are located in a single application, which facilitates development and
deployment.
   The advantages of a monolithic architecture include ease of development and testing, no
problems with interactions between components, and reduced infrastructure costs. However,
monolithic applications can become difficult to scale and develop in large projects, and they can be
less flexible in introducing new features.
Figure 2: Monolithic and Microservices Architecture [9]

    Microservice architecture (Figure 3) is an approach where a large web application is broken
down into small, independent services that work together using lightweight communication
mechanisms such as APIs. Each service is responsible for limited functionality and has its own
database.
    Microservice architecture provides greater modularity, scalability, and flexibility in web
application development. Each service can be independently developed, scaled, and maintained. In
addition, microservices can use different technologies and programming languages, which gives
developers more freedom to choose technologies.
    However, the microservice architecture also has its challenges, including the complexity of
interactions between services, configuration, and monitoring management, and greater complexity
in implementing and managing multiple services.


Figure 3: Example of Microservices Architecture
  The advantages and disadvantages of MSA compared to monolithic architecture are discussed in
more detail in Table 1.
Table 1
Comparison of MSA with monolithic architecture
                                                                   Monolithic
 Characte
                           MSA - architecture                     architecture
   ristic
                      The high degree of autonomy. The              Lack of autonomy.
                      system functions are divided into            System functions are
  Structure
                    independent, slightly connected parts         tightly coupled in one
                         with a smaller code volume.                large block of code.
                                 Very high.                             Very limited
 Portability
                                                                        portability.
                               Highly reusable.                      Very limited code
 Reusability
                                                                        reusability.
 Modularity             Highly modular and scalable.                Limited modularity
    and                                                            and difficult to scale.
 scalability
                   The start time to market depends on the        Long time to market,
                       readiness of individual services.           especially in large
                   The more code is reused, the shorter the             systems.
                                     time.                          Shorter time to
   Time to
                      If the system's microservices are           market in small and
   market
                     developed from scratch, the time is            simple systems.
                    usually longer than for a monolithic
                                 architecture.
                        Very short release cycle, rapid           The long and typically
                   implementation of changes and updates.         very laborious release
 Release cycle
                                                                  cycle for new versions
 and updates                                                           and updates.
                   Usually high. It depends on the size of the     Typically low. They
                                      system.                     become larger in large
   Initial costs
                   Initial costs are offset by operational cost     corporate systems.
                                      savings.
   Operational       Low. Easier to maintain and operate.          High. Difficult to
     costs                                                        maintain and operate.
                                      High                                Low
   Complexity
   API control                        High                                  Low
                   Decentralized databases, so maintaining         Centralized database,
 Structural data     data integrity is more challenging.            making it easier to
    integrity                                                     maintain data integrity
                                                                  throughout the system.
  Performance                   Typically lower.                     Typically higher.

    Security                  More security issues                 Less security issues.

                      Hard to implement depending on the            Easy to implement.
 Implementation        organizational structure. Requires         Minimal organizational
   in software        adoption of flexible development and           transformation is
  development         DevOps (CI/CD, etc.). Organizational         required, if any at all.
  organization     transformation may be needed, which can
                          take a long time to achieve.
 Fault tolerance                Typically higher.                    Typically lower.
    Microservice architecture (Figure 3) is better designed for scaling than monolithic architecture
for the following reasons:
    - Decentralization. Microservices are distributed across multiple servers, which makes them
more scalable than monolithic applications that run on a single server. This means that you can
easily add or remove servers as needed to maintain the desired performance.
    - Isolation. Each microservice is isolated from the others, which means that you don't need to
scale the entire application if there is a significant load on just one microservice. This also means
that you can scale microservices independently of each other, which can be useful for cost
optimization.
    - Layer architecture. Microservices are often built using a layered architecture, which allows
you to scale the application using different technologies for each layer. For example, the storage
tier can be scaled horizontally and the processing tier can be scaled vertically.

4. Microservice Architecture (MSA)
MSA is a technique for creating a complex system from a set of smaller applications, each of which
is designed to perform a specific limited function.
    These minor applications (or services, or microservices) are developed independently of each
other and can function independently of each other. Each microservice has an API interface to
communicate with other microservices in the system.
    The way these individual microservices are organized together determines the functionality of
the larger system.
    To comprehend the value of microservices and the challenges that come with developing an
MSA, it is important to understand how microservices interact and communicate with each other.
    This interaction can be linear or non-linear.
    In a linear interaction (Figure 4), microservices transfer data to each other sequentially,
processing it in the system. Input data is always transferred to the first microservice, and output
data is always generated by the last microservice in the system.


Figure 4: Linear interaction of microservices

   In almost most existing systems, the interaction is non-linear (Figure 5).
   In a nonlinear microservice interaction, data is distributed among different functions in the
system. Input data can be passed to any function in the system, and output data can be generated
by any function in the system.
   Let's consider nonlinear interaction using a practical example in a typical e-commerce system
(Figure 6).

                                                                                    save or
update customer information. This microservice is solely responsible for managing customer
information based on the data it receives from the API call.


Figure 5: Nonlinear interaction of microservices
microservice, depending on the type of payment specified in the API call. It's worth noting here
how the payment verification process is split into two different microservices, each of which is
specifically designed for a specific payment function.
   This provides flexibility and portability of these microservices to other parts of the system or
another system if necessary.
   After the payment is processed, other microservices in the system receive API calls to fulfill the
order.
   This example shows how modular and flexible the MSA system design could be.


Figure 6: An example of a nonlinear interaction of microservices

5. Artificial Intelligence (AI), Machine Learning (ML) And Deep
   Learning (DL)
Despite the recent rise in popularity of Artificial Intelligence (AI) and Machine Learning (ML), the
field of artificial intelligence has existed since the 60s of the XX century. With the emergence of
various AI subfields, it is important to be able to distinguish them from each other and understand
what they mean and include.
    First, AI is a general field that encompasses all the subfields we see today, such as ML, Deep
Learning (DL) (Figure 7), and others.
    Any system that perceives or receives information from the environment and performs actions
to maximize rewards or achieve its goal is considered an AI system.
    This is very common in robotics today. Most of our machines are designed so that they can
collect data through their sensors, such as cameras, sonars, or gyroscopes, and use the collected
data to perform a particular task efficiently.
    This concept is very similar to how humans behave. Humans use their senses to gather
information from the environment and, based on the information they receive, perform certain
actions.
    AI is a vast field, but it can be broken down into different subfields, one of which we know
today as ML. What makes ML unique is that this field works to create systems or machines that
can learn and improve their models without explicit programming.
    ML does this by collecting data, known as training data, and trying to find patterns and
regularities in that data to make accurate predictions without being explicitly programmed to do
so. ML uses different methods to learn from data, and these methods are chosen depending on the
problems to be dealt with.
    The approaches used in ML are traditionally divided into three broad categories:
    1. Supervised Learning (SL);
    2. Unsupervised Learning (UL);
    3. Reinforcement Learning (RL).
    SL helps to understand the relationship between input and output data. One typical example of
SL is predicting the price of a house in a certain city.
    Data is collected on existing houses, namely their characteristics and current prices (training
set), and then the patterns between the characteristics of these houses and their prices are studied.
    After that, you can take a house that is not part of the training set and use the model you built
to predict its price based on its characteristics.
    UL involves learning the structure of data using grouping or clustering methods. This method is
often used for marketing purposes.
    For example, a store wants to divide its customers into different groups to effectively tailor its
products to different demographics.
    It can obtain the purchase history of its customers; study this data to determine purchase
patterns, and recommend certain products or services that might be of interest to them, thereby
maximizing its profits.
    Before looking at DL, which is a subfield of ML, it is important to understand what Artificial
Neural Networks (ANN) is.
    Taking the neurons in the brain as an example, ANNs are models that consist of a network of
interconnected nodes, also known as artificial neurons. They contain a set of inputs (Input), hidden
layers (Hidden Layer) connecting the neurons, and an output node (Output) (Figure 8). [10]
    Each neuron has an input and an output that can be transmitted throughout the network. To
calculate the neuron's output, the weighted sum of all inputs is taken, multiplied by the neuron's
weight, and usually a shift parameter is added.
    This process continues until the last layer is reached, which is the output neuron. A nonlinear
activation function, such as a sigmoid function, is applied to obtain the final prediction.
    The resulting predicted value is input into the cost function. This function shows how well our
network is learning.
    The value of the cost function is used to backpropagate errors through all layers back to the
first layer by adjusting the weights of the neurons. This allows us to create powerful models that
can perform tasks such as handwriting recognition, gaming AI, etc.
    In some cases, ANNs can be very powerful, but there are serious drawbacks that limit their
application:
    - Black Box: ANNs can be hard to interpret, making it complicated to understand how they
work and why they make certain predictions. This can make it difficult to debug ANNs and trust
their results.
    - Computational cost: Training an ANN can be computationally expensive, especially for large
and complex networks. It may require specialized hardware such as GPUs and can take a long time
to train.
    - Overfitting: ANNs are prone to overlearning, which means they can learn the training data too
well and fail to generalize to new data. This can lead to poor performance on real-world examples.


Figure 7: Relationship between AI, ML, DL
Figure 8: Artificial Neural Networks - ANN

    This is where DL comes into play.
    DL can be categorized according to the following key features:
    - Hierarchical composition of layers: Instead of having only fully connected layers in the
network, we can create and combine several different layers consisting of nonlinear and linear
transformations. These different layers play a role in extracting key features in the data that would
otherwise be difficult to find in an ANN.
    - End-to-end training: The network starts with a method called feature extraction. It analyzes
the data and finds a way to group redundant information and identify important features of the
data. The network then uses these features to learn and make predictions or classifications using
fully connected layers.
    Distributed representation of neurons: With feature extraction, the network can group neurons
to encode a larger feature of the data. Unlike ANNs, no single neuron encodes everything. This
allows the model to reduce the number of parameters it has to learn from while retaining key
elements in the data.
    DL is widely used in computer vision. Due to advances in photo and video capture technology,
it has become very difficult for ANNs to learn and recognize images with high accuracy. The
reason is that when using an image to train a model, you need to consider each pixel as an input
parameter of the model. For example, a 256x256 image has more than 65,000 input parameters.
Depending on the number of neurons in a fully connected layer, the number of parameters can
reach millions. With such a large number of parameters, there is a chance of overfitting and
training can take a very long time. With DL, you can create a group of layers called Convolutional
Neural Networks (CNNs). These layers are responsible for reducing the number of parameters that
the model needs to learn while preserving the key features of our data. With these additional
elements, we can learn how to extract certain features and use them to train our model with high
efficiency and accuracy (Figure 9).


Figure 9: Convolutional Neural Networks      CNN
6. Algorithms of AI Web Service Scaling System Based on MSA
Commonly used for web service autoscaling is the Threshold Rules Policy, which consists in
setting certain thresholds or limits that, when exceeded or reached, cause resources to
automatically scale to ensure optimal system performance and reliability.
   The use of ML techniques can greatly improve web service autoscaling strategies, especially for
large and complex systems, as such systems often have a large number of parameters and loads
that change in a very dynamic way.
   In such systems, patterns emerge that are the result of recurrence or similarity in the data,
interactions between system components, or the way the system processes the data.
   These patterns can be detected using various machine learning algorithms, which can
significantly improve the efficiency and scalability of the system, as well as ensure that the system
as a whole performs more optimally.
   As mentioned, there are many areas in the MSA system where artificial intelligence can be used.
   The focus will be on two main potential areas of improvement (Figure 10), which are
implemented by individual additional AI services. The first is to increase the system's response
speed in the event of microservice failure or performance degradation. The second area of
improvement is the introduction of the proactive role of the Circuit Breaker.


Figure 10: PBW and PAD - Artificial Intelligence Microservices for Enhancing Reliability and
Manageability of MSA System

    The first AI microservice is called Performance Baseline Watchdog (PBW). PBW is an ML
microservice that determines whether the performance of each microservice in the system meets
expectations. If the performance of a microservice falls below the expected level by a certain
amount, PBW sends an alert to operations support or network management systems. If
performance falls even further, PBW sends an alert to the Operation Support System (OSS) or
Network Management System (NMS) and can take action to automatically correct the problem.
    The second artificial intelligence microservice is the Performance Anomaly Detector (PAD).
PAD is a machine learning service that covers the entire MSA system. It analyzes MSA
performance patterns and tries to detect any unusual behavior. PAD finds problematic patterns in
the behavior of microservices, automatically detects problems before they occur, and proactively
acts to resolve them.
    The PBW algorithm calculates the expected performance based on the collected performance
statistics. The collected performance statistics include API response time statistics, errors or error
rates of individual microservices, API response codes, and the load applied to the microservice
itself. Predefined actions are triggered depending on how much the microservice deviates from the
calculated performance indicator. Based on the PBW configuration, the larger the deviation, the
more likely it is that a proactive action will be initiated to try to self-heal. However, in the case of a
minor deviation, no self-healing action should be triggered - a system warning informing the
system administrator is sufficient.
Table 2
Challenges of the MSA System and PBW.
             Problem                             Action triggered by PBW

                               Scaling microservice vertically or horizontally or restarting
         Slow response or
             timeouts                                    microservice
                                      Checking the status of Apache, Flask, JVM, Docker
          HTTP response
                                      volumes, SQL service, etc. Restarting the service if
             errors
                                                          necessary.
         Microservice does                 Restarting the microservice container.
        not respond (turned
                off).


Figure 11: Self-healing Microservices Algorithm

   Table 2 shows some of the possible system problems [11-12] that can be encountered during
system operation and the actions that the PBW service will take to try to fix the problem, and
Figure 11 and Table 3 show the microservice self-healing algorithm.

Table 3
Explanation of Terms for Figure 11.
            Term                                        Description
       Healing Action                            Action taken to fix a failure.
                             State of the microservice where only the self-healing algorithm
     Healing Lock State
                                      can interact with the problematic microservice.
                              The time to wait when a treatment fails before retrying. The
      Retry Wait Period
                                   default timeout period before retrying is 2 minutes.
                            The state in which the microservice is marked as unhealable after
       Unhealable State
                                               its failed attempt to heal itself.
      Maximum Healing            The maximum number of attempts made to rectify the
          Attempts                     microservice before marking it as unhealable.
   PBW uses a linear regression model for training and prediction, while PAD uses a One-Class
Support Vector Machine (One-Class SVM [13-14]).
   Compared to traditional support vector machines, which are used for classification tasks where
the data is labeled, One-Class SVM is designed for situations where only one class of data is
available (unlabeled data). Its main goal is to identify and classify normal data points from outliers
or anomalies [15-17].

Conclusion
The paper researches the main aspects of web services development, their structure, and the
impact of ML on this field. In particular, the paper considers web services development trends,
scaling options, importance, and basic concepts of MSA architecture.
    The general principles of artificial intelligence, machine learning, and deep learning and their
impact on the functionality of web services are also covered. This helps to understand what
technological innovations are used to improve the performance of web services and how machine
learning changes their capabilities.
    The described approach provides a general idea of the basic principles and trends of web
services development and the impact of machine learning on this industry. This is an important
basis for further research and implementation of innovations in the field of web services and their
connection with machine learning.
    Implementation of ML methods in web service autoscaling can provide significant benefits and
improve the efficiency of the MSA system.
    ML is especially useful for large and complex systems, as it enables the detection of patterns in
data and the interaction of system components.
    The proposed PBW and PAD algorithms provide the following advantages:
    1. Improved system reliability: These algorithms allow the system to respond to deviations in
microservice performance and detect anomalies, even before they occur. This allows system
operators to take action to fix problems faster and more efficiently, increasing overall system
reliability.
    2. Performance optimization: Rapid problem detection and automatic correction avoids loss of
productivity. Timely response to abnormalities helps maintain system stability and optimal
performance, which in turn improves productivity.
    3. Preliminary detection of problems: PAD helps detect anomalous patterns or unusual
behavior before they can cause serious problems. This allows the system to prevent failures or
performance degradation, enabling operators to prepare for potential problems and prevent them
from spreading.

References
[1] S. Semerikov, D. Zubov, A. Kupin, M. Kosei, V. Holiver, Models and Technologies for
    Autoscaling Based on Machine Learning for Microservices Architecture (2024), in: CEUR
    Workshop Proceedings, 2024, 3664, pp. 316 330. URL: https://ceur-ws.org/Vol-
    3664/paper22.pdf
[2] P. Raj, A. Raman, H. Subramanian, Architectural Patterns. Packt Publishing, 2017.
[3] S.Newman, Building microservices: Designing fine-
             , 2021.
[4] M. Bruce, P. Pereira, Microservices in action. Shelter Island, NY: Manning Publications Co.,
    2019.
[5] A. Müller, S. Guido, Introduction to machine learning with python: A guide for data scientists.
                                , 2018.
[6] J. Mueller, Machine learning security principles: Use various methods to keep data, networks,
    users, and applications safe from Prying eyes. Birmingham: Packt Publishing, 2023.
[7] S. Raschka, Y. Liu, and V. Mirjalili. Machine learning with pytorchand Scikit-Learn: Develop
    machine learning and deep learning models with python. Birmingham: Packt Publishing, 2022.
[8] M. Abouahmed, and O. Ahmed. Machine learning in microservices: Productionizing
     Microservices Architecture for Machine Learning Solutions. Birmingham: Packet Publishing,
     2023.
[9] Ubuntu server - for scale out workloads Ubuntu, 2023. URL: https://ubuntu.com/server/
[10] A. Kupin. Application of neurocontrol principles and classification optimisation in conditions
     of sophisticated technological processes of beneficiation complexes (2014), in: Metallurgical
     and Mining Industry, 2014, 6(6), pp. 16 24. ISSN: 20760507.
[11] J. Brains, PyCharm: The python IDE for professional developers by jetbrains, JetBrains, 2021.
     URL: https://www.jetbrains.com/pycharm/
[12] DBeaver Community, 2023. URL: https://dbeaver.io/
[13] MySQL, 2023. URL: https://www.mysql.com/
[14] Accelerated Container Application Development, 2023 Docker. URL: https://www.docker.com/
[15] A. Davis, Bootstrapping Microservices with Docker, Kubernetes, and Terraform: A project-
     based guide. Manning, 2021.
[16] S. Wells, Enabling Microservice Success: Managing Technical, Organizational, and Cultural

[17] C. Richardson, Microservices Patterns: With examples in Java. Manning, 2018.