Distribute load among concurrent servers ⋆
                                Denys Bakhtiiarov1,2,*,†, Bohdan Chumachenko1,†, Oleksandr Lavrynenko1,†,
                                Volodymyr Chupryn1,† and Veniamin Antonov1,†
                                1
                                 National Aviation University, 1 Kosmonavta Komarova ave., 03058 Kyiv, Ukraine
                                2
                                 State Scientific and Research Institute of Cybersecurity Technologies and Information Protection, 3 Maksym Zaliznyak, 03142
                                Kyiv, Ukraine


                                                   Abstract
                                                   A technical implementation option for load balancing among concurrently operating application servers is
                                                   proposed to mitigate the risks of overload amid substantial unpredictable fluctuations in request flow to the
                                                   application system and the variable processing durations by each application server. The structural-
                                                   functional model for load balancing inside the server line of the application system is delineated, and
                                                   designed to operate under conditions where the incoming request flow from clients is characterized as
                                                   random, unexpected, non-stationary, and pulsing. A proposal is made for a system that generates a flow of
                                                   requests to the application server line, ensuring the alignment of the stationary intervals of this flow with
                                                   the intervals of discrete control for equalizing server load factors. A technological framework for load
                                                   balancing on application servers is proposed, facilitating the equalization of load factors among application
                                                   system servers through real-time transmission, allowing the redistribution of a portion of incoming request
                                                   traffic from more heavily loaded servers to those with lesser loads.

                                                   Keywords
                                                   request, application, server, client, load balancing1


                         1. Introduction                                                              Users between the line servers (steps 2 and 3) will
                                                                                                      implement the distribution strategy outlined below. The
                         In practice, when utilizing computerized real-time                           request redirection server transmits the IP address of the
                         application systems like ‘client/server’ that permit remote                  subsequent application server, as determined by the
                         access for clients via the Internet, such as various interactive             distribution method, to the user terminal (step 4), and
                         help systems, the effectiveness is assessed by the value of                  subsequently readies itself to handle a new request from
                         τs—the average service duration of each stream of customer                   another user, advancing to step 1. The user utilizes the IP
                         requests entering the application system input. A reduced                    address of the designated application server to retrieve the
                         value indicates that the consumer is likely to receive a                     online result of processing his request from that server
                         response to their request more promptly [1]. At low request                  (step 5). The designated server resolves the application issue
                         flow intensities, queues at the application system’s input are               and transmits the outcome to the user (step 6) [2].
                         virtually nonexistent, thereby making τs directly contingent                     Specifically, Fig. 1 illustrates that a series of specialized
                         upon the performance of the server hardware hosting the                      application software and hardware servers process client
                         application software. Issues occur when the volume of                        requests concurrently. Choosing the number of servers in
                         incoming requests is misaligned with the processing speed                    the configuration should align the request traffic intensity
                         of the server infrastructure, leading to the accumulation of                 with the application system’s performance. Nonetheless, the
                         unprocessed requests, which in turn results in an                            issues get intricate when addressing an erratic and
                         unacceptable increase in service request duration and                        unpredictable influx of requests, characterized by
                         certain instances, the loss of some requests. Given the high                 substantial fluctuations in both intensity and duration. In
                         intensity of request flow in several applications, it is                     this scenario, due to erratic variations in request volume and
                         essential to partition it in real-time into parallel                         the uncertain processing times by application servers, these
                         demultiplexed substreams and execute their concurrent                        servers, in the absence of specific interventions, experience
                         online processing utilizing a series of application servers                  uneven and arbitrary loading—resulting in some servers
                         with identical functionality. For instance, as illustrated in                becoming overloaded and consequently losing requests,
                         Fig. 1. Before the processing of a user’s request by an                      while others remain underutilized. Unforeseen variations in
                         application server, it is initially received by the request                  the volume of requests directed to any application server
                         redirection server (step 1), which employs a block to                        can impede request processing due to potential transient
                         ascertain the current application server number designated                   server overloads.
                         for the request and allocates the request stream in real-time.


                                CPITS-II 2024: Workshop on Cybersecurity Providing in Information           0000-0003-3298-4641 (D. Bakhtiiarov);
                                and Telecommunication Systems II, October 26, 2024, Kyiv, Ukraine         0000-0002-0354-2206 (B. Chumachenko);
                                ∗
                                  Corresponding author.                                                   0000-0002-3285-7565 (O. Lavrynenko);
                                †
                                  These authors contributed equally.                                      0000-0001-9412-7413 (V. Chupryn);
                                   bakhtiiaroff@tks.nau.edu.ua (D. Bakhtiiarov);                          0000-0003-2244-262X (V. Antonov)
                                bohdan.chumachenko@npp.nau.edu.ua (B. Chumachenko);                                     © 2024 Copyright for this paper by its authors. Use permitted under
                                                                                                                        Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                oleksandrlavrynenko@tks.nau.edu.ua (O. Lavrynenko);
                                volodymyr.chupryn@npp.nau.edu.ua (V. Chupryn);
                                veniamin.antonov@npp.nau.edu.ua (V. Antonov)
CEUR
Workshop
                  ceur-ws.org
              ISSN 1613-0073
                                                                                                    260
Proceedings
Figure 1: Generalized structural and functional model for the allocation of user requests among application servers

Consequently, there is both theoretical and practical                 and autonomously, with a software server that facilitates
interest in developing a mechanism for load balancing on              real-time adaptive distribution of request flow among the
application servers, specifically a dynamic load balancing            application servers to achieve more or less uniform load
approach among collaborating application servers in real-             balancing. The parameters of the examined load balancing
time. This method’s implementation aims to avert potential            technology are established through the resolution of the
short-term overloads of individual application servers                boundary value problem associated with the analytical
during their operation, thereby fostering the sustainable             design of the relevant regulator, utilizing the synthesis of
functioning of the application system amid uncertainties in           the corresponding R. Bellman functional and iterative
the dynamics of the aforementioned environmental factors.             numerical integration of the derived tuning equation. The
The suggested technique must assure the stability of the              implemented technical solution facilitates nearly uniform
request distribution process, considering the dynamics of             loading of server equipment under the specified conditions
unforeseen fluctuations in this flow. The theoretical                 while maintaining an acceptable average waiting time for
foundation of this strategy is explained in [3–5]. This paper         service requests with the minimal necessary server
presents a potential option for its technical implementation,         resources.
the core of which is as follows. The application system
hardware depicted in pic.1 comprises a software server                2.1. System model for load balancing on
(ROM server+server definition unit) that concurrently and                  servers
autonomously manages multiple application servers. This
                                                                      This work introduces a structural and functional model for
software server facilitates a real-time adaptive distribution
                                                                      load balancing throughout the server line of the application
of requests among the application servers to maintain a
                                                                      system, designed to operate under conditions where the
more uniform load during unpredictable surges in request
                                                                      incoming request flow from clients is random, unexpected,
flow.
                                                                      non-stationary, and pulsing. Server load balancing entails
                                                                      the real-time redistribution of incoming request flows from
2. Main Part                                                          heavily loaded application servers to those with lighter
The theoretical foundation of the employed load balancing             loads, thereby achieving a more uniform distribution of load
method is delineated in [1, 2, 6]. This paper presents a              across the servers. Fig. 2 illustrates this model as a series of
potential option for its technical implementation, the core           numbered blocks, each representing a certain functional
of which is as follows. The application system comprises a            component of the model’s structure [7].
series of application servers that must function concurrently


                                                                261
Figure 2: Structural and functional paradigm for load balancing between concurrently operating servers of the applied
information system

Fig. 2 use the following designations for functional blocks:            that the generated quasi-stationary traffic segments receive
1—smoothing of an input request stream; 2—creation of                   approximately equal load factors across all servers. The
quasi-stationary segments of incoming request traffic at                model illustrated in Fig. 2 is founded on the adaptive
time intervals ∆ti—smoothing steps (as the formation                    principle of reallocating demultiplexed subflows of requests
process is executed as a stepwise iterative procedure with a            among application servers through real-time monitoring of
step ∆ti, while monitoring fluctuations in the intensity of             fluctuations in the current intensity of the incoming request
the incoming request flow); 3—demultiplexing of the                     stream and the existing load levels of the application
resulting input stream of requests at each smoothing                    servers. Consequently, this paradigm necessitates the real-
interval ∆ti; 4—configurator of smoothing and alignment                 time implementation of the following three processes:
procedures (referring to the process of synchronizing the
current values of load factors for application servers seen in             1)    The establishment of an incoming request flow to
pic.2), executed by software-controlled clock generators; 5—                     attain a more uniform temporal distribution,
assessing the current values of the intensity of the generated                   thereby preventing short-term overloads in the
input request stream at each smoothing interval ∆ti; 6—                          application server line.
buffering requests (establishing a queue of requests for                   2)    The demultiplexing of the incoming request
processing by the i-th application server) at the input of the                   stream into several concurrently operating
i-th application server; 7—evaluating the current values of                      subflows corresponds to the number of application
the load factor of the i-th application server at each                           servers in the line.
alignment step; 8—determining a singular matrix of                         3)    The equalization of current application server load
regulatory relationships among the variables to be aligned                       factors diminishes the likelihood of short-term
(i.e., between load factors on servers) at each alignment step;                  overload on any individual server. Examine the
9—ascertaining the precise values of the resource allocation                     characteristics of each of these processes.
(i.e., the amount of requests) to be allocated among the input
queues of application servers at each stage of the alignment;           2.2. Establishment of the incoming request
10—data processing of the relevant issue; A—incoming                         flow
request stream; B—produced flow of requests; B—query                    For the proper functioning of this load-balancing method,
substreams post-demultiplexing. Fig. 2 illustrates that to              the incoming request traffic must be transformed into a
create quasi-stationary traffic segments, the non-stationary            series of quasi-stationary segments representing a discrete
incoming request stream is initially smoothed and                       random process, which can be partially refined by
structured accordingly. The created input stream is                     specialized averaging techniques. The load balancing
demultiplexed, and the resulting parallel substreams are                technology on the application system’s servers necessitates
allocated to the application system’s servers based on the              the accurate structuring of request flow, specifically to
established load-balancing method. The primary objective                maintain the consistency between the stationary intervals
of balancing is to attain the most accurate estimate of the             of this flow, ∆Ts, and the intervals of the discrete control
uniform load across the application system servers. In other            process for equalizing server load factors, τk. Some traffic
words, under conditions of unpredictable fluctuations in                creation technologies do not allow for this possibility. The
incoming traffic and varying request processing times by                “bucket tokens” method [6, 8] has a notable constraint in its
each server, the balancing algorithm must operate to ensure             applicability, being suitable solely for scenarios where


                                                                  262
actual traffic exhibits the traits of a stationary random              servers, largely unjustifiable. This study presents a
process. Nevertheless, actual traffic and its derivatives must         structural and functional framework for the development of
be regarded as a non-stationary discontinuous process,                 request flow, intended as a component of adaptive load-
rendering the straight application of the “token bucket”               balancing technology for parallel servers within the
method, along with other established traffic generating                application system. This diagram is illustrated in Fig. 3.
techniques, in adaptive load redistribution systems on


Figure 3: Structural and functional diagram of the request processing pipeline by a series of application servers

Fig. 3 employs the following designations for functional                   The implementation of this traffic processing scheme is
blocks: 1—the request queue buffer at the input of the                 warranted if it can transform a non-stationary flow, marked
application system (i.e., the input request storage); 2—the            by unpredictable average speeds and fluctuating volumes,
parameter (generator) defining the size of the smoothing               into a series of quasi-stationary process segments with
step; 3—the measurement of the number of requests                      defined maximum current thresholds. This transformation
received at the input of the balancing system during a single          enables the implementation of discrete control. The token
smoothing step duration; 4—generator of virtual events to              bucket technique is extensively discussed in the literature,
transmit the request via the gateway (token generator); 5—             albeit within rather limited domains of applicability. The
repository of virtual events for the request sent through the          operational architecture of this algorithm is altered to
gateway (“bucket of tokens”); 6—gateway for routing                    facilitate its integration into the load-balancing system
requests to the input of the demultiplexer; 7—demultiplexer            circuit.
for the input stream of requests. Fig. 3 illustrates that the
foundation of this approach is the ‘buckets of tokens’                 2.3. Demultiplexing the incoming request
method, but with some adjustments and enhancements that                     stream
facilitate its application in the processing of non-stationary
                                                                       Demultiplexing the incoming request stream from
request flows. In this scenario, the request gateway 6
                                                                       application system clients is essential when the
functions as a lock jumper, allowing requests from the input
                                                                       performance of a single application server is inadequate to
queue to go to the multiplexer only when the fill level of the
                                                                       effectively process this stream, necessitating the utilization
‘bucket’ of virtual events permits the request to traverse the
                                                                       of multiple parallel application servers with identical
‘bucket’, achieving the average flow rate at the current
                                                                       functionality. One can select from many ways of stream
smoothing step. The velocity of the token generator 4 is
                                                                       multiplexing. The most straightforward option is to allocate
contingent upon the strength of the incoming request
                                                                       requests from the incoming stream uniformly across
stream. Based on the intensity measurements conducted by
                                                                       application system servers. In this instance, the disparity in
meter 3 at each smoothing step, the configuration of the
                                                                       request processing times would result in certain servers
token generator is executed. Consequently, we acquire
                                                                       experiencing temporary overloads, leading to request
quasi-stationary segments of the generated request flow.
                                                                       losses, while other application servers operate under
The applicability of this traffic generation strategy is
                                                                       capacity. Consequently, it is prudent to execute the
restricted to instances when there exists a possibility:
                                                                       multiplexing of the input stream precisely as seen below.
    1)   Establish time intervals, referred to as stationary
         intervals (∆Tc), during which the average flow rate           2.4. Model training
         (Rc) at the input of the load balancing system                The processing time for each request is an unpredictable
         remains almost constant.                                      variable, resulting in real-time fluctuations of application
    2)    Ensure the regulated magnitude of pulsations in              server load factors. Under these circumstances, balancing
          the smoothed stream of queries.                              server load factors is recommended. Fig. 4 illustrates the
                                                                       structural and functional framework of load balancing on
                                                                       application servers.


                                                                 263
Figure 4: Structural and functional framework for load balancing on application servers

Fig. 4 uses the subsequent designations for functional                  an adaptive controller with a specified quantity of
blocks: 1—settler (generator) of the alignment step                     application servers is to mitigate the risk of server
magnitude; 2—buffer for the request queue at the server                 equipment overload and to maintain the stability of the load
application input; 3—assessment of the current value of the             balancing process amidst the unpredictable duration of
server application load factor (evaluations are conducted at            request processing by each server. The objective of
each alignment step); 4—calculation of the determinant of               synthesizing such a regulator pertains to the established
the matrix of regulatory connections among server                       boundary value problem of analytically designing
applications (resulting from the resolution of the                      regulators to minimize the R. Bellman functional within the
configuration equation); 5—computation of the determinant               realm of continuous dynamic control systems for entities
of the resource share ∆ (specifically, the number of requests           characterized by ordinary first-order linear differential
to be redistributed at each alignment step among each                   equations. The application of the synthesis results
server application). The load balancing process is a                    facilitated a more uniform loading of the server equipment
deliberate iterative procedure for the real-time                        and ensured the requisite stability and length of the
redistribution of requests inside the request queue buffers             balancing procedure despite the aforementioned
for processing at the inputs of each application server. A              unanticipated events. The trajectory of traffic flow
specific quantity of requests is extracted from one server’s            regulation is dictated by the suitably constructed R. Bellman
queue and subsequently transferred to another server’s                  functional. The role of monitoring trends in variations in
queue by the established alignment procedure. This                      processed flow intensity on servers is executed through the
redistribution aims to diminish the disparity between the               incremental integration of the relevant differential tuning
load factor values of the servers comprising the line,                  equation. In the analytical design of the controller, the
facilitating load balancing across each server in the line. The         structure of the Bellman function was defined, enabling the
technique operates so that at each alignment step,                      formulation of the tuning equation, the specification of the
determined by setter 1 based on the measured current load               function, and the derivation of the appropriate Bellman
values of each server, it ascertains the current state of the           equation. The task of designing a controller is simplified to
control link matrix 4 (as a result of the incremental                   solving the Riccati equation, a matrix quadratic equation
solution). This matrix delineates the direction of request              essential for determining the matrix component of the
redistribution across server pairs, while the resource share            Bellman function. Substituting the identified matrix into the
determinant of 5, derived from measurements of current                  control expression yields the final formulation for the
incoming request traffic intensity, specifies the number of             required controller. A regulator is synthesized to maintain a
requests to be transferred from one server to another. This             consistent trajectory of state changes in the regulation
publication does not include a formal synthesis of the                  object’s phase space C2, adhering to defined quality
adaptive system controller that executes load balancing on              parameters of the transient process. The controller must
application servers. A synthesis was specifically conducted             observe both the variations in the intensity of incoming
in [1]. The principles of analytical regulator theory are               request flows and the dynamics of the transient process of
presented in references [9–14]. Only the subsequent                     load factor equalization to minimize control errors while
information should be noted. The objective of synthesizing              considering constraints that maintain the stability of the


                                                                  264
control system. Initial parameters of the equalization                            design of this regulator must address the following inherent
system: the number of servers in the queue and the                                physical restrictions. Physical Constraint 1:
attenuation coefficient for the Bellman function α. The
                                                     s1  s2  s3  ...  sn  F .                                                         (1)

    Here’s the translated text: where F represents the                           to the aforementioned constraints will decrease the risk of
total bandwidth of the application server line,                                   server traffic overflow.
 F  f1  f 2  f3  ...  f n  const , f1 , f 2 , f3 ,..., f n are the
                                                                                  2.5. Essential Factors for Operating PHP
server bandwidths, and s1 , s2 , s3 ,..., sn          are the flow
                                                                                       Applications Across Multiple Servers
intensities of requests at the inputs of application servers.
    Physical constraint 2: the unpredictability of request                            Having addressed load balancing, the subsequent
flow ripples.                                                                     pertinent inquiry is: how are sessions managed? Sessions
    Physical constraint 3: Ambiguity regarding the                                enable programs to circumvent the stateless characteristic
processing duration of each specific request by each                              of HTTP and retain information across multiple requests
application server. The efficiency of the load balancing                          (e.g., authentication status and shopping cart contents).
procedure on the servers, from a physical perspective, is the                     PHP, by default, retains sessions on the server’s disk that
aggregate of the squares of the discrepancies in the load                         processes the user’s request. For instance, when User A
factors of each pair of application servers. This number                          submits a request to Server B, a session for User A is
should be reduced, as a value of zero indicates that the load                     established and retained on Server B (Fig. 5) [11].
factors of each server in the line will be identical. Adhering


Figure 4: Basic load balancer schematic

Nonetheless, when requests are distributed among                                  imperative to ensure that the session store does not become
numerous servers, this setup is likely to lead to                                 a singular point of failure. This can be circumvented by
malfunctioning functionality. For instance, consumers may                         configuring the store in a clustered arrangement.
discover their shopping cart is unexpectedly empty midway                         Consequently, if one server in the cluster fails, it is not
through the process; they may be arbitrarily redirected to                        catastrophic, as another can be incorporated to substitute it
the login page; or they may realize that all their responses                      [15]. Persistent Sessions. An alternative to session caching
in a survey have been erased while completing it. Two                             is Session Stickiness, also known as Session Persistence.
alternatives exist to mitigate this: centrally stored sessions                    User queries are routed to the same server for the duration
and sticky sessions. Centrally Stored Sessions. Sessions may                      of their session. Although it may initially appear to be a
be centrally saved via a caching server (e.g., Redis or                           wonderful concept, there are various possible downsides,
Memcached), a database (e.g., MySQL or PostgreSQL), or a                          including Will thermal gradients emerge within the cluster?
shared filesystem (e.g., NFS or GlusterFS). The optimal                           What occurs when a server is inaccessible, overloaded, or
choice among these choices is a caching server. This is due                       requires an upgrade? Consequently, I do not endorse this
to two factors: They are an in-memory storage system based                        strategy.
on key-value pairs, providing superior responsiveness
compared to SQL databases; sessions are consistently                              3. Conclusions
written upon the conclusion of a request, whereas SQL
databases need writing to the database with each request.                         In several application systems, such as ‘client/server’, which
This requirement may result in table locking and sluggish                         exhibit high traffic intensity, the processing of client
write operations. When centrally storing sessions, it is                          requests is executed by a series of concurrently operating
                                                                                  application servers. Owing to the erratic fluctuations in


                                                                            265
request flow and the variable duration of their processing              References
by application servers, these servers, unless specific
measures are implemented, experience random and uneven                  [1]    D. Bakhtiiarov, G. Konakhovych, O. Lavrynenko, An
loading—resulting in some servers becoming overloaded                          Approach to Modernization of the Hat and COST 231
and consequently losing requests, while others remain                          Model for Improvement of Electromagnetic
underutilized. In [1], a formal balancing method was                           Compatibility in Premises for Navigation and Motion
developed to avert potential short-term overloads of                           Control Equipment, in: 5th International Conference
application servers during their operation, thereby                            on Methods and Systems of Navigation and Motion
promoting the sustainable functioning of the application                       Control       (MSNMC)         (2018)    271–274.   doi:
system amidst uncertainties in the dynamics of the                             10.1109/MSNMC.2018.8576260.
aforementioned factors. This study presents a potential                 [2]    F. Xia, et al., Community-based Event Dissemination
option for the technical implementation of this strategy.                      with Optimal Load Balancing, IEEE Trans. Comput.
    The structural-functional model of load balancing for                      64(7) (2015) 1857–1869.
the application system’s server line is delineated, and                 [3]    A. Nahir, A. Orda, D. Raz, Schedule First Manage
designed to operate in conditions where the incoming                           Later: Network-Aware Load Balancing, Proc. IEEE
request flow from clients is random, unexpected, non-                          INFOCOM (2013) 510–514.
stationary, and pulsating. The model utilizes the adaptive              [4]    J. Doncel, S. Aalto, U. Ayesta, Economies of Scale in
principle of reallocating demultiplexed request sub-streams                    Parallel-Server Systems, Proc. IEEE INFOCOM (2017)
across application servers through real-time monitoring of                     1–9.
fluctuations in the incoming request stream intensity and               [5]    O. Veselska, et al., A Wavelet-Based Steganographic
the current load levels of the application servers. This                       Method for Text Hiding in an Audio Signal, Sensors,
paradigm necessitates the implementation of the following                      22(15) (2022) 5832.
three processes:                                                        [6]    R. Odarchenko, et al., Empirical Wavelet Transform in
                                                                               Speech Signal Compression Problems, in: IEEE 8th
    1)   Establishment of the incoming request flow to                         International     Conference        on   Problems    of
         prevent short-term server line overloads.                             Infocommunications, Science and Technology (PIC
    2)   Demultiplexing the incoming request stream into                       S&T) (2021) 599–602, doi: 10.1109/PICST54195.2021.
         multiple parallel substreams based on the number                      9772156.
         of application servers in the line.                            [7]    D. S. Boger, J. S. Fraga, E. Alchieri, Reconfigurable
    3)   Equalization of the current load factor values of                     Scalable State Machine Replication, LADC (2016) 1–8.
         application servers.                                           [8]    N. Santos, A. Schiper, Achieving High-Throughput
                                                                               State Machine Replication in Multi-Core Systems,
    The formation of an incoming request stream to the                         ICDCS (2013).
application server line is examined. It is demonstrated that            [9]    O. Lavrynenko, et al., Protected Voice Control System
the proper functioning of this load-balancing method                           of UAV, in: IEEE 5th International Conference Actual
requires the incoming request traffic to be converted into a                   Problems of Unmanned Aerial Vehicles Developments
sequence of quasi-stationary segments representing a                           (APUAVD) (2019) 295–298. doi: 10.1109/APUAVD-
discrete random process. It is essential to align the intervals                47061.2019.8943926.
of stationarity of this request flow with the intervals of the          [10]   O. Solomentsev, et al., A Procedure for Failures
discrete control steps for equalizing the load factor values of                Diagnostics of Aviation Radio Equipment,
application servers. A modification of the established                         Proceedings—International Conference on Advanced
technological approach for packet traffic creation, referred                   Computer Information Technologies, ACIT (2023)
to as the “bucket of tokens”, is proposed. The token                           100–103. doi: 10.1109/ACIT58437.2023.10275337.
generator’s performance is determined by the intensity of               [11]   D. Bakhtiiarov, et al., Method of Binary Detection of
the incoming request stream. Specifically, based on the                        Small Unmanned Aerial Vehicles, in: Cybersecurity
intensity measurements conducted by the meter at each                          Providing in Information and Telecommunication
smoothing step, the token generator is calibrated.                             Systems, vol. 3654 (2024) 312–321.
Consequently, we acquire quasi-stationary segments of the               [12]   P. J. Marandi, et al., Filo: Con-Solidated Consensus as
generated request flow.                                                        a Cloud Service, ATC (2016).
    A technological technique for load balancing on                     [13]   M. Poke, T. Hoefler, DARE: High-Performance State
application servers has been created, characterized as a                       Machine Replication on RDMA Networks, HPDC
deliberate iterative procedure for the real-time                               (2015) 107–118.
redistribution of requests stored in the buffers of request             [14]   W. Zhao, Performance Optimization for State
queues at the entry points of each application server. This                    Machine Replication based on Application Semantics,
redistribution aims to diminish the disparity between the                      J. Syst. Software, 122(C) (2016) 96–109.
load factor values of the servers constituting the line. The            [15]   J. R. Lorch, et al., Leveraging Lightweight Virtual
implemented balancing algorithm enables a specified                            Machines to Easily and Efficiently Construct Fault-
number of application servers to mitigate the risk of short-                   Tolerant Services, NSDI (2015).
term server overloads and ensures the stability of the load-
balancing process amidst the unpredictable duration of
request processing by each server.


                                                                  266