<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Auto-scaling Policies to Adapt the Application Deployment in Kubernetes</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Civil Engineering and Computer Science Engineering University of Rome Tor Vergata</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>30</fpage>
      <lpage>38</lpage>
      <abstract>
        <p>The ever increasing diffusion of computing devices enables a new generation of containerized applications that operate in a distributed cloud environment. Moreover, the dynamism of working conditions calls for an elastic application deployment, which can adapt to changing workloads. Despite this, most of the existing orchestration tools, such as Kubernetes, include best-effort threshold-based scaling policies whose tuning could be cumbersome and application dependent. In this paper, we compare the default threshold-based scaling policy of Kubernetes against our model-based reinforcement learning policy. Our solution learns a suitable scaling policy from the experience so to meet Quality of Service requirements expressed in terms of average response time. Using prototype-based experiments, we show the benefits and flexibility of our reinforcement learning policy with respect to the default Kubernetes scaling solution.</p>
      </abstract>
      <kwd-group>
        <kwd>Kubernetes</kwd>
        <kwd>Elasticity</kwd>
        <kwd>Reinforcement Learning</kwd>
        <kwd>Selfadaptive systems</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Fabiana Rossi
Elasticity allows to adapt the application deployment at run-time in face of
changing working conditions (e.g., incoming workload) and to meet stringent
Quality of Service (QoS) requirements. Exploiting operating system level
virtualization, software containers allow to simplify the deployment and management
of applications, also offering a reduced computational overhead with respect to
virtual machines. The most popular container management system is Docker. It
allows to simplify the creation, distribution, and execution of applications inside
containers. Although the container management systems can be used to deploy
simple containers, managing a complex application (or multiple applications)
at run-time requires an orchestration tool. The latter automates container
provisioning, management, communication, and fault-tolerance. Although several
orchestration tools exist [
        <xref ref-type="bibr" rid="ref5 ref8">5,8</xref>
        ], Kubernetes1, an open-source platform introduced
by Google in 2014, is the most popular solution. Kubernetes includes a Horizontal
Pod Autoscaler enabling to automatically scale the application deployment using
a threshold-based policy based on cluster-level metrics (i.e., CPU utilization).
However, this threshold-based scaling policy is not well suited to satisfy QoS
requirements of latency-sensitive applications. Determining a suitable threshold
is cumbersome, requiring to identify the relation between a system metric (i.e.,
utilization) and an application metric (i.e., response time), as well as to know
the application bottleneck (e.g., in terms of CPU or memory). In this paper,
we compare the default threshold-based scaling policy of Kubernetes against
model-free and model-based reinforcement learning policies [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Our model-based
solution automatically learns a suitable scaling policy from the experience so to
meet QoS requirements expressed in term of average response time. To perform
such comparison, we use our extension of Kubernetes, which includes a more
flexible autoscaler that can be easily equipped with new scaling policies. The
remainder of the paper is organized as follows. In Section 2, we discuss related
works. In Section 3, we describe the Kubernetes features. Then, we propose a
reinforcement learning-based scaling policy to adapt at run-time the deployment
of containerized applications (Section 4). In Section 5, we evaluate the proposed
solutions using prototype-based experiments. We show the flexibility and efficacy
of using a reinforcement learning solution compared to the default Kubernetes
scaling policy. In Section 6, we outline the ongoing and future research directions.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        The elasticity of containers is carried out in order to achieve different objectives:
to improve application performance (e.g., [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]), load balancing and resource
utilization (e.g., [
        <xref ref-type="bibr" rid="ref1 ref11">1,11</xref>
        ]), energy efficiency (e.g., [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]), and to reduce the deployment
cost (e.g., [
        <xref ref-type="bibr" rid="ref2 ref6">6,2</xref>
        ]). Few works also consider a combination of deployment goals
(e.g., [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]). Threshold-based policies are the most popular approaches to scale
containers at run-time (e.g., [
        <xref ref-type="bibr" rid="ref10 ref4">4,10</xref>
        ]). Also the noteworthy orchestration tools (e.g.,
Kubernetes, Docker Swarm, Amazon ECS, and Apache Hadoop YARN) usually
rely on best-effort threshold-based scaling policies based on some cluster-level
metrics (e.g., CPU utilization). However, all these approaches require a
nontrivial manual tuning of the thresholds, which can also be application-dependent.
To overcome to this issue, solutions in literature propose container deployment
methods ranging from mathematical programming to machine learning solutions.
The mathematical programming approaches exploit methods from operational
research in order to solve the application deployment problem (e.g., [
        <xref ref-type="bibr" rid="ref12 ref13 ref18">12,13,18</xref>
        ]).
Since such a problem is NP-hard, other efficient solutions are needed. In the
last few years, reinforcement learning (RL) has become a widespread approach
to solve the application deployment problem at run-time. RL is a machine
learning technique by which an agent can learn how to make (scaling) decisions
through a sequence of interactions with the environment [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Most of the existing
solutions consider the classic model-free RL algorithms (e.g., [
        <xref ref-type="bibr" rid="ref16 ref17 ref7">7,16,17</xref>
        ]), which
however suffer from slow convergence rate. To tackle this issue, in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], we
propose a novel model-based RL solution that exploits what is known (or can be
estimated) about the system dynamics to adapt the application deployment at
run-time. Experimental results based on Docker Swarm have shown the flexibility
of our approach, which can learn different adaptation strategies according to
the optimized deployment objectives (e.g., meet QoS requirements in terms of
average response time). Moreover, we have shown that the model-based RL agent
learns a better adaptation policy than other model-free RL solutions. Encouraged
by the previous promising results, in this paper, we integrate the model-based
RL solution in Kubernetes, one of the most popular container orchestration
tools used in the academic and industrial world. Experimental results in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
demonstrate that Kubernetes performs better than other existing orchestration
tools, such as Docker Swarm, Apache Mesos, and Cattle. However, Kubernetes is
not suitable for managing latency-sensitive applications in a extremely dynamic
environment. It is equipped with a static best-effort deployment policy that relies
on system-oriented metrics to scale applications in face of workload variations.
In this paper, we first extend Kubernetes to easily introduce self-adaptation
capabilities. Then, we integrate RL policies in Kubernetes and compare them
against the default Kubernetes auto-scaling solution.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Kubernetes</title>
      <p>Kubernetes is an open-source orchestration platform that simplifies the
deployment, management, and execution of containerized applications. Based on a
master-worker decentralization pattern, it can replicate containers for improving
resource usage, load distribution, and fault-tolerance. The master node
maintains the desired state at run-time by orchestrating applications (using pods). A
worker is a computing node that offers its computational capability to enable the
execution of pods in distributed manner. A pod is the smallest deployment unit
in Kubernetes. When multiple containers run within a pod, they are co-located
and scaled as an atomic entity. To simplify the deployment of applications,
Kubernetes introduces Deployment Controllers that can dynamically create and
destroy pods, so to ensure that the desired state (described in the deployment file)
is preserved at run-time. Kubernetes also includes a Horizontal Pod Autoscaler2
to automatically scale the number of pods in a Deployment based on the ratio
between the target value and the observed value of pod’s CPU utilization. Setting
the CPU utilization threshold is a cumbersome and error-prone task and may
require a knowledge of the application resource usage to be effective.</p>
      <p>To address this limitation, we equip Kubernetes with a decentralized control
loop. In a single loop iteration, it monitors the environment and the containerized
applications, analyzes application-level (i.e., response time) and cluster-level (i.e.,
CPU utilization) metrics, and plans and executes the corresponding scaling
actions. The modularity of the control loop allows us to easily equip it with different
QoS-aware scaling policies. To dynamically adapt the application deployment
according to the workload variations, we consider RL policies.
2 https://kubernetes.io/docs/tasks/run-application/</p>
      <p>horizontal-pod-autoscale/</p>
    </sec>
    <sec id="sec-4">
      <title>Reinforcement Learning Scaling Policy</title>
      <p>
        Differently from the Kubernetes scaling policy, we aim to design a flexible
solution that can be easily customized by manually tuning various configuration
parameters. In this paper, we customize the RL solution proposed in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] to
scale at run-time the number of application instances (i.e., pods). RL refers to
a collection of trial-and-error methods by which an agent must prefer actions
that it found to be effective in the past (exploitation). However, to discover
such actions, it has to explore new actions (exploration). In a single control loop
iteration, the RL agent selects the adaptation action to be performed. As first step,
according to the received application and cluster-oriented metrics, the RL agent
determines the Deployment Controller state and updates the expected long-term
cost (i.e., Q-function). We define the application state as s = (k, u), where k is
the number of application instances (i.e., pods), and u is the monitored CPU
utilization. We denote by S the set of all the application states. We assume that
k ∈ {1, 2, ..., Kmax}; being the CPU utilization (u) a real number, we discretize
it by defining that u ∈ {0, u¯, ..., Lu¯}, where u¯ is a suitable quanta. For each state
s ∈ S, we define the set of possible adaptation actions as A(s) ⊆ {−1, 0, 1}, where
±1 defines a scaling action (i.e., +1 to scale-out and −1 to scale-in), and 0 is the
do nothing decision. Obviously, not all the actions are available in any application
state, due to the upper and lower bounds on the number of pods per application
(i.e., Kmax and 1, respectively). Then, according to an action selection policy, the
RL agent identifies the scaling action a to be performed in state s. The execution
of a in s leads to the transition in a new application state (i.e., s0) and to the
payment of an immediate cost. We define the immediate cost c(s, a, s0) as the
weighted sum of different terms, such as the performance penalty, cperf , resource
cost, cres, and adaptation cost, cadp. We normalized them in the interval [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ],
where 0 represents the best value (no cost), 1 the worst value (highest cost).
Formally, we have: c(s, a, s0) = wperf · cperf + wres · cres + wadp · cadp, where wadp,
wperf and wres, wadp + wperf + wres = 1, are non negative weights that allow us
to express the relative importance of each cost term. We can observe that the
formulation of the immediate cost function c(s, a, s0) is general enough and can be
easily customized with other QoS requirements. The performance penalty is paid
whenever the average application response time exceeds the target value Rmax.
The resource cost is proportional to the number of application instances (i.e.,
pods). The adaptation cost captures the cost introduced by Kubernetes to perform
a scaling operation. The traffic routing strategy used in Kubernetes forwards the
application requests to the newly added pod, even if not all containers in the
pod are already running. We observe that, for this reason, we prefer horizontal
scaling to vertical scaling operations. When a vertical scaling changes a pod
configuration (e.g., to update its CPU limit), Kubernetes spawns new pods as a
replacement of those with the old configuration. In this phase, the application
availability decreases and only a subset of the incoming requests are processed.
Conversely, a scale-out action introduces a reduced adaptation cost inversely
proportional to the number of application instances.
      </p>
      <p>
        The received immediate cost contributes to update the Q-function. The
Q-function consists in Q(s, a) terms, which represent the expected long-term
cost that follows the execution of action a in state s. The existing RL policies
differ in how they update the Q-function and select the adaptation action to be
performed (i.e., action selection policy) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. To adapt the application deployment,
we consider a model-based solution which we have extensively evaluated in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
At any decision step, the proposed model-based RL solution does not use an
action selection policy (e.g., -greedy action selection policy) but it always selects
the best action in term of Q-values, i.e., a = arg mina0∈A(s) Q(s, a0). Moreover,
to update the Q-function, the simple weighted average of the traditional RL
solutions (e.g., Q-learning) is replaced by the Bellman equation [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]:
Q(s, a) =
      </p>
      <p>
        X p(s0|s, a) hc(s, a, s0) + γ am0∈inA Q(s0, a0)i ∀a∀∈s∈AS(,s)
s0∈S
where γ ∈ [0, 1) is the discount factor, p(s0|s, a) and c(s, a, s0) are, respectively,
the transition probabilities and the cost function ∀s, s0 ∈ S and a ∈ A(s). Thanks
to the experience, the proposed model-based solution is able to maintain an
empirical model of the unknown external system dynamics (i.e., p(s0|s, a) and
c(s, a, s0)) speeding-up the learning phase. Further details on our model-based
RL solution can be found in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
(1)
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>
        We show the self-adaptation capabilities of Kubernetes when equipped with
model-free and model-based RL policies as well as the default threshold-based
solution (by the Horizontal Pod Autoscaler). The RL solutions scale pods using
user-oriented QoS attributes (i.e., response time), whereas the Horizontal Pod
Autoscaler uses a best-effort threshold-based policy based on cluster-level metrics
(i.e., CPU utilization). The evaluation uses a cluster of 4 virtual machines of
the Google Cloud Platform; each virtual machine has 2 vCPUs and 7.5 GB of
RAM (type: n1-standard-2). We consider a reference CPU-intensive application
that computes the sum of the first n elements of the Fibonacci sequence. As
shown in Figure 1, the application receives a varying number of requests. It
follows the workload of a real distributed application [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], accordingly amplified
700
) 600
/ss 500
q
r(e 400
tae 300
r
taa 200
D 100
0
0
50
100
150 200
Time (minutes)
250
300
350
50 100 150 200 250 300 350 50 100 150 200 250 300 350
      </p>
      <p>
        Time (minutes) Time (minutes)
(a) Threshold at 60% of CPU utilization. (b) Threshold at 70% of CPU utilization.
and accelerated so to further stress the application resource requirements. The
application expresses the QoS in terms of target response time Rmax = 80 ms.
To meet Rmax, it is important to accordingly adapt the number of application
instances. The Kubernetes autoscaler executes a control loop every 3 minutes. To
learn an adaptation policy, we parameterize the model-based RL algorithm as in
our previous work [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. For sake of comparison, we consider also the model-free
Qlearning approach that chooses a scaling action according to the -greedy selection
policy: at any decision step, the Q-learning agent chooses, with probability ,
a random action, whereas, with probability 1 − , it chooses the best known
action. For Q-learning, we set to 10%. To discretize the application state, we use
Kmax = 10 and u¯ = 0.1. For the immediate cost function, we consider the set of
weights wperf = 0.90, wres = 0.09, wadp = 0.01. This weight configuration allows
to optimize the application response time, considered to be more important than
saving resources and reducing the adaptation costs.
      </p>
      <p>The default Kubernetes threshold-based scaling policy is application unaware
and not flexible, meaning that it is not easy to satisfy QoS requirements of
latencysensitive applications by setting a threshold on CPU utilization (see Figures 2a–
2c). From Table 1, we can observe that small changes in the threshold setting
lead to a significant performance deterioration. Setting the scaling threshold is
itseem)s346050000
snpoeR (m1500
iliittzanoU 1527000550
fo 10</p>
      <p>8
rbeumsdop 642
N 0 0
cumbersome, e.g., with threshold on 80% of CPU utilization, we obtain a rather
high number of Rmax violations. With the scaling threshold at 70% of CPU
utilization, the application violates Rmax 21% of time, with 54% of average CPU
utilization. With the scaling threshold at 60% of CPU utilization, the application
has better performance (Rmax is exceeded only 9% of time), even though we
might still perform a finer threshold tuning to further increase it.</p>
      <p>Conversely, the RL approach is general and more flexible, requiring only to
specify the desired deployment objectives. It allows to indicate what the user
aims to obtain (through the cost function weights), instead of how it should
be obtained. In particular, a RL learning agent learns the scaling policy in an
automatic manner. Figures 3a and 3b show the application performance when
the model-free and model-based RL solutions are used. The RL agent starts with
no knowledge on the adaptation policy, so it begins to explore the cost of each
adaptation action. When Q-learning is used, the RL agent slowly learns how to
adapt the application deployment. As we can see from Figure 3a and Table 1, the
application deployment is continuously updated (i.e., 66% of the time) and the
RL agent does not learn a good adaptation policy within the experiment duration.
As a consequence, the application response time exceeds Rmax most of the time.
Taking advantage of the system knowledge, the model-based solution has a very
different behavior: it obtains better performance and more quickly reacts to
workload variations. We can see that, in the first minutes of the experiment, the
model-based solution does not always respect the target application response time.
However, as soon as a suitable adaptation policy is learned, the model-based RL
solution can successfully scale the application and meet the application response
time requirement most of the time. The learned adaptation policy deploys a
number of pods that follows the application workload (see Figures 1 and 3b),
maintaining a reduced number of Rmax violations (14.4%) and a good average
resource utilization (49%).</p>
      <p>
        We should observe that, even though a fine grained threshold tuning can be
performed (thus improving performance of the default Kubernetes scaling policy),
the RL-based approach automatically learns a suitable and satisfying adaptation
strategy. Moreover, changing the cost function weights, the RL solution can easily
learn different scaling policies, e.g., to improve resource utilization or to reduce
deployment adaptations [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>Kubernetes is one of the most popular orchestration tools to manage containers
in a distributed environment. To react to workload variations, it includes a
threshold-based scaling policy that changes the application deployment according
to cluster-level metrics. However, this approach is not well-suited to meet stringent
QoS requirements. In this paper, we compare model-free and model-based RL
scaling policies against the default threshold-based solution. The prototype-based
results have shown the flexibility and benefits of RL solutions: while the
modelfree Q-learning suffers from slow convergence time, the model-based approach
can successfully learn the best adaptation policy, according to the user-defined
deployment goals.</p>
      <p>As future work, we plan to investigate the deployment of applications in
geodistributed environment, including edge/fog computing resources located at the
network edges. The default Kubernetes scheduler spreads containers on computing
resources not taking into account the not-negligible network delay among them.
This can negatively impact the performance of latency-sensitive applications.
Therefore, alongside the elasticity problem, also the placement problem (or
scheduling problem) should be efficiently solved at run-time. We want to extend
the proposed heuristic so to efficiently control the scaling and placement of
multicomponent applications (e.g., micro-services). When an application consists of
multiple components that cooperate to accomplish a common task, adapting the
deployment of a component impacts on performance of the other components. We
are interested in considering the application as a whole, so to develop policies that
can adapt, in proactive manner, the deployment of inter-connected components,
avoiding performance penalties.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgment</title>
      <p>The author would like to thank her supervisor Prof. Valeria Cardellini and to
acknowledge the support by Google with the GCP research credits program.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abdelbaky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diaz-Montes</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parashar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Unuvar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steinder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Docker containers across multiple clouds and data centers</article-title>
          .
          <source>In: Proc. of IEEE/ACM UCC</source>
          <year>2015</year>
          . pp.
          <fpage>368</fpage>
          -
          <lpage>371</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Al-Dhuraibi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paraiso</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Djarallah</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Merle</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Autonomic vertical elasticity of Docker containers with ElasticDocker</article-title>
          .
          <source>In: Proc. of IEEE CLOUD '17</source>
          . pp.
          <fpage>472</fpage>
          -
          <lpage>479</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Asnaghi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferroni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Santambrogio</surname>
          </string-name>
          , M.D.:
          <article-title>DockerCap: A software-level power capping orchestrator for Docker containers</article-title>
          .
          <source>In: Proc. of IEEE EUC '16</source>
          . pp.
          <fpage>90</fpage>
          -
          <lpage>97</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Barna</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khazaei</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fokaefs</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Litoiu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Delivering elastic containerized cloud applications to enable DevOps</article-title>
          .
          <source>In: Proc. of SEAMS '17</source>
          . pp.
          <fpage>65</fpage>
          -
          <lpage>75</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Casalicchio</surname>
          </string-name>
          , E.:
          <article-title>Container orchestration: A survey</article-title>
          .
          <source>In: Systems Modeling: Methodologies and Tools</source>
          , pp.
          <fpage>221</fpage>
          -
          <lpage>235</lpage>
          . Springer International Publishing,
          <string-name>
            <surname>Cham</surname>
          </string-name>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Guan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Choi</surname>
            ,
            <given-names>B.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
          </string-name>
          , J.:
          <article-title>Application oriented dynamic resource allocation for data centers using Docker containers</article-title>
          .
          <source>IEEE Commun. Lett</source>
          .
          <volume>21</volume>
          (
          <issue>3</issue>
          ),
          <fpage>504</fpage>
          -
          <lpage>507</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Horovitz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arian</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Efficient cloud auto-scaling with SLA objective using Q-learning</article-title>
          .
          <source>In: Proc. of IEEE FiCloud '18</source>
          . pp.
          <fpage>85</fpage>
          -
          <lpage>92</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Jawarneh</surname>
            ,
            <given-names>I.M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bellavista</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bosi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Foschini</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martuscelli</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montanari</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palopoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Container orchestration engines: A thorough functional and performance comparison</article-title>
          .
          <source>In: Proc. of IEEE ICC</source>
          <year>2019</year>
          . pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Jerzak</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ziekow</surname>
          </string-name>
          , H.:
          <article-title>The DEBS 2015 grand challenge</article-title>
          .
          <source>In: Proc. ACM DEBS</source>
          <year>2015</year>
          . pp.
          <fpage>266</fpage>
          -
          <lpage>268</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Khazaei</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ravichandiran</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Park</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bannazadeh</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tizghadam</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>LeonGarcia</surname>
          </string-name>
          , A.:
          <article-title>Elascale: Autoscaling and monitoring as a service</article-title>
          .
          <source>In: Proc. of CASCON '17</source>
          . pp.
          <fpage>234</fpage>
          -
          <lpage>240</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oak</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pompili</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beer</surname>
            , D., Han,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>DRAPS: dynamic and resource-aware placement scheme for Docker containers in a heterogeneous cluster</article-title>
          .
          <source>In: Proc. of IEEE IPCCC '17</source>
          . pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Nardelli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cardellini</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casalicchio</surname>
          </string-name>
          , E.:
          <article-title>Multi-level elastic deployment of containerized applications in geo-distributed environments</article-title>
          .
          <source>In: Proc. of IEEE FiCloud '18</source>
          . pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Rossi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cardellini</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Lo</given-names>
            <surname>Presti</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          :
          <article-title>Elastic deployment of software containers in geo-distributed computing environments</article-title>
          .
          <source>In: Proc. of IEEE ISCC '19</source>
          . pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Rossi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardelli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cardellini</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Horizontal and vertical scaling of containerbased applications using Reinforcement Learning</article-title>
          .
          <source>In: Proc. of IEEE CLOUD '19</source>
          . pp.
          <fpage>329</fpage>
          -
          <lpage>338</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Sutton</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barto</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          :
          <article-title>Reinforcement Learning: An Introduction</article-title>
          . MIT Press, Cambridge, MA,
          <volume>2</volume>
          <fpage>edn</fpage>
          . (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jia</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Migration modeling and learning algorithms for containers in fog computing</article-title>
          .
          <source>IEEE Trans. Serv. Comput</source>
          .
          <volume>12</volume>
          (
          <issue>5</issue>
          ),
          <fpage>712</fpage>
          -
          <lpage>725</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Tesauro</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Jong,
          <string-name>
            <given-names>N.K.</given-names>
            ,
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Bennani</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.N.:</surname>
          </string-name>
          <article-title>A hybrid Reinforcement Learning approach to autonomic resource allocation</article-title>
          .
          <source>In: Proc. of IEEE ICAC '06</source>
          . pp.
          <fpage>65</fpage>
          -
          <lpage>73</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mohamed</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ludwig</surname>
          </string-name>
          , H.:
          <article-title>Locality-aware scheduling for containers in cloud computing</article-title>
          .
          <source>IEEE Trans. Cloud Comput</source>
          .
          <article-title>(</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>