<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Zitzler E, Thiele L. Multiobjective evolutionary
algorithms: a comparative case study and the
strength Pareto approach. IEEE transactions on
Evolutionary Computation.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>An effective resource management approach in a FaaS environment</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas Christoforou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas S. Andreou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Cyprus University of Technology</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>3</volume>
      <issue>4</issue>
      <fpage>159</fpage>
      <lpage>169</lpage>
      <abstract>
        <p>cloud providers adopted and currently support and offer this architecture. AWS Lambda [3], IBM Cloud Serverless computing introduces a new Cloud service Functions [4], Google Cloud Functions [5] and Miwhich consist an increasingly popular architecture for crosoft Azure Functions [6] are the major serverless building distributed applications. This paper investi- providers. This new cloud paradigm is becoming ingates and proposes a new resource management ap- creasingly popular and is gaining great attention from proach in a FaaS platform, based on intelligent tech- the software industry and research community. A numniques. A number of experiments applied through an ber of new technical challenges and open problems [7] indicative framework consisting of a client application have emerged and a number of questions have been and a Lambda function. Three Genetic Algorithms set about the importance and the future of serverless were employed to deliver optimal solutions in a multi- computing. objective environment. Resource management support in such a FaaS environment is essential for the software development pro1 Introduction cess itself, which is directed towards satisfying the SLA and providing QoS assurance. The identification of the optimum scenario for resource allocation to serve adequately a specific workload is a tedious, computationally complex and time-consuming process since multiple objectives need to be satisfied. This research work is motivated by the following two Research Questions (RQ) Serverless computing is a relatively new processing paradigm or model that emerged through the continuous and vast development of the Cloud. Serverless computing provides a service in which developers can write and deploy code without provisioning or managing servers or containers. The adoption of this paradigm has great impact on several software engineering aspects such as development process, pricing • RQ1: Is it possible to implement easy to use model and Quality of Sercice (QoS) assurance. and efficient resource management algorithms in Despite the various benefits and advantages of a FaaS platform? serverless computing, like zero server management, no • RQ2: How intelligent techniques can deliver effiup-front provisioning, high availability, auto-scalability cient resource management to developers in a FaaS and pay only for the resources used, there are also some environment with the minimum possible cost and weaknesses that should also be taken into account: As time? currently offered from providers, it is not suitable for long term tasks because of the limited time a service The Amazon Lambda platform is considered a comcan run; additionally, there is increasing complexity of plete platform as it offers the most features, while the underlying architecture, which is intensified by the presents the greatest market share; for these reasons lack of appropriate operational tools. it was selected to constitute our experimental environThe main representative of this new service architec- ment. ture is Function as a Service (FaaS) or event-based pro- The rest of this paper is organized as follows: Section gramming [1], where a function may triggered through 2 provides a brief overview of relevant literature focusan api call or by an event. Since Amazon introduced ing more on resource management approaches. Section Lambda serverless platform in late 2014 [2], many other 3 introduces the proposed approach which addresses</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>the two research questions. The experimental process to the first research question. In this research work an
is described in section 4, while section 5 discuses the exhaustive algorithm will be applied on a low demand
results obtained. Finally, section 6 concludes the paper environment and a small scale workload with the
reand outlines future research steps. sults obtained being considered as the reference data
for the proposed intelligent approaches, the latter
being able to reach to solutions faster. Our proposition is
2 Literature Review the employment of a multi-objective genetic algorithms
(MOGAs) [15] as our optimization method that will
generate near-optimal solutions and their performance
will be compared with the reference optimal solutions
extracted by the exhaustive algorithm.</p>
      <p>A short literature review has been carried out that is
not limited to resource management approaches, since
very few relevant research works were identified.</p>
      <p>Two research works refer to efficient resource
management: In [8] a solution using a well-known
resource allocation strategy for a Lambda platform was
presented based on the model predictive controller Genetic algorithms are a type of evolutionary
algo(MPC). This solution designs a resource allocation rithm, which are widely used to solve search-based
oppolicy through the understanding of the run-time at- timization problems by simulating the theory of
natutributes of the workload. The authors in [9] performed ral evolution on a population of individuals (candidate
a dedicated test to identify if cost optimization is feasi- solutions). Problems like the one this study is dealing
ble when utilizing increased resources for lowering pro- with, require the optimization of multiple criteria at
cessing times. the same time. In such a case multi-objective genetic</p>
      <p>Evaluation of main FaaS providers was performed algorithms can be adopted, where the goal is to find
in [10] and relevant results were presented in terms of the best solution by otimizing a set of objective
functhroughput, network bandwidth, a file I/O and perfor- tions. In case of conflicting or competing objectives,
mance computed according to concurrent invocations. a multi-objective genetic algorithm normally delivers a
In [11], the authors report results from a comprehen- set of optimal solutions instead of a single one. This
sive investigation on the performance of microservices set of optimal solutions is called Pareto optimal set (or
hosted by a serveless platform. The investigation dealt Pareto front) and contains those solutions that are not
with implications of infrastructure elasticity, load bal- dominated by any other solution yielded during
evoluancing, provisioning variation, infrastructure retention, tion. Each optimal solution constitutes a specific
baland memory reservations. A micro-benchmark intro- ance between the objectives under optimization, where
duced in [12] was used to evaluate the performance and any improvement in one of them leads to worsening the
cost model of popular FaaS providers. other (conflicting targets, e.g. cost vs time in our case).</p>
      <p>An open-source FaaS tool called Snafu was intro- Therefore, a decision maker is provided with the set of
duced in [13] which is employed for managing, execut- optimal solutions and is therefore supported to take
deing and testing functions across provider-specific inter- cisions as to which values of the decision variables are
faces. Finally the work of Hong et al. [14] describes most suited based on the targets and the requirements
six serverless design patterns that can be used to build of his/her application.
serverless applications and services.</p>
    </sec>
    <sec id="sec-2">
      <title>In the context of our application, the candidate so</title>
      <p>3 Multi-Objective Optimization lutions, or alternatively the decision variables, consist
Approach of two of the available configuration options offered
by AWS Lambda platform, namely memory allocation
Our general aim is to allocate a sufficient amount of re- and number of maximum concurrency functions, as
sources in a FaaS environment that should be able to well as a third variable, the batch size, that represents
serve a specific workload. By utilizing an exhaustive the number of inputs that each individual function is
algorithm we first aim to identify the optimal solutions required to process. Our approach is graphically
repfor both objectives, cost and performance. Exhaustive resented in Figure 1 where the three decision variables
approaches have great demands on resources and, sub- in conjunction with the characteristics of the workload
sequently, on cost, and thus they cannot be the answer define the level of Cost and Duration.
4</p>
      <sec id="sec-2-1">
        <title>Experimental Process</title>
        <p>Experimental Environment
An application that is used to perform the
experimental process has been implemented by utilizing services
offered by Amazon AWS platform. More specifically,
this application is based on the idea introduced in
Amazon Big Data Blog 1 where in a map-reduce style
it counts the words in files stored in an S3 2 bucket.</p>
        <p>This application streams the total number of words
each function has calculated, returns it back to the user
in real-time and demonstrates how a Lambda function
can efficiently process large amounts of data in short Figure 2: Experimental Environment
time and provide immediate results to users.</p>
        <p>In our experimentation study our application was Python. The synchronous invocations of lambda
funcimplemented as follows: On the client side, besides tions is executed and controlled by utilizing Python’s
the main function that triggers the whole process, a multi-threading module. The integrated experimental
cascade function was also implemented which is re- environment is depicted in Figure 2.
sponsible to sense the workload size and accordingly
distribute the data in a synchronous way over a num- 4.2 Exhaustive Algorithm
ber of lambda functions. This function is also
responsible to collect and aggregate the results taken from As mentioned before, our multi-objective optimization
the lambda functions responses and when all func- approach was adjusted and configured based on the
tions are completed it returns the final result to the AWS Lambda platform and taking into account the
user. On the AWS platform a lambda function was available options offered. The two objectives are cost
created that counts the words of a given batch of files and performance. The cost objective is the
minimizawhich are stored in a AWS S3 bucket. Data stored tion of the total cost required for the completion of the
in the S3 bucket represents the workload that will be process of the input workload and is calculated using
processed with specific features. Both the client and the formulas and rules as these are given by Amazon 4.
lambda sides were implemented in Python 3.7 and the The calculation of the cost depends on the number of
integration with the AWS services, S3 and Lambda lambda functions executions, the total duration of all
was achieved through the Boto3 3, the AWS SDK for executed functions and the allocated memory. The
performance objective is also the minimization of the total
duration needed for the workload process completion
and is calculated as the time from the moment the
1https://aws.amazon.com/blogs/big-data/building-scalableand-responsive-big-data-interfaces-with-aws-lambda/
2https://aws.amazon.com/s3/
3https://aws.amazon.com/sdk-for-python/
4https://aws.amazon.com/lambda/pricing/
user sends the start request until the application de- objective functions, Cost and Duration, for values that
livers back to the user the total count of words. The belong to the PS set (see equation 2).
set of decision variables consists of the memory
allocation size, the number of maximum concurrent
functions and the batch size. Memory allocation denotes minimizef (x) = (fduration(x), fcost(x)), x ∈ P S (2)
the amount of memory you want to allocate for your
lambda function. The values memory can get, ranged Since all three decision variables are real-valued, we
from 128MB to 3008 MB with 64MB increment step. accordingly use best practices for setting the MOGAs
The concurrent execution limit can be set from 1 to configuration. For crossover and mutation operators,
1000. Finally, the batch size represents the number the Simulated Binary Crossover (SBX) and Polynomial
of files that each function will process and in our ex- Mutation (PM) were selected respectively. The same
periments its values are relative to the percentage to parameters and settings were used for all executions.
the workload size and fall into the following set: [1, 2, As one can easily discern from Figure 3, all algorithms
5, 10, 20, 25, 50, 100]. The S3 bucket which is used yielded very similar solutions which are almost
identifor the workload, contains 100 text files with each file cal to the reference optimal. A more detailed analysis
containing 638 words. The exhaustive algorithm that of the results follows in the next section.
was used calculated and delivered all possible
candidate solutions. To reduce execution time and cost we
discarded solutions for concurrency limit over 100 since 5 Results and Discussion
the size of the workload is 100 and it does not make
sense to take these solutions into account. The number The execution of the exhaustive algorithm on the
exof Possible Solutions (PS) is calculated using equation perimental application delivered a complete list of PS.
1 and is calculated to be equal to 36800. The extracted values constitute the aggregation of five
different executions in order to minimize or even
elim|P S| = N × M × K (1) inate possible variations between the results under
exactly the same configuration conditions. As described
where, N is the number of the possible values of the above, the results from the exhaustive algorithm were
concurrency limit and is equal to 100, M denotes the considered as the reference data and were used for
total number of values for memory and is equal to 46 the assessment of the three MOGAs employed. Each
and K is the number of values the workload batch size MOGA was run 100 times for different values of fitness
can get and is equal to 8. evaluations (FE) ranging from 500 to 4500 with
increment step 500. The Pareto optimal front that emerged
4.3 Multi-objective Genetic Algorithms from the reference optimal solutions in contrast to the
Pareto near-optimal solutions yielded by each MOGA
We selected three well-known and widespread MO- for 1000 fitness evaluations, is depicted in Figure 3. By
GAs to assess their ability to solve the problem in observing the Pareto fronts, one can easily conclude
hand and compare their results: The Non-dominated that all three MOGAs approached the optimal
soluSorting Genetic Algorithm II (NSGA-II) [15], the tions to a high degree; in fact, in some cases the
domNon-dominated Sorting Genetic Algorithm III (NSGA- inant MOGA solutions are exactly the same as those
III) [16] and the Strength Pareto Evolutionary Algo- of the reference set. At this point we will utilize some
rithm 2 (SPEA2) [17]. The selection of these specific metrics aiming to assess further and compare the
peralgorithms was made after performing some prelimi- formance [18] of the three MOGAs we employed on our
nary experimentation which indicated that these three approach.
present a consistently good performance.</p>
        <p>The relative configuration of the required
parameters, as well as the overall implementation of the al- 5.1 MOGAs performance through quality
gorithms, were performed using Platypus 5, a Python- indicators
based multi-objective optimization algorithms library.</p>
        <p>The aim is to minimize a vector consisting of the two</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The hypervolume (HV) [19] and the inverted genera</title>
      <p>tional distance (IGD) [20] quality indicators were
selected to assist in comparing the three MOGAs with
5https://platypus.readthedocs.io/en/latest/index.html
respect to performance and scalability, given the
ability of the aforementioned metrics to assess both
convergence and diversity (uniformity and spread) of the
algorithms. Specifically, the HV indicator assesses the
volume covered by the non-dominated solutions of a
Pareto front in the objective space. Therefore, the
larger the volume covered by the solutions generated
in a run, the higher the HV value, which indicates a
better performance. The IGD indicator assesses how
far the elements of the true Pareto front (reference data
in our case) are from the non-dominated points of an
approximation Pareto front. Therefore, the greater the
extent of the true Pareto front that is covered by the
non dominated points generated by a run in the
objective space, the lower the IGD value, which denotes a
better performance. Each algorithm was run 100 times
for each number of FE and the median values of the
HV and IGD were calculated for each algorithm. These
values are presented in Tables 1 and 2 respectively.</p>
      <p>The differences observed in the indicators values
between the compared algorithms are too small, and this
most probably is the result of the small complexity
of the workload used; however, in cases of application
workloads with increasingly greater complexity and/or
scale, these differences will become more profound. As
regards the HV indicator, SPEA2 presents the best
performance, that is, the highest hypervolume value,
and this is consistent along all fitness evaluation
numbers used. Second best for this indicator is the
NSGAII. It is important to observe that in the case SPEA2 in
the 1000 fitness evaluations rub the HV value stabilizes
to a constant value from the second measurement
onFE</p>
    </sec>
    <sec id="sec-4">
      <title>The paper is part of the outcomes of the Twinning</title>
      <p>project Dossier Cloud. This project has received
fundwards. The same behaviour is also observed for NSGA- ing from the European Union’s Horizon 2020 research
II but from the third measurement onwards. On the and innovation programme under grant agreement No.
contrary, NSGA-III presents more fluctuations in its 692251.
measurements.</p>
      <p>In the case of the IGD indicator, things appear more References
complicated since none of the algorithms seems to
prevail. A notable point is that for the lowest measure of [1] Fox GC, Ishakian V, Muthusamy V, Slominski A.
the fitness evaluations numbers, SPEA2 clearly outper- Status of serverless computing and
function-asforms the others. The values for NSGA-II and SPEA2 a-service (faas) in industry and research. arXiv
starting from the third measurement onwards are fully preprint arXiv:170808028. 2017;.
identical, while NSGA-III shows ups and downs and in
three cases seems better than the other two. [2] Adzic G, Chatley R. Serverless computing:
eco</p>
      <p>The Wilcoxon signed-rank test was applied on both nomic and architectural impact. In: Proceedings
quality indicators to detect whether or not a statisti- of the 2017 11th Joint Meeting on Foundations of
cally significant difference exist among the three algo- Software Engineering. ACM; 2017. p. 884–889.
rithms. To handle the family-wise error rate
accumulated, p-values were adjusted using a post-hoc Holm [3] AWS Lambda;. Accessed: 2018-09-30. https:
procedure. The p-values resulting from the pairwise //aws.amazon.com/lambda/.
comparison for HV and IGD indicators are shown in
Tables 3 and 4 respectively where pairs of algorithms [4] IBM Cloud Functions;. Accessed: 2018-09-30.
with a statistically significant difference (p &lt;0.05) are https://www.ibm.com/cloud/functions.
shown in boldface. According to the pairwise
comparisons, no significant difference is observed between [5] Google Cloud Functions;. Accessed: 2018-09-30.
NSGA-II and NSGA-III, NSGA-II and SPEA2, and https://cloud.google.com/functions/.
NSGA-III and SPEA2 in either indicator except only
for the pair NSGA-III and SPEA2 in the case of HV.
6</p>
      <sec id="sec-4-1">
        <title>Conclusions and Future Work</title>
        <p>This research work performed a preliminary
investigation to assess whether heuristic approaches for
multiobjective optimization, like the specific MOGAs used,
are able to solve the problem of finding a set of
nearoptimal solutions that support developers in a FaaS
environment to select an efficient resource allocation
scheme with respect to cost and time. In our case
this was verified through the experimental process
followed; therefore, the question raised now, and will be
the source of future research work, is whether the
support obtained by the MOGAs is worth investigating
further in terms of real-time response, ”execution on
the fly as workloads come in” and the level to which
this support can be generalized to cover also workloads
of unknown characteristics.</p>
        <p>Acknowledgments
[6] Microsoft Azure Functions;. Accessed:
201809-30. https://azure.microsoft.com/en-us/
services/functions/.
[7] Baldini I, Castro P, Chang K, Cheng P, Fink S,</p>
        <p>Ishakian V, et al. Serverless computing: Current
trends and open problems. In: Research Advances
in Cloud Computing. Springer; 2017. p. 1–20.
[8] HoseinyFarahabady M, Taheri J, Tari Z, Zomaya [18] Riquelme N, Von Lu¨cken C, Baran B.
PerforAY. A dynamic resource controller for a lambda mance metrics in multi-objective optimization. In:
architecture. In: Parallel Processing (ICPP), 2017 Computing Conference (CLEI), 2015 Latin
Amer46th International Conference on. IEEE; 2017. p. ican. IEEE; 2015. p. 1–11.
332–341.
[20] Van Veldhuizen DA, Lamont GB.
Multiobjective evolutionary algorithms: Analyzing
the state-of-the-art. Evolutionary computation.
2000;8(2):125–147.
[9] Hoang A. Analysis of microservices and
serverless architecture for mobile application
enablement (PhD Dissertation). California State
University, Northridge; 2017.
[10] Lee H, Satyam K, Fox G. Evaluation of
production serverless computing environments. In: 2018
IEEE 11th International Conference on Cloud</p>
        <p>Computing (CLOUD). IEEE; 2018. p. 442–450.
[16] Deb K, Jain H. An evolutionary many-objective
optimization algorithm using
reference-pointbased non dominated sorting approach, part
I: Solving problems with box constraints.</p>
        <p>IEEE Trans Evolutionary Computation.</p>
        <p>2014;18(4):577–601.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>