     On the Importance of Simulation in Enabling
    Continuous Delivery and Evaluating Deployment
                Pipeline Performance
     Andrea D’Ambrogio                  Alberto Falcone, Alfredo Garro                   Andrea Giglio
Dept. of Enterprise Engineering         Dept. of Informatics, Modeling,            Dept. of Innovation and
      University of Rome             Electronics and Systems Engineering           Information Engineering
          Tor Vergata                         University of Calabria             Guglielmo Marconi University
     Via del Politecnico 1                     Via P. Bucci 41C                          Via Plinio 44
      00133, Rome, Italy                      87036, Rende, Italy                     00193, Rome, Italy
     dambro@uniroma2.it         {alberto.falcone, alfredo.garro}@dimes.unical.it    a.giglio@unimarconi.it

   Abstract—In modern Software Engineering, Continuous De-             Furthermore, the dramatic impact of the Agile Manifesto on
livery (CD) is a development approach in which a software is           the way software production is managed, has led in recent
iteratively developed in short cycles ensuring, for each cycle,        years to the widespread adoption of new emerging approaches
that the new features are available to end users as soon as they
are implemented and tested. CD aims at defining, building and          in software engineering. The first of the twelve fundamental
releasing software with greater speed and frequency through the        principles defined in the aforementioned manifesto states that
deployment pipeline resulting in three major benefits, visibility,     “Our highest priority is to satisfy the customer through early
feedback and continuous deployment respectively, enabling the          and continuous delivery of valuable software” [20].
software functional items to efficiently flow from development
to production. In this domain, the need for evaluating the                Despite the fact that continuous delivery was explicitly
performance of the deployment pipeline emerges, since the con-         mentioned in this principle, its widespread adoption is the
ventional metrics available in the software engineering discipline     result of an evolution of different approaches through the
are not suited to handle all the involved aspects. In this paper,      years. Starting from continuous integration (CI) shifting to-
the metrics suited for supporting CD are introduced and an
integration with Modeling and Simulation (M&S) techniques is
                                                                       wards modern continuous deployment pipelines, releasing new
discussed, based on the Business Process Model and Notation            software versions early and often has become mainstream, and
(BPMN) standard, which could represent a valid support offering        now represents a concrete option also for an ever growing
a graphical notation to easily specify the deployment pipeline steps   number of companies and practitioners. Let us summarize
as a standard and repeatable process. The main objective is to         some baseline definitions:
identify feasible perspectives in which simulation methods and
principles can be exploited, thus evaluating the effectiveness of        • Continuous Integration. CI is the process of ensuring that
M&S to support performance analysis of a deployment pipeline,
seen as a predictable process. Specifically, M&S can be seen as an         a software build is in a correct working state, guaranteeing
enabling tool for the evaluation and comparison of different CD            developers that the release is suitable for production. In
choices against requirements through an effective implementation           this approach, members of a team are pushed to integrate
of simulation techniques and virtual testing.                              their work frequently, typically each person integrates
   Index Terms—Continuous Delivery, Business Process Model                 at least daily leading to several integrations per day.
and Notation (BPMN), Modeling and Simulation
                                                                           It is worth noting that each integration is verified by
                     I. INTRODUCTION                                       an automated build and test procedure with the aim of
                                                                           detecting integration errors as soon as possible.
   Nowadays, in the software engineering and software devel-
                                                                         • Continuous Delivery. Deriving from CI, CD ensures that
opment domains, many influencing factors play a fundamental
                                                                           a build is always in a releasable state. Going hand to hand
role in making software production activities increasingly
                                                                           with a DevOps approach, where the team is responsible
challenging. Fast-changing customer requirements, as well as
                                                                           for all aspects of software delivery, after integration and
unpredictability of market and the need of an ever shorter time
                                                                           testing activities the software is actually provisioned to a
to-market are non-negligible aspects that have to be taken into
                                                                           full, production-like stack, and delivered to some set of
account, since software development has become a demanding
                                                                           end users.
area of business [8]. First approaches, strongly based on attrac-
                                                                         • Continuous Deployment. When the entire process, from
tive lean principles proposed the adoption of strategies such
                                                                           check-in to production, is fully automated, with no human
as “Decide as late as possible” [30], configured themselves
                                                                           intervention, continuous deployment is implemented.
as early steps towards a Continuous Delivery (CD) approach.
                                                                         • Deployment Pipeline. The foundation of continuous deliv-
  Copyright c held by the author                                           ery is the deployment pipeline, which can be seen the path
      that a code change takes from check-in (i.e. the commit         simplifying and automating the whole delivery process for an
      operation) to production.                                       application after the source code has been committed into a
   CD is a software engineering approach that represents a            repository. CD is centered around the collaboration aspects of
huge step forward in productivity and quality for companies           teams involved in the software production such as, developers,
and organizations that adopt it. CD is gaining attention since it     project managers and users, as well as automating the software
provides a well-established process that allows team members          delivery steps so as to ensure that a software is always ready
to work together to define and create large and complex               to be deploy and then delivered to end-users.
software with a higher level of control. CI claims to enable             Basically, the deployment pipeline is an automated process
companies to release new features, configuration properties,          for getting software from the repository into the hands of end-
bug fixes and tests, to customers safely and quickly in a             users. CD is preceded by the Continuous Integration (CI) in
sustainable way. This means that it mainly focuses on asserting       which the team members integrates the source code [31], [34].
the correct compilation of the source code and that it passes         This step leads to a faster feedback cycle and to benefits such
a chain of test units.                                                as improved productivity and increased communication [31].
   However, measuring the performance of a CD deployment              Figure 1 shows the deployment pipeline.
pipeline is a challenging task that requires a considerable effort
in terms of both time and cost. Many companies including                                        Source Code

Google, Facebook and IBM are developing their software                                                                Manual

products by using CD [8] and are trying to find out viable                            Commit
solutions to measure the CD performance. To this purpose,
significant benefits can derive from the possibility to perform
Modeling and Simulation (M&S) activities on the CD deploy-                                      Compile and
ment pipeline in order to understand, measure and optimize
the behavior of systems on which to perform experiments and                           Test
theoretical analyses.                                                                              tests

                                                                                                                  CD Pipeline
   In this context, the paper introduces metrics that are suited to
evaluate the performance of a CD deployment pipeline through                                     Functional
M&S techniques based on the Business Process Model and
Notation (BPMN) standard, in the purpose of defining the                                       Non-Functional
pipeline as a standard process and also of analyzing it as a                                       tests

repeatable and predictable process.
   The ultimate aim is to provide an exploratory analysis of the                                    tests
available approaches to minimize the cycle-time in software
production (i.e. the period it takes from an idea to provide                                      Stability
actual business value), and also to investigate the different                                      tests

domains in which simulation techniques can be fruitfully
exploited to enforce continuous delivery activities through a                                      tests
deployment pipeline optimization.
   The rest of this paper is organized as follows. Section II                         Deploy
provides an introduction to the essential concepts and back-
ground knowledge on the research domain. The Continuous
Delivery (CD) methodology and the typical architecture of                       Fig. 1. Overview of the CD deployment pipeline.
a CD deployment pipeline are presented along with some
related works available in literature. In Section III, the CD            The CD deployment pipeline starts when developers commit
deployment pipeline is defined through the concepts provided          the source code changes into the version control system. At
by the Business Process Model and Notation (BPMN) standard            this point, the CI Management System (CI-MS) triggers a
with particular focus on how to make available Modeling and           new instance of the deployment pipeline and compiles the
Simulation (M&S) practices during the pipeline flow in order          source code. If the compile procedure successfully completed
to evaluate, predict and optimize its behavior. Finally, Section      a transition to the test phase is performed; otherwise the
IV discusses the main objectives of the work and presents             procedure terminates with an error. In the test phase, the
some future works.                                                    CI-MS runs a set of unit tests, performs code analysis, and
                                                                      builds the installer package. In this step, some manual tests
          II. BACKGROUND AND R ELATED WORK                            needs to be performed in order to evaluate domain-dependent
   Continuous Delivery (CD) is a concept which has been               functional and non-functional requirements. If all the tests
taken as a pillar of modern Agile software engineering [11]           pass, the executable code is used to generate the binaries that
because its enables higher software development velocity              will be stored in an artifact repository [21].
and productivity. The concept of continuous delivery means               The related works available in the literature are vast and
extend through empirical and modeling and simulation ap-                 III. T HE P IPELINE AS A S TANDARD AND P REDICTABLE
proaches to studying processes in software development.                                         P ROCESS
   In [33] Van Der Hoek et al. identified the problems of              A. The Pipeline
releasing component-based software and presented a flexible
software release management tool that acts as a bridge to                 Under the assumption that both a business process collabo-
reduces the gap between the development process and deploy-            ration and a CD deployment pipeline implement an exchange
ment process.                                                          of information between logical processes, in this paper the
   Lahtela et al. in [24] presented nine significant challenges        adoption of BPMN as a notation to define a deployment
regarding the delivery process of software. The authors, also          pipeline is proposed. This allows to perform simulation test
presented solutions to overcome these challenges through well-         units in order to evaluate the performance of a pipeline through
defined guidelines on how to build an ITIL-based release               a set of well-known metrics such as, those presented in [25].
management process and how it should be performed.                        The deployment pipeline is the workflow, with related activ-
   Kajko-Mattson and Fan in [22] outlined a model of the               ities, that a source code follows from commit to production.
release management process integrating both the vendor and             Each commit, made by the development team, generates a
acceptor sides.                                                        release candidate of the software which flows through the
                                                                       pipeline. If everything goes well, which means that all steps
   Krishnan in [23] presented an economic model to capture
                                                                       of the pipeline are correctly executed, the software is ready
and analyze a set of tredoffs involved in the software release
                                                                       for release. Figure 2 shows a general CD deployment pipeline
decisions and discussed a method to optimize the delivery
                                                                       formalized in BPMN notation [17].
process that takes into account crucial factors such as the
                                                                          With reference to the BPMN/CD deployment pipeline de-
software development cycle and the changes in market needs.
                                                                       picted in Figure 2, the pipeline starts in the Development
   The above presented works do not cover all the aspects
                                                                       swing-lane where the source code is ready to be committed
engaged in the continuous delivery process and do not provide
                                                                       into a repository. A transition to the Continuous Integration
an adequate view on the issues that can emerge in performing
                                                                       swing-lane happens and during the activity transition a connec-
the release management for IT services [3]. Moreover, they
                                                                       tion to the repository is performed. At this point, the commit
are based on empirical results and not on simulation ones that
                                                                       activity is performed in order to save the changes to both the
can help to better understand, predict and optimize the delivery
                                                                       local and remote repository. In the Build activity, the source
process of software.
                                                                       code is built and the resulting executable file is generated.
   Through the following related works, an insight into the            After that, a transition to the Simulation activity is done.
main issues related to the use of modeling and simulation              Throughout this stage, the new software version is rigorously
techniques in delivering software, is given.                           tested through simulation techniques to evaluate, predict and
   Dlugi et al. in [10] highlighted the difficulties to evaluate the   optimize its behavior. It is important that all the requirements
performance of a software release, mainly related to the lack of       aspects whether functional, non-functional, performance, se-
a test environment that is comparable to a production system.          curity and compliance are verified. For each simulation test,
The authors introduced a model-based performance change                some software metrics related to various constructs like class,
detection process that uses modeling and simulation methods            cohesion and inheritance have been evaluated. To evaluate
for evaluating performance metrics of different software ver-          the complexity of the source code, three standard metrics,
sions. Also, they developed a a plug-in for a CD deployment            which are proposed by various researchers, can considered
pipeline based on Jenkins.                                             [36]: (i) CCM (Cyclomatic Complexity Metric), which is a
   In [35], Vöst and Wagner investigated the CD pipeline              metric based on graph theory that represents the number
in the automotive domain. They evaluated the effect of                 of linearly independent paths through a program’s code; (ii)
various functional and non-functional requirements on the              HCM (Halstead Complexity Metric), which measures the logic
development life cycle. They presented a typical delivery              volume of the code. It is calculated on the count of the
pipeline in automotive software that adopts Hardware-in-the-           operators and operands; (iii) (v) NF (number of function),
Loop-Simulation approaches to evaluate the software with the           which represents the total number of functions. Typically,
operating environment.                                                 the lower are the values of these metrics the lower is the
   Zia et al. in [1] used the Business Process Model and               complexity of the source code and thus higher should be the
Notation (BPMN) standard to evaluate DevOps alternatives.              code compactness, readability and reliability.
The effectiveness of BPMN has been demonstrated in other                  In the last activity, named Test units, a set of parametrized
domain [15], [16], [17], and this can represent a viable solution      test units are executed. This allows to measure the progress
to model and simulate CD pipelines.                                    of the software and detect side effects. A transition to the
   This paper has various purposes in common with the pre-             Production swing-lane happens if all the activities have been
sented works but unlike them it introduces metrics to evaluate         successfully completed. Otherwise, if at least one activity
the performance of a CD deployment pipeline through M&S                failed, a state transition to the source code activity is per-
techniques based on BPMN so as to define a standard CD                 formed, since the source code needs to be revised, and the
pipeline on which to perform analysis.                                 pipelines terminates.
                                                                       source code

                      Continuous Integration

                                                                                       commit                  Build                  Simulation                 Test units



                                                                                                Environment              Source code               manual test
                                                                                                configuration             configuration                units

                                                                                                                             Manual fix                     Error trace



                                                                            Software                 Push                                                                     repository
                                                                             Images             Infrastructure

                                                                             Fig. 2. Overview of the CD deployment pipeline defined in BPMN.

   The Production swing-lane, the parameters related to the tar-                                                                to identify process improvement potential.
get environment (e.g. AWS Amazon Cloud, Microsoft Azure,                                                                           On the other hand, we find those metrics that refer to the
Google Cloud, etc.) are configured in the Environment con-                                                                      amount of work done in a certain time, for example the number
figuration activity. Thereafter, the Source code configuration                                                                  of users (or jobs) that complete the whole process during an
activity is performed, in which the software is configured to                                                                   observation time. In this case, the abovementioned metrics are
run into the target environment. Finally, additional manual test                                                                related to the concept of throughput.
units, which take into account the production environment, are                                                                     In the context of continuous delivery and deployment, the
executed in the manual test units activity.                                                                                     concepts of “user” or “job” can be effectively intended as
   A transition to the Release swing-lane happens if everything                                                                 “features” flowing throughout the pipeline, from the initial
goes well. Otherwise, the issues occurred during the produc-                                                                    commit to the final deployment in production.
tion activities are collected and stored into the error repository                                                                 Since from a business analyst perspective, the cycle time
in order to be evaluated and fixed by developers. In the Release                                                                is the period it takes from an idea to provide actual business
swing-lane, the image of the software release is created and                                                                    value, it is easy to understand that most profits could be gained
deployed in the environment.                                                                                                    from that idea if it can be implemented quickly. If the cycle
B. Relevant Metrics                                                                                                             time is too high, competitors might be first or the idea might
   Generally speaking, in the study of business processes                                                                       not be relevant anymore.
performance, metrics of particular interest fall under two main                                                                    Under this perspective, it becomes clear how, to that op-
categories.                                                                                                                     timizing the cycle time requires an efficient process which
   On the one hand we find those metrics that are directly                                                                      usually spans multiple departments and involves a lot of people
related to time, for example the time in which a user (or job)                                                                  (i.e., a complex process).
crosses the entire process starting from the initial state arriving                                                                In order to optimize a complex process, we need an efficient
at the final state, executing different tasks in its path. In this                                                              way to specify and analyze each task that is involved in it. As
case, this metric is defined as cycle time.                                                                                     aforementioned we propose BPMN to describe the deployment
   Since cycle time includes both value-adding and non value-                                                                   pipeline process, and propose the use of simulation techniques
adding activity times (e.g., waiting times), it is a powerful tool                                                              to enact analysis activities.
   A similar approach enabling performance predictions over           •  Releases Per Month (RPM), measures the number of
a business process has been introduced and discussed in [5],             releases that have been completed during a single month.
[9]                                                                      It is worth noting how changes in the long term of this
   Benefits of analyzing the process of such a pipeline include          metric suggest information on changes in the release
finding and removing bottlenecks (e.g., a specific activity that         cycle. Data for obtaining such a measure can be retrieved
is too expensive in terms of time, due to some inefficiency),            from the version control system or from the integration
shortening the feedback loop (i.e., measuring activities in the          server logs, for instance.
middle of the pipeline can be useful to provide an early               • Fastest Possible Feature Lead Time, measures a sort of
feedback to end users), automating as much as possible (i.e.,            best case scenario in which, without delays and latencies
an accurate analysis can lead to the identification and the              (e.g., caused by the use of feature branches or by separate
reduction of expensive manual activities and eliminate any               build processes), the feature under study spends time only
error-prone manual tasks). In other words, the ultimate purpose          in the build and test phase on the pipeline.
is to optimize and visualize what’s going on to improve the            Also monitoring Mean Time Between Failures (MTBF) and
flow.                                                               Mean Time To Repair (MTTR) and their balancing can be an
   In the continuous delivery and deployment domain, it is          effective way to optimize the deployment pipeline.
easy to understand how the throughput of the pipeline used to          Mean time between failures reminds the team to to avoid
deliver new features to production (i.e., to the end user) is a     easy failures. However, core point of software development is
relevant metric.                                                    to provide new value to users, and only looking at MTBF can
   Referring to the lean world and to kanban practices, a metric    result in teams becoming overly cautious and never releasing
that completely captures the concept of the pipeline is certainly   anything new.
flow efficiency, which is calculated based on the actual work          In order to avoid such a drawback, a focus on mean time
time (i.e., the actual time spent working on a feature) measured    to recover (i.e., a metric that measures the ability to rollback
against the total wait time (e.g., data transfer, hardware or       in case of a mistake) can be a key counterbalance.
software provisioning, environment setting, etc.), presented in
                                                                       • Mean Time Between Failures (MTTF), is the predicted
                                                                         elapsed time between inherent failures of a system, during
   The calculation of flow efficiency is as follows:                     normal system operation. In the deployment pipeline can
                                          Work Time                      be a measure of the mean time between a feature not
      Flow E f f iciency in % =                                          passing some test, for example. MTBF can be calculated
                                    Work Time + Wait Time
                                                                         as the arithmetic mean (average) time between failures
                                                                         of a system (i.e., a feature scoring a KO in some test).
  • Work Time is the time in which the development of the                Achieving a good MTBF in a pipeline involves getting
     feature under study can be seen as in progress.                     feedback early on and making sure that thorough val-
  • Wait Time is the time in which the development of the                idation occurs in testing environments. Moreover, such
     feature under study can be seen as blocked or waiting.              validations should be run on environments that are similar
  Other interesting metrics have been introduced, which can              to production, and with realistic data.
be fruitfully taken into account for analyzing a deployment            • Mean Time To Repair (MTTR), represents the average
pipeline, as Features Per Month (FPM), Releases Per Month                time required to repair a failed component or device (i.e.,
(RPM), or Fastest Possible Feature Lead Time [25].                       the failure of a feature in the pipeline). Assuming failure
                                                                         is inevitable, it’s important to ensure that mean time to
                             TABLE I                                     recover is as fast as possible. In other words, it measures
                  S UMMARY OF R ELEVANT M ETRICS .                       the time that is needed to “rollback” to a previous build
                                                                         after a release failure. In this scenario it is clear how
                                     Metric    Unit of Measure
                                                                         robust monitoring of production is essential. In order to
                                 Cycle Time    Time                      make this approach more effective, teams should learn
                                 Throughput    Frequency
                             Flow Efficiency   %                         about failures through monitoring and alerts, not through
                  Features Per Month (FPM)     Features / Time           customer complaints.
                 Releases Per Month (RPM)      Releases / Time
         Fastest Possible Feature Lead Time    Time                 Table I summarizes relevant metrics of interest and their units
                                      MTBF     Time                 of measure.
                                     MTTR      Time
                                                                    C. Simulation Benefits
  •   Features Per Month (FPM), identifies the number of new           The ever-growing complexity of modern software, which
      features that have crossed the entire pipeline during a       also involves physical and/or virtual platforms, requires the
      month. The metric is based on Day-by-the-Hour (DbtH),         adoption of effective analysis techniques to support the design
      which measures quantity produced over hours worked            and operation. Using M&S techniques is it possible to generate
      [29].                                                         the real-world scenarios that would be hard to get in the
real world, especially in distributed environments [13], [14],          In this paper, we have discussed the adoption of a well-
[18], [27], [12], [28]. Moreover, moving from monolith to            known standard to represent the deployment pipeline as a
Microservice architecture, the involved parts have their own         process (i.e., BPMN), in order to make it repeatable and clearly
lifecycle [32].                                                      specified.
   The solution introduced in section III (see Figure 2) regards        Moreover, we have discussed a set of candidate metrics
the availability of a Simulation activity that combines the static   which can be suitable to analyze the performance of such
test units with simulation ones. The benefit of the Simulation       process, most of them referable to the concepts of throughput
activity is that it continuously urges the software services,        and cycle-time.
creating a constant and uniform load on them. This allows to            Finally, we have discussed the benefits of simulation ap-
monitor the CD development pipeline and notify developers if         proaches and techniques to carry on performance predictions
any activity failed.                                                 over a deployment pipeline.
   Moreover, through the use of the Simulation activity is it           More specifically, the support provided from simulation can
possible to evaluate the source code, which is managed by a          be twofold. On the one hand, end-to-end performance of the
CD development pipeline, on specific embedded platforms and          whole pipeline can be analyzed over different configurations,
then overcome typical issues involved in the hardware-based          and, on the other hand, particular hardware or software com-
setups, such as:                                                     ponents can be simulated (e.g., using contracts in case of
  • Hardware Checking. Managing simulation tests is much             microservices or using a virtual hardware component).
    easier respect to use hardware, especially when complete            As a further step, work is ongoing to implement a full stack
    and actual tests are too expensive to be performed in            framework to enact modeling and simulation of a deployment
    terms of cost, time and other resources. Though simula-          pipeline, leveraging on previous experiences in a cloud-based
    tion is also easier to handle multiple hardware configura-       environment [4], [6], [7], [2], [19].
    tions, since this means to change configuration parameters
