=Paper= {{Paper |id=Vol-2703/paperDC4 |storemode=property |title=Interactive Data-Driven Business Process Simulation (Extended Abstract) |pdfUrl=https://ceur-ws.org/Vol-2703/paperDC4.pdf |volume=Vol-2703 |authors=Gerhardus van Hulzen |dblpUrl=https://dblp.org/rec/conf/icpm/Hulzen20 }} ==Interactive Data-Driven Business Process Simulation (Extended Abstract)== https://ceur-ws.org/Vol-2703/paperDC4.pdf
                   Interactive Data-Driven Business Process
                        Simulation (Extended Abstract)
                                                          Gerhardus van Hulzen
                                                   Research group Business Informatics
                                                            Hasselt University
                                                            Hasselt, Belgium
                                                          0000-0001-8962-9515


                           I. I NTRODUCTION                                   individual BPS model components to make it usable to
                                                                              support CM decisions.
        Today, healthcare systems worldwide are under constant
                                                                           2) Enabling interactive data-driven process simulation:
     pressure. On the one hand, increasing population numbers,
                                                                              Domain knowledge should be closely integrated during
     ageing populations, lifestyle factors, and new technologies are
                                                                              the discovery of BPS models to ensure the reliability
     increasing the yearly expenses on healthcare. On the other
                                                                              and usability of the discovered simulation models.
     hand, budgets are under pressure due to economic austerity [1].
     In order to provide high-quality care to all patients, healthcare              III. P LANNED R ESEARCH ACTIVITIES
     managers are forced to improve their care processes. Efficient
                                                                            The following subsections give an overview of the planned
     Capacity Management (CM) is one of the key aspects to ensure
                                                                         research activities for the two research objectives.
     this. This involves, amongst others, determining the suitable
     resource levels – i.e. staff size, equipment, and facilities [2].   A. Extended Support for Key BPS Modelling Tasks
        Business Process Simulation (BPS) can be used to support
                                                                            Based on a systematic literature review, we concluded
     managers during CM decisions. BPS uses a (computer) model
                                                                         that defining the control-flow, entity arrival rates, activity
     to imitate the behaviour of a business process. This approach
                                                                         execution times, gateway routing logic, entity types, queueing
     allows evaluating the effects of changes before implementing
                                                                         disciplines, resource schedules, resource requirements, and
     them [3]. For instance, BPS can be used to determine suitable
                                                                         resource roles are the most important modelling tasks to
     equipment levels, e.g. by simulating the effect of an additional
                                                                         support CM decisions via simulation. These tasks correspond
     X-ray scanner on patient waiting times, throughput rates, and
                                                                         to a subset of modelling tasks given by [7]. Most attention
     staff workload.
                                                                         of PM research has been dedicated to control-flow definition
        In Process Mining (PM) the emerging field of data-driven
                                                                         [7]. However, for creating a simulation model for supporting
     process simulation provides promising first results to generate
                                                                         CM decisions, we believe that all aforementioned tasks are
     simulation models from information captured in event logs [4].
                                                                         required – albeit some tasks are more important than others.
     These “discovered” models can form the basis to compare
                                                                            In PM, only limited amount of work has been devoted
     the operational effects of various capacity levels. The main
                                                                         to integrating the various tasks needed to build a simulation
     advantage of data-driven process simulation over “traditional”
                                                                         model. The authors in [8] were the first to generate an initial
     simulation model development is the availability and objec-
                                                                         simulation model from data. They included the process-flow,
     tivity of event logs compared to information sources, such as
                                                                         gateway routing logic, and resource pools. Later, the authors
     interviews, process documentation, and observations [5]. How-
                                                                         extended their work with activity durations and entity inter-
     ever, some challenges remain in the field of automated BPS
                                                                         arrival times [5]. Nevertheless, the authors emphasise that the
     discovery. Most importantly, the lack of domain knowledge
                                                                         derived initial model still has to be verified and – if required
     makes it challenging to extract a reliable and usable simulation
                                                                         – augmented by domain experts to ensure validity.
     model. In addition, event logs often suffer from data quality
                                                                            In [9], a PM approach is proposed to generate BPS models
     issues, which strongly affects the reliability of the simulation
                                                                         for short-term KPI prediction. A similar approach as in [5] is
     results [6]. Therefore, it is imperative to take these problems
                                                                         used. However, the resource perspective is left aside, assuming
     seriously.
                                                                         an infinite amount of resources is available [9].
                      II. R ESEARCH O BJECTIVES                             Control-flow, resources, activity durations, and gateway
                                                                         routing logic are supported by the approach in [10]. In
        Given the context outlined above, this PhD research pursues      addition, they also support inter-arrival times and resource
     the following two objectives:                                       schedules. However, the latter have to be defined manually
        1) Extended support for key BPS modelling tasks: While           by the domain expert.
           the field of automated BPS discovery renders promising           None of the aforementioned studies tried to integrate all
           results; there are still challenges ahead to discover         elements into a single, simulation-ready model. This is where




Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Simod [11] extends the work on data-driven process simu-              expert should immediately obtain an estimation of the impact
lation. Simod is a tool which automatically discovers BPS             of the changed parameter, instead of having to wait until the
models from event logs. In addition, Simod is also capable of         simulation has finished running, which could – depending on
measuring the accuracy of the obtained simulation model and           the complexity of the model – take quite a while.
allows to optimise the accuracy using hyper-parameters [11].             The third cycle of the framework involves the actual model
   While the initial results of data-driven BPS algorithms are        validation. The calibrated model is simulated extensively, and
promising, there are still challenges to automatically derive         the domain expert validates the simulation results. If needed,
a simulation model for supporting CM decisions from event             the parameters of the simulation model can be altered again
logs. Especially the resource perspective is crucial for CM           to obtain more realistic results. The validated model can be
decisions. Incorrect resource requirements, pools, and sched-         used for further analyses and to evaluate different scenarios.
ules make the results of the model unreliable, resulting in              The goal of this part of the PhD research is to develop a
inaccurate capacity requirement estimations. The state-of-the-        prototype which supports the interactive development of data-
art still has limitations when it comes to defining the resource      driven simulation models.
perspective. Part of this PhD research will be dedicated to                            IV. C ONCLUDING R EMARKS
improving the support of the resource perspective in data-
                                                                         This PhD will mainly focus on the resource aspect of data-
driven BPS.
                                                                      driven BPS and how domain experts can be interactively
B. Enabling Interactive Data-Driven Process Simulation                involved in the discovery of simulation models. This should
   As mentioned earlier, data quality issues should be taken          culminate in the development of a prototype tool which allows
seriously to ensure the reliability of the data-driven simulation     interactive data-driven generation of BPS models based on
model. Detecting these issues often requires domain knowl-            event logs and domain knowledge. The derived simulation
edge. Therefore, it would be beneficial to involve the domain         model will form the basis for supporting CM decisions in
experts as early as possible to detect and handle data quality        healthcare. Nevertheless, the prototype would also be usable in
issues before integrating everything into a single simulation         many other applications in different fields besides healthcare,
model. Especially in stochastic models, such as simulation, a         such as production planning in manufacturing, supply chain
problem in one part of the model may have a profound impact           logistics, and transportation.
on other parts. It is much easier to solve issues at the root, then                                 R EFERENCES
having to trace back the problem in a full simulation model.           [1] C. Hicks, T. McGovern, G. Prior, and I. Smith, “Applying Lean
   Ideally, domain experts would conduct simulation studies                Principles to the Design of Healthcare Facilities,” International Journal
themselves. After all, they know the process best. However,                of Production Economics, vol. 170, pp. 677–686, 2015.
                                                                       [2] F. R. Jacobs and R. B. Chase, “Strategic Capacity Management,” in
conducting simulation studies requires specific knowledge                  Operations and Supply Management: The Core, ser. Operations and
which domain experts often do not possess. Of course, they                 Decision Sciences. New York, NY, USA: McGraw Hill/Irwin, 2008,
could learn more about constructing simulation models, but                 pp. 51–79.
                                                                       [3] N. Melão and M. Pidd, “Use of Business Process Simulation: A Survey
usually, they are very busy and do not have the time to master             of Practitioners,” Journal of the Operational Research Society, vol. 54,
the required skills.                                                       no. 1, pp. 2–10, 2003.
                                                                       [4] B. Depaire and N. Martin, “Data-Driven Process Simulation,” Encyclo-
   Against this background, we propose a framework to in-                  pedia of Big Data Technologies, 2018.
teractively involve domain experts during the development              [5] A. Rozinat, R. S. Mans, M. Song, and W. M. P. van der Aalst,
of data-driven simulation models. The framework consists                   “Discovering Simulation Models,” Information Systems, vol. 34, no. 3,
                                                                           pp. 305–327, 2009.
of three cycles. The first cycle is the initial model con-             [6] L. Vanbrabant, N. Martin, K. Ramaekers, and K. Braekers, “Quality
struction. In this step, for each required modelling task (e.g.            of Input Data in Emergency Department Simulations: Framework and
determining the inter-arrival rates, activity durations, resource          Assessment Techniques,” Simulation Modelling Practice and Theory,
                                                                           vol. 91, pp. 83–101, 2019.
requirements, the control-flow, etc.) the data requirements are        [7] N. Martin, B. Depaire, and A. Caris, “The Use of Process Mining in
established. If these requirements are fulfilled, the quality of           Business Process Simulation Model Construction,” Business & Informa-
the data is assessed, and a discovery algorithm is applied. The            tion Systems Engineering, vol. 58, no. 1, pp. 73–87, 2016.
                                                                       [8] A. Rozinat, R. S. Mans, and W. M. P. van der Aalst, “Mining CPN
results of this algorithm, together with the detected data quality         Models: Discovering Process Models with Data from Event Logs,” in
issues (e.g. missing values, outliers, inconsistencies, etc.), are         Workshop and Tutorial on Practical Use of Coloured Petri Nets and the
presented to the domain expert for validation. If needed,                  CPN Tools, K. Jensen, Ed., Aarhus, Denmark, 2006, pp. 57–76.
                                                                       [9] I. Khodyrev and S. Popova, “Discrete Modeling and Simulation of
the expert can correct these issues and alter the discovery                Business Processes Using Event Logs,” in Proceedings of the 14th Inter-
parameters until he or she is satisfied with the results.                  national Conference on Computational Science, ser. Procedia Computer
   In the second cycle, all the initial model components from              Science, D. Abramson, M. Lees, V. Krzhizhanovskaya, J. Dongarra, and
                                                                           P. M. A. Sloot, Eds., vol. 29. Cairns, QLD, Australia: Elsevier, 2014,
the first cycle are integrated into a single simulation-ready              pp. 322–331.
model. The entire model will run for the first time, and the          [10] B. Gawin and B. Marcinkowski, “How Close to Reality is the “as-is”
preliminary results will be validated for the first time by the            Business Process Simulation Model?” Organizacija, vol. 48, no. 3, pp.
                                                                           155–175, 2015.
domain expert. By altering parameters, the domain expert              [11] M. Camargo, M. Dumas, and O. González-Rojas, “Automated Discovery
can “calibrate” the model until he or she is satisfied with                of Business Process Simulation Models from Event Logs,” Decision
the preliminary results. During this calibration, the domain               Support Systems, vol. 134, 2020.