<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Interactive Data-Driven Business Process Simulation (Extended Abstract)</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Gerhardus van Hulzen Research group Business Informatics Hasselt University Hasselt</institution>
          ,
          <addr-line>Belgium 0000-0001-8962-9515</addr-line>
        </aff>
      </contrib-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Today, healthcare systems worldwide are under constant
pressure. On the one hand, increasing population numbers,
ageing populations, lifestyle factors, and new technologies are
increasing the yearly expenses on healthcare. On the other
hand, budgets are under pressure due to economic austerity [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
In order to provide high-quality care to all patients, healthcare
managers are forced to improve their care processes. Efficient
Capacity Management (CM) is one of the key aspects to ensure
this. This involves, amongst others, determining the suitable
resource levels – i.e. staff size, equipment, and facilities [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Business Process Simulation (BPS) can be used to support
managers during CM decisions. BPS uses a (computer) model
to imitate the behaviour of a business process. This approach
allows evaluating the effects of changes before implementing
them [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. For instance, BPS can be used to determine suitable
equipment levels, e.g. by simulating the effect of an additional
X-ray scanner on patient waiting times, throughput rates, and
staff workload.
      </p>
      <p>
        In Process Mining (PM) the emerging field of data-driven
process simulation provides promising first results to generate
simulation models from information captured in event logs [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
These “discovered” models can form the basis to compare
the operational effects of various capacity levels. The main
advantage of data-driven process simulation over “traditional”
simulation model development is the availability and
objectivity of event logs compared to information sources, such as
interviews, process documentation, and observations [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
However, some challenges remain in the field of automated BPS
discovery. Most importantly, the lack of domain knowledge
makes it challenging to extract a reliable and usable simulation
model. In addition, event logs often suffer from data quality
issues, which strongly affects the reliability of the simulation
results [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Therefore, it is imperative to take these problems
seriously.
      </p>
    </sec>
    <sec id="sec-2">
      <title>II. RESEARCH OBJECTIVES</title>
      <p>Given the context outlined above, this PhD research pursues
the following two objectives:
1) Extended support for key BPS modelling tasks: While
the field of automated BPS discovery renders promising
results; there are still challenges ahead to discover
individual BPS model components to make it usable to
support CM decisions.
2) Enabling interactive data-driven process simulation:
Domain knowledge should be closely integrated during
the discovery of BPS models to ensure the reliability
and usability of the discovered simulation models.</p>
    </sec>
    <sec id="sec-3">
      <title>III. PLANNED RESEARCH ACTIVITIES The following subsections give an overview of the planned research activities for the two research objectives.</title>
      <sec id="sec-3-1">
        <title>A. Extended Support for Key BPS Modelling Tasks</title>
        <p>
          Based on a systematic literature review, we concluded
that defining the control-flow, entity arrival rates, activity
execution times, gateway routing logic, entity types, queueing
disciplines, resource schedules, resource requirements, and
resource roles are the most important modelling tasks to
support CM decisions via simulation. These tasks correspond
to a subset of modelling tasks given by [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Most attention
of PM research has been dedicated to control-flow definition
[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. However, for creating a simulation model for supporting
CM decisions, we believe that all aforementioned tasks are
required – albeit some tasks are more important than others.
        </p>
        <p>
          In PM, only limited amount of work has been devoted
to integrating the various tasks needed to build a simulation
model. The authors in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] were the first to generate an initial
simulation model from data. They included the process-flow,
gateway routing logic, and resource pools. Later, the authors
extended their work with activity durations and entity
interarrival times [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Nevertheless, the authors emphasise that the
derived initial model still has to be verified and – if required
– augmented by domain experts to ensure validity.
        </p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], a PM approach is proposed to generate BPS models
for short-term KPI prediction. A similar approach as in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] is
used. However, the resource perspective is left aside, assuming
an infinite amount of resources is available [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>
          Control-flow, resources, activity durations, and gateway
routing logic are supported by the approach in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. In
addition, they also support inter-arrival times and resource
schedules. However, the latter have to be defined manually
by the domain expert.
        </p>
        <p>None of the aforementioned studies tried to integrate all
elements into a single, simulation-ready model. This is where
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
        <p>
          Simod [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] extends the work on data-driven process
simulation. Simod is a tool which automatically discovers BPS
models from event logs. In addition, Simod is also capable of
measuring the accuracy of the obtained simulation model and
allows to optimise the accuracy using hyper-parameters [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>While the initial results of data-driven BPS algorithms are
promising, there are still challenges to automatically derive
a simulation model for supporting CM decisions from event
logs. Especially the resource perspective is crucial for CM
decisions. Incorrect resource requirements, pools, and
schedules make the results of the model unreliable, resulting in
inaccurate capacity requirement estimations. The
state-of-theart still has limitations when it comes to defining the resource
perspective. Part of this PhD research will be dedicated to
improving the support of the resource perspective in
datadriven BPS.</p>
      </sec>
      <sec id="sec-3-2">
        <title>B. Enabling Interactive Data-Driven Process Simulation</title>
        <p>As mentioned earlier, data quality issues should be taken
seriously to ensure the reliability of the data-driven simulation
model. Detecting these issues often requires domain
knowledge. Therefore, it would be beneficial to involve the domain
experts as early as possible to detect and handle data quality
issues before integrating everything into a single simulation
model. Especially in stochastic models, such as simulation, a
problem in one part of the model may have a profound impact
on other parts. It is much easier to solve issues at the root, then
having to trace back the problem in a full simulation model.</p>
        <p>Ideally, domain experts would conduct simulation studies
themselves. After all, they know the process best. However,
conducting simulation studies requires specific knowledge
which domain experts often do not possess. Of course, they
could learn more about constructing simulation models, but
usually, they are very busy and do not have the time to master
the required skills.</p>
        <p>Against this background, we propose a framework to
interactively involve domain experts during the development
of data-driven simulation models. The framework consists
of three cycles. The first cycle is the initial model
construction. In this step, for each required modelling task (e.g.
determining the inter-arrival rates, activity durations, resource
requirements, the control-flow, etc.) the data requirements are
established. If these requirements are fulfilled, the quality of
the data is assessed, and a discovery algorithm is applied. The
results of this algorithm, together with the detected data quality
issues (e.g. missing values, outliers, inconsistencies, etc.), are
presented to the domain expert for validation. If needed,
the expert can correct these issues and alter the discovery
parameters until he or she is satisfied with the results.</p>
        <p>In the second cycle, all the initial model components from
the first cycle are integrated into a single simulation-ready
model. The entire model will run for the first time, and the
preliminary results will be validated for the first time by the
domain expert. By altering parameters, the domain expert
can “calibrate” the model until he or she is satisfied with
the preliminary results. During this calibration, the domain
expert should immediately obtain an estimation of the impact
of the changed parameter, instead of having to wait until the
simulation has finished running, which could – depending on
the complexity of the model – take quite a while.</p>
        <p>The third cycle of the framework involves the actual model
validation. The calibrated model is simulated extensively, and
the domain expert validates the simulation results. If needed,
the parameters of the simulation model can be altered again
to obtain more realistic results. The validated model can be
used for further analyses and to evaluate different scenarios.</p>
        <p>The goal of this part of the PhD research is to develop a
prototype which supports the interactive development of
datadriven simulation models.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>IV. CONCLUDING REMARKS</title>
      <p>This PhD will mainly focus on the resource aspect of
datadriven BPS and how domain experts can be interactively
involved in the discovery of simulation models. This should
culminate in the development of a prototype tool which allows
interactive data-driven generation of BPS models based on
event logs and domain knowledge. The derived simulation
model will form the basis for supporting CM decisions in
healthcare. Nevertheless, the prototype would also be usable in
many other applications in different fields besides healthcare,
such as production planning in manufacturing, supply chain
logistics, and transportation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Hicks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>McGovern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Prior</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Smith</surname>
          </string-name>
          , “
          <article-title>Applying Lean Principles to the Design of Healthcare Facilities</article-title>
          ,”
          <source>International Journal of Production Economics</source>
          , vol.
          <volume>170</volume>
          , pp.
          <fpage>677</fpage>
          -
          <lpage>686</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F. R.</given-names>
            <surname>Jacobs</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. B.</given-names>
            <surname>Chase</surname>
          </string-name>
          , “
          <article-title>Strategic Capacity Management,” in Operations and Supply Management: The Core, ser</article-title>
          . Operations and
          <string-name>
            <given-names>Decision</given-names>
            <surname>Sciences</surname>
          </string-name>
          . New York, NY, USA: McGraw Hill/Irwin,
          <year>2008</year>
          , pp.
          <fpage>51</fpage>
          -
          <lpage>79</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Melão</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Pidd</surname>
          </string-name>
          , “
          <article-title>Use of Business Process Simulation: A Survey of Practitioners,”</article-title>
          <source>Journal of the Operational Research Society</source>
          , vol.
          <volume>54</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>2</fpage>
          -
          <lpage>10</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Depaire</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Martin</surname>
          </string-name>
          , “
          <string-name>
            <surname>Data-Driven Process</surname>
            <given-names>Simulation</given-names>
          </string-name>
          ,”
          <source>Encyclopedia of Big Data Technologies</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rozinat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Mans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Song</surname>
          </string-name>
          , and
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          , “
          <source>Discovering Simulation Models,” Information Systems</source>
          , vol.
          <volume>34</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>305</fpage>
          -
          <lpage>327</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Vanbrabant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ramaekers</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Braekers</surname>
          </string-name>
          , “
          <article-title>Quality of Input Data in Emergency Department Simulations: Framework and Assessment Techniques,”</article-title>
          <source>Simulation Modelling Practice and Theory</source>
          , vol.
          <volume>91</volume>
          , pp.
          <fpage>83</fpage>
          -
          <lpage>101</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Depaire</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Caris</surname>
          </string-name>
          , “
          <source>The Use of Process Mining in Business Process Simulation Model Construction,” Business &amp; Information Systems Engineering</source>
          , vol.
          <volume>58</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>73</fpage>
          -
          <lpage>87</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rozinat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Mans</surname>
          </string-name>
          , and
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          , “Mining CPN Models:
          <article-title>Discovering Process Models with Data from Event Logs,” in Workshop and Tutorial on Practical Use of Coloured Petri Nets and the CPN Tools,</article-title>
          K. Jensen, Ed.,
          <string-name>
            <surname>Aarhus</surname>
          </string-name>
          , Denmark,
          <year>2006</year>
          , pp.
          <fpage>57</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>I.</given-names>
            <surname>Khodyrev</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Popova</surname>
          </string-name>
          , “
          <article-title>Discrete Modeling and Simulation of Business Processes Using Event Logs,”</article-title>
          <source>in Proceedings of the 14th International Conference on Computational Science</source>
          , ser. Procedia Computer Science, D. Abramson,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lees</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Krzhizhanovskaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dongarra</surname>
          </string-name>
          , and
          <string-name>
            <surname>P. M.</surname>
          </string-name>
          <article-title>A</article-title>
          . Sloot, Eds., vol.
          <volume>29</volume>
          .
          <string-name>
            <surname>Cairns</surname>
            ,
            <given-names>QLD</given-names>
          </string-name>
          , Australia: Elsevier,
          <year>2014</year>
          , pp.
          <fpage>322</fpage>
          -
          <lpage>331</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>B.</given-names>
            <surname>Gawin</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Marcinkowski</surname>
          </string-name>
          , “
          <article-title>How Close to Reality is the “as-is”</article-title>
          <source>Business Process Simulation Model?” Organizacija</source>
          , vol.
          <volume>48</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>155</fpage>
          -
          <lpage>175</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Camargo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>González-Rojas</surname>
          </string-name>
          , “
          <source>Automated Discovery of Business Process Simulation Models from Event Logs,” Decision Support Systems</source>
          , vol.
          <volume>134</volume>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>