<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Event Log Knowledge to Support Business Process Simulation Model Construction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Niels Martin</string-name>
          <email>niels.martin@uhasselt.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Hasselt University</institution>
          ,
          <addr-line>Agoralaan Building D, 3590 Diepenbeek</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>My dissertation focuses on the use of event log knowledge, i.e. process mining, to support the development of business process simulation models. Despite the fact that the Process Mining Manifesto highlights this topic as a key research challenge, prior research e orts tend to have a proof-of-concept nature. To this end, this dissertation contributes towards fundamentally bridging the gap between these domains by (i) providing the required conceptualization and (ii) developing a set of methods that extract knowledge from event logs to support speci c business process simulation modeling tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>Niels Martin</kwd>
        <kwd>Business process simulation</kwd>
        <kwd>Process mining</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Positioning and dissertation objectives</title>
      <p>Every organization is comprised of a set of business processes such as the
production process, and the transportation process. When decision-makers analyze
these processes and particular issues appear, several ideas for potential process
changes are likely to be generated.</p>
      <p>
        To evaluate the e ects of policy measures, managers can experiment with
the real-life process by, e.g., changing the sta ng policy and measure its
operational e ect. However, this is a high-risk approach as the implementation of
a measure does not guarantee the desired outcomes. This is a setting in which
business process simulation can be a valuable instrument. Business process
simulation (BPS) refers to the imitation of business process behavior through the
use of a simulation model. Using a BPS model, an organization can verify the
consequences of proposed process modi cations prior to implementation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        The use of BPS requires the construction of a simulation model, which, in
its turn, necessitates a profound insight in the business process. To this end,
extensive information needs to be collected. Typical information sources include
business documents, interviews with business experts and observations of the
process. Despite the valuable insights that can stem from these information
sources and their common use, their limitations should also be recognized.
Business documents can contain information deviating from real-life process behavior
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Interviews, in their turn, can result in contradictory information [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and can,
e.g., be heavily in uenced by recent experiences within the business process [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
Collecting observational data requires a signi cant time investment and can
suffer from the Hawthorne e ect, which refers to the performance increase of sta
members due to the mere fact that their actions are observed [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>The aforementioned limitations stress the need for information sources that
are more readily available and less in uenced by human perception. In this
respect, an important trend is that business processes are increasingly supported
by process-aware information systems such as Enterprise Resource Planning
systems. These systems record process execution information in an event log, of
which the structure is illustrated in Table 1. Each row in the event log
represents `something' that happens in the process. For instance: the rst row
indicates that Sue started to register application 143 on April 3rd, 2018 at 08:52:41.
She completes this registration at 09:04:04, as shown in the second row.
case id timestamp activity transaction type resource ...
... ... ... ... ... ...
143 03/04/2018 08:52:41 Register application start Sue ...
143 03/04/2018 09:04:04 Register application complete Sue ...
144 03/04/2018 09:04:04 Register application start Sue ...
107 03/04/2018 09:06:22 Judge application start Mike ...
... ... ... ... ... ...</p>
      <p>
        As an event log is automatically recorded and captures real-life process
behavior, it is well-suited as an additional information source for the construction
of a BPS model. My dissertation focuses on the use of event log knowledge, i.e.
process mining, to support simulation model construction. This is marked as one
of the key research challenges in the Process Mining Manifesto [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Literature on this topic includes [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], with [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] providing the most
comprehensive support. In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], a stepwise method to mine a simulation model
from an event log is provided together with suggestions for suitable ProM plugins
to perform mining operations. However, simplifying assumptions were made such
as equating case arrival time to the rst activity's start time. Moreover, modeling
tasks such as the queue discipline and resource schedules are excluded.
      </p>
      <p>From a thorough analysis of existing literature, it follows that, despite the
intrinsic potential of process mining to support BPS model construction, limited
insights exist in the systematic use of event log knowledge within this context.
A signi cant research gap is present as literature on this matter tends to have
a proof-of-concept nature. This implies that important simplifying assumptions
are made and that various modeling tasks that need to be performed when
building a real-life BPS model are not taken into consideration. Moreover, the
required conceptual foundation to fundamentally integrate event log knowledge
in BPS model development is not present, marking another research challenge.</p>
      <p>Consistent with these research gaps, the two main objectives of this work are
(i) providing the required conceptualization to fundamentally integrate event log
knowledge in BPS model construction and (ii) developing a set of methods that
extract knowledge from event logs to support speci c BPS modeling tasks.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Overview of the dissertation</title>
      <p>Given the limited research insights in the systematic use of event log knowledge
within this context, the rst part of the dissertation is situated at a conceptual
level (Section 2.1). Taking this conceptualization as an input, the second part
of the thesis develops methods to mine relevant insights related to speci c BPS
modeling tasks from an even log (Section 2.2).
2.1</p>
      <sec id="sec-2-1">
        <title>Conceptualization of the use of event log knowledge in BPS model construction</title>
        <p>A rst stage of conceptualization relates to de ning the key steps of a simulation
study. This enables positioning the use of event log knowledge in the wider
picture of a BPS study. Simulation literature proposes a multitude of methods
describing the key steps in a simulation study. Even though key components
such as computer modeling are included in most of them, di erences can also be
observed. However, existing methods tend to be de ned in isolation. Given this
observation, a new method for conducting a simulation study is developed, based
on a critical analysis of 14 existing methods. The developed method consists of
9 steps with continuous assessment as a central feedback mechanism.</p>
        <p>After elaborating on the wider perspective of a BPS study, the second stage
of conceptualization uses a generic representation of a BPS model to de ne a
series of modeling tasks. For each of them, the potential of event log
knowledge to support their speci cation is outlined. These insights are compared to
the state of the art in process mining literature in general and research on the
use of process mining for BPS purposes in particular. From this comparison, it
follows that few process mining algorithms are directly applicable to support
BPS model construction. This can be attributed to the di erences between the
underlying paradigms in process mining and simulation. While many process
mining techniques treat an event log as a series of independent cases, simulation
mimics the behavior of an operational business process in which the interaction
between cases that are simultaneously present in the process in uences process
behavior. Hence, tailored methods need to be developed for a wide range of BPS
modeling tasks, which demonstrates that additional research e orts are required.
The structured set of research challenges provided in the dissertation provides
clear directions for future research and, in this way, is a solid foundation for the
future development of the domain.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Developed methods to support speci c BPS modeling tasks</title>
        <p>Using the conceptualization outlined in Section 2.1 as an input, the dissertation
also develops a series of new algorithms to retrieve event log insights to support
speci c BPS modeling tasks. Three modeling tasks are selected based on criteria
including the potential consequences of inaccurately performing it on the
simulation study results and the absence of process mining algorithms which are
directly applicable or easily adjustable. Based on this assessment, new methods
are developed to support the speci cation of (i) the entity arrival rate, (ii) batch
processing, and (iii) resource schedules. All algorithms are rigorously evaluated
using arti cial data and, whenever relevant, real-life event logs.</p>
        <p>Entity arrival rate An entity is a dynamic object (e.g., a patient, a parcel)
that moves through the process. The entity arrival rate models the pace at which
new entities arrive, which is often expressed using a parameterized probability
distribution for the time between two consecutive arrivals. As it de nes the in ow
at the beginning of the process, inaccurately modeling the arrival rate can have
a major in uence on simulation outputs.</p>
        <p>Process mining literature on this topic implicitly assumes that an entity's
arrival time corresponds to its rst recorded timestamp. However, the developed
taxonomy shows that this assumption is only appropriate when no queues are
formed at the start activity. Given this observation, an Arrival Rate
Parameter Retrieval Algorithm (ARPRA) is created, which is the rst algorithm that
explicitly takes queue formation into account during arrival rate mining.
Batch processing Batch processing refers to a resource's tendency to
accumulate entities in order to process them simultaneously, concurrently, or
sequentially. Regarding batch processing, two distinct topics are treated: batch
identi cation and batch activation rules. For both topics, a new mining
algorithm is developed, implemented and evaluated. Firstly, for batch identi cation,
the Batch Organization of Work Identi cation algorithm (BOWI) is introduced,
which is based on a distinction between simultaneous, sequential and concurrent
batching. It is the rst algorithm that systematically identi es batches in an
event log and calculates a set of batch processing metrics.</p>
        <p>Secondly, related to batch activation rules, the Batch Activation Rule
Identi cation algorithm (BARI) is created. A batch activation rule captures the
circumstances under which a resource starts processing a batch, e.g. when a
particular number of entities has been collected. Using explicit and implicit event log
information, BARI transforms the problem into a classi cation problem, which
is solved using decision tree analysis to obtain batch activation rules.
Resource schedules A resource schedule expresses the availability of a resource
for the process under consideration. When focusing on human resources, sta
members might not work full-time or might be involved in multiple processes.
Consequently, it can be di cult to deduce their availability for one speci c
process from generic schedules provided by the HR-department. This is especially
the case when work organization is less rigid, leaving signi cant freedom for
resources to divide their attention between processes. As resource availability
is critical information when building a simulation model, a Resource Schedule
Identi cation Method (RSIM) is developed, which supports the speci cation of
resource schedules. More speci cally, daily availability records are mined from
the event log, which express a resource's allocation to a particular process on a
particular day. It is the rst algorithm in its kind as the daily availability records
take into account (i) the temporal dimension of availability, i.e. the time of day
at which a resource is available, and (ii) intermediate availability interruptions.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Key contribution of the dissertation</title>
      <p>
        My dissertation presents an important step towards the integration of process
mining in BPS model construction. Despite the fact that the Process Mining
Manifesto [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] highlights this integration as a key research challenge, prior
research e orts tend to have a proof-of-concept nature. Consequently, from a
scienti c perspective, the developed conceptualization in this dissertation provides
the much needed structure to this research eld. Moreover, the developed
algorithms demonstrate how an event log can be used in previously unexplored
directions to, e.g., mine batching behavior or resource schedules. From a
managerial perspective, the improved BPS models due to the use of process mining
will provide more accurate insights in the e ect of particular policy alternatives
on process performance. In this way, BPS will become a more powerful decision
support tool, which will translate in an increased use of BPS in practice. Even
though further research is required, e.g. by developing algorithms for other
modeling tasks, this dissertation lays important foundations towards more powerful
BPS models through the use of process mining.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adriansyah</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>de Medeiros</surname>
            ,
            <given-names>A.K.A.</given-names>
          </string-name>
          , ...,
          <string-name>
            <surname>Wen</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Westergaard</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wynn</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Process mining manifesto</article-title>
          .
          <source>Lecture Notes in Business Information Processing</source>
          <volume>99</volume>
          ,
          <volume>169</volume>
          {
          <fpage>194</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Aguirre</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parra</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alvarado</surname>
          </string-name>
          , J.:
          <article-title>Combination of process mining and simulation techniques for business process redesign: a methodological approach</article-title>
          .
          <source>Lecture Notes in Business Information Processing</source>
          <volume>162</volume>
          ,
          <volume>24</volume>
          {
          <fpage>43</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Dickey</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pearson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Recency e ect in college student course evaluations</article-title>
          .
          <source>Practical Assessment, Research and Evaluation</source>
          <volume>10</volume>
          (
          <issue>6</issue>
          ),
          <volume>1</volume>
          {
          <fpage>10</fpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Leyer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moormann</surname>
          </string-name>
          , J.:
          <article-title>Comparing concepts for shop oor control of informationprocessing services in a job shop setting: a case from the nancial services sector</article-title>
          .
          <source>International Journal of Production Research</source>
          <volume>53</volume>
          (
          <issue>4</issue>
          ),
          <volume>1168</volume>
          {
          <fpage>1179</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Mel~ao, N.,
          <string-name>
            <surname>Pidd</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Use of business process simulation: a survey of practitioners</article-title>
          .
          <source>Journal of the Operational Research Society</source>
          <volume>54</volume>
          (
          <issue>1</issue>
          ),
          <volume>2</volume>
          {
          <fpage>10</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Maruster</surname>
          </string-name>
          , L.,
          <string-name>
            <surname>van Beest</surname>
            ,
            <given-names>N.R.T.P.</given-names>
          </string-name>
          :
          <article-title>Redesigning business processes: a methodology based on simulation and process mining techniques</article-title>
          .
          <source>Knowledge and Information Systems</source>
          <volume>21</volume>
          (
          <issue>3</issue>
          ),
          <volume>267</volume>
          {
          <fpage>297</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Simulation: the practice of model development and use</article-title>
          . Wiley, Chichester (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Rozinat</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mans</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Song</surname>
          </string-name>
          , M.,
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          :
          <article-title>Discovering simulation models</article-title>
          .
          <source>Information Systems</source>
          <volume>34</volume>
          (
          <issue>3</issue>
          ),
          <volume>305</volume>
          {
          <fpage>327</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Vincent</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Input data analysis</article-title>
          . In: Banks,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (ed.)
          <article-title>Handbook of simulation: principles, advances, applications</article-title>
          , and practice, pp.
          <volume>3</volume>
          {
          <fpage>30</fpage>
          . John Wiley &amp; Sons, Hoboken (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>