<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A New Process Discovery Algorithm for Exploratory Data Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jonas Lieben Supervisors: Beno</string-name>
          <email>jonas.lieben@uhasselt.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>t Depaire</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mieke Jans</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Hasselt University Martelarenlaan 42</institution>
          ,
          <addr-line>3500 Diepenbeek, Belgium FWO, Egmontstraat 5, 1000 Brussels</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <fpage>19</fpage>
      <lpage>27</lpage>
      <abstract>
        <p>The domain of process mining created many discovery techniques which can be used to generate a process representation of the data. However, existing techniques come with a aw for exploratory data analysis (EDA). They tend to produce process models which contain more process behaviour than is observed in the data and do not optimize for understandability. This severely limits their value for EDA, because only patterns which can be observed from the data should be distilled when performing an EDA. We explain why this limitation is important and give a methodology to overcome this. This methodology describes how a discovery algorithm can be developed that is suitable for EDA.</p>
      </abstract>
      <kwd-group>
        <kwd>Process</kwd>
        <kwd>Exploratory data analysis</kwd>
        <kwd>Comprehensibility</kwd>
        <kwd>Precision</kwd>
        <kwd>Process discovery algorithm</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>During the past years, companies are increasingly storing and collecting event
data. This type of data describes the occurrences of events during the execution
of a business process. Originally, its main source are IT systems supporting
business operations. Recently, Internet of Things, with all its sensors measuring
changes in the environment, has become a new important source of event data.</p>
      <p>
        The analysis of event data and the underlying process belongs to the domain
of process mining. Within process mining, three broad categories exist: process
discovery, conformance checking and process enhancement [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This project ts
in the subdomain of process discovery. The goal of process discovery is to create
a (visual) model representing the process based on event data. These models are
typically visualised in a graph based language such as Petri nets or BPMN.
      </p>
      <p>
        Models learned from data serve multiple purposes. In this project we focus
on the purpose to describe and summarise event data and to reveal
interesting patterns within this data. Such models are used for exploratory research.
Exploratory data analysis (EDA) is an important preliminary to con rmatory
and predictive analysis. John Tukey stated that: "Exploratory data analysis can
never be the whole story, but nothing else can serve as the foundation stone as
the rst step" [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. When presented with a large event log, EDA provides a good
understanding of the data at hand which is essential for a useful further analysis
of the underlying process.
      </p>
      <p>Researchers from the domain of process mining proposed many discovery
techniques which can be used to create a process representation of event data.
However, these techniques come with a limitation for EDA. They tend to
generate process models which contain more process behaviour than is observed in
the data, because they were developed with the rationale that an event log is
incomplete. Therefore, they produce models which generalise the behaviour in
the event log to represent all possible behaviour.</p>
      <p>Figure 1 shows a simple example process model which was discovered with
event data. A process can be executed multiple times and each execution refers
to a case. The sequence of the events during the process execution is called a
trace. Two cases share the same trace if the events occur in the same order. The
left side of Figure 1 shows an example of an event log containing several traces.</p>
      <p>
        The model of gure 1 shows the BPMN representation of the model
discovered by Evolutionary Tree Miner [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] using the traces on the left side. Other
miners discover similar models. While the model is a concise representation of
the event data, it is not perfectly precise. In process mining, a perfectly precise
model only contains the observed process behaviour. The model in Figure 1 is
not perfectly precise, because it allows the execution of the unobserved traces
ACDEFG and ABDEGF. To our knowledge, there are not many process
discovery techniques that guarantee a perfectly precise model as outcome.
      </p>
      <p>We consider this an important research gap for EDA as caution is needed
when visualising patterns which are not completely present within the data. Such
patterns might mislead the researcher in its conclusions. Based on the model in
Figure 1 the researcher might conclude that E, F and G occur in any order.
However, the data does not support this conclusion. Close inspection of the data
reveals for example that G never occurs between E and F. Therefore, a good
exploratory model should only hint at patterns that are not fully supported in
the data, but should never present them as facts.</p>
      <p>Mining a perfectly precise model is not di cult. The trace model, which
consists of a single exclusive choice where each possible path represents a trace
from the event log, is always perfectly precise. Figure 2 shows the trace model
for the traces in Figure 1.</p>
      <p>
        Although the trace model is perfectly precise, it performs poorly in some
aspects of comprehensibility, because it is di cult to identify patterns of choice
and concurrency. Exploratory models should not only be perfectly precise, but
also be optimised for comprehensibility. Figure 3, for example, illustrates a
perfectly precise model for the traces in Figure 1, but has a higher comprehensibility
than the trace model. The main di erence between both models is the number
of duplicate tasks. The relation between duplicate tasks and comprehensibility
is complex and non-linear. Too many duplicate tasks hide patterns (cfr. Figure
2) and decrease the comprehensibility. However, a certain number of duplicate
tasks also adds structure to the process and reduces clutter [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] which increases
comprehensibility.
      </p>
      <p>
        This leads to the second research gap which we will address in this project.
According to a recent literature review [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], none of the existing metrics measuring
process model comprehension account for the in uence of duplicate tasks on
comprehensibility. This is due to the fact that it is implicitly assumed that
process models have unique labels.
      </p>
      <p>This research project is important because the current discovery algorithms
produce imprecise models which limit their value for EDA. As EDA is an
important rst step for any data analysis project, having an algorithm which produces
comprehensible and precise models will make it signi cantly easier to identify
interesting patterns and ideas for follow-up (con rmatory/predictive) analysis.
Our research is unconventional in the sense that the guiding principles for our
discovery algorithm will be precision and comprehensibility, whereas current
techniques rely on the assumption that the event log is incomplete.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Research Objectives</title>
      <p>The overall research goal is to develop a process discovery algorithm for
exploratory data analysis which generates a perfectly precise and comprehensible
process visualization of the event data. To achieve this overall research goal,
three research objectives need to be achieved:</p>
      <p>
        Firstly, a discovery algorithm needs to be developed. This algorithm should
be able to generate models with perfect precision, optimised comprehensibility
and representing a certain number of traces from the event log. In addition
to generating perfectly precise models, the algorithm must meet the following
requirements to be of value during exploratory research:
{ Generate comprehensible models: the purpose of EDA is to get a good
understanding of the data and to easily recognise interesting patterns.
Optimization of comprehensibility must directly guide the algorithm.
{ Generate models representing a certain number of traces: traditional
discovery algorithms focus on learning a model for the entire event log, which often
results in overly complex models. During EDA, the researcher is not always
interested in a single model representing all traces. Therefore, the data
analyst should be able to set the number of traces which should be represented
by the model. The algorithm should select the set of traces which results in
the most comprehensible perfectly precise model.
{ Allow for di erent comprehensibility measures: the algorithm should be
exible enough such that a di erent measure can be used without changes to
the algorithm.
{ Be extensible: part of this project will focus on how to optimally visualise
certain aspects of a process. These insights will be incorporated into the
algorithm. The mechanism to do so must be exible enough such that future
insights can be easily incorporated.
{ Use BPMN as the graphical notation: empirical research has shown that
the BPMN notation appears to be the strongest in providing for a good
understanding by model readers [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Secondly, more comprehensible visualizations for partial parallelism and
longterm dependencies need to be designed. Partial parallelism occurs when a set of
activities seem to happen in parallel, but not all possible combinations are
observed in the event log. Long-term dependencies are observed when an exclusive
choice is partially determined by the occurrence or non-occurrence of previous
activities in the trace. Both constructs are present in the data of Figure 1.
Activities E, F and G are only partially in parallel since we never observe a trace with
G occuring between E and F. The exclusive choice after activity D is limited to
activity H when activity C occurred. Both constructs have a tendency to make
a model less comprehensible. The goal of this research objective is to search for
di erent kinds of visualization to improve comprehensibility.</p>
      <p>Thirdly, an empirically validated comprehensibility measure needs to be
developed. This measure should be applicable to the models generated by our
algorithm. Our algorithm uses a comprehensibility measure as its guiding
mechanism. This implies that the measure also needs to account for the
comprehensibility cost of duplicate tasks and the di erent visualizations developed as part
of research objective two.
3</p>
    </sec>
    <sec id="sec-3">
      <title>State of the Art</title>
      <p>
        In the domain of process mining, many algorithms have been developed to
discover the control- ow of process models. To our knowledge, none of the existing
algorithms create perfectly precise models while optimizing for the
understandability. Most algorithms put a less stringent notion of completeness than global
completeness. The notion of global completeness implies that all possible
behaviour of the process is included in the log [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. As the creators of most existing
algorithms made the assumption that the log is incomplete, there are patterns
included into the process models which are not present in the log.
      </p>
      <p>
        Existing discovery algorithms can be categorized into ve categories [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The
rst category is the abstraction-based algorithms. One of the best known
discovery algorithms is the -algorithm [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The -algorithm and its derivatives are
all abstraction-based algorithms. The heuristics miner [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is the only algorithm
belonging to the heuristic-based algorithms category and takes into account the
presence of noise. The third category is the search-based algorithms, which
contains the Evolutionary Tree Miner based on genetic algorithms [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This category
contains all algorithms which use metaheuristics to infer a process model.
Models created by existing algorithms of the rst three categories are not perfectly
precise, because one of the underlying assumptions is the incompleteness of the
event log. Language-based region algorithms can generate perfectly precise
models. This fourth category uses the theory of regions to construct a process model
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. However, the algorithms do not directly optimize for understandability. The
last category contains the state discovery algorithms [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. These algorithms rst
construct a transition system and then derive a Petri net. Nevertheless, they do
not directly optimise for understandability either.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Research Methodology</title>
      <p>
        The general methodology for this project follows the principles of design
science research (DSR). DSR deals with the creation of artifacts and scienti c
knowledge about these artifacts with the goal to provide solutions to a class of
problems [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. A typical DSR project consists of ve steps: problem identi cation,
requirement speci cation, artifact design and development, artifact evaluation
and result communication. The problem identi cation step has largely been done
during the preliminary study in preparation for this research paper. For each
research objective of Section 2, a methodology will be described.
4.1
      </p>
      <sec id="sec-4-1">
        <title>The Creation of the Comprehensibility Measure</title>
        <p>We start with the development of the empirically validated comprehensibility
measure, because the new discovery algorithm can only be created using the
measure. The rst step in the development of this measure is a literature review
to identify di erent aspects of a process model which have been empirically
proven to in uence comprehensibility. Through this literature review, we are
able to gather the requirements for the measure. Therefore, this step is the
requirement speci cation.</p>
        <p>
          During the artifact design and development phase, we will develop an
algorithm which quanti es the presence of these aspects within the process model.
This part of the study has already been executed. 23 existing metrics were
identi ed and implemented using the programming language R. The results are
published in the form of an R package on CRAN 1 and will be sent to the journal
paper SoftwareX. At the moment, there are no other software packages that can
calculate all implemented metrics for a batch of BPMN models. In addition, an
exploratory factor analysis is performed. This factor analysis allows to discover
the underlying dimensions of the large number of metrics. The sample of
models used for the factor analysis consisted of BPMN models from the BPM AI
(Business Process Management Academic Initiative) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] and models generated
by the PTandLogGenerator [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. The results of this factor analysis are published
1 https://cran.r-project.org/web/packages/understandBPMN/index.html
as conference proceedings and are presented at the EOMAS (Enterprise &amp;
Organizational Modeling and Simulation) workshop which takes place in conjunction
with CAISE 2018.
        </p>
        <p>Next, we will conduct an experiment to determine the impact of each
dimension on comprehensibility. Participants will receive process models and a set of
questions to test their understanding of the models. We apply a within-subjects
design to control for the e ects on comprehensibility related to the model reader.
The dependent variables will be an objective comprehension accuracy measure
such as percentage of correct answers, a time-taken measure and a subjective
comprehension di culty measure. The independent variables in this study will
be the quanti cations of the di erent model aspects within the models. After
running the experiment, we will apply a multi-level regression analysis on the
collected data to determine the impact of each factor on comprehensibility. The
parameter estimates will become the empirically-validated weights for our new
comprehension measure. The artifact will afterwards be evaluated and
demonstrated and the results will be communicated as a journal paper.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>The Development of the Discovery Algorithm</title>
        <p>
          When the comprehensibility measure is created, the discovery algorithm can be
created. Two versions of the discovery algorithm will be developed: one which
generates a model representing all traces and one which generates a model
representing a prede ned minimum number of traces. To develop and design the
algorithms, we will transform the discovery problem into an optimization
problem. This approach has been applied before by [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], which used genetic algorithms
as search strategy. However, genetic algorithms are less suited for our problem
since it would be di cult to de ne mutate and cross-over operators that are
necessary to result in perfectly precise models.
        </p>
        <p>
          Our approach is inspired by Iterated Local Search [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], which has not been
applied before in this context. Our algorithm will use the trace model as initial
solution and apply domain-speci c local search operators (LSOs) to modify the
model by transforming duplicate tasks into more complex process structures. For
the rst version, we will develop at least two LSOs: one for parallel constructs
and one for exclusive choice constructs. Other LSOs may be de ned later. Each
LSO should guarantee perfect precision after transformation. The LSOs are the
mechanism that make the algorithm extensible.
        </p>
        <p>For the second version of the algorithm, two new operators will be created: a
trace removal and a trace imputation operator. The removal operator will remove
parts of the process model that correspond to entire traces. The imputation
operator will add entire traces from the event log to the model. Both operators
should modify the model in such a way that perfect precision remains guaranteed.</p>
        <p>
          To verify whether our models are perfectly precise and represent all traces,
we rst use the Behavior Recall metric [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] to check whether all traces are
represented by the model. If so, we will use the ETC Precision [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] to test if the
model is perfectly precise. Because both metrics require the process model to be
represented as a Petri net, we will use the transformation algorithm in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. For
the BPMN constructs in our models, this algorithm guarantees bisimilarity. To
evaluate the comprehensibility of the models, there is no other algorithm to
compare with. Therefore, we are limited to a descriptive analysis of the algorithms
performance in terms of comprehensibility. We will analyze the improvements
with respect to the trace model and apply a sensitivity analysis to see which
aspects in uence the algorithms ability to improve comprehensibility. For
evaluation we will use a broad set of event logs, both real and arti cial. The real
data will be taken from the collection made available by the IEEE Task Force
on Process Mining. The arti cial data sets will be created using [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. The results
will be made available in an R package on CRAN and scienti c papers.
4.3
        </p>
        <p>
          The Design of Alternative Visualizations for Partial Parallelism
and Longterm Dependencies
We aim to create more comprehensible visualizations for partial parallelism and
longterm dependencies. To determine the requirements of these alternative
visualizations, we will apply a multi-dimensional long-term case study with expert
users as suggested in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Since the purpose of the case study is to increase
transferability to people who have the same needs, a sample of 3 to 5 expert
users is appropriate [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Experts will be data analysts, both from academia and
industry. The case study is multi-dimensional because it combines di erent
research methods such as interviews and observations. It is also long-term, because
it involves a longitudinal study throughout the entire DSR cycle.
        </p>
        <p>These new visualisations need to be incorporated in the comprehensibility
measure of Section 4.1 and in the discovery algorithm of Section 4.2.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>Industry is becoming increasingly data-driven. The past decade both the amount
of data collected and the nature of the data has changed. This project focusses on
event data, which describes how (business) processes are executed. The rst step
for retrieving insights from data is through exploratory data analysis (EDA).
Despite the many algorithms which discover process models from event data, none
of them are really suited for EDA for two reasons. Firstly, they tend to create
models which contain behaviour that was not observed data. Secondly, almost
none of the existing algorithms optimise their models in terms of
comprehensibility, while this is necessary to recognise easily interesting patterns.</p>
      <p>This project contributes to both process mining and data analytics. It creates
a discovery algorithm suitable for EDA. The resulting models only represent the
observed behaviour and are optimised for comprehensibility. Further
contributions of this project are a rst comprehensibility measure which takes duplicate
tasks into account and alternative visualizations for partial parallelism and
longterm dependencies.</p>
      <p>Acknowledgments I would like to thank FWO for my PhD scholarship.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          : Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer-Verlag, Berlin Heidelberg (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Aalst</surname>
            ,
            <given-names>W.V.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weijters</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maruster</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Work ow mining: Discovering process models from event logs</article-title>
          .
          <source>IEEE Trans. on Knowl. and Data</source>
          Eng p.
          <fpage>2004</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Buijs</surname>
            ,
            <given-names>J.C.A.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dongen</surname>
            ,
            <given-names>B.F.v.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          v.d.:
          <article-title>A genetic algorithm for discovering process trees</article-title>
          .
          <source>In: 2012 IEEE Congress on Evolutionary Computation</source>
          . pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          (Jun
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Dongen</surname>
            ,
            <given-names>B.F.v.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Medeiros</surname>
            ,
            <given-names>A.K.A.d.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wen</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Process Mining: Overview and Outlook of Petri Net Discovery Algorithms</article-title>
          .
          <source>In: Transactions on Petri Nets and Other Models of Concurrency II</source>
          , pp.
          <volume>225</volume>
          {
          <fpage>242</fpage>
          . Lecture Notes in Computer Science, Springer, Berlin, Heidelberg (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Figl</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Comprehension of Procedural Visual Business Process Models: A Literature Review</article-title>
          .
          <source>Business &amp; Information Systems Engineering</source>
          <volume>59</volume>
          (
          <issue>1</issue>
          ),
          <volume>41</volume>
          {67 (Feb
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Figl</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendling</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Strembeck</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The in uence of notational de ciencies on process model comprehension</article-title>
          .
          <source>Journal of the Association for Information Systems</source>
          <volume>14</volume>
          (
          <issue>6</issue>
          ),
          <volume>312</volume>
          {
          <fpage>338</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Goedertier</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martens</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vanthienen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baesens</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Robust process discovery with arti cial negative events</article-title>
          .
          <source>Journal of Machine Learning Research 10(Jun)</source>
          ,
          <volume>1305</volume>
          {
          <fpage>1340</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Johannesson</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perjons</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          : An Introduction to Design Science. Springer (Oct
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Jouck</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Depaire</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Generating Arti cial Data for Empirical Analysis of Control- ow Discovery Algorithms: A Process Tree</article-title>
          and
          <string-name>
            <given-names>Log</given-names>
            <surname>Generator</surname>
          </string-name>
          .
          <source>Business &amp; Information Systems</source>
          Engineering pp.
          <volume>1</volume>
          {
          <issue>18</issue>
          (Mar
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kalenkova</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lomazova</surname>
            ,
            <given-names>I.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rubin</surname>
            ,
            <given-names>V.A.</given-names>
          </string-name>
          :
          <article-title>Process mining using BPMN: relating event logs and process models</article-title>
          .
          <source>Software &amp; Systems Modeling</source>
          <volume>16</volume>
          (
          <issue>4</issue>
          ),
          <volume>1019</volume>
          {1048 (Oct
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Kunze</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berger</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weske</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lohmann</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moser</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>BPM Academic Initiative-Fostering Empirical Research</article-title>
          .
          <source>In: BPM (Demos)</source>
          . pp.
          <volume>1</volume>
          {
          <issue>5</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Mun</surname>
          </string-name>
          <article-title>~oz-</article-title>
          <string-name>
            <surname>Gama</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carmona</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A Fresh Look at Precision in Process Conformance</article-title>
          . In: Business Process Management. pp.
          <volume>211</volume>
          {
          <fpage>226</fpage>
          . Lecture Notes in Computer Science, Springer, Berlin, Heidelberg (Sep
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>La</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Wohed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Mendling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Ter</surname>
          </string-name>
          <string-name>
            <surname>Hofstede</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.H.</given-names>
            ,
            <surname>Reijers</surname>
          </string-name>
          , H.A.,
          <string-name>
            <surname>van der Aalst</surname>
          </string-name>
          , W.M.:
          <article-title>Managing process model complexity via abstract syntax modi cations</article-title>
          .
          <source>IEEE Transactions on Industrial Informatics</source>
          <volume>7</volume>
          (
          <issue>4</issue>
          ),
          <volume>614</volume>
          {
          <fpage>629</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Lorenz</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mauser</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Juhas</surname>
          </string-name>
          , G.:
          <article-title>How to synthesize nets from languages - a survey</article-title>
          .
          <source>In: 2007 Winter Simulation Conference</source>
          . pp.
          <volume>637</volume>
          {
          <issue>647</issue>
          (Dec
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Lourenco</surname>
            ,
            <given-names>H.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>O.C.</given-names>
          </string-name>
          , Stutzle, T.:
          <article-title>Iterated local search</article-title>
          . In: Handbook of metaheuristics, pp.
          <volume>320</volume>
          {
          <fpage>353</fpage>
          . Springer (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Shneiderman</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plaisant</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Strategies for evaluating information visualization tools: multi-dimensional in-depth long-term case studies</article-title>
          .
          <source>In: Proceedings of the</source>
          <year>2006</year>
          <article-title>AVI workshop on BEyond time and errors: novel evaluation methods for information visualization</article-title>
          . pp.
          <volume>1</volume>
          {
          <issue>7</issue>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Tukey</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          :
          <article-title>Exploratory data analysis</article-title>
          ,
          <source>vol. 2</source>
          . Reading, Massachusetts (
          <year>1977</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Weijters</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A K</given-names>
            <surname>Medeiros</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Process Mining with the Heuristics Miner-algorithm</article-title>
          , vol.
          <volume>166</volume>
          (
          <issue>01</issue>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>