<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">A New Process Discovery Algorithm for Exploratory Data Analysis</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Jonas</forename><surname>Lieben</surname></persName>
							<email>jonas.lieben@uhasselt.be</email>
							<affiliation key="aff0">
								<orgName type="institution">Hasselt University</orgName>
								<address>
									<addrLine>Martelarenlaan 42</addrLine>
									<postCode>3500</postCode>
									<settlement>Diepenbeek</settlement>
									<country key="BE">Belgium</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Benoît</forename><surname>Depaire</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">FWO</orgName>
								<address>
									<addrLine>Egmontstraat 5</addrLine>
									<postCode>1000</postCode>
									<settlement>Brussels</settlement>
									<country key="BE">Belgium</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mieke</forename><surname>Jans</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">FWO</orgName>
								<address>
									<addrLine>Egmontstraat 5</addrLine>
									<postCode>1000</postCode>
									<settlement>Brussels</settlement>
									<country key="BE">Belgium</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">A New Process Discovery Algorithm for Exploratory Data Analysis</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">4028A293B06D8B5696CB52B7F4E5F854</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T12:06+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Process</term>
					<term>Exploratory data analysis</term>
					<term>Comprehensibility</term>
					<term>Precision</term>
					<term>Process discovery algorithm</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The domain of process mining created many discovery techniques which can be used to generate a process representation of the data. However, existing techniques come with a flaw for exploratory data analysis (EDA). They tend to produce process models which contain more process behaviour than is observed in the data and do not optimize for understandability. This severely limits their value for EDA, because only patterns which can be observed from the data should be distilled when performing an EDA. We explain why this limitation is important and give a methodology to overcome this. This methodology describes how a discovery algorithm can be developed that is suitable for EDA.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Research Context</head><p>During the past years, companies are increasingly storing and collecting event data. This type of data describes the occurrences of events during the execution of a business process. Originally, its main source are IT systems supporting business operations. Recently, Internet of Things, with all its sensors measuring changes in the environment, has become a new important source of event data.</p><p>The analysis of event data and the underlying process belongs to the domain of process mining. Within process mining, three broad categories exist: process discovery, conformance checking and process enhancement <ref type="bibr" target="#b0">[1]</ref>. This project fits in the subdomain of process discovery. The goal of process discovery is to create a (visual) model representing the process based on event data. These models are typically visualised in a graph based language such as Petri nets or BPMN.</p><p>Models learned from data serve multiple purposes. In this project we focus on the purpose to describe and summarise event data and to reveal interesting patterns within this data. Such models are used for exploratory research. Exploratory data analysis (EDA) is an important preliminary to confirmatory and predictive analysis. John Tukey stated that: "Exploratory data analysis can never be the whole story, but nothing else can serve as the foundation stone as the first step" <ref type="bibr" target="#b16">[17]</ref>. When presented with a large event log, EDA provides a good understanding of the data at hand which is essential for a useful further analysis of the underlying process.</p><p>Researchers from the domain of process mining proposed many discovery techniques which can be used to create a process representation of event data. However, these techniques come with a limitation for EDA. They tend to generate process models which contain more process behaviour than is observed in the data, because they were developed with the rationale that an event log is incomplete. Therefore, they produce models which generalise the behaviour in the event log to represent all possible behaviour.</p><p>Figure <ref type="figure" target="#fig_0">1</ref> shows a simple example process model which was discovered with event data. A process can be executed multiple times and each execution refers to a case. The sequence of the events during the process execution is called a trace. Two cases share the same trace if the events occur in the same order. The left side of Figure <ref type="figure" target="#fig_0">1</ref> shows an example of an event log containing several traces.</p><p>The model of figure <ref type="figure" target="#fig_0">1</ref> shows the BPMN representation of the model discovered by Evolutionary Tree Miner <ref type="bibr" target="#b2">[3]</ref> using the traces on the left side. Other miners discover similar models. While the model is a concise representation of the event data, it is not perfectly precise. In process mining, a perfectly precise model only contains the observed process behaviour. The model in Figure <ref type="figure" target="#fig_0">1</ref> is not perfectly precise, because it allows the execution of the unobserved traces ACDEFG and ABDEGF. To our knowledge, there are not many process discovery techniques that guarantee a perfectly precise model as outcome. We consider this an important research gap for EDA as caution is needed when visualising patterns which are not completely present within the data. Such patterns might mislead the researcher in its conclusions. Based on the model in Figure <ref type="figure" target="#fig_0">1</ref> the researcher might conclude that E, F and G occur in any order. However, the data does not support this conclusion. Close inspection of the data reveals for example that G never occurs between E and F. Therefore, a good exploratory model should only hint at patterns that are not fully supported in the data, but should never present them as facts.</p><p>Mining a perfectly precise model is not difficult. The trace model, which consists of a single exclusive choice where each possible path represents a trace from the event log, is always perfectly precise. Figure <ref type="figure" target="#fig_1">2</ref> shows the trace model for the traces in Figure <ref type="figure" target="#fig_0">1</ref>. Although the trace model is perfectly precise, it performs poorly in some aspects of comprehensibility, because it is difficult to identify patterns of choice and concurrency. Exploratory models should not only be perfectly precise, but also be optimised for comprehensibility. Figure <ref type="figure">3</ref>, for example, illustrates a perfectly precise model for the traces in Figure <ref type="figure" target="#fig_0">1</ref>, but has a higher comprehensibility than the trace model. The main difference between both models is the number of duplicate tasks. The relation between duplicate tasks and comprehensibility is complex and non-linear. Too many duplicate tasks hide patterns (cfr. Figure <ref type="figure" target="#fig_1">2</ref>) and decrease the comprehensibility. However, a certain number of duplicate tasks also adds structure to the process and reduces clutter <ref type="bibr" target="#b12">[13]</ref> which increases comprehensibility.</p><p>This leads to the second research gap which we will address in this project. According to a recent literature review <ref type="bibr" target="#b4">[5]</ref>, none of the existing metrics measuring process model comprehension account for the influence of duplicate tasks on comprehensibility. This is due to the fact that it is implicitly assumed that process models have unique labels.</p><p>This research project is important because the current discovery algorithms produce imprecise models which limit their value for EDA. As EDA is an impor-Fig. <ref type="figure">3</ref>. A perfectly precise model with limited duplicate activities tant first step for any data analysis project, having an algorithm which produces comprehensible and precise models will make it significantly easier to identify interesting patterns and ideas for follow-up (confirmatory/predictive) analysis. Our research is unconventional in the sense that the guiding principles for our discovery algorithm will be precision and comprehensibility, whereas current techniques rely on the assumption that the event log is incomplete.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Research Objectives</head><p>The overall research goal is to develop a process discovery algorithm for exploratory data analysis which generates a perfectly precise and comprehensible process visualization of the event data. To achieve this overall research goal, three research objectives need to be achieved:</p><p>Firstly, a discovery algorithm needs to be developed. This algorithm should be able to generate models with perfect precision, optimised comprehensibility and representing a certain number of traces from the event log. In addition to generating perfectly precise models, the algorithm must meet the following requirements to be of value during exploratory research:</p><p>-Generate comprehensible models: the purpose of EDA is to get a good understanding of the data and to easily recognise interesting patterns. Optimization of comprehensibility must directly guide the algorithm. -Generate models representing a certain number of traces: traditional discovery algorithms focus on learning a model for the entire event log, which often results in overly complex models. During EDA, the researcher is not always interested in a single model representing all traces. Therefore, the data analyst should be able to set the number of traces which should be represented by the model. The algorithm should select the set of traces which results in the most comprehensible perfectly precise model.</p><p>-Allow for different comprehensibility measures: the algorithm should be flexible enough such that a different measure can be used without changes to the algorithm. -Be extensible: part of this project will focus on how to optimally visualise certain aspects of a process. These insights will be incorporated into the algorithm. The mechanism to do so must be flexible enough such that future insights can be easily incorporated. -Use BPMN as the graphical notation: empirical research has shown that the BPMN notation appears to be the strongest in providing for a good understanding by model readers <ref type="bibr" target="#b5">[6]</ref>.</p><p>Secondly, more comprehensible visualizations for partial parallelism and longterm dependencies need to be designed. Partial parallelism occurs when a set of activities seem to happen in parallel, but not all possible combinations are observed in the event log. Long-term dependencies are observed when an exclusive choice is partially determined by the occurrence or non-occurrence of previous activities in the trace. Both constructs are present in the data of Figure <ref type="figure" target="#fig_0">1</ref>. Activities E, F and G are only partially in parallel since we never observe a trace with G occuring between E and F. The exclusive choice after activity D is limited to activity H when activity C occurred. Both constructs have a tendency to make a model less comprehensible. The goal of this research objective is to search for different kinds of visualization to improve comprehensibility.</p><p>Thirdly, an empirically validated comprehensibility measure needs to be developed. This measure should be applicable to the models generated by our algorithm. Our algorithm uses a comprehensibility measure as its guiding mechanism. This implies that the measure also needs to account for the comprehensibility cost of duplicate tasks and the different visualizations developed as part of research objective two.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">State of the Art</head><p>In the domain of process mining, many algorithms have been developed to discover the control-flow of process models. To our knowledge, none of the existing algorithms create perfectly precise models while optimizing for the understandability. Most algorithms put a less stringent notion of completeness than global completeness. The notion of global completeness implies that all possible behaviour of the process is included in the log <ref type="bibr" target="#b3">[4]</ref>. As the creators of most existing algorithms made the assumption that the log is incomplete, there are patterns included into the process models which are not present in the log.</p><p>Existing discovery algorithms can be categorized into five categories <ref type="bibr" target="#b3">[4]</ref>. The first category is the abstraction-based algorithms. One of the best known discovery algorithms is the α-algorithm <ref type="bibr" target="#b1">[2]</ref>. The α-algorithm and its derivatives are all abstraction-based algorithms. The heuristics miner <ref type="bibr" target="#b17">[18]</ref> is the only algorithm belonging to the heuristic-based algorithms category and takes into account the presence of noise. The third category is the search-based algorithms, which contains the Evolutionary Tree Miner based on genetic algorithms <ref type="bibr" target="#b2">[3]</ref>. This category contains all algorithms which use metaheuristics to infer a process model. Models created by existing algorithms of the first three categories are not perfectly precise, because one of the underlying assumptions is the incompleteness of the event log. Language-based region algorithms can generate perfectly precise models. This fourth category uses the theory of regions to construct a process model <ref type="bibr" target="#b13">[14]</ref>. However, the algorithms do not directly optimize for understandability. The last category contains the state discovery algorithms <ref type="bibr" target="#b3">[4]</ref>. These algorithms first construct a transition system and then derive a Petri net. Nevertheless, they do not directly optimise for understandability either.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Research Methodology</head><p>The general methodology for this project follows the principles of design science research (DSR). DSR deals with the creation of artifacts and scientific knowledge about these artifacts with the goal to provide solutions to a class of problems <ref type="bibr" target="#b7">[8]</ref>. A typical DSR project consists of five steps: problem identification, requirement specification, artifact design and development, artifact evaluation and result communication. The problem identification step has largely been done during the preliminary study in preparation for this research paper. For each research objective of Section 2, a methodology will be described.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">The Creation of the Comprehensibility Measure</head><p>We start with the development of the empirically validated comprehensibility measure, because the new discovery algorithm can only be created using the measure. The first step in the development of this measure is a literature review to identify different aspects of a process model which have been empirically proven to influence comprehensibility. Through this literature review, we are able to gather the requirements for the measure. Therefore, this step is the requirement specification.</p><p>During the artifact design and development phase, we will develop an algorithm which quantifies the presence of these aspects within the process model. This part of the study has already been executed. 23 existing metrics were identified and implemented using the programming language R. The results are published in the form of an R package on CRAN<ref type="foot" target="#foot_0">1</ref> and will be sent to the journal paper SoftwareX. At the moment, there are no other software packages that can calculate all implemented metrics for a batch of BPMN models. In addition, an exploratory factor analysis is performed. This factor analysis allows to discover the underlying dimensions of the large number of metrics. The sample of models used for the factor analysis consisted of BPMN models from the BPM AI (Business Process Management Academic Initiative) <ref type="bibr" target="#b10">[11]</ref> and models generated by the PTandLogGenerator <ref type="bibr" target="#b8">[9]</ref>. The results of this factor analysis are published as conference proceedings and are presented at the EOMAS (Enterprise &amp; Organizational Modeling and Simulation) workshop which takes place in conjunction with CAISE 2018.</p><p>Next, we will conduct an experiment to determine the impact of each dimension on comprehensibility. Participants will receive process models and a set of questions to test their understanding of the models. We apply a within-subjects design to control for the effects on comprehensibility related to the model reader. The dependent variables will be an objective comprehension accuracy measure such as percentage of correct answers, a time-taken measure and a subjective comprehension difficulty measure. The independent variables in this study will be the quantifications of the different model aspects within the models. After running the experiment, we will apply a multi-level regression analysis on the collected data to determine the impact of each factor on comprehensibility. The parameter estimates will become the empirically-validated weights for our new comprehension measure. The artifact will afterwards be evaluated and demonstrated and the results will be communicated as a journal paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">The Development of the Discovery Algorithm</head><p>When the comprehensibility measure is created, the discovery algorithm can be created. Two versions of the discovery algorithm will be developed: one which generates a model representing all traces and one which generates a model representing a predefined minimum number of traces. To develop and design the algorithms, we will transform the discovery problem into an optimization problem. This approach has been applied before by <ref type="bibr" target="#b2">[3]</ref>, which used genetic algorithms as search strategy. However, genetic algorithms are less suited for our problem since it would be difficult to define mutate and cross-over operators that are necessary to result in perfectly precise models.</p><p>Our approach is inspired by Iterated Local Search <ref type="bibr" target="#b14">[15]</ref>, which has not been applied before in this context. Our algorithm will use the trace model as initial solution and apply domain-specific local search operators (LSOs) to modify the model by transforming duplicate tasks into more complex process structures. For the first version, we will develop at least two LSOs: one for parallel constructs and one for exclusive choice constructs. Other LSOs may be defined later. Each LSO should guarantee perfect precision after transformation. The LSOs are the mechanism that make the algorithm extensible.</p><p>For the second version of the algorithm, two new operators will be created: a trace removal and a trace imputation operator. The removal operator will remove parts of the process model that correspond to entire traces. The imputation operator will add entire traces from the event log to the model. Both operators should modify the model in such a way that perfect precision remains guaranteed.</p><p>To verify whether our models are perfectly precise and represent all traces, we first use the Behavior Recall metric <ref type="bibr" target="#b6">[7]</ref> to check whether all traces are represented by the model. If so, we will use the ETC Precision <ref type="bibr" target="#b11">[12]</ref> to test if the model is perfectly precise. Because both metrics require the process model to be represented as a Petri net, we will use the transformation algorithm in <ref type="bibr" target="#b9">[10]</ref>. For the BPMN constructs in our models, this algorithm guarantees bisimilarity. To evaluate the comprehensibility of the models, there is no other algorithm to compare with. Therefore, we are limited to a descriptive analysis of the algorithms performance in terms of comprehensibility. We will analyze the improvements with respect to the trace model and apply a sensitivity analysis to see which aspects influence the algorithms ability to improve comprehensibility. For evaluation we will use a broad set of event logs, both real and artificial. The real data will be taken from the collection made available by the IEEE Task Force on Process Mining. The artificial data sets will be created using <ref type="bibr" target="#b8">[9]</ref>. The results will be made available in an R package on CRAN and scientific papers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">The Design of Alternative Visualizations for Partial Parallelism and Longterm Dependencies</head><p>We aim to create more comprehensible visualizations for partial parallelism and longterm dependencies. To determine the requirements of these alternative visualizations, we will apply a multi-dimensional long-term case study with expert users as suggested in <ref type="bibr" target="#b15">[16]</ref>. Since the purpose of the case study is to increase transferability to people who have the same needs, a sample of 3 to 5 expert users is appropriate <ref type="bibr" target="#b15">[16]</ref>. Experts will be data analysts, both from academia and industry. The case study is multi-dimensional because it combines different research methods such as interviews and observations. It is also long-term, because it involves a longitudinal study throughout the entire DSR cycle. These new visualisations need to be incorporated in the comprehensibility measure of Section 4.1 and in the discovery algorithm of Section 4.2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>Industry is becoming increasingly data-driven. The past decade both the amount of data collected and the nature of the data has changed. This project focusses on event data, which describes how (business) processes are executed. The first step for retrieving insights from data is through exploratory data analysis (EDA). Despite the many algorithms which discover process models from event data, none of them are really suited for EDA for two reasons. Firstly, they tend to create models which contain behaviour that was not observed data. Secondly, almost none of the existing algorithms optimise their models in terms of comprehensibility, while this is necessary to recognise easily interesting patterns.</p><p>This project contributes to both process mining and data analytics. It creates a discovery algorithm suitable for EDA. The resulting models only represent the observed behaviour and are optimised for comprehensibility. Further contributions of this project are a first comprehensibility measure which takes duplicate tasks into account and alternative visualizations for partial parallelism and longterm dependencies.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. An imprecise model representing a set of traces</figDesc><graphic coords="2,134.77,402.47,359.70,122.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. A trace model for the traces</figDesc><graphic coords="3,134.77,188.66,335.10,213.90" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="4,134.77,116.83,360.30,162.60" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://cran.r-project.org/web/packages/understandBPMN/index.html</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgments I would like to thank FWO for my PhD scholarship.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Process Mining: Discovery, Conformance and Enhancement of Business Processes</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Van Der Aalst</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2011">2011</date>
			<publisher>Springer-Verlag</publisher>
			<pubPlace>Berlin Heidelberg</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Workflow mining: Discovering process models from event logs</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">V D</forename><surname>Aalst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Weijters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Maruster</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. on Knowl. and Data Eng</title>
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A genetic algorithm for discovering process trees</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C A M</forename><surname>Buijs</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">F V</forename><surname>Dongen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P V</forename><surname>Aalst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Congress on Evolutionary Computation</title>
				<imprint>
			<date type="published" when="2012-06">2012. Jun 2012</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Process Mining: Overview and Outlook of Petri Net Discovery Algorithms</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">F V</forename><surname>Dongen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K A D</forename><surname>Medeiros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Transactions on Petri Nets and Other Models of Concurrency II</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<meeting><address><addrLine>Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="225" to="242" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Comprehension of Procedural Visual Business Process Models: A Literature Review</title>
		<author>
			<persName><forename type="first">K</forename><surname>Figl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Business &amp; Information Systems Engineering</title>
		<imprint>
			<biblScope unit="volume">59</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="41" to="67" />
			<date type="published" when="2017-02">Feb 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">The influence of notational deficiencies on process model comprehension</title>
		<author>
			<persName><forename type="first">K</forename><surname>Figl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mendling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Strembeck</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the Association for Information Systems</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="312" to="338" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Robust process discovery with artificial negative events</title>
		<author>
			<persName><forename type="first">S</forename><surname>Goedertier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Martens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Vanthienen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Baesens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="1305" to="1340" />
			<date type="published" when="2009-06">Jun. 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">An Introduction to Design Science</title>
		<author>
			<persName><forename type="first">P</forename><surname>Johannesson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Perjons</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2014-10">Oct 2014</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Generating Artificial Data for Empirical Analysis of Control-flow Discovery Algorithms: A Process Tree and Log Generator</title>
		<author>
			<persName><forename type="first">T</forename><surname>Jouck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Depaire</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Business &amp; Information Systems Engineering</title>
		<imprint>
			<biblScope unit="page" from="1" to="18" />
			<date type="published" when="2018-03">Mar 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Process mining using BPMN: relating event logs and process models</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Kalenkova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Van Der Aalst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">A</forename><surname>Lomazova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Rubin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Software &amp; Systems Modeling</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="1019" to="1048" />
			<date type="published" when="2017-10">Oct 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">BPM Academic Initiative-Fostering Empirical Research</title>
		<author>
			<persName><forename type="first">M</forename><surname>Kunze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Berger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Weske</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lohmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Moser</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">BPM (Demos)</title>
				<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="1" to="5" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A Fresh Look at Precision in Process Conformance</title>
		<author>
			<persName><forename type="first">J</forename><surname>Muñoz-Gama</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Carmona</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Business Process Management</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<meeting><address><addrLine>Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2010-09">Sep 2010</date>
			<biblScope unit="page" from="211" to="226" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Managing process model complexity via abstract syntax modifications</title>
		<author>
			<persName><forename type="first">La</forename><surname>Rosa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wohed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mendling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ter Hofstede</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">H</forename><surname>Reijers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">A</forename><surname>Van Der Aalst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Industrial Informatics</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="614" to="629" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">How to synthesize nets from languages -a survey</title>
		<author>
			<persName><forename type="first">R</forename><surname>Lorenz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mauser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Juhas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Winter Simulation Conference</title>
				<imprint>
			<date type="published" when="2007-12">2007. Dec 2007</date>
			<biblScope unit="page" from="637" to="647" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Iterated local search</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">R</forename><surname>Lourenço</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">C</forename><surname>Martin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Stützle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Handbook of metaheuristics</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="320" to="353" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Strategies for evaluating information visualization tools: multi-dimensional in-depth long-term case studies</title>
		<author>
			<persName><forename type="first">B</forename><surname>Shneiderman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Plaisant</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2006 AVI workshop on BEyond time and errors: novel evaluation methods for information visualization</title>
				<meeting>the 2006 AVI workshop on BEyond time and errors: novel evaluation methods for information visualization</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="1" to="7" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">W</forename><surname>Tukey</surname></persName>
		</author>
		<title level="m">Exploratory data analysis</title>
				<meeting><address><addrLine>Reading, Massachusetts</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1977">1977</date>
			<biblScope unit="volume">2</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Process Mining with the Heuristics Miner-algorithm</title>
		<author>
			<persName><forename type="first">A</forename><surname>Weijters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Aalst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Medeiros</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
			<biblScope unit="volume">166</biblScope>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
