<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Analyzing Business Process Changes Using In uence Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Teemu Lehto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Markku Hinkka</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jaakko Hollmen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Aalto University, School of Science, Department of Computer Science</institution>
          ,
          <country country="FI">Finland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>QPR Software Plc</institution>
          ,
          <country country="FI">Finland</country>
        </aff>
      </contrib-group>
      <fpage>32</fpage>
      <lpage>46</lpage>
      <abstract>
        <p>Real world business operations are continuously changing. Periodical business performance review sessions typically focus on monitoring changes in key performance indicator (KPI) measures. However, the detection and review of activity level changes in actual business processes is often based on subjective manual observations. This means that many changes are not detected in timely manner making the organization slower to adapt to changes. In this paper we present a systematic method for detecting business process changes for business review purposes based on transaction level data. Our method uses process mining principles and is based on our previously published in uence analysis methodology. Unlike most process mining change detection algorithms which operate on case level our method analyzes changes in the individual event level. We show how case level data can be used to construct features to the event level. Our method detects changes in timely manner since there is no need to wait for the cases to be completed. We present two alternative ways, binary approach and continuous event-age approach, for dividing events into recent and old for business review purpose. We also demonstrate the method with data from a real-life case.</p>
      </abstract>
      <kwd-group>
        <kwd>process analysis</kwd>
        <kwd>process improvement</kwd>
        <kwd>change detection</kwd>
        <kwd>concept drift</kwd>
        <kwd>process mining</kwd>
        <kwd>performance management</kwd>
        <kwd>key performance indicator</kwd>
        <kwd>root cause analysis</kwd>
        <kwd>data mining</kwd>
        <kwd>in uence analysis</kwd>
        <kwd>contribution</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The ability to detect changes is crucial for developing and improving agile
business operations. Unwanted changes need to be mitigated quickly and desired
changes need to be reinforced and shared as best practices. In this paper we
present a systematic approach for analyzing business process data using process
mining principles [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and speci cally our previously published in uence
analysis methodology [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Our methodology is capable of detecting business
process changes and describing for each change the important business relevant
attributes of what changed, how big is the change, what may be the cause for
the changes and what might be the e ect and outcome of the change.
      </p>
      <p>Our method is particularly useful for periodic business review situations
which take place in most of the business organizations globally. During the
business review, managers typically review the performance of business operations
using Key Performance Indicators (KPIs). One problem is that managers
typically do not have an accurate fact-based understanding and analysis of what
has changed during the review period. Instead they typically rely on subjective
comments, views and suggestions biased by acute business challenges and crises.
Using our method in this situation the managers easily see what has changed
during the review period by comparing the new process mining data against data
from previous business review periods. Our method will discover changes that
take place very fast as well as more gradual changes that occur in the course of
several years giving managers accurate data about changes and trends.</p>
      <p>The rest of this paper is organized as follows: Section 2 introduces relevant
background in process mining, concept drift and business process management.
Section 3 presents our methodology for analyzing business process changes.
Section 4 shows a real-life example of using the methodology on the loan application
process followed by a section for Discussions and Summary.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>
        This paper is based on our previously published in uence analysis
methodology which shows how the root causes can be identi ed for generic process
related problems [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In this paper we discuss about the concept drift
that occurs over the time and show how in uence analysis can be used to
discover changes. Data preparation for in uence analysis is based on process mining
methodologies [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] where data from ERP systems is transformed into event log
format containing events with event attributes connected to cases with case
attributes.
      </p>
      <p>
        Handling concept drift in process mining has been discussed in detail in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. These papers present that operational processes change and suggest
three main problems to be studied: detection of change points, characterization
of change and insight to the process evolution. This work is excellent for
understanding how complete process executions have changed. However, during the
business review situation the management is reviewing a xed period of time and
trying to identify as early signals as possible hinting how the processes might
be changing at this very moment. In e ect the change point is set to be the
beginning of the review period and question is "show us the things that have
changed after the start of review period as compared to the things that took
place before the review period". As [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] analyze complete cases they
can be categorized as o -line analysis of changes. If cases take 6 months to
complete the analysis results based on complete cases are at least 6 months old. In
this paper we will present a method for on-line analysis of changes.
      </p>
      <p>
        Concept drift in relation to machine learning has been studied a lot for
example in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The objective of those studies is to increase the accuracy
of predictions by utilizing machine learning algorithms that discover the changes
in the process. Instead of making accurate predictions our method is tailored to
discover and explain changes as part of the systematic periodical business review.
      </p>
      <p>
        A novel Trace Clustering algorithm [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] presents an approach to analyze
attribute data from events and cases in addition to the traditional business
process data. The approach is based on Markov cluster (MCL) algorithm [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
for nding similar cases. Although the results look promising the challenge of
this approach is that it uses complete cases and is thus more useful for o -line
analysis than for periodical on-line analysis.
      </p>
      <p>
        An approach more targeted for on-line business process drift detection is
presented in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. It uses the concept of partial order runs to run statistical tests
in order to nd the exact point in time for the change. A somewhat similar
method for concept-drift detection in event log streams has been studied in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
which present a method for detecting actual concept-drift time and individual
anomalies using histograms and clustering. However, these methods do not take
into account the attribute data and are not aimed to provide insight to the
business review question of what has changed during the current business review
period in comparison to the operations before.
      </p>
      <p>
        Since it is di cult to detect the changes by using only traditional statistical
measures, an interesting set of visual analytics tools enabling interactive process
analysis and process mining is presented in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Plotting all events to a stacked
area graph with absolute calendar time on horizontal axis this paper presents
a visualization for detecting concept drift and changes in business process and
case attribute data. Even though the presented visual analytics are useful they
do not clearly provide a concrete answer to what has changed during the past
review period. Presented visualization techniques also have challenges when the
amount of case attributes is so large that all case attributes cannot be included
in the visualizations at the same time.
      </p>
      <p>
        Yet another approach to make business people aware of changes that require
active intervention is presented in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. The method uses a sophisticated cost
model for optimize the generation of alarms for business people. Challenge with
this active intervention method is that it requires a lot of settings and detailed
level knowledge about the importance of various process issues. These settings
must be beforehand so that the algorithm can then suggest active intervention
when needed. Our understanding from actual business operations is that this
kind of settings and detailed information is not available or it is very di cult and
expensive to maintain over the time. On the other, practically all organizations
do have so kind of business reviews, so it is bene cial to present the discoveries
as part of the business review meetings.
      </p>
      <p>Summary of related work:
{ Process mining and concept drift has been studied a lot.
{ Most of the presented studies can be categorized as o -line analysis. They
are related to detecting and analyzing changes based on completed cases.
{ Limited tools exist for on-line analysis that could be used to compare xed
business period (like previous month, week, day, quarter or year) with past
performance.
{ Some methods are tailored for detecting process ow changes and some
methods detect case attribute data changes, our approach can detect both changes
and comparing them with a uniform scale for reporting purposes.
{ Machine learning is mostly used for making predictions and not used so much
for supporting business review analysis.</p>
      <p>
        Our previously published in uence analysis methodology shows how the root
causes can be identi ed for generic process related problems [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. In this paper we
discuss about the concept drift that occurs over the time and show how in uence
analysis can be used to discover changes. As shown in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] the in uence analysis
can be used as binary analysis or continuous analysis. In this paper we show the
both approaches and discuss the bene ts and challenges of each approach.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Analyzing Business Process Changes</title>
      <p>Idea of this paper is to analyze the variance and deviations in business processes
on individual transaction level in order to discover and explain changes that have
taken place. Using the in uence analysis methodology our objective is to nd
areas that have more variance compared to the average areas. Our method is
based on the idea that if there is no changes in the operations then the data in
ERP system for the review period is similar to the data for the previous periods.
On the other hand, if there is changes, then the data will be di erent than in the
past. In this chapter we re ne and augment the previously presented in uence
analysis steps.</p>
      <sec id="sec-3-1">
        <title>3.1 Identify the relevant business process and de ne the case</title>
        <p>Our approach detects changes from one business process at a time. A large
organization with multiple processes needs to run the analysis separately for
each business process to detect the changes in all business operations. Typically,
the business reviews are based on consolidated data, for example a dashboard
report can contain several Key Performance Indicators (KPI). The ERP system
in large organization can easily contain 1 billion new database level transactions
(ie. database rows) per month. If the review is based on 10 KPIs with 100
consolidated drill-down measures each, then then we could say that we use 1 000
out of 1 billion, i.e. 0.0001% of the available data for making ndings in business
review. However, if we set-up 10 process mining models that contain an average
of 1 million transaction level events per business review period of 1 month, then
we use 10 000 000 out of 1 billion transaction, i.e. 1% of total data for supporting
the business review. In this example we would use 10 000 times more data for
supporting business review compared to the previous situation with only the
KPI data. Based on these ideas we propose organizations to analyze as many
processes as possible and include as many events as possible in order to get a
wide view into the changes in business operations. We also suggest to the data
to be prepared so that it covers as long as possible end-to-end processes in order
to facilitate identifying root causes for the discovered process changes.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Collect event and case attribute information</title>
        <p>In this paper we propose the idea of collecting and analyzing the data on event
level. In practice this means that data is not consolidated from individual cases
to the event level but rather the event level data is used as it is and case level
data is copied to each event.</p>
        <p>
          Since our goal is to create new insight for business people, we encourage to
use all possible event and case attribute data that is available. Generation of
suitable log les with extended attributes is well studied area [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. There also
exists methods for enriching and aggregating event logs to case logs [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. We
summarize a key method for constructing event and case logs for event level
with following steps:
{ Starting point is the event which is typically a result of one business
transaction, for example the relational database table whose rows correspond to
transactions E.
{ Use the properties of each event ei in E as event attributes.
{ Identify for each event ei a corresponding case ci and copy all case attributes
as event attributes.
{ Form a event path for each event ei by concatenating the event type names of
events linked to the same case sorted from oldest to newest. Event path can
be expressed in many ways, for example as single event attribute containing
the full path or as several attributes containing single predecessor values.
{ Identify for each event ei in E, a set of objects Oi such that every object
oij in Oi is linked to ei. Use the properties of objects oij as additional event
attributes for events ei.
{ Further augment every event ei by adding external events that have occurred
at the same time. Examples of external events include machinebreak,
weekend, strike, queuetoolong and badweather. Adding external events makes it
possible to use this same approach for detecting changes in external
circumstances as well.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Create new categorization dimensions</title>
        <p>In this paper we will present new categorization dimensions speci cally useful
for the event level analysis.</p>
        <p>The purpose of this step is to create new categorization dimensions for the
cases. All these dimensions will then be used for detecting the changes, so the
more dimensions we have the larger the coverage of our analysis will be. Table 1
shows examples of dimensions that can be created for every event log based on
the log itself.</p>
        <p>
          Categorization Dimensions form the bases for our In uence Analysis when
discovering the business process changes. Without any Categorization
Dimensions we could only make a discovery that in the review period there is more, less
or equal amount of transactions compared to the comparison period. Having the
Event types dimension enables us to detect changes in the amounts of particular
event types, for example we could nd out that there was more Ontime Delivery
kind of events and less Customer Complaint kind of events during the review
period as compared to the comparison period. The Case attributes dimension
in Table 1 is even more interesting since it allows us to detect changes in the
background data of active cases, for example in November there was more cases
from Region with value Finland compared to previous 6 months. Case attribute
changes may be analyzed as speci c to certain event types using the event type
name in the dimension identi er or as global case attributes without the event
type name, or both. In the similar manner all the dimensions in Table 1 can
be added to the analysis. Total amount of dimensions, ie. feature vectors for
case analysis can easily grow large if all the dimensions are taken into use. For
example with 30 event types, 50 case attributes and 10 event attributes the total
amount of dimensions from Table 1 would be 1 + 50 + 1500 + 10 + 300 + 1 +
30 + 1 + 30 = 1 923. In order to hande this curse of dimensionality we suggest
three solutions in real life business review cases. 1. Use In uence Analysis as
described in this paper since it in e ect only shows those dimensions where the
changes are largest. 2. Select only those dimensions that seem to be important
for review purposes. Bene t of this is that business people are not overloaded
with data that they cannot understand. Problem is that some dimensions may
at some point of time contain very useful information about process changes and
in case that dimension is taken away then naturally it is not reported to business
people, so the change may be left unnoticed. 3. Use advanced feature selection
algorithms as presented in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
3.4
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>De ne data for review and comparison periods</title>
        <p>
          The original In uence Analysis that is presented in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] uses a binary classi
cation that speci es each case as either problematic or successful. In this paper
we alter this classi cation in three ways. First, instead of analyzing cases we run
the analysis on the transaction/event level. Second, instead of specifying cases as
problematic or successful we specify events as belonging to the review period or
as belonging to the reference period. Third, in addition to using only the Binary
approach as presented in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] we also use an additional Continuous approach as
presented in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
        <p>Binary approach Figure 1 shows how the analysis data is divided into four
di erent periods in order to identify process changes
{ Review period (c). All events occurring during this period are taken into
consideration when discovering changes. If these events, their quantities and
event attributes are similar to the comparison period events, then there is
no big changes. In real life something is always changing so our target is
to detect the most important changes. Example review period could be one
month like November.
{ Comparison period (b). All events occurring during this period are also
taken into consideration when discovering changes. Typical setup would be
to use the 6 months prior to the review period as comparison period so as
an example the review period could be May to October. If a business is very
seasonal then one option is to use year-to-year comparison period so that
the Comparison period could be same month last year.
{ History period (a). The events that occurred during the History period are
not used as separate events for the review and/or comparison sets. However,
these events will be used for constructing the business process path (trace)
for each event in both review and comparison periods. For example: Review
period and comparison periods both contain events OntimeDeliveryFailed.
In order to understand the root causes for these failures we want to include
a full process path for each OntimeDeliveryFailed event so that we would
see the di erence of how cases are ending up to the OntimeDeliveryFailed
process step. For this reason we need to use the predecessor events also from
the History period when constructing this path for review and comparison
period events.
{ Most Recent Data period (d). All events occurring after the review
period are excluded completely from the analysis. As an example, the typical
business review for November is done in early December when the data from
November is complete. We do not want to use the recent data from December
as it becomes available because that data will be analyzed in next month
business review. Naturally it is possible to set-up the review period as last
30 days so that all recent data from last 30 days is regarding as the review
data and the Most Recent Data period would then be empty.</p>
        <p>Bene t of using a binary approach is that it is typically easy to use for
business people who have prior knowledge about the operations for both review
period and comparison period. Since the history period has no weight in the
analysis, they can fully ignore exceptional transactions, projects and cases that
have been completed during history period. Binary approach also guarantees that
all discovered changes indeed have taken place exactly during the well-de ned
review period.</p>
        <p>
          Continuous approach Another approach for de ning review period and
comparison period is to use a continuous measure to determine which period any
particular event belongs. Bene ts of using Continuous approach include: 1. If the
analysis is done by an analyst as part of a one-time process analysis then there
is no continuous review process and it would be easier to just let the system
divide event into review and comparison periods. 2. Continuous approach give
the analyst more freedom for setting weights for individual events, for example
events occurring during past 6 months may have a certai weights and events
occurring 6-12 months ago could have even higher weight. One straightforward
way to split events into Review and Comparison periods could be: Use half of
the available data for History period. This ensures that most of the events have
proper history and we should not discover changes that result from predecessor
events not being included in the dataset. The other half could then again be
used as 50% Comparison period and 50% Review period. This approach gives
a nice 50% ratio so that for each dimension and analysis nding there should
be an equal amount of that in both Comparison and Review data. If we want
to speci cally detect changes that have occurred over the time then we could
consider giving the oldest and newest cases more weight than for the events that
take place when Comparison period ends and Review period starts, since we
are not really reviewing a speci c calendar month in business review style. One
way to achieve this is to calculate an Age attribute for each event. Age would
be equal to the elapsed time between the actual time of the event and current
time. We then use the Age attribute as the lead time measure for continuous
contribution formulas as de ned in Table 8. in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. In practice the continuous
approach using Age gives the largest weight for the events that take place in
the beginning of the comparison period and in the end of the review period.
Events that take place in the middle have very small weight so the analysis tells
how changes have taken place during the whole period. This is particularly good
approach for analyzing small gradual changes that occur over a longer period of
time.
3.5
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>Detecting changes using In uence Analysis</title>
        <p>
          The In uence Analysis has been presented in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] and [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] for analyzing root
causes for process mining cases. In this paper we use the same formulas with
the exception that instead of using process mining cases we do the analysis on
process mining event level. Another change is that the concept of comparing
problematic cases with normal cases is replaced with the de nition of comparing
review period data with comparison period data in order to detect changes that
have occurred in the time dimension. To re ect these changes the In uence
Analysis de nitions and equations are presented as follows:
De nition 1. Let E = fe1; : : : ; eN g be a set of events in the process analysis.
Each event represents a single transaction that happened at a particular time
and is related to single business process instance.
        </p>
        <p>De nition 2. Let Ea = fea1 ; : : : ; eaN g be a set of events sharing a same
characteristics as de ned in segment A. Ea E. These characteristics are derived
from di erent values for the Categorization Dimensions.</p>
        <p>De nition 3. Let Ep = fep1 ; : : : ; epN g be a set of Review Period events. Ep
to be used in Binary Approach.</p>
        <p>E
De nition 4. Let dej be the age of the event ej to be used in Continuous
Approach.</p>
        <p>
          De nition 5. Let pr be the problem size in the original situation before any
business process improvement. According to the terminology in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] for Binary
Contribution (BiCo) the problem size is equal to the total number of events in
review period, and for Continuous Contribution (CoCo) it is the sum of distance
between Age and average Age for each event separately. The practical meaning
of problem size is to de ne the number of events that belong to the review period.
Binary Change Window In Binary Change Window analysis each event is
either included in the set of Review Period events or the set of Comparison
period events. Note that events belonging to History period and Most Recent
Data period have already been excluded from the analysis.
        </p>
        <p>
          Converting the formulas from Table 8 in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] from cases to events we get:
Total problem size for BiCo is the number of problematic events prBinaryChange =
        </p>
        <p>P 1 as shown in equation 1. Average function for BiCo is the
averej2Ep</p>
        <p>P 1
= jEpj = ej2Ep
jEj P 1 as shown in equation 2. Similarly the</p>
        <p>
          ej2E
age problem density
Continuous Change Window For Continuous Change Window the
formulas from Table 8 in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] are written as: Total problem size for CoCo is the
the sum of distance between Age and average Age for each event separately
prContinuousChange = 12 P dej d as shown in equation 9. Average function
ej2E
for CoCo is the average age
= d = ejP2E dej
        </p>
        <p>
          P 1 as shown in equation 10.
Simiej2E
larly the average problem density for CoCo of subset Ea is a = da = ejP2Ea dej
P 1
ej2Ea
as shown in equation 11. Finally theContribution% for CoCo of subset Ea is
(da d) P 1
conCoCo = prCeojC2oEa as shown in equation 12 in Table 8. in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Case Study: BPI Challenge 2017 Dataset</title>
      <p>
        In this section we show a real life example of using the presented methodology
on the loan applications process data from a Dutch Financial Institute. The data
is publicly available as BPI Challenge 2017 Dataset [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and contains 31 509 cases
and a total of 1 202 267 events. The original dataset has been prepared in a way
that it contained full cases. Since the purpose of our analysis is to show business
process changes within a continuous monitoring situation, we have taken the
following steps in preparing a setup for business review analysis.
{ November 2016 is selected as the business review month. The data contains
104 946 events whose time stamp belongs to November, so the Review period
will consist of these events.
{ All events occurring later than November belong to Most Recent Data period
and are excluded from analysis, consisting of 120 568 events. These events
would naturally be included in later business review periods.
{ Comparison period has been chosen to include the 6 months before review
period, ie. from May 2016 to October 2016. Comparison period contains 647
406 events.
{ History period contains 329 347 events occurring before May 2016. These
events are used for constructing the process path and predecessor dimensions
for History and Review events, but they are not included in the analysis as
actual events belonging to either Comparison or Review sets.
{ Total amount of events in the analysis is 752 352 consisting of 104 946 events
for Review period (13.95% of all events) and 647 406 events for Comparison
period (86.05% of all events).
      </p>
      <sec id="sec-4-1">
        <title>Results using Binary Change Window Figure 2. shows the top-10 most</title>
        <p>important changes in the business process and related data for the review period.
We see that there is a lot of User changes in event attribute org:resource so it
seems like employees are changing a lot. User 133 has conducted 3 728 events
during the review period and only 4 267 in the comparison period so 47% of his
events have taken place during the review period, which makes him to be the
biggest increase in volume taken into account the size of his total activity (7 995
events) and the di erence 33% from average 13.95% of activities which should
take place in review period.</p>
        <p>Figure 3 shows the changes in only the event type dimension. The event types
W Call incomplete les - suspend and W Call after o ers - ate abort occur more
often during the Review period whereas the event types W Validate application
- resume and W Call after o ers - suspend occur less often during the Review
period than in Comparison period.</p>
        <p>Considering the business process related changes where the order of activities
is changing we limit the analysis to only the predecessor changes where a certain
event takes place immediately after another event as shown in Figure 4. During
the review period the control ow transition from event W Call after o ers
ate abort to W Call after o ers - schedule occurs more often and the transition
from event type W Validate application - suspend to W Validate application
resume less often than during the comparison period.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Results using Continuous Change Window Figure 5 shows the continuous</title>
        <p>approach versions of the same overall analysis as the previous binary approach
Figur 2. The continuous analysis is con gured to discover di erences in events
from mid-August to November with events from May to mid-August. The results
of Continuous analysis are the result of giving each event a weight based on the
Age of the event. The bigger the distance from average Age the bigger the weight
of that particular event. In our example data the average Age of events is 103.97
days and the distance from average Age is then between +103.97 days and
103.97 days. An event taking place in either end (oldest and youngest) have
about 100 times the weight compared to an event taking place 1 day after of
before the average Age. Similarly an event that takes place exactly in the average
Age has zero weight as it does not belong either to the old period or new period.
Continuous analysis results are well in line with the binary approach results and
di erences are based on the di erent setup of Review and Comparison periods
and a di erent weighting aproach as described. For example User 133 as the new
value for org:resource is still the biggest change and both org:resources User 67
and User 65 are included in top-10 changes for both Binary and Continuous
approaches as is visible in Figures 2 and 5.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Summary and Conclusions</title>
      <p>In this paper we have presented a method for detecting business process changes.
The method is based on our previously published In uence Analysis and it uses
the conformance measure to scale di erent types of changes in order to present
various kind of changes sorted by their signi cance. One novel idea in this paper
is to use In uence Analysis on the event level instead of business process case
level. Operating on the event level makes it possible to use all available data
from the review period for detecting changes instead of having to wait until a
business process case is completed. Summary of our key experiences when using
the analysis with real-life cases include:
{ Changes in business operations can be analyzed by comparing Review period
events to the Comparison period events using in uence analysis.
{ Business people quickly learn to read the in uence analysis results on monthly
bases. Detecting the top-10 or top-50 changes gives a very good starting point
for a more detailed periodical analysis of business process changes.
{ Detected changes may also be a result of incorrect data integration between
process mining system and the actual ERP system(s). The method presented
in this paper serves as an easy-to-use quality assurance tool for evaluating
the correctness of periodical data loads and integrations. For example, after
each monthly, weekly or daily data import the system can notify business
analyst about the top-10 changes so that a potential technical integration
problem is detected and corrected before other business users spend a lot of
time in analyzing incorrect data.</p>
      <p>Acknowledgements. We thank QPR Software Plc for the practical experiences
from a wide variety of customer cases and for funding our research. The
algorithms presented in this paper have been implemented in a commercial process
mining tool QPR ProcessAnalyzer.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Barbon</given-names>
            <surname>Junior</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Tavares</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            ,
            <surname>da Costa</surname>
          </string-name>
          , V. G. T.,
          <string-name>
            <surname>Ceravolo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Damiani</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          (
          <year>2018</year>
          ,
          <article-title>April). A Framework for Human-in-the-loop Monitoring of Concept-drift Detection in Event Log Stream</article-title>
          .
          <source>The Web Conference</source>
          <year>2018</year>
          (pp.
          <fpage>319</fpage>
          -
          <lpage>326</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bose</surname>
          </string-name>
          , R. J. C.,
          <string-name>
            <surname>van der Aalst</surname>
          </string-name>
          , W. M.,
          <string-name>
            <surname>liobait</surname>
          </string-name>
          , I., &amp;
          <string-name>
            <surname>Pechenizkiy</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2011</year>
          , June).
          <article-title>Handling concept drift in process mining</article-title>
          .
          <source>In International Conference on Advanced Information Systems Engineering</source>
          (pp.
          <fpage>391</fpage>
          -
          <lpage>405</lpage>
          ). Springer, Berlin, Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bose</surname>
            ,
            <given-names>R. J. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Der Aalst</surname>
            ,
            <given-names>W. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zliobaite</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Pechenizkiy</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Dealing with concept drifts in process mining</article-title>
          .
          <source>IEEE transactions on neural networks and learning systems</source>
          ,
          <volume>25</volume>
          (
          <issue>1</issue>
          ),
          <fpage>154</fpage>
          -
          <lpage>171</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Buijs</surname>
            ,
            <given-names>J. C. A. M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W. M. P.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Enabling interactive process analysis with process mining and visual analytics</article-title>
          .
          <source>BIOSTEC</source>
          <year>2017</year>
          ,
          <volume>573</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Carmona</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Gavalda</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2012</year>
          ,
          <article-title>October)</article-title>
          .
          <article-title>Online techniques for dealing with concept drift in process mining</article-title>
          .
          <source>In International Symposium on Intelligent Data Analysis</source>
          (pp.
          <fpage>90</fpage>
          -
          <lpage>102</lpage>
          ). Springer, Berlin, Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. van Dongen,
          <string-name>
            <surname>B.F.</surname>
          </string-name>
          (
          <year>Boudewijn</year>
          ) (
          <year>2017</year>
          )
          <article-title>BPI Challenge 2017</article-title>
          . Eindhoven University of Technology. Dataset. https://doi.org/10.4121/uuid:
          <fpage>5f3067df</fpage>
          -f10b
          <string-name>
            <surname>-</surname>
          </string-name>
          45dab98b-86ae4c7a310b
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Van</given-names>
            <surname>Dongen</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          (
          <year>2000</year>
          ).
          <article-title>A Cluster Algorithm for Graphs</article-title>
          .
          <source>Technical report, National Research Institute for Mathematics and Computer Science in the Netherlands.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. Zliobaite_,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Pechenizkiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            , &amp;
            <surname>Gama</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>An overview of concept drift applications. In Big data analysis: new algorithms for a new society</article-title>
          (pp.
          <fpage>91</fpage>
          -
          <lpage>114</lpage>
          ). Springer, Cham.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hinkka</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehto</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heljanko</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Jung</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2017</year>
          ,
          <article-title>September)</article-title>
          .
          <article-title>Structural Feature Selection for Event Logs</article-title>
          .
          <source>In International Conference on Business Process Management</source>
          (pp.
          <fpage>20</fpage>
          -
          <lpage>35</lpage>
          ). Springer, Cham.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Hompes</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buijs</surname>
          </string-name>
          , J. C.,
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dixit</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Buurman</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Detecting Change in Processes Using Comparative Trace Clustering</article-title>
          . In SIMPDA (pp.
          <fpage>95</fpage>
          -
          <lpage>108</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Lehto</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinkka</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Hollmen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>"Focusing Business Improvements Using Process Mining Based In uence Analysis</article-title>
          .
          <source>" International Conference on Business Process Management</source>
          . Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Lehto</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinkka</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Hollmen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Focusing Business Process Lead Time Improvements Using In uence Analysis</article-title>
          .
          <source>In International Symposium on Data-Driven Process Discovery and Analysis. Rheinisch-Westfaelische Technische Hochschule Aachen.</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>De</surname>
            <given-names>Leoni</given-names>
          </string-name>
          , Massimiliano, Wil MP van der Aalst, &amp;
          <string-name>
            <surname>Marcus</surname>
            <given-names>Dees.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>"A general process mining framework for correlating, predicting and clustering dynamic behavior based on event logs</article-title>
          .
          <source>" Information Systems</source>
          , http://dx.doi.org/10.1016/j.is.
          <year>2015</year>
          .
          <volume>07</volume>
          .003
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Maaradji</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>La</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            , &amp;
            <surname>Ostovar</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          (
          <year>2015</year>
          ,
          <article-title>August)</article-title>
          .
          <article-title>Fast and accurate business process drift detection</article-title>
          .
          <source>In International Conference on Business Process Management</source>
          (pp.
          <fpage>406</fpage>
          -
          <lpage>422</lpage>
          ). Springer, Cham.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Maisenbacher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Weidlich</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2017</year>
          , June).
          <article-title>Handling concept drift in predictive process monitoring</article-title>
          .
          <source>In Services Computing (SCC)</source>
          ,
          <year>2017</year>
          IEEE International Conference on (pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Suriadi</surname>
          </string-name>
          , Suriadi, Ouyang, Chun, van der Aalst,
          <string-name>
            <surname>Wil</surname>
            <given-names>M.P.</given-names>
          </string-name>
          , &amp; ter Hofstede,
          <source>Arthur</source>
          (
          <year>2013</year>
          )
          <article-title>Root cause analysis with enriched process logs</article-title>
          .
          <source>Lecture Notes in Business Information Processing [Business Process Management Workshops: BPM 2012 International Work-shops Revised Papers]</source>
          ,
          <volume>132</volume>
          , pp.
          <fpage>174</fpage>
          -
          <lpage>186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Teinemaa</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tax</surname>
          </string-name>
          , N.,
          <string-name>
            <surname>de Leoni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Maggi</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          (
          <year>2018</year>
          ,
          <article-title>September)</article-title>
          .
          <article-title>Alarm-based prescriptive process monitoring</article-title>
          .
          <source>In International Conference on Business Process Management</source>
          (pp.
          <fpage>91</fpage>
          -
          <lpage>107</lpage>
          ). Springer, Cham.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Van Der Aalst</surname>
          </string-name>
          , Wil, et al.
          <article-title>"Process mining manifesto." Business process management workshops</article-title>
          . Springer Berlin Heidelberg,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Webb</surname>
            ,
            <given-names>G. I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hyde</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cao</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>H. L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Petitjean</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Characterizing concept drift</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          ,
          <volume>30</volume>
          (
          <issue>4</issue>
          ),
          <fpage>964</fpage>
          -
          <lpage>994</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>