<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ProcessPro ler3D: A Tool for Visualising Performance Di erences Between Process Cohorts and Process Instances</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>E. Poppe</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M.T. Wynn</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A.H.M. ter Hofstede</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>R. Brown</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A. Pini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>W.M.P. van der Aalst</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DensityDesign Research Lab, Politecnico di Milano</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Eindhoven University of Technology</institution>
          ,
          <addr-line>Eindhoven</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Queensland University of Technology</institution>
          ,
          <addr-line>Queensland</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>An organisation's event logs can give great insight into factors that a ect the execution of their business processes by comparing di erent process cohorts. We have recently presented ProcessPro ler3D, a novel tool for such comparisons that supports interactive data exploration, automatic calculation of performance data and visual comparison of multiple cohorts. The approach enables the intuitive discovery of differences and trends in cohort performance. To better support the interpretation of these di erences in the context of process execution we now extended the tool with a novel visualisation technique that enables the visualisation of case execution and timing in a way that provides context to such a performance analysis.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Analysing process data in event logs to identify problems and opportunities
with existing processes can be of great value for improving the processes of an
organisation. Process mining [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], a specialised eld of research in business
process management, develops tools and techniques to support this. By splitting
an event log into process cohorts, i.e. a group of process instances that have one
or more shared characteristics, one can analyse how di erent case
characteristics (often called context factors) a ect the execution of a process. We have
recently identi ed that despite continued industry interest [
        <xref ref-type="bibr" rid="ref2 ref4 ref6">6,4,2</xref>
        ], there is a lack
of tools to support such analyses e ectively [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. None of the existing academic or
commercial tools provided both support for interactive data exploration, by
supporting interactive splitting of the event log, as well as an integrated comparison
of multiple process cohorts, by supporting the visualisation of performance data
for more than two cohorts in one view. Consequently, we presented
ProcessProler3D, a framework to solve this issue [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. We now present a complementary
novel visualisation technique that covers additional performance analysis
scenarios by providing additional context to the presented performance data.
ProcessPro ler3D enables comparing the performance of multiple process
cohorts by
{ aligning an event log with a process model
{ calculating common node level process performance indicators such as
activity duration, activity throughput time and waiting times between activities
{ storing performance data in a data cube
{ interactively splitting the event log by de ning cohorts
{ visualising performance data in a third dimension on top of the process model
at multiple levels of process abstraction
{ visualising data related to activities using either one of three di erent types
of bar charts or a triangle chart (see [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ])
{ visualising data related to activity pairs can be visualised using coloured arcs
between the two activities (see [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ])
      </p>
      <p>The framework was implemented in two plugins for the process mining
framework ProM. Figure 1 shows an example of comparative performance analysis
using this tool.</p>
      <p>However, we note that some scenarios are still not well covered by existing
performance analysis techniques and in the remainder of this paper we will
discuss one of these scenarios and present a novel visualisation technique that we
have added to ProcessPro ler3D to address this issue.
3</p>
    </sec>
    <sec id="sec-2">
      <title>Problem statement</title>
      <p>One issue with existing techniques for process performance analysis is the loss of
context that occurs when performance data are a) localised and b) summarised
as is usually the case with activity duration, throughput time and waiting time
calculations. Both problems have the potential to a ect our understanding of
performance analysis results and can complicate nding root causes.</p>
      <p>Firstly, the analysis results are currently localised to one point in the process
model. For example, an activity C may be preceded by either activity A or
B. By looking at performance indicators of these activities we cannot tell if
cases that rst executed A on average take longer to execute C than cases that
executed B. So by localising the analysis results per activity we lose the context
of how preceding activities a ected the case and how subsequent activities were
impacted.</p>
      <p>Secondly, the statistical summary of performance indicators by minimum,
median, mean and maximum also means that we are losing context in the
results. It is, for example, hard to tell whether a few extreme cases skewed the
results or what the general distribution of cases is. Furthermore, if the same
case executes an activity multiple times, it is impossible to identify di erences
between the individual execution times (e.g. the activity took much time on the
rst execution, but nished really quickly on every following execution). Some
absolute indicators, such as the average case runtime at an activity, also get
distorted by loops. Consequently, while existing process performance analysis
techniques already provide valuable insights into the execution of a process,
additional analysis techniques are required to add context to the results of existing
techniques.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Trajectory Visualisation</title>
      <p>
        We therefore propose a novel visualisation technique inspired by geo-spatial
data visualisations (e.g. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]) to present performance data in the context of both
history and future execution of a process instance. This visualisation presents
the path of individual cases through a process model, while showing timing
information in a third dimension, orthogonal to the process model. An example
of this technique can be seen in Figure 2.
      </p>
      <p>We construct this visualisation by replaying a token-game on a given Petri
net and recording each token move as a line in two dimensions. We then use the
time of each event that triggered the token move to calculate the height of the
start and end point of each line. Our implementation provides three di erent
con gurations of the trajectory visualisation. The rst variant visualises token
paths from one activity straight to the next activity. The second variant visualises
the token paths from the activity through the place to the next activity. The third
variant visualises the token path from the activity along the edge connecting it to
the place and then along the edge to the next activity. Each variant increases the
complexity of the visualisation, but often lines following the model layout more
precisely make it easier to relate them back to the underlying process model and
therefore easier to understand. To further facilitate this, vertical support lines
can be displayed by selecting nodes in the process model, as shown in Figure 2.</p>
      <p>In addition to the shape of case trajectories, colours can encode additional
information in the visualisation. By default, case trajectories are coloured to
indicate the cohort a case belongs to (see Figure 3). However, our
implementation can also colour the trajectory to display relative completion of the case
as a colour gradient. This can facilitate nding bottlenecks in large event logs.
Furthermore, the cohort classi cation can be used to lter the visualisation, by
hiding trajectories belonging to a particular cohort. Lastly, the vertical scale of
the visualisation can be changed by clicking on the white frame surrounding the
trajectories and pulling it upwards or downwards. This can make it easier to see
di erences between otherwise densely packed trajectories.</p>
      <p>
        Seeing both control- ow and time perspective in one view enables users to
identify interactions between control- ow constructs such as loops and process
execution times. Using this technique together with the previously presented
techniques for comparative performance visualisation (see [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]) therefore
facilitates the understanding of performance analysis results.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>
        We have presented ProcessPro ler3D, a framework that can be used to analyse
and compare the performance of multiple process cohorts. The usefulness of this
framework has previously been demonstrated by analysing two industry data
sets and evaluating the tool with two industry partners [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In this paper we
have added a novel visualisation technique, the trajectory visualisation, to this
framework, to address the loss of context in the existing performance analysis
approaches.
      </p>
      <p>The framework is available as a package (called \ProcessPro ler3D") for the
process mining framework ProM. In addition, the complete source code for the
tool including the trajectory visualisation is available in the ProM repository4.</p>
      <p>A screencast of the tool including the new technique is available at:
https://www.youtube.com/watch?v=CkgBTFk6MXY
6
4 https://svn.win.tue.nl/repos/prom/Packages/ProcessPro ler3D/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          : Process Mining: Data Science in Action. Springer (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bolt</surname>
          </string-name>
          , A.,
          <string-name>
            <surname>de Leoni</surname>
          </string-name>
          , M.,
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gorissen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Exploiting process cubes, analytic work ows and process mining for business process reporting: A case study in education</article-title>
          .
          <source>In: International Symposium on Data-driven Process Discovery and Analysis</source>
          . pp.
          <volume>33</volume>
          {
          <fpage>47</fpage>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kraak</surname>
            ,
            <given-names>M.J.:</given-names>
          </string-name>
          <article-title>The space-time cube revisited from a geovisualization perspective</article-title>
          .
          <source>In: Proc. 21st International Cartographic Conference</source>
          . pp.
          <year>1988</year>
          {
          <year>1996</year>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Partington</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wynn</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suriadi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ouyang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karnon</surname>
          </string-name>
          , J.:
          <article-title>Process mining for clinical processes: A comparative analysis of four Australian hospitals</article-title>
          .
          <source>ACM Transactions on Management Information Systems</source>
          <volume>5</volume>
          (
          <issue>4</issue>
          ),
          <volume>19</volume>
          :1{
          <fpage>19</fpage>
          :18 (Jan
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Pini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
          </string-name>
          , R.,
          <string-name>
            <surname>Wynn</surname>
          </string-name>
          , M.T.:
          <article-title>Process visualization techniques for multiperspective process comparisons</article-title>
          . In: Bae,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Suriadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Wen</surname>
          </string-name>
          ,
          <string-name>
            <surname>L</surname>
          </string-name>
          . (eds.)
          <source>Asia Paci c Business Process Management. Lecture Notes in Business Information Processing</source>
          , vol.
          <volume>219</volume>
          , pp.
          <volume>183</volume>
          {
          <fpage>197</fpage>
          . Springer, Busan,
          <source>Korea (March</source>
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Suriadi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wynn</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ouyang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>ter Hofstede</surname>
            ,
            <given-names>A.H.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>van Dijk</surname>
            ,
            <given-names>N.J.:</given-names>
          </string-name>
          <article-title>Understanding process behaviours in a large insurance company in Australia: A case study</article-title>
          . In: Salinesi,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Norrie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.C.</given-names>
            ,
            <surname>Pastor</surname>
          </string-name>
          ,
          <string-name>
            <surname>O</surname>
          </string-name>
          . (eds.)
          <source>Advanced Information Systems Engineering, Lecture Notes in Computer Science</source>
          , vol.
          <volume>7908</volume>
          , pp.
          <volume>449</volume>
          {
          <fpage>464</fpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Wynn</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poppe</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , ter
          <string-name>
            <surname>Hofstede</surname>
            ,
            <given-names>A.H.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
          </string-name>
          , R.A.,
          <string-name>
            <surname>Pini</surname>
          </string-name>
          , A.,
          <string-name>
            <surname>van der Aalst</surname>
          </string-name>
          , W.M.P.:
          <article-title>ProcessPro ler3D: A visualisation framework for log-based process performance comparison. Decision Support Systems (</article-title>
          <year>2017</year>
          , in press), https://doi. org/10.1016/j.dss.
          <year>2017</year>
          .
          <volume>04</volume>
          .004
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>