<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Directly Follows-Based Process Mining: a Tool</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Sander J.J. Leemans, Erik Poppe, Moe T. Wynn Queensland University of Technology Brisbane</institution>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-In order to bridge the gap between well-founded academic and intuitive commercial tools, we introduce the Directly Follows visual Miner (DFvM), which takes as input an event log and allows users to explore it. That is, it automatically discovers a directly follows process model, applies conformance checking, provides performance measures and allows for filtering the log. DFvM uses directly follows models, which are also used by many commercial process mining tools, however, unlike such tools, it provides conformance checking and reliable performance measures. Index Terms-process discovery, conformance checking, process enhancement, performance mining, directly follows-based process mining II. IDEA</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>
        Process mining aims to obtain insights from event logs that
contain recorded behaviour of executions of an organisation’s
business processes, in order to optimise the processes.
Typically, in a process mining project, first a process model that
describes the control flow of the process is discovered from an
event log. Second, the model should be evaluated against the
event log or a secondary test event log to verify that the model
represents the behaviour of the process well. This is typically
performed using a conformance checking technique. Third,
the performance of the process can be measured to identify
bottlenecks, central concepts and batching behaviour. These
steps can be repeated to, for instance, zoom in on areas of
particular interest, or to compare different groups of recorded
behaviour [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Many process mining techniques and tools have been
proposed, both commercially (e.g. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]) and academically [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
Academic tools typically use process models with a
welldefined semantics that support advanced constructs such as
concurrency, interleaving and inclusive choices. This allows
such models, for instance Petri nets, process trees or BPMN
models, to be evaluated using conformance checking
techniques [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In contrast, many commercial process mining tools
use directly follows-based process models to convey business
processes to stakeholders, even though these models often
lack an executable semantics and support neither concurrency,
interleaving nor inclusive choices. As the executable semantics
of such models is not clear, obtained insights are difficult to
verify [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Nevertheless, such models are considered to be
easier to understand for users [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], we introduced concepts to bridge this gap between
academic and commercial process mining tools: we keep the
directly follows based models, but use a proper semantics,
introduce soundness and describe how conformance checking
and performance measuring concepts can be applied to these
models.
      </p>
      <p>In this paper, we present the Directly Follows visual Miner
(DFvM), which applies both well-established and new
academic concepts to directly follows based models. Our aim
is to illustrate that conformance checking and reliable
performance measures are possible within the limitations of intuitive
directly follows-based models.</p>
      <p>
        Relation with Inductive visual Miner: The DFvM is an
extension of the Inductive visual Miner (IvM) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and shares
its code base. Compared to the previously published version of
IvM [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], DFvM has a new architecture (Section III) and several
new features (Section IV). The architecture and many of the
new features of DFvM have been made available to users of
IvM as well, however have not been published before.
      </p>
      <p>In the remainder of this paper, we first introduce the
idea of the DFvM (Section II) and explain its architecture
(Section III). We describe the new features in Section IV
and describe its maturity in Section V. Section VI explains
how the tool can be accessed while the paper is concluded in
Section VII.</p>
      <p>
        The Directly Follows visual Miner (DFvM) takes an event
log and automatically applies a series of steps, which allows
the user to perform process-based analyses on the log. That
is, first a Directly Follows Model (DFM) is discovered by a
DFM discovery algorithm. Second, the DFM is aligned with
the event log, and the results are shown. That is, deviations
between event log and DFM can be shown on both the model
(to show where deviations occur) and on the event log (to show
which events do not correspond to the model [log moves],
and where in the traces the model made a move that is not
represented in the trace [model moves]). Furthermore, several
performance measures, such as how often parts of the model
are executed and the time spent waiting for or executing these
parts, are computed and visualised. In addition, the event log is
animated as yellow tokens flowing through the model, which
allows the observation of bottlenecks, batching behaviour and
seasonality. In typical process mining projects, insights gained
using these techniques might highlight the need for drilling
further into the event log, for instance to compare different
subsets of the event log or to zoom in on a particular part of
the DFM [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. DFvM facilitates such inquiries by quick filtering
in several ways: first, clicking an activity or edge of the DFM,
filters the log to contain only traces that use that activity or
edge. Second, filters based on trace and event attributes can
be applied before or after discovery of the DFM. Any such
filtering will update all performance measures accordingly.
      </p>
      <p>To summarise, using DFvM a user can explore the
process iteratively to derive insights by repeatedly discovering a
process model using process discovery, evaluating the model
using conformance checking, assessing its performance and
drilling down by filtering the log.</p>
      <p>To enable this seamless exploration, a new architecture
underlying DFvM automates this process: when users change
a setting DFvM recomputes results as necessary.</p>
    </sec>
    <sec id="sec-2">
      <title>III. ARCHITECTURE</title>
      <p>DFvM performs several steps fully automatically: after a
few log-related steps (filtering, applying the classifier, etc.), a
directly follows model is discovered, after which it is aligned,
the log is filtered and the results of that filtered alignment
are visualised (performance, histograms, animation, colouring,
etc.).</p>
      <p>The architecture of the DFvM is shown in Figure 1: the
graph shows the steps that are performed to visualise event
logs and directly follows models, and the dependencies
between steps. Each step is executed (in parallel) as soon as all
preceding steps (that is, all steps with arcs going into the
current step) have been completed. Similarly, when a user changes
a setting, only the relevant dependent steps are recomputed.
For instance, if a user changes the highlighting selection (e.g.
by clicking on an activity to filter the log to only include
traces going through that activity according to the model) then
highlight selection, measure performance and
compute histograms are redone. This architecture is
supported by a flexible multi-threaded framework that uses
this graph to take care of starting steps, keeping track of their
progress, and cancelling steps and voiding their results if they
are no longer necessary (because the user changed something
upstream again).</p>
      <p>This framework has proven to be flexible: new steps can
be added to the framework by simply adding them and their
dependency edges to the graph.</p>
      <p>Compared to IvM, the framework has been improved:
previously, all current steps were executed sequentially and
only if a step’s result would become obsolete a concurrent
execution was started.</p>
    </sec>
    <sec id="sec-3">
      <title>IV. MAIN NEW FEATURES</title>
      <p>
        Directly Follows visual Miner (DFvM) is based on the
Inductive visual Miner (IvM) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and extends it as follows
with new features exclusively for directly follows models:
Process discovery. The new process discovery algorithm
is described in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and is available in DFvM. This
algorithm simplifies the model by filtering traces, based
on the paths slider. Using the paths slider, the level of
infrequent behaviour filtering can be adjusted. The value
of the slider sets the minimum percentage of traces in
the event log that fit the model (guaranteed when only
completion events are considered [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]).
      </p>
      <p>
        Conformance checking. Directly follows models are
automatically translated to Petri nets and aligned using [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
The results are used throughout the DFvM, for instance
to show deviations, animate the event log over the model
and to compute performance measures. DFvM shows
deviations using the concepts introduced in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which
illustrates how (commercial) directly follows-based tools
could implement conformance checking.
      </p>
      <p>
        Performance. Typically, existing DFM-based tools
compute performance measures based on DFM-edge
traversal. As shown in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], this might lead to counter-intuitive
results. DFvM, on the other hand, computes performance
measures based on the computed alignment. That is, the
traces are projected on the model and several performance
measures are computed based on this projection:
minimum, average and maximum sojourn, waiting and service
time, and elapsed and remaining trace time.
      </p>
      <p>
        Edit model. Users can manually edit the directly follows
model that was discovered, or import an existing stored
model to avoid discovery altogether. Editing the model
may involve changing the start and end activities, as
well as the edges that make up the model and selecting
whether the model should support empty traces. While
editing the model, users are shown an example of the
tobe-model to help with editing and creating a sound model,
that is, a model in which every activity is reachable and
can proceed to the end of the model [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. If the model is
not sound, DFvM will highlight the identified issue (see
Figure 3). As soon as the edited model is sound, DFvM
will adopt the new model and continue computations
according to Figure 1.
      </p>
      <p>Export model. The directly follows model can be
exported to ProM as a Petri net, an Accepting Petri net,
an Expanded Accepting Petri net (with explicit start and
completion activities) or as a DFM.</p>
      <p>New features of DFvM that also benefit users of IvM:
Trace and token colouring. Using “trace colouring”, a
trace-level attribute can be chosen, and every trace will
be coloured according to the value of this attribute using a
continuous Viridis colour map. The tokens, as well as the
traces in the trace view, are highlighted with this colour.
Hardware accelerated drawing. The drawing of tokens
now makes use of hardware acceleration using OpenGL
for smoother animations.</p>
      <p>Edit model for process trees. A model editing facility has
been added for process trees as well.</p>
      <p>Additional performance measures: DFvM shows elapsed
and remaining time, in addition to waiting, service and
sojourn time. For an activity a, elapsed time is the time
from the first timestamp in a trace to the start of the
execution of a, or the completion if no start timestamp
is available. Similarly, remaining time is the time of
the completion (or start if that is not available) of a
to the last timestamp in a trace. The reported measures
are the minimum, average and maximum times over all
executions of an activity a.</p>
      <p>Import an existing process model and bypass the
discovery. Conformance checking and performance analysis
for existing models can be done using the ProM plug-in
“Visualise deviations on directly follows model (Directly
Follows visual Miner)”.</p>
      <p>Switch between DFvM and IvM. Users can easily switch
between DFvM and IvM by choosing a different “miner”:
even though DFvM and IvM use completely different
concepts, if advanced process discovery using process
trees with concurrency, interleaving or inclusive choices
is necessary, IvM is two clicks away.</p>
    </sec>
    <sec id="sec-4">
      <title>V. MATURITY</title>
      <p>
        The Inductive visual Miner was originally introduced in
2014, and has attracted 53 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and 41 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] citations (Google
Scholar, 18-03-2019) and has been used in several industry
projects within QUT’s BPM discipline.
      </p>
      <p>
        We have applied the Directly Follows visual Miner (DFvM)
to a case study in a Queensland Government department [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
in which it proved useful to perform process mining analyses
on models that were difficult to represent with Petri nets or
process trees. Furthermore, the department had 72 manually
created directly follows-based business process descriptions,
which were transformed to a DFvM format, after which we
used DFvM to apply conformance checking to these models
and corresponding event logs to gain insight into deviations,
frequencies and performance [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>Finally, the new edit model feature has been used in
collaborations with several industry partners during workshops.
While discussing the model with the participants, it is
modelled live in DFvM, after which DFvM quickly shows the
alignment results (deviations, frequencies and performance),
which then often leads to further discussions on the model
and on the process. For instance, one of the participants of a
workshop noted “Utilising the DFvM tool during workshops
has been beneficial. Being able to visually see the deviations,
frequencies and performance of the processes has facilitated
robust discussion. All workshop participants, whether process
focussed or not, are able to quickly comprehend the model
and leverage the findings to identify process inefficiencies and
drive potential process transformation.” (Janne Barnes, Project
Manager - Queensland University of Technology - Research
Management Systems Upgrade).</p>
    </sec>
    <sec id="sec-5">
      <title>VI. ACCESS &amp; DEMONSTRATION</title>
      <p>A user manual is available from http://leemans.ch/
inductivevisualminer/, and a screencast video is available on
https://youtu.be/xTKKqGwzh6I.</p>
    </sec>
    <sec id="sec-6">
      <title>VII. CONCLUSION</title>
      <p>Typical process mining projects involve repeated process
discovery, conformance checking, filtering and performance
measuring. Academic process discovery techniques might
return models that are difficult to understand or that
underor overfit the event log, while commercial tools typically do
not offer conformance checking capabilities (and thus, might
provide counter-intuitive performance measures).</p>
      <p>In this paper, we introduced the Directly Follows visual
Miner (DFvM): a new tool that automatically discovers a
directly follows model, applies conformance checking
(alignments), computes performance measures and supports filtering
to drill down into parts of the event log.</p>
      <p>DFvM extends and improves the Inductive visual Miner
with a new formalism (directly follows models rather than
process trees), a new more concurrent architecture using
a flexible framework, the ability to edit models and new
performance measures.</p>
      <p>We express the hope that the ideas illustrated in DFvM will
lower the bar for analysts to apply conformance checking and
that it will inspire commercial vendors and others to include
conformance checking capabilities.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. J. J.</given-names>
            <surname>Leemans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fahland</surname>
          </string-name>
          , and
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          , “
          <article-title>Process and deviation exploration with Inductive visual Miner,”</article-title>
          <source>in BPM Demo Sessions</source>
          <year>2014</year>
          ,
          <year>2014</year>
          , p.
          <fpage>46</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Gu</surname>
          </string-name>
          <article-title>¨nther and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Rozinat</surname>
          </string-name>
          , “Disco: Discover your processes,” in BPM demos,
          <year>2012</year>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>44</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          ,
          <source>Process Mining - Data Science in Action, Second Edition</source>
          . Springer,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. J. J.</given-names>
            <surname>Leemans</surname>
          </string-name>
          , E. Poppe, and M. T. Wynn, “
          <article-title>Directly follows-based process mining: Exploration &amp; a case study,” in</article-title>
          <string-name>
            <surname>ICPM</surname>
          </string-name>
          ,
          <year>2019</year>
          , in print.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Adriansyah</surname>
            , and
            <given-names>B. F. van Dongen</given-names>
          </string-name>
          , “
          <article-title>Replaying history on process models for conformance checking and performance analysis</article-title>
          ,
          <source>” WIRDMKD</source>
          , vol.
          <volume>2</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>182</fpage>
          -
          <lpage>192</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S. J. J.</given-names>
            <surname>Leemans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fahland</surname>
          </string-name>
          , and
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          , “
          <article-title>Exploring processes</article-title>
          and deviations,” in BPM Workshops,
          <year>2014</year>
          , pp.
          <fpage>304</fpage>
          -
          <lpage>316</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>