<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>IM-viz: A Tool for the Step-by-step Visualization of the Inductive Miner</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Florian Lang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Din Hida</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yingjie Bian</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adrian Rebmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Han van der Aa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Business Informatics and Mathematics, University of Mannheim</institution>
          ,
          <addr-line>B6 26, 68159 Mannheim</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Inductive Miner is a state-of-the-art algorithm for process discovery and a staple in process mining education, since its divide-and-conquer nature can teach students how to recognize behavioral relations in event data and to break up a discovery problem into smaller parts. However, a key problem from this educational perspective is that the algorithm's manual application is time consuming, involving a considerable amount of drawing, whereas existing implementations of the algorithm only show the ifnal outcome, not its intermediary steps. To overcome this, we present IM-viz, an educational process mining tool that visualizes the application of the Inductive Miner and the Inductive Miner infrequent in an iterative manner. IM-viz allows users to interactively explore how inductive mining works and how it deals with diferent kinds of event data, thus providing a convenient means for process mining students and educators to establish and analyze step-by-step process discovery examples.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Process discovery is a quintessential task in process mining, which takes exemplary process
executions, stored in an event log, and aims to establish a process model that accurately describes
the underlying process [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In this context, inductive mining refers to a family of discovery
algorithms that work in a top-down manner [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These algorithms apply a divide-and-conquer
strategy to recursively decompose the process discovery task into smaller parts. This is achieved
by establishing a directly-follows graph (DFG) of the event log and, subsequently, identifying
cuts that partition the activities in the DFG into smaller sets. These cuts identify behavioral
relations between the activity sets, indicating that they are in a sequential, parallel, exclusive,
or loop relation. The algorithm is recursively applied on the sub logs corresponding to these
smaller activity sets, until the problem cannot be further decomposed and a process tree or
workflow net is returned.
      </p>
      <p>Inductive mining is recognized as (part of the) state of the art for process discovery, as it
provides formal guarantees on its output (most importantly, output models are always sound),
is scalable, and provides flexibility to users, since inductive mining algorithms exist that can
deal with diferent quality issues, such as noise and incompleteness. Furthermore, inductive
mining is interesting from an educational perspective, as its divide-and-conquer nature provides
a convenient means for students to learn how to break up a discovery task into smaller problems
and to recognize behavioral relations in event data.</p>
      <p>A downside from this educational perspective is that students new to process mining may
ifnd it dificult to understand how the algorithms exactly work and how they should be applied.
Additionally, the algorithms are time consuming to apply in a manual manner, involving a
considerable amount of drawing (of the DFGs). Since existing implementations only provide the
output obtained by discovery (i.e., the final process tree or workflow net), these do not allow
users to understand the intermediary steps taken to reach this result. Consequently, it is tedious
for educators and dificult for students to gain insights into the application of inductive mining
algorithms in a step-by-step manner.</p>
      <p>
        To alleviate this issue, we present textitIM-viz, an educational process mining tool that
visualizes the application of the standard Inductive Miner and the Inductive Miner infrequent
algorithms in a step-by-step manner, showing the intermediary steps taken to go from input to
output. The creation of IM-viz is inspired by the VisuAlgo project [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] of the National University
of Singapore (NUS), which visualizes popular algorithms and data structures in an intuitive
manner.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. The IM-viz Tool</title>
      <p>At https://github.com/badrecursionbrb/im-viz IM-viz is available. This repository provides a
link to a deployed instance of IM-viz, the source code and instructions to run the tool locally,
documentation, and a video showing its usage. A video presenting the tool can also be viewed
at https://raw.githubusercontent.com/badrecursionbrb/im-viz/main/imdemo.mp4.</p>
      <sec id="sec-2-1">
        <title>2.1. Visualizer</title>
        <p>The main page of the IM-viz application, directly accessed when opening it, focuses on the
visualization of the inductive mining algorithms, as shown in Figure 1.</p>
        <p>Getting started. To start visualizing a discovery task, users first select the event data and
algorithm to apply. In terms of event data, IM-viz allows users to upload an XES file, simply
write down an event log using a string representation (e.g., “&lt;a,c,d&gt;4;&lt;c,a,a,d&gt;5;”), or select one
of the ready-made examples. In terms of algorithms, users can currently select the standard
Inductive Miner and the Inductive Miner infrequent. If the Inductive Miner infrequent is selected,
one needs to additionally enter the desired noise threshold. After that, the user clicks the “Go!"
button to start the selected algorithm.</p>
        <p>Algorithm visualization. When a user presses “Go!" (or, after the first step, “Next”),
IMviz will apply one step of the selected discovery algorithm on the selected event data. As
shown in Figure 1, the tool visualizes the application of this step through three complementary
components:
1. The top-right window shows the DFG of the (sub-) log that is currently being considered.
The DFG visualization, depicted in Figure 2a, uses diferent colors to indicate diferent activity
sets that will be separated by the cut to be performed.
2. The bottom-right part of the screen shows the inductive mining algorithm that is currently
being applied and which lines are currently being executed, showing if the algorithm is currently
in a base case, if a cut has been identified, or if a fall-through scenario is reached (where no
cut can be applied). The box next to this algorithm, then, provides more information on this
current step, e.g., by indicating which cut has been found and why this cut can be applied to
the DFG at hand.
3. The left-hand side of the screen shows the current state of the process tree, which is updated
after each step. The tree representation, depicted in more detail in Figure 2b, uses, e.g., [’e, f’]
to indicate an activity set that needs further exploration.</p>
        <p>After updating each of these components in a sequential manner, the application of the algorithm
is paused, so that users can take the time to explore the current, intermediary state of the
discovery task. When ready, users can press “Next” to apply the next step, which corresponds
to the left-most node in the tree that requires further exploration.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Education Section</title>
        <p>IM-viz also includes an education section (accessible via the top-right corner of the UI). It
provides information on the most important concepts necessary for a general understanding of
the Inductive Miner were selected and explained briefly such as process trees or the so called
lfower model . A screenshots of the education section landing page is shown in Figure 3.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Tool Architecture</title>
        <p>
          As shown in Figure 4, the IM-viz application consists of a Python-based back-end and a
JavaScript-based front-end, which communicate via REST requests. The back-end uses Flask to
implement the REST-API and is hosted using the gunicorn server package, which needs to run
on a Linux operating system. Further, PM4Py [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] as a basis for the Inductive Miner algorithm
(a) DFG visualization, each color indicates an ac- (b) Process tree under construction, brackets
inditivity set identified by a cut cating activity sets to be explored.
implementation. Because accessing intermediate results required code changes, PM4Py was
forked to keep the distinction between IM-viz and the original PM4Py framework. The front-end
is built with vue.js and node.js. For visualizations, we used the Data Driven Documents (D3)
library, together with the Cola.js implementation for the network graph. The front-end can be
hosted by any HTTP server. Vue applications generally are rendered at client side, so server
load is minimized by only performing the back-end computations.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Conclusion and Future Work</title>
      <p>IM-viz provides students with step-by-step guidance through the application of inductive mining
algorithms using event data of their choice. The tool provides diferent kinds of information for
each step of an algorithm, showing the current step in the algorithm itself, the DFG, and the
process tree, allowing users to jump back and forth between steps.</p>
      <p>As next steps, we aim to extend the tool to includes other process mining algorithms, such as
the alpha algorithm for process discovery or the * search in alignment-based conformance
checking and also to develop the tool further towards maturity. Additionally, the visualizations
itself may be improved in terms of comprehensibility and vividness. Finally, we aim to pilot
the usage of the tool in the exercise sessions of a process-mining course at our university and
conduct user studies to assess the benefits of using algorithm visualization in process mining
education.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>W. M. van der Aalst</surname>
          </string-name>
          , J. Carmona, Process mining handbook, Springer Nature,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S. J. J.</given-names>
            <surname>Leemans</surname>
          </string-name>
          ,
          <article-title>Robust Process Mining with Guarantees: Process Discovery, Conformance Checking</article-title>
          and Enhancement,
          <source>Ph.D. thesis</source>
          , Eindhoven University of Technology, Eindhoven,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Halim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Halim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. Z.</given-names>
            <surname>Chun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Loh Bo Huai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Thi Quynh Trang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Phandi</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Millardo tijndradinata</article-title>
          , N. Hoang
          <string-name>
            <surname>Duy</surname>
            ,
            <given-names>R. M.</given-names>
          </string-name>
          <string-name>
            <surname>Tan Zhao</surname>
            <given-names>Yun</given-names>
          </string-name>
          ,
          <string-name>
            <surname>I. Reinaldo</surname>
          </string-name>
          , VisuAlgo
          <article-title>- visualising data structures and algorithms through animation</article-title>
          ,
          <year>2011</year>
          . URL: https://visualgo.net/en.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Berti</surname>
          </string-name>
          , S. van Zelst,
          <string-name>
            <surname>W. M. van der Aalst</surname>
          </string-name>
          ,
          <article-title>Process Mining for Python (PM4Py): Bridging the Gap Between Process-</article-title>
          and
          <string-name>
            <surname>Data Science</surname>
          </string-name>
          ,
          <source>in: ICPM Demos</source>
          <year>2019</year>
          , volume
          <volume>2374</volume>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>