1. Introduction

IM-viz: A Tool for the Step-by-step Visualization of the Inductive Miner

Florian Lang

Din Hida

Yingjie Bian

Adrian Rebmann

Han van der Aa

0 0 School of Business Informatics and Mathematics, University of Mannheim , B6 26, 68159 Mannheim , Germany

The Inductive Miner is a state-of-the-art algorithm for process discovery and a staple in process mining education, since its divide-and-conquer nature can teach students how to recognize behavioral relations in event data and to break up a discovery problem into smaller parts. However, a key problem from this educational perspective is that the algorithm's manual application is time consuming, involving a considerable amount of drawing, whereas existing implementations of the algorithm only show the ifnal outcome, not its intermediary steps. To overcome this, we present IM-viz, an educational process mining tool that visualizes the application of the Inductive Miner and the Inductive Miner infrequent in an iterative manner. IM-viz allows users to interactively explore how inductive mining works and how it deals with diferent kinds of event data, thus providing a convenient means for process mining students and educators to establish and analyze step-by-step process discovery examples.

1. Introduction

Process discovery is a quintessential task in process mining, which takes exemplary process executions, stored in an event log, and aims to establish a process model that accurately describes the underlying process [ 1 ]. In this context, inductive mining refers to a family of discovery algorithms that work in a top-down manner [ 2 ]. These algorithms apply a divide-and-conquer strategy to recursively decompose the process discovery task into smaller parts. This is achieved by establishing a directly-follows graph (DFG) of the event log and, subsequently, identifying cuts that partition the activities in the DFG into smaller sets. These cuts identify behavioral relations between the activity sets, indicating that they are in a sequential, parallel, exclusive, or loop relation. The algorithm is recursively applied on the sub logs corresponding to these smaller activity sets, until the problem cannot be further decomposed and a process tree or workflow net is returned.

Inductive mining is recognized as (part of the) state of the art for process discovery, as it provides formal guarantees on its output (most importantly, output models are always sound), is scalable, and provides flexibility to users, since inductive mining algorithms exist that can deal with diferent quality issues, such as noise and incompleteness. Furthermore, inductive mining is interesting from an educational perspective, as its divide-and-conquer nature provides a convenient means for students to learn how to break up a discovery task into smaller problems and to recognize behavioral relations in event data.

A downside from this educational perspective is that students new to process mining may ifnd it dificult to understand how the algorithms exactly work and how they should be applied. Additionally, the algorithms are time consuming to apply in a manual manner, involving a considerable amount of drawing (of the DFGs). Since existing implementations only provide the output obtained by discovery (i.e., the final process tree or workflow net), these do not allow users to understand the intermediary steps taken to reach this result. Consequently, it is tedious for educators and dificult for students to gain insights into the application of inductive mining algorithms in a step-by-step manner.

To alleviate this issue, we present textitIM-viz, an educational process mining tool that visualizes the application of the standard Inductive Miner and the Inductive Miner infrequent algorithms in a step-by-step manner, showing the intermediary steps taken to go from input to output. The creation of IM-viz is inspired by the VisuAlgo project [ 3 ] of the National University of Singapore (NUS), which visualizes popular algorithms and data structures in an intuitive manner.

2. The IM-viz Tool

At https://github.com/badrecursionbrb/im-viz IM-viz is available. This repository provides a link to a deployed instance of IM-viz, the source code and instructions to run the tool locally, documentation, and a video showing its usage. A video presenting the tool can also be viewed at https://raw.githubusercontent.com/badrecursionbrb/im-viz/main/imdemo.mp4.

2.1. Visualizer

The main page of the IM-viz application, directly accessed when opening it, focuses on the visualization of the inductive mining algorithms, as shown in Figure 1.

Getting started. To start visualizing a discovery task, users first select the event data and algorithm to apply. In terms of event data, IM-viz allows users to upload an XES file, simply write down an event log using a string representation (e.g., “<a,c,d>4;<c,a,a,d>5;”), or select one of the ready-made examples. In terms of algorithms, users can currently select the standard Inductive Miner and the Inductive Miner infrequent. If the Inductive Miner infrequent is selected, one needs to additionally enter the desired noise threshold. After that, the user clicks the “Go!" button to start the selected algorithm.

Algorithm visualization. When a user presses “Go!" (or, after the first step, “Next”), IMviz will apply one step of the selected discovery algorithm on the selected event data. As shown in Figure 1, the tool visualizes the application of this step through three complementary components: 1. The top-right window shows the DFG of the (sub-) log that is currently being considered. The DFG visualization, depicted in Figure 2a, uses diferent colors to indicate diferent activity sets that will be separated by the cut to be performed. 2. The bottom-right part of the screen shows the inductive mining algorithm that is currently being applied and which lines are currently being executed, showing if the algorithm is currently in a base case, if a cut has been identified, or if a fall-through scenario is reached (where no cut can be applied). The box next to this algorithm, then, provides more information on this current step, e.g., by indicating which cut has been found and why this cut can be applied to the DFG at hand. 3. The left-hand side of the screen shows the current state of the process tree, which is updated after each step. The tree representation, depicted in more detail in Figure 2b, uses, e.g., [’e, f’] to indicate an activity set that needs further exploration.

After updating each of these components in a sequential manner, the application of the algorithm is paused, so that users can take the time to explore the current, intermediary state of the discovery task. When ready, users can press “Next” to apply the next step, which corresponds to the left-most node in the tree that requires further exploration.

2.2. Education Section

IM-viz also includes an education section (accessible via the top-right corner of the UI). It provides information on the most important concepts necessary for a general understanding of the Inductive Miner were selected and explained briefly such as process trees or the so called lfower model . A screenshots of the education section landing page is shown in Figure 3.

2.3. Tool Architecture

As shown in Figure 4, the IM-viz application consists of a Python-based back-end and a JavaScript-based front-end, which communicate via REST requests. The back-end uses Flask to implement the REST-API and is hosted using the gunicorn server package, which needs to run on a Linux operating system. Further, PM4Py [ 4 ] as a basis for the Inductive Miner algorithm (a) DFG visualization, each color indicates an ac- (b) Process tree under construction, brackets inditivity set identified by a cut cating activity sets to be explored. implementation. Because accessing intermediate results required code changes, PM4Py was forked to keep the distinction between IM-viz and the original PM4Py framework. The front-end is built with vue.js and node.js. For visualizations, we used the Data Driven Documents (D3) library, together with the Cola.js implementation for the network graph. The front-end can be hosted by any HTTP server. Vue applications generally are rendered at client side, so server load is minimized by only performing the back-end computations.

3. Conclusion and Future Work

IM-viz provides students with step-by-step guidance through the application of inductive mining algorithms using event data of their choice. The tool provides diferent kinds of information for each step of an algorithm, showing the current step in the algorithm itself, the DFG, and the process tree, allowing users to jump back and forth between steps.

As next steps, we aim to extend the tool to includes other process mining algorithms, such as the alpha algorithm for process discovery or the * search in alignment-based conformance checking and also to develop the tool further towards maturity. Additionally, the visualizations itself may be improved in terms of comprehensibility and vividness. Finally, we aim to pilot the usage of the tool in the exercise sessions of a process-mining course at our university and conduct user studies to assess the benefits of using algorithm visualization in process mining education.

[1] W. M. van der Aalst , J. Carmona, Process mining handbook, Springer Nature, 2022 .

[2]

S. J. J.

Leemans , Robust Process Mining with Guarantees: Process Discovery, Conformance Checking and Enhancement, Ph.D. thesis , Eindhoven University of Technology, Eindhoven, 2017 .

[3]

Halim ,

K. Z.

Chun ,

Loh Bo Huai ,

Thi Quynh Trang ,

Phandi , A. Millardo tijndradinata , N. Hoang Duy , R. M. Tan Zhao

Yun

, I. Reinaldo , VisuAlgo - visualising data structures and algorithms through animation , 2011 . URL: https://visualgo.net/en.

[4]

Berti , S. van Zelst, W. M. van der Aalst , Process Mining for Python (PM4Py): Bridging the Gap Between Process- and Data Science , in: ICPM Demos 2019 , volume 2374 , CEUR , 2019 , pp. 13 - 16 .