The Multi-perspective Process Explorer Felix Mannhardt1,2 , Massimiliano de Leoni1 , Hajo A. Reijers3,1 1 Eindhoven University of Technology, Eindhoven, The Netherlands 2 Lexmark Enterprise Software, Naarden, The Netherlands 3 VU University Amsterdam, Amsterdam, The Netherlands {f.mannhardt, m.d.leoni, h.a.reijers}@tue.nl Abstract. Organizations use process mining techniques to analyze event data recorded by their information systems. Multi-perspective process mining tech- niques make use of data attributes attached to events to analyze processes from multiple perspectives. Applying those multi-perspective process mining techniques in practice is a laborious task when the event data contains a large number of at- tributes and many different trace variants. Tools that facilitate the usage of these techniques in practical settings are missing. We describe the Multi-perspective Process Explorer as a new tool that integrates current multi-perspective process mining techniques for discovery and conformance checking. It supports common tasks in multi-perspective process mining, and aims to reduce the time needed to explore event data. Keywords: Process Mining, Process Analysis, Multi-perspective Process Mining, In- teractive Visualization, Process Exploration 1 Multi-perspective Process Mining Process mining techniques enable organizations to gain insights into their processes by using event data recorded by their information systems. The two major areas of process mining are process discovery, i.e., the discovery of process models based on sequences of events, and process conformance checking, i.e., revealing differences be- tween recorded executions and behavior prescribed by existing process models [1]. Multi-perspective process mining techniques go beyond techniques that only use event sequences to analyze the control-flow of a process: Data attributes attached to events are used to analyze processes from other perspectives, e.g., discovering and checking decision rules, analyzing resource behavior and checking time-related rules [2,3]. We refer to those techniques as multi-perspective as they obtain information about the other perspectives from data encoded in event attributes. Applying multi-perspective process mining techniques in practice is a laborious task, especially in cases when the data con- tains a large number of different attributes with high variability. A substantial amount of manual work by analysts is required, because they need to filter and transform event data as well as to select relevant features. Also, results need to be explored and, if not satisfactory, these steps need to be repeated multiple times by hand. Existing commercial tools, such as Perceptive Process Mining and Fluxicon Disco, and academic tools, such as Inductive Visual Miner (IVM) [4], mainly focus on the Copyright ⃝2015 c for this paper by its authors. Copying permitted for private and academic purposes. Fig. 1. Screen shot of the MPE where path frequencies are projected onto the model control-flow perspective, and provide only very limited support for data-aware process exploration. In this paper, we describe the Multi-perspective Process Explorer (MPE), a new tool that is tailored towards multi-perspective process exploration for discovery and analysis. It integrates existing work on multi-perspective process mining [2,3] with new interactive visualizations and filtering facilities into an scalable and extensible tool. The main features are: integration of existing data-aware discovery, conformance checking, and performance analysis techniques; interactive efficient exploration of data-aware processes; built-in filtering based on attributes and trace variants. In the reminder of this paper we describe how to use the MPE for one of the possible use cases, the discovery of data-aware process models. 2 Walkthrough of The Multi-perspective Process Explorer The MPE is available as plug-in in the MultiPerspectiveProcessExplorer package in the open-source framework ProM1 . Here, we show the sequence of steps required to discover and evaluate a multi-perspective process model. In particular, we showcase the MPE on a publicly available event log containing events from more than 150,000 process instances of a road fines management process in an Italian local police force [3]. An application of the MPE on this event log has also been recorded as a video that is available under http://purl.tue.nl/899817766269492. Required input. Starting point for the usage of the MPE is an event log and a process model, in form of a Petri net. Petri nets can be discovered, e.g., by applying plug-ins implementing several process-discovery techniques. Alternatively, they can be created manually with an editor such as WoPeD.2 For our case study, we discovered a Petri net using the IVM [4]. Optionally, the model may already contain activity guards. In this case, Steps 1 and 2 can be skipped as they are not necessary. However, one may still want to discover additional activity guards, regardless of those which were already provided. 1 Available in the ProM nightly builds under: http://promtools.org 2 http://woped.org/ Step 1: Analysis of the Input Model. The first step is to analyze the model provided as input. Figure 1 shows a screenshot of the MPE where the input process model is shown, along with information about the frequencies of paths as observed in the event log. This information is projected onto the model: The thickness of an arc indicates the frequency of observing a path including that arc in the event log. Below the process model, there are three areas highlighted by a red rectangle and marked with a number. The panel in area 3 gives general information about the fitness of the model wrt. the input event log. The fitness score measures whether the behavior observed in the event log is reflected in the model; For the model and the event log in question, the fitness score is 93.2%. This is computed by constructing an alignment between the model and the event log [3], which makes it possible to pinpoint the deviations that cause nonconformity. An align- ment between a recorded process execution and a process model is a pairwise matching between activities recorded in the log and activities allowed by the model. Sometimes, activities as recorded in the event log (events) cannot be matched with any of the activ- ities allowed by the model (process activities), thus resulting in so-called moves on log. In other cases, an activity should have been executed but is not observed in the event log, thus resulting in a so-called move on model. For our case study, there are 23,712 model moves, i.e. missing executions of activities, and 18,672 log moves, i.e. activity executions that occurred when not allowed by the model. Step 2: Data-aware Discovery. This step is about discovering the guards associated with process activities. To do that, we need to switch the mode to data discovery in the Display panel (area 1 in Fig. 1). In the data discovery mode, the mode configuration panel looks like area 2 in Fig. 1. It allows users to configure which guard-discovery algorithm to use, which data attributes to consider along with tuning some specific al- gorithms parameters: the minimum number of elements associated with decision-tree leaves (min instances) and the minimum control-flow fitness for each trace to be con- sidered (min fitness). For further information, readers can refer to [2]. Once parameters are chosen, the button Discover can be pressed. In this case, we choose the standard decision tree classifier configured to use all attributes and 25% as the min instances parameter, i.e., the number of elements is set to 25% of the process instances reaching a decision point. The min instances parameter is important as it influences whether the discovered guards are over-fitting (value too low), or under-fitting (value too high). Step 3: Fitness & Precision Computation. After discov- ering a multi-perspective pro- cess model, the plug-in can evaluate the quality of the discovered process model. This requires to change the mode of the MPE to fit- ness in the Display panel mentioned above. Please ob- serve that there is a sub- stantial difference compared with Step 1: now the model Fig. 2. An excerpt of the visualization when in fitness mode. contains activity guards. Fig- ure 2 shows an excerpt of the visualization when in fitness mode. In particular, the focus is on the activity Send for Credit Collection, for which the following deci- sion rule has been discovered: amount > 71 (area 5). Each activity is colored accord- ing to the ratio between the number of compliant executions of the activity and the total moves for that activities (which also accounts for missing events and executions with incorrect data). The relatively dark color of the Send for Credit Collection activity indicates a considerable fraction of non-compliant executions. Moreover the statistics show that according to the new model the data values observed in the event log are wrong 30,308 times (area 7). If we compare the average fitness of the model with and without activity guards (compare area 3 in Fig. 1 with area 6 in Fig. 2), it is clear that the presence of data guards has decreased the fitness level from 93.2% to 90.1%. However, this is not necessarily negative: fitness is only one of the measure to evaluate the quality of a model. A second measure is precision, which is the ratio be- tween the amount of behavior observed in the event log and the amount of behavior described by the model. Adding guards increase the precision because the added rules restrict the behavior allowed by the model. The MPE also allows for computing preci- sion [5]. This is done by switching the display mode to precision. For sake of space, a detailed description of the precision mode is only showcased in the screencast. Step 3: Bottleneck & Performance Computation After evaluating whether the discovered model is a suitable representation of the process behav- ior, performance information about the time perspective such as aver- age waiting times can be obtained using the performance mode of the MPE. Figure 3 shows that the aver- age waiting time between the activ- ity Create Fine and the activity Send Fine is 7.4 hours (area 8). In Fig. 3. An excerpt of the performance mode screen this case, we have also used a filter- ing feature: amount > 71 (area 9). This filtering query restricts the analysis to only those traces for which attribute amount is larger than 71 at least once in the trace. Step 4: Detailed Analysis using Trace View & Chart View. After evaluating the fitness and precision of the model, and the presence of bottlenecks, the end user may want to explore specific traces in detail that she, e.g., finds to be problematic. Therefore, the MPE provides two complementary views on the process showing more details. Fig- ure 4a shows the detailed trace view that opens on a second screen upon pressing the Toggle Traces button. For each log trace, the corresponding alignment is shown: log and model moves are highlighted with yellow and purple color above the move, wrong data is highlighted with white color. Moves related to the same activity are painted with the same color. Trace are grouped based on similar executions. Individual traces and their data attributes can also be explored. The second view, shown in Figure 4b, provides more details on the distribution of data attributes at certain states within the process model. Figure 4b shows two histograms with the distribution of the values of attribute amount before the occurrence of activity Send for Credit Collection and (a) Trace view (b) Chart view Fig. 4. Two additional views for detailed analysis the invisible step tau from tree, which models the skipping of Send for Credit Collection. This allows end users to visually analyze whether certain ranges of val- ues are usually observed together with the occurrence of given activities. For instance, for Send for Credit Collection, the observed values are usually high. Conclusion. We presented the MPE as a novel tool for multi-perspective process ex- ploration. For the sake of space, we only showed one iteration cycle. However, any of the steps can be repeated as many times as necessary. The tool has reached a high de- gree of maturity, which allows it to be used for real-life case studies. This is testified by its application on the analysis of a real-life event log with more than 150,000 traces and around 500,000 events. As next steps, we plan to evaluate the user interface with real process analysts and improve it when necessary. Also, we are currently working on novel data-discovery techniques, which we aim to incorporate by the end of 2015. Fi- nally, we aim to work on improving the performance of the used alignment techniques to further speed up the analysis of large data sets and complex process models. References 1. van der Aalst, W.M.P.: Process Mining - Discovery, Conformance and Enhancement of Busi- ness Processes. Springer (2011) 2. de Leoni, M., van der Aalst, W.M.P.: Data-Aware Process Mining: Discovering Decisions in Processes Using Alignments. In: SAC’13, ACM (2013) 1454–1461 3. Mannhardt, F., de Leoni, M., Reijers, H.A., van der Aalst, W.M.P.: Balanced multi-perspective checking of process conformance. Computing (2015) doi:10.1007/s00607-015-0441-1 (in press). 4. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Process and deviation exploration with inductive visual miner. In: Proceedings of the BPM Demo Sessions 2014. Volume 1295 of CEUR Workshop Proceedings., CEUR-WS.org (2014) 46 5. Mannhardt, F., de Leoni, M., Reijers, H.A., van der Aalst, W.M.P.: Measuring the Precision of Multi-perspective Process Models. In: BPI’15. (2015) (accepted).