P ROCESS E XPLORER: Interactive Visual Exploration of Event Logs with Analysis Guidance Alexander Seeliger, Maximilian Ratzke, Timo Nolle, Max Mühlhäuser Technische Universität Darmstadt Telecooperation Lab Darmstadt, Germany Email: {seeliger, nolle, max}@tk.tu-darmstadt.de Abstract—Process analysts use process mining techniques to organizations, visual exploration and analysis are getting more obtain fact-based knowledge from event logs about how business and more challenging. Often the analyst is confronted with a processes are actually executed in organizations. Often process spaghetti-like process map which by itself does not necessarily discovery is the first step in their analytical workflow. However, when working with large amount of data and complex processes, lead to useful insights. Without extensive knowledge about exploring as-is process models to obtain interesting and insightful the underlying process, selecting the right set of cases to find knowledge can be challenging. We propose P ROCESS E XPLORER, interesting and valuable insights or trends is non-trivial. In an interactive visual recommendation system for process dis- current process mining tools, most of these analysis steps are covery to facilitate event log exploration. P ROCESS E XPLORER performed manually, leading to a lot of repetitive work which automatically analyzes the event log to obtain promising subsets of cases, evaluates interesting process performance indicators, hampers efficient exploration and analysis. and recommends those that are most interesting and insightful. P ROCESS E XPLORER extends the interactive visual explo- Our system uses multi-perspective trace clustering to identify ration capabilities in today’s process mining tools by providing candidate cases of interest and a deviation-based approach to automatic guidance to the analyst. Our tool integrates several assess the interestingness of process performance indicators. We implemented P ROCESS E XPLORER as a standalone desktop recommendation suggestions in a user-friendly manner to application that allows to explore any process and any event log. improve overall process discovery exploration: Our demo shows how the workflow of analysts is supported by the 1) Subset Recommendation. P ROCESS E XPLORER recom- system through suggesting subset and insights recommendations. mends subsets of interesting cases to allow analysts Index Terms—process discovery, variants analysis, log pre- quickly inspect the different process behaviors observed processing, trace clustering, statistical hypothesis testing in the event log. Different from the manual filtering that requires expert knowledge, subset recommendations I. I NTRODUCTION are automatically derived by mining process behavior patterns from the dataset to simplify subset selection. Nowadays, information systems in organizations support 2) Insights Recommendation. After selecting a subset of and automate the processing of business transactions. These cases, P ROCESS E XPLORER automatically computes a systems are typically integrated into companies’ business range of relevant process performance indicators to processes and record the activities that have been executed show interesting deviations. Analysts are guided towards in the form of an event log. Process mining aims at providing interesting statistics that they usually would compute an accurate view of how processes are actually executed in manually. organizations. In particular, process discovery reconstructs as- 3) Recommendation Ranking. In order to prevent the an- is process models from event logs which can be used for alyst from inspecting only a limited subset of cases, further analysis. A wide range of process mining tools has P ROCESS E XPLORER provides the analyst with the most been established that implement process discovery and analy- diversifying recommendations by applying diversifying sis methods to support analysts to obtain valuable knowledge. top-k ranking [1]. With this knowledge, process issues can be identified and optimizations can be implemented. P ROCESS E XPLORER is agnostic to the process and event In this paper, we introduce the P ROCESS E XPLORER system log that is being analyzed. Any process and any event log in which provides recommendations to the analyst on how to the standardized IEEE XES format can be used. Furthermore, select a subset of cases and what statistics may be interesting the analyst does not need to setup any configuration or specify and insightful. Our system is inspired by the workflow that parameter values. Prior knowledge about the process or the analysts typically perform when working with process mining event log is not required. P ROCESS E XPLORER obtains all the tools. The visual inspection of the discovered process model necessary information from the event log itself. is the initial starting point of any process mining project. We used P ROCESS E XPLORER in a case study on the BPI Due to the massive growth of data, the increasing process Challenge 2019 event log collected from a large company complexity, and the flexible execution of business processes in to investigate the procurement handling process [2]. The rest of the paper is structured as follows. We provide a C. Ranking walk-through of P ROCESS E XPLORER using this event log, Lastly, P ROCESS E XPLORER ranks the recommendations showing the different types of recommendations provided by based on the interestingness score [4]. Each insights recom- P ROCESS E XPLORER and highlight the maturity of the tool. mendation is assigned a score that is computed from how large Then, we present the architecture of P ROCESS E XPLORER to the deviation is from the rest of the event log and the number show extensibility. of cases that are covered. We use Cohen’s effect size [5] which uses a comprehensive scale to determine the maturity of the II. R ECOMMENDATION E NGINE deviation. Insights recommendations are then ranked by their P ROCESS E XPLORER extends process mining tools by intro- assigned scores. ducing a recommendation engine to support analysts selecting During our experiments, we found that certain insights interesting subsets of cases and generating insightful statistics. co-occur with each other which unnecessarily increases the In particular, our system allows to quickly scan unknown pro- number of insights recommendations. P ROCESS E XPLORER cesses in event logs to obtain knowledge about how the process clusters similar insights recommendations using the Spear- is actually executed and where potential issues can be found. man’s rank-order correlation. P ROCESS E XPLORER provides two types of recommendations Subset recommendations are assigned a score based on the and a ranking mechanism. insights scores and the number of cases that are contained in the subset. We obtain the top-k subset recommendations A. Subset Recommendations using the top-k diversifying ranking algorithm [1] to increase the analysts perspective on the event log. Instead of show- The first type of recommendation suggests subsets of cases ing very similar subset recommendations on top of the list, that contain interesting process behavior patterns. We are P ROCESS E XPLORER suggests the most diversifying subsets particularly interested in patterns that combine the control which prevent the analyst from inspecting only a limited subset flow and the data perspective. This is inspired by the manual of cases. In P ROCESS E XPLORER, the top 10 most interesting work of analysts who not only filter cases by the sequence and diversifying subset recommendations are shown to the of activities but also by attributes. This is often used to user. compare different departments, products, or company loca- tions. To support analysts during the selection of appropriate III. T OOL subsets of cases, P ROCESS E XPLORER automatically analyzes P ROCESS E XPLORER is a standalone interactive process the given event log to find such patterns using trace clustering. mining tool to demonstrate the proposed guidance capabil- Specifically, we apply multi-perspective trace clustering [3] to ities. As mentioned earlier, it allows importing any stan- obtain subsets of cases that contain dependencies between the dardized IEEE XES event log and works without specifying control flow and the case attributes. Resulting subsets of cases any additional parameter value. We give a walk-through of with similar behavior lead to process maps that are typically P ROCESS E XPLORER by inspecting the procurement handling less complex and easier to understand visually. process of the BPI Challenge 2019 event log [2]. Figure 1 shows the main screen of P ROCESS E XPLORER. The user B. Insights Recommendations interface consists of five different components: Another typical task in process mining is to investigate and a) Process Map: The most prominent component in compare a range of process performance indicators (PPIs), P ROCESS E XPLORER is the process map. It visualizes the such as the number of activities, the total duration time, activities and transitions that have been observed in the event the duration time between activities, the directly followed- log. Activities and transitions can be filtered by their relative by relation, and the existence of activities. These are either occurrence using the slider at the bottom right. Figure 1 shows directly visualized in the process map or separately displayed the process map of a selected subset recommendation. in the form of statistical charts or single values. Existing b) Subset Recommendations: On the top right side, the process mining tools provide assistance by offering the possi- ranked list of subset recommendations is shown. Subset rec- bility to create dashboards with predefined PPIs which will ommendations can be modified and adjusted by the user, update immediately if a different case selection is made. enabling to further refine the selection of cases interactively. Still, each PPI needs to be investigated one after another to Users can add a happy path filter, a variant filter, a start and identify deviations which is time-consuming and error-prone. end activity filter, and an activity occurrence filter. Figure 1 P ROCESS E XPLORER automatically computes these PPIs for a shows the 8 subset recommendations that are suggested for selected subset and identifies those ones that may be interest- the currently selected subset of cases. ing to the user by performing statistical significance testing. c) Subset Statistics: On the lower right side, basic statis- Compared to dashboards that are static with respect to the tics of the selected subset recommendation are shown which computed PPIs, P ROCESS E XPLORER reevaluates the PPIs for give an overview of the cases in the subset. The statistics show each applied subset recommendation individually. Only PPIs how the subset selection compares to the original event log that are significantly different from the rest of the cases in the and highlights the event distribution, the variant distribution, event log are considered as an interesting insight [4]. and the number of selected cases. Based on the statistics, the Fig. 1. User interface of P ROCESS E XPLORER showing the subset and insights recommendations, the process map of the selected subset, the stage view, and the subset statistics. The screenshot shows a selected subset recommendation of the BPI Challenge 2019 event log. user can decide which subset recommendation to apply. In the recommendations are computed, so recommendations can be example, the selected subset recommendation selects 6 events, successively refined. and 1 out of 4 variants. d) Insights Recommendations: On the left-hand side, IV. A RCHITECTURE P ROCESS E XPLORER shows the insights recommendations for the current subset. Insights recommendations are automatically P ROCESS E XPLORER is built of three main components: updated each time the subset of cases is modified. The the event log manager (XLogManager), the stage manager system computes a range of basic PPIs which are typically (XStageManager), and the recommendation manager (Rec- analyzed by users. We distinguish between case- and subset- ommendationManager). All three components are open for based insights. Depending on the insight type, a different extension, such that other event log formats, stage management visualization is shown to the user. Figure 1 shows a portion capabilities, subset and insights recommendation approaches of the obtained insights recommendations. For instance, the can be integrated. Figure 2 shows the overall architecture of first insight refers to the directly followed-by relation between P ROCESS E XPLORER. the “Record Invoice Receipt” and “Remove Payment Block” Event logs are imported as an OpenXES XLog object and activities, which occurs more often in the applied subset. stored in-memory using the XESlite extension. Each loaded Furthermore, we can see that the activity “Receive Order log is stored in the XLogData object structure which links Confirmation” is mostly executed by “user 029”. to the XLog object and stores the basic statistics of the e) Stage Views: For easier navigation between the differ- log. The XStageManager is responsible for managing the ent subset recommendations, P ROCESS E XPLORER introduces views of P ROCESS E XPLORER, storing a history of all stages stage views. Each time the user decides to apply a subset visited by the user. For an active stage, the XStageManager recommendation a new stage view is generated. A stage view retrieves the recommendations from the Recommendation- stores the selected cases and the computed insight recom- Manager which returns a set of Recommendation objects. mendations. Stages are organized as a hierarchical structure If the recommendations have not yet being computed, the such that each refinement of a selection results in a new RecommendationManager calls the RecommendationFactory. hierarchy level. For each stage view, subset and insights Each Recommendation refers to the subset recommendations XLogManager import event log XES Document XLogData XLogData XLogData RecommendationFactory XLogData generated active stage recommendations selected stage RecommendationManager selected log data XStageViewer XLogData XLogData XStageManager Recommendation XLogViewer recommendation XLogData RecommendationInfoViewer XLogData active stage selected active stage Recommendation XStage StageInfoViewer RecommendationInsightsViewer XStage Insight StageInsightsviewer selected generated recommendation recommendations RecommendationSelector accept/reject recommendation Fig. 2. Overview of the architecture of the P ROCESS E XPLORER tool [6]. shown in P ROCESS E XPLORER which contain the Insight rec- analysts towards interesting subsets of cases as well as shows ommendations. insightful statistics of relevant PPIs. Subset recommenda- All visualization components, such as the XLogViewer, tions are computed using multi-perspective trace clustering StageInfoViewer, StageInsightsViewer, RecommendationView- to obtain process behavior patterns that are interesting to ers are separated from the actual recommendation engine. explore. Insights recommendations show interesting PPIs that This architecture allows the exploration of different types of significantly differ for an investigated subset compared to the visualizations, such as other types of charts, process model rest of the event log. Furthermore, P ROCESS E XPLORER gives visualizations, etc., but keep the actual computation of the each recommendation a score based on interestingness and recommendations. maturity. It applies top-k diversifying ranking to obtain the In the current implementation of P ROCESS E XPLORER, we most different recommendations. implemented a multi-perspective trace clustering recommenda- ACKNOWLEDGMENT tion engine for subset recommendations and a statistical sig- nificance testing approach for obtaining insights recommenda- This work is funded by the German Federal Ministry of tions. However, other implementations are easy to implement Education and Research (BMBF) Software Campus project by extending the corresponding classes. “AI-PM” [01IS17050] and the research project “KI.RPA” [01IS18022D]. V. D OWNLOAD , S CREENCAST, AND L INKS R EFERENCES The P ROCESS E XPLORER demo tool can be found at our [1] L. Qin, J. X. Yu, and L. Chang, “Diversifying top-k results,” Proceedings project page1 . On the project page, a demonstration video of the VLDB Endowment, vol. 5, no. 11, pp. 1124–1135, jul 2012. including a screencast, a reduced event log derived from the [2] B. F. van Dongen, “Dataset BPI Challenge 2019,” 4TU.Centre for BPI Challenge 2019, and additional screenshots are provided. Research Data, 2019. [3] A. Seeliger, T. Nolle, and M. Mühlhäuser, “Finding Structure in the The demo tool requires Oracle Java 8 and was tested on Unstructured: Hybrid Feature Set Clustering for Process Discovery,” in Windows and Ubuntu. Proc. of the 16th BPM. Springer International Publishing, 2018, pp. 288–304. VI. C ONCLUSION [4] M. Vartak, S. Rahman, S. Madden, A. Parameswaran, and N. Polyzotis, “SeeDB,” in Proc. of the VLDB Endowment, vol. 8, no. 13, 2015, pp. In this paper, we presented P ROCESS E XPLORER, an inter- 2182–2193. active visual recommendation system for process discovery [5] J. Cohen, “Statistical Power Analysis,” Current Directions in Psycholog- ical Science, vol. 1, no. 3, pp. 98–101, jun 1992. inspired by the workflow typically performed by analysts. [6] M. Ratzke, “Intelligent and Systematic Browsing through Process Mining Our system suggests two types of recommendations that guide Data,” 2019. 1 https://fileserver.tk.informatik.tu-darmstadt.de/AS/processexplorer