Directly Follows-Based Process Mining: a Tool Sander J.J. Leemans, Erik Poppe, Moe T. Wynn Queensland University of Technology Brisbane, Australia {s.leemans, e.poppe, m.wynn}@qut.edu.au Abstract—In order to bridge the gap between well-founded and performance measuring concepts can be applied to these academic and intuitive commercial tools, we introduce the Di- models. rectly Follows visual Miner (DFvM), which takes as input an In this paper, we present the Directly Follows visual Miner event log and allows users to explore it. That is, it automatically discovers a directly follows process model, applies conformance (DFvM), which applies both well-established and new aca- checking, provides performance measures and allows for filtering demic concepts to directly follows based models. Our aim the log. DFvM uses directly follows models, which are also used is to illustrate that conformance checking and reliable perfor- by many commercial process mining tools, however, unlike such mance measures are possible within the limitations of intuitive tools, it provides conformance checking and reliable performance directly follows-based models. measures. Index Terms—process discovery, conformance checking, pro- Relation with Inductive visual Miner: The DFvM is an cess enhancement, performance mining, directly follows-based extension of the Inductive visual Miner (IvM) [1] and shares process mining its code base. Compared to the previously published version of IvM [1], DFvM has a new architecture (Section III) and several I. I NTRODUCTION new features (Section IV). The architecture and many of the Process mining aims to obtain insights from event logs that new features of DFvM have been made available to users of contain recorded behaviour of executions of an organisation’s IvM as well, however have not been published before. business processes, in order to optimise the processes. Typi- In the remainder of this paper, we first introduce the cally, in a process mining project, first a process model that idea of the DFvM (Section II) and explain its architecture describes the control flow of the process is discovered from an (Section III). We describe the new features in Section IV event log. Second, the model should be evaluated against the and describe its maturity in Section V. Section VI explains event log or a secondary test event log to verify that the model how the tool can be accessed while the paper is concluded in represents the behaviour of the process well. This is typically Section VII. performed using a conformance checking technique. Third, the performance of the process can be measured to identify II. I DEA bottlenecks, central concepts and batching behaviour. These The Directly Follows visual Miner (DFvM) takes an event steps can be repeated to, for instance, zoom in on areas of log and automatically applies a series of steps, which allows particular interest, or to compare different groups of recorded the user to perform process-based analyses on the log. That behaviour [1]. is, first a Directly Follows Model (DFM) is discovered by a Many process mining techniques and tools have been pro- DFM discovery algorithm. Second, the DFM is aligned with posed, both commercially (e.g. [2]) and academically [3]. the event log, and the results are shown. That is, deviations Academic tools typically use process models with a well- between event log and DFM can be shown on both the model defined semantics that support advanced constructs such as (to show where deviations occur) and on the event log (to show concurrency, interleaving and inclusive choices. This allows which events do not correspond to the model [log moves], such models, for instance Petri nets, process trees or BPMN and where in the traces the model made a move that is not models, to be evaluated using conformance checking tech- represented in the trace [model moves]). Furthermore, several niques [4]. In contrast, many commercial process mining tools performance measures, such as how often parts of the model use directly follows-based process models to convey business are executed and the time spent waiting for or executing these processes to stakeholders, even though these models often parts, are computed and visualised. In addition, the event log is lack an executable semantics and support neither concurrency, animated as yellow tokens flowing through the model, which interleaving nor inclusive choices. As the executable semantics allows the observation of bottlenecks, batching behaviour and of such models is not clear, obtained insights are difficult to seasonality. In typical process mining projects, insights gained verify [4]. Nevertheless, such models are considered to be using these techniques might highlight the need for drilling easier to understand for users [3]. further into the event log, for instance to compare different In [4], we introduced concepts to bridge this gap between subsets of the event log or to zoom in on a particular part of academic and commercial process mining tools: we keep the the DFM [1]. DFvM facilitates such inquiries by quick filtering directly follows based models, but use a proper semantics, in several ways: first, clicking an activity or edge of the DFM, introduce soundness and describe how conformance checking filters the log to contain only traces that use that activity or highlight selection, measure performance and compute histograms are redone. This architecture is supported by a flexible multi-threaded framework that uses this graph to take care of starting steps, keeping track of their progress, and cancelling steps and voiding their results if they are no longer necessary (because the user changed something upstream again). This framework has proven to be flexible: new steps can be added to the framework by simply adding them and their dependency edges to the graph. Compared to IvM, the framework has been improved: previously, all current steps were executed sequentially and only if a step’s result would become obsolete a concurrent execution was started. IV. M AIN N EW F EATURES Directly Follows visual Miner (DFvM) is based on the Inductive visual Miner (IvM) [1] and extends it as follows with new features exclusively for directly follows models: • Process discovery. The new process discovery algorithm is described in [4] and is available in DFvM. This algorithm simplifies the model by filtering traces, based on the paths slider. Using the paths slider, the level of infrequent behaviour filtering can be adjusted. The value of the slider sets the minimum percentage of traces in Fig. 1. Architecture of the Directly Follows visual Miner. the event log that fit the model (guaranteed when only completion events are considered [4]). edge. Second, filters based on trace and event attributes can • Conformance checking. Directly follows models are au- be applied before or after discovery of the DFM. Any such tomatically translated to Petri nets and aligned using [5]. filtering will update all performance measures accordingly. The results are used throughout the DFvM, for instance To summarise, using DFvM a user can explore the pro- to show deviations, animate the event log over the model cess iteratively to derive insights by repeatedly discovering a and to compute performance measures. DFvM shows process model using process discovery, evaluating the model deviations using the concepts introduced in [4], which using conformance checking, assessing its performance and illustrates how (commercial) directly follows-based tools drilling down by filtering the log. could implement conformance checking. To enable this seamless exploration, a new architecture • Performance. Typically, existing DFM-based tools com- underlying DFvM automates this process: when users change pute performance measures based on DFM-edge traver- a setting DFvM recomputes results as necessary. sal. As shown in [4], this might lead to counter-intuitive results. DFvM, on the other hand, computes performance III. A RCHITECTURE measures based on the computed alignment. That is, the DFvM performs several steps fully automatically: after a traces are projected on the model and several performance few log-related steps (filtering, applying the classifier, etc.), a measures are computed based on this projection: mini- directly follows model is discovered, after which it is aligned, mum, average and maximum sojourn, waiting and service the log is filtered and the results of that filtered alignment time, and elapsed and remaining trace time. are visualised (performance, histograms, animation, colouring, • Edit model. Users can manually edit the directly follows etc.). model that was discovered, or import an existing stored The architecture of the DFvM is shown in Figure 1: the model to avoid discovery altogether. Editing the model graph shows the steps that are performed to visualise event may involve changing the start and end activities, as logs and directly follows models, and the dependencies be- well as the edges that make up the model and selecting tween steps. Each step is executed (in parallel) as soon as all whether the model should support empty traces. While preceding steps (that is, all steps with arcs going into the cur- editing the model, users are shown an example of the to- rent step) have been completed. Similarly, when a user changes be-model to help with editing and creating a sound model, a setting, only the relevant dependent steps are recomputed. that is, a model in which every activity is reachable and For instance, if a user changes the highlighting selection (e.g. can proceed to the end of the model [4]. If the model is by clicking on an activity to filter the log to only include not sound, DFvM will highlight the identified issue (see traces going through that activity according to the model) then Figure 3). As soon as the edited model is sound, DFvM Fig. 2. A screenshot of the Directly Follows visual Miner, showing the controls on the right, the model on the left and the performance measures of an activity in a pop-up. Fig. 3. A screenshot of DFvM’s model editing capabilities. A sound model is necessary for further processing. The shown model is not sound as from activity X, the end state cannot be reached. A preview of the model is shown in order to aid users to correct the issue. will adopt the new model and continue computations traces in the trace view, are highlighted with this colour. according to Figure 1. • Hardware accelerated drawing. The drawing of tokens • Export model. The directly follows model can be ex- now makes use of hardware acceleration using OpenGL ported to ProM as a Petri net, an Accepting Petri net, for smoother animations. an Expanded Accepting Petri net (with explicit start and • Edit model for process trees. A model editing facility has completion activities) or as a DFM. been added for process trees as well. • Additional performance measures: DFvM shows elapsed New features of DFvM that also benefit users of IvM: and remaining time, in addition to waiting, service and • Trace and token colouring. Using “trace colouring”, a sojourn time. For an activity a, elapsed time is the time trace-level attribute can be chosen, and every trace will from the first timestamp in a trace to the start of the be coloured according to the value of this attribute using a execution of a, or the completion if no start timestamp continuous Viridis colour map. The tokens, as well as the is available. Similarly, remaining time is the time of the completion (or start if that is not available) of a A user manual is available from http://leemans.ch/ to the last timestamp in a trace. The reported measures inductivevisualminer/, and a screencast video is available on are the minimum, average and maximum times over all https://youtu.be/xTKKqGwzh6I. executions of an activity a. VII. C ONCLUSION • Import an existing process model and bypass the dis- covery. Conformance checking and performance analysis Typical process mining projects involve repeated process for existing models can be done using the ProM plug-in discovery, conformance checking, filtering and performance “Visualise deviations on directly follows model (Directly measuring. Academic process discovery techniques might re- Follows visual Miner)”. turn models that are difficult to understand or that under- • Switch between DFvM and IvM. Users can easily switch or overfit the event log, while commercial tools typically do between DFvM and IvM by choosing a different “miner”: not offer conformance checking capabilities (and thus, might even though DFvM and IvM use completely different provide counter-intuitive performance measures). concepts, if advanced process discovery using process In this paper, we introduced the Directly Follows visual trees with concurrency, interleaving or inclusive choices Miner (DFvM): a new tool that automatically discovers a is necessary, IvM is two clicks away. directly follows model, applies conformance checking (align- ments), computes performance measures and supports filtering V. M ATURITY to drill down into parts of the event log. The Inductive visual Miner was originally introduced in DFvM extends and improves the Inductive visual Miner 2014, and has attracted 53 [1] and 41 [6] citations (Google with a new formalism (directly follows models rather than Scholar, 18-03-2019) and has been used in several industry process trees), a new more concurrent architecture using projects within QUT’s BPM discipline. a flexible framework, the ability to edit models and new We have applied the Directly Follows visual Miner (DFvM) performance measures. to a case study in a Queensland Government department [4], We express the hope that the ideas illustrated in DFvM will in which it proved useful to perform process mining analyses lower the bar for analysts to apply conformance checking and on models that were difficult to represent with Petri nets or that it will inspire commercial vendors and others to include process trees. Furthermore, the department had 72 manually conformance checking capabilities. created directly follows-based business process descriptions, R EFERENCES which were transformed to a DFvM format, after which we [1] S. J. J. Leemans, D. Fahland, and W. M. P. van der Aalst, “Process used DFvM to apply conformance checking to these models and deviation exploration with Inductive visual Miner,” in BPM Demo and corresponding event logs to gain insight into deviations, Sessions 2014, 2014, p. 46. frequencies and performance [4]. [2] C. W. Günther and A. Rozinat, “Disco: Discover your processes,” in BPM demos, 2012, pp. 40–44. Finally, the new edit model feature has been used in col- [3] W. M. P. van der Aalst, Process Mining - Data Science in Action, Second laborations with several industry partners during workshops. Edition. Springer, 2016. While discussing the model with the participants, it is mod- [4] S. J. J. Leemans, E. Poppe, and M. T. Wynn, “Directly follows-based process mining: Exploration & a case study,” in ICPM, 2019, in print. elled live in DFvM, after which DFvM quickly shows the [5] W. M. P. van der Aalst, A. Adriansyah, and B. F. van Dongen, “Replaying alignment results (deviations, frequencies and performance), history on process models for conformance checking and performance which then often leads to further discussions on the model analysis,” WIRDMKD, vol. 2, no. 2, pp. 182–192, 2012. [6] S. J. J. Leemans, D. Fahland, and W. M. P. van der Aalst, “Exploring and on the process. For instance, one of the participants of a processes and deviations,” in BPM Workshops, 2014, pp. 304–316. workshop noted “Utilising the DFvM tool during workshops has been beneficial. Being able to visually see the deviations, frequencies and performance of the processes has facilitated robust discussion. All workshop participants, whether process focussed or not, are able to quickly comprehend the model and leverage the findings to identify process inefficiencies and drive potential process transformation.” (Janne Barnes, Project Manager - Queensland University of Technology - Research Management Systems Upgrade). VI. ACCESS & D EMONSTRATION The Directly Follows visual Miner (DFvM) is available both as a ProM plug-in (http://promtools.org, from version 6.9 or in Nightly Builds) and as a part of the ProM QuickVisualiser (http://leemans.ch/quickvisualiser/). In both cases, an event log is required to run DFvM. Exam- ple event logs are available from https://data.4tu.nl/repository/ collection:event logs real.