-

ProcessPro ler3D: A Tool for Visualising Performance Di erences Between Process Cohorts and Process Instances

E. Poppe

M.T. Wynn

A.H.M. ter Hofstede

R. Brown

A. Pini

W.M.P. van der Aalst

0 DensityDesign Research Lab, Politecnico di Milano , Milan , Italy 1 Eindhoven University of Technology , Eindhoven , The Netherlands 2 Queensland University of Technology , Queensland , Australia

An organisation's event logs can give great insight into factors that a ect the execution of their business processes by comparing di erent process cohorts. We have recently presented ProcessPro ler3D, a novel tool for such comparisons that supports interactive data exploration, automatic calculation of performance data and visual comparison of multiple cohorts. The approach enables the intuitive discovery of differences and trends in cohort performance. To better support the interpretation of these di erences in the context of process execution we now extended the tool with a novel visualisation technique that enables the visualisation of case execution and timing in a way that provides context to such a performance analysis.

Analysing process data in event logs to identify problems and opportunities with existing processes can be of great value for improving the processes of an organisation. Process mining [ 1 ], a specialised eld of research in business process management, develops tools and techniques to support this. By splitting an event log into process cohorts, i.e. a group of process instances that have one or more shared characteristics, one can analyse how di erent case characteristics (often called context factors) a ect the execution of a process. We have recently identi ed that despite continued industry interest [ 6,4,2 ], there is a lack of tools to support such analyses e ectively [ 7 ]. None of the existing academic or commercial tools provided both support for interactive data exploration, by supporting interactive splitting of the event log, as well as an integrated comparison of multiple process cohorts, by supporting the visualisation of performance data for more than two cohorts in one view. Consequently, we presented ProcessProler3D, a framework to solve this issue [ 7 ]. We now present a complementary novel visualisation technique that covers additional performance analysis scenarios by providing additional context to the presented performance data. ProcessPro ler3D enables comparing the performance of multiple process cohorts by { aligning an event log with a process model { calculating common node level process performance indicators such as activity duration, activity throughput time and waiting times between activities { storing performance data in a data cube { interactively splitting the event log by de ning cohorts { visualising performance data in a third dimension on top of the process model at multiple levels of process abstraction { visualising data related to activities using either one of three di erent types of bar charts or a triangle chart (see [ 5 ]) { visualising data related to activity pairs can be visualised using coloured arcs between the two activities (see [ 7 ])

The framework was implemented in two plugins for the process mining framework ProM. Figure 1 shows an example of comparative performance analysis using this tool.

However, we note that some scenarios are still not well covered by existing performance analysis techniques and in the remainder of this paper we will discuss one of these scenarios and present a novel visualisation technique that we have added to ProcessPro ler3D to address this issue. 3

Problem statement

One issue with existing techniques for process performance analysis is the loss of context that occurs when performance data are a) localised and b) summarised as is usually the case with activity duration, throughput time and waiting time calculations. Both problems have the potential to a ect our understanding of performance analysis results and can complicate nding root causes.

Firstly, the analysis results are currently localised to one point in the process model. For example, an activity C may be preceded by either activity A or B. By looking at performance indicators of these activities we cannot tell if cases that rst executed A on average take longer to execute C than cases that executed B. So by localising the analysis results per activity we lose the context of how preceding activities a ected the case and how subsequent activities were impacted.

Secondly, the statistical summary of performance indicators by minimum, median, mean and maximum also means that we are losing context in the results. It is, for example, hard to tell whether a few extreme cases skewed the results or what the general distribution of cases is. Furthermore, if the same case executes an activity multiple times, it is impossible to identify di erences between the individual execution times (e.g. the activity took much time on the rst execution, but nished really quickly on every following execution). Some absolute indicators, such as the average case runtime at an activity, also get distorted by loops. Consequently, while existing process performance analysis techniques already provide valuable insights into the execution of a process, additional analysis techniques are required to add context to the results of existing techniques. 4

Trajectory Visualisation

We therefore propose a novel visualisation technique inspired by geo-spatial data visualisations (e.g. [ 3 ]) to present performance data in the context of both history and future execution of a process instance. This visualisation presents the path of individual cases through a process model, while showing timing information in a third dimension, orthogonal to the process model. An example of this technique can be seen in Figure 2.

We construct this visualisation by replaying a token-game on a given Petri net and recording each token move as a line in two dimensions. We then use the time of each event that triggered the token move to calculate the height of the start and end point of each line. Our implementation provides three di erent con gurations of the trajectory visualisation. The rst variant visualises token paths from one activity straight to the next activity. The second variant visualises the token paths from the activity through the place to the next activity. The third variant visualises the token path from the activity along the edge connecting it to the place and then along the edge to the next activity. Each variant increases the complexity of the visualisation, but often lines following the model layout more precisely make it easier to relate them back to the underlying process model and therefore easier to understand. To further facilitate this, vertical support lines can be displayed by selecting nodes in the process model, as shown in Figure 2.

In addition to the shape of case trajectories, colours can encode additional information in the visualisation. By default, case trajectories are coloured to indicate the cohort a case belongs to (see Figure 3). However, our implementation can also colour the trajectory to display relative completion of the case as a colour gradient. This can facilitate nding bottlenecks in large event logs. Furthermore, the cohort classi cation can be used to lter the visualisation, by hiding trajectories belonging to a particular cohort. Lastly, the vertical scale of the visualisation can be changed by clicking on the white frame surrounding the trajectories and pulling it upwards or downwards. This can make it easier to see di erences between otherwise densely packed trajectories.

Seeing both control- ow and time perspective in one view enables users to identify interactions between control- ow constructs such as loops and process execution times. Using this technique together with the previously presented techniques for comparative performance visualisation (see [ 7 ]) therefore facilitates the understanding of performance analysis results.

Conclusion

We have presented ProcessPro ler3D, a framework that can be used to analyse and compare the performance of multiple process cohorts. The usefulness of this framework has previously been demonstrated by analysing two industry data sets and evaluating the tool with two industry partners [ 7 ]. In this paper we have added a novel visualisation technique, the trajectory visualisation, to this framework, to address the loss of context in the existing performance analysis approaches.

The framework is available as a package (called \ProcessPro ler3D") for the process mining framework ProM. In addition, the complete source code for the tool including the trajectory visualisation is available in the ProM repository4.

A screencast of the tool including the new technique is available at: https://www.youtube.com/watch?v=CkgBTFk6MXY 6 4 https://svn.win.tue.nl/repos/prom/Packages/ProcessPro ler3D/

1. van der Aalst , W.M.P. : Process Mining: Data Science in Action. Springer ( 2016 )

2. Bolt , A., de Leoni , M., van der Aalst , W.M.P. , Gorissen , P. : Exploiting process cubes, analytic work ows and process mining for business process reporting: A case study in education . In: International Symposium on Data-driven Process Discovery and Analysis . pp. 33 { 47 . CEUR-WS.org ( 2015 )

3. Kraak , M.J.: The space-time cube revisited from a geovisualization perspective . In: Proc. 21st International Cartographic Conference . pp. 1988 { 1996 ( 2003 )

4. Partington , A. , Wynn , M. , Suriadi , S. , Ouyang , C. , Karnon , J.: Process mining for clinical processes: A comparative analysis of four Australian hospitals . ACM Transactions on Management Information Systems 5 ( 4 ), 19 :1{ 19 :18 (Jan 2015 )

5. Pini , A. , Brown , R., Wynn , M.T.: Process visualization techniques for multiperspective process comparisons . In: Bae, J. , Suriadi , S. , Wen , L . (eds.) Asia Paci c Business Process Management. Lecture Notes in Business Information Processing , vol. 219 , pp. 183 { 197 . Springer, Busan, Korea (March 2015 )

6. Suriadi , S. , Wynn , M.T. , Ouyang , C. , ter Hofstede , A.H.M. , van Dijk , N.J.: Understanding process behaviours in a large insurance company in Australia: A case study . In: Salinesi, C. , Norrie , M.C. , Pastor , O . (eds.) Advanced Information Systems Engineering, Lecture Notes in Computer Science , vol. 7908 , pp. 449 { 464 . Springer ( 2013 )

7. Wynn , M.T. , Poppe , E. , Xu , J. , ter Hofstede , A.H.M. , Brown , R.A., Pini , A., van der Aalst , W.M.P.: ProcessPro ler3D: A visualisation framework for log-based process performance comparison. Decision Support Systems ( 2017 , in press), https://doi. org/10.1016/j.dss. 2017 . 04 .004