Understanding and Improving Process Analysts’ Comprehension of Process Mining Visualizations Femke Pieters1,* 1 UHasselt, Digital Future Lab, Agoralaan Building D, 3590 Diepenbeek, Belgium Abstract Process mining (PM) has become fundamental for data-driven process analysis. It enables organizations to identify inefficiencies and areas for improvement. Central to the effectiveness of PM is the use of visualizations and the ability of process analysts to comprehend the visualizations generated by PM algorithms. This doctoral research aims to address the challenge of enhancing the comprehensibility of PM visualizations for process analysts. The study has two key objectives: (1) to establish an improved understanding of how process analysts interpret PM visualizations and (2) to design and evaluate novel visualizations that improve comprehensibility. Through a combination of theoretical frameworks and empirical studies, this research develops an assessment framework to evaluate the comprehensibility of PM visualizations. Additionally, the research proposes novel visualization designs, which are assessed through a series of experiments to determine their impact on analysts’ performance. The findings contribute to the field of PM by providing insights that can guide the design of more user-friendly visualizations, which facilitates better decision-making and process improvement within organizations. Keywords Process Mining, Comprehensibility, Visualization, Visual Analytics, Process Analyst 1. Introduction Organizations commonly conduct process analysis to pinpoint operational problems and areas for improvement [1]. These analyses are increasingly being performed in a data-driven way given the widespread presence of process execution data captured by information systems, i.e. fine-grained data reflecting how a process is being performed in reality. Process mining (PM) enables such data-driven process analyses that use process execution data to, e.g., discover the activity order in a process or expose bottlenecks [2]. The great majority of PM algorithms provide a visual representation as an output such as a process model showing the activity order [3]. To actually contribute to organisational value creation by, e.g., identifying root causes of operational problems or process improvement opportunities, process analysts who use PM in their profession need to make sense of the provided PM output [4, 5]. Hence, it is crucial that PM visualizations are comprehensible for process analysts such that the visualization can effectively and efficiently satisfy their information needs [6]. The incomprehensibility of current PM visualizations has emerged as an important challenge for the use of PM in organizations [7]. Strikingly, there is hardly any understanding on how process analysts comprehend PM visualizations [8], even though this is critical to guide the (re)design of PM visualizations. In this doctoral research, we aim to gather an in-depth understanding of process analysts’ com- prehension of PM visualizations to create novel PM visualizations with improved comprehensibility. Besides providing a highly relevant knowledge contribution, the novel visualizations enable process analysts to better comprehend the provided output, turning PM into a more powerful instrument to support data-driven process improvement. The two research objectives are (RO1) establish an in-depth understanding of process analysts’ comprehension of PM visualizations and, (RO2) create and evaluate novel PM visualizations, designed to improve comprehensibility for process analysts. ICPM 2024 Doctoral Consortium, October 14–18, 2024, Kongens Lyngby, Denmark * Corresponding author. $ femke.pieters@uhasselt.be (F. Pieters)  0009-0006-5675-126X (F. Pieters) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings The remainder of this extended abstract is structured as follows. In Section II, we describe our research methodology. An overview of related work is given in Section III. Finally, we conclude our research intents in Section IV. 2. Methodology This research will take both a theoretical and empirical perspective. The proposed research objectives will be achieved following five iterative steps. (1) Identify common PM visualizations and related process analysis tasks (preparatory to RO1 and RO2). To inform the selection of PM visualizations in subsequent research steps, the most common process analysis tasks and associated PM visualizations are identified. Published PM case studies will be collected from various sources. For each collected case study, the process analysis tasks that are conducted are recorded, together with the PM visualizations that have been used. This step results in a structured overview of common process analysis tasks and the associated PM visualizations used for this task. We will validate the overview by (a) consulting the main commercial tools and (b) conducting semi-structured interviews with academics and process analysts from industry. (2) Create an assessment framework to assess the comprehensibility of PM visualizations, based on theory (RO1). A rapid review will be conducted to identify theories, guidelines and principles on (process) visualizations and comprehensibility. As relevant literature is scattered over various fields such as process modeling, information visualizations, visual analytics and human-computer interaction, a systematic literature review is infeasible here. Before conducting the rapid review, we will develop an appropriate protocol building upon Klerings et al. [9]. The data will be coded using the Gioia method [10], using Atlas.ti as a tool [11]. Drawing upon the results of the rapid review, an assessment framework for the assessment of the comprehensibility of PM visualizations (from a theoretical perspective) will be composed. An example theory that could be part of the assessment framework is the cognitive load theory. This states that a visualizations should help analysts process information without overloading their working memory. The cognitive load can be reduced by, for example, using color codes or showing the right amount of details [8]. The goal of our assessment framework is to be able to assess from a theoretical perspective (the components of) a visualization based on its comprehensibility. (3) Exploratory study on process analysts’ comprehension of PM visualizations (RO1). To understand how process analysts comprehend PM visualizations, an exploratory study, in which process analysts will perform certain tasks to locate particular information in the visualization, will be designed and conducted. While performing these tasks, data is collected using three complementary data collection methods: (a) clickstream traces, (b) eye-tracking and (c) concurrent think-aloud. Using these rich datasets, we will identify action patterns and intents of process analysts and relate this to their task performance. The assessment framework marking the output of the previous step will be refined with the outcomes of step 3. (4) Assess the comprehensibility of current PM visualizations (RO2). The comprehensibility of the most common PM visualizations identified in step 1 will be assessed using the assessment framework created in step 2 and 3. To limit the impact of personal bias, dual data extraction will be used. Based on the assessment, various areas to improve PM visualizations to enhance their comprehensibility will emerge. This constitutes the basis for a research agenda containing a structured overview of areas for future research. This research agenda will not only inform step 5, but will also be highly relevant for the research community at large. While the importance of devoting more attention to the visualization of PM output is widely recognised at a conceptual level, this research agenda will provide specific directions to move forward. In this way, it has the potential to generate significant impact on the future development of the research field. (5) Create and evaluate novel PM visualizations (RO2). Open challenges regarding the comprehensibility of current PM visualizations emerging from step 4 will be selected and tackled. In particular, the two visualizations that are commonly used and show the largest potential to enhance their comprehensibility will be selected. To create the novel PM visualizations, the procedure to design cognitively efficient visualizations by Ware [12] will be used. An intermediate evaluation with a convenience sample of master students will be performed. Concurrent think-aloud will be used, i.e. studentds and academics are requested to perform the PM task using the created prototype while talking aloud. This intermediate evaluation can provide areas to further refine the visualizations to obtain a mature prototype. Afterwards, a between-subjects experimental design is set up, in which we evaluate the impact of the novel PM visualizations on their comprehensibility for process analysts. Participating analysts will be randomly assigned to one of two groups: the experimental or the control group. The visualization is the treatment variable, but both groups will first undergo a baseline assessment to measure their initial level of PM visualization comprehension. Following this baseline assessment, the control group will perform the experimental tasks using the original PM visualizations and the experimental group will perform the same tasks using the novel visualizations. Similar to step 3, the tasks will take the form of search and recognition tasks. Task performance metrics such as accuracy, speed, and subjective ratings of comprehension will be recorded or calculated. Clickstream traces and eye-tracking data will be collected to contextualise the analysis. A statistical analysis (ANOVA) will be performed to compare the performance of the experimental and control group on the collected task performance metrics, thereby evaluating the impact of the novel PM visualizations on their comprehensibility for process analysts. 3. Related work Until now, only a limited number of works in PM literature have considered the perspective of the user of PM algorithms. For instance, Zerbato et al. [13] identify behavioral patterns of PM users (amongst which one process analyst from industry) that emerge during the early stages of a PM analysis. While this paper takes the perspective of PM users, it does not provide specific insights into how process analysts comprehend PM visualizations. Prior research from the process modeling domain has shown that a variety of factors have an impact on process model comprehension, such as the modeling language used, the visual layout and the size of the model [14] However, it is not clear to which extent these findings can be transferred to PM as PM visualizations differ from traditional process models. For example, PM visualizations are embedded in interfaces that offer interaction functionalities. Traditional process models are less dynamic [2, 15]. In more recent years, synergies between PM and the research fields of visual analytics and information visualization have been explored. For example: Rehse et al. [16] use a visual analytics framework to assess how current PM tools visualize the output of conformance checking, which is a very common class of PM. This assessment highlights the need to develop and evaluate novel PM visualization idioms (i.e. forms or types of visualizations) for conformance checking output [16]. Another recent paper in this stream of literature is Yeshchenko and Mendling [17], which reviews literature to map contributions on the visualization of event sequences from information visualization and PM [17]. Future research can build upon their initial observations to create novel PM visualizations. Additionally, in one systematic study of work practices in PM, the focus lies on both visualizations and tasks. The researchers identified that problems beyond the control-flow perspective, particularly from the case perspective, are often analyzed using general-purpose visualization techniques. These techniques may lack the sophistication needed to support deeper analytical insights [18]. As mentioned in previous studies, there is a cognitive challenge faced by analysts in interpreting PM outputs. In this context, the work from Sorokina et al. [19] contributes by proposing the PEM4PPM model, a theoretical foundation that explains the cognitive processes of individual analysts during the process of process mining (PPM). Cognitive steps such as task understanding, goal refinement, data exploration and hypothesis testing play important roles in shaping how analysts approach PM tasks. As PM visualizations are central to this process, there is a growing recognition that improving their comprehensibility can significantly enhance analysts’ performance and effectiveness PM tools in organizational settings [19]. 4. Conclusion In this doctoral research, we aim to understand and improve the comprehensibility of PM visualizations. First, we provide a highly relevant knowledge contribution by proposing an assessment framework to assess the comprehensibility of PM visualizations. Having a better understanding of PM visualization comprehension allows for furthering the impact of PM in organizations as process analysts can make more sense of the output of PM algorithms. Second, we design novel visualizations and we empirically research if the comprehensibility of these visualizations is improved. References [1] M. Dumas, M. La Rosa, J. Mendling, H. Reijers, Fundamentals of Business Process Management, Springer, 2018. doi:10.1007/978-3-662-56509-4. [2] W. Aalst, Process Mining: Data Science in Action, Springer, 2016. doi:10.1007/ 978-3-662-49851-4. [3] T. Gschwandtner, Visual analytics meets process mining: Challenges and opportuni- ties, Lecture Notes in Business Information Processing 244 (2017) 142–154. doi:10.1007/ 978-3-319-53435-0_7. [4] P. Badakhshan, B. Wurm, T. Grisold, J. Geyer-Klingeberg, J. Mendling, J. v. Brocke, Creating business value with process mining, The Journal of Strategic Information Systems 31 (2022) 2023. [5] M. Eck, X. Lu, S. Leemans, W. Aalst, Pm2 : A process mining project methodology, Lecture Notes in Computer Science (2015) 297–313. doi:10.1007/978-3-319-19069-3_19. [6] D. Keim, F. Mansmann, J. Schneidewind, J. Thomas, H. Ziegler, Visual analytics: Scope and challenges, First publ. in: Lecture notes in computer science, No. 4404 (2008), pp. 76-90 1 (2008). doi:10.1007/978-3-540-71080-6_6. [7] N. Martin, D. Fischer, G. Kerpedzhiev, K. Goel, S. Leemans, M. Roeglinger, W. Aalst, M. Dumas, M. La Rosa, M. Wynn, Opportunities and challenges for process mining in organizations: Results of a delphi study, Business Information Systems Engineering 63 (2021) 1–17. doi:10.1007/ s12599-021-00720-0. [8] J. Mendling, D. Djurica, M. Malinova Mandelburger, Cognitive effectiveness of representa- tions for process mining, Lecture Notes in Computer Science (2021) 17–22. doi:10.1007/ 978-3-030-85469-0_2. [9] I. Klerings, S. Robalino, A. Booth, C. Escobar-Liquitay, I. Sommer, G. Gartlehner, D. Devane, S. Waffenschmidt, Rapid reviews methods series: Guidance on literature search, BMJ evidence- based medicine 28 (2023). doi:10.1136/bmjebm-2022-112079. [10] D. Gioia, K. Corley, A. Hamilton, Seeking qualitative rigor in inductive research, Organizational Research Methods 16 (2013) 15–31. doi:10.1177/1094428112452151. [11] K. Charmaz, Constructing Grounded Theory: A Practical Guide Through Qualitative Analysis, volume 1, SAGE Publications Ltd, 2006. [12] C. Ware, Information Visualization: Perception for Design: Second Edition, Morgan Kaufmann, 2004. [13] F. Zerbato, P. Soffer, B. Weber, Initial insights into exploratory process mining practices, Lecture Notes in Business Information Processing (2021) 145–161. doi:10.1007/978-3-030-85440-9_ 9. [14] K. Figl, Comprehension of procedural visual business process models: a literature review, Business & Information Systems Engineering 59 (2017) 41–67. [15] J. S. Yi, Y. a. Kang, J. Stasko, J. Jacko, Toward a deeper understanding of the role of interaction in information visualization, IEEE Transactions on Visualization and Computer Graphics 13 (2007) 1224–1231. doi:10.1109/TVCG.2007.70515. [16] J.-R. Rehse, L. Pufahl, M. Grohs, L.-M. Klein, Process mining meets visual analytics: The case of conformance checking, Proceedings of the 56th Hawaii International Conference on System Sciences (2022). doi:10.48550/arXiv.2209.09712. [17] A. Yeshchenko, J. Mendling, A survey of approaches for event sequence analysis and visualization, Information Systems 120 (2023) 102283. [18] C. Klinkmüller, R. Müller, I. Weber, Mining process mining practices: An exploratory charac- terization of information needs in process analytics, Lecture Notes in Computer Science (2019) 322–337. [19] E. Sorokina, P. Soffer, I. Hadar, U. Leron, F. Zerbato, B. Weber, Pem4ppm: A cognitive perspective on the process of process mining, Lecture Notes in Computer Science (2023) 465–481. doi:10. 1007/978-3-031-41620-0_27.