Sub-Model Freezing During Incremental Process Discovery in Cortado (Extended Abstract) Daniel Schuster Sebastiaan J. van Zelst Wil M. P. van der Aalst Fraunhofer Institute for Applied Fraunhofer Institute for Applied Fraunhofer Institute for Applied Information Technology FIT Information Technology FIT Information Technology FIT Sankt Augustin, Germany Sankt Augustin, Germany Sankt Augustin, Germany daniel.schuster@fit.fraunhofer.de sebastiaan.van.zelst@fit.fraunhofer.de RWTH Aachen University RWTH Aachen University RWTH Aachen University Aachen, Germany Aachen, Germany Aachen, Germany wvdaalst@pads.rwth-aachen.de Abstract—Conventional process discovery algorithms are fully Process Model 𝑴 automated and work as a black box from the user’s perspective. “Under Construction” Event data is fed into the discovery algorithm, and a process “Frozen” “Frozen” Sub-Model 𝑴𝟏 Sub-Model 𝑴𝟐 model is returned. Interactive process discovery is about breaking this black-box approach of conventional process discovery and incrementally marks process Modified Process Model 𝑴′ involving the user during the discovery, i.e., adopting the princi- model parts as “frozen” Cortado: describing the selected process ples of hybrid intelligence in process discovery. The central idea Freezing-Enabled behavior and previously added Incremental behavior is to exploit the user’s knowledge of the process to be discovered User/ Process Modeler “Frozen” “Frozen” within the discovery phase to obtain better models. The software Discovery Sub-Model 𝑴𝟏 Sub-Model 𝑴𝟐 tool Cortado allows for the incremental discovery of a process incrementally selects process model based on user-selected process behavior. In this paper, we behavior (a trace) not yet described by the process model present the implementation of sub-model freezing, i.e., a novel form of user interaction during incremental process discovery, Event Data in Cortado. Index Terms—process mining, interactive process discovery, Fig. 1. Conceptual idea of sub-model freezing during incremental process process models, hybrid intelligence discovery. Figure adapted from [3]. I. I NTRODUCTION II. S UB -M ODEL F REEZING Process discovery, a key discipline of process mining [1], comprises algorithms that (automatically) learn a process In this section, we first outline the concept of sub-model model from event data. Since event data often have quality freezing within incremental process discovery. Afterwards, we issues and are incomplete, i.e., only a fragment of the possible focus on the implementation of said technique in Cortado. process behavior is captured, conventional process discovery The theoretical foundations of sub-model freezing are intro- algorithms often yield low-quality process models. To address duced in [3]. Figure 1 visualizes the conceptual idea. Starting these challenges, the field of interactive process discovery has from an event log and an initial model M , which can also be emerged. The key idea is to utilize domain knowledge about discovered by Cortado, a user incrementally selects process the process to be discovered, in addition to the available event behavior, i.e., trace variants, that are not yet described by the data, to discover process models of superior quality. process model M . Additionally, the user has the option to In [2], we introduced the first version of the software freeze sub-models of M . For example, as indicated in Figure 1, tool Cortado. Following an incremental process discovery the user freezes two sub-models, i.e., M1 and M2 , of M . By approach, Cortado enables the user to gradually discover a freezing sub-models of M , the freezing-enabled incremental process model from user-selected process behavior, i.e., event discovery approach implemented in Cortado ensures that the data. This incremental approach to process discovery allows incrementally discovered process model M 0 contains M1 and the user to influence the discovery of a process model interac- M2 . Without marking M1 and M2 as frozen, there is no guar- tively. For a detailed description of Cortado’s functionality, antee that these sub-models will be present in the new model we refer to [2]. In recent work [3], we presented a novel M 0 in identical form. Note that the incrementally discovered form of user interaction in the context of incremental process process model M 0 describes the selected trace variant plus discovery: sub-model freezing. In this paper, we present the previously selected trace variants. After one iteration, the user realization of sub-model freezing within Cortado1 . can incrementally add further trace variants to the model under construction. Note that the incrementally discovered model M 0 1 Sub-model freezing is available from version 1.3.0, downloadable from is used as an input in the next iteration, visualized by the dotted https://cortado.fit.fraunhofer.de/ arc from M 0 to M in Figure 1. Further, the user can change Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). (a) Without freezing: Initial model that describes the first two variants (b) Without freezing: Process model after the third variant has been incremen- (indicated by green check-marks, which are located left to each variant) tally added to the model shown in Figure 2a (c) With freezing: Initial model as shown in Figure 2a with frozen process (d) With freezing: Process model after the third variant has been incrementally model part (frozen subtree is marked blue) added to the model shown in Figure 2c. Fig. 2. Example of an incremental process discovery step, i.e., adding a non-fitting trace variant to a model under construction, with/without freezing. which sub-models are frozen in each iteration. without freezing (Figure 2b). Further, note that both discovered In Figure 2, we present screenshots of Cortado that are process trees, i.e., with/without freezing, describe the three demonstrating the described incremental process discovery selected trace variants shown; however, they structurally differ. approach once without freezing (Figure 2a and 2b) and once III. C ONCLUSION with freezing (Figure 2c and 2d). In both cases, we use the same event data and the same initial model that describes the In this paper, we presented the realization of sub-model first two variants from the variant explorer, cf. Figure 2a and freezing—a novel form of user interaction within incremental 2c. Note that Cortado uses process trees as a process model process discovery—in Cortado. Further, we highlighted the formalism. We refer to [4] for an introduction to process trees. difference between using and not using the freezing option Figure 2b shows the process tree after adding the third with an example. variant from the variant explorer to the initial process tree. R EFERENCES We observe that the algorithm added a loop on the activity [1] W. M. P. van der Aalst, Process Mining - Data Science in Action, Second W_Afhandelen leads. In Figure 2c, we see the same Edition. Springer, 2016. initial process tree where the user marked a subtree as frozen, [2] D. Schuster, S. J. van Zelst, and W. M. P. van der Aalst, “Cortado— highlighted in blue colors. After adding the the third variant to an interactive tool for data-driven process discovery and modeling,” in Application and Theory of Petri Nets and Concurrency, ser. Lecture Notes the initial process tree with frozen subtree, we observe that the in Computer Science, vol. 12734. Springer, 2021. resulting process tree is different compared to the one obtained [3] ——, “Freezing sub-models during incremental process discovery,” in without freezing. This time, the algorithm added an optional Conceptual Modeling, ser. Lecture Notes in Computer Science, vol. 13011. Springer, 2021. activity labeled with W_Afhandelen leads before the [4] ——, “Incremental discovery of hierarchical process models,” in Research frozen subtree is executed. Note that the frozen subtree has Challenges in Information Science, ser. Lecture Notes in Business Infor- not been altered by the algorithm, compared to the execution mation Processing, vol. 385. Springer, 2020. Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).