=Paper= {{Paper |id=Vol-3098/demo_205 |storemode=property |title=Interactive Process Drift Detection: A Framework for Visual Analysis of Process Drifts (Extended Abstract) |pdfUrl=https://ceur-ws.org/Vol-3098/demo_205.pdf |volume=Vol-3098 |authors=Denise Maria Vecino Sato,Rafaela Mantovani Fontana,Jean Paul Barddal,Edson Emilio Scalabrin }} ==Interactive Process Drift Detection: A Framework for Visual Analysis of Process Drifts (Extended Abstract)== https://ceur-ws.org/Vol-3098/demo_205.pdf
Interactive Process Drift Detection: A Framework for
Visual Analysis of Process Drifts (Extended Abstract)
   Denise Maria Vecino Sato              Rafaela Mantovani Fontana                    Jean Paul Barddal                   Edson Emilio Scalabrin
Graduate Program in Informatics        Department of Professional and         Graduate Program in Informatics         Graduate Program in Informatics
Pontifícia Universidade Católica          Technological Education             Pontifícia Universidade Católica        Pontifícia Universidade Católica
do Paraná and Instituto Federal        Universidade Federal do Paraná                    do Paraná                               do Paraná
           do Paraná                           Curitiba, Brazil                        Curitiba, Brazil                        Curitiba, Brazil
         Curitiba, Brazil                  0000-0001-6350-4167                    0000-0001-9928-854X                       0000-0002-3918-179
     0000-0003-1117-7082



    Abstract—Interactive Process Drift Detection (IPDD) is a                   model. However, the most common perspective considered in
framework for visual analysis of process drifts. A process drift               the available tools is the control flow. Identifying and
indicates a change in the process model occurred at some point in              understanding the process drifts is relevant for business analysts
time. IPDD firstly generates process models for subparts of the                because it improves their knowledge about the processes and
event log using a sliding window approach. Then, it detects the                enhances the quality of process mining analysis. Even when
drifts by evaluating similarity metrics calculated between adjacent            analysts perform offline process mining analysis, process drift
process models; a difference in some of the metrics indicates a                detection can provide benefits, e.g., avoid complex discovered
drift. The current implementation of IPDD generates the process                process models, improve conformance checking, or enhance
models using the directly-follows graph (DFG) and applies two
                                                                               processes based on their current state.
metrics: nodes and edges similarity. The user interface shows the
drifts in the process models over time, allowing the user to visually              Different tools for detecting process drifts from event logs
understand the model changes. Also, the user can easily change the             have been proposed, but the accuracy of the detection is usually
hyperparameters for the analysis and verify the results on the                 related to the hyperparameter configuration [3]. The ProDrift
interface. The user interface of IPDD allows the user to evaluate              plugin in Apromore [4], [5] and the ConceptDrift plugin in ProM
the detected drifts by calculating the F-score metric, which is                [2] can detect different types of drifts (sudden and gradual);
useful when using artificial datasets. The underlying idea is to ease          however, the focus is the change point and information about it.
the choice of a “good” value for the hyperparameter
                                                                               The user has to complement the drift analysis by executing a
configuration, which is critical for almost any drift detection tool.
                                                                               more exploratory mining slicing the event log based on the
   Keywords— process drift detection, visual process analysis,                 reported change points to understand the evolution of the
process drift, concept drift                                                   process. A more recent tool, named VDD [6], detects the four
                                                                               types of drifts and allows the user to explore the drift using the
                         I. INTRODUCTION                                       process model. However, the tool is based on constraints mined
                                                                               over Declare models, and it mixes DFGs with the constraints to
    Process mining aims at creating valuable knowledge about
                                                                               explain the dynamic of the process over time. None of the
business processes obtained from information systems event
                                                                               identified tools calculate an accuracy metric in the user interface.
data. Usually, process mining techniques assume the processes
to be steady-state, i.e., the event data contains information from                 Tunning the hyperparameter configuration to enhance the
a unique version of the process. However, this assumption does                 detection accuracy imposes a challenge to the proposed tools
not reflect the reality of the business processes, which constantly            because the different approaches are affected by the
adapt to new regulations, improve performance, or enhance user                 hyperparameter configuration. IPDD aims to overcome this
experience. The situation where a process changes while being                  issue by providing an interactive user interface where the user
analyzed is named concept drift or process drift [1].                          quickly changes the parameter and visually evaluates the results.
                                                                               The tool provides visual process drift detection analysis by
    The change in the process can affect the ongoing instances,
                                                                               showing the distinct process models over time, in what we can
sudden or gradually. A sudden drift occurs when all the ongoing
                                                                               consider a “replay” of the process models. IPDD also provides
instances start to follow the new process model immediately. In
                                                                               information about the differences against the previous model for
a gradual drift, there is a period of time where instances from
                                                                               each process model, enhancing the analysis. IPDD’s current
both versions of the process model coexist. The process drifts
                                                                               implementation detects sudden drifts in the control-flow
can also follow recurrent or incremental patterns. A recurrent
                                                                               perspective offline, which is a limitation.
drift indicates that a replaced process model can occur again. In
an incremental drift, minor changes of the process model are                                        II. IPDD MAIN FEATURES
implemented during some time. Sudden, gradual, incremental,
and recurring are considered process drift types [2]. The process                  The IPDD framework detects the process drifts by analyzing
drift can also affect one or more perspectives of the process                  the event log using a sliding window strategy. First, the user

   This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
  Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0
XXX-X-XXXX-XXXX-X/XX/$XX.00
  International (CC BY 4.0). ©20XX IEEE
defines the window size based on the number of traces, and
IPDD splits the log using tumbling windows. Then, it generates
a model for each window and calculates the similarity metrics
between adjacent models. The idea is to compare models mined
from adjacent time slots using similarity metrics; when they are
not similar, IPDD identifies a drift and characterizes the change
based on the information provided by the metric.
     The IPDD’s current implementation mines the DFGs
(process maps) from the traces in the time slots using the Pm4Py
[7]. Then, the adjacent derived graphs are compared using the
Nodes (NS) and Edges similarity (ES) metrics. NS is calculated
using Eq. 1 [2], where 𝑛𝑝 and 𝑛𝑞 are the number of activities in
the process maps 𝑃 and 𝑄 (derived from adjacent windows)
respectively, and 𝑛𝑐𝑠 indicates the number of common activities
between 𝑃 and 𝑄. ES is calculated using Eq. 2, similar to NS: 𝑒𝑝          Fig.1. Screenshot from the main window.
is the number of edges in 𝑃, 𝑒𝑞 is the number of edges in 𝑄, and
𝑒𝑐𝑠 indicates the number of common edges in both 𝑃 and 𝑄.                to check different hyperparameter configurations to overcome
                                                                         this challenge visually.
               𝑁𝑆 = 2 ∗ 𝑛𝑐𝑠 = (𝑛𝑝 + 𝑛𝑞)               (1)
                                                                             The tool was presented to our research group in Curitiba
               𝐸𝑆 = 2 ∗ 𝑒𝑐𝑠 = (𝑒𝑝 + 𝑒𝑞)               (2)                (Brazil), including researchers from three post-graduate
    IPDD calculates both metrics, and if one or both is less than        programs (Informatics, Production and Systems Engineering,
0, it marks the window as a drift. The F-score metric uses the           and Health Technology). Firstly we have conducted a usability
True Positives (TP), False Positives (FP), and FN (False                 assessment for redesigning the user interface. Currently, we are
Negatives). A TP indicates a window reported as a drift                  working on a case study on a manufacturing scenario. The idea
containing a trace inputted as a real drift; an FP is counted when       is to detect drifts in the temporal perspective of the process
a window reporting a drift does not contain any trace informed           (sojourn time). The information about drifts will be used as input
as real drifts, and an FN is incremented when a window that does         for planning the maintenance intervals on the production line.
not report a drift contains any traces inputted as actual drifts.
                                                                                                      ACKNOWLEDGMENT
    Fig. 1 shows the tool’s main screen, allowing users to easily
                                                                             This study was financed in part by the Coordenação de
change parameters and visually check the results. The parameter
                                                                         Aperfeiçoamento de Pessoal de Nível Superior-Brasil (CAPES)
configuration panel is on top, where users must define the
                                                                         - Finance Code 001 – Grant No.: 88887.321450/2019-00.
hyperparameter configuration before starting the analysis. After
clicking on “Analyze Process Drifts”, users can follow the                                                REFERENCES
current status in the “Status” area below the parameters panel.
                                                                         [1]   W. M. P. Van der Aalst et al., “Process Mining Manifesto,” in
When the analysis finishes, IPDD shows the process drift                       International Conference on Business Process Management BPM 2011:
analysis panel. There is a timeline of windows in the upper part               Business Process Management Workshops, 2011, vol. 99, pp. 169–194.
of this panel, where users can click to inspect specific windows         [2]   R. P. J. C. Bose, W. M. P. van der Aalst, I. Zliobaite, and M. Pechenizkiy,
of the process model. The similarity metrics information (on the               “Dealing With Concept Drifts in Process Mining,” IEEE Trans. Neural
left side) is updated for each window selected, providing                      Networks Learn. Syst., vol. 25, no. 1, pp. 154–171, Jan. 2014.
information about the differences between the current and the            [3]   S. M. Vecino, D. F. Cristiana, B. Paul, and S. Emilio, “A Survey on
previous model. In the example, the ES indicates a drift that is               Concept Drift in Process Mining,” ACM Comput. Surv., vol. 54, no. 9,
                                                                               pp. 1–38, Oct. 2021.
characterized by two edges added. After IPDD finishes the
                                                                         [4]   A. Maaradji, M. Dumas, M. La Rosa, and A. Ostovar, “Fast and Accurate
analysis, the user can show the evaluation panel to calculate the              Business Process Drift Detection,” in International Conference on
F-score metric by clicking “Evaluate results”. IPDD framework                  Business Process Management BPM 2016: Business Process
is described in more detail in [8]. Its source code is available in            Management, 2015, pp. 406–422.
a public repository1, the deployed application is available in a         [5]   A. Maaradji, M. Dumas, M. L. Rosa, and A. Ostovar, “Detecting Sudden
public node2, and a demo video is available on YouTube3.                       and Gradual Drifts in Business Processes from Execution Traces,” IEEE
                                                                               Trans. Knowl. Data Eng., vol. 29, no. 10, pp. 2140–2154, 2017.
                        III. CASE STUDIES                                [6]   A. Yeshchenko, C. Di Ciccio, J. Mendling, and A. Polyvyanyy, “Visual
                                                                               Drift Detection for Sequence Data Analysis of Business Processes,” IEEE
    Authors have proposed different tools for process drift                    Trans. Vis. Comput. Graph., pp. 1–1, 2021.
detection. However, the methods are usually sensitive to the             [7]   A. Berti and S. van Zelst, “Process Mining for Python (PM4Py): Bridging
hyperparameter configuration. Moreover, almost all approaches                  the Gap Between Process- and Data Science.” 2019.
apply windowing strategies – and defining a “good” value for             [8]   D. M. V. Sato, J. P. Barddal, and E. E. Scalabrin, “Interactive Process
the window size is still a challenge. Also, the adaptive                       Drift Detection Framework,” in International Conference on Artificial
approaches have some drawbacks; other parameters affect the                    Intelligence and Soft Computing (ICAISC), 2021, pp. 192–204.
detected drifts [3]. Our IPDD approach gives users the freedom
                                                                                     3
    1 https://github.com/denisesato/InteractiveProcessDriftDetectionFW                   Demonstration video at: https://youtu.be/8feKd6jr8Gs
                     2 http://visual-pro-drift.com.br:8050/

  Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0
  International (CC BY 4.0).