Conformance Checking Beyond Replay Methods: Closing
                         The Gap To Real-World Adoption
                         Eduardo Goulart Rocha1,2,*
                         1
                             Celonis Labs GmbH, Munich, Germany
                         2
                             Process And Data Science (PADS) Chair - RWTH Aachen University, Aachen, Germany


                                       Abstract
                                       Conformance checking is a key subfield of process mining. Despite its importance, there is relatively little
                                       adoption of advanced (procedural) conformance checking methods in the industry, which can be attributed
                                       to two limitations of state-of-the-art techniques: first, the poor scalability resulting from their inherent worst-
                                       case exponential nature. Second, the high cognitive load needed to interpret their results, which renders them
                                       inaccessible to non-experts. This Ph.D. project aims to close these gaps by developing conformance checking
                                       techniques suitable for industrial settings. Our research is structured into two workstreams. First, we investigate
                                       efficient techniques when one only needs to compute a single number to quantify the degree of discrepancy
                                       between an event log and a process model. In parallel, we investigate how to generate conformance diagnostics
                                       that can be easily understood without technical enablement. In both workstreams, there has been progress in
                                       terms of accepted papers that validate the problem and our proposed solutions.

                                       Keywords
                                       Process Mining, Conformance Checking, Conformance Diagnostics


                         1. Introduction
                         Conformance checking is concerned with identifying and quantifying discrepancies between event logs
                         and process models. It is a central task in process mining, being an enabler for tasks such as process
                         discovery (by measuring the degree of discrepancy between the log and the model) and enhancement
                         (by correlating patterns of deviation with KPI performance). Despite that, state-of-the-art replay-based
                         conformance checking techniques such as token-based replay [1] and trace alignments [2] are not
                         widely adopted in industry, with most commercial tools lacking support for them.
                            This Ph.D. project is motivated by problems encountered in industry when trying to productize
                         conformance checking techniques. In particular, we learn from two key limitations of existing methods
                         (with a special focus on trace alignments) which, in our experience, explain their poor adoption: first,
                         their poor scalability, and second, the hard-to-digest diagnostics produced by them. Our research is
                         split into two workstreams, depending on the conformance task to be solved.

                         Task 1 (Computing a Metric) The simplest task of conformance checking is to quantify the degree of
                         discrepancy between an event log and a process model as a single number metric. On the one hand, con-
                         formance metrics are expected to offer a series of quality guarantees such as determinism, monotonicity,
                         and robustness to partial mismatches. On the other hand, applications such as optimization-based
                         process discovery [3] and online monitoring of the conformance rate requires computationally efficient
                         metrics. Unfortunately, none of existing techniques meet both criteria.
                            As shown in [4, 5], most existing techniques do not provide sufficient quality guarantees. More
                         critically, in terms of scalability, existing techniques do not scale to very large event logs that are
                         commonly found in industry. For comparison, while the largest BPI Challenge dataset (2016) contains
                         less than 20 million events (and this is arguably one of the least studied public datasets), in industry

                          ICPM 2024 Doctoral Consortium, October 14–18, 2024, Kongens Lyngby, Denmark
                         *
                           Corresponding author.
                          $ e.goulartrocha@celonis.com (E. Goulart Rocha)
                           0009-0000-1184-1188 (E. Goulart Rocha)
                                       © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
Table 1
Research agenda and open challenges
                 RQ Task                                  Status / Open Challenges
                 1.1 Introduce method for process trees   Published [9]
                 1.2 Extend to free-choice Petri nets     Improve the approximation
                 1.3 Formalize the stochastic problem     Accepted (to appear) [10]
                 1.4 Extend to stochastic process trees   Decompose stochastic models
                 2.1 Define the diagnostics framework     Published [11]
                 2.2 Discover data and time patterns      Scalability and decidability
                 2.3 Discover arbitrary patterns          Discovery and verbalization
                 2.4 Minimize arbitrary patterns          Published [12]


one often encounters event logs with more than one billion (sometimes more than ten billion) events.
Furthermore, most techniques do not offer runtime guarantees, meaning that it is hard to predict when
a computation will succeed and that one must resort to timeouts instead, which leads to poor customer
experience. This motivates our first research question:

RQ 1: How to efficiently measure the conformance rate of very large event logs while providing
quality and runtime guarantees?

Task 2 (Providing Diagnostics) A more advanced task in conformance checking is to identify devi-
ation patterns that provide insights into the nature of non-conformance. For this task, alignment-based
techniques [2] are considered state-of-the-art. However, alignments suffer from multiple shortcomings:

   1. The produced diagnostics are too low-level, in the form of insertion and deletion operations,
      which do not explain the nature of the deviation.
   2. The same insertion/deletion operation admits different interpretations depending on the context.
   3. Alignments are “non-deterministic”. A trace often admits multiple optimal alignments inducing
      different diagnostics.

   Correctly interpreting the result of alignment techniques requires awareness of the above limitations,
raising the entry bar for non-experts. This problem has been acknowledge in the literature [6, 7, 8], but
not thoroughly researched. This motivates our second research question:

RQ 2: How to identify patterns of non-conformance that are easily interpretable by non-expert users?


2. Research Agenda and Methodology
Our research agenda is summarized in Table 1.

2.1. Efficiently Measuring Conformance Via Subtraces
To answer the first research question, we build on the technique introduced in [5]. The proposed
conformance metrics are based on comparing the Markovian abstraction of the event log and the
process model. The technique offers a series of quality guarantees (see [5]). Furthermore, it brings
two advantages in terms of scalability. First, it avoids computing the synchronous product between
both artifacts. Second, the Markovian abstraction of event logs can be computed with a linear pass
over the log, which scales to very large event logs. However, the method proposed in [5] is worst-case
exponential in the size of the process model. Our research focuses on how to efficiently compute this
abstraction for large process models. A first milestone (1.1) of presenting a compositional polynomial-
time approach for process trees has been reached [9]. Additionally, we extended (1.3) the abstraction to
the stochastic perspective [10].
Extend to free-choice Petri nets (1.2) and stochastic process trees (1.4): For free-choice Petri
nets, we can already provide a polynomial-time approximation method based on decomposing the
net and we are currently working on improving the approximation. For stochastic process trees, we
might need different approaches since compositional methods do not work well with their stochastic
perspective.

Evaluation: This research focuses on a well-studied problem in process mining. We resort to standard
evaluation setups consisting of real-world and synthetic datasets to measure the runtime of the approach
and to compare the induced model ranks with existing state-of-the-art techniques. The scalability of
the approach must be measured across multiple dimensions: the size of (the state space of) the model,
the size of the log, the number of distinct event types, and the degree of discrepancy between the model
and the log.

2.2. Mining Behavioral Patterns for Conformance Diagnostics
To answer the second research question, we first observe that diagnostics provided by declarative
methods are often more understandable to non-expert users, however, declarative models are unsuitable
to model certain types of processes [13, 14]. Our proposed solution is to combine both paradigms. The
approach is sketched in Figure 1. First, (control-flow) declarative constraints are discovered from a
user-provided procedural process model (step A), producing a declarative model that is "equivalent" to
the original model. Next, this model is minimized to remove redundancies (B) and used to verify the
event log (C).

 Constraint
 Templates        A. Discover       Declarative     B. Constraint             C. Verify
Procedural        Constraints         Model         Minimization                Log         Diagnostics
                                                                     Log
  Model

Figure 1: Proposed diagnostics framework. The set of constraint templates is provided by the tool vendor.


   Using the discovered constraints to verify the event log ensures that the diagnostics are on a higher
level and, thus, more understandable. Furthermore, the approach enjoys good quality properties such
as determinism and monotonicity of the generated diagnostics. Our first work [11] introduces the
framework above (2.1) and demonstrates how to use it to derive understandable conformance diagnostics.
The next steps in this research involve refining each framework step. One of which, namely extending
the constraint minimization step (B) to arbitrary models (2.4), is already completed [12].

Discover Time and Data Patterns (2.2): A natural extension of the framework above is to consider
the time and data perspectives [15]. However, this raises questions on feasibility. Reasoning tasks
involving the data perspective are computationally expensive. For time constraints, these can quickly
turn undecidable [16, 17].

Discover Arbitrary Patterns (2.3): Currently, the framework is limited to a set of constraint template
patterns (such as the set of DECLARE patterns) which might not best capture a deviation. We would like
to discover arbitrary behavioral patterns, i.e. arbitrary logical formulae beyond a fixed set of templates.
For that, we must solve two challenges: First, we must devise an algorithm to discover arbitrary logical
formulae (step A). Second, we must devise a mechanism to verbalize the discovered formulae to the
user (step C). We believe that the latter can be tackled to some extend using NLP techniques such as
rule-based methods or large language models. Therefore, we focus our attention on the first problem,
for which our current idea is to extend existing methods [18] to work in our setting. In parallel, we are
also evaluating ad hoc approaches that work by analyzing the structure of the process models [19].
Evaluation: Our evaluation will be primarily qualitative. To validate that our diagnostics are under-
standable, we plan to conduct a user study. However, user studies suffer from reproducibility issues.
Therefore, we also aim to develop a set of reference process models for public datasets and demonstrate
our method on top of them. With that, we aim to develop a “benchmark” for conformance diagnostics.


Acknowledgments
This Ph.D. project is a cooperation between the RWTH Aachen University and Celonis Labs GmbH. It
is supervised by Prof. Dr. Ir. Wil van der Aalst and fully funded by Celonis.


References
 [1] A. Rozinat, W. M. P. van der Aalst, Conformance checking of processes based on monitoring real
     behavior, Inf. Syst. 33 (2008) 64–95. doi:10.1016/J.IS.2007.07.001.
 [2] A. Adriansyah, B. F. van Dongen, W. M. P. van der Aalst, Conformance checking using cost-based
     fitness analysis, in: IEEE 15th International Enterprise Distributed Object Computing Conference,
     IEEE Computer Society, 2011, pp. 55–64. doi:10.1109/EDOC.2011.12.
 [3] J. C. A. M. Buijs, Flexible evolutionary algorithms for mining structured process models, Technische
     Universiteit Eindhoven 57 (2014).
 [4] N. Tax, X. Lu, N. Sidorova, D. Fahland, W. M. P. van der Aalst, The imprecisions of precision
     measures in process mining, Inf. Proc. Lett. 135 (2018) 1–8. doi:10.1016/J.IPL.2018.01.013.
 [5] A. Augusto, A. Armas-Cervantes, R. Conforti, M. Dumas, M. L. Rosa, Measuring fitness and
     precision of automatically discovered process models: A principled and scalable approach, IEEE
     Trans. Knowl. Data Eng. 34 (2022) 1870–1888. doi:10.1109/TKDE.2020.3003258.
 [6] A. Adriansyah, B. F. van Dongen, N. Zannone, Controlling break-the-glass through alignment, in:
     International Conference on Social Computing, SocialCom 2013, IEEE Computer Society, 2013, pp.
     606–611. doi:10.1109/SOCIALCOM.2013.91.
 [7] A. S. M. Mehr, R. M. de Carvalho, B. F. van Dongen, Explainable conformance checking: Under-
     standing patterns of anomalous behavior, Engineering Applications of Artififical Intelligence 126
     (2023) 106827. doi:10.1016/J.ENGAPPAI.2023.106827.
 [8] M. Grohs, H. van der Aa, J. Rehse, Beyond log and model moves in conformance checking: Dis-
     covering process-level deviation patterns, in: Business Process Management - 22nd International
     Conference, BPM 2024, Krakow, Poland, September 1-6, 2024, Proceedings, volume 14940 of Lecture
     Notes in Computer Science, Springer, 2024, pp. 381–399. doi:10.1007/978-3-031-70396-6\_22.
 [9] E. G. Rocha, W. M. P. van der Aalst, Polynomial-time conformance checking for process trees, in:
     Business Process Management - 21st International Conference, BPM 2023, Utrecht, The Nether-
     lands, September 11-15, 2023, Proceedings, volume 14159 of Lecture Notes in Computer Science,
     Springer, 2023, pp. 109–125. doi:10.1007/978-3-031-41620-0\_7.
[10] E. Goulart Rocha, S. J. J. Leemans, W. M. P. van der Aalst, Stochastic conformance checking based
     on expected sub-trace frequency, in: International Conference on Process Mining, 2024.
[11] E. G. Rocha, S. J. van Zelst, W. M. P. van der Aalst, Mining behavioral patterns for conformance
     diagnostics, in: Business Process Management - 22nd International Conference, BPM 2024, Krakow,
     Poland, September 1-6, 2024, Proceedings, volume 14940 of Lecture Notes in Computer Science,
     Springer, 2024, pp. 291–308. doi:10.1007/978-3-031-70396-6\_17.
[12] E. G. Rocha, W. M. P. van der Aalst, Precision-guided minimization of arbitrary declarative process
     models, in: Enterprise, Business-Process and Information Systems Modeling - 25th International
     Conference, BPMDS 2024, and 29th International Conference, EMMSAD 2024, volume 511 of
     LNBIP, Springer, 2024, pp. 48–56. doi:10.1007/978-3-031-61007-3\_5.
[13] H. A. Reijers, T. Slaats, C. Stahl, Declarative modeling-an academic dream or the future for bpm?,
     in: Business Process Management - 11th International Conference, BPM 2013, Beijing, China,
     August 26-30, 2013. Proceedings, volume 8094 of Lecture Notes in Computer Science, Springer, 2013,
     pp. 307–322. doi:10.1007/978-3-642-40176-3\_26.
[14] D. Fahland, D. Lübke, J. Mendling, H. A. Reijers, B. Weber, M. Weidlich, S. Zugal, Declarative versus
     imperative process modeling languages: The issue of understandability, in: Enterprise, Business-
     Process and Information Systems Modeling, 10th International Workshop, BPMDS 2009, and
     14th International Conference, EMMSAD 2009, volume 29 of LNBIP, Springer, 2009, pp. 353–366.
     doi:10.1007/978-3-642-01862-6\_29.
[15] A. Burattin, F. M. Maggi, A. Sperduti, Conformance checking based on multi-perspective declarative
     process models, Expert Syst. Appl. 65 (2016) 194–211. doi:10.1016/J.ESWA.2016.08.040.
[16] R. Alur, D. L. Dill, A theory of timed automata, Theor. Comput. Sci. 126 (1994) 183–235. doi:10.
     1016/0304-3975(94)90010-8.
[17] J. Ouaknine, J. Worrell, On the language inclusion problem for timed automata: Closing a
     decidability gap, in: 19th IEEE Symposium on Logic in Computer Science (LICS 2004), 14-17 July
     2004, Turku, Finland, Proceedings, IEEE Computer Society, 2004, pp. 54–63. doi:10.1109/LICS.
     2004.1319600.
[18] A. Camacho, S. A. McIlraith, Learning interpretable models expressed in linear temporal logic,
     in: Proceedings of the Twenty-Ninth International Conference on Automated Planning and
     Scheduling, ICAPS 2019, Berkeley, CA, USA, July 11-15, 2019, AAAI Press, 2019, pp. 621–630.
     doi:10.1609/icaps.v29i1.3529.
[19] M. Weidlich, A. Polyvyanyy, J. Mendling, M. Weske, Efficient computation of causal behavioural
     profiles using structural decomposition, in: Applications and Theory of Petri Nets, 31st Interna-
     tional Conference, PETRI NETS 2010, Braga, Portugal, June 21-25, 2010. Proceedings, volume 6128
     of LNCS, Springer, 2010, pp. 63–83. doi:10.1007/978-3-642-13675-7\_6.