Light Methods for Conformance Checking of Business
                      Processes

                                    Farbod Taymouri

                  Universitat Politècnica de Catalunya, Barcelona (Spain)
                                taymouri@cs.upc.edu


    In Process Mining, Conformance Checking of business process models ad- dresses
the deviations happened in their real executions. Conformance checking is a very im-
portant dimension on its own, enabling the detection and visualization of deviations on
top of process models. Its consequences expand not only to detection of misalignment
between observed and modeled behavior, but also to improve the value of process mod-
els by projecting event data on them. For example, model repair techniques open the
door for enhancement of processes, by modifying those parts of the process model that
no longer represent the underlying process.
    Identifying deviations mathematically boils down to the notion of alignment con-
ceptually. An alignment quantifies to what degree a process model can imitate what
happened in its observed behavior, i.e., an event log. Formally, an alignment is a ma-
trix data structure with two rows, where each column represents whether an observed
event can be lined up with the corresponding activity of the given process model. With
that said, an optimal alignment is the best combination by which the process model can
imitate the corresponding observed behavior. Unfortunately, state of the art techniques
for alignment computation have exponential time and space complexity, hindering their
applicability for medium and large instances even for a single alignment. Additionally,
extra challenges will appear if other dimensions, like multi-perspective or online com-
putation, are considered.
    This thesis presents several approaches for alignment computation, not only to
tackle the mentioned challenges, but also, to provide new insights and facilities for con-
formance checking of large process models. Indeed, each of the proposed approaches
has a specific set of advantages, and globally are complementary to each other; for in-
stance, we propose techniques to handle very large process models in limited amount
of memory or time. In general, they have some goals in common, namely, proposing
light and efficient methods for alignment computation. Above that, one of the presented
contributions for alignment computation is an abstraction framework that can be used
in combination with any other techniques in the literature. Doing this not only alle-
viates the computational challenge, but also provides a greater logical perspective of
deviations.
    All methods presented in this thesis are kinds of combinatorial optimizations for
alignment computation. They make close approximation to an optimal solution. The
corresponding contributions represent novel ways in which they are proposed and their
effectiveness for the mentioned challenge with respect to state of the art approach in
different perspectives. Generally speaking, methods presented in this thesis can be cat-
egorized as:
 – Classical approaches [5, 8]: These techniques exploit Integer Linear Programming
   (ILP) as well as structural theory of Petri nets to formulate alignment computation
   as an optimization of a set of linear equations. In other words, these approaches
   translate the computation of an optimal alignment to an ILP instance. In general,
   providing an alignment over speci
   ed intervals, called Approximate Alignment, is an outcome of these approaches.
   Three different formulations with various objectives are pro- posed. In short, the
   first one is optimal in terms of Parikh vector but it does not scale [1]; the second
   technique follows a recursive approach, hence it is fast but its optimality is not
   guaranteed [1]. The last one computes an alignment in an incremental way, adapting
   the A∗ technique from [2], and a real solution (perhaps not optimal) is guaranteed
   [3].
 – Heuristic approaches [6, 7]: These techniques adopt different policy with respect
   to the previous approaches, and take advantages of heuristic functions to explore the
   search space of alignments to find the optimal one(s). This can be done by obtaining
   an initial solution, and iteratively improving it until saturation or by reaching a cer-
   tain criterion. In more details, the iterations must be light so that the complexity of
   the overall framework is not worsen, and more importantly, to guide the algorithm
   toward a better solution. The first technique in this class, called Local Search, poses
   an initial solution. The initial solution can be located somewhere in the correspond-
   ing search space. Then, tailored operators will be applied so as to improve the initial
   solution iteratively. Intuitively, the operators cause to examine neighbors of the cur-
   rent solution, and if a better solution is found, it will be considered as the current
   solution. The improvement loop continues until no better solution can be obtained
   [4]. The other technique in this class adopts a Genetic Algorithm, with well spe-
   cific designed operators, by which exploration of the corresponding search space
   can be guided toward the best solution(s). The main idea behind this technique is
   similar to the previous one, i.e., exploring the search space until finding the best
   solution. Although this approach is slower than the previous heuristic approach, it
   can generate more than one solution, and moreover due to the evolutionary setting,
   the probability of getting stuck in local optimum is much smaller [5].
 – Model reduction [4]: Regardless of the two main approaches just mentioned, one
   way to boost the effectiveness of alignment computation is reducing model and
   observed behavior without loosing alignment information. This structure reduction
   not only boosts the alignment computation, but also provides a big picture of de-
   tected deviations. Above that, a divide-and-conquer strategy will be provided for
   the first approach such that it breaks the original problem into a set of smaller in-
   dependent problems which can be solved independently. This technique in more
   details is a framework for the size reduction of a given problem, i.e., model and
   event log. In other words, this approach can be integrated by any approaches which
   compute alignments to boost its efficiency in different perspectives [6].

     Experiments witness the merit of proposed approaches with respect to state of the
art techniques in different perspectives, such as memory consumption, execution time,
quality and accuracy of solutions found. All methods have been implemented as a stand-

                                            2
alone tool box called ALI 1 [7]. The tool box and datasets used in this thesis are publicly
available at https://www.cs.upc.edu/ taymouri/.


References
1. F. Taymouri, J. Carmona, A recursive paradigm for aligning observed behavior of large struc-
   tured process models, in: 14th International Conference of Business Process Management
   (BPM), Rio de Janeiro, Brazil, September 18 - 22, 2016.
2. A. Adriansyah, Aligning observed and modeled behavior, Ph.D. thesis, Technische Univer-
   siteit Eindhoven (2014).
3. B. van Dongen, J. Carmona, Th. Chatain, F. Taymouri, Aligning modeled and observed be-
   havior: A compromise between complexity and quality, in: E. Dubois, K. Pohl (Eds.), Pro-
   ceedings of the 29th International Conference on Advanced Information Systems Engineering
   (CAiSE’17), Vol. 10253 of Lecture Notes in Computer Science, Springer, Essen, Germany,
   2017, to appear.
4. F. Taymouri, J. Carmona, Computing alignments of well-formed process models using local
   search, Submitted to ACM Transactions on Software Engineering and Methodology.
5. F. Taymouri, J. Carmona, An evolutionary technique to approximate multiple optimal align-
   ments, in: 16th International Conference of Business Process Management (BPM), Sydney,
   Australia. September 9-14, Brazil, September 18 - 22, 2018.
6. F. Taymouri, J. Carmona, Model and event log reductions to boost the computation of align-
   ments, in: Data-Driven Process Discovery and Analysis - 6th IFIP WG 2.6 International
   Symposium, SIMPDA 2016, Graz, Austria, December 15-16, 2016, Revised Selected Papers,
   2016, pp. 1–21.
7. F. Taymouri, ALI: Alignment for Large Instances (2017).
   URL https://www.cs.upc.edu/˜taymouri/tool.html


 1
     1Alignment of Large Instances


                                              3