=Paper= {{Paper |id=Vol-1592/paper10 |storemode=property |title=Calculating the Number of Unique Paths in a Block-Structured Process Model |pdfUrl=https://ceur-ws.org/Vol-1592/paper10.pdf |volume=Vol-1592 |authors=Gert Janssenswillen,Benoît Depaire,Toon Jouck |dblpUrl=https://dblp.org/rec/conf/apn/JanssenswillenD16 }} ==Calculating the Number of Unique Paths in a Block-Structured Process Model== https://ceur-ws.org/Vol-1592/paper10.pdf
    Calculating the Number of Unique Paths in a
          Block-Structured Process Model

           Gert Janssenswillen1,2 , Benoît Depaire1 , and Toon Jouck1

           Hasselt University, Agoralaan Bldg D, 3590 Diepenbeek, Belgium
    Research Foundation Flanders (FWO), Egmontstraat 5, 1060 Brussels, Belgium
         gert.janssenswillen, benoit.depaire, toon.jouck@uhasselt.be



       Abstract. Estimating the number of execution paths in a process model
       is a non-trivial task as one runs quickly into an combinatorial explosion
       of possible paths. This paper introduces a new algorithm to calculate the
       number of different execution paths for finite-behavior block-structured
       models in a computationally efficient way. Block functions are defined
       for the workflow constructs sequence, parallel, exclusive choice and finite
       loops, such that the amount of behavior in each block-construct can
       be computed efficiently. Subsequently, the block-structuredness of the
       model is exploited to efficiently calculate the number of unique paths
       in the model. The algorithm has been implemented for process trees,
       although the translation to other modeling notations is straightforward.
       An empirical analysis showed that the run-time of the algorithm is very
       low, and only slightly impacted by the complexity of the model.

       Keywords: Process Modeling, Process Mining, Process Trees, Process
       Model Complexity


1    Introduction
In the field of process mining and process modeling, there are certain situations
where it is desirable to quantify the amount of behavior of a process model.
For example, if one wants to quantify the variance of behavior present in a
process model. Related to this, the precision measure for discovered process
models, expresses to what extent the behavior in the model does not exceed
the behavior in the log [3]. Similarly, the implicit realism measure [4] uses the
number of unique paths in a model to calculate the probability that a certain
amount of behavior from the model did not show up in the log. The amount
of behavior in a model can moreover be used as a proxy for model complexity.
As it can be computationally hard to compute the amount of behavior, several
metrics to calculate model complexity use proxies instead. [6].
    Determining the amount of behavior in a process model, which we quantify
in this paper as the number of unique (execution) paths, is a challenging task.
One could naively traverse the process model recursively and count the number
of unique paths, but this quickly becomes computationally infeasible due to a
combinatorial explosion of different (parallel) paths.




                                       138
   The main idea presented in this paper is to compute the number of unique
paths in a block-structured finite-behavior process model in a more intelligent
and computationally efficient way. As we will show, this is possible by exploiting
the block-structuredness of the model. In this paper, we make the following
contributions:

 – We construct a block function, which calculates the number of unique paths,
   for each of the following process constructs: sequence (→), exclusive choice
   (×), parallelism (∧) and structured finite loops (k ).
 – We develop a generic approach to determine the total amount of behavior
   in a block-structured finite-behavior process model.
 – We provide an implementation for process trees.

   Section 2 describes the general approach used by the algorithm, while in
Section 3 the implementation is elaborated upon. The performance of the tech-
nique in terms of run-time is discussed in Section 4 while Section 5 concludes
the paper.


2     General approach

In this section, the formal approach of the calculation will be described. First,
some assumptions will be make regarding the types of models taken into account.
In the subsequent paragraphs, different block functions for each of the specific
operator type will be defined. Finally, some limitations to the formal approach
will be pointed out, together with work-arounds in order to solve them.


2.1   Assumptions and used notations

It is important to keep in mind that we impose two restrictions on the pro-
cess models. Firstly, we assume finite-behavior models, since it would otherwise
make no sense to determine the number of unique traces. Consequently, loops
in our models have a maximum number of repetitions. While this appears very
restrictive, this can be justified by accepting a so-called fairness assumption,
which states that a task of a process cannot be postponed indefinitely. This
assumption therefore rules out infinite behaviors that are considered unrealis-
tic [1]. Secondly, we assume that models are block-structured, i.e. they can be
decomposed in properly nested subprocesses.
    For the development and discussion of our approach, we will use the process
tree notation, because process trees are block-structured by definition. However,
the ideas in this paper are applicable to other notation languages as long as
the models are block-structured and finite in behavior. We formally define a
finite-behavior Process Tree, which is largely based on the definition in [2], as
follows:

Definition 1. Let A be the activity alphabet and A ⊆ A be a finite set of activ-
ities,then P T = (N, r, m, c) is a process tree such that:




                                    139
 – N is a non-empty finite set of nodes consisting of operator (NO ) and leaf
   nodes (NL ) such that: NO ∩ NL = ∅
 – r ∈ NO is the root node of the tree
 – O = {→, ×, ∧, k , ∨}, the set of operator types.
 – m : N → A ∪ O is a mapping function mapping each node to an operator or
   activity:
                                
                                  a ∈ A ∪ {τ }, if n ∈ NL .
                       m(n) =
                                  o ∈ O,         if n ∈ NO .
 – c : N → N ∗ is the direct-child-relation function:
   c(n) = if n ∈ NL
   c(n) ∈ N + if n ∈ NO
   such that

      • each node except the root node has exactly one parent:
        ∀n ∈ N \{r} : ∃p ∈ NO : n ∈ c(p) ∧ q ∈ NO : p = q ∧ n ∈ c(q);
      • the root node has no parent:
        n ∈ N : r ∈ c(n);
      • each node appears only once in the list of children of its parent:
        ∀n ∈ N : ∀1≤i