Resource Utilization Prediction in
        Decision-Intensive Business Processes?

    Simon Sperl1 , Giray Havur1,2 , Simon Steyskal1,2 , Cristina Cabanillas2 ,
                     Axel Polleres2 , and Alois Haselböck1
         1
             Siemens AG Österreich, Corporate Technology, Vienna, Austria
                           {name.surname}@siemens.com
                2
                  Vienna University of Economics and Business, Austria
                             {name.surname}@wu.ac.at


      Abstract. An appropriate resource utilization is crucial for organiza-
      tions in order to avoid, among other things, unnecessary costs (e.g. when
      resources are under-utilized) and too long execution times (e.g. due to
      excessive workloads, i.e. resource over-utilization). However, traditional
      process control and risk measurement approaches do not address resource
      utilization in processes. We studied an often-encountered industry case
      for providing large-scale technical infrastructure which requires rigorous
      testing for the systems deployed and identified the need of projecting
      resource utilization as a means for measuring the risk of resource under-
      and over-utilization. Consequently, this paper presents a novel predictive
      model for resource utilization in decision-intensive processes, present in
      many domains. In particular, we predict the utilization of resources for
      a desired period of time given a decision-intensive business process that
      may include nested loops, and historical data (i.e. order and duration
      of past activity executions, resource profiles and their experience etc.).
      We have applied our method using a real business process with multiple
      instances and presented the outcome.

      Key words: decision-intensive business processes, prediction model, re-
      source utilization, risk management


1 Introduction

Human resource utilization in an organization can be seen as the proportion of
time a person spends on working on allocated tasks. Poor utilization of human
resources2 relates to having resources unnecessarily idle or overloaded. This has a
very negative effect on process performance measures such as process completion
time, execution costs, and quality [1]. Specifically, while under-utilization of re-
sources leads to higher process execution costs, over-utilization of resources may
?
  This work has been funded by the Austrian Research Promotion Agency (FFG)
  under the project grant 845638 (SHAPE) and the Austrian Science Fund (FWF)
  under the project grant V 569-N31 (PRAIS).
2
  From now on resources for the sake of brevity.


                                        128
result in process delays. Therefore, decision makers (typically process managers)
should be informed about the utilization of resources in their organizations for
enabling appropriate controls that ensure a desired level of resource utilization.
    In a scenario where the process model has no decision nodes, given a base-
line schedule and resource allocation [2], deriving the utilization of resources
would be trivial. However, actual processes usually have decision points that
split the execution flow into different paths so that several cases are projected
in the same process model [3]. Decision-intensive processes may contain (nested)
loops [4], which actually makes both scheduling and resource allocation more
difficult due to the increasing uncertainty. Nonetheless, these kind of processes
are common in many domains, e.g. engineering, healthcare, insurance handling,
and construction. An inadequate scheduling or allocation of resources may re-
sult in a poor resource utilization. While recent resource allocation approaches
have already tried and addressed that kind of processes [2], to the best of our
knowledge there is a lack of support for resource utilization prediction in such
complex scenarios, as most of the existing techniques tend to simplify the ap-
plication scope [5, 6, 7, 8]. This, in turn, negatively affects risk management
in organizations, since process managers miss input that helps to improve the
process models and hence, the execution performance of the processes.
    In this paper we address that problem and describe a mathematical method
for quantifying resource utilization with respect to the structural properties of
non-deterministic processes and the historical executions of these proceses. The
input values that are used for our prediction model are intrinsically of a stochas-
tic nature, i.e. in the form of probability density functions (PDF). They include,
among others, the activity duration PDFs and the resource utilization PDFs.
We propagate these input values towards the accumulated utilization function.
This function provides a visual overview on the level of future resource uti-
lizations. Therefore, an upcoming over- or under-utilization of resources can be
observed. Moreover, we have defined two metrics which characterizes resource
over-utilization and under-utilization. We have implemented our approach and
demonstrated it with a real process related to large-scale technical infrastructure
development and deployment.
    We believe that our approach enhances the assessment of process behaviour
with the resource perspective. It is especially useful for organizations who need
to evaluate the utilization of resources for which they are accountable. With the
help of our approach they can automatically get an answer to questions such as
whether they have enough resources for the robust execution of their processes,
in which periods of time they should expect a delay in the process execution,
and when they can safely grant vacations to particular resources, among others.
    The paper is structured as follows: Section 2 presents a scenario that mo-
tivates this work as well as related work. Section 3 formally defines the input
required for our approach. Section 4 describes our approach for predicting re-
source utilization in decision-intensive business processes. Section 5 applies our
method to a real process and presents the outcome, and Section 6 concludes the
paper and outlines the future work.


                                       129
2 Background
In the following, we describe an example scenario that motivates this work and
shows the problems to be addressed, and then we outline related work.

2.1 Running Example
A company that provides large-scale technical infrastructure requires rigorous
testing for the systems deployed. Each system consists of different types and
number components that are developed and tested in parallel. In order to pro-
vide concise and clear examples while describing our method, we use the simple
example process shown in Fig. 1. In this process, the Develop activity is followed
by a Test activity, and this may repeat if the test fails. There is also the activ-
ity Manage which abstracts the potential contractual work running in parallel.
There are two resources, namely Jack and Jill, who execute this process. A more
complex process from a real scenario is used later in Section 5 for demostrating
the applicability of our method.

2.2 Related Work
The approach presented in this paper is mostly related to resource-related risk
monitoring and prediction. This risk occurs “due to the high variability that may
affect operational processes in real world scenarios” [9]. The risk of inappropriate
resource utilization has been identified in [10, 11]. Rosemann et. al. [10] provides
a risk taxonomy from the project management perspective. They identify the
organizational risk in their taxonomy (e.g., when a resource does not possess the
required skills to carry out an activity, or when there is not enough resources
to carry out activities on time). Similarly, the process related risk taxonomy of
zur Muehlen et. al. [11] contains the resource perspective catagory about lack of
resources and/or their skills for activity executions.
    Information systems support processes by recording information about their
executions in event logs [12]. In order to manage time and resource related risks
in an informed fashion, a variety of event log mining and prediction mechanisms
have been devised: Among others, process duration estimation [13, 14], dead-
line violation detection [6], resource profiling [7], resource behaviour measure-
ment [5, 15], resource scheduling [16], resource recommendation [17], work pri-
oritization [18], and process performance forecasting [19]. There are also several
risk monitoring approaches [8, 20] combines the scope of several aformentioned
mechanisms.
    Our method requires extracting durations of activities, and experience of
resources for each activity from the event logs for providing realistic resource
utilization predictions. Durations can be obtained in a similar way to described
by van der Aalst et al. [13] who provide reliable predictions. On the other hand,
extraction of resource profiles including their experiences are delineated by Pika
et. al. [7].
    Within the context of quantifying the resource perspective, our method draws
parallels with [15]. We further elaborate our mathematical model for refining our


                                       130
results over the structural properties of running processes. Rather than providing
an overall view, Conforti et. al. [8] allow users to take resource-informed decisions
at run-time. Our approach is similar to this work in the sense that it reports
on resource utilization abnormalities thay may become a problem during the
executions of processes.
    Folino et al. [19] introduce a performance model for run-time process exe-
cutions with respect to process variants such as workload and seasonality. Our
prediction method can support such models for enriching the context from re-
source utilization point-of-view.


3 Input Data

The following input is required for resource utilization prediction in decision-
intensive business processes:
Input 1 (Business Process Model). A business process model P is repre-
sented as a directed, connected graph (N, E) with N = A ∪ G ∪ {nstart , nend }
denoting a finite set of nodes consisting of activities A, gateways G, and two
respective start and end events nstart and nend , and E ⊆ N × N representing a
set of edges connecting the nodes.
    We assume that our model consists of activities, XOR-gateways (decision
points), AND-gateways (parallel execution gateways), a start-event, and an end-
event. The process always terminates (i.e., it contains no livelocks or deadlocks).
    In a real life application, an input process can be composed of the processes
that are planned to be executed in the future, and of the unexecuted fragments
of the already executing processes.
Input 2 (Edge Execution Probability). Given a process model
P = (A ∪ G ∪ {nstart , nend }, E), each edge e ∈ E leaving a XOR-gateway
g ∈ GXOR ⊆ G of P is annotated with an edge execution probabil-
ity P
    pe ∈ [0, 1]. Additionally edge execution probabilities pe must satisfy
        pe = 1, ∀g ∈ GXOR .
e∈E∩(g×N )

   The outgoing edges of XOR-gateways are annotated with edge execution
probabilities (see Fig. 6).
Input 3 (Activity Duration PDF). For each activity a ∈ A, Da : R≥0 → R≥0
denotes the PDF representing the duration PDF of activity a.
Example 1. For clarification, we provide the process in Fig. 1. A single loop
of two consequent activities Develop and Test is presented in parallel with the
activity Manage. The duration PDFs for Manage and Develop are normally
distributed while Test has a fixed duration represented as a solid dot (i.e., a
time shifted dirac delta function 3 ). The duration of Test is deterministically 3
time-units (TU) whereas the duration of Manage is on average 2 TU.


                                        131
                                                                 t

                                                      Manage
                                                                            p = 0.8


                                        Develop                      Test
                               p=1                                                    p = 0.2


                                                  t                            t

               Fig. 1: The running example process with activity duration PDFs Da

                                 Resources
                               Jack       Jill                         Description
                                 o            o
                 Manage                               The activity Manage utilizes Jack about 30%.
                              .5 1         .5 1
                                                      The activity Develop utilizes Jill about 100%,
Activities


                                 o            o
                 Develop                              however there is also less of a chance that the
                              .5 1         .5 1       same activity may utilize her about 50%.
                                 o            o       The activity Test utilizes Jack about 85%
                     Test
                              .5 1         .5 1       and Jill about 50%.


                    Fig. 2: Resource Utilization PDFs for the activities in Fig. 1

Input 4 (Resource Utilization PDF). Given a set of activities A of a
business process B and a set of resources R, the resource utilization PDF is
Ua,r : R≥0 → R≥0 for activity a ∈ A and resource r ∈ R. U defines the PDFs
of the probable additional utilization the execution of an activity causes for all
resources.
    The resource utilization PDFs describe which resources are utilized to what
extent while executing an activity. Intuitively, one can think of Ua,r (x)dx as being
the probability of r’s utilization falling in the infinitesimal interval [x, x + dx].
We assume that utilization values are normalized so that Ua,r (0) represents the
probability of resource r being 0% utilized by the activity a. Ua,r (1) specifies
the probability of resource r’s being utilized 100% by the activity a. The notion
of utilization can be considered as “the percentage of the work day spent on a
task” in our running example.
Example 2. See Fig. 2 for an example visualizing the definition of resource
utilization PDF. For instance, UDevelop,Jill (0.5) = 0.3 means with 0.3 probability

  3
                                                                                      R∞
             The Dirac delta function δ(x) is ∞ at x = 0 otherwise 0 and satisfies         δ(x) dx = 1.
                                                                                      −∞


                                                      132
                         Process                       Time                Resource


Input
               Input 1               Input 2           Input 3   Input 4              Input 5


                                   Def. 2,3,4,5         Def. 1
Intermediary


                                     Def. 6


                                                                                      Output 1
Output


                                    Output 3      Output 2


                Fig. 3: Input, intermediary functions, and output of our method


Develop utilizes Jill 50%, and UDevelop,Jack (0) = ∞ means that Jack is never
occupied by Develop.
Input 5 (Experience Matrix). Given a set of activities A of a process model,
and a set of resources R, the experience value Xa,r : R>0 where a ∈ A and r ∈ R
is a multiplication factor (a scalar) for activity durations.
    The experience of each resource r in every activity a is reflected in this matrix.
This value theoretically has a range between zero to infinity which is extracted
from activity execution durations of resources. Resources that execute activities
faster than average have an experience value greater than 1.0.
    We assume that the edge execution probabilities pe , activity duration PDFs
Da , resource utilization PDFs Ua,r , and experience matrix X (Input 2–5 ) are
extracted from the event logs obtained from the past executions of the process P
(Input 1), where the resources and the durations of the past activity executions
are recorded. A solution to this prediction problem is total utilization PDFs
over time for each resource. Each function provides a utilization prediction for
its respective resource.


4 Method
Following the problem definition, we first introduce intermediary functions that
would allow us to compute the ultimate total utilization PDF, and afterwards
we introduce quality metrics that would quantify the risk of abnormal resource
utilization that may occur in the future. Fig. 3 is an overview of “used by”
relation between input values and defined functions of our method (e.g., Input
3 is used by Definition 1 ). Background colors blue, green and yellow indicates
process-related, time-related and resource-related elements.
Definition 1 (Dependent Activity Duration PDF). Given an independent
activity duration PDF Da , allocation PDF Ua,r , experience values Xa,r for ac-
tivities a and resources r, the (utilization-and-experience) dependent activity du-
ration PDF Dadep : R≥0 → R≥0 is defined as follows:


                                                       133
                                                             X            Z∞
                                                  xa =             Xa,r        uUa,r (u)du
                                                             r∈R          0
                                                                 Da (txa )
                                      Dadep (t) = R∞
                                                                 Da (t0 xa )dt0
                                                             0

   For each activity, an experience value xa is derived from the experiences of all
the resources that are potentially participants in a (i.e., ∃x ∈ R>0 : Ua,r (x) > 0).
xa is used as a division factor for Da , therefore the experience values above 1.0
reduce the width of Da (i.e., they act as speed-up factor for execution times),
and the opposite holds for the values smaller than 1.0.

4.1 Edge Transition Probability Function

In order to compute how often an activity is being executed, we need to extract
the probability values of edge traversals over time. Only activities can create
time delays via their duration PDFs. Edge transitions are always instantaneous.

Definition 2 (Edge transition probability function). fe (t) : R≥0 → R≥0
denotes the probability that an edge e ∈ E is traversed at time t ∈ R≥0 .

       P (An edge is traversed between t1 and t2 )
       = P (Edge is traversed before t2 ) − P (Edge is traversed before t1 )
                Zt2                   Zt1
                         0    0
        =             f (t ) dt −              f (t0 ) dt0
                0                     0

  Note that in general fe is not a PDF. However,
                                           R       if the process contains no
XOR-gateways all fe are PDFs since ∀e ∈ E : fe (t) dt = 1.

Definition 3 (f for AND-Gateway). Given g ∈ Gand ⊆ G with
incoming edges {in1 , . . . , inn } = E ∩ (N × g) and outgoing edges
{out1 , . . . , outm } = E ∩ (g × N ), the edge transition probability function fe (t) for
outgoing edges e is defined as
       fin                            (t)
                (t)
                             f out
            1                         1
                                                                          n t                       
       ..                                 ..                                            0        0       d
                                                                          QR
                                                     foutj (t) =                  fini (t ) dt           dt   ,1 ≤ j ≤ m
        .                                  .                              i=1
                             fo
                (t)            ut
                                  m   (t)
       f in n
   Having a generic analytical result about the edge transition behavior of the
process is difficult unless the input is in form of time-shifted dirac delta functions
or PDFs of an exponential distribution.


                                                                   134
Definition 4 (f for XOR-Gateway). Given g ∈ GXOR ⊆ G
with incoming edges {in1 , . . . , inn } = E ∩ (N × g), outgoing edges
{out1 , . . . , outm } = E ∩ (g × N ), and edge execution probabilities
{pout1 , . . . , poutm }, the edge transition probability function fe (t) for outgoing
edges e is defined as
      fin                              (t)
                (t)
                              f out
            1                          1

                                                                        n
       ..                                  ..
                                                                        P
                                                    foutj (t) = poutj         fini (t)   ,1 ≤ j ≤ m
        .                                   .                           i=1
                             fo
                (t)               ut
                                   m   (t)
       f in n
Definition 5 (f for Activities). Given an activity a ∈ A with one incom-
ing edge ein , one outgoing edge eout , and activity duration PDF Dadep , the edge
transition probability function fout (t) is defined as

     fin (t)                       fout (t)
                      Dadep (t)                             fout (t) = fin (t) ∗ Dadep (t)


    Note thatR the convolution operator ∗ for two functions f and g is defined as
(f ∗ g)(t) = f (t0 )g(t − t0 ) dt0 .
    We compute edge transition probability functions directly on the process.
Another way of doing this is described in [21]. Their method requires decompo-
sition of the process into process blocks.

4.2 Estimating Total Resource Utilization

In order to be able to define the total resource utilization function, we need to
know the probability density of an activity a being executed at time t. From
edge transition probabilities for the incoming edge of a, the probability density
that an activity is executed for each point in time is represented by the activity
execution probability function Fa (t).

Definition 6 (Activity execution probability function). The function
Fa (t) : R≥0 → R≥0 denotes the probability density that an activity a ∈ A is
executed at time t ∈ R≥0 .
 Fa (t) = P (Activity is currently being executed at t)
        = P (Activity is entered before t) − P (Activity is left before t)
          Zt              Zt                         Zt               Zt
        = fin (t ) dt − fin (t ) ∗ Da (t ) dt = fin (t ) dt − fout (t0 ) dt0
                  0   0           0     dep 0    0           0    0

                0                               0                       0                   0

Example 3. Our running example with the activity duration PDFs Da in Fig. 1
would then result into the activity execution probability functions Fa presented
in Fig. 4. Manage is immediately executed once. As Develop and Test repeat


                                                        135
                                                        t

                                         Manage
                                                              p = 0.8


                             Develop                   Test
                     p=1                                                    p = 0.2


                                           t                            t


Fig. 4: The running example process with activity execution prob. functions Fa

(infinitely) in the loop, each repetition lowers the execution probability of the
next iteration.
Output 1 (Total utilization PDF over time). Or is the utilization PDF of a
resource r ∈ R at time t ∈ R≥0 , where A = {a1 , . . . , an }. A value Or (t, u)
represents the probability density that r is utilized by an amount of u percent of
his/her time (e.g. u = 1 means full time, u > 1 means over-utilization) at time t.
The operator ∗u is the convolution over the parameter u.

                       Or : (R≥0 × R≥0 ) → R≥0
                 Or (t, u) = δ(u)
                           ∗u (Ua1 ,r (u)Fa1 (t) + δ(u)(1 − Fa1 (t)))
                           ...
                           ∗u (Uan ,r (u)Fan (t) + δ(u)(1 − Fan (t)))

   For each resource, there is one total utilization PDF over time. In these PDFs,
the activity execution probability functions and resource utilization PDFs of the
respective resources are combined.
Example 4. In Fig. 5, the total utilization PDF over time for Jack OJack is
based on (1) our running example with the activity execution probability func-
tions F as seen in Fig. 4, and (ii) the resource utilization PDFs U as seen in Fig. 2.
OJack,T est (t, u) = (WJack,T est (u)FT est (t) + δ(u)(1 − FT est (t))) is the probable
utilization of Jack by the activity Test shown in top-left corner. In a similar
way, OJack,M anage is on the top-right corner. We do not show OJack,Develop ,
since it has no influence on Jack’s total utilization (see Fig. 2). We can clearly
observe in OJack,M anage that the activity Manage is non-repeating. The com-
bined total utilization PDF for Jack OJack is shown in bottom of the Figure. It
shows how the probable utilizations of Jack’s activities combine into a period
of over-utilization as shown in the red region. Such region representing over-
utilization are of special interest for the decision makers (e.g., project managers)
who are also responsible for time and resource management.


                                         136
                                OJack,T est (t, u)                                                   OJack,M anage (t, u)


                                                           FT                                                                       FM
                                                                 est                                                                     an
                                                                                                            ge                               ag e


                                                                           probability
                                    es
                                      t                                                                  ana
                             k ,T                                                                  k,M
                           ac                                                                Ja
                                                                                               c
                      UJ                                                                 U


                      tim                                   ion                                tim                                            ion
                            e                         lizat                                         e                                   lizat
                                                  uti                                                                               uti

                                                                   OJack (t, u)
        probability


                                                                                                         ove
                                                                                                             r-  uti
                                                                                                                    liz
                                                                                                                          ati
                                                                                                                                on


                                                                                                                                n
                                          tim                                                                             tio
                                             e      no                                                             liza
                                                           uti
                                                                liz                                          uti
                                                                   ati
                                                                           on


Fig. 5: Total Utilization PDF over time OJack of our running example and the
relevant pieces it is composed of. Directly above the plots, we provide a more
condensed visualization where the x-axis is time, and the y-axis is utilization.
The color intensity is directly proportional with the probability density value.
Note that the blue parts are of infinite slope caused by δ(u).

4.3 Quality Metrics
Based on resource utilization PDFs Or , we can define various metrics for quan-
tifying abnormal resource utilization. These metrics can be used as optimization
criteria for managing organizations and their schedules.
Output 2 (Resource over-utilization metric). The resource over-utilization met-
ric m+ r : R≥0 for a resource r ∈ R is the volume of utilization larger than 1.
                                                              Z∞ Z∞
                                                 m+ r =                      oOr (t, u) du dt
                                                              0        1


                                                                           137
                                                                p A
                      Develop               Test
                    Component A          Component A             pjoin-A

                                                                p B
                      Develop               Test                            Integrate          p end
                    Component B                                            Components   Test
                                         Component B            pjoin-B
                                                                                               p *
                                                                p C
                      Develop               Test
                    Component C          Component C             pjoin-C


Fig. 6: An often-encountered development process with multiple parallel “De-
velop and Test” cycles with a final integration step

The total over-utilization
                 P + m of a process is the sum of over-utilization of all
resources: m+ =      m r
                  r∈R

  In other words, m+ is a value characterizing resource over-utilization for a
whole organization (i.e., for all resources and the processes involved).
Output 3 (Resource under-utilization metric). The resource under-utilization
metric m− r for a resource r ∈ R is the accumulated volume under the optimal
resource occupancy threshold on the occupancy axis (cf. 1.0).
                            Zt                Zt
                                       0    0
                    FP (t) = festart (t ) dt − feend (t0 ) dt0
                                  0                         0
                                  Z∞            Z1
                        m− r =         FP (t)        |1 − u|Or (t, u) du dt
                                  0             0
   In other words, m− is a value characterizing resource under-utilization for a
whole organization (i.e., for all resources and the processes involved). FP in the
formula above is similar to Fa in Definition 6 in the sense that FP denotes the
probability that a process P is executed at time t ∈ R≥0 .

5 Application to a Real Process
We tested our method in a setting where the utilization of 10 resources is fore-
casted in 10 process instances of the process shown in Fig. 6. In this realistic
process, each system consists of different types and number of components that
are developed and tested in parallel. All Develop activities must be tested in-
dependently, and repeated development effort is expected. Additionally, a final
Integrate and Test phase is mandatory for the combined system, which may
cause additional repetitions of the entire process. It is expected for resources to
work on multiple processes in parallel.
    Table 1 describes the properties of 10 process instances with different starting
times and resource utilizations: The process name (id), the starting time (ts ), the
initial letter of the resource allocated to a certain activity (r), mean duration
of the activity (∂), repetition probability of the iteration “Develop and Test”
for component X (pX ) are given (cf. p∗ is the edge execution probability of


                                                138
                     Table 1: Properties of process instances

       DevA T estA   DevB T estB              DevC T estC        Int∗ T est∗
id ts r ∂ r ∂ pA r ∂ r ∂ pB                 r ∂ r ∂ pC       r ∂ r ∂ p∗
 1 40 B 50 G 20 0.6 E 12.5 H 5 0.3            I 25 J 10 0.3     J 22.5 C 9 0.1
 2 40 H 40 G 16 0.3 I 10 H 4 0.6              E 20 H 8 0.1      F 18 I 7.2 0.15
 3 40 F 70 E 28 0.2 C 17.5 J 7 0.6            C 35 F 14 0.2     B 31.5 I 12.6 0.1
 4 40 J 60 I 24 0.2 G 15 H 6 0.5              E 30 F 12 0.3     B 27 I 10.8 0.1
 5 40 B 40 C 16 0.6 B 10 B 4 0.3              I 20 B 8 0.3      B 18 I 7.2 0.1
 6 70 G 20 A 8 0.4 E 5 F 2 0.4                E 10 F 4 0.4      B 9 G 3.6 0.2
 7 100 B 80 E 32 0.4 E 20 H 8 0.8             C 40 B 16 0.1     D 36 A 14.4 0.1
 8 110 H 40 I 16 0.5 I 10 J 4 0.8             A 20 D 8 0.5      D 18 G 7.2 0.1
 9 120 H 60 C 24 0.6 G 15 J 6 0.3             A 30 H 12 0.5     J 27 C 10.8 0.1
10 170 B 30 G 12 0.7 E 7.5 F 3 0.6            E 15 D 6 0.5      D 13.5 E 5.4 0.3

restarting the process). The different components of each process have different
failure rates and duration distributions.
    In order to generate the utilization functions for each resource, we simulate
the process instances with respect to their properties (cf. Table 1) and apply
the methods described previously. Note that we leave out the process execution
paths with a probability lower than 0.1%.
    Fig. 7 shows the visualization of the utilization functions for each resource.
For instance, Boris is overoccupied most of the times because he is allocated
to 5 Develop, 3 Test, and 4 Integration activities which are overlapping. Grace
becomes overoccupied due to the start of project-6 at day 70. Edgar, Florence,
and Henry are also expected to be visibly overoccupied in some periods during
the execution of these 10 projects, though their situations are not as critical
as Boris’s, because their activities are not as many times overlapping as the
activities of Boris. David’s utilization looks exceptionally low. Moreover, by using
the quality metrics defined in Section 4.3, we are informed that there is a 10
times greater risk of resource under-occupancy (m− = 2665.45) than resource
over-occupancy (m+ = 273.44). This can be intuitively confirmed by comparing
the amount of blue and red areas on Fig. 7.
    Fig. 7 and the quality metrics suggest that: (i) There are more resources
than needed in this setting, and (ii) demand for resources can still be balanced,
especially those of Boris’s for a more robust execution.
    The computational complexity of our method is equivalent to the complexity
of numerical integration which is P-complete [22]. Its implementations perform
in quasilinear time. Therefore, our method responds to the problems with real-
world sizes in a few seconds, and it is suitable for both design-time and run-time
prediction tasks.

6 Conclusions and Future Work

In this paper we have introduced a novel method for predicting resource utiliza-
tion in decision-intensive business processes. Our approach facilitates to obtain a
resource utilization overview for the whole organization where the processes have
a stochastic nature. As a result, the decision makers are provided with actual


                                       139
   Amy
   Boris
  Casey
  David
  Edgar
Florence
  Grace
  Henry
  Ingrid
   Jack
        40                70               100      120               170     days
                     utilization   0-50%         50-100%   100-150%   >150%

Fig. 7: Compact visualization of resource utilization forecast with 10 resources
and 10 projects of type Figure 6 over one year

insights about their organizational feasibility. One advantage of our approach
is that it can be readily used in practice, and it can be incorporated in BPM
systems as a supplementary risk monitoring element.
    Our future work primarily involves conducting exhaustive evaluations to as-
sess the applicability of the approach in more real settings and compare the
performance results. We also aim at integrating the approach into a BPM sys-
tem as well as at adapting the current design-time method to be used at run time
too, i.e. to make it more dynamic such that the resource utilization predictions
are updated during the execution of the process instances.


References
 1. Michael Rosemann and Jan vom Brocke. The six core elements of business process
    management. In Handbook on business process management 1, pages 105–122.
    Springer, 2015.
 2. Giray Havur, Cristina Cabanillas, Jan Mendling, and Axel Polleres. Resource
    Allocation with Dependencies in Business Process Management Systems. In BPM
    (Forum), volume 260, pages 3–19, 2016.
 3. Marlon Dumas, Marcello La Rosa, Jan Mendling, Hajo A Reijers, et al. Funda-
    mentals of Business Process Management, volume 1. Springer, 2013.
 4. Qiang Liu and Jin He. Study of complex loop patterns based on time petri net.
    In Computer Supported Cooperative Work in Design, 2007. CSCWD 2007. 11th
    International Conference on, pages 801–805. IEEE, 2007.
 5. Zhengxing Huang, Xudong Lu, and Huilong Duan. Resource behavior measure and
    application in business process management. Expert Systems with Applications,
    39(7):6458–6468, 2012.


                                           140
 6. Anastasiia Pika, Wil MP van der Aalst, Colin J Fidge, Arthur HM ter Hofstede,
    and Moe T Wynn. Predicting deadline transgressions using event logs. Lecture
    Notes in Business Information Processing, 132:211–216, 2012.
 7. Anastasiia Pika, Michael Leyer, Moe T Wynn, Colin J Fidge, Arthur HM Ter Hof-
    stede, and Wil MP Van Der Aalst. Mining resource profiles from event logs. ACM
    Transactions on Management Information Systems (TMIS), 8(1):1, 2017.
 8. Raffaele Conforti, Massimiliano de Leoni, Marcello La Rosa, Wil MP van der Aalst,
    and Arthur HM ter Hofstede. A recommendation system for predicting risks across
    multiple business process instances. Decision Support Systems, 69:1–19, 2015.
 9. Paolo Ceravolo, Antonia Azzini, Ernesto Damiani, Mariangela Lazoi, Manuela
    Marra, and Angelo Corallo. Translating process mining results into intelligible
    business information. In Proceedings of the The 11th International Knowledge
    Management in Organizations Conference on The changing face of Knowledge
    Management Impacting Society, page 14. ACM, 2016.
10. Michael Rosemann and Michael Zur Muehlen. Integrating risks in business process
    models. ACIS 2005 Proceedings, page 50, 2005.
11. Michael Zur Muehlen and Danny Ting-Yi Ho. Risk management in the bpm lifecy-
    cle. In International Conference on Business Process Management, pages 454–466.
    Springer, 2005.
12. Wil MP van der Aalst. Process mining: data science in action. Springer, 2016.
13. Wil MP Van der Aalst, M Helen Schonenberg, and Minseok Song. Time prediction
    based on process mining. Information systems, 36(2):450–475, 2011.
14. Andreas Rogge-Solti and Mathias Weske. Prediction of business process durations
    using non-markovian stochastic petri nets. Information Systems, 54:1–14, 2015.
15. Marijke Swennen, Niels Martin, Gert Janssenswillen, Mieke J Jans, Benoı̂t Depaire,
    An Caris, and Koen Vanhoof. Capturing resource behaviour from event logs. In
    SIMPDA, 2016.
16. Arik Senderovich, Matthias Weidlich, Avigdor Gal, and Avishai Mandelbaum. Min-
    ing resource scheduling protocols. In International Conference on Business Process
    Management, pages 200–216. Springer, 2014.
17. Michael Arias, Eric Rojas, Jorge Munoz-Gama, and Marcos Sepúlveda. A frame-
    work for recommending resource allocation based on process mining. In Inter-
    national Conference on Business Process Management, pages 458–470. Springer,
    2015.
18. Suriadi Suriadi, Moe T Wynn, Jingxin Xu, Wil MP van der Aalst, and Arthur HM
    ter Hofstede. Discovering work prioritisation patterns from event logs. Decision
    Support Systems, 2017.
19. Francesco Folino, Massimo Guarascio, and Luigi Pontieri. Discovering context-
    aware models for predicting business process performances. In OTM Confederated
    International Conferences” On the Move to Meaningful Internet Systems”, pages
    287–304. Springer, 2012.
20. Raffaele Conforti, Sven Fink, Jonas Manderscheid, and Maximilian Röglinger.
    Prism–a predictive risk monitoring approach for business processes. In Inter-
    national Conference on Business Process Management, pages 383–400. Springer,
    2016.
21. Wil MP Van der Aalst, Kees M Van Hee, and Hajo A Reijers. Analysis of discrete-
    time stochastic petri nets. Statistica Neerlandica, 54(2):237–255, 2000.
22. Akitoshi Kawamura. Computational complexity in analysis and geometry. Uni-
    versity of Toronto, Toronto, Ont., Canada, 2011.


                                        141