=Paper=
{{Paper
|id=Vol-1592/paper02
|storemode=property
|title=Process Mining in IT Service Management: A Case Study
|pdfUrl=https://ceur-ws.org/Vol-1592/paper02.pdf
|volume=Vol-1592
|authors=Borja Vázquez-Barreiros,David Chapela,Manuel Mucientes,Manuel Lama,Diego Berea
|dblpUrl=https://dblp.org/rec/conf/apn/Vazquez-Barreiros16
}}
==Process Mining in IT Service Management: A Case Study==
<pdf width="1500px">https://ceur-ws.org/Vol-1592/paper02.pdf</pdf>
<pre>
    Process Mining in IT Service Management: A
                    Case Study

    Borja Vázquez-Barreiros1 , David Chapela1 , Manuel Mucientes1 , Manuel
                           Lama1 , and Diego Berea2
     1
       Centro Singular de Investigación en Tecnoloxı́as da Información (CiTIUS)
       Universidade de Santiago de Compostela. Santiago de Compostela, Spain
    {borja.vazquez, david.chapela, manuel.mucientes, manuel.lama}@usc.es
                 2
                   Ozona Consulting. Santiago de Compostela, Spain
                         diego.berea@ozonaconsulting.com


         Abstract. The explosion of process-related data in nowadays organi-
         zations has raised the interest to exploit the data in order to know in
         deep how the business processes are being carried out. To face this need,
         process mining has emerged as a way to analyze the behavior of an or-
         ganization by extracting knowledge from process related data. In this
         paper, we present a case study of process mining in a real IT service
         management scenario. We describe an exploratory analysis of real life
         event logs generated between 2012 and 2015, in four diﬀerent processes
         designed within an IT platform. More speciﬁcally, we analyze the way of
         handling the diﬀerent requests and incidents registered in an organiza-
         tion.


1    Introduction
In the recent years, there has been a huge investment in developing technolo-
gies to automate the diﬀerent tasks carried out in an organization, and to store
all the possible information generated during these tasks. In particular, regard-
ing business processes, this has lead to an incredible growth on the amount of
process-related data, i.e., execution traces of business activities. Clearly, the ex-
plosion of this kind of information has opened a door to provide insights into the
actual way of working in an organization, to predict performance using simula-
tion, to detect deviations in the process or to improve the way certain business
activities are executed [1].
    Within this context, a business process, henceforth a process model, is un-
derstood as a collection of related structured activities that produce a speciﬁc
outcome, e.g., a product or a service. Typically, process models have a detailed
description prescribing how tasks must or should be done, i.e., they describe the
way of working. Unfortunately, there might be diﬀerences between the designed
process model, and how the process is being executed in reality [1]. Hence, creat-
ing process models is a diﬃcult and error-prone task, that can lead to an evident
gap between what we think is going on (the a-priori process model) and what
is really happening (the real process model). With this in mind, process mining


                                         16
has emerged as a way to analyze the behavior of an organization by extract-
ing knowledge from process-related data, and oﬀering techniques to discover,
monitor and enhance real processes [1]. Nowadays we can ﬁnd several academic
(PMLAB3 ), open-source (ProM4 ), and commercial (Disco5 ) tools featuring an
extensive set of analysis techniques for process mining.
    Concerning its real applicability, process mining has been widely applied in
multiple domains, showing an incredible potential as a link between Business
Process Management and all kinds of analytical techniques that are not neces-
sarily process-aware. Examples of this success can be found in a wide variety
of ﬁelds6 . In the literature, we can ﬁnd several case studies providing insights
into hospital and health care processes [2,3,4,5]; education [6,7]; manufacturing
processes [8]; invoice veriﬁcation processes in SAP [9]; in monitoring sea regula-
tions [10]; ﬁnancial services [11] or purchases process in IBM [12].
    Regarding this paper, we present an experience of applying process mining
in a real IT service management (ITSM) scenario. The dataset used in this
case study consists of a total of 2,004 diﬀerent requests and incidents generated
in an organization between 2012 and 2015, and four diﬀerent process models
designed to handled such requests and incidents recorded in the system. Within
this scenario, we will follow a data-driven approach, seeking new insights and
generating ideas and hypotheses for further research, i.e., this analysis has an
exploratory character. Thus, the main idea behind this study is to ﬁnd the
gap between the modeled behavior, i.e., the as-designed process model or what
the organization thinks is happening; and the observed behavior, i.e., the as-is
process model or what is really happening. In this analysis, we mainly use the
academic tool ProDiGen [13] platform7 and the open source tool ProM.


2   Case study

The case under study presented in this paper is the analysis of the services to
handle requests and incidents in a real organization. In particular, this organi-
zation has implemented a central point of contact for handling customers, users
and other issues within the company, e.g., stock capacity, changes on a partic-
ular process, ﬁnancial management, etc. In order to plan, deliver, operate and
control the oﬀered IT services, this organization relies on a ITSM software plat-
form. This platform provides a wide range of management services. In particular,
it implements the means for monitoring the progress of diﬀerent events, i.e., it
logs process-related data. However, the ITSM tool does not provide any kind of
process mining analysis technique, that is, it does not fully exploit this kind of
3
  https://www.cs.upc.edu/~jcarmona/PMLAB/
4
  http://www.promtools.org
5
  https://fluxicon.com/disco
6
  The IEEE CIS Task Force on Process Mining has an extensive compilation of
  diﬀerent successful case studies http://www.win.tue.nl/ieeetfpm/doku.php?id=
  shared:process_mining_case_studies.
7
  https://tec.citius.usc.es/processmining


                                     17
information. Hence, although plenty of process-related information is available,
this organization has no clear idea of how the incidents and requests are handled.
    In this system, requests and incidents, henceforth tickets, are usually mapped
to one or more management processes, i.e., process models. Thus, when a ticket is
registered in the system, the linked management process indicates the actions to
be performed. Each one of these process models is generated through a graphical
tool, in which the designer deﬁnes, conﬁgures and organizes the diﬀerent steps
in such a process model. Hence, each of these management processes speciﬁes a
guideline tof steps indicating who, how and when should intervene during the
processing of a ticket. In other words, in order to be properly handled, each ticket
has to perform a sequence of deﬁned steps. From the point of view of process
mining, each one of this steps can be considered as an event, i.e., a speciﬁc
activity in the system, and each ticket as a trace, i.e., a sequence of events.
Therefore, for each deﬁned process model in this organization, it is possible to
extract an event log, i.e., a group of traces that consist of process events.
    Concerning this case study, we identify four major process models: Issues
resolution, Orders resolution, Standard changes and Emergency changes. Hence-
forth we denote this processes as: Workﬂow 1, Workﬂow 2, Workﬂow 3, Work-
ﬂow 4, respectively. In total, this four processes are mapped to 2,004 tickets, i.e.,
traces, registered between 2012 and 2015. Hence, we are dealing with historic
data, i.e., traces that, in theory, have been completed. Within this scenario, the
main motivation is to obtain, from an exploratory point of view, insights into
how the diﬀerent tickets are being handled in reality, and if they conform with
what was designed. In other words, we aim to check and compare reality with
what was planned. Remark that this exploratory analysis does not have a speciﬁc
goal, i.e., through this analysis we do not aim to intervene, adjust or redesign
the possible deviations and diﬀerences between modeled behavior and reality,
but rather gain insights for a further question-driven and deeper analysis.
    We have executed the following steps to perform this exploratory case study.
First, we extracted and ﬁltered the data from the ITSM tool database. Then,
we conformed reality with the a-priori process models. After this, we applied
discovery techniques to retrieve the real processes. Finally, we measured the
performance of the processes to detect bottlenecks and performance issues.

2.1   Data preparation
The ﬁrst phase in the presented analysis consists of the preparation and explo-
ration of the available process data between 2012 and 2015. Hence, in this step,
the data was extracted from the database of the platform, and converted into a
standard event log storage format, i.e., XES [14]. Furthermore, we also translated
the a-priori process models to their equivalent Petri net representation.
    During this preprocessing step, we detected several traces that, during their
lifespan, were linked to diﬀerent process models. In other words, a ticket was
initially handled through the steps of a speciﬁc process model and, at some
point, was transferred to a diﬀerent process model, having to start again. The
main problem behind this particular behavior is that there is no clear indicator


                                      18
                        Table 1: Event logs characteristics.
                  Workﬂow 1         Workﬂow 2          Workﬂow 3         Workﬂow 4
#activities           5                 7                 12                 7
#traces              158              1,151               84                611
#events              886              7,242               696              3,580
#activities, #traces and #events stands for the number of activities, process instances,
and events, respectively, in each event log.


in the database on when a change of this type took place: the only information is
when a ticket ﬁred an step on a diﬀerent process model. Clearly, a deeper study
would involve analysing these particular cases and checking, for instance, the
impact on the average time resolution of the tickets, or the reason behind these
transfers. However, in this case study, we ﬁltered this kind of behavior, focusing
the analysis on those tickets handled only through a single process model.
    Once correctly identiﬁed all the diﬀerent tickets, i.e., the traces, we found a
problem related with the ordering of the events. Speciﬁcally, some events in the
system lacked the start timestamp attribute, precluding the creation of the actual
trace of events for each ticket. Note that we are working with atomic activities,
and we sort the events of a trace based on the start timestamp. In particular, we
detected that this behavior was always related to automatic activities. Hence, in
order to properly sort the events of a trace, we set the start timestamp of these
activities equal to the end timestamp. Finally, we also added an artiﬁcial end
and start activity to each trace.
    After this ﬁltering process we created four diﬀerent event logs, one for each
process model, in XES format. Table 1 shows, for each prepared event log, the
number of activities (#actitivies), the number of process instances or tickets
(#traces), and the total number of events (#events) in each event log.

2.2   Conformance analysis
Conformance checking aims to ﬁnd discrepancies between the modeled behavior,
i.e., the process model, and the observed behavior, i.e., the event data. As we have
access to the a-priori process models, i.e., the desired behavior, we can conform
the recorded behavior of the tickets w.r.t. the a-priori process models and see
how many of the handled tickets actually followed the deﬁned steps. The metric
used to this end is the replay ﬁtness, which measures the extent to which process
models can reproduce the traces recorded in the event log. Among the diﬀerent
approaches in the literature related to this particular dimension, we have selected
cost-based ﬁtness metric, based on alignments [15]. An alignment between a trace
and a process model is a pairwise comparison between the executed activities
and the activities allowed by the model. Such sequences of pairs are called moves.
Three diﬀerent moves can be distinguished: i) moves only on the event log, i.e.,
the process model does not allow the execution of a recorded event; ii) moves
only on the process model, i.e., the process model needs to execute an activity


                                        19
Table 2: Cost-based ﬁtness and number of tickets for the a-priori process models.
                              Workﬂow 1 Workﬂow 2 Workﬂow 3 Workﬂow 4
 Cost-based ﬁtness                0.87       0.89      0.96       0.89
 #correctly replayed tickets    57 (36% ) 594 (51% ) 60 (71% ) 111 (18% )
 #incorrectly replayed tickets 101 (64% ) 557 (49% ) 24 (29% ) 500 (82% )


not recorded in the event log; and iii) moves on both (synchronous moves), i.e.,
an event in the event log can be correctly replayed through the model.
    Table 2 shows the cost-based ﬁtness after aligning each event log and process
model. For instance, the event log Workﬂow 2 has a ﬁtness of 0.89, that is, it can
reproduce 89% of the recorded events or, in other words, 11% of the recorded
events deviate from the a-priori process model. Additionally, Table 2 also shows
the number of tickets that were correctly and incorrectly replayed through their
respective process model. In other words, each time a ticket deviates from the
a-priori process model, even in a single event, it counts as an incorrectly replayed
ticket. As can be seen, although the cost-based ﬁtness is relatively high in all
cases, there is a signiﬁcant amount of tickets, in all four event logs, that deviate
from the a-priori process model, e.g., 82% of the tickets in Workﬂow 4 present
a deviation, i.e., an event log/model move.
    Through replay ﬁtness techniques we can also obtain a more detailed lo-
cal diagnostic, allowing to detect exactly where the deviations took place. For
example, by projecting the alignments between all traces in the event log of
Workﬂow 1 onto the a-priori process model yields a visualization that shows
the location of deviations as shown in the Petri net of Figure 1. In this visual-
ization, each transition shows (in parenthesis) the ratio of synchronous moves
(left part) to moves only on the process model (right part). Also, colored places
represent errors when moves only on the event log occur. Furthermore, the size
of the colored places represent the frequency of moves only on the event log. In


                                                                                             "user validation"
                  "analysis & resolution"               "user validation" was                happened 2 times when
                  happened 54 times when         log    skipped 29 times             model   it was not possible
          model   it was not possible           move                                 move
          move
                                                                                             according to the model
                  according to the model
                                                                                                   analysis of
                                   analysis &                           user                        rejection
                                   resolution                         validation                         (1/0)
                                      (154/4)                          (130/29)
                                                        notify
        Start process
                                                       opening                                    End process
               (158/0)
                                                         (156/2)                                       (158/0)

                                                       update
                                                       impact
                                                        (73/85)                 "update impact" was
                                                                        log     skipped 85 times
                                                                       move


Fig. 1: Diagnostic information showing the deviations for Workﬂow 1 on the
a-priori model.


                                                       20
   Table 3: Most skipped and wrongly executed activities for each workﬂow.
                      Workﬂow 1         Workﬂow 2        Workﬂow 3       Workﬂow 4
Most skipped       update impact (85)   warning (497)   email to the     priority vali-
activity                                                petitioner (9)   dation (421)
Most wrongly ex- analysis & resolu- analysis & resolu- analyze change analysis & reso-
ecuted activity  tion (54)          tion (404)         order (13)     lution (120)
Most skipped activity is related to the model moves, while Most wrongly executed
activity is related to log moves.


more detail, the deviations in this process model occur in diﬀerent locations. On
the one hand, we detect diﬀerent model moves: update impact was skipped 85
times; user validation was skipped 29 times; analysis & resolution was skipped
4 times and notify opening was skipped 2 times. On the other hand, related to
log moves, the activities analysis & resolution and user validation were executed
54 and 2 times, respectively, when the a-priori process model was not allowing
them to happen. Note that, for instance, considering all the possible moves, user
validation was executed more than 158 times, i.e., the total number of traces.
This means that this activity was executed in a loop situation, i.e., it appears
more than once within the same trace.
    Based on these diagnostics for Workﬂow 1, some of the insights that can
be extracted from the previous information are that, for instance, in 53% of
the handled tickets, the activity update impact was skipped and, in 29%, the
activity user validation was also not involved, albeit both these activities were
designed as required in the a-priori process model. Another deviation is that
in 34% of the total tickets, analysis & resolution was executed multiple times
within the same trace, when this activity was designed to only happen once per
process instance. Furthermore, it was also designed to be executed just before
user validation but, in the previous cases, it was freely executed without any type
of restriction, e.g., it was executed just after user validation. Table 3 depicts the
most skipped (model moves) and wrongly executed (log moves) activities for
each process model. Further research will involve ﬁnding the reason behind this
behavior, following a more question-driven analysis, e.g., in which situations is
it possible to skip these activities? Is it related to a certain type of tickets?


2.3   Discovery analysis

Following with our exploratory analysis on how this organization is actually
handling the diﬀerent tickets, the next step involves discovering, from the control-
ﬂow perspective, the real processes based on the recorded events. This analysis
usually starts with the visualization of the underlying discovered process model.
     Based on the previous conformance checking analysis, all four a-priori models
have a ﬁtness over 0.85, i.e., more than 85% of the events happened as planned.
Hence, among other indicators such as the number of diﬀerent process instances,
it is quite clear that, from the point of view of process discovery, we are dealing


                                        21
with Lasanga processes [1], i.e., the real processes are relatively structured and
the cases ﬂowing through such processes are handled in a controlled manner.
Hence, it should be possible to discover process models with high(er) values of
replay ﬁtness and with a clear structure. Figures 2 and 3 demonstrate that this
is, in fact, the described case. More speciﬁcally, Figures 2a and 3a show the
original discovered process models by ProDiGen for the event logs Workﬂow 2
and Workﬂow 4, respectively, in C-net format. A C-net is a graph where nodes
represent activities and arcs represent causal dependencies. In this representation
there are not places, hence the routing logic is solely represented by the possible
input and output bindings. Both these process models reproduce all the behavior
of the tickets recorded in the event logs, i.e., they have a perfect replay ﬁtness.
Furthermore, in order to focus only on the main behavior of the process models,
we also pruned the arcs used less than 5%. Figures 2b and 3b show the resultant
pruned models. Additionally, for each process model, we annotated each arc and
transition with their frequency of use.
   The rather structured discovered process models, coupled with their perfect
replay ﬁtness, is of special interest to the stakeholders to both detect frequent
and infrequent behavior. When these results were presented to the stakeholders,
they conﬁrmed that the models representing the most frequent behavior, e.g.,
the process models in Figures 2b and 3b, were, with slight diﬀerences, what they


                       Start process                                                         Start process
                                          20
                                                                                                        1131
                                    user
                            1131 notification                                                 analysis &
                                   (1152)                                                     resolution       249
                                                                                                (1560)
                                1132      180

                     analysis &                                                            1132 180
                     resolution      249         933
                       (1560)                                                               user
                                                                                         notification        168
                          168                          1                                   (1152)
                       user
                     validation      4                                                              933
                      (1107)
                                                                                                    user
                                    554                                                           validation
                                                                                                   (1107)
                                  warning
                     1 2                             99                     38
                                   (654)
                                                                                                  554
       11                       1         99     356
                                                                                        warning
                                                                                                        99
                  analysis of                   notify pending                           (654)
            449    rejection                      validation
                      (2)                            (455)
                                                                                          356 99                   449
                                      297              5
                                                                                         notify pending
                                               severe request                     297      validation
                                                 notification    5   351                      (455)
                                                     (10)

                                                 5
                                                                                                  351

                                  End process                                             End process

            (a) Original discovered model.                                       (b) Filtered discovered
                                                                                 model.

Fig. 2: Annotated c-nets discovered by ProDiGen for the event log Workﬂow 2.


                                                                       22
       Table 4: Unexpected behavior for each discovered process model.
               Workﬂow 1          Workﬂow 2       Workﬂow 3      Workﬂow 4
 Unexpected analysis & resolu- analysis & resolu-     -       analysis & resolu-
 loops      tion               tion                           tion
 Unexpected update impact      user validation, analyze      priority validation,
 skips                         user notiﬁcation change order user validation
Unexpected skips and Unexpected loops stands for behavior that, in theory, is not
allowed, but the discovered process model allows to execute it.


expected. However, they also detected unexpected behavior and deviations in
the way of handling certain tickets. Table 4 summarizes the exceptional insights
retrieved from this discovery analysis, i.e., loops that in theory are not allowed
(unexpected loops), or activities that are mandatory but in reality can be skipped
(unexpected skips). Among the diﬀerent deviations detected by the stakeholders,
it is noteworthy, in both processes (Figures 2b and 3b), the high frequency of the
self-loop in the activity analysis & resolution. Additionally, also in both these
processes, the activity user validation, in theory mandatory, can be skipped in
the discovered model. In other words, this means that there were tickets that
completed without any kind of validation.


                       Start process                                                      Start process

                                     590                                                           590
                               analysis &                                                   analisys &
                            21 resolution          80                                       resolution         80
                                 (746)                                                        (746)
                                       76 592
                                                                                          592 76
                                        notify
                                 1     opening                                         notify
                                        (613)                                         opening         69
                                                                                       (613)
                            69                 4
                                                                                              506
                                            priority
                              506          validation                                          user
                                             (192)                                          validation
                                                                                              (582)
                                      124 5         2 8                     4
                                                                                                    215
                                 user               check
                           61 validation             mir               27                 notify pending
                                (582)                (8)                        124         validation
                                                                                               (215)
                                     215                        179
                                                                                              61                    243
                  notify pending
           2        validation                          243 6
                       (215)                                                           priority
                                                                                      validation         152
                       2                    152                                         (192)

          analisysof                                                                               179
           rejection                                     End process
              (2)                                                                                  End process

           (a) Original discovered model.                                       (b) Filtered discov-
                                                                                ered model.

Fig. 3: Annotated c-nets discovered by ProDiGen for the event log Workﬂow 4.


                                                                        23
    ProDiGen platform provides a visual interface where stakeholders can repro-
duce, through the discovered process models, the actual path of each trace. This
was crucial to retrieve valuable information about the behavior of the whole
process. Figure 4 shows a snapshot of this process player, on the event log Work-
ﬂow 4. This process player represents a set of controls to reproduce the event
log over the process model. In this example, we have grouped the traces that fol-
lowed the same path in the process model. Hence, in this visualization we show
the path —the dark gray activities and arcs— followed by the traces grouped
in Group 2, i.e., 131 diﬀerent traces that share the same sequence of activities.
Furthermore, this player also shows diﬀerent performance statistics (this kind of
analysis is extended in more detail in Section 2.4), in the left part of Figure 4,
such as the average completion time of this group of traces, i.e., 13 days, or the
average completion time of each activity considering only this group of traces,
e.g, user validation took 5 days on average within these 131 traces. Additionally,
this player also enables to reproduce speciﬁc traces, allowing to gain more ﬁne
grained insights, in a visual way, into how, who and when an speciﬁc trace was
executed through the discovered process model.
    Furthermore, through this discovery analysis we obtained valuable feedback
from the people involved in the process on how to improve the mining of these
kind of processes in further analysis. Speciﬁcally, the timestamp is of particular


                                                    Start process

                                                              131

                                                            analysis &
                                                            resolution
                                                              (131)

                                                                     131

                                                                 notify
                                                                opening
                                                                 (131)


                                                                     priority
                                                          131       validation
                                                                        (-)


                                                              user           check
                                                           validation         mir
                                                             (131)             (-)

                                                             131

                                               notify pending
                                                 validation
                                                    (131)

                                                                     131

                                       analisysof
                                        rejection                                End process
                                            (-)


Fig. 4: Snapshot of the process player, in ProDiGen platform, on the process
model discovered for the event log Workﬂow 4.


                                     24
value when mining these kind of event logs. For instance, when the same activity
is executed multiple times in a short period of time, it should be considered
as the same activity. Additionally, when two diﬀerent activities are executed
sequentially in a short period of time, it should be considered that they are
executed in parallel regardless if they always appear as a sequence.

2.4   Performance analysis
The last part of this case study relies on the performance analysis. Process
mining provides a wide range of performance techniques [1]. Among them, the
dotted chart is one of the most powerful tools to view a process from diﬀerent
angles. Concerning this paper, we use the dotted chart to gain an overall view
of the performance of the event log Workﬂow 2. Figure 5 shows this dotted
chart. In this chart, time is measured along the horizontal axis, and each trace
represented along the vertical axis where each dot is an event. The color of each
dot represents the activity of the process, e.g., the red dots represent the activity
user validation. Note that, in this visualization, we omitted the artiﬁcial start
and end activities.
    Based on this dotted chart of the event log, diﬀerent observations can be
made. At ﬁrst glance, we can see that the process does not follow a constant
arrival of tickets, i.e., the initial events of all traces do not form a straight line:
there is a clear diﬀerence between the inﬂux of tickets before and after 2014. More
speciﬁcally, we can see that there is a signiﬁcant increase in the arrival rate of
tickets after the last months of 2014. Moreover, this arrival is quite steady, i.e.,


                                                                    increase in the inow of
                                                                    recorded tickets


Fig. 5: Dotted chart, retrieved with ProM, for the event log Workﬂow 2 using
absolute (real) times for the horizontal axis. Each horizontal division represents
a calendar month. The vertical axis represents the traces.


                                       25
we can draw a rather straight line alongside the initial events during this period
of time (bottom right of Figure 5), i.e., from 2014 onwards. Additionally, it seems
that, in some situations, certain sets of activities are executed in batches, i.e.,
the same activities are executed for diﬀerent cases in the same interval of time.
For instance, the activities inside the colored boxes (bottom right of Figure 5)
seem to show this particular behavior. We can also notice that there are periods
of inactivity (some of them are marked with the colored circles in the middle of
Figure 5), where no events were recorded.
    On the other hand, for some tickets, events are recorded a long time after
their arrival, whereas for the majority of the tickets most events are observed in
the ﬁrst couple of days. Figure 6 shows a better view of this behavior. In this
ﬁgure, we use the relative times, i.e., all the tickets start at time zero, and they
are sorted in descending order by their real duration. As can be seen, most of the
cases ended within the ﬁrst days, or even in the same day, of being registered in
the organization. However, we can ﬁnd diﬀerent tickets that took much longer
than expected, e.g., more than 10 days. Furthermore, we can see exceptional
cases that took even more than a year (the bottom of Figure 6). In general,
it seems that when the ticket goes through the activities user validation and
warning, i.e., the events represented by a red and pink dot, respectively, the
time to handle a ticket increases.
    As shown, it is possible to divide the real process of Workﬂow 2 in two time
intervals, based on the inﬂow of tickets: before and after the increase in the ar-
rival rate of tickets in 2014. Furthermore, we detected the same pattern in the
other three event logs within the organization. With this in mind, an interesting


Fig. 6: Dotted chart, retrieved with ProM, for the event log Workﬂow 2 using rel-
ative times for the horizontal axis, i.e., all traces start at time 0. Each horizontal
division represents 30 days. The vertical axis represents the traces.


                                       26
analysis would involve discovering the process models in these two time intervals
to scrutinize how the process behave under such conditions. Hence, we created
two diﬀerent logs with the tickets before and after 2014 for Workﬂow 2, more
speciﬁcally, before and after September 2014. Figure 7 shows the discovered pro-
cess models for these two time intervals. Again, both solutions have a perfect
replay ﬁtness, i.e., they reproduce all the recorded behavior in both event logs.
As can be seen, the process model describing the behavior of the tickets recorded
after 2014 (Figure 7b) is more well-structured than the process model discov-
ered for the tickets recorded before 2014 (Figure 7a). Furthermore, the process
model of Figure 7b, i.e., the modeled behavior after 2014, describes, with slight
diﬀerences, the expected behavior designed by the stakeholders. However, the
discovered model of Figure 7a, i.e., the modeled behavior before 2015, depicts
more deviations and unexpected behavior. In other words, after September 2014,
there was an signiﬁcant improvement on the way of handling the diﬀerent tickets
through this particular workﬂow, with less deviations, i.e., tickets that followed
not deﬁned rules from the desired process model. Remark that, after splitting
the behavior in all the remaining event logs, we found the same behavior in
the discovered process models, i.e., after September 2014, the way of handling


         Start process
              703

                        690

                    analysis &
               13   resolution              197                                                  Start process
                      1029                                                                            452

                  142        691                                                                      441         448
            user                                                                         analysis &                 user
         notification                   135
             704
                                                                                         resolution    90        notification        4
                                                                                            531                      448
                                544                                                                   418
                                                                                                                        418
                                    user
                        1         validation          4                                                            user
                                     685                                                                         validation
                                                                                                                    422
                                      223

                    warning                                                          7                                  422         422
                                                  8           1    2
                     232                                                    6
                                                                                                                                  notify pending
    17                      8     25                  1                                        23           30                      validation
                                                                                                                                        422
                                notify pending                analysis of
             449                  validation                   rejection
                                       33                          2                                         warning
                                                                                                                                          4
                                                                                                              422
                    206                 1
                                                                                                                                severe request
                            severe request                                                                         422            notification     4   418
                              notification                1       24                                                                    8
                                    2

                                  1                                                                                           4

               End process                                                                            End process
                   703                                                                                    452

    (a) Discovered process model                                                     (b) Discovered process model for the
    for the tickets recorded before                                                  tickets recorded after September 2014.
    September 2014.

Fig. 7: Annotated c-nets discovered by ProDiGen for the event log Workﬂow 2
before and after September 2014.


                                                                                27
the diﬀerent tickets was followed in a more strict way, as stated in the diﬀerent
designed process models.
    Performance analysis can also be achieved by enhancing process models with,
for instance, the time attributes of the events recorded in the event log. In other
words, using the previously discovered process models in Section 2.3, and the
timestamp of the events, it is possible to detect bottlenecks and other types
of behavior that could negatively aﬀect the whole performance of the process.
Figure 8 shows the throughput of the process model discovered in Section 2.3
for the event log Workﬂow 2. This model was extended with the timestamp for
both the activities, and the layover between them, i.e., the time between the
completion of the preceding activity and the start of the next activity. In this
process model, each arc is annotated with the average time between activities.
Moreover, the darker it is in comparison with the rest of the arcs, the longest
is the time between two activities. The same annotation is used regarding the
time of the activities.
    As can be seen, most of the layovers between activities are almost automatic,
taking a second or less. However, there is one layover that stands out from
the rest: the arc from analysis of rejection to user validation, which takes 92
days. Concerning the activities, it is possible to identify several of them that are
automatic, i.e., they are instant. However, we also ﬁnd an activity that took, on
average, 90 days: analysis of rejection. From a global perspective, if we analyse


            Start process                                                     Start process

                                                                                                 26

                                user                                                                user
                             notification                                         1131           notification
                              (instant)                                                            (1152)

                    1s        23h                                                      1132       186

          analysis &                                                       ana0l sis y
          resolution        1d      9d                                     reso0ution       249         933
            (7 d)                                                            (15&6)

               10d                       32m                                     1&8                          1

            user                                                             user
          validation        2m                                             va0idation       4
           (11 h)                                                           (1167)

                       1s                                                                  554

                  warning                                                                warning
     4h       92d (instant)              1s                                 1 2                             99                    38
                                                                                          (&54)

                    24m       1s 1s                          11                        1         99     35&

      analysis of                 notify pending                        ana0l sis of                   notifl pending
       rejection                    validation                    449    rejection                       va0idation
        (90 d)                       (instant)                              (2)                             (455)

                                            1s                                               297                5

                                 severe request                                                       severe request
                                   notification    1s                                                   notification    5   351
                                    (instant)                                                               (16)

                                                                                                        5

                       End process                                                      End process


Fig. 8: C-net discovered for the                             Fig. 9: Most frequent pattern
event log Workﬂow 2, extended                                within the process model discov-
with the time perspective.                                   ered for the event log Workﬂow 2.


                                                        28
the frequency of use of this part of the model (Figure 2a), analysis of rejection
was only executed two times in the whole process, but being very time consuming
in the whole process performance.
    Within the visual interface provided by ProDiGen, it is also possible to detect
the most frequent patterns in a process model, allowing to visualize the critical
parts. These patterns are extracted using both the information of the the event
log and the discovered process model to extract the frequent patterns. Hence,
using this algorithm it is possible to detect subprocesses (both activities and
control structures) that can be replaced with high level activities to, for instance,
reduce the complexity of a process model. For instance, Figure 9 highlights the
frequent pattern within the process model discovered for Workﬂow 2, with a
threshold above 70%, i.e., the pattern must be fulﬁlled more than 70% of the
times the process was executed. In this particular case, the most frequent pattern
in the whole model is a sequence of analysis & resolution → user notiﬁcation
→ user validation, with a frequency of 81%, i.e., 81% of the behavior recorded
in the event log went through this pattern. This visualization allows us to easily
check which part of the model is the most congested.

3   Conclusions
We have presented a case study of applying process mining on a real IT ser-
vice management scenario. The real data set has 2,004 requests and incidents
recorded between 2012 and 2015 within an organization. First, we extracted and
prepared the data. Then, based on the a-priori models, we performed a confor-
mance checking analysis, followed by the discovery of the real process models.
Finally, we measured the actual performance of the processes. In this paper, we
focused on a data-driven project, providing valuable insights about the way of
handling the diﬀerent tickets. For instance, the change of behavior in the global
processes before and after 2014. Other insights are related to how, for some
tickets, it was necessary to redo certain activities, related to wrongly assigned
tickets, or, for other tickets, that there was no validation and/or notiﬁcation to
the user and/or staﬀ. Regarding the latter, this behavior was due to anticipated
cancellations that were not properly recorded in the system, leading to open
tickets, or unexpected endings. Further analysis will involve giving answer to
the untapped questions, such as: the analysis of the tickets that were involved
in more than one process model, the handover of work, etc.

Acknowledgments. This research was supported by the Spanish Ministry of
Economy and Competitiveness (grant TIN2014-56633-C3-1-R, co-funded by the
European Regional Development Fund - FEDER program), the Galician Min-
istry of Education under the projects EM2014/012, CN2012/151, GRC2014/030.

References
 1. van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement
    of Business Processes. 1st edn. Springer (2011)


                                      29
 2. Mans, R.S., Schonenberg, M.H., Song, M., van der Aalst, W.M.P., Bakker, P.J.M.:
    Process mining in healthcare - A case study. In: Proceedings of the First Interna-
    tional Conference on Health Informatics, HEALTHINF. (2008) 118–125
 3. Partington, A., Wynn, M., Suriadi, S., Ouyang, C., Karnon, J.: Process mining
    for clinical processes: Acomparative analysis of four Australian hospitals. ACM
    Transactions on Management Information Systems 5(4) (2015) 1–19
 4. Forsberg, D., Rosipko, B., Sunshine, J.L.: Analyzing PACS usage patterns by
    means of process mining: Steps toward a more detailed workﬂow analysis in radi-
    ology. Journal of Digital Imaging 29(1) (2015) 47–58
 5. Sztyler, T., Völker, J., Carmona, J., Meier, O., Stuckenschmidt, H.: Discovery of
    personal processes from labeled sensor data – An application of process mining to
    personalized health care. In: Proceedings of the 2015 International Workshop on
    Algorithms & Theories for the Analysis of Event Data, ATAED. Volume 1371 of
    CEUR. (2015) 22–23
 6. Sedrakyan, G., Weerdt, J.D., Snoeck, M.: Process-mining enabled feedback: “tell
    me what I did wrong” vs. “tell me how to do it right”. Computers in Human
    Behavior 57 (2016) 352–376
 7. Vazquez Barreiros, B., Lama, M., Mucientes, M., Vidal, J.: Softlearn: A process
    mining platform for the discovery of learning paths. In: Proceedings of the 14th
    International Conference on Learning Technologies ICALT. (2014) 373–375
 8. Park, M., Song, M., Baek, T.H., Son, S., Ha, S.J., Cho, S.: Workload and delay
    analysis in manufacturing process using process mining. In: Proceedings of the 3rd
    Asia Paciﬁc Conference on Asia Paciﬁc Business Process Management, AP-BPM.
    Volume 219. (2015) 138–151
 9. Stolfa, J., Kopka, M., Stolfa, S., Kobersky, O., Snásel, V.: An application of process
    mining to invoice veriﬁcation process in SAP. In: Proceedings of the 4th Inter-
    national Conference on Innovations in Bio-Inspired Computing and Applications,
    IBICA. Volume 237 of Adv. Intell. Syst. Comput. (2014) 61–74
10. Spagnolo, G.O., Marchetti, E., Coco, A., Scarpellini, P., Querci, A., Fabbrini, F.,
    Gnesi, S.: An experience on applying process mining techniques to the tuscan port
    community system. In: Proceedings of the 8th International Conference on Soft-
    ware Quality. The Future of Systems-and Software Development, SWQD. Volume
    238 of LNBIP. (2016) 49–60
11. Weerdt, J.D., Schupp, A., Vanderloock, A., Baesens, B.: Process mining for the
    multi-faceted analysis of business processes – A case study in a ﬁnancial services
    organization. Computers in Industry 64(1) (2013) 57–67
12. van Eck, M.L., Lu, X., Leemans, S.J.J., van der Aalst, W.M.P.: PM2 : A process
    mining project methodology. In: Proceedings of the 27th International Conference
    on Advanced Information Systems Engineering, CAiSE. Volume 9097 of LNCS.
    (2015) 297–313
13. Vázquez-Barreiros, B., Mucientes, M., Lama, M.: ProDiGen: Mining complete,
    precise and minimal structure process models with a genetic algorithm. Information
    Sciences 294 (2015) 315–333
14. Verbeek, H.M.W., Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.:
    XES, XESame, and ProM 6. In: Information Systems Evolution. Volume 72 of
    LNBIP. (2011) 60–75
15. Adriansyah, A.A.: Aligning observed and modeled behavior. PhD thesis, Technis-
    che Universiteit Eindhoven (2014)


                                         30

</pre>