How Planning Techniques Can Help Process Mining: The Conformance-Checking Case Massimiliano de Leoni1 , Andrea Marrella2 1 Eindhoven University of Technology, The Netherlands m.d.leoni@tue.nl 2 Sapienza - University of Rome, Italy marrella@dis.uniroma1.it Abstract. Modern organizations execute processes to deliver product and ser- vices, whose enactment needs to adhere to laws, regulations and standards. Con- formance checking is the problem of pinpointing where deviations are observed in the process event data. Literature proposes solutions for the conformance- checking problem that, in fact, are implementations of planning algorithms built ad-hoc for the specific problem. Unfortunately, in the era of big data, these ad-hoc implementations do not scale sufficiently compared with robust, well-established planning systems. Furthermore, their ad-hoc nature does not allow for seamlessly plugging in new outperforming planning algorithms or heuristics, causing a mas- sive amount of work to be necessary to incorporate and evaluate alternatives. This paper summarizes our last results on how instances of the conformance checking problem can be represented as classical planning problems in PDDL for which planners can find a correct solution in a finite amount of time. If conformance checking problems are converted into planning problems, one can seamlessly update to the recent versions of the best performing automated planners, with evident advantages in term of versatility and customization. Experiments with three different planners highlight this versatility. Furthermore, by employing sev- eral processes and event logs of different sizes, we show how our planning-based approach outperforms existing approaches of several order of magnitude and, in certain cases, carries out the task while existing approaches run out of memory. 1 Introduction Process mining is a discipline that sits between data mining and process modeling and analysis and, hence, can be considered one of the links between data science and pro- cess science [1]. The idea of process mining is to discover, monitor and improve the processes by extracting knowledge from the data that are stored in information systems about how these systems are used to carry out processes. Differently from a-priori anal- ysis, the focus is not on the assumed processes but on real processes in the way that they are executed. Therefore, the starting point of process mining is event data, which is analyzed to extract useful insights and recurrent patterns about how processes are executed within organizations. Within Process Mining, conformance checking verifies whether the observed be- havior stored in an event log is compliant with a process model that encodes how the process is allowed to be executed to ensure that norms and regulations are not violated. As an example, suppose that a process model may indicate that orders for more than 100000 Euros require a double check: conformance checking will check whether the actual executions follows this rule or not. The notion of alignment [1, 2] provides a robust approach to conformance checking, which makes it possible to exactly pinpoint the deviations causing nonconformity with a high degree of detail. An alignment between a recorded process execution (log trace) and a process model is a pairwise matching between activities recorded in the log and activities allowed by the model. Sometimes, activities as recorded in the log (events) cannot be matched to any of the activities allowed by the model (process activities). In general, a large number of possible alignments exist between a process model and a log trace, since there may exist manifold explanations why a trace is not con- forming. It is clear that one is interested in finding the most probable explanation. In [1, 2], an approach is proposed that is based on the principle of the Occam’s razor: the most parsimonious explanation is preferable. Therefore, one should not aim to find any alignment but, precisely, one of the alignments with the least expensive deviations (one of the so-called optimal alignments). The work by Adriansyah et al. [2] provide a technique to compute these optimal alignments, based on the A* algorithm and ad-hoc heuristics. However, experiments (also reported in this paper) show that their technique does not scale sufficiently. In the era of big data, scalable approaches are desperately necessary, as also advocated by the IEEE Task Force in Process Mining [3]: “In some domains, mind-boggling quan- tities of events are recorded. [...] Therefore, additional efforts are needed to improve performance and scalability.”. We believe that re-implementing standard planning algorithms, such as A*, for solv- ing specific planning problems in an ad-hoc way is not ideal. First, ad-hoc implemen- tations cannot compare in scalability with well-established planning systems. Second, ad-hoc implementations prevent one from seamlessly plugging in new algorithms or evaluate alternative heuristics. As a consequence, the possibility of evaluating differ- ent alternatives is hampered; also, if the literature proposes new algorithms that clearly overtake the existing ones for solving instances of the conformance checking problem, a massive amount of work is needed to modify those ad-hoc implementations. Hence, in order to facilitate the integration of different planning algorithms, in [4], we illustrate how the problem of computing optimal alignments can be formulated as a planning problem, which can be solved through off-the-shelf PDDL planners. Here, we summarize the main ideas discussed in [4] and report on the most inter- esting results from the experiments. The results show that our planning-based approach is versatile, allowing multiple planners to be integrated, and scales better than existing approaches with larger models and event logs. 2 Petri Nets and Event Logs To model business processes, we opted for Petri nets (see, e.g., [1]), which have proven to be adequate [5] for this purpose. However, the approach described in this paper can be applied to other notations, such as BPMN. A Petri net N = (P, T, F ) is a directed graph with a set P of nodes called places and a set T of transitions. The nodes are connected via directed arcs F ⊆ (P × T ) ∪ (T × P ) . Connections between two nodes of the same type are not allowed. Places are represented by circles and transitions by rectangles. Fig. 1 illustrates an example of Petri nets. Given a transition t ∈ T , • t is used to indicate the set of input places of t, which are the places p with a directed arc from p to t (i.e., such that (p, t) ∈ F ). Similarly, t• indicates the set of output places, namely the places p with a direct arc from t to p. At any time, a place can contain zero or more tokens, drawn as black dots. The state of a Petri net, a.k.a. marking, is determined by the number of tokens in places. Therefore, a marking m is a function m : P → N. In any run of a Petri net, the num- ber of tokens in places may change, i.e., the Petri net marking. A transi- b tion t is enabled at a marking m iff start p1 p3 each input place contains at least one a τ d token, i.e. ∀ p ∈ • t, M (p) > 0. p2 end A transition t can fire at a mark- p4 c ing m iff it is enabled. As result of firing a transition t, one token is e “consumed” from each input place and one is “produced” in each output t place. This is denoted as m → − m0 . In Fig. 1. An example of Petri net. the remainder, given a sequence of transition firing σ = ht1 , . . . , tn i ∈ σ T ∗ , m0 − → mn is used to indicate t1 t2 tn m0 −→ m1 −→ . . . −→ mn . Event logs are the starting point for process mining. An event log L is is a multi-set of log traces σL ∈ T ∗ . A trace is a sequence of transition firings and describes the exection of a process instance in terms of the executed activities.3 Transition firings in an event log are known as events. Some transitions do not represent process activities but are necessary to correctly represent the business process through Petri nets. These transitions are invisible tran- sitions (similar to τ steps) and are not recorded as log events. For the sake of space, we assume here that each activity is mapped to one and only one Petri-net transition and that events are defined over the same alphabet as the Petri-net transitions. In [4], we show how these assumptions can be removed. 3 Alignment between Event Logs and Petri Nets As mentioned in Section 1, we perform conformance checking by constructing an align- ment of event log L and process model N [1, 2], which allows us to exactly pinpoint where deviations occur. On this aim, the events in the event log need to be related to transitions in the model, and vice versa. Building this alignment is far from trivial, since the log may deviate from the model at an arbitrary number of places. We need to relate “moves” in the log to “moves” in the model in order to establish an alignment between a process model and an event log. However, it may be that some of the moves in the log cannot be mimicked by the model and vice versa. We explicitly denote such “no moves” by . 3 We use multisets because the same trace can appear multiple times because different process executions can be identical in terms of activities being executed. Definition 1 (Alignment Moves). Let N = (P, T, F ) be a Petri net and L be an event log. A legal alignment move for N and L is represented by a pair (sL , sM ) ∈ (T ∪ { } × T ∪ {}) \ {(, )} such that: – (sL , sM ) is a move in log if sL 6=  and sM = , – (sL , sM ) is a move in model if sL = and sM ∈ T , – (sL , sM ) is a synchronous move if sL = sM . An alignment is a sequence of alignment moves: Definition 2 (Alignment). Let N = (P, T, F ) be a Petri net with an initial marking and final marking denoted with mi and mf Let also L be an event log. Let ΓN be the universe of all alignment moves for N and L. Let σL ∈ L be a log trace. Sequence γ ∈ ΓN∗ is an alignment of N and σL if, ignoring all occurrences of , the projection on the first element yields σL and the projection on the second yields a sequence σ 00 ∈ T ∗ σ 00 such that mi −−→ mf . A move in log for a transition t indicates that t occurred when not allowed; a move in model for a visible transition t indicates that t did not occur, when, conversely, expected. a d b c  Many alignments are possible for the same trace. For ex- γ1 = ample, Fig. 2 shows three possible alignments for a trace a  b c d a  d b c σ1 = ha, d, b, ci. Note how moves are represented verti- γ2 = cally. For example, as shown in Fig. 2, the first move of a τ d   a  d  b c  γ1 is (a, a), i.e., a synchronous move of a, while the the γ3 = second and fifth move of γ1 are a move in log and model, a τ  e b c d respectively. As discussed in Section 1, we aim at finding Fig. 2. Alignments of a complete alignment of σ and N with minimal number L trace ha, d, b, ci and the of deviations for visible transitions, also known in litera- process model in Fig. 1. ture as optimal alignment. Aligning an event log L means to compute an optimal alignment for each trace σL ∈ L. Clearly, different traces in L generally have different alignments, because they contain different events and deviations. With reference to the alignments in Fig. 2, γ1 and γ2 have two moves in model and/or log for visible transitions (γ1 has one move in model for an invisible transition but it does not count for computing optimal alignments). Con- versely, γ3 has three moves for visible transitions, specifically one log move for d (3rd move) and two model moves for e (4th move) and d (last move). Since no alignment exists with only one moves in model and/or log for visible transitions, both γ1 and γ2 are optimal alignments and any can equally be returned. For the sake of space, we as- sume here that all deviations (i.e., moves in model for visible transitions and moves in log) have the same severity. In [4], we show how this assumption can be removed. 4 Encoding the Alignment Problem in PDDL Planning systems are problem-solving algorithms that operate on explicit representa- tions of states and actions [6]. PDDL [7] is the standard Planning Domain Definition Language; it allows us to formulate a planning problem P = hI, G, PD i, where I is the description of the initial state of the world, G is the desired goal state, and PD is the planning domain. PD is built from a set of predicates and fluent describing the state of the world and a set of actions that can be executed in the domain. This section illustrates how, given an event log L and Petri net N = (P, T, F ) with given initial and final markings mi , mf , the problem of the alignment of N and σL = he1 , . . . , en i ∈ L can be encoded as a PDDL planning problem, which can be solved by state-of-the-art planners. The plan synthesized will consist of a sequence of alignment moves that establish an alignment between σL and N . As throughout discussed in [4], the proposed encoding of an alignment problem A as planning problem P is such that one can find a solution for A by solving P. Also, planning algorithms always terminate when the input is a planning problem P that encodes A according to the procedure proposed in this paper. In the remainder of this paper, we assume Petri nets to be 1-bounded. This is not a large limitation as the behavior allowed by most of business processes can be repre- sented as 1-bounded Petri nets (see discussion in [4]). 4.1 Predicates and fluents In the planning domain PD , we provide three abstract types called place, transition and event. The types place and transition represent respectively the places and the transitions of N . The type event is used to record the list of events that occur in the specific trace σL that must be checked for conformance. For trace σ1 = ha, d, b, ci and the Petri net in Fig. 1, the following objects are introduced in P: (:objects a b c d1 e inv - transition start p1 p2 p3 p4 end - place e1 e2 e3 e4 evEND - event) In addition to the four events, denoted as e1, ..., e4, we introduce the invisible transition inv and a fictitious evEN D, which identifies the end of trace. The introduction of this fiction evEN D is not specific of this example, but it is general for any instance of the alignment problem. To capture all possible markings of N and the evolution of σL during an alignment, we define four boolean predicates in PD , as follows: token. For each p ∈ P , (token ?p - place) holds iff p contains a token in the currently reached marking. succ. For each 1 ≤ n < |σL |, (succ ?e1 - event ?e2 - event) holds if e1 = ei and e2 = ei+1 . tracePointer. For each event e ∈ σL , (tracePointer ?e - event) holds when, during the computation of the alignment, e is the next trace event to align. associated. For each event e ∈ σL , (associated ?e - event ?t - transition) holds when t = e. In addition to the predicates above, we introduce a fluent total-cost to keep track of the number of deviations found in the alignment till that point. Synchronous moves are associated with no costs and, hence, no fluent needs to be introduced. For trace σ1 = ha, d, b, ci and the Petri net in Fig. 1, the initial state of P with the definition of predicates and fluents is as follows: (:init (token start) (tracePointer e1) (succ e1 e2) (succ e2 e3) (succ e3 e4) (succ e4 evEND) (associated e1 a) (associated e2 d) (associated e3 b) (associated e4 c) (= (total-cost) 0)) At the end, for the example in question, when the alignment is found, the Petri net needs to be in the final marking, i.e., with one token in place end and zero tokens in any other place, and the trace pointer has reached evEND: (:goal (and (tracePointer evEND) (token end) (not (token p1)) (not (token p2)) (not (token p3)) (not (token p4)) (not (token start)))) Readers should notice that, since our purpose is to minimize the total cost of the alignment, the planning problem also contains the following specification: (:metric minimize (total-cost)). 4.2 Planning Actions The plan to reach the final goal from the initial state is constituted by a sequence of alignment moves, each of which is a planning action. Therefore, three classes of actions exist in PD : synchronous moves, model moves and log moves. Synchronous moves. A separate action exists for each visible transition t ∈ T to rep- resent a synchronous move for t. For instance, let us consider transition a of the model in Fig. 1; synchronous move for a is associated with the following action: (:action moveSync-a :parameters (?e1 - event ?e2 - event) :precondition (and (token start) (tracePointer ?e1) (associated ?e1 a) (succ ?e1 ?e2)) :effect (and (not (token start)) (token p1) (token p2) (not (tracePointer ?e1)) (tracePointer ?e2))) The preconditions of the action are (i) that a is enabled (as defined in Section 2): each place p ∈ • a contains a token; (ii) e1 is the actual event of σL under analysis; (iii) e1 corresponds to the execution of a in σL ; (iv) e1 is succeeded by the event e2 in σL . The effect is such that the trace pointer moves from e1 to e2 and that the marking changes according to the firing rules: one token in place start is consumed and one token is produced in places p1 and p2. Model moves. A separate action exists for each transition t ∈ T . A model move focuses on making an enabled transition t fire without performing any move in σL . For instance, let us consider transition a of the model in Fig. 1; a model move for a is associated with the following action: (:action moveInTheModel-a :precondition (token start) :effect (and (not (token start)) (token p1) (token p2) (increase (total-cost) 1))) It is worthy observing how the execution of a model move for a makes total cost (of the alignment) increases of a value equal to 1, since a is a visible transition4 . The effects are accordant to the firing rules: The token in place start is consumed and one token is produced in places p1 and p2. The difference with synchronous moves is that the value of any predicate tracePointer does not change. For what concerns invisible transitions, readers should observe that since the ex- ecutions of invisible transitions are never recorded in event logs, invisible transitions are always involved in move in models, only. However, these model moves are actu- ally not deviations and, hence, the action moveInTheModel does not have (increase (total-cost) 1) among the effects. E.g., for transition τ in the model in Fig. 1, action moveInTheModel-tau will not have the total-cost increase as an effect. Log moves. All log moves can be represented by a single generic action, which is independent of N and σL : 4 For the sake of simplicity, we are assuming a unitary cost to express the severity of a move in model and/or in log. 10% noise 20% noise 30% noise Petri Net Existing Fast SymBA-2* SPM&S Existing Fast SymBA-2* SPM&S Existing Fast SymBA-2* SPM&S Size Approach Downward Approach Downward Approach Downward 25 25.85 84.79 685.35 812.44 42.8 86.21 692.97 814.04 56.87 88.87 699.42 816.05 36 11.71 94.13 767.05 950.25 19.24 92.78 785.14 956.19 28.6 92.67 793.13 959.25 68 112.85 130.68 1,413.85 1,722.13 164.92 134.57 1,418.51 1,779.17 278.31 144.99 1,419.5 1,808.61 95 77.56 123.65 2,048.84 2,431.35 119.89 125.78 2,049.14 2,445.66 168.32 126.82 2,053.92 2,483.98 115 638.52 205.47 2,289.84 2,787.31 966.34 384.26 2,291.91 2,817.29 1,987.12 716.68 2,352.81 2,868.77 136 304.62 178.66 2,752.28 3,282.23 465.97 182.88 2,765.14 3,300.87 578.63 187.17 2,775.46 3,317.48 175 – 18,759.91 5,909.46 6,546.87 – 84,195.41 6,092.93 6,837.57 – 1.71 · 105 6,259.99 7,182.13 263 – 2,343.92 11,960.84 13,289.56 – 13,524.63 12,044.67 13,744.92 – 30,582.11 12,194.09 14,075.09 Table 1. Comparison of the experimental results with different planners and with the existing approach of [2], using different models while varying the amount of noise. The values refers to the average time (in ms.) to compute the alignment of a log trace and the respective process model. (:action moveInTheLog :parameters (?e1 - event ?e2 - event) :precondition (and (tracePointer ?e1) (succ ?e1 ?e2)) :effect (and (not (tracePointer ?e1))(tracePointer ?e2) (increase (total-cost) 1 ))) A log move for any event ei ∈ σL is represented as (moveInTheLog ?e1 ?e2) where e1 = ei and e2 = ei+1 . The preconditions are that e2 follows e1, and that e1 is the actual event under analysis. The effect is to move the trace pointer from e1 to e2 and to increase the total cost of the alignment of a value equal to 1. 5 Evaluation and Conclusion The planning-based alignment tool is publicly available as Java application. Implemen- tation details and information about how to download it are available in [4]. To evaluate the versatility of approach to seamlessly integrate different planners, we employed three different planners: Fast Downward5 , SymBA-2* and SPM&S6 . These three planning systems were employed with searching algorithms that guarantee the optimality of the solution. Therefore, the returned alignments are always guaranteed to be optimal. To evaluate the scalability and feasibility of the approaches, we tested our imple- mentation with several combinations of real-life and synthetic event logs and processes. We refer to [4] for a thorough discussions of the experimental test bed and of the results. Here, for the sake of space, we focus on the results on the synthetic process models and event logs. Specifically, we artificially generated different process models of increas- ing sizes in form of Petri nets to evaluate the scalability of the approach and, for each Petri net, we generated artificial event logs with increasing amount of noise. Established techniques and tools exist for this purpose (see details in [4]). The entire set of artificial models is attached as appendix in [4]. In the remainder, an event log with X% of noise indicates that, starting from an event log where each trace is conforming, an event is swapped with the next event in the trace with a probability of X%. This swapping generally causes the trace to be no more conforming. Note also that multiple events may be swapped in a trace and, also, the same event can be swapped with consecutive events, moving it further apart from the original position in the trace. 5 http://www.fast-downward.org/ 6 SymBA-2* and SPM&S can be downloaded at: https://helios.hud.ac.uk/ scommv/IPC-14/planners_actual.html The experiments with these artificial process models are summarized in Table 1. Each row refers to a different Petri net. For each row, the first column indicates the size of the Petri net in term of number of transitions; the subsequent 12 columns refer to the experimental results when the corresponding event log contain 10%, 20% and 30% noise. For each amount of noise, we compared the result of the existing ad-hoc approach by Adriansyah et al. [2] with the results obtained by our approach when integrated with the three planners mentioned above. The values refer to the average time to alignment and event-log trace with the re- spective process model. The results show that the existing ad-hoc approach is faster with models of small sizes and/or when using event logs with smaller amount of noise. Con- versely, our approach is faster with larger models. In particular, it seems that, generally speaking, Fast Downward performs better with middle-size models whereas SymBA-2* works better with very large models. To conclude, the existing approach by Adriansyah et al. [2] is confirmed not to sufficiently scale when event logs and process models become very large. As a matter of fact, when testing with the two largest models, the existing approach was not able to align every event-log trace with the corresponding model and was running out of memory, even though we were using a machine with 16 GBs of RAM. Conversely, tracking the memory consumption, all of three employed planners were never using more than 4 GBs of RAM. Even when the existing approach was able to carry out the alignment tasks, it could do it significantly slower. As an example, compare the results for the model with 115 activities and an event log with 30%: On average, the existing approach requires 1987 ms per trace, versus 716 ms for Fast Downward. A saving of 1.2 seconds per trace means that, to align all traces of an event log with, e.g., 100000 traces (not uncommon in real-world case studies), the use of Fast Downward would allow us to save 120000 seconds: 83.3 hours. References 1. van der Aalst, W.M.P.: Data Science in Action. Springer Berlin Heidelberg, Berlin, Heidelberg (2016) 2. Adriansyah, A., Sidorova, N., van Dongen, B.F.: Cost-based fitness in conformance checking. In: Proc. of Int. Conf. on Application of Concurrency to System Design, IEEE (2011) 57–66 3. Van Der Aalst, W., et al.: Process Mining Manifesto. In: Proc. of the 9th Int. Conf. on Business Process Management (BPM 2011), Springer Berlin Heidelberg (2011) 169–194 4. de Leoni, M., Marrella, A.: Aligning Real Process Executions and Prescriptive Process through Automated Planning. Expert System with Applications 82 (2017) 162 – 183 5. van der Aalst, W.M.P.: The application of petri nets to workflow management. Journal of Circuits, Systems, and Computers 8(1) (1998) 21–66 6. Ghallab, M., Nau, D.S., Traverso, P.: Automated Planning: Theory and Practice. Morgan Kaufmann (2004) 7. McDermott, D., Ghallab, M., Howe, A., Knoblock, C.A., Ram, A., Veloso, M., Weld, D.S., Wilkins, D.E.: PDDL—The Planning Domain Definition Language. Technical Report DCS TR-1165, Yale Center for Computational Vision and Control, New Haven, Connecticut (1998)