Compositional Discovery of Workflow Nets from Event Logs Using Morphisms? Luca Bernardinello2 , Irina Lomazova1 , Roman Nesterov1,2 , and Lucia Pomello2 1 National Research University Higher School of Economics, 20 Myasnitskaya Ulitsa, 101000 Moscow, Russia 2 Dipartimento di Informatica, Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Viale Sarca 336 - Edificio U14, I-20126 Milano, Italia Abstract. This paper presents a modular approach to discover process models for multi-agent systems from event logs. System event logs are filtered according to individual agent behavior. We discover workflow nets for each agent using existing process discovery algorithms. We con- sider asynchronous interactions among agents. Given a specification of an interaction protocol, we propose a general scheme of workflow net composition. By using morphisms, we prove that this composition pre- serves soundness of components. A quality evaluation shows the increase in the precision of models discovered by the proposed approach. Keywords: Petri nets, workflow nets, multi-agent systems, morphisms, composition, process discovery 1 Introduction Process discovery focuses on the synthesis of process models from event logs con- taining the observed record of an information system behavior. Process models are usually developed at the design stage of the information system life-cycle. However, the real observed behavior of the information system can eventually differ from the designed one. In some cases, designers cannot develop precise models describing all possible scenarios. That is why, process discovery is a topic of great interest at the moment. Many process discovery algorithms have been proposed over recent years. They include Genetic algorithms, HeuristicMiner, Fuzzy miner, Inductive miner, the algorithms based on integer linear programming (ILP), and on the theory of regions (see [3] for a comprehensive review). They can be applied to solve typical problems of event logs, e.g. incompleteness and noise. Within multi-agent systems (MAS), models obtained by existing process dis- covery tools can be incomprehensible since concurrent interacting agents pro- duce rather sophisticated behavior. Process models of MAS, obtained by the ? This work is supported by MIUR, Basic Research Program at the National Research University Higher School of Economics, and Russian Foundation for Basic Research, project No.16-01-00546. 23 algorithms mentioned above, are not structured in such a way that it is possible to identify agent models as components as shown by the following example. In order to give an intuition behind our approach, consider the system model shown in Fig. 1(a). Two interacting agents can be clearly identified as well as the way they interact by looking at the small grey places and arcs. Now take the event log produced by this system as the input to a process discovery algorithm. If we try to discover a model from this log directly, we can obtain, for instance, the results shown in Fig. 1(b) by inductive miner and in Fig. 1(c) by ILP miner. Although equivalent to the original one in their ability to reproduce the same event sequences, their structure hides the fact the original system is made of two agents, communicating through channels. More technically, the two agents cor- respond to two S-invariants in the original model, while they are not “separable” by means of S-invariants in the discovered models. It is true that we can improve the overall structure of models by configuring algorithm parameters. However, they still will not reflect the underlying MAS organization. (a) an intitial system model (b) inductive miner (c) ILP miner Fig. 1. Discovering process models of MAS from event logs We propose a compositional approach to address the problem of discovering process models of MAS from event logs clearly expressing agents as compo- nents and their interactions. We assume that agents interact asynchronously 24 via message passing. System event logs are projected on each agent to discover component models in terms of workflow nets by using existing process discovery algorithms. By means of morphisms, we can construct abstract models of the agents and compose them by adding channels for message passing. This com- position models the protocol of agent interaction. In this way, it is possible to check soundness of this simplified model and to compose the discovered agent models on the basis of this protocol. We prove that this composition preserves soundness of the protocol and components by construction. In this paper, we start constructing a formal background to the general compositional approach to discovering process models of multi-agent systems from event logs. The problem of discovering structured models from logs is not new and has been studied in several works based on composition. In [18] the authors have designed a technique to discover readable models by decomposing transition systems. A special method to deal with process cancellations has been studied in [17]. Regular process behavior is composed with cancellations using reset arcs. A rather large amount of literature has been devoted to Petri net compo- sition (see, for example, [9, 13, 25]). In particular, regarding asynchronous com- munication, several approaches have been proposed. Among the others, in [14] asynchronous composition of Petri nets through channels has been studied con- sidering preservation of various channel properties. A general approach to asyn- chronous composition has also been discussed in [4], where open nets have been introduced. The core problem of composition lies in preserving component prop- erties. Open nets have been used by many researches especially focused on mod- eling composite services (see, for example, [15, 16]). They have proposed both structural and behavioral techniques assuring the correctness of composition. Using morphisms is another possible way to achieve inheritance of component properties. Composition of Petri nets via morphisms has been studied in several works [4, 5, 8, 12, 22, 23, 27]. We will use α-morphisms and the related composition presented in [7]. In general, α-morphisms allow one to preserve/reflect properties checking structural and only local behavioral constraints. In the particular case considered here, α-morphisms preserve/reflect reachable markings and transition firings as well as preserve soundness as it will be shown in Section 3. From a more practical point of view, many researchers have considered work- flow net composition. Workflow nets (WF-nets) form a class of Petri nets used to model processes and services. Among the others, composition of WF-nets via shared resources has been studied in [21, 26] with a concern about soundness preservation. In addition, other approaches have been used for modeling and composing interacting workflow nets [20, 24]. Many works have concerned the web service composition (see the survey [11]), where the authors have stressed that there is a lack of execution engines based on Petri nets. The paper is organized as follows. Section 2 gives preliminary definitions used in this paper as well as composition by channels and α-morphisms. Section 3 discusses properties preserved and reflected by α-morphisms, relevant to WF- nets. Section 4 introduces a modular approach to construct models of MAS from 25 event logs, using the same illustrative example shown in Fig. 1. In Section 5, we summarize the paper by discussing results and possible continuations. Detailed proofs of some propositions stated in Section 3 are given in the extended version of this paper [6], whereas here we provide sketches of proofs. 2 Preliminaries N denotes the set of non-negative integers. A+ denotes the set of all finite non- empty sequences over A, and A∗ = A+ ∪ {},  is the empty sequence. A multiset m over a set S is a function m : S → N. Let m1 , m2 be two multisets over S. Then m1 ⊆ m2 ⇔ m1 (s) ≤ m2 (s) for all s ∈ S. Also, m0 = m1 + m2 ⇔ m0 (s) = m1 (s) + m2 (s), m00 = m1 − m2 ⇔ m00 (s) = max(m1 (s) − m2 (s), 0) for all s ∈ S. A Petri net is a triple N = (P, T, F ), where P and T are two disjoint sets of places and transitions, i.e. P ∩ T = ∅, and F ⊆ (P × T ) ∪ (T × P ) is a flow relation, where dom(F )∪cod(F ) = P ∪T . We consider nets, s.t. ∀t ∈ T : |• t| ≥ 1 and |t• | ≥ 1. Let N = (P, T, F ) be a Petri net, and X = P ∪ T . The set • x = {y ∈ X|(y, x) ∈ F } denotes the preset of x ∈ X. The set x• = {y ∈ X|(x, y) ∈ F } • • • • denotes the postset of x ∈ X. The S set •x =• x ∪ Sx denotes the neighborhood • of x ∈ X. Let A ⊆ X, then A = x∈A x, A = x∈A x• , • A• = • A ∪ A• . By N (A) we denote a subnet of N generated by A, i.e. N (A) = (P ∩ A, T ∩ A, F ∩ (A × A)). Let N (A) be a subnet of N generated by A ⊆ X. The set N (A) = {y ∈ A|∃z ∈ X \A : (z, y) ∈ F or • y = ∅} denotes the input elements. The set N (A) = {y ∈ A|∃z ∈ X \ A : (y, z) ∈ F or y • = ∅} denotes the output elements. A marking (state) of a Petri net N = (P, T, F ) is a function m : P → N. It is a multiset over a set of places P . A marked Petri net (N, m0 ) is a Petri net together with its initial marking m0 . A marking m enables a transition t ∈ T , denoted m[ti, if • t ⊆ m. The firing of t at m leads to a new marking m0 = m − • t + t• , denoted m[tim0 . A sequence w ∈ T ∗ is a firing sequence of N = (P, T, F, m0 ) if w = t1 t2 . . . tn and m0 [t1 im1 [t2 i . . . mn−1 [tn imn . Then we can write m0 [wimn . The set of all firing sequences of N is denoted by F S(N ). A marking m of N = (P, T, F, m0 ) is reachable if ∃w ∈ F S(N ) : m0 [wim. The set of all markings of N reachable from m is denoted by [mi. A reachable marking is dead if it does not enable any transition. N is deadlock-free if no reachable marking is dead. N is safe if ∀p ∈ P ∀m ∈ [m0 i : m(p) ≤ 1. Then we will specify reachable markings as subsets of places. A state machine is a connected Petri net N = (P, T, F ), s.t. ∀t ∈ T : |• t| = • |t | = 1. A subnet of a marked Petri net N = (P, T, F, m0 ) identified by a subset of places A ⊆ P and its neighborhood, i.e. N (A∪ • A• ), is a sequential component of N if it is a state machine and has a single token in the initial marking. N is covered by sequential components if every place of N belongs to at least one sequential component. Then N is said to be state machine decomposable (SMD). Semantics of a marked Petri net is given by its unfolding as defined below. 26 Let N = (P, T, F ) be a Petri net, and F ∗ be the reflexive transitive closure of F. Then ∀x, y ∈ P ∪ T : (a) x and y are in causal relation, denoted x < y, if (x, y) ∈ F ∗ ; (b) x and y are in conflict relation, denoted x#y, if ∃tx , ty ∈ T , s.t. tx 6= ty , • tx ∩ • ty 6= ∅, and tx < x, ty < y. Definition 1. A Petri net O = (B, E, F ) is an occurrence net if: 1. ∀b ∈ B : |• b| ≤ 1. 2. F ∗ is a partial order. 3. ∀x ∈ B ∪ E : {y ∈ B ∪ E|y < x} is finite. 4. ∀x, y ∈ B ∪ E : x#y ⇒ x 6= y. By definition, O is acyclic. M in(O) denotes the set of minimal nodes of O w.r.t. F ∗ , i.e. the elements having the empty preset. We only consider nets having transitions with non-empty presets and postsets, then M in(O) ⊆ B. Definition 2. Let N = (P, T, F, m0 ) be a marked safe Petri net, O = (B, E, F ) be an occurrence net, and π : B ∪ E → P ∪ T be a map. (O, π) is a branching process of N if: 1. π(B) ⊆ P and π(E) ⊆ T . 2. π restricted to Min(O) is a bijection between Min(O) and m0 . 3. ∀t ∈ T : π restricted to • t is a bijection between • t and • π(t), and similarly • for t• and π(t) . 4. ∀t1 , t2 ∈ T : if • t1 = • t2 and π(t1 ) = π(t2 ), then t1 = t2 . The unfolding of N , denoted U(N ), is the maximal branching process of N , s.t. any other branching process of N is isomorphic to a subnet of U(N ) with the map π restricted to the elements of this subnet. The map associated with the unfolding is denoted u and called folding. Workflow nets form a subclass of Petri nets used for process modeling. We define generalized workflow nets having an initial state m0 and a final state mf . Definition 3. A marked Petri net N = (P, T, F, m0 , mf ) is a generalized work- flow net (GWF-net) if and only if: 1. m0 = {s ∈ P | • s = ∅} and m0 6= ∅. 2. mf = {f ∈ P | f • = ∅} and mf 6= ∅. 3. ∀x ∈ P ∪ T ∃s ∈ m0 ∃f ∈ mf : (s, x) ∈ F ∗ and (x, f ) ∈ F ∗ . If |m0 | = |mf | = 1, then a generalized workflow net is called just a workflow net (WF-net, for short). State machine decomposable GWF-nets are safe. The important correctness property of GWF-nets is soundness [2]. Definition 4. A GWF-net N = (P, T, F, m0 , mf ) is sound if and only if: 1. ∀m ∈ [m0 i : mf ∈ [mi. 2. ∀m ∈ [m0 i : mf ⊆ m ⇒ m = mf . 3. ∀t ∈ T ∃m ∈ [m0 i : m[ti. 27 We consider composition of two GWF-nets by adding channels. This ap- proach is similar to the one presented in [14]. A set of channel places (channels, for short) is denoted by Pc . They model asynchronous communication via mes- sage passing. Some transitions of two nets can either send a message by an incoming arc to a channel or receive a message by an outgoing arc from a chan- nel. We assume to know exactly which transitions send/receive messages to/from which channel places. In order to simplify the notation, we will not introduce special transition labels indicating sending/receiving. Two GWF-nets can be composed via a set of channels Pc iff any channel receiving a message from one GWF-net send it only to the other GWF-net. Definition 5. Let Ni = (Pi , Ti , Fi , mi0 , mif ) be a GWF-net for i = 1, 2, s.t. N1 and N2 are disjoint, where P1 ∩ P2 = ∅ and T1 ∩ T2 = ∅. Let Pc be a set of channels. A channel-composition of N1 and N2 , denoted N1 ⊕Pc N2 , is a Petri net N = (P, T, F, m0 , mf ), where: 1. P = P1 ∪ P2 ∪ Pc , where Pc ∩ (P1 ∪ P2 ) = ∅. 2. T = T1 ∪ T2 . 3. F = F1 ∪ F2 ∪ Fc , where Fc ⊆ (Pc × (T1 ∪ T2 )) ∪ ((T1 ∪ T2 ) × Pc ). 4. m0 = m10 ∪ m20 and mf = m1f ∪ m2f . 5. ∀p ∈ Pc : (a) (• p ⊆ T1 or • p ⊆ T2 ) and (p• ⊆ T1 or p• ⊆ T2 ), (b) • p ⊆ Ti ⇔ p• ⊆ T(i+1) mod 2 , (c) • p 6= ∅ and p• 6= ∅. Remark 1. The operation ⊕Pc is commutative, i.e. N1 ⊕Pc N2 = N2 ⊕Pc N1 . By the following proposition, the class of GWF-nets is closed under the channel-composition. Figure 2(a) provides an example of channel-composition of N1 and N2 , where channels are indicated by small gray places. Proposition 1. If N1 and N2 are GWF-nets, then N1 ⊕Pc N2 is a GWF-net. Proof. We show that each channel lies on a path from a place in mi0 to a place in mjf , where j = (i + 1) mod 2 and i = 1, 2. Take p ∈ Pc . By Def. 5.5, • p ⊆ Ti ⇔ p• ⊆ Tj . Take ti ∈ • p and tj ∈ p• . By Def. 3.3, ∃s ∈ mi0 , s.t. (s, ti ) ∈ Fi∗ . Then (s, p) ∈ F ∗ . By Def. 3.3, ∃f ∈ mjf , s.t. (tj , f ) ∈ Fj∗ . Then (p, f ) ∈ F ∗ . t u Further, we recall the definition of α-morphisms (see Definition 6 and Defi- nition 7 in [7]) supporting abstraction and refinement for Petri nets. An exam- ple of α-morphism is shown in Fig. 2(b), where the refinement is given by the shaded ovals and by the transition labeling explicitly. Refinement can also require splitting transitions of the abstract model. After giving the formal definition of α-morphisms, we will provide an intuition behind them. Definition 6. Let Ni = (Pi , Ti , Fi , mi0 ) be a marked SMD safe Petri net, Xi = Pi ∪Ti , i = 1, 2. An α-morphism from N1 to N2 is a total surjective map ϕ : X1 → X2 , also denoted ϕ : N1 → N2 , such that: 28 1. ϕ(P1 ) = P2 . 2. ϕ(m10 ) = m20 . • 3. ∀t ∈ T1 : if ϕ(t) ∈ T2 , then ϕ(• t) = • ϕ(t) and ϕ(t• ) = ϕ(t) . • • 4. ∀t ∈ T1 : if ϕ(t) ∈ P2 , then ϕ( t ) = {ϕ(t)}. 5. ∀p2 ∈ P2 : (a) N1 (ϕ−1 (p2 )) is an acyclic net; (b) ∀p ∈ N1 (ϕ−1 (p2 )) : ϕ(• p) ⊆ • p2 , and if • p2 6= ∅, then • p 6= ∅; (c) ∀p ∈ N1 (ϕ−1 (p2 )) : ϕ(p• ) = p2 • ; (d) ∀p ∈ P1 ∩ ϕ−1 (p2 ) : p ∈ / N1 (ϕ−1 (p2 )) ⇒ ϕ(• p) = p2 and p∈ / N1 (ϕ−1 (p2 )) ⇒ ϕ(p• ) = p2 ; (e) ∀p ∈ P1 ∩ ϕ−1 (p2 ) : there is a sequential component N 0 = (P 0 , T 0 , F 0 ) of N1 , s.t. p ∈ P 0 and ϕ−1 (• p2 • ) ⊆ T 0 . (a) N1 ⊕Pc N2 , Pc = {x, y, z} (b) α-morphism ϕ : N20 → N2 Fig. 2. Introductory examples By definition, α-morphisms allow us to refine places by replacing them with subnets. Thus, if a transition is mapped on a place, then its neighborhood should be mapped on the same place (4). If a transition is mapped on a transition, it should have the corresponding neighborhood (3). Indeed, α-morphisms are motivated by the attempt to ensure that properties of an abstract model hold in its refinement. Each output place of a subnet should have the same choices as its abstraction does (5c). However, input places do not need this constraint (5b), because the choice between them is done before, since there are no concurrent events in the neighborhood of the subnet (5e). Moreover, 5d guarantees that presets and postsets of places, internal to the subnet, are mapped to the same place as the subnet. Conditions 5a-5e together ensure the intuition behind α-morphisms. If a subnet of N1 refines a place in N2 , then it behaves “in the same way”. More precisely, by Lemma 1 of [7], (a) no input transition of the subnet is enabled whenever a token is inside the subnet; and (b) firing an output transition of the subnet empties it. 29 3 Properties Preserved and Reflected by α-Morphisms In this section, we study properties preserved and reflected by α-morphisms. In [7] several properties of α-morphisms have already been studied. Here we will mention some of them and consider properties of α-morphisms for GWF-nets. In the following propositions, we assume Ni = (Pi , Ti , Fi , mi0 ) to be a marked SMD safe Petri net for i = 1, 2, s.t. there is an α-morphism ϕ : X1 → X2 , where Xi = Pi ∪ Ti . To begin with, α-morphisms preserve the structure of GWF-nets. Proposition 2. If N1 is a GWF-net, then N2 is a GWF-net. Proof. We prove that N2 satisfies the three structural requirements of Def. 3. The conditions 1 and 2 on the initial and final state are assured by conditions 5b and 5c of α-morphisms (see Def. 6) respectively. The preservation of connectedness, i.e. condition 3, is proven by using the fact that α-morphisms are surjective maps preserving neighborhood of nodes (conditions 3 and 4 of Def. 6). t u Remark 2. It follows from Proposition 2 that ϕ(m1f ) = m2f . In the general case the converse of Proposition 2 is not true. In fact, α-morphisms do not reflect the initial state of GWF-nets properly (see Fig. 3(a)). (a) (b) Fig. 3. An α-morphism: two examples Recall from [7] that N1 is well marked w.r.t. ϕ if each input place of a subnet in N1 , refining a marked place in N2 , is marked. Consider the α-morphism shown in Fig. 3(a), the token of the shaded subnet must be placed into p to make N1 well marked w.r.t. to ϕ. The structure of GWF-nets is reflected under the well- markedness of N1 (see the following Proposition). However, if N1 is a GWF-net related to N2 by an α-morphism, then N1 is well marked w.r.t. ϕ. Proposition 3. If N2 is a GWF-net and N1 is well marked w.r.t. ϕ, then N1 is a GWF-net. 30 Proof. We prove that N1 satisfies the three structural conditions of Def. 3. The condition 1 on the initial state is given by the fact that N1 is well marked w.r.t. ϕ. The condition 2 on the final state is proven by contradiction using condition 4 of α-morphisms (see Def. 6). The connectedness of N1 is proven by contrandiction as well. If a node of N1 is not connected to at least one initial place, then there are two cases. In the first case, there should be a backward path to a place p, s.t. • p = ∅ and p ∈ / m10 which contradicts the well-markedness of N1 (a forward • path to place p , s.t. p0 = ∅ and by Def. 3.2, p0 ∈ m1f ). In the second case, 0 we are in a cycle in N1 . By Def. 6.5a, we deduce that there is a corresponding cycle in N2 . Knowing that N2 is connected (by Def. 3.3) and that α-morphisms preserve neighborhood of nodes, we get a contradiction in N1 , s.t. at least one node in this cycle is connected to p, s.t. p ∈ m10 (to p0 , s.t. p0 ∈ m1f ). t u We recall in the following proposition that α-morphisms preserve reachable markings and firing of transitions. Proposition 4 ([7]). Let m1 ∈ [m10 i. Then ϕ(m1 ) ∈ [m20 i. If m1 [tim01 , where t ∈ T1 , then: 1. ϕ(t) ∈ T2 ⇒ ϕ(m1 )[ϕ(t)iϕ(m01 ). 2. ϕ(t) ∈ P2 ⇒ ϕ(m1 ) = ϕ(m01 ). Remark 3 ([7]). In the general case α-morphisms do not reflect reachable mark- ings, i.e. m2 ∈ [m20 i and m2 [t2 i do not imply that there exists m1 = ϕ−1 (m2 ) ∈ [m10 i, s.t. ∀t ∈ ϕ−1 (t2 ) : m1 [ti. To reflect reachable markings, we need to check local conditions as shown in [7]. For any place p2 ∈ P2 , refined by a subnet of N1 , we construct a “local” net, denoted S2 (p2 ), of N2 by taking the neighborhood transitions of p2 plus an artificial input and output place if needed. At the same time, we construct the corresponding “local” net, denoted S1 (p2 ), of N1 by taking the subnet of N1 refining p2 , i.e. N1 (ϕ−1 (p2 )), and the transitions ϕ−1 (• p2 ), ϕ−1 (p2 • ) plus an artificial input and output place if needed. The details are given in [7] (see Definition 9 there). There is an α-morphism ϕS from S1 (p2 ) to S2 (p2 ) which is a restriction of ϕ on the places and transitions of S1 (p2 ). In the following Lemma, taking the unfolding of S1 (p2 ), we assure that the final marking of the subnet enables the same set of transitions which are enabled by its image. By Definition 6.5a, since S1 (p2 ) is acyclic, its unfolding is finite. Lemma 1. Let U(S1 (p2 )) be the unfolding of S1 (p2 ) with the folding function u, and ϕS be an α-morphism from S1 (p2 ) to S2 (p2 ). Let N1 be a sound GWF-net. Then, the map from U(S1 (p2 )) to S2 (p2 ) obtained as ϕS ◦ u is an α-morphism. Proof. Since N1 is a GWF-net, S1 (p2 ) is a GWF-net. By [7] (see Lemma 1 there), when a transition in ϕ−1 (p2 • ) fires, it empties the subnet N1 (ϕ−1 (p2 )). Then S1 (p2 ) is sound, and by Def. 4.3 ∀t ∈ T1 ∃m ∈ [m10 i : m[ti. So, each transition of S1 (p2 ) will occur at least once. Then, the folding u is a surjective function from U(S1 (p2 )) to S1 (p2 ), and ϕS ◦ u is an α-morphism from U(S1 (p2 )) to S2 (p2 ). t u 31 As for the main results, we obtain that under the assumption of soundness of the GWF-net N1 , α-morphisms not only preserve, but also reflect reachable markings and transition firings (see Proposition 5). Moreover, α-morphisms pre- serve soundness as shown in Proposition 6. Proposition 5. If N1 is a sound GWF-net, then ∀m2 ∈ [m20 i ∃m1 ∈ [m10 i : ϕ(m1 ) = m2 , and if ∃t2 ∈ T2 : m2 [t2 i, then ∀t1 ∈ ϕ−1 (t2 ) : m1 [t1 i. Proof. Follows from Lemma 1 and from [7] (see Proposition 5 there). Proposition 6. If N1 is a sound GWF-net, then N2 is a sound GWF-net. Proof. We prove that N2 satisfies the three behavioral requirements of Def. 4. The conditions 1 and 2 are assured by knowing the fact that, under the assump- tion of soundness of N1 , α-morphisms not only preserve reachable markings, but also reflect them (by using Prop. 4 and Prop. 5). The condition 3 is proven by the surjectivity of α-morphisms together with marking preservation. t u Remark 4. In the general case the converse of Proposition 6 is not true. Consider the example shown in Fig. 3(b), where N2 is sound and N1 is not sound, since transitions y1 and y2 are dead. Thus, α-morphisms do not reflect soundness. Note that reachable markings are also not reflected in this example. However, in the next section we will provide conditions under which soundness is reflected. 4 From Event Logs to Structured and Sound Models of Multi-Agent Systems In this section, we present our approach to process discovery by composing individual agent models through α-morphisms. An event log L is a finite multiset of finite non-empty sequences (traces) over a set of observable actions A. We assume to have an event log L of two interacting agents. For instance, we will further work with the same event log obtained from the MAS shown in Fig. 1(a) which we have used in Section 1. We assume to know what actions are executed by which agent, A = A1 ∪A2 , s.t. A1 ∩A2 = ∅. Also, we assume to know actions corresponding to their asynchronous “message-passing” interaction. Instead of discovering the model directly from L, we propose to filter the log according to the agent actions A1 and A2 producing two new logs L1 and L2 . Traces of L1 and L2 contain actions done only by a corresponding agent. By using, for example, inductive miner [19], from L1 and L2 we obtain two GWF- nets N10 and N20 modeling the two agents. By construction, N10 and N20 are well- structured and sound, which implies that they are state machine decomposable (see Corollary 4 in [1]). Well-structured models are recursively built from blocks representing basic control flow constructs, e.g. choice, concurrency or cycle. It is possible to compose N10 and N20 using the channel-composition as in Definition 5 obtaining a new GWF-net N10 ⊕Pc N20 , where channels are defined according to the specification of agent interaction we have assumed to know. 32 However, it is obvious that N10 ⊕Pc N20 might not be sound. In order to avoid the verification of N10 ⊕Pc N20 , we apply the following approach to get its soundness by construction. We can abstract the discovered nets N10 and N20 w.r.t. interacting actions, thus producing two GWF-nets N1 and N2 , s.t. there is an α-morphism ϕi : Ni0 → Ni for i = 1, 2. According to Proposition 6, Ni is sound. The abstract models can also be composed via the same channels obtaining N1 ⊕Pc N2 . This abstract model represents the interaction protocol between the agents. Obviously, N1 ⊕Pc N2 is less complex than N10 ⊕Pc N20 , and its soundness can be much easier verified. Given N1 ⊕Pc N2 and the two α-morphisms ϕi : Ni0 → Ni for i = 1, 2, we can construct two new GWF-nets: N10 ⊕Pc N2 and N1 ⊕Pc N20 , which actually represent different abstractions of the same MAS. This construction is formally defined in the following definition. We also show in Remark 5 that there is an α-morphism from N10 ⊕Pc N2 (by symmetry, from N1 ⊕Pc N20 ) towards N1 ⊕Pc N2 . Definition 7. Let Ni = (Pi , Ti , Fi , mi0 , mif ) be a GWF-net for i = 1, 2, and Pc be a set of channels. Let N1 ⊕Pc N2 = (P, T, F, m0 , mf ) be a channel-composition of N1 and N2 . Let N10 = (P10 , T10 , F10 , m001 , m0f 1 ) be a GWF-net, s.t. there is an α-morphism ϕ1 : N10 → N1 . Construct N10 ⊕Pc N2 = (P 0 , T 0 , F 0 , m00 , m0f ), where: 1. P 0 = P10 ∪ P2 ∪ Pc . 2. T 0 = T10 ∪ T2 . 3. F 0 = F10 ∪ FNc 2 ∪ FNc 0 , where: 1 (a) FNc 2 = F ∩ [((Pc ∪ P2 ) × T2 ) ∪ (T2 × (Pc ∪ P2 ))]; (b) FNc 0 ⊆ (Pc × T10 ) ∪ (T10 × Pc ). 1 4. m00 = m001 ∪ m20 and m0f = m0f 1 ∪ m2f . 5. ∀p ∈ Pc , ∀t ∈ T1 : ((t, p) ∈ F ⇒ ϕ−1 (t) × {p} ⊆ FNc 0 ) and ((p, t) ∈ F ⇒ {p} × ϕ−1 (t) ⊆ FNc 0 ). 1 1 Remark 5. Let N1 , N10 and N2 be GWF-nets, and ϕ1 : N10 → N1 be an α- morphism. Then there is an α-morphism ϕ01 : N10 ⊕Pc N2 → N1 ⊕Pc N2 . In fact, by construction, ϕ01 is given by ϕ1 plus the identity mapping of places and tran- sitions of N2 together with the identity mapping of channel places. Example 1. Here we consider as N1 ⊕Pc N2 the GWF-net shown in Fig.2(a). We can refine it by models N10 or N20 , discovered from filtered event logs L1 and L2 as shown in Fig. 4. The α-morphisms between the discovered models and the abstract ones are indicated by the shaded ovals and the transition labeling. The α-morphism between N20 and N2 has also been shown in Fig. 2(b). As for the main result, we will prove in Proposition 7 that an α-morphism from N10 ⊕Pc N2 (by symmetry, from N1 ⊕Pc N20 ) to N1 ⊕Pc N2 reflects the soundness of N1 ⊕Pc N2 . To prove this fact, we will use the property of reachable markings of a channel-composition stated in the following Lemma. Lemma 2. Let Ni = (Pi , Ti , Fi , mi0 , mif ) a GWF-net for i = 1, 2, N1 ⊕Pc N2 = (P, T, F, m0 , mf ). Then, ∀m ∈ [m0 i : m = m1 ∪ mc ∪ m2 , where m1 ∈ [m10 i, m2 ∈ [m20 i, and mc ⊆ Pc . 33 Proof. By Def. 5.4, m0 = m10 ∪ m20 and mf = m1f ∪ m2f . Take m ∈ [m0 i, then ∃w ∈ F S(N1 ⊕Pc N2 ) : m0 [wim, where w ∈ T ∗ . By Def. 5.2, T = T1 ∪ T2 . Restricting w to T1 and T2 produces two firing sequences of N1 and N2 leading from m10 and m20 to the reachable markings m1 and m2 which constitute m. t u (a) N10 ⊕Pc N2 (b) N1 ⊕Pc N20 Fig. 4. Refining N1 ⊕Pc N2 with agent nets N10 and N20 discovered from filtered logs Proposition 7. Let N1 , N10 and N2 be sound GWF-nets, and ϕ1 : N10 → N1 be an α-morphism. If N1 ⊕Pc N2 is sound, then N10 ⊕Pc N2 is sound. Proof. By Rem. 5, there is an α-morphism ϕ01 from N10 ⊕Pc N2 to N1 ⊕Pc N2 . Assume N10 ⊕Pc N2 = (P 0 , T 0 , F 0 , m00 , m0f ) and N1 ⊕Pc N2 = (P, T, F, m0 , mf ). We prove that N10 ⊕Pc N2 satisfies the three conditions of soundness of Def. 4. 1. Take m0 ∈ [m00 i. By Lemma 2, m0 = m01 ∪ m2 ∪ mc . By Prop. 4 for ϕ01 , ϕ01 (m0 ) = m ∈ [m0 i. By Lemma 2, m = m1 ∪ m2 ∪ mc , where m2 and mc are the same as in m0 and ϕ1 (m01 ) = m1 (by Prop. 4 for ϕ1 ). Since N1 ⊕Pc N2 is sound, ∃w ∈ F S(N1 ⊕Pc N2 ) : m[wimf , where w ∈ T ∗ . It is possible to write w = w21 v, where v =  or v = t11 w22 t21 . . . with w2i ∈ T2∗ and ti1 ∈ T1 , s.t. i ≥ 1. Then w2i can be obviously executed on the component N2 of N10 ⊕Pc N2 as well, because ϕ01 reflects the connection to channels (by Def. 7.5 and Rem. 5). Since N10 is sound, ϕ1 reflects reachable markings and firing of transitions (by Prop. 5) between N10 and N1 . Thus, there is a reachable marking m01i in N10 , belonging to ϕ−1 i i i −1 i 1 (m1 ) in N1 , s.t. if m1 [t1 i in N1 , then any transition in ϕ1 (t1 ) is enabled at m1i in N1 . Moreover, these transitions are enabled in N1 ⊕Pc N2 , because ϕ01 0 0 0 reflects connection to channels (by Def. 7.5 and Rem. 5). Hence, the sequence w ∈ F S(N1 ⊕Pc N2 ) is reflected in N10 ⊕Pc N2 reaching its final marking m0f . 2. Suppose by contradiction ∃m0 ∈ [m00 i : m0 ⊇ m0f and m0 6= m0f . By Def. 7, m0f = m0f 1 ∪ m2f . Then m0 = m0f 1 ∪ m2f ∪ m3 . By Prop. 4 for ϕ01 , ϕ01 (m0 ) ∈ [m0 i. In this way, ϕ01 (m0 ) = ϕ01 (m0f 1 ) ∪ ϕ01 (m2f ) ∪ ϕ01 (m3 ) = ϕ1 (m0f 1 ) ∪ m2f ∪ m3 = m1f ∪ m2f ∪ m3 = mf ∪ m3 (by Rem. 5 and Def. 7.4). Thus, this marking strictly covers the final marking mf of N1 ⊕Pc N2 which contradicts its soundness. 34 3. We prove that ∀t0 ∈ T 0 ∃m0 ∈ [m00 i : m0 [t0 i. By Lemma 2, m0 = m01 ∪ m2 ∪ mc . By Def. 7.2, ∀t0 ∈ T 0 : t0 ∈ T10 or t0 ∈ T2 . If t0 ∈ T2 , then since N1 ⊕Pc N2 is sound, ∃m ∈ [m0 i, s.t. m[t0 i. By Rem. 5 and by Def. 7.5, m2 ∪ mc enables t0 . If t0 ∈ T10 , then there are two cases. If ϕ01 (t0 ) ∈ P , then t0 is not connected to channels. Since N10 is sound, m01 ⊆ m0 enables t0 . If ϕ01 (t0 ) ∈ T , then take t ∈ T , s.t. ϕ01 (t0 ) = t. Since N1 ⊕Pc N2 is sound, ∃m ∈ [m0 i : m[ti. By Prop. 5, ϕ1 reflects reachable markings and firings of transitions on N10 . Moreover, ϕ01 reflects connection to channels (by Def. 7.5 and Rem. 5). Then m01 ∪ mc enables t0 . t u Since there are two α-morphisms from N1 ⊕Pc N20 and N10 ⊕Pc N2 towards N1 ⊕Pc N2 , we can compose them by using the composition defined in [7] (see Definition 12 there) obtaining as a result N with other two α-morphisms from N towards N1 ⊕Pc N20 and N10 ⊕Pc N2 , s.t. the diagram shown in Fig. 5(b) commutes. Alternatively, by applying a similar construction as the one given in Defi- nition 7, we can refine N2 in N10 ⊕Pc N2 by N20 obtaining N10 ⊕Pc N20 which is isomorphic to the previously obtained composition N . Symmetrically, it is pos- sible to refine N1 in N1 ⊕Pc N20 by N10 coming to the same result. According to Remark 5, there are α-morphisms from N10 ⊕Pc N20 to N10 ⊕Pc N2 and to N1 ⊕Pc N20 . According to Proposition 7, N10 ⊕Pc N2 (by symmetry, N1 ⊕Pc N20 ) is sound, and then N10 ⊕Pc N20 is also sound. Thus, we have shown that it is possible to even simultaneously refine N1 and N2 in the sound abstract model N1 ⊕Pc N2 by sound models N10 and N20 obtained from filtered event logs, s.t. the result N10 ⊕Pc N20 is also sound. Example 2. In Fig. 5(a), we show the result of composing, by means of α- morphisms, N10 ⊕Pc N2 and N1 ⊕Pc N20 constructed in Example 1. This com- position can also be obtained directly by the channel-composition of N10 and N20 . The obtained model meets the desired requirement stated in Section 1: we can identify the agents as subnets and explicitly see their asynchronous interaction. N1 ⊕Pc N2 N10 ⊕Pc N2 N1 ⊕Pc N20 N (a) composition N isomorphic to N10 ⊕Pc N20 (b) diagram Fig. 5. Composition of N10 ⊕Pc N2 and N1 ⊕Pc N20 based on α-morphisms 35 4.1 Comparing Quality of Presented MAS Process Models Process models of MAS presented in Fig. 1(b) and in Fig. 1(c) are discovered directly from the log produced by simulating the model shown in Fig. 1(a). The composition shown in Fig. 5(a), in fact, is obtained from the same log. That is why, we have compared their quality using the standard process discovery metrics [10]. Fitness measures how accurately a model can replay traces of an initial event log. Intuitively, precision indicates a ratio between the behavior given by the log and the one allowed by the model. If it is low, then a model allows for too much additional behavior. Table 1 provides the result of this quality analysis. Table 1. Fitness and precision of MAS models presented in the paper Model Algorithm Discovery Fitness Precision Figure 1(b) Inductive miner Direct 1,0000 0,1732 Figure 1(c) ILP miner Direct 1,0000 0,8516 Figure 5(a) Inductive miner Composed 1,0000 0,8690 The result of the compositional process discovery shows the increase in preci- sion resulting from the separate analysis of agent behavior. However, we can see that the precision of the composed model is close to that of the model obtained by ILP miner. Thus, the model can be appropriate in terms of precision, but not in terms of the MAS structure. 5 Conclusion and Future Work In this paper, we have proposed a compositional approach to discover process models of multi-agent systems from event logs. We have considered asynchronous agent interactions which are modeled by a channel-composition of two nets. We assume that events can be partitioned, s.t. system event logs are filtered accord- ing to agent behavior to discover their detailed models. In order to guarantee that their composition is sound, we have proposed to abstract agent models, by using α-morphisms, w.r.t. interacting actions thus constructing an abstract model of a communication protocol. The general algorithm of constructing α- morphisms is one of the open problems which is a subject for further research. We have proven that, when this abstract protocol is sound, the direct composi- tion of the detailed models is also sound. In proving this fact, we have used two intermediate models corresponding to a detailed view of one agent composed with the abstract view of the other. The obtained system model is structured in such a way that it is possible to identify agent models as components and their interactions are clearly expressed. We have compared the quality of directly obtained models with the quality of models obtained by the proposed approach. The quality of a composed model is seen to be at the appropriate level in terms of fitness and precision being the 36 general process discovery quality dimensions. In the future, we suggest intro- ducing a structural indicator showing the extent to which it is easy to identify agents as parts of the model as well as their interaction. We plan to explore more general asynchronous interactions, e.g. when the same channel can be used by several agents as well as the possibility to have both asynchronous and synchronous communication. Moreover, we would like to generalize the proposed approach to other classes of nets. Another line of research will be focused on dealing with dynamic changes. In particular, intermediate models, mentioned above, can be used to check if changes in the behavior of an agent affect the soundness of the whole system. References 1. van der Aalst, W.M.P.: Workflow verification: Finding control-flow errors using petri-net-based techniques. In: Business Process Management: Models, Techniques, and Empirical Studies. LNCS, vol. 1806, pp. 161–183. Springer, Heidelberg (2000) 2. van der Aalst, W.M.P., van Hee, K.M., ter Hofstede, A.H.M., Sidorova, N., Verbeek, H.M.W., Voorhoeve, M., Wynn, M.T.: Soundness of workflow nets: classification, decidability, and analysis. Formal Aspects of Computing 23(3), 333–363 (2011) 3. Augusto, A., Conforti, R., Dumas, M., Rosa, M.L., Maggi, F.M., Marrella, A., Mecella, M., Soo, A.: Automated discovery of process models from event logs: Review and benchmark. CoRR abs/1705.02288 (2017) 4. Baldan, P., Corradini, A., Ehrig, H., Heckel, R.: Compositional modeling of reactive systems using open nets. In: Larsen, K.G., Nielsen, M. (eds.) CONCUR 2001. LNCS, vol. 2154, pp. 502–518. Springer, Heidelberg (2001) 5. Bednarczyk, M.A., Bernardinello, L., Caillaud, B., Pawlowski, W., Pomello, L.: Modular system development with pullbacks. In: ICATPN 2003. LNCS, vol. 2679, pp. 140–160. Springer, Heidelberg (2003) 6. Bernardinello, L., Lomazova, I., Nesterov, R., Pomello, L.: Compositional discovery of workflow nets from event logs using morphisms. Technical Report. Available: https://arxiv.org/pdf/1806.04254.pdf (2018) 7. Bernardinello, L., Mangioni, E., Pomello, L.: Local state refinement and composi- tion of elementary net systems: An approach based on morphisms. In: Transactions on Petri Nets and Other Models of Concurrency VIII. LNCS, vol. 8100, pp. 48–70. Springer, Heidelberg (2013) 8. Bernardinello, L., Monticelli, E., Pomello, L.: On preserving structural and be- havioural properties by composing net systems on interfaces. Fundamenta Infor- maticae 80(1-3), 31–47 (2007) 9. Best, E., Devillers, R., Hall, J.G.: The box calculus: A new causal algebra with multi-label communication. In: APN 1992. LNCS, vol. 609, pp. 21–69. Springer, Heidelberg (1992) 10. Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: On the role of fitness, precision, generalization and simplicity in process discovery. In: OTM 2012. LNCS, vol. 7565, pp. 305–322. Springer, Heidelberg (2012) 11. Cardinale, Y., El Haddad, J., Manouvrier, M., Rukoz, M.: Web service composition based on petri nets: Review and contribution. In: RED 2012. LNCS, vol. 8194, pp. 83–122. Springer, Heidelberg (2013) 37 12. Fabre, E.: On the construction of pullbacks for safe petri nets. In: Donatelli, S., Thiagarajan, P.S. (eds.) ICATPN 2006. LNCS, vol. 4024, pp. 166–180. Springer, Heidelberg (2006) 13. Girault, C., Valk, R.: Petri Nets for Systems Engineering: A Guide to Modeling, Verification, and Applications. Springer, Heidelberg (2003) 14. Haddad, S., Hennicker, R., Møller, M.H.: Channel properties of asynchronously composed petri nets. In: ICATPN 2013. LNCS, vol. 7927, pp. 369–388. Springer, Heidelberg (2013) 15. van Hee, K.M., Mooij, A.J., Sidorova, N., van der Werf, J.M.: Soundness-preserving refinements of service compositions. In: Web Services and Formal Methods. LNCS, vol. 6551, pp. 131–145. Springer, Heidelberg (2011) 16. van Hee, K.M., Sidorova, N., van der Werf, J.M.: Construction of asynchronous communicating systems: Weak termination guaranteed! In: Software Composition. LNCS, vol. 6144, pp. 106–121. Springer, Heidelberg (2010) 17. Kalenkova, A.A., Lomazova, I.A.: Discovery of cancellation regions within process mining techniques. Fundamenta Informaticae 133(2-3), 192–209 (2014) 18. Kalenkova, A.A., Lomazova, I.A., van der Aalst, W.M.P.: Process model discovery: A method based on transition system decomposition. In: ICATPN 2014. LNCS, vol. 8489, pp. 71–90. Springer, Heidelberg (2014) 19. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs - a constructive approach. In: ICATPN 2013. LNCS, vol. 7927, pp. 311–329. Springer, Heidelberg (2013) 20. Lomazova, I.A.: Interacting workflow nets for workflow process re-engineering. Fun- damenta Informaticae 101(1-2), 59–70 (2010) 21. Lomazova, I.A., Romanov, I.V.: Analyzing compatibility of services via resource conformance. Fundamenta Informaticae 128(1-2), 129–141 (2013) 22. Nielsen, M., Rozenberg, G., Thiagarajan, P.: Elementary transition systems. The- oretical Computer Science 96(1), 3–33 (1992) 23. Padberg, J., Urbášek, M.: Rule-based refinement of petri nets: A survey. In: Petri Net Technology for Communication-Based Systems: Advances in Petri Nets. LNCS, vol. 2472, pp. 161–196. Springer, Heidelberg (2003) 24. Pankratius, V., Stucky, W.: A formal foundation for workflow composition, work- flow view definition, and workflow normalization based on petri nets. In: APCCM 2005, vol. 43. pp. 79–88. Australian Computer Society, Inc. (2005) 25. Reisig, W.: Understanding Petri Nets: Modeling Techniques, Analysis Methods, Case Studies. Springer, Heidelberg (2013) 26. Siegeris, J., Zimmermann, A.: Workflow model compositions preserving relaxed soundness. In: BPM 2006. LNCS, vol. 4102, pp. 177–192. Springer (2006) 27. Winskel, G.: Petri nets, morphisms and compositionality. In: APN 1985. LNCS, vol. 222, pp. 453–477. Springer, Heidelberg (1986) 38