=Paper= {{Paper |id=Vol-2625/paper05 |storemode=property |title=On Discovering Distributed Process Models the case of asynchronous communication |pdfUrl=https://ceur-ws.org/Vol-2625/paper-04.pdf |volume=Vol-2625 |authors=Pieter Kwantes,Jetty Kleijn |dblpUrl=https://dblp.org/rec/conf/apn/KwantesK20 }} ==On Discovering Distributed Process Models the case of asynchronous communication== https://ceur-ws.org/Vol-2625/paper-04.pdf
 On Discovering Distributed Process Models
 – the case of asynchronous communication –

                        Pieter Kwantes and Jetty Kleijn

                   LIACS, Leiden University, P.O.Box 9512
                    NL-2300 RA Leiden, The Netherlands
             {p.m.kwantes,h.c.m.kleijn}@liacs.leidenuniv.nl



       Abstract. This paper is concerned with the discovery of models
       of distributed processes from event logs under the assumption that
       an algorithm for the discovery of the component models is given.
       We use Enterprise nets to model local processes, while Industry
       nets are compositions of Enterprise nets which interact through
       asynchronous message passing. First, we investigate the relation
       between the behaviour of Enterprise nets and that of Industry nets
       and formalise the (causal) structure of global (Industry net) be-
       haviour in terms of a partial order derived from the message pass-
       ing. Next, we demonstrate how to leverage existing algorithms for
       the discovery of workflow nets to discover Enterprise nets and how
       to combine these into an Industry net. Using the results on the
       structure of the global behaviour, we relate the behaviour of the
       Industry net thus discovered to the behaviour of the Enterprise
       nets and show how fitness of the Enterprise nets (the event log pro-
       vided as input is included in the behaviour of the discovered net)
       is preserved as fitness of the Industry net.
       Keywords: process discovery, distributed process, event log, work-
       flow net, E-net, I-net, asynchronous channels, partial order


1    Introduction
Industry nets have been introduced in [18] as a framework to model global
communication between enterprises where the design of their internal op-
erations is left to the local level. In this set-up, operations at the enterprise
level are represented by Enterprise nets (Petri nets with input, output, and
internal transitions) and Industry nets are compositions of Enterprise nets
that interact by exchanging messages through channels. In [18], a method is
proposed to establish global compliance of an Industry net with a reference
model by local checks (of Enterprise nets) only. In this paper, we focus on
the discovery of Industry nets from a description of the (observed) global
behaviour of a collection of collaborating enterprises.



                                              49
                                                         Copyright © 2020 for this paper by its authors.
                 Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    Business process modelling and process mining are nowadays very active
research areas and there are many approaches both to process discovery and
conformance checking (see, eg., [6, 8, 11]). Here, we take advantage of this
and consider the discovery of distributed processes (modelled as Industry
nets) assuming the existence of an algorithm for the discovery of component
processes (in the form of Enterprise nets with their own activities). To be
precise, we first demonstrate how an algorithm that computes from an
event log a Petri net, say a workflow net [7] (the Inductive Miner [19] is an
example of such an algorithm), can be adapted to yield an Enterprise net.
Then, given some global behaviour and the local event logs determined by it,
an Industry net is constructed by combining the Enterprise nets discovered
from the respective local event logs.
    In order to compare the given global behaviour with the behaviour (the
firing sequences) of the Industry net, we first investigate, after a preliminary
section, in Sections 3, 4, and 5, the relation between the behaviour of an In-
dustry net and the firing sequences of its Enterprise nets. As a consequence
of the asynchronous set-up, every sequence that can be executed by an In-
dustry net is also locally executable (ignoring the actions of other compo-
nents) by each of its Enterprise nets. To describe the converse relationship,
we formulate a prefix property that identifies those action sequences that
can be executed by an Industry net whenever they are locally executable in
all component Enterprise nets. Furthermore, by explicitly assigning occur-
rences of output actions to (dependent) occurrences of input actions, we can
associate with each sequence with the prefix property, a labelled partial or-
der reflecting the necessary ordering of action occurrences in this sequence.
All linearisations of this labelled partial order are also firing sequences of
the Industry net. We show that, in case the input/output relation is based
on a FIFO relation, the set of these linearisations is maximal. In Section 6,
we demonstrate how to leverage existing process discovery algorithms to
the discovery of Industry nets. Then we argue, based on the results from
the earlier sections, how fitness of the nets delivered by the original algo-
rithm (the event log provided as input is included in the behaviour of the
discovered net) guarantees the fitness of the distributed model. Section 7
concludes the paper and briefly considers related work.
    Due to the page limit, this paper has no proofs. We provide, however,
examples, explanations, and lemmas to clarify our statements and results.


2    Preliminaries

N = {0, 1, 2, . . .} is the set of natural numbers including 0. For n ∈ N, we
set [n] = {1, 2, 3, . . . , n} and if n = 0, then [n] = ∅. By A ] B we denote the




                                       50
union of disjoint sets A and B. Given a partial order R ⊆ A × A, we refer
to a total order R0 ⊆ A × A such that R ⊆ R0 , as a linearisation of R.
    An alphabet is a finite, non-empty, set of symbols. Let Σ be an alphabet.
A word over Σ is a sequence w = a1 · · · an , with n ≥ 0 and ai ∈ Σ, for
all i ∈ [n]; we refer to n as the length of w, denoted by |w|. If n = 0
then w is the empty word denoted by λ. Note that |λ| = 0. The set of
all words over Σ is denoted as Σ ∗ . Any subset of Σ ∗ is a language (over
Σ). It is often convenient to consider a word w = a1 · · · an as a function
w : [|w|] → Σ, defined by w(i) = ai for all i ∈ [n]. The alphabet of w
is alph(w) S = {w(i) | i ∈ [n]}. Hence alph(λ) = ∅. For a language L,
Alph(L) = {alph(w) | w ∈ L} is the set of all symbols that occur in
the words of L. If a word w is a concatenation of words v1 and v2 , i.e.,
w = v1 v2 , then v1 is said to be a prefix of w. The number of occurrences
of symbol a ∈ Σ in a word w ∈ Σ ∗ is defined as #a (w) = |{i | w(i) = a})|
and the set of occurrences of w is occ(w) = {(a, i) | a ∈ alph(w) ∧ 1 ≤
i ≤ #a (w)}. In addition, we allocate a position with each occurrence of
w through the function posw : occ(w) → [|w|] as follows: for all (a, i) ∈
occ(w), posw ((a, i)) = k if w(k) = a and #a (w(1) · · · w(k)) = i. For a
subset ∆ of Σ, the projection of Σ ∗ on ∆∗ is a function projΣ,∆ : Σ ∗ → ∆∗ ,
defined by projΣ,∆ (a) = a if a ∈ ∆, and projΣ,∆ (a) = λ otherwise;
furthermore projΣ,∆ (wa) = projΣ,∆ (w)projΣ,∆ (a) whenever w ∈ Σ ∗ and
a ∈ Σ. We omit the subscript Σ if it is clear from the context and thus
write proj∆ (w) instead of projΣ,∆ (w).


Petri nets A Petri net is a triple N = (P, T, F ), where P is a finite set of
places, T is a finite non-empty set of transitions such that P ∩ T = ∅, and
F ⊆ (P × T ) ∪ (T × P ) is a set of arcs.
Let N = (P, T, F ) be a Petri net. If N 0 = (P 0 , T 0 , F 0 ) is a Petri net such that
P ∪T and P 0 ∪T 0 have no elements in common, then N and N 0 are disjoint.
Let x ∈ P ∪ T . Then • x = {y | (y, x) ∈ F } and x• = {y | (x, y) ∈ F } are
the preset and the postset, respectively, of x (in N ). A marking µ of N is a
function µ : P → N. Let t ∈ T and µ a marking of N . Then t is enabled at µ
if µ(p) > 0 for all p ∈ • t. If t is enabled at µ, it may occur, and thus lead to
                                         t
a new marking µ0 of N , denoted µ →      − N µ0 , with µ0 (p) = µ(p) − 1 whenever
p ∈ • t\t• ; µ0 (p) = µ(p)+1 whenever p ∈ t• \ • t; and µ0 (p) = µ(p) otherwise.
                               t                                                λ
We extend the notation µ → − N µ0 to sequences w ∈ T ∗ as follows1 : µ −
                                                                       →N µ
                  wt     0                      0              ∗
for all µ; and µ −→N µ for markings µ, µ of N , w ∈ T and t ∈ T ,
                                                  w                 t
whenever there is a marking µ00 such that µ −     →N µ00 and µ00 → − N µ0 . If
   w
µ− →N µ0 , for some w ∈ T ∗ , we say that µ0 is reachable from µ (in N ). The
1
    We thus view T as an alphabet of action symbols.




                                          51
                                     w
language L(N, µ) = {w ∈ T ∗ | µ −    →N µ0 for some marking µ0 } consists of
all possible firing sequences of N from µ. A place p ∈ P is a source place of
N if • p = ∅ and it is a sink place of N if p• = ∅. The marking µ of N such
that, for all p ∈ P , µ(p) = 1 if p is a source place and µ(p) = 0 otherwise,
is the default initial marking of N . If µ is the default initial marking of N ,
we may write L(N ) to denote L(N, µ) and refer to it as the language of N .
A workflow net (see [7]) is a Petri net with exactly one source and one sink
place such that every place and every transition are on a path from source
to sink.

Enterprise nets and Industry nets We recall the definitions of En-
terprise and Industry nets from [18]. Enterprise nets (or E-nets, for short)
are Petri nets equipped for asynchronous communication with other E-nets;
they have transitions for inputting and transitions for outputting. An inter-
action between E-nets is realised by an occurrence of an output transition
of one E-net and a (later) occurrence of an input transition of another E-net
with a matching message type.
We let M denote the set of message types fixed throughout this paper.
Definition 1. An Enterprise net is a tuple E = (P, [Tint , Tinp , Tout ], F, M )
such that Tint , Tinp , and Tout are mutually disjoint sets; Tint is the set
of internal transitions of E, Tinp its set of input transitions, and Tout
its set of output transitions; furthermore, the underlying Petri net of E,
und(E) = (P, Tint ] Tinp ] Tout , F ), is a Petri net with exactly one source
place; finally M : Tinp ∪ Tout → M is the communication function of E. t      u
Given an Enterprise net E, L(E) = L(und(E)) is the language of E.
Note that E-nets, like workflow nets, have a single source place; no further
restrictions are imposed on the underlying net.
    Composing E-nets yields an Industry-net (or I-net, for short) with mul-
tiple source places. The E-nets involved are pairwise disjoint, i.e., their
underlying Petri nets are disjoint. When combining them into an I-net,
output and input transitions are matched via their message types.
Definition 2. Let n ≥ 2. Let V = {Ei | i ∈ [n]} be a set of pairwise disjoint
E-nets with Ei = (Pi , [Ti,int , Ti,inp , Ti,out ],SFi , Mi ) for eachSi ∈ [n].
A matching over V is a bijection ϕ : i∈[n] Ti,out → j∈[n] Tj,inp such
that whenever t ∈ Ti,out and ϕ(t) ∈ Tj,inp , for some i, j, then i 6= j and
Mi (t) = Mj (ϕ(t)).                                                             t
                                                                                u
A set V of mutually disjoint E-nets is said to be composable if there exists
a matching over V . To construct an I-net from a composable set V and a
matching ϕ over V , matching output and input transitions of the E-nets in
V are connected through (new) channel places using channel arcs.



                                         52
Definition 3. Let n ≥ 2. Let V = {Ei : i ∈ [n]} be a composable set of
E-nets with Ei = (Pi , [Ti,int , Ti,inp , Ti,out ], Fi , Mi ) and Ti = Ti,int ∪ Ti,inp ∪
Ti,out for all i ∈ [n]. Let ϕ be a matching over V .
Then P (V, ϕ) = {[t, ϕ(t)] | t ∈ Ti,out , i ∈ [n]} is the set of channel places of
V and ϕ, and F (V, ϕ) = {(t, [t, ϕ(t)]) | t ∈ Ti,out , i ∈ [n]}∪{([t, ϕ(t)], ϕ(t)) |
t ∈ Ti,out , i ∈ [n]} is the set of channel arcs of V and ϕ. The sets P (V, ϕ),
F (V, ϕ), and Pi , Ti , Fi , where i ∈ [n], are all pairwise disjoint.
S Industry net over (V,Sϕ) is the Petri netSI(V, ϕ) = (P, T, F ) with P =
The
  i∈[n] Pi ∪ P (V, ϕ), T =      i∈[n] Ti , and F =        i∈[n] Fi ∪ F (V, ϕ).         t
                                                                                       u
    The notion of an E-net can be considered a generalisation of the notion
of a workflow net through its underlying net. This generalisation is moti-
vated, in particular, by the observation that workflows in enterprises often
interact with workflows in other enterprises by a bilateral, asynchronous
exchange of messages of a specified type (see eg., [20, 16, 24, 12]). The sim-
ilarity between E-nets and workflow nets, moreover, allows us to leverage
the large amount of research into the automated discovery of workflow nets,
for the purpose of automated discovery of E-nets. The term “Industry”, in
“Industry net”, is used loosely to refer to any group of collaborating enter-
prises.
Example 1. Consider two enterprises, an Investment Firm (IF) and an Ex-
change (EX), modelled by the E-nets EIF and EEX in Figs. 1a and 1b,
respectively. The Investment Firm sends order messages to the Exchange
(modelled by output transition so with message type mo ). The Exchange,
upon receiving a message (via input transition ro with message type mo ),
subsequently sends an order confirmation message to the Investment Firm
(output transition sc with message type mc ). Then I(V, ϕ), the I-net in
Fig.1c, with V = {EIF , EEX } and ϕ defined by ϕ(so) = ro and ϕ(sc) = rc,
models the collaboration between the Investment Firm and the Exchange.
Note that V allows only one matching.
                                                                       t
                                                                       u
    The ideas underlying the concepts of E-nets and I-nets belong to a well-
established branch of research concerned with composing systems modelled
as Petri nets, into larger correctly functioning (concurrent) systems (see
eg., [13, 14, 22, 23]). In [1] and also in [25] compositions of so-called open
workflow nets are considered. These are Petri nets with interface places to
communicate with other Petri nets. In [14], similar to the set-up of I-nets,
local Petri nets do not have interface places but they communicate via
distinguished input and output labels. As mentioned in [14], the advantage
of this approach is that it leads to a better separation of concerns: eg., the
designer of a local system does not have to consider how it will be used;
this is a concern for the designer of the global system.



                                           53
     (a) E-net EIF          (b) E-net EEX                (c) The I-net

                                     Fig. 1


3    E-net languages and I-net languages
Let V , ϕ, and I(V, ϕ) = (P, T, F ) as specified in Definition 3, be fixed for
this section, and for Section 4 and Section 5.
    Clearly, the construction of I(V, ϕ) does not affect the internal structure
of the E-nets in V and the set of source places of I(V, ϕ) consists of all source
places of the Ei . Moreover, removing channel places does not restrict the
behaviour of an E-net. Consequently, all firing sequences in L(I(V, ϕ)) are
combinations of firing sequences from the languages of the component nets.

Lemma 1. If w ∈ L(I(V, ϕ)), then projTi (w) ∈ L(Ei ) for all i ∈ [n].          t
                                                                               u

On the other hand, the composition of E-nets into an I-net adds a channel
place to the preset of each input transition. Consequently, the behaviour
of an E-net may become restricted in the I-net. The property defined next
describes how the occurrence of input transitions depends on the occurrence
of output transitions.

Definition 4. Let w ∈ T ∗ . Then w has the prefix property with respect to
ϕ if #a (u) ≤ #ϕ−1 (a) (u) for all prefixes u of w and all a ∈ Tinp .
A language L ⊆ T ∗ has the prefix property with respect to ϕ if all w ∈ L
have this property.                                                      t
                                                                         u

We will omit the reference to ϕ as it is fixed.
   The asynchronous communication between its component E-nets guar-
antees that the firing sequences of I(V, ϕ) have the prefix property.

Lemma 2. If w ∈ L(I(V, ϕ)), then w has the prefix property.                    t
                                                                               u




                                       54
Conversely, any sequence w ∈ T ∗ that can be (locally) executed by all
E-nets and that satisfies the prefix property, belongs to L(I(V, ϕ)).

Lemma 3. Let w ∈ T ∗ be such that projTi (w) ∈ L(Ei ) for all i ∈ [n]. If
w has the prefix property, then w ∈ L(I(V, ϕ)).                        t
                                                                       u

Example 2. (Ex. 1 ctd.) Let T = {so, rc, ro, sc}. Let w = hso, ro, sc, rci.2
Now projTIF (w) = hso, rci ∈ L(EIF ) and projTEX (w) = hro, sci ∈ L(EEX ),
where TIF and TEX are the sets of transitions of EIF and EEX , respectively.
Obviously, w has the prefix property. Thus w ∈ L(I(V, ϕ)) by Lemma 3. For
w0 = hso, ro, rc, sci, the prefix property does not hold: in u = hso, ro, rci, a
prefix of w0 , we have #rc (u) = 1, but #sc (u) = 0. Indeed, w0 6∈ L(I(V, ϕ))
although projTIF (w0 ) = projTIF (w) and projTEX (w0 ) = projTEX (w). t       u

Combining the above three lemmas yields

Theorem 1. Let L ⊆ T ∗ . Then L ⊆ L(I(V, ϕ)) if and only if L has the
prefix property and projTi (L) ⊆ L(Ei ) for all i ∈ [n].            t
                                                                    u

From Lemma 1 and Theorem 1, it follows that whenever there exists a
language L with the prefix property and for which projTi (L) = L(Ei ) for
all i ∈ [n], the I-net captures the full unrestricted behaviour of each E-net.

Corollary 1. Let L ⊆ T ∗ be such that L has the prefix property and
projTi (L) = L(Ei ) for all i ∈ [n]. Then projTi (L(I(V, ϕ))) = L(Ei ) for
all i ∈ [n].                                                             t
                                                                         u

It should however be noted, that the conditions on L in Corollary 1 do not
imply that L(I(V, ϕ)) = L. By Theorem 1, L ⊆ L(I(V, ϕ)) and as we argue
next, this inclusion may be strict.


4     Structuring words with the prefix property
Whereas the prefix property is based on a simple comparison of numbers of
occurrences, assignment functions, as defined next, relate each occurrence
of an input transition to a corresponding occurrence of an output transition.

Definition 5. An assignment function with respect to ϕ for a word w ∈ T ∗
is an injective function θ : (occ(w) ∩ (Tinp × N)) → occ(w) such that
for every occurrence (b, j) ∈ occ(w) with b ∈ Tinp , θ(b, j) = (a, i) where
a = ϕ−1 (b) and i is such that posw (a, i) < posw (b, j).                t
                                                                         u
2
    The elements of the alphabet T are symbols consisting of two letters. In order to
    avoid confusion, we denote in this and similar examples, any sequence a1 · · · an
    with ai ∈ T for each i ∈ [n] by ha1 , · · · , an i.



                                          55
Again, since ϕ is fixed, we will omit the reference to ϕ.

Remark 1. Clearly, every word for which an assignment function exists, has
the prefix property. Conversely, whenever a word has the prefix property it
has at least one (in general, more than one) assignment function.        t
                                                                         u

Example 3. Assume T = {a, b} and ϕ(a) = b. Let w = aabb. Then w has
two assignment functions: θ defined by θ(b, 1) = (a, 1) and θ(b, 2) = (a, 2);
and θ0 defined by θ0 (b, 1) = (a, 2) and θ0 (b, 2) = (a, 1).               t
                                                                           u

Each assignment function for w determines a partial order on occ(w).

Definition 6. Let w ∈ T ∗ and let θ be an assignment function for w. Let
(a, i), (b, j) ∈ occ(w), with a ∈ Tk and b ∈ Tl for some k, l ∈ [n].
Then (a, i) ≤w,θ (b, j) if
(1) k = l and posw (a, i) ≤ posw (b, j) or
(2) k 6= l and b ∈ Tinp and θ(b, j) = (a, i).                         t
                                                                      u

By condition (1), ≤w,θ incorporates the ordering of occurrences of transi-
tions in w that originate from the same E-net; by (2), it reflects the order
of occurrences of input transitions and their assigned occurrences of output
transitions.

Lemma 4. Let w ∈ T ∗ and let θ be an assignment function for w. Then
≤+
 w,θ , the transitive closure of ≤w,θ , is a partial order on occ(w). t
                                                                      u

The linearisations of ≤+
                       w,θ , where w is a word and θ an assignment function
for w, define a set of words that can be obtained from w by (repeatedly)
interchanging the positions of unrelated occurrences.

Definition 7. Let w ∈ T ∗ and let θ be an assignment function for w. Then
linθ (w) = {v ∈ T ∗ | occ(v) = occ(w) and for all x, y ∈ occ(v), x ≤+ w,θ
y implies posv (x) ≤ posv (y)} is the set of θ-linearisations of w.     t
                                                                        u

Clearly w ∈ linθ (w) whenever linθ (w) is defined.

Remark 2. Let w and θ as in Definition 7 and let v ∈ linθ (w). Then
(1) projTi (v) = projTi (w) for all i ∈ [n] and
(2) θ is an assignment function for v and linθ (v) = linθ (w).
From (2) and Remark 1, it follows that v has the prefix property.          t
                                                                           u

Example 4. (Ex. 3 ctd.) Let v = abab. Then occ(v) = occ(w) and linθ (w) =
{v, w}. Clearly, θ is an assignment function for v and linθ (v) = linθ (w).
Note that linθ0 (w) = {w} and θ0 is not an assignment function for v. t  u




                                     56
By combining Remark 2 with Lemma 3, the statement of Lemma 3 can be
strengthened. This shows that whenever a word with the prefix property
can be locally executed by all component E-nets, then, for all its assignment
functions θ, each of its θ-linearisations can be executed globally by the I-net.
Theorem 2. Let w ∈ T ∗ be such that projTi (w) ∈ L(Ei ) for all i ∈ [n]. If
θ is an assignment function for w, then linθ (w) ⊆ L(I(V, ϕ)).           t
                                                                         u
Example 5. The I-net in Fig. 2 is an extension of the I-net from Example 2.
The Exchange, after sending a confirmation (output transition sc) to the
Investment Firm, can send (output transition si) a settlement instruction
to the Central Securities Depositary (CD) to transfer ordered securities to
the Investment Firm. The Central Securities Depositary - after receiving
(input transition ri matching si) the settlement instruction - sends (output
transition ss) a confirmation of settlement to the Investment Firm (that
receives the message via input transition rs).




                Fig. 2: I-net composed of EIF , EEX and ECD



Let w = hso, ro, sc, rc, si, ri, ss, rsi. So, w can be (locally) executed by
each of the three enterprise nets in Fig. 2. Also, w has the prefix property
(w.r.t. ϕ displayed in Fig. 2). Hence w is in the language of the I-net in
Fig. 2. Define θ by θ(ro, 1) = (so, 1), θ(rc, 1) = (sc, 1), θ(ri, 1) = (si, 1),
and θ(rs, 1) = (ss, 1). This θ is an (the only) assignment function for w
(w.r.t. ϕ). Consider v = hso, ro, sc, si, rc, ri, ss, rsi, obtained from w by
exchanging (si, 1) and (rc, 1). These occurrences are not ordered by ≤+   w,θ .
Hence v ∈ linθ (w) and v is, like w, in the language of the I-net.           t
                                                                             u



                                       57
5   Assignment functions of type FIFO
By Theorem 2, every assignment function θ for a word w with the prefix
property determines a set of words linθ (w) that can be executed by the
I-net, provided w can be locally executed by all component E-nets. As
remarked earlier, a word with the prefix property may have more than one
assignment function. In this section we demonstrate that considering one
particular type of assignment function is sufficient to describe all possible
linearisations. Intuitively, in terms of the I-net, one could say that each
assignment function describes for each occurrence of an input transition,
which token it should take from its input channel place (namely the token
deposited by the assigned occurrence of its output transition). Based on
this point of view, the following two types of assignment functions represent
natural policies to deal with the messages (tokens) in the channel places.

Definition 8. Let w ∈ T ∗ and let θ be an assignment function for w. Then
(1) θ is of type FIFO with respect to ϕ if, θ(b, j) = (ϕ−1 (b), j) for every
(b, j) ∈ occ(w) such that b ∈ Tinp ;
(2) θ is of type Stack with respect to ϕ if, for every (b, j) ∈ occ(w) such that
b ∈ Tinp , θ(b, j) = (a, i) where a = ϕ−1 (b), w = xaybz, for some x, y, z ∈
occ(w) such that #b (y) = #a (y), and j = #b (xayb) and i = #a (xa).          t
                                                                              u

Again, since ϕ is fixed, we will omit the reference to ϕ.
   Every word with the prefix property has one assignment function of
type FIFO and one of type Stack (and these functions may be the same).

Example 6. (Ex. 3 and Ex. 4 ctd.) For w, the assignment function θ is of
type FIFO and θ0 is of type Stack. Moreover, θ is an assignment function
for v of type FIFO and of type Stack.                                  t
                                                                       u
                 fifo
In the sequel, θw     denotes the assignment function of type FIFO of any
word w ∈ T with the prefix property. In addition,Sif L ⊆ T ∗ is a language
             ∗

that has the prefix property, then linFIFO (L) = {linθwfifo (w) | w ∈ L}
consists of all FIFO-linearisations of L.

Lemma 5. Let v, w ∈ T ∗ be such that v and w have the prefix property
and occ(v) = occ(w). Then θvfifo = θw
                                    fifo
                                         .                         t
                                                                   u

This observation does not hold for all assignment functions as can be seen
in Example 6. Assignment functions of type FIFO are also special as they
impose the least restrictive ordering (with for each occurrence of an input
transition, only one occurrence of an output transition that precedes it).
and thus a maximal (w.r.t. set inclusion) set of linearisations.




                                       58
Lemma 6. Let w ∈ T ∗ have the prefix property. Then linθwfifo (w) = {v ∈
T ∗ | v has the prefix property and projTi (v) = projTi (w) for all i ∈ [n]}.
                                                                             t
                                                                             u
The inclusion from left to right is true for all assignment functions by
Remark 2. The converse follows from Lemma 5 and Definition 7.
Example 7. Let I(V, ϕ) be the I-net in Fig. 3 (extending Fig.1c). The In-
vestment Firm from Example 2 can now send multiple orders to the Ex-
change. This is modelled using internal transitions in the E-nets: ii1 and
ii2 for the Investment Firm and ei1 and ei2 for the Exchange.




                   Fig. 3: I-net with option to repeat orders



Let v = hei1, ii1, so, ii2, ro, ei2, so, ii2, ro, ei2, so, ro, sc, rci and let w =
hei1, ii1, so, ii2, so, ii2, so, ro, ei2, ro, ei2, ro, sc, rci. Clearly, both v and w
have the prefix property and projTi (v) = projTi (w) for i ∈ {1, 2} with
T1 = {ii1, ii2, so, rc} and T2 = {ei1, ei2, ro, sc}. Let θ be the assignment
function for w of type FIFO. Hence, by Lemma 6, v ∈ linθ (w). However, for
the assignment function θ0 of w of type Stack (and thus θ0 (ro, 1) = (so,3)),
v ∈ linθ0 (w) does not hold.                                                        t
                                                                                    u
For a language L ⊆ T ∗ that has the prefix property, we denote by lin(L) the
language consisting of all words that can be obtained as a θ-linearisation
          S w ∈ L in for whatever assignment function θ for w. Thus,
of any word
lin(L) = {linθ (w) | w ∈ L and θ an assignment function for w}.
   Using Lemma 6, we obtain the following extension of Theorem 2.
Theorem 3. Let L ⊆ T ∗ be a language with the prefix property such that
projTi (L) ⊆ L(Ei ) for all i ∈ [n].
Then L ⊆ lin(L) ⊆ linFIFO (L) ⊆ L(I(V, ϕ)).                          t
                                                                     u



                                         59
This theorem together with Theorem 1 shows that the languages of I-nets
are closed under exchanging independent occurrences of transitions.

Corollary 2. lin(L(I(V, ϕ))) = linFIFO (L(I(V, ϕ))) = L(I(V, ϕ)).           t
                                                                            u


6   Discovery of I-nets
We are now ready to address our initial question: given an event log (lan-
guage), representing the observed behaviour of a given number of distinct,
collaborating enterprises, construct an I-net that generates this observed
behaviour. In this section, we will demonstrate how we can leverage exist-
ing algorithms for the discovery of isolated (local) processes, to solve this
problem. A formal definition of such a process discovery algorithm, based
on [6], is given next.

Definition 9. Let L be a family of languages. A process discovery algo-
rithm A for L is an algorithm that computes for all L ∈ L, a Petri net
A(L) = (P, T, F ) with a single source place and such that T = Alph(L) and
L ⊆ L(A(L)).                                                             t
                                                                         u

To leverage such algorithms to a system of asynchronously communicating
nets, one needs to describe the distribution of actions over components in
combination with their role as internal, input, or output action. This leads
to the concept of a distributed communicating alphabet defined next.

Definition 10. Let n ≥ 1. An n-dimensional distributed communicating
alphabet (or n-DCA, for short) is a tuple
DA = ([Σ1 , . . . , Σn ], [Σint , Σinp , Σout ], mt, cp) such that
– Σ1 , . . . , Σn are non-empty, pairwise disjoint alphabets;
– Σint , Σinp , Σout are pairwise disjoint alphabets consisting of internal
actions,
   S        input actions and output actions, respectively;
– i∈[n] Σi = Σint ] Σinp ] Σout ;
– mt : Σinp ∪ Σout → M is a function that assigns message types to the
input and output actions;
– cp : Σinp ∪ Σout → Σinp ∪ Σout is a ( complementing) bijection, which is
not defined (cp = ∅) if n = 1, and otherwise (n ≥ 2), for all a ∈ Σinp ∪Σout :
if a ∈ Σinp , then cp(a) ∈ Σout and if a ∈ Σout , then cp(a) ∈ Σinp ;
if a ∈ Σi for some i ∈ [n], then cp(a) ∈ Σj where j ∈ [n] is such that i 6= j;
mt(cp(a)) = mt(a); and cp(cp(a)) = a.                                       t
                                                                            u
                  S
The alphabet i∈[n] Σi = Σint ] Σinp ] Σout in Definition 10, also referred
to as the underlying alphabet of the n-DCA DA, is intended to represent
the actions available to n interacting enterprises. DA describes both their



                                      60
distribution across the enterprises and their interaction capacities. A 1-DCA
consists of a single enterprise. When specifying a 1-DCA, we can omit its
complementing bijection cp as it is not defined (and not needed). For i ∈ [n],
we refer to the 1-DCA DAi = ([Σi ], [Σi,int , Σi,inp , Σi,out ], mti ) as the i-th
component of DA.
    For the rest of this section we assume a fixed family of languages L
and a fixed process discovery algorithm A for L. The following definition
describes how A can be used to discover E-nets.

Definition 11. The E-net discovery algorithm derived from A, is the al-
gorithm AE that computes, for all pairs (L, DA) such that L ∈ L with
A(L) = (P, T, F ), and DA = ([T ], [Tint , Tinp , Tout ], mt) is a 1-DCA with
Alph(L) = T as its underlying alphabet, the E-net AE (L, DA) = (P, [Tint ,
Tinp , Tout ], F, mt).                                                      t
                                                                            u

Thus AE adds the information from DA to the transitions of A(L). Clearly,
AE (L, DA) is an E-net. 3 From Definitions 9 and 11, we have L ⊆ L(A(L)) =
L(E). Now we move to the discovery of I-nets on basis of A.

Definition 12. The I-net discovery algorithm derived from A, is the al-
gorithm AI that computes, for all pairs (L, DA) such that
– DA = ([Σ1 ,S. . . , Σn ], [Σint , Σinp , Σout ], mt, cp) is an n-DCAfor an n ≥ 2,
– Alph(L) = i∈[n] Σi , the underlying alphabet of DA, and
– projΣi (L) ∈ L for all i ∈ [n],
the I-net AI (L, DA) = I(V, ϕ) with V = {AE (projΣi (L), DAi ) | i ∈ [n]}
and ϕ(a) = cp(a) for all a ∈ Σout .                                              t
                                                                                 u

Note that AI (L, DA) = I(V, ϕ) in Definition 12 is indeed an I-net, since,
by Definition 11, for all i ∈ [n], AE (projΣi (L), DAi ) is an E-net with set
of transitions Σi ; the Σi are mutually disjoint; and the properties of cp
described in Definition 10 guarantee that ϕ is a matching for V .
    By Definitions 9 and 12, we can transfer Theorem 1, Corollary 1, and
Theorem 3 to the setting of discovering an I-net from a given language L.

Theorem 4. Let L be a language with the prefix property. Let n ≥ 2 and
let DA be an n-DCA with Alph(L) as its underlying alphabet
Let AI (L, DA) = I(V, ϕ) with V = {Ei | i ∈ [n]}. Then
(1) L ⊆ lin(L) ⊆ linFIFO (L) ⊆ L(I(V, ϕ));
(2) projΣi (L(I(V, ϕ))) = projΣi (L) if projΣi (L) = L(Ei ) for all i ∈ [n],
where for each i ∈ [n], Σi is the underlying alphabet of DAi .            t
                                                                          u
3
    In case A is an algorithm for the discovery of workflow nets, like the Inductive
    Miner [19], AE (L, DA) would have the structure of a workflow net.




                                         61
Thus, given a language L with the prefix property representing an event
log of a system of n enterprises, and an n-dimensional distributed commu-
nicating alphabet, we can construct an I-net as described in Definition 12.
This I-net has all words in L as firing sequences together with all their lin-
earisations (obtained by exchanging occurrences of independent actions). In
fact, it is sufficient to consider only their FIFO-linearisations. Even then,
though, there may be more firing sequences than what can be deduced from
the description L of the observed behaviour. By Theorem 4.(2), a precise fit
of the enterprise nets is preserved in the I-net constructed. We are currently
working on a characterisation of properties of L that would guarantee that
the language of L is exactly lin(L).

Example 8. (Ex. 7 ctd.) Let w and v be as in Example 7 and let L = {w}.
Let DA = ([ΣIF , ΣEX ], [Σint , Σinp , Σout ], mt, cp) be the 2-DCA, with com-
ponents DAIF and DAEX , as given in Fig. 4, representing the actions avail-
able to the collaboration between the Investment firm and the Exchange
from Example 7. Let AE and AI be the E-net and I-net discovery algo-




                           Fig. 4: The 2-DCA DA



rithms respectively, derived from a process discovery algorithm A. Fur-
thermore, AE (projΣIF (L), DAIF ) = EIF and AE (projΣEX (L), DAEX ) =
EEX are as depicted in Figs. 5.(a) and (b) respectively. Then AI (L, DA) =
I(V, ϕ) is the I-net in Fig. 3. Since L satisfies the “if” conditions in Theorem
1, L ⊆ L(I(V, ϕ)). Moreover, as v ∈ linFIFO (w) we have from Theorem 4,
that also v ∈ L(I(V, ϕ))                                                       t
                                                                               u


7   Discussion
In this paper we have considered the problem of discovering a distributed
process model (in the form of an I-net) from an event log (given in the form
of a language). Moreover, the number of participating processes (modelled
as E-nets) is given together with their channels (in the form of matching
input and output actions). We have shown how an (existing) algorithm




                                       62
            (a) E-net EIF                          (b) E-net EEX

                                   Fig. 5


for the discovery of Petri net models from event logs of individual pro-
cesses could be used in a new, distributed, algorithm to discover an I-net.
Moreover, by Theorem 1, fitness of the I-net produced by the algorithm is
guaranteed if and only if fitness of the component nets is guaranteed and
the event log has the prefix property (a natural assumption for event logs
of distributed processes composed of components that interact via asyn-
chronous communication channels). It is interesting to reflect on the inclu-
sion of these linearisations (see Corollary 2 and Theorem 4.(1)) and the
role of the channels. The exchange of independent occurrences of actions
suggests a relationship to the well-known Mazurkiewicz traces (equivalence
classes of words, see, eg., [21]) and their dependence graphs (defining their
causal structure in the form of a labelled partial order, see [15]). There
is however an important difference: independence in Mazurkiewicz’ theory
is between actions rather than their occurrences. The independence be-
tween occurrences considered here can be seen as being context sensitive
and determined by the history leading to these occurrences which leads to
a theory of context dependent or local traces (see, eg., [10, 17]). In [14],
a model similar to our I-nets is considered. There channel properties are
investigated in terms of asynchronous I/O-transition systems rather than
languages. The asynchronously composed I/O-Petri nets considered, use
communication channels modelled by places of Petri nets rather than im-
posing the usual FIFO requirement of communication channels. However,
as can be seen from our results, choosing FIFO channels or the standard
unordered places, does not affect the language.
    Returning to the discovering of distributed process models, it was our
initial idea to investigate to what extent existing algorithms for the dis-




                                     63
covery of single (isolated) processes could be used to discover distributed
systems. The subject of process mining is an active research area, that has
resulted in many process discovery algorithms (see, eg., [6] for an overview).
Typically, these algorithms allow for the discovery of local processes in isola-
tion, i.e., interaction between processes is not taken into account. Note that
the approach as outlined here, is also different from the one followed in [2,
4] where the problem of process mining from large event logs is addressed.
There, different options to distribute the mining problem over sublogs (with
overlapping activities) are considered, whereas in the current paper we as-
sume the (disjoint) components given. Of special interest are the passages
from [3]. Provided a causal structure (a ‘skeleton’ of the process) is known,
a larger model can be decomposed into (synchronising) fragments mak-
ing a divide–and–conquer approach possible In [9] the event logs of Multi
Agent Systems are projected onto each agent to discover component models
in terms of workflow nets by using existing process discovery algorithms,
similar to our approach. By means of α-morphisms an abstraction of each
component model is derived. The aim of [9] however, is to show that if
the composition of these abstract models is sound, the composition of the
original component models is sound.


References
 1. Wil M. P. van der Aalst, Arjan J. Mooij, Christian Stahl and Karsten Wolf
    (2009) Service Interaction: Patterns, Formalization, and Analysis Lect. Notes
    in Comput. Sci. 5569, 42–88.
 2. Wil M. P. van der Aalst (2012) Distributed Process Discovery and Confor-
    mance Checking. Lect. Notes in Comput. Sci. 7212, 1–25.
 3. Wil M. P. van der Aalst (2012) Decomposing Process Mining Problems Using
    Passages. Lect. Notes in Comput. Sci. 7347, 72–91.
 4. Wil M. P. van der Aalst (2013) Decomposing Petri nets for Process Mining:
    A Generic Approach. Distr. and Parallel Databases 31(4), 471–507.
 5. Wil M. P. van der Aalst (2013) Service Mining: Using Process Mining to
    Discover, Check, and Improve Service Behavior IEEE Trans. Serv. Comput.
    6(4), 525–535.
 6. Wil M. P. van der Aalst (2016) Process Mining - Data Science in Action
    Second Edition, Springer.
 7. Wil M. P. van der Aalst and Christian Stahl (2011) Modeling Business Pro-
    cesses - A Petri Net-Oriented Approach, MIT Press.
 8. Adriano Augusto, Raffaele Conforti, Marlon Dumas, Marcello La Rosa, Fab-
    rizio Maria Maggi, Andrea Marrella, Massimo Mecella, and Allar Soo (2017)
    Automated Discovery of Process Models from Event Logs: Review and Bench-
    mark. arXiv:1705.02288v3[cs.SE].




                                       64
 9. Luca Bernardinello, Irina A. Lomazova, Roman Nesterov and Lucia Pomello
    (2018) Compositional Discovery of Workflow Nets from Event Logs Using
    Morphisms. ATAED 2018, CEUR Workshop Proceedings 2115, 39-55.
10. Isabelle Biermann and Brigitte Rozoy (1997) Reliable Generalized and Con-
    text Dependent Commutation relations. Lect. Notes in Comput. Sci. 1214,
    165–176.
11. Josep Carmona, Boudewijn F. van Dongen, Andreas Solti, and Matthias
    Weidlich (2018) Conformance Checking - Relating Processes and Models,
    Springer.
12. EDSN (2018) Marktfacilitering https://www.edsn.nl/
13. Luı́s Gomes and João Paulo Barros (2005) Structuring and composability
    issues in Petri nets modeling. IEEE Trans. Industrial Informatics 1(2), 112 –
    123.
14. Serge Haddad, Rolf Hennicker, and Mikael H. Møller (2013) Channel Prop-
    erties of Asynchronously Composed Petri Nets. Lect. Notes in Comput. Sci.
    7927, 368 – 388.
15. Hendrik Jan Hoogeboom and Grzegorz Rozenberg (1995) Dependence graphs.
    In: Diekert, V., Rozenberg, G. (eds.) The Book of Traces, World Scientific,
    Singapore, 43–67.
16. HL7 (2015) Health Level Seven INTERNATIONAL http://www.hl7.org/
17. P.W. Hoogers, H.C.M.Kleijn, and P.S. Thiagarajan (1995) A trace semantics
    for Petri nets. Inf. Comput. 117(1), 98–114.
18. Pieter M. Kwantes and Jetty Kleijn (2018) On the Synthesis of Industry Level
    Process Models from Enterprise Level Process Models. ATAED 2018, CEUR
    Workshop Proceedings 2115, 6–22.
19. Sander J. J. Leemans, Dirk Fahland, and Wil M. P. van der Aalst (2013) Dis-
    covering Block-Structured Process Models from Event Logs - A Constructive
    Approach. Lect. Notes in Comput. Sci. 7927, 311–329.
20. S.W.I.F.T. (2015) ISO20022 Universal financial industry message scheme
    http://www.iso20022.org
21. Antoni W. Mazurkiewicz (1987) Trace theory. Lect. Notes in Comput. Sci.
    255, 279–324.
22. Wolfgang Reisig (2018) Towards a conceptual foundation of service composi-
    tion. Comput. Sci. Res. Dev. 33(3-4): 281-289
23. Wolfgang Reisig (2019) Associative composition of components with double-
    sided interfaces. Acta Inf. 56(3): 229-253
24. GS1US (2018) RosettaNet http://www.rosettanet.org/
25. Karsten Wolf (2009) Does My Service Have Partners? Trans. Petri Nets Other
    Model. Concurr. 2: 152-171 (2009)




                                       65