=Paper=
{{Paper
|id=Vol-1372/paper13
|storemode=property
|title=Discovery of Functional Architectures From Event Logs
|pdfUrl=https://ceur-ws.org/Vol-1372/paper13.pdf
|volume=Vol-1372
|dblpUrl=https://dblp.org/rec/conf/apn/WerfK15
}}
==Discovery of Functional Architectures From Event Logs==
<pdf width="1500px">https://ceur-ws.org/Vol-1372/paper13.pdf</pdf>
<pre>
         Discovery of Functional Architectures
                   From Event Logs

               Jan Martijn E.M. van der Werf and Erwin Kaats

                Department of Information and Computing Science
                               Utrecht University
               P.O. Box 80.089, 3508 TB Utrecht, The Netherlands
             j.m.e.m.vanderwerf@uu.nl, e.j.kaats@students.uu.nl


      Abstract. The functional architecture focuses on decomposing func-
      tionality into modules that offer certain features. These features require
      interactions in order to complete their functionality. However, functional
      architectures typically only focus on the static aspects of the system de-
      sign. Additional modeling techniques, such as message sequence charts
      are often used in the early phases of software design to indicate how the
      software should behave.
      In this paper we investigate the use of process discovery techniques to
      discover from these scenarios the internal behavior of individual compo-
      nents. Based on event logs, this paper presents an approach (1) to derive
      the information flows between features, (2) identify the internal behav-
      ior of features, and (3) to discover the order between features within a
      module. The approach results in a sound workflow model for each mod-
      ule. We illustrate the approach using a running example of a payment
      system.


1   Introduction

One of the principle tasks of a software architect is to design a software sys-
tem [17], i.e., to organize the software elements the system is composed of in
sets of structures, to allow reasoning about the system [4]. Many different Archi-
tectural Description Languages (ADLs) exist to document software architecture.
However, due to the large competitive market in the software product industry,
architecture is often neglected in software product organizations [14]. Hence, not
many ADLs are used in practice. As experienced in [14], in software product or-
ganizations, architects rather use informal architectural models as an instrument
of communication and discussion.
    An important aspect of software architecture is the functionality it offers.
To decompose and specify the functionality of software, the authors of [6] in-
troduced the Functional Architecture Model (FAM), which offers the desired
modeling technique used by many software architects in software product orga-
nizations [14]. The FAM separates the functionality into so-called features that
are offered by the different modules the system is decomposed into.
228    PNSE’15 – Petri Nets and Software Engineering


    Features interact with other features via information flows to offer their func-
tionality. However, FAM only offers a static view on this interaction, i.e., the
information flow only shows possible interactions, but imposes no order on or
dependencies between these flows. Thus, to show how functionality is offered
by the system, the architect requires additional models. One way is to define
scenarios on top of the models, in which the architect can specify which features
interact in which order. These scenario then result in event logs, that can be
analyzed using process mining techniques [2]. Another source for discovering the
possible interactions between features is the use of system execution data [21],
mapping events to the (partial) execution of features. In this way, execution data
can be used to reconstruct a software architecture.
    In software product organizations, time to market is often a more important
priority than having a properly documented software architecture. Consequently,
architecture documentation is often outdated or even missing [9]. Therefore, dis-
covering architectural models help such organizations in maintaining their soft-
ware products. In this paper, we investigate the possibility to use process mining
techniques to discover, the functional architecture of the system from an event
log. We thereby focus on three basic questions on the functional architecture:

 1. Which features interact?
 2. What is the internal behavior of features?
 3. What is the order in which features are executed within a module?

    The first question focuses on the discovery of information flows: given an
event log, is it possible to derive which features interact? Next, we investigate
whether it is possible to derive the internal behavior of features based on event
logs. In other words, we focus on the question how does a feature use its in-
formation flows to complete its functionality. The last question deals with the
high-level view of the functional architecture. To execute the system’s function-
ality, the features within a module are called in a certain order. Can process
discovery techniques be used to discover these orders?
    The remainder of this paper is structured as follows. To illustrate the ap-
proach, Sec. 2 presents a running example which we will use throughout the
paper. Next, Sec. 3 presents the basic notions used in the paper. Section 4 in-
troduces the functional architecture model in more detail, after which in Sec. 5
we will focus on solving the three questions posed in the introduction. Section 6
concludes the paper.


2     Running Example
As an running example, consider the Payment System as introduced in [10]. The
system consists of three modules, Debtor, Payment and Creditor. The payment
module serves as an intermediate between the Debtor and the Creditor. An
example of such a payment module is the european SEPA standard. The payment
module initiates a transaction, which the debtor needs to accept. If the debtor
accepts, the payment is continued, and the creditor is contacted to start the
           J. Van Der Werf, E. Kaats: Discovery of Functional Architectures    229


transaction. If for some reason the creditor rejects the transaction, the debtor
is notified, and the transaction is terminated. Similarly, if the creditor accepts,
the payment is passed to the debtor, and finally, the creditor receives the final
payment information.
    As the software evolved into the current system, no precise model exists that
specifies the behavior of this system. The system only recorded the order in which
the different features of the modules have been called in an event log, as shown
in Tbl. 1. Each pair in the table represents the feature and the module to which
that feature belongs. For readability, the features and modules are abbreviated
in this event log.
    The system is decomposed into three modules: the Debtor module (X), the
Payment (Y) module, and the Creditor (Z). Based on the event log, the software
architect finds the following features:

 – Receive transaction request (A);
 – Reject transaction (B);
 – Accept transaction (C);
 – Cancel transaction (D);
 – Initiate payment (E);
 – Send payment details (F);
 – Archive transaction request (G).
 – Send transaction request (H);
 – Reject transaction request (I);
 – Initiate creditor (J);
 – Cancel transaction (K);
 – Initiate payment (O);
 – Handle payment (M);
 – Archive transaction (N).
 – Start transaction (Q);
 – Handle transaction (S).


                             A                 H

                             B                 I

                    G        C                 J             Q

                            D                  K

                             E                 O             S

                             F                M

                         Debtor                N          Creditor

                                            Payment

        Fig. 1. Initial functional architecture model of the running example
230     PNSE’15 – Petri Nets and Software Engineering


Case                                                 Trace
  1 (H, Y), (A, X), (B, X), (G, X), (I, Y), (N, Y)
  2 (H, Y), (A, X), (B, X), (I, Y), (G, X), N, Y)
  3 (H, Y), (A, X), (B, X), (I, Y), (N, Y), (G, X)
  4 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (K, Y), (D, X), (G, X), (N, Y)
  5 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (K, Y), (D, X), (N, Y), (G, X)
  6 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (K, Y), (N, Y), (D, X), (G, X)
  7 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z),( O, Y), (E, X), (F, X), (G, X), (M, Y), (S, Z), (N, Y)
  8 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (O, Y), (E, X), (F, X), (G, X), (M, Y), (N, Y), (S, Z)
  9 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (O, Y), (E, X), (F, X), (M, Y), (G, X), (S, Z), (N, Y)
 10 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (O, Y), (E, X), (F, X), (M, Y), (G, X), (N, Y), (S, Y)
 11 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (O, Y), (E, X), (F, X), (M, Y), (S, Z), (G, X), (N, Y)
 12 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (O, Y), (E, X), (F, X), (M, Y), (S, Z), (N, Y), (G, X)
 13 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (O, Y), (E, X), (F, X), (M, Y), (N, Y), (G, X), (S, Z)
 14 (H, Y), (A, X), (C, X), (J, Y), (Q, Z), (S, Z), (O, Y), (E, X), (F, X), (M, Y), (N, Y), (S, Y), (G, X)

                 Table 1. System execution data of the payment system


    Based on this information, the architect can draw the modules with their
features, as shown in Fig. 1. In the remainder of this paper, we investigate a
method to use event logs, such as the one shown in Tbl. 1, to complete the
diagram and derive a behavioral specification of the system.


3     Preliminaries
Let S be a set. The powerset of S is denoted by PpSq “ tS 1 | S 1 Ď Su. We use |S|
for the number of elements in S. Two sets U and V are disjoint if U X V “ H.
Some set S with relation ď is a partial order, denoted by pS, ďq, iff ď is reflexive,
i.e. a ď a for all a P S, antisymmetric, i.e. a ď b and b ď a imply a “ b for all
a, b P S, and transitive, i.e. a ď b and b ď c imply a ď c for all a, b, c P S. Given
a relation R Ď S ˆ S for some set S, we denote its transitive closure by R` , and
the transitive and reflexive closure by R˚ .
     A bag m over S is a function m : S Ñ IN , where IN “ t0, 1, . . .u denotes the
set of natural numbers. We denote e.g. the bag m with an element a occurring
once, b occurring three times and c occurring twice by m “ ra, b3 , c2 s. The set
of all bags over S is denoted by IN S . Sets can be seen as a special kind of bag
where all elements occur only once; we interpret sets in this way whenever we
use them in operations on bags. We use ` and ´ for the sum and difference of
two bags, and “, ă, ą, ď, ě for the comparison of two bags, which are defined
in a standard way.
     A sequence over S of length n P IN is a function σ : t1, . . . , nu Ñ S. If
n ą 0 and σpiq “ ai for i P t1, . . . , nu, we write σ “ xa1 , . . . , an y. The length
of a sequence is denoted by |σ|. The sequence of length 0 is called the empty
sequence, and is denoted by . The set of all finite sequences over S is denoted
by S ˚ . We write a P σ if a 1 ď i ď |σ| exists such that σpiq “ a. Concatenation
of two sequences ν, γ P S ˚ , denoted by σ “ ν; γ, is a sequence defined by σ :
t1, . . . , |ν| ` |γ|u Ñ S, such that σpiq “ νpiq for 1 ď i ď |ν|, and σpiq “ γpi ´ |ν|q
for |ν| ` 1 ď i ď |ν| ` |γ|. A sequence σ can be projected over some set U ,
denoted by σ|U , and is inductively defined by |U “ , pxay; σq|U “ xay; σ|U if
a P U , and pxay; σq|U “ σ|U otherwise.
            J. Van Der Werf, E. Kaats: Discovery of Functional Architectures       231


Petri Nets A Petri net [16] is a tuple N “ xP, T, F y where (1) P and T are two
disjoint sets of places and transitions respectively; and (2) F Ď pP ˆT qYpT ˆP q
is a flow relation. The elements from the set P Y T are called the nodes of
N . Elements of F are called arcs. Places are depicted as circles, transitions as
squares. For each element pn1 , n2 q P F , an arc is drawn from n1 to n2 .
     Let N “ xP, T, F y be a Petri net. Given a node n P pP Y T q, we define
its preset N‚ n “ tn1 | pn1 , nq P F u, and its postset n‚N “ tn1 | pn, n1 q P F u.
We liftŤthe notation of preset
 ‚               ‚          ‚
                                Ť and ‚postset to sets. Given a set U Ď pP Y T q,
N  U  “    nPU N   n and  U N “    nPU nN . If the context is clear, we omit the N in
the subscript.
     A marking of N is a bag m P IN P , where mppq denotes the number of tokens
in place p P P . If mppq ą 0, place p is called marked in marking m. A Petri net
N with corresponding marking m is written as pN, mq and is called a marked
Petri net. Given a marked Petri net pN, mq, transition t is enabled, denoted by
pN, mqrty, if ‚ t ď m. If transition t is enabled in pN, mq, it can fire, resulting in
a new marking m1 , denoted by pN, mqrtypN, m1 q, such that m1 ` ‚ t “ m ` t‚ . We
lift the firing of transitions to the firing of sequences in a standard way, i.e., a
sequence σ P T ˚ of length n is enabled in pN, mq if markings m0 , . . . , mn exist,
such that m “ m0 and pN, mi´1 qrσpiqypN, mi q for all 1 ď i ď n. A marking
m1 is reachable from some marking m in N , denoted by pN, mqr˚ypN, m1 q, if a
firing sequence σ P T ˚ exists such that pN, mqrσypN, m1 q. A marking m1 is a
home marking of pN, mq, if for all markings m2 with pN, mqr˚ypN, m2 q, we have
pN, m2 qr˚ypN, m1 q.
     A special class of Petri nets are the workflow nets [1]. A workflow net is a
tuple xP, T, F, i, f y with xP, T, F y a Petri net, (2) i P P is the only place with no
incoming transitions, (3) f P P is the only place with no outgoing transitions,
i.e., ‚ i “ f ‚ “ H, and (4) all transitions have at least one incoming and one
outgoing arc, i.e., ‚ t ‰ H ‰ t‚ for all t P T .

Open Petri Nets Within a network of asynchronously communicating systems,
messages are passed between the elements within the network. The approach we
follow is based on Open Petri nets [5]. Communication in an open Petri net
(OPN) is represented by special places, called the interface places. An interface
place is either an input place, receiving messages from the outside, or an output
place that sends messages to the outside of the OPN. An input place is a place
that has only outgoing arcs, and an output place has no incoming arcs.
Definition 1. An Open Petri net is defined as an 7-tuple xP, I, O, T, F, i, Ωy
where (1) xP Y I Y O, T, F y is a Petri net; (2) P is a set of internal places;
(3) I is a set of input places, and ‚ I “ H; (4) O is a set of output places,
and O‚ “ H; (5) P , I and O are pairwise disjoint; (6) i P IN P is the initial
marking, and (7) Ω Ď IN P is the set of final markings. We call the set I Y O
the interface places of the OPN. An OPN is called closed if I “ O “ H.
   An important behavioral property for OPNs is termination: an OPN should
always have the possibility to terminate properly. We identify two termination
properties: weak termination and soundness.
232    PNSE’15 – Petri Nets and Software Engineering


Definition 2. Let xP, I, O, T, F, i, f y be an OPN. It is weakly terminating, if for
every reachable marking of the marked Petri net pxP Y I Y O, T, F y, iq a marking
f P Ω can be reached.
   It is sound, if for every reachable marking of the marked Petri net pxP, T, F y, iq
a marking f P Ω can be reached.
   Communication between OPNs is done via the interface places. Two OPNs
can only communicate if the input places of the one are the output places of the
other, and vice versa.
Definition 3. Two OPNs A and B are composable, denoted by A ‘ B, if and
only if pIA X OB q Y pOB X IA q “ pPA Y TA Y IA Y OA q X pPB Y TB Y IB Y OB q.
   If A and B are composable, they can be composed into a new OPN, denoted by
A‘B, with A‘B “ xP, I, O, T, F, i, Ωy where P “ PA YPB YG; I “ pIA YIB qzG;
O “ pOA Y OB qzG; T “ TA Y TB ; F “ FA Y FB ; i “ iA ` iB ; and f “ ΩA Y ΩB
with G “ pIA X OB q Y pOB X IA q.

Event Logs and Behavioral Profiles Although event logs are defined as a
tuple consisting of a set of case identifiers, events, and an attribute mapping [2],
it is in this paper sufficient to consider an event log, denoted by L, as a set of
sequences over some alphabet T , i.e., L Ď T ˚ . Given an event log L, we define
the successor relation [20] by a ăL b if a sequence σ P L and 1 ď i ď |σ| exist,
such that σpiq “ a and σpi ` 1q “ b. Using the successor relation, we define the
behavioral profile pÑc , kc , `c qL as three relations: (1) the causality relation Ñc
is defined by a Ñc b iff a ăL b and b ­ăL a, (2) the concurrency relation kc , which
is defined by a kc b iff both a ăL b and b ăL a, and (3) the exclusive relation `c
is defined by a `c b iff both a ­ăL b and b ­ăL a [20]. If the context is clear, we
omit the subscript.
    Given a marked Petri net pN, mq with N “ xP, T, F y, an event log L Ď T ˚
is called complete with respect to pN, mq iff traces σ1 , σ2 P T ˚ exist such that
pN, mqrσ1 ; xa, by; σ2 ypN,) implies a ăL b for all a, b P T .


4     Functional Architectures
To model the overview of a system, the modules it consists of, and the features
these modules offer, we propose the use of the functional architecture model
(FAM). The functional architecture of a system is “an architectural model which
represents at a high level the software products major functions from a usage
perspective, and specifies the interactions of functions, internally between each
other and externally with other products” [6]. It offers modules containing fea-
tures. Features of different modules interact via so-called information flows.
    An example is shown in Fig. 2(a). The FAM contains 1 context module, E,
7 modules, A, B, C, D, X, Y and Z. Modules have features, depicted by the
rounded rectangles. For example, module C contains two features, K and L. Be-
tween features of different modules, information flows exist, e.g., the information
flow pF, q, Lq between modules A and C.
             J. Van Der Werf, E. Kaats: Discovery of Functional Architectures                                 233


                                                              1                       2
    O    p                F       q             L     O   p                   F           q               L

                 G                      K                             G                           K
     E                A                     C         E                   A                           C
                                                                                      3
                                        r                         5                           3   r
                  t           s                                       t           s


                                        N                                             4           N
                                  u                                                   u
                 H                 v            M                     H                v                  M
                      B                     D                             B           4               D

                      X                     Y                             X                           Y
                                  Z                                                   Z


 (a) Example     Functional            Architecture       (b) Scenario as an overlay
 Model

Fig. 2. Example Functional Architecture Model and corresponding scenario as overlay


Definition 4. A Functional Architecture Model (FAM) is defined as a 6-tuple
xM, C, F, h, m, Ñy where

 – M is a finite set of modules;
 – C is a finite set of context modules;
 – F is a finite set of features;
 – h : M Ñ M is the hierarchy function, such that the transitive closure h ˚ is
   irreflexive;
 – m : F Ñ M Y C is a feature map that maps each feature to a module,
   possibly in the context, and this module does not have any children, i.e.
   h´1 pmpF qq “ H for all F P F;
 – Ñ Ď F ˆ Λ ˆ F is the information flow, with Λ the label universe, such
   that for pA, l, Bq PÑ we have mpAq ‰ mpBq. The labels for the information
   flows are unique per feature, i.e., pA, l, Bq and pA, l, Cq imply B “ C for all
   labels l P Λ and pA, l, Bq, pA, l, Cq P Ñ.

    Although the information flows define the possible interactions between mod-
ules, it remains a static overview of the system. Therefore, one can use scenarios
on top of the functional architecture, e.g. by creating an overlay, highlighting the
information flows that are executed and the order in which they should occur.
Formally, we represent a scenario as a partial order.

Definition 5. Let F “ xM, C, F, h, m, Ñy be a FAM. A scenario of F is a pair
pS, ăq with S ĎÑ, such that pS, ďq with ď “ ă˚ is a partial order.

   An example is shown in Fig. 2(b). The scenario implied by the overlay can
be represented by a partial order induced by pO, p, F q ă pF, q, Lq, pF, q, Lq ă
pK, s, Hq, pF, q, Lq ă pK, r, N q, pK, r, N q ă pN, u, Hq, pK, s, Hq ă pH, t, Gq,
pN, u, Hq ă pH, t, Gq, pK, r, N q ă pM, v, Hq, and pM, v, Hq ă pH, t, Gq.
234    PNSE’15 – Petri Nets and Software Engineering


    However, such scenarios are typically not specified. Another important draw-
back of such scenarios is their analyzability. Although each scenario can be
checked, the consistency between the different scenarios remains a difficult task.
Therefore, in the remainder of this paper, we search for a method to derive the
behavioral specification as a network of asynchronously communicating systems,
given the system execution data produced by the actual system in the form of
event logs.


5     Discovery of a Functional Architecture
In this section, we study the possibilities process mining [2] offers to generate
Petri nets for each of the different modules a system consists of. Event logs
describe the order in which features of a system have been executed. Such event
logs are system wide. Instead of each module having its own event log, only
global sequences exist, i.e., sequences concatenate the executed features over all
modules. As FAM only allows features to be contained in a single module, we
assume that each feature belongs to exactly one module. Also, FAM prescribes
communication to be one-directional, i.e., given two communicating features A
and B, we assume that either A sends a message to B, or vice versa, that B
sends a message to A, but not both.
    The behavioral specification of a system is three-fold: (1) communication
between modules via their features, (2) the internal behavior within each feature,
and (3) the order in which features are called within a module. In this section,
we explore all three types of behavioral specification to come to a composed
system of asynchronously communicating systems.
    In the remainder, let L be an event log over a set of features T , and let
R : T Ñ M , with M the set of modules, be a function that maps each feature
onto the module that contains that feature.

5.1   Communication between Features
Communication between modules within a system is asynchronous of nature:
messages are sent between features in order to complete their functionality.
Within an event log, we need to consider the order in which events or fea-
tures occur. For example, given some trace σ, if the resource is different for two
subsequent events, i.e., Rpσpiqq ‰ Rpσpi ` 1qq, then this might indicate that the
former sends a message to the latter. This is expressed by the communication
successor.

Definition 6 (Communication successor). Let L Ď T ˚ be an event log.
We define the communication successor relation ÎL Ď T ˆ T by A ÎL B iff
RpAq ‰ RpBq, σpiq “ A, and σpi ` 1q “ B for some σ P L and 1 ď i ă |σ|.

    Although at first sight the communication successors seem to work, we need
to remember the concurrent nature of asynchronous communication. Consider for
example the communication between modules M and N as depicted in Fig. 3, and
              J. Van Der Werf, E. Kaats: Discovery of Functional Architectures     235


          M                       N

               A              E


                                                      Case Trace
               B              F                        1 A, E, F, G, B, C
                                                       2 A, E, F, B, G, C
                                                       3 A, E, B, F, G, C
                                                       4 A, E, B, F, C, G
               C             G
                                                       5 A, E, F, B, C, G
                                                       6 A, B, E, F, G, C
                                                       7 A, B, E, F, C, G

        Fig. 3. Modules M and N                 Table 2. Corresponding event log

                                        E F G
                                      AÑ` `
                                      B k k k
                                      C ` Ð k

                      Table 3. Communication behavioral profile


the corresponding allowed sequences in Tbl. 4. We have A Î E, which is indeed
the communication as modeled in the composition M ‘ N . However, we also
find G Î B, indicating a possible communication between G and B. Listing all
communication successors, we get A Î E, G Î B, F Î B, B Î G, G Î C, E Î B,
B Î F , F Î C, C Î G, and B Î E. Observe that because of the asynchronous
nature of the communication, features B and E are concurrently enabled in
Fig. 3. Assuming the event log to be complete, this should become visible in the
communication successor relation, as for the normal successor relation on event
logs.
Definition 7 (Communication behavioral profile). Let L Ď T ˚ be an event
log, and ÎL Ď T ˆ T the corresponding communication successor relation.
    The communication behavioral profile is the 3-tuple pÑc , kc , `c qCom
                                                                       L   defined
by:
 – A Ñc B iff A ÎL B and B ­ÎL A;
 – A kc B iff both A ÎL B and B ÎL A; and
 – A `c B iff both A ­ÎL B and A ­ÎL B.
    Calculating the behavioral profile of the communicating transitions using
the communication successor relation, results in the communication behavioral
profile as shown in Tbl. 3. It shows that B and E are concurrently enabled.
Following the behavioral profile, we see that the causal relation of the behavioral
profile correctly identifies the feature communication.
    Using the communication behavioral profile, we can construct the informa-
tion flows from an event log as follows. If A Ñ B in the communication behavioral
236    PNSE’15 – Petri Nets and Software Engineering


                     Debtor        Payment   Creditor
                 A B C D E F G H I J K O MN Q S
               A               Ð` ` ` ` ` ` ` `
               B               ` Ñ` ` ` ` ` ` `
               C               ` ` Ñ` ` ` ` ` `
               D               ` ` ` Ð` ` k ` `
               E               ` ` ` ` Ð` ` ` `
               F               ` ` ` ` ` Ñ` ` `
               G               ` k ` ` ` k k ` k
               H Ñ` ` ` ` ` `                ` `
               I ` Ð` ` ` ` k                ` `
               J ` ` Ð` ` ` `                Ñ `
               K ` ` ` Ñ` ` `                ` Ð
               O ` ` ` ` Ñ` `                ` Ð
               M ` ` ` ` ` Ð k               ` Ñ
               N ` ` ` k ` ` k               ` k
               Q ` ` ` ` ` ` ` ` ` Ð` ` ` `
               S ` ` ` ` ` ` k ` ` ` ÑÑÐ k

        Table 4. Communication behavioral profile for the running example


profile of the event log, then an information flow pA, x, Bq exists, with x a fresh
label. This results in the following translation:

Definition 8 (Generated FAM). Let L be an event log, and pÑc , kc , `c qCom   L
be its communication behavioral profile. Its corresponding functional architecture
model xM, C, F, h, m, Ñy is defined by:

 – M “ RpLq;
 – C “ H;
 – F “ T;
 – h “ H;
 – m “ R; and
 – Ñ“ tpA, x, Bq | A Ñc B, and x P Λ a fresh labelu.

    After constructing the communication behavioral profile for the running ex-
ample, shown in Tbl. 4, we can complete the functional architecture model.
Based on the given system execution data, we see for example that feature H
communicates with feature A, and feature S sends messages to features K and
O, and receives messages from feature M . The complete functional architecture
of the running example is shown in Fig. 4.


5.2   Internal Behavior of Features

As can be seen in the running example, features can send and receive multiple
messages. For example, feature S sometimes sends a message to feature K and
sometimes to feature O. Therefore, the next step in discovering the functional
             J. Van Der Werf, E. Kaats: Discovery of Functional Architectures   237


                              A                       H

                              B                       I

                    G         C                       J            Q

                              D                       K

                              E                       O            S

                              F                      M

                         Debtor                       N         Creditor

                                                  Payment

             Fig. 4. Functional architecture model of the running example


architecture is to reconstruct the internal behavior of each of the features. For
this, we create for each of the features an event log, containing the features that
it communicates with. We call this the feature log.

Definition 9 (Feature log). Let L Ď T ˚ be an event log, and let F P T be
some feature. Let pÑc , kc , `c qCom
                                 L   be the corresponding communication behav-
ioral profile. The feature log LF is defined by LF “ tσ|CpF q | σ P L, F P σu
where CpF q “ tA | A Ñc F _ F Ñc Au.

   Consider for example feature S in the running example. This feature com-
municates with features K, O and M , i.e., CpSq “ tK, O, M u. Its feature log is
the projection of the log on these features, i.e., LS “ txKy, xO, M yu.
   On these feature logs, we apply the inductive miner [13], that always returns
a sound workflow net. Next, we transform the discovered workflow net into an
open Petri net, to visualize the messages sent and received by the feature. This
results in a feature net for each of the features present in the event log.

Definition 10 (Feature Net). Let L Ď T ˚ be an event log, and let F P T
be some feature. Let pÑc , kc , `c qCom
                                    L   be the corresponding communication behav-
ioral profile. The Feature net NF is the OPN xP, I, O, T, F, i, Ωy defined by
 – P “ P̄ , T “ T̄ , i “ r ī s, Ω “ t r f¯ s u;
 – I “ tpA´F | A Ñc F u;
 – O “ tpF ´A | F Ñc Au;
 – F “ F̄ Ytpt, pF ´A q | t P T, λptq “ A, F Ñc Au
           YtppA´F , tq | t P T, λptq “ A, A Ñc F qu.
where xP̄ , T̄ , F̄ , ī, f¯y is the discovered workflow net.

   In our running example, each of the 16 features are transformed into a feature
net. Most of the features are simple, like for feature H and A, consisting of
238     PNSE’15 – Petri Nets and Software Engineering


                O                                                      C               S
      pS-O                                                      pJ-C            pS-K

                          K
      pS-K
                                          A              H
                                   pH-A          pH-A
                                                                       Q               D
                                                                pJ-Q            pK-D
                M
      pM-S


             Feature S             Feature H     Feature A         Feature J    Feature K


                    Fig. 5. Some of the feature nets for the running example


a single transition sending a message to A, and receiving a message from H,
respectively. A more complex feature net is the net for feature S, which internally
decides whether it sends a message to K or to O. Figure 5 depicts some of the
feature nets generated using the inductive miner [13].


5.3   Feature Interaction within Modules

Now that each feature has its internal behavior defined by means of a feature
net, the next step is to determine the order in which features are executed within
each of the modules. As for the features, we first create event logs for each of
the modules, by filtering each trace on the features it contains. This results in a
module log for each of the modules.

Definition 11 (Module Log). Let L Ď T ˚ be an event log. Let M P RngpRq
be a module. Let pÑc , kc , `c q be the corresponding communication behavioral
profile. The Module log LM is defined by LM “ tσ|tF |RpF q“M u | σ P Lu.

   Within the running example, we obtain three module logs, one for each of
the modules. For example, module Debtor, has module log LDebtor “ txA, B, Gy,
xA, C, D, Gy, xA, C, E, F, Gyu, and for Creditor we have LCreditor “ txQ, Sy,


                         Fig. 6. Refinement of a transition by a workflow net
              J. Van Der Werf, E. Kaats: Discovery of Functional Architectures             239


                            A                          H


                 C                   B          I             J


          E            D                               K             O


                                                                               Q
          F                                                          M


                                                                               S
                       G                                      N


                     Module Debtor                  Module Payment       Module Creditor


                           Fig. 7. Module nets for the running example


xQ, S, Syu. Again applying the inductive miner results in the three workflow
nets as depicted in Fig. 7. Notice that, although feature S occurs twice in one
of the sequences, the algorithm only adds a single feature S in the resulting
workflow model.


5.4   Composition of Feature Nets and Module Nets


Last step in the process is to combine the feature nets generated for each of the
features with the generated module nets. This results in an open Petri net for
each of the modules, defining the interaction between the different modules.
    In the module net, each feature is represented by a single transition. Next
step is to refine each feature by its feature net. For this, we first define the
refinement of a transition by a workflow model on open Petri nets, as shown in
Fig. 6. This refinement connect each input place of the refined transition with
each of the transitions in the postset of the initial place of the workflow, and
similarly each output place of the refined transition with each of the transitions
in the preset of the final place of the refining workflow. It is straight-forward
to prove that if (1) the initial net is sound, (2) each input place of the refined
transition is 1-bounded, i.e., it can contain at most one token, and (3) workflow
net W is sound, then the refinement yields a sound result.
240    PNSE’15 – Petri Nets and Software Engineering


                                A   H
                         pH-A           pH-A

                                B   I
                         pB-I           pB-I
                                C   J                               Q
                         pC-J           pC-J                 pJ-Q       pJ-Q


                                                                    S
                                D   K
                         pK-D           pK-D                 pS-K       pS-K
                                E   O
                         pO-E           pO-E                 pS-O       pS-Q


                                F   M
                         pF-M           pF-M                 pM-S       pM-S


            G                                            N


                Debtor                         Payment                     creditor


             Fig. 8. Composed nets generated for the running example


    The result of refining each feature by its feature net is shown in Fig. 8. As
features G and N have no feature net defining communication, these transitions
are not refined.
    To verify whether the resulting open Petri nets are a true representation of
the system, one can compose the nets into a single Petri net, and execute each
of the sequences of the event log of Tbl. 1 on the resulting model, which in
this example is possible. Further analyzing the resulting model shows that its
only deadlocks are desirable markings: either all modules reach their final place,
without any pending tokens, or the Creditor module remains untouched, while
the Debtor and Payment module reach their final place.


6     Conclusions
Within this paper, we discussed a method to automatically generate a functional
architecture model from an event log together with a mapping of each feature to
the module that offers that functionality. We showed how the information flows
can be derived from the communication behavioral profile. This profile not only
identifies the information flow for the static structure of the functional archi-
tecture, but additionally offers sufficient information to construct the internal
behavior for each of the features, and between the features within a module.
Lastly, we showed how to compose feature and module nets into an open Petri
net.
            J. Van Der Werf, E. Kaats: Discovery of Functional Architectures    241


    Discovering the interaction between different modules is not new. Tchniques
like service mining [3], apply process mining on event logs to discover a process
model of how the services are orchestrated. In the approach presented in this
paper, we focus on the discovery of the behavior of each of the modules, rather
than a complete orchestration.
    In [15], the authors discover the internal behavior of services based on the
interaction between two services, guaranteeing deadlock freedom of the discov-
ered service. In the setting of this paper, the exact interaction between modules
is unknown, and needs to be discovered first.
    The core idea of this paper is twofold: firstly to derive the information flows
for a Functional Architecture Model, and secondly to derive the internal behavior
for each of the modules within the architecture. Within software architecture,
this is called Software Architecture Reconstruction [12]. Although some tech-
niques take the dynamic aspects of the software operation into account, most
techniques only focus on the static aspects of software architecture models, us-
ing solely the available source code [8]. For example, system execution data is
used to enrich architectures with performance data [11] or to visualize traces on
how the software is used [19]. In this paper, we propose a method to not only
visualize software usage, but to discover module communication and to generate
the internal behavior of modules within a software architecture.
    Although the approach presented in this paper shows an application of the
behavioral profile to discover feature interaction, additional research is required.
First, the current approach requires the event log to be complete, i.e., if the log
grows, the successor relation should not change. Further, for the generation of
the internal feature behavior, we assume that if the sending feature is present
in the event log, it enables all possible events, which is possibly a too strict
assumption that deserves further investigation.
    The approach in this paper is very flexible, as we derive individual models
for the features and modules. For this, we apply standard process discovery algo-
rithms returning sound workflow models. However, their composition in general
does not result in a sound system of asynchronously communicating systems.
Further research is required to study the conditions under which this can be
guaranteed. For this, we want to identify conditions which on the one hand re-
sult in correct models, and on the other hand have a positive effect on model
quality as described by [7].
    Not only does this approach provide useful insights for the software architect,
we expect the approach applicable to business process management as well, as
for the discovery of separate business processes, the Business Process Modelling
and Notation offers the swimlane notion. Therefore, we plan to implement the
approach in the Process Mining toolkit ProM [18] to experiment and apply the
approach on real-life examples.

Acknowledgements The authors would like to thank the anonymous reviewers
for their valuable feedback, and Sjaak Brinkkemper, Fabiano Dalpiaz, Garm
Lucassen, Leo Pruijt and Erik Jagroep for the fruitfull discussions and valuable
input on architecture and software products.
242    PNSE’15 – Petri Nets and Software Engineering


References
 1. W.M.P. van der Aalst. Verification of workflow nets. In Application and Theory
    of Petri Nets 1997, volume 1248 of Lecture Notes in Computer Science, pages 407
    – 426. Springer, Berlin, 1997.
 2. W.M.P. van der Aalst. Process Mining: Discovery, Conformance and Enhancement
    of Business Processes. Springer, Berlin, 2011.
 3. W.M.P. van der Aalst. Challenges in service mining: Record, check, discover. In
    Web Engineering, volume 7977 of Lecture Notes in Computer Science, pages 1–4.
    Springer, Berlin, 2013.
 4. L. Bass, P. Clements, and R. Kazman. Software Architecture in Practice. Series in
    Software Engineering. Addison Wesley, Reading, MA, USA, 2012.
 5. D. Bera, K. M. van Hee, and J.M.E.M. van der Werf. Designing weakly termi-
    nating ros systems. In Applications and Theory of Petri Nets, (33th International
    Conference, Petri Nets 2012), volume 7347 of Lecture Notes in Computer Science,
    pages 328 – 347. Springer, Berlin, 2012.
 6. S. Brinkkemper and S. Pachidi. Functional architecture modeling for the software
    product industry. In ECSA 2010, volume 6285 of Lecture Notes in Computer
    Science, pages 198 – 213. Springer, Berlin, 2010.
 7. J.C.A.M. Buijs, B.F. van Dongen, and W.M.P. van der Aalst. On the role of
    fitness, precision, generalization and simplicity in process discovery. In On the
    Move to Meaningful Internet Systems: OTM 2012, volume 7565 of Lecture Notes
    in Computer Science, pages 305–322. Springer, Berlin, 2012.
 8. S. Ducasse and D. Pollet. Software architecture reconstruction: A process-oriented
    taxonomy. Software Engineering, IEEE Transactions on, 35(4):573–591, July 2009.
 9. S.A. Fricker. Software product management. In Software for People, Management
    for Professionals, pages 53–81. Springer, 2012.
10. K.M. van Hee, N. Sidorova, and J.M.E.M. van der Werf. When can we trust a
    third party? In Transactions on Petri Nets and Other Models of Concurrency
    VIII, volume 8100 of Lecture Notes in Computer Science, pages 106–122. Springer,
    Berlin, 2013.
11. T. Israr, M. Woodside, and G. Franks. Interaction tree algorithms to extract
    effective architecture and layered performance models from traces. Journal of Sys-
    tems and Software, 80(4):474 – 492, 2007. Software Performance 5th International
    Workshop on Software and Performance.
12. R.L. Krikhaar. Software Architecture Reconstruction. PhD thesis, VU Amsterdam,
    1999.
13. S.J.J. Leemans, D. Fahland, and W.M.P. van der Aalst. Discovering block-
    structured process models from event logs - a constructive approach. In Appli-
    cation and Theory of Petri Nets and Concurrency, volume 7927 of Lecture Notes
    in Computer Science, pages 311–329. Springer, Berlin, 2013.
14. G. Lucassen, J.M.E.M. van der Werf, and S. Brinkkemper. Alignment of software
    product management and software architecture with discussion models. In IWSPM
    2014, pages 21–30. IEEE, Aug 2014.
15. R. Müller, C. Stahl, W.M.P. van der Aalst, and M Westergaard. Service discovery
    from observed behavior while guaranteeing deadlock freedom in collaborations. In
    Service-Oriented Computing, volume 8274 of Lecture Notes in Computer Science,
    pages 358–373. Springer, Berlin, 2013.
16. W. Reisig. Petri Nets: An Introduction, volume 4 of Monographs in Theoretical
    Computer Science: An EATCS Series. Springer, Berlin, 1985.
            J. Van Der Werf, E. Kaats: Discovery of Functional Architectures        243


17. R.N. Taylor, N. Medvidovic, and E.M. Dashofy. Software Architecture: Founda-
    tions, Theory, and Practice. John Wiley & Sons, 2010.
18. H.M.W. Verbeek, J.C.A.M. Buijs, B.F. van Dongen, and W.M.P. van der Aalst.
    XES, XESame, and ProM 6. In Information System Evolution, volume 72 of Lecture
    Notes in Business Information Processing, pages 60–75. Springer, Berlin, 2011.
19. R.J. Walker, G.C. Murphy, J. Steinbok, and M.P. Robillard. Efficient mapping of
    software system traces to architectural views. In Proceedings of the 2000 Conference
    of the Centre for Advanced Studies on Collaborative Research, CASCON ’00, pages
    12–. IBM Press, 2000.
20. M. Weidlich and J.M.E.M. van der Werf. On profiles and footprints – relational
    semantics for petri nets. In Applications and Theory of Petri Nets (ICATPN 2012),
    volume 7347 of Lecture Notes in Computer Science, pages 148 – 167. Springer,
    Berlin, 2012.
21. J.M.E.M. van der Werf and H.M.W. Verbeek. Online compliance monitoring of
    service landscapes. In BPM 2014 International Workshops, volume 202 of Lecture
    Notes in Business Information Processing, pages 89–95. Springer, Berlin, 2015.

</pre>