=Paper= {{Paper |id=None |storemode=property |title=From Code to Coloured Petri Nets: Modelling Guidelines |pdfUrl=https://ceur-ws.org/Vol-851/paper8.pdf |volume=Vol-851 |dblpUrl=https://dblp.org/rec/conf/apn/DedovaP12 }} ==From Code to Coloured Petri Nets: Modelling Guidelines== https://ceur-ws.org/Vol-851/paper8.pdf
            From Code to Coloured Petri Nets:
                  Modelling Guidelines

                        Anna Dedova and Laure Petrucci

                  LIPN, CNRS UMR 7030, Université Paris XIII
          99, avenue Jean-Baptiste Clément, F-93430 Villetaneuse, France
              {Anna.Dedova,Laure.Petrucci}@lipn.univ-paris13.fr



      Abstract. This paper presents a method for designing a coloured Petri
      net model of a system starting from its high-level object oriented source
      code. The entire process is divided into two parts: grounding and code
      analysis. For each part detailed step-by-step guidelines are given. The
      approach is illustrated with an industrial application case study, the
      NEO protocol.


1   Introduction

The modelling problem has been being under investigation for many years. It
has a lot of particular cases depending on 1) the nature of the description of the
system to be modelled and 2) which formalism is chosen for the final model. Ac-
cording to the first criteria there are three basic groups of modelling approaches:

 1. Starting from an informal description of a problem;
 2. Starting from a detailed specification of a system;
 3. Starting from the source code.

    Some recent works tackle the first group of approaches. For example, in [3] the
authors propose a modular design method and illustrate it on a model railway
case study. One of the main points of [3] is using properties of the system at the
modelling stage. In [4] an approach aggregating different views of the system is
given. This method assumes that the system can be observed from several points
of view: pre/post, process and lifeline views expressing respectively pre- and post-
conditions of events, sequences of events, and sequences of states. Thus, steps in
a process view correspond to system events and can be modelled by transitions
in a Petri nets formalism. Similarly, steps of a lifeline view correspond to the
states of the system and can be modelled by places of a Petri net. Then, by
identifying the elements of these different representations of systems, places and
transitions are glued together in order to get a complete Petri net.
    The second group of modelling approaches includes various attempts to de-
liver a formal model from UML diagrams [8, 6, 5]. The advantage of these meth-
ods is that most developers are familiar with the UML and an automatic trans-
formation of their diagrams into formal models and model-check them, would
110    PNSE’12 – Petri Nets and Software Engineering



greatly simplify the software quality control. The difficulty is that UML dia-
grams allow for much more freedom for the designer than formal models and the
automatic translation is not trivial.
    This paper addresses the third group of modelling approaches, which is not
covered by a wide range of methods in the literature [9]. Such approaches are ded-
icated to systems for which the source code already exists, in order to guarantee
it satisfies some requirements. They often do not support a complete language,
but are restricted to some subset of it. Moreover, to the best of our knowledge,
no work addressed a high level object oriented language, such as python.
    Hence, what are the particular difficulties encountered by reverse-engineering
from the source code? If a program is rather small (tens of lines) one can simply
suppose that the operators are the system events and correspond to transitions,
places between them model the intermediate states of the system, and some
additional places model the states of variables used. But this approach is no
more applicable when the system under consideration is as large as 3 MBytes
of object oriented code. Of course it is possible to model all operators as in the
previous case, but then the model becomes so huge that there is no means to
analyse it and it becomes useless. Thus, it is necessary to choose an appropriate
level of abstraction for the system. If it is too low and the model contains too
many details, the same problem as above arises. If the level of abstraction is too
high, there are too many hypotheses and assumptions and it may happen that
nothing is left worth checking. The model is then trivial and its behaviour is
completely correct while the system contains drawbacks that are hidden due to
the modelling assumptions.
    The paper is organised as follows. Section 2 gives detailed guidelines on how
to derive step-by-step a coloured Petri net from the source code. Then Section 3
shows how this method was applied in practice to the NEO protocol. Finally,
Section 4 draws some conclusions.
    We assume the reader is familiar with coloured Petri nets [7].


2     Modelling Guidelines

This section discusses the guidelines to follow in order to deliver a coloured Petri
net from high level object oriented source code.


2.1   Grounding

Before the start of the modelling process some preparation work is required. It
mainly concerns the deep understanding of project structure and expected prop-
erties of the system. This helps a lot during the modelling by saving the time
devoted to the consideration of unnecessary elements or restructuring model hi-
erarchy. It is always possible to skip this stage and proceed directly to modelling
the most interesting piece of code, but then the risk of choosing an inadequate
abstraction level is very high. The main steps of grounding are listed below. They
                  A. Dedova, L. Petrucci: From Code to Coloured Petri Nets          111



                            start
                      1. Understand structure     Scenarios




                      2. Choose key elements     Properties




    Interaction                                   Classes,             System
                        3. Find interactions
    mechanisms                                    objects            architecture




                      4. Secondary elements




                      5. Divide into modules
                            stop




                     Fig. 1. Schema of the grounding process


are depicted in Figure 1 together with the elements that are to be considered or
impacted.
1. Understand the structure. First of all, we should pay attention to the
   architecture of the project. The key elements (classes) should be found as
   well as their roles in the whole system. It can be very useful to find the
   most common scenarios of the system use (or maybe scenarios that should
   be verified later). We can look for the parts (classes or objects) of the sys-
   tem that are impacted by these scenarios. We also need to understand the
   class structure of the project (with particular attention to inheritance and
   polymorphism). During this step the most important result is to understand
   globally how the system works from the inside.
2. Choose key elements. The second step focusses on the system properties
   to be checked. Properties can be proposed by developers, clients or anyone
   else. Then, they must be considered one by one in order to choose those
   that are the most crucial for the system. Selecting them before starting the
   modelling process is very important since this choice can influence a lot the
   model structure that will not be so easy to change later on.
   Once the properties are selected, we look for the scenarios they concern.
   Moreover the classes and methods used within these scenarios are selected,
   according to the project structure from the previous step. Thus, the main
   pieces of code that are going to be modelled are defined.
3. Find interactions. We should keep in mind that objects of chosen classes
   can be verified separately from one another. But the ultimate goal is usually
112    PNSE’12 – Petri Nets and Software Engineering



    to model-check the whole system altogether. Separate parts can be subjected
    to traditional testing techniques while the complexity and the size of the
    system makes their application to the entire project impossible.
    So, when modelling something larger than an isolated object, the interac-
    tions between them must be identified. They can be of very different natures:
    message passing; shared or global variables (e.g. between different methods
    inside a class); sometimes a class is composed of other auxiliary classes; a
    method of an external class can be called. All interactions should be inves-
    tigated and the corresponding elements added to our modelling selection.
 4. Secondary elements. Here we need to look at auxiliary classes that are
    used by the selected key classes. They can be classes of data structures, or
    classes providing message exchange capabilities. On the one hand, operations
    and/or interconnections of key elements are impossible without them. On
    the other hand, if we model them in detail, the model will be too bulky to
    perform any analysis. Such elements usually describe the work of the system
    on a low level of abstraction and can be verified separately. So, the idea is
    to model them as simple as possible, but without loss of essence.
    In the end of this step we should know which abstractions are going to be
    used: some algorithms could be modelled as a single transition, some complex
    data structures encoded with natural numbers, etc.
 5. Divide into modules. All the scenarios and methods that have been chosen
    for modelling are used to design a modular structure for the future model.
    Of course, it can be changed later during the modelling process.
    It is rather natural to associate a submodule with a class or a method. It
    is also important to pay attention to auxiliary elements and decide whether
    they are worth a separate module or not.


2.2   Code analysis

During this stage, two processes are carried out in parallel: the analysis of the
source code and the construction of the model. In order to streamline these
processes it is proposed to divide them into five main activities, according to
the scheme on Figure 2. Each activity requires to look for some elements in the
source code as well as interpreting them in terms of the modelling formalism
(in our case coloured Petri nets). At each step, the source code is looked at
from different points of view in order to extract the different components of the
model. In practice, it is usually needed to go trough the cycle several times but
at the start it is hard to tell how many times it should be done. It is also possible
that some activities are skipped on later rounds, since a new element cannot be
extracted from the source code. From one round to another the understanding
of the chosen abstraction level is more and more accurate and the model is more
and more complete.
    Since the module hierarchy of the CPN can be different from the initial
structure of classes and methods, the work within the five activities can be
organised in different ways.
                  A. Dedova, L. Petrucci: From Code to Coloured Petri Nets    113



                                       start
                               Types ⇒ Colours


         next round             States ⇒ Places


                             Events ⇒ Transitions
                                         ∀ transition
                              Transforms ⇒ Arcs             next transition


                             Conditions ⇒ Guards
                                        stop


                      Fig. 2. Schema of the modelling process


 – Consider the modules of the future model (found at the fifth step of ground-
   ing) one by one. For each module examine scenarios, classes and methods it
   concerns and analyse them via all activities.
 – Consider scenarios or methods (found at the second step of grounding) one
   by one. For each scenario/method perform each activity that will give re-
   finement for different modules of the model.
 – Consider activities one by one and look at the system as a whole, analysing
   different parts of code and changing different models, but from the chosen
   point of view.

In practice third approach is difficult to apply unless the model is almost ready
and it can be grasped at a glance. The first approach is the most effective one,
but sometimes the second one may also prove useful by focusing on a partic-
ular behaviour. In this case, the behaviour is either described by an execution
scenario, or the details of a method are tracked step-by-step.


Data structures It is important to start from this activity because it forms
the basis of the future model. It is natural to start with colour domains in order
to use them (and may be supplemented later with new details) during further
activities.
    In general, data structures of the source code should be expressed in terms
of colour domains. However, it is often not that simple. In object oriented code,
data structures are usually integrated together with their storing, loading and
treatment methods. Colour domains syntax does not allow to do this, so, it may
be needed to model a “simple” object with a separate CPN. Such cases can be
left to further activities nevertheless providing basic types for future CPNs.
114     PNSE’12 – Petri Nets and Software Engineering



   This phase provides as a result a preliminary list of colour domains and
variables needed in the model.


States and conditions on objects This is the first activity that assumes the
modeller thinks in terms of parts of CPN that have no strict correspondence
in the source code, namely the places. It may be difficult to deliver them in
the situation of “blank sheet”, but the model with places make other activities
become much simpler or even possible (e.g. construction of arcs).
    Hence, this phase aims at creating the set of places of the CPN which usually
represent the states of the system or its parts (objects, variables, etc.). To begin
with, the system flow of operations can be represented as a finite state automa-
ton. The set of states of this automaton can be a first approximation of the set
of places of the CPN. Then conditions required to proceed from one state to an-
other are considered. These conditions often concern the states of some objects
or variables. They should also be added to the set of places. Finally, a colour
domain (defined during the previous activity) is associated with each place.


Events and actions This activity is in general simpler than the previous one.
Each operator or method call in the source code can be considered as an action
and thus be modelled as a transition. The main hindrance here is a tendency to
model every operator with a transition. To avoid this we can apply information
obtained during grounding (2nd and 4th steps).
   The purpose is to select actions, essential for the processes to be verified.
To start with, consider the changes of variables and data structures that are
implicitly mentioned in the properties. If the properties are not formalised yet,
main constructions of the system can serve as a basis. As for previous phases, on
the first round only a preliminary view of transitions in the model can be given.
After going through other activities it will be completed and refined.


Transformation of data During this phase, the modeller considers for each
transition the three following questions, and performs the corresponding net
construction:

 1. What is taken as input? (Connect corresponding places with input arcs);
 2. What is produced as output? (Connect corresponding places with output
    arcs);
 3. How are the tokens transformed? (Provide input and output arc expressions).

If there is a special input format, it can be reflected in the input arc inscription. If
the output is somehow calculated from input variables, the corresponding output
arc must be assigned with a formula, representing these calculations in terms of
CPN. Often, the formulae from the source code cannot be applied directly and
need to be adapted w.r.t. the chosen level of abstraction.
                  A. Dedova, L. Petrucci: From Code to Coloured Petri Nets      115



Conditions on events Here, as in the previous phase, we consider the set
of transitions. The focus this time is on the special conditions under which
a transition can be fired. In practice the conditions for most transitions are
modelled by the tokens in input places. In this case the transition has a guard
true that can be omitted in the model. But sometimes for better readability of
the model, and also to prevent having too large sets of places and transitions, it
can prove better to formulate such a condition as a guard of the transition.
    The goal of this activity is to find such cases and to figure out the guards. It
can happen that some condition is not possible to express on the selected level
of data abstraction. If so, the colour domains created in the first activity must
be revised, as well as their occurrences in parts of the CPN that have already
been built. Thus, a next round of activities is started.


3     Application of the Guidelines to the NEO Protocol

This section illustrates modelling guidelines with examples from modelling pro-
cess of the NEO protocol. The protocol, designed to handle a large distributed
database over a cluster of machines, was described in [1, 2]. Its main charac-
teristics are shortly summarised in Section 3.1. This specification was part of
an industrial project which aimed at validating the protocol and its prototype
implementation both designed and developed by the NEXEDI company. It was
implemented in Python, but our approach is not specific to this language.


3.1   Brief Description of the NEO Protocol

A more extensive description and analysis of the NEO protocol can be found
in [1] and [2].
    Different kinds of nodes play dedicated roles in the protocol, as depicted by
the architecture in Figure 3(a):

storage nodes handle the database itself. Since the database is distributed,
    each storage node cares for a part of the database, according to a partition
    table. To avoid data loss in case of a node failure, data is duplicated, and is
    thus handled by at least two storage nodes.
master nodes handle the transactions requested by the client nodes and for-
    ward them to the appropriate storage nodes. A distinguished master node,
    called primary master, handles the operations. Secondary masters (i.e. the
    other master nodes) are ready to become primary master in case of a failure
    of this node. They also inform other nodes of the identity of the primary
    master.
the administration node is used for manual setup if needed.
client nodes correspond to the machines running applications concerned with
    the database objects. Thus, they request either read or write operations.
    They first ask the primary master which storage nodes are concerned with
    their data, and can then contact them directly.
116     PNSE’12 – Petri Nets and Software Engineering



   The lifecycle of the NEO rotocol is depicted in Figure 3(b).
   At the system startup, the primary master is elected among all master nodes.
The primary master maintains the key information for the protocol to operate.
   After the election of a primary master, the system goes through various stages
with the purpose of checking that all transactions were completely processed, and
thus that the database is consistent across the different storage nodes (bootstrap
protocol).
   Finally, the system enters its operational state. Clients can then access the
database through the elected primary master.


             Secondary Masters




              Primary Master
                                                                 The primary master

                                                         recover verification service
                                           elected                                                time
   Client Nodes            Storage Nodes    conn_to_pm
                                                      verification        initializationservice

            Administration Node                                     A storage node

      (a) Protocol architecture                      (b) Phases of bootstrap process

                  Fig. 3. Architecture and lifecycle of the NEO Protocol




3.2   Grounding

Each step described in Section 2 is now applied.

Understand structure This step is difficult to illustrate on a real example since
it implies working on extensive code. The conclusions cannot be confirmed by
a small piece of code. Nevertheless, for the NEO protocol, at this stage we can
state the following, and confirm the brief description from Section 3.1.
    The main entities are nodes of the cluster, of four types: master, storage,
client and admin nodes. For each of these types there is a corresponding function
in the source code.
    The life cycle of nodes leads them through different phases implemented by an
auxiliary class (RecoveryManager, VerificationManager) or a method of the cor-
responding node class (ElectPrimay, VerifyData, Initialize, DoOperation). Also,
depending on the phase of the protocol, a node changes its message handlers.

Choose key elements Based on the conclusions of the previous step and on the
verification issues, we decided to focus on master and storage nodes. This paper
does not get into the details of the numerous properties to check, a good part of
which can be found in [1] and [2]. Many properties were provided as an informal
statement by the code developers. For example, only a single node is elected as
                             A. Dedova, L. Petrucci: From Code to Coloured Petri Nets   117



a primary master ; all shared information (partition tables, identifiers) has been
made consistent for the service phase to take place.
    Most attention is payed to the election of the primary master and to the
bootstrap process (everything between election and operational state). Later on
in this paper we focus on bootstrap phase.
    Therefore, we chose the following fragments of code for detailed analysis:
1. master node application
  1   def    __init__ ( s e l f , c o n f i g )
  2   def    run ( s e l f )
  3   def    playPrimaryRole ( s e l f )
  4   def    r u n M a n a ger ( s e l f , m a n a g e r _ k l a s s )
  5   def    changeClusterState ( self , state )

2. recovery manager class
  1   def __init__ ( s e l f , c o n f i g )
  2   def r u n ( s e l f )
  3   def b u i l d F r o m S c r a t c h ( s e l f )

3. verification manager class
  1   def    __init__ ( s e l f , app )
  2   def    _askStorageNodesAndWait ( s e l f , packet ,                node_list )
  3   def    run ( s e l f )
  4   def    verifyData ( s e l f )
  5   def    verifyTransaction ( self , tid )

4. storage node application
  1   def    __init__ ( s e l f , c o n f i g )
  2   def    run ( s e l f )
  3   def    connectToPrimary ( s e l f )
  4   def    verifyData ( s e l f )
  5   def    initialize ( self )



Find interactions The nodes in the cluster need to communicate with one an-
other. For this purpose they use a class EventManager. It describes the mecha-
nism for sending and receiving messages. To treat them, each node has its own
handlers, different for different phases of the protocol. Thus, they should be
added to our list of pieces of code.
     Another means for nodes collaboration in the cluster is a partition table. It
is implemented as a class that keeps the distribution of data between storage
nodes. This class is another key element and should also be added to the analysis
list.
     The master and storage applications also use some global variables to allow
their methods to know the state of the application (primary, operational, has_pt
— partition table). This information should be kept aside to be used during
modelling.

Secondary elements When this stage occurs, all significant parts of the project
and their communications are identified. It is then time to make rather crude
abstractions on objects that could not be eliminated from the model, but must
be simplified because of the abstraction level.
   For example, for the NEO protocol we made following abstractions:
118       PNSE’12 – Petri Nets and Software Engineering



 1. The complex message structure, defined in a class package is modelled as an
    integer number;
 2. Connection, described as a group of classes, is modelled as a pair of nodes,
    that are considered to be connected;
 3. Transaction and object, that have a lot of fields, such as serial number,
    history, data, etc., are modelled by their identification numbers.



              Storage Global                  Partition Table          Master Global


  Poll Verification        Poll Identification                         Run Primary


  Handle Mes-
  sage: DelT, ComT,        Handle Message:
  AskUT, AskTI, AskOP,     AnsPT, AnsLID, AnsNI,        Recover                        Verify Data
  AskLID, AskPT, NotPCh,   NotNI
  StartOp, StopOp
                                                        Poll Recover        Demand              Poll Verification

                                                                                                Handle Message:
                                                                                                AnsUT, AnsTI, AnsOP,
                                                                                                TNF, ONF




                                            Fig. 4. Hierarchy of models


Divide into modules Figure 4 presents the sub-module structure of the bootstrap
model. First of all, there are two main global level models: “Storage Global” and
“Master Global”. Aside, a “Partition Table” model is referenced by the both
storage and master nodes.
    Then, in “Storage Global” there are two substitution transitions, correspond-
ing, respectively, to “Poll Verification” and “Poll Identification” sub-modules, and
presented by two instances of “Poll Storage”. Each of them contains a “Handle
Message” sub)model, but with different messages inside.
    In the model “Master Global” there is only one sub-module “Run Primary”. It
itself has two sub-nets: “Recover” and “Verify Data”, that correspond to recovery
and verification managers run methods. The “Recover” module contains one
sub-module, “Poll Rec” together with the handlers of two messages. The “Verify
Data” module contains two sub-models occurring several times in the model :
“Demand” (twice) and “Poll Verification” (four times). “Poll Verification” includes
a “Handle Message” with different message handlers.


3.3     Code Analysis

In this subsection we will give some detailed examples of application the guide-
lines to the source code of the NEO protocol.

Data structures In order to show how the colour domains can be constructed
from data structures, let us take the piece of code in Figure 5. It is a fragment of
the partition table class definition where the internal fields are declared. First,
                           A. Dedova, L. Petrucci: From Code to Coloured Petri Nets                                    119



      1   class C e l l ( o b j e c t ) :
      2     def __init__ ( s e l f , node ,             s t a t e = C e l l S t a t e s .UP_TO_DATE) :
      3       s e l f . node = node
      4       self . state = state
      5   ...
      6
      7   class P a r t i t i o n T a b l e ( o b j e c t ) :
      8     def __init__ ( s e l f , n u m _ p a r t i t i o n s , n u m _ r e p l i c a s ) :
      9       s e l f . _id = None
    10        s e l f . np = n u m _ p a r t i t i o n s
    11        s e l f . nr = num_replicas
    12        s e l f . num_filled_rows = 0
    13        s e l f . p a r t i t i o n _ l i s t = [ [ ] f o r _ in x r a n g e ( n u m _ p a r t i t i o n s ) ]
    14    ...

          Fig. 5. Fragment of the source code declaration of partition table class



the class for a cell of the partition table is declared. It has two attributes: a
storage node and a state. Knowing that a partition cell has two possible states,
we can declare a colour domain PSTATE as a set of these two values and a colour
domain PT_CELL as a product of storage node type and cell state.
c o l s e t PSTATE = w i t h UTD | OOD; ( ∗ t h e s e t o f s t a t e s o f p a r t i t i o n ∗ )
c o l s e t PT_CELL = p r o d u c t SN∗PSTATE ; (∗ a c e l l o f p a r t i t i o n t a b l e ∗)

Now let us consider the beginning of the partition table class constructor. It
starts with assigning the values of variables for number of replicas and number
of partitions. Then it creates a two-dimension list. In one dimension its size
is equal to the number of partitions, in the other dimension the size is not
specified. So, we declare three auxiliary colour domains: a set of partitions, a
list of partition cells and a partition row, that is a product of a partition and a
list of cells. Finally, a partition table colour domain consists of a list of partition
rows.
colset    PART = i n d e x np w i t h 0 . . NP ( ∗ t h e s e t o f p a r t i t i o n s ∗ )
colset    PT_CELLlist = l i s t PT_CELL w i t h 0 . . NR+ 1 ) ; ( ∗ a c e l l l i s t ∗ )
colset    PT_ROW = p r o d u c t PART∗ PT_CELLlist ; ( ∗ a p a r t i t i o n t a b l e row ∗ )
colset    PT = l i s t PT_ROW w i t h 0 . . NP ; ( ∗ t h e p a r t i t i o n t a b l e t y p e ∗ )

    The following colour domain definitions will be used later on:
colset      SN = i n d e x s n w i t h 0 . . N ; ( ∗ t h e s e t o f s t o r a g e n o d e s ∗ )
colset      MN = i n d e x mn w i t h 0 . .M; ( ∗ t h e s e t o f s t o r a g e n o d e s ∗ )
colset      NODE = u n i o n s 1 : SN + m1 :MN; ( ∗ t h e s e t o f a l l n o d e s ∗ )
colset      CSTATE = w i t h VER | REC | RUN | STP ; ( ∗ t h e s e t o f c l u s t e r s t a t e s ∗ )
colset      MTYPE = w i t h StopOp | S t a r t O p | AskUT | AskPT | AskNI | AskLID |
                                  AskTI | AskOP | AnsUT | AnsNI | AnsPT | AnsLID |
                                  AnsTI | AnsOP | NotNI | NotPCh | DelT | ComT ;
                                  (∗ the s e t o f message t y p e s ∗)
c o l s e t MESS = p r o d u c t MTYPE∗NODE∗NODE∗INT ; ( ∗ t h e s e t o f m e s s a g e s ∗ )
c o l s e t S N l i s t l i s t SN w i t h 0 . . N ; ( ∗ a l i s t o f s t o r a g e n o d e s ∗ )
c o l s e t M E S S l i s t = l i s t MESS 0 . . 1 0 0 0 ; ( ∗ a l i s t o f m e s s a g e s ∗ )




States and conditions on objects As an illustration of the next four steps, let us
consider the beginning of the verification phase from the primary master point
of view. The corresponding source code is listed in Figure 6.
120         PNSE’12 – Petri Nets and Software Engineering



 1    def r u n ( s e l f ) :
 2      s e l f . app . c h a n g e C l u s t e r S t a t e ( C l u s t e r S t a t e s . VERIFYING )
 3      s e l f . verifyData ()
 4      ...
 5
 6    def v e r i f y D a t a ( s e l f ) :
 7      em , nm = s e l f . app . em , s e l f . app . nm
 8      neo . l i b . l o g g i n g . debug ( ’ w a i t i n g ␣ f o r ␣ t h e ␣ c l u s t e r ␣ t o ␣ be ␣ o p e r a t i o n a l ’ )
 9      while not s e l f . app . p t . o p e r a t i o n a l ( ) : em . p o l l ( 1 )
10      neo . l i b . l o g g i n g . i n f o ( ’ s t a r t ␣ t o ␣ v e r i f y ␣ d a t a ’ )
11      s e l f . _askStorageNodesAndWait ( P a c k e t s . A s k U n f i n i s h e d T r a n s a c t i o n s ( ) ,
12          [ x f o r x in s e l f . app . nm . g e t I d e n t i f i e d L i s t ( ) i f x . i s S t o r a g e ( ) ] )
13       ...
14
15    def _ a s k S t o r a g e N o d e s A n d W a i t ( s e l f , p a c k e t ,   node_list ):
16      p o l l = s e l f . app . em . p o l l
17      o p e r a t i o n a l = s e l f . app . p t . o p e r a t i o n a l
18      u u i d _ s e t = s e l f . _uuid_set
19      uuid_set . c l e a r ()
20      f o r node in n o d e _ l i s t :
21          u u i d _ s e t . add ( node . getUUID ( ) )
22          node . a s k ( p a c k e t )
23      while True :
24          p o l l (1)
25          i f not u u i d _ s e t :
26              break

                       Fig. 6. Fragment of the source code of verification phase


    First three lines come from the run method of the verification manager class.
We can see that the primary master changes cluster state to VERIFYING and
calls verifyData method. So, we can start by defining two places:

 –    start_verif with colour domain MN (the state of the primary master at
   the start of the verification manager);
 –    c_state with colour domain CSTATE (the current state of the cluster).
Then, the primary master waits until the partition table becomes operational
(line 9). We define a new place, corresponding to this state of the primary master.

 –          wait_pt with colour domain MN.
After that, it calls a method _askStorageNodesAndWait, where it sends requests
about unfinished transactions to storage nodes, and waits until the list uuid_set
becomes empty. This waiting period can be modelled as a new place. In order to
send messages to other nodes we need a channel place. According to the protocol,
it must be a FIFO list. Hence, two places are added:

 –          network with colour domain MESSlist;
 –          wait_ut with colour domain MN.
Finally, the primary master starts the verification of transactions one by one.
This code is out of scope of our example, but we can at least give the next state
of the primary master by adding a new place:

 –          verifying_trans with colour domain MN.
                  A. Dedova, L. Petrucci: From Code to Coloured Petri Nets    121



Events and actions Now, we need to extract the important actions from the
same piece of code. The first method call changeClusterState can be considered
as one of them. So, we add the first transition:
 –      change_c_state.
When getting to the next lines, we see that line 7 contains nothing but shortcuts
and line 8 writes the current state to the log. The next important action is
em.poll(1) that is executed while the primary master waits for the partition table
to be operational. Here, it is supposed to treat different messages. For the sake
of readability of the model, we decide to organise message handlers in separate
sub-nets. A new transition is added, coloured in black to symbolise there is a
net behind.
 –      poll _pm_verif .
Line 10 is not important since it writes a log. Then the _askStorageNodesAnd-
Wait method is called with a list of all identified storage nodes as an argument.
Inside this method some shortcuts occur (lines16–18) and the list is cleared
uuid_set. Then, considering the storage nodes from the input list one by one,
they are added to uuid_set and sent a request of unfinished transaction (which is
also given as input parameter, defined in line 11). An additional place is needed
to store the identified storage nodes. So, we go to the previous step, add this
place, and return to add a new transition:

 –     s_iden with colour domain SN.
 –      ask _ut.

Then the primary master is waiting once again, executing poll (line 24). So, we
duplicate the corresponding transition. Finally, we add a transition that models
the exit of this process.
 –      got_ut.

Transformations of data Let us consider the transitions we have up to now one
by one in order to build arcs and provide their expressions. To do so, some
variables should first be declared. Let pm: MN; cst: CSTATE. The whole net can
be seen in Figure 7. Transition change_c_state moves the primary master token
from place start_verif to wait_pt and replaces current cluster state token with
a VERIF one.
    Transition got_ut moves the primary master from place wait_ut to verifying_trans.
But it can fire only when all the answers of storage nodes are received. Here an
additional place, similar to the variable uuid_set (line 25), is created, that will
contain all storage nodes, answers from which the primary master is waiting.
Transition got_ut must fire if and only if this place is empty. However, it is
not possible to check if the multiset is empty without inhibitor arc. One of the
solutions is to change the colour domain to SNlist, since a list can be checked for
emptiness.
122      PNSE’12 – Petri Nets and Software Engineering



                                                start_verif
                                                MN
                                         hpmi
                                hcsti
              c_state                            change_c_state
                CSTATE         hVERIFi
                                         hpmi
                                                       hpmi
      partition_table              wait_pt                              poll_pm_verif
                   PT                    MN            hpmi
                             hptbli    hpmi
                                 hsli           ask _ut   hbroadcast(AskUT, sl, mli
               s_iden                                                                   network
                SNlist                          [operate(ptbl)]                         MESS
                               hsli      hpmi
                                                       hpmi
            wait_ans               wait_ut                              poll_pm_verif
                SNlist                  MN             hpmi
                              h[ ]i             hpmi

                                                 got_ut

                                         hpmi

                                                verif ying_trans
                                                MN



                         Fig. 7. Model of verification manager - part 1



 –       wait_ans with colour domain SNlist.

A new variable sl: SNlist is also declared.
     Transition poll _pm_verif is replaced by a sub-net. Here it simply takes the
primary master token and puts it back. Handlers of messages, that are hidden
behind them, can change the state of some variables, e.g. uuid_set, and, respec-
tively, the content of place wait_ans.
     Transition ask _ut moves the primary master token from place wait_pt to
wait_ut. It also sends messages to all storage nodes from place s_iden. Here we
see that it could be convenient to change the colour domain of s_iden to SNlist.
In this case we can directly put this list into place wait_ans. Also we can write
an SML function broadcast, that sends the same message to each node from the
list.
fun b r o a d c a s t ( msgType , l ) =
    L i s t . f o l d r ( fn ( sNode , t o k e n s ) =>
            1 ‘ ( msgType , s 1 ( sNode ) , pm , 0 ) ++ t o k e n s )   []   l

A new variable declaration is needed: ml: MESSlist.

Conditions on events In this example, there is only a single transition that
requires an auxiliary condition to fire. It is ask_ut, since it can fire only if the
partition table is operational (see lines 9, 11, 22 of figure 6). To make this check,
first of all, we need to add an additional place:
                           A. Dedova, L. Petrucci: From Code to Coloured Petri Nets               123



     –       partition_table with colour domain PT,

    together with a new variable ptbl: PT. A guard must also be added. The partition
    table is operational if and only if there is at least one up-to-date cell for each
    partition. So, we can write the following SML function.
1   fun o p e r a t i o n a l p t = L i s t . a l l ( fn (_, row ) =>
2                                   L i s t . e x i s t s ( fn (_, s t ) => s t = UP) row ) p t




    3.4    Analysis and feedback

    The properties the protocol should satisfy were model-checked. The outcome of
    this analysis was suspicious scenarios. The design approach allowed for tracing
    back the execution sequence in the source code, and thus the engineers could
    check their validity.
        Some scenarios were due to a too coarse abstraction level, but the assump-
    tions made during modelling did hold and guarantee the appropriate behaviour
    of the code. An interesting erroneous scenario pointed out the possibility of a
    livelock in the primary master election process. However, this never happened in
    practice, as the developers found out it was prevented by a side-effect of a Python
    function. Nevertheless, they could fix it, such a side effect being undesirable, in
    case it doesn’t happen in a future version of Python.


    4     Conclusion

    In this paper we gave detailed directions on how to construct a coloured Petri
    net model from a high level object oriented source code and illustrated it with a
    real case example. The modelling process is divided into 2 main parts: grounding
    and code analysis.
        Now the following questions could be raised. Can this process be automated?
    The most complicated part of the modelling process is to choose objects and ac-
    tions that are important for the goals of verification and separate them from
    those that are not as useful. If a programming language could provide some kind
    of priorities to data structures and methods, it may simplify the automation
    process. During the industrial project in which the NEO protocol was analysed,
    a code-tagging approach to facilitate both the modelling and the interpretation
    of the verification results was envisioned for future work. Moreover, part of our
    group works on a tool-supported Petri net model design method from a nat-
    ural language description. Further work could include the integration of both
    approaches.
        Another interesting question is could these modelling guidelines be applied
    elsewhere? Even if not directly, but with some refinements, they could be applied
    to any reverse-engineering process providing a coloured Petri net or a similar
    model of concurrency.
124     PNSE’12 – Petri Nets and Software Engineering



References
1. C. Choppy, A. Dedova, S. Evangelista, S. Hong, K. Klai, and L. Petrucci. The
   NEO protocol for large-scale distributed database systems: Modelling and initial
   verification. In Proc. 31st Int. Conf. Application and Theory of Petri Nets and Other
   Models of Concurrency (PetriNets’2010), Braga, Portugal, June 2010, volume 6128
   of Lecture Notes in Computer Science, pages 145–164. Springer Verlag, June 2010.
2. C. Choppy, A. Dedova, S. Evangelista, K. Klai, L. Petrucci, and S. Youcef. Modelling
   and formal verification of the NEO protocol. Transactions on Petri Nets and Other
   Models of Concurrency, 2012. To appear.
3. C. Choppy, L. Petrucci, and G. Reggio. A modelling approach with coloured Petri
   nets. In Proc. 13th International Conference on Reliable Software Technologies /
   ADA-Europe, Venice, Italy, LNCS 5026, pages 73–86. Springer-Verlag, 2008.
4. J. Desel and L. Petrucci. Aggregating views for Petri net model construction. In
   Proc. workshop on Petri Nets and Distributed Systems (PNDS08, associated with
   Petri Nets 2008), Xi’an, China, pages 17–31, 2008.
5. U. Farooq, C. P. Lam, and H. Li. Transformation methodology for uml 2.0 activity
   diagram into colored petri nets. In Proceedings of the third conference on IASTED
   International Conference: Advances in Computer Science and Technology, ACST’07,
   pages 128–133, Anaheim, CA, USA, 2007. ACTA Press.
6. Elhillali Kerkouche, Allaoua Chaoui, El-Bay Bourennane, and Ouassila Labbani. A
   uml and colored petri nets integrated modeling and analysis approach using graph
   transformation. Journal of Object Technology, 9(4):25–43, 2010.
7. Kurt Jensen. Coloured Petri Nets. Basic Concepts, Analysis Methods and Practical
   Use., volume Basic Concepts. Springer-Verlag, 1992.
8. John Anil Saldhana and Sol M. Shatz. Uml diagrams to object petri net models:
   An approach for modeling and analysis. In In International Conference on Software
   Engineering and Knowledge Engineering, pages 103–110, 2000.
9. JB. Voron and F. Kordon. Transforming Sources to Petri Nets : A Way to Analyze
   Execution of Parallel Programs. In International Workshop on Petri Nets Tools and
   APplications (PNTAP), pages 1–10. ACM, 2008.