=Paper= {{Paper |id=Vol-3618/forum_paper_18 |storemode=property |title=Assessing the value of incomplete deadlock verification in Model-Driven Engineering |pdfUrl=https://ceur-ws.org/Vol-3618/forum_paper_18.pdf |volume=Vol-3618 |authors=Felix Cammaerts,Monique Snoeck |dblpUrl=https://dblp.org/rec/conf/er/CammaertsS23 }} ==Assessing the value of incomplete deadlock verification in Model-Driven Engineering== https://ceur-ws.org/Vol-3618/forum_paper_18.pdf
                                Assessing the value of incomplete deadlock
                                verification in Model-Driven Engineering
                                Felix Cammaerts1,∗ , Monique Snoeck1
                                1
                                    LIRIS, Naamsestraat 69, 3000, Leuven, Belgium


                                                                         Abstract
                                                                         Model-Driven Engineering (MDE) involves automatically generating code from a conceptual model.
                                                                         However, despite accurate translation of requirements into models and correct transformation of models
                                                                         into code, ensuring internal consistency of the model remains a challenge. Inconsistencies can arise
                                                                         among various elements of a conceptual model. While certain errors can be identified through simple
                                                                         semantic checks (e.g., circular inheritance relationships, non-deterministic statecharts), other error
                                                                         detection problems encounter decidability issues or necessitate complex verification algorithms that may
                                                                         suffer from state explosions. One particularly arduous verification problem involves checking concurrent
                                                                         statecharts for deadlocks. To address this problem we proposes a divide and conquer algorithm that starts
                                                                         from two statecharts and incrementally adds more statecharts. Although the algorithm cannot address
                                                                         all possible deadlock issues, it demonstrates the capability to detect a significant number of them. This
                                                                         paper provides an initial evaluation of the algorithm, by evaluating the algorithm’s utility, conceptual
                                                                         models submitted as homework by students in previous years within an MDE course are employed. The
                                                                         results reveal that students’ conceptual models often contain deadlock situations. Consequently, the
                                                                         paper concludes that even an algorithm that cannot detect all deadlock situations, as presented in this
                                                                         research, can already detect a significant number of potential deadlock situations and thus can serve as
                                                                         an initial step in mitigating them while providing valuable feedback to students. The paper also describes
                                                                         the next steps to be undertaken to use the algorithm as part of an educational tool for novice modellers.

                                                                         Keywords
                                                                         Deadlock detection, Formal verification, Education, MERODE




                                1. Introduction
                                The cost of poor software quality in the United States in 2020 was estimated at a staggering
                                $2.08 trillion [1]. Operational failures accounted for a significant portion, reaching $1.56 trillion,
                                which increased from $1.275 trillion in 2018. Software bugs can have various consequences,
                                from frustrating customers to catastrophic outcomes. Reports on software bugs and faults
                                continue to emerge, with the latest example mentioned on the ”Risks to the Public” forum being
                                a bug in a life-sustaining heart pump [2].
                                   For the industry, adequate software testing is the most widely used technique ensuring quality
                                assurance, typically achieved through validation and verification processes.

                                ER2023: Companion Proceedings of the 42nd International Conference on Conceptual Modeling: ER Forum, 7th SCME,
                                Project Exhibitions, Posters and Demos, and Doctoral Consortium, November 06-09, 2023, Lisbon, Portugal
                                ∗
                                    Corresponding author.
                                Envelope-Open felix.cammaerts@kuleuven.be (F. Cammaerts); monique.snoeck@kuleuven.be (M. Snoeck)
                                Orcid 0000-0002-0037-3865 (F. Cammaerts); 0000-0002-3824-3214 (M. Snoeck)
                                                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
   As low-code and no-code software development approaches gaining ground [3], more soft-
ware is developed according to Model-Driven Engineering (MDE) practices. In MDE code is
automatically generated from a model, verification entails checking the internal consistency of
the model. For example, statecharts should only have accessible states, and circular inheritance
relationships should be avoided. Many of these mistakes can be identified through simple
semantic checks that can be defined by means of Object Constrain Language (OCL) [4] in the
meta-model of the modelling approach. However, more complex semantic checks, such as
deadlock situations can not be modelled by OCL and require a procedural verification.
   The preferred language of MDE practitioners is UML [5]. This language is used for object-
oriented modelling, but can also be used for artefact-centric modelling [6][7]. In both approaches,
systems are designed as a collection of object types each of which is equipped with a statechart
defining the lifecycle of objects. While such systems are prone to deadlock errors, research on
deadlock verification is quite scarce in this domain. A possible reason could be that while there
have been attempts to provide UML with formal semantics, leading to the definition of fUML1 ,
these are not based on a formal process algebra, making it challenging to perform deadlock
detection in UML-based modelling approaches.
   The importance of algorithmic verification should not be underestimated. Previous research
has demonstrated that ”reading” models without actually running them does not allow to detect
errors easily, unless modellers have at least 10 to 20 years of modelling experience [8]. In the
absence of algorithmic verification for deadlock, software developers have to rely on (time
consuming) software testing to detect possible errors. Algorithmic verification would thus allow
for saving precious time in the software development process.
   In search for an algorithm, one can consider different formalisms to verify concurrent pro-
cesses for deadlocks. A prominent and well-researched formalism is Petri Nets, which is mostly
used in the domain of process modelling [9]. As statecharts can be translated into Petri nets, it
is worthwhile to consider the work that has been performed in this domain, such as the work of
Raedts et al. [10], Chu et al. [11], Huang et al. [12], and Jeng et al. [13]. Besides Petri Nets also
other or additional techniques are used such as decomposition [14], graph reduction [15], and
WF-nets [16]. However, all these proposed verification approaches apply to a single process. In
case of collaborative processes, such as is the case for concurrent statecharts, these approaches
require the separate processes to be integrated to a single process to allow for deadlock verifica-
tion. Moreover, these algorithms do not support the provision of feedback about the root cause
of the deadlocks in the individual processes in the collaboration, and especially not if those
originate from statecharts. There thus remains work to be done to develop a practical algorithm
that can verify a collection if interacting statecharts for deadlock errors while providing the
modelers with actionable feedback about the root causes in the source statecharts.
   In the past, research on MERODE has focused on creating a UML-based MDE approach that is
grounded in process algebra with the goal of facilitating model checking. MERODE makes use
of a combination of class diagrams and statecharts in line with UML, but formalises this with
the process algebra of Communicating Sequential Processes (CSP) [17]. While some theoretical
work has been performed to check MERODE models for deadlocks [18], this work is incomplete
and only reports whether a model contains a potential deadlock or not, instead of providing the

1
    More information can be found at: https://www.omg.org/spec/FUML/1.5/About-FUML/
specific conditions under which the deadlock would occur, making it of little practical use.
   This paper contributes to the state of the art in deadlock checking of UML-based models
by taking a next step towards addressing deadlock situations in Model-Driven Engineering
within the MERODE approach by presenting a practical algorithm capable of detecting deadlock
situations in conceptual models. These deadlock situations arise from the parallel execution of
separate artifacts within these models. The present paper introduces the Algorithm for Deadlock
Detection Leveraging Incomplete Feedback (ADD-LIFE), and explores the prevalence of deadlock
situations within MERODE models. Ultimately, the ADD-LIFE algorithm would be used to
provide feedback to students. Before implementing the algorithm in a didactic modelling tool,
the present research evaluates the feasibility of developing such an algorithm and assesses its
utility, thus addressing the following research questions:

    • RQ1: What is a feasible way to algorithmically perform (partial) checks on conceptual
      models for deadlock, livelock, deadlock imminent and livelock imminent situations?

    • RQ2: How many deadlock, livelock, deadlock imminent and livelock imminent situations
      can be identified using such an algorithm on conceptual models from students?

   The subsequent sections of this paper are organized as follows: section 2 presents an overview
of related work concerning deadlock detection in object-oriented conceptual models and dis-
cusses the prior work on the MERODE approach. Section 3 presents the algorithm that was
developed as answer to RQ1. Section 4 presents the answer to RQ2 by applying the algorithm
to student homeworks from the past 5 years. Section 5 offers a further discussion of the results,
providing insights and interpretations and also addresses limitations. Lastly, section 6 concludes
the paper by summarizing the findings. It also outlines the next steps to be taken in further
developing and refining a tool that would use the algorithm.


2. Related work
2.1. Deadlock detection in conceptual models
In Model-Driven Engineering (MDE), software development revolves around creating domain
models as abstract representations of the desired software behavior, rather than directly writing
source code. These domain models serve as conceptual models that capture the expected
behavior of the software system and are used to automatically generate runnable code.
   Roughly speaking, three types of errors can occur during the MDE process. First, the
conceptual models themselves may contain errors or misinterpretations of the requirements
(type 1 errors). Second, a model may contain internal inconsistencies, such as inaccessible
states in statecharts, unlinked classes in class diagrams, or deadlock situations resulting from
backward inaccessibility or interactions between different statecharts (type 2 errors). Finally,
transformation errors (type 3 errors), may result in the generated code being erroreous and not
correctly implementing the conceptual model is was generated from. The work presented in
this paper focuses on type 2 errors, and more specifically different types of deadlocks as type 2
errors in a set of concurrent and interacting state charts.
   Zhang et al. [19] have introduced a framework that ensures the absence of deadlock situations
when transforming SLCO models (which are based on state machines) to code. This framework
guarantees that the semantic behavior of the model is faithfully preserved in the generated
code. The work is part of the SLCO language framework, which focuses on ensuring error-free
translation from SLCO models to code [20]. SLCO thus focuses on type 3 errors and does not
guarantee internal consistency within the model itself (type 2 errors). While in the domain of
process modelling most of the work focuses on checking a single process for deadlock (e.g. using
Petri Nets as formalism), some of the work in this domain is related to checking state charts as
well. Takaki et al. [21] proposed the EVA algorithm, which verifies a workflow’s consistency
with the state charts that constitute the default lifecycles of the data artifacts manipulated by the
process. They also introduced an algorithm to transform cyclic workflow diagrams into acyclic
ones, enabling the application of the EVA algorithm. The paper demonstrates that EVA can
determine the consistency of lifecycles and identify defects in a real large enterprise application
system. However, these approaches primarily address the consistency of generated code and
the surrounding business processes, rather than the internal consistency of the conceptual
model of the data artefacts. Artefact-centric process models define systems as a collection of
business artefacts having lifecycles defined by means of statecharts[22]. Calvanese et al. [23]
investigated safety properties using Satisfiability-Modulo-Theories (SMT) agnostic to the initial
database structure, making it applicable to artifact-centric systems. While they address type 2
errors for concurrent state charts, their research identified three classes of systems for which
safety is decidable, limiting the applicability of their results to certain models. Deutsch et al.
[24] implemented WAVE, a tool for static analysis of a subset of data-aware process models.
However, their verification checks only apply to that specific subset of models, as other models
are not within the PSPACE complexity class. Thus, these approaches are limited and cannot be
universally applied to all conceptual models that define a series of concurrent state charts.
   While object interaction in UML is binary (one object sends a message to one other object),
in MERODE, object interaction has been defined by means of joint participation to common
events, thus allowing for multi-party interaction. The semantics of this interaction have been
formalised by means of Communicating Sequential Processes (CSP) [17]: when an event is
triggered, all processes that have this event in their input alphabet will react on it.
   Numerous studies have already investigated deadlock detection for CSP, including works
by Huang et al. [25], Lima et al. [26], Isobe et al. [27], and Ladkin [28]. These papers propose
algorithms that automatically detect deadlock situations in CSP-based systems. However, these
algorithms assume that the number of processes is known in advance (e.g. a system with 5
dining philosophers who share 5 forks). While addressing type 2 errors, the assumption of a
fixed number of objects does not hold in the broader context of information systems, where
objects can be created and ended, thus adding processes to a system or removing them (i.e.
philosophers and forks can be added and removed, thus resulting the need for checking systems
that may have an arbitrary number of philosophers and forks).
   Despite the many works on safety and deadlock checking either based on Petri Nets, artefact-
centric modelling or CSP, to the best of our knowledge, no prior research has introduced an
approach for static analysis of deadlock detection in conceptual models, particularly when
considering the time and space complexity challenges that arise with large-scale models.
Figure 1: Model with possible deadlock situation. The tokens show the system’s state after the sequence
crJob (President), crApplicant (Felix), crApplication (FelixForPresident)


2.2. Existing work on MERODE
The MERODE approach is an MDE approach that uses class diagrams and finite state machines.
The class diagram represents the structural aspects by means of object types and their asso-
ciations, while the finite state machines model the behavioral aspects. In MERODE, the class
diagram takes the form of an Existence-Dependency Graph (EDG), where object types are
connected through associations that express existence dependency. The EDG can be obtained
from a regular UML class diagram by reifying the associations that do not express existence
dependency (ED). Assume for example a many-to-many association ’applies for’ between jobs
and applicants: in MERODE, this results in a class ’Application’ that is ED on both ’Job’ and
’Applicant’, see Figure 1. In the EDG, all classes are thus arranged in a master-dependent order.
   Each class has a state chart that defines its lifecyle. Object interaction happens through joint
participation to events. Assume that in the example of Figure 1 three objects have already
been created: President is an instance of Job and is in the state open; Felix is an instance of
Applicant and is in the state applied, and FelixForPresident is an instance of Application and
is in the state exists. ReviewByMonique has not yet been created. Joint participation to events
means that each object that has been defined to participate in the event must validate the event
and possibly react to it. For example, when the send event occurs, the FelixForPresident
object will react to it and move from the exists to the sent state, but, since each application is
connected to a single applicant, the send event also appears in the lifecycle of the Felix object.
Hence, the occurrence of the send event will also transition Felix from the applied to the sent
state and President from the open to the open state. However, transitions only happen if the
objects are in the right state. In line with CSP’s definition of interaction, all participants to an
event must accept an event for it to be accepted overall. If Felix is not in the applied state but
in the exists state, the event would not be executed for either of the thee objects because the
state of Felix does not allow the event to be executed.
   Making use of the lifecycle implications of existence dependency, three approaches have
been defined to enhance the internal consistency between different views of a software model
[29, 30]. The consistency by construction approach infers specifications in one view based on user-
defined specifications in another view, for example by automatically adding default lifecycles
when creating a class (see the lifecycle of Review in Figure 1). The consistency by monitoring
approach checks newly defined specifications for consistency at the moment they are added,
preventing the creation of cyclic inheritance relationships, for example. This conforms to the
aforementioned use of OCL [4] to specify constraints on the meta-model to prevent model errors.
The consistency by analysis approach uses algorithms to check for inconsistencies between
different deliverables and reports any identified issues. These approaches enable modelers to
freely construct various artifacts while ensuring consistency.
   As the default statecharts do not impose any other sequence constraint than creation having
to happen first and ending last, in combination with the a-cyclicity of the EDG, such system
has been proven to be deadlock-free [18]. However, the default statecharts often fail to capture
the full complexity of the requirements, such as for example having a blocked state for a
bank account, thus requiring the definition of more detailed statecharts. The introduction of
additional sequence constraints however holds the risk of introducing deadlock problems. To
check for these errors, the consistency by construction approach cannot be applied. Therefore,
consistency by analysis and consistency by monitoring, both relying on algorithmic checks for
model consistency, are to be used. The consistency by analysis approach performs checks
when the user requests model verification on their conceptual model, while the consistency
by monitoring approach verifies the new conceptual model immediately upon user-initiated
changes. For example, when the user attempts to add an ending transition to a non-final state,
the consistency by monitoring approach will not allow this transition to be added. The addition
of a new state will cause a problem of inaccessibility and deadlock but is not prevented to
ensure usability of the tool. But when the user then requests a model check from the system,
the consistency by analysis will inform the user about the inaccessibility of a state.
   Several verification checks for the MERODE approach already exists in MERLIN [31], a
tool to develop MERODE models with. These checks include verifying the accessibility of
states, ensuring object types have attributes, confirming the determinism of statecharts, and
the absence of forward inaccessible states. When considering a single object type, the MERLIN
tool can detect potential deadlock situations by verifying the absence of backward inaccessible
states. However, MERLIN has no algorithm for checking a collection of statecharts for deadlock
problems. Previous research on MERODE has already proposed an algorithm to check a
complete model for deadlock freedom [18], be it with some simplifications to circumvent
problems arising from unbounded interleaving. Unbounded interleaving results from maximum
many cardinalities. Indeed, in the case of a 1-to-many association, an instance on the 1 side may
have an unbounded number of instances on the many side running concurrently. In our example,
an applicant may have an unbounded number of concurrent applications. When translating this
into automata theory, a full deadlock check in the MERODE approach would require automata
able to deal with unbounded interleaving semantics, i.e. push-down automaton conforming to
a context-free (CF) grammar. Such a pushdown automaton can be derived from on the class
diagram and state diagrams of the conceptual model. To find possible deadlock situations in the
conceptual model, one then needs to check whether all possible input sequences (strings) are
accepted by the automaton. For one input sequence of length 𝑛, a pushdown automaton has a
time complexity of 𝑛3 [32]. Since not just one input sequence has to be checked, but all possible
input sequences, this clearly turns into a computationally intractable approach.
   To avoid the complexities associated with pushdown automata, the algorithm therefore
replaces the 0/1..* cardinalities on the side of the dependent with 0/1..1. This forces the dependent
class to exist sequentially, thus exhibiting an iteration instead of unbounded interleaving
behaviour (for the given example: an application can have a sequence of applications). As
iteration belongs to regular expression grammars, this can be handled by means of an finite
state automaton. The algorithm works by calculating the parallel composition of the concurrent
statecharts of a master with the iteration of its dependents. Once the parallel composition has
been calculated, the algorithm verifies if all events remain fireable in the resulting finite state
machine. If all events are still firable, the system consisting of these concurrent statecharts, is
said to be deadlock-free. Thus, it does therefore not ensure the absence of potential deadlock
situations in a object-oriented conceptual model; it merely ensures the existence at least one
deadlock-free execution trace and the presence of all events in at least one of these traces.
Therefore, a more comprehensive approach is needed to detect deadlock situations that may
arise from the combined behavior of multiple object types in the system.


3. Defining a feasible algorithm
The parallel execution of multiple interacting statecharts may result in deadlocks. This means
they have reached a state where none of them can make further progress, rendering the system
unresponsive and incapable of executing any further actions. While some deadlock situations,
such as those resulting from backwards inaccessible states within a single statechart, are
relatively easy to detect, the complex deadlock scenarios that arise due to the interplay between
concurrent statecharts are more difficult to detect.
   An experienced modeller may notice, by merely inspecting the FSMs of Figure 1, that an
applicant will be able to have at most one application. Such problem identification requires the
calculation of the system behaviour as the parallel composition of the concurrent statecharts.
An experienced modeller might be able to do this mentally. Junior modelers, however, struggle
a lot more to do this type of verification by just viewing and mentally interpreting the models.
   Various types of deadlock situations can be identified within a model’s global state. The first
type is the deadlock situation, in which no further steps can be executed. A second type is
the deadlock imminent situation, where one or more steps can still be executed, but this will
eventually lead to a deadlock situation. The third type is the livelock situation, which allows
steps to be executed but lacks a sequence of steps that allows for progress towards a final state.
Lastly, the fourth type is the livelock imminent situation, in which all possible sequences of
steps will lead to a livelock situation.
   The given example contains deadlock and livelock situations. For instance, the sequence
of events MEcrApplicant, MEcrApplication, send, review, doNotHire would place the Applicant
object in the reviewed state and the Application object in the exists state. In this global state of
the model, for the Applicant object only the hire and doNotHire events can be executed, while
for the Application object only the send endApplication event are permitted. Consequently,
the system is in a deadlock state, as no further events can be executed. It’s worth noting that
neither Applicant nor Application contain a deadlock situation on their own.
   Additionally, the same example contains a livelock situation. The sequence of events MEcrAp-
plicant, MEcrApplication, send, doNotHire places the Applicant object in the applied state and
the Application object in the sent state. In the applied state, an Applicant can execute either
the send or editApplicationDetails event, while in the sent state, an Application can execute the
doNotHire, review, or editApplicationDetails events. It is still possible to execute the editApplica-
tionDetails event, but this doesn’t change the state for neither object. As it is the only remaining
executable event, there is no way to terminate both the Applicant and the Application , thus
resulting in a livelock situation. Again, neither the Applicant nor the Application object
type has a livelock situation on its own.
   To address RQ1, we develop an algorithm capable of detecting deadlock problems in MERODE
models with an acceptable running time. The algorithm exploits the fact that the class diagram
is a directed acyclic graph (DAG) organizing object types along ED thanks to the reification of
associations until they express existence dependency. In such DAG we can identify top and
bottom object types. A top object type is not dependent on any other object type, while a
bottom object type has no dependents. A formal definition of top and bottom object types is
given in Definition 18b of [18]. In Figure 1, the object types have been numbered from top to
bottom, starting with 0. To compute a parallel statechart based on the concurrent statecharts of
a subset of the EDG, starting from a top object type and ending at a bottom object type, all the
paths in the EDG hierarchy are computed. In our example, there are two top object types (Job
and Applicant ) and one bottom object type (Review ). There are two paths, one running from
Applicant through Application to Review and one that starts at Job and ends at Review .
   The proposed algorithm, calculates a parallel statechart that represents the global behaviour
in the following way. First, all cardinalities of ”many” are replaced by ”one” so as to replace
interleaving by iteration. Next, for each master object type, the parallel composition with
an iteration of each dependent object type is calculated. The FSM representing the iteration
of an object type is obtained by simply rerouting the transitions that end in a final state to
the initial state and relabelling this state as both initial and final. This process is repeated
iteratively until a complete path from top to bottom has been calculated. Finally, the parallel
combination of the paths is calculated. This is identical to the algorithm presented in [18].
However, the algorithm in [18] removes backwards inaccessible states after each calculation step.
This immediate pruning of backwards inaccessible states in each intermediate result helps to
avoid rising computational time due a state explosion. However, this pruning also removes the
useful information about the exact combination of states that are causing a deadlock situation.
   To provide modelers with detailed feedback, it is important to keep the backward inaccessible
states in the results as they represent the deadlock states in the global system behaviour. To
determine the type of deadlock, it can be checked whether other deadlock states are reachable
from the identified deadlock state. If no other (deadlock) states are reachable, it indicates a
deadlock situation. If another deadlock state is reachable, then we deal with a deadlock imminent
situation. If the identified deadlock state is still reachable from itself, this indicates a livelock
situation. If at least one (other) livelock state is reachable, it indicates a livelock imminent
situation. The transitions towards the problematic states provide useful information as to which
scenarios ultimately lead to deadlock/livelock.
   In summary, the ADD-LIFE algorithm follows the following steps:
    1. Calculate all paths from top object types to bottom object types.
    2. For each found path in step 1, apply Algorithm 1 (this algorithm is adjusted from algorithm
       A.1 as described in [18] and takes into account referential integrity contstraints) iteratively
       for each object type in the path to obtain the parallel behaviour of all the statecharts in
       the path.
    3. For all the calculated statecharts obtained from step 2, apply Algorithm 2 (this algorithm
       is adjusted from algorithm A.1 as described in [18]) iteratively for each statechart to
       obtain the parallel statechart that models the combined behaviour of the entire system.
    4. Identify the backwards inaccessible states in the resulting parallel statechart, these are
       the deadlock states.
    5. Classify the resulting deadlock states.

Algorithm 1 The algorithm to combine the statechart of a master with the statechart of a
dependent.
Require: objectType2 is existence dependent of objectType1. FSM M1 is the statechart modelling the behaviour of objectType1,
  FSM M2 is the statechart modelling the behaviour of objectType2.
Ensure: FSM M is the parallel statechart of FSM M1 = (Σ1 , 𝑄1 , Δ1 , 𝑞1 , 𝐹1 ) and FSM M2 = (Σ2 , 𝑄2 , Δ2 , 𝑞2 , 𝐹2 ).
  𝑀 ← (Σ1 ∪ Σ2 , 𝑄1 × 𝑄2 , Δ, 𝑞1 × 𝑞2 ) such that

  for 𝑎 ∈ Σ1 ∩ Σ2 ∧ 𝑎 ∉ 𝑑(Σ1 ) do                            ▷ 𝑑 contains the subset of deleting event of the alphabet of objectType1
      if Δ1 (𝑞1 , 𝑎) = 𝑞1′ ∧ Δ2 (𝑞2 , 𝑎) = 𝑞2′ then
          Δ((𝑞1 , 𝑞2 ), 𝑎) ← (𝑞1′ , 𝑞2′ )
      end if
  end for
  for 𝑎 ∈ Σ1 ⧵ Σ2 do
      if Δ1 (𝑞1 , 𝑎) = 𝑞1′ then
          Δ((𝑞1 , 𝑞2 ), 𝑎) ← (𝑞1′ , 𝑞2 )
      end if
  end for



Algorithm 2 The algorithm to combine the statechart of combined FSMs of two subparts of
the EDG.
Require: M1 is the parallel statechart of a path of the EDG, M2 is the parallel statechart of a path of the EDG
Ensure: FSM M is the parallel statechart of FSM M1 = (Σ1 , 𝑄1 , Δ1 , 𝑞1 , 𝐹1 ) and FSM M2 = (Σ2 , 𝑄2 , Δ2 , 𝑞2 , 𝐹2 ).
  𝑀 ← (Σ1 ∪ Σ2 , 𝑄1 × 𝑄2 , Δ, 𝑞1 × 𝑞2 ) such that

  for 𝑎 ∈ Σ1 ∩ Σ2 do
      if Δ1 (𝑞1 , 𝑎) = 𝑞1′ ∧ Δ2 (𝑞2 , 𝑎) = 𝑞2′ then
          Δ((𝑞1 , 𝑞2 ), 𝑎) ← (𝑞1′ , 𝑞2′ )
      end if
  end for
  for 𝑎 ∈ Σ1 ⧵ Σ2 do
      if Δ1 (𝑞1 , 𝑎) = 𝑞1′ then
          Δ((𝑞1 , 𝑞2 ), 𝑎) ← (𝑞1′ , 𝑞2 )
      end if
  end for
  for 𝑎 ∈ Σ2 ⧵ Σ1 do
      if Δ2 (𝑞2 , 𝑎) = 𝑞2′ then
          Δ((𝑞1 , 𝑞2 ), 𝑎) ← (𝑞1 , 𝑞2′ )
      end if
  end for


  Regarding computing time, the calculation of combined statecharts and the generation of
graphs were completed within a few seconds when just a single path is verified. However, if
multiple instantiations of an object type are allowed or the entire parallel statechart is calculated
(instead of just the paths in the EDG), the computing time grows exponentially. Estimations
indicate that calculating the combined statechart for a single model would take several days
under these conditions. This is in line with what was discussed in Section 2, concerning
computational time when using push-down automata for recognizing CF grammars.


4. Validation of utility
In order to answer RQ2, the algorithm was run against students’ homework solutions of the last
5 academic years. As we can see from Table 1, on average 45% of the student models contain at
least one deadlock situation.

Table 1
Total number of deadlock models per academic year.

      Academic year                    18-19     19-20   20-21   21-22   22-23    Total students solutions
      Deadlock-free models               10        10      17      9       5                 51
      Deadlock models                    8         12      10      7       4                 41
      Percentage of deadlock models     44%       55%     37%     44%     44%               45%




Table 2
Results of the Kruskal-Wallis test and Spearman correlation coefficient.

         Deadlock type                Deadlock      Livelock     Deadlock imminent       Livelock imminent
     Kruskal-Wallis (p-value)           0.73          0.59              0.65                    0.87
          Artefact type                  Nr. of Classes                          Nr. of States
  Student models (Spearman 𝜌)                 -0.10                                  -0.02



   As each year a different assignment was used, we also investigate whether the prevalence of
deadlock situations of a certain type depends on the provided exercise. A Kruskal-Wallis test
can be used for ordinal data (in this case count data) to compare multiple independent samples
(students from different academic years) when there are more than 2 samples (5 academic
years). This Kruskal-Wallis test is used to determine whether the means of all the different
academic years, in which a different exercise was given for the homework, are equal or not. A
Kruskal-Wallis test works under the null hypothesis that all group means are the same. The
results in Table 2 shows all p-values to be non-significant. Furthermore, some models might be
more prone to deadlock situations, due to them consisting of more object types or more states
than others. Therefore, a correlation between the number of classes and states in the models
compared to the total number of deadlock situations in the model has been calculated too. For
this we use the Spearman correlation coefficient, as the number of classes/states and deadlock
situations are ordinal data. Also here, we see no significant correlation.
5. Discussion
To achieve a feasible algorithm in response to RQ1, two concessions had to be made regarding
exhaustive verification for deadlock: interleaving resulting from cardinalities of many is replace
by iteration by changing the maximum cardinality to one and the verification is limited to
single paths from top to bottom. In [18] a different choice was made: backward inaccessible
states are removed from parallel statechart at the cost of losing detailed information as to how
many deadlock situations are present and the in which states of the concurrent statecharts they
appear. An important limitation of the ADD-LIFE algorithm is that the made concessions affect
the completeness deadlock state detection. This means that other deadlock situations may exist
beyond those identified by the algorithm. However, due to the state explosion when doing a full
calculation, it is practically impossible to know which percentage of deadlock situations have
been identified with the proposed algorithm. A second limitation is the limited external validity,
as the research focuses on conceptual models created within a specific MDE approach based on
CSP. For MDE approaches based on a different process algebra, the algorithm can be adjusted
to take into account the different language semantics. Herein, the general idea of allowing only
a limited amount of instantiations of the object types to avoid computational issues remains.
   For RQ2, the analysis of the results reveals that almost half of the student models contain
deadlock situations. While it is possible that the prevalence of deadlock situations may vary
across academic years, the Kruskal-Wallis test results in Table 2) did not indicate any significant
differences between different years. Additionally, no correlation was found between the number
of classes or states in a model and the occurrence of deadlock situations (Table 2), which shows
that the complexity of a model, as different assignments might come at different difficulty levels,
is not per se related to the prevalence of deadlock situations. While deadlock is mentioned and
illustrated when discussing homework results, the topic is not addressed in a systematic way
during the course. Students are expected to extrapolate the examples to their own solutions and
detect them through adequate testing. However, research has found that the students’ manual
testing efforts are often below par, as evidenced by their use of an insufficient number of test
cases to assess the accuracy of requirement modelling [33]. This leads us to believe that raising
students’ awareness via automated feedback of potential deadlock situations can be of help
in reducing the prevalence of such situations, since students do not bump into these deadlock
situations when testing their models on their own.
   To enhance internal validity, student models from different academic years were included,
to verify that the prevalence of deadlock situations cannot be attributed solely to one specific
exercise or year. Again, external validity is limited by the focus on MERODE models. Further
research could check the models made by students at other universities where the tool is also
used, and investigate other modelling approaches.


6. Conclusion and future work
In this paper, we presented the ADD-LIFE (Algorithm for Deadlock Detection Leveraging
Incomplete Feedback) algorithm, which aims to automatically detect deadlock situations in
conceptual models. The algorithm calculates the global behaviour resulting from individual
concurrent statecharts and addresses state explosion and time complexity by limiting the number
of instantiations per object type and peforming a partial check only.
   To evaluate the utility of the algorithm, we applied it to student models collected over multiple
years from an MDE course. We identified four types of deadlock situations: deadlock, livelock,
deadlock imminent, and livelock imminent. No significant differences were found in the number
of deadlock situations among student solutions from different years. Additionally, there was no
correlation between the number of classes or states and the occurrence of deadlock situations
in the models. This suggests that deadlock errors are quite generally present, thus motivating
the need for algorithmic assistance for detecting them.
   Two main lines of future research can be developed from this research. First, considering
that many student models contain deadlock situations, the ADD-LIFE algorithm could be
implemented in the modelling tool to provide automatic feedback to both lecturers and students
on the presence of deadlock in their models. Students can use this feedback to learn which
modelling choices introduce deadlock situations, while lecturers can use such a tool to aid them
in highlighting difficult to find deadlock situations in student homeworks.
   The algorithm can be easily integrated with the tooling support that already exists for
the MERODE approach. Since the consistency by construction, consistency by monitoring and
consistency by analysis approach are already present in the MERLIN tool, the algorithm’s
feedback can be added to these other checks. After successful integration with the MERLIN
tool, we can investigate the impact on students learning process. This can be researched by
means of small experiments, but since the tool allows for logging students’ interaction, we can
evaluate to what extent the new feature is use and whether or not it decreases the prevalance
of deadlock situations for the student groups who make use of ADD-LIFE’s feedback.
   A second line of research could further develop the approach so that it can also be used
for other MDE approaches. For instance, according to the UML standard for class diagrams,
additional constraints can be modelled, such as preconditions. Further research could thus
extend the evaluation by applying the algorithm to models from different courses or modeling
approaches. However, it is worth noting that the algorithm may not be suitable for evaluation
in MDE approaches based solely on UML without a formal process algebra. Finally, futher
research could explore the algorithm’s applicability in real-world MDE scenarios outside of
academic settings. Overall, while the algorithm’s evaluation provides valuable insights, further
research is needed to assess its effectiveness and generalizability in diverse MDE contexts.


Acknowledgments
This paper has been funded by the ENACTEST Erasmus+ project number 101055874.


References
 [1] H. Krasner, The cost of poor software quality in the US: A 2020 report, Proc. Consortium
     Inf. Softw. QualityTM (CISQTM) (2021).
 [2] The RISKS Digest Volume 33 Issue 01, https://catless.ncl.ac.uk/Risks/33/01#subj1, 2022.
     Accessed: 2023-07-06.
 [3] S. Käss, S. Strahringer, M. Westner, Practitioners’ perceptions on the adoption of low code
     development platforms, IEEE Access 11 (2023) 29009–29034. doi:10.1109/ACCESS.2023.
     3258539 .
 [4] OCL, http://www.omg.org/spec/OCL/, 2017. Accessed: 2023-09-07.
 [5] C. Verbruggen, M. Snoeck, Practitioners’ experiences with model-driven engineering: a
     meta-review., Softw Syst Model 22 (2023) 111–129. URL: https://link.springer.com/article/
     10.1007/s10270-022-01020-1. doi:https://doi.org/10.1007/s10270- 022- 01020- 1 .
 [6] M. Estañol, M.-R. Sancho, E. Teniente, Ensuring the semantic correctness of a bauml
     artifact-centric bpm, Information and Software Technology 93 (2018) 147–162. URL:
     https://www.sciencedirect.com/science/article/pii/S0950584917301404. doi:https://doi.
     org/10.1016/j.infsof.2017.09.003 .
 [7] M. Snoeck, C. Verbruggen, J. D. Smedt, J. D. Weerdt, Supporting data-aware processes
     with merode., Softw Syst Model 23 (2023) 111–129. URL: https://link.springer.com/article/
     10.1007/s10270-023-01095-4. doi:https://doi.org/10.1007/s10270- 023- 01095- 4 .
 [8] G. Sedrakyan, S. Poelmans, M. Snoeck, Assessing the influence of feedback-inclusive rapid
     prototyping on understanding the semantics of parallel uml statecharts by novice modellers,
     Information and Software Technology 82 (2017) 159–172. URL: https://www.sciencedirect.
     com/science/article/pii/S0950584916303044. doi:https://doi.org/10.1016/j.infsof.
     2016.11.001 .
 [9] R. M. Dijkman, M. Dumas, C. Ouyang, Formal semantics and analysis of BPMN process
     models using Petri nets, Queensland University of Technology, Tech. Rep (2007) 1–30.
[10] I. Raedts, M. Petkovic, Y. S. Usenko, J. M. E. van der Werf, J. F. Groote, L. J. Somers,
     Transformation of BPMN Models for Behaviour Analysis., MSVVEIS 2007 (2007) 126–137.
[11] F. Chu, X.-L. Xie, Deadlock analysis of Petri nets using siphons and mathematical pro-
     gramming, IEEE Transactions on Robotics and Automation 13 (1997) 793–804.
[12] Y. Huang, M. Jeng, X. Xie, S. Chung, Deadlock prevention policy based on Petri nets and
     siphons, International Journal of Production Research 39 (2001) 283–305.
[13] M. Jeng, X. Xie, M. Zhou, M. Fanti, Deadlock detection and prevention of automated
     manufacturing systems using Petri nets and siphons, Deadlock resolution in computer-
     integrated systems (2005) 233–281.
[14] Y. Choi, J. L. Zhao, Decomposition-based verification of cyclic workflows, in: International
     Symposium on Automated Technology for Verification and Analysis, Springer, 2005, pp.
     84–98.
[15] H. Lin, Z. Zhao, H. Li, Z. Chen, A novel graph reduction algorithm to identify structural
     conflicts, in: Proceedings of the 35th Annual Hawaii International Conference on System
     Sciences, IEEE, 2002, pp. 10–pp.
[16] W. M. van der Aalst, A. Hirnschall, H. Verbeek, An alternative way to analyze workflow
     graphs, in: Advanced Information Systems Engineering: 14th International Conference,
     CAiSE 2002 Toronto, Canada, May 27–31, 2002 Proceedings 14, Springer, 2002, pp. 535–552.
[17] C. A. R. Hoare, Communicating sequential processes, Communications of the ACM 21
     (1978) 666–677.
[18] G. Dedene, M. Snoeck, Formal deadlock elimination in an object oriented conceptual
     schema, Data & Knowledge Engineering 15 (1995) 1–30.
[19] D. Zhang, D. Bošnački, M. van den Brand, C. Huizing, B. Jacobs, R. Kuiper, A. Wijs, Verifying
     atomicity preservation and deadlock freedom of a generic shared variable mechanism
     used in model-to-code transformations, in: Model-Driven Engineering and Software
     Development: 4th International Conference, MODELSWARD 2016, Rome, Italy, February
     19-21, 2016, Revised Selected Papers 4, Springer, 2017, pp. 249–273.
[20] S. de Putter, A. Wijs, D. Zhang, The SLCO framework for verified, model-driven con-
     struction of component software, in: Formal Aspects of Component Software: 15th
     International Conference, FACS 2018, Pohang, South Korea, October 10–12, 2018, Proceed-
     ings 15, Springer, 2018, pp. 288–296.
[21] O. Takaki, T. Seino, I. Takeuti, N. Izumi, K. Takahashi, Verification of evidence life cycles in
     workflow diagrams with passback flows, International Journal On Advances in Software 1
     (2008).
[22] R. Hull, Artifact-centric business process models: Brief survey of research results and
     challenges, in: OTM Confederated International Conferences” On the Move to Meaningful
     Internet Systems”, Springer, 2008, pp. 1152–1163.
[23] D. Calvanese, S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, From model completeness to
     verification of data aware processes, Description Logic, Theory Combination, and All That:
     Essays Dedicated to Franz Baader on the Occasion of His 60th Birthday (2019) 212–239.
[24] A. Deutsch, R. Hull, V. Vianu, Automatic verification of database-centric systems, ACM
     SIGMOD Record 43 (2014) 5–17.
[25] S.-T. Huang, A distributed deadlock detection algorithm for CSP-like communication,
     ACM Transactions on Programming Languages and Systems (TOPLAS) 12 (1990) 102–122.
[26] L. Lima, A. Tavares, S. C. Nogueira, A framework for verifying deadlock and nondeter-
     minism in UML activity diagrams based on CSP, Science of Computer Programming 197
     (2020) 102497.
[27] Y. Isobe, M. Roggenbach, CSP-Prover—A proof tool for the verification of scalable concur-
     rent systems, Information and Media Technologies 5 (2010) 32–39.
[28] P. B. Ladkin, B. B. Simons, Static deadlock analysis for CSP-type communications, in:
     Responsive Computer Systems: Steps Toward Fault-Tolerant Real-Time Systems, Springer,
     1995, pp. 89–102.
[29] M. Snoeck, C. Michiels, G. Dedene, Consistency by construction: the case of MERODE,
     in: Conceptual Modeling for Novel Application Domains: ER 2003 Workshops ECOMO,
     IWCMQ, AOIS, and XSDM, Chicago, IL, USA, October 13, 2003. Proceedings 22, Springer,
     2003, pp. 105–117.
[30] M. Snoeck, G. Dedene, Existence dependency: The key to semantic integrity between struc-
     tural and behavioral aspects of object types, IEEE Transactions on software engineering
     24 (1998) 233–251.
[31] M. Snoeck, MERLIN: An Intelligent Tool for Creating Domain Models, in: Research
     Challenges in Information Science: 14th International Conference, RCIS 2020, Limassol,
     Cyprus, September 23–25, 2020, Proceedings 14, Springer, 2020, pp. 549–555.
[32] A. V. Aho, J. E. Hopcroft, J. D. Ullman, Time and tape complexity of pushdown automaton
     languages, Information and Control 13 (1968) 186–206.
[33] G. Sedrakyan, M. Snoeck, S. Poelmans, Assessing the effectiveness of feedback enabled
     simulation in teaching conceptual modeling, Computers & Education 78 (2014) 367–382.