Studies on Declarative Process Modeling and Its
  Relation to Procedural Techniques (Extended
                    Abstract)

                            Johannes De Smedt


    The challenging task of managing business processes has become an even
more complex endeavor as companies are required to sustain flexible and agile
business practices. Business Process Management (BPM) research has proposed
numerous ways of accommodating for this need for flexibility. The declarative
process modeling paradigm, with its shifts towards a constraint-based way of
approaching process behavior, is a prime example. By capturing a process’
behavior in a loose fashion by demarcating what is- and is not allowed, the
models produced by adhering to the paradigm are often more concise than their
procedural counterparts. The downside of numerous languages covered by this
paradigm, however, is the complex nature of their constraints, as well as their
interactions. Furthermore, declarative process modeling might not offer a single
golden bullet approach to attain flexibility, as it makes delineating structured
process behavior often harder. Rather, an intermediate solution in the form of
a mixed-paradigm model can provide a fitting answer that joins best of both
worlds in areas with multiple layers of flexibility.
    This thesis offers numerous solutions in the form of approaches, frameworks,
and implementations to overcome these impediments for improved usability of
declarative process modeling, as well as the comparison of both paradigms, and
the in-depth study of mixed-paradigm process models. It does so in three major
parts.

    In the first part, the backdrop in terms of literature and formalisms is pro-
vided. Chapter 2 reports a comprehensive literature study that encompasses all
works related to the declarative process modeling paradigm. It positions each
of the works along the business process management lifecycle, hence assessing
the maturity of this subfield of business process management. All papers pub-
lished on the topic in the last 10 years were reviewed and classified according
to their use of a certain phase of the lifecycle. Findings suggest that there is a
strong skewness towards the discovery, implementation, and monitoring phases.
This is mainly due to the formal nature of many papers, that focus on execution
semantics, formal verification, and automated process discovery from event logs.
    The key takeaways are summarized in a roadmap for future declarative pro-


                                        1
cess modeling literature. First of all, it is suggested that a comparison framework
for existing frameworks for scalability, applicability, and functionality can boost
the focus of the research area. While one of the key benefits of using declarative
process modeling is achieving flexibility, it remains to be seen how and how well
each approach is capable of doing so. E.g., a study on how Declare with a bigger
but unwieldy constraint base can compete with the more focused DCR Graphs
approach in a real-life setting would offer more insight into how to progress the
control-flow perspective. On the other hand, the integration of artifact-centric
approaches delivers yet another view that rather focuses on integrating the life-
cycle of objects. Secondly, modeling guidelines can make a welcoming addition
to the field. Many studies have devoted their attention to the understandability
of declarative process models, with often mixed to negative results. Modeling
guidelines that are tailored towards the characteristics of declarative process
models might alleviate this impediment, and is also an important aspect of this
thesis in part 2. Thirdly, given the recent gain in popularity of decision model-
ing approaches which are also by nature declarative, it remains to be seen how
a fully declarative solution of control flow, data, decision, and so on can unite
in a process context.
    In chapter 3, the formal layer of the thesis is provided. The concepts of
activities, constraints, and models are introduced, along with the execution
semantics of Petri nets, and of the widely-used Declare language in regular
expressions, LTL, and R/I-nets. In the remained of the work, Declare will be
used to illustrate the concepts that are studied and proposed.

    In the second part of the thesis, the focus is shifted towards the understand-
ability and usability of constraint-based declarative process models. Since often
declarative process models are based on a vast number of interacting constraints,
it becomes hard for modelers and users to grasp their interplay. Since the na-
ture of the patterns is also of a wide variety, the effect each construct has on
another is often obscured when creating bigger models. Especially the problem
of hidden dependencies, i.e., interactions between constraints that are defined
over the same set of activities that result in behavioral outcomes that are not
covered by the separate semantics of constraints themselves, pose a significant
threat to the inexperienced and even experienced process modeler. Therefore,
in Chapter 4, an approach is devised that uncovers the interplay of Declare
constraints.
    Based on different types of constraints, it is possible to construct a recursive
back- and forward searching procedure to establish so-called dependency struc-
tures. These structures visualize how constraints are related within a model,
and what violations and resolutions have effect on the other ones. The ap-
proach is based on propagating the upper- and lower bound of the number of
occurrences of each of the activities. Indeed, it is shown that the hidden de-
pendencies stem from the constraints that impose an upper bound on activities,
which consequently affect the activities connected to them.
    In Chapter 5, the applications of revealing the dependencies are discussed.


                                         2
Making the hidden dependencies explicit in dependency structures allows for
the construction of an extra annotation layer of declarative process models in
the form of textual descriptions of the dependencies. This was implemented
in the Declare Execution Environment and used in a user experiment with a
considerable amount of students. It was tested whether this extra layer and
descriptions can aid novice modelers and users in understanding the behavior of
declarative process models expressed in Declare. Next to this, also the annota-
tion of the enabledness of activities and the violation status of constraints was
tested. Results show that every layer, compared to a model with only the typical
graphical notation of Declare without any other information but the separate
constraint descriptions, significantly relieves mental effort as measured in the
scores of the users, as well as the time needed to perform the questions regarding
the models. Especially the fully-annotated models containing descriptions and
dependency structures of the hidden dependencies renders novice users able to
read and explain the behavior that is present in the models.
    Following the user study, it is investigated how the insights on dependencies
can be used towards quantifying the complexity of declarative, constraint-based
process models. Traditionally, such measures in BPM focus on routing con-
structs and the number of paths that can be traversed in the process, i.e., the
size of the language. In declarative models, however, there is little correspon-
dence between the number of constraints and the number of paths because the
effect of conjoining different constraints can lead to both an in- and decrease
of the size of the language. By using the hierarchy that is inherent to the de-
pendency structures for declarative process models, it is possible to follow a
different route that is rather based on the fan-in, fan-out principle of tracking
and tracing that also adheres to the open world assumption of declarative pro-
cess models. The proposed complexity metric increases in effect size when there
are more, denser, and more convoluted dependency structures present to reflect
the difficult interactions that take place between the constraints.
    Finally, Chapter 5 concludes with a preliminary approach to restructure
declarative process models according to the dependency structures. By focusing
on the constraints that cause hidden dependencies, and hence the structures, it
is possible to derive different stages of the model. This allows to split the model
and present it without hidden dependencies altogether.

    In the third part, the comparison with the procedural process modeling
paradigm is made for both manual, as well as automated process discovery
and verification. In Chapter 6, an approach is devised for constructing mixed-
paradigm process models with intertwined state spaces, i.e., models consisting
of both procedural and declarative process modeling constructs that share the
same activities and are not just a collection of atomic subprocesses.
    First, the different types of mixed-paradigm models are discussed. Such
models can either incorporate declarative constraints for a certain smaller part
of the behavior to increase flexibility, or for an extensive part of the model
where declarative constructs can more concisely express flexible behavior. Fur-


                                        3
thermore, not only more flexibility can be achieved, as declarative constraints
can also be especially restrictive. Hence, they can also tie down the behavior
considerably with, e.g., constraints determining cardinalities for and constraints
imposing chaining behavior on activities.
    Next, all constraints in the Declare template base are reviewed for their
impact on global concurrency, the timing and violation of constraints, and their
effect towards permanently disabling activities. This is summarized in a scoring
table that can be used to quantize to what extent they have an impact on a
process model, and hence to what extent they might interfere with a procedural
counterpart of a model.
    Finally, the chapter concludes with a step-wise approach towards mixed-
paradigm process modeling. Four steps are put forward for users to construct
better models. The approach is illustrated on a well-known example in the BPM
sphere to display its usefulness and effect.
    Chapter 7 starts from the same evaluation of how mixed-paradigm models
can be useful, but on top also discusses the relation to the behavior in an event
log. Again, declarative process constructs can be incorporated into procedural
models by either being more restrictive, aiming for an even more precise result,
and/or by covering more behavior in the log as it is capable of better capturing
the flexible behavior.
    The insights are condensed into the Fusion Miner framework, which offers
an approach for mining mixed-paradigm models by using various procedural
and declarative process mining algorithms. Its main idea starts from classifying
activities into being either rather prone to be suitable in a procedural model, i.e.,
easy to capture within a structured process sequence, or being rather suitable for
a declarative model, i.e., its place in a sequence is hard to express because of its
looping behavior, random occurrence, or duplication. After this classification,
each set of activities is mined accordingly with its respective paradigm. A user-
defined parameter, the entropy level, controls the sensitivity towards regarding
certain behavior as being fit for mining with declarative constraints. Both sets
do have overlaps nonetheless, and it is illustrated how the declarative part of the
model is capable of explaining the more volatile behavior generated by flexible
processes better.
    The approach is instantiated in this chapter in the form of a mixture of both
Heuristics Miner and Declare Miner, and shows its ability towards mining better
models in terms of fitness and precision.
    Chapter 8 concludes the part of mixed-paradigm modeling, by providing
a model verification and conformance checking approach. The former can be
done by constructing the state space of the procedural part of the process, and
subsequently conjoining the separate declarative constraints. Ultimately, either
a global automaton is created containing all behavior that is modeled, or a trade-
off has to be achieved that preferably includes as many constraints in the model
as possible. Also, an improved version of the Fusion Miner implementation is
introduced, FusionMINERful, which uses both Inductive Miner and MINERful.
The algorithm is also capable of finding the right level of entropy itself. Starting
off by trying to build a reachability graph for the Petri net, it traverses the


                                         4
entropy spectrum when no such construct can be obtained, i.e., the share of the
declarative activities in the overall model is increased. In case no deterministic
procedural model can be built from the behavior witnessed in athe event log, a
fully declarative model is returned.
    To evaluate and compare procedural, declarative, and mixed-paradigm ap-
proaches, an alignment-based conformance checking technique was implemented
that uses the global automaton of the model checking approach to replay traces.
This allows for calculating fitness, precision, and generalization for the full spec-
trum of entropy.
    Experiments were done on two synthetic, and one real-life event log. Re-
sults show that the intuition that procedural process models are better capable
of representing more structured data than declarative models, and vice versa,
is true, however, the gap in terms of performance is not considerable. Fur-
thermore, mixed-paradigm approaches are often an as interesting, if not better
solution, without needing that many more constructs. Although the evaluation
of model understandability remains a subjective, the models produced by the
algorithm seem to represent the behavior in an interesting way. Finally, the
self-learning capabilities of the entropy level renders FusionMINERful a good
tool to scrutinize the level of flexibility in an event log.

    Finally, the last part provides an outlook for future work. Every part has
an extensive line of further investigation that can be pursued. The topic of un-
derstandability can be further enriched by empirically validating the complexity
metric. Also, a mature approach to restructure declarative process models along
the phases can provide models that are easier to integrate with procedural mod-
els. The mixed-paradigm conformance checking study might benefit from bench-
marking all procedural and declarative process models, and score them along
the conformance dimensions to find the best mining solution overall. These are
mere examples, and the range of future work is considerable.
    To summarize, the thesis provides a plethora of techniques, approaches, and
frameworks for improving the usability and understandability of constraint-
based declarative process models and their integration with procedural models.


                                         5