=Paper= {{Paper |id=Vol-1293/paper5 |storemode=property |title=Reasoning on Data-Aware Business Processes with Constraint Logic |pdfUrl=https://ceur-ws.org/Vol-1293/paper5.pdf |volume=Vol-1293 |dblpUrl=https://dblp.org/rec/conf/simpda/SmithP14 }} ==Reasoning on Data-Aware Business Processes with Constraint Logic== https://ceur-ws.org/Vol-1293/paper5.pdf
       Reasoning on Data-Aware Business Processes with
                      Constraint Logic

                             Maurizio Proietti and Fabrizio Smith

    National Research Council, IASI ”Antonio Ruberti” - Via dei Taurini 19, 00185 Roma, Italy
                         {maurizio.proietti, fabrizio.smith}@iasi.cnr.it



         Abstract. We propose a framework grounded in Constraint Logic Programming
         for representing and reasoning about business processes from both the workflow
         and data perspective. In particular, our goal is twofold: (1) define a logical lan-
         guage and a formal semantics for process models where data object manipulation
         and interactions with an underlying database are explicitly represented; (2) pro-
         vide an effective inference mechanism that supports the combination of reasoning
         services dealing with process behavior and data properties. To this end we define
         a rule-based process representation coping with a relevant fragment of the popular
         BPMN modeling notation, extended with annotations that model data manipula-
         tion. The behavioral semantics of a process is defined as a state transition system
         by following an approach similar to the Fluent Calculus, and allows us to specify
         state change in terms of preconditions and effects of the enactment of activities.
         Our framework provides a wide range of reasoning services, which can be per-
         formed by using standard Constraint Logic Programming inference engines.


Keywords: Business Process, Constraints, Logic Programming, Analysis, Verification


1     Introduction

The penetration of Business Process (BP) Management solutions into production re-
alities is constantly growing, due to its potential for an effective support to enterprise
actors and business stakeholders along the entire BP life-cycle. In this frame, model-
ing languages such as BPMN [1] are largely adopted by the stakeholders (designers,
analysts, business men) to develop conceptual models to be used for the design and
reengineering of BPs. One of the main advantages of having a machine-processable
representation of BPs available is that it enables the automation of tasks dealing with
process analysis, simulation and verification.
     However, standard process-centric approaches focus on the procedural representa-
tion of a BP as a workflow graph that specifies the planned order of operations, while
the interactions of individual operations with the underlying data layer is often left im-
plicit or abstracted away. Indeed, the automated analysis issue is addressed in the work-
flow community mainly from a control flow perspective (see, for instance, the notion
of soundness [2]), and most of the tools today available aim at verifying whether the
behavior of the modeled system enforces requirements specified without considering
the data perspective.
    In order to provide an integrated account of the workflow and data modeling, several
approaches have been proposed both in industrial realities (e.g., [3, 4]), as well as in the
database (e.g., [5]) and workflow (e.g., [6]) research communities. A data-aware BP
representation explicitly models the manipulation of data objects operated by individual
tasks and their interactions with databases, with the aim of enabling the automated
analysis of behavioral properties of the resulting system.
    In this paper we propose a logic-based framework for representing and reasoning
about data-aware BP models, where the workflow perspective, specified according to
BPMN, is enriched by annotations defining preconditions and effects of individual pro-
cess elements in terms of data objects, used to store information that is read and modi-
fied by the process enactment. We also consider the existence of an underlying database,
which can be queried during the enactment, retrieving values to be used for data object
manipulation. The behavioral semantics of a process is defined as a state transition sys-
tem by following an approach derived from the Fluent Calculus [7], and by integrating
into the framework a symbolic representation of data object values, given in terms of
arithmetic constraints over the real numbers. The proposed rule-based formalization
supports a relevant fragment of BPMN in addition to expressive data modeling, and
its grounding into Constraint Logic Programming (CLP) [8] provides a uniform and
formal framework that enables automated reasoning.
    In this work we do not propose yet another business process modeling language, but
we assume a pragmatic perspective aiming at supporting process-related knowledge ex-
pressed by means of de-facto standards for BP modeling, like BPMN. To the best of our
knowledge, this is one of the first attempts to provide a formal execution semantics for
expressive BPMN workflows in the presence of data and arithmetic constraints. No-
tably, the CLP formalization directly provides an executable semantics, which enables
the implementation of analysis and verification tasks relying on established automated
reasoning methods and tools. Due to the presence of data, the state space of the modeled
systems is potentially infinite and most verification problems are undecidable. However,
the symbolic representation of data values by means of constraints achieves the termi-
nation of a number of reasoning services in many cases of practical relevance also in
the presence of an infinite state space.
    The paper is organized as follows. After presenting a motivating scenario and intro-
ducing the modeling framework in Section 2, we provide a formal account of the behav-
ioral semantics in Section 3. In Section 4 we show how automated reasoning methods
developed in the field of CLP can be directly applied to perform analysis and verifi-
cation tasks. In Section 5 we present a proof-of-concept prototype, and, finally, in the
concluding section we give a critical discussion of our approach, along with directions
for future work.


2   Modeling Data-Aware Business Processes

A data-aware Business Process Schema (DAPS) is an activity workflow model, where
each task can additionally operate on data objects used to store information that is read
and modified by process enactment. In our approach, data objects are essentially re-
garded as variables, and hence at any time during execution there is a single instance
of a given data object that may be read or (over-)written by some activity. We consider
two main types of relationships between activities and data objects. Firstly, the enact-
ment of an activity may be guarded by a condition involving a number of data objects.
Secondly, the enactment of an activity can modify the value of a data object, hence
producing a new value, possibly related to other data objects’ values by an arithmetic
constraint over the real numbers R. We also consider the presence of a database (DB)
that can be queried during process enactment, hence retrieving values to be used for data
object manipulation. For the scope of this work, we assume that DB cannot be updated
by activity executions. Furthermore, we assume to deal with BPs whose state space,
considering the control flow only, is finite. However, since the data objects can assume
infinitely many values, the state space of the overall system is in general infinite.




                   Fig. 1: Example DAPS for an eProcurement Scenario

For the representation of the workflow-related perspective, we mainly refer to the BPMN
standard [1], which will be used throughout as a reference notation. We show how a
DAPS is specified by means of the example depicted in Figure 1, which deals with the
handling of a purchase order in an eProcurement scenario.
    A DAPS consists of a set of flow elements and relations between them, and it is
associated with a unique start event (e.g., s) and a unique end event (e.g., e), which are
flow elements that represent the entry point and the exit point, respectively, of the pro-
cess. An activity is a flow element that represents a unit of work performed within the
process. It can be modeled as a task, representing an atomic activity no further decom-
posable (e.g., choose prod), or as a compound activity, representing the execution of a
sub-processes (not exemplified here). The sequencing of flow elements is specified by
the sequence flow relation (corresponding to solid arrows), and the branching/merging
of the control flow is specified by using three types of gateways: exclusive (XOR, e.g.,
x1), inclusive (OR, e.g., o1), and parallel (AND, not exemplified here).
    The item flow relation (corresponding to dotted arrows) specifies that a flow ele-
ment uses as input or manipulates as output particular data objects. In our setting, the
input and output of a flow element can be enriched by declarative descriptions of pre-
conditions and effects, respectively, formulated in terms of arithmetic constraints, value
assignments (denoted by :=), and database queries (following the | symbol). For in-
stance, the task apply coupon requires (precondition) a value of item price greater than
0, while upon its execution (effect), the value of item price is decreased by the amount
of coupon. More complex effects can be specified, as in the case of choose prod, where
the values assigned to item price and item weight are retrieved by performing the con-
junctive database query product price(X, P), weight(X,W ). Uppercase letters denote
variables, representing values not known at design-time that are introduced during the
enactment, e.g., retrieved by database queries or produced after interactions with ex-
ternal systems or users. The values that variables are allowed to assume at run-time
can be characterized in terms of arithmetic constraints, as exemplified by the payment
effect where, due to the constraint 0 < X ≤ amount, the possible values of X after each
execution of payment range from 0 to the current value of amount (representing any
admissible paid amount). Similarly to activity preconditions, guards can be attached to
outgoing sequence flows of inclusive and exclusive gateways, as in the case of o1 and
x2. Whenever a guard is not defined, a non-deterministic behavior is assumed.
    The depicted BP is started by the user log-in, which triggers the execution of the
start event leading to the initialization of the involved data objects. The user can select
a number of products, by choosing for each of them a shipment compatible with the
item weight and optionally applying a coupon, which decreases the item price. When
the amount due exceeds the plafond associated with the user, a selected product cannot
be added to the order and it is refused. In order to proceed with the shipment, the full
amount due has to be paid, possibly through several subsequent payments.


3   Formal Semantics

Now we present a formal definition of the behavioral semantics, or enactment, of a
DAPS, by following an approach derived from the Fluent Calculus, a well-known rule-
based calculus for action and change (see [7] for an introduction), which is formalized
in Logic Programming (LP). In [9] we have proposed a specialized version of the Flu-
ent Calculus, developed to specifically deal with BPs. Here, in order to cope with the
data perspective, our formalization is enhanced by using Constraint Logic Programming
(CLP) [8], which extends LP with constraints over specific domains and, in particular,
over the domain of the real numbers with the usual arithmetic operations.

3.1   Constraint Logic Programming
We will consider CLP programs with constraints over the set R of the real numbers. We
will denote variables by upper case letters X1 , X2 , . . . , while we will denote constants,
predicate and function symbols by lower case letters, a, p, f , . . . Constraints are induc-
tively defined as follows. An atomic constraint is either a formula of the form p1 ≥ p2
or a formula of the form p1 > p2 , where p1 and p2 are polynomials with real variables.
We will also use the equality ‘=’ and the inequalities ‘≤’ and ‘<’ defined in terms of
‘≥’ and ‘>’ as usual. A constraint is either true, or false, or an atomic constraint, or a
conjunction of constraints.
     An atom is an atomic formula of the form p(t1 , . . . ,tm ), where p is a predicate sym-
bol not in {≥, >} and t1 , . . . ,tm , with m ≥ 0, are terms. A literal is either an atom A or
a negated atom ¬A. A goal is a (possibly empty) conjunction of atoms. A constrained
goal c ∧ G is a conjunction of a constraint c and a goal G. A CLP program is a finite
set of rules of the form A ← c ∧ G (to be understood as “A if c and G”), where A is an
atom and c ∧ G is a constrained goal. Given a rule A ← c ∧ G, A is the head of the rule
and c ∧ G is the body of the rule. A rule with empty body is called a fact. A term or a
formula is ground if no variable occurs in it.
     Let TR denote the set of ground terms built from R and from the set of function
symbols in the language. An R-interpretation is an interpretation that: (i) has universe
TR , (ii) assigns to +, ×, >, ≥ the usual meaning in R, and (iii) is the standard Herbrand
interpretation [10] for function and predicate symbols different from +, ×, >, ≥. We
can identify an R-interpretation I with the set of ground atoms (with arguments in TR )
which are true in I. We write R |= ϕ if ϕ is true in every R-interpretation. A constraint
c is satisfiable if R |= ∃X1 , . . . , Xn .c, where X1 , . . . , Xn are all variables occurring in c.
A constraint c entails a constraint d, denoted c v d, if R |= ∀X1 , . . . , Xn .c → d, where
X1 , . . . , Xn are all variables occurring in c or d. An R-model of a CLP program P is an
R-interpretation that makes true every rule of P. Every CLP program P has a least (with
respect to set inclusion) R-model, denoted M(P).
     We consider the standard operational semantics of CLP programs based on resolu-
tion extended with constraint solving [8]. For reasons of simplicity we assume that no
negated atom appears in the body of a CLP rule. However, we consider negated atoms
in queries. A query has the form ← c ∧C, where c is a constraint and C is a conjunction
of literals. A query ← c ∧C succeeds if it is possible to derive from it, possibly in many
steps, a query of the form ← c0 ∧ true, where c0 is a satisfiable constraint and true is
the empty conjunction of literals. The constraint c0 is also called an answer to the query
← c ∧C. As usual in CLP systems, we assume the left-to-right selection strategy of lit-
erals during resolution. A query ← c ∧C (finitely) fails if the set of derivations from it is
finite and no query of the form ← c0 ∧true, where c0 is satisfiable, is derivable. The oper-
ational semantics is sound with respect to the least model semantics in the sense that, if a
query ← Q succeeds with answer c0 then M(P) |= ∀X1 , . . . , Xn .c0 → Q, where X1 , . . . , Xn
are the variables occurring in c0 → Q. If ← Q fails then M(P) |= ∀X1 , . . . , Xn .¬Q, where
X1 , . . . , Xn are the variables occurring in Q.
3.2    Data-Aware BP Schema Representation

A DAPS is formally represented by a triple hWF, DC, DBSi, where WF is a workflow
specification, DC (data constraints) is a specification of the preconditions and effects
of activities on data objects, and DBS is a database schema.
    The workflow specification WF is a set of ground facts of the form p(a1 , . . . , an ),
where a1 , . . . , an are constants denoting individual flow elements and p is a predicate
symbol representing a BPMN construct (e.g., activity, gateway, sequence flow). For in-
stance, the formal specification of the eProcurement BP in Figure 1 will contain facts
such as: bp(eProc, s, e), stating that eProc is a process starting with s and ending with
e; task(choose prod), stating that choose prod is an atomic activity within the work-
flow; exc branch(x2), stating that x2 is an exclusive branch point, i.e., a decision point;
seq(choose prod, choose ship, eProc), stating that a sequence flow relation is defined
between choose prod and choose ship in eProc.
    The data constraints DC specify the way data objects are manipulated during pro-
cess enactment, by means of the following relations:
Precondition: pre(A,C, Proc), which specifies an enabling condition C that the data
objects must satisfy to enable the execution of an activity A in the process Proc;
Effect: eff(A, E, Proc), which specifies the effect expression E describing the effect on
the data objects of executing A in the process Proc;
Guard: guard(C, B,Y, Proc), which specifies a conditional sequence flow used to select
the set of successors of decision points, where the enabling condition C must hold in
order to enable the flow element Y , successor of B in the process Proc.
Enabling conditions and effect expressions are formally defined as follows. Let DO
denote the set of data objects occurring in the DAPS. An arithmetic expression is an
expression constructed from DO, (CLP) variables, real numbers, +, and ×. A constraint
expression is an expression of the form a1 rel a2 , where a1 , a2 are arithmetic expressions
and rel ∈ {≥, >} (i.e., a constraint expression is an atomic constraint on data objects
and CLP variables). A db-query is an atom whose predicate is defined in the database
schema DBS. A data update is an expression of the form o := a, where o ∈ DO and a is
an arithmetic expression. An enabling condition is the conjunction of n ≥ 0 constraint
expressions. An update condition is either a constraint expression or a db-query. An
effect expression is a pair data-updates 8 conds, where data-updates is a sequence of
data updates and conds is a conjunction of update conditions.
    Returning to the eProcurement example of Section 2, a precondition associated with
the payment activity is:
      pre(payment, [amount > 0]), eProc)
meaning that the task payment can be executed only if the data object amount has a
positive value. In the above example and in the sequel, both sequences and conjunctions
appearing as predicate arguments are represented using lists. The specification of the
effect associated with the choose ship activity is:
      eff(choose ship, [item price := item price + P] 8
                       [ship price(S, P), max weight(S,W ),W > item weight], eProc)
meaning that the effect of the execution of the task choose ship is that the value of the
data object item price is incremented by a quantity P, where P is the price of a shipment
type S and the value of item weight is below the maximum weight allowed for S.
    The specification of the guard associated with the flow from x2 to refuse item is:
    guard([amount + item price > plafond],x2, refuse item, eProc)
meaning that the control flow can proceed from the gateway x2 to the refuse item task
only if the sum of the values of amount and item price exceeds the value of plafond.
     The database schema DBS consists of a set of predicate symbols (with arity) repre-
senting the relations stored in the database, together with a set of formulas of the form
p(X1 , . . . , Xn ) → c, where p is a predicate symbol in the schema and c is a constraint
whose variables are among X1 , . . . , Xn , representing integrity constraints (for simplicity
we do not consider more complex integrity constraints). A database instance of DBS is
a finite set of ground facts p(a1 , . . . , an ) that satisfies all formulas in DBS.

3.3   Behavioral Semantics of Data-Aware Processes
Similarly to the Fluent Calculus, we represent the state of the world as a set of fluents,
i.e., terms denoting atomic properties that hold at a given instant of time. The execu-
tion of a flow element may cause a change of state, i.e., an update of the collection of
fluents associated with it. In particular, a change of state can be determined by the ef-
fects of activities specified by DC. A fluent is represented by an expression of the form
 f (a1 , . . . , an ), where f is a fluent symbol and a1 , . . . , an are constants or variables. We
take a closed-world interpretation of states, that is, we assume that a fluent F holds in
a state S iff F ∈ S. Our set-based representation of states relies on the assumption that
the DAPS is safe, that is, during its enactment there are no concurrent executions of the
same flow element [2]. This assumption enforces that, in absence of data objects, the
set of states reachable by a given DAPS is finite.
     We will consider the following three kinds of fluents:
   – cf(E1 , E2 , Proc), which means that the flow element E1 has been executed and the
      successor flow element E2 is waiting for execution, during the enactment of the
      process Proc (cf stands for control flow);
   – en(A, Proc), which means that the activity A is being executed during the enactment
      of the process Proc (en stands for enacting).
   – val(O,V ), which means that the data object O ∈ DO has value V .
     The truth value of a fluent or update condition depends on the state where it is
evaluated. For this reason we introduce the satisfaction relation holds(C, S), which holds
if C is true in the state S, where C is either a fluent or a conjunction of update conditions.
1. holds([ ], S)
2. holds([C|Cs], S) ← holds(C, S) ∧ holds(Cs, S)
3. holds(p(X1 , . . . , Xn ), S) ← p(X1 , . . . , Xn )                  for every p declared in DBS
4. holds(F, S) ← fluent(F) ∧ F ∈ S
5. holds(val(X,V ), S) ← constant(X) ∧V = X
6. holds(val(X,V ), S) ← variable(X) ∧V = X
7. holds(A1 > A2 , S) ← V1 > V2 ∧ holds(val(A1 ,V1 ), S) ∧ holds(val(A2 ,V2 ), S)
8. holds(val(A1 +A2 ,V ), S) ← V = V1 +V2 ∧holds(val(A1 ,V1 ), S)∧holds(val(A2 ,V2 ), S)
Recall that in this paper we assume that the database does not change during process
enactment, and hence in Rule 3 the evaluation of the db-query p(X1 , . . . , Xn ) does not
depend on the state S. Rule 4 has one instance for each kind of fluents, and a particular
instance is the following rule for retrieving the value of a data object O in a state S:
holds(val(O,V ), S) ← val(O,V ) ∈ S. Rules 5 and 6 express the fact that the value of
logical variables and constants is independent of the state. Rules 7 and 8 are needed to
define the value of constraint expressions by structural induction. We have omitted the
rules for ≥ and ×, which are similar to rules 7 and 8, respectively.
    The change of state determined by the execution of an action will be formalized
by a relation result(S1 , A, S2 ), which holds if action A can be executed in state S1 lead-
ing to state S2 . For the definition of result(S1 , A, S2 ), we assume that an instance of
the database schema DBS is provided. We also assume that the execution of an activ-
ity has a beginning and a completion (although we do not associate a duration with
activity execution), while the other flow elements execute instantaneously. Thus, we
will consider two kinds of actions: begin(A) which starts the execution of an activity A,
and complete(E), which represents the completion of the execution of a flow element E
(possibly, an activity). The following auxiliary predicate will be used: update(S1 , T,U, S2 ),
which holds if S2 = (S1 − T ) ∪U, where S1 , T,U, and S2 are sets of fluents.
    Let us now present some of the rules that define the behavioral semantics of a DAPS.
The state change determined by the execution of a task is defined by the following two
rules, corresponding to the start and the completion of the task, respectively:
result(S1 , begin(A), S2 ) ← task(A) ∧ holds(cf(X, A, P), S1 ) ∧ pre(A,C, P) ∧ holds(C, S1 )
        ∧ update(S1 , {cf(X, A, P)}, {en(A, P)}, S2 )
result(S1 , complete(A), S2 ) ← task(A) ∧ holds(en(A, P), S1 ) ∧ seq(A,Y, P)
        ∧ eff(A, DU 8C, P) ∧ holds(C, S) ∧ apply(DU, S1 , S0 )
        ∧ update(S0 , {en(A, P)}, {cf(A,Y, P)}, S2 )
    The first rule states that the execution of task A is started if the control flow has
reached it ( holds(cf(X, A, P), S1 ) ) and the enabling conditions associated with it hold
in the current state ( pre(A,C, P) ∧ holds(C, S1 ) ). The successor state is obtained by
asserting that the process is enacting A ( update(S1 , {cf(X, A, P)}, {en(A, P)}, S2 ) ). The
second rule states that the execution of task A can be completed if the update con-
ditions associated with A hold in the current state ( eff(A, DU 8 C, P) ∧ holds(C, S) ).
The successor state is obtained by applying the sequence DU of data updates, hence
updating the values of the data objects ( apply(DU, S1 , S0 ) ), and moving the control
flow to the next flow element Y ( update(S0 , {en(A, P)}, {cf(A,Y, P)}, S2 ) ). The rela-
tion apply(DU, S1 , S0 ), meaning that state S0 is obtained from state S1 by performing the
sequence DU of data updates is defined as follows:
apply([ ], S, S).
apply([DU|DUs], S, T ) ← apply(DU, S, S0 ) ∧ apply(DUs, S0 , T )
apply(O := A, S, S0 ) ← holds(val(A,V ), S) ∧ update(S, {val(O, X)}, {val(O,V ), S0 })
    The following two rules formalize the state changes determined by the execution of
an exclusive branch (with a guard associated with an outgoing flow) and an exclusive
merge, respectively.
result(S1 , complete(B), S2 ) ← exc branch(B) ∧ holds(cf(X, B, P), S1 ) ∧ seq(B,Y, P)
        ∧ guard(C, B,Y, P) ∧ holds(C, S1 ) ∧ update(S1 , {cf(X, B, P)}, {cf(B,Y, P)}, S2 )
result(S1 , complete(M), S2 ) ← exc merge(M) ∧ holds(cf(A, M, P), S1 ) ∧ seq(M,Y, P)
        ∧ update(S1 , {cf(A, M, P)}, {cf(M,Y, P)}, S2 )
    Note that, in particular, in order to proceed from the exclusive branch B to the next
flow element Y , the guard C associated with the flow from B to Y should hold in the
current state ( guard(C, B,Y, P) ∧ holds(C, S1 ) ).
    The behavioral semantics of other flow elements, e.g., parallel or inclusive gate-
ways, can be formalized by rules defined in a similar style. For lack of space we omit
those rules and we refer to [9] for more details in the simpler case where data object
manipulation (which is the main contribution of this paper) is not considered1 .
    The relation r(S1 , S2 ) holds if a state S2 is immediately reachable from a state S1 ,
that is, some action A can be executed in state S1 leading to state S2 :
r(S1 , S2 ) ← result(S1 , A, S2 )
We say that a state S2 is reachable from a state S1 if there is a finite, possibly empty, se-
quence of actions from S1 to S2 , that is, reachable state(S1 , S2 ) holds, where the relation
reachable state is the reflexive-transitive closure of r.


4     Reasoning Services
The formal semantics of data-aware BP schemas introduced in Section 3 is the basis for
developing automated reasoning techniques for the analysis and verification of business
processes that manipulate data objects. A major point is that our formal semantics is
a CLP program, and hence we can directly apply automated reasoning methods and
tools developed in the field of Constrained Logic Programming to perform analysis
and verification tasks. Indeed, by using standard CLP systems we are able to provide
a framework that supports several reasoning services and, in particular, in this section
we will demonstrate some of them and their use for analyzing process enactment, for
testing process executions, and for verifying behavioral properties.
    Given a DAPS hWF, DC, DBSi and a database instance D of the schema DBS, let
KB be the CLP program consisting of: (1) the ground facts WF specifying the work-
flow, (2) the pre, eff, and guard facts in DC specifying the enabling conditions, effects,
and guards associated with the workflow, (3) the ground facts in D, and (4) the rules
(introduced in Section 3) that define the behavioral semantics of the DAPS.
    Reasoning services will be realized by evaluating queries to the CLP program KB .
One major advantage of our approach is that query evaluation relies on a symbolic rep-
resentation of states, which often avoids the actual exploration of the whole, in general
infinite, state space, by covering that space by means of a finite set of constraints. More
specifically, a symbolic state is represented as a set of the form:
    { f1 , . . . , fk , val(o1 ,V1 ), . . . , val(om ,Vm )} satisfying a constraint c on V1 , . . . ,Vm .
In the above symbolic state f1 , . . . , fk are ground cf or en fluents and val(o1 ,V1 ), . . . ,
val(om ,Vm ) are fluents that associate data objects o1 , . . . , om with their values V1 , . . . ,Vm ,
respectively. Thus, a symbolic state represents the, possibly infinite, set of concrete
 1 The semantics presented in [9] supports the definition of unstructured workflows with arbitrary

    cycles, exceptional flows, and inclusive merge points, under the safeness assumption [2].
states that satisfy the given constraint. We say that a symbolic state S with associated
constraint c is subsumed by another symbolic state T with associated constraint d, if S
is equal to T , modulo variable renaming, and c v d. We can often reduce the state space
by avoiding to consider symbolic states that are subsumed by previously visited ones.


4.1   Enactment

We model the enactment of a DAPS as an execution trace (corresponding to a plan in
the Fluent Calculus), i.e., a sequence of actions of the form [act(a1 ), . . . , act(an )] where
act is either begin or complete.
    The predicate trace(S1 , T, S2 , N) defined below holds if T is a sequence of actions
of maximum length N that leads from state S1 to state S2 :
trace(S1 , [ ], S2 , N) ← N = 0 ∧ S1 = S2
trace(S1 , [A|T], S2 , N) ← N > 0 ∧ N1 = N − 1 ∧ result(S1 , A,U) ∧ trace(U, T, S2 , N1 ).
    In the following we use the abbreviation s0Proc to denote the initial state of a process
Proc, where the start event of Proc is enabled to fire. Furthermore, we introduce the
following rule to characterize the set of final states, where the end event of Proc is
enabled to fire
holds(final(Proc), S) ← bp(Proc, Estart , Eend ) ∧ holds(en(Eend , Proc), S).
    Our framework provides two services for analyzing process enactment, namely
trace compliance and simulation.
(1) Trace compliance is the task of verifying whether an execution trace of a process
is correct with respect to a given DAPS specification. Execution traces are commonly
stored by BP management systems as process logs, representing the evolution of the
process instances that have been enacted. Formally, a correct trace T of length N of a
process Proc is a trace that leads from the initial state to the final state of Proc, that is:
ctrace(T, Proc, N) ← trace(s0Proc , T, S f , N) ∧ holds(final(Proc), S f )
    The compliance of a trace with respect to a given DAPS can then be verified by
evaluating a query of the form ← ctrace(t, p, n) with respect to program KB , where t
is a ground list of length at most n and p is a process name. It is easy to see that such
query terminates for every ground t, as the length of the second argument of the trace
predicate decreases at each recursive call. An example of correct trace related to our
running example is reported below.
[comp(s),comp(x1),comp(o1),beg(choose prod),comp(choose prod),
beg(choose ship),comp(choose ship), comp(o2),comp(x2),beg(add item),
comp(add item),comp(x3),beg(payment),comp(payment),
comp(x4),beg(shipment),comp(shipment),comp(e)]
The trace is guaranteed to be compliant also with respect to the data constraints associ-
ated with the DAPS, even if the information about the actual values of the data objects
is not explicitly represented. A more complex representation of traces that also includes
information about the data object values can easily be defined, but we omit it for reasons
of simplicity.
(2) Simulation is the task of generating execution traces that represent possible process
enactments. In order to analyze the dynamic behavior of the process in various situa-
tions, the process designer can analyze test cases where: (i) data objects are initialized
to ground values in R, and (ii) a subset of the database instance D is selected. A query
of the form ← trace(s0Proc , T, S, n), where T is a free variable, can be used to generate
the execution traces T of length not larger than n. The query terminates for every fixed
integer n, as the last argument of trace decreases at each recursive call, and for each
successful derivation from the query, the unification mechanism employed by CLP will
bind T to a ground list of actions. Similarly, to the case of trace compliance, we can eas-
ily extend our definitions so as to generate traces that also contain explicit information
about data values.

4.2   Symbolic Testing
Simulation is performed by selecting a finite set of test cases, that is, by fixing values for
initializing the data objects and taking into consideration a specific database instance.
However, the generation of test cases is not always straightforward. The mechanisms
of symbolic computation provided by CLP (notably, unification and constraint solving)
enable us to generate execution traces by only specifying constraints that those val-
ues are required to satisfy. Thus, we can initialize a data object to a value in a range,
rather than to a concrete value. Furthermore, we can exploit the integrity constraints in
the database schema DBS to perform a symbolic evaluation of a trace query without
considering a fixed database instance.
     For instance, in our eProcurement example, we can evaluate a trace query by ini-
tializing the data object coupon to a value X with 0 < X < 10, as specified in Figure 1.
Similarly, we can initialize plafond to a value X with 0 < X < 1000. Furthermore, sup-
pose that in our running example the integrity constraints relative to product price and
ship price are:
product price(X, P) → P > 5 ∧ P < 100
ship price(X, P) → P > 0 ∧ P < 15.
Then we replace the database instance D in KB by the inverse implications of these
integrity constraints, that is, by the rules:
product price(X, P) ← P > 5 ∧ P < 100
ship price(X, P) ← P > 0 ∧ P < 15.
     In general, for performing a symbolic testing task, we replace the ground facts that
constitute the database instance D in KB , by the inverse implications p(X1 , . . . , Xn ) ←
c of all integrity constraints p(X1 , . . . , Xn ) → c specified by DBS, hence deriving a
new CLP program KB 0 . Due to the least model semantics of CLP, we will have that
M(KB 0 ) |= p(X1 , . . . , Xn ) ↔ c1 ∧ . . . ∧ ck , where c1 , . . . , ck are the constraints implied by
p(X1 , . . . , Xn ) in DBS. KB 0 is an over-approximation of KB , i.e., M(KB ) ⊆ M(KB 0 ).
Then we evaluate a query of the form ← trace(s0Proc , T, S, n), for a given integer n. This
query always terminates and, if it succeeds and returns an answer T = t, then there
exists a database instance of DBS such that the DAPS hWF, DC, DBSi has t as a possi-
ble execution trace. The converse in not necessarily true, i.e., there may exist database
instances that do not generate the trace t.
      KB 0 can be used to test reachability properties of the DAPS. For instance, the reach-
ability of a deadlock state in n steps can be verified trough a query of the form:
← trace(s0Proc , T, Sd , n) ∧ ¬r(Sd , Sn )
If the query succeeds, then the DAPS can reach, for some database instance, a potential
deadlock state Sd and T is bound to a trace that leads to that state. If the query fails
then, for any database instance satisfying the given database schema DBS, no deadlock
is reachable in at most n steps.
     In our example, two potential deadlock states are reachable by taking n = 20, both
occurring when the control flow reaches payment (i.e., holds(en(payment, eProc), Sd )).
In the first case the potential deadlock is due to a negative value of amount caused
by a coupon value higher than the price of the product chosen by choose prod, which
prevents the execution of payment. Indeed, the following constraint associated with Sd
is computed as an answer to the above query:
Vamount < 0 ∧ Vitem price < 0 ∧ Vcoupon > 5
where Vo is the logical variable associated to the data object o in the state Sd through
the fluent val (i.e., holds(val(o,Vo ), Sd )). In the second case we have that the following
constraint associated with Sd is computed as another answer to the above query:
Vamount = 0 ∧ Vitem price