=Paper= {{Paper |id=Vol-1882/paper14 |storemode=property |title=Practical Verification of Hierarchical Artifact Systems |pdfUrl=https://ceur-ws.org/Vol-1882/paper14.pdf |volume=Vol-1882 |authors=Yuliang Li |dblpUrl=https://dblp.org/rec/conf/vldb/Li17 }} ==Practical Verification of Hierarchical Artifact Systems== https://ceur-ws.org/Vol-1882/paper14.pdf

Practical Verification of Hierarchical Artifact Systems

Yuliang Li
Co-Advised by Alin Deutsch and Victor Vianu
UC San Diego
La Jolla, California
yul206@eng.ucsd.edu

ABSTRACT identify practically relevant classes of artifact systems and
Data-driven workflows, of which IBM’s Business Artifacts properties for which fully automatic verification is possible.
are a prime exponent, have been successfully deployed in This is an ambitious goal, since artifacts are infinite-state
practice, adopted in industrial standards, and have spawned systems due to the presence of unbounded data. Along this
a rich body of research in academia, focused primarily on line, decidability and complexity results were shown for dif-
static analysis. The present research bridges the gap be- ferent versions of the verification problem with various ex-
tween the theory and practice of artifact verification by pressiveness of the artifact models, as reviewed in the next
studying the implementation of a full-fledged and efficient section.
artifact verifier for a variant of the Hierarchical Artifact Sys- The project described in this paper bridges the gap be-
tem (HAS) model presented in [9]. With a family of special- tween the theory and practice of artifact verification by pro-
ized optimizations to the classic Karp-Miller algorithm, our viding the first implementation of a full-fledged and efficient
verifier performs >10x faster than a nontrivial Spin-based artifact verifier. The artifact model we use is a variant of
baseline on real-world workflows and is scalable to large syn- the Hierarchical Artifact System (HAS) model of [9], which
thetic workflows. captures core elements of IBM’s GSM model. Rather than
building on top of an existing program verification tool such
as Spin, which we have shown to have strong limitations,
1. INTRODUCTION we implemented our verifier from scratch. The implemen-
The past decade has witnessed the evolution of work- tation is based on the classic Karp-Miller algorithm [16],
flow specification frameworks from the traditional process- with a family of specialized optimizations to boost perfor-
centric approach towards data-awareness. Process-centric mance. The experimental results show that our verifier per-
formalisms focus on control flow while under-specifying the forms an order of magnitude faster compared to a baseline
underlying data and its manipulations by the process tasks, implementation using Spin [10] on specifications based on
often abstracting them away completely. In contrast, data- real-world BPMN workflows [2], and scales well on large
aware formalisms treat data as first-class citizens. A notable synthetic workflows. To the best of our knowledge, our arti-
exponent of this class is IBM’s business artifact model pio- fact verifier is the first implementation with full support of
neered in [14], successfully deployed in practice [3, 5, 18] unbounded data.
and adopted in industrial standards.
In a nutshell, business artifacts (or simply “artifacts”)
model key business-relevant entities, which are updated by a 2. BACKGROUND AND RELATED WORK
set of services that implement business process tasks, spec- [8, 6] studied the verification problem for a bare-bones
ified declaratively by pre-and-post conditions. A collection variant of artifact systems, in which each artifact consists
of artifacts and services is called an artifact system. IBM of a flat tuple of evolving values and the services are spec-
has developed several variants of artifacts, of which the most ified by simple pre-and-post conditions on the artifact and
recent is Guard-Stage-Milestone (GSM) [7, 11]. The GSM database. The verification problem was to check statically
approach provides rich structuring mechanisms for services, whether all runs of an artifact system satisfy desirable prop-
including parallelism, concurrency and hierarchy, and has erties expressed in LTL-FO, an extension of linear-time tem-
been incorporated in the OMG standard for Case Manage- poral logic where propositions are interpreted as existential
ment Model and Notation (CMMN) [15, 12]. first-order logic sentences on the database and current arti-
Artifact systems deployed in industrial settings typically fact tuple. In order to deal with the resulting infinite-state
specify complex workflows prone to costly bugs, whence the system, a symbolic approach was developed in [8] to allow
need for verification of critical properties. Over the past few a reduction to finite-state model checking and yielding a
years, the verification problem for artifact systems was in- pspace verification algorithm for the simplest variant of the
tensively studied. Rather than relying on general-purpose model (no database dependencies and uninterpreted data
software verification tools suffering from well-known limi- domain). In [6] the approach was extended to allow for
tations, the focus of the research community has been to database dependencies and numeric data testable by arith-
metic constraints.
In our previous work [9], we made significant progress on
Proceedings of the VLDB 2017 PhD Workshop, August 28, 2017. Mu-
nich, Germany. Copyright (c) 2017 for this paper by its authors. Copying several fronts. We introduced the HAS model, a much richer
permitted for private and academic purposes. and more realistic model abstracting the core elements of
the GSM model. The model features task hierarchy, con- T1: ProcessOrders
currency, and richer artifact data (including updatable arti-
fact relations). In more detail, a HAS consists of a database
and a hierarchy (rooted tree) of tasks. Each task has as- T2: TakeOrder T3: CheckCredit T4: ShipItem
sociated to it local evolving data consisting of a tuple of
Figure 1: Tasks Hierarchy
artifact variables and an updatable artifact relation. It also The task has 3 internal services: Initialize, StoreOrder
has an associated set of services. Each application of a ser- and RetrieveOrder. Intuitively, Initialize creates a new order
vice is guarded by a pre-condition on the database and lo- with cust id = item id = null. When RetrieveOrder is
cal data and causes an update of the data, specified by a called, an order is non-deterministically chosen and removed
post-condition (constraining the next artifact tuple) and an from ORDERS for processing, and (cust id, item id, status)
insertion or retrieval of a tuple from the artifact relation. is set to be the chosen tuple. When StoreOrder is called,
In addition, a task may invoke a child task with a tuple of the current order (cust id, item id, status) is inserted into
input parameters, and receive back a result if the child task ORDERS. The latter two services are specified as follows (the
completes. To express properties of HAS we introduce hi- specification consists of a pre-condition, a post-condition,
erarchical LTL-FO (HLTL-FO), which is similar to LTL-FO and an update to the ORDERS artifact relation):
but adapted to the hierarchy. The main results of [9] es-
tablish the complexity of checking HLTL-FO properties for StoreOrder :
various classes of HAS, highlighting the impact of various Pre: cust id 6= null∧item id 6= null∧status 6= “Failed”
features on verification. Post: cust id = null ∧ item id = null ∧ status = “Init”
Update: {+ORDERS(cust id, item id, status)}
3. MODEL AND EXAMPLE RetrieveOrder :
The artifact model used in our implementation is a variant Pre: cust id = null ∧ item id = null // Post: True
of the HAS model of [9] denoted HAS*, which differs from Update: {−ORDERS(cust id, item id, status)}
the HAS model used in [9] in two respects. On one hand,
it restricts HAS by disallowing arithmetic in service pre- TakeOrder: When this task is called, the customer enters
and-post conditions, and requires the underlying database the information of the order (cust id and item id) and the
schema to use an acyclic set of foreign keys, as in the widely status of the order is initialized to “OrderPlaced”. The
used Star (or Snowflake) schemas [17]. On the other hand, task contains cust id, record and status as variables and
HAS* extends HAS by allowing an arbitrary number of ar- all are return variables to the parent task. There are two
tifact relations in each task, arbitrary variable propagation, services called EnterCustomer and EnterItem, that allow
and more flexible interactions between tasks. As shown by the customer to enter her and the item’s information. The
our real-life examples, HAS* is powerful enough to model a CUSTOMERS and ITEMS tables are queried to obtain the cus-
wide variety of business processes, and so is a good vehicle tomer ID and item ID. These two services can be called
for studying the implementation of a verifier. Moreover, de- multiple times to allow the customer to modify previously
spite the extensions, the complexity of verifying HLTL-FO entered data. The task’s termination condition is cust id 6=
properties of HAS* can be shown to remain expspace, by null ∧ item id 6= null, at which time its variables are re-
adapting the techniques of [9]. turned to its parent task ProcessOrders.
We illustrate the HAS* model with a simplified example CheckCredit: This task checks the financial record of a
of order fulfillment business process based on a real-world customer and decides whether the supplier will go ahead
BPMN workflow. The workflow allows customers to place with the sale. It is called when status = “OrderPlaced”.
orders and suppliers to process the orders. It has the follow- It has artifact variables cust id (input variable), record
ing database schema: and status. When the credit record is good, status is set
• CUSTOMERS(id, name, address, record) to “Passed”, and otherwise to “Failed”. After status is
ITEMS(id, item name, price, in stock) set, the task terminates and returns status to the parent
CREDIT RECORD(id, status) task. The task has a single service Check performing the
In the schema, the id’s are key attributes, and record is a credit check.
foreign key referencing CREDIT RECORD. The CUSTOMERS ta- Check :
ble contains basic customer information and CREDIT RECORD Pre: true // Post:
provides each customer’s credit rating. The ITEMS table con- ∃n∃a CUSTOMERS(cust id, n, a, record)∧
tains information on all the items. The artifact system has 4
(CREDIT RECORD(record, “Good”) → status = “Passed”)∧
tasks: T1 :ProcessOrders, T2 :TakeOrder, T3 :CheckCredit
(¬CREDIT RECORD(record, “Good”) → status = “Failed”)
and T4 :ShipItem, which form the hierarchy in Figure 1.
Intuitively, the root task ProcessOrders serves as a global Note that in a service we can also specify a set of propa-
coordinator which manages a pool of orders and the child gated variables whose values stay unchanged when the ser-
tasks TakeOrder, CheckCredit and ShipItem implement vice is applied. In Check, only cust id is a propagated vari-
the 3 processing stages of an order. At each point in a run, able and others will be assigned new values.
ProcessOrders nondeterministically picks an order from ShipItem: This task checks whether the desired item is in
its pool, triggers one processing stage, and places it back stock by looking up the item id in the ITEMS table to see
into the pool upon completion. whether the in stock attribute equals “Yes”. If so, the item
ProcessOrders: The task has the artifact variables: cust id, is shipped to the customer (status is set to “Shipped”) oth-
item id, status which store basic information of an order. erwise the order fails (status is set to “Failed”). This task
It also has an artifact relation ORDERS(cust id, item id, status) is specified similarly to CheckCredit (details omitted).
storing the orders to be processed. Properties of HAS* can be specified in LTL-FO. In the
above workflow, we can specify a temporal property saying As shown in Section 5, the performance of the verifier is
“If an order is taken and the ordered item is out of stock, significantly improved.
then the item must be restocked before it is shipped.” It can Data Structure Support When the above optimization is
be written in LTL-FO as: applied, a frequent operation during the search is to find sub-
∀i G(EnterItem ∧ item id = i ∧ instock = “No”) → states and superstates of a given candidate state in the cur-
(¬(ShipItem ∧ item id = i) U (Restock ∧ item id = i)) rent set of reached symbolic states. This operation becomes
the performance bottleneck when there is a large number
of reached states. We accelerate the superstate and sub-
4. VERIFIER IMPLEMENTATION state queries with a Trie index and an Inverted-Lists index,
Although decidability of verification was shown in [9], a respectively.
naive implementation of the expspace algorithm outlined Static Analysis The verifier statically analyzes and sim-
there would be wholly impractical. Instead, our implemen- plifies the input workflow with a preprocessing step. We
tation brings to bear a battery of optimization techniques notice that in real workflows, some contraints in the spec-
crucial to performance. This approach of [9] is based on de- ification can never be violated in a symbolic run, and thus
veloping a symbolic representation of the runs of a HAS*. can be removed. For example, for a constraint x = y in
In the representation, each snapshot is summarized by: the specification, where x, y are variables, if x 6= y does not
(i) the isomorphism type of the artifact variables, describing appear anywhere in the specification and is not implied by
symbolically the structure of the portion of the database other constraints, then x = y can be safely removed from the
reachable from the variables by navigating foreign keys specification without affecting the result of the verification
(ii) for each artifact relation and isomorphism type, the num- algorithm.
ber of tuples in the relation that share that isomorphism
type 5. EXPERIMENTAL RESULTS
We evaluated the performance of our verifier using both
The heart of the proof in [9] is showing that it is sufficient
real-world and synthetic artifact specifications.
to verify symbolic runs rather than actual runs. Observe
that because of (ii), the symbolic representation is not finite The Real Set As the artifact approach is still new to
state. Indeed, (ii) requires maintaining a set of counters, the industry, real-world processes available for evaluation
which can grow unboundedly. Therefore, the verification are limited. We therefore built an artifact system bench-
algorithm relies on a reduction to state reachability in Vector mark specific to business processes, by rewriting the more
Addition Systems with States (VASS) [4]. A VASS is a widely available process-centric BPMN workflows as HAS*
finite-state machine augmented with positive counters that specifications. There are numerous sources of BPMN work-
can be incremented and decremented (but not tested for flows, including the official BPMN website [2], that provides
zero). This is essentially equivalent to a Petri Net. 36 workflows of non-trivial size. To rewrite these workflows
A direct implementation of the above algorithm is im- into the HAS*, we manually added the database schema,
practical because the resulting VASS can have exponentially artifact variables/relations, and services for updating the
many states and counters in the input size, and state-of-the- data. Among the 36 real-world BPMN workflows collected
art VASS tools can only handle a small number of coun- from the official BPMN website bpmn.org, our model is suf-
ters (<100) [1]. To mitigate the inefficiency, our implemen- ficiently expressive to specify 32 of them in HAS* and can
tation never generates the whole VASS but instead lazily thus be used for performance evaluation. The remaining
computes the symbolic representations on-the-fly. Thus, it ones cannot be expressed in HAS* because they involve com-
only generates reachable symbolic states, whose number is puting aggregate functions or updating the artifact relations
usually much smaller. In addition, isomorphism types in in ways that are not supported in the current model. We
the symbolic representation are replaced by partial isomor- will consider these features in our future work.
phism types, which store only the subset of constraints on the The Synthetic Set The second benchmark we used for
variables imposed by the current run, leaving the rest un- evaluation is a set of randomly generated HAS specifications.
specified. This representation is not only more compact, but All components of each specification, including DB schema,
also results in an exponentially smaller search space. Then task hierarchy and services, are generated fully at random
our verifier performs the (repeated) state reachability search of a certain size. The ones with empty search space due to
using the classic Karp-Miller algorithm [16] with three spe- unsatisfiable conditions are removed from the benchmark.
cialized optimizations to further accelerate the search. We Table 1 shows some statistics of the benchmarks.
discuss these next.
Dataset Size #Relations #Tasks #Variables #Services
State Pruning The classic Karp-Miler algorithm is well-
known to be inefficient and pruning is a standard way to Real 32 3.563 3.219 20.63 11.59
improve its performance [16]. We introduce a new pruning Synthetic 120 5 5 75 75
technique which can be viewed as a generalization of the
Table 1: Statistics of the Two Sets of Workflows
strategies in [16]. The high-level idea is that when a new
state I is found, if there exists a reached state I 0 such that Baseline and Setup We compare our verifier with a sim-
all states reachable from I are also reachable from I 0 , then pler implementation built on top of Spin, a widely used soft-
we can stop exploring I immediately. In this case, we call I 0 ware verification tool [10]. Building such a verifier is by itself
a superstate of I and I a substate. Similarly, if there exists a challenging task since Spin is incapable of handling data
a reached state I 0 which is a substate of I, then we can of unbounded size, present in the HAS* model. We man-
prune I 0 and its successors. Compared to [16], our pruning aged to build a Spin-based verifier supporting a restricted
is more aggresive, resulting in a much smaller search space. version of out model, without updatable artifact relations.
As the read-only database can still have unbounded size and SP SA DSS
Dataset
domain, the verifier requires a set of nontrivial translations Mean Trim. Mean Trim. Mean Trim.
and optimizations. The details will be discussed in a sepa-
Real 2943.58x 55.31x 1.80x 1.66x 1.90x 1.24x
rate paper.
Synthetic 494.57x 180.82x 17.92x 0.92x 1.45x 1.27x
We implemented both verifiers in C++ with Spin version
6.4.6 for the Spin-based verifier. All experiments were per- Table 3: Mean and Trimmed Mean (5%) of Speedups
formed on a Linux server with a quad-core Intel i7-2600 CPU speedups of each optimization in both datasets. We also
and 16G memory. For each workflow in each dataset, we ran present the trimmed averages of the speedups (i.e. removing
our verifiers to test a randomly generated liveness property. the top/bottom 5% speedups before averaging) which is less
The time limit of each run was set to 10 minutes. For fair sensitive to extreme values.
comparison, since the Spin-based verifier (Spin-Opt) cannot Table 3 shows that the effect of state pruning is the most
handle artifact relations, we ran both the full Karp-Miller- significant in both sets of workflows, with an average (trimmed)
based verifier (KM), and the Karp-Miller-based verifier with speedup of 55x and 180x in the real and synthetic set, re-
artifact relations ignored (KM-NoSet). spectively. The static analysis optimization is more effective
Performance Table 2 shows the results on both sets of in the real set (1.6x improvement) but its effect in the syn-
workflows. The Spin-based verifier achieves acceptable per- thetic set is less obvious. It creates a small amount (8%)
formance on the real set with an average elapsed time of few of overhead in most cases, but significantly improves the
seconds and only 1 timeout instance. However, it failed on running time of a single instance, resulting in the huge gap
most runs (109/120) in the synthetic set of workflows. On between the normal average speedup and the trimmed av-
the other hand, both KM and KM-NoSet achieve average erage speedup. Finally, the data-structure support provides
running times below 1 second and with no timeout on the a consistent ∼1.2x average speedup in both datasets.
real set, and the average running time is in seconds on the Acknowledgement This work was supported in part by
synthetic set, with only 3 timeouts. The presence of artifact the National Science Foundation under award IIS-1422375.
relations introduced only a negligible amount of overhead
in the running time. Compared with the Spin-based veri- 6.[1] MIST
REFERENCES
- a safety checker for petri nets and extensions.
fier, the KM-based approach is >10x faster in the average https://github.com/pierreganty/mist/wiki.
running time and more scalable to large workflows. [2] Object management group business process model and
notation. http://www.bpmn.org/. Accessed: 2017-03-01.
Real Synthetic [3] K. Bhattacharya, N. S. Caswell, S. Kumaran, A. Nigam, and
Mode
Avg(Time) #Timeout Avg(Time) #Timeout F. Y. Wu. Artifact-centered operational modeling: Lessons
from customer engagements. IBM Systems Journal,
Spin-Opt 3.111s 1 67.01s 109 46(4):703–721, 2007.
KM-NoSet .2635s 0 3.214s 3 [4] M. Blockelet and S. Schmitz. Model checking coverability
KM .2926s 0 3.355s 2 graphs of vector addition systems. In Mathematical
Foundations of Computer Science 2011, pages 108–119.
Springer, 2011.
Table 2: Performance of the two Verifiers
[5] T. Chao et al. Artifact-based transformation of IBM Global
Cyclomatic Complexity To better understand the scal- Financing: A case study. In BPM, 2009.
ability of the Karp-Miller-based approach, we measured the [6] E. Damaggio, A. Deutsch, and V. Vianu. Artifact systems with
difficulty of verifying each workflow using a metric called the data dependencies and arithmetic. TODS, 37(3):22, 2012.
cyclomatic complexity [13], which is widely used in software [7] E. Damaggio, R. Hull, and R. Vaculı́n. On the equivalence of
incremental and fixpoint semantics for business artifacts with
engineering to measure the complexity of program modules. guard-stage-milestone lifecycles. Information Systems,
Figure 2 shows that the elapsed time increases exponentially 38:561–584, 2013.
with the cyclomatic complexity. According to [13], it is rec- [8] A. Deutsch, R. Hull, F. Patrizi, and V. Vianu. Automatic
verification of data-centric business processes. In ICDT, pages
ommended that any well-designed program should have cy- 252–267, 2009.
clomatic complexity at most 15 in order to be readable and [9] A. Deutsch, Y. Li, and V. Vianu. Verification of hierarchical
testable. Our verifier successfully handled all workflows in artifact systems. In PODS, pages 179–194, 2016.
both benchmarks with cyclomatic complexity less than or [10] G. Holzmann. Spin Model Checker, the: Primer and Reference
Manual. Addison-Wesley Professional, first edition, 2003.
equal to 17, which is above the recommended level. For
[11] R. Hull et al. Business artifacts with guard-stage-milestone
instances with cyclomatic complexity above 15, our verifier lifecycles: Managing artifact interactions with conditions and
only timed out in 2/24 instances (8.33%). events. In ACM DEBS, 2011.
103 [12] M. Marin, R. Hull, and R. Vaculı́n. Data centric bpm and the
emerging case management standard: A short survey. In BPM
102
Running Time (sec.)

Workshops, 2012.
101 [13] T. J. McCabe. A complexity measure. IEEE Transactions on
Synthetic
100 Real
software Engineering, (4):308–320, 1976.
[14] A. Nigam and N. S. Caswell. Business artifacts: An approach
10 1 to operational specification. IBM Systems Journal,
10 2 42(3):428–445, 2003.
10 3 [15] Object Management Group. Case Management Model and
5 10 15 20 25 30 35 40 45 Notation (CMMN), 2014.
Cyclomatic Complexity
[16] P.-A. Reynier and F. Servais. Minimal coverability set for petri
Figure 2: Running Time vs. Cyclomatic Complexity nets: Karp and miller algorithm with pruning. In International
Comparing Different Optimizations We show next the Conference on Application and Theory of Petri Nets and
Concurrency, pages 69–88. Springer, 2011.
effect of our 3 optimization techniques: state pruning (SP),
[17] P. Vassiliadis and T. Sellis. A survey of logical models for olap
static analysis (SA) and data structure support (DSS), by databases. ACM Sigmod Record, 28(4):64–69, 1999.
rerunning the experiment with the optimization turned off, [18] W.-D. Zhu et al. Advanced Case Management with IBM Case
and comparing the difference. Table 3 shows the average Manager. IBM Redbooks, 2015.