Uniform Interpolation for the Automated Verification
of Data-Aware Business Processes
Alessandro Gianola1
1
    Free University of Bozen-Bolzano, Italy


                                         Abstract
                                         In the last two decades, Craig interpolation has emerged as a powerful tool in formal verification.
                                         Interpolants are largely exploited as an efficient method to approximate the reachable states and for
                                         invariant synthesis. In this survey, we report recent results on “stronger interpolants”, called uniform
                                         interpolants, and we discuss how they can be used to develop effective techniques for computing the
                                         reachable states in an exact way. Uniform interpolants can be efficiently computed when verifying
                                         data-aware processes, where the control flow of a (business) process can interact with a data storage.
                                         This is significant since integrating data and processes is a long-standing problem in business process
                                         management.

                                         Keywords
                                         Uniform Interpolation, Formal Verification, BPM, Data-Aware Processes, SMT


1. Overview
In this survey, we report recent results on the use of uniform interpolants (UIs) in automated
reasoning [1, 2] and in the context of formal verification of the so-called data-aware (business)
processes [3, 4, 5]. These results originate from the confluence of two well-established research
fields: model-theoretic algebra [6] in mathematical logic, where UIs were originally investigated
for non-classical logics, and Satisfiability Modulo Theories (SMT) [7], where UIs provide a
‘light form’ of quantifier elimination. We discuss how such apparently quite distant scientific
paradigms can cooperate in verification of infinite-state systems such as data-aware processes
[8, 4, 9, 10], where the control flow of a (business) process can interact with a data storage.

Verification of Data-Aware Business Processes. In recent years, a growing number of AI
application domains asks for process modeling languages paired with automated verification
techniques that do not just consider the control flow dimension of a process, but also take
into consideration the data dimension [11, 12, 13]. From a theoretical perspective, this has led
to the development of formal frameworks for attacking the problem of verifying data-aware
processes (DAPs) [14, 15, 16, 17, 18]. What all these frameworks have in common is that they
strive to focus on general DAP models that formalize abstract dynamic systems (i.e., the process
OVERLAY 2022: 4th Workshop on Artificial Intelligence and Formal Verification, Logic, Automata, and Synthesis,
November 28, 2022, Udine, Italy
$ gianola@inf.unibz.it (A. Gianola)
 https://gianola.people.unibz.it/ (A. Gianola)
 0000-0003-4216-5199 (A. Gianola)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
component) interacting with data persistent storage (i.e., the data component). DAP verification
should reflect the possibility of expressing properties that simultaneously account for the data
and the process perspectives, and most importantly for their interaction. Because of data, DAP
models are intrinsically infinite-state: a database is finite but its size is unbounded and unknown a
priori (since new tuples can always be added to relations using elements from infinite domains).
   From a more applied perspective, a huge body of research has been dedicated to the prob-
lem of reconciling data and business process management (BPM [19]) within contemporary
organizations [20, 21, 22]. More specifically, in the BPM context it has become more and more
important to study multi-perspective models that do not just conceptually account for the
control-flow dimension of business processes, but also consider the interplay with data [23, 14]:
in contrast with abstract DAPs, these models are more focused on concrete processes as they
are interpreted by stakeholders and BPM practitioners. One important challenge that naturally
arises is how to formally verify such business processes enriched with data [24, 23]: since they
are complex infinite-state systems, this requires to develop sophisticated symbolic techniques.

The Problem of Quantifiers. Verification of infinite-state systems, and in particular of DAPs,
when studied in declarative terms, requires to symbolically represent transitions and reachable
states: contrary to finite-state model checking [25], an explicit and exhaustive exploration of
the state space is not possible, because of the presence of infinitely many states.
   Several techniques based on SMT have been studied to attack the general problem: for instance,
there are prominent methods that are based on forward reachability, such as K-induction [26],
or on backward reachability [27]. Methods based on backward reachability symbolically explore
the state space starting from ‘unsafe’ states via an iterative computation of predecessors (i.e., the
(pre-)image), until a fixpoint is reached or the initial state(s) are intersected. There exist various
SMT-based model checkers implementing these methods (e.g., Kind 2 [28] or mcmt [29]).
   In many first-order (FO) declarative formalisms (e.g., [27]), sets of states are represented using
quantifier-free logical formulae, called state formulae: the content of variables in a state formula
characterizes the global state of the system and may change during its evolution. The verification
procedure symbolically computes reachable sets of states, which should be represented again as
state formulae: indeed, images are intended to describe sets of states, and, as such, they should
be quantifier-free. Reachable sets of states need to be manipulated, because in general are not
described by state formulae: this is because, when computing images, FO (mostly, existential)
quantifiers may be introduced by the transitions, which are usually quantified formulae. For
instance, this is the case of DAP models, whose transitions contain as guards existential queries
over a relational database [8, 4]. These quantifiers appear in image computation, breaking the
quantifier-free format of state formulae. Hence, a sort of quantifier elimination (QE) is needed
to represent reachable states as state formulae. QE is also essential for efficiency reasons: e.g.,
using approaches à la Bounded Model Checking [30], where paths with incremental length are
encoded, QE guarantees that the size of formulae does not increase, avoiding their blow-up.
   Dealing with quantifiers is a genuine problem when using frameworks based on SMT. For
instance, declarative versions of backward reachability [31, 27] require discharging to SMT
solvers proof obligations in the form of satisfiability tests for quantified formulae with a restricted
shape. Although SMT solvers natively reason about the quantifier-free fragments, FO quantifiers
can be in some cases handled by instantiation [27], whereas in others [32], where quantifiers
range over specific real values involving light forms of arithmetic, proper QE can be used [33].
   To summmarize, the precise computation of the set of reachable states can be in principle
performed via proper quantifier elimination. However, quantifier elimination is not always
possible in generic FO theories, and, when available, is in many cases computationally intractable:
for instance, this is the case of arithmetical theories. In order to cope with this problem,
other methods for symbol elimination (e.g., predicate abstraction [34, 35] or ordinary (Craig)
interpolation [36, 37, 38]) have been investigated, which do not compute precise images, but
perform an approximation of the reachable states: this implies that images can contain ‘spurious
elements’, i.e., states that are not properly reachable in one step. Nevertheless, these methods are
quite successful and computationally efficient. The main limitation is that they usually require
refinement techniques to handle the spurious traces possibly produced by the approximation.


2. Model Completions and Uniform Interpolation
We now remark how in significant cases (e.g., data-aware business processes), approximating
methods can be abandoned in favor of exact methods that have the merits of both computing
precise images and remaining computationally tractable (i.e., in polytime). These methods are
based on the use of uniform interpolants, a strong form of interpolants having strict relationships
with quantifier elimination performed in richer theories called model completions. We clarify
why model completions come to the picture and how they are related to uniform interpolation.

Model-Theoretic Algebra and Model Completions An FO theory 𝑇 has quantifier elimi-
nation iff for every formula 𝜑 in the signature Σ of 𝑇 there is a quantifier-free formula 𝜑′ s.t. 𝜑
is 𝑇 -equivalent to 𝜑′ . Eliminating quantifiers from a generic FO formula can be equivalently
formulated as the problem of eliminating existential quantifiers from constraints. Elimination
of existentials has an interesting interpretation: it can be seen as the logical counterpart of
finding witnesses, i.e., solutions, to suitable systems of logical “equations and/or disequalities”.
Model-theoretic algebra, since the pioneering work by Robinson [39, 6], provides a powerful
setting where to formulate this problem in algebraic terms and solve it using model theory.
   The central notion is that of existentially closed model. A quantifier-free formula with param-
eters in a model 𝑀 is solvable if there is an extension 𝑀 ′ of 𝑀 where the formula is satisfied. A
model 𝑀 is existentially closed if any solvable formula already has a solution in 𝑀 itself. This
is not FO definable in general. However, in significant cases, the class of existentially closed
models of 𝑇 are exactly the models of another FO theory 𝑇 * , called model completion [40] of
𝑇 . Formally, a universal theory 𝑇 has a model completion iff there is a stronger theory 𝑇 * ⊇ 𝑇
(still within the same signature Σ of 𝑇 ) s. t.: (i) every Σ-constraint satisfiable in a model of 𝑇 is
satisfiable in a model of 𝑇 * ; (ii) 𝑇 * eliminates quantifiers. The theory 𝑇 * , if it exists, is unique.
In model completions, QE holds, even in case it does not in 𝑇 . The model completion of a theory
identifies the class of the models where all satisfiable existential statements can be satisfied.
   In declarative approaches to verification of infinite-state systems, the runs of a system that are
‘analyzed’ by image computation are identified with certain definable paths in the models of a
suitable theory 𝑇 : e.g., in the case of DAP models, the theory 𝑇 formalizes the relational database
that is queried and modified by the process [3]. As noticed, when performing these computations,
FO quantifiers are introduced and need to be eliminated, even in case of theories 𝑇 not admitting
QE. Nevertheless, in significant cases where QE is not available, model completions still exist.
This is the case of useful theories such as the ones used for DAP verification [3, 4].
   Model completions become the crucial tool to exploit: one of the main results from [3] is that
one can restrict the analysis to paths within existentially closed models, thus taking profit from
the properties of the model completion 𝑇 * , first of all QE. For instance, in [3] it is shown that,
in the case of safety verification, performing backward reachability for systems whose models
live in 𝑇 is equivalent to performing backward reachability for systems whose models live in
𝑇 * : favorably, in 𝑇 * quantifiers can be eliminated, even when QE is not available in 𝑇 .

UIs and QE in Model Completions. We give a general definition of UIs and we discuss
their strict relationship with model completions. Let 𝑇 be a logic or a theory and let 𝐿 be a
suitable fragment (propositional, FO quantifier-free, etc.) of its language. Given an 𝐿-formula
𝜑(𝑥, 𝑦) (𝑥, 𝑦 are the variables occurring in 𝜑), a (𝐿-)uniform interpolant (UI) of 𝜑 (w.r.t. 𝑦) is an
𝐿-formula 𝜑′ (𝑥) where only the 𝑥 occur, satisfying these two properties: (i) 𝜑(𝑥, 𝑦) ⊢𝑇 𝜑′ (𝑥);
(ii) for any further 𝐿-formula 𝜓(𝑥, 𝑧) such that 𝜑(𝑥, 𝑦) ⊢𝑇 𝜓(𝑥, 𝑧), we have 𝜑′ (𝑥) ⊢𝑇 𝜓(𝑥, 𝑧).
   For every pair of 𝐿-formulae 𝜑(𝑥, 𝑦) and 𝜓(𝑥, 𝑧) s.t. 𝜑 ⊢𝑇 𝜓, a UI of 𝜑 is in particular an
ordinary (Craig) interpolant for the pair (𝜑, 𝜓) [41]. Hence, whenever UIs exist, one can compute
an ordinary interpolant for 𝜑 ⊢𝑇 𝜓 in a way that is independent of 𝜓, i.e., uniformly.
   UIs were originally investigated in non-classical logics, since the work by Pitts [42]. They are
stronger than ordinary interpolants: even in case Craig interpolants exist, UIs may not exist.
Since the nineties, they have been extensively studied in a large literature (e.g., [43, 44, 45]).
   Recently, the automated reasoning community has developed an increasing interest in UIs,
where 𝐿 is the quantifier-free fragment of an FO theory 𝑇 : from now on, we restrict our
attention to this case. This is witnessed by various talks by Kapur (FLoC 2010, ISCAS 2013-14,
SCS 2017 [46]), as well as by Gulwani and Musuvathi in [47], where UIs are called covers.
The use of UIs in model checking to compute exact images of states was first shown in that
work, and then further motivated by DAP verification [8, 1]. The first formal proof about the
existence of UIs in ℰ𝒰 ℱ (Equality and Uninterpreted Functions) was published in [48, 1], where
was also proved that computing UIs in 𝑇 is equivalent to eliminating quantifiers in its model
completion 𝑇 * . Hence, instead of investigating paths in 𝑇 * and eliminating quantifiers in model
completions, reachability can be performed in 𝑇 itself by computing UIs there. In [48], a UI
algorithm for ℰ𝒰 ℱ relying on the constrained Superposition Calculus was proposed: in the case
that is used to verify DAP models, a quadratic bound in time for UI computation can be given [1].
Two simpler UI algorithms are in [49].
   DAP verification also suggests the study of UI transfer to combined theories: it is natural
to consider the combination of theories accounting for different datatypes contained in the
persistent storage [2, 8]. The UI transfer problem is: if UIs exist in theories 𝑇1 and 𝑇2 , under which
conditions do they exist also in the combined theory 𝑇1 ∪ 𝑇2 ? In [50, 2] combined UIs are shown
to exist in the disjoint-signatures convex case under the same hypothesis, i.e., the equality
interpolating condition [51], guaranteeing the transfer of quantifier-free ordinary interpolation;
in [50, 2], a combined UI algorithm, based on the use of Beth definability, is also provided.
Acknowledgments
This work is partially supported by the Italian Ministry of University and Research under the
PRIN program, grant B87G22000450001 (PINPOINT), and by the Free University of Bozen-
Bolzano with the SMART-APP and ADAPTERS projects.


References
 [1] D. Calvanese, S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, Model completeness, uniform
     interpolants and superposition calculus, Journal of Automated Reasoning 65 (2021) 941–
     969.
 [2] D. Calvanese, S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, Combination of uniform
     interpolants via Beth definability, Journal of Automated Reasoning 66 (2022) 409–435.
 [3] D. Calvanese, S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, SMT-based verification of
     data-aware processes: a model-theoretic approach, Mathematical Structures in Computer
     Science 30 (2020) 271–313.
 [4] D. Calvanese, S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, Formal modeling and SMT-
     based parameterized verification of data-aware BPMN, in: Proceeding of BPM 2019, volume
     11675 of LNCS, Springer, 2019, pp. 157–175.
 [5] A. Gianola, SMT-based safety verification of data-aware processes: Foundations and
     applications, in: Proc. of the Best Dissertation Award, Doctoral Consortium, and Demon-
     stration & Resources Track at BPM 2022, volume 3216 of CEUR Workshop Proceedings,
     CEUR-WS.org, 2022, pp. 20–24.
 [6] A. Robinson, Introduction to model theory and to the metamathematics of algebra, Studies
     in logic and the foundations of mathematics, North-Holland, 1963.
 [7] C. W. Barrett, C. Tinelli, Satisfiability modulo theories, in: Handbook of Model Checking.,
     Springer, 2018, pp. 305–343.
 [8] D. Calvanese, S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, Verification of data-aware
     processes: Challenges and opportunities for automated reasoning, in: Proceedings of
     ARCADE 2019, volume 311, EPTCS, 2019.
 [9] S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, Delta-BPMN: A concrete language and
     verifier for Data-Aware BPMN, in: Proceedings of BPM 2021, volume 12875 of LNCS,
     Springer, 2021, pp. 179–196.
[10] S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, Petri net-based object-centric processes
     with read-only data, Information Systems 107 (2022).
[11] B. B. Hariri, D. Calvanese, G. De Giacomo, R. De Masellis, P. Felli, Foundations of relational
     artifacts verification, in: Proceedings of BPM 2011, volume 6896 of LNCS, Springer, 2011,
     pp. 379–395.
[12] G. D. Giacomo, R. D. Masellis, M. Grasso, F. M. Maggi, M. Montali, Monitoring business
     metaconstraints based on LTL and LDL for finite traces, in: Proc. of BPM 2014, volume
     8659 of Lecture Notes in Computer Science, Springer, 2014, pp. 1–17. URL: https://doi.org/10.
     1007/978-3-319-10172-9_1. doi:10.1007/978-3-319-10172-9\_1.
[13] G. D. Giacomo, X. Oriol, M. Estañol, E. Teniente, Linking data and BPMN processes to
     achieve executable models, in: Proc. of CAiSE, LNCS, Springer, 2017, pp. 612–628.
[14] D. Calvanese, G. De Giacomo, M. Montali, Foundations of data-aware process analysis: A
     database theory perspective, in: Proceedings of PODS 2013, 2013, pp. 1–12.
[15] B. Bagheri Hariri, D. Calvanese, G. De Giacomo, A. Deutsch, M. Montali, Verification of
     relational data-centric dynamic systems with external services, in: Proceedings of PODS
     2013, 2013, pp. 163–174.
[16] A. Deutsch, R. Hull, Y. Li, V. Vianu, Automatic verification of database-centric systems,
     ACM SIGLOG News 5 (2018) 37–56.
[17] A. Deutsch, Y. Li, V. Vianu, Verification of hierarchical artifact systems, in: Proceedings of
     PODS 2016, 2016, pp. 179–194.
[18] A. Deutsch, Y. Li, V. Vianu, Verification of hierarchical artifact systems, ACM Transactions
     on Database Systems 44 (2019) 12:1–12:68.
[19] M. Dumas, M. L. Rosa, J. Mendling, H. A. Reijers, Fundamentals of Business Process
     Management, Second Edition, Springer, 2018.
[20] C. Richardson, Warning: Don’t assume your business processes use master data, in:
     Proceedings of BPM 2010, volume 6336 of LNCS, Springer, 2010, pp. 11–12.
[21] M. Dumas, On the convergence of data and process engineering, in: Proceedings of ADBIS
     2011, volume 6909 of LNCS, Springer, 2011, pp. 19–26.
[22] M. Reichert, Process and data: Two sides of the same coin?, in: Proceedings of OTM 2012,
     volume 7565 of LNCS, Springer, 2012, pp. 2–19.
[23] E. Damaggio, A. Deutsch, R. Hull, V. Vianu, Automatic verification of data-centric business
     processes, in: Proceedings of BPM 2011, volume 6896 of LNCS, Springer, 2011, pp. 3–16.
[24] A. Deutsch, R. Hull, F. Patrizi, V. Vianu, Automatic verification of data-centric business
     processes, in: Proceedings of ICDT 2009, 2009, pp. 252–267.
[25] E. M. Clarke, O. Grumberg, D. A. Peled, Model checking, MIT Press, 2001.
[26] M. Sheeran, S. Singh, G. Stålmarck, Checking safety properties using induction and a
     SAT-solver, in: Proceedings of FMCAD 2000, volume 1954 of LNCS, Springer, 2000, pp.
     108–125.
[27] S. Ghilardi, S. Ranise, Backward reachability of array-based systems by SMT solving:
     Termination and invariant synthesis, Logical Methods in Computer Science 6 (2010).
[28] A. Champion, A. Mebsout, C. Sticksel, C. Tinelli, The Kind 2 model checker, in: Proceedings
     of CAV 2016, volume 9780 of LNCS, Springer, 2016, pp. 510–517.
[29] S. Ghilardi, S. Ranise, MCMT: A model checker modulo theories, in: Proceedings of IJCAR
     2010, volume 6173 of LNCS (LNAI), Springer, 2010, pp. 22–29.
[30] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, Y. Zhu, Bounded model checking,
     Adv. Comput. 58 (2003) 117–148. URL: https://doi.org/10.1016/S0065-2458(03)58003-2.
     doi:10.1016/S0065-2458(03)58003-2.
[31] S. Ghilardi, E. Nicolini, S. Ranise, D. Zucchelli, Towards SMT model checking of array-
     based systems, in: Proceedings of IJCAR 2008, volume 5195 of LNCS (LNAI), Springer,
     2008, pp. 67–82.
[32] F. Alberti, R. Bruttomesso, S. Ghilardi, S. Ranise, N. Sharygina, An extension of lazy
     abstraction with interpolation for programs with arrays, Formal Methods in System
     Design 45 (2014) 63–109.
[33] D. C. Cooper, Theorem proving in arithmetic without multiplication., in: Machine
     Intelligence, volume 7, Edinburgh University Press, 1972, pp. 91–100.
[34] R. Jhala, R. Majumdar, Software model checking, ACM Comput. Surv. 41 (2009) 21:1–21:54.
[35] R. Jhala, A. Podelski, A. Rybalchenko, Predicate abstraction for program verification, in:
     Handbook of Model Checking, Springer, 2018, pp. 447–491.
[36] K. L. McMillan, Interpolation and SAT-Based Model Checking, in: Proceedings of CAV
     2003, volume 2725 of LNCS, Springer, 2003, pp. 1–13.
[37] K. L. McMillan, Lazy Abstraction with Interpolants, in: Proceedings of CAV 2006, volume
     4144 of LNCS, Springer, 2006, pp. 123–136.
[38] L. Kovács, A. Voronkov, Interpolation and symbol elimination, in: Proceedings of CADE
     2009, volume 5663 of LNCS (LNAI), Springer, 2009, pp. 199–213.
[39] A. Robinson, On the metamathematics of algebra, Studies in Logic and the Foundations of
     Mathematics, North-Holland Publishing Co., Amsterdam, 1951.
[40] C.-C. Chang, J. H. Keisler, Model Theory, third ed., North-Holland Publishing Co.,
     Amsterdam-London, 1990.
[41] W. Craig, Three uses of the Herbrand-Gentzen theorem in relating model theory and
     proof theory, Journal of Symbolic Logic 22 (1957) 269–285.
[42] A. M. Pitts, On an interpretation of second order quantification in first order intuitionistic
     propositional logic, Journal of Symbolic Logic 57 (1992) 33–52.
[43] S. Ghilardi, M. Zawadowski, Sheaves, games, and model completions, volume 14 of Trends
     in Logic—Studia Logica Library, Kluwer Academic Publishers, Dordrecht, 2002.
[44] T. Kowalski, G. Metcalfe, Uniform interpolation and coherence, Annals of Pure and
     Applied Logic 170 (2019) 825–841.
[45] G. Metcalfe, L. Reggio, Model completions for universal classes of algebras: necessary and
     sufficient conditions, Journal of Symbolic Logic (2022) 1–34.
[46] D. Kapur, Nonlinear polynomials, interpolants and invariant generation for system analysis,
     in: Proceedings of SC-Square 2017 (co-located with ISSAC), volume 1974, CEUR Workshop
     Proceedings, 2017.
[47] S. Gulwani, M. Musuvathi, Cover algorithms and their combination, in: Proceedings of
     ESOP 2008, volume 4960 of LNCS, Springer, 2008, pp. 193–207.
[48] D. Calvanese, S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, Model completeness, covers
     and superposition, in: Proceedings of CADE 2019, volume 11716 of LNCS (LNAI), Springer,
     2019, pp. 142–160.
[49] S. Ghilardi, A. Gianola, D. Kapur, Uniform interpolants in EUF: Algorithms using DAG-
     representations, Logical Methods in Computer Science 18 (2022).
[50] D. Calvanese, S. Ghilardi, A. Gianola, M. Montali, A. Rivkin, Combined Covers and Beth
     Definability, in: Proceedings of IJCAR 2020, volume 12166 of LNCS (LNAI), Springer, 2020,
     pp. 181–200.
[51] G. Yorsh, M. Musuvathi, A combination method for generating interpolants, in: Proceed-
     ings of CADE 2005, volume 3632 of LNCS, Springer, 2005, pp. 353–368.