<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Münster, Germany
* Corresponding author.
" gianola@inf.unibz.it (A. Gianola)
~ https://gianola.people.unibz.it/ (A. Gianola)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>SMT-based Safety Verification of Data-Aware Processes: Foundations and Applications (Extended Abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alessandro Gianola</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Free University of Bozen-Bolzano</institution>
          ,
          <addr-line>Bolzano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>Integrating data and processes to understand their concrete interplay is a long-standing problem in business process management. We present a general framework for the formalization and verification of infinite-state transition systems that can interact with a persistent storage represented as relational databases. We develop sophisticated algorithmic techniques, based on automated reasoning and SMT solving, for the verification of safety properties. Building on top of these techniques, we apply our framework to business process management: we introduce general models for the safety verification of complex business processes enriched with real data. For those models, we devise formal and operational methods that are based on standard languages and/or can capture advanced modeling capabilities, considering in particular the BPMN standard and variants of Petri nets for the process control-flow, and SQL queries and updates for the interaction with data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Data-Aware Processes. In recent years, a growing number of application domains asks for
process modeling languages paired with automated verification techniques that do not just
consider the control flow dimension, but also take into consideration the data dimension. From
a theoretical perspective, the development of formal, abstract frameworks for attacking the
problem of verifying so-called Data-Aware Processes (DAPs) has consequently flourished [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ],
leading to a plethora of formal models that diferentiate depending on how the data and process
components, as well as their interplay, is actually represented. What all these frameworks have
in common is that they strive to focus on very general DAP models that formalize abstract
dynamic systems (i.e., the process component) interacting with data persistent storage (i.e., the
data component). In these models, the concept of “process” should be interpreted in an abstract
sense, as a (possibly undetermined) mechanism that guides the evolution of a system.
      </p>
      <p>
        One may wonder whether formal frameworks for verifying DAPs can employ the techniques
used in finite-state model checking. Verification of DAPs should reflect the possibility of
expressing properties that simultaneously account for the data and the process perspectives,
and most importantly for their interaction. However, due to the presence of data, DAP models are
intrinsically infinite-state : the content of a relational database is finite but its size is unbounded
and unknown a priory (since a new tuple can always be added to some relation using data
elements taken from infinite domains); in addition, many real-world scenarios often require
the presence of possibly fresh data values (e.g., string, new identifiers, integer or real numbers)
injected into the process by external users. In such sophisticated settings, explicit model
checking is not possible anymore: due to the infinity of the state space, exhaustive search
cannot be applied. Moreover, traditional approaches in data management do not help here,
since they typically investigate static, structural aspects of a domain of interest, disregarding
dynamics [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. The results obtained for the verification of DAPs are quite fragmented, since
they consider a variety of diferent assumptions on the model and on the analysis tasks [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ],
assumptions that are usually too restrictive when compared with both concrete business process
and data management models.
      </p>
      <p>
        Business Processes enriched with Data. While theoretical DAP frameworks were studied,
a huge body of research has been dedicated to the challenging problem of reconciling data and
business process management within contemporary organizations [
        <xref ref-type="bibr" rid="ref7 ref8 ref9">7, 8, 9</xref>
        ]. More specifically,
in the BPM context it has become more and more important to study multi-perspective models
that do not just conceptually account for the control-flow dimension of business processes, but
also consider the interplay with data [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]: in contrast with abstract DAPs, these models are more
focused on concrete processes as they are interpreted by stakeholders and BPM practitioners.
Traditionally, BPM models are formalized using pure control-flow models, ignoring the efects
caused by data to the evolution of the process, i.e., neglecting how data are manipulated during
the tasks execution, and how the executability of tasks is afected by data-related conditions.
      </p>
      <p>
        To overcome this fundamental limitation, many concrete languages and corresponding
formalisms have been proposed to represent business processes where the data and the
controllfow dimension are both considered as first-class citizens [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. One important challenge that
naturally arises is how to formally verify such business processes enriched with data: in this
respect, it seems natural to employ the abstract frameworks developed for DAPs, and to try
to adapt these frameworks to the desired application domains (see for some attempts in this
direction, e.g., [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]). However, this adaptation is quite challenging because of the lack of
a comprehensive framework for the verification of DAPs that is able to capture most of the
essential modeling features of business process data-aware extensions. The solutions provided
in the literature consist either of abstracting from data to non-determinism or of bounding
data to only finitely many configurations, fixed apriori. Nonetheless, these solutions are partial
since only support very limited types of unboundedness, whereas real data, in general, account
for two more complex types: the one due to fresh values injected into the process from the
environment, and the one due to how relations (the most basic data structures) can yield new
configurations. These, in turn, can easily lead to Turing-complete mechanisms.
      </p>
      <p>Moreover, currently there are few theoretical frameworks that also come with a corresponding
implemented tool for concrete verification, which is one of the desiderata for every model
checking application domain. Finally, an orthogonal challenging dimension regards the abstract
lfavor of these frameworks and their modeling constructs, which often depart from those
ofered by process and data modeling standard languages used in practice, such as the Business
Process Model and Notation (BPMN) for processes, and the Structured Query Language (SQL)
for persistent data. In fact, it is essential to rely on standard languages when dealing with
real applications, considering that model checking should have a big impact in guaranteeing
trustworthiness of concretely deployed systems.</p>
      <p>
        Infinite-state Model Checking. In parallel, in the nineties, a completely separated line of
research started to investigate how to attack the problem of verifying infinite-state systems
in general, disregarding the data perspective. A particularly interesting class of infinite-state
systems is given by parameterized systems [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Such systems can be seen as transition systems
that are parametric in the number  of (possibly independent) components, called parameters.
Parameterized verification approaches provide automatic methods to prove trustworthiness of
these systems regardless of the number of their components.
      </p>
      <p>
        Another successful way for tackling infinite-state model checking consists of providing a
symbolic setting [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] for representing states. Thanks to this representation, the exploration of
the state space does not need to be explicit, but it can be performed symbolically. Satisfiability
Modulo Theories (SMT) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] is one of the most fruitful emerging areas in computational logic
that provides techniques for symbolic model checking: the SMT problem consists of deciding
the satisfiability of logical formulae with respect to combinations of background first-order
logic theories. SMT leverages highly eficient technologies, such as those stemming from
SMT solvers, and has been proved to be particularly successful for verifying infinite-state
systems. There exists a wide range of methods using SMT solvers as their underlying engine.
A successful framework based on SMT solving (and applied, e.g., to distributed systems) is
the one of array-based systems [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], a declarative formalism where arrays (as known from
programming languages) are the central notion.
2. Contributions
As we have seen, DAP verification has recently gained increasing attention, with strong
motivation from BPM applications. Several approaches have been proposed for attacking this problem,
but they present two significant limitations. First, the employed verification techniques are
developed ad hoc, and do not rely on well-established methods and implemented and
state-of-theart technologies. Second, these frameworks are usually studied at a quite abstract level, which
makes it challenging to connect them to the standard front-end languages used in practice. To
summarize, there is the lack of a universal and comprehensive approach to verify DAPs via a
suficiently expressive formalism that natively supports efective techniques and can encode
“front-end” languages and models.
      </p>
      <p>At the same time, many approaches to symbolic reasoning for infinite-state systems have
been successfully applied to diferent models, such as parameterized systems. Most of these
approaches provide not only a solid theoretical framework, but also the support of
computationally eficient and highly engineered tools. This is the case of SMT solvers and model checkers
based on them. Nevertheless, importing SMT-based model checking in the context of DAP
verification is extremely challenging and cannot be done as is: it requires carrying out genuinely
novel research for handling the data perspective, and to develop algorithmic techniques. In
this work, we bridge the gap between DAP verification and infinite-state model checking : (i) we
propose a theoretical SMT-based framework for DAP verification and, to make it efective
in practice, we devise eficient algorithmic techniques based on automated reasoning; (ii) we
apply the previous machinery to provide practical approaches for modeling and verifying
business processes enriched with real data.</p>
      <p>DAP Verification via SMT. We define a general model-theoretic/algebraic framework, called
(Universal) RASs, for automated verification of DAPs based on SMT solving, focusing in
particular on parameterized safety: this amounts to establishing whether a system can reach an
undesired, unsafe configuration, irrespective of the specific content of its background data
storage. Our approach relies on array-based systems: we incorporate the needed capabilities to
formalize and reason about relational databases. We exhibit how arrays can represent read-write
relations that can be queried and updated during the evolution of the system.</p>
      <p>For this purpose, we develop sophisticated algorithmic techniques building on top of
wellknown methods in automated reasoning. We also demonstrate the feasibility of our approach
by showing these techniques in action thanks to the implementation in the state-of-the-art
mcmt model checker, testing it against a concrete DAP benchmark: in this respect we provide a
ifrst, preliminary experimental evaluation of our verification machinery.</p>
      <p>Modeling and Verification of Data-Aware Business Processes. As a second, key
contribution, we exploit the developed machinery to provide BPM-oriented practical approaches, strictly
linked to other major research lines on modeling and verifying processes with data.</p>
      <p>First, within the general framework of RASs, we propose Data-Aware BPMN models, called
DABs, based on standard languages used in the BPM literature. These models natively combine
process modeling and querying languages used in practice (BPMN+SQL) in terms of a significant
subset of this combination, exhibiting how this subset can be both automatically translated
into the formal verification framework of RASs. These models are implemented and executed
in an operational framework called delta-BPMN, where a standard process execution engine
(provided by Camunda) is enriched with data management features, i.e., with the support of
relational databases and data manipulation capabilities.</p>
      <p>Then, we introduce a theoretical formalism, called COA-nets, that captures expressive
modeling capabilities in the recently born spectrum of so-called object-centric processes. COA-nets are
in line with process modeling principles studied in process research, specifically Petri nets and
their extensions, and represent an extension of Colored Petri nets that is able both to model data
lfow in tokens and to access a relational database. Once again, the translation to the verification
framework of Universal RASs is provided. We explicitly discuss ways of modeling process
and data interactions along a comprehensive setting, where expressive modeling patterns are
introduced and shown how to employed in practice without hampering verification guarantees.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Clarke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Henzinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Veith</surname>
          </string-name>
          , R. Bloem (Eds.),
          <source>Handbook of Model Checking</source>
          , Springer,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Burch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Clarke</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. L. McMillan</surname>
            ,
            <given-names>D. L.</given-names>
          </string-name>
          <string-name>
            <surname>Dill</surname>
            ,
            <given-names>L. J.</given-names>
          </string-name>
          <string-name>
            <surname>Hwang</surname>
          </string-name>
          ,
          <article-title>Symbolic model checking: 10ˆ20 states and beyond</article-title>
          ,
          <source>Inf. Comput</source>
          .
          <volume>98</volume>
          (
          <year>1992</year>
          )
          <fpage>142</fpage>
          -
          <lpage>170</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , G. De Giacomo,
          <string-name>
            <given-names>M.</given-names>
            <surname>Montali</surname>
          </string-name>
          ,
          <article-title>Foundations of data-aware process analysis: A database theory perspective</article-title>
          ,
          <source>in: Proc. of PODS</source>
          <year>2013</year>
          ,
          <year>2013</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Deutsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vianu</surname>
          </string-name>
          ,
          <article-title>Automatic verification of database-centric systems</article-title>
          ,
          <source>ACM SIGLOG News</source>
          <volume>5</volume>
          (
          <year>2018</year>
          )
          <fpage>37</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vianu</surname>
          </string-name>
          , Foundations of Databases, Addison-Wesley,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. L.</given-names>
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. F.</given-names>
            <surname>Patel-Schneider</surname>
          </string-name>
          (Eds.),
          <source>The Description Logic Handbook: Theory</source>
          , Implementation, and
          <string-name>
            <surname>Applications</surname>
          </string-name>
          , Cambridge University Press,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Richardson</surname>
          </string-name>
          , Warning: Don'
          <article-title>t assume your business processes use master data</article-title>
          ,
          <source>in: Proceedings of BPM</source>
          <year>2010</year>
          , volume
          <volume>6336</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2010</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <article-title>On the convergence of data and process engineering</article-title>
          ,
          <source>in: Proc. of ADBIS</source>
          <year>2011</year>
          ,
          <year>2011</year>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Reichert</surname>
          </string-name>
          ,
          <article-title>Process and data: Two sides of the same coin?</article-title>
          ,
          <source>in: Proceedings of OTM</source>
          <year>2012</year>
          , volume
          <volume>7565</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2012</year>
          , pp.
          <fpage>2</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Meyer</surname>
          </string-name>
          , L. Pufahl,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fahland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Weske</surname>
          </string-name>
          ,
          <article-title>Modeling and enacting complex data dependencies in business processes</article-title>
          ,
          <source>in: Proc. of BPM</source>
          <year>2013</year>
          ,
          <year>2013</year>
          , pp.
          <fpage>171</fpage>
          -
          <lpage>186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E.</given-names>
            <surname>Damaggio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Deutsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vianu</surname>
          </string-name>
          ,
          <article-title>Automatic verification of data-centric business processes</article-title>
          ,
          <source>in: Proc. of BPM</source>
          <year>2011</year>
          ,
          <year>2011</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Solomakhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Montali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tessaris</surname>
          </string-name>
          , R. De Masellis,
          <article-title>Verification of artifact-centric systems: Decidability and modeling issues</article-title>
          ,
          <source>in: Proc. of ICSOC</source>
          <year>2013</year>
          ,
          <year>2013</year>
          , pp.
          <fpage>252</fpage>
          -
          <lpage>266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Abdulla</surname>
          </string-name>
          , G. Delzanno, Parameterized verification,
          <source>Int. J. Softw. Tools Technol. Transf</source>
          .
          <volume>18</volume>
          (
          <year>2016</year>
          )
          <fpage>469</fpage>
          -
          <lpage>473</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>K. L. McMillan</surname>
          </string-name>
          , Symbolic Model Checking, Kluwer Academic Publishers, Norwell, MA, USA,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Barrett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tinelli</surname>
          </string-name>
          ,
          <article-title>Satisfiability modulo theories</article-title>
          ,
          <source>in: Handbook of Model Checking</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>305</fpage>
          -
          <lpage>343</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghilardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ranise</surname>
          </string-name>
          ,
          <article-title>Backward reachability of array-based systems by SMT solving: Termination and invariant synthesis</article-title>
          ,
          <source>Log. Methods Comput. Sci. 6</source>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>