<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Enacting Complex Data Dependencies from Activity-Centric Business Process Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas Meyer</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luise Pufahl</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dirk Fahland</string-name>
          <email>d.fahland@tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mathias Weske</string-name>
          <email>Mathias.Weskeg@hpi.uni-potsdam.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Eindhoven University of Technology</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Hasso Plattner Institute at the University of Potsdam</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Execution of process models requires a process engine to handle control flow and data dependencies. While control flow is well supported in available activity-oriented process engines, data dependencies have to be specified manually in an error-prone and time-consuming work. In this paper, we present an extension to the process engine Activiti allowing to automatically extract complex data dependencies from process models and to enact the respecting models. We also briefly explain required extensions to BPMN to allow a model-driven approach for data dependency specification easing the process of data modeling.</p>
      </abstract>
      <kwd-group>
        <kwd>Process Modeling</kwd>
        <kwd>Data Modeling</kwd>
        <kwd>Process Enactment</kwd>
        <kwd>BPMN</kwd>
        <kwd>SQL</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Today, organizations use process-oriented systems to automate the enactment of their
business processes. Therefore, processes are often captured in process models focusing
on the activities being performed. These models are executed by process engines as, for
instance, YAWL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], Activiti [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], jBPM [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], Bonita [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], AristaFlow [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and Adept2 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Generally, a process engine has access to a process model repository as shown in Fig. 1.
As soon as a start event of a particular process occurs, the engine creates a new instance
of this process and enacts the control flow as specified by the process model. Thereby,
the process engine is able to allocate specified user tasks to process participants via a
graphical user interface or to invoke an application for execution of service tasks.
      </p>
      <p>For the enactment of tasks, data plays an important role, because data specifies
preand postconditions of tasks. A precondition requires the availability of certain data in a
specified state while the postcondition demands certain manipulation of data. In modern
activity-oriented process engines as mentioned above, these and further complex data
dependencies (e.g., creating and updating multiplicity relations between data objects)
have to be implemented manually through a process engineer by specifying the respective
data access statements (see shaded elements in Fig. 1 left); this is an error-prone and
time-consuming work.</p>
      <p>In this paper, we explain an approach to model data dependencies in the process
model itself and automatically derive data access statements from the process model as
shown in Fig. 1 right. Process data utilized during activity execution is out of scope in
this paper. We demonstrate the feasibility of this approach using the industry standard
Process Designer Process Participant
Business Process</p>
      <p>Modeling</p>
      <p>Business
Process Model
Repository</p>
      <p>Graphical
User Interface</p>
      <p>Invoked</p>
      <p>Applications
Process Engine</p>
      <p>Process Engineer</p>
      <p>Data access
statements
Database</p>
      <p>Business Process
and Data Modeling
Business Process
Model with Data</p>
      <p>Repository</p>
      <p>Graphical
User Interface</p>
      <p>Invoked</p>
      <p>Applications
Process Engine</p>
      <p>
        Database
for process modeling, the Business Process Model and Notation (BPMN) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], and the
Activiti process engine [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>Next, Section 2 shows how to model complex data dependencies in BPMN; Section 3
shows how three simple conservative extensions of the industrial process engine Activiti
suffice to enact complex data dependencies from a BPMN model. We conclude in
Section 4.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Modeling Complex Data Dependencies in BPMN</title>
      <p>
        BPMN provides the concept of data objects that are associated to tasks [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Roughly,
a task is only enabled when its associated input data objects are in a particular state.
Associated output data objects have to be in a specified state when the task completes.
However, for enactment more information is required.
      </p>
      <p>Figure 2 shows a
standard BPMN model of a sim- case object: Order
plified build-to-order pro- r [reOcrediveerd] [reOjerdceterd] [coOnfridremred] [Osrednetr]
cess of a computer manufac- reu pk: o_id pk: o_id pk: o_id pk: o_id
tscueertsesri,na(inigtnaOloirrcdisne)g.r Itahnnanttohwtiasatsiporrnoes-- frttcaaeunuM Cohrdeecrk coCmrlpeisoatnteent Porordceerss
cfierisvteCdhfercokmedaancdusetoitmheerr ries- opCm Co[[ncmerwep]aotneedn]ts
jected or confirmed. If it is pfkk::IcoIp__idid
confirmed, task Create
component list creates several Fig. 2. Build-to-order Process of a Computer Manufacturer
Components to be ordered;
based on these components the order is processed in a subprocess and, when completed,
the Order is sent to the customer.</p>
      <p>In BPMN, each data object has a name and a set of attributes of which one describes
the state of a data object. Data flow edges express pre- and postconditions to the different
tasks, e.g., Check order is only enabled if object Order exists in state received in the
current process instance. However, when handling multiple orders in different instances
in parallel, the process model does not express which order is correlated to which process
instance. Likewise, BPMN cannot describe create, read, update, and delete operations
on one or more objects of the same kind, possibly in 1:n or m:n relationship to other
data objects. For instance, one cannot express that task Create component list of Fig. 2
creates several new Component objects and associates them to the Order handled in the
process. Such data dependencies would have to be implemented manually.</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], we have shown that a few simple additional annotations to BPMN data objects
suffice to describe such complex data dependencies with operational semantics directly
in BPMN. First, borrowing an idea from business artifacts [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], we propose that for
each process instance (and each instance of a subprocess) exists exactly one data object
instance driving and orchestrating the execution of the process instance. All other data
objects used in the instance depend on that case object. The case object of Fig. 2 is an
Order as shown by the annotation. Dependencies between data objects are expressed via
primary and foreign key attributes in analogy to databases [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Each data object has a
primary key attribute that uniquely distinguishes different instances of this object, e.g.,
Order has the primary key attribute o id and Component has cp id. Foreign key attributes
link object instances, e.g., attribute o id links Components to Orders. The primary key
of the case object implicitly links to the instance identifier of the (sub-)process. Read
and update of data objects are already provided through BPMN’s data flow edges. We
express create or delete through respective annotations in the BPMN data object, e.g.,
Create components list creates several new Components (annotation [new] and multi
instance characteristic jjj) and relates them to the current Order.
      </p>
      <p>
        Most importantly, these annotations have operational semantics. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], it is shown
how to derive SQL queries from annotated BPMN data objects that realize the specified
data operations. For example, for object Order in state rejected written by activity
Check order in Fig. 2, the corresponding SQL query is dervied: UPDATE Order SET
state = ’rejected’ WHERE o id = $ID with $ID representing the identifier
of the (sub-)process the activity belongs to. See [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for full details.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Tool Architecture and Implementation</title>
      <p>
        We implemented the approach of Section 2 to show its feasibility. In the spirit of adding
only few data annotations to BPMN, we made an existing BPMN process engine
dataaware by only few additions to its control structures. As basis, we chose Activiti [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], a
Java-based, lightweight, and open source process engine specifically tailored for a subset
of BPMN. Activiti enacts process models given in the BPMN XML format. Activiti
supports standard BPMN control flow constructs. Data dependencies are not enacted from
the process model, but are specified separately. We adapted the Camunda [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] modeler
to allow the creation of BPMN models with our proposed concepts; we extended the
Activiti engine to enact process models with these concepts.
      </p>
      <p>First, we extended the XML specification by utilizing extension elements, which
the BPMN specification explicitly supports to add new attributes and properties to
existing constructs. We added the case object as additional property to the (sub-)process
construct. The data object was extended with additional properties for primary key
(exactly one), foreign keys (arbitrary number), and the data access type as attribute.
The BPMN parser of Activiti was extended to read BPMN data objects with the new
attributes and properties, and data associations.
&lt;exte.an.ss.ion&gt;
&lt;/extension&gt;</p>
      <p>BPMN Data Objects</p>
      <p>Control Flow
and Resource
pre</p>
      <p>A
post
loop
[ resultSet.size() == 0 ]</p>
      <p>SELECT ... FROM PRE WHERE ...</p>
      <p>resultSet</p>
      <p>Activity
Behavior</p>
      <p>UPDATE POST ...
.bpmn file</p>
      <p>Parser</p>
      <p>Internal Representation</p>
      <p>Engine</p>
      <p>The actual execution engine was extended at two points: before invoking the
execution of an activity to check the preconditions of an activity and before completing an
activity to realize the postconditions, both with respect to data objects. At either point,
the engine checks for patterns of data input and output objects and categorizes them.
For instance, in Fig. 2, Order is input and output to Check order in different states. The
engine classifies this as a “conditional update of case object Order”. The data operations
at task Create component list would be classified as “conditional creation of multiple
data objects that depend on the case object (1:n relationship)”. Classification proceeds
from most specific to most general patterns.</p>
      <p>When invoking an activity, for each matching precondition pattern a corresponding
SQL select query is generated to read whether the required data objects are available.
The query assumes that for each data object of the process model exists a table holding
all instances of this object and their attributes. If there is an object instance in the right
state, the SQL query returns the corresponding entry and is empty otherwise. The engine
repeatedly queries the database until an entry is returned (i.e., the task is enabled), as
shown in Fig. 3. Then the activity is executed. Upon completion, an SQL insert, update,
or delete query is generated for each matching postcondition pattern, and executed on
the database.</p>
      <p>Altogether, we had to extend Activiti at merely 4 points to realize our concept,
as illustrated in Fig. 3: (1) at the XML, (2) at the parser and internal representation,
(3) when checking for enabling of activities, and (4) when completing an activity.
The extensions required just over 1000 lines of code with around 600 lines being
concerned with classifying data access patterns, generating, and executing SQL queries.
The extended engine, a graphical modeling tool, example process models, a screencast,
and a complete setup in a virtual machine are available together with the source code at
http://bpt.hpi.uni-potsdam.de/Public/BPMNData.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>
        In this paper, we presented an approach how to automatically enact complex data
dependencies from activity-centric process models. They key concepts required are
data objects associated to tasks; a few annotations allow to express relations between
data objects and the particular data operation. Our modeling tool helps to easily specify
the required annotations in a graphical user interface. From these annotations, SQL
queries can be automatically generated and executed from a process engine, covering
all fundamental data access operations: create, read, update, and delete of single data
objects and of related data objects in 1:n and m:n relationship [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. We have shown on the
process engine Activiti that minimal extensions to existing execution engines suffice to
implement this concept.
      </p>
      <p>
        Compared to other techniques and engines for enacting data dependencies from
models, our approach is less intrusive. The object-centric modeling paradigm [
        <xref ref-type="bibr" rid="ref11 ref5">5, 11</xref>
        ]
requires substantial changes to the infrastructure as completely new engines have to
be used. Process engines for this paradigm exist, e.g., PhilharmonicFlows [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and
Corepro [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], but they are incompatible with activity-centric approaches as it is supported
by BPMN. In this respect, our work fills a critical gap in allowing owners of
activitycentric processes to adapt automated enactment of data dependencies without changing
paradigms and infrastructure.
      </p>
      <p>
        Acknowledgements. We thank Kimon Batoulis, Sebastian Kruse, Thorben Lindhauer, and Thomas
Stoff for extending the Camunda modeler [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] in the course of their master project to support the
modeling of processes with respect to the concepts described in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>ter Hofstede</surname>
            ,
            <given-names>A.H.M.:</given-names>
          </string-name>
          <article-title>YAWL: Yet Another Workflow Language</article-title>
          .
          <source>Information Systems</source>
          <volume>30</volume>
          (
          <issue>4</issue>
          ),
          <fpage>245</fpage>
          -
          <lpage>275</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>2. Activiti: Activiti BPM Platform. https://www.activiti.org/</mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>3. Bonitasoft: Bonita Process Engine. https://www.bonitasoft.com/</mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Camunda:
          <article-title>Camunda BPM platform</article-title>
          . https://www.camunda.org/
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cohn</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hull</surname>
          </string-name>
          , R.:
          <article-title>Business artifacts: A data-centric approach to modeling business operations and processes</article-title>
          .
          <source>IEEE Data Eng. Bull</source>
          .
          <volume>32</volume>
          (
          <issue>3</issue>
          ),
          <fpage>3</fpage>
          -
          <lpage>9</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>6. JBoss: jBPM Process Engine. https://www.jboss.org/jbpm/</mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. Ku¨nzle, V.,
          <string-name>
            <surname>Reichert</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>PHILharmonicFlows: Towards a Framework for Object-aware Process Management</article-title>
          .
          <source>Journal of Software Maintenance and Evolution: Research and Practice</source>
          <volume>23</volume>
          (
          <issue>4</issue>
          ),
          <fpage>205</fpage>
          -
          <lpage>244</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lanz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reichert</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dadam</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Robust and flexible error handling in the aristaflow bpm suite</article-title>
          .
          <source>In: CAiSE Forum</source>
          <year>2010</year>
          . vol.
          <volume>72</volume>
          , pp.
          <fpage>174</fpage>
          -
          <lpage>189</lpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Meyer, A.,
          <string-name>
            <surname>Pufahl</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fahland</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weske</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Modeling and Enacting Complex Data Dependencies in Business Processes</article-title>
          .
          <source>In: Business Process Management. LNCS</source>
          , vol.
          <volume>8094</volume>
          , pp.
          <fpage>171</fpage>
          -
          <lpage>186</lpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. Mu¨ller,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Reichert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Herbst</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          :
          <article-title>Data-driven modeling and coordination of large process structures</article-title>
          .
          <source>In: OTM 2007. LNCS</source>
          , vol.
          <volume>4803</volume>
          , pp.
          <fpage>131</fpage>
          -
          <lpage>149</lpage>
          . Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Nigam</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caswell</surname>
            ,
            <given-names>N.S.:</given-names>
          </string-name>
          <article-title>Business artifacts: An approach to operational specification</article-title>
          .
          <source>IBM Systems Journal</source>
          <volume>42</volume>
          (
          <issue>3</issue>
          ),
          <fpage>428</fpage>
          -
          <lpage>445</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. OMG:
          <article-title>Business Process Model and Notation (BPMN), Version 2</article-title>
          .0 (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Reichert</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rinderle-Ma</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dadam</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Flexibility in process-aware information systems</article-title>
          .
          <source>ToPNoC 5460</source>
          ,
          <fpage>115</fpage>
          -
          <lpage>135</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Silberschatz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Korth</surname>
            ,
            <given-names>H.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sudarshan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <source>Database System Concepts</source>
          ,
          <source>4th Edition. McGrawHill Book</source>
          Company (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>