<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Integrated Data and Process Management: Finally?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marlon Dumas</string-name>
          <email>marlon.dumas@ut.ee</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Tartu</institution>
          ,
          <country country="EE">Estonia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Contemporary information systems are generally built on the principle of segregation of data and processes. Data are modeled in terms of entities and relationships while processes are modeled as chains of events and activities. This situation engenders an impedance mismatch between the process layer, the business logic layer and the data layer. We discuss some of the issues that this impedance mismatch raises and analyze how and to what extent these issues are addressed by emerging artifact-centric process management paradigms.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Data management and process management are both well-trodden fields – but each in
its own way. Well-established data analysis and design methods allow data analysts to
identify and to capture domain entities and to refine these domain entities down to the
level of database schemas in a seamless and largely standardized manner.
Concomitantly, database systems and associated middleware enable the development of robust
and scalable data-driven applications, while contemporary packaged enterprise systems
support hundreds of business activities on top of shared databases.</p>
      <p>In a similar vein, well-documented and proven process analysis and design methods
allow process analysts to identify and to capture process models at different levels of
abstraction, ranging from high-level process models suitable for qualitative analysis and
organizational redesign down to the level of executable processes that can be deployed
in Business Process Management Systems (BPMS).</p>
      <p>But while data management and process management are each well supported by
their own body of mature methods and tools, these methods and tools are at best loosely
integrated. For example, when it comes to accessing data, BPMS typically rely on
request-response interactions with database applications or packaged enterprise
systems. Typically, data fetched from these systems are copied into the “working memory”
of the BPMS. The data in this working memory are then used to evaluate business rules
relevant to the execution of the process, and to orchestrate both manual and automated
work. But the burden of synchronizing the working data maintained by the BPMS with
the data maintained by the underlying systems is generally left with the developers.</p>
      <p>More generally, the “data vs. process” divide leads to an impedance mismatch
between the data layer, the business logic layers and the process layer, which in the long
run, hinders on the coherence and maintainability of information systems. In particular,
the data vs. process divide has the following effects:
– Process-related and function-related data redundancy. The BPMS maintains data
about the state of the process, since these data are needed in order to enable the
system to schedule tasks, react to events and to evaluate predicates attached to
decision points in the process. On the other hand, data entities manipulated by
the process are stored in the database(s) underpinning the applications with which
the BPMS interacts. Hence, the state of the entities is stored both by the BPMS
and by the underlying applications. In other words, data are managed redundantly
at the database layer and at the process layer, thereby adding development and
maintenance complexity.
– Business rules fragmentation and redundancy. Some business rules are encoded
at the level of the business process, others in the business logic layer (e.g. using a
business rules engine) and others in the database (in the form of triggers or integrity
constraints). Worst, some rules are encoded at different levels depending on the
type of rule and the data involved. This fragmentation and redundancy hampers
maintainability and potentially leads to inconsistencies.</p>
      <p>
        The effects of this mismatch are perhaps less apparent when a one-to-one mapping
exists between the instances of a given process and the entities of a given entity type.
This is the case for example of a typical invoice handling process where one process
instance (also called a “case”) corresponds exactly to one invoice. In this context, the
state of a process instance maps neatly to the state of an entity. Ergo, the data required
by the process, for example when evaluating branching conditions, is restricted to the
data contained in the associated entity (i.e. the invoice in this example) and possibly to
the state of other entities within the logical horizon [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] of the said entity – e.g. the
Purchase Order (PO) associated to the invoice. Accordingly, collecting the data required
for evaluating business rules required by this process is relatively simple, while
synchronizing the state of the process instance with the state of its associated entity (at the
business logic and data layers) does not pose a major burden.
      </p>
      <p>
        The impedance mismatch however becomes much more evident when this
one-toone correspondence between processes and entities does not hold. Consider for example
a shipment process where a single shipment may contain products for multiple
customers, ordered by means of multiple purchase orders (POs) and invoiced by means of
multiple invoices – perhaps even multiple POs and multiple invoices per customer
involved. Furthermore, consider the case where the products requested in a given PO are
not necessarily sent all in a single shipment, but instead may be spread across multiple
shipments. In this setting, the effects of a customer canceling a PO are not circumscribed
to one single instance of the shipment process. Similarly, the effects of a delayed
shipment are not restricted to single PO. Consequently, business rules related for example
to cancellation penalties, compensation for delayed deliveries or prioritization of
shipments become considerably more difficult to capture, to maintain and to reason about,
as exemplified in numerous case studies [
        <xref ref-type="bibr" rid="ref1 ref3 ref8 ref9">1, 9, 8, 3</xref>
        ]. Traditional process management
approaches quickly hit their limit when dealing with such processes. The outcome of
this limitation is that a significant chunk of the “process logic” has to be pushed down to
the business logic layer (e.g. in the form of business rules) – which essentially voids the
benefits of adopting a structured process management approach supported by a BPMS.
      </p>
      <p>
        Service-oriented architectures (SOAs) facilitate the inter-connection of applications
and application components. Their emergence has greatly facilitated the integration of
data-driven and process-driven applications. SOAs have also enabled packaged
enterprise software vendors to “open the box” by providing standardized programmatic
access to the vast functionality of their systems. But per se, SOAs do not address the
problem of data and process integration, since data-centric services and process-centric
services are still developed separately using different methods. A case in point is Thomas
Erl’s service-oriented design method [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which advocates that process-centric services
should be strictly layered on top of data-centric (a.k.a. entity-centric) services. Erl’s
approach consists of two distinct methods for designing process-centric services and
entity-centric services. This same principle permeates in many other service-oriented
design methods [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Such approaches do not address the issues listed above. Instead,
they merely reproduce the data versus process divide by segregating data-centric
services and process-centric services.
2
      </p>
      <p>
        The Artifact-Centric Process Management Paradigm
This talk discusses emerging approaches that aim at addressing the shortcomings of the
traditional data versus processes divide. In particular, the keynote discusses the
emerging artifact-centric process management paradigm [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] and how this paradigm, in
conjunction with service-oriented architectures and associated platforms, enable higher
levels of integration and higher responsiveness to process change.
      </p>
      <p>Mainstream process modeling notations such as BPMN can be thought as
being activity-centric in the sense that process models are structured in terms of flows
of events and activities. Modularity is achieved by decomposing activities into
subprocesses. Data manipulation is captured either by means of global variables defined
within the scope of a process or subprocess, or by means of conceptually passive data
objects that are created, read and/or updated by the events and activities in the process.
In contrast, the database applications and/or enterprise systems on top of which these
processes execute are usually structured in terms of objects that encapsulate data and/or
behavior. This duality engenders the above-mentioned impedance mismatch between
the process layer and the business logic and data layers.</p>
      <p>In contrast, artifact-centric process modeling paradigms aim at conceptually
integrating the process layer, the business logic and the data layer. Their key tenet is that
business processes should be conceived in terms of collections of artifacts that
encapsulate data and have an associated lifecycle. Transitions between these states in this
lifecycle are triggered by events coming from human actors, modules of an enterprise
system (possibly exposed as services) and possibly other artifacts, thus implying that
artifacts are inter-linked. In this way, the state of the process and the state of the entities
are naturally maintained “in sync” and business processes are conceived as network
of inter-connected artifacts that may be connected according to N-to-M relations, thus
allowing one to seamlessly capture rules spanning across what would traditionally be
perceived to be multiple process instances.</p>
      <p>
        The talk also discusses ongoing efforts within the Artifact-Centric Service
Interoperation (ACSI) project-2. This project aims at combining the artifact-centric process
management paradigm with SOAs in order to achieve higher levels of abstraction
during business process integration across organizational boundaries. The key principle of
-2 http://www.acsi-project.eu/
the ACSI project is that processes should be conceived as systems of artifacts that are
bound to services. The binding between artifacts and services specifies where should
the data of the artifact be pushed to, or where it should be pulled from, and when. In
the ACSI approach, process developers do not reason in terms of tasks that are mapped
to request-response interactions between a process and the underlying systems. Instead,
they reason in terms of artifacts, their lifecycles, operations and associated data.
Artifact lifecycles are captured based on a meta-model – namely Guard-Stage-Milestone
(GSM) – that allows one to capture behavior, data querying and manipulation in a
unified framework [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Upon this foundation, the ACSI project is building a proof-of-concept platform that
supports the definition and execution of artifact-centric business processes. Challenges
addressed by ACSI include the problem of reverse-engineering artifact systems from
enterprise system logs – for the purpose of legacy systems migration – and the
verification of artifact-centric processes, which by nature are infinite-state systems due to the
tight integration of processes and data.</p>
      <p>Acknowledgments. This paper is the result of collective discussions within the ACSI
project team. Thanks especially to Rick Hull for numerous discussions on this topic.
The ACSI project is funded by the European Commission’s FP7 ICT Program.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Kamal</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          , Nathan S. Caswell, Santhosh Kumaran, Anil Nigam, and
          <string-name>
            <surname>Frederick</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Artifact-centered operational modeling: Lessons from customer engagements</article-title>
          .
          <source>IBM Systems Journal</source>
          ,
          <volume>46</volume>
          (
          <issue>4</issue>
          ):
          <fpage>703</fpage>
          -
          <lpage>721</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. David Cohn and
          <string-name>
            <given-names>Richard</given-names>
            <surname>Hull</surname>
          </string-name>
          .
          <article-title>Business artifacts: A data-centric approach to modeling business operations and processes</article-title>
          .
          <source>IEEE Data Eng. Bull.</source>
          ,
          <volume>32</volume>
          (
          <issue>3</issue>
          ):
          <fpage>3</fpage>
          -
          <lpage>9</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Marlon</given-names>
            <surname>Dumas</surname>
          </string-name>
          .
          <article-title>On the convergence of data and process engineering</article-title>
          .
          <source>In Proc. of the 15th International Conference on Advances in Databases and Information Systems (ADBIS)</source>
          , Vienna, Austria, pages
          <fpage>19</fpage>
          -
          <lpage>26</lpage>
          . Springer,
          <year>September 2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Erl</surname>
          </string-name>
          .
          <article-title>Service-Oriented Architecture (SOA): Concepts, Technology, and</article-title>
          <string-name>
            <given-names>Design. Prentice</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>P.</given-names>
            <surname>Feldman</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Entity model clustering: Structuring a data model by abstraction</article-title>
          .
          <source>The Computer Journal</source>
          ,
          <volume>29</volume>
          (
          <issue>4</issue>
          ):
          <fpage>348360</fpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Richard Hull, Elio Damaggio, Riccardo De Masellis, Fabiana Fournier, Manmohan Gupta, Fenno Terry Heath, Stacy Hobson,
          <string-name>
            <given-names>Mark H.</given-names>
            <surname>Linehan</surname>
          </string-name>
          , Sridhar Maradugu, Anil Nigam,
          <article-title>Piyawadee Noi Sukaviriya, and Roman Vacul´ın. Business artifacts with guard-stage-milestone lifecycles: managing artifact interactions with conditions and events</article-title>
          .
          <source>In Proceedings of the Fifth ACM International Conference on Distributed Event-Based Systems (DEBS)</source>
          , New York, NY, USA, pages
          <fpage>51</fpage>
          -
          <lpage>62</lpage>
          . ACM,
          <year>July 2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Kohlborn</surname>
          </string-name>
          , Axel Korthaus,
          <string-name>
            <given-names>Taizan</given-names>
            <surname>Chan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Rosemann</surname>
          </string-name>
          .
          <article-title>Identification and analysis of business and software services - a consolidated approach</article-title>
          .
          <source>IEEE Transactions on Services Computing</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <fpage>50</fpage>
          -
          <lpage>64</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Vera</given-names>
            <surname>Ku</surname>
          </string-name>
          <article-title>¨ nzle and Manfred Reichert. Philharmonicflows: towards a framework for object-aware process management</article-title>
          .
          <source>Journal of Software Maintenance</source>
          ,
          <volume>23</volume>
          (
          <issue>4</issue>
          ):
          <fpage>205</fpage>
          -
          <lpage>244</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Guy</given-names>
            <surname>Redding</surname>
          </string-name>
          , Marlon Dumas,
          <string-name>
            <surname>Arthur H. M. ter Hofstede</surname>
            , and
            <given-names>Adrian</given-names>
          </string-name>
          <string-name>
            <surname>Iordachescu</surname>
          </string-name>
          .
          <article-title>A flexible, object-centric approach for business process modelling</article-title>
          .
          <source>Service Oriented Computing and Applications</source>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          ):
          <fpage>191</fpage>
          -
          <lpage>201</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>