=Paper= {{Paper |id=Vol-2575/paper1 |storemode=property |title=Towards the Discovery of Object-Aware Processes |pdfUrl=https://ceur-ws.org/Vol-2575/paper1.pdf |volume=Vol-2575 |authors=Marius Breitmayer,Manfred Reichert |dblpUrl=https://dblp.org/rec/conf/zeus/BreitmayerR20 }} ==Towards the Discovery of Object-Aware Processes== https://ceur-ws.org/Vol-2575/paper1.pdf
                       Towards the Discovery
                     of Object-Aware Processes

                     Marius Breitmayer and Manfred Reichert

     Institute of Databases and Information Systems, Ulm University, Germany
                {marius.breitmayer,manfred.reichert}@uni-ulm.de



       Abstract. There has been a huge body of research in order to reduce
       manual efforts in creating executable process models through the auto-
       mated discovery of process models from the event logs created by in-
       formation systems. Regarding activity-centric processes, such event logs
       comprise case ids and events related to the execution of process activities.
       However, there exist alternative process management paradigms, such as
       object-aware processes, for which existing algorithms fail to discover a
       sound model. These algorithms do not treat data as first-class citizens,
       but solely rely on the information from event logs. In consequence, ex-
       isting discovery algorithms are insufficient for discovering object-aware
       processes. To address this issue, discovery algorithms need to consider
       additional data sources (e.g., existing forms). This paper discusses the
       need for dedicated discovery techniques in object-aware processes.

       Keywords: object-aware processes, process mining, process discovery


1    Introduction
Despite the many mining approaches that exist for activity-centric processes, ad-
equate support for discovering data-centric process models, e.g., in the context
of artifact-centric processes [7], case handling [2], or object-aware processes [8],
is still lacking. While an activity-centric process model consists of a sequence of
activities that need to be executed in a defined order, a data-driven and -centric
process allows for greater flexibility through the use of declarative process rules
and generated forms [15,6]. Current process discovery algorithms are able to
discover the schema of an activity-centric process from an event log, whereas
information about the internal logic of the activities (e.g., user forms or data
required for an activity) is often neglected. As data is treated as first-class cit-
izen in data-centric (e.g., object-aware) process management, the discovery of
corresponding models should consider this issue as well.
    To understand the nature of the problem at hand, a short introduction into
data-centric and object-aware process management becomes necessary. PHILhar-
monicFlows, our approach to data-centric processes, introduces the concepts of
objects, object behavior, and object interaction. For each business object present
in a real-world business process, one such object exists. The latter comprises
data, represented by attributes, and a state-based process model describing the

   J. Manner, S. Haarmann, S. Kolb, O. Kopp (Eds.): 12th ZEUS Workshop, ZEUS 2020, Potsdam,
           Germany, 20-21 February 2020, published at http://ceur-ws.org/Vol-2575
 Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License
                           Attribution 4.0 International (CC BY 4.0).
       2         Marius Breitmayer and Manfred Reichert

object behavior in terms of an object lifecycle model. When data becomes avail-
able during runtime, this enables transitions between the various states of the
lifecycle process, i.e., execution is data-driven. In the e-learning system PHoo-
dle, a practical application of the PHILharmonicFlows system [5], examples of
business objects include Submission, Exercise, and Lecture (see Fig. 1 for the
respective data model). In turn, when the values of certain attributes, such as
Points or Feedback, become available at runtime, this enables the transition
between the states of a lifecycle process (see Fig. 2). Finally, the interactions
between object lifecycles are managed by coordination processes [14].

                                                                                                                                                                          Submission #1
                                                                                                                                                                                                                               Passed
                                                                                                                                                       Edit                              Submitted
                                                                            Exercise #1
                                                                                                                                                                                                   Feedback
                                                  Edit                                Published                     Past Due              Exercise              Files         Points              Feedback == true


     Lecture     1:n   Download      Description          Exercise
                                                            Files
                                                                                       Submission                       Due
                                                                                                                        Date                      Assignment: Student
                                                                                                                                                                                                  Feedback == false

                                                                                                                                                                                         Assignment: Tutor
                                                                                                                                                                                                                                Failed


                                                                                                                                          Lifecycle
                                                                                                                                          Attributes
                                         Assignment: Supervisor                   Assignment: Student            Assignment: Supervisor                Exercise: String    Files: File         Points: Integer        Feedback: Bool

                                      Lifecycle
        1:n




                 1:n




                                     Attributes
                                              Description: String    Exercise Files: File     Submission: File   Due Date: Date                                           Submission #2
                                                                                                                                                                                                                               Passed
                                                                                                                                                       Edit                              Submitted
                                                                                                                                                                                                   Feedback
                                                                                                                                          Exercise              Files         Points              Feedback == true

                                                                            Exercise #2                                                                                                                                         Failed
     Exercise           Tutorial
                                                                                                                                                                                                  Feedback == false
                 1:n




                                                                                                                                                  Assignment: Student                    Assignment: Tutor
                                                  Edit                                Published                     Past Due              Lifecycle
                                                                                                                                          Attributes
                                     Description          Exercise                     Submission                       Due                            Exercise: String    Files: File         Points: Integer        Feedback: Bool

                                                            Files                                                       Date

                                                                                                                                                                          Submission #3
        1:n




                           1:n




                                          Assignment: Supervisor                   Assignment: Student           Assignment: Supervisor
                                                                                                                                                                                                                               Passed
                                      Lifecycle                                                                                                        Edit                              Submitted
                                      Attributes
                                               Description: String   Exercise Files: File     Submission: File   Due Date: Date                                                                    Feedback
                                                                                                                                          Exercise              Files         Points              Feedback == true

                                                                                                                                                                                                  Feedback == false             Failed

    Submission         Attendance                                                                                                                 Assignment: Student
                                                                                                                                          Lifecycle
                                                                                                                                                                                         Assignment: Tutor


                                                                                                                                          Attributes
                                                                                                                                                       Exercise: String    Files: File         Points: Integer        Feedback: Bool




     Fig. 1. Data Model             Fig. 2. Objects with Lifecycles and Interaction



2      Related Work

Process discovery summarizes techniques that leverage information from event
logs to discover process models [3]. For activity-centric processes, there exist
a variety of approaches (see [1,17] for an overview). Various algorithms use
event logs as input to discover an activity-centric process model. Regarding data-
centric processes [16], however, there only exist few approaches for discovering
process models. [12] describes an approach for discovering artifacts and their
lifecycles from structured datasets as opposed to lifecycle-enabled objects in our
approach. In turn, [9] deals with methods for discovery of artifacts and the in-
teractions between them; additionally, an evaluation based on real-life datasets
from ERP systems is provided. In turn, [13] decomposes the problem of artifact
lifecycle discovery such that existing process mining algorithms can be applied.
The construction of data and object models from different data structures (e.g.,
databases, legacy systems) has been investigated in reverse engineering [4,10].
While database reverse engineering reconstructs logical or conceptual models,
other aspects of data-driven process management are neglected (e.g., lifecycles
or the interactions between object lifecycles). An approach to automatically gen-
erate event logs from databases is described in [11]. Since data is treated as a
first-class citizen in object-oriented process management, additional information
(i.e., data sources) need to be considered to discover an object-aware process.
                         Towards the Discovery of Object-Aware Processes              3

3    Research direction
In our PHILharmonicFlows framework, an object-aware process consists of a data
model, one lifecycle model for each object, and a coordination process enforcing
constraints regarding object interactions [8]. In order to discover an executable
object-aware process, all three aspects need to be considered. For the discovery
of various aspects of object-aware processes (e.g., relations between objects or
states of a lifecycle), solely considering event logs is not sufficient and, hence, ad-
ditional data sources need to be taken into account. For example, the data model
underlying an object-aware process provides the foundation for both lifecycles
and object coordination [5].
    The first step during process discovery is to identify objects, including
their attributes and relations. Note that the structure of a normalized relational
database, to a certain degree, is comparable to a data model, which offers the
opportunity to discover the data model from the structure (i.e., the create table
statements) of a database. Each table in the database may, but does not have to
correspond to an object in the data model, whereas columns of a table represent
the attributes of an object. One-to-many relations between tables can be used
to identify relations between the objects of a data model. Additionally, relations
can be used as an indicator if a table corresponds to a correct object.
    After discovering the data model, the object lifecycles need to be discovered
in the second step. Based on the attributes from the data model, lifecycle dis-
covery shall deliver object states as well as the transitions between them. In
general, a lifecycle process may enter another state, if all necessary data (i.e.,
attribute values) are available. In particular, lifecycle states cannot be discovered
from event logs, whose entries solely refer to activities due to the mismatch be-
tween states (i.e., defined by attributes) and cases (i.e., a collection of activities).
To tackle this mismatch, discovery algorithms for object lifecycles, suitable event
log preprocessing (e.g., splitting an event log into event logs for each object),
and additional data sources (e.g., forms of existing information systems) need
to be considered as well during the discovery process.
    The third step in discovering an object-aware process is to unravel the
coordination of interactions between objects (e.g., a submission may only be
created if the corresponding exercise is in state published). As object interaction
can only be discovered with the data model and the lifecycles present, their
discovery is a secondary problem for now.

4    Conclusion
This paper discusses the need for spending research efforts on the discovery of
object-aware process models. As major advantage, the discovery of an object-
aware processes allows to identify the underlying logic of a process. Finally, due
to the strong linkage between process and data in object-aware processes, it is
possible that not every aspect of each element (i.e., data model, lifecycles, and
coordination) may be discovered from the presented data sources and, therefore
further research is of utmost importance.
    4      Marius Breitmayer and Manfred Reichert

References
 1. van der Aalst, W.M.P.: Process Mining: Data Science in Action. Springer, 2 edn.
    (2016)
 2. van der Aalst, W.M.P., Weske, M., Grünbauer, D.: Case handling: a new paradigm
    for business process support. DKE 53(2), 129–162 (2005)
 3. van der Aalst, W.M.P., et al.: Process mining manifesto. In: Int’l Conf on BPM’11.
    pp. 169–194 (2011)
 4. Alhajj, R.: Extracting the extended entity-relationship model from a legacy rela-
    tional database. Information Systems 28(6), 597 – 618 (2003)
 5. Andrews, K., Steinau, S., Reichert, M.: Engineering a highly scalable object-
    aware process management engine using distributed microservices. In: Int’l Conf
    on CoopIS’18. pp. 80–97 (2018)
 6. Andrews, K., Steinau, S., Reichert, M.: Enabling runtime flexibility in data-centric
    and data-driven process execution engines. Information Systems p. 101447 (2019)
 7. Cohn, D., Hull, R.: Business artifacts: A data-centric approach to modeling busi-
    ness operations and processes. IEEE Data Eng. Bull. 32(3), 3–9 (2009)
 8. Künzle, V., Reichert, M.: PHILharmonicFlows: towards a framework for object-
    aware process management. J of Soft Maint & Evo 23(4), 205–244 (2011)
 9. Lu, X., Nagelkerke, M., van de Wiel, D., Fahland, D.: Discovering interacting
    artifacts from ERP systems. IEEE Trans Serv Com 8(6), 861–873 (2015)
10. Mfourga, N.: Extracting entity-relationship schemas from relational databases: a
    form-driven approach. In: WCRE’97. pp. 184–193 (1997)
11. de Murillas, et al.: Case notion discovery and recommendation: automated event
    log building on databases. Know & Inf Sys (2019)
12. Nooijen, E.H.J., van Dongen, B.F., Fahland, D.: Automatic discovery of data-
    centric and artifact-centric processes. In: Int’l Conf on BPM’12. pp. 316–327 (2012)
13. Popova, V., Fahland, D., Dumas, M.: Artifact lifecycle discovery. Int’l J of Coop
    Inf Sys 24(01), 1550001 (2013)
14. Steinau, S., Andrews, K., Reichert, M.: Modeling process interactions with coor-
    dination processes. In: CoopIS’18. pp. 21–39. LNCS, Springer (2018)
15. Steinau, S., Andrews, K., Reichert, M.: Executing lifecycle processes in object-
    aware process management. In: Data-Driven Process Discovery and Analysis. pp.
    25–44. Springer (2019)
16. Steinau, S., Marrella, A., Andrews, K., Leotta, F., Mecella, M., Reichert, M.:
    DALEC: A framework for the systematic evaluation of data-centric approaches
    to process management software. Softw & Sys Modeling 18(4), 2679–2716 (2019)
17. Weerdt, J.D., Backer, M.D., Vanthienen, J., Baesens, B.: A multi-dimensional qual-
    ity assessment of state-of-the-art process discovery algorithms using real-life event
    logs. Inf Sys 37(7), 654 – 676 (2012)