=Paper= {{Paper |id=Vol-1701/paper2 |storemode=property |title=Discovering Interacting Artifacts from ERP Systems (Extended Abstract) |pdfUrl=https://ceur-ws.org/Vol-1701/paper2.pdf |volume=Vol-1701 |authors=Dirk Fahland,Xixi Lu,Marijn Nagelkerke,Dennis van de Wiel |dblpUrl=https://dblp.org/rec/conf/emisa/FahlandLNW16 }} ==Discovering Interacting Artifacts from ERP Systems (Extended Abstract)== https://ceur-ws.org/Vol-1701/paper2.pdf
                         Jan Mendling and Stefanie Rinderle-Ma, eds.: Proceedings of EMISA 2016,
                                                           Gesellschaft für Informatik, Bonn 2016

Discovering Interacting Artifacts from ERP Systems
(Extended Abstract)3

D. Fahland1, X. Lu1, Marijn Nagelkerke2, Dennis van de Wiel2



Abstract: Enterprise Resource Planning (ERP) systems are widely used to manage business doc-
uments along a business processes and allow very detailed recording of event data of past process
executions and involved documents. This recorded event data is the basis for auditing and detecting
unusual flows. Process mining techniques can analyze event data of processes stored in linear event
logs to discover a process model that reveals unusual executions. Existing techniques assume a linear
event log that use a single case identifier to which all behavior can be related. However, in ERP sys-
tems processes such as Order to Cash operate on multiple interrelated business objects, each having
their own case identifier, their own behavior, and interact with each other. Forcing these into a single
case creates ambiguous dependencies caused by data convergence and divergence which obscures
unusual flows in the resulting process model. We present a new semi-automatic, end-to-end approach
for analyzing event data in a plain database of an ERP system for unusual executions. We identify an
artifact-centric process model describing the business objects, their life-cycles, and how the various
objects interact along their life-cycles. The technique was validated in two case studies and reliably
revealed unusual flows later confirmed by domain experts. The work summarized in this extended
abstract has been published in [Lu15].

Keywords: Process Mining, ERP-System, Artifact-Centric Model, Object Life-Cycle, Interaction
Discovery



1    Introduction and Problem Description

Information systems (IS) not only store and process data in an organization but also record
event data about how and when information changed. This “historical event data” can be
used to analyze, for instance, whether information processing in the past conformed to the
prescribed processes or to compliance requirements. Process mining [Aa11] offers auto-
mated techniques for this task. In particular exploring visual models discovered from event
data allows to identify unusual flows and their circumstances; based on which concrete
measures for process improvement can be devised [Ec15]. Prerequisite to this analysis is
a process event log that holds events about information changes with the assumption that
each event belong to one specific execution of a specific process.
In general, information access is not tied to a particular process execution; rather the same
information can be accessed and changed from various processes and applications. For
1 d.fahland@tue.nl, Eindhoven University of Technology, The Netherlands
2 KPMG IT Advisory N.V., Eindhoven, The Netherlands
3 This article summarizes problem, approach, and selected findings of a study published as Xixi Lu, Marijn

 Nagelkerke, Dennis van de Wiel, and Dirk Fahland. Discovering Interacting Artifacts from ERP Systems. Ser-
 vices Computing, IEEE Transactions on, 8(6), 2015 doi:10.1109/TSC.2015.2474358 [Lu15].


                                                    1
                                                                                                                         F2

                              Sales documents (SD)                                                                             Delivery documents (DD)
SD id   Date created     Reference id   Document type           Value    Last change        DD id    Date created     Reference SD id   Reference BD Document type             Picking date
  S1    16-5-2020        null           Sales Order              100     10-6-2020           D1      18-5-2020        S1                B1             Delivery                31-5-2020
  S2    17-5-2020        null           Sales Order              200     31-5-2020           D2      22-5-2020        S1                B2             Delivery                5-6-2020
  S3    10-6-2020        S1             Return Order              10     NULL                D3      25-5-2020        S2                B2             Delivery                5-6-2020
                            F1                                                               D4      12-6-2020        S3                null           Return Delivery         NULL

                                                                                                                                               F3
                                    F4
                                                  Documents Changes                                                                                Billing documents (BD)
  Change id   Date changed       Reference id     Table name Change type                        Old Value New Value                BD id Date created      Document type      Clearing date
     1        17-5-2020          S1               SD          Price updated                     100       80                        B1 20-5-2020           Invoice            31-5-2020
     2        19-5-2020          S1               SD          Delivery block released           X         -                         B2 24-5-2020           Invoice            5-6-2020
     3        19-5-2020          S1               SD          Billing block released            X         -
     4        10-6-2020          B1               BD          Invoice date updated              20-6-2020 21-6-2020                                       Parent            Child
                                                                                                                                                           table            table



                                                Fig. 1: The tables of the simplified OTC example
                        (Sales Order)                                                                                                       Sales order
                                               Divergence           7 events “Created” related to S1              Sales order created
(a)                                 (Delivery)
                                                                                                                           2              1                 Return               Return
                                                                                                                                                        1 order                1 delivery
                                                (Invoice)                  (Return Order)                                   Delivery 2       Invoice
                                                                                                                     1
                                                            (Delivery)                                                      created          created        created              created
                                                                                                                                3               3              1                     1
                                                                                                                                         2
                                                             (Invoice)                      (Return Delivery)   (b)
                                                                                                                                   Delivery         1    Invoice                    Legend:
                                                                                                                                       created                 created              artifact
                                                                                                                Sales order 2
                                                                                                                                          3           2           2                 Event type
                                                                                                                    created
                                                                                                                       2             Return order         Return delivery      Causal relation
                                                                                                                                                     1                         or interaction
        (Sales Order)                 (Invoice)                                                                                1        created               created
                                                              (Delivery)
                                                                                                                                           1                     1                   Deviating
 3 events “Created” related to S2        Convergence                                                            (c)                                                                 interaction



Fig. 2: Creation of documents of Fig. 1 along time (a), a classic process model based on a single case
identifier (b), and an artifact-centric model based on 5 case identifiers (c).

instance, in Enterprise Resource Planning (ERP) systems (such as SAP and Oracle En-
terprise), information is stored in business objects (or documents) which are linked via
one-to-many and many-to-many relations, typically in the relational database. The objects
themselves are encapsulated in services [AMZ00] which are invoked by high-level end-
to-end business processes; each invocation is called a transactions which is logged in the
data object itself.
Fig. 1 shows a simplified example of the transactional data of an Order to Cash (OTC)
process supported by SAP systems; Fig. 2(a) visualizes the events of Fig. 1 that are related
to document creation. There are two sales orders S1 and S2; creation of S1 is followed
by creation of a delivery document D1, an invoice B1, another delivery document D2, and
another invoice B2 which also contains billing information about S2. Creation of S2 is
also followed by creation of another delivery document D3. Further, there is a return order
S3 related to S1 with its own return delivery document D4. The many-to-many relations
between documents surface in the transactional data of Fig. 1: a sales document can be
related to multiple billing documents (S1 is related to B1 and B2) and a billing document
can be related to multiple sales document (B2 is related to S1 and S2). This behavior
already contains an unusual flow: delivery documents were created twice before the billing
document (main flow), but once the order was reversed (B2 before D3).
When applying classical process mining techniques, one first has to extract an event log
based on a single case identifier to which all event data can be related. Choosing SD id
in Fig. 1 leads to the two sequences of events shown in Fig. 2(a). Process discovery on


                                                                                               2
this log yields the model of Fig. 2(b) which is wrong: two invoices are created before
their deliveries instead of one, and three invoices are created instead of two (known as
divergence and convergence, respectively) [Pi11].



2     Approach: Discovering Artifact-Centric Models

We propose to approach the problem under the “conceptual lens” of artifact-centric mod-
els [CH09]. An artifact is a data object over an information model; each artifact instance
exposes services that allow changing its informational contents; a life-cycle model gov-
erns when which service of the artifact can be invoked; the invocation of a service in one
artifact may trigger the invocation of another service in another artifact. Information mod-
els of different artifacts can be in one-to-many and many-to-many relations allowing to
describe behavior over complex data in terms of multiple objects interacting via service
invocations. Under this lens, each document of an ERP system can be seen as an arti-
fact; a transaction on a document is a service call on the artifact; behavioral dependencies
between transactions of documents can be seen as life-cycle behavior and dependencies
of service calls. Describing the transactional data of Fig. 1 with artifact-centric concepts
yields the model of Fig. 2(c); it visualizes the order in which objects are created and also
highlights the unusual flow of invoice B2 being created before delivery D2.
The problem of discovering an artifact-centric
                                                       Data                                                  Database
process model from relational ERP data de-            Source                                                 Schema
composes into two sub-problems. (1) Given a
relational data source, identify a set of artifacts,
extract for each artifact an event log, and dis-               1.1 Discover
                                                                                         2.1 Discover               Artifact
cover a model of its life-cycle. (2) Given a set                                                                       Type
of artifacts and their data source, identify inter-                                           Type-Level
actions between the artifacts, between their in-                                              Interaction
                                                                                         2.2 Add case
stances, between their event types and between        1.2 Extract
                                                                                          references
their events. Figure 3 shows the overview of
                                                      Case A1               Case B1                              Case C1
our approach. (1.1) We use the data schema of         Case A2               Case B2                              Case C2
the data source to discover artifact types which      Case An               Case Bm                              Case Ck
detail all timestamped columns related to a par-                                                                      Event
                                                                                            2.3 Discover
ticular business object. (1.2) For each artifact 1.3 Discover                                                           Log

we then extract a classical event log [Aa11],
each case describes all events related to one
instance of the artifact. (1.3) Existing process
discovery algorithms allow discovering a life-
cycle model of the artifact. In parallel, (2.1) Life-Cycle
                                                     Models...                      + Activity-Level Interactions
we discover interactions between artifacts from
foreign key relations in the data source; (2.2)          Fig. 3: An overview on our approach.
during log extraction, each case of an artifact is
annotated with references to cases of other artifacts this case interacts with. (2.3) The case
references are refined into interactions between activities of different artifact life-cycles.


                                                             3
3   Results
We implemented our approach based on [NvDF12] and conducted two case studies. By
separating data into artifacts along one-to-many relations, we eliminated divergence and
convergence, the interaction flows discovered from one-to-many relations were meaning-
ful to business users, and unusual flows were detected.

Fig. 4 shows models obtained from 2                                                                                                                                                                                                                                                                                                       1140



                                                                                                                                                                                                                                                                                                                                    Created




months data of an SAP Order-to-Cash pro-
                                                                                                                                                                                                                            646
                                                                                                                                                                                                                                                                                                                                     2581


                                                                                                                                                                                                                             1472                                                                                 907                                                                                                             39



                                                                                                                                                                                           Delivery H_Created                     2439                    55                                                                                                                          31                                    6
                                                                                                                                                                                                  5118


                                                                                                                                                                                                                                                                        12                      23                  65        346



                                                                                                                                                                                                            597                47                                                                                                                                                                Payment05or15_Payment Received        107
                                                                                                                                                                                                                                                                                                                                                                                                              822


                                                                                                                                                                                                                                                                                                                                         278                               26                                                                                                                              46



                                                   641                                                                                                                                                                                   Invoice H_Created                     142                                           49                                341




cess (11 document header tables, 134,826
                                                                                                                                                                                                                                                2629


                                                         53                 371                                               620                       42                                                                                                         1                                                                                                                                                                                                                                                     14



                                                                                             46                    316                                                                                                                                         DebitMemoRequest H_Created                                                                                                  354                 78                                                                                     59        18
                                                                                                                                                                                                                                                                           2


                                                                                                                                                                                                                         1360                             1298                              2                                                                                                                                                                       212                 141



                                                                                  1   1                                             15                                                                                                                                  1                DebitMemo H_Created             3                                                                                                                   13                                                                               1         1
                                                                                                                                                                                                                                                                                                 14


                                                                        1                                                                                                                                                                                                            5          9                                   90                                          138                                                                                                                                  1                      304



                                                   271              1                                                     1                                      3                                                                                                    PostInAR_PostedInAR                  854                                                                                                          1                         17           10




records of 5-49 attributes); the model at the
                                                                                                                                                                                                                                                                              3479


                                                                                                                                          22                                 1        1                 1                                    1                                                                                                                                                                                                                                                                                    568



                                                              625                         Contract H_Created                                                                                                                                                                                                                                  89                     108
                                                                                                 741



                                                                                                                                                                                      14         8                ReturnDelivery H_Created                                                             9           22                    18
                                                                                                                                                                                                                              1



                                                                                                                                                                                                                                                     15          15                                                                                6                                               ProFormaInvoice H_Created           257
                                                                                                                                                                                                                                                                                                                                                                                                             730


                                                                                                                                                                                                                         1                                                                                  734                                        2                                                                                               2




top was obtained with a classical approach
                                                                             8                                                      CreditMemoRequest H_Created      6                                                                                                                                                                                     2                                                                                                              InvoiceCancellation H_Created         2
                                                                                                                                                33                                                                                                                                                                                                                                                                                                                                     40


                                                                                                                                                             7                   12                                                                                                                                                                                               49



                                                                                                               5                                                                                     CreditMemo H_Created                  2                                                                                                                                                                                                               4
                                                                                                                                                                                                              34


                                                                                                                                                                         1                                                                       7



                                                                                                  ReturnOrder H_Created
                                                                                                            1




(only 29 of 77 edges are correct); the
model at the bottom was obtained using
our approach. In both case studies the dis-
covered process models were assessed as
accurate graphical representations of the
source data by domain experts; all edges
including outlier edges were assessed as
                                              Fig. 4: SAP OTC process, classical process model
correct and traced back to the source data
                                              obtained from single event log (top) and artifact-
together with domain experts. These in- centric model highlighting outliers (bottom)
sights could be obtained exploratively and
much faster than with existing best practices.


References
[Aa11]     Aalst, W.M.P. van der: Process Mining: Discovery, Conformance and Enhancement of
           Business Processes. Springer, 2011.
[AMZ00] Al-Mashari, Majed; Zairi, Mohamed: Supply-chain re-engineering using enterprise re-
        source planning (ERP) systems: an analysis of a SAP R/3 implementation case. IJPDLM,
        30(3/4):296–313, 2000.
[CH09]     Cohn, D.; Hull, R.: Business artifacts: A data-centric approach to modeling business
           operations and processes. Bulletin of the IEEE Computer Society TCDE, 32(3):3–9,
           2009.
[Ec15]     van Eck, Maikel L.; Lu, Xixi; Leemans, Sander J. J.; van der Aalst, Wil M. P.: PM ˆ2
           : A Process Mining Project Methodology. In: CAiSE 2015. volume 9097 of LNCS.
           Springer, pp. 297–313, 2015.
[Lu15]     Lu, Xixi; Nagelkerke, Marijn; van de Wiel, Dennis; Fahland, Dirk: Discovering Inter-
           acting Artifacts from ERP Systems. IEEE Trans. Services Computing, 8(6):861–873,
           2015.
[NvDF12] Nooijen, Erik H. J.; van Dongen, Boudewijn F.; Fahland, Dirk: Automatic Discovery
         of Data-Centric and Artifact-Centric Processes. In: DAB’12. volume 132 of LNBIP.
         Springer, pp. 316–327, 2012.
[Pi11]     Piessens, D.A.M.: Event Log Extraction from SAP ECC 6.0. Master’s thesis, Eindhoven
           University of Technology, 2011.



                                               4