=Paper=
{{Paper
|id=Vol-1701/paper2
|storemode=property
|title=Discovering Interacting Artifacts from ERP Systems (Extended Abstract)
|pdfUrl=https://ceur-ws.org/Vol-1701/paper2.pdf
|volume=Vol-1701
|authors=Dirk Fahland,Xixi Lu,Marijn Nagelkerke,Dennis van de Wiel
|dblpUrl=https://dblp.org/rec/conf/emisa/FahlandLNW16
}}
==Discovering Interacting Artifacts from ERP Systems (Extended Abstract)==
Jan Mendling and Stefanie Rinderle-Ma, eds.: Proceedings of EMISA 2016, Gesellschaft für Informatik, Bonn 2016 Discovering Interacting Artifacts from ERP Systems (Extended Abstract)3 D. Fahland1, X. Lu1, Marijn Nagelkerke2, Dennis van de Wiel2 Abstract: Enterprise Resource Planning (ERP) systems are widely used to manage business doc- uments along a business processes and allow very detailed recording of event data of past process executions and involved documents. This recorded event data is the basis for auditing and detecting unusual flows. Process mining techniques can analyze event data of processes stored in linear event logs to discover a process model that reveals unusual executions. Existing techniques assume a linear event log that use a single case identifier to which all behavior can be related. However, in ERP sys- tems processes such as Order to Cash operate on multiple interrelated business objects, each having their own case identifier, their own behavior, and interact with each other. Forcing these into a single case creates ambiguous dependencies caused by data convergence and divergence which obscures unusual flows in the resulting process model. We present a new semi-automatic, end-to-end approach for analyzing event data in a plain database of an ERP system for unusual executions. We identify an artifact-centric process model describing the business objects, their life-cycles, and how the various objects interact along their life-cycles. The technique was validated in two case studies and reliably revealed unusual flows later confirmed by domain experts. The work summarized in this extended abstract has been published in [Lu15]. Keywords: Process Mining, ERP-System, Artifact-Centric Model, Object Life-Cycle, Interaction Discovery 1 Introduction and Problem Description Information systems (IS) not only store and process data in an organization but also record event data about how and when information changed. This “historical event data” can be used to analyze, for instance, whether information processing in the past conformed to the prescribed processes or to compliance requirements. Process mining [Aa11] offers auto- mated techniques for this task. In particular exploring visual models discovered from event data allows to identify unusual flows and their circumstances; based on which concrete measures for process improvement can be devised [Ec15]. Prerequisite to this analysis is a process event log that holds events about information changes with the assumption that each event belong to one specific execution of a specific process. In general, information access is not tied to a particular process execution; rather the same information can be accessed and changed from various processes and applications. For 1 d.fahland@tue.nl, Eindhoven University of Technology, The Netherlands 2 KPMG IT Advisory N.V., Eindhoven, The Netherlands 3 This article summarizes problem, approach, and selected findings of a study published as Xixi Lu, Marijn Nagelkerke, Dennis van de Wiel, and Dirk Fahland. Discovering Interacting Artifacts from ERP Systems. Ser- vices Computing, IEEE Transactions on, 8(6), 2015 doi:10.1109/TSC.2015.2474358 [Lu15]. 1 F2 Sales documents (SD) Delivery documents (DD) SD id Date created Reference id Document type Value Last change DD id Date created Reference SD id Reference BD Document type Picking date S1 16-5-2020 null Sales Order 100 10-6-2020 D1 18-5-2020 S1 B1 Delivery 31-5-2020 S2 17-5-2020 null Sales Order 200 31-5-2020 D2 22-5-2020 S1 B2 Delivery 5-6-2020 S3 10-6-2020 S1 Return Order 10 NULL D3 25-5-2020 S2 B2 Delivery 5-6-2020 F1 D4 12-6-2020 S3 null Return Delivery NULL F3 F4 Documents Changes Billing documents (BD) Change id Date changed Reference id Table name Change type Old Value New Value BD id Date created Document type Clearing date 1 17-5-2020 S1 SD Price updated 100 80 B1 20-5-2020 Invoice 31-5-2020 2 19-5-2020 S1 SD Delivery block released X - B2 24-5-2020 Invoice 5-6-2020 3 19-5-2020 S1 SD Billing block released X - 4 10-6-2020 B1 BD Invoice date updated 20-6-2020 21-6-2020 Parent Child table table Fig. 1: The tables of the simplified OTC example (Sales Order) Sales order Divergence 7 events “Created” related to S1 Sales order created (a) (Delivery) 2 1 Return Return 1 order 1 delivery (Invoice) (Return Order) Delivery 2 Invoice 1 (Delivery) created created created created 3 3 1 1 2 (Invoice) (Return Delivery) (b) Delivery 1 Invoice Legend: created created artifact Sales order 2 3 2 2 Event type created 2 Return order Return delivery Causal relation 1 or interaction (Sales Order) (Invoice) 1 created created (Delivery) 1 1 Deviating 3 events “Created” related to S2 Convergence (c) interaction Fig. 2: Creation of documents of Fig. 1 along time (a), a classic process model based on a single case identifier (b), and an artifact-centric model based on 5 case identifiers (c). instance, in Enterprise Resource Planning (ERP) systems (such as SAP and Oracle En- terprise), information is stored in business objects (or documents) which are linked via one-to-many and many-to-many relations, typically in the relational database. The objects themselves are encapsulated in services [AMZ00] which are invoked by high-level end- to-end business processes; each invocation is called a transactions which is logged in the data object itself. Fig. 1 shows a simplified example of the transactional data of an Order to Cash (OTC) process supported by SAP systems; Fig. 2(a) visualizes the events of Fig. 1 that are related to document creation. There are two sales orders S1 and S2; creation of S1 is followed by creation of a delivery document D1, an invoice B1, another delivery document D2, and another invoice B2 which also contains billing information about S2. Creation of S2 is also followed by creation of another delivery document D3. Further, there is a return order S3 related to S1 with its own return delivery document D4. The many-to-many relations between documents surface in the transactional data of Fig. 1: a sales document can be related to multiple billing documents (S1 is related to B1 and B2) and a billing document can be related to multiple sales document (B2 is related to S1 and S2). This behavior already contains an unusual flow: delivery documents were created twice before the billing document (main flow), but once the order was reversed (B2 before D3). When applying classical process mining techniques, one first has to extract an event log based on a single case identifier to which all event data can be related. Choosing SD id in Fig. 1 leads to the two sequences of events shown in Fig. 2(a). Process discovery on 2 this log yields the model of Fig. 2(b) which is wrong: two invoices are created before their deliveries instead of one, and three invoices are created instead of two (known as divergence and convergence, respectively) [Pi11]. 2 Approach: Discovering Artifact-Centric Models We propose to approach the problem under the “conceptual lens” of artifact-centric mod- els [CH09]. An artifact is a data object over an information model; each artifact instance exposes services that allow changing its informational contents; a life-cycle model gov- erns when which service of the artifact can be invoked; the invocation of a service in one artifact may trigger the invocation of another service in another artifact. Information mod- els of different artifacts can be in one-to-many and many-to-many relations allowing to describe behavior over complex data in terms of multiple objects interacting via service invocations. Under this lens, each document of an ERP system can be seen as an arti- fact; a transaction on a document is a service call on the artifact; behavioral dependencies between transactions of documents can be seen as life-cycle behavior and dependencies of service calls. Describing the transactional data of Fig. 1 with artifact-centric concepts yields the model of Fig. 2(c); it visualizes the order in which objects are created and also highlights the unusual flow of invoice B2 being created before delivery D2. The problem of discovering an artifact-centric Data Database process model from relational ERP data de- Source Schema composes into two sub-problems. (1) Given a relational data source, identify a set of artifacts, extract for each artifact an event log, and dis- 1.1 Discover 2.1 Discover Artifact cover a model of its life-cycle. (2) Given a set Type of artifacts and their data source, identify inter- Type-Level actions between the artifacts, between their in- Interaction 2.2 Add case stances, between their event types and between 1.2 Extract references their events. Figure 3 shows the overview of Case A1 Case B1 Case C1 our approach. (1.1) We use the data schema of Case A2 Case B2 Case C2 the data source to discover artifact types which Case An Case Bm Case Ck detail all timestamped columns related to a par- Event 2.3 Discover ticular business object. (1.2) For each artifact 1.3 Discover Log we then extract a classical event log [Aa11], each case describes all events related to one instance of the artifact. (1.3) Existing process discovery algorithms allow discovering a life- cycle model of the artifact. In parallel, (2.1) Life-Cycle Models... + Activity-Level Interactions we discover interactions between artifacts from foreign key relations in the data source; (2.2) Fig. 3: An overview on our approach. during log extraction, each case of an artifact is annotated with references to cases of other artifacts this case interacts with. (2.3) The case references are refined into interactions between activities of different artifact life-cycles. 3 3 Results We implemented our approach based on [NvDF12] and conducted two case studies. By separating data into artifacts along one-to-many relations, we eliminated divergence and convergence, the interaction flows discovered from one-to-many relations were meaning- ful to business users, and unusual flows were detected. Fig. 4 shows models obtained from 2 1140 Created months data of an SAP Order-to-Cash pro- 646 2581 1472 907 39 Delivery H_Created 2439 55 31 6 5118 12 23 65 346 597 47 Payment05or15_Payment Received 107 822 278 26 46 641 Invoice H_Created 142 49 341 cess (11 document header tables, 134,826 2629 53 371 620 42 1 14 46 316 DebitMemoRequest H_Created 354 78 59 18 2 1360 1298 2 212 141 1 1 15 1 DebitMemo H_Created 3 13 1 1 14 1 5 9 90 138 1 304 271 1 1 3 PostInAR_PostedInAR 854 1 17 10 records of 5-49 attributes); the model at the 3479 22 1 1 1 1 568 625 Contract H_Created 89 108 741 14 8 ReturnDelivery H_Created 9 22 18 1 15 15 6 ProFormaInvoice H_Created 257 730 1 734 2 2 top was obtained with a classical approach 8 CreditMemoRequest H_Created 6 2 InvoiceCancellation H_Created 2 33 40 7 12 49 5 CreditMemo H_Created 2 4 34 1 7 ReturnOrder H_Created 1 (only 29 of 77 edges are correct); the model at the bottom was obtained using our approach. In both case studies the dis- covered process models were assessed as accurate graphical representations of the source data by domain experts; all edges including outlier edges were assessed as Fig. 4: SAP OTC process, classical process model correct and traced back to the source data obtained from single event log (top) and artifact- together with domain experts. These in- centric model highlighting outliers (bottom) sights could be obtained exploratively and much faster than with existing best practices. References [Aa11] Aalst, W.M.P. van der: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, 2011. [AMZ00] Al-Mashari, Majed; Zairi, Mohamed: Supply-chain re-engineering using enterprise re- source planning (ERP) systems: an analysis of a SAP R/3 implementation case. IJPDLM, 30(3/4):296–313, 2000. [CH09] Cohn, D.; Hull, R.: Business artifacts: A data-centric approach to modeling business operations and processes. Bulletin of the IEEE Computer Society TCDE, 32(3):3–9, 2009. [Ec15] van Eck, Maikel L.; Lu, Xixi; Leemans, Sander J. J.; van der Aalst, Wil M. P.: PM ˆ2 : A Process Mining Project Methodology. In: CAiSE 2015. volume 9097 of LNCS. Springer, pp. 297–313, 2015. [Lu15] Lu, Xixi; Nagelkerke, Marijn; van de Wiel, Dennis; Fahland, Dirk: Discovering Inter- acting Artifacts from ERP Systems. IEEE Trans. Services Computing, 8(6):861–873, 2015. [NvDF12] Nooijen, Erik H. J.; van Dongen, Boudewijn F.; Fahland, Dirk: Automatic Discovery of Data-Centric and Artifact-Centric Processes. In: DAB’12. volume 132 of LNBIP. Springer, pp. 316–327, 2012. [Pi11] Piessens, D.A.M.: Event Log Extraction from SAP ECC 6.0. Master’s thesis, Eindhoven University of Technology, 2011. 4