=Paper= {{Paper |id=Vol-3263/abstract-15 |storemode=property |title=Extraction of Object-Centric Event Logs through Virtual Knowledge Graphs (Extended Abstract) |pdfUrl=https://ceur-ws.org/Vol-3263/abstract-15.pdf |volume=Vol-3263 |authors=Jing Xiong,Guohui Xiao,Tahir Emre Kalayci,Marco Montali,Zhenzhen Gu,Diego Calvanese |dblpUrl=https://dblp.org/rec/conf/dlog/Xiong0KMGC22 }} ==Extraction of Object-Centric Event Logs through Virtual Knowledge Graphs (Extended Abstract)== https://ceur-ws.org/Vol-3263/abstract-15.pdf
Extraction of Object-Centric Event Logs through
Virtual Knowledge Graphs
Extended Abstract

Jing Xiong1 , Guohui Xiao2,3 , Tahir Emre Kalayci4 , Marco Montali1,3 , Zhenzhen Gu1
and Diego Calvanese1,3,5
1
  KRDB Research Centre for Knowledge and Data, Free University of Bozen-Bolzano, 39100 Bolzano, Italy
2
  Department of Information Science and Media Studies, University of Bergen, 5007 Bergen, Norway
3
  Ontopic S.R.L., 39100 Bolzano, Italy
4
  Virtual Vehicle Research GmbH, 8010 Graz, Austria
5
  Department of Computing Science, Umeå University, 901 87 Umeå, Sweden


                                         Abstract
                                         Process mining is a family of techniques that supports the analysis of operational processes based on
                                         event logs. Among the existing event log formats, the IEEE standard eXtensible Event Stream (XES)
                                         is the most widely adopted. In XES, each event must be related to a single case object, which may
                                         lead to convergence and divergence problems. To solve such issues, object-centric approaches become
                                         promising, where objects are the central notion, and one event may refer to multiple objects. In particular,
                                         the Object-Centric Event Logs (OCEL) standard has been proposed recently. However, the crucial problem
                                         of extracting OCEL logs from external sources is still largely unexplored. In this paper, we try to fill this
                                         gap by leveraging the Virtual Knowledge Graph (VKG) approach to access data in relational databases.
                                         We have implemented this approach in the OnProm system, extending it from XES to OCEL support.
                                         The full version of this article has been submitted to an international conference.

                                         Keywords
                                         Process mining, object-centric event logs, virtual knowledge graph, ontology-based data access




1. Introduction
Process mining [1, 2] is a family of techniques relating the fields of data science and process
management to support the analysis of operational processes based on event logs. To perform
process mining, normally the algorithms and tools expect that the event logs are following
certain standards. However, in reality, most IT systems in companies and organizations do
not directly produce such logs, and the relevant information is spread in legacy systems, in
particular, relational databases. Hence, event log extraction from legacy systems is a key enabler
for process mining [3, 4, 5, 6].

   DL 2022: 35th International Workshop on Description Logics, August 7–10, 2022, Haifa, Israel
$ jing.xiong@unibz.it (J. Xiong); guohui.xiao@uib.no (G. Xiao); emre.kalayci@v2c2.at (T. E. Kalayci);
montali@inf.unibz.it (M. Montali); zhenzhen.gu@unibz.it (Z. Gu); calvanese@inf.unibz.it (D. Calvanese)
 0000-0002-3604-9645 (J. Xiong); 0000-0002-5115-4769 (G. Xiao); 0000-0001-6228-1221 (T. E. Kalayci);
0000-0002-8021-3430 (M. Montali); 0000-0002-7346-6093 (Z. Gu); 0000-0001-5174-9693 (D. Calvanese)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
   There have been several proposals for the representation of event logs, e.g., eXtensible Event
Stream (XES) [7], JSON Support for XES (JXES) [8], Open SQL Log Exchange (OpenSLEX) [9],
and eXtensible Object-Centric (XOC) [10], where XES is the most adopted one, being the IEEE
standard for interoperability in event logs [11]. In XES (and other similar proposals), each event
is related to a single case object, which leads to problems with convergence (when an event is
related to multiple cases and occurs repetitively) and divergence (when multiple events are in
a single case and are hard to separate) [12]. To solve these issues, object-centric approaches
become promising, where objects are the central notion, and one event may refer to multiple
objects. In particular, along this direction, the Object-Centric Event Logs (OCEL) standard [13]
has been proposed recently.
   To the best of our knowledge, the crucial problem of extracting OCEL logs from external
sources is still largely unexplored. The only exception is [14], where OCEL logs are extracted by
identifying the so-called master and relevant tables in the underlying database and building a
Graph of Relations (GoR). Though promising, this approach might be difficult to adopt when the
underlying tables are complex and the GoR is hard to model, because it does not separate the
storage level (i.e., the database) from the concept level (i.e., domain knowledge about events).
   In this work, we try to fill this gap by leveraging the OnProm (http://onprom.inf.unibz.it/) frame-
work [4, 5] for extracting event logs from legacy information systems. OnProm v1 was already
relying on the technology of Virtual Knowledge Graphs (VKG) [15] to expose databases as
Knowledge Graphs that conform to a conceptual model, and to query this conceptual model
and eventually generate logs by using ontology and mapping-based query processing. It came
with a toolchain to process the conceptual model, and to automatically extract XES event logs,
by relying on the VKG system Ontop [16]. We present here OnProm v2, which we have modu-
larized so that it becomes easier to extend, and in which we have implemented OCEL-specific
features to extract OCEL logs.


2. The OnProm Framework for Event Log Extraction
We describe now the OnProm approach for event log extraction, as shown in Figure 1. To
extract from a legacy information system ℐ = ⟨ℛ, 𝒟⟩, with relational schema ℛ and database 𝒟,
event logs that conform to an event log standard 𝑋, OnProm works as follows:
(1) A domain ontology is a high-level abstraction of business logic concerned in a domain of
    interest. The user can design a domain ontology 𝒯 using the standard ontology language
    OWL 2 QL using any ontology editing tool, e.g., the Ontology Editor of the OnProm tool chain.
    Then the user creates a VKG mapping ℳ (using, e.g., the Ontop plugin for Protégé [17]) to
    declare how the instances of classes and properties in 𝒯 are populated from ℐ. This step
    is only concerned with modeling the domain of interest and is agnostic to the event log
    standard.
(2) OnProm assumes that for the event log standard 𝑋, a specific (domain-independent) event
    ontology ℰ𝑋 is available. The Annotation Editor of OnProm imports ℰ𝑋 , and allows the
    user to create annotations ℒ𝑋 , which are based on the classes in ℰ𝑋 , over the classes in 𝒯 .
(3) OnProm assumes that for the standard 𝑋 also a set of SPARQL queries for extracting the
    log information is defined. By relying on a conceptual schema transformation approach [6]
                                                  Onprom
            XES event ontology ℰXES                 XES annotation ℒXES          OCEL annotation         OCEL event ontology ℰOCEL
                                                         📌                               📌 ℒOCEL 📌 📌
                                      refers to              📌                                         refers to
                                                                 📌       📌                      📌
                                                                     📌                    📌    📌

                                                   VKG
                                                         Domain ontology 𝒯




            XES log mapping specification                                                                OCEL log mapping specification
                                                     Mapping specification ℳ




                                                     Information system ℐ
                                                                DB schema            ℛ




                                                                             RDB 𝒟




Figure 1: OnProm event log extraction framework


     and query reformulation of Ontop, using ℒ𝑋 , 𝒯 , ℳ, and ℛ, these SPARQL queries are
     internally translated to SQL queries over ℐ. OnProm evaluates the generated SQL queries
     to construct corresponding Java objects and serialize them into log files compliant with 𝑋.
As mentioned, OnProm v1 only supported the XES standard. In this work, we have first
modularized the system, by separating the above steps in different software components, so
as to make it more extensible. Then we have introduced OCEL-specific features in Steps (2)
and (3). Hence, OnProm v2 is now able to extract OCEL logs from relational databases.
   Next we illustrate the functionality of OnProm for extracting OCEL logs through an example.
The OCEL event ontology ℰOCEL is a very simple ontology with only three classes: Object,
Event, and Attribute. We consider Dolibarr [18] v14, a popular open source Enterprise Resource
Planning (ERP) system. We have designed a Sale Orders domain ontology and the mapping in
the Ontop system (Figure 2). We have then used the Annotation Editor of OnProm to annotate
this ontology with ℰOCEL classes (Figure 3). Based on the provided information, OnProm is able
to extract OCEL logs automatically. Figure 4 shows a fragment of the extracted log in XML, and
its graphical visualization.


Acknowledgments
This research has been supported by the Wallenberg AI, Autonomous Systems and Software
Program (WASP) funded by the Knut and Alice Wallenberg Foundation, by the Italian PRIN
project HOPE, and by the EU H2020 project INODE, grant n. 863410.
Figure 2: Ontology and mappings shown in Ontop




Figure 3: OnProm Annotation Editor showing the annotated ontology of the Dolibarr ERP system




            (a) OCEL XML serialization                           (b) OCEL graph

Figure 4: A fragment of the extracted OCEL log from the Dolibarr ERP system
References
 [1] W. M. P. van der Aalst, et al., Process Mining Manifesto, in: Proc. of the Business Process
     Management Workshops (BPM-WS), volume 99 of Lecture Notes in Business Information
     Processing, Springer, 2011, pp. 169–194. doi:10.1007/978-3-642-28108-2_19.
 [2] W. van der Aalst, Process Mining – Data Science in Action, 2nd ed., Springer, 2016.
 [3] D. Calvanese, M. Montali, A. Syamsiyah, W. M. P. van der Aalst, Ontology-driven extraction
     of event logs from relational databases, in: Proc. of the 11th Int. Workshop on Business
     Process Intelligence (BPI), volume 256 of Lecture Notes in Business Information Processing,
     Springer, 2016, pp. 140–153. doi:10.1007/978-3-319-42887-1_12.
 [4] D. Calvanese, T. E. Kalayci, M. Montali, A. Santoso, The onprom toolchain for extracting
     business process logs using ontology-based data access, in: Proc. of the BPM Demo Track
     and BPM Dissertation Award (BPM-D&DA), volume 1920 of CEUR Workshop Proceedings,
     CEUR-WS.org, 2017. URL: http://ceur-ws.org/Vol-1920/BPM_2017_paper_207.pdf.
 [5] D. Calvanese, T. E. Kalayci, M. Montali, S. Tinella, Ontology-based data access for extract-
     ing event logs from legacy data: The onprom tool and methodology, in: Proc. of the 20th Int.
     Conf. on Business Information Systems (BIS), volume 288 of Lecture Notes in Business Infor-
     mation Processing, Springer, 2017, pp. 220–236. doi:10.1007/978-3-319-59336-416.
 [6] D. Calvanese, T. E. Kalayci, M. Montali, A. Santoso, W. van der Aalst, Conceptual schema
     transformation in ontology-based data access, in: Proc. of the 21st Int. Conf. on Knowledge
     Engineering and Knowledge Management (EKAW), volume 11313 of Lecture Notes in
     Computer Science, Springer, 2018, pp. 50–67. doi:10.1007/978-3-030-03667-6_4.
 [7] H. M. W. Verbeek, J. C. A. M. Buijs, B. F. van Dongen, W. M. P. van der Aalst, XES, XESame,
     and ProM 6, in: Information Systems Evolution: Selected Extended Papers of CAiSE
     Forum 2010, volume 72 of Lecture Notes in Business Information Processing, Springer, 2010,
     pp. 60–75. doi:10.1007/978-3-642-17722-4_5.
 [8] M. B. Shankara Narayana, H. Khalifa, W. M. P. van der Aalst, JXES: JSON Support for
     the XES Event Log Standard, CoRR Technical Report arXiv:2009.06363, arXiv.org e-Print
     archive, 2020. URL: https://arxiv.org/abs/2009.06363.
 [9] E. G. L. de Murillas, H. A. Reijers, W. M. P. van der Aalst, Connecting databases with
     process mining: A meta model and toolset, Software and System Modeling 18 (2019)
     1209–1247. doi:10.1007/s10270-018-0664-7.
[10] G. Li, E. G. López de Murillas, R. M. de Carvalho, W. M. P. van der Aalst, Extracting object-
     centric event logs to support process mining on databases, in: J. Mendling, H. Mouratidis
     (Eds.), Proc. of CAiSE Forum, volume 317 of Lecture Notes in Business Information Processing,
     Springer, 2018, pp. 182–199. doi:10.1007/978-3-319-92901-9_16.
[11] XES, 1849-2016 - IEEE Standard for eXtensible Event Stream (XES) for Achieving Interop-
     erability in Event Logs and Event Streams, IEEE Computer Society, 2016. doi:10.1109/
     IEEESTD.2016.7740858.
[12] W. M. P. van der Aalst, Object-centric process mining: Dealing with divergence and
     convergence in event data, in: Proc. of the 17th Int. Conf. on Software Engineering and
     Formal Methods (SEFM), volume 11724 of Lecture Notes in Computer Science, Springer,
     2019, pp. 3–25. doi:10.1007/978-3-030-30446-1_1.
[13] A. F. Ghahfarokhi, G. Park, A. Berti, W. M. P. van der Aalst, OCEL: A standard for object-
     centric event logs, in: ADBIS 2021 Short Papers, Doctoral Consortium and Workshops:
     DOING, SIMPDA, MADEISD, MegaData, CAoNS, volume 1450 of Communications in
     Computer and Information Science, Springer, 2021, pp. 169–175.
[14] A. Berti, G. Park, M. Rafiei, W. van der Aalst, An Event Data Extraction Approach from
     SAP ERP for Process Mining, CoRR Technical Report arXiv:2110.03467, arXiv.org e-Print
     archive, 2021. URL: https://arxiv.org/abs/2110.03467.
[15] G. Xiao, L. Ding, B. Cogrel, D. Calvanese, Virtual Knowledge Graphs: An overview of
     systems and use cases, Data Intelligence 1 (2019) 201–223. doi:10.1162/dint_a_00011.
[16] G. Xiao, D. Lanti, R. Kontchakov, S. Komla-Ebri, E. Güzel-Kalayci, L. Ding, J. Corman,
     B. Cogrel, D. Calvanese, E. Botoeva, The virtual knowledge graph system Ontop, in: Proc.
     of the 19th Int. Semantic Web Conf. (ISWC), volume 12507 of Lecture Notes in Computer
     Science, Springer, 2020, pp. 259–277. doi:10.1007/978-3-030-62466-8_17.
[17] A. Poggi, M. Rodriguez-Muro, M. Ruzzi, Ontology-based database access with DIG-Mastro
     and the OBDA Plugin for Protégé, in: K. Clark, P. F. Patel-Schneider (Eds.), Proc. of the
     4th Int. Workshop on OWL: Experiences and Directions (OWLED), 2008.
[18] Dolibarr Open Source ERP CRM, Web suite for business, https://www.dolibarr.org/, 2021.
     (Last accessed on 5 May 2022).