<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Object-Centric Process Mining (and More) Using a Graph-Based Approach With PromG ⋆</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Ava</forename><surname>Swevels</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Eindhoven University of Technology</orgName>
								<address>
									<country key="NL">The Netherlands</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Eva</forename><forename type="middle">L</forename><surname>Klijn</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Eindhoven University of Technology</orgName>
								<address>
									<country key="NL">The Netherlands</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dirk</forename><surname>Fahland</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Eindhoven University of Technology</orgName>
								<address>
									<country key="NL">The Netherlands</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Object-Centric Process Mining (and More) Using a Graph-Based Approach With PromG ⋆</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">B8CE8C5A993D63197197614FAA53D5C4</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T20:03+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Object-Centric Process Mining</term>
					<term>Object-Centric Event Data</term>
					<term>Event Knowledge Graphs</term>
					<term>Neo4j</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>PromG is an extensible Python library for managing and enriching object-centric event data (OCED) and for developing object-centric process mining (OCPM) techniques. It does so by using Event Knowledge Graphs, which model process-related concepts as a property graph in a Neo4j database. The library automatically generates Cypher queries to transform, enhance, and manipulate object-centric event data, giving analysts a straightforward way to explore and analyze object-centric processes. To enable others to develop OCPM techniques, the library is available as a Python package on PyPi and has been tested with real-life examples.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Analysis of real-life processes with multiple interrelated objects has revealed the limitations of traditional case-centric process mining techniques <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2,</ref><ref type="bibr" target="#b2">3]</ref>. As a result, classical process mining techniques such as control-flow discovery and conformance checking must be adapted, and new techniques must be developed addressing the multi-object interactions of the process. These techniques are collectively referred to as object-centric process mining (OCPM) <ref type="bibr" target="#b3">[4]</ref>. Some techniques have already been proposed by academia <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref> and process mining vendors (notably MyInvenio/IBM and Celonis).</p><p>However, an open-source ecosystem that enables development and application of OCPM in the broader process mining community has yet to form. It should offer extensible, easyto-use functionality for (1) managing object-centric event data (OCED), e.g., import, storage, preprocessing, export, (2) exploring OCED from various angles, (3) routine analysis of OCED, e.g., discovery, performance, and (4) one-off analysis specific to a particular use case.</p><p>Toward this goal, we developed the open-source Python library PromG which uses the Neo4j graph DB system to store data and analysis in a multi-layered knowledge graph. PromG implements a recent community proposal for standard OCED 1 and provides standard functionality for importing, managing, and analyzing OCED (by automatically generating queries against Neo4j). Additionally, it allows users to script custom OCPM analyses and implement newly ICPM 2023 Doctoral Consortium and Tool Demonstration Track ⋆ The research underlying this paper was supported by AutoTwin EU GA n. 101092021 Envelope a.j.e.swevels@tue.nl ( Ava Swevels); e.l.klijn@tue.nl ( Eva L. Klijn); d.fahland@tue.nl ( Dirk Fahland) </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Overview and Design</head><p>PromG is a Python library that realizes OCPM by using a Neo4j graph database as data store. Its architecture is illustrated in Fig. <ref type="figure" target="#fig_0">1</ref>. The Neo4j database stores OCED and process mining analysis results in multiple layers of an Event Knowledge Graph <ref type="bibr" target="#b2">[3]</ref>, a specific labeled property graph, that describes (qualified) relations between events, objects, relations, and their attributes (over time).</p><p>PromG translates process mining tasks into Cypher queries that are run against the Neo4j instance. It consists of modules that capture the logic to store and analyze the data, and core functionalities that provide a query library, a database connection to the Neo4j instance and the data schema. The latter is implemented in the core, as Neo4j (or any graph database) lacks a schema implementation.</p><p>Users can build a process mining analysis using existing modules. Additionally, since the data is stored in a Neo4j instance, it can be accessed through Cypher queries and industrial GUIs, allowing further processing, exploration, and analysis to be built on top of PromG. Therefore, we provide users with a template to create their own modules that interact with the core features, thus enabling them to realize their own OCPM analysis techniques.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Functionalities Available</head><p>While PromG is designed to be easily extended with additional features, we discuss the current capabilities along the currently available layers, allowing users to take advantage of the tool immediately.  of legacy data records as nodes in raw record layer (at the "bottom" of the graph). Based on a user-provided semantic header (a JSON document describing the data's domain semantics), OCED-PG generates queries that automatically transform the raw record nodes into nodes of related events, objects, and attribute forming the domain-level event layer in OCED format <ref type="bibr" target="#b9">[10]</ref>; each node of the event layer is linked to the nodes of the raw record layer it originates from.</p><p>(b) Object-path inference. Per object chosen by the user, OCED-PG infers the directly-follows (df) path of events per object (enhancing the event layer), resulting in a partial order over all events that can be analyzed <ref type="bibr" target="#b2">[3]</ref>.</p><p>(c) Event Layer to Process Model layer. The process discovery module enables the automated discovery of object-centric process models as multi-object DFGs <ref type="bibr" target="#b2">[3]</ref>. The user specifies activity features and objects (or relations) for which the model should be discovered, PromG generates queries that aggregate event nodes and df-relations of the event layer into activity nodes and flow relations per object together -forming a process model layer. Each activity node is linked to the event nodes in the event layer it models.</p><p>(d) Task Layer. PromG supports OCED analysis beyond classical OCPM use cases. The task identification module infers df-paths per resource, uses these to detects sub-graphs where a resource continuously worked on related objects. Queries then abstract the entire event layer into a task layer by aggregating sub-graphs into task execution nodes (linked to the underlying events); giving insights into how actors collaborate across executions <ref type="bibr" target="#b10">[11]</ref>. Fig. <ref type="figure" target="#fig_2">2</ref> visualizes the interconnected layers on BPIC'17: a task instance node (purple) linked to the underlying event nodes (green) along their DF-paths, and how (some) events link to the multi-object DFG (blue/orange nodes) of BPIC'17.</p><p>(e) Custom Modules. We provide a template for users to create their own module that generates queries against Neo4j, enabling user to create custom routine and one-off analyses that enrich existing layers or introduce new layers. Through the template architecture, routine analysis modules can be included in PromG facilitating open-source contributions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Installation, Usage, and Maturity</head><p>The PromG library is hosted on PyPi<ref type="foot" target="#foot_0">2</ref> and open-source <ref type="foot" target="#foot_1">3</ref> with example analyses, a demo video and documentation. PromG can be used in any Python project as long as a Neo4j instance <ref type="foot" target="#foot_2">4</ref> (with the APOC plugin<ref type="foot" target="#foot_3">5</ref> installed) is available. PromG provides example projects for constructing EKGs of 5 public real-life event logs of different sizes (BPIC14, BPIC15, BPIC16, BPIC17, BPIC19). Graph construction is a one-time operation that depends on the number of relationships to construct <ref type="bibr" target="#b11">[12,</ref><ref type="bibr">Tab.4]</ref>. Improving PromG query performance is planned future work.</p><p>PromG's approach and queries have been used in developing custom analyses in multiple industrial case studies in baggage handling systems <ref type="bibr" target="#b12">[13]</ref>, semiconductor <ref type="bibr" target="#b13">[14]</ref> and ship manufacturing <ref type="bibr" target="#b14">[15]</ref>, and configuration management <ref type="bibr" target="#b15">[16]</ref> with consistently positive feedback that the graph-based approach enables insights and analytics not obtainable previously. Incorporating relevant analysis functions into PromG is planned future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Comparison to Related Software</head><p>Next to closed-source implementations of OCPM, only the open-source Python library OCPA <ref type="bibr" target="#b8">[9]</ref> addresses the same objective as PromG. OCPA currently offers more analytics functionality than PromG, and serves as "backbone" for the GUI-based analysis tools OCPM <ref type="bibr" target="#b6">[7]</ref> and OC𝜋 <ref type="bibr" target="#b7">[8]</ref>.</p><p>PromG's strengths lie in the multi-layered Event Knowledge Graph (EKG) within a standardized data store (Neo4j): the EKG implements standard OCED with domain semantics; the extensible layers persist analysis results linked to the source data (see Fig. <ref type="figure" target="#fig_2">2</ref>); Neo4j's query language Cypher and GUIs enables advanced, interactive data exploration and visualization crucial for OCPM analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>PromG is an open-source Python library designed to manage and explore OCED and to perform OCPM analyses. Although its current functionality is limited compared to some academic counterparts, PromG's architecture prioritizes ease of extension and future development, positioning it as a valuable tool in the growing field of OCED and OCPM.</p><p>Particularly, PromG's multi-layered knowledge graph promotes the development of a number of extensions: next to realizing further OCPM capabilities <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b7">8]</ref> an inference engine for inferring missing or latent information <ref type="bibr" target="#b13">[14]</ref> building on an integration of event data with system design and context data <ref type="bibr" target="#b12">[13]</ref>; analysis of actor behavior and organizational routines <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b14">15]</ref>; and detecting emergent behavior and its propagation across cases <ref type="bibr" target="#b16">[17]</ref>.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: PromG architecture</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>(a) Raw Records to OCED Event Layer. The OCED-PG module enables the automatic import</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Multi-layered event knowledge graph of BPIC'17 generated by PromG and explored in Neo4j Bloom.</figDesc><graphic coords="3,89.29,84.20,416.64,227.96" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">https://pypi.org/project/promg/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">https://github.com/PromG-dev</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">https://neo4j.com/product/neo4j-graph-database/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">https://neo4j.com/labs/apoc/</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Object-centric process mining: Dealing with divergence and convergence in event data</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Van Der Aalst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SEFM 2019</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">11724</biblScope>
			<biblScope unit="page" from="3" to="25" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">AI-augmented business process management systems: A research manifesto</title>
		<author>
			<persName><forename type="first">M</forename><surname>Dumas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Fournier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Limonad</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Trans. Manag. Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page">19</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Process mining over multiple behavioral dimensions with event knowledge graphs</title>
		<author>
			<persName><forename type="first">D</forename><surname>Fahland</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Process Mining Handbook</title>
				<meeting>ess Mining Handbook</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">448</biblScope>
			<biblScope unit="page" from="274" to="319" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Twin transitions powered by event data -using object-centric process mining to make processes digital and sustainable</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Van Der Aalst</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">ATAED 2023</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">3424</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Discovering interacting artifacts from ERP systems</title>
		<author>
			<persName><forename type="first">X</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nagelkerke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Van De Wiel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fahland</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Serv. Comput</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="861" to="873" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Discovering object-centric petri nets</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Van Der Aalst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Berti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Fundam. Informaticae</title>
		<imprint>
			<biblScope unit="volume">175</biblScope>
			<biblScope unit="page" from="1" to="40" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Oc-pm: analyzing object-centric event logs and process models</title>
		<author>
			<persName><forename type="first">A</forename><surname>Berti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M</forename><surname>Van Der Aalst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal on Software Tools for Technology Transfer</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="1" to="17" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Oc𝜋: Object-centric process insights</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">N</forename><surname>Adams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M</forename><surname>Van Der Aalst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Applications and Theory of Petri Nets</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">ocpa: A python library for object-centric process analysis</title>
	</analytic>
	<monogr>
		<title level="j">Software Impacts</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page">100438</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Implementing object-centric event data models in event knowledge graphs</title>
		<author>
			<persName><forename type="first">A</forename><surname>Swevels</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fahland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Montali</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Process Mining Workshops. ICPM 2023</title>
		<title level="s">Lecture Notes in Business Information Processing</title>
		<meeting>ess Mining Workshops. ICPM 2023</meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>Accepted, to appear</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Classifying and detecting task executions and routines in processes using event graphs</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">L</forename><surname>Klijn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Mannhardt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fahland</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">BPM&apos;21 Forum</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">427</biblScope>
			<biblScope unit="page" from="212" to="229" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Multi-dimensional event data in graph databases</title>
		<author>
			<persName><forename type="first">S</forename><surname>Esser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fahland</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal on Data Semantics</title>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">V</forename><surname>Chu</surname></persName>
		</author>
		<title level="m">Using event knowledge graphs to model multi-dimensional dynamics in a baggage handling system</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Inferring missing entity identifiers from context using event knowledge graphs</title>
		<author>
			<persName><forename type="first">A</forename><surname>Swevels</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Dijkman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fahland</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">BPM 2023</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">14159</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<title level="m">Event graph model discovery for waiting time and workflow analysis in damen&apos;s process</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Capturing multi-dimensional dynamics in a configuration management process through event knowledge graphs</title>
		<author>
			<persName><forename type="first">K</forename><surname>Marangoz</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">The interplay between high-level problems and the process instances that give rise to them</title>
		<author>
			<persName><forename type="first">B</forename><surname>Bakullari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Van Thoor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fahland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Van Der Aalst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">BPM 2023 Forum</title>
				<imprint>
			<publisher>LNBIP</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">490</biblScope>
			<biblScope unit="page" from="145" to="162" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
