<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Easy and Eficient Object-Centric Process Querying with the OCPQ Tool</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aaron Küsters</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wil M.P. van der Aalst</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chair of Process and Data Science (PADS), RWTH Aachen University</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>OCPQ is a framework and graphical tool for querying object-centric process data. Traditional process querying relies on case-centric event data, which often cannot represent the multiple interacting objects and perspectives of real-life processes accurately. Querying individual events or cases is no longer suficient for object-centric data. The OCPQ tool allows more flexible queries for arbitrary combinations of objects and events. Queries are represented graphically as node trees in the OCPQ tool, enabling constructing complex queries visually and without programming experience. The query backend of the tool is implemented in the Rust programming language with a focus on fast execution.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Process Mining</kwd>
        <kwd>Object-Centric Event Data</kwd>
        <kwd>Process Querying</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>combinations of objects and events, so-called bindings, by specifying their object and event
types, and any additional filter predicates, e.g., that there should be event-to-object (E2O) or
object-to-object (O2O) relationships between them. In this demo paper, we present the OCPQ
tool implementation in more detail, including new features like general binding annotations,
OCEL filtering, or early-stopping on overload. The OCPQ tool allows visually modeling and
evaluating object-centric process queries on a loaded object-centric dataset. As the result of
each query or subquery, an output table can be explored, where individual entries correspond to
bindings, i.e., combinations of event or object instances, as defined by the query. Each binding
row can additionally be annotated with additional labels or information, for example a violation
indicator or a key performance indicator (KPI) value. Figure 1 shows an example OCPQ query
with integrated constraints. The graphical query is shown on the left and the output table of
the root node is shown on the right.</p>
      <p>The remainder of this paper is structured as follows: First, we present an overview of the
OCPQ tool and its capabilities in Section 2. Next, in Section 3, we provide a brief description of
the OCPQ tool’s implementation. Finally, we conclude this paper in Section 4, also giving an
outlook on future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. OCPQ Tool</title>
      <p>The OCPQ tool is available for download at https://ocpq.aarkue.eu, where the documentation can
also be found. The source code of OCPQ is publicly available at https://github.com/aarkue/ocpq.
A short demo video of the tool is available at https://github.com/aarkue/ocpq-demo.</p>
      <p>
        Initially, an object-centric event log has to be imported. OCPQ supports the OCEL 2.0
specification [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] in any of the three introduced exchange formats (JSON, XML, and SQLite).
After the dataset was imported, some basic information on the OCEL are shown, including the
number of events and objects, as well as all object and event types together with their attributes.
Object Variables
o1: orders,
items,
packages
Event Variables
Filters
Labels
Constraints
Object Variables
Event Variables
      </p>
      <p>e1:
Filterse1 o1
Labels
Constraints</p>
      <p>A</p>
      <p>B
71196 Object Variables 2780</p>
      <p>Event Variables</p>
      <p>e1: payment reminder
Filterse1 o1
Labels
Constraints
2.1. Features
After importing an OCEL file, the dataset can be explored using queries, constraints, or graph
visualizations of objects, events, and their relationships.</p>
      <p>Visual Query and Constraint Tree Editor The query editor appears after creating a new
query and can be used to create and link multiple query nodes to form a query tree. This tree
structure allows easily modeling and nesting of subqueries (e.g., a subquery for all items in
an order). In each node, event and object variables can be added, as well as filter predicates,
determining what combinations of object and event instances to consider. For example, the E2O
iflter predicate specifies that there should be an event-to-object relationship between variable
values. Figure 1 shows the user interface of the query tree editor on the left. The top node
queries all combinations of orders objects and corresponding place order events. For that,
the node introduces two variables o1 (for the orders object) and e1 (for the place order
event), and also has one E2O filter predicate (visualized as a link icon), specifying that the values
of o1 and e1 should be in an E2O relationship. Queries can be evaluated using the play button
on the top right. After evaluation, the number of queried bindings is displayed in the top right
for every node, indicated with # as shown in Figure 1. Analogously to filters, predicates can also
be used for constraints. For that, bindings that fulfill all constraint predicates are considered
satisfied, and violated otherwise. If constraint predicates are used, the violation percentage of a
node is used to determine its color. For instance, in Figure 1, the root node is colored yellow
because it is violated in around 30% of bindings.</p>
      <p>General Annotations and KPIs The violation status is not the only annotation that can
be added to output bindings. For example, KPIs, like the total order volume or the number
of payment reminders per customer, can be augmented to each output binding row. To allow
general annotations, OCPQ uses the Common Expression Language1. In Figure 2, an example
OCPQ query calculating the number of payment reminders per customer is shown.</p>
      <p>All output tables can be explored in the tool directly, or also exported as CSV or XLSX files
for usage in other tools and applications, e.g., as input for machine learning techniques.
OCEL Filtering Filtering OCEL datasets and exporting the resulting subset again is also
supported in OCQP. The filtering is implemented using three diferent configuration modes
for each element (i.e., object, event, or relation): Included (green) which specifies that the
element should be included in the output. Excluded (red) which specifies that the element
should be explicitly excluded, even if it is included somewhere else. Ignored (gray) which does
not influence the output. On the right of Figure 2, a filtering example of OCPQ is shown.
Other Features Apart from the previously mentioned functionality, OCPQ also has some
additional features. For example, it supports automatic discovery for some types of constraints
based on an input OCEL. Moreover, some safeguards are in place to stop execution of overloaded
queries early on and inform the user. For more information and feature descriptions, we refer
interested readers to the website of the tool.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Implementation</title>
      <p>
        The OCPQ implementation is based on the Rust4PM software library presented in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], in
particular using its OCEL 2.0 data structures and importers. The backend and frontend of OCPQ
are implemented in a modular way, allowing using the tool both as a desktop application2 and
as a hosted web application. After importing an OCEL, it is processed to link object and event
references as indicated by their identifiers in relationships. Multiple implementation details
support fast execution of queries: First, query execution is parallelized, for example, evaluating
subqueries or additional filter predicates in parallel across each considered binding. Second, the
ordering and method for binding new object or event variables to values is optimized, reducing
the number of unwanted constructed bindings. For instance, when binding a customers and a
orders object with an O2O relationship, not all combinations of all customers and all orders
need to be considered. After constructing one binding for each customers object, the O2O
relationship can be used to directly construct only those binding extensions that also fulfill the
O2O filter predicate. While evaluating the query execution performance in detail is outside
the scope of this paper, we refer interested readers to the evaluation section in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. There, we
investigated the runtime of example queries on a real-life dataset with more than one million
events and found that every tested example query finished in less than 85ms (0.085 seconds).
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>
        In this paper, we presented the OCPQ tool for querying object-centric process data. As handling
object-centric processes requires more flexibility, OCPQ not only allows querying individual
1https://cel.dev/
2The desktop application uses the tauri framework from https://github.com/tauri-apps/tauri.
events or objects, but any combination of objects and events. Through a graphical constraint
editor, nested queries can be modeled in a tree structure without programming experience. The
result of queries can be explored inside the tool, and can also be exported. Constraints and
other annotations can be added to add additional information and labels to each output binding.
Maturity The first main version of the OCPQ tool was published in September 2024 as v0.5.0.
Since then, more than 15 version updates have been published, including additional functionality,
like allowing general annotation labels, filtering OCEL files, or macOS support. The tool is
stable, and installers for Windows, Linux, and macOS are automatically built for every release.
As described in more detail in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the tool also supports larger, real-life datasets well.
Future work We want to open the OCPQ tool up for more use cases and backends, for
example, executing the modeled queries via SQL or allowing custom extensions in the tool.
Moreover, a case study applying OCPQ to a real-life problem would be interesting. For example,
using the situation table export functionality to derive custom features for process outcome
prediction.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Polyvyanyy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barros</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          ,
          <article-title>Process querying: Enabling business intelligence through query-based process analytics</article-title>
          ,
          <source>DSS</source>
          <volume>100</volume>
          (
          <year>2017</year>
          )
          <fpage>41</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Pérez-Álvarez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Díaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Parody</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M. R.</given-names>
            <surname>Quintero</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. T.</surname>
          </string-name>
          Gómez-López,
          <article-title>Process Instance Query Language and the Process Querying Framework</article-title>
          , in: Process Querying Methods, Springer,
          <year>2022</year>
          , pp.
          <fpage>85</fpage>
          -
          <lpage>111</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Vogelgesang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ambrosy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Becher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Seilbeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Geyer-Klingeberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Klenk</surname>
          </string-name>
          ,
          <string-name>
            <surname>Celonis</surname>
            <given-names>PQL</given-names>
          </string-name>
          :
          <article-title>A Query Language for Process Mining</article-title>
          , in: Process Querying Methods, Springer,
          <year>2022</year>
          , pp.
          <fpage>377</fpage>
          -
          <lpage>408</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Esser</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Fahland, Multi-Dimensional Event Data in Graph Databases</article-title>
          ,
          <source>J. Data Semant</source>
          .
          <volume>10</volume>
          (
          <year>2021</year>
          )
          <fpage>109</fpage>
          -
          <lpage>141</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Berti</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Koren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. N.</given-names>
            <surname>Adams</surname>
          </string-name>
          , G. Park,
          <string-name>
            <given-names>B.</given-names>
            <surname>Knopp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Graves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rafiei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Liß</surname>
          </string-name>
          , L. T. genannt
          <string-name>
            <surname>Unterberg</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            , C. T. Schwanen,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Pegoraro</surname>
            ,
            <given-names>W. M. P. van der Aalst</given-names>
          </string-name>
          ,
          <source>OCEL (objectcentric event log) 2</source>
          .0 specification,
          <source>CoRR abs/2403</source>
          .
          <year>01975</year>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Küsters</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          , OCPQ:
          <article-title>Object-Centric Process Querying and Constraints</article-title>
          , in:
          <source>RCIS (1)</source>
          , volume
          <volume>547</volume>
          <source>of LNBIP</source>
          , Springer,
          <year>2025</year>
          , pp.
          <fpage>383</fpage>
          -
          <lpage>400</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Bowen</surname>
          </string-name>
          ,
          <article-title>The Z notation: Whence the cause and whither the course?</article-title>
          ,
          <source>in: SETSS</source>
          , volume
          <volume>9506</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2014</year>
          , pp.
          <fpage>103</fpage>
          -
          <lpage>151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Küsters</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          ,
          <article-title>Rust4PM: A Versatile Process Mining Library for When Performance Matters, in: BPM (Demos / Resources Forum)</article-title>
          , volume
          <volume>3758</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>91</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>