<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Multidimensional Process Model Forecasting (MuDiPMF)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yongbo Yu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>KU Leuven</institution>
          ,
          <addr-line>Naamsestraat 69, 3000 Leuven</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <fpage>14</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>Process analytics aims to improve processes based on event logs generated by information systems by, among others, automatically discovering models representing the current system. This discovery, however, is typically based on static models, ignoring its underlying trends and shifts. Recently, the modeling and prediction of the full system have been proposed as process model forecasting (PMF). However, the current SOTA lacks the ability to model intricate control flow constructs while also not incorporating extra information, such as resources tied to the process. Besides, by using univariate models, the underlying relations between the diferent elements of the system are ignored. This proposal addresses these issues by firstly extending PMF to richer control flow models that are able to capture relationships between activities in workflows. Secondly, the current PMF techniques will be replaced by a multivariate framework based on state-of-the-art deep learning techniques such as graph neural networks, which form a natural fit for graph-based models such as workflow diagrams and capture both temporal, structural, and multiscale patterns. Next, additional perspectives, such as resources, will be added to obtain fully object-centric process model forecasts, which can incorporate any data related to a process through the forecasting of event knowledge graphs. Finally, two industry cases in finance and logistics will be used to validate the findings in a real-life setting.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Process Model Forecasting</kwd>
        <kwd>Time Series Forecasting</kwd>
        <kwd>Graph Neural Networks</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Positioning and Motivation</title>
      <p>
        Within the field of Process Mining, Predictive Process Monitoring (PPM) entails forecasting future
elements of ongoing process instances or cases, including the most probable next activities, outcomes,
and remaining runtime. Notably, the integration of machine learning and deep learning solutions
into this domain has been extensively explored in academic literature [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. While real-time insights at
the individual case level allow process owners to intervene in specific instances, they often lack the
capacity to provide end-users with information regarding the future trajectory of the entire process.
Consequently, a new paradigm known as Process Model Forecasting (PMF) has emerged [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], focusing
on predicting future states of the overall process model over a long-term horizon, drawing information
from historical event data [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This Multidimensional Process Model Forecasting (MuDiPMF) project
aims to develop and validate a set of tailored and integrated multi-dimensional forecasting models using
multivariate predictive methods for multi-perspective business process models.
      </p>
      <p>
        The current state-of-the-art in PMF involves depicting the evolution of process behavior through
time series analysis of individual Directly-Follows relations (DFs), which track the frequency of one
activity following another within cases, over a predefined timeframe [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These DFs collectively form a
Directly-Follows Graph (DFG), a widely utilized process visualization tool ofering a clear representation
of the flow of the process. The individual DFs are forecasted using univariate time series forecasting
techniques, overlooking correlations between diferent DFs induced by the underlying relations between
process elements within the information system. Additionally, DFGs lack the capability to model more
nuanced process constructs, such as parallel behavior, in contrast to more semantically rich process
model notations like Petri nets and BPMN models.
      </p>
      <p>
        Therefore the first objective of MuDiPMF is to forecast more semantically rich process models. This
entails expanding the feature set beyond DFs, such as the constructs utilized in advanced process
discovery techniques. Furthermore, multimodal predictive models can be used to forecast various
time series of multidimensional feature sets simultaneously, thereby accounting for process related
dependencies and correlations. Next to this, MuDiPMF aims to enhance the forecasted process models
beyond the control-flow aspect, by incorporating additional dimensions such as resource occupation,
execution times, and decision point analyses. This would, among others, allow process owners to
perform timed interventions regarding bottlenecks, and optimize resource allocations. Furthermore,
in recent years Object-Centric Process Mining (OCPM) has emerged as a new family of approaches
tailored to handle event data from processes involving diferent interconnected objects such as orders,
items, and shipments, garnering widespread attention in both academia and industry [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Given the
rapid emergence and relevance of these object-centric process models, a third objective of MuDiPMF is
to develop a framework extending forecasting capabilities to such process models.
      </p>
      <p>In summary, the Multidimensional Process Model Forecasting (MuDiPMF) project will significantly
contribute to the current state-of-the-art in Business Process Management and Process Mining by
enhancing the recently proposed PMF framework by broadening the feature set to be predicted with
additional dimensions, while exploring the application of more suitable multivariate predictive methods.
Finally, the project aims to demonstrate the practical utility of the diferent PMF enhancements using
real-life process data from two diferent domains: financial services and logistics.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Current Solutions and Research Objectives</title>
      <p>
        Figure 1 illustrates the research gaps and research objectives (ROs). Given the recent inception of the
PMF paradigm, the state-of-the-art is currently confined to univariate forecasting of distinct DFs [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
More specifically, current solutions rely on auto-regressive time-series forecasting techniques such as
ARIMA and GARCH. The forecasted DFs can collectively represent a process model (DFG), but are
not suficient to discover more complex process model structures such as parallelism. In contrast, the
literature on automated process discovery is more developed, with numerous approaches proposed over
the years to discover, e.g., Petri Nets or BPMN models [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Other examples include Heuristics Miner and
its extension Fodina, which utilizes various heuristics and formalisms to automatically discover among
others, concurrency, exclusive choices, and loops in a process from event logs [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Another approach
involves a top-down strategy, exemplified by methods like Inductive Miner, which partitions larger
event logs into more manageable segments for analysis [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. RO1 discusses expanding PMF towards
forecasting shifts in processes expressed by more semantically rich process model representations.
      </p>
      <p>
        Next, the growing literature on PPM remains relevant despite its focus on single objectives such as
sufix prediction [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], or case outcome prediction [10]. Many of the predictive approaches have assumed
deep learning models such as long short-term memory networks and even graph neural networks.
Many operate a multivariate, but not a multitarget approach, as envisioned for RO2. Finally, a growing
interest in the object-centric perspective of process mining has been evident in recent years [11]. The
representation of these object-centric event logs as an event knowledge graph is especially of interest
for this project given the similarity of object-centric process models changing over time according to a
temporal graph-based structure of the data [12]. This will be addressed in RO3.
      </p>
      <p>From an algorithmic perspective underpinning these applications, various data-driven and deep
learning approaches for multivariate time series forecasting have been proposed [13]. Given the
graphbased structure of process models, together with their emergence as powerful predictors in diferent
tasks, Graph Neural Networks (GNNs) are a natural match to learn both temporal and structural
properties of process models. Particularly, work incorporating the time dimension into GNNs to
investigate both spatial and temporal dependency together could provide PMF with more powerful and
lfexible predictors capable of taking into account process-specific dependencies. Diferent approaches
for spatial-temporal graph neural networks (STGNNs), such as STGCN [14] and StemGNN [15], have
shown potential in domains such as trafic forecasting.</p>
      <p>Multidimensional
feature set</p>
      <p>Multivariate
predictive models</p>
      <p>Multidimensional Process Model Forecasting (MuDiPMF)</p>
      <sec id="sec-2-1">
        <title>RO1: Semantically rich</title>
        <p>process model representations
Engineer data structures for
semantically rich control-flow
process model forecasting.
WP1.1
Develop advanced multivariate
and multiscale forecasting
algorithms for semantically
rich process models.</p>
        <p>WP1.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>RO2: Extend process model</title>
        <p>forecasting to other dimensions
Develop time series data
transformation techniques
for bottleneck, resource, and
decision points. WP2.1</p>
        <p>Design and implement a
GNN-based multidimensional
process model forecasting
algorithm. WP2.2</p>
      </sec>
      <sec id="sec-2-3">
        <title>RO3: Object-centric process</title>
        <p>model forecasting algorithm
Construct event knowledge
graphs of object-centric
event logs tailored to process
model forecasting. WP3.1
Develop heterogeneous
graph-based predictive
models to forecast
objectcentric process models.WP3.2
WP4.1</p>
        <p>Case study in financial services
Case study in logistics industry
WP4.2</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Planned Research Methodology</title>
      <p>Figure 2 presents a schematic overview of the proposed work plan designed based on [16], illustrating
the alignment of diferent work packages (WPs) with the four research objectives (ROs).</p>
      <p>RO1 aims to extend the feature set of the forecasting techniques beyond DFs by incorporating process
representations and dependencies utilized in various widely used process discovery methodologies. For
example, we can forecast full dependency graphs, as this would allow us to discover parallel activities
or forecast the required metrics for the partition creation used by top-down discovery algorithms.
Correspondingly, existing predictive models will be replaced with multivariate models capable of
simultaneously forecasting all time series while accounting for cross-dependencies. One promising
avenue involves the exploration of multi-scale spatial-temporal graph neural networks.</p>
      <p>RO2 extends process model forecasting capabilities to incorporate control-flow orthogonal
dimensions, including bottlenecks, resource allocation, and decision points. We aim to integrate resource
information into the feature set, considering multiple granularity levels from overall resource occupancy
to allocations at specific activities. Finally, we will incorporate the attention mechanism in GNNs to
extract more reliable and eficient patterns and leverage the multitask learning (MTL) framework to
implement a multidimensional PMF algorithm.</p>
      <p>RO3 aims to design and implement a comprehensive object-centric process model forecasting
algorithm. To account for the complexity of object-centric event logs, we will develop an appropriate
event knowledge graph (EKG) structure on top of which a new process model forecasting algorithm can
be built. This would entail the construction of time series features for heterogeneous graph elements
within EKGs. Next to it, we aim to explore extra architectures tailored toward heterogeneous graph
forecasting. The initial avenue that will be pursued focuses on the use of heterogeneous temporal graph
neural networks.</p>
      <p>RO4 aims to extend the impact of the developed techniques within MuDiPMF by deploying them
in practical applications across diverse industries, specifically targeting the financial services and
logistics sectors. The research group’s network will be leveraged to collaborate with two partnering
companies. Through these case studies, the objective is to demonstrate how the advancements made
can substantially improve the state-of-the-art in Process Model Forecasting (PMF) and highlight their
practical efectiveness. By validating our algorithms using real-world problems and data, we do not
only aim to emphasize their added value, but refinement and adaptation strategies will be developed to
make MuDiPMF algorithms extensible to other application domains.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <p>This study was financed by the Research Foundation Flanders under grant number G039923N and
Internal Funds KU Leuven under grant number C14/23/031.</p>
      <p>This Ph.D. thesis is supervised by Prof. dr. Johannes De Smedt and Prof. dr. Jochen De Weerdt.
[10] I. Teinemaa, M. Dumas, M. L. Rosa, F. M. Maggi, Outcome-oriented predictive process monitoring:
Review and benchmark, ACM Transactions on Knowledge Discovery from Data (TKDD) 13 (2019)
1–57.
[11] R. Galanti, M. De Leoni, N. Navarin, A. Marazzi, Object-centric process predictive analytics, Expert</p>
      <p>Systems with Applications 213 (2023) 119173.
[12] D. Fahland, Process mining over multiple behavioral dimensions with event knowledge graphs,
in: Process mining handbook, Springer, 2022, pp. 274–319.
[13] B. Lim, S. Zohren, Time-series forecasting with deep learning: a survey, Philosophical Transactions
of the Royal Society A 379 (2021) 20200209.
[14] B. Yu, H. Yin, Z. Zhu, Spatio-temporal graph convolutional networks: A deep learning framework
for trafic forecasting, arXiv preprint arXiv:1709.04875 (2017).
[15] D. Cao, Y. Wang, J. Duan, C. Zhang, X. Zhu, C. Huang, Y. Tong, B. Xu, J. Bai, J. Tong, et al., Spectral
temporal graph neural network for multivariate time-series forecasting, Advances in neural
information processing systems 33 (2020) 17766–17778.
[16] J. Mendling, H. Leopold, H. Meyerhenke, B. Depaire, Methodology of algorithm engineering, arXiv
preprint arXiv:2310.18979 (2023).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Maggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Di Francescomarino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ghidini</surname>
          </string-name>
          ,
          <article-title>Predictive monitoring of business processes</article-title>
          , in: Advanced Information Systems Engineering: 26th International Conference, CAiSE
          <year>2014</year>
          , Thessaloniki, Greece, June 16-20,
          <year>2014</year>
          . Proceedings 26, Springer,
          <year>2014</year>
          , pp.
          <fpage>457</fpage>
          -
          <lpage>472</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Poll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polyvyanyy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Röglinger</surname>
          </string-name>
          , L. Rupprecht,
          <article-title>Process forecasting: Towards proactive business process management</article-title>
          ,
          <source>in: Business Process Management: 16th International Conference, BPM</source>
          <year>2018</year>
          ,
          <article-title>Sydney</article-title>
          ,
          <string-name>
            <surname>NSW</surname>
          </string-name>
          , Australia, September 9-
          <issue>14</issue>
          ,
          <year>2018</year>
          , Proceedings 16, Springer,
          <year>2018</year>
          , pp.
          <fpage>496</fpage>
          -
          <lpage>512</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>J. De Smedt</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Yeshchenko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Polyvyanyy</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. De Weerdt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mendling</surname>
          </string-name>
          ,
          <article-title>Process model forecasting and change exploration using time series analysis of event sequence data</article-title>
          ,
          <source>Data &amp; Knowledge Engineering</source>
          <volume>145</volume>
          (
          <year>2023</year>
          )
          <fpage>102145</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>W. M. van der Aalst</surname>
          </string-name>
          ,
          <article-title>Object-centric process mining: Dealing with divergence and convergence in event data</article-title>
          ,
          <source>in: Software Engineering and Formal Methods: 17th International Conference, SEFM 2019</source>
          , Oslo, Norway,
          <source>September 18-20</source>
          ,
          <year>2019</year>
          , Proceedings 17, Springer,
          <year>2019</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Augusto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Conforti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. La</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Maggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Marrella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mecella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Soo</surname>
          </string-name>
          ,
          <article-title>Automated discovery of process models from event logs: Review and benchmark</article-title>
          ,
          <source>IEEE transactions on knowledge and data engineering 31</source>
          (
          <year>2018</year>
          )
          <fpage>686</fpage>
          -
          <lpage>705</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Weijters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. M. van Der</given-names>
            <surname>Aalst</surname>
          </string-name>
          , A.
          <string-name>
            <surname>A. De Medeiros</surname>
          </string-name>
          ,
          <article-title>Process mining with the heuristicsminer algorithm (</article-title>
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S. K.</given-names>
            vanden
            <surname>Broucke</surname>
          </string-name>
          , J. De Weerdt,
          <article-title>Fodina: A robust and flexible heuristic process discovery technique, decision support systems 100 (</article-title>
          <year>2017</year>
          )
          <fpage>109</fpage>
          -
          <lpage>118</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Leemans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fahland</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. M. Van Der Aalst</surname>
          </string-name>
          ,
          <article-title>Discovering block-structured process models from event logs-a constructive approach</article-title>
          ,
          <source>in: Application and Theory of Petri Nets and Concurrency: 34th International Conference, PETRI NETS</source>
          <year>2013</year>
          , Milan, Italy, June 24-28,
          <year>2013</year>
          . Proceedings 34, Springer,
          <year>2013</year>
          , pp.
          <fpage>311</fpage>
          -
          <lpage>329</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Camargo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>González-Rojas</surname>
          </string-name>
          ,
          <article-title>Learning accurate lstm models of business processes</article-title>
          ,
          <source>in: Business Process Management: 17th International Conference, BPM 2019</source>
          , Vienna, Austria, September 1-
          <issue>6</issue>
          ,
          <year>2019</year>
          , Proceedings 17, Springer,
          <year>2019</year>
          , pp.
          <fpage>286</fpage>
          -
          <lpage>302</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>