<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Process and Resource-Aware Responsible Recommender Systems (Extended Abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alessandro Padella</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Padua, Dipartimento di Matematica "Tullio Levi-Civita"</institution>
          ,
          <addr-line>Via Trieste, 63, 35131 Padova PD</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In recent years, predictive and prescriptive analytics have become increasingly valuable in optimizing business processes by enabling data-driven decision-making. This thesis focuses on the enhancement of these systems in terms of fairness, accuracy, and transparency. First, a fairness-aware predictive framework is proposed, leveraging adversarial learning techniques to ensure that attributes that have to be protected, such as gender or ethnicity do not influence prediction outcomes. Experimental results demonstrate a significant reduction in biased predictions and recommendations. The thesis also addresses the issue of limited or imbalanced event logs, which can afect the training of reliable recommendation models. A comparative evaluation of current event-log augmentation methods is conducted, followed by the introduction of a novel augmentation approach based on statistical sampling. This method is shown to outperform state-of-the-art techniques in generating synthetic event logs that closely resemble real-world data distributions. Furthermore, the thesis presents two resource allocation frameworks to improve the global eficiency of business processes. The first generates recommendations that are globally optimal while allowing flexibility for local decisions of resources, about which task to perform as next. The second framework ensures a balanced workload distribution among process participants, addressing practical constraints in real-world resource management. Lastly, to support transparency and user trust, an explainable recommender system that recommends which task to perform as next is developed to accompany recommendations with interpretable justifications. By incorporating Shapley values into the recommendation model, the framework can provide meaningful insights into the rationale behind specific suggestions across diferent process domains.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Process Mining</kwd>
        <kwd>Deep Learning</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Resource Allocation</kwd>
        <kwd>Explainability</kwd>
        <kwd>Data Augmentation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        This PhD thesis focuses on the design and enhancement of Process-Aware Recommender Systems
(PARs), a class of information systems that aims to monitor business processes, predict their outcomes,
and eventually recommend corrective actions to improve performance. These systems are increasingly
data-driven and rely on techniques from machine-and-deep learning to optimize the so called Key
Performance Indicators (KPIs) such as cost, execution time, and customer satisfaction [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. PAR systems
are composed of three main components: Process Monitoring, Process Predictive Analytics, and Process
Prescriptive Analytics. Process Monitoring enables real-time tracking of ongoing processes, Process
Predictive Analytics leverages historical and real-time data to forecast future process behaviour, and
Process Prescriptive Analytics provides actionable recommendations to provide feasible and efective
interventions.
      </p>
      <p>
        While these systems ofer significant potential for process optimization, their development raises
critical challenges related to the ethical, accurate, and transparent use of data. To address these
challenges, this thesis focuses on a PAR system’s predictive and prescriptive components following the
Responsible Data Science concepts pointed out by Van der Aalst et al. in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], that if applied to PARs,
translates into four challenging research questions, three of which have been identified as the subject
of study in the thesis:
• RQ1 – Fairness: How can process predictions and recommendations avoid unfair conclusions, even
when such conclusions are supported by historical data?
This question explores the integration of fairness-aware techniques into predictive models to
mitigate bias toward protected attributes (e.g., gender, ethnicity). The goal is to prevent discriminatory
outcomes in recommender systems by ensuring that predictions are not influenced by variables
that should be ethically irrelevant. Recommender systems may inherit biases from training data,
especially when demographic attributes are involved, leading to unfair outcomes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Studies
have shown that such biases can reinforce discrimination, prompting legal and ethical concerns
(e.g., Article 2 of GDPR [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]). In Process-Aware Recommender Systems, these biases can influence
corrective actions, disproportionately afecting certain groups. To prevent this, models must be
trained and evaluated using fairness-aware techniques. Ensuring unbiased predictions is essential
for ethical and responsible decision support.
• RQ2 – Accuracy: How can we ensure process predictions and recommendations are accurate and
reliable, especially when data is imbalanced or when resources must be shared across concurrent
process instances?
Predictive models in business processes often struggle with rare but critical events due to
imbalanced datasets, leading to inaccurate forecasts and suboptimal interventions. In scenarios like
loan processing, this can afect the prediction of the occurrence of exceptional activities that are
often related to problems and ineficiencies. Additionally, in a recommender system, resources
are shared among multiple process instances, each typically performing one activity at a time.
If resources are assigned to each instance independently, without considering others, overall
eficiency sufers. For example, if resource R1 is given to instance P1, it cannot assist instance P2.
However, assigning R2 to P1—even if it’s slightly less efective—could be better if R1 is uniquely
suitable for P2. This highlights the importance of evaluating resource allocation globally, not in
isolation. Efective intervention decisions must consider all running instances to optimize overall
outcomes.
• RQ3 – Transparency: How can we provide transparent and understandable explanations for
process predictions and recommendations to facilitate trust and human oversight?
Process Prescriptive Analytics often neglects interaction with human decision-makers, limiting its
practical adoption. Despite high predictive accuracy, users may distrust recommendations without
clear explanations. Explainable AI (XAI) is crucial for bridging this gap, enhancing transparency
and trust. Studies show that understanding the reasoning behind predictions increases user
confidence. Therefore, integrating interpretability into PAR systems is essential for efective
human-AI collaboration.
      </p>
      <p>The fourth question, i.e. Confidentiality, although relevant, is beyond the scope of this work and it
has been identified as future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Research Approach and Outcomes</title>
      <p>In light of the identified research questions, this PhD thesis outlines frameworks for extending the
state-of-the-art of Predictive and Prescriptive Analytics following the identified research questions.</p>
      <p>RQ1 – Fairness How can process predictions and recommendations avoid unfair conclusions, even
when such conclusions are supported by historical data?</p>
      <p>
        For addressing this research question in the thesis, a framework is developed to address unfair
predictions in process analytics caused by bias from certain variables that should not influence the
outcome and should therefore be considered as protected (e.g., gender, citizenship). It is important to
state that solely removing the protected variables is insuficient, as bias can shift to correlated features,
becoming a hidden variable, which is even harder to handle. The framework introduced in the thesis
uses adversarial debiasing on a fully connected neural network [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]: the idea below adversarial debiasing
is to train a predictive model based on a neural network while simultaneously training an adversary
network to prevent learning from protected variables. This ensures the model focuses on legitimate, fair
patterns. In the thesis, the protected variables have been defined on a case-by-case basis, resulting in 4
case studies on both real and synthetic datasets with 6 diferent protected variables. The experiments
showed that the proposed model not only significantly reduces the influence of the protected variables,
but is also able to maintain the same accuracy of the model in which the protected variables influence
the KPI prediction. Furthermore, the predictive model is also shown to reduce the influence of the
variables that are correlated with the protected ones, proving the initial statement. This work also
resulted in the publication of a paper on the International Conference on Cooperative Information
Systems [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>RQ2 – Accuracy: How can we ensure process predictions and recommendations are accurate and
reliable, especially when data is imbalanced or when resources must be shared across concurrent process
instances?</p>
      <p>
        This research question has been addressed under two points of view: i) Data Augmentation and ii)
Resource Allocation. In this thesis, a framework for event-log augmentation to improve predictions
of rare process behaviours has been developed. The approach addresses the challenge of imbalanced
data by generating synthetic traces that reflect rare but important events. The thesis provide two
contributions: (1) independently evaluating existing augmentation methods using consistent datasets
and criteria from [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and (2) proposing a novel technique that outperforms existing ones in quality and
speed. The evaluation incorporates both established and new metrics. The
Train-on-Synthetic-Test-onReal [8] metric is finally used to assess how well synthetic logs replicate real patterns by measuring
model performance trained on synthetic and tested on real data. The development of this framework
led to the publication of a paper that is currently submitted to the Business &amp; Information Systems
Engineering journal.
      </p>
      <p>ii) From the point of view of resource allocation, this thesis puts forward 2 diferent frameworks
for resource allocation in Process Prescriptive Analytics. The first framework focuses on optimizing
recommendations by jointly assigning both the next activity and the most appropriate resource for all
ongoing process instances. Unlike traditional methods that make isolated decisions for each instance,
this approach adopts a global optimisation perspective, aiming to maximize key performance indicators
(KPIs) across the entire system. It takes into account real-world constraints, such as the fact that a
resource can only work on one activity at a time, and it allows some flexibility in assigning sub-optimal
local resources if doing so improves the overall system outcome, with the goal of leaving the resources
free to choice the next task on which they will be working on. The development of this framework led
to the publication of a paper at the Business Process Management conference [9].</p>
      <p>The second framework integrates the concept of worker experience into the allocation process.
Rather than defining experience based on resumes or generic seniority metrics, it uses historical process
execution data as task execution times for the single resources for the respective tasks, to create a more
data-driven measure of expertise. The goal here is twofold: not only to maintain high performance in
terms of global KPIs, but also to ensure a fair distribution of workload and promote the development
of less experienced workers by gradually assigning them more complex tasks. This contributes to
long-term organizational learning and balances short-term eficiency with human-centric growth. The
development of this framework led to the publication of a paper at the Business Process Management
conference [10].</p>
      <p>RQ3 – Transparency: How can we provide transparent and understandable explanations for process
predictions and recommendations to facilitate trust and human oversight?</p>
      <p>The thesis also proposes a framework for adding explanations to Process-aware Recommender
Systems (PAR systems) to improve trust and understanding of recommendations. While existing systems
focus on predicting and recommending activities for processes at risk, they often lack explanations for
their suggestions. The proposed framework uses Shapley Values from game theory [11], a technique that
is independent of the machine-learning technique used, and allows to decompose the input variables’
influence to explain the rationale behind predictions and recommendations. Applied to a Process
Prescriptive Analytics system using gradient boosting with decision trees, the framework was evaluated
on real-life datasets, demonstrating its efectiveness in both improving recommendations and providing
clear explanations. The development of this framework led to the publication of a paper [12].</p>
    </sec>
    <sec id="sec-3">
      <title>3. Conclusions and Post-doctoral Research Directions</title>
      <p>Process-Aware Recommender Systems analyses historical process execution data to evaluate
performance against predefined objectives for ongoing process instances. Their primary goal is to provide
real-time insights into process performance and ofer actionable recommendations to correct instances
likely to produce suboptimal outcomes, based on KPIs. These systems address three key research
questions within the field of Responsible Data Science. Despite their performance benefits, these
techniques face challenges such as explainability issues, the need for large datasets, and potential bias
in training processes. These limitations have been noted across various machine learning frameworks
used in Process Predictive Analytics. Each framework developed in this thesis has been
experimentally validated across multiple case studies and event logs, considering various KPIs. All the code is
publicly available for free use. Nevertheless, several future research paths remain, particularly those
that involve integrating the individual approaches examined in this study. These will be considered by
the post-doctoral researcher. One challenge is developing a system that addresses the Confidentiality
research question: How can process predictions and recommendations be made while maintaining
confidentiality? Blockchain techniques [ 13, 14] could be a potential solution by ensuring data integrity
without a central authority. Next, integrating fairness, synthetic event log generation, and explainability
into a unified framework would improve predictive accuracy, fairness, and interpretability. This could
prevent unfair influences from protected variables and correct imbalances in underrepresented activities.
Extending explainability further involves using large language models (LLMs) to develop a dynamic
explanation engine that provides process actors with insights into factors contributing to suboptimal
outcomes and recommendations for corrective actions, this research path has been already took into
account in my post-doctoral period, leading to a publication in the Business Process Management Forum
2025. Additionally, future research could explore advanced graph-neural-network-based approaches to
dynamically adapt recommendations over time based on resource performance. This would create an
adaptive recommender system that continuously improves process and resource eficiency. Finally, A/B
testing and surveys could be used to assess the quality of recommendations provided by the system.</p>
    </sec>
    <sec id="sec-4">
      <title>Declaration on Generative AI</title>
      <p>The authors declare that Generative AI tools were used solely to assist with language refinement and
not for the generation of scientific content or analysis.
[8] P. Conen, et al., Train on synthetic - test on real: Domain adaptation for strain-based damage
detection on an aircraft wing, in: ICAS 2024, International Council of the Aeronautical Sciences, liely
conference proceedings, 2024. URL: https://elib.dlr.de/204979/2/ICAS2024_Paper_PhilippConen.pdf,
available in DLR eLibrary.
[9] A. Padella, M. de Leoni, Resource allocation in recommender systems for global kpi improvement,
in: C. Di Francescomarino, A. Burattin, C. Janiesch, S. Sadiq (Eds.), Business Process Management
Forum, Springer Nature Switzerland, Cham, 2023, pp. 249–266.
[10] A. Padella, F. Mannhardt, F. Vinci, M. de Leoni, I. Vanderfeesten, Experience-based resource
allocation for remaining time optimization, in: A. Marrella, M. Resinas, M. Jans, M. Rosemann
(Eds.), Business Process Management - 22nd International Conference, BPM 2024, Krakow, Poland,
September 1-6, 2024, Proceedings, volume 14940 of Lecture Notes in Computer Science, Springer,
2024, pp. 345–362.
[11] S. M. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, in: Advances in
neural information processing systems, 2017, pp. 4765–4774.
[12] A. Padella, M. de Leoni, O. Dogan, R. Galanti, Explainable process prescriptive analytics, in:
2022 4th International Conference on Process Mining (ICPM), 2022, pp. 16–23. doi:10.1109/
ICPM57379.2022.9980535.
[13] I. Weber, X. Xu, R. Riveret, G. Governatori, A. Ponomarev, J. Mendling, Untrusted business process
monitoring and execution using blockchain, in: M. La Rosa, P. Loos, O. Pastor (Eds.), Business
Process Management, Springer International Publishing, Cham, 2016, pp. 329–347.
[14] C. Cabanillas, C. Di Ciccio, J. Mendling, A. Baumgrass, Predictive task monitoring for business
processes, in: S. Sadiq, P. Sofer, H. Völzer (Eds.), Business Process Management, Springer
International Publishing, Cham, 2014, pp. 424–432.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ceravolo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Comuzzi</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. De Weerdt</surname>
          </string-name>
          , et al.,
          <source>Predictive process monitoring: Concepts</source>
          , challenges, and future research directions,
          <source>Process Science</source>
          <volume>1</volume>
          (
          <year>2024</year>
          )
          <article-title>2</article-title>
          . URL: https://doi.org/10.1007/ s44311-024-00002-4. doi:
          <volume>10</volume>
          .1007/s44311-024-00002-4.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Bichler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Heinzl</surname>
          </string-name>
          ,
          <article-title>Responsible data science</article-title>
          ,
          <source>Business &amp; Information Systems Engineering</source>
          <volume>59</volume>
          (
          <year>2017</year>
          )
          <fpage>311</fpage>
          -
          <lpage>313</lpage>
          . doi:
          <volume>10</volume>
          .1007/s12599-017-0487-z.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Mannhardt</surname>
          </string-name>
          , Responsible Process Mining, Springer, Cham,
          <year>2022</year>
          , pp.
          <fpage>373</fpage>
          -
          <lpage>401</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>European</given-names>
            <surname>Union</surname>
          </string-name>
          ,
          <article-title>General data protection regulation (gdpr</article-title>
          ),
          <year>2018</year>
          . URL: https://gdpr-info.eu/, accessed:
          <fpage>2024</fpage>
          -11-26.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yoon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jarrett</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>van der Schaar, Time-series generative adversarial networks, Curran Associates Inc</article-title>
          .,
          <string-name>
            <surname>Red</surname>
            <given-names>Hook</given-names>
          </string-name>
          ,
          <string-name>
            <surname>NY</surname>
          </string-name>
          , USA,
          <year>2019</year>
          , p.
          <fpage>11</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>M. de Leoni</surname>
          </string-name>
          , A. Padella,
          <article-title>Achieving Fairness in Predictive Process Analytics via Adversarial Learning</article-title>
          , volume
          <volume>15506</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2024</year>
          , pp.
          <fpage>346</fpage>
          -
          <lpage>354</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Chapela-Campa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Benchekroun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Baron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Krass</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Senderovich</surname>
          </string-name>
          ,
          <article-title>Can i trust my simulation model? measuring the quality of business process simulation models</article-title>
          ,
          <source>in: 21th International Conference on Business Process Management, BPM</source>
          <year>2023</year>
          , Proceedings,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>