<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Intervening With Confidence: Conformal Prescriptive Monitoring of Business Processes</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mahmoud Shoush</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marlon Dumas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Tartu</institution>
          ,
          <addr-line>Narva mnt 18, 51009 Tartu</addr-line>
          ,
          <country country="EE">Estonia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Prescriptive process monitoring methods seek to improve the performance of a process by selectively triggering interventions at runtime (e.g., ofering a discount to a customer) to increase the probability of a desired case outcome (e.g., a customer making a purchase). The backbone of a prescriptive process monitoring method is an intervention policy, which determines for which cases and when an intervention should be executed. Existing methods rely on predictive models to define intervention policies; specifically, they consider policies that trigger an intervention when the probability of a negative outcome exceeds a threshold. However, the probabilities computed by a predictive model often come with low confidence, leading to unnecessary interventions and wasted efort, which is problematic when the resources available to execute interventions are limited. To tackle this shortcoming, this paper outlines an approach to extend existing prescriptive process monitoring methods with conformal predictions, i.e., predictions with confidence guarantees. A preliminary evaluation using real-life public datasets shows that conformal predictions enhance the net gain of prescriptive process monitoring methods under limited resources.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Prescriptive Process Monitoring</kwd>
        <kwd>Conformal Prediction</kwd>
        <kwd>Causal Inference</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        probability of a negative outcome exceeds a threshold [
        <xref ref-type="bibr" rid="ref5 ref7">7, 5</xref>
        ]. A shortcoming of this approach
is that the probabilities computed by predictive models often come with low confidence. This
leads to unnecessary interventions and, thus, wasted efort. This wasted efort is particularly
problematic in settings where the resources available to execute interventions are limited, which
means that allocating a resource to intervene in a case (based on a low-confidence probability)
may result in this resource being unable to intervene in other cases.
      </p>
      <p>
        This paper addresses the above shortcoming by outlining an approach to integrate conformal
prediction methods [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] into a PrPM system. Conformal prediction methods allow us to
associate confidence guarantees with predictions, thus tackling the abovementioned shortcoming
regarding triggering unnecessary interventions. The paper reports on an empirical evaluation
to test the hypothesis that the use of conformal predictions leads to a higher net gain from
interventions in a resource-constrained PrPM system.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        PrPM techniques can be classified into three groups based on intervention policy and improving
business value [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The first group focuses on control flow for optimal action
recommendations [
        <xref ref-type="bibr" rid="ref10 ref11 ref4">10, 11, 4</xref>
        ]. The second group prioritizes resource allocation decisions [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ]. The third
group combines control flow and resources to mitigate undesired outcomes [
        <xref ref-type="bibr" rid="ref14 ref2 ref5 ref6">14, 6, 2, 5</xref>
        ]. This
paper falls into the third group.
      </p>
      <p>
        Studies in the third group use predictive models trained on historical process data (event logs)
to determine when and for which cases interventions should be triggered. Fahrenkrog et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
propose a PrPM approach based on predictions from an outcome-oriented model [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. These
methods trigger interventions if the probability of an undesired outcome exceeds a threshold.
The threshold is determined through empirical thresholding, which explores multiple thresholds
over a subset of the event log to maximize a reward function. However, these techniques
overlook the inherent uncertainty in prediction models.
      </p>
      <p>
        Metzger et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] propose using reliability estimates, prediction scores, and other features
in an online RL method. However, their reliability estimates lack confidence guarantees, and
their black-box policy learned through neural networks lacks explainability. Additionally, their
approach involves online RL, while we focus on ofline policy discovery based on past data.
      </p>
      <p>
        In previous work [
        <xref ref-type="bibr" rid="ref14 ref6">14, 6</xref>
        ], we presented a PrPM technique that considers the tradeof between
triggering an intervention now versus later when resources are limited. This technique relies on
estimates including the intervention efect (or conditional average treatment efect, i.e.,  ),
total uncertainty (determined as the entropy of the average prediction from an ensemble of
machine learning (ML) classifiers), and the probability of undesired outcomes. However, these
uncertainty estimates do not come with confidence guarantees. This latter approach is used as
a baseline in the empirical evaluation reported later in this paper.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Approach</title>
      <p>In line with existing ML approaches, the method consists of three phases, as illustrated in Fig. 1:
Training, Calibration, and Testing, which we discuss below in turn.</p>
      <sec id="sec-3-1">
        <title>3.1. Training phase</title>
        <p>
          During training, the event log is used to train predictive and causal models after data cleaning
and enrichment. The predictive model estimates the probability of an ongoing case resulting in
an undesired outcome (P()). In contrast, the causal model measures the efect of triggering
an intervention on the probability of a positive outcome ( ). The training process is
described in our previous work [
          <xref ref-type="bibr" rid="ref14 ref6">14, 6</xref>
          ] and is summarized below.
        </p>
        <p>Testing phase
Incomplete
trace
process
events
prefix collator
event
prefixes
Complete
trace</p>
        <p>Event log
enriching
Training phase
predictive
model
Causal
model</p>
        <p>Conformal
prediction
Calibration phase</p>
        <sec id="sec-3-1-1">
          <title>3.1.1. Event Log Enrichment</title>
          <p>
            This step includes data preparation, prefix extraction, enrichment, and encoding . In data
preparation, we clean the event log by removing incomplete traces and outliers (e.g., events with
abnormal timestamp values). We extract prefixes of length  from each case to simulate real-life
scenarios. The prefixes are enriched with attributes related to temporal context and inter-case
information. Finally, the prefixes are encoded into a fixed-size feature vector using an aggregate
encoding method [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ] for training machine learning algorithms. The output is a preprocessed
dataset containing tuples ((, , )), each consisting of a feature vector  (original and
enriched features), and an intervention  that can positively impact the outcome . The dataset
is then divided into three folds: , ,  with  =  +  +  samples. Each
fold is used in the training, calibration, and testing phases, respectively.
          </p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.1.2. Predictive Model</title>
          <p>The predictive model aims to estimate the probability of an ongoing case ending in an undesired
outcome based on its corresponding prefix. To train the predictive model, a gradient-boosted
tree algorithm is applied to the training fold . The objective is to minimize a loss function
ℒ (, ˆ ), where  represents the actual outcome, and ˆ represents the predicted outcome.
The result is a predictive model (ˆ ) that generates a prediction score (probability) for both the
undesired outcome, P(), and the desired outcome, P().</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>3.1.3. Causal Model</title>
          <p>The causal model determines the impact of the intervention, i.e.,  . It represents the
percentage increase in the probability of achieving the desired outcome when the intervention
is applied. For example, in a lead-to-order process with an initial sales probability of 0.4, an
  of 0.3 indicates that the intervention would raise the sales probability to 0.7. To estimate
the  , a causal model is trained to predict the probabilities of undesired outcomes with
and without the intervention ( = 1 and  = 0). The diference between these probabilities,
considering the current case state characterized by , provides the  .</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Calibration phase</title>
        <p>
          In this phase (Fig. 2), we use an Inductive Conformal Prediction (ICP) algorithm [
          <xref ref-type="bibr" rid="ref16 ref17 ref8">16, 8, 17</xref>
          ]. ICP
methods can be applied as a post-processing step to any predictive model, such as random
forests or gradient-boosting, to provide predictions with confidence guarantees.
        </p>
        <p>The ICP method uses a user-defined significance level (  ) and a predictive model (ˆ ) to create
a prediction set ( that contains the actual outcome with a confidence level of 1 −  . For
example, if the user desires a confidence level of 90%, then they would set  to 0.1. This  value
isn’t for hyperparameter optimization in total gain but reflects the user’s preferred level of
conservatism, indicating their willingness to act on less certain predictions.</p>
        <p>Reducing the significance level increases confidence but also enlarges the prediction set to
encompass all possible outcomes. In our context, we aim to create prediction sets exclusively
containing only the undesired outcome to ensure high certainty that a case will end undesirably
before allocating costly resources. Accordingly, we adopt a conservative approach for risk-averse
users, triggering interventions only when we’re highly confident of the undesired outcome.
This leads to prediction sets consisting solely of the undesired outcome. In contrast, risk-prone
users may opt for intervention even when positive outcomes are possible without it.
predictive
modeling
method</p>
        <p>Prediction
scores
non-conformity
scoring method</p>
        <p>An ICP method consists of two steps, as shown in Fig. 2. In the first step, non-conformity
scores () and a non-conformity quantile (ˆ) are calculated. The predictive model assigns
outcome probabilities (prediction scores) to the calibration data, and a non-conformity scoring
 for each sample. Higher non-conformity
method generates non-conformity scores  ∈ ()=1
scores indicate greater uncertainty. The non-conformity quantile ˆ is determined based on the
significance level (  ) as per Eq. 1.</p>
        <p>ˆ =  ()=1 , ︂[(  +1)(1 −  )⌉︂ </p>
        <p>
          In the second step, the value of ˆ determines the outcomes included in the prediction set.
Using the marginal coverage guarantee property [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. ICP generates a prediction set with
1 −  confidence. This property ensures that the actual outcome  will be included in the
prediction set () with 1 −  confidence. A higher confidence level increases the size of
the prediction set to accommodate all possible outcomes. In outcome-oriented PrPM tasks, the
focus is on identifying () = {} with greater certainty to allocate resources eficiently.
For example, if the desired outcome is the actual outcome and  = 0.2, Eq. 2 guarantees that 
will be included in  with at least 80% confidence. Lower  values indicate higher confidence
but also result in larger prediction sets. In a PrPM task, the main goal is to confidently identify
() = {}, indicating a higher certainty of an undesirable outcome.
        </p>
        <p>P( ∈ ()) ≥ 1 −</p>
        <p>ICP methods difer in how they calculate the non-conformity score and how they use the
non-conformity quantile ˆ to determine the prediction set. Below, we describe the specific ICP
methods we employ in our approach.</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Naive method</title>
          <p>Fundamentally, in the outcome-oriented task, the predictive model approximates P( =  ︀⋃
 = ) ∀  ∈ {, }. For example, given an instance of a given case , what is the
probability of it belonging to ? Then we perform a naive calibration step by setting the
non-conformity score () to be one minus the prediction score of the actual outcome, as shown
 . Then calculate ˆ according to Eq. 1.
in Eq. 3, to obtain {()}=1
(2)
(3)
(4)
 = 1 − ˆ () ∀  ∈ 
() = { ∶ ˆ () ≥ 1 − ˆ}
Then, the prediction set is constructed based on Eq. 4, where  is known, but  is
not. This means the prediction set will only include one outcome, desired or undesired, when
the prediction score for one outcome satisfies the condition in Eq. 4., and the other outcome
does not. For example, when ˆ = 0.7, the P() = 0.72, and the P() = 0.28, then the
() = {}. Otherwise, the level of certainty about the prediction becomes insuficient
to retain only one outcome and either include both or none.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Outcome-balanced method</title>
          <p>This scoring () method’s principle is the same as the former; however, here, we perform the
calibration step for each outcome separately to achieve outcome-balanced coverage, especially
when the outcome of cases is imbalanced; thus, it guarantees (5) instead of (2). Hence, defining
the non-conformity scores and non-conformity quantile for each outcome, as shown in Eq. 5.,
means we stratify by the outcome.</p>
          <p>P( ∈ () ⋃︀  = ) ≥ 1 − ,
∀  ∈ {, }</p>
          <p>(5)
ˆ() =  (())=1(), [︂( ()(+1))(1 −  )⌉︂
</p>
          <p>According to the outcome-balanced scoring method, the prediction set is determined by Eq. 7.,
where we iterate over desired and undesired outcomes. Then it retains or not each outcome
according to its quantiles. For example, assume ˆ() = 0.7, ˆ() = 0.4, the P() = 0.3,
and the P() = 0.7. Then the prediction set examines each outcome with its prediction score
and ˆ. Hence, the P() = 0.3 is not greater than 1 minus 0.4; accordingly, the prediction set
will discard the undesired outcome. Conversely, the P() = 0.7 is greater than 1 minus 0.7;
thus, the prediction set will retain the desired outcome, meaning () = {} only.</p>
          <p>() = { ∶ ˆ () ≥ 1 − ˆ()}</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.3. Adaptive method</title>
          <p>Unlike previous methods ( and ) that consider only the prediction score for the actual
outcome, this scoring method () considers all possible outcomes until the sum of their
prediction scores exceeds the 1 −  confidence. Eq. 8., shows how the non-conformity scores
are calculated, where  () is the permutation of all possible outcomes that orders ˆ ()
from the most likely outcome to the less likely. The next step is to compute ˆ as (2), and the
prediction set is formed according to Eq. 9.</p>
          <p>= ∑  ()</p>
          <p>=1
() = { ∶  ≥ ˆ}</p>
          <p>Based on this scoring method, there is no empty prediction set because the prediction set
will retain only one outcome when the level of certainty about it is high. Otherwise, it will
retain both outcomes but with diferent orders. Specifically, we add outcomes one by one to
the prediction set until the sum of their prediction score exceeds the ˆ. For example, assume
ˆ = 0.8, P() = 0.45, and the P() = 0.55. We first sort the prediction scores from the
most likely to the least, e.g., P() = 0.55, followed by P() = 0.45. Then we add the most
likely outcome to the prediction set if its prediction score does not exceed ˆ = 0.8, meaning
P() = 0.55 &lt; 0.8. Next, we sum the next outcome in order to the previous one, and if their
sum does not exceed the ˆ = 0.8 will include it; otherwise, we stop and not adding any other
outcomes to the prediction set. Since 0.45 + 0.55 is greater than ˆ = 0.8, the prediction set will
not include the second outcome in the prediction set; thus () = {}.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Testing phase</title>
        <p>At runtime, the approach collates events for ongoing cases into case prefixes using a prefix
collator, resulting in a stream of trace prefixes. For each incoming trace prefix (), estimates
of P(),  , and () are obtained. These estimates are then used to filter ongoing
1–12
(6)
(7)
(8)
(9)
cases, identify intervention candidates, and rank them based on a gain function that considers
the benefits of achieving desired outcomes and the costs of interventions.</p>
        <p>To identify candidate cases for intervention, we check three conditions: (1) P() is above
a threshold, determined empirically, (2) () = {}, and (3)   &gt; 0. The candidate
case with the highest gain is chosen, calculated as the benefit of avoiding an undesired outcome
( multiplied by  , minus the intervention cost , see Eq. 10.</p>
        <p>=   ∗  −</p>
        <p>The parameters  and  are user-defined and can vary between diferent processes.
Tab. 1 provides an example of costs and gains for six case prefixes in an unemployment benefits
process. The undesired outcome is when the customer lodges an appeal. Diferent decisions
are made depending on the cost of creating an appeal and giving a discount. For example, in
 = , giving a discount is preferred when its cost is lower than creating an appeal,
while in  = , accepting the appeal is preferred.
(10)</p>
        <p>Also, in Tab. 1, we have six cases with diferent P(),   and ().  = 
and  =  are excluded due to their negative intervention efects and empty prediction
sets. With only one available resource for a phone call, we allocate it to the case with the highest
gain, which is  = .</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Evaluation</title>
      <p>We report on an evaluation that addresses the following questions:
RQ1. What significance level (  ) is appropriate for each non-conformity scoring method to
align with the preferences of a risk-averse user?
RQ2. To what extent does conformal prediction improve the total gain w.r.t. existing baselines?</p>
      <sec id="sec-4-1">
        <title>4.1. Datasets</title>
        <p>We experimented with two real-life event logs from the banking industry: BPIC20171 and
BPIC20122. These logs represent the loan origination process and provide clear definitions
for desired and undesired outcomes. They are large enough regarding the number of loan
applications and include interventions that can reduce the probability of undesired outcomes.
Table 2 provides an overview of the key characteristics of these logs.</p>
        <p>
          The logs contain diverse case and event attributes. We use them in our experiments in addition
to other extracted attributes, e.g., the number of sent ofers, monthly loan interest, and temporal
features, to enrich the logs. Then, we define outcomes according to each case’s last activity
and determine the intervention according to the Creat_Ofer activity for cases labeled with
undesired outcomes, as shown in Tab. 2. To avoid lengthy cases, we extract prefixes up to the
90th percentile. An aggregate encoding method is applied to capture maximum information from
the logs, outperforming other techniques, as shown in previous research [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. The resulting
ifxed-size feature vector serves as input for training the machine learning algorithms.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Experimental Setup</title>
        <p>The experimental setup involves dividing the log into three categories: training (60%),
calibration (20%), and testing (20%). The training set is used for model training, the calibration set is
used to create the prediction set, and the testing set evaluates the intervention policy.</p>
        <p>
          We use Catboost [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], a GBDT algorithm, to train the predictive model for estimating the
probability of undesired outcomes (P()). For estimating  , we employ the Orthogonal
Random Forest (ORF) algorithm from EconMl3. Both methods have shown good accuracy in
predicting undesired outcomes and estimating intervention efects [
          <xref ref-type="bibr" rid="ref14 ref15">15, 14</xref>
          ].
        </p>
        <p>During runtime, ongoing cases are filtered to identify candidates based on P() &gt; 0.5,
  &gt; 0, and () = {}. These estimates help prioritize cases likely to have
undesired outcomes and be influenced by the intervention. Then we set the  = 20,
relatively high, to  = 1 to estimate the expected gain from resource allocation.</p>
        <p>We evaluate the proposed approach using metrics such as   and  −  to assess
the ICP methods’ performance. These metrics are suitable for imbalanced data and provide an
unbiased evaluation. Additionally, we examine the number of cases in () containing
only the undesired outcome, targeting confident predictions of undesirable outcomes.</p>
        <p>We evaluate the intervention policy with limited resources based on the total gain and the
(accuracy/resource) ratio. The total gain represents the cumulative gains achieved per available
resource. In contrast, the accuracy per resource ratio indicates the proportion of correctly
allocated resources to undesired cases out of the total allocated cases.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Results</title>
        <p>We analyze the impact of the user-defined significance level (  ) on the prediction set ( RQ1)
and examine the improvement in total gain with finite resources using the intervention policy
based on conformal prediction ( RQ2).</p>
        <p>In Fig. 3, we present the impact of diferent non-conformity scoring methods on the retention
of an undesired outcome in the prediction set, addressing RQ1. In other words, for a risk-averse
user, we aim to find which significance level maximizes the number of cases in which the
prediction set contains only a negative outcome. Our findings indicate that for the naive and
outcome-balanced methods, the optimal significance levels (  ) for maximizing the number
of cases belonging to the prediction set, while retaining only undesired outcomes, are 0.4
for BPIC2012 and 0.2 for BPIC2017. Conversely, the adaptive method achieves the maximum
retention at  = 0.9 for both logs. This disparity can be attributed to the construction of the
prediction set in each method. The naive and outcome-balanced methods demonstrate less
conservatism towards including a specific outcome in the prediction set as ˆ approaches zero.
In contrast, the adaptive method exhibits the opposite behavior. Moreover, we observe that
these significance levels yield the highest F-score and (AUC) compared to other levels (detailed
in the supplementary material4). As a result, for risk-averse users, an extreme alpha value must
be selected, and these levels are used to assess the enhancement of conformal methods in terms
of total gain and accuracy/resource.</p>
        <p>
          To investigate RQ2, we analyze diferent approaches for improving the intervention policy.
Firstly, we compare pure predictive methods targeting cases with P() &gt; 0.5 [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], with and
without a threshold of    &lt; 0.75 [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. We then evaluate the performance of
predictive methods combined with the inductive conformal prediction (ICP), specifically when
() includes only undesired outcomes. This analysis is presented in Fig. 4, where the
gain from interventions using   is examined. Additionally, Fig. 5 compares the predictive
approach (P() &gt; 0.5 and the    &lt; 0.75) combined with   (when it is
above 0) [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and conformal prediction.
        </p>
        <p>For the BPIC2012 log, the total gain (on the left-hand side) improves when we combine
any conformal method with pure predictive in Fig. 4 and   in Fig. 5. In particular, when
resources are minimal, with a remarkable accuracy/resource compared to non-conformal methods.
Also, the adaptive conformal method outperforms other methods w.r.t the total gain, and similar
to other methods, w.r.t accuracy/resource. This is because the adaptive method’s defined ˆ is
much higher than the naive and outcome-balanced methods; accordingly, more conservative in
adding outcomes to the prediction set.</p>
        <p>Moreover, when resources are not restricted, which is diferent from the situation in practice,
we find that non-conformal methods achieve good gains with reasonable accuracy per resource
as conformal methods. However, the conformal methods are more conservative since they
constrain the allocation of resources.</p>
        <p>For the BPIC2017 log, the adaptive method significantly improves the total gain with high
accuracy in both limited and relaxed resource scenarios. In contrast, when resources are limited,
the naive and outcome-balanced methods achieve comparable gains to non-conformal methods.
Nevertheless, all conformal methods outperform non-conformal w.r.t accuracy per resource.</p>
        <p>
          In summary, the proposed PrPM approach demonstrates superior performance compared to
baselines regarding total gain and accuracy per resource, as shown in Fig. 4 and Fig. 5. Moreover,
our approach outperforms the previous work [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], as indicated in the supplementary material.
The use of conformal prediction to construct an intervention policy with limited resources
further enhances the performance of PrPM methods, benefiting business processes.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>We studied the hypothesis that the use of conformal predictions can enhance the efectiveness
of prescriptive process monitoring methods by preventing interventions from being triggered
unnecessarily when the level of confidence is insuficient.</p>
      <p>The empirical evaluation shows that intervention policies with conformal predictions
outperform classic non-conformal methods, particularly when the number of resources available for
performing the interventions is limited. The reported evaluation relied on two real-life event
logs from the same domain (banking). We acknowledge that further experiments with a larger
and more diverse array of datasets are required to achieve generalizability.</p>
      <p>The proposal assumes that only one type of intervention is available (e.g., giving a customer
discount). Also, it assumes that this intervention can only be triggered at most once in a case. In
practice, cases may be subject to multiple interventions of diferent types (e.g., giving a discount,
ofering an upgrade, or a voucher for future purchases, etc.). Thus, a direction for future work is
to extend the current approach to a multi-intervention setting, for example, using multi-armed
bandit approaches. Another direction is to study the problem where the case outcome is not a
categorical variable (e.g., positive vs. negative) but a numerical variable (e.g., cost, time).</p>
      <p>In this paper, we’ve employed conformal prediction techniques to identify cases likely to
result in a negative outcome. However, an intriguing avenue for further exploration involves
applying conformal prediction to the CATE values themselves, thereby enhancing the overall
prediction and decision-making process. Furthermore, we could expand upon this work by
exploring the application of reinforcement learning, both with and without conformal methods,
as an alternative to the rule-based approach for learning intervention policies.
Reproducibility. The source code required to reproduce the experiments can be found at:
https://github.com/mshoush/conformal-prescriptive-monitoring.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research is supported by the European Research Council (PIX Project).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Athey</surname>
          </string-name>
          ,
          <article-title>Beyond prediction: Using big data for policy problems</article-title>
          ,
          <source>Science</source>
          <volume>355</volume>
          (
          <year>2017</year>
          )
          <fpage>483</fpage>
          -
          <lpage>485</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Metzger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Palm</surname>
          </string-name>
          ,
          <article-title>Triggering proactive business process adaptations via online reinforcement learning</article-title>
          ,
          <source>in: BPM</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>273</fpage>
          -
          <lpage>290</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Z. D.</given-names>
            <surname>Bozorgi</surname>
          </string-name>
          , I. Teinemaa,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. La</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <article-title>Prescriptive process monitoring for cost-aware cycle time reduction</article-title>
          , in: ICPM, IEEE,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>M. de Leoni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Dees</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Reulink</surname>
          </string-name>
          ,
          <article-title>Design and evaluation of a process-aware recommender system based on prescriptive analytics</article-title>
          , in: ICPM, IEEE,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Fahrenkrog-Petersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tax</surname>
          </string-name>
          , I. Teinemaa,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. de Leoni</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          <string-name>
            <surname>Maggi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Weidlich</surname>
          </string-name>
          ,
          <article-title>Fire now, fire later: alarm-based systems for prescriptive process monitoring</article-title>
          ,
          <source>Knowl. Inf. Syst</source>
          .
          <volume>64</volume>
          (
          <year>2022</year>
          )
          <fpage>559</fpage>
          -
          <lpage>587</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Shoush</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <article-title>Prescriptive process monitoring under resource constraints: A causal inference approach</article-title>
          , in: ICPM Workshops, Lect. Notes Bus. Inf. Process., Springer,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I.</given-names>
            <surname>Teinemaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tax</surname>
          </string-name>
          , M. de Leoni,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Maggi</surname>
          </string-name>
          ,
          <article-title>Alarm-based prescriptive process monitoring</article-title>
          , in: BPM (Forum),
          <source>Lect. Notes Bus</source>
          . Inf. Process., Springer,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Shafer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vovk</surname>
          </string-name>
          ,
          <article-title>A tutorial on conformal prediction</article-title>
          ,
          <source>J. Mach. Learn. Res</source>
          .
          <volume>9</volume>
          (
          <year>2008</year>
          )
          <fpage>371</fpage>
          -
          <lpage>421</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kubrak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Milani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nolte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          , Prescriptive process monitoring: Quo vadis?,
          <source>PeerJ Comput. Sci. 8</source>
          (
          <year>2022</year>
          )
          <article-title>e1097</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Weinzierl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dunzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zilker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Matzner</surname>
          </string-name>
          ,
          <article-title>Prescriptive business process monitoring for recommending next best actions</article-title>
          ,
          <source>in: BPM (Forum)</source>
          ,
          <source>volume 392 of Lecture Notes in Business Information Processing</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>193</fpage>
          -
          <lpage>209</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sindhgatta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dechu</surname>
          </string-name>
          ,
          <article-title>Goal-oriented next best activity recommendation using reinforcement learning</article-title>
          ,
          <source>CoRR abs/2205</source>
          .03219 (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sindhgatta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Ghose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. K.</given-names>
            <surname>Dam</surname>
          </string-name>
          ,
          <article-title>Context-aware analysis of past process executions to aid resource allocation decisions</article-title>
          ,
          <source>in: CAiSE</source>
          , volume
          <volume>9694</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2016</year>
          , pp.
          <fpage>575</fpage>
          -
          <lpage>589</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>Prediction-based resource allocation using LSTM and minimum cost and maximum flow algorithm</article-title>
          , in: ICPM, IEEE,
          <year>2019</year>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>128</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Shoush</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <article-title>When to intervene? prescriptive process monitoring under uncertainty and resource constraints</article-title>
          ,
          <source>in: BPM (Forum)</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>I.</given-names>
            <surname>Teinemaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Maggi</surname>
          </string-name>
          ,
          <article-title>Outcome-oriented predictive process monitoring: Review and benchmark</article-title>
          ,
          <source>ACM Trans. Knowl. Discov. Data</source>
          <volume>13</volume>
          (
          <year>2019</year>
          )
          <volume>17</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          :
          <fpage>57</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>G.</given-names>
            <surname>Zeni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fontana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vantini</surname>
          </string-name>
          ,
          <article-title>Conformal prediction: a unified review of theory and new challenges</article-title>
          , CoRR abs/
          <year>2005</year>
          .07972 (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Tibshirani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Barber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Candès</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramdas</surname>
          </string-name>
          ,
          <article-title>Conformal prediction under covariate shift</article-title>
          , in: NeurIPS,
          <year>2019</year>
          , pp.
          <fpage>2526</fpage>
          -
          <lpage>2536</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>V.</given-names>
            <surname>Vovk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gammerman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Saunders</surname>
          </string-name>
          ,
          <article-title>Machine-learning applications of algorithmic randomness</article-title>
          , in: ICML, Morgan Kaufmann,
          <year>1999</year>
          , pp.
          <fpage>444</fpage>
          -
          <lpage>453</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>L. O.</given-names>
            <surname>Prokhorenkova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Gusev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vorobev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Dorogush</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gulin</surname>
          </string-name>
          ,
          <article-title>Catboost: unbiased boosting with categorical features</article-title>
          , in: NeurIPS,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>