<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CRISP-NAM: Competing Risks Interpretable Survival Prediction with Neural Additive Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dhanesh Ramachandram</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ananya Raval</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Vector Institute</institution>
          ,
          <addr-line>108 College St W1140, Toronto, ON M5G 0C6</addr-line>
          ,
          <country country="CA">CANADA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Competing risks are crucial considerations in survival modelling, particularly in healthcare domains where patients may experience multiple distinct event types. We propose CRISP-NAM (Competing Risks Interpretable Survival Prediction with Neural Additive Models), an interpretable neural additive model for competing risks survival analysis which extends the neural additive architecture to model cause-specific hazards while preserving feature-level interpretability. Each feature contributes independently to risk estimation through dedicated neural networks, allowing for visualization of complex non-linear relationships between covariates and each competing risk. CRISP-NAM demonstrates competitive performance on multiple datasets compared to existing approaches.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Survival Analysis</kwd>
        <kwd>Interpretable Models</kwd>
        <kwd>Neural Additive Models</kwd>
        <kwd>Competing Risks</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>• The proposed model jointly learns all cause-specific hazards in a single model, rather than treating
competing events as censoring or requiring separate models.
• Our model can be used to generate shape functions and feature importance rankings for each
competing risk and this would allow practitioners to understand how diferent covariates influence
each outcome.
• We incorporate risk-frequency weightings to address class imbalances in competing events which
is a common challenge that appears in real-world medical datasets.</p>
      <p>The next section reviews the relevant background and related work that motivates our proposed
approach.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background and Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Cox Proportional Hazards Model</title>
        <p>
          Historically, the Cox Proportional Hazards (Cox PH) model [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] has been a popular choice for survival
analysis. It is a semi-parametric, linear model that relates covariates to the hazard function, which
characterizes the instantaneous risk of an event occurring at time , given survival up to that time. The
Cox PH model assumes that covariates have a multiplicative efect on the hazard and that their efects
are constant over time (proportional hazards assumption). Mathematically, the hazard function under
this model is expressed as:
ℎ( | ) = ℎ0() exp( ⊤)
(1)
where ℎ0() is the baseline hazard function,  is the vector of covariates, and  is the vector of
regression coeficients.
        </p>
        <p>
          Despite its widespread use and interpretability, the Cox PH model has several limitations. For example,
nonlinear relationships must be manually specified (e.g., using splines), or alternatively introduced
using nonlinear kernels [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] which can be challenging in high-dimensional settings. Additionally, this
model assumes that the efect of each covariate on the hazard is constant over time. Violations of this
assumption can lead to biased estimates. Finally, in scenarios with many features or complex interactions,
prior feature engineering or dimensionality reduction may be necessary to avoid convergence issues or
unstable estimates.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Deep Learning for Survival Analysis</title>
        <p>To address the limitations of traditional statistical models, recent years have seen a shift towards
machine learning-based survival models, including neural networks, which can capture nonlinear
efects, interactions, and high-dimensional structure in the data.</p>
        <p>
          DeepSurv [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] is one of the earliest deep learning models designed for survival analysis. It consists of
a deep feedforward network with a single output node with a linear activation which estimates the
log-risk function in the Cox PH model. Kvamme et al. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] introduced a joint time–covariate network
 (, ), breaking the proportional hazards assumption by modelling the efect of  as varying with time.
This is conceptually closer to dynamic hazard models or time-dependent Cox PH models. With the
aim of increasing trust and adoption, alignment with medical knowledge and supporting regulatory
requirements, researchers have proposed several interpretable survival models in the literature. Kovalev
et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] proposed SurvLIME, which incorporates the Local Interpretable Model-agnostic Explanation
(LIME) framework [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to approximate the survival model in the local neighbourhood of a test instance
in feature space.
        </p>
        <p>
          Neural Additive Models (NAMs) [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], a neural-network extension of Generalized Additive Models
(GAMs) have been used in machine learning based survival models, examples of which are SurvNAM [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]
and CoxNAM [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. While both SurvNAM and CoxNAM employ NAMs [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] to enhance interpretability
in survival analysis, they difer fundamentally in purpose and integration. CoxNAM is a fully trainable
survival model that embeds NAMs directly within the Cox proportional hazards framework, enabling
inherently interpretable, end-to-end learning of nonlinear feature efects from survival data. In contrast,
SurvNAM is a post-hoc explanation method that approximates the predictions of a pre-trained black-box
survival model such as a Random Survival Forest [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] by fitting a GAM-extended Cox PH model using
NAMs as surrogate learners.
        </p>
        <p>Notably, none of the survival models discussed thus far are capable of modelling competing risks,
which will be covered next.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Competing Risks in Survival Analysis</title>
        <p>While conventional survival models primarily address single outcomes, many real-world scenarios
involve competing risks that fundamentally alter the probability distribution of the primary event.
For instance, if a patient dies, the possibility of experiencing a subsequent heart-related complication
is removed, illustrating a typical competing-events situation. Two methodological frameworks have
emerged for analyzing competing risks:</p>
        <p>
          The Cause-Specific Hazard approach [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] models competing events separately, treating each outcome
as a distinct hazard function and censoring subjects who experience competing events from the risk
set, without requiring actual independence between event types. For each cause , the cause-specific
hazard  (|x) represents the instantaneous rate of occurrence of event type  at time  for subjects
who have not experienced any event prior to time :
 (|x) = lim
Δ→0
 ( ≤  &lt;  + Δ,  =  | x)
Δ
where x represents the covariates,  represents the event time and  ∈ {1, 2, . . . , } denotes the
event type.
        </p>
        <p>
          In contrast, the Fine-Gray sub-distribution model [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] directly accounts for competing events by
maintaining subjects who experience competing risks within the risk set. Given the risk set for
competing event :
        </p>
        <p>() = { :  ≥  or (  &lt;  and  ̸= )}
This risk set includes subjects, , who have either not experienced any event by time  or have experienced
a competing event (not event ) before time .</p>
        <p>The sub-distribution hazard is then defined as:
 (|x) = lim
Δ→0
 ( ≤  &lt;  + Δ,  = | ∈  (), x)
Δ
(2)
(3)
(4)</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Deep Survival Models for Competing Risks</title>
        <p>Deep Survival Models leverage deep learning techniques to address competing risks in survival analysis,
ofering the ability to model complex non-linear patterns in risk prediction. DeepHit [ 16] is a joint
model for survival analysis with competing risks. It uses a shared representation network followed
by cause-specific sub-networks to model the joint distribution of the event time and event type. The
model is trained using a combination of the negative log-likelihood and a ranking loss to encourage
concordance between predicted risks and observed outcomes. Neural Fine Gray [17] extends the
Fine-Gray sub-distribution model using neural networks to capture non-linear relationships between
covariates and sub-distribution hazards. It allows for flexible modelling of competing risks while
maintaining the ability to directly estimate cumulative incidence functions. Despite these advances, a
key limitation of existing deep survival approaches for competing risks is their lack of interpretability,
especially at the feature level, making it dificult to understand how individual features contribute to risk
predictions for diferent competing events. In Fig. 1, we depict the current gap in the literature of deep
survival models. Specifically, we are interested in these criteria: Interpretability, Non-Linear Modelling
and Competing Risks Capability. Models such as SurvNAM and CoxNAM are interpretable, however,
they have not been reported to be used in competing risks settings. In contrast, the Neural Fine-Gray
and DeepHit architectures can model competing risks, but are “black-box” models and can only be
explained using post-hoc explainability methods such as SHAP and Partial Dependence Plots. Post hoc
methods are known to be imprecise and can generate misleading explanations [18, 19]. The original
formulations of Fine-Gray and Cox PH models are incapable of modelling non-linear relationships.
Our proposed CRISP-NAM model addresses these gaps in survival models fulfilling all 3 criteria while
providing competitive performance.</p>
        <p>To this end, an extension of the Neural Additive Model for competing risks settings is proposed in this
paper, resulting in an inherently interpretable survival model. Our approach retains the interpretability
and feature-wise transparency of NAMs and allowing for flexible, non-linear modelling of cause-specific
or sub-distribution hazards in competing risks scenarios.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Model Architecture</title>
      <p>CRISP-NAM extends Neural Additive Models to the competing risks survival analysis setting while
preserving feature-level interpretability. The architecture consists of three primary components:
CRISP-NAM: Model Architecture</p>
      <p>Feature 1</p>
      <p>FeatureNet 1(1)
Feature 2</p>
      <p>...</p>
      <p>Feature 
...</p>
      <p>...</p>
      <p>FeatureNet 2(2)
FeatureNet ()
h1
h2
h</p>
      <p>Projections
1,1(h1)</p>
      <sec id="sec-3-1">
        <title>3.1. Neural Additive Model (FeatureNet)</title>
        <p>In line with the Neural Additive Model framework, each input feature  is processed by its own
dedicated neural network (·) , referred to here as a FeatureNet. These feature-specific sub-networks
are designed to learn the non-linear contribution of each individual feature to the overall risk score,
while preserving interpretability by isolating feature efects.</p>
        <p>h = () ∈ R
where  is the dimension of the hidden representation. Each FeatureNet is a fully-connected feedforward
neural network with  layers, taking the scalar input  and producing a hidden representation h ∈ R.
The activations are computed recursively using the hyperbolic tangent function:
z() = tanh(()z(−1)</p>
        <p>+ ()), for  = 1, . . . , ,
where z(0) = , and h = z() denotes the output of the final layer. Each layer  has weights
 () ∈ R×−1 and biases () ∈ R , with 0 = 1 and  = .</p>
        <p>In this implementation, Dropout is used with rate dropout after each hidden layer, Feature Dropout
with rate feature during training to increase robustness and an optional batch normalization layer after
each linear transformation to stabilize learning especially for deeper FeatureNets.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Risk-Specific Projections</title>
        <p>For each feature  and competing risk , a separate linear projection transforms the feature representation
to its contribution to the log-hazard ratio. To address scale ambiguities across diferent competing risks
and ensure fair comparison of feature contributions, we constrain the projection vectors to have unit
L2 norm.</p>
        <p>The risk-specific projection is defined as:
where w˜ , is the L2-normalized projection vector:</p>
        <p>,(h) = w˜ ,h
w˜ , =</p>
        <p>w,
‖w,‖2 + 
with w, ∈ R being the learnable weight vector for projection  ∈ {1, 2, . . . , } and risk  ∈
{1, 2, . . . , }, and h ∈ R being the feature representation.</p>
        <p>This normalization constraint ensures that ‖w˜ ,‖2 = 1 for all feature-risk pairs, which constraints
all projection vectors to operate on the same scale and enabling direct comparison of feature importance
across diferent competing risks.
(5)
(6)
(7)
(8)
(9)</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Additive Risk Aggregation</title>
        <p>The cause-specific log-hazard ratio for risk  given input features x = [1, 2, . . . , ] is computed as
the sum of individual feature contributions:</p>
        <p>(x) = ∑︁ ,(())</p>
        <p>=1
This preserves the additive nature of the model while allowing for complex non-linear feature efects.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Cause-Specific Hazards Approach</title>
        <p>
          The cause-specific hazards framework [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] is adopted for the CRISP-NAM model. For a subject with
covariates x, we parameterize each cause-specific hazard using a Cox-type model:
 (|x) =  0() exp( (x))
(10)
where  0() is the baseline hazard for the -th event and  (x) is the risk score function for event
type . Unlike traditional Cox models with linear risk functions, our approach uses FeatureNets within
a neural additive model (NAM), enabling it to model complex non-linear efects of individual features.
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Partial Likelihood Loss Function</title>
        <p>
          To train the model, the standard Cox partial likelihood approach, adapted for competing risks [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ],
is implemented. For a dataset with  subjects, the negative log partial likelihood for event type
 ∈ {1, . . . , } is
ℒ = −
 [︃
∑︁  (x) − log
=1
=
⎛
        </p>
        <p>⎜ ∑︁ exp( (x))⎟⎟
⎜⎝ =1 ⎠
≥ 
⎞
]︃
where  is the event type for subject  and  is their event or censoring time. The risk set at time
 consists of all subjects  who have not yet experienced any event ( ≥  ), and x denotes the
feature vector for each subject  in this risk set. The overall loss is the sum of the negative log partial
likelihoods across all event types:
where  is the 2 regularization parameter and Θ represents all model parameters. Since many
real-world problems involving competing risks sufer from class imbalance, we adopt a
risk-frequencyweighted version of the partial likelihood. Specifically, we define:
ℒ, = − 
∑︁ [︂
=1
=</p>
        <p>(x) − log (︁ ∑︁ exp (︀  (x )︀) ︁) ]︂</p>
        <p>=1
≥ 
where  is a weight inversely proportional to the frequency of event type . The total loss is given by:

ℒ = ∑︁ ℒ + ‖Θ‖ 22
=1</p>
        <p>ℒweighted = ∑︁ ℒ,
=1
(11)
(12)
(13)
(14)
(15)</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.6. Baseline Hazard Estimation</title>
        <p>In the cause-specific proportional hazards formulation, the hazard for event type  ∈ {1, . . . , } at
time  ≥ 0 for a subject with covariates x ∈ R  is expressed as</p>
        <p>( | x) =  0() · exp (︀  (x))︀ ,
where
•  0() is the baseline cause-specific hazard function for event type , independent of covariates,
•  (x) is the covariate-dependent log-risk function produced by the model.
hazard function  0() = ∫︀0  0()  after training using the Breslow estimator [20]:
We do not directly parameterize  0(). Instead, we estimate the corresponding baseline cumulative
^0() =</p>
        <p>∑︁
:≤
=</p>
        <p>1
∑︀:≥  exp( (x ))
,
where  denotes the observed time for subject ,  ∈ {0, 1, . . . , } is the event indicator ( = 0 if
censored), and the denominator represents the sum of relative risks for all subjects still at risk at time
.</p>
        <p>This estimated baseline cumulative hazard ^0() can then be used to compute the cumulative
incidence function (CIF) for each event type  at test time.</p>
        <p>Estimating baseline hazards is necessary for several reasons. First, while the neural additive
component of CRISP-NAM eficiently learns relative risks between subjects (hazard ratios), baseline hazard
estimation enables translation of these relative measures into absolute risk predictions. This is required
for clinical decision-making, where probabilities of events are needed. Second, in competing risks
settings, accurate baseline hazard estimation is required for proper calculation of CIFs, as shown in Eqs. (17)
and (18). Third, the baseline hazard captures the underlying temporal pattern of risk independent of
covariates, allowing CRISP-NAM to generate time-dependent predictions at clinically relevant horizons
(e.g., 1-year, 5-year risks). Finally, proper evaluation metrics such as Brier scores and time-dependent
AUCs at specific time points depend on accurate absolute risk estimation.</p>
      </sec>
      <sec id="sec-3-7">
        <title>3.7. Prediction of Absolute Risks</title>
        <p>To predict the cumulative incidence function (CIF) [21] for each competing event, we use the relationship
between cause-specific hazards and the CIF. For a subject with covariates x, the CIF for event type  at
^(−1 |x) = exp −
︃(

∑︁ ∑︁ ^′ (ℓ|x) ,
′=1 ℓ&lt;</p>
        <p>)︃
^(|x) = ^0() exp(︀  (x))︀ ,
and ^0() denotes the estimated baseline cause-specific hazard at time   for event type .
(16)
(17)
(18)
(19)
(|x) =</p>
        <p>(|x)  (|x) ,
• (|x) is the overall survival function, i.e., the probability of not experiencing any event up to
•  is the integration variable representing time between 0 and ,
•  (|x) =  0() exp( (x)) is the cause-specific hazard for cause .</p>
        <p>The survival function is given by
(|x) = exp −
∑︁ ∫︁ 
=1 0</p>
        <p>)︃
 (|x)  .
^(|x) ≈</p>
        <p>∑︁ ^(−1 |x) · ^(|x),
a set of ordered discrete time points with  ∈ [0, ]. Then the CIF can be approximated as</p>
        <p>In practice, a discrete approximation is used to compute these integrals. Let {1, 2, . . . ,  } denote
time  ≥ 0 is defined as
where</p>
        <p>time ,
where
∫︁</p>
        <p>0
︃(
≤</p>
      </sec>
      <sec id="sec-3-8">
        <title>3.8. Interpretability Mechanisms</title>
        <p>Given that CRISP-NAM is based on NAMs, as with all variants of Generalized Additive Models,
CRISPNAM can generate shape functions plots to visualize the (non-linear) contribution of each feature to the
prediction. Specifically, for each feature  and risk , we can extract a shape function that describes
how the feature afects the log-hazard ratio.</p>
        <p>The importance of feature  for risk  is quantified by the mean absolute value of its contribution
across the dataset.</p>
        <p>,() = ,(())
ℐ, =</p>
        <p>1 ∑︁ |,( )|
 =1
(20)
(21)</p>
        <p>This enables ranking features by their impact on each competing risk, providing valuable insights
into risk-specific predictor importance.</p>
        <p>Notably, in this current implementation, CRISP-NAM does not capture features interactions. This is
by design to prioritize interpretability through independent feature level shape functions, ensuring that
feature contributions can be visualized and understood in isolation. Adding separate FeatureNets to
model feature interactions adds to the model complexity and afects its interpretability as visualization
beyond pairwise features interactions is challenging. With  features, there are (2) possible pairwise
interactions, and deciding which interactions to model would also require domain knowledge.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <sec id="sec-4-1">
        <title>4.1. Datasets</title>
        <p>In this section, the datasets used and the experimental procedure to evaluate the CRISP-NAM model
are described.</p>
        <p>In order to evaluate the proposed interpretable model for competing risks survival prediction, we used
the following three real-world medical datasets and a synthetic dataset. Table 1 provides a summary
and breakdown of the primary and competing risks for each the datasets used in this work.
Primary Biliary Cholangitis (PBC). The PBC dataset originates from a randomized controlled
trial conducted at the Mayo Clinic between 1974 and 1984, involving 312 patients diagnosed with
primary biliary cholangitis. The study aimed to evaluate the eficacy of D-penicillamine in treating the
disease. Each patient record includes 25 covariates encompassing demographic, clinical, and laboratory
measurements. The primary endpoint was mortality while on the transplant waiting list, with liver
transplantation considered a competing risk [22].</p>
        <p>Framingham Heart Study. Initiated in 1948, the Framingham Heart Study is a longitudinal cohort
study designed to investigate cardiovascular disease (CVD) risk factors. For this analysis, data from
4,434 male participants were utilized, each with 18 baseline covariates collected over a 20-year follow-up
period. The study focuses on modelling the risk of developing CVD, treating mortality from non-CVD
causes as a competing event [23].</p>
        <p>SUPPORT2 Dataset. The SUPPORT2 dataset originates from the Study to Understand Prognoses
and Preferences for Outcomes and Risks of Treatments (SUPPORT2), a comprehensive investigation
conducted across five U.S. medical centers between 1989 and 1994. This dataset encompasses records
of 9,105 critically ill hospitalized adults, each characterized by 42 variables detailing demographic
information, physiological measurements, and disease severity indicators. The study was executed in
two phases: Phase I (1989–1991) was a prospective observational study aimed at assessing the care
and decision-making processes for seriously ill patients and Phase II (1992–1994) implemented an
intervention to enhance end-of-life care. The primary objective was to develop and validate prognostic
models estimating 2- and 6-month survival probabilities, thereby facilitating improved clinical
decisionmaking and patient-physician communication regarding treatment preferences and outcomes [24].</p>
        <sec id="sec-4-1-1">
          <title>Synthetic Dataset.</title>
          <p>We use the synthetic dataset introduced by Lee et al. [16], which models two
competing risks with distinct but overlapping covariate efects. Each patient  is assigned a 12-dimensional
event times are sampled from exponential distributions as:
feature vector x()
∼  (0,  12), partitioned into three 4-dimensional subgroups: x(1), x
()
2 , x
(3). The
1() ∼ exp
2() ∼ exp
︁(
︁(
  ‖x(3)‖2 +   1⊤x(1))︁ ,
  ‖x(3)‖2 +   1⊤x(2))︁ ,
(22)
(23)
where   = 10. Covariates x1 and x2 influence only their respective event times, while
x3 afects both.</p>
          <p>The dataset consists of 30,000 rows of unique patient data with 50% random right-censoring by
(x(), (), ()), where () is the observed time and () is the event indicator (∅ if censored).
drawing a censoring time 
()
∼  [0, min{ 1
, 2</p>
          <p>}
()
() ]. The final observed data for each patient is</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Experimental Setup</title>
        <p>The CRISP-NAM model is implemented using PyTorch and the code is available from https://github.
com/VectorInstitute/crisp-nam. We employed nested cross-validation to prevent data leakage during
hyperparameter optimization and model evaluation. The approach consists of an outer 5-fold stratified
cross-validation for performance evaluation and an inner 5-fold cross-validation for hyperparameter
tuning within each outer fold. For each outer fold, Optuna [25] is employed on the training partition to
systematically search for optimal model configurations using the inner 5-fold cross-validation, tuning
learning rate, 2 regularization strength, dropout rates, network architecture (1-3 hidden layers with
8128 units), and batch normalization settings using validation loss as the objective. The best configuration
identified for each outer fold is then trained on the complete training partition and evaluated on the
corresponding held-out test partition. Continuous features were normalized using standard scaling
( = 0,  = 1</p>
        <p>) and categorical features were one-hot encoded. Missing categorical values were imputed
using mode imputation. For continuous variables, mean imputation was used across all datasets.
Training employed the AdamW [26] optimizer to minimize the negative log-likelihood loss with a batch
size of 256 and early stopping (patience=10) to prevent over-fitting.</p>
        <p>Model performance was assessed using complementary metrics [27] for discrimination and calibration.
For discrimination ability, the Time-Dependent Area-Under-the-Curve (TD-AUC) was used to quantify
how well the model ranks subjects by risk, with values ranging from 0.5 (no better than chance) to
1.0 (perfect discrimination). Additionally, the Time-Dependent Concordance Index (TD-CI), which
considers that the model’s performance can change over time was computed. This is crucial because the
risk of an event may evolve as time progresses, and a model’s predictive ability might not be constant.
TD-CI ranges from 0.5 to 1.0, with higher values indicating better discriminative ability. The third
FHS
SUP
metric was the Brier score (BS), which measures the accuracy of probabilistic predictions and penalizes
both discrimination and calibration errors with lower values of this score indicating better performance.
All metrics were evaluated at multiple clinically relevant time horizons corresponding to the 25th, 50th,
and 75th percentiles of observed event times for each competing risk.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Discussion</title>
      <p>Notes: Dataset = (FHS: Framingham Heart Study, SUP: SUPPORT2, PBC: Primary Biliary Cirrhosis, SYN: Synthetic); Model=
(CRISP-NAM: CRISP-NAM, NFG: Neural Fine Gray, DEEPHIT: DeepHit); Risk = (1: Primary, 2: Competing)
art neural baselines: DeepHit and Neural Fine Gray (NFG). While DeepHit generally achieves the
highest performance, CRISP-NAM demonstrates competitive discrimination with the added benefit of
interpretability through its additive structure.</p>
      <p>CRISP-NAM achieves discrimination metrics within 2-5% of DeepHit on clinical datasets,
demonstrating its competitiveness. On FHS, the TD-AUC gap at 0.25 is merely 0.011 (0.843 vs 0.854) and
CRISP-NAM performance is at parity with DeepHit at 0.50 for both risks. Similarly, on SUPPORT2
Risk 1, CRISP-NAM exceeds DeepHit’s TD-AUC by substantial margins at 0.25 (0.855 vs 0.779) and
0.50 (0.802 vs 0.640). For PBC, while DeepHit achieves marginally higher TD-AUC, both models reach
near-ceiling performance, making the practical diference negligible. The synthetic dataset represents
the primary challenge, where DeepHit’s flexible architecture captures complex non-linear patterns
that CRISP-NAM’s additive structure cannot fully represent. Our experimental results demonstrate a
systematic 15% performance deficit of CRISP-NAM relative to DeepHit across all metrics on the synthetic
dataset. The synthetic dataset generator incorporates quadratic univariate efects through
creates strong non-linearities in the hazard space that may be challenging for CRISP-NAM’s additive
architecture. The 3 components also afect both competing risks identically, creating dependencies
that are dificult for strictly additive models to capture by the CRISP-NAM
model.</p>
      <p>2
||3|| , which</p>
      <p>Notably, CRISP-NAM’s calibration performance (Brier score) is also lower than NFG and DeepHit.
This could be attributed to the loss function we implemented that optimizes for discrimination (ranking)
rather than the calibration (probability accuracy). In addition, the event weighting could exacerbate the
model’s lagging calibration. We will investigate improvements to the loss function to handle calibration
more efectively in our future work.</p>
      <sec id="sec-5-1">
        <title>5.1. Interpretability Analysis</title>
        <p>Here, the shape function plots for the top 10 features per risk as learned by the CRISP-NAM model
across 3 datasets: SUPPORT2, Framingham and PBC are presented. Each curve ,() represents the
marginal contribution of feature  to the log cause-specific hazard for risk . Rug plots beneath each
curve illustrate the empirical distribution of feature values, highlighting regions, particularly in the
tails, where data are sparse.</p>
        <p>Interpretation Note. It should be noted that these shape functions are associational, not causal, and
may obscure interactions between features. They are estimated under smoothness constraints and can
extrapolate in regions with low data density, leading to amplified or flattened efects. Apparent patterns
should be interpreted cautiously, corroborated on external cohorts, and discussed with domain experts
before drawing scientific or clinical conclusions. While our primary focus centres on analyzing trends
revealed through the shape function plots, we provide limited discussion of the associations between
covariates and predicted risks. These discussions serve primarily to demonstrate how our shape plot
ifndings align with or contrast against established medical literature.</p>
        <sec id="sec-5-1-1">
          <title>5.1.1. Framingham Heart Study Dataset</title>
          <p>Figure 3 illustrates shape functions from the CRISP-NAM model trained on the Framingham dataset,
with separate plots for two competing risks: cardiovascular disease (Risk 1) and non–cardiovascular
death (Risk 2). All functions are displayed on the log-hazard scale, where a unit increase of +0.69
corresponds to a doubling of the cause-specific hazard.</p>
          <p>For Risk 1, GLUCOSE and BMI shows steep increase showing that higher blood glucose levels and
increasing BMI are associated to higher risk of cardiovascular disease for this dataset. Shape plots
reveal that the patients with diabetes, patients who are on medication for hypertension and who
are current smokers (DIABETES, BPMEDS and CURSMOKE) have higher risk of cardiovascular disease
which correlates well with known risk factors. Shape plot also reveals that women have lower risk of
cardiovascular disease, compared to men and having more education is correlated with lower risk of
cardiovascular disease. The binary history flags PREVCHD and PREVAP show negative contributions to
the log-hazard. Three data characteristics, rather than a real reversal of risk, explain this result:
1. Sparsity. For binary variables with low prevalence rates (less than 10%) such as PREVAP and
PREVCHD, excessive smoothing regularization can distort their true impact on survival outcomes.
The shape functions for these rare categorical variables are vulnerable to being inappropriately
regressed toward the population mean, causing established cardiovascular risk factors to
paradoxically appear protective in the visualized contribution plots. This phenomenon occurs because
the limited number of positive cases provides insuficient signal to overcome the model’s
smoothing penalties, resulting in misleading shape functions that fail to capture the true elevated risk
associated with these clinical conditions.
2. Selection. The original study excluded most people with severe existing heart disease. The
retained group is therefore healthier or already under treatment, a “survival-selection” bias that
lowers their short-term risk estimates [28].
3. Competing-risk censoring. Deaths due to heart problems are counted under Risk 1. Removing
those events from the Risk 1 pool leaves a group that is, by definition, less likely to die from
non-heart causes. This negative efect can then bleed back into the Risk 1 estimate because the
true positive efect must be learned from very few events.</p>
          <p>For Risk 2, the shape functions for SYSBP and GLUCOSE follow J-shaped profiles, with minimal
contribution at lower values and a marked increase beyond the upper quantiles. educ decreases nearly
linearly. The binary variable PREVCHD is associated with a negative contribution, while DIABETES
presents a discrete step-wise increase. The BMI function shows a shallow U-shape. DIABP increases
monotonically throughout its observed range.</p>
          <p>Several observed patterns in the shape functions are consistent with known risk factors for
cardiovascular and all-cause mortality. Established studies have linked elevated systolic blood pressure and high
glucose levels with heightened cardiovascular risk [23, 28]. Cigarettes per day displayed an exponential
relationship with diminishing marginal efects at higher consumption levels, reflecting saturation of
smoking-related harm pathways. The inverse trend for educational attainment aligns with literature on
socioeconomic disparities in cardiovascular outcomes [29]. The U-shaped relationship observed for BMI
(Risk 2) has been previously noted in older adults and is often described as the “obesity paradox” [30]
as well as sparse data in the extreme ranges. Additionally, previous cardiovascular conditions (angina
pectoris and coronary heart disease) showed positive linear associations with mortality risk.</p>
          <p>Figure 4 shows how features contribute positively or negatively to the prediction. While feature
importance plots are generally not as informative as shape plots, which can reveal more detailed
relationship between covariates, we provide it here for completeness.</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>5.1.2. Primary Biliary Cholangitis (PBC) Dataset</title>
          <p>Figure 5 presents the feature importance and shape functions from the CRISP-NAM model trained on
the Primary Biliary Cholangitis (PBC) dataset. The plots show distinct patterns for Risk 1 (death on the
waiting list) and Risk 2 (transplantation). Additionally, Figure 6 shows top 5 positive and top 5 negative
important features that contributed to the model’s prediction.</p>
          <p>For Risk 1, age exhibits a Sigmoid-shaped curve. alkaline displays sharp increases followed by
plateaus at higher values.Biomarkers serBilir, serChol and prothrombin all show similar rises
steeply at low values and plateus thereafter. Binary indicators such as hepatomegaly_Yes shows
positive contributions. albumin demonstrates an inverted Sigmoid-shaped curve: initially increasing,
crossing zero near the median, and declining at higher values.</p>
          <p>For Risk 2, the shape for age exhibits inverse sigmoidal shape with a decreasing trend as age increases
signifying that transplantation risk decreases as patients progress with age. The shape for serBilir,
alkaline and serChol rises steeply, before plateauing. These three elevated biomarkers indicate
disease severity in PBC, which simultaneously increases both death risk and transplant priority.
Examining the contribution scales for both risks for ascites_Yes shows a moderate negative contribution
to Risk 1 (∼ −0.4 ) but a much stronger negative contribution to Risk 2 (∼ −1.0 ). From a model
validation perspective, this suggests the model has learned that ascites presence is associated with
reduced likelihood of both outcomes, but particularly transplantation. The asymmetric magnitudes
indicate the model distinguishes between the two competing risks rather than simply treating ascites
as a general severity marker.</p>
          <p>The rug plots accompanying each shape function reflect the distribution of the feature values and
indicate regions with limited data support. These empirical patterns align with several well-established
clinical insights in the context of Primary Biliary Cholangitis (PBC). For instance, older age, elevated
liver enzymes such as alkaline phosphatase, and increased serum bilirubin are recognized markers of
disease severity and poorer prognosis [22, 31]. The decreasing transplant hazard with age may reflect
clinical prioritization criteria that favour younger candidates for organ allocation. The non-linear shape
for albumin aligns with its known role as a proxy for liver synthetic function, where low levels indicate
hepatic decompensation. Histologic stage (histologic) progression, from fibrosis to cirrhosis, is a
standard determinant in transplant eligibility, consistent with the monotonic rise observed in its shape
function. Additionally, ascites and hepatomegaly are classical signs of advanced liver disease, often
associated with higher mortality and reduced transplant suitability. Edema in PBC patients indicates
advanced liver disease. In the competing risks framework, patients with edema have higher disease
severity and thus receive higher priority for transplantation due to their urgent medical need. The
positive contribution to transplantation risk reflects the clinical reality that transplant allocation systems
prioritize sicker patients and those with edema are more likely to receive transplants because it serves
as a marker of advanced disease requiring urgent intervention.</p>
        </sec>
        <sec id="sec-5-1-3">
          <title>5.1.3. SUPPORT2 Dataset</title>
          <p>Figure 7 presents feature importance and shape functions from the CRISP-NAM model trained on
the SUPPORT2 dataset, which distinguishes cancer-specific mortality (Risk 1) from death due to other
causes (Risk 2).</p>
          <p>We highlight several observations from the shape plots for both risks. For Risk 1, the shape function
for age shows a lower risk of death from cancer for younger patients, then steadily increasing risk of
cancer-related death up to approximately 65 years and declining risk for very old patients. The binary
indicator race_black exhibits a slight negative contribution to the log-hazard for Risk 1 showing that
black patients appear to have lower mortality due to cancer. The apparent protective efect of Black
race contradicts established epidemiological evidence [32] demonstrating higher cancer mortality rates
in Black population. This could indicate insuficient sample representation or selection biases inherent
to the SUPPORT2 dataset. Similarly, the inverse relationship between number of comorbidities and
cancer death risk suggests competing mortality mechanisms, where patients with multiple comorbidities
may succumb to other medical conditions before cancer progression becomes the primary threat. The
shape for avtisst is lower durations of mechanical ventilation but shows a steep increase beyond
certain number of days. pafi (oxygen ratio) exhibits a spike at very low values (∼ 0.7 ), then drops
to steady negative contribution (∼ 0.3 ). Other biomarkers such as sodium sod shows an inverted
U-shape peaking near normal levels, suggesting both low sodium levels as well as very high sodium
levels increases risk.</p>
          <p>For Risk 2, the severity scores aps show sharp increases followed by plateaus. The avtisst variable
again displays a marked rise in hazard for durations exceeding 3 days. Cost-related variables charges
and totcst show decreasing trends. The binary indicator dnr_no_dnr shows that patients without
DNR (Do Not Resuscitate) orders (dnr_no_dnr = 1) demonstrated substantially lower contributions
to non-cancer death risk compared to those with DNR orders present. This pattern aligns with clinical
expectations, as DNR orders typically indicate patients with poor overall prognosis, advanced chronic
diseases, or end-stage conditions who are at elevated risk for cardiovascular, respiratory, or multi-organ
failure. Rising aps scores are consistent with the role of physiological instability and organ dysfunction
in predicting mortality [33]. The inverted-U for age in cancer mortality likely reflects competing risks
from other causes in older individuals [34]. The association between prolonged ventilation (avtisst
&gt; 3 days) and increased hazard aligns with the known severity of illness in patients requiring extended
respiratory support. Cost variables likely act as proxies for length of stay or illness trajectory rather
than direct predictors.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>We introduce CRISP-NAM, a deep survival model that simultaneously addresses competing risks
in survival analysis while remaining inherently interpretable. Our model demonstrates competitive
discriminative performance on real-world clinical data and uniquely reveals covariate efects through
intuitive shape function plots. CRISP-NAM is particularly valuable in high-stakes healthcare ML
applications requiring mechanistic understanding such as investigating associational relationships,
assessing treatment eficacy, or designing targeted interventions, especially when competing events
represent distinct clinical processes [35].</p>
      <p>We acknowledge that CRISP-NAM inherits the proportional hazards assumption from the Cox
framework, requiring that covariate efects on each cause-specific hazard remain constant over time.
This assumption may be violated when covariate efects vary temporally, such as biomarkers having
diferent predictive power for early versus late events. There are two possible avenues for future work:
(i) exploration of temporal FeatureNets capable of learning how feature contributions evolve over time.
The caveat that is that this approach would likely increase computational complexity during training
and potentially demand larger datasets. Additionally, we will investigate using a modified loss function
that takes calibration (Brier Score) into account.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Generative AI Declaration</title>
      <p>During the preparation of this work, the author(s) used generative AI tools such as ChatGPT and
Claude in order to correct grammatical errors and spelling check, paraphrasing for better clarity and
for diagram generation. After using this tool/service, the author(s) reviewed and edited the content as
needed and take(s) full responsibility for the publication’s content.
[16] C. Lee, W. Zame, J. Yoon, M. Van Der Schaar, Deephit: A deep learning approach to survival analysis
with competing risks, in: Proceedings of the 32nd AAAI Conference on Artificial Intelligence,
volume 32, 2018.
[17] V. Jeanselme, C. H. Yoon, B. Tom, J. Barrett, Neural fine-gray: Monotonic neural networks for
competing risks, in: Proceedings of the Conference on Health, Inference, and Learning, PMLR,
2023, pp. 379–392.
[18] X. Huang, J. Marques-Silva, On the failings of shapley values for explainability, International</p>
      <p>Journal of Approximate Reasoning 171 (2024) 109–112. doi:10.1016/j.ijar.2023.109112.
[19] T. Laugel, M.-J. Lesot, C. Marsala, X. Renard, M. Detyniecki, The dangers of post-hoc
interpretability: unjustified counterfactual explanations, in: Proceedings of the 28th
International Joint Conference on Artificial Intelligence, IJCAI’19, AAAI Press, 2019, p. 2801–2807.
doi:10.24963/ijcai.2019/388.
[20] N. E. Breslow, Analysis of survival data under the proportional hazards model, International</p>
      <p>Statistical Review 43 (1975) 45–57. doi:10.2307/1402659.
[21] R. J. Gray, A class of k-sample tests for comparing the cumulative incidence of a competing risk,</p>
      <p>The Annals of Statistics (1988) 1141–1154.
[22] E. R. Dickson, et al., Application of the mayo primary biliary cirrhosis survival model to patients
awaiting liver transplantation, Hepatology 9 (1989) 216–221.
[23] W. B. Kannel, D. L. McGee, Diabetes and cardiovascular disease. the framingham study., JAMA
241 (1979) 2035–2038. doi:10.1001/jama.241.19.2035.
[24] A controlled trial to improve care for seriously ill hospitalized patients. the study to understand
prognoses and preferences for outcomes and risks of treatments (SUPPORT). the SUPPORT
principal investigators, JAMA 274 (1995) 1591–1598.
[25] T. Akiba, S. Sano, T. Yanase, T. Ohta, M. Koyama, Optuna: A next-generation hyperparameter
optimization framework, in: Proceedings of the 25th ACM SIGKDD International Conference on
Knowledge Discovery &amp; Data Mining, 2019, pp. 2623–2631. doi:10.1145/3292500.3330701.
[26] I. Loshchilov, F. Hutter, Decoupled weight decay regularization, in: 7th International Conference
on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
[27] S. Y. Park, J. E. Park, H. Kim, S. H. Park, Review of statistical methods for evaluating the
performance of survival or other time-to-event prediction models (from conventional to deep learning
approaches), Korean Journal of Radiology 22 (2021) 1697–1707. doi:10.3348/kjr.2021.0223.
[28] R. B. D’Agostino, W. B. Kannel, Epidemiological background and design: The framingham study,
in: Proceedings of the American Statistical Association Sesquicentennial Invited Paper Sessions,
American Statistical Association, Alexandria, VA, 1989, pp. 707–718.
[29] A. Rosengren, A. Smyth, S. Rangarajan, C. Ramasundarahettige, S. I. Bangdiwala, K. F. AlHabib,
A. Avezum, K. B. Boström, J. Chifamba, S. Gulec, et al., Socioeconomic status and risk of
cardiovascular disease in 20 low-income, middle-income, and high-income countries: the
prospective urban rural epidemiologic (pure) study, The Lancet Global Health 7 (2019) e748–e760.
doi:10.1016/S2214-109X(19)30045-2.
[30] D. E. Amundson, S. Djurkovic, G. N. Matwiyof, The obesity paradox, Critical Care Clinics 26
(2010) 583–596. doi:10.1016/j.ccc.2010.06.004.
[31] C. F. Murillo Perez, et al., Optimizing therapy in primary biliary cholangitis: alkaline phosphatase
at six months identifies one -year non-responders and predicts survival, Liver International:
oficial journal of the International Association for the Study of the Liver 43 (2023) 1497–1506.
doi:10.1111/liv.15592.
[32] A. H. Saka, A. N. Giaquinto, L. E. McCullough, K. Y. Tossas, J. Star, A. Jemal, R. L. Siegel, Cancer
statistics for african american and black people, 2025, CA: a cancer journal for clinicians 75 (2025)
111–140. doi:10.3322/caac.21874.
[33] W. A. Knaus, F. E. Harrell, J. Lynn, L. Goldman, R. S. Phillips, A. F. Connors, N. V. Dawson, W. J.</p>
      <p>Fulkerson, R. M. Calif, N. Desbiens, et al., The support prognostic model. objective estimates of
survival for seriously ill hospitalized adults. study to understand prognoses and preferences for
outcomes and risks of treatments., Annals of Internal Medicine 122 (1995) 191–203. doi:10.7326/
0003-4819-122-3-199502010-00007.
[34] E. Hayes-Larson, S. F. Ackley, S. C. Zimmerman, M. Ospina-Romero, M. M. Glymour, R. E. Graf,
J. S. Witte, L. C. Kobayashi, E. R. Mayeda, The competing risk of death and selective survival
cannot fully explain the inverse cancer-dementia association, Alzheimer’s &amp; Dementia 16 (2020)
1696–1703. doi:10.1002/alz.12168.
[35] P. C. Austin, D. S. Lee, J. P. Fine, Introduction to the analysis of survival data in the presence of
competing risks, Circulation 133 (2016) 601–609. doi:10.1161/CIRCULATIONAHA.115.017719.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[1] Government of Canada, Artificial intelligence and data act (aida)</source>
          ,
          <year>2024</year>
          . URL: https://ised-isde. canada.ca/site/innovation
          <article-title>-better-canada/en/artificial-intelligence-and-data-act.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>U.S.</given-names>
            <surname>Food</surname>
          </string-name>
          and
          <string-name>
            <given-names>Drug</given-names>
            <surname>Administration</surname>
          </string-name>
          ,
          <source>Artificial intelligence/software as a medical device</source>
          ,
          <year>2024</year>
          . URL: https://www.fda.
          <article-title>gov/medical-devices/software-medical-device-samd/ artificial-intelligence-software-medical-device.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>[3] Article 13, transparency and provision of information to deployers, eu ai act</article-title>
          ,
          <year>2024</year>
          . URL: https: //www.artificial
          <article-title>-intelligence-act</article-title>
          .com/Artificial_Intelligence_Act_Article_13.html.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Cox</surname>
          </string-name>
          ,
          <article-title>Regression models and life-tables</article-title>
          ,
          <source>Journal of the Royal Statistical Society: Series B (Methodological) 34</source>
          (
          <year>1972</year>
          )
          <fpage>187</fpage>
          -
          <lpage>202</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Tonini</surname>
          </string-name>
          ,
          <string-name>
            <surname>X. Lin,</surname>
          </string-name>
          <article-title>Kernel machine approach to testing the significance of multiple genetic markers for risk prediction</article-title>
          ,
          <source>Biometrics</source>
          <volume>67</volume>
          (
          <year>2011</year>
          )
          <fpage>975</fpage>
          -
          <lpage>986</lpage>
          . doi:
          <volume>10</volume>
          .1111/j.1541-
          <fpage>0420</fpage>
          .
          <year>2010</year>
          .
          <volume>01544</volume>
          .x.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Katzman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Shaham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cloninger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kluger</surname>
          </string-name>
          ,
          <article-title>Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network</article-title>
          ,
          <source>BMC Medical Research Methodology</source>
          <volume>18</volume>
          (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          . doi:
          <volume>10</volume>
          .1186/s12874-018-0482-1.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kvamme</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Ørnulf</given-names>
            <surname>Borgan</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Scheel</surname>
          </string-name>
          ,
          <article-title>Time-to-event prediction with neural networks and cox regression</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>20</volume>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Utkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Kasimov</surname>
          </string-name>
          ,
          <article-title>Survlime: A method for explaining machine learning survival models</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>203</volume>
          (
          <year>2020</year>
          )
          <fpage>106</fpage>
          -
          <lpage>164</lpage>
          . doi:https://doi.org/10. 1016/j.knosys.
          <year>2020</year>
          .
          <volume>106164</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>Model-agnostic interpretability of machine learning</article-title>
          ,
          <source>arXiv preprint arXiv:1606.05386</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Melnick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Frosst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lengerich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Caruana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Hinton</surname>
          </string-name>
          ,
          <article-title>Neural additive models: interpretable machine learning with neural nets</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>34</volume>
          ,
          <year>2021</year>
          , pp.
          <fpage>4699</fpage>
          -
          <lpage>4711</lpage>
          . doi:
          <volume>10</volume>
          .5555/3540261.3540620.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Utkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Satyukov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Konstantinov</surname>
          </string-name>
          ,
          <string-name>
            <surname>Survnam:</surname>
          </string-name>
          <article-title>The machine learning survival model explanation</article-title>
          ,
          <source>Neural Networks</source>
          <volume>147</volume>
          (
          <year>2022</year>
          )
          <fpage>81</fpage>
          -
          <lpage>102</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.neunet.
          <year>2021</year>
          .
          <volume>12</volume>
          .015.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <surname>Coxnam:</surname>
          </string-name>
          <article-title>An interpretable deep survival analysis model</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>227</volume>
          (
          <year>2023</year>
          )
          <fpage>120</fpage>
          -
          <lpage>218</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.eswa.
          <year>2023</year>
          .
          <volume>120218</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ishwaran</surname>
          </string-name>
          , U. B.
          <string-name>
            <surname>Kogalur</surname>
            ,
            <given-names>E. H.</given-names>
          </string-name>
          <string-name>
            <surname>Blackstone</surname>
            ,
            <given-names>M. S.</given-names>
          </string-name>
          <string-name>
            <surname>Lauer</surname>
          </string-name>
          , Random survival forests,
          <source>The Annals of Applied Statistics</source>
          <volume>2</volume>
          (
          <year>2008</year>
          )
          <fpage>841</fpage>
          -
          <lpage>860</lpage>
          . URL: https://doi.org/10.1214/08-
          <fpage>AOAS169</fpage>
          . doi:
          <volume>10</volume>
          .1214/ 08-
          <fpage>AOAS169</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Prentice</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Kalbfleisch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Peterson</surname>
          </string-name>
          Jr.,
          <string-name>
            <given-names>N.</given-names>
            <surname>Flournoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. T.</given-names>
            <surname>Farewell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. E.</given-names>
            <surname>Breslow</surname>
          </string-name>
          ,
          <article-title>The analysis of failure times in the presence of competing risks</article-title>
          ,
          <source>Biometrics</source>
          (
          <year>1978</year>
          )
          <fpage>541</fpage>
          -
          <lpage>554</lpage>
          . doi:
          <volume>10</volume>
          .2307/2530374.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <article-title>A proportional hazards model for the subdistribution of a competing risk</article-title>
          ,
          <source>Journal of the American Statistical Association</source>
          <volume>94</volume>
          (
          <year>1999</year>
          )
          <fpage>496</fpage>
          -
          <lpage>509</lpage>
          . doi:
          <volume>10</volume>
          .2307/2670170.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>