<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>July</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Assessing reliability of explanations in unbalanced datasets: a use-case on the occurrence of frost events</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ilaria Vascotto</string-name>
          <email>ilaria.vascotto@phd.units.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valentina Blasone</string-name>
          <email>valentina.blasone@phd.units.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alex Rodriguez</string-name>
          <email>alejandro.rodriguezgarcia@units.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Bonaita</string-name>
          <email>alessandro.bonaita@generali.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Bortolussi</string-name>
          <email>lbortolussi@units.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Assicurazioni Generali Spa</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Mathematics</institution>
          ,
          <addr-line>Informatics and Geosciences</addr-line>
          ,
          <institution>University of Trieste</institution>
          ,
          <addr-line>Trieste</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>The Abdus Salam International Center for Theoretical Physics</institution>
          ,
          <addr-line>Trieste</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>0</volume>
      <fpage>9</fpage>
      <lpage>11</lpage>
      <abstract>
        <p>The usage of eXplainable Artificial Intelligence (XAI) methods has become essential in practical applications, given the increasing deployment of Artificial Intelligence (AI) models and the legislative requirements put forward in the latest years. A fundamental but often underestimated aspect of the explanations is their robustness, a key property that should be satisfied in order to trust the explanations. In this study, we provide some preliminary insights on evaluating the reliability of explanations in the specific case of unbalanced datasets, which are very frequent in high-risk use-cases, but at the same time considerably challenging for both AI models and XAI methods. We propose a simple evaluation focused on the minority class (i.e. the less frequent one) that leverages on-manifold generation of neighbours, explanation aggregation and a metric to test explanation consistency. We present a use-case based on a tabular dataset with numerical features focusing on the occurrence of frost events.</p>
      </abstract>
      <kwd-group>
        <kwd>XAI</kwd>
        <kwd>Unbalanced datasets</kwd>
        <kwd>Neural networks</kwd>
        <kwd>Trustworthiness</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Nowadays, Artificial Intelligence (AI) and Machine Learning (ML) models have become essential tools
for practitioners tackling real and complex problems. These include high-risk applications such as
in healthcare, climate, and fraud detection, which require highly reliable models due to the serious
consequences that incorrect or biased predictions may have. However, many real-world datasets in
these domains are intrinsically unbalanced as the critical events, e.g. rare diseases, natural catastrophes,
or fraudulent transactions, occur less frequently than the normal conditions. Dataset unbalance
introduces significant challenges for the AI and ML models, often leading to biased model predictions.
In this context, gaining insights on what lies behind a model’s prediction is particularly valuable for
enhancing its trustworthiness. Despite this, ML models are usually very complex and are often treated
by practitioners as black boxes.</p>
      <p>Explainable Artificial Intelligence (XAI) methods aim at improving the transparency of ML models
and ofer a set of tools that can be used to</p>
      <p>
        open the black box either via local or global explanations. In
practical applications, while eforts are made towards the development of highly accurate ML models,
the explainability aspect may at times be overlooked. This is in part an undesirable consequence
of the new legislative requirements that have been proposed for high-risk applications, both in the
GDPR [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and AI Act [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The transparency requirement can at times result in the incautious use of
XAI techniques to satisfy the legislative needs on such use-cases. As an example, practitioners may
apply frequently cited methodologies, such as LIME [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and SHAP [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], without fully understanding
their theoretical backgrounds and feature-wise requirements. Careless application of XAI approaches
Late-breaking work, Demos and Doctoral Consortium, colocated with the 3rd World Conference on eXplainable Artificial Intelligence:
      </p>
      <p>CEUR</p>
      <p>ceur-ws.org
to high-risk scenarios may reflect into unintended, yet harmful, consequences. In this respect, the
robustness of the explanations is often left unconsidered, whereas it is a decisive aspect to increase
model trustworthiness and reliability of the provided explanation. The robustness of explanations (also
referred to as stability) can be broadly defined as the ability of an explanation method (or explainer) to
produce similar explanations for similar inputs.</p>
      <p>When evaluating unbalanced datasets, assessing the reliability of explanations becomes even more
important. To better understand the problem, we can consider a simple example. Assume that a model
 (⋅) reaches an accuracy of 99% on the training dataset for a real-world problem. If the dataset is balanced
or with a limited factor of unbalance, we can conclude that the model is behaving appropriately. If
instead we assume to be working with a highly unbalanced dataset, e.g.   ( = 0) = 0.99 and
  ( = 1) = 0.01 , we cannot come to the same conclusion. In fact, a 99% accuracy could be easily
obtained with a model that always predicts the class 0, which is obviously not the expected behaviour.
This example sparks an important consideration on explanation reliability: if the majority class, that is
the most frequently occurring one, is easily predicted by the model, the derived explanations cannot
be deemed trustworthy. Moreover, considering that most of the times the training of ML models with
unbalanced data is specifically adjusted to concentrate on the minority class, we cannot take for granted
that the model has correctly learned the specific facets of the majority class, even if the accuracy is high.</p>
      <p>
        In this study, we address what has been discussed so far by presenting a preliminary evaluation of
if and how explanations can be trusted in the context of unbalanced datasets. For this purpose, we
take inspiration from the framework recently introduced in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which was proposed for evaluating
the trustworthiness of model explanations. We propose to focus on minority class explanations, as
accurately predicting rare events is crucial in high-risk applications. As a use-case, we take a real
scenario where a ML model is trained to predict the occurrence of extreme weather events, starting
from atmospheric tabular data. Specifically, we consider the occurrence of frost events, which are
inherently rare compared to normal weather conditions, leading to a highly unbalanced dataset where
instances of the extreme phenomena are significantly lower, when compared to non-event cases.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <p>An overview of the proposed method is illustrated in Figure 1. Let  = ( X, y) be a dataset with 
points and  features where y is the vector of class labels. Let  = 0 indicate the majority class and
 = 1 the minority class, such that   ( = 0) ≫   ( = 1) . Let  be divided into training, validation
and test sets and let  (⋅) be a neural network trained on the training set. Let  be an explanation method
and ( x) the explanation associated to the data point x and model  . If  is a feature importance method,
the explanation will be also referred to as feature attribution.</p>
      <p>Let x be a data point of interest and let x̃ be a perturbation such that ( x, x̃) &lt;  with  &gt; 0 with
 a distance metric, e.g. the Euclidean distance. A local neighbourhood  of point x is defined as the
set of perturbed data points that are suficiently close to the original data point and for which a model 
predicts the same class as the original point,  .̂</p>
      <sec id="sec-2-1">
        <title>2.1. Neighbourhood Generation</title>
        <p>In the context of XAI, a local explanation aims at understanding the reasoning behind a model’s
prediction for a specific input vector, say x. A key property of local explanations is their robustness, as
introduced in Section 1. By definition, locality plays a central role in the evaluation of local explanations
but it can lead to misleading results when the data manifold is not properly taken into account. A
neighbourhood  should not only be made of datapoints which are close to the original one, but also
be faithful to the observed data distribution: only in this scenario the evaluation can be trustworthy.</p>
        <p>
          Leveraging the manifold hypothesis, we are able to construct neighbourhoods which are on-manifold
as in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Firstly, we apply the  -medoids clustering algorithm to the validation set, obtaining  
clusters that are, on average, of size   = 10. From each cluster  , we extract the medoid x as a
representative, compute its   = 5 nearest neighbours within the other (  − 1) cluster centres
and randomly select from this set a medoid x . For each datapoint in the test set, we predict the
corresponding medoid cluster and retrieve its medoid’s neighbours. If we assume that  represents the
probability of perturbing a numerical variable   , then a perturbation x̃ is computed feature-wise as:
 ̃ = (1 − ) ̄ ⋅   +  ⋅̄
        </p>
        <p>with  ←̄ ( ⋅ 100, (1 − ) ⋅ 100)
ℛ̂(x) =
1</p>
        <p>∑ (( x), ( x̃))
| | x̃∈
 ( ̂ x) = (( x), (̄ x))</p>
        <p>The second score  ( ̂ x), referred to as consistency, computes the rank correlation between the original
explanation and the locally weighted averaged one (Equation 2).</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Local Averaging</title>
        <p>Given a neighbourhood of size  , costructed via Equation 1, we apply an explanation method  to both
the original data point and its perturbations, retrieving the local explanations ( x) and ( x̃), with x̃ ∈  .</p>
        <p>For points belonging to the minority class  = 1 , we can compute a local weighted attribution in
order to retain local information. In particular, the averaged explanation is:
ē(x) =
∑x̃∈ e(x̃) ⋅  ( x, x̃)
∑x̃∈  ( x, x̃)
where  ( x, x̃) =
(
1
x, x̃)</p>
        <p>Using the aggregated explanation allows us to take into account the results of a data augmentation on
the minority class, which is ensured to be faithful as it lays on-manifold. Explanations are weighted
according to the distance between the original data point and the perturbed one: same-class perturbations
which are more distant will be given smaller weight in the aggregated explanation.</p>
        <p>A neighbourhood  created following this perturbation scheme should be at least of size  = 100 . A
ifnal filtering step is then performed to discard all perturbations that change the predicted class label.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Proposed Evaluation</title>
        <p>We assess the reliability of the minority class explanations by verifying if the locally weighted attribution
is robust. In particular, we compare two locally-computed scores on  .</p>
        <p>
          The first score ℛ̂(x) - local robustness [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] - is computed locally by considering the Spearman rank
correlation  between the explanations of the point of interest and those of its perturbed neighbours.
(1)
(2)
(3)
(4)
        </p>
        <p>The consistency score allows us to investigate if the aggregated explanation is locally faithful to the
original one, ensuring that it summarizes useful information on the model decision making.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Use-case</title>
      <p>To get some preliminary insight on if and how explanations on unbalanced datasets can be trusted,
we apply the described methodology (Figure 1) to a real-world use-case. The case study was chosen
to reflect what might be the needs of a practitioner, who finds himself working with an unbalanced
dataset and having to apply XAI techniques, and thus wanting to determine whether the explanations
received can be trusted or not. The code implementation is available on Github.1</p>
      <sec id="sec-3-1">
        <title>3.1. Dataset Description</title>
        <p>
          We selected the problem of identifying the occurrence of frost events, using atmospheric data as
predictors. Frost is critical for agriculture, but rare and dificult to predict. A model that quantifies the
occurrence of frost can be the first step for practitioners to estimate agricultural business policies in
regions where data are limited. We used the the publicly available ERA5 reanalysis data [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] from the
European Centre for Medium-Range Weather Forecasts (ECMWF) to construct the input features. We
selected 8 numerical variables, each standardized to zero mean and unit variance. Since all features are
numerical, we used the Euclidean distance to compute  in Equation 2. Data are aggregated by days and
by municipalities and only the spring/summer months are retained. Target data were instead obtained
from proprietary datasets on insurance policies and claims, and take on a value of 0 or 1, whether or not
the event occurred on that particular day/municipality. We considered a period of 15 years (2009-2024)
over the territory of Poland. The obtained dataset is highly unbalanced with dataset class frequencies
of 0.99 and 0.01 for the majority class 0 and minority class 1, respectively.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Model and Training</title>
        <p>
          The model’s architecture is a simple fully connected neural network, which takes the input features
and processes them through five consecutive linear layers. The first four layers are followed by a ReLU
activation function, while the final output layer applies a sigmoid activation to produce a probability
score for classification. Data were split in three subsets: train ( 75%), validation (15%) and test (10%).
The split was performed using a stratified approach based on municipalities, ensuring that all days of
a municipality are assigned to the same set, either training or validation/test. We trained the model
using the focal loss (FL), first introduced by [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and shown in Equation 5. FL modifies the cross entropy
(CE) loss formulation by introducing a hyper-parameter  which down-weights the contribution of easy
examples and guides the model to focus on hard examples. Additionally, the hyper-parameter  is used
to handle the class imbalance. We performed a random search to tune the loss hyper-parameters and
found that  = 2.5 and  = 0.75 give the best performance on the validation set.
        </p>
        <p>() = −(1 − )
 log()
(5)</p>
        <p>
          We trained the model for 100 epochs with a batch size of 256, using the RAdam [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] optimizer with
learning rate 0.0001. As metric to evaluate the model performance we used the F1-score, particularly
suited for the case of unbalanced datasets. The results are reported in Table 1. We observe that the
F1score on the majority class 0 reaches its best value of 1.0 in all the three sets, while on the minority class 1
the obtained values are 0.66, 0.50 and 0.51 on the train, validation and test sets, respectively. Although
the performance is not optimal, we consider the results achieved to be good given the imbalance ratio
and especially the limited number of data points available for the minority class.
1https://github.com/ilariavascotto/Reliability_Unbalanced
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Preliminary Results</title>
      <p>
        As model-agnostic methods such as LIME and SHAP have been proved to exhibit poor robustness,
as shown in [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ], we focused on explanation methods which are specifically tailored to neural
networks, as this is the model chosen in the use-case. In particular, we considered four explainability
methods: Integrated gradients [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], DeepLIFT [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], Layerwise Relevance Propagation (LRP) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and
the ensemble approach proposed in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The first three methods are local post-hoc explanations that
make use of the backpropagation procedure inherent in neural network’s training to backpropagate a
signal (often referred to as relevance) from the output layer to the input one. The methods difer in the
used backpropagation rules and the possible presence of a baseline. Finally, the ensemble method is
built from Integrated Gradients, DeepLIFT and LRP and aims at limiting the undesired efects of the
disagreement problem [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], by providing a weighted average of multiple explanations.
      </p>
      <p>We evaluated the data points belonging to the test set. We firstly investigated the impact of the
neighbourhood generation on both the majority and minority class. In particular, we considered both
a random neighbourhood generation, in which Gaussian noise  ∼  (0,  2) is added the numerical
variables, and the medoid-based approach presented in Subsection 2.1. By comparing the robustness of
Integrated Gradients (Figure 2) we show that there is indeed a diference between the two neighbourhood
generating approaches and that the minority class is the most afected one. In fact, the
randomgenerating scheme is associated with a larger distribution shift in the minority class than in the majority
class. The negative efect of using a random neighbourhood is also evident if we consider the other
explainability methods analysed. These results support our hypothesis that, in highly unbalanced
datasets, the minority class is particularly vulnerable and requires careful analysis.</p>
      <p>We then focused on the minority class data points ( = 1 ) and we built a neighbourhood of size
 = 100 for each point, following the medoid-based generation scheme with hyperparameters   = 5
and  = 0.05 . The hyperparameters were set through a random search to ensure that, on the validation
set, at least 95% of the generated data points were classified as their corresponding original ones.</p>
      <p>Figure 3 shows that the explanations may be more or less locally robust, depending on the considered
method. We observe that the less stable method in this case-study is LRP, as confirmed by the lowest
overall mean value and large standard deviation presented in Table 2. This technique at times sufers
from the vanishing gradient problem, returning zero-vector attributions as a result. To limit the efect of
uninformative explanations, we selected the epsilon propagation rule, as it was the one minimizing the
vanishing gradient problem on this dataset. The choice of the propagation rule can significantly impact
the results and should be carefully selected as a dataset-specific feature. When comparing the ensemble
method robustness to the other two explainers, we need to take into account that its robustness is also
influenced by LRP.</p>
      <p>We showed that the individual explanations lie in a robust area of the data-manifold, as proven by the
robustness scores derived on the test set for the four explainers (the larger the score, the more robust
the area). However, this is not enough as we also need to ensure that the explanations are meaningful.</p>
      <p>Our aim is to compute the locally weighted attribution  ̄ via Equation 2 and to test if it is consistent
with the original point’s explanation. This is done by computing the consistency score  ( ̂ x) via
Equation 4. Since we showed that the data points lie in a robust area of the data space, we can leverage
on-manifold information to enrich the set of data points predicted to be in class  = 1 and produce an
individual explanation which takes into account also the neighbouring points. Increasing the amount
of information considered when providing an individual explanation for a point of the minority class
helps in understanding the decision making process of the model.</p>
      <p>Figure 4 presents the consistency scores of the four explainers on the test set: we observe that the
two methods which have higher robustness scores ℛ̂(x), namely Integrated Gradients and DeepLIFT,
are associated to larger consistency scores  ( ̂ x). This was expected as the local explanations are more
similar to one another and the averaged explanation remains highly correlated with the original one. In
contrast, it is interesting to note that both LRP and the ensemble aggregation, despite having a lower
robustness score due to the influence of the LRP method, are still able to produce satisfyingly consistent
aggregated explanations within the generated neighbourhood.</p>
      <p>Table 2 presents the mean robustness and consistency scores with the corresponding standard
deviation reported within brackets. It can be seen that DeepLIFT is the most robust and most consistent
method in the considered use-case, being associated with both the highest mean value and the lowest
standard deviation. According to the practitioner’s needs, the ensemble could also be deemed to
be a valid candidate as it jointly considers the efects of multiple explanations, addressing possible
inconsistencies by construction, and proposing satisfying robustness and consistency scores.</p>
      <p>The consistency scores show that retaining local information from carefully crafted on-manifold
neighbourhoods can be beneficial for the minority class explanations. In particular, the locally averaged
explanation maintains more local information than the individual explanation on a given data point
and it has been proved to encapsulate explanations of a robust area of the data manifold.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Developments</title>
      <p>Despite being in its early stages of development, we argue that this line of research could bring to light
useful insights in dealing with unbalanced datasets in practical use-cases. We believe that the analysis
could be further enriched by defining a metric to quantify the quality of an explanation in such complex
use-cases, considering both the robustness and the consistency as defined in Section 2. Moreover, it
would be interesting to test how an uncertainty analysis on the minority class could be beneficial for
the correct evaluation of these datasets from both a technical and a practitioner point of view.</p>
      <p>As next steps, we plan to investigate how the results change as the performance of the neural
network model varies and, at the same time, investigate the efects of a varying unbalancing ratio
between the two classes. The analysis will be extended to diferent datasets, both public and private,
in order to maintain the real-world aspect of interest to practitioners, but at the same time favour the
reproducibility of the results. In future research stages, we also plan to investigate diferent machine
learning models for improving performance with unbalanced dataset (e.g. by explicitly taking into
account the spatio-temporal dimensions that often characterise many datasets) and adapt the choice of
XAI methodologies and reliability reasoning accordingly.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>We wish to thank Assicurazioni Generali Spa for their support and interest in our work.
The authors have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>European</given-names>
            <surname>Commission</surname>
          </string-name>
          ,
          <source>Regulation (EU)</source>
          <year>2016</year>
          /
          <article-title>679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data</article-title>
          ,
          <source>and repealing Directive</source>
          <volume>95</volume>
          /46/EC (
          <article-title>General Data Protection Regulation) (Text with EEA relevance</article-title>
          ),
          <year>2016</year>
          . https://eur-lex.europa.eu/eli/reg/2016/679/oj.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>European</given-names>
            <surname>Commission</surname>
          </string-name>
          ,
          <article-title>Proposal for a Regulation of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain union legislative acts (COM(</article-title>
          <year>2021</year>
          )
          <article-title>206 final</article-title>
          ),
          <year>2021</year>
          . https://eur-lex.europa.eu/legal-content/ EN/TXT/?uri=celex%3A52021PC0206.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          , “
          <article-title>why should I trust you?”: Explaining the predictions of any classifier</article-title>
          , in: J. DeNero, M. Finlayson, S. Reddy (Eds.),
          <source>Proceedings of the</source>
          <year>2016</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, Association for Computational Linguistics</article-title>
          , San Diego, California,
          <year>2016</year>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>101</lpage>
          . URL: https: //aclanthology.org/N16-3020/. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N16</fpage>
          - 3020.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-I.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>A unified approach to interpreting model predictions</article-title>
          ,
          <source>in: Proceedings of the 31st International Conference on Neural Information Processing Systems</source>
          , volume
          <volume>2017</volume>
          <source>- December of NIPS'17</source>
          , Curran Associates Inc.,
          <year>2017</year>
          , pp.
          <fpage>4766</fpage>
          --
          <lpage>4775</lpage>
          . https://dl.acm.org/doi/10.5555/ 3295222.3295230.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>I.</given-names>
            <surname>Vascotto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bonaita</surname>
          </string-name>
          , L. Bortolussi,
          <article-title>When can you trust your explanations? a robustness analysis on feature importances</article-title>
          ,
          <year>2025</year>
          . arXiv:
          <volume>2406</volume>
          .
          <fpage>14349</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Hersbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Berrisford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hirahara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Horányi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Muñoz-Sabater</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Nicolas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Peubey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Radu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schepers</surname>
          </string-name>
          , et al.,
          <source>The era5 global reanalysis</source>
          ,
          <source>Quarterly journal of the royal meteorological society 146</source>
          (
          <year>2020</year>
          )
          <fpage>1999</fpage>
          -
          <lpage>2049</lpage>
          . doi:
          <volume>10</volume>
          .1002/qj.3803.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.-Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Girshick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dollár</surname>
          </string-name>
          ,
          <article-title>Focal loss for dense object detection</article-title>
          ,
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          <volume>42</volume>
          (
          <year>2020</year>
          )
          <fpage>318</fpage>
          -
          <lpage>327</lpage>
          . doi:
          <volume>10</volume>
          .1109/TPAMI.
          <year>2018</year>
          .
          <volume>2858826</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          , J. Han,
          <article-title>On the variance of the adaptive learning rate and beyond,</article-title>
          <year>2021</year>
          . arXiv:
          <year>1908</year>
          .03265.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Slack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hilgard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lakkaraju</surname>
          </string-name>
          ,
          <article-title>Fooling lime and shap: Adversarial attacks on post hoc explanation methods</article-title>
          ,
          <source>in: Proceedings of the AAAI/ACM Conference on AI</source>
          ,
          <string-name>
            <surname>Ethics</surname>
          </string-name>
          , and Society, AIES '20,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>180</fpage>
          -
          <lpage>186</lpage>
          . URL: https://doi.org/10.1145/3375627.3375830. doi:
          <volume>10</volume>
          .1145/3375627.3375830.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gosiewska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Biecek</surname>
          </string-name>
          , Do not trust additive explanations,
          <year>2020</year>
          . URL: https://doi.org/10.48550/ arXiv.
          <year>1903</year>
          .
          <volume>11420</volume>
          . arXiv:
          <year>1903</year>
          .11420.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundararajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Taly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <article-title>Axiomatic attribution for deep networks</article-title>
          ,
          <source>in: Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML'17</source>
          , JMLR.org,
          <year>2017</year>
          , p.
          <fpage>3319</fpage>
          -
          <lpage>3328</lpage>
          . https://dl.acm.org/doi/10.5555/3305890.3306024.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shrikumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Greenside</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kundaje</surname>
          </string-name>
          ,
          <article-title>Learning important features through propagating activation diferences</article-title>
          ,
          <source>in: Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML'17</source>
          , JMLR.org,
          <year>2017</year>
          , p.
          <fpage>3145</fpage>
          -
          <lpage>3153</lpage>
          . https://dl.acm.org/doi/10.5555/3305890.3306006.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Binder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Montavon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Klauschen</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-R. Müller</surname>
          </string-name>
          , W. Samek,
          <article-title>On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation</article-title>
          ,
          <source>PLoS ONE 10(7)</source>
          (
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .1371/journal.pone.
          <volume>0130140</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Krishna</surname>
          </string-name>
          , T. Han,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Gu,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jabbari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lakkaraju</surname>
          </string-name>
          ,
          <article-title>The disagreement problem in explainable machine learning: A practitioner's perspective</article-title>
          ,
          <source>Transactions on Machine Learning Research</source>
          (
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          .21203/rs.3.rs-
          <volume>2963888</volume>
          /v1.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>