<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>HI-AI@KDD, Human-Interpretable AI Workshop at the KDD</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>MASALA: Model-Agnostic Surrogate Explanations by Locality Adaptation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Saif Anwar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nathan Grifiths</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Warwick</institution>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Engineering, University of Warwick</institution>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>26</volume>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Existing local Explainable AI (XAI) methods, such as LIME, select a region of the input space in the vicinity of a given input instance, for which they approximate the behaviour of a model using a simpler and more interpretable surrogate model. The size of this region is often controlled by a user-defined locality hyperparameter. In this paper, we demonstrate the dificulties associated with defining a suitable locality size to capture impactful model behaviour, as well as the inadequacy of using a single locality size to explain all predictions. We propose a novel method, MASALA, for generating explanations, which automatically determines the appropriate local region of impactful model behaviour for each individual instance being explained. MASALA approximates the local behaviour used by a complex model to make a prediction by fitting a linear surrogate model to a set of points which experience similar model behaviour. These points are found by clustering the input space into regions of linear behavioural trends exhibited by the model. We compare the fidelity and consistency of explanations generated by our method with existing local XAI methods, namely LIME and CHILLI. Experiments on the PHM08 and MIDAS datasets show that our method produces more faithful and consistent explanations than existing methods, without the need to define any sensitive locality hyperparameters.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Exlainable AI (XAI)</kwd>
        <kwd>Interpretable Machine Learning</kwd>
        <kwd>Explanation</kwd>
        <kwd>Model-Agnostic</kwd>
        <kwd>Post-Hoc</kwd>
        <kwd>Local Linear Modelling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Many Machine Learning (ML) methods are treated as black-boxes because of their complex
and often incomprehensible behaviour. As a result, there is tentative adoption in high-risk
domains, such as healthcare, finance, and defence. There is a demand from stakeholders to
establish trust in a model, since an incorrect decision may have serious consequences [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ].
Explainable AI (XAI) methods aim to provide explanations for the predictions produced by a
model and make transparent its behaviours [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In this paper, we define an explanation to be an
interpretable representation of a base model’s decision-making process, such as in the form of
feature importance scores or a set of decision rules.
      </p>
      <p>
        Generally, XAI techniques can be divided into inherently-interpretable models and
posthoc methods. The former involves developing model architectures which are interpretable by
design and do not require extensive efort to understand the reasoning for a given output [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ].
However, it is generally agreed that limiting complexity, for the sake of interpretability, may
hinder performance when compared to more complex black-box models [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ].
      </p>
      <p>
        Post-hoc XAI methods generate explanations for pre-trained black-box models, often in a
model-agnostic manner [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Popular methods typically explain predictions through feature
importance scores, either on a global or local scale [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">10, 11, 12</xref>
        ]. Global methods explain general
model behaviour for all datapoints, whereas local methods explain the model behaviour used to
make a specific prediction. A local explanation is constrained to a locality, namely a region of
the input space surrounding the instance for which the prediction is being explained, which we
call the target instance. Some local XAI methods fit an inherently interpretable surrogate model
in a locality around a target instance, where the interpretation of the surrogate model is the
explanation, such as the coeficients of a regression model or rules established in a decision tree.
      </p>
      <p>
        LIME [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is a popular approach that fits a surrogate model to perturbations of a target
instance that are generated by randomly sampling a Gaussian kernel centered around the
target instance. The width of the kernel, which is manually defined by the user and is fixed
for all explanations, controls the expanse of the perturbations and therefore the size of the
explanation locality. It has been shown that a surrogate model fit to perturbations sampled from
an inappropriately sized Gaussian kernel may not be representative of the base models training
data and therefore does not represent the true base model behaviour [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. CHILLI [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] is an
adaptation of LIME which addresses some of these issues. Perturbations are generated according
to the distribution of the model training data in the vicinity of the target instance. However,
CHILLI also requires the user to define the locality of each explanation. The locality size must
be defined in such a way that it is not too large, where the explanation ignores model intricacies,
and not too small, where the explanation focuses on anomalous fluctuations rather than more
dominant behavioural trends. This is illustrated in Figure 1, which shows how the locality size
afects the behaviour of the surrogate model and how a fixed locality may not be appropriate for
all instances. Since LIME and CHILLI use non-deterministic perturbation generation methods,
explanations for the same prediction may difer, leading to a lack of consistency that undermines
the trustworthiness of the explanation method [15, 16, 17].
      </p>
      <p>In this paper, we propose Model-Agnostic Surrogate explAnations by Locality Adaptation
(MASALA), a novel post-hoc XAI method that automatically finds the impactful local model
behaviour surrounding a target instance. MASALA fits a Multivariate Linear Regression (MLR)
surrogate model to a set of points that experience the same linear behaviour as the target
instance, which it obtains by automatically detecting the linear regions of model behaviour
across the input domain. The coeficients of the MLR represent feature relationships towards
the target distribution. Since MASALA generates explanations using a deterministic clustering,
explanations for the same instance are identical and therefore are consistent. Using the PHM08
and MIDAS datasets, we qualitatively and quantitatively demonstrate the ability of MASALA
to generate explanations with higher fidelity and consistency than those produced by LIME
and CHILLI. Our source code and data are available through the following repository: https:
//github.com/saifanwar/MASALA</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>Existing works have attempted to address locality issues by clustering similar points to fit linear
surrogate models. Zafar et al. [18] propose DLIME, which uses agglomerative Hierarchical
Clustering to divide the training data into groups of similar points according to their Euclidean
distance across all features. Points that are clustered together may not experience the same
model behaviour since they may be close in some feature dimensions, but distant in others and
therefore may not experience similar model behaviour across all features.</p>
      <p>It has been shown that LIME generates inconsistent explanations when using a locality size
that is too small, since explanations may focus on irregularities introduced by the randomly
sampled perturbations. Gaudel et al. propose s-LIME [19], which generates perturbations whose
distance is proportional to the magnitude of the selected kernel width. However, this still requires
the locality to be manually defined. Local Surrogate [ 17] avoids manual locality definition by
generating perturbations around the decision boundary closest to the instance being explained,
therefore approximating the model behaviour that led to the prediction. However, this may not
be applicable to a regression problem where there is no decision boundary. In ALIME [20], an
autoencoder is trained as a weighting function used to decide whether perturbations are local
to the target instance. Although this leads to more consistent explanations, the threshold for
discarding points must be manually defined and is efectively equivalent to the kernel width
hyperparameter in LIME.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>The goal of MASALA is to fit a linear surrogate model to a set of points that experience similar
model behaviour to a specified target instance. We first formalise the problem and then describe
the details of the method.
ˆ
y
0
0
10 20
Feature 1 (m = 1)
30
0</p>
      <p>10 20
Feature 2 (m = 2)</p>
      <sec id="sec-3-1">
        <title>3.1. Problem Definition</title>
        <p>Consider a black-box base model  , which maps an  -dimensional input vector x ∈  , to a
scalar output  ∈ , and is trained on a dataset . The prediction for a given target instance x
is explained by training a MLR surrogate model  on a subset of the training data  ∈ , with
target values being the predictions on  from the base model  . The linear coeficients of the
MLR directly indicate the contribution of each feature towards the prediction. The selection of
instances to include in  defines the locality of the explanation, since only the model behaviour
used to make predictions for those instances will be approximated. Only instances that have
similar feature values and experience similar model behaviour to the target instance should
be included in . However, identifying a set  for a given x is non-trivial, since the model
behaviour may vary across the feature space. MASALA identifies an appropriate subset  for
training a surrogate MLR model, by finding all instances that share the same region of linearity
in the distribution of model predictions as the target instance x. Since we assume that the
base model behaviour is locally linear for some region around all data instances, we propose
exhaustively clustering the distribution of each input feature against the model predictions
into regions of linearity. The set of  identified linear regions, or clusters, for a given feature
dimension , is denoted as , such that  = {|∀ ∈ }. Once a clustering has been
obtained, the clusters within which a target instance x falls, in each feature dimension, can be
identified. This is denoted as (x) which indicates that in the distribution of feature , the

target instance falls within linear region . The subset of instances used to train the surrogate
 is then denfied as the set of instances which share the same linear region in each feature
dimension as the instance x, as formulated in Equation 1.</p>
        <p>= {x ∈  |(x ) = (x), ∀ ∈  }
(1)
Figure 2 shows the distributions of 2 features from the same dataset, against the respective
model predictions, along with a target instance to be explained. Each feature has been clustered
into a diferent number of linear regions with the target instance falling within the blue and
pink regions for features  = 1 and  = 2 respectively. The set  contains the points that
also fall within both the blue and pink clusters. Since  does not change for a given instance,
explanations generated for the same instance are identical, thus preserving consistency.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Local Linear K-Medoids Clustering</title>
        <p>We now present our method for identifying regions of linear model behaviour in the distribution
of an input feature and the predictions from the base model. We consider each feature dimension
individually, and will therefore omit the superscript  in the remainder of this section. For
example,   will be denoted as  and  will be denoted as .</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Pairwise Dissimilarity</title>
          <p>
            We apply an adapted K-medoids algorithm [21] which clusters datapoints based on their pairwise
Euclidean distance. However, points that are close in the feature space, do not necessarily
experience the same linear model behaviour. We introduce a custom distance measure in the
form of pairwise dissimilarity ∆  , which compares the local linearity, feature value, and local
density of points. For each datapoint, , a weighted Local Linear Regression (LLR) is performed
on all points within its neighbourhood,  (), which is defined using a distance threshold
in the feature space. Although this may be seen as defining a similar threshold to the kernel
width locality parameter in LIME, the final explanation is much more robust to changes in the
threshold since the weighted LLR automatically considers closer points with greater importance.
The dissimilarity between two points  and  is calculated as
∆  (, ) = ||w −
w ||2 + (,  ) + || ()| − |  ( )||,
(2)
where w represents the vector of LLR model parameters for point , and | ()| denotes
the number of points in the neighbourhood of . The second term (, ) is the diference
in feature values of  and  , which may be a custom measure dependent on the type of
feature. Including this distance ensures that points with similar neighbourhood trends that are
in diferent regions of the feature space, are not clustered together. The final term allows for the
inclusion of local data density since two points may share a similar linear neighbourhood trend,
but one neighbourhood may exhibit more sparsity than the other. A linear model fit to a sparse
neighbourhood may be skewed by anomalous datapoints, and therefore should be considered
with more caution. It should be noted that all terms are normalised in the range [
            <xref ref-type="bibr" rid="ref1">0,1</xref>
            ] such that
they have equal contribution towards the dissimilarity measure. Points with the smallest values
in ∆  will be close together in the feature space, have similar LLR model parameters, and have
similar neighbourhood density.
          </p>
          <p>Points are clustered according to their pairwise dissimilarity using the K-medoids algorithm
[21]. To allow for a deterministic clustering, medoids are initialised by evenly distributing K
medoids across the sorted values from the data. A Linear Regression (LR) model is fit to the
points within each cluster, where  and  are the LR parameters within cluster . The cost,  ,
for a clustering  is calculated according to Equation 3.</p>
          <p>() = ∑︁ RMSE({ ·  + ,  ()|∀ ∈ })
=1
(3)
Thus, the cost is the sum of the RMSEs between predictions from the LR model and base model
predictions within each cluster. Rather than randomly assigning a new medoid for a cluster, the
clustering cost is calculated when each point is the medoid. The medoid which gives the lowest
clustering cost is selected. This is repeated for all clusters with the algorithm halting when the
clustering cost no longer changes after optimising all clusters and is also a deterministic process.
A lower clustering cost reflects an ensemble of linear proxy models that is more faithful to the
behaviour of the base model.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Automatically defining</title>
          <p>It may seem that increasing the number of clusters , would generate a more faithful ensemble
of linear models. However, doing so leads to clusters with smaller coverage which may be
overfit to erroneous behaviour, rather than to more impactful general linear trends. This efect
can be worsened in sparse regions of the input space. The relationship between an input feature
and model predictions will vary across features and datasets, and therefore the appropriate
number of linear regions to cluster also varies. We propose an algorithm that automatically
ifnds a suitable value for , given a set of constraints, such that clusters must not overlap in
the feature space, must contain a suficient number of datapoints, and must cover a suitable
range of values in the feature space.</p>
          <p>If a cluster is wholly contained within another cluster, the larger cluster will adopt all
datapoints of the smaller cluster. If two clusters overlap each other, two new clusters replace
them by dividing the points at the midpoint of the overlapping clusters’ intersection. Once all
clusters have been separated, they each occupy a unique range of values in the feature space.</p>
          <p>To avoid skewing to anomalous or non-significant behaviours, all clusters should contain
suficient datapoints, and therefore the data sparsity of each cluster is checked. The sparsity of
a cluster is defined as the ratio between the number of points it contains and the number of
points in the largest cluster, , leading to a dynamic sparsity measure that is relative to the
current clustering. If the current largest cluster contains a relatively small number of datapoints
from the entire dataset, the sparsity measure considers that all clusters are generally small and
there may be intricate relationships within the data. The criteria for deciding whether cluster
 is sparse is shown in Equation 4,
|| &lt;
||
1
 2 ∑︁ | −  |,
,∈
(4)
where N is the total number of samples in the dataset. The right-hand side of the formulation
is the average pairwise distance between all points in the dataset and is used as a sparsity
threshold since it provides a measure of the average density of the data. If the sparsity of a
cluster falls below the threshold, it is combined with a neighbouring cluster in the feature space.
The combination that gives the lowest clustering cost when merging the sparse cluster with
each of its neighbours, is selected for the new clustering. Similarly, the coverage of each cluster
is also checked, where coverage is calculated as the percentage of the input space occupied. If
the coverage falls below the same threshold used for sparsity, it is combined with a neighbouring
cluster using the same protocol as for a sparse cluster.</p>
          <p>To obtain a clustering, we start with some arbitrarily large . An initial clustering is generated
by selecting  random medoids and assigning points to the cluster for which the medoid is
most similar according to ∆  . The cost of this clustering is calculated using Equation 3. A
new clustering is generated by randomly selecting a new medoid for a random cluster and
reassigning all points. If the new clustering is of lower cost, it replaces the previous clustering.
This is repeated until the cost of the clustering is unchanged by selecting a new medoid. The
clustering is then checked against the constraints and modified if necessary, which may lead
to a change in the number of clusters. If so,  is redefined as the new number of clusters and
a new clustering is obtained in the same manner as outlined above, to find the lowest cost
clustering for the new value of . This process is repeated until the number of clusters does
not change after satisfying the constraints. This algorithm is outlined in Appendix A.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Generating Explanations</title>
        <p>We cluster each feature dimension in the input space and, as discussed in Section 3.1, use this
as a foundation for generating explanations for any input instance. The computational cost of
MASALA scales linearly with the number of feature dimensions. A specified target instance
x will fall into a single linear region in each feature dimension. The set of instances  for
training the surrogate  is defined using Equation 1. The linear coeficients of the MLR indicate
the contribution of each feature towards the base model’s prediction for the target instance.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>To evaluate MASALA we use the following two combinations of datasets and base models.</p>
      <p>
        PHM08 Challenge [22] is a dataset used to predict the Remaining Useful Life (RUL) of a set
of turbofan engines. A Gradient Boosting Regressor (GBR) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] was trained using the lifetime
operations of 218 engines, containing almost 46,000 samples. GBRs provide global feature
importance scores, however these describe general model behaviour and are not suficient for
providing insights into the behaviour of the model at an instance level. The GBR achieved a
RMSE of 59.5 on the test set. Appendix C.2 compares the predictions made by the base model for
the PHM08 dataset to the true values. It can be noticed that the relationship between features
and predictions is not always linear, and therefore a single linear surrogate model may not be
able to capture the true model behaviour.
      </p>
      <p>MIDAS [23] is a dataset provided by the UK Meteorological Ofice which contains hourly
weather observations across the UK. We use data collected at Heathrow Airport over 3 years
(Jan. 2019 - Dec. 2022), which contains 19138 observations of a number of weather related
parameters. A Recurrent Neural Network (RNN) is trained to predict the air temperature at a
given time. RNNs are complex deep learning models that lack inherent-interpretability. Many
state-of-the-art temporal models incorporate RNN architectures, therefore, being able to explain
the behaviour of such models is useful to ensure they can be trusted and applied safely. The
RNN achieved a RMSE of 3.04 on the test set. Appendix C.1 compares the predictions made
by the base model for the MIDAS dataset to the true values. Similar to the GBR, there is not a
direct linear relationship between the distributions of input features and predictions. For both
datasets, 75% of the data is used for training and 25% is reserved for testing.</p>
      <p>
        Local fidelity is a common evaluation metric that measures how well a surrogate model
approximates the behaviour of the base model, by comparing their respective predictions on
the local data used to fit the surrogate [ 24, 25]. This local data, as is the case for LIME and
CHILLI, may be perturbed samples of the instance being explained [
        <xref ref-type="bibr" rid="ref14">14, 26</xref>
        ], which may not
be appropriate if the perturbations are not representative of the original training data [27].
Furthermore, locality is ill-defined in these existing methods and so local fidelity cannot be
trusted. Instead, we measure explanation fidelity which is calculated as the absolute error
between the predictions from the base model and surrogate model for the target instance.
      </p>
      <p>Average Explanation Fidelity =</p>
      <p>1 ∑︁ | (x) − (x)|
 =1
(5)</p>
      <p>The average explanation fidelity can be calculated over a number of N instances, as shown
in Equation 5, to quantify the explanation methods performance. We calculate the average
explanation fidelity for 20 instances selectedly uniformly at random from the test set.</p>
      <p>Prior works have highlighted that when repeatedly explaining the same instance, random
perturbation based methods, such as LIME, produce inconsistent and difering explanations
[18, 28, 29]. To measure consistency of repeated explanations, existing works use Jaccard distance
[18, 30]. Jaccard distance only considers explanations to be similar if their feature importance
scores are identical. We instead propose calculating the average standard deviation of the
normalised feature importance scores across 10 repeated explanations, which we subtract from
1 to measure consistency. We compare explanations generated by MASALA to those generated
by LIME and CHILLI for a range of kernel width settings. Both the average consistency and
ifdelity of the explanations are calculated for all methods across 5 independent runs.</p>
      <p>For illustration, the clustering used to generate explanations with MASALA for the MIDAS
dataset is shown in Appendix B.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>Table 1 shows the average consistency and fidelity of all explanations obtained from the
experiments along with the standard deviation across the 5 runs. It can be noticed that on average,
higher kernel width settings for LIME and CHILLI (shown in parentheses in Table 1) produce
explanations with lower error for target instances from the PHM08 dataset. However, as the
explanation fidelity improves, the consistency decreases which makes it dificult to trust the
more performant explanations. Figure 3 shows the absolute error of the explanations generated
using CHILLI for the individual random instances at each kernel width setting. It can be noticed
that the kernel width setting that produces the lowest error explanation varies. This highlights
that the optimal locality parameter for each instance may vary and cannot be universally defined.
For the PHM08 instances, explanations generated using MASALA always exhibited the lowest
average error, and therefore greatest fidelity, compared to those generated using LIME and
CHILLI across all kernel width values. This demonstrates the capability of MASALA to produce
explanations that exceed the performance of LIME and CHILLI without the requirement of
defining a kernel width value. Since the locality size of explanations generated using MASALA
LIME (0.1)
LIME (0.25)
LIME (0.5)
LIME (1.0)
CHILLI (0.01)
CHILLI (0.1)
CHILLI (0.25)
CHILLI (0.5)
CHILLI (1.0)</p>
      <sec id="sec-5-1">
        <title>MASALA</title>
        <p>1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20</p>
        <p>Instance
is dependent on local trends, it is able to capture the appropriate locality for individual instances
without the need for a kernel width setting.</p>
        <p>For the MIDAS dataset, CHILLI and LIME at low kernel width settings achieved significantly
lower average error compared to MASALA. This can be attributed to the fact that at such small
localities, the perturbations used to train the surrogate model only occupy a minuscule region of
the input space, which the surrogate can model very accurately. It can also be noticed however,
that at particularly small localities, LIME and CHILLI experience a lower consistency than
MASALA. An example of repeated explanations for a single instance generated by each method
is shown in Figure 4. There is significant variation in explanations generated at low kernel width
values which may be attributed to LIME and CHILLI randomly sampling perturbations. We see
that for a kernel width of 0.1, CHILLI produces much more consistent explanations, since the
perturbation generation method considers the distribution of the original data and this locality
size may be appropriate in capturing the relevant trends. However, this comes at the cost of
ifdelity, indicated by the lower average error, since the produced explanations may be describing
more general trends rather than local ones. Consistency cannot be sacrificed for fidelity, since</p>
        <p>Model Prediction
Surrogate Prediction</p>
        <p>Heathrow wind speed</p>
        <p>Heathrow wind direction
Heathrow total cloud cover
Heathrow cloud base height</p>
        <p>Heathrow visibility</p>
        <p>Heathrow MSL pressure
Heathrow relative humidity</p>
        <p>Heathrow rainfall</p>
        <p>Date</p>
        <p>LIME (0.1)
Consistency: 0.53
Average Error: 0.00</p>
        <p>CHILLI (0.1)
Consistency: 0.98
Average Error: 0.82</p>
        <p>CHILLI (0.01)
Consistency: 0.87
Average Error: 0.00</p>
        <p>MASALA
Consistency: 1.00
Average Error: 0.76
12
10
if multiple explanations are presented for a single prediction, there is uncertainty regarding
their correctness and trustworthiness. Explanations generated using MASALA achieve perfect
consistency since they are generated using a predetermined clustering, leading to identical
repeated explanations.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this paper we propose MASALA, a novel method for generating explanations for black-box
model predictions using linear local surrogate models. We proposed a clustering technique that
identifies a set of points which are similar to an instance for which an explanation is being
generated, and use this to fit a linear surrogate model to approximate the base model behaviour.
As a result, MASALA automatically detects the relevant and impactful model behaviour in an
appropriately sized region of the input space. We find that explanations generated using our
method produce more faithful and consistent explanations than those generated using LIME
and CHILLI, without the need to manually define a locality hyperparameter that may difer for
each instance being explained. Although a deterministic clustering ensures consistency, it is
not clear how a non-deterministic clustering method which generates explaantions of equal
ifdelity would compare. There may be explanations that are equally faithful yet present diferent
feature contributions so the question arises as to which of the explanations, or both, are correct.
Future work would explore the possibilities of this and investigate whether such explanations
are equally valid and how they can be compared.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We gratefully acknowledge the funding provided by the UK Engineering and Physical Sciences
Research Council (grant ref. EP/T517641/1) and TRL Ltd to support this iCASE with project
code B.CSAA.0001.
Interaction (2023).
[15] X. Zhao, W. Huang, X. Huang, V. Robu, D. Flynn, BayLIME: Bayesian local
interpretable model-agnostic explanations, in: Proceedings of the Thirty-Seventh
Conference on Uncertainty in Artificial Intelligence, PMLR, 2021, pp. 887–896. URL: https:
//proceedings.mlr.press/v161/zhao21a.html, iSSN: 2640-3498.
[16] Z. Tan, Y. Tian, J. Li, GLIME: General, Stable and Local LIME Explanation, 2023. URL:
http://arxiv.org/abs/2311.15722, arXiv:2311.15722 [cs, stat].
[17] T. Laugel, X. Renard, M.-J. Lesot, C. Marsala, M. Detyniecki, Defining Locality
for Surrogates in Post-hoc Interpretablity, 2018. URL: http://arxiv.org/abs/1806.07498,
arXiv:1806.07498 [cs, stat].
[18] M. R. Zafar, N. M. Khan, DLIME: A Deterministic Local Interpretable Model-Agnostic
Explanations Approach for Computer-Aided Diagnosis Systems, 2019. URL: http://arxiv.
org/abs/1906.10263, arXiv:1906.10263 [cs, stat].
[19] R. Gaudel, L. Galárraga, J. Delaunay, L. Rozé, V. Bhargava, s-LIME: Reconciling
Locality and Fidelity in Linear Explanations, 2022. URL: http://arxiv.org/abs/2208.01510,
arXiv:2208.01510 [cs].
[20] S. M. Shankaranarayana, D. Runje, ALIME: Autoencoder Based Approach for Local
Interpretability, in: H. Yin, D. Camacho, P. Tino, A. J. Tallón-Ballesteros, R. Menezes,
R. Allmendinger (Eds.), Intelligent Data Engineering and Automated Learning – IDEAL
2019, Lecture Notes in Computer Science, Springer International Publishing, Cham, 2019,
pp. 454–463. doi:10.1007/978-3-030-33607-3_49.
[21] H.-S. Park, C.-H. Jun, A simple and fast algorithm for K-medoids clustering, Expert
Systems with Applications 36 (2009) 3336–3341. URL: https://linkinghub.elsevier.com/
retrieve/pii/S095741740800081X. doi:10.1016/j.eswa.2008.01.039.
[22] A. Saxena, D. Simon, N. Eklund, Damage Propagation Modeling for Aircraft Engine</p>
      <p>Prognostics (2008).
[23] Met Ofice, Met Ofice MIDAS Open: UK Land Surface Stations Data (1853-current), Centre
for Environmental Data Analysis, 2019.
[24] R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A Survey of
Methods for Explaining Black Box Models, ACM Computing Surveys 51 (2019) 1–42. URL:
https://dl.acm.org/doi/10.1145/3236009. doi:10.1145/3236009.
[25] A. A. Freitas, Comprehensible classification models: a position paper, ACM SIGKDD
Explorations Newsletter 15 (2014) 1–10. URL: https://dl.acm.org/doi/10.1145/2594473.2594475.
doi:10.1145/2594473.2594475.
[26] R. Guidotti, A. Monreale, S. Ruggieri, D. Pedreschi, F. Turini, F. Giannotti, Local Rule-Based
Explanations of Black Box Decision Systems, 2018. URL: http://arxiv.org/abs/1805.10820,
arXiv:1805.10820 [cs].
[27] C. Molnar, G. König, J. Herbinger, T. Freiesleben, S. Dandl, C. A. Scholbeck, G. Casalicchio,
M. Grosse-Wentrup, B. Bischl, General Pitfalls of Model-Agnostic Interpretation Methods
for Machine Learning Models, Technical Report arXiv:2007.04131, arXiv, 2021. URL: http:
//arxiv.org/abs/2007.04131, arXiv:2007.04131 [cs, stat] type: article.
[28] Y. Zhang, K. Song, Y. Sun, S. Tan, M. Udell, "Why Should You Trust My Explanation?"
Understanding Uncertainty in LIME Explanations, 2019. URL: http://arxiv.org/abs/1904.
12991, arXiv:1904.12991 [cs, stat].
[29] Z. C. Lipton, The Mythos of Model Interpretability, 2017. URL: http://arxiv.org/abs/1606.</p>
      <p>03490, arXiv:1606.03490 [cs, stat].
[30] E. Amparore, A. Perotti, P. Bajardi, To trust or not to trust an explanation: using LEAF
to evaluate local linear XAI methods, PeerJ Computer Science 7 (2021) e479. URL: https:
//www.ncbi.nlm.nih.gov/pmc/articles/PMC8056245/. doi:10.7717/peerj-cs.479.</p>
    </sec>
    <sec id="sec-8">
      <title>A. Local Linear K-Medoids Clustering Algorithm</title>
      <p>Algorithm 1 Local Linear K-Medoids Clustering
Require:  &gt; 0, ∆  , Previous  = 0, Clustering Cost = ∞
while Previous  ̸=  do</p>
      <p>Previous  = 
Select  medoids which are evenly distributed across the distribution of 
for each non-medoid  ∈  do</p>
      <p>Find the least dissimilar medoid according to ∆  and assign  to
the corresponding cluster to generate clustering 
end for
Fit LR model within each cluster 
repeat</p>
      <p>Calculate Clustering Cost  (), using Equation 3
Lowest Cost =  ()
for each cluster  ∈  do
for each  ∈  do</p>
      <p>Change medoid for  to 
Generate new clustering  with new medoids
Fit LR models within each cluster 
Calculate  ′() using Equation 3
if  ′() &lt; Lowest Cost then</p>
      <p>Lowest Cost =  ()</p>
      <p>Accept  as new medoid
else</p>
    </sec>
    <sec id="sec-9">
      <title>B. MIDAS Clustering</title>
      <p>Here we show the clustering obtained by applying Algorithm 1 to the MIDAS data containing
features describing weather characteristics at Heathrow Airport between 2019 and 2022. The
distribution of each feature is shown against the predictions obtained by the base model being
explained, in this case a Recurrent Neural Network. The clustered regions of linear behaviour
are shown as diferent colours with the linear regression model fit to the points within each
cluster also shown as a line of the same colour.</p>
      <p>We have not shown the clustering obtained for the PHM08 dataset because there are
too many features to be able to visualise the individual clusters efectively.</p>
      <p>e
trr
u
a
e
p20
m
e
T
ir
tid10
A
e
c
d
e
rP 0.00
e
trr
u
a
e
p20
m
e
T
ir
tid10
A
e
c
d
e
rP 0.00</p>
    </sec>
    <sec id="sec-10">
      <title>C. Model Predictions</title>
      <p>The distribution of each feature from the MIDAS and PHM08 datasets against the true target
values and the RNN and GBR models respectively. It can be seen that there is not a clear linear
relationship present between many features and the target variable. Therefore, a single linear
model would not be appropriate for explaining the behaviour of the base model for all instances.</p>
      <sec id="sec-10-1">
        <title>C.1. MIDAS Recurrent Neural Network</title>
        <p>1.0
0.0
2000
0
25
0
5
L200
U
R
0
s18
1.0
s5
s13
s17
s21
s19</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Devitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Scholz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bolia</surname>
          </string-name>
          ,
          <article-title>A method for ethical AI in Defence (</article-title>
          <year>2021</year>
          ). URL: https://apo.org.au/node/311150, publisher: Department of Defence (Australia).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R. P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Hom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Abramof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Campbell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Chiang</surname>
          </string-name>
          ,
          <article-title>on behalf of the AAO Task Force on Artificial Intelligence, Current Challenges and Barriers to Real-World Artificial Intelligence Adoption for the Healthcare System, Provider, and the Patient</article-title>
          ,
          <source>Translational Vision Science &amp; Technology</source>
          <volume>9</volume>
          (
          <year>2020</year>
          )
          <article-title>45</article-title>
          . URL: https://doi.org/10.1167/tvst.9. 2.45. doi:
          <volume>10</volume>
          .1167/tvst.9.2.45.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Preece</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Harborne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Braines</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tomsett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          , Stakeholders in
          <source>Explainable AI</source>
          ,
          <year>2018</year>
          . URL: http://arxiv.org/abs/
          <year>1810</year>
          .00184, arXiv:
          <year>1810</year>
          .00184 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F. K.</given-names>
            <surname>Dosilovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brcic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Hlupic</surname>
          </string-name>
          ,
          <article-title>Explainable artificial intelligence: A survey</article-title>
          ,
          <source>in: 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)</source>
          , IEEE, Opatija,
          <year>2018</year>
          , pp.
          <fpage>0210</fpage>
          -
          <lpage>0215</lpage>
          . URL: https: //ieeexplore.ieee.org/document/8400040/. doi:
          <volume>10</volume>
          .23919/MIPRO.
          <year>2018</year>
          .
          <volume>8400040</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Explainable Spatio-Temporal Graph Neural Networks</article-title>
          ,
          <source>in: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , Birmingham United Kingdom,
          <year>2023</year>
          , pp.
          <fpage>2432</fpage>
          -
          <lpage>2441</lpage>
          . URL: https://dl. acm.org/doi/10.1145/3583780.3614871. doi:
          <volume>10</volume>
          .1145/3583780.3614871.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Goerigk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hartisch</surname>
          </string-name>
          ,
          <article-title>A framework for inherently interpretable optimization models</article-title>
          ,
          <source>European Journal of Operational Research</source>
          <volume>310</volume>
          (
          <year>2023</year>
          )
          <fpage>1312</fpage>
          -
          <lpage>1324</lpage>
          . URL: https://www. sciencedirect.com/science/article/pii/S0377221723002953. doi:
          <volume>10</volume>
          .1016/j.ejor.
          <year>2023</year>
          .
          <volume>04</volume>
          .013.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>O.</given-names>
            <surname>Loyola-Gonzalez</surname>
          </string-name>
          ,
          <article-title>Black-box vs. white-box: Understanding their advantages and weaknesses from a practical point of view</article-title>
          ,
          <source>IEEE Access 7</source>
          (
          <year>2019</year>
          )
          <fpage>154096</fpage>
          -
          <lpage>154113</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2019</year>
          .
          <volume>2949286</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wanner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.-V.</given-names>
            <surname>Herm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Heinrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Janiesch</surname>
          </string-name>
          ,
          <article-title>A social evaluation of the perceived goodness of explainability in machine learning</article-title>
          ,
          <source>Journal of Business Analytics</source>
          <volume>5</volume>
          (
          <year>2022</year>
          )
          <fpage>29</fpage>
          -
          <lpage>50</lpage>
          . doi:
          <volume>10</volume>
          .1080/2573234X.
          <year>2021</year>
          .
          <volume>1952913</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>C. Molnar,</surname>
          </string-name>
          <article-title>Interpretable Machine Learning: A Guide for Making Black Box Models Explainable</article-title>
          , 2nd ed., https://christophm.github.io/interpretable-ml-book,
          <year>2022</year>
          . URL: https: //christophm.github.
          <article-title>io/interpretable-ml-book.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>"Why Should I Trust You?": Explaining the Predictions of Any Classifier</article-title>
          ,
          <year>2016</year>
          . URL: http://arxiv.org/abs/1602.04938, arXiv:
          <fpage>1602</fpage>
          .04938 [cs, stat].
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-I.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          Unified Approach to Interpreting Model Predictions,
          <year>2017</year>
          . URL: http://arxiv.org/abs/1705.07874, arXiv:
          <fpage>1705</fpage>
          .07874 [cs, stat].
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Friedman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tibshirani</surname>
          </string-name>
          , Hastie, Trevor,
          <source>The Elements of Statistical Learning: Data Mining, Inference, and Prediction</source>
          , Springer Series in Statistics, 2nd ed., Springer International Publishing,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Dieber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kirrane</surname>
          </string-name>
          ,
          <article-title>Why model why? Assessing the strengths</article-title>
          and
          <source>limitations of LIME</source>
          ,
          <year>2020</year>
          . URL: http://arxiv.org/abs/
          <year>2012</year>
          .00093, arXiv:
          <year>2012</year>
          .00093 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Anwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Grifiths</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhalerao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Popham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bell</surname>
          </string-name>
          , S. Hellman,
          <article-title>CHILLI: A data contextaware perturbation method for XAI</article-title>
          , ICML 2023 Workshop on AI &amp; Human Computer
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>