=Paper= {{Paper |id=Vol-3318/short20 |storemode=property |title=Want robust explanations? Get smoother predictions first |pdfUrl=https://ceur-ws.org/Vol-3318/short20.pdf |volume=Vol-3318 |authors=Deddy Jobson |dblpUrl=https://dblp.org/rec/conf/cikm/Jobson22 }} ==Want robust explanations? Get smoother predictions first== https://ceur-ws.org/Vol-3318/short20.pdf
Want robust explanations? Get smoother predictions first.
Deddy Jobson
Mercari Inc., Roppongi Hills Mori Tower, 6 Chome-10-1 Roppongi, Minato City, Tokyo 106-6118


                    Abstract
                     Model-agnostic machine learning interpretability methods like LIME which explain the predictions of elaborate machine
                     learning models suffer from a lack of robustness in the explanations they provide. Small targeted changes to the input can
                     result in large changes in explanations even when there are no significant changes in the predictions made by the machine
                     learning model. This is a serious problem as it undermines the trust one has in the explanations made. We propose to solve the
                     problem by smoothening the predictions of the machine learning model as a preprocessing step. We smoothen the predictions
                     by taking multiple samples from the neighbourhood of each input data point and averaging the output predictions. Through
                     our preliminary experiments, we show that the explanations are more robust because of smoothening thus making them
                     more reliable.

                     Keywords
                     interpretable machine learning, model agnostic, interpretability, LIME, robustness



1. Introduction                                                                                                    Shapley values[4] take a game-theoretic approach and
                                                                                                                assume different features take part in a collaboration to
The sudden improvement in performance of machine                                                                assign a score for an instance. The shapley value for a fea-
learning through deep learning and tree ensemble meth-                                                          ture is the average increment in the score obtained by the
ods has led to an explosion in the adoption of machine                                                          inclusion of said feature in the collaboration. While using
learning in a wide variety of prediction tasks in multi-                                                        shapley values has a strong mathematical foundation, it
ple domains like image, text, tabular data, etc. While                                                          has the downside where the computational cost for cal-
the increased performance has made machine learning                                                             culation is exponential to the number of features. While
models much more useful in practice, it has come at the                                                         methods like Tree SHAP[5] exist to more efficiently cal-
cost of interpretability; one can no longer trivially ex-                                                       culate the values, there are issues with the robustness[6]
plain the decisions made by machine learning models                                                             of shapley values which have not yet been resolved.
the same way one could for statistical models like lin-                                                            Local Interpretable Model-Agnostic Explanations
ear regression in the past. While we can do without                                                             (LIME)[7] is a method that estimates a local surrogate
interpretability in cases where the consequences of the                                                         model in the vicinity of each data point and uses the
downstream decisions are little, like in the case of recom-                                                     coefficients of the local model to interpret the decisions
mending movies, interpretability becomes important in                                                           made by the model. It is related to SHAP through Ker-
high-stakes situations like predicting whether or not a                                                         nel SHAP[8], a way to get approximate SHAP values.
person has cancer[1]. In such a case, it is not just impor-                                                     One advantage of LIME over shapley values is that LIME
tant to know what the predictions of the model are, but                                                         can produce sparse explanations which don’t rely on too
also how the predictions were made.                                                                             many features resulting in more human-friendly explana-
   A number of model-agnostic interpretability methods                                                          tions. However, issues regarding the robustness[6] of the
exist to help explain the predictions made by machine                                                           explanations provided by LIME have been raised. Our
learning models. Partial Dependence Plots[2] show the                                                           goal in this paper is to find ways to improve the robust-
marginal effect of a feature on the outcome. Individual                                                         ness of the interpretations made by LIME to improve the
Conditional Expectation plots[3] do the same by making                                                          reliability and therefore trustworthiness of the provided
separate plots for each individual thus allowing one to                                                         explanations.
see the variance (and not just the mean) of the effect of
each feature. The above two have a problem wherein
we consider the effect of very unlikely counterfactual 2. Problem Setup
scenarios in the case where the features in the dataset
are strongly correlated.                                    The original LIME algorithm works as follows, given a
                                                            trained model and a target data point:
AIMLAI ’22: Advances in Interpretable Machine Learning and Artificial
                                                                                                                    1. Sample data around the neighbourhood of the
Intelligence,October 21,2022,Atlanta, GA
$ deddy@mercari.com (D. Jobson)                                                                                        data point.
 0000-0003-1557-8131 (D. Jobson)                                                                                   2. Get the predicted values for the sampled data
 © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0
 International (CC BY 4.0).                                                                                            points.
 CEUR Workshop Proceedings (CEUR-WS.org)
    3. Fit a surrogate model to the generated data             Table 1
       weighted by distance from the target data point.        Preliminary experiments on the Boston dataset (the lower the
    4. Explain the prediction of the main model with           score the better)
       the coefficients of the surrogate model.                        Algorithm         Lipschitz Discontinuity Score
The explanations generated by the above algorithm can                   LIME                         2.78
be unstable for a number of reasons.                                LIME smoothed                    2.60
   One source of instability is the sampling of data
points[9] that is done randomly, ignoring any correla-
tion between features. Methods have been developed             of robustness in the explanations caused by LIME is not
to estimate the required number of samples to get sta-         because of LIME itself but rather the jaggedness of the
ble explanations[10] or do away with randomness in the         predictions made by the model.
sampling altogether[11].                                          We smoothen the predictions by averaging the predic-
   Another potential cause for instability in explanations,    tions made on random perturbations on the data points.
especially pertinent to the case of tabular data, is the       We consider the case where all features of the data point
discretization of the numerical features. While for the        are numeric and continuous in this study. We perturb
most part this can yield more consistent explanations,         each feature by adding it with gaussian noise of zero
target data points near the boundaries can have unstable       mean. We refer to the standard deviation of the gaussian
explanations even when the model predictions (which            noise to be the "strength" parameter. This is because the
don’t rely on discretization) in the vicinity are relatively   greater the "strength" parameter, the larger the pertur-
stable.                                                        bations and the smoother the averaged predictions will
                                                               be (assuming enough samples) and so the "stronger" the
                                                               smoothening effect. We choose a strength value of 0.1
3. Related Work                                                for our experiments and take 100 random samples for
The measurement of the stability (or lack thereof)             each data point for the smoothening process.
of LIME’s explanations isn’t a new research problem.
Alvarez-melis et al.[6] have shown that small pertuba-         5. Experiments and Discussion
tions to the input can cause a large change in the output
without much of a change in the predictions made by the        Our hypothesis is that smoothening the predictions will
model. They use the definition of Lipschitz continuity         yield explanations that are more robust. To test this hy-
to get the maximum possible difference in explanation          pothesis, we look at the extent to which the variance of
within the neighbourhood of the data point to be ex-           LIME’s explanations change before and after smoothen-
plained. Their approach is similar to prior work that          ing the predicting function. We define a metric called Lip-
was done to inspect the lack of robustness of predictions      schitz Discontinuity Score (LDS) Score which is derived
made by neural networks[12].                                   from the expression used in the definition of Lipschitz
   Visani et al.[13] introduce two novel metrics grounded      Continuity. Our approach is similar to the one used in
in statistics to measure the extent to which repeated sam-     [6]. LDS is defined as follows:
pling of the data leads to a variance in the explanations.
                                                                                     𝑁
Their metrics quantify the variance of the selected fea-                         1 ∑︁       ||𝑓 (𝑥𝑖 ) − 𝑓 (𝑥𝑗 )||2
tures and coefficient values, the lower the better.                    𝐿𝐷𝑆 =          max                                (1)
                                                                                 𝑁 𝑖=1 𝑗̸=𝑖      ||𝑥𝑖 − 𝑥𝑗 ||2
   Much more recently, Garreau et al.[14] performed a
very deep analysis into the workings of LIME for tabular       In the above expression, N is the number of records in the
data and (among other things) found that when the sur-         dataset, i and j are indices to denote individual records
rogate model (the one trained for interpretability) uses       and take values from 1 to N, and 𝑓 (𝑥𝑖 ) is the vector of
ordinary least squares, and the number of sampled data         coefficients we get from the explanations of the LIME
points is large, the estimations by LIME are robust to mild    algorithm.
perturbations. This suggests that the cause of instability        We perform preliminary experiments on the publicly
could lie elsewhere.                                           available Boston dataset, a dataset with 12 covariates for a
                                                               regression problem. We parameterize the LIME algorithm
                                                               to explain with only 3 features. The base model used is
4. Our Method                                                  the random forest regressor from scikit-learn. We use the
                                                               default parameters of the random forest since it suffices
For our method, we smoothen the predictions of the
                                                               for the purposes of this study. We estimate the LDS
model we want to explain with the help of Gaussian
                                                               on the Boston dataset using 10-fold cross validation. In
noise. We do so because we hypothesize that the lack
                                                               table 1, we compare the LDS of the explanations of LIME
for two cases: with and without smoothening. We find Any remaining deficiencies left in the paper belong to
that there is a substantial improvement in the LDS when the authors.
smoothening the predictions, in line with our hypothesis.

                                                               References
6. Future Work
                                                           [1] P. Karatza, K. Dalakleidi, M. Athanasiou, K. Nikita,
In this paper, we smoothen the predictions of the machine      Interpretability methods of machine learning algo-
learning model by sampling neighbouring points ran-            rithms with applications in breast cancer diagnosis,
domly multiple times and taking the average of the out-        in: 2021 43rd Annual International Conference of
put. We do this to increase the robustness of the explana-     the IEEE Engineering in Medicine & Biology So-
tions by LIME. We chose white noise since the approach is      ciety (EMBC), 2021, pp. 2310–2313. doi:10.1109/
similar to the original LIME algorithm, but since its intro-   EMBC46164.2021.9630556, iSSN: 2694-0604.
duction, various improved sampling strategies have been    [2] B. M. Greenwell, B. C. Boehmke, A. J. McCarthy,
proposed that result in more robust explanations[15, 16].      A Simple and Effective Model-Based Variable Im-
Trying those other sampling methods for the purpose of         portance Measure, 2018. URL: http://arxiv.org/abs/
smoothening the predictions is beyond the scope of this        1805.04755. doi:10.48550/arXiv.1805.04755,
extended abstract and can be considered as one avenue          arXiv:1805.04755 [cs, stat].
for future research.                                       [3] A. Goldstein, A. Kapelner, J. Bleich, E. Pitkin,
   While we perform preliminary experiments with tabu-         Peeking Inside the Black Box: Visualizing Sta-
lar data, our hypothesis can be potentially true for other     tistical Learning with Plots of Individual Con-
forms of data, more so due to the greater dimensionality       ditional Expectation, 2014. URL: http://arxiv.
of data like image, text, etc. In order to extend the idea     org/abs/1309.6392. doi:10.48550/arXiv.1309.
to other forms of data, the key will be to find how best       6392, arXiv:1309.6392 [stat].
to perturb the input to get smooth predictions.            [4] S. Lundberg, S.-I. Lee, A Unified Approach to Inter-
   Lastly, we test our hypothesis with LIME and found          preting Model Predictions, 2017. URL: http://arxiv.
promising results. Since the instability of explanations       org/abs/1705.07874. doi:10.48550/arXiv.1705.
of other interpretability methods can also be (at least        07874, arXiv:1705.07874 [cs, stat].
partly) explained by unstable predictions of the machine   [5] S. M. Lundberg, G. G. Erion, S.-I. Lee, Con-
learning model, we suspect our idea can be applied to          sistent Individualized Feature Attribution for
improve other model interpretability methods too.              Tree Ensembles, 2019. URL: http://arxiv.org/abs/
   As we can see, there is a lot of scope for future work      1802.03888. doi:10.48550/arXiv.1802.03888,
and we are excited to see how research develops in this        arXiv:1802.03888 [cs, stat].
direction.                                                 [6] D. Alvarez-Melis, T. S. Jaakkola, On the Robustness
                                                               of Interpretability Methods, 2018. URL: http://arxiv.
                                                               org/abs/1806.08049. doi:10.48550/arXiv.1806.
7. Conclusion                                                  08049, arXiv:1806.08049 [cs, stat].
                                                           [7] M. T. Ribeiro, S. Singh, C. Guestrin, "Why Should
In this paper, we propose a way to improve the robustness
                                                               I Trust You?": Explaining the Predictions of Any
of LIME, a model-agnostic explainer of the predictions
                                                               Classifier (2016). URL: https://arxiv.org/abs/1602.
of machine learning models. We propose smoothening
                                                               04938v3. doi:10.48550/arXiv.1602.04938.
the predictions made by the model to increase the con-
                                                           [8] I. Covert, S.-I. Lee, Improving KernelSHAP: Prac-
sistency of the predictions made by the model, thereby
                                                               tical Shapley Value Estimation Using Linear Re-
making the explanations more trustable. We explain how
                                                               gression, in: Proceedings of The 24th Interna-
we smoothen predictions using random noise and per-
                                                               tional Conference on Artificial Intelligence and
form some preliminary experiments on publicly-available
                                                               Statistics, PMLR, 2021, pp. 3457–3465. URL: https:
datasets to achieve promising results. We also outline
                                                               //proceedings.mlr.press/v130/covert21a.html, iSSN:
future steps that can be taken to increase the scope of
                                                               2640-3498.
the research.
                                                           [9] Y. Zhang, K. Song, Y. Sun, S. Tan, M. Udell, "Why
                                                               Should You Trust My Explanation?" Understanding
Acknowledgments                                                Uncertainty in LIME Explanations, 2019. URL: http:
                                                               //arxiv.org/abs/1904.12991. doi:10.48550/arXiv.
We would like to thank Mercari Inc. for supporting the         1904.12991, arXiv:1904.12991 [cs, stat].
research and also the anonymous reviewers who gave [10] Z. Zhou, G. Hooker, F. Wang, S-LIME: Stabilized-
very helpful feedback to improve the quality of the paper.     LIME for Model Explanation, in: Proceedings
     of the 27th ACM SIGKDD Conference on Knowl-
     edge Discovery & Data Mining, KDD ’21, Associ-
     ation for Computing Machinery, New York, NY,
     USA, 2021, pp. 2429–2438. URL: https://doi.org/
     10.1145/3447548.3467274. doi:10.1145/3447548.
     3467274.
[11] M. R. Zafar, N. Khan, Deterministic Local In-
     terpretable Model-Agnostic Explanations for Sta-
     ble Explainability, Machine Learning and Knowl-
     edge Extraction 3 (2021) 525–541. URL: https://
     www.mdpi.com/2504-4990/3/3/27. doi:10.3390/
     make3030027, number: 3 Publisher: Multidisci-
     plinary Digital Publishing Institute.
[12] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna,
     D. Erhan, I. Goodfellow, R. Fergus, Intriguing
     properties of neural networks, 2014. URL: http:
     //arxiv.org/abs/1312.6199. doi:10.48550/arXiv.
     1312.6199, arXiv:1312.6199 [cs].
[13] G. Visani, E. Bagli, F. Chesani, A. Poluzzi, D. Ca-
     puzzo, Statistical stability indices for LIME: ob-
     taining reliable explanations for Machine Learning
     models, Journal of the Operational Research Soci-
     ety 73 (2022) 91–101. URL: http://arxiv.org/abs/2001.
     11757. doi:10.1080/01605682.2020.1865846,
     arXiv:2001.11757 [cs, stat].
[14] D. Garreau, U. von Luxburg, Looking Deeper into
     Tabular LIME, 2022. URL: http://arxiv.org/abs/2008.
     11092, arXiv:2008.11092 [cs, stat].
[15] S. Saito, E. Chua, N. Capel, R. Hu, Improving
     LIME Robustness with Smarter Locality Sampling,
     2021.     URL:      http://arxiv.org/abs/2006.12302.
     doi:10.48550/arXiv.2006.12302,
     arXiv:2006.12302 [cs, stat].
[16] S. Shi, X. Zhang, W. Fan, A Modified Perturbed
     Sampling Method for Local Interpretable Model-
     agnostic Explanation, 2020. URL: http://arxiv.
     org/abs/2002.07434. doi:10.48550/arXiv.2002.
     07434, arXiv:2002.07434 [cs, stat].