=Paper=
{{Paper
|id=Vol-3318/short20
|storemode=property
|title=Want robust explanations? Get smoother predictions first
|pdfUrl=https://ceur-ws.org/Vol-3318/short20.pdf
|volume=Vol-3318
|authors=Deddy Jobson
|dblpUrl=https://dblp.org/rec/conf/cikm/Jobson22
}}
==Want robust explanations? Get smoother predictions first==
Want robust explanations? Get smoother predictions first. Deddy Jobson Mercari Inc., Roppongi Hills Mori Tower, 6 Chome-10-1 Roppongi, Minato City, Tokyo 106-6118 Abstract Model-agnostic machine learning interpretability methods like LIME which explain the predictions of elaborate machine learning models suffer from a lack of robustness in the explanations they provide. Small targeted changes to the input can result in large changes in explanations even when there are no significant changes in the predictions made by the machine learning model. This is a serious problem as it undermines the trust one has in the explanations made. We propose to solve the problem by smoothening the predictions of the machine learning model as a preprocessing step. We smoothen the predictions by taking multiple samples from the neighbourhood of each input data point and averaging the output predictions. Through our preliminary experiments, we show that the explanations are more robust because of smoothening thus making them more reliable. Keywords interpretable machine learning, model agnostic, interpretability, LIME, robustness 1. Introduction Shapley values[4] take a game-theoretic approach and assume different features take part in a collaboration to The sudden improvement in performance of machine assign a score for an instance. The shapley value for a fea- learning through deep learning and tree ensemble meth- ture is the average increment in the score obtained by the ods has led to an explosion in the adoption of machine inclusion of said feature in the collaboration. While using learning in a wide variety of prediction tasks in multi- shapley values has a strong mathematical foundation, it ple domains like image, text, tabular data, etc. While has the downside where the computational cost for cal- the increased performance has made machine learning culation is exponential to the number of features. While models much more useful in practice, it has come at the methods like Tree SHAP[5] exist to more efficiently cal- cost of interpretability; one can no longer trivially ex- culate the values, there are issues with the robustness[6] plain the decisions made by machine learning models of shapley values which have not yet been resolved. the same way one could for statistical models like lin- Local Interpretable Model-Agnostic Explanations ear regression in the past. While we can do without (LIME)[7] is a method that estimates a local surrogate interpretability in cases where the consequences of the model in the vicinity of each data point and uses the downstream decisions are little, like in the case of recom- coefficients of the local model to interpret the decisions mending movies, interpretability becomes important in made by the model. It is related to SHAP through Ker- high-stakes situations like predicting whether or not a nel SHAP[8], a way to get approximate SHAP values. person has cancer[1]. In such a case, it is not just impor- One advantage of LIME over shapley values is that LIME tant to know what the predictions of the model are, but can produce sparse explanations which don’t rely on too also how the predictions were made. many features resulting in more human-friendly explana- A number of model-agnostic interpretability methods tions. However, issues regarding the robustness[6] of the exist to help explain the predictions made by machine explanations provided by LIME have been raised. Our learning models. Partial Dependence Plots[2] show the goal in this paper is to find ways to improve the robust- marginal effect of a feature on the outcome. Individual ness of the interpretations made by LIME to improve the Conditional Expectation plots[3] do the same by making reliability and therefore trustworthiness of the provided separate plots for each individual thus allowing one to explanations. see the variance (and not just the mean) of the effect of each feature. The above two have a problem wherein we consider the effect of very unlikely counterfactual 2. Problem Setup scenarios in the case where the features in the dataset are strongly correlated. The original LIME algorithm works as follows, given a trained model and a target data point: AIMLAI ’22: Advances in Interpretable Machine Learning and Artificial 1. Sample data around the neighbourhood of the Intelligence,October 21,2022,Atlanta, GA $ deddy@mercari.com (D. Jobson) data point. 0000-0003-1557-8131 (D. Jobson) 2. Get the predicted values for the sampled data © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). points. CEUR Workshop Proceedings (CEUR-WS.org) 3. Fit a surrogate model to the generated data Table 1 weighted by distance from the target data point. Preliminary experiments on the Boston dataset (the lower the 4. Explain the prediction of the main model with score the better) the coefficients of the surrogate model. Algorithm Lipschitz Discontinuity Score The explanations generated by the above algorithm can LIME 2.78 be unstable for a number of reasons. LIME smoothed 2.60 One source of instability is the sampling of data points[9] that is done randomly, ignoring any correla- tion between features. Methods have been developed of robustness in the explanations caused by LIME is not to estimate the required number of samples to get sta- because of LIME itself but rather the jaggedness of the ble explanations[10] or do away with randomness in the predictions made by the model. sampling altogether[11]. We smoothen the predictions by averaging the predic- Another potential cause for instability in explanations, tions made on random perturbations on the data points. especially pertinent to the case of tabular data, is the We consider the case where all features of the data point discretization of the numerical features. While for the are numeric and continuous in this study. We perturb most part this can yield more consistent explanations, each feature by adding it with gaussian noise of zero target data points near the boundaries can have unstable mean. We refer to the standard deviation of the gaussian explanations even when the model predictions (which noise to be the "strength" parameter. This is because the don’t rely on discretization) in the vicinity are relatively greater the "strength" parameter, the larger the pertur- stable. bations and the smoother the averaged predictions will be (assuming enough samples) and so the "stronger" the smoothening effect. We choose a strength value of 0.1 3. Related Work for our experiments and take 100 random samples for The measurement of the stability (or lack thereof) each data point for the smoothening process. of LIME’s explanations isn’t a new research problem. Alvarez-melis et al.[6] have shown that small pertuba- 5. Experiments and Discussion tions to the input can cause a large change in the output without much of a change in the predictions made by the Our hypothesis is that smoothening the predictions will model. They use the definition of Lipschitz continuity yield explanations that are more robust. To test this hy- to get the maximum possible difference in explanation pothesis, we look at the extent to which the variance of within the neighbourhood of the data point to be ex- LIME’s explanations change before and after smoothen- plained. Their approach is similar to prior work that ing the predicting function. We define a metric called Lip- was done to inspect the lack of robustness of predictions schitz Discontinuity Score (LDS) Score which is derived made by neural networks[12]. from the expression used in the definition of Lipschitz Visani et al.[13] introduce two novel metrics grounded Continuity. Our approach is similar to the one used in in statistics to measure the extent to which repeated sam- [6]. LDS is defined as follows: pling of the data leads to a variance in the explanations. 𝑁 Their metrics quantify the variance of the selected fea- 1 ∑︁ ||𝑓 (𝑥𝑖 ) − 𝑓 (𝑥𝑗 )||2 tures and coefficient values, the lower the better. 𝐿𝐷𝑆 = max (1) 𝑁 𝑖=1 𝑗̸=𝑖 ||𝑥𝑖 − 𝑥𝑗 ||2 Much more recently, Garreau et al.[14] performed a very deep analysis into the workings of LIME for tabular In the above expression, N is the number of records in the data and (among other things) found that when the sur- dataset, i and j are indices to denote individual records rogate model (the one trained for interpretability) uses and take values from 1 to N, and 𝑓 (𝑥𝑖 ) is the vector of ordinary least squares, and the number of sampled data coefficients we get from the explanations of the LIME points is large, the estimations by LIME are robust to mild algorithm. perturbations. This suggests that the cause of instability We perform preliminary experiments on the publicly could lie elsewhere. available Boston dataset, a dataset with 12 covariates for a regression problem. We parameterize the LIME algorithm to explain with only 3 features. The base model used is 4. Our Method the random forest regressor from scikit-learn. We use the default parameters of the random forest since it suffices For our method, we smoothen the predictions of the for the purposes of this study. We estimate the LDS model we want to explain with the help of Gaussian on the Boston dataset using 10-fold cross validation. In noise. We do so because we hypothesize that the lack table 1, we compare the LDS of the explanations of LIME for two cases: with and without smoothening. We find Any remaining deficiencies left in the paper belong to that there is a substantial improvement in the LDS when the authors. smoothening the predictions, in line with our hypothesis. References 6. Future Work [1] P. Karatza, K. Dalakleidi, M. Athanasiou, K. Nikita, In this paper, we smoothen the predictions of the machine Interpretability methods of machine learning algo- learning model by sampling neighbouring points ran- rithms with applications in breast cancer diagnosis, domly multiple times and taking the average of the out- in: 2021 43rd Annual International Conference of put. We do this to increase the robustness of the explana- the IEEE Engineering in Medicine & Biology So- tions by LIME. We chose white noise since the approach is ciety (EMBC), 2021, pp. 2310–2313. doi:10.1109/ similar to the original LIME algorithm, but since its intro- EMBC46164.2021.9630556, iSSN: 2694-0604. duction, various improved sampling strategies have been [2] B. M. Greenwell, B. C. Boehmke, A. J. McCarthy, proposed that result in more robust explanations[15, 16]. A Simple and Effective Model-Based Variable Im- Trying those other sampling methods for the purpose of portance Measure, 2018. URL: http://arxiv.org/abs/ smoothening the predictions is beyond the scope of this 1805.04755. doi:10.48550/arXiv.1805.04755, extended abstract and can be considered as one avenue arXiv:1805.04755 [cs, stat]. for future research. [3] A. Goldstein, A. Kapelner, J. Bleich, E. Pitkin, While we perform preliminary experiments with tabu- Peeking Inside the Black Box: Visualizing Sta- lar data, our hypothesis can be potentially true for other tistical Learning with Plots of Individual Con- forms of data, more so due to the greater dimensionality ditional Expectation, 2014. URL: http://arxiv. of data like image, text, etc. In order to extend the idea org/abs/1309.6392. doi:10.48550/arXiv.1309. to other forms of data, the key will be to find how best 6392, arXiv:1309.6392 [stat]. to perturb the input to get smooth predictions. [4] S. Lundberg, S.-I. Lee, A Unified Approach to Inter- Lastly, we test our hypothesis with LIME and found preting Model Predictions, 2017. URL: http://arxiv. promising results. Since the instability of explanations org/abs/1705.07874. doi:10.48550/arXiv.1705. of other interpretability methods can also be (at least 07874, arXiv:1705.07874 [cs, stat]. partly) explained by unstable predictions of the machine [5] S. M. Lundberg, G. G. Erion, S.-I. Lee, Con- learning model, we suspect our idea can be applied to sistent Individualized Feature Attribution for improve other model interpretability methods too. Tree Ensembles, 2019. URL: http://arxiv.org/abs/ As we can see, there is a lot of scope for future work 1802.03888. doi:10.48550/arXiv.1802.03888, and we are excited to see how research develops in this arXiv:1802.03888 [cs, stat]. direction. [6] D. Alvarez-Melis, T. S. Jaakkola, On the Robustness of Interpretability Methods, 2018. URL: http://arxiv. org/abs/1806.08049. doi:10.48550/arXiv.1806. 7. Conclusion 08049, arXiv:1806.08049 [cs, stat]. [7] M. T. Ribeiro, S. Singh, C. Guestrin, "Why Should In this paper, we propose a way to improve the robustness I Trust You?": Explaining the Predictions of Any of LIME, a model-agnostic explainer of the predictions Classifier (2016). URL: https://arxiv.org/abs/1602. of machine learning models. We propose smoothening 04938v3. doi:10.48550/arXiv.1602.04938. the predictions made by the model to increase the con- [8] I. Covert, S.-I. Lee, Improving KernelSHAP: Prac- sistency of the predictions made by the model, thereby tical Shapley Value Estimation Using Linear Re- making the explanations more trustable. We explain how gression, in: Proceedings of The 24th Interna- we smoothen predictions using random noise and per- tional Conference on Artificial Intelligence and form some preliminary experiments on publicly-available Statistics, PMLR, 2021, pp. 3457–3465. URL: https: datasets to achieve promising results. We also outline //proceedings.mlr.press/v130/covert21a.html, iSSN: future steps that can be taken to increase the scope of 2640-3498. the research. [9] Y. Zhang, K. Song, Y. Sun, S. Tan, M. Udell, "Why Should You Trust My Explanation?" Understanding Acknowledgments Uncertainty in LIME Explanations, 2019. URL: http: //arxiv.org/abs/1904.12991. doi:10.48550/arXiv. We would like to thank Mercari Inc. for supporting the 1904.12991, arXiv:1904.12991 [cs, stat]. research and also the anonymous reviewers who gave [10] Z. Zhou, G. Hooker, F. Wang, S-LIME: Stabilized- very helpful feedback to improve the quality of the paper. LIME for Model Explanation, in: Proceedings of the 27th ACM SIGKDD Conference on Knowl- edge Discovery & Data Mining, KDD ’21, Associ- ation for Computing Machinery, New York, NY, USA, 2021, pp. 2429–2438. URL: https://doi.org/ 10.1145/3447548.3467274. doi:10.1145/3447548. 3467274. [11] M. R. Zafar, N. Khan, Deterministic Local In- terpretable Model-Agnostic Explanations for Sta- ble Explainability, Machine Learning and Knowl- edge Extraction 3 (2021) 525–541. URL: https:// www.mdpi.com/2504-4990/3/3/27. doi:10.3390/ make3030027, number: 3 Publisher: Multidisci- plinary Digital Publishing Institute. [12] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks, 2014. URL: http: //arxiv.org/abs/1312.6199. doi:10.48550/arXiv. 1312.6199, arXiv:1312.6199 [cs]. [13] G. Visani, E. Bagli, F. Chesani, A. Poluzzi, D. Ca- puzzo, Statistical stability indices for LIME: ob- taining reliable explanations for Machine Learning models, Journal of the Operational Research Soci- ety 73 (2022) 91–101. URL: http://arxiv.org/abs/2001. 11757. doi:10.1080/01605682.2020.1865846, arXiv:2001.11757 [cs, stat]. [14] D. Garreau, U. von Luxburg, Looking Deeper into Tabular LIME, 2022. URL: http://arxiv.org/abs/2008. 11092, arXiv:2008.11092 [cs, stat]. [15] S. Saito, E. Chua, N. Capel, R. Hu, Improving LIME Robustness with Smarter Locality Sampling, 2021. URL: http://arxiv.org/abs/2006.12302. doi:10.48550/arXiv.2006.12302, arXiv:2006.12302 [cs, stat]. [16] S. Shi, X. Zhang, W. Fan, A Modified Perturbed Sampling Method for Local Interpretable Model- agnostic Explanation, 2020. URL: http://arxiv. org/abs/2002.07434. doi:10.48550/arXiv.2002. 07434, arXiv:2002.07434 [cs, stat].