1. Introduction

Guided-LIME: Structured Sampling based Hybrid Approach towards Explaining Blackbox Machine Learning Models

Amit Sangroya

Mouli Rastogi

C. Anantaram

Lovekesh Vig

TCS Innovation Labs

Tata Consultancy Services Ltd.

Delhi

India

2017

70 123 128

Many approaches to explain machine learning models and interpret its results have been proposed. These include shadow model approaches, like LIME and SHAP; model inspection approaches like Grad-CAM and data-based approaches like Formal Concept Analysis (FCA). Explanations of the decisions of blackbox ML models using any one of these approaches has their limitations as the underlying model is rather complex. Running explanation model for each sample is not cost-eficient. This motivates to design a hybrid approach for evaluating interpretability of blackbox ML models. One of the major limitations of widely-used LIME explanation framework is the sampling criteria that is employed in SP-LIME algorithm for generating a global explanation of the model. In this work, we investigate a hybrid approach based on LIME using FCA for structured sampling of instances. The approach combines the benefits of using a data-based approach (FCA) and proxy model-based approach (LIME). We evaluate these models on three real-world datasets: IRIS, Heart Disease and Adult Earning dataset. We evaluate our approach based on two parameters: 1) by measuring the prominent features in the explanations, and 2) proximity of the proxy model to the original blackbox ML model. We use calibration error metric in order to measure the closeness between blackbox ML model and proxy model.

eol>Interpretability Explainability blackbox Models Deep Neural Network Machine Learning Formal Concept Analysis

1. Introduction

erated. In the proxy model approach, the data corpus needs to be created by perturbing the inputs of the tarExplainability is an important aspect for an AI system get blackbox model and then an interpretable shadow in order to increase the trustworthiness of its decision- model is built, while in the model inspection approach making process. Many blackbox deep learning mod- the model architecture needs to be available for inels are being developed and deployed for real-world spection to determine the activations, and in the datause (an example is Google’s Diabetic Retinopathy Sys- based approach the training data needs to be available. tem [1]). For such blackbox systems neither the model Local shadow models are interpretable models that details nor its training dataset is made publicly avail- are used to explain individual predictions of blackbox able. Explanations of the predictions made by such machine learning models. LIME (Local Interpretable blackbox systems has been a great challenge. Model-agnostic Explanations [9]) is a well-known ap

Apart from post-hoc visualization techniques [2] (e.g., proach where shadow models are trained to approxifeature dependency plots), feature importance tech- mate the predictions of the underlying blackbox model. niques based on sensitivity analysis, there have been LIME focuses on training local shadow models to exthree main approaches for explainability of AI systems: plain individual predictions, wherein a prediction of i) Proxy or Shadow model approaches like LIME, SHAP interest of the target blackbox deep learning model ii) Model inspection approaches like Class Activation  is considered and its related input features ’s are maps (CAM), Grad-CAM, Smooth-Grad-CAM, etc. and perturbed within a neighborhood proximity to meaiii) Data based approaches like Decision sets and For- sure the changes in predictions. Based on a reasonable mal Concept Analysis [3, 4, 5, 6, 7]. Most of the re- sample of such perturbations a dataset is created and search work on explainability has followed one of the a locally linear explainable model is constructed. To above approaches [8]. However, each of these approachescover the decision-making space of the target model have limitations in the way the explanations are gen- , Submodular Pick-LIME (SP-LIME) [9] generates the global explanations by finding a set of points whose explanations (generated by LIME) are varied in their selected features and their dependence on those features. SP-LIME proposes a sampling way based on sub-modular picks to select instances such that the interpretable features have higher importance. ysis to explain the outcomes of a machine learning model. We use LIME to interpret locally by using a linear shadow model of the blackbox model, and use Formal Concept Analysis to construct a concept latFigure 1: Example output of LIME after adding noisy fea- tice of the training dataset, and then extract out implitures in the Heart Disease dataset cation rules among the features. Based on the implication rules we select relevant samples for the global instances that we feed to SP-LIME. Therefore, rather

Figure 1 shows a sample explanation output of LIME than using all instances (which is very costly for deep for a binary classification problem on Heart Disease networks) or random sampling (which never guarandataset. The prediction probabilities are shown in the tees optimal behavior), we use a FCA guided approach left using diferent colors and prominent features that for selecting the instances. Therefore, we call our frameare important for classification decision are shown in work as Guided-LIME. the right. Important features are presented in a sorted We show that Guided-LIME results in better covermanner based on their relevance. Note that some noisy age of the explanation space as compared to SP-LIME. features are also injected in the dataset and therefore Our main contributions in this paper are as follows: are present in the explanation (af1, af2, af3 and af4) • We propose a hybrid approach based on LIME as well. In an ideal scenario, noisy features should and FCA for generating explanation by exploitnot be the most relevant features for any ML model ing the structure in training data. We demonand therefore should be least important from an ex- starte how FCA helps in structured sampling of planation point of view. However, due to proxy model instances for generating global explanations. inaccuracies and unreliability, sometimes these noisy features can also come as the most relevant features • Using the structured sampling, we can choose in explanations. In figure 2, we show an example sce- optimal instances both in terms of quantity and nario that compares the calibration level of two proxy quality to generate explanations and interpret models with a machine learning model. The x axis in the outcomes.Thereafter, using calibration error this figure is the confidence of model and y axis is the metric we show that Guided-LIME is a closer apaccuracy. Assuming that we have a blackbox machine proximate of the original blackbox ML model. learning model and a proxy model that explains this model, we argue that these models should be closer to each other in terms of their calibration levels. 2. Background and Preliminaries

Ideally, a proxy model which is used for explaining a machine learning model should be as close as possible 2.1. Blackbox Model Outcome to the original model Explanation

Motivated by the design of an optimized explanation model, we design a hybrid approach where we A blackbox is a model, whose internals are either uncombine the shadow model approach proposed by LIME known to the observer or they are known but uninterwith the data-based approach of Formal Concept Anal- pretable by humans. Given a blackbox model solving a classification problem, the blackbox outcome explanation problem consists of providing an interpretable explanation for the outcome of the blackbox. In other words, the interpretable model must return the prediction together with an explanation about the reasons for that prediction. In this context, local interpretability refers to understanding only the reasons for a specific decision. In this case, only the single prediction/decision is interpretable. On the other hand, a model may be completely interpretable when we are able to understand the global prediction behavior (different possible outcomes of various test predictions).

2.2. LIME Approach for Global Explanations

SP-LIME algorithm provides a global understanding of the machine learning model by explaining a set of individual instances. Ribeiro et al. [9] propose a budget that denotes the number of explanations to be generated. Thereafter, they use Pick Step to select instances for the user to inspect. The aim of this is to obtain non-redundant explanations that represent how the model behaves globally. This is done by avoiding instances with similar explanations. However, there are some limitations of this algorithm [10]:

Data points are sampled from a Gaussian distribu

tion, ignoring the correlation between features. This can lead to unlikely data points which can then be used to learn local explanation models. In [11], au- thors study the stability of the explanations given by LIME. They showed that the explanations of two very close points varied greatly in a simulated setting. This instability decreases the trust in the produced explanations. The correct definition of the neighborhood is also an unsolved problem when using LIME with tabular data. Local surrogate models e.g. LIME is a concrete and very promising implementation. But the method is still in development phase and many problems need to be solved before it can be safely applied.

• The SP-LIME algorithm is based on a greedy ap- (1982) to study how objects can be hierarchically grouped proach which does not guarantee an optimal so- together according to their common attributes. FCA lution. deals with the formalization of concepts and has been applied in many disciplines such as software engineer• The algorithm runs the model on all instances ing, machine learning, knowledge discovery and onto maximize the coverage function. tology construction during the last 20-25 years. Informally, FCA studies how objects can be hierarchically grouped together with their common attributes. A formal context = (, , ) consists of two sets and

and a relation between and . The elements of are called the objects and the elements of are called the attributes of the context. A formal concept of a formal context = (, , ) is a pair (, ). The set of all formal concepts of a context K together with the order relation forms a complete lattice, called the concept lattice of .

Figure 3 and 4 are examples from IRIS dataset (more details in Section 4). In figure 3, we show a collection of some objects and their attributes. For simplicity, we choose only those objects where a particular attribute is present or not. In real-world objects can have very complex relationships with fuzzy values. Figure 4 is an example concept lattice generated using this sample data.

2.3. Formal Concept Analysis Formal Concept Analysis (FCA) is a data mining model that introduces the relation among attributes in a visual form. It was introduced in the early 80s by Wille 3. Guided-LIME Framework: Guiding sampling in SP-LIME using FCA extracted concepts

uses these instances to generate a set of local explanation models and covers the overall decision-making space. FCA provides a useful means for discovering implicational dependencies in complex data [12, 13].

In previous work, FCA-based mechanism has been In [9] SP-LIME has been used to generate global ex- used as an approach to explain the outcome of a blackplanations of a blackbox model. SP-LIME carries out box machine learning model through the construction submodular picks from a set of explanations generated of lattice structure of the training data and then using for a given set X of individual data instances. The SP- that lattice structure to explain the features of predicLIME algorithm picks out explanations based on fea- tions made on test data [4]. In this proposed hybrid ture importances across generated explanations. How- approach, we use the power of FCA to determine imever, the data instances X from which explanations plication rules among features and using that to guide are generated, are either the full dataset (called Full- the submodular picks for LIME in order to generate LIME) or data points sampled from a Gaussian distri- local explanations. It provides the benefits of using bution (SP-LIME random), and ignore the correlation data-based approach and proxy model based approach between features in the dataset. Carrying out SP-LIME in a unified framework. for the full dataset (Full-LIME) is very time consuming especially when the dataset is large. Carrying out SP- 3.1. FCA-based selection of Instances LIME random on the dataset may end up considering data points that are implied by other data points in the The goal of our FCA-based instances selection is to explanation space. Thus it is important to analyze the take advantage of the underlying structure of data to full data set and choose only those points for SP-LIME build a concise and non-redundant set of instances. such that the selected data points are representative of We hypothesize that most of the state-of-the-art apthe data space. In this work, we propose a mechanism proaches do not consider this information (to the best to determine the implication of features to guide the of our knowledge). We shortlist sample instances usselection of the instances X from the training dataset. ing the following process: We use Formal Concept Analysis (FCA) to analyze the training data and discover feature implication rules. 1. We first binarize the training data in an ad-hoc Using these feature implication rules, we pick appro- way. The binarization technique is applied to priate instances to feed into SP-LIME. SP-LIME then discretize the continuous attribute values into 3.1.1. Generating Implication Rules from

Training Data

In order to find an optimal subset of samples, we gen

erate implication rules from the given training data. One of the challenge in generating implication rules is that for a given domain and training data, the number of rules can be very large. Therefore, we shortlist rules based on their expressiveness e.g. we select the subset of rules that have the highest coverage and lowest redundancy.

When we generate association rules from the dataset, conclusion does not necessarily hold for all objects. However, it is true for some stated percentage of all objects covering the premise of rule. We sort the rules using this percentage and select the top rules. The value of is emperically calculated based on a given domain. 3.1.2. Generating Lattice Structure and selecting Instances Using the lattice structure and implication rules, we select instances for guiding SP-LIME. We identify all the instances that follow the implication rules. For each rule in the “implication rules list", we calculate if a given sample “pass" or “fail" the given criteria i.e. if a particular sample follows implication rule or not. Finally, we produce a sorted list of the instances that are deemed more likely to cover maximally and are non-redundant as well.

3.2. Guided-LIME for Global Explanations We propose structured data sampling based approach

Guided-LIME towards a hybrid framework extending SP-LIME. SP-LIME normally has two methods for sampling: random and full. In the random approach, samonly of two values, 0 or 1. The binarization pro- ples are chosen randomly using a Gaussian distribucess can be done in a more formal manner e.g. tion. On the other hand, full approach make use of all chiMerge algorithm [14] which ensures that bi- the instances. We extend the LIME implementation to narization method does not corrupt the gener- integrate another method “FCA" that takes the samples ated lattice. In the scope of current work, we generated using lattice and implication rules. keep this process simple enough. Thereafter, we Algorithm 1 explains the steps to perform structured generate concept lattice using standard FCA-based sampling using training data and pass to SP-LIME for approach. Each concept in the lattice represents generating global explanations. The input to Guidedthe objects sharing some set of properties; and LIME is training data used to train the blackbox ML each sub-concept in the lattice represents a sub- model. Data processing for finding the best samples set of the objects. for Guided-LIME involves binarization of data. There2. We use ConExp concept explorer tool to gener- after, a concept lattice is created based on FCA apate lattice from the training data [15]. proach [4]. Using the concept lattice, we derive implication rules. These rules are then used to select test instances for Guided-LIME.

Algorithm 1 Sample selection algorithm using FCA

for Guided-LIME Require: Training dataset Ensure: Samples and their ranking for a given Training dataset consisting of data samples do

Binarize numeric features Generate concept Lattice using FCA Find implication rules Generate samples and their ranking

Select top samples from each rule end for for all top samples from each rule do

Select samples using redundancy and coverage criteria end for

As we mentioned previously, there are various examples of using a single approach for explanation. This can be done using any of the proposed techniques i.e. proxy model, activation based or perturbation based approach. However, we argue that none of these approaches provides a holistic view in terms of outcome explanation. Whereas, if we use a hybrid approach such as a combination of proxy model and data-based approach, it can provide a better explanation at a much reduced cost.

One of the question that arise in our hybrid approach is whether the approach is still model agnostic such as LIME. We argue that sampling step do not afect the model agnosticity in any manner. It just adds a sampling step which helps in choosing the samples in a systematic manner.

4. Experiments and Results 4.1. Experimental Setup We use the following publicly available datasets to eval

uate the proposed framework: IRIS, Heart Disease and Adult Earning dataset (See Table 1). IRIS dataset contains 3 classes of 50 instances each, where each class refers to a type of iris plant [16]. There are a total of 150 samples with 5 attributes each: sepal length, sepal width, petal length, petal width, class (Iris Setosa, Iris Versicolor, Iris Virginica). Similarly, Heart Disease dataset contains 14 attributes; 303 samples and two classes [17]. Adult Earning dataset contains 48000 samples, 14 features across two classes. The machine learning task for all three datasets is classification. We use random forest blackbox machine learning model in all our experiments.

Features sepal length, sepal width, petal length, petal width age of patient, sex, chest pain type, resting blood pressure, serum cholesterol, fasting blood sugar, resting ECG, maximum heart rate achieved, exercise induced angina, ST depression induced by exercise relative to rest, peak exercise ST segment, number of major vessels colored by fluoroscopy, Thal, Diagnosis of heart disease age, workclass, fnlwgt, education, education-num, marital status, occupation, relationship, race, sex, capital-gain, capital-loss, hours-per-week, native-country 20 t n u o15 C e r u t eaF10 l a i c i f itr 5 A # 0

SP-Lime Guided-LIME AF-1_Imp-1 generated explanations. Ideally, the noisy features should not occur among the important features. Therefore a 4.2. Results lower FDR suggests a better approach for explanation. We present the discovery of number of noisy features The goal of this experiment is to compare the proposed for each explanation averaged over 100 runs. Each exGuided-LIME approach with random sampling of SP- planation consists of a feature importance vector that LIME. In the scope of this work, we do not compare shows the importance of a particular feature. As we the proposed hybrid approach with full sampling of see in Figures 6, 7, and 8, y axis is the number of noisy SP-LIME. We perform a case study to find out which features and x axis is index of noisy feature. We inapproach is better in selecting important features for a clude the cases where a noisy feature is at first or secgiven blackbox model. As shown in Table 1, we main- ond place in the feature importance vector. AF-1_Imptain ground truth oracle of important features as do- 1 represents artificial/noisy feature occurring at first main knowledge [18, 19]. We train random forest clas- place in feature importance vector whereas AF-1_Impsifier with default parameters of scikit-learn. In this 2 represents artificial/noisy feature occurring at secexperiment, we add 25% artificially “noisy” features in ond place. Guided-LIME sampling approach is consisthe training data. The value of these features is cho- tently better than basic SP-LIME. sen randomly. In order to evaluate the efectiveness of approach we use FDR (false discovery rate) metric which is defined as the total number of noisy features selected as important features in the explanation.

We calculate the occurrence of noisy features in the

4.3. Validating Guided-LIME using calibration level

ror provide a better estimate of reliability of ML models [21, 22]. Moreover, the focus of our experiment is to estimate the proximity of the shadow model w.r.t the The objective of this experiment is to validate which original blackbox model. Calibration error values are proxy model is a closer approximation to original black- therefore used to compare which model is the better box model with respect to the prediction probabilities approximation of the original model. We hypothesize of each model. In order to measure this closeness, var- that the proxy model with a ECE closer to the original ious distance metric can be used e.g. KL divergence, blackbox ML model shall be a closer approximate. cross entropy etc. We use the well established ECE We perform experiment in two settings: 1) with orig(expected calibration error) and MCE (maximum cali- inal data 2) by adding noisy features in the data. As bration error) as the underlying metric to detect the shown in Tables 2 and 3, in both scenarios, ECE and calibration of both the models [20]. Calibration erblackbox MCE of Guided-LIME is closer to the original ML model SHAP need to run for every instance. This generates a in comparison to the random SP-LIME. This justifies matrix of Shapley values which has one row per data the benefit of structured sampling. We also run ex- instance and one column per feature. We can interperiments with full samples of LIME. Although, this pret the entire model by analyzing the Shapley values can be a better approximate of original model, but tak- in this matrix. ing all the samples in the proxy model is not a practi- In CAM and Grad-CAM approaches, explanation is cal and economic choice for real world huge datasets. provided by using a Saliency Mask (SM), i.e. a subset Guided-LIME has a closer ECE to the original black- of the original record which is mainly responsible for box model. Hence, Guided-LIME is a better choice as the prediction. For example, as salient mask we can a proxy model to explain the original ML model. consider the part of an image or a sentence in a text. A saliency image summarizes where a DNN looks into an image for recognizing their predictions. Although 5. Related Work these solutions are not just limited/agnostic to blackbox NN, but it requires specific architectural modificaVarious approaches for explainability of blackbox mod- tions. els have been proposed [8]. Broadly the existing techniques can be classified into Model Explanation approaches; outcome Explanation approaches; Model Inspection approaches. There are also example of works that focus on designing transparent design of models.

In this work, we focus only on the outcome explanation approaches. In the category of outcome explanation, CAM, Grad-CAM, Smooth Grad-CAM++, SHAP, DeepLIFT, LRP and LIME are the main approaches [23, 24, 25, 9, 26, 27, 28]. These methods provide a locally interpretable shadow model which is able to explain the prediction of the blackbox in understandable terms for humans.

Most popular shadow model approaches for black

Feature importance is well known approach to explain blackbox models. More recently, instance-wise feature selection methods are proposed to extract a subset of features that are most informative for each given example in deep learning network. [29]. In [30] authors make use of a combination of neural networks to identify prominent features that impact the model accuracy. These approaches are based on subset sampling through back-propagation.

Ribeiro et. al. [9] present the Local Interpretable Model-agnostic Explanations (LIME) approach which does not depend on the type of data, nor on the type of blackbox b to be opened. In other words, LIME can return an understandable explanation for the predicbox ML model explanations are Local Interpretable Modelt-ion obtained by any blackbox. The main intuition of Agnostic Explanations (LIME) and SHAP. LIME can explain the predictions of any classifier in “an interLIME is that the explanation may be derived locally from the records generated randomly in the neighborpretable and faithful manner, by learning an interpretablehood of the record to be explained. As blackbox the model locally around the prediction. In order to make the predictions easily interpretable, LIME have two design goals: Easy to interpret and Local fidelity : This means that outcomes of shadow model are easily interpretable and the explanation for individual predictions are locally faithful, i.e. it correspond to how the model behaves in the vicinity of the individual observation being predicted.

In contrast, SHAP (SHapley Additive exPlanations) is distinctly built on the Shapley value. The Shapley following classifiers are tested: decision trees, logistic regression, nearest neighbors, SVM and random forest.

In [31], authors find the global importance introduced by Local Interpretable Model-agnostic Explanations (LIME) unreliable and present approach based on global aggregations of local explanations with the objective to provide insights in a model’s global decision making process. This work reveal that the choice of aggregation matters regarding the ability to gain relivalue is the average of the marginal contributions across able and useful global insights on a blackbox model. all permutations. The Shapley values consider all pos- We find this work as motivation to propose an hybrid sible permutations, thus SHAP is a united approach approach where aggregations can be generated using that provides global and local consistency and inter- knowledge of data through FCA-based system. pretability. However, its cost is time — it has to com- In contrast to model explanation approaches such pute all permutations in order to give the results. SHAP as LIME and SHAP [9, 26], our approach is compleapproach has speed limitations as it has to compute all mentary which can guide these approaches for selectpermutations globally to get local accuracy whereas ing the optimal instances for explanation. Extracting LIME perturbs data around an individual prediction rules from neural networks is also a well studied probto build a model. For generating a global explanation, lem [32]. These approaches depend on various factors such as: Quality of the rules extracted; Algorithmic complexity; Expressive power of the extracted rules; Portability of the rule extraction technique etc. Our approach also uses the knowledge of structure in data however it is not dependent on the blackbox model.

Moreover, formal concept analysis based data analysis provides a solid theoretical basis.

6. Conclusions and Future Work In this paper,we proposed a hybrid approach for eval

uating interpretability of blackbox ML systems. Although Guided-LIME do not guarantee an optimal solution, yet we observe that a single approach like LIME is not suficient to explain the AI system thoroughly. There are limitations of deciding an optimal sampling criteria in SP-Lime algorithm. Our approach combines the benefits of using a data-based approach (FCA) and proxy model based approach (LIME). Overall, our approach is complementary to SP-LIME as we provided a structured way of selecting right instances for global explanations. Our results on real world datasets shows that false discovery rate is much lower with GuidedLIME in comparison to random SP-LIME. Moreover, Guided-LIME has a closer ECE and MCE to the original blackbox model. In future, we would like to perform extensive experiments with diverse datasets and complex deep learning models.