1. Introduction

Impact of Weight Functions on Preferred Abductive Explanations for Decision Trees

Louenas Bounia

0 1

Matthieu Goliot

Anasse Chafik

0 1 0 Centre de Recherche en Informatique de Lens (CRIL), Université d'Artois & CNRS , France 1 Université d'Artois , Lens , France

In this article, our main objective is to address the issue of diversity in abductive explanations for decision trees by studying the impact of diferent weight functions on preferred abductive explanations. We acknowledge that users may have specific preferences regarding the explanations they prefer to receive. Therefore, we propose several criteria to obtain high-quality subsets of abductive explanations that take into account these preferences. These criteria are defined by the users themselves by assigning weights to diferent preference criteria. To evaluate the impact of these preference criteria on abductive explanations and the relationships between the obtained subsets, we propose an approach based on SAT encoding. This allows us to enumerate more easily the diferent subsets of abductive explanations that meet the user-defined preference criteria. Additionally, we use measures based on the distance between two sets of explanations to assess the correlation between user preferences and the extent to which result sets difer from each other for diferent preferences. In summary, this study represents the first step towards providing a framework for selecting abductive explanations that cater to users' preferences in a diverse and high-quality manner. We aim to instill the necessary confidence in users to utilize these explanations in their decision-making process by ofering explanations tailored to their individual preferences.

eol>Explainable AI Diversity of explanations Decision trees Weight functions

1. Introduction

Explaining Machine Learning (ML) models is an important challenge that has been a subject of study of AI in recent years (see, for example, [ 1, 2, 3, 4 ]. In this article, we focus on abductive explanations for binary decision tree models [ 5 ]. Abductive explanations aim to clarify why a classifier classifies an instance as positive or negative. In contrast, contrastive explanations aim to explain why the instance was not classified as expected (thus addressing the question "why not the other classification?"). Several types of abductive explanations exist depending on the used classifier. These include the direct reason [ 6 ], the prime implicant [ 7 ], also known as the suficient reason [ 8 ]. The quality of an explanation relies not only on the reason itself but often depends on the person being explained to and the domain involved.

In this article, we focus on the diversity of abductive explanations, a crucial aspect when it comes to user-guided explanations. When a user requests an explanation for the classification of an example by a machine learning model, they may have specific preferences regarding the form or content of that explanation. For instance, some users prefer concise and succinct explanations, while others prioritize more detailed and comprehensive explanations. Our study primarily centers on preferred abductive reasons, which are considered the most anticipated explanations by users. We have chosen to investigate the diversity of preferred explanations within the context of decision trees, which are widely used machine learning models. Diversity, in this context, can be perceived as a mean to account for diferent priorities among users. In other words, the objective of this study is to consider user preferences, specially when they vary from one another.

We first propose a SAT encoding based on the encoding proposed by Jabbour et al. [ 9 ] to enumerate the preferred suficient reasons. Several weight functions based on XAI methods known in the literature have been considered to calculate the preferred reasons based on the weights provided by these functions. These weight functions allow us to calculate the preferred suficient reasons for a given method (or a given user) using a gradual preference model expressed by weights. Finally, we evaluate the impact of diferent weight functions on the preferred suficient reasons for a given decision tree, by first counting their number and then calculating the distance between two sets of preferred explanations. This measure allows us to quantify the gap between two subsets of explanations and thus measure the impact of user preference diversity on the produced explanations.

2. Decision Trees and Abductive Explanations 2.1. Preliminaries

For an integer , let [] be the set {1, . . . , }. We denote ℱ as the class of all Boolean functions from {0, 1} to {0, 1}, and we use = {1, . . . , } to represent the set of Boolean input variables. Any assignment ∈ {0, 1} is called an instance. If () = 1 for ∈ ℱ, then is called a model of . is a positive instance if () = 1, and a negative instance if () = 0.

We refer to as a propositional formula when it is described using the Boolean connectors ∧ (conjunction), ∨ (disjunction), ¬ (negation), as well as the Boolean constants 1 (true) and 0 (false). Other connectors, such as implication →, may also be considered. As usual, a literal ℓ is a variable (a positive literal) or its negation ¬, also denoted (a negative literal). and are complementary literals. A positive literal is associated with a positive feature (i.e., is assigned to 1), while a negative literal is associated with a negative feature.

A term is a conjunction of literals, and a clause is a disjunction of literals. Lit ( ) denotes the set of all literals in . A DNF (Disjunctive Normal Form) formula is a disjunction of terms, and a CNF (Conjunctive Normal Form) formula is a conjunction of clauses. The set of variables appearing in a formula is denoted by Var ( ). A formula is consistent if and only if it has a model. A CNF formula is monotone when each literal of a given variable in the formula has the same polarity (i.e., each time a literal appears in the formula, the complementary literal does not appear in the formula). A formula 1 implies a formula 2, denoted 1 |= 2, if and only if every model of 1 is a model of 2. Two formulas 1 and 2 are equivalent, denoted 1 ≡ 2, if and only if they have the same models. Given an assignment ∈ {0, 1}, the corresponding term is defined as:

= ⋀︁ où 0 = et 1 =

=1 A term covers an assignment if ⊆ . An implicant of a Boolean function is a term that implies . A prime implicant of is an implicant of such that no proper subset of is an implicant of . Conversely, an implicant of a Boolean function is a clause that is implied by , 2 1 1 3 1 1 0 0 4

4 2 3

3 1 and a prime implicant of is an implicant of such that no proper subset of is an implicant of .

Definition 1 (Boolean decision tree). A Boolean decision tree over is a binary decision tree, where each internal node is labeled with one of the Boolean input variables, and each leaf is labeled with either 0 or 1. Each variable appears at most once along any path from the root to a leaf. The value () ∈ {0, 1} of for the input instance is determined by the label of the leaf reached from the root as follows: at each node, we follow the left or right child depending on whether the input value of the corresponding variable is 0 or 1. The size of (denoted | |) is the number of nodes.

The class of decision trees over is denoted DT. It is well-known that any tree ∈ DT can be transformed into an equivalent disjunction of terms in linear time, denoted DNF( ), where each term corresponds to a path from the root to a leaf labeled 1. Similarly, can be transformed in linear time into a conjunction of clauses, denoted CNF( ) [ 10 ], where each clause is the negation of a term corresponding to a path from the root to a leaf labeled 0.

The tree shown in Figure 1 will be used as an running example in the rest of the paper. Example 1. The decision tree in Figure 1 classifies bank loans using the following attributes: 1: "does not have a permanent contract", 2: "is over 50 years old", 3: "has annual income below 35K" and 4: "has not repaid a previous loan".

2.2. Abductive explanations

We consider the concept of abductive explanation. Formally, for ∈ and x ∈ {0, 1}, an abductive explanation (reasons) of x given is an implicant of (or of ¬ in the case where (x) = 0) that covers x. There always exists an abductive explanation of x given because = x is such a trivial explanation. Therefore, in the remainder of this section, we will focus on more concise forms of abductive explanation.

Direct reasons [ 10, 6 ] are abductive explanations specific to decision trees and random forests (see [ 11 ]). Other abductive explanations exist that are not specific to a particular classifier, such as suficient reasons [ 8 ]. In the following, we will define suficient reasons.

Definition 2 (Suficient reason) . Let ∈ ℱ and ∈ {0, 1} such that () = 1 (resp. () = 0). A suficient reason for given is a prime implicant of (resp. ¬ ) that covers . sr (, ) denotes the set of all suficient reasons for given .

A suficient reason [ 8 ] (or PI-explanation [ 7 ]) for an instance given a Boolean function is a subset of that is minimal with respect to set inclusion, and such that any instance ′ that shares the set is classified by as . Thus, when covers , when () = 1, is a suficient reason for given if and only if is a prime implicant of , and when () = 0, is a suficient reason for given if and only if is a prime implicant of ¬ . Suficient reasons do not contain any redundant attributes. We refer to a minimal-size suficient reason for given as a suficient reason for given that contains the minimum number of literals.

Example 2. Going back to Example 1, we can observe that () = 0 (Bank loan rejected.) for the instance = (1, 1, 1, 1). The direct reason for is = 1 ∧ 2 ∧ 3 ∧ 4, 1 ∧ 2 ∧ 4, 1 ∧ 3 ∧ 4 and 2 ∧ 3 ∧ 4 are the suficient reasons for given . They are also the only minimal-size suficient reasons for given .

3. Computing All Abductives Explanations

The number of suficient reasons in an instance may be exponential [ 10 ]. In the following, we remind that even for the restricted class of decision trees with logarithmic depth, an instance can have an exponential number of suficient reasons. By definition, the number of minimal suficient reasons for cannot be greater than the number of its suficient reasons. However, restricting ourselves to minimal suficient reasons does not guarantee a significant reduction to their number [ 12, 10 ] because an instance can have an exponential number of minimal suficient reasons. We shall recall a proposition that confirms the exponential nature of the number of minimal suficient reasons which was proposed by Audemard et al. [ 10 ].

Proposition 1. For any ∈ N such that is odd, there exists a decision tree ∈ DT with depth +21 , containing 2 + 1 nodes, and an instance ∈ {0, 1} such that the number of minimum-size suficient reasons for given is equal to 2√− 1.

3.1. Compute all minimum-size suficient reasons.

In order to synthesize the set of suficient reasons, we first focus on the minimum-size suficient reasons. Although the set of minimum-size suficient reasons for an instance given a decision tree can be exponential, this number cannot exceed the total number of suficient reasons, and in practice, it can be significantly smaller. However, unlike suficient reasons, which can be generated in polynomial time [ 10, 12 ], computing the minimum-size reasons is not an easy task. Proposition 2. Let ∈ DT and ∈ {0, 1}. Computing a minimum-size suficient reason for given is NP-hard.

Despite this result of intractability in the general case, computing a set of minimum-size suficient reasons is possible in many practical cases. For this purpose, we rely on recent advancements in combinatorial optimization related to SAT.

First, let us recall that the Partial MaxSAT problem consists of a pair (soft, hard), where soft and hard are (finite) sets of clauses. The objective is to determine, if it exists, an assignment of variables that maximizes the number of satisfied clauses from soft, while satisfying all clauses from hard. We can utilize a Partial MaxSAT solver to compute minimal-size suficient reasons: Proposition 3. Let decision trees in DT and ∈ {0, 1} an instance such that () = 1. Let (soft, hard) instance of Partial MaxSAT problem such that : and soft = { : ∈ } ∪ { : ∈ } hard = { ∩ : ∈ CNF( )}. The intersection of with * , where * is an optimal solution for (hard, soft), is a minimal-size suficient reason for given .

A Partial MaxSAT solver can also be used to compute a predefined number of minimal-size suficient reasons. The process involves generating an initial reason , adding the negation of (¬) to hard, and including a cardinality constraint to ensure that the subsequent computed reasons have the same size as . This process is repeated until the desired number of reasons is reached or no solution exists. Calculating a single explanation is often insuficient to fully understand the behavior of a classifier. On the other hand, providing millions of explanations would not be practical for the user. Reasons can vary greatly from one another, and the quality of a reason also depends on the person to whom it is explained. The authors of the article [ 13 ] propose leveraging user preferences to select the most relevant reasons and thus reduce their number. This restricted set of explanations has two advantages: it aligns as closely as possible with the user’s preferences and can drastically reduce the overall number of explanations. However, it is important to note that even two experts on the same field may have diferent preferences. In our work, we focus on the impact of diferent weighting functions on the set of preferred suficient reasons given a decision tree , in order to better understand the diversity of abductive explanations.

4. Preferred abductive explanations

One rational way to address this question is to focus on a subset of explanations, referred to as the preferred ones [ 13 ]. Defining what makes an explanation "preferred" or "good enough" is challenging in general, and there is no consensus on this matter, as seen in [ 14 ]. Preferred explanations can be either the complete set of abductive explanations [ 15 ] or subsets thereof, particularly those containing only suficient reasons. Although the notion of preferred reasons makes sense for any Boolean classifier, our results are specific to decision trees since they concern suficient reasons . The authors of the paper [ 13 ] have defined several preference models, and in the following, we focus on one of them: Maximum-Weight Explanations.

4.1. Maximum-Weight Explanations

A model of preference relation on a combinatorial domain is by using a utility function (or cost function). In our context, this involves assigning a utility value (weight) to each feature. This approach leads to a total preorder on explanations, where the best explanations are those with the highest weight.

The idea behind a utility function is to measure the importance of each feature in the explanation. For example, one can assign a weight to each feature corresponding to its usefulness or relevance to the considered problem. The larger the utility value of a feature, the more important it is in the explanation. By associating a utility value with each feature, one can calculate an overall utility value for each explanation by summing the utility values of its features. This allows ranking explanations based on their utility value and determining the best explanations, those with the highest utility value. The advantage of this approach is that it allows for more complex preferences to be taken into account than simply ranking features in order of importance. Indeed, each user may have diferent preferences, and a personalized utility function allows for these preferences to be modeled more finely.

In the general case, computing a maximum-weight suficient reason is NP-hard in the broad sense. This follows from the fact that a minimum-size suficient reason for a given instance of a decision tree is a minima-weight preferred reason for a given instance and decision tree with a weight mapping 1 such that for each ∈ [], 1() = 1. Computing a maximum-weight suficient reason for a given instance of a decision tree is NP-hard [ 11, 16 ]. Nevertheless, the approach presented in [ 12 ] can be generalized to compute minimum-size suficient reasons for the case of maximum-weight suficient reasons. This amounts to solving an instance of the Weighted Partial MaxSAT problem.

Definition 3. Let ∈ DT. Let : → N* a weight vector associated with each feature. A maximum-weight reason for given et is a term for and that maximize Σ∈Var()(). Proposition 4. Let ∈ DT and an instance ∈ {0, 1} suche that () = 1. Let : → N* weights application. Maximum-weight suficient reason for given et is given by ∩ * , where * is the solution of (soft, hard) of Weighted Partial MaxSAT problem such that : soft = {(, ()) : ∈ } ∪ {(, ()) : ∈ } hard = {([], ∞) : ∈ CNF( )} where : hard : is the CNF encoding proposed by [ 9 ] of the CNF encoding of decision tree

In the following, we will refer to "maximum-weight suficient reason" as the explanation with the highest weight and "preferred suficient reason" as the explanation preferred. Remark. We would like to clarify that the encoding proposed in this article (Proposition 3) is diferent from the one proposed by the authors in [ 13 ], even though both are based on MaxSAT. The aim of the encoding in [ 13 ] is to minimize the sum of weights to obtain preferred reasons, while our approach aims to maximize it. Another major diference is the exploitation of the encoding by [ 9 ] to preferred suficient reasons for the decision tree. This encoding allows for easier enumeration of preferred suficient reasons for decision tree.

Example 3. Let’s consider the example of a banker 1 using a decision tree to decide whether to approve or reject a loan for a client. Suppose the decision tree is represented by Example 1, and the banker wants to understand why a particular instance, = (1, 1, 1, 1), was classified as a rejection ( () = 0). In this case, there are multiple suficient reasons to explain this classification. These reasons are all combinations of attributes that, if true, result in a negative classification. For = (1, 1, 1, 1), the suficient reasons are: 1 ∧ 2 ∧ 4, 1 ∧ 3 ∧ 4, and 2 ∧ 3 ∧ 4. However, the banker prefers an explanation without the attribute 2 because it is a non-actionable attribute, meaning the client cannot change it. In this case, we can use a weight function for each attribute to find the best explanation. In this example, we use the weight function 1 = (5, 1, 8, 4), which assigns higher weights to attributes considered more important for the decision. Using this weight function, the solver returns that the best explanation of maximum-weight is 1 ∧ 3 ∧ 4, which does not include the non-actionable attribute 2. 5. Weight Functions and Distance Between Two Finite Subsets of

Explanations The main idea of this section is to address the variations in user preference aggregation modalities regarding preferred abductive reasons. It is acknowledged that even two experts in the same domain can have diferent preferences. However, in the absence of a real-world application with actual user preferences, the study focuses on exploring diferent weight measures, both local and global. The weight functions used in this study are based on diferent approaches such as Shapley values, Banzhaf values, LIME, Anchors, Explanatory, as well as Wordfreq and Feature importance. These weight functions allow quantifying the relative importance of diferent features or attributes in explaining the results of the classification model. By using these weight measures, it is possible to take into account user preferences when aggregating abductive explanations, assigning diferent weights to features based on their perceived importance.

5.1. Weight Functions

Global Weight Measures: Global weight measures focus on the contribution of features by considering all predictions of all instances. We will present some of the global weight measures used in the literature to aggregate user preferences regarding preferred suficient reasons. • Wordfreq : Zipf’s law states that the frequency of a word in a corpus is inversely proportional to its rank , i.e., ∝ 1 . This law is often used to model the distribution of word frequencies in a linguistic corpus. The Zipf frequency of a word is given by: = log10 ︀( )︀ , where is the total number of words in the corpus and is the rank of the word, i.e., its position in the ranking of most frequent words. 1 • Features importance : The "Mean Decrease Impurity" (MDI) method is used to evaluate the importance of attributes in a classification task by measuring the average decrease in impurity (e.g., entropy or Gini index) in the decision tree when the attribute is used to divide the data into subgroups. The importance of an attribute is then evaluated by taking the average and standard deviation of this decrease in impurity over all divisions of the tree that use that attribute [ 17 ].

Local Weight Measures: Local measures focus on the contribution of features to a specific prediction, individual predicted instance. We now present some local weight measures: • Local Surrogate Models (LIME): LIME allows for the explanation of individual predictions made by non-interpretable machine learning models. This technique was proposed and implemented by Ribeiro et al. in 2016 [ 1 ]. LIME focuses on constructing local surrogate models to explain individual predictions. The idea is to train an interpretable surrogate model on a new dataset composed of locally perturbed samples. • SHAP (SHapley Additive exPlanations): The Shapley value is based on cooperative game theory. The goal of SHAP is to explain the prediction of an observation by calculating the contribution of each variable to that prediction. We used the method proposed by [ 3 ]. • Anchors: Anchors [ 2 ] is an interpretability technique that aims to find sets of rules that best summarize the behavior of the model under study. The objective is to identify the largest possible local regions where predictions are as consistent as possible. • Explanatory: It involves calculating the number of models for each variable given the instance and a decision tree using D4 [ 18 ].

Example 4. Two other bankers have diferent preferences for explanations compared to the banker in Example 2. The second banker believes that if the client has not repaid a previous loan, they will never be able to repay a new loan, so they prefer an explanation with attribute 4. These preferences are expressed with 2 = (1, 1, 1, 10). On the other hand, a third banker thinks that if the client has an annual income below 35 and is over 50 years old, it is preferable not to grant them a loan due to their low salary relative to their age, so they prefer an explanation with 2 ∧ 4. • For 2 = (1, 1, 1, 10), the reasons 1 ∧ 2 ∧ 4, 1 ∧ 3 ∧ 4, and 2 ∧ 3 ∧ 4 are prefered suficient reasons based on the preferences of the second banker. 1You can find more information at https://pypi.org/project/wordfreq/. • The two reasons 1 ∧ 2 ∧ 4 and 2 ∧ 3 ∧ 4 are two preefred suficient reasons based based on the preferences of the third banker.

Example 4 demonstrates that subsets of preferred reasons can be very diferent from each other. For instance , the two subsets of preferred reasons based on the preferences of bankers 1 and 3 do not share any common reasons.

Monotone Transformation. We know that the operation of SAT solvers requires integer and positive weights, while the values of SHAP, LIME, etc., are not necessarily positive or integer initially. In order to satisfy this constraint for SAT solvers and still maintain the same preference order based on SHAP, LIME, etc., values, we will perform a monotonically increasing transformation on the values of diferent weight functions. The Explanatory method does not require a monotone transformation as the number of models for each literal is already a positive and integer value. Given a weight vector ∈ Rn, the monotone transformation is given by the following formula: − ← − min∈[]() + 1. Then, we multiply by 10, where is the maximum number of decimal places. This transformation allows us to convert all the weights into positive integers.

Example 5 (monotone transformation). Let ∈ DT be a decision tree and ∈ 4 be an instance, and let SHAP(x, ) = (0.5, − 0.2, 0.3, − 0.1) be the Shapley values for the instance given . Then, a monotone increasing transformation gives (x) = (8, 1, 6, 2).

5.2. Distance Between Two Finite Sets of Explanations

When it comes to evaluating the impact of user preferences on preferred abductive explanations, several evaluation criteria can be considered. One of these criteria is a distance measure based on the symmetric diference between two explanations. This distance measure allows quantifying the proximity between two explanations. The symmetric diference between two explanations involves considering the literals that are present in one explanation but not in the other, that is, the literals that are specific to each explanation. By comparing the cardinality of this symmetric diference, we can assess the degree of similarity or diference between these two explanations. Additionally, we will consider the distance between two finite subsets of explanations as the minimum distance between the explanations within these two subsets.

The idea behind this distance measure is to provide an estimation of the proximity between sets of explanations, allowing us to understand how these sets come closer to or move away from each other. This can be useful for evaluating the similarities or divergences in user preferences regarding abductive explanations.

Definition 4. The distance between two finite subsets of explanations 1 and 2 is defined as (1, 2) = ∈m1,in∈2|(, )|, where |.| represents the counting measure, and is the symmetric diference between two explanations 1 and 2, denoted as (1, 2), given by the formula (1, 2) = { : ∈ Lit (1) ∪ Lit (2) ∧ ∈/ Lit (1) ∩ Lit (2)} = Lit (1)△Lit (2).

Note that the larger the value of (1, 2), the farther apart the two sets 1 and 2 are from each other. If 1 ∩ 2 ̸= ∅, then (1, 2) = 0. From a topological perspective, expresses the geometric distance between two finite subsets of explanations, taking into account the topological nature of explanations, which are terms composed of literals.

Lemma 1. The complexity of calculating the distance between two subsets of explanations, 1 and 2, is quadratic.

The computational complexity of calculating the distance between two sets of explanations, 1 and 2, depends on the sizes of these sets. Let’s assume that 1 represents the size of 1 and 2 represents the size of 2. For each element in 1, we need to compare it with each element in 2 to calculate the distance between them. This implies a comparison between 1 elements of 1 and 2 elements of 2, resulting in a complexity of the order of (1 · 2), which is quadratic when 1 and 2 are suficiently large.

Example 6. Based on Example 4, let’s denote 1 , 2 , and 3 as the subsets of preferred explanations based on the preferences of bankers 1, 2, and 3, respectively. We have (1 , 2 ) = 0 and (2 , 3 ) = 0 because 1 ∩ 2 ̸= ∅ and 3 ∩ 2 ̸= ∅, while (1 , 3 ) = 2.

6. Experiments

Experimental setup. We considered 18 well-known binary classification datasets available on Kaggle, OpenML, and UCI. No data preprocessing was performed for numerical attributes, and the attributes were binarized in-line by the decision tree learning algorithm used. For each benchmark , we evaluated the classification performance using standard evaluation metrics. We used the CART algorithm and its implementation in Scikit-Learn to learn decision trees, with default parameter settings. For each benchmark and a subset of up to 250 randomly selected instances from the test set, unless the dataset contains fewer than 250 instances, in which case the entire dataset was used. We computed the number of suficient reasons using the encoding proposed by [ 9 ] and the number of minimum-size suficient reasons using the Partial MaxSAT solver (with a 60-second timeout per instance). Finally, we computed the number of preferred suficient reasons using the encoding detailled in the section 4 and the WEIGHTED PARTIAL MAXSAT solver from OpenWBO [ 19 ].

Regarding the weight functions, for each tree , we used the exact method proposed by [ 20 ] to compute the SHAP score as well as the scores for LIME [ 1 ] and Anchors [ 2 ]. We also used feature importance with Scikit-Learn [ 17 ], the number of models "Explanatory" with [ 18 ], and the Zipf frequency of each feature viewed as a word in the wordfreq library. Two weight functions (random local and global) based on random weight sampling were added to clarify the nature of preferred explanations for diferent weight functions. We report the classical statistics, the average number and variance of suficient reasons and minimal suficient reasons, and the preferred suficient reasons for each weight function method. Finally, for the " placement" and "compas" datasets, we report the distance between diferent preferred subsets for the diferent weight functions.

6.1. Experimental results

Tables 2 and 3 present an excerpt of the results. The tables present results on datasets, decision trees, and global weight measures, based on 18 datasets. For each benchmark, the table provides the dataset name (name), the accuracy of the decision trees ((%)), the number of binary variables (#), and the number of instances (#). The columns |(, )| and |(, )| respectively indicate the mean and standard deviation (std) of the number of suficient reasons and the number of preferred suficient reasons. Then, for each benchmark , the columns #wordf, #f_imp, ([ 1,10 ], [ 1,100 ], [ 1,1000 ]) correspondingly represent the number of preferred suficient reasons for wordfreq, feature importance, and global random sampling over the intervals [ 1,10 ], [ 1,100 ], and [ 1,1000 ]. The columns of Table 3 represent the mean and standard deviation (std) of the number of preferred suficient reasons for the local weight measures in the following order: Lime, Shapely, Anchors, Explanatory, and local random sampling over the intervals [ 1,10 ], [ 1,100 ], and [ 1,1000 ]. We clarify that the concept of "random sampling local" consists of selecting integer weights for each instance, while respecting a specified interval. Let’s consider the illustrative example: suppose we have a dataset with instances of size = 5, meaning that there are five elements in each instance. The specified interval is [ 1, 10 ], indicating that the chosen weights must be integer values ranging from 1 to 10. For each individual instance, we perform a random draw to determine the corresponding weights. In our example, the weight vector = (9, 4, 7, 5) is generated from this random draw. Each weight in the vector is an integer chosen randomly within the interval [ 1, 10 ].

First. We would like to emphasize that computing preferred reasons given a decision tree and instance is feasible in practice. In fact, for many datasets and instances, the computation of all preferred reasons has been completed in less than 20 seconds, regardless of the type of weight / Lime Lime 0.0 Shap .

Anchor .

Exp . "R_[ 1,10 ]" . "R_[ 1,100 ]" . "R_[ 1,1000 ]" . "R_[ 1,10 ]" "R_[ 1,100 ]" "R_[ 1,1000 ]" 0.2 0.28 0.38 0.32 0.4 0.5 0.22 0.3 0.4 0.32 0.4 0.5 0.0 0.3 0.4 . 0.0 0.36 . . 0.0 function used. It is evident that the use of diferent weight function types has a significant impact on the number of reasons, making it easier to compute all preferred reasons by reducing their quantity compared to suficient reasons and minimum-size reasons.

Furthermore, it is important to note that for each dataset , each instance in the benchmark of , and each type of weight function, enumerating the preferred suficient reasons has been feasible. Leveraging user preferences ofers a significant advantage by substantially reducing the number of generated explanations. By focusing solely on the explanations preferred by the user, information overload is avoided, and attention is directed towards the most relevant and useful explanations.

Second. Tables 4 and 5 present a matrix that visualizes the average distances between diferent subsets of explanations. These subsets of explanations are obtained using various methods of local and global weight assignment. The values in the matrices correspond to the distances between pairs of subsets, where the coordinates (, ) represent the weight assignment methods used. When examining the diagonal entries of the matrix, we observe that the distances are zero. This is because a subset is identical to itself, so the distance between a subset and itself is always 0. Additionally, it is important to note that the matrices are symmetric. This is because the distance used is symmetric, which is typically the case for all distances.

By observing the distances between the diferent subsets of explanations, we notice that they are generally less than 1. This indicates that the explanations are relatively close to each other in terms of distance. Topologically, this suggests that the set of suficient reasons forms a compact structure, where the explanations are closely grouped and interconnected. This observation represents an initial step in studying the diversity of formal explanations. It indicates that the diferent methods of local and global weight assignment used to generate the explanations do not result in explanations that are very distant from each other. This raises questions about the variety and extent of possible explanations, as well as how local weight assignment methods can influence the diversity of the obtained explanations.

7. Conclusion

To summarize the contributions highlighted in this article, we first proposed a CNF-encoding approach to compute preferred suficient reasons for decision trees. This approach involves representing the reasons in a logical form that facilitates their calculation. Additionally, we introduced the concept of distance between preferred explanations and examined the impact of weight functions on preferred abductive explanations. Namely, we investigated how diferent methods of assigning weights afect the proximity of preferred explanations to each other. Our focus was on the quantity and diversity of these explanations. We found that a classified instance, whether positive or negative, can have an exponential number of reasons, including an exponential number of minimum-sized reasons or preferred reasons. This means that there can be numerous possible explanations for a single classified instance. However, despite this potential diversity, the number of preferred reasons is significantly smaller than the number of suficient reasons, regardless of the weight function used. Generally, there is a restricted selection of preferred explanations that are considered the most relevant or useful. Furthermore, we observed that the distances between diferent sets of explanations are generally not large. This indicates that abductive explanations for decision trees tend to be close to each other in terms of similarity or proximity. In other words, the explanations often share similar features or partially overlap. These findings suggest that despite the potential diversity of explanations, there are commonalities and trends among preferred explanations for decision trees. This can be useful in understanding how decisions are made by these models and in providing comprehensible explanations to users.

Studying the impact of weight functions on preferred abductive explanations for decision trees is just the first step in our research on the diversity of abductive explanations. We intend to apply a similar approach to other models, particularly random forests. Concurrently, we are developing a SAT encoding to compute the SAT Distance between preferred sets of suficient reasons. The aim of this endeavor is to provide users with a framework for selecting preferred explanations that align with their personal preferences and are closer to the model’s output. In other words, through this SAT encoding, users will be able to measure the proximity between diferent sets of explanations and identify those that are most relevant and consistent with their expectations. This will enhance their understanding of the model’s results and enable the provision of explanations that are better suited to the users needs.

[1]

M. T.

Ribeiro ,

Singh ,

Guestrin , "why should I trust you?": Explaining the predictions of any classifier , in: Proc. of SIGKDD'16 , 2016 , pp. 1135 - 1144 .

[2]

M. T.

Ribeiro ,

Singh ,

Guestrin , Anchors: High-precision model-agnostic explanations , in: Proc. of AAAI'18 , 2018 , pp. 1527 - 1535 .

[3]

Lundberg ,

S.-I.

Lee , A unified approach to interpreting m(ijcaiodel predictions , in: Proc. of NIPS'17 , 2017 , pp. 4765 - 4774 .

[4]

Molnar , Interpretable Machine Learning, Leanpub, 2020 .

[5]

Breiman , Random forests, Machine Learning 45 ( 2001 ) 5 - 32 .

[6]

Izza ,

Ignatiev ,

Marques-Silva , On explaining decision trees , CoRR abs/2010 .11034 ( 2020 ).

[7]

Shih ,

Choi ,

Darwiche , A symbolic approach to explaining bayesian network classifiers , in: Proc. of IJCAI'18 , 2018 , pp. 5103 - 5111 .

[8]

Darwiche ,

Hirth , On the reasons behind decisions , in: Proc. of ECAI'20 , 2020 .

[9]

Jabbour ,

Marques-Silva ,

Sais ,

Salhi , Enumerating prime implicants of propositional formulae in conjunctive normal form , in: Logics in Artificial Intelligence , 2014 .

[10]

Audemard ,

Bellart ,

Bounia ,

Koriche , J.-M. Lagniez , P. Marquis , On the explanatory power of boolean decision trees , Data & Knowledge Engineering 142 ( 2022 ) 102088 . URL: https://www.sciencedirect.com/science/article/pii/S0169023X22000799. doi:https://doi. org/10.1016/j.datak. 2022 . 102088 .

[11]

Audemard ,

Bellart ,

Bounia ,

Koriche , J.-M. Lagniez , P. Marquis , Trading complexity for sparsity in random forest explanations , in: Proc. of AAAI'22 , 2022 .

[12]

Audemard ,

Bellart ,

Bounia ,

Koriche , J.-M. Lagniez , P. Marquis , Sur le pouvoir explicatif des arbres de décision , EGC' 2022 38 ( 2022 ).

[13]

Audemard ,

Bellart ,

Bounia ,

Koriche ,

Lagniez ,

Marquis , On preferred abductive explanations for decision trees and random forests , in: Proc. of IJCAI'22 , 2022 .

[14]

Doshi-Velez ,

Kim , Towards a rigorous science of interpretable machine learning , 2017 . arXiv: 1702 . 08608 .

[15]

Audemard ,

Bellart ,

Bounia ,

Koriche , J.-M. Lagniez , P. Marquis , Les raisons majoritaires: des explications abductives pour les forêts aléatoires , EGC'2022 38 ( 2022 ).

[16]

Audemard ,

Bellart ,

Bounia ,

Koriche ,

Lagniez ,

Marquis , On the computational intelligibility of boolean classifiers , in: Proc. of KR'21 , 2021 , pp. 74 - 86 .

[17]

Pedregosa ,

Varoquaux ,

Gramfort ,

Michel ,

Thirion ,

Grisel ,

Blondel ,

Prettenhofer ,

Weiss ,

Dubourg ,

Vanderplas ,

Passos ,

Cournapeau ,

Brucher ,

Perrot , E. Duchesnay, Scikit-learn: Machine learning in Python , Journal of Machine Learning Research 12 ( 2011 ) 2825 - 2830 .

[18] J.-M. Lagniez , P. Marquis , An Improved Decision-DNNF Compiler , in: Proc. of IJCAI'17 , 2017 , pp. 667 - 673 .

[19]

Martins ,

V. M.

Manquinho , I. Lynce , Open-wbo: A modular maxsat solver „ in: International Conference on Theory and Applications of Satisfiability Testing , 2014 .

[20]

S. M.

Lundberg , G. Erion,

Chen , A. DeGrave, J. M. Prutkin , B.

Nair , R.

Katz , J.

Himmelfarb , N.

Bansal , S.-I. Lee , Explainable ai for trees: From local explanations to global understanding , arXiv preprint arXiv: 1905 . 04610 ( 2019 ).