<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Impact of Weight Functions on Preferred Abductive Explanations for Decision Trees</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Louenas Bounia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthieu Goliot</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anasse Chafik</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre de Recherche en Informatique de Lens (CRIL), Université d'Artois &amp; CNRS</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Université d'Artois</institution>
          ,
          <addr-line>Lens</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this article, our main objective is to address the issue of diversity in abductive explanations for decision trees by studying the impact of diferent weight functions on preferred abductive explanations. We acknowledge that users may have specific preferences regarding the explanations they prefer to receive. Therefore, we propose several criteria to obtain high-quality subsets of abductive explanations that take into account these preferences. These criteria are defined by the users themselves by assigning weights to diferent preference criteria. To evaluate the impact of these preference criteria on abductive explanations and the relationships between the obtained subsets, we propose an approach based on SAT encoding. This allows us to enumerate more easily the diferent subsets of abductive explanations that meet the user-defined preference criteria. Additionally, we use measures based on the distance between two sets of explanations to assess the correlation between user preferences and the extent to which result sets difer from each other for diferent preferences. In summary, this study represents the first step towards providing a framework for selecting abductive explanations that cater to users' preferences in a diverse and high-quality manner. We aim to instill the necessary confidence in users to utilize these explanations in their decision-making process by ofering explanations tailored to their individual preferences.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Explainable AI</kwd>
        <kwd>Diversity of explanations</kwd>
        <kwd>Decision trees</kwd>
        <kwd>Weight functions</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Explaining Machine Learning (ML) models is an important challenge that has been a subject of
study of AI in recent years (see, for example, [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1, 2, 3, 4</xref>
        ]. In this article, we focus on abductive
explanations for binary decision tree models [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Abductive explanations aim to clarify why a
classifier classifies an instance as positive or negative. In contrast, contrastive explanations aim
to explain why the instance was not classified as expected (thus addressing the question "why
not the other classification?"). Several types of abductive explanations exist depending on the
used classifier. These include the direct reason [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the prime implicant [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], also known as the
suficient reason [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The quality of an explanation relies not only on the reason itself but often
depends on the person being explained to and the domain involved.
      </p>
      <p>In this article, we focus on the diversity of abductive explanations, a crucial aspect when it
comes to user-guided explanations. When a user requests an explanation for the classification of
an example by a machine learning model, they may have specific preferences regarding the form
or content of that explanation. For instance, some users prefer concise and succinct explanations,
while others prioritize more detailed and comprehensive explanations. Our study primarily
centers on preferred abductive reasons, which are considered the most anticipated explanations
by users. We have chosen to investigate the diversity of preferred explanations within the context
of decision trees, which are widely used machine learning models. Diversity, in this context,
can be perceived as a mean to account for diferent priorities among users. In other words, the
objective of this study is to consider user preferences, specially when they vary from one another.</p>
      <p>
        We first propose a SAT encoding based on the encoding proposed by Jabbour et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] to
enumerate the preferred suficient reasons. Several weight functions based on XAI methods
known in the literature have been considered to calculate the preferred reasons based on the
weights provided by these functions. These weight functions allow us to calculate the preferred
suficient reasons for a given method (or a given user) using a gradual preference model expressed
by weights. Finally, we evaluate the impact of diferent weight functions on the preferred suficient
reasons for a given decision tree, by first counting their number and then calculating the distance
between two sets of preferred explanations. This measure allows us to quantify the gap between
two subsets of explanations and thus measure the impact of user preference diversity on the
produced explanations.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Decision Trees and Abductive Explanations</title>
      <sec id="sec-2-1">
        <title>2.1. Preliminaries</title>
        <p>For an integer , let [] be the set {1, . . . , }. We denote ℱ as the class of all Boolean
functions from {0, 1} to {0, 1}, and we use  = {1, . . . , } to represent the set of Boolean
input variables. Any assignment  ∈ {0, 1} is called an instance. If  () = 1 for  ∈ ℱ,
then  is called a model of  .  is a positive instance if  () = 1, and a negative instance if  () = 0.</p>
        <p>We refer to  as a propositional formula when it is described using the Boolean connectors
∧ (conjunction), ∨ (disjunction), ¬ (negation), as well as the Boolean constants 1 (true) and
0 (false). Other connectors, such as implication →, may also be considered. As usual, a
literal ℓ is a variable  (a positive literal) or its negation ¬, also denoted  (a negative
literal).  and  are complementary literals. A positive literal  is associated with a
positive feature (i.e.,  is assigned to 1), while a negative literal  is associated with a negative feature.</p>
        <p>A term  is a conjunction of literals, and a clause  is a disjunction of literals. Lit ( ) denotes
the set of all literals in  . A DNF (Disjunctive Normal Form) formula is a disjunction of terms,
and a CNF (Conjunctive Normal Form) formula is a conjunction of clauses. The set of variables
appearing in a formula  is denoted by Var ( ). A formula  is consistent if and only if it has a
model. A CNF formula is monotone when each literal of a given variable in the formula has the
same polarity (i.e., each time a literal appears in the formula, the complementary literal does not
appear in the formula). A formula 1 implies a formula 2, denoted 1 |= 2, if and only if every
model of 1 is a model of 2. Two formulas 1 and 2 are equivalent, denoted 1 ≡ 2, if and
only if they have the same models. Given an assignment  ∈ {0, 1}, the corresponding term is
defined as:</p>
        <p>= ⋀︁  où 0 =  et 1 =</p>
        <p>=1
A term  covers an assignment  if  ⊆ . An implicant of a Boolean function  is a term that
implies  . A prime implicant of  is an implicant  of  such that no proper subset of  is an
implicant of  . Conversely, an implicant of a Boolean function  is a clause that is implied by  ,
2
1
1
3
1
1
0
0
4</p>
        <p>4
2
3</p>
        <p>3
1
and a prime implicant of  is an implicant  of  such that no proper subset of  is an implicant of
 .</p>
        <p>Definition 1 (Boolean decision tree). A Boolean decision tree over  is a binary decision tree,
where each internal node is labeled with one of the  Boolean input variables, and each leaf is labeled
with either 0 or 1. Each variable appears at most once along any path from the root to a leaf. The
value  () ∈ {0, 1} of  for the input instance  is determined by the label of the leaf reached from
the root as follows: at each node, we follow the left or right child depending on whether the input
value of the corresponding variable is 0 or 1. The size of  (denoted | |) is the number of nodes.</p>
        <p>
          The class of decision trees over  is denoted DT. It is well-known that any tree  ∈ DT
can be transformed into an equivalent disjunction of terms in linear time, denoted DNF( ),
where each term corresponds to a path from the root to a leaf labeled 1. Similarly,  can be
transformed in linear time into a conjunction of clauses, denoted CNF( ) [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], where each clause
is the negation of a term corresponding to a path from the root to a leaf labeled 0.
        </p>
        <p>The tree shown in Figure 1 will be used as an running example in the rest of the paper.
Example 1. The decision tree in Figure 1 classifies bank loans using the following attributes: 1:
"does not have a permanent contract", 2: "is over 50 years old", 3: "has annual income below 35K"
and 4: "has not repaid a previous loan".</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Abductive explanations</title>
        <p>We consider the concept of abductive explanation. Formally, for  ∈  and x ∈ {0, 1}, an
abductive explanation (reasons) of x given  is an implicant  of  (or of ¬ in the case where
 (x) = 0) that covers x. There always exists an abductive explanation  of x given  because
 = x is such a trivial explanation. Therefore, in the remainder of this section, we will focus on
more concise forms of abductive explanation.</p>
        <p>
          Direct reasons [
          <xref ref-type="bibr" rid="ref10 ref6">10, 6</xref>
          ] are abductive explanations specific to decision trees and random forests
(see [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]). Other abductive explanations exist that are not specific to a particular classifier, such
as suficient reasons [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. In the following, we will define suficient reasons.
        </p>
        <p>Definition 2 (Suficient reason) . Let  ∈ ℱ and  ∈ {0, 1} such that  () = 1 (resp.  () = 0).
A suficient reason for  given  is a prime implicant  of  (resp. ¬ ) that covers . sr (,  )
denotes the set of all suficient reasons for  given  .</p>
        <p>
          A suficient reason [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] (or PI-explanation [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]) for an instance  given a Boolean function  is
a subset  of  that is minimal with respect to set inclusion, and such that any instance ′ that
shares the set  is classified by  as . Thus, when  covers , when  () = 1,  is a suficient
reason for  given  if and only if  is a prime implicant of  , and when  () = 0,  is a suficient
reason for  given  if and only if  is a prime implicant of ¬ . Suficient reasons do not contain
any redundant attributes. We refer to a minimal-size suficient reason for  given  as a suficient
reason for  given  that contains the minimum number of literals.
        </p>
        <p>Example 2. Going back to Example 1, we can observe that  () = 0 (Bank loan rejected.) for
the instance  = (1, 1, 1, 1). The direct reason for  is  = 1 ∧ 2 ∧ 3 ∧ 4, 1 ∧ 2 ∧ 4,
1 ∧ 3 ∧ 4 and 2 ∧ 3 ∧ 4 are the suficient reasons for  given  . They are also the only
minimal-size suficient reasons for  given  .</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Computing All Abductives Explanations</title>
      <p>
        The number of suficient reasons in an instance may be exponential [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In the following,
we remind that even for the restricted class of decision trees with logarithmic depth, an instance
 can have an exponential number of suficient reasons. By definition, the number of minimal
suficient reasons for  cannot be greater than the number of its suficient reasons. However,
restricting ourselves to minimal suficient reasons does not guarantee a significant reduction to
their number [
        <xref ref-type="bibr" rid="ref10 ref12">12, 10</xref>
        ] because an instance can have an exponential number of minimal suficient
reasons. We shall recall a proposition that confirms the exponential nature of the number of
minimal suficient reasons which was proposed by Audemard et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>Proposition 1. For any  ∈ N such that  is odd, there exists a decision tree  ∈ DT with depth
+21 , containing 2 + 1 nodes, and an instance  ∈ {0, 1} such that the number of minimum-size
suficient reasons for  given  is equal to 2√− 1.</p>
      <sec id="sec-3-1">
        <title>3.1. Compute all minimum-size suficient reasons.</title>
        <p>
          In order to synthesize the set of suficient reasons, we first focus on the minimum-size suficient
reasons. Although the set of minimum-size suficient reasons for an instance given a decision
tree can be exponential, this number cannot exceed the total number of suficient reasons, and
in practice, it can be significantly smaller. However, unlike suficient reasons, which can be
generated in polynomial time [
          <xref ref-type="bibr" rid="ref10 ref12">10, 12</xref>
          ], computing the minimum-size reasons is not an easy task.
Proposition 2. Let  ∈ DT and  ∈ {0, 1}. Computing a minimum-size suficient reason for 
given  is NP-hard.
        </p>
        <p>Despite this result of intractability in the general case, computing a set of minimum-size
suficient reasons is possible in many practical cases. For this purpose, we rely on recent
advancements in combinatorial optimization related to SAT.</p>
        <p>First, let us recall that the Partial MaxSAT problem consists of a pair (soft, hard), where
soft and hard are (finite) sets of clauses. The objective is to determine, if it exists, an assignment
of variables that maximizes the number of satisfied clauses from soft, while satisfying all clauses
from hard. We can utilize a Partial MaxSAT solver to compute minimal-size suficient reasons:
Proposition 3. Let  decision trees in DT and  ∈ {0, 1} an instance such that  () = 1. Let
(soft, hard) instance of Partial MaxSAT problem such that :
and
soft = { :  ∈ } ∪ { :  ∈ }
hard = { ∩  :  ∈ CNF( )}.
The intersection of  with * , where * is an optimal solution for (hard, soft), is a minimal-size
suficient reason for  given  .</p>
        <p>
          A Partial MaxSAT solver can also be used to compute a predefined number of minimal-size
suficient reasons. The process involves generating an initial reason , adding the negation of  (¬)
to hard, and including a cardinality constraint to ensure that the subsequent computed reasons
have the same size as . This process is repeated until the desired number of reasons is reached
or no solution exists. Calculating a single explanation is often insuficient to fully understand
the behavior of a classifier. On the other hand, providing millions of explanations would not be
practical for the user. Reasons can vary greatly from one another, and the quality of a reason also
depends on the person to whom it is explained. The authors of the article [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] propose leveraging
user preferences to select the most relevant reasons and thus reduce their number. This restricted
set of explanations has two advantages: it aligns as closely as possible with the user’s preferences
and can drastically reduce the overall number of explanations. However, it is important to note
that even two experts on the same field may have diferent preferences. In our work, we focus
on the impact of diferent weighting functions on the set of preferred suficient reasons given a
decision tree  , in order to better understand the diversity of abductive explanations.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Preferred abductive explanations</title>
      <p>
        One rational way to address this question is to focus on a subset of explanations, referred to
as the preferred ones [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Defining what makes an explanation "preferred" or "good enough"
is challenging in general, and there is no consensus on this matter, as seen in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Preferred
explanations can be either the complete set of abductive explanations [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] or subsets thereof,
particularly those containing only suficient reasons. Although the notion of preferred reasons
makes sense for any Boolean classifier, our results are specific to decision trees since they concern
suficient reasons . The authors of the paper [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] have defined several preference models, and in
the following, we focus on one of them: Maximum-Weight Explanations.
      </p>
      <sec id="sec-4-1">
        <title>4.1. Maximum-Weight Explanations</title>
        <p>A model of preference relation on a combinatorial domain is by using a utility function (or cost
function). In our context, this involves assigning a utility value (weight) to each feature. This
approach leads to a total preorder on explanations, where the best explanations are those with
the highest weight.</p>
        <p>The idea behind a utility function is to measure the importance of each feature in the
explanation. For example, one can assign a weight to each feature corresponding to its
usefulness or relevance to the considered problem. The larger the utility value of a feature,
the more important it is in the explanation. By associating a utility value with each
feature, one can calculate an overall utility value for each explanation by summing the utility
values of its features. This allows ranking explanations based on their utility value and
determining the best explanations, those with the highest utility value. The advantage of
this approach is that it allows for more complex preferences to be taken into account than
simply ranking features in order of importance. Indeed, each user may have diferent
preferences, and a personalized utility function allows for these preferences to be modeled more finely.</p>
        <p>
          In the general case, computing a maximum-weight suficient reason is NP-hard in the broad
sense. This follows from the fact that a minimum-size suficient reason  for a given instance of a
decision tree is a minima-weight preferred reason  for a given instance and decision tree with
a weight mapping 1 such that for each  ∈ [], 1() = 1. Computing a maximum-weight
suficient reason  for a given instance of a decision tree is NP-hard [
          <xref ref-type="bibr" rid="ref11 ref16">11, 16</xref>
          ]. Nevertheless, the
approach presented in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] can be generalized to compute minimum-size suficient reasons for the
case of maximum-weight suficient reasons. This amounts to solving an instance of the Weighted
Partial MaxSAT problem.
        </p>
        <p>
          Definition 3. Let  ∈ DT. Let  :  → N* a weight vector associated with each feature. A
maximum-weight reason for  given  et  is a term  for  and  that maximize Σ∈Var()().
Proposition 4. Let  ∈ DT and an instance  ∈ {0, 1} suche that  () = 1. Let  :  → N*
weights application. Maximum-weight suficient reason for  given  et  is given by  ∩ * ,
where * is the solution of (soft, hard) of Weighted Partial MaxSAT problem such that :
soft = {(, ()) :  ∈ } ∪ {(, ()) :  ∈ }
hard = {([], ∞) :  ∈ CNF( )}
where : hard : is the CNF encoding proposed by [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] of the CNF encoding of decision tree
        </p>
        <p>
          In the following, we will refer to "maximum-weight suficient reason" as the explanation with
the highest weight and "preferred suficient reason" as the explanation preferred.
Remark. We would like to clarify that the encoding proposed in this article (Proposition 3) is
diferent from the one proposed by the authors in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], even though both are based on MaxSAT.
The aim of the encoding in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] is to minimize the sum of weights to obtain preferred reasons,
while our approach aims to maximize it. Another major diference is the exploitation of the
encoding by [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to preferred suficient reasons for the decision tree. This encoding allows for
easier enumeration of preferred suficient reasons for decision tree.
        </p>
        <p>Example 3. Let’s consider the example of a banker 1 using a decision tree to decide whether to
approve or reject a loan for a client. Suppose the decision tree is represented by Example 1, and
the banker wants to understand why a particular instance,  = (1, 1, 1, 1), was classified as a
rejection ( () = 0). In this case, there are multiple suficient reasons to explain this classification.
These reasons are all combinations of attributes that, if true, result in a negative classification. For
 = (1, 1, 1, 1), the suficient reasons are: 1 ∧ 2 ∧ 4, 1 ∧ 3 ∧ 4, and 2 ∧ 3 ∧ 4. However,
the banker prefers an explanation without the attribute 2 because it is a non-actionable attribute,
meaning the client cannot change it. In this case, we can use a weight function for each attribute
to find the best explanation. In this example, we use the weight function 1 = (5, 1, 8, 4), which
assigns higher weights to attributes considered more important for the decision. Using this weight
function, the solver returns that the best explanation of maximum-weight is 1 ∧ 3 ∧ 4, which
does not include the non-actionable attribute 2.
5. Weight Functions and Distance Between Two Finite Subsets of</p>
        <p>Explanations
The main idea of this section is to address the variations in user preference aggregation modalities
regarding preferred abductive reasons. It is acknowledged that even two experts in the same
domain can have diferent preferences. However, in the absence of a real-world application with
actual user preferences, the study focuses on exploring diferent weight measures, both local
and global. The weight functions used in this study are based on diferent approaches such as
Shapley values, Banzhaf values, LIME, Anchors, Explanatory, as well as Wordfreq and Feature
importance. These weight functions allow quantifying the relative importance of diferent features
or attributes in explaining the results of the classification model. By using these weight measures,
it is possible to take into account user preferences when aggregating abductive explanations,
assigning diferent weights to features based on their perceived importance.</p>
      </sec>
      <sec id="sec-4-2">
        <title>5.1. Weight Functions</title>
        <p>
          Global Weight Measures: Global weight measures focus on the contribution of features by
considering all predictions of all instances. We will present some of the global weight measures
used in the literature to aggregate user preferences regarding preferred suficient reasons.
• Wordfreq : Zipf’s law states that the frequency  of a word in a corpus is inversely
proportional to its rank , i.e.,  ∝ 1 . This law is often used to model the distribution
of word frequencies in a linguistic corpus. The Zipf frequency  of a word is given by:
 = log10 ︀(  )︀ , where  is the total number of words in the corpus and  is the rank of
the word, i.e., its position in the ranking of most frequent words. 1
• Features importance : The "Mean Decrease Impurity" (MDI) method is used to evaluate
the importance of attributes in a classification task by measuring the average decrease in
impurity (e.g., entropy or Gini index) in the decision tree when the attribute is used to
divide the data into subgroups. The importance of an attribute is then evaluated by taking
the average and standard deviation of this decrease in impurity over all divisions of the
tree that use that attribute [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>
          Local Weight Measures: Local measures focus on the contribution of features to a specific
prediction, individual predicted instance. We now present some local weight measures:
• Local Surrogate Models (LIME): LIME allows for the explanation of individual predictions
made by non-interpretable machine learning models. This technique was proposed and
implemented by Ribeiro et al. in 2016 [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. LIME focuses on constructing local surrogate
models to explain individual predictions. The idea is to train an interpretable surrogate
model on a new dataset composed of locally perturbed samples.
• SHAP (SHapley Additive exPlanations): The Shapley value is based on cooperative
game theory. The goal of SHAP is to explain the prediction of an observation by calculating
the contribution of each variable to that prediction. We used the method proposed by [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
• Anchors: Anchors [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] is an interpretability technique that aims to find sets of rules that
best summarize the behavior of the model under study. The objective is to identify the
largest possible local regions where predictions are as consistent as possible.
• Explanatory: It involves calculating the number of models for each variable  given the
instance  and a decision tree  using D4 [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
        </p>
        <p>Example 4. Two other bankers have diferent preferences for explanations compared to the banker
in Example 2. The second banker believes that if the client has not repaid a previous loan, they will
never be able to repay a new loan, so they prefer an explanation with attribute 4. These preferences
are expressed with 2 = (1, 1, 1, 10). On the other hand, a third banker thinks that if the client has
an annual income below 35 and is over 50 years old, it is preferable not to grant them a loan due
to their low salary relative to their age, so they prefer an explanation with 2 ∧ 4.
• For 2 = (1, 1, 1, 10), the reasons 1 ∧ 2 ∧ 4, 1 ∧ 3 ∧ 4, and 2 ∧ 3 ∧ 4 are prefered
suficient reasons based on the preferences of the second banker.
1You can find more information at https://pypi.org/project/wordfreq/.
• The two reasons 1 ∧ 2 ∧ 4 and 2 ∧ 3 ∧ 4 are two preefred suficient reasons based based
on the preferences of the third banker.</p>
        <p>Example 4 demonstrates that subsets of preferred reasons can be very diferent from each other.
For instance , the two subsets of preferred reasons based on the preferences of bankers 1 and 3
do not share any common reasons.</p>
        <p>Monotone Transformation. We know that the operation of SAT solvers requires integer
and positive weights, while the values of SHAP, LIME, etc., are not necessarily positive or
integer initially. In order to satisfy this constraint for SAT solvers and still maintain the same
preference order based on SHAP, LIME, etc., values, we will perform a monotonically increasing
transformation on the values of diferent weight functions. The Explanatory method does not
require a monotone transformation as the number of models for each literal is already a positive
and integer value. Given a weight vector  ∈ Rn, the monotone transformation is given by the
following formula:  − ←  − min∈[]() + 1. Then, we multiply  by 10, where  is the
maximum number of decimal places. This transformation allows us to convert all the weights
into positive integers.</p>
        <p>Example 5 (monotone transformation). Let  ∈ DT be a decision tree and  ∈ 4 be an
instance, and let SHAP(x,  ) = (0.5, − 0.2, 0.3, − 0.1) be the Shapley values for the instance  given
 . Then, a monotone increasing transformation gives (x) = (8, 1, 6, 2).</p>
      </sec>
      <sec id="sec-4-3">
        <title>5.2. Distance Between Two Finite Sets of Explanations</title>
        <p>When it comes to evaluating the impact of user preferences on preferred abductive explanations,
several evaluation criteria can be considered. One of these criteria is a distance measure based on
the symmetric diference between two explanations. This distance measure allows quantifying
the proximity between two explanations. The symmetric diference between two explanations
involves considering the literals that are present in one explanation but not in the other, that is,
the literals that are specific to each explanation. By comparing the cardinality of this symmetric
diference, we can assess the degree of similarity or diference between these two explanations.
Additionally, we will consider the distance between two finite subsets of explanations as the
minimum distance between the explanations within these two subsets.</p>
        <p>The idea behind this distance measure is to provide an estimation of the proximity between
sets of explanations, allowing us to understand how these sets come closer to or move away from
each other. This can be useful for evaluating the similarities or divergences in user preferences
regarding abductive explanations.</p>
        <p>Definition 4. The distance between two finite subsets of explanations 1 and 2 is defined as
(1, 2) = ∈m1,in∈2|(, )|, where |.| represents the counting measure, and  is the
symmetric diference between two explanations 1 and 2, denoted as (1, 2), given by the
formula (1, 2) = { :  ∈ Lit (1) ∪ Lit (2) ∧  ∈/ Lit (1) ∩ Lit (2)} = Lit (1)△Lit (2).</p>
        <p>Note that the larger the value of (1, 2), the farther apart the two sets 1 and 2 are from
each other. If 1 ∩ 2 ̸= ∅, then (1, 2) = 0. From a topological perspective,  expresses the
geometric distance between two finite subsets of explanations, taking into account the topological
nature of explanations, which are terms composed of literals.</p>
        <p>Lemma 1. The complexity of calculating the distance between two subsets of explanations, 1 and
2, is quadratic.</p>
        <p>The computational complexity of calculating the distance between two sets of explanations, 1
and 2, depends on the sizes of these sets. Let’s assume that 1 represents the size of 1 and 2
represents the size of 2. For each element in 1, we need to compare it with each element in 2
to calculate the distance between them. This implies a comparison between 1 elements of 1
and 2 elements of 2, resulting in a complexity of the order of (1 · 2), which is quadratic
when 1 and 2 are suficiently large.</p>
        <p>Example 6. Based on Example 4, let’s denote 1 , 2 , and 3 as the subsets of preferred
explanations based on the preferences of bankers 1, 2, and 3, respectively. We have (1 , 2 ) = 0 and
(2 , 3 ) = 0 because 1 ∩ 2 ̸= ∅ and 3 ∩ 2 ̸= ∅, while (1 , 3 ) = 2.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Experiments</title>
      <p>
        Experimental setup. We considered 18 well-known binary classification datasets available
on Kaggle, OpenML, and UCI. No data preprocessing was performed for numerical attributes,
and the attributes were binarized in-line by the decision tree learning algorithm used. For each
benchmark , we evaluated the classification performance using standard evaluation metrics. We
used the CART algorithm and its implementation in Scikit-Learn to learn decision trees, with
default parameter settings. For each benchmark  and a subset of up to 250 randomly selected
instances  from the test set, unless the dataset contains fewer than 250 instances, in which case
the entire dataset was used. We computed the number of suficient reasons using the encoding
proposed by [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and the number of minimum-size suficient reasons using the Partial MaxSAT
solver (with a 60-second timeout per instance). Finally, we computed the number of preferred
suficient reasons using the encoding detailled in the section 4 and the WEIGHTED PARTIAL
MAXSAT solver from OpenWBO [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        Regarding the weight functions, for each tree , we used the exact method proposed by [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
to compute the SHAP score as well as the scores for LIME [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Anchors [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. We also used
feature importance with Scikit-Learn [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], the number of models "Explanatory" with [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], and
the Zipf frequency of each feature viewed as a word in the wordfreq library. Two weight
functions (random local and global) based on random weight sampling were added to clarify the
nature of preferred explanations for diferent weight functions. We report the classical statistics,
the average number and variance of suficient reasons and minimal suficient reasons, and the
preferred suficient reasons for each weight function method. Finally, for the " placement" and
"compas" datasets, we report the distance between diferent preferred subsets for the diferent
weight functions.
      </p>
      <sec id="sec-5-1">
        <title>6.1. Experimental results</title>
        <p>
          Tables 2 and 3 present an excerpt of the results. The tables present results on datasets, decision
trees, and global weight measures, based on 18 datasets. For each benchmark, the table provides
the dataset name (name), the accuracy of the decision trees ((%)), the number of binary
variables (#), and the number of instances (#). The columns |(,  )| and |(,  )|
respectively indicate the mean and standard deviation (std) of the number of suficient reasons
and the number of preferred suficient reasons. Then, for each benchmark , the columns #wordf,
#f_imp, ([
          <xref ref-type="bibr" rid="ref1 ref10">1,10</xref>
          ], [
          <xref ref-type="bibr" rid="ref1">1,100</xref>
          ], [
          <xref ref-type="bibr" rid="ref1">1,1000</xref>
          ]) correspondingly represent the number of preferred suficient
reasons for wordfreq, feature importance, and global random sampling over the intervals [
          <xref ref-type="bibr" rid="ref1 ref10">1,10</xref>
          ],
[
          <xref ref-type="bibr" rid="ref1">1,100</xref>
          ], and [
          <xref ref-type="bibr" rid="ref1">1,1000</xref>
          ]. The columns of Table 3 represent the mean and standard deviation (std) of
the number of preferred suficient reasons for the local weight measures in the following order:
Lime, Shapely, Anchors, Explanatory, and local random sampling over the intervals [
          <xref ref-type="bibr" rid="ref1 ref10">1,10</xref>
          ], [
          <xref ref-type="bibr" rid="ref1">1,100</xref>
          ],
and [
          <xref ref-type="bibr" rid="ref1">1,1000</xref>
          ]. We clarify that the concept of "random sampling local" consists of selecting integer
weights for each instance, while respecting a specified interval. Let’s consider the illustrative
example: suppose we have a dataset with instances of size  = 5, meaning that there are five
elements in each instance. The specified interval is [
          <xref ref-type="bibr" rid="ref1 ref10">1, 10</xref>
          ], indicating that the chosen weights
must be integer values ranging from 1 to 10. For each individual instance, we perform a random
draw to determine the corresponding weights. In our example, the weight vector  = (9, 4, 7, 5)
is generated from this random draw. Each weight in the vector is an integer chosen randomly
within the interval [
          <xref ref-type="bibr" rid="ref1 ref10">1, 10</xref>
          ].
        </p>
        <p>First. We would like to emphasize that computing preferred reasons given a decision tree and
instance is feasible in practice. In fact, for many datasets and instances, the computation of all
preferred reasons has been completed in less than 20 seconds, regardless of the type of weight
/ Lime
Lime 0.0
Shap .</p>
        <p>Anchor .</p>
        <p>
          Exp .
"R_[
          <xref ref-type="bibr" rid="ref1 ref10">1,10</xref>
          ]" .
"R_[
          <xref ref-type="bibr" rid="ref1">1,100</xref>
          ]" .
"R_[
          <xref ref-type="bibr" rid="ref1">1,1000</xref>
          ]" .
"R_[
          <xref ref-type="bibr" rid="ref1 ref10">1,10</xref>
          ]" "R_[
          <xref ref-type="bibr" rid="ref1">1,100</xref>
          ]" "R_[
          <xref ref-type="bibr" rid="ref1">1,1000</xref>
          ]"
0.2 0.28 0.38
0.32 0.4 0.5
0.22 0.3 0.4
0.32 0.4 0.5
0.0 0.3 0.4
. 0.0 0.36
. . 0.0
function used. It is evident that the use of diferent weight function types has a significant impact
on the number of reasons, making it easier to compute all preferred reasons by reducing their
quantity compared to suficient reasons and minimum-size reasons.
        </p>
        <p>Furthermore, it is important to note that for each dataset , each instance in the benchmark
of , and each type of weight function, enumerating the preferred suficient reasons has been
feasible. Leveraging user preferences ofers a significant advantage by substantially reducing the
number of generated explanations. By focusing solely on the explanations preferred by the user,
information overload is avoided, and attention is directed towards the most relevant and useful
explanations.</p>
        <p>Second. Tables 4 and 5 present a matrix that visualizes the average distances between diferent
subsets of explanations. These subsets of explanations are obtained using various methods of
local and global weight assignment. The values in the matrices correspond to the distances
between pairs of subsets, where the coordinates (, ) represent the weight assignment methods
used. When examining the diagonal entries of the matrix, we observe that the distances are
zero. This is because a subset is identical to itself, so the distance between a subset and itself is
always 0. Additionally, it is important to note that the matrices are symmetric. This is because
the distance used is symmetric, which is typically the case for all distances.</p>
        <p>By observing the distances between the diferent subsets of explanations, we notice that they
are generally less than 1. This indicates that the explanations are relatively close to each other in
terms of distance. Topologically, this suggests that the set of suficient reasons forms a compact
structure, where the explanations are closely grouped and interconnected. This observation
represents an initial step in studying the diversity of formal explanations. It indicates that the
diferent methods of local and global weight assignment used to generate the explanations do
not result in explanations that are very distant from each other. This raises questions about the
variety and extent of possible explanations, as well as how local weight assignment methods can
influence the diversity of the obtained explanations.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>7. Conclusion</title>
      <p>To summarize the contributions highlighted in this article, we first proposed a CNF-encoding
approach to compute preferred suficient reasons for decision trees. This approach involves
representing the reasons in a logical form that facilitates their calculation. Additionally, we
introduced the concept of distance between preferred explanations and examined the impact of
weight functions on preferred abductive explanations. Namely, we investigated how diferent
methods of assigning weights afect the proximity of preferred explanations to each other.
Our focus was on the quantity and diversity of these explanations. We found that a classified
instance, whether positive or negative, can have an exponential number of reasons, including
an exponential number of minimum-sized reasons or preferred reasons. This means that there
can be numerous possible explanations for a single classified instance. However, despite this
potential diversity, the number of preferred reasons is significantly smaller than the number
of suficient reasons, regardless of the weight function used. Generally, there is a restricted
selection of preferred explanations that are considered the most relevant or useful. Furthermore,
we observed that the distances between diferent sets of explanations are generally not large.
This indicates that abductive explanations for decision trees tend to be close to each other in
terms of similarity or proximity. In other words, the explanations often share similar features or
partially overlap. These findings suggest that despite the potential diversity of explanations, there
are commonalities and trends among preferred explanations for decision trees. This can be useful
in understanding how decisions are made by these models and in providing comprehensible
explanations to users.</p>
      <p>Studying the impact of weight functions on preferred abductive explanations for decision trees
is just the first step in our research on the diversity of abductive explanations. We intend to apply
a similar approach to other models, particularly random forests. Concurrently, we are developing
a SAT encoding to compute the SAT Distance between preferred sets of suficient reasons. The
aim of this endeavor is to provide users with a framework for selecting preferred explanations
that align with their personal preferences and are closer to the model’s output. In other words,
through this SAT encoding, users will be able to measure the proximity between diferent sets of
explanations and identify those that are most relevant and consistent with their expectations. This
will enhance their understanding of the model’s results and enable the provision of explanations
that are better suited to the users needs.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>"why should I trust you?": Explaining the predictions of any classifier</article-title>
          ,
          <source>in: Proc. of SIGKDD'16</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>Anchors: High-precision model-agnostic explanations</article-title>
          ,
          <source>in: Proc. of AAAI'18</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1527</fpage>
          -
          <lpage>1535</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-I.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>A unified approach to interpreting m(ijcaiodel predictions</article-title>
          ,
          <source>in: Proc. of NIPS'17</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>4765</fpage>
          -
          <lpage>4774</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Molnar</surname>
          </string-name>
          , Interpretable Machine Learning, Leanpub,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          , Random forests,
          <source>Machine Learning</source>
          <volume>45</volume>
          (
          <year>2001</year>
          )
          <fpage>5</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Izza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ignatiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Marques-Silva</surname>
          </string-name>
          ,
          <article-title>On explaining decision trees</article-title>
          ,
          <source>CoRR abs/2010</source>
          .11034 (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shih</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Darwiche</surname>
          </string-name>
          ,
          <article-title>A symbolic approach to explaining bayesian network classifiers</article-title>
          ,
          <source>in: Proc. of IJCAI'18</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>5103</fpage>
          -
          <lpage>5111</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Darwiche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hirth</surname>
          </string-name>
          ,
          <article-title>On the reasons behind decisions</article-title>
          ,
          <source>in: Proc. of ECAI'20</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Jabbour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Marques-Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Salhi</surname>
          </string-name>
          ,
          <article-title>Enumerating prime implicants of propositional formulae in conjunctive normal form</article-title>
          ,
          <source>in: Logics in Artificial Intelligence</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>G.</given-names>
            <surname>Audemard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bellart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bounia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Koriche</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-M. Lagniez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Marquis</surname>
          </string-name>
          ,
          <article-title>On the explanatory power of boolean decision trees</article-title>
          ,
          <source>Data &amp; Knowledge Engineering</source>
          <volume>142</volume>
          (
          <year>2022</year>
          )
          <article-title>102088</article-title>
          . URL: https://www.sciencedirect.com/science/article/pii/S0169023X22000799. doi:https://doi. org/10.1016/j.datak.
          <year>2022</year>
          .
          <volume>102088</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Audemard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bellart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bounia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Koriche</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-M. Lagniez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Marquis</surname>
          </string-name>
          ,
          <article-title>Trading complexity for sparsity in random forest explanations</article-title>
          ,
          <source>in: Proc. of AAAI'22</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Audemard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bellart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bounia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Koriche</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-M. Lagniez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Marquis</surname>
          </string-name>
          ,
          <article-title>Sur le pouvoir explicatif des arbres de décision</article-title>
          , EGC'
          <year>2022</year>
          38 (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Audemard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bellart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bounia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Koriche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lagniez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Marquis</surname>
          </string-name>
          ,
          <article-title>On preferred abductive explanations for decision trees and random forests</article-title>
          ,
          <source>in: Proc. of IJCAI'22</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>F.</given-names>
            <surname>Doshi-Velez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <article-title>Towards a rigorous science of interpretable machine learning</article-title>
          ,
          <year>2017</year>
          . arXiv:
          <volume>1702</volume>
          .
          <fpage>08608</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>G.</given-names>
            <surname>Audemard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bellart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bounia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Koriche</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-M. Lagniez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Marquis</surname>
          </string-name>
          ,
          <article-title>Les raisons majoritaires: des explications abductives pour les forêts aléatoires</article-title>
          ,
          <source>EGC'2022</source>
          <volume>38</volume>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>G.</given-names>
            <surname>Audemard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bellart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bounia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Koriche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lagniez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Marquis</surname>
          </string-name>
          ,
          <article-title>On the computational intelligibility of boolean classifiers</article-title>
          ,
          <source>in: Proc. of KR'21</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>74</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , E. Duchesnay,
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (
          <year>2011</year>
          )
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>J.-M. Lagniez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Marquis</surname>
          </string-name>
          ,
          <article-title>An Improved Decision-DNNF Compiler</article-title>
          ,
          <source>in: Proc. of IJCAI'17</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>667</fpage>
          -
          <lpage>673</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>R.</given-names>
            <surname>Martins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. M.</given-names>
            <surname>Manquinho</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Lynce</surname>
          </string-name>
          , Open-wbo:
          <article-title>A modular maxsat solver</article-title>
          „
          <source>in: International Conference on Theory and Applications of Satisfiability Testing</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          , G. Erion,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          , A. DeGrave,
          <string-name>
            <surname>J. M. Prutkin</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Nair</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Katz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Himmelfarb</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Bansal</surname>
            ,
            <given-names>S.-I. Lee</given-names>
          </string-name>
          ,
          <article-title>Explainable ai for trees: From local explanations to global understanding</article-title>
          , arXiv preprint arXiv:
          <year>1905</year>
          .
          <volume>04610</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>