<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Axiomatic Approach to Linear Explanations in Data Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jakub Sliwinski</string-name>
          <email>jakvbs@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Strobel</string-name>
          <email>mstrobel@comp.nus.edu.sg</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yair Zick</string-name>
          <email>dcsyaz@nus.edu.sg</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ETH Zurich</institution>
          ,
          <addr-line>Zurich</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Univ. of Singapore</institution>
          ,
          <country country="SG">Singapore</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this work, we focus on local explanations for data analytics; in other words: given a datapoint ~x, how important was the i-th feature in determining the outcome for ~x? The literature has seen a recent emergence of various analytical answers to this question. We argue for a linear influence measure explanation: given a datapoint ~x, assign a value fi(~x) to every feature i, which roughly corresponds to feature i's importance in determining the outcome for ~x. We present a family of measures called MIM (monotone influence measures), that are uniquely derived from a set of axioms: desirable properties that any reasonable influence measure should satisfy. Departing from prior work on influence measures, we assume no knowledge - or access - to the underlying classifier labeling the dataset. In other words, our influence measures are based on the dataset alone and do not make any queries to the classifier. We compare MIM to other linear explanation models in the literature and discuss their underlying assumptions, merits, and limitations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>INTRODUCTION
An individual is denied a bank loan; knowing that they are in
good financial standing, they demand that the bank explain
its decision. However, the bank uses an ML algorithm that
automatically rejected the loan application. How should the
bank explain its decision? This example is more than
anecdotal; recent years have seen the widespread implementation
of data-driven algorithms making decisions in increasingly
high-stakes domains, such as healthcare, transportation, and
public safety. Using novel ML techniques, algorithms are able
to process massive amounts of data and make highly
accurate predictions; however, their inherent complexity makes it
increasingly difficult for humans to understand why certain
decisions were made. By obfuscating the underlying
decisionmaking processes, such algorithms potentially expose human
©2018. Copyright for the individual papers remains with the authors.
Copying permitted for private and academic purposes.</p>
      <p>ExSS ’18, March 11, Tokyo, Japan.
stakeholders to risks. These risks could include incorrect
decisions (e.g. Alice’s application was wrongly rejected due to a
system bug), information leaks (e.g. the algorithm was
inadvertently given information about Alice that it should not have
seen), or discrimination (e.g. the algorithm is biased against
female applicants). Indeed, government bodies and regulatory
authorities have recently begun calling for algorithmic
transparency: providing human-interpretable explanations of the
underlying reasoning behind large-scale decision-making
algorithms. Our work represents a first formal axiomatic analysis
of automatically generated explanations of black-box
classifiers.</p>
      <p>Our Proposal
We propose utilizing simple mathematical frameworks for
an explanation via influence measures: these are functions
that, given a dataset, assign a value to every feature; this
value should roughly correspond to the feature’s importance in
affecting the classification outcome for individual data points.
Slightly more formally, we are given a dataset X containing n
dimensional vectors, whose data points are labeled by a binary
classifier c, such that c(~y) = 1 for all ~y 2 X ; now, given a
point of interest ~x 2 X , we wish to identify the features in ~x
that are ‘responsible’ for it being labeled the way it was. This
is done via a mapping f whose input is the dataset X , its
labels (given by c), and the point of interest ~x; its output is a
vector f (~x) 2 Rn, where fi(~x) corresponds to the influence of
feature i on the label of ~x. Intuitively, a large positive value
of fi(~x) should mean that feature i was highly important in
determining the label of ~x; a large negative value for fi(~x)
should mean that despite the value of i at ~x, ~x was assigned this
label. This approach carries several important benefits. First of
all, it is completely generic, requiring no assumptions on the
underlying classification model; secondly, linear explanation
models are simple and straightforward, even for a layperson
to understand (e.g. ‘Alice was denied her loan because of
the high importance the algorithm placed on her low monthly
income, and despite her never having to file for bankruptcy’).
The appeal of linear explanations has been recognized by the
research community; recent years have seen a moderate boom
of papers proposing linear explanations in data-driven domains
(see Section 1.2). However, this poses a new problem for end
users that wish to apply these methodologies: which linear
explanation is the ‘right’ one to choose? In other words,
. . . which linear explanations are guaranteed to satisfy
certain desirable properties?
We argue for an axiomatization of influence measures in
classification domains. The axiomatic approach is common in the
economics literature: first one reasons about simple,
reasonable properties (axioms) which should be satisfied by any
function (say, methods for dividing revenue amongst collaborators,
or agreeing on an election winner given voters’ preferences);
next, one should prove that there exists a unique function
satisfying these simple mathematical properties. The axiomatic
approach allows one to rigorously reason about the types of
influence measures one should use in a given setting: if the
axioms set forth make sense in this setting, there is but one
method of assigning influence in the given domain. It is, in
some sense, an explanation of an explanation method, a
provable guarantee that the method is sound; in fact, uniqueness
implies that it is the only sound method one can reasonably
use in a domain.</p>
      <p>
        In a recent line of work, we identify specific properties that
any reasonable influence measure should satisfy (Section 3);
using these axioms, we mathematically derive a class of
influence measures, dubbed monotone influence measures (MIM),
which uniquely satisfy these axioms (Section 4). Unlike most
existing influence measures in the literature, we assume
neither knowledge of the underlying decision-making algorithm,
nor of its behavior on points outside the dataset. Indeed, some
methodologies (see Related Work in Section 1.2) are heavily
reliant on having access to counterfactual information: what
would the classifier have done if some features were changed?
This is a rather strong assumption, as it assumes not only
access to the classifier but also the potential ability to use it on
nonsensical data points1. By making no such assumptions,
we are able to provide a far more general methodology for
measuring influence; indeed, many of the tools described in
Section 1.2 will simply not be usable when queries to the
classifier are not available, or when the underlying classification
algorithm is not known. Finally, grounding the measure in the
dataset ensures the distribution of data is accounted for, rather
than explaining the classification in terms of arbitrarily chosen
data points. The points can be very unlikely or impossible to
occur in practice, and using them can demonstrate a behavior
the algorithm will never exhibit in its actual domain. Despite
their rather limiting conceptual framework, our influence
measures do surprisingly well on a sparse image dataset. We show
that the outputs of our influence measure are comparable to
those of other measures, and provide interpretable results.
Related Work
Axiomatic approaches for influence measurement are
common in economic domains. Of particular note are axiomatic
approaches in cooperative game theory [
        <xref ref-type="bibr" rid="ref12 ref3 ref9">9, 12, 3</xref>
        ].
      </p>
      <p>
        The first axiomatic characterization of an influence measure
for datasets is provided in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]; however, they interpret
influence as a global measure (e.g., what is the overall importance
of gender for decision making). Moreover, one of the axioms
proposed in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] turned out to be too strong, severely limiting
the explanation power of the resulting measure. Indeed, as
1For example if the dataset consists of medical records of men and
women, the classifier might need to answer how it would handle
pregnant men
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] show, the measure proposed by [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] outputs undesirable
values (e.g. zero influence) in many real instances. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] propose
an empirical influence measure that relies on a potential-like
approach. However, as we show, their methodology fails to
satisfy reasonable properties even on simple datasets. Other
approaches in the literature either rely on black-box access to
the classifier [
        <xref ref-type="bibr" rid="ref6 ref8">6, 8</xref>
        ], or assume domain knowledge (e.g. that
the classifier is a neural network whose layers are
observable) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Another notable axiomatic treatment of influence
in data-driven domains appears in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]; in this work, it is shown
that a Shapley value based approach is the only way influence
can be measured when one assumes counterfactual access to
the black-box classifier. This result is confirmed in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
THE FORMAL MODEL
A dataset X = h~x1; : : : ;~xmi is given as a list of vectors in Rn
(each dimension i 2 [n] is a feature), where every ~x j 2 X
has a unique label c j 2 f 1; 1g; given a vector ~x 2 X , we
often refer to the label of ~x as c(~x). For example, X can
be a dataset of bank loan applications, with ~x describing the
applicant profile (age, gender, credit score etc.), and c(~x) being
a binary decision (accepted/rejected). An influence measure
is simply a function f whose input is a dataset X , the labels
of the vectors in X denoted by c, and a specific point ~x 2 X ;
its output is a value fi(~x; X ; c) 2 R; we often omit the inputs
X and c when they are clear from context. The value fi(~x)
should roughly correspond to the importance of the i-th feature
in determining the outcome c(~x) for ~x.
      </p>
      <p>AXIOMS FOR EMPIRICAL INFLUENCE MEASUREMENT
We are now ready to define our axioms; these are simple
properties that we believe any reasonable influence measure should
satisfy. We take a geometric interpretation of the dataset X ;
thus, several of our axioms are phrased in terms of geometric
operations on X .
1. Shift Invariance: let X +~b be the dataset resulting from
adding the vector~b 2 Rn to every vector in X (not changing
the labels). An influence measure f is said to be shift invariant
if for any vector~b 2 Rn, any i 2 [n] and any ~x 2 X ,
fi(~x; X ) = fi(~x +~b; X +~b):
In other words, shifting the entire dataset by some vector ~b
should not affect feature importance.
2. Rotation and Reflection Faithfulness: let A be a rotation
(or reflection) matrix, i.e. an n n matrix with det(A) 2 1;
let AX be the dataset resulting from taking every point ~x in
X and replacing it with A~x. An influence measure f is said to
be faithful to rotation and reflection if for any rotation matrix
A, and any point ~x 2 X , we have Af (~x; X ) = f (A~x; AX ):
In other words, rotating or reflecting the entire dataset results
in the influence vector rotating in the same manner.
3. Continuity: an influence measure f is said to be continuous
if it is a continuous function of X .
4. Flip Invariance: let c be the labeling resulting from
replacing every label c(~x) with c(~x). An influence measure is
flip invariant if for every point ~x 2 X and every i 2 [n] we
have fi(~x; X ; c) = fi(~x; X ; c):
5. Monotonicity: a point ~y 2 Rn is said to strengthen the
influence of feature i with respect to ~x 2 X if c(~x) = c(~y)
and yi &gt; xi; similarly, a point ~y 2 Rn is said to weaken the
influence of i with respect to ~x 2 X if yi &gt; xi and c(~x) , c(~y).
An influence measure f is said to be monotonic, if for any
data set X , any feature i and any data point ~x 2 X we have
fi(~x; X ) fi(~x; X [ f~yg) whenever ~y strengthens i w.r.t. ~x,
and fi(~x; X ) fi(~x; X [ f~yg) whenever ~y weakens i w.r.t. ~x.
6. Random Labels: an influence measure f is said to satisfy
the random labels axiom, if for any dataset X , if all labels
are assigned i.i.d. uniformly at random (i.e. for all ~x 2 X ,
Pr[c(~x) = 1] = Pr[c(~x) = 1]); we call this label distribution
U . Then, for all ~x 2 X and all i we have</p>
      <p>Ec U [fi(~x; X ; c) j c(~x) = 1] =
Ec U [fi(~x; X ; c) j c(~x) =
In other words, when we fix the label of ~x and randomize all
other labels, the expected influence of all features is 0.
Let us briefly discuss the latter two axioms. Monotonicity is
key in defining what influence means: intuitively, if one is
to argue that Alice’s old age caused her loan rejection, then
finding older persons whose loans were similarly rejected
should strengthen this argument; however, finding older
persons whose loans were not rejected should weaken the
argument. The Random Labels axiom states that when labels are
randomly generated, no feature should have any influence in
expectation; any influence measure that fails this test is
inherently biased towards assigning influence to some features,
even when labels are completely unrelated to the data.
CHARACTERIZING MONOTONE INFLUENCE MEASURES
Influence measures satisfying the Axioms in Section 3 must
follow a simple formula, described in Theorem 4.1; the full
proof of Theorem 4.1 appears in a full version of this work.2
Below, 1(p) is a f1; 1g-valued indicator (i.e. 1 if p is true
and 1 otherwise), and k~xk2 is the Euclidean length of ~x; note
that we can admit other distances over Rn, but stick with k k2
for concreteness.</p>
      <p>THEOREM 4.1. Axioms 1 to 6 are satisfied iff f is of the
form
f (~x; X ) =</p>
      <p>å (~y ~x)a(k~y ~xk2)1(c(~x) = c(~y))
~y2X n~x
(1)
where a is any non-negative-valued function.</p>
      <p>We refer to measures satisfying Equation (1) as monotone
influence measures (MIM). MIM uniquely satisfy a set of
reasonable axioms; moreover, they maximize the total cosine
similarity objective function. Intuitively, given a vector~x 2 X ,
an MIM vector f (~x; X ) will point in the direction that has the
‘most’ vectors in X sharing a label with ~x. The value kf k2
can be thought of as one’s confidence in the direction: if kf k2
is high, this means that one is fairly certain where other vectors
sharing a label with ~x are (and, correspondingly, this means
that there are at least some highly influential features identified
by f ); a small value of kf k2 implies low explanation strength.
2The main paper is currently under review.</p>
      <p>EXISTING MEASURES
In this section, we provide an overview of some existing
methodologies for measuring influence in data domains and
compare them to MIM.</p>
      <p>
        Parzen
The main idea behind the approach followed by [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is to
approximate the labeled dataset with a potential function and
then use the derivative of this function to locally assign
influence to features. Parzen satisfies Axioms 1 to 4. However, it is
neither monotonic nor can it efficiently detect random labels.
LIME
The measure in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is based on the idea of finding a best local
fit for the classifier in a region around ~x. At its core, LIME fits
a classifier by minimizing the mean-squared error, whereas
MIM maximizes cosine similarity.
      </p>
      <p>
        The Counterfactual Influence Measure
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] initiated the axiomatic treatment of influence in data
analysis; they propose a counterfactual aggregate influence measure
for black-box data domains. Unlike other measures in this
section, [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] do not measure local feature influence; rather, they
measures the overall influence of a feature for a given dataset.
The measure proposed by [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] does the following: when
measuring the influence of the i-th feature; for every point ~x 2 X ,
it counts the number of points in X who differ from ~x by only
the i-th feature, and in their classification outcome. Given its
rather restrictive notion of influence, this methodology only
measures non-zero influence in very specific types of datasets:
it assigns zero influence to all features in datasets that do not
contain data points that differ from one another by only one
feature; moreover, it only measures influence when a change in
the state of a single feature changes the classification outcome.
Quantitative Input Influence
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] propose a general framework for influence measure in
datasets, generalizing counterfactual influence. Instead of
measuring the effect of changing a single feature on point
~x 2 X , they examine the expected effect of changing a set
of features. The resulting measure, named QII (Quantitative
Input Influence) is based on the Shapley value [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a method
of measuring the importance of individuals in collaborative
environments. QII allows access to counterfactual information;
moreover, it is computationally intensive in practice, and under
its current implementation, will not scale to domains having
more than a few dozen features.
      </p>
      <p>Black-Box Access Vs. Data-Driven Approaches
Some measures above assume black-box access to the
classifier (e.g. QII and LIME); others (e.g. Parzen and MIM)
make no such assumption. Is it valid to assume black-box
access to a classifier? This depends on the implementation
domain one has in mind and the strength of explanations that
one wishes to arrive at. On the one hand, having more access,
measures such as QII and LIME can offer better explanations
in a sparse data domain; however, they are essentially
unusable when one does not have access to the underlying classifier.
Data-driven approaches such as MIM, the counterfactual
measure, and Parzen are more generic and can be applied on any
given dataset; however, they will naturally not be particularly
informative in sparse regions of the dataset.</p>
      <p>
        DISCUSSION AND FUTURE WORK
In this paper, we argue for the axiomatic treatment of linear
influence measurement. We present a measure uniquely
derived from a set of reasonable properties which also optimizes
a natural objective function. Our characterization subsumes
known influence measures proposed in the literature. In
particular, MIM becomes the Banzhaf index in cooperative games
and is also related to formal models of causality. Furthermore,
MIM generalizes the measure proposed by [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] for measuring
influence in a data-dependent cooperative game setting.
Taking a broader perspective, axiomatic influence analysis in data
domains is an important research direction: it allows us to
rigorously discuss the underlying desirable norms we’d like
to see in our explanations. Indeed, an alternative set of axioms
is likely to result in other novel measures, that satisfy other
desirable properties. Being able to mathematically justify one’s
choice of influence measures is important from a legal/ethical
perspective as well: when explaining the behavior of
classifiers in high-stakes domains, having provably sound measures
offers mathematical backing to those using them.
      </p>
      <p>
        While MIM offers an interesting perspective on influence
measurement, it is but a first step. There are several interesting
directions for future work; first, our analysis is currently
limited to binary classification domains. It is possible to naturally
extend our results to regression domains, e.g. by replacing
the value 1(c(~x) = c(~y)) with c(~x) c(~y); however, it is not
entirely clear how one might define influence measures for
multiclass domains. It is still possible to retain 1(c(~x) = c(~y))
as the measure of ‘closeness’ between classification outputs
— i.e. all points that share ~x’s output offer positive influence,
and all those who do not offer negative influence — but we
believe that this may result in a somewhat coarse influence
analysis. This is especially true in cases where there is a large
number of possible output labels. One possible solution for
the multiclass case would be to define a distance metric over
output labels; however, the choice of metric would greatly
impact the outputs of MIM (or any other influence measure).
Another major issue with MIM (and several other measures)
is that their explanations are limited to the influence of
individual features; they do not capture joint effect, let alone more
complex synergistic effects of features on outputs (the only
exception to this is LIME, which, at least in theory, allows
fitting non-linear classifiers in the local region of the point of
interest). It would be a major theoretical challenge to
axiomatize and design ‘good’ methods for measuring the effect of
pairwise (or k-wise) interactions amongst features. This also
allows one to have a natural tradeoff between the accuracy and
interpretability of a given explanation. A linear explanation
(e.g. LIME, QII, or this work) is easy to understand: each
feature is assigned a number that corresponds to their positive
or negative effect on the output of ~x; a measure that captures
k-wise interactions would be able to explain much more of the
underlying feature interactions, but would naturally be less
human interpretable. Indeed, a measure that captures all levels
of feature interactions would be equivalent to a local
approximation of the original classifier, which may not be feasible to
achieve, nor easy to interpret. A better understanding of this
behavior would be an important step in the design of influence
measures. Finally, it is important to translate our numerical
measure to an actual human-readable report. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] propose using
linear explanations as transparency reports; however, more
advanced methods which assume access to the classifier source
code propose mapping back to specific subroutines for
explanations [
        <xref ref-type="bibr" rid="ref10 ref5">5, 10</xref>
        ]. Indeed, while the transition from data to
numerical explanations is an important step, mapping these to
actual human-interpretable explanations is an open problem.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>D.</given-names>
            <surname>Baehrens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schroeter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Harmeling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kawanabe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hansen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.-R.</given-names>
            <surname>Müller</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>How to explain individual classification decisions</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>11</volume>
          (
          <year>2010</year>
          ),
          <fpage>1803</fpage>
          -
          <lpage>1831</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>E.</given-names>
            <surname>Balkanski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Syed</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Vassilvitskii</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Statistical Cost Sharing</article-title>
          .
          <source>In Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS)</source>
          .
          <volume>6222</volume>
          -
          <fpage>6231</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>J.F.</given-names>
            <surname>Banzhaf</surname>
          </string-name>
          .
          <year>1965</year>
          .
          <article-title>Weighted Voting Doesn't Work: a Mathematical Analysis</article-title>
          .
          <source>Rutgers Law Review</source>
          <volume>19</volume>
          (
          <year>1965</year>
          ),
          <fpage>317</fpage>
          -
          <lpage>343</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>A.</given-names>
            <surname>Datta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Datta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Procaccia</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zick</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Influence in Classification via Cooperative Game Theory</article-title>
          .
          <source>In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI).</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>A.</given-names>
            <surname>Datta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fredrikson</surname>
          </string-name>
          , G. Ko,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mardziel</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Sen</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Proxy Non-Discrimination in Data-Driven Systems</article-title>
          .
          <source>CoRR abs/1707</source>
          .08120 (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>A.</given-names>
            <surname>Datta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zick</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Algorithmic Transparency via Quantitative Input Influence</article-title>
          .
          <source>In Proceedings of the 37th IEEE Conference on Security and Privacy (Oakland).</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>S.M.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A Unified Approach to Interpreting Model Predictions</article-title>
          .
          <source>In Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS)</source>
          .
          <volume>4768</volume>
          -
          <fpage>4777</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          .
          <year>2016</year>
          . “
          <article-title>Why Should I Trust You?”: Explaining the Predictions of Any Classifier</article-title>
          .
          <source>In Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining (KDD)</source>
          .
          <volume>1513</volume>
          -
          <fpage>1522</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>L.S.</given-names>
            <surname>Shapley</surname>
          </string-name>
          .
          <year>1953</year>
          .
          <article-title>A Value for n-Person Games</article-title>
          . In Contributions to the
          <source>Theory of Games</source>
          , vol.
          <volume>2</volume>
          . Princeton University Press,
          <fpage>307</fpage>
          -
          <lpage>317</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Programs as Black-Box Explanations</article-title>
          .
          <source>CoRR abs/1611</source>
          .07579 (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>M. Sundararajan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Taly</surname>
            , and
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>Yan</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Axiomatic Attribution for Deep Networks</article-title>
          .
          <source>arXiv preprint arXiv:1703.01365</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>H.P.</given-names>
            <surname>Young</surname>
          </string-name>
          .
          <year>1985</year>
          .
          <article-title>Monotonic solutions of cooperative games</article-title>
          .
          <source>International Journal of Game Theory</source>
          <volume>14</volume>
          ,
          <issue>2</issue>
          (
          <year>1985</year>
          ),
          <fpage>65</fpage>
          -
          <lpage>72</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>