=Paper= {{Paper |id=Vol-2578/PIE1 |storemode=property |title=Futility of a Right to Explanation |pdfUrl=https://ceur-ws.org/Vol-2578/PIE1.pdf |volume=Vol-2578 |authors=Jarek Gryz,Nima Shahbazi |dblpUrl=https://dblp.org/rec/conf/edbt/GryzS20 }} ==Futility of a Right to Explanation== https://ceur-ws.org/Vol-2578/PIE1.pdf
                                       Futility of a Right to Explanation
                                 Jarek Gryz                                                                Nima Shahbazi
                               York University                                                                 Mindle AI
                                  Toronto                                                                   nima@mindle.ai
                             jarek@cse.yorku.ca

ABSTRACT                                                                          words, we need transparency and accountability of the decision-
In the last few years, interpretability of classification models                  making algorithms.
has been a very active area of research. Recently the concept                         In recent years, multiple papers have been published to ad-
of interpretability was given a more specific legal context. In                   dress interpretability (variously defined) of models generated by
2018 EU introduced General Data Protection Regulation with a                      machine learning algorithms. However, a recent publication [8]
Right to Explanation for people subjected to automated decision                   suggests not only that the concept of interpretability is mud-
making. The Regulation itself is very brief on what such a right                  dled but also badly motivated. The approval of General Data
might imply. In this paper, we attempt to explain what the Right                  Protection Regulation (GDPR) 2016 prompted a discussion 1 on a
to Explanation may involve. We then argue that this right would                   related legal concept, a right to explanation. If this right is indeed
be very difficult to implement due to technical challenges. We                    mandated by GDPR (which has been in effect since 2018) then
also maintain that the Right to Explanation may not be needed                     software companies conducting business in Europe are immedi-
and sometimes may even be harmful. We propose instead an                          ately liable if they are not able to satisfy that right. A discussion
external evaluation of classification models with respect to their                on what it would take to comply with this new requirement is
correctness and fairness.                                                         thus already overdue.
                                                                                      In this paper, we discuss that very concept. Our conclusion is
KEYWORDS                                                                          mostly negative; we do not believe that right to explanation can
                                                                                  be successfully implemented or that it is useful. In Section 2, we
right to explanation, explainable AI, algorithmic transparency
                                                                                  set the stage for the discussion by defining precisely the context.
                                                                                  In Section 3 we review recent work on model explanation and
1    INTRODUCTION                                                                 show that it has little relevance for implementing the right to
Recent advances in development of machine learning algorithms                     explanation. Section 4 presents a case study of a recommendation
combined with massive amount of data to train them changed                        system we have developed recently. We show that the model
dramatically their utility and scope of applications. Software tools              generated by algorithms of that system would be very difficult –
based on these algorithms are now routinely used in criminal                      if at all possible – to explain to an ordinary user. Thus, in Section
justice system, financial services, medicine, research, and even                  5, we contend that we do not need a right to explanation in the
in small business. Many decisions affecting important aspects                     first place and show that in fact it can be harmful. We conclude
of our lives are now made by algorithms rather than humans.                       the paper in Section 6.
Clearly, there are many advantages of this transformation. Hu-                        We should also point out that most of the diagnoses and opin-
man decisions are often biased and sometimes simply incorrect.                    ions expressed in this paper apply as much to a wider concept of
Algorithms are also cheaper and easier to adjust to changing                      interpretability as they do to a right to explanation.
circumstances.
   Yet there is a price to pay for these benefits. Despite promises               2    WHAT IS A RIGHT TO EXPLANATION
to the contrary, there have been several cases of bias and discrim-               Articles 13 and 14 of GDPR state that a data subject has the right
ination discovered in algorithmic decision-making. Of course,                     to “meaningful information about the logic involved”. In addition,
once discovered, these biases can be removed and algorithms can                   Recital 71 states more clearly that a person who has been subject
be validated to be non-discriminatory before they are deployed.                   to automated decision-making:
But there is still widespread uneasiness – particularly among
                                                                                          should be subject to suitable safeguards, which
legal experts - about the use of these algorithms. Most of these
                                                                                          should include specific information to the data sub-
algorithms are self-learning and their designers have little control
                                                                                          ject and the right to obtain human intervention,
over the models generated from the training data. In fact, com-
                                                                                          to express his or her point of view, to obtain an
puter scientists were not really interested in studying the models
                                                                                          explanation of the decision reached after such as-
because they are often extraordinarily complex (hence they are
                                                                                          sessment and to challenge the decision
often referred to as black boxes). The standard approach was that
as long as an algorithm worked correctly, nobody bothered to                         This requirement is clear in one aspect: the person has the
analyze how it worked.                                                            right to seek an explanation of a specific decision and only ex
   This approach has changed once the tools based on machine                      post. This is important as it does not require the controller of
learning algorithms became ubiquitous and began directly af-                      the software to reveal the complete functionality of the system.
fecting lives of ordinary people. If the decision on how many                     Still, GDPR does not elucidate anywhere what constitutes an
years you are going to spend in prison is made by an algorithm                    "explanation" and we will attempt to do just that.
you have the right to know how this decision was made. In other                      We make two, hopefully harmless, assumptions:

© 2020 Copyright held by the owner/author(s). Published in Workshop Proceedings
                                                                                  1 The authors of [5] claim that this right is already mandated by GDPR. The authors
of the EDBT/ICDT 2020 Joint Conference, March 30-April 2, 2020 on CEUR-WS.org
Distribution of this paper is permitted under the terms of the Creative Commons   of [23] disagree but believe that it should be there and show how to modify the
license CC BY 4.0.                                                                language of GDPR to do so.
    (1) We only consider decision-making tools based on classifi-         left with the other two barriers. In fact, these two barriers are
        cation algorithms. Classification algorithms are “trained”        dependent upon each other. Complexity of machine learning
        on data obtained from past decisions to create a model            methods are positively correlated with the level of technical liter-
        which is then used to arrive at future decisions. It is this      acy required to comprehend them. We will claim that given the
        model that requires an explanation, not the algorithm it-         current level of educational attainment in general population and
        self (in fact, different algorithms may arrive at very similar    the complexity of machine learning algorithms these barriers are
        models).                                                          insurmountable.
    (2) We assume that the output of the algorithm is a numeri-               Let us start by putting to rest two “solutions” to this problem
        cal value from the range 1 to n. This covers both yes/no          that have been proposed in literature on the subject. Thus, [7]
        answers (“yes” may be then represented as the first half          suggest the following to address barrier (2):
        of the numbers and “no” as the second half) as well as cat-               This kind of opacity can be attenuated with stronger
        egorical values (each number represent one category and                   education programs in computational thinking and
        we no longer assume that there is any ordering between                    “algorithmic literacy” and by enabling independent
        numbers).                                                                 experts to advice those affected by algorithmic decision-
    With these two assumptions, we can now fix the setting where                  making
we expect the right to explanation to be executed. A user submits             First, even if we manage to strengthen technical literacy edu-
the information about her to a decision-making tool and receives          cation (which is very unlikely given how successful we have been
the answer X (X can be a number, a No, or a category such as              so far in this area) we are still left with 80% of the population
“high risk”). From the wording of Recital 71 (the user has the right      which has completed their education a long time ago but may
to challenge the decision) it is clear that the right to explanation is   still want to use their right to explanation. Second, the cost of
designed for cases where the answer the user received is different        employing independent experts would be prohibitive and is just
from what she expected or hoped for (say, she expected Y). The            not feasible. (It is also not clear, how exactly these experts might
most straightforward question she may ask then is: “Why X?”.              be helpful.) As a solution to barrier (3), [23] suggest the following
When the user asks “Why X?” when she expects Y as an answer,              to be provided to a user as an explanation:
she means in fact to ask: “Why X rather than Y?”. This is the                     Evidence regarding the weighting of features, de-
type of question that calls for a contrastive explanation [11]. The               cision tree or classification structure, and general
answer we need to provide to the user must then contain not only                  logic of the decision-making system may be suffi-
the explanation why the information she provided about herself                    cient.
generated answer X, but also what in her data has to change to
                                                                              Indeed, this type of evidence would certainly be sufficient
generate answer Y (the one she was expecting).
                                                                          to understand how the system arrived at a decision. But it is
    When people ask “Why X?”, they are looking for a cause of X.
                                                                          completely unrealistic to expect that a layperson would be able
Thus, if X is a negative decision to a loan application, we would
                                                                          to grasp these concepts. Anyone who taught a machine learning
need to specify what information in their application (features
                                                                          course at a university knows that the concepts of decision tree
used as input in the model) caused X. We also need to remember
                                                                          or neural networks are hard to grasp even for computer science
that the decision-making tool that has made a decision for the
                                                                          majors.
user is replacing a human being that used to make such decisions.
                                                                              In the last few years there has been very intensive work on
In fact, a person that reports a decision to the user may not even
                                                                          “black-box” model explanation. Some of this work [1, 2, 10, 13,
clearly state that it is a verdict of an algorithm (judges in the US
                                                                          19, 22] has been designed specifically for experts. Interpretability
routinely use software-based risk assessment tools to help them
                                                                          of a model is a key ingredient of a robust validation procedure
in sentencing). The user may thus expect that an explanation
                                                                          in applications such as medicine or self-driving cars. But there
provided to her uses the language of social attribution [11], that
                                                                          has also been some innovative work on model explanation for its
is, explains the behavior of the algorithm using folk psychology.
                                                                          own sake: [3, 4, 6, 15, 16, 18, 20, 24, 25]. Most of these papers are
This may seem to be an excessive requirement but as we will
                                                                          still addressed at experts with the aim of providing insights into
show later in the paper, an explanation that does not take into
                                                                          models they create or use. In fact, only in the last three papers
account human psychology and social relations can be useless.
                                                                          mentioned above, explanations were tested on human subjects
    Last but not least, we need to be able to evaluate the quality
                                                                          and even then a certain level of sophistication was expected on
of an explanation. This is important because – as we show in
                                                                          their part (from the ability to interpret a graph or a bar chart to
Section 3 – there are usually multiple ways of explaining X (and
                                                                          completing a grad course on machine learning). Most important
there are always multiple ways of explaining Y). People prefer
                                                                          though, all of these works provide explanations of certain aspects
explanations that are simpler (cite few causes) and more general
                                                                          of a model (for example, showing what features influence the de-
(they explain more events) [9].
                                                                          cision of the algorithm the most). None of them even attempts to
                                                                          explain fully two contrasting paths in a model leading to distinct
3    STATE OF THE ART IN MODEL                                            classification results (which as we argued above is required for a
     EXPLANATION                                                          contrastive explanation).
Three barriers to transparency of algorithms in general are usu-
ally distinguished: (1) intentional concealment whose objective           4    MODEL EXPLANATION IS HARD
is protection of the intellectual property; (2) lack of technical         We believe that explaining a black-box model of a machine learn-
literacy on the part of the users; (3) intrinsic opacity which arises     ing algorithm is much harder than it is usually assumed. To make
by the nature of machine learning methods. Right to explana-              our case more vivid, we will describe our recent work [17] on
tion is probably void when trade secrets are at stake (German             designing a song recommendation system for KKBOX, Asia’s
commentary to GDPR states that explicitly [23]), but we are still         leading music streaming service provider.
    KKBOX has provided a training data set that consists of in-                         Now, how can a user possibly grasp this model? Assume that
formation of listening sessions for each unique user-song pair                       a user wants an explanation why song X was recommended to
within a specific timeframe. The features available to the algo-                     her rather than song Y. There will be multiple trees with the X
rithm include information about the users, such as id, age, gender,                  recommendation as well as Y recommendation. Which one do
etc., and about songs, such as length, genre, singer, etc. The train-                we choose? These multiple trees cannot be generalized as this
ing and the test data are selected from users’ listening history in                  has been already done by the algorithm (one of the most difficult
a given time period and have around 7 and 2.5 million unique                         aspects of algorithms based on decision trees is optimization
user-song pairs respectively. Although the training and the test                     which is generating the simplest, most general trees). Perhaps we
sets are split based on time and are ordered chronologically, the                    could generalize by approximating the answer? This approach,
timestamps for train and test sets are not provided. It is worth                     sometimes advocated in the literature [12], can be harmful, how-
mentioning that this structure also suffers from the cold start                      ever. Let us say, your loan application has been rejected and you
problem: 14.5% of the users and 26.6% of the songs in the test                       get an explanation based on an approximate model. Based on
data do not appear in the training data.                                             this model you are told that if your debt goes down to $10,000
    The performance of any supervised learning model relies on                       you will be approved. You pay off part of your debt to satisfy that
two principal factors: predictive features and effective learning                    condition, apply again, and are rejected again. The reason is that
algorithm. Very often, these features are only implicit in the                       the correct model we have just approximated had a $9,000 not
training data and the algorithm is not able to extract them itself.                  $10,000 outstanding debt condition.
Feature engineering is an approach that exploits the domain                             Let us summarize the obstacles in explaining the model gen-
knowledge of an expert to extract from the data set features                         erated by our system. A user expects a simple answer to the
that should generalize well to the unseen data in the test set.                      following question: Why did you recommend song X rather than
The quality and quantity of the features have a direct impact                        Y?
on the overall quality of the model. In our case, we created (or                         • There may be multiple trees (“reasons”) why X was rec-
extracted, because they were implicitly present in data) certain                           ommended and similarly multiple trees why Y was not
statistical features such as: number of sessions per user, number                          recommended. Generalizing or approximating these trees
of songs per each session, or the length of time a user has been                           to provide a more general answer is not possible either for
registered with KKBOX. We also tried to capture the changes of                             technical or psychological reasons.
user behavior over time with the following approach: for each                            • An explanation must refer to the actual features used by
user, we looked at how the number of songs s/he listened to                                the model. Yet most of these features do not appear in the
per session changed over time. For that, we created two linear                             original data that describes user-song interactions. Even
regression models: the first model was fitted to the number of                             worse, many of them have no intuitive meaning as they
songs per user session and the second one is fitted to the number                          are machine-generated.
of artists per user session. Finally, the following features were                        • If we give up on model explanation and try instead to
extracted from the linear models: the slope of the model, the first                        describe algorithms that generated the model, our task is
and last predicted values, and the difference between the first                            even harder due to formidable complexity of the algorith-
and the last predicted values.                                                             mic design (as shown in Figure 1).
    As a result, we increased by a factor of about 10, to 185, the
                                                                                         One more comment is in order. Our system used decision trees
number of features available to the algorithm. And here is the key
                                                                                     to build a model. Decisions trees are directly interpretable as
point: some of these derived features turned out to be extremely
                                                                                     each path in a tree lists simple conditions that have to be satisfied
important in determining user’s taste in songs and as a result a
                                                                                     to reach a specific decision. Deep neural nets with weights at-
recommendation we provide for him or her. Yet none of these
                                                                                     tached to features and their complex interactions are not directly
features were explicitly present in the original data! The paradox
                                                                                     interpretable.
is that if someone asked us to explain how the model worked we
                                                                                         Complexity of machine learning models is actually worse than
would have had to refer to features NOT present in the data.
                                                                                     we described above. Machine learning is heuristics-driven and
    But this is only a part of the story. We did not use a single
                                                                                     nobody expects rigorous mathematical proofs of correctness of
algorithm to make a prediction. We used five different algorithms,
                                                                                     its algorithms. What often happens is that if a model generated
all of them very complex (Figure 1 shows the complexity of one of
                                                                                     by some algorithm does not classify correctly the test data, a
these algorithms: a simplified neural net2 structure). Thus, here
                                                                                     designer would stack up another algorithmic layer on top of it
is another key point: the final model was the weighted average
                                                                                     in hope that it improves the results. Sometimes it does but at
of all five models’ predictions. It was NOT a result of one, clean
                                                                                     this point nobody bothers to explain why that happened. As Ali
algorithm.
                                                                                     Rahimi put it in a recent keynote talk at NIPS [14]: “Machine
    The model that was generated by these algorithms was ex-
                                                                                     learning has become alchemy (. . . ) many designers of neural nets
tremely large and complex. Since we used gradient boosting
                                                                                     use technology they do not really understand”. If people who
decision tree algorithms, our model was a forest of such decision
                                                                                     work with these algorithms do not understand them, how can
trees. The forest contained over 1000 trees, each with 10-20 chil-
                                                                                     anybody else?
dren at each node and at least 16 nodes deep (it took almost 128GB
of RAM to derive the gradient boosting decision tree model and
around 28 hours on 4 Tesla T4 GPU to create the deep neural
                                                                                     5   EXPLANATIONS ARE UNNECESSARY
network model).                                                                          AND CAN BE HARMFUL
                                                                                     We do not feel competent to answer a legal question of whether
                                                                                     people should have a right to explanation. We do, however, have
2 We do not explain each of these steps in detail as our point is just to show the   a few observations regarding the psychological question whether
complexity of the entire prediction process and not its technical aspects.           people actually want that right and will use it. We came across
                                                                                       of gaining trust is to show that the algorithm is correct and fair.
                                                                                       Fairness and correctness can be easily verified by experts and
                                                                                       reported back to the general population. Expert opinion is what
                                                                                       we have used for at least 100 years in almost all of our technology,
                                                                                       from bridge safety to GPS precision. There is no reason why it
                                                                                       should not work here.
                                                                                           But there is more to our skepticism about explanations of
                                                                                       decision-making algorithms. We believe not only that they are
                                                                                       unnecessary, but that they can also be harmful. One unintended
                                                                                       consequence of revealing the mechanism of an algorithm is the
                                                                                       ability to game the system. This is unfair both to the users who
                                                                                       did not ask for explanation (or do not have the necessary exper-
                                                                                       tise to understand it) as well as the controller of the algorithm.
                                                                                       But there is however yet another, more serious problem which
                                                                                       has been overlooked by scholars. Algorithms are supposed to
                                                                                       be blind to race, gender, religion etc. This blindness, however,
Figure 1: Structure of one of the algorithms used in the                               extends to everything that is not explicitly present in the data, in
recommendation system.                                                                 particular, social context. Imagine a middle-aged woman from
                                                                                       racial minority whose loan application has been rejected. She is
                                                                                       told by means of an explanation that her application has been
three arguments routinely made to justify the requirement to                           rejected primarily because of her address: she lives in socioeco-
explain the behavior of decision-making algorithms:                                    nomically deprived neighborhood such as Southeast LA with a
      • Continuity: these decisions were previously made by hu-                        documented high loan default rate among its population. But
        mans whom we could ask for explanation. We want to                             that population also happens to be mostly of the same race as
        keep this option.                                                              the applicant (which the algorithm, of course, does not know).
      • Gravity: these decisions (judicial sentencing, hiring, col-                    Needless to say, the applicant would assume that it was the race
        lege admission) have grave consequences for our lives                          that was a hidden factor behind the negative decision. But the
        therefore must be justified3 .                                                 explanation she gets can be even more damaging. Imagine further
      • Trust: we trust humans more than machines therefore                            that – as part of the contrastive explanation – the applicant is
        even if we do not always ask humans for explanations we                        told what she should do to get an approval of the loan. To that
        should be able to ask machines.                                                effect, an algorithm that runs the explanation module reviews
                                                                                       profiles in the model (these could be paths in decision trees which
   We will address all three arguments with an illustrative exam-                      keep information about applicants’ features such as income, type
ple. Imagine you have been diagnosed with cancer and your physi-                       of job, age, etc.) to find the most similar ones to the applicant’s.
cian suggests chemotherapy treatment. You may ask whether                              In other words, the algorithm looks for a minimal change in the
there are other options available to you and the doctor presents a                     applicant’s profile that will give her a positive loan decision. It
few but still recommends chemotherapy. You may inquire further                         finds three profiles that are identical to the applicant’s except for
why chemotherapy is your best option to which she answers                              one attribute. It then suggests that the applicant should either
that medical studies say so. Again, you may press and ask for                          buy a house in Beverly Hill, or increase her income by $100,000
details of these studies but at some point (unless you are a health                    or lower her age.
professional yourself) you will stop understanding her explana-                            This example is not at all contrived. Every AI system is the
tions. A decision - potentially a life or death decision - has been                    fabled tabula rasa; it “knows” only as much as it has been told.
made on your behalf yet you do not insist on detailed explanation                      A classification algorithm trained on banking data has no in-
of its validity. In fact, most people will rely on the authority of                    formation about what it takes to buy a house in Beverly Hill
the physician without asking for any explanation. One may still                        or get a salary increase of $100,000 and it does not know that
argue that we do not need explanations because we trust the                            one cannot lower her age. It does not “understand” any of its
physician (or trust her more than we trust algorithms or ma-                           own suggestions because they are generated by purely syntactic
chines). But is it really the physician that we trust? The entire                      manipulation. In fact, it does not understand anything.
diagnostic process (MRI, X-ray, blood tests, etc.) is performed by                         Of course, we can try to tweak the explanation module of a spe-
machines, drugs used in treatment are produced by machines,                            cific decision-making system to avoid preposterous and insulting
and surgical procedures are performed with significant techno-                         explanations such as these. But we are rather pessimistic about
logical support. Yet we almost never ask how this technology                           the extent to which this can be done. It seems that we would
works.                                                                                 need to introduce a tremendous amount of background knowl-
   We believe that the need to get an explanation from decision-                       edge about human behavior and social relations. This knowledge
making algorithms is a simple consequence of their novelty.                            would have to be then properly organized so that the relevant
When we get an unexpected decision from such an algorithm,                             part of it is easily available for a case at hand. This has been
we suspect that the algorithm made a mistake and want to see                           tried in the context of knowledge-base systems in the 1980s,
the justification of the decision. In other words, we do not trust                     unfortunately, without much success.
the algorithm. But we do not need an explanation to gain that
trust. We believe that a much simpler and more convincing way
3 Gravity of a decision does not automatically give us a right to an explanation. In
most legal systems, jury verdicts are neither explained nor justified.
6     CONCLUSIONS                                                                        [21] Suresh Venkatasubramanian. 2019. Algorithmic Fairness: Measures, Methods
                                                                                              and Representations. In Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI
Black-box algorithms make decisions that affect our lives. We do                              Symposium on Principles of Database Systems. 481–481.
not trust them because we do not know what is happening in the                           [22] Marina M-C Vidovic, Nico Görnitz, Klaus-Robert Müller, Gunnar Rätsch, and
                                                                                              Marius Kloft. 2015. Opening the black box: Revealing interpretable sequence
black-box. We want explanations. Yet, as we argued in this paper,                             motifs in kernel-based learning algorithms. In Joint European Conference on
for technical reasons, explanations at a human level are very                                 Machine Learning and Knowledge Discovery in Databases. Springer, 137–153.
hard to get. More than that, they can be useless or even harm-                           [23] Sandra Wachter, Brent Mittelstadt, and Luciano Floridi. 2017. Why a right to
                                                                                              explanation of automated decision-making does not exist in the general data
ful. We suggest instead, that the algorithms be analyzed from                                 protection regulation. International Data Privacy Law 7, 2 (2017), 76–99.
outside, by looking at their performance. Performance can be                             [24] Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. 2015.
evaluated by two fundamental criteria: correctness and fairness.                              Understanding neural networks through deep visualization. arXiv preprint
                                                                                              arXiv:1506.06579 (2015).
Machine learning community has developed reliable tests to mea-                          [25] Luisa M Zintgraf, Taco S Cohen, Tameem Adel, and Max Welling. 2017. Visu-
sure algorithm correctness and we are making good progress in                                 alizing deep neural network decisions: Prediction difference analysis. arXiv
                                                                                              preprint arXiv:1702.04595 (2017).
developing methods to test their fairness [21]. If the conclusions
of this paper are correct then we need to convince policy makers
(such as GDPR authors) that performance evaluation is all they
can get and that it is also sufficient.

REFERENCES
 [1] Philip Adler, Casey Falk, Sorelle A Friedler, Tionney Nix, Gabriel Rybeck,
     Carlos Scheidegger, Brandon Smith, and Suresh Venkatasubramanian. 2018.
     Auditing black-box models for indirect influence. Knowledge and Information
     Systems 54, 1 (2018), 95–122.
 [2] David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe,
     Katja Hansen, and Klaus-Robert Muller. 2010. How to explain individual
     classification decisions. Journal of Machine Learning Research 11, Jun (2010),
     1803–1831.
 [3] Anupam Datta, Shayak Sen, and Yair Zick. 2016. Algorithmic transparency via
     quantitative input influence: Theory and experiments with learning systems.
     In 2016 IEEE symposium on security and privacy (SP). IEEE, 598–617.
 [4] Ruth C Fong and Andrea Vedaldi. 2017. Interpretable explanations of black
     boxes by meaningful perturbation. In Proceedings of the IEEE International
     Conference on Computer Vision. 3429–3437.
 [5] Bryce Goodman and Seth Flaxman. 2017. European Union regulations on
     algorithmic decision-making and a “right to explanation”. AI magazine 38, 3
     (2017), 50–57.
 [6] Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2019.
     Faithful and customizable explanations of black box models. In Proceedings of
     the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 131–138.
 [7] Bruno Lepri, Nuria Oliver, Emmanuel Letouzé, Alex Pentland, and Patrick
     Vinck. 2018. Fair, transparent, and accountable algorithmic decision-making
     processes. Philosophy & Technology 31, 4 (2018), 611–627.
 [8] Zachary C Lipton. 2018. The mythos of model interpretability. Queue 16, 3
     (2018), 31–57.
 [9] Tania Lombrozo. 2007. Simplicity and probability in causal explanation. Cog-
     nitive psychology 55, 3 (2007), 232–257.
[10] Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker. 2013. Accurate
     intelligible models with pairwise interactions. In Proceedings of the 19th ACM
     SIGKDD international conference on Knowledge discovery and data mining.
     623–631.
[11] Tim Miller, Piers Howe, and Liz Sonenberg. 2017. Explainable AI: Beware of
     inmates running the asylum or: How I learnt to stop worrying and love the
     social and behavioural sciences. arXiv preprint arXiv:1712.00547 (2017).
[12] Brent Mittelstadt, Chris Russell, and Sandra Wachter. 2019. Explaining expla-
     nations in AI. In Proceedings of the conference on fairness, accountability, and
     transparency. 279–288.
[13] Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. 2018. Methods
     for interpreting and understanding deep neural networks. Digital Signal
     Processing 73 (2018), 1–15.
[14] Ali Rahimi. 2017. NIPS 2017 Test-of-Time Award presentation. Retrieved Jan
     11, 2019 from https://www.youtube.com/watch?v=ORHFOnaEzPc
[15] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. [n.d.]. Anchors: High-
     precision model-agnostic explanations. In Thirty-Second AAAI Conference on
     Artificial Intelligence. 1527–1535.
[16] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. " Why should
     I trust you?" Explaining the predictions of any classifier. In Proceedings of the
     22nd ACM SIGKDD international conference on knowledge discovery and data
     mining. 1135–1144.
[17] Nima Shahbazi, Chahhou Mohammed, and Jarek Gryz. 2018. Truncated SVD-
     based Feature Engineering for Music Recommendation. In WSDM Cup 2018
     Workshop, Los Angeles.
[18] Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, and Anshul Kundaje.
     2016. Not just a black box: Learning important features through propagating
     activation differences. arXiv preprint arXiv:1605.01713 (2016).
[19] Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside
     convolutional networks: Visualising image classification models and saliency
     maps. arXiv preprint arXiv:1312.6034 (2013).
[20] Paolo Tamagnini, Josua Krause, Aritra Dasgupta, and Enrico Bertini. 2017.
     Interpreting black-box classifiers using instance-level visual explanations. In
     Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. 1–6.