Can we do better explanations? A proposal of User-Centered Explainable AI Mireia Ribera Agata Lapedriza ribera@ub.edu alapedriza@uoc.edu Universitat de Barcelona - Departament de Matemàtiques i Universitat Oberta de Catalunya Informàtica. Institut de Matemàtica de la Universitat de Barcelona, Spain Barcelona Barcelona, Spain ABSTRACT 1 INTRODUCTION Artificial Intelligence systems are spreading to multiple appli- Artificial Intelligence (AI) is increasingly being used in more cations and they are used by a more diverse audience. With contexts and by a more diverse audience. In the future, AI this change of the use scenario, AI users will increasingly will be involved in many decision-making processes. For ex- require explanations. The first part of this paper makes a ample, in the medical field there will be AI systems that will review of the state of the art of Explainable AI and highlights help physicians to make diagnoses, whereas in companies how the current research is not paying enough attention the support of AI will be used in the interviewing process of to whom the explanations are targeted. In the second part recruiting campaigns. In these cases, different types of users, of the paper, it is suggested a new explainability pipeline, most of them without a deep understanding of how AI is where users are classified in three main groups (developers built, will directly interact with AIs and will need to under- or AI researchers, domain experts and lay users). Inspired by stand, verify and trust their decisions. This change of use the cooperative principles of conversations, it is discussed scenarios of AI is similar to the one occurred in the ’80s with how creating different explanations for each of the targeted the popularization of computers. When computers started groups can overcome some of the difficulties related to cre- to be produced massively and to be targeted to non-expert ating good explanations and evaluating them. users, a need for improving human-computer interaction emerged which would accomplish to make technology ac- CCS CONCEPTS cessible to less specialized users. In a similar way, a need for • Computing methodologies → Artificial intelligence; making AI understandable and trustful to general users is • Human-centered computing → HCI theory, concepts now emerging. and models. In this new broad scenario of AI use contexts, explain- ability plays a key role for many reasons, since in many KEYWORDS cases the user interacting with the AI needs more reasoned Explainability; XAI; Conversational interfaces; User centered information than just the decision made by the system. design; HCI Plenty of attention is being paid to the need for explain- able AI. In the first part of this paper we review the 5 main ACM Reference Format: aspects that are the focus of recent surveys and theoretical Mireia Ribera and Agata Lapedriza. 2019. Can we do better ex- planations? A proposal of User-Centered Explainable AI. In Joint frameworks of explainability: (I) what an explanation is, (II) Proceedings of the ACM IUI 2019 Workshops, Los Angeles, USA, March what the purposes and goals of explanations are, (III) what 20, 2019, 7 pages. ACM, New York, NY, USA, 7 pages. information do explanations have to contain, (IV) what type of explanations can a system give, and (V) how can we eval- Permission to make digital or hard copies of all or part of this work for uate the quality of explanations. This review reveals, in our personal or classroom use is granted without fee provided that copies opinion, how the current theoretical approach of explainable are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights AI is not paying enough attention to what we believe is a for components of this work owned by others than the author(s) must key component: who are the explanations targeted to. be honored. Abstracting with credit is permitted. To copy otherwise, or In the second part of this paper, we argue that explana- republish, to post on servers or to redistribute to lists, requires prior specific tions cannot be monolithic and that each stakeholder looks permission and/or a fee. Request permissions from permissions@acm.org. for explanations with different aims, different expectations, IUI Workshops ’19, March 20, 2019, Los Angeles, USA different background, and different needs. By building on © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. the conversational nature of explanations, we will outline IUI Workshops ’19, March 20, 2019, Los Angeles, USA Ribera and Lapedriza how explanations could be created to fulfill the demands set many definitions relate explanations with "why" questions on them. or causality reasonings. Also, and more importantly, there is a key aspect when trying to define what an explanation is: 2 HOW DO WE APPROACH EXPLAINABILITY? there are two subjects involved in any explanation, the one Defining what an explanation is, is the starting point for who provides it (the system), or explainer, and the one who creating explainable models, and allows to set the three pil- receives it (the human), or explainee. Thus, when providing lars on which explanations are built: goals of an explanation, AI with explainability capacity, one can not forget about to content of an explanation, and types of explanations. The whom the explanation is targeted. last key aspect reviewed in this section is how explanations can be evaluated, which is a critical point for the progress of explainable AI. Goals of explanations (WHY) According to Sameket al. [25] the need of explainable systems Definition of explanation is rooted in four points: (a) Verification of the system: Under- Explanations are "ill-defined" [17]. In the literature the con- stand the rules governing the decision process in order to cept of explainability is related to transparency, interpretabil- detect possible biases; (b) Improvement of the system: Under- ity, trust, fairness and accountability, among others [1]. Inter- stand the model and the dataset to compare different models pretability, sometimes used as a synonym of explainability, and to avoid failures; (c) Learning from the system: "Extract is defined by Doshi and Kim [6] as "the ability to explain the distilled knowledge from the AI system"; (d) Compliance or to present in understandable terms to a human". Gilpin with legislation (particularly with the "right to explanation" et al. [7], on the contrary, consider explainability a broader set by European Union): To find answers to legal questions subject than interpretability; these authors state that a model and to inform people affected by AI decisions. is interpretable if it is "able to summarize the reasons for Gilpin et al.[7] mostly agree with these goals, adding spe- [system] behavior, gain the trust of users, or produce insights cific considerations on two of these points: (a) Verification about the causes of decisions". However an explainable AI of the system: explanations help to ensure that algorithms needs, in addition, "to be complete, with the capacity to de- perform as expected, and (b) Improvement of the system: in fend [its] actions, provide relevant responses to questions, terms of safety against attacks. Guidotti, et al. [10] enforce and be audited". Rudin [24] defines Interpretable Machine for (c) "the sake of openness of scientific discovery and the Learning in a more restricted sense, as "When you use a progress of research" , while Miller [19] directly considers model that is not a black box", while Explainable Machine "facilitating learning" the primary function of explanation. Learning is, for this author, "when you use a black box and Wachter, et al. [26] describe more in detail three aims behind explain it afterwards". the right to explanation. These three aims are "to inform and Miller [19], does an interesting review of social science help the subject understand why a particular decision was constructs to find the theoretical roots of the explainability reached, to provide grounds to contest adverse decisions, and concept. For example, Lewis [15] states that "To explain an to understand what could be changed to receive a desired event is to provide some information about its causal history. result in the future, based on the current decision-making In an act of explaining, someone who is in possession of model". some information about the causal history of some event Lim et al. [16] add a new goal, relating explainability to: –explanatory information – tries to convey it to someone (e) Adoption: Acceptance of the technology. These authors else". Halpern and Pearl [12] define a good explanation as a state that "[the] lack of system intelligibility (in particular if response to a Why question, that "(a) provides information a mismatch between user expectation and system behavior that goes beyond the knowledge of the individual asking occurs) can lead users to mistrust the system, misuse it, or the question and (b) be such that the individual can see that abandon it altogether". it would, if true, be (or be very likely to be) a cause of". Doshi-Velez and Kim [6] focus on (b) and (d) and see inter- After the review, Miller [19] extracts four characteristics of pretability as a proxy to evaluate safety and nondiscrimina- explanations: "explanations are contrastive" (why this and tion, which can be related to fairness in AI. They also argue not that), "explanations are selected in a biased manner (not that an explanation is only necessary when wrong results everything shall be explained)", "probabilities don’t matter" may have an important impact or when the problem is in- and finally "explanations are social". completely studied. Rudin [24] agrees with that last view, but From these definitions and the recent reviews of explain- also mentions troubleshooting (a) as an important goal. On a ability [7, 10] we can conclude that there is no agreement on more theoretical framework, Wilkenfeld and Lombrozo [27], a specific definition for explanation. However, some relevant cited in [19], discuss about other functions of explanations points are shared in almost every definition. For example, such as persuasion or assignment of blame, and they raise Can we do better explanations? IUI Workshops ’19, March 20, 2019, Los Angeles, USA attention to the fact that the goals of explainer and explainee the roles of layers or units), that allows users to understand may be different. the structures of the system. In the latter, the explanation Regarding to the need and utility of explanations, Abdul focuses in a specific output and allows users to understand et al.[1] see explanations as a way for humans to remain in better the reasons why that specific output occurred or the control. This view is questioned by Lipton [17], who warns relation between a specific input and its output. against explanations "to simply be a concession to institu- Overall, there are multiple questions that good explana- tional biases against new methods", arising a more deep re- tions should provide answers to. We observe, however, a flection on how AI fits our society: to empower people or to quite consistent agreement on the importance of the "Why" surpass people. Finally, Rudin [24], in her controversial video questions. Furthermore, some explanation contents are more seminar, questions the utility of explanations, and states that interesting or important for some users than others. For they only "perpetuate the problem of bad stuff happening", example, researchers developing the AI system might be in- because they act somewhat as a disclaimer. Furthermore, terested in technical explanations on how the system works some authors agree that the explainee will only require ex- to improve it, while lay users, with no technical background, planations when the system decision does not match her would not be interested at all about these type of explanation. expectations [8]. Despite the disagreement of some experts on the need of Types of explanations (HOW) explanations, there are more reasons supporting their need In this section we review the different ways of classifying than the opposite. In particular it is very likely that users explanations according to how they are generated and deliv- expect an explanation when the decision of an AI has im- ered to the user. portant economical consequences or it affects their rights. In terms of generation, explanations can be an intrinsic However, trying to cover all goals with a unique explanation part of the system, which becomes transparent and open is overwhelming [7]. If we take into account the explainee , to inspection (for some authors this is called interpretabil- maybe a practical solution could be to create several expla- ity). For example, CART (Classification and regression trees) nations serving only the specific goals related to a particular [2] is a classical decision tree algorithm that functions as a audience. white box AI system. On the contrary, explanations can be post-hoc, built once the decision is already made [17, 20]. Content to include in the explanation (WHAT) For instance, LIME by Ribeiro et al. [23] consists of a local Lim et al. [16] say that an explanation should answer five surrogate model that reproduces the system behavior for a questions: "(1) What did the system do?, (2) Why did the sys- set of inputs. Detailed pros and cons of each of these two tem do P?, (3) Why did the system not do X?, (4) What would types are discussed in [20]. In particular, while intrinsic ex- the system do if Y happens? , (5) How can I get the system planations need to impose restrictions on the design of the to do Z, given the current context?" . These questions are system, post-hoc explanations are usually unable to give very similar to the explanatory question classes introduced information on the representation learned by the system or by Miller [19]. Gilpin et al. [7], on the contrary, add a new on how the system is internally working. question related to the data stored by the system: (6) "What Regarding to the explanation modality, we can find expla- information does the system contain?" nations in natural language with "analytic (didactic) state- Lim et al. [16] relate their five questions to Don Norman ments [...] that describe the elements and context that sup- gulfs of evaluation and execution, solving questions 1-3 the port a choice", as visualizations, "that directly highlight por- separation between perceived functionality of the system and tions of the raw data that support a choice and allow viewers the user’s intentions and expectations, and questions 4-5 the to form their own perceptual understanding", as cases or separation between what can be done with the system and "explanations by example", "that invoke specific examples or the user’s perception of its capacity. These authors tested the stories that support the choice", or as rejections of alternative questions on an explanatory system with final users and they choices or "counterfactuals" "that argue against less preferred concluded that "Why questions" (2) were the most important. answers based on analytics, cases, and data" [11, 17]. Cur- Some authors categorize the explanations depending on rently visualizations are probably the most common type whether they explain how the model works or the reason of of explanations (see [28] for a recent review), with a longer a particular output [7, 10]. Although both aspects are con- tradition of interaction and evaluation methods [13]. nected, explanations can be more specific when focused on a We can see there exist many types of explanations and, local result. In the first case, the explanation is more global, although visualizations are among the most adopted, it is and can help users to build a mental model of the system. This not clear when or why one type is better than another. In global explanation includes also the representation learned some cases the most suitable modality will depend on the by the model (for example, in a Neural Network, what are content of the explanation. Furthermore, the user should also IUI Workshops ’19, March 20, 2019, Los Angeles, USA Ribera and Lapedriza play an important role on deciding what type of explanation 3 CAN WE DO BETTER? is the most appropriate according to background, specific In this section we critically review the previous sections and expectations or needs. give insights on new directions to create better explanations. We build our proposal upon two main axes: (1) to provide Evaluation of explanations more than one explanation, each targeted to a different user Evaluating explanations is maybe the most immature aspect group, and (2) making explanations that follow cooperative on the research on explainable AI. Lipton [17] and Miller principles of human conversation. [19] openly question the existing practices for evaluating In order to better contextualize current developments in explanations. Lipton says that "the question of correctness explainability, we suggest to take into account the commu- has been dodged, and only subjective views are proposed". nicative nature of the explanations and to categorize ex- Miller [19] argues that most explanations rely on causal plainees in three main groups, based on their goals, back- relations while people do not find likely causes very useful, ground and relationship with the product [4],[5]: and states that simplicity, generality and coherence are "at least as equally important". • Developers and AI researchers: investigators in AI, In a promising direction, Doshi-Velez and Kim [6] criticize software developers, or data analysts who create the the weakness of current methods for explanation evaluation, AI system. and suggest grounding evaluations on more solid principles, • Domain experts: specialists in the area of expertise based on Human Computer Interaction (HCI) user tests. The where the decisions made by the system belong to. For authors suggest three possible approaches, from more spe- example: physicists or lawyers. cific and costly to more general and cheap: (1) application- • Lay users: the final recipients of the decisions. For ex- grounded evaluation with real humans and real tasks; (2) ample: a person accepted or rejected on a loan demand, human-grounded evaluation with real humans but simpli- or a patient that has been diagnosed. fied tasks; and (3) functionally-grounded evaluation without humans and proxy tasks; all of them always inspired by real Starting with explainability goals, if we take a closer look tasks and real humans’ observations. to the listed goals, we can detect different needs and ex- The Explainable AI DARPA program (XAI) [11], started plainee profiles for each of them. (a) verification and (b) im- on 2017, tries to cover current gaps of this topic and opens provement goals, clearly appeal to a developer or researcher many scientific research lines to solve them. The program profile, who wants to improve the algorithm’s parameters or conceptualizes the goals of explanation as to generate trust optimization. These goals can be attained with the help of and facilitate appropriate use of technology (focusing mainly domain experts to whom the tool is intended to help: they in adoption, the (e) goal of explanations). The project relates will be the ones that detect possible failures of the system. the explanation goals with several elements to evaluate, each However, for the domain experts, the main goal can be to one linked to a corresponding indicator. learn from the system (c), to understand the mechanisms On the Open Learning Modelling domain, Conati et al [3], of inference or correlation that the system uses in order to based on Mabbot and Bull[18] previous experiments, point improve their decision methods or to hypothesize possible out some key considerations on designing explanations such general rules. For domain experts the explainer goal provid- as considering the explainee, as we suggest, and also the ing explanations is to grant the system adoption (e). The last reason to build the system, which aspects to made available goal mentioned by Samek, the right to an explanation, is to the user and the degree it can be manipulated by the user. clearly targeted to lay users because the system decisions On a more technical vein, Gilpin et al.[7], after a review of may have economical or personal implications for them, al- the literature, cite four evaluation methods. The first two are though this goal can be also relevant for domain experts, related to processing (completeness to model, completeness who might have the legal responsibility of the final decision. on substitute task), while the last two related to represen- Related to explanation content, Doshi-Velez and Kim [6] tation (completeness on substitute task, detect biases) and argue that different explanations are needed depending on explanation producing (human evaluation, detect biases). global versus local scope, thematic area, severity of incom- Setting clear evaluation goals and metrics is critical in pleteness, time constraints and nature of user expertise. We order to advance the research on explainability and more can delve a bit more on this idea, particularly in the need to efforts are needed in this area. How can we say that a system tailor explanations to user expertise, and exemplify it with is better than another if we do not know why? Doshi-Velez the following scenario. Let’s say we have a system that offers and Kim [6], and DARPA [11] proposals have strong points, explanations at the representational level, describing data but they do not cover all the goals set on explainable systems, structures; these should clearly not be communicated in the nor all the modalities and explanation contents. same language for developers as for domain-experts. Even Can we do better explanations? IUI Workshops ’19, March 20, 2019, Los Angeles, USA different area domain-experts will require different kind of Finally, considering evaluation, we can also observe that explanations [22]. different metrics appeal to different needs and audience. In terms of types of explanations, Lipton [17] states that For example, testing completeness or functionally-grounded humans do not exhibit transparency, sustaining that human evaluation are targeted to developers or AI scientists, task explanations are always post-hoc. On the other side, many performance and mental model appeal to domain experts authors are concerned about the high complexity of machine whereas trust is intended for domain experts and lay users. If learning algorithms and the limits of human reasoning to we deliver different explanations, targeted to a specific of the understand them [26]. This relates to Nielsen heuristic of above mentioned groups, it will be easier to evaluate them, progressive disclosure or Shneiderman visual information- since we can use the most suitable metric for each case. seeking mantra: "Overview first, zoom and filter, then details- on-demand" as techniques to cope with complex informa- 4 USER-CENTERED EXPLAINABLE AI tion or tasks. To make explanations more human, Naveed, From the literature review and discussions above presented, Donker and Ziegler [21] introduce an interesting framework we conclude that explanations are multifaceted and cannot of explanations based on Toulmin’s argumentation model. be attained with one single, static explanation. Since it is very This model proposal is to communicate decisions giving ev- difficult to approach explainable AI in a way that fulfills all idences, like facts or data, that support the decision, and the expected requirements at the same time, we suggest cre- relating both the evidences and the decision with contextual ating different explanations for every need and user profile. information. Other authors suggest interaction as a way to The rest of this section gives more details on this idea and explore the explanation space: "allowing people to interac- discusses the different reasons that support our proposal. tively explore explanations for algorithmic decision-making is a promising direction" [1] "By providing interactive partial dependence diagnostics, data scientists can understand how features affect the prediction overall" [14]. Likewise, Miller [19] criticizes the current proposed expla- nations as being too static, he describes them ideally as "an interaction between the explainer and explainee". Delving on the fourth feature he identified in social science theoretical constructs: "explanations are social", this author parallels explanations to conversations . Therefore explanations must follow the cooperative principles of Grice [9] and its four maxims: 1. Quality: Make sure that the information is of high quality: (a) do not say things that you believe to be false; and (b) do not say things for which you do not have sufficient evidence; 2. Quantity: Provide the right quantity of informa- tion. (a) make your contribution as informative as is required; and (b) do not make it more informative than is required; 3. Relation: Only provide information that is related to the conversation. (a) Be relevant. This maxim can be interpreted as a strategy for achieving the maxim of quantity; 4. Man- ner: Relating to how one provides information, rather than what is provided. This consists of the ’supermaxim’ of ’Be perspicuous’, and according to Grice, is broken into various maxims such as: "(a) avoid obscurity of expression; (b) avoid ambiguity; (c) be brief (avoid unnecessary prolixity); and (d) be orderly". Figure 1: The system targets explanations to different types We observe that (1), (2), and (3) refer to the content of the of user, taking into account their different goals, and provid- ing relevant (Grice 3rd maxim) and customized information explanation, while (4) refers to the type of explanation. No- to them (Grice 2nd and 4th maxim), as described in section tice that these 4 cooperative principles can also be related to 2. Evaluation methods are also tailored to each explanation other wanted properties of explanations [20], such as fidelity or comprehensibility. Our claim is that Explainable AI for domain-experts and lay users can benefit from the theoretical As argued above, we suggest that AI explanations should frameworks developed for human communication. follow the 4 cooperative principles previously described. IUI Workshops ’19, March 20, 2019, Los Angeles, USA Ribera and Lapedriza In this context, if different explanations are specifically de- users (user-centered design) "information disclosures need signed for different audiences or users, we can design each to be tailored to their audience, with envisioned audiences one with a particular purpose, content, and present it in a including children and uneducated laypeople" , "the utility of specific way. This procedure makes it easier to follow the such approaches outside of model debugging by expert pro- principles of (2) quantity: deliver the right quantity of data grammers is unclear". They also emphasize the need to give and abstraction, and (3) relation: be relevant to each stake- a "minimal amount of information" (be relevant), "counter- holder. Concretely, taking into account the current research factual explanations are intentionally restricted". Moreover, in explainability we suggest these 3 big families of explana- when the authors talk about the suitability of offering "mul- tions: tiple diverse counterfactual explanations to data subjects", - Developers and AI researchers: Model inspection and they could benefit from a conversational approach. simulation with proxy models. These two types of explana- While the proposed scheme of user-centered explainable tions are very well suited to verify the system, detect failures AI particularly benefits the quantity and relation principles, and give hints to improve it. The mode of communication the manner can also be chosen to be as appropriate as pos- fits well the audience, who are able to understand code, data sible to the user. For example, although natural language representation structures and statistical deviations. Com- descriptions can be a suitable modality for any of the three pleteness tests covering different scenarios can be set to user groups, the specific vocabulary should be adapted to evaluate the explanation. the user background. In particular, technical terms are not - Domain-experts: provide explanations through natural a good choice for explanations targeted to a lay user, and language conversations or interactive visualizations, letting explanations for domain-experts should use their respective the expert decide when and how to question the explana- area terminology. Finally, regarding to the quality principle, tion and led her discovery by herself. Explanations must we think it has to be always applied in the same way, and it be customized to the discipline area of the domain experts is not necessary to take into account the specific user group. and to the context of their application, be it legal or medical decisions, or any other, in order to be clear and to use the discipline terminology. Test of comprehension, performance 5 CONCLUSION and survey of trust can be set to evaluate the explanation. While there has been a great progress in some aspects of - Lay users: outcome explanations with several counter- explainability techniques, we observed that there is a key factuals [26] with which users can interact to select the one aspect that is being misrepresented in several of the current most interesting to their particular case. This explanation approaches: the user to whom the explanation is targeted to. is parallel to human modes and it is very likely to generate Putting explanations in the user context makes explainability trust. Satisfaction questionnaires can be set to evaluate the easier to approach than when we try to create explainable explanation. systems that fulfill all the requirements of a general explana- Our proposal is that explanations need to be designed tion. In addition, the user-centered framework gives clues on taking into account the type of user they are targeted to, how to create more understandable and useful explanations as shown in the pipeline for explanation of Figure 1. That for any user, because we can follow the principles of human means to approach explainable AI from a user-centered per- communication, thoroughly studied. spective, putting the user in a central position. Approaching More generally, the increasing demand of explainable AI explainability in that way has two main benefits. First, it systems and the different background of stakeholders of ma- makes the design and creation of explainable systems more chine learning systems justify, in our view, to revise the con- affordable, because the purpose of the explanation is more cept of explanations as unitary solutions and to propose the concrete and can be more specifically defined than when we creation of different user-centered explainability solutions, try to create an all-sizes all-audiences explanation. Second, simulating human conversations with interactive dialogues it will increase satisfaction among developers or researchers, or visualizations that can be explored. domain-experts and lay users, since each of them receives a more targeted explanation that is easier to understand than a general explanation. Finally, it will be easier to evaluate ACKNOWLEDGMENTS which explanation is better because we have metrics that are This work has been partially supported by the Spanish project specific to each case. TIN2016-74946-P (MINECO/FEDER, UE) and CERCA Pro- Wachter et al. [26] proposal of counterfactual explanations gramme / Generalitat de Catalunya. Icons used in Figure fulfilling the right of explanation is a good example that sup- 1, are from Flaticon, made by Freepik and Smashicons. We ports the implementation of these principles. In their paper thank Jordi Vitrià for his review and suggestions on the they abound in the need to make explanations adapted to lay whole article. Can we do better explanations? IUI Workshops ’19, March 20, 2019, Los Angeles, USA REFERENCES [16] Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and [1] Ashraf Abdul, Jo Vermeulen, Danding Wang, Brian Y. Lim, and Mohan Why Not Explanations Improve the Intelligibility of Context-Aware Kankanhalli. 2018. Trends and Trajectories for Explainable, Account- Intelligent Systems. In Proceedings of the SIGCHI Conference on Human able and Intelligible Systems: An HCI Research Agenda. In Proceedings Factors in Computing Systems. 2119–2128. of the 2018 CHI Conference on Human Factors in Computing Systems [17] Zachary C. Lipton. 2016. The Mythos of Model Interpretability. Whi - CHI ’18. Association for Computing Machinery, Montreal, Canada, (2016). arXiv:1606.03490 http://arxiv.org/abs/1606.03490 1–18. https://doi.org/10.1145/3173574.3174156 [18] Andrew Mabbott and Susan Bull. 2006. Student preferences for editing, [2] L Breiman. 1984. Algorithm CART. Classification and Regression Trees. persuading, and negotiating the open learner model. In International California Wadsworth International Group, Belmont, California (1984). Conference on Intelligent Tutoring Systems. Springer, 481–490. [3] Cristina Conati, Kaska Porayska-Pomsta, and Manolis Mavrikis. 2018. [19] Tim Miller. 2019. Explanation in artificial intelligence : Insights from AI in Education needs interpretable machine learning: Lessons from the social sciences. Artificial Intelligence 267 (2019), 1–38. https://doi. Open Learner Modelling. arXiv preprint arXiv:1807.00154 (2018). org/10.1016/j.artint.2018.07.007 [4] Alan Cooper et al. 2004. The inmates are running the asylum:[Why [20] Christoph Molnar. 2018. Interpretable Machine Learning: a guide for high-tech products drive us crazy and how to restore the sanity]. Sams making black box models explainable. https://christophm.github.io/ Indianapolis. interpretable-ml-book/ [5] Alan Cooper, Robert Reimann, and David Cronin. 2007. About face 3: [21] Sidra Naveed, Tim Donkers, and Jürgen Ziegler. 2018. Argumentation- the essentials of interaction design. John Wiley & Sons. Based Explanations in Recommender Systems. Adjunct Publication of the 26th Conference on User Modeling, Adaptation and Personalization - [6] Finale Doshi-velez and Been Kim. 2017. A Roadmap for a Rigorous UMAP ’18 (2018), 293–298. https://doi.org/10.1145/3213586.3225240 Science of Interpretability. stat 1050 (2017), 28. [22] Forough Poursabzi-Sangdeh. 2018. Design and Empirical Evaluation [7] Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael of Interactive and Interpretable Machine Learning. Ph.D. Dissertation. Specter, and Lalana Kagal. 2018. Explaining Explanations: An Ap- University of Colorado, Boulder. https://scholar.colorado.edu/csci proach to Evaluating Interpretability of Machine Learning. (2018). [23] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why https://doi.org/arXiv:1806.00069v2 arXiv:1806.00069 should i trust you?: Explaining the predictions of any classifier. In [8] Shirley Gregor and Izak Benbasat. 1999. Explanations from Intelligent Proceedings of the 22nd ACM SIGKDD international conference on knowl- Systems: Theoretical Foundations and Implications for Practice. MIS edge discovery and data mining. ACM, 1135–1144. Quarterly 23, 4 (dec 1999), 497. https://doi.org/10.2307/249487 [24] Cynthia Rudin. 2018. Please stop doing "explainable" ML. (2018). [9] H.P. Grice. 1975. Logic and Conversation. In Syntax and semantics 3: https://bit.ly/2QmYhaV Speech arts. 41–58. [25] Wojciech Samek, Thomas Wiegand, and Klaus-Robert Müller. 2017. [10] Riccardo Guidotti, Anna Monreale, and Salvatore Ruggieri. 2018. A Explainable artificial intelligence: understanding, visualizing and in- Survey of Methods for Explaining Black Box Models. ACM Computing terpreting deep learning models. ITU Journal: ICT Discoveries Special Surveys (CSUR 51, 5 (2018), 42 p. Issue No.1 (2017). https://www.itu.int/en/journal/001/Documents/ [11] David Gunning. 2017. Explainable Artificial Intelligence ( XAI ). Tech- itu2017-5.pdf nical Report. 1–18 pages. [26] Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2018. Coun- [12] Joseph Y Halpern and Judea Pearl. 2005. Causes and Explanations terfactual Explanations Without Opening the Black Box: Automated : A Structural-Model Approach . Part II : Explanations. 56, 4 (2005), Decisions and the GDPR. Harvard Journal of Law & Technology 31, 2 889–911. https://doi.org/10.1093/bjps/axi148 (2018), 1–52. https://doi.org/10.2139/ssrn.3063289 arXiv:1711.00399 [13] Jeffrey Heer and Ben Shneiderman. 2012. Interactive dynamics for [27] Daniel A. Wilkenfeld and Tania Lombrozo. 2015. Inference to the visual analysis. Communications of the ACM ACM 55, 4 (apr 2012), Best Explanation (IBE) Versus Explaining for the Best Inference (EBI). 45–54. https://doi.org/10.1145/2133806.2133821 Science and Education 24, 9-10 (2015), 1059–1077. https://doi.org/10. [14] Josua Krause, Adam Perer, and I B M T J Watson. 2016. Interacting with 1007/s11191-015-9784-4 Predictions : Visual Inspection of Black-box Machine Learning Models. [28] Quanshi Zhang and Song-Chun Zhu. 2018. Visual Interpretability In CHI’16. 5686–5697. https://doi.org/10.1145/2858036.2858529 for Deep Learning: a Survey. Frontiers in Information Technology & [15] David Lewis. 1986. Causal Explanation. In Philosophical Papers. Vol II. Electronic Engineering 19, 1423305 (2018), 27–39. https://doi.org/10. Oxford University Press, New York, Chapter Twenty two, 214–240. 1631/fitee.1700808 arXiv:1802.00614