Do ML Experts Discuss Explainability for AI Systems? A discussion case in the industry for a domain-specific solution Juliana Jansen Ferreira Mateus de Souza Monteiro IBM Research IBM Research Rio de Janeiro Brazil Rio de Janeiro Brazil jjansen@br.ibm.com msmonteiro@ibm.com ABSTRACT The application of Artificial Intelligence (AI) tools in different 1 Introduction domains are becoming mandatory for all companies wishing to In the digital transformation era, AI technology is mandatory excel in their industries. One major challenge for a successful for companies that want to stand out in their industries. To application of AI is to combine the machine learning (ML) achieve that goal, companies must make the most with domain expertise with the domain knowledge to have the best results data, but also combine it with domain expertise. Machine applying AI tools. Domain specialists have an understanding of Learning (ML) techniques and methods are resourceful while the data and how it can impact their decisions. ML experts have dealing with a lot of data. But it needs the human input to add the ability to use AI-based tools dealing with large amounts of meaning and purpose to that data. AI technology must empower data and generating insights for domain experts. But without a users [7]. In the first age of AI, the research aimed to get away deep understanding of the data, ML experts are not able to tune from studying human behavior and consider the computer as a their models to get optimal results for a specific domain. tool for solving certain classes of problems [19]. But now, the best Therefore, domain experts are key users for ML tools and the results come from the partnership between AI and people where explainability of those AI tools become an essential feature in that they are coupled very tightly, and the resulting of this partnership context. There are a lot of efforts to research AI explainability for presents new ways for the human brain to think and computers different contexts, users and goals. In this position paper, we to process data. The pairing, or the communication, of machines discuss interesting findings about how ML experts can express and people, is the core material for Human-Computer Interaction concerns about AI explainability while defining features of an ML (HCI) research. Recently, AI research has been recognizing the tool to be developed for a specific domain. We analyze data from HCI view on their advances since the human behavior cannot be two brainstorm sessions done to discuss the functionalities of an left out of the context to advance AI research impact on real ML tool to support geoscientists - domain experts - on analyzing problems [7][19][29]. seismic data - domain-specific data – with ML resources. The explainability dimension of AI, eXplainable AI (XAI), gains even more importance once people are a component for successful CCS CONCEPTS AI application. While researching explainable AI, we observed Human-centered computing → Empirical studies in HCI that different terms are often present in the previous work that sometimes are considered as a synonym of explainable or as a KEYWORDS necessary dimension to enable explainability. Interpretability and Explainable AI, domain experts, ML experts, machine learning, AI transparency are constant terms associated with XAI, and they are development. usually related to algorithms or ML models. Although the keywords help us to search for relevant work in XAI, our goal was ACM Reference format: to verify if the explanation of AI in the publications has a clear Juliana Jansen Ferreira and Mateus Monteiro. 2020. Do ML Experts Discuss goal not just present any explanation. Explainability for AI Systems? A case in the industry for a domain-specific AI shows great results dealing with problems that can be cast as solution. In Proceedings of the IUI workshop on Explainable Smart classification problems, but they lack the ability to explain their Systems and Algorithmic Transparency in Emerging Technologies (ExSS- decisions in a way people can understand [21]. Most AI ATEC'20). Cagliari, Italy, 7 pages. explainability research focuses on algorithmic explainability or transparency [1][7][30][34], aiming to make the algorithms more comprehensive. But this kind of explanation does not work for all Copyright © 2020 for this paper by its authors. Use permitted under Creative people, purpose or context. For those with expertise in ML or Commons License Attribution 4.0 International (CC BY 4.0). maybe only with computer programming, this approach might be ExSS-ATEC'20, March 2020, Cagliari, Italy J. J. Ferreira et al. enough to build explanations, but not for those people without 2 Related Work that technical expertise, such as domain experts. There are many research efforts regarding explainable artificial There is much less XAI research considering usability, intelligence (XAI) in the literature. For this paper, we look over practical interpretability, and efficacy on real users [12][34]. The previously published work from different venues (e.g., IUI, CHI, mediation of professionals like designers and HCI practitioners DIS, AAAI, etc.) and databases (e.g., Scopus, Web of Science, seems even more critical for XAI design [28]. The presence and Google Scholar) to identify which are the people considered on participation of designers in the early stages of ML models’ XAI research. Our research examines two types of people: 1) ML development presents an interesting approach for XAI. Since experts, which are people capable of building, training and testing designers are the professionals responsible for building the bridge machine learning models with different datasets from different between technology and users, they need to understand their domains, and 2) Non-ML-experts, which are people not skilled working material. In this case, for XAI, ML models are an essential with ML concepts that, in some dimension, use ML tools to part of this material for design [17]. HCI presents a lot of methods perform tasks on different domains. and approaches that are flexible enough to deal with different Considering ML experts, there is previous work about design scenarios. The co-design technique is being applied with supporting the sense-making of the model and data to enable domain experts [8][32] and also with ML experts or data scientists explainability. These studies are often related to delivering as users [13][27] to explore explainability functionalities. The explanations through images by showing the relevant pixels (e.g. explanation challenges are also being tackled in broader aspects [22,24]) or regions (e.g. [24]) of pixels from the classifier result. that impact the society such as trust (e.g. [[1],[15],[30]]), ethical Other works, such as the presented by Hohman et al. [13], uses a and legal aspects [16]. visual analytics interactive system, named GAMUT, to support It is a challenge to combine ML expertise with domain data scientists with model interpretability. Similarly, to Hohman knowledge to tune ML models for a specific domain. Industries are et al. [13], the authors Di Castro and Bertini [11] explore the use housing their own AI experts and data scientists [33][35], which of visualization and model interpretability to promote model is an indicator of the importance of combining AI and domain verification and debugging methods using a visual analytics expertise. There are a set of new roles that AI technology system. Studies also highlight decision-making before the generates, and industries need to adapt and hire AI experts to keep developing process. One of the applications is to provide support their competitive edge. Some of those new roles created by AI are in the process of assertive choosing of the machine learning related to the ability to explain the AI technology in some matter model. In the work of Wang et al. [27], the authors offer a solution and considering some dimension [14]. One common characteristic named ATMSeer. Given the dataset, the solution automatically of all explanation skills is the contextualization of the AI tries different models and allows users to observe and analyze technology in the business, relate it to the domain. For that, the these models through interactive visualization. Lastly, concerning domain knowledge is the differentiator factor to make general AI ML experts, but with no visualization, Nguyen, Lease, and Wallace solutions tuned for a business needs in the industry. [4] present an approach to provide explanations regarding of Our research context is in the oil & gas industry. An essential annotator mistakes in Mechanical Turkey Tasks. part of this industry decision-making process relies on experts’ Concerning non-ML-experts, Kizilcec [30] presented a study prior knowledge and experiences from previous cases and on a MOOC platform. The authors show research on how projects. The seismic data is an important data source that experts transparency affects trust in a learning system. According to the interpret by searching for visual indicators of relevant geological authors [30], individuals whose expectations (on the grade) were characteristics in the seismic. It is a very time-consuming process. met, did not vary the trust by changing the "amount" of The application of ML on seismic data aims to augment experts’ transparency. Besides, individuals whose expectations were seismic interpretation abilities by processing large amounts of violated, trusted the system less, unless the grading algorithm was data and adding meaning to visual features in seismic. The ML transparent. Another context-aware example is the work of tool, in our case, aims to be a sandbox of ML models that can Smith-Renner, Rua, and Colony [2]. The authors present an handle seismic data in different ways for different tasks to enable explainable threat detection tool. Another work that supports seismic interpretation experts to have meaningful insights during decisions in high-risk, complex operating environments, such as their work. the military, is the work from Clewley et al. [25]. In this context, In this position paper, we discuss some findings about how ML such use improves the performance of trainees entering high-risk experts can express their concerns about AI explainability while operations [25]. developing an ML tool for supporting the seismic interpretation. Paudyal et al. [26], on the other hand, present a work in the We had the opportunity to observe and collect data from two context of Computer-Aided Language Learning, in which the brainstorm sessions where ML developers and ML researchers, explanation is used to provide feedback on location, movement, some with domain knowledge, discussed features of an ML tool. and hand-shape to learners of American Sign Language. Lastly, Although the explainability was not an explicit discussion topic, Escalante et al. [16] explanations happen in the area of human the concerns about that dimension could be identified in portions resources, in which routinely decisions are made by human of the participants' discourse throughout the sessions. resource departments to evaluate candidates. In ML, this task demands an explanation of the models as a means of identifying and understanding how they relate to decisions suggested and gain insight into undesirable bias [16]. The authors [16] address One work that used co-design for a solution to experts it is the this scenario by proposing a competition to reduce bias in this ML work of Wang et al. [27]. In their work, ML experts participated task. in the process of elucidating about how they choose machine Works that presents the explanation for non-experts with no learning models and what opportunities exist to improve the context are not unusual. For example, Cheng et al. [15] present a experience. Another expert-centered work is presented in visual analytics system to improve users' trust and comprehension Hohman et al. [13]. Through an interactive design process with of the model. In another non-context work is from Rotsidis, both machine learning researchers and practitioners, the authors Theodorou, and Wortham [1], in which the authors show emerged a list of capabilities that an explainable machine learning explainability for human-robots interaction. By showing in interface should support for Data Scientists. through virtual reality in real-time, the decision process of the Finally, Barria-Pineda and Brusilovsky [21] presented the robot is exposed to the user in a debugging functionality. The explainability design of a recommender system in an educational majority of the ML techniques and tools presented in the literature scenario. After releasing the system for testing, the authors found are designed to support expert users like data scientists and ML that transparency seemed to influence the probability of the practitioners [27] and how visualization has been used widely to student in opening and to attempt the lesson. Other motivations explain and visualize algorithms and models (e.g. [13,22,24,27]). for explainability in the learning context can also be the learning However, the work of Kizilcec [30] shows the complexity in itself (see [26]). Furthermore, the motivation tells a lot about the providing explanations or making the algorithm more awareness of the work within the user and the context. Studies transparent, especially to non-experts. This fact highlights that the that had a perceived context-awareness presented a specific transparency/explainability of models is not static. Instead, it motivation for explaining, that is, choosing appropriate models requires a deep understanding of the end-user and the context before developing [27], improving workers' production [3], [32]. Besides, the intelligent system's acceptance and effectiveness debugging models [1], training for a military novice [25], among depend on its ability to support decisions and actions interpretable others. Other non-context researches motivated the explanations by its users and those affected by them [23]. Recent evidence [32] into generic aspects such as trust (e.g. [[1],[15],[30]]), ethical and shows that misleading explanation has, consequently, promoted legal aspects [16]. Chromik et al. [23], for example, affirm that conflicting in reasoning. An explanation design should, therefore, companies that motivate only through legal compliance will most offer the cognitive value to the user and communicate the nature likely not result in meaningful explanation for users. Legal of an explanation relevant to their context [[17],[32]]. compliance acknowledges user rights, but it is not enough for Browne [17] presents a reinforcement alternative concerning users nor our HCI research community [23]. designing explainability. The author argues that the designers should not only understand the end-user and the context but preferably also participate in the early conceptualization of the 3 Our ML tool case ML model. According to Browne [17], with the early participation, This paper research was designed from the opportunity to the designers benefit from understanding the models more observe and hear discussions of a project development team sincerely and allow them to develop early prototyping of ML regarding the features for building an ML tool. We observed and experiences, i.e., more controllability, testing of the model, and collected data from 2 brainstorm sessions where ML developers, successful explanation strategies. ML researchers, and other stakeholders of the ML tool discusses Towards a user-centered explanation, co-designing the features for that tool. The discussion did not have any orientation explainable interface appears to be a possible approach to both to aspects of XAI or any particular feature. They were proposed expert and non-expert end-users. For example, Wang et al. [9] by the people involved in the project to get a better understanding developed a framework using a theory-driven approach. The of the ML tool’s features. explanations were focused on physicians with previous The ML tool project is developed in an industry R&D knowledge in a decision support system. Similarly, in the same laboratory and is already being used by oil &gas companies in context of Healthcare, Kwon, et al. [8] co-designed a visual research projects. We believe it is essential for the research to analytics system. explain our settings. There was a previous study with some of the Stumpf [32], on the other hand, used co-design to a more participants in the same laboratory where they were invited to abroad intelligent system, a Smart Heating system. In their reflect and discuss on some ML development challenges, such as discovery [32], end-users voted for more explanation through XAI [28]. One of the authors of this paper participated in this more straightforward and textual explanations. Accordingly, previous study as an HCI researcher and saw the discussions of Wang et al. [9] affirm that some explanation structures in specific the ML tool an opportunity to reflect and discuss ML challenges contexts can be communicated with simpler structures, such as in a real project context. Therefore, she participated in the session textual explanations or even single lists. On the other hand, some as an observer without any intervention or mediation and well-structured and complex contexts ask for more elaborate collected the data used to discuss in this paper. explanations techniques (e.g. [8]), i.e., intelligibility queries about the system state (e.g. [21]) or even inference mechanisms (see [8]) 3.1 Research Domain Context [9]. Other techniques include XAI elements, such as the feature that had a positive or negative influence on an outcome [9]. ExSS-ATEC'20, March 2020, Cagliari, Italy J. J. Ferreira et al. The ML tool of our case aims to aid seismic interpretation, facilitator (1) that facilitate the brainstorm session without which is a central process in the oil and gas exploration industry. influencing on the discussion content. This practice main goal is supporting decision-making processes As aforementioned, for this research, we have four (4) by reducing uncertainty. To achieve that goal, different people participants that already collaborated in a previous study [28]. work alone and engage in multiple informal interactions and Three (3) of them have more than seven years of experience with formal collaboration sessions, embedding biases, decisions, and ML development and research, and they have been working in the reputation. Seismic interpretation is the process of inferring the oil & gas industry for more than one year (1 of them for more than geology of a region at some depth from the processed seismic data four years). Those participants have been working with the survey 1 . Figure 1 shows an example of seismic data lines (or domain data in question (seismic data) for a while and have been slices), which is a portion of seismic data, with an interpretation exploring different aspects of it with ML technology about visual indicators of a possible geological structure called salt [5][6][10][31]. The other participants are also experienced ML diapir. developers or experts having at least three years of experience in the industry, plus academic experience. 3.3 Brainstorm sessions The data we collected for this paper analysis was produced during two brainstorm sessions for the development of a domain-specific ML tool. With participants' consent with the data collection before the sessions, and they were aware that it was going to be used for research publication. The ML tool under development is an asset from a larger project with industry clients; therefore, its development aims to Figure 1: Seismic image example (Netherland – Central support real domain practices. The brainstorm sessions were Graben – inline 474) organized by the ML tool’s development team from the laboratory. It was not scheduled to produce data for our study in particular In the same industry R&D laboratory, ML experts and but is presented as an enriching opportunity to investigate if and researchers are exploring the possibilities of combining ML for how ML Experts discuss AI explainability in while they are exploring seismic data. It is important to say that seismic data are building an ML tool. mainly examined visually. It commonly has other data to compose the seismic interpretation, but the domain expert analyzes, interprets the seismic imagens to identify significant geological characteristics. Therefore, there is research focusing on image analysis aspects rather than geophysical or geological discussions. [5][6]. Plus, there is research on exploring additional texture features that are prominent in other domains but have not received attention in the seismic domain yet. Namely, they investigated the ability of Gabor Filters and LBP (Local Binary Patterns) – this last, widely used for face recognition – to retrieve similar regions of seismic data [6]. Still exploring the visual aspects of seismic data, there is research on generating synthetic seismic data from sketches [31] and on using ML to improve the seismic image resolution [10]. 3.2 About the ML professionals Figure 2. Brainstorm Sessions plan In total, there were eleven (11) ML professionals as participants There were two brainstorm sessions organized to discuss the on the ML tool discussions: ML Developers (7) that were involved ML tool’s features. The facilitator organized activities to support in the ML tools’ discussions and where directly involved in its individual inputs and collaborative discussions (Figure 2). development. ML Researchers (2) that were involved in the Between the sessions, there was a voting activity to prioritize the discussion about the ML tool, but not directly involved in the discussion for the second session. The sessions were performed in development, Domain Expert (1) that is a member of the technical an online collaboration tool2. The content of the collaboration tool team (not expert from the industry), but with deep understanding was discarded as study data because one participant modified it of the domain data and domain practice with that data, and a without the facilitator orientation. Therefore, this study data was the videos of the session. Some of the participants were not 1 https://www.britannica.com/science/seismic-survey 2 https://mural.co/ physically present, participating through a videoconferencing we selected a few of those quotes to discuss the concerns ML system and the online tool. developers are expressing about AI explainability while thinking about features for an ML tool. The discussion for the ML tool was sometimes conflicting 4. Data Analysis about who was the user (or user) for that ML tool. In the quote As aforementioned, we used the sessions’ videos as our study data. below, one participant was considering two users: an ML expert We transcribed the audios from both videos (session 1: 2h and and a data scientist. In his discourse, it is aligned with previous session 2: 1.5h, respectively) and tagged the quotes of every research about ML models’ interpretability [11][13] and participant of the sessions. We wanted to identify the XAI aspects understanding the data that ML models handle [22,24]. The of the discourse and relate it to the participant who brought it to visualization of trained model and the visualization of the data the discussion. We considered the data from both sessions as one with its metrics could be a way to explain an XAI scenario for ML dataset because we wanted to analyze the discourse of experts and data scientists. This kind of feature could be a pointer participants throughout the discussion about the ML tool’s to further discussions on XAI: features. […] a visualization, feature "I'm a machine learning guy For the data analysis, we used a qualitative approach since we and I want to see the trained model"; "I'm the data guy are still framing concerns about XAI on ML tools’ development. and I want to see the data […] I want to correctly Our goal was to identify the critical ideas that repeatedly arise visualize the data […] how is this data spatially during the ML professionals’ discussion of an ML tool’s features. distributed […] visualize the metrics. […]. We used the discourse analysis method that considers the written In the next quote, a participant comment on a new trend in oil or spoken language concerning its social context [18] (pp. 221, & gas companies of training geoscientists on machine learning. [20]). We did start by doing some content analysis (pp. 301, [20]) This trend aims to combine the ML tools potential to handle a lot to verify the frequency of terms, cooccurrences, and other of data and the domain expert tacit knowledge and experience to structural markers. But since the topic of the discussion was tune the pair model-data to have the best results with ML. Not broader – ML tool’s features – this did not provide relevant only quantitative results (best ML model accuracy) but qualitative findings. Therefore, we changed to discourse analysis, which goes results when that domain expert with ML learning knowledge can beyond looking at discussions of words and contents to examine make the best of model-data by understanding the meaning of the the structure of the conversation, in search of cues that might results. There are new roles of “explainers” in AI [34] that will provide further understanding (pp. 221, [20]). make the technology fit the domain in which it is applied. By having the understanding model and domain data, they are 4 Discussion about AI Explainability equipped to define the necessary explanations in a domain: We started our data analysis trying to tag the participants' […] what happens in these companies now is that they quotes with the codes “aid-XAI” or “harm-XAI” (aid or harm are hiring geophysicists and giving a machine learning eXplainable AI). Then, we notice that any categorization of the course, and I also think the same guy may be acting depending on the role he's playing at that time […]. data we had was not possible without further feedback from the person who said the quote. Therefore, we decide to tag the quotes The understanding of the algorithms and the ML workflows that had in the discourse features or concerns related to AI has been the focus of most XAI research [1][7][31][35]. The trails explainability. We selected a total of 25 quotes from on what data goes into which model and which was the output approximately 3.5h of audio transcriptions. Considering that the result can support the decision about how to fit the model and brainstorm session had a broader goal of discussing the ML tool’s data were for a particular case. In the next quote, a participant features, we believe those quotes point to an exciting direction for places a concern about the timeline and resolution of the seismic our research to investigate “Do ML Experts Discuss Explainability data. Those are parameters of the seismic data that could help the for AI Systems?”. The discussion did not have any intervention or building a better ML tool. A comparison feature could be bias towards explainability concerns, which allow us to see if and considered a way to explain what is available, what was in fact, how AI explainability would be included in their development used by the ML tool and why: discussion. […] you have to imagine that you have seismic data From the 25 quotes, 13 were from those three ML professionals from 20 years ago, as usual, and you have a new seismic that have more experience with ML development and also data that has a different resolution […] For you to be experience working with the domain data (seismic data). We able to compare things, you need to have a grid there learned that professionals that have ML+Domain knowledge and start comparing things. All the information that combined might be more capable of having an overall vision of goes in there needs to be useful […] how the AI system will impact the domain and its experts. The The participants were mostly ML developers; therefore, they quotes indicate concerns about XAI without any mention of the are used to handle ML models and data like one type of user specific topic. The theme was of genuine concern from those considered for the ML tool under development. The quote above professionals, and it was present in their discourse while shows a participant finding a solution to their users the same as developing an AI system for geoscientists. In this position paper, ExSS-ATEC'20, March 2020, Cagliari, Italy J. J. Ferreira et al. him, as the user thinks as a good solution. This seems an how it would affect the discussion about its features. Using design interesting approach: to use existing tools that somehow explain techniques, such as co-design [13][27], to explore those scenarios the ML results and see if it works for other users. Combining this with ML professionals as users could open different discussions initial input with co-designing approaches [13][27], the topics. Maybe concerns about explainability would appear more investigation of what works as an explanation for every user once developers are in users’ place. could present promising research results: In a previous study in the same R&D lab, mediation challenges were identified for the development of deep learning model [28]. […] something like Jupyter does. You have a report that One exciting aspect of that earlier study was that once the ML says, "For this data here I had this result," the views and the guy can follow more or less […] professional considered his ML solution in a real context, new concerns about the impact on people and explanations were identified. In this study, the ML professionals have a real context 5 Final Remarks and future work where their ML tool will be applied, but we believe they are still very distant from the consequences the ML tool might have on the In this position paper, we aim to use the data collected from a real user decision-making. The study reported in [28], the context and ML tool’s development project brainstorm to discuss if and how its impacts were easier to relate (ML to support hand-written ML experts express concerns about AI explainability while voting process using MNIST dataset). For the oil & gas domain, defining features of an ML tool to be developed. It was not a for example, the effect of a wrong decision cannot be so easily controlled study with users. We analyze data from two brainstorm foreseen. This could be an approach for investigating the sessions done to discuss the functionalities of an ML tool to mediation challenges [28]. support geoscientists - domain experts - on analyzing seismic data Explanations are social, and they are a transfer of knowledge, - domain-specific data – with ML resources. It was serendipity presented as part of a conversation or interaction, and are thus that one of the authors got aware of the discussion and that all shown relative to the explainer’s (explanation producer) beliefs participants agree that she could be present and collect the data about the ‘explainee’s’ (explanation consumer) beliefs. [34]. XAI for this research. needs social mediation from technology builders to technology The data collected was tough to transcript because the users and their practice [28]. We believe the explanation cannot brainstorm sessions were used to structure all the participants be generic. The design of a “good” explanation needs to take into understanding the ML tool, user, and features. Therefore, account: who is receiving the explanation, what for and in which sometimes participants did not make complete sentences, or the context the explanation was requested. sentences were incomprehensible. As mentioned in the Data This initial study opened paths to many exciting kinds of Analysis session of this paper, we started the data analysis with research, not only associated with XAI. For the XAI research, as content analysis [20] but changed to discourse analysis [18] to future work, we intend to investigate AI explanations considering analyze the data. But while analyzing word frequency, we those three dimensions (who + why + context). The investigation generate the word cloud presented in Figure 3. The most frequent of XAI considering those dimensions shows promising paths for word was “you” which was used by participants to present their designing AI systems considering different scenarios. Industries ideas. are training their domain experts on ML tools, but what about capacitate ML experts on data and domain practice before building ML solutions? It might enable the ML expert to design the solution aware of how it will impact the domain and the people involved. Other promising research path is to address the XAI topic explicitly with ML professionals as part of the design material for developing AI systems. The mediation challenges identified by Brandão et. al [28] are an initial pointer for that XAI discussion with ML professionals . As our first study, we plan to go back to the same group participants and discuss AI explainability to verify what kind of feature and concerns are raised once we point to the specific topic. REFERENCES [1] Alexandros Rotsidis, Andreas Theodorou, Robert H. Wortham. 2019. Robots Figure 3. Word cloud from transcripts That Make Sense: Transparent Intelligence Through Augmented Reality. In Intelligent User Interfaces for Algorithmic Transparency in Emerging Technologies - IUIATEC (2019). Considering that ML professionals were one of the potential [2] Alison Smith-Renner, Rob Rua, and Mike Colony. 2019. Towards an Explainable Threat Detection Tool. Workshop on Explainable Smart Systems – users for the ML tool, it is interesting that ML developers did not ExSS. use the first person in their phrases, but the third person “ you”. [3] Alison Smith-Renner, Rob Rua, Mike Colony. 2019. Towards an Explainable An investigation path was to check with those ML professionals Threat Detection Tool. Workshop on Explainable Smart Systems – ExSS (2019). [4] An T. Nguyen, Matthew Lease, and Byron C. Wallace. 2019. Explainable if they thought of themselves as a possible user to the ML tool and modeling of annotations in crowdsourcing. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI '19). ACM, New York, [26] Prajwal Paudyal, Junghyo Lee, Azamat Kamzin, Mohamad Soudki, Ayan NY, USA, 575-579. DOI: https://doi.org/10.1145/3301275.3302276 Banerjee, Sandeep Gupta. 2019. Learn2Sign: Explainable AI for Sign Language [5] Andrea Britto Mattos, Rodrigo S Ferreira, Reinaldo M Da Gama e Silva, Mateus Learning. Workshop on Explainable Smart Systems – ExSS (2019). Riva, and Emilio Vital Brazil. 2017. Assessing texture descriptors for seismic [27] Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. image retrieval. 2017 30th SIBGRAPI Conference on Graphics, Patterns and Smith, Kalyan Veeramachaneni, and Huamin Qu. 2019. ATMSeer: Increasing Images (SIBGRAPI), IEEE, 292–299. Transparency and Controllability in Automated Machine Learning. In [6] Andrea Britto Mattos, Rodrigo S. Ferreira, Reinaldo M. Da Gama e Silva, Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems Mateus Riva, and Emilio Vital Brazil. 2017. Assessing Texture Descriptors for (CHI '19). ACM, New York, NY, USA, Paper 681, 12 pages. DOI: Seismic Image Retrieval. 2017 30th SIBGRAPI Conference on Graphics, Patterns https://doi.org/10.1145/3290605.3300911 and Images (SIBGRAPI), IEEE, 292–299. [28] Rafael Brandão, Joel Carbonera, Clarisse de Souza, Juliana Ferreira, Bernardo [7] Ashraf Abdul, Jo Vermeulen, Danding Wang, Brian Y. Lim, and Mohan Gonçalves, and Carla Leitão. 2019. Mediation Challenges and Socio-Technical Kankanhalli. 2018. Trends and Trajectories for Explainable, Accountable and Gaps for Explainable Deep Learning Applications. arXiv:1907.07178 [cs]. Intelligible Systems: An HCI Research Agenda. In Proceedings of the 2018 CHI Retrieved December 2, 2019 from http://arxiv.org/abs/1907.07178 Conference on Human Factors in Computing Systems (CHI ’18), 582:1–582:18. [29] Randy Goebel, Ajay Chander, Katharina Holzinger, Freddy Lecue, Zeynep https://doi.org/10.1145/3173574.3174156 Akata, Simone Stumpf, Peter Kieseberg, and Andreas Holzinger. 2018. [8] Bum Chul Kwon, Min-Je Choi, Joanne Taery Kim, Edward Choi, Young Bin Explainable AI: The New 42? In Machine Learning and Knowledge Extraction, Kim, and Soonwook Kwon. 2018. RetainVis: Visual Analytics with Andreas Holzinger, Peter Kieseberg, A Min Tjoa, and Edgar Weippl (eds.). Interpretable and Interactive Recurrent Neural Networks on Electronic Springer International Publishing, Cham, 295–303. Medical Records. IEEE Transactions on Visualization and Computer Graphics 25, https://doi.org/10.1007/978-3-319-99740-7_21 1: 299–309. [30] René F. Kizilcec. 2016. How Much Information?: Effects of Transparency on [9] Danding Wang, Qian Yang, Ashraf Abdul, and Brian Y. Lim. 2019. Designing Trust in an Algorithmic Interface. In Proceedings of the 2016 CHI Conference on Theory-Driven User-Centric Explainable AI. Proceedings of the 2019 CHI Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, Conference on Human Factors in Computing Systems, ACM, 601:1–601:15. 2390-2395. DOI: https://doi.org/10.1145/2858036.2858402 [10] Dario A. B. Oliveira, Rodrigo S. Ferreira, Reinaldo Silva, and Emilio Vital Brazil. [31] Rodrigo S Ferreira, Julia Noce, Dario AB Oliveira, and Emilio Vital Brazil. 2019. 2019. Improving Seismic Data Resolution With Deep Generative Networks. Generating Sketch-Based Synthetic Seismic Images With Generative IEEE Geoscience and Remote Sensing Letters 16, 12: 1929–1933. Adversarial Networks. IEEE Geoscience and Remote Sensing Letters. [11] Federica Di Castro, Enrico Bertini. 2019. Surrogate decision tree visualization [32] Simone Stumpf. 2019. Horses For Courses: Making The Case For Persuasive interpreting and visualizing black-box classification models with surrogate Engagement In Smart Systems. Workshop on Explainable Smart Systems – decision tree. Workshop on Explainable Smart Systems – ExSS (2019). ExSS (2019). [12] Finale Doshi-Velez and Been Kim. 2017. Towards A Rigorous Science of [33] Thomas H Davenport and DJ Patil. 2012. Data scientist. Harvard business Interpretable Machine Learning. arXiv:1702.08608 [cs, stat]. Retrieved review 90, 5: 70–76. December 18, 2019, from http://arxiv.org/abs/1702.08608 [34] Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social [13] Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven M. sciences. Artificial Intelligence 267: 1–38. Drucker. 2019. Gamut: A Design Probe to Understand How Data Scientists https://doi.org/10.1016/j.artint.2018.07.007 Understand Machine Learning Models. In Proceedings of the 2019 CHI [35] Yogesh K. Dwivedi, Laurie Hughes, Elvira Ismagilova, et al. 2019. Artificial Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, Intelligence (AI): Multidisciplinary perspectives on emerging challenges, NY, USA, Paper 579, 13 pages. DOI: https://doi.org/10.1145/3290605.3300809 opportunities, and agenda for research, practice and policy. International [14] H James Wilson, Paul Daugherty, and Nicola Bianzino. 2017. The jobs that Journal of Information Management: S026840121930917X. artificial intelligence will create. MIT Sloan Management Review 58, 4: 14. [15] Hao-Fei Cheng, Ruotong Wang, Zheng Zhang, Fiona O'Connell, Terrance Gray, F. Maxwell Harper, and Haiyi Zhu. 2019. Explaining Decision-Making Algorithms through UI: Strategies to Help Non-Expert Stakeholders. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, Paper 559, 12 pages. DOI: https://doi.org/10.1145/3290605.3300789 [16] Hugo Jair Escalante, Isabelle Guyon, Sergio Escalera, Julio Jacques, Meysam Madadi, Xavier Baró, Stephane Ayache, Evelyne Viegas, Yağmur Güçlütürk, Umut Güçlü, Marcel A. J. van Gerven, Rob van Lier. 2017. Design of an explainable machine learning challenge for video interviews. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN’17), Anchorage, AK, 3688-3695. DOI: https://doi.org/10.1109/IJCNN.2017.7966320 [17] Jacob T. Browne. 2019. Wizard of Oz Prototyping for Machine Learning Experiences. Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, ACM, LBW2621:1–LBW2621:6. [18] James Paul Gee. 2004. An introduction to discourse analysis: Theory and method. Routledge. [19] Jonathan Grudin. 2009. AI and HCI: Two fields divided by a common focus. AI Magazine 30, 4: 48–48. [20] Jonathan Lazar, Jinjuan Heidi Feng, and Harry Hochheiser. 2017. Research methods in human-computer interaction. Morgan Kaufmann. [21] Jordan Barria-Pineda and Peter Brusilovsky. 2019. Making Educational Recommendations Transparent through a Fine-Grained Open Learner Model. IUI Workshops. [22] Mandana Hamidi Haines, Zhongang Qi, Alan Fern, Fuxin Li, Prasad Tadepalli. 2019. Interactive Naming for Explaining Deep Neural Networks: A Formative Study. Workshop on Explainable Smart Systems – ExSS (2019). [23] Michael Chromik, Malin Eiband, Sarah Theres Völkel, and Daniel Buschek. 2019. Dark Patterns of Explainability, Transparency, and User Control for Intelligent Systems. Intelligent User Interfaces for Algorithmic Transparency in Emerging Technologies - IUIATEC (2019). [24] Mukund Sundararajan, Jinhua Xu, Ankur Taly, Rory Sayres, Amir Najmi. 2019. Exploring Principled Visualizations for Deep Network Attributions. Workshop on Explainable Smart Systems – ExSS (2019). [25] Natalie Clewley, Lorraine Dodd, Victoria Smy, Annamaria Witheridge, and Panos Louvieris. 2019. Eliciting Expert Knowledge to Inform Training Design. In Proceedings of the 31st European Conference on Cognitive Ergonomics (ECCE 2019), Maurice Mulvenna and Raymond Bond (Eds.). ACM, New York, NY, USA, 138-143. DOI: https://doi.org/10.1145/3335082.3335091