Investigating Explanations that Target Training Data Ariful Islam Anik and Andrea Bunt Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada Abstract To promote transparency in black-box machine learning systems, different explanation approaches have been developed and discussed in the literature. However, training dataset information is rarely communicated in these explanations despite the utmost importance of training data to a system trained with machine learning techniques. We investigated explanations that focus on communicating training dataset information to end-users in our work. In this position paper, we discuss our prototype explanations and highlight findings from our user studies. We also discuss open questions and interesting directions for future research. Keywords 1 Explanations, Training Data, Machine Learning Systems, Transparency. 1. Introduction systems [8]. For example, biased training data can lead to systematic discriminations by the systems [5,6,22]. While machine learning (ML) and artificial Our work focuses on designing and studying intelligence (AI) are being increasingly used in data-centric explanations that provide end- a range of automated systems, a lack of users with information on the data used to train transparency in these black-box systems can be the system [1]. In this position paper, we first a barrier for end-users to interpret the systems’ summarize how we designed and evaluated outcomes [28,32]. This lack of transparency data-centric explanations that communicate can also negatively impact end-users’ trust and information on the training data to end-users. acceptance of the systems [13,36]. We also discuss interesting and important To increase system transparency, prior work future research directions that have arisen from has investigated a range of explanation our work. approaches for machine learning systems [2,7,9,14,36,37]. These explanations provide the users with information about the systems 2. Related Work and their decisions by mostly focusing on explaining the decision factors, the criteria, and With the goal of increasing transparency in the properties of the outcomes [2,7,9,14,36,37]. machine learning systems, prior work has While evaluations of these approaches investigated a range of explanation approaches [4,7,9,16,23,35] have shown them to be that explain the outcomes and/or a system’s valuable, previously studied explanations rarely rationale behind the outcomes. These communicate information about training data or explanations can be categorized into different how the system was trained. Since machine groups based on the focus of the provided learning algorithms look at the underlying information. For example, input-influence patterns and characteristics of the training data explanations [4,14] describe the degree of to decide on the outcomes, training data and influence of the inputs to the system output. In training procedures can have a fundamental contrast, sensitivity-based explanations [4,36] impact on the performance of machine learning describe how much the value of an input has to Joint Proceedings of the ACM IUI 2021 Workshops, April 13-17, 2021, College Station, USA EMAIL: aianik@cs.umanitoba.ca (A. 1); bunt@cs.umanitoba.ca (A. 2) Copyright ©️ 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) differ to change the output. Other popular documentation sheet for this purpose [17]. This explanation approaches include demographic- documentation approach is receiving attention based explanations [2,4], which describe the in the machine learning community [10,40] and aggregate statistics on the outcome classes for in some organizations [3,31]. Our research different demographic categories (e.g., gender, focuses on investigating how such information race), while case-based explanations [4,7] use could be communicated to end-users and how example instances from the training data to it might impact their perceptions of machine explain the outcome. Prior work also explored learning systems. white-box explanations [9] that explain the internal workings of an algorithm, and visual 3. Data-centric Explanations explanations [25,39] that explain the outcomes or the model through a visual analytics In this section, we present a high-level interface. Most of these approaches either focus on the decision process or the factors in the description of our approach to explanations that decision process. communicate the underlying training data. We Prior work has also investigated the impact also summarize our key evaluation results to of different explanation approaches on end- date. A more detailed discussion of our work users’ perception of machine learning systems can be found in [1]. [4,7,9,16,23,35]. While increased transparency Our data-centric explanations focus on providing end-users with information on the through explanations tends to universally increase users’ acceptance of the systems training data used in machine learning systems. [13,21,24], the impacts on trust have been We leveraged Gebru et al.’s datasheets for mixed [9,13,23,26,30,33,34]. Prior work has datasets [17] as a starting point to design data- centric explanations, using an iterative process also studied the impact of explanations on end- users’ sense of fairness, finding that certain to transform this information into forms that explanation styles impact fairness judgments were meaningful and understandable to end- more than the others [4,16]. users. Figure 1 provides an overview of one of Given that training data is fundamental to our prototype data-centric explanations. Our iterative design and evaluation led us to include the performance of machine learning systems, Gebru et al. advocated the concept of five different categories of training data documenting important information (e.g., information (Figure 1: Left). Within each category, the prototype explains dataset motivation, creation, compositions, intended information using a question-and-answer use, distribution) about datasets before releasing them, proposing a standard dataset format (example is given in Figure 1: A). Figure 1: Overview of data-centric explanations as described in [1]. On the left, we can see the main screen with the five categories of information provided in the explanations. On the right (A), we see the expanded version of the collection category. (B), (C), (D), and (E) refer to the other categories (demographics, recommended usage, potential issues, and general information) respectively. We evaluated our prototype explanations in While our study findings suggest that a mixed-method user study with 27 participants participants positively receive data-centric to assess their potential to impact end-users’ explanations, some participants also wanted perceptions of machine learning systems. Our additional information about the systems and evaluation used a scenario-based approach, the decision factors, particularly to judge where we presented participants with a set of fairness. A significant body of research has scenarios describing potential real-world investigated explanations that focus on the systems along with the accompanying factors of a decision and the decision process explanations. The scenarios varied in the (i.e., process-centric information) perceived stakes of the systems (high stakes vs [9,14,25,36,39]. While each of the explanation low stakes) and the characteristics of the approache has its own benefits, it would be training data revealed in the accompanying interesting to explore ways to combine explanations (balanced training data vs training explanations of training data with process- data with red flags). Our study also included a centric explanations. Doing so would also allow semi-structured interview session with each us to investigate how end-users might prioritize participant where we probed on issues the different types of explanations, as well as surrounding trust, fairness, and characteristics how the different approaches might of the system scenarios and training data. complement each other. We found in our evaluation that the data- We also see opportunities for the centric explanations impacted participants’ community to study and discuss different perceived level of trust in and the sense of evaluation methods. For example, a common fairness of the machine learning systems. We method for evaluating explanations of machine found that participants had more trust in the learning systems is to use fictional system system and thought the system was fair when scenarios (which we also used in our study with the explanations revealed a balanced training data-centric explanations) [4,19,29,38,41]. A dataset with no errors compared to when downside of this method is that it requires explanations pointed out issues in the training participants to role-play rather than experience data. Our study also provided qualitative the systems directly, which in turn impacts the insights into the value end-users see in having ecological validity of the study findings. There training-data information available. For are a number of challenges with moving example, participants liked having access to the towards evaluations with real-life systems. For demographics information as they felt it helped example, before we can evaluate our them identify biases. We also noticed initial explanations in a real setting, we need more indications of participant expertise affecting documented datasets available for real-world attitudes towards the explanations. Machine systems and we need more machine learning learning experts expected other users to have specialists to buy into the idea of data-centric difficulty understanding explanations; explanations and be more open to incorporating however, we did not such concerns expressed data-centric explanations in real-life systems. by participants with less prior knowledge of One of the goals for explanations, in general, machine learning. In fact, almost all is to ensure fairness in machine learning participants reported that the explanations were systems by revealing more details about the easy to understand and expressed interest in systems and their decision process. However, having them available. measuring users’ perceptions of fairness is a challenging task. While a common approach is 4. Opportunities and Challenges to adapt and use prior scales proposed for organizational justice [4,12,16] (which we also with Data-centric Explanations use in our study), these scales do not necessarily capture the fact that fairness is multi- Our initial evaluation of the data-centric dimensional and context-dependent [18,19]. A explanation prototypes suggested that end- first necessary step in developing more robust users are capable of and interested in study instruments is to develop a common understanding information about training definition of “fairness”. There is existing work datasets. Our results also point to interesting in this direction that we can build upon [11,20]. future research directions that we discuss in this A second key evaluation challenge is having section. objective measures to complement the commonly collected questionnaire data (e.g., Heritage and Digital Libraries 17, 8–9: self-reported Likert scale values 687–714. [4,7,9,16,19,29]). Developing such measures, [3] M. Arnold, D. Piorkowski, D. Reimer, J. particularly ones that can be feasibility Richards, J. Tsay, K. R. Varshney, R. K.E. collected, is an important area of future work. Bellamy, M. Hind, S. Houde, S. Mehta, A. Finally, we are interested in how Mojsilovic, R. Nair, K. Natesan explanations such as ours might influence the Ramamurthy, and A. Olteanu. 2019. perceptions of stakeholders other than potential FactSheets: Increasing trust in AI services end-users, who are often the target pool in through supplier’s declarations of evaluations [4,7,16,23,35]. For example, for conformity. IBM Journal of Research and explanations of training data, one interesting Development 63, 4–5. audience could be companies and organizations https://doi.org/10.1147/JRD.2019.294228 that want to purchase machine learning systems 8 to see whether data-centric explanations might [4] Reuben Binns, Max Van Kleek, Michael impact on their purchasing decisions. Another Veale, Ulrik Lyngs, Jun Zhao, and Nigel potential audience for the data-centric Shadbolt. 2018. “It’s reducing a human explanations are journalists, who play an being to a percentage”; perceptions of important role in reporting black-box systems justice in algorithmic decisions. and communicating them to the general public Proceedings of the 2018 CHI Conference [15]. We know from prior work that journalists on Human Factors in Computing Systems have criticized machine learning systems for 2018-April: 1–14. their black-box nature [27]. https://doi.org/10.1145/3173574.3173951 [5] Tolga Bolukbasi, Kai Wei Chang, James 5. Summary Zou, Venkatesh Saligrama, and Adam Kalai. 2016. Man is to computer programmer as woman is to homemaker? Explaining the training data of machine Debiasing word embeddings. In Advances learning systems has the potential to provide a in Neural Information Processing Systems, range of benefits to end-users and other 4356–4364. stakeholders in terms of increased transparency [6] Joy Buolamwini and Timnit Gebru. 2018. of the systems. Our study with data-centric Gender Shades: Intersectional Accuracy explanations found some evidence that such Disparities in Commercial Gender explanations can impact people’s trust in and Classification. In Proceedings of the 1st fairness judgment of machine learning systems. Conference on Fairness, Accountability We discussed some important directions for and Transparency (Proceedings of future work, which we hope will encourage Machine Learning Research), 77–91. discussion with researchers working on a Retrieved from variety of explanation styles and approaches. http://proceedings.mlr.press/v81/buolamw ini18a.html 6. References [7] Carrie J. Cai, Jonas Jongejan, and Jess Holbrook. 2019. The effects of example- [1] Ariful Islam Anik and Andrea Bunt. 2021. based explanations in a machine learning Data-Centric Explanations: Explaining interface. Proceedings of the 24th Training Data of Machine Learning International Conference on Intelligent Systems to Promote Transparency. In User Interfaces: 258–262. Proceedings of the 2021 CHI Conference https://doi.org/10.1145/3301275.3302289 on Human Factors in Computing Systems, [8] Toon Calders and Indrė Žliobaitė. 2013. (To appear). Why unbiased computational processes [2] Liliana Ardissono, Anna Goy, Giovanna can lead to discriminative decision Petrone, Marino Segnan, and Pietro procedures. Studies in Applied Philosophy, Torasso. 2003. Intrigue : Personalized Epistemology and Rational Ethics 3: 43– Recommendation of Tourist Attractions. 57. https://doi.org/10.1007/978-3-642- Applied Artificial Intelligence: Special 30487-3_3 Issue on Artificial Intelligence for Cultural [9] Hao Fei Cheng, Ruotong Wang, Zheng Zhang, Fiona O’Connell, Terrance Gray, F. Maxwell Harper, and Haiyi Zhu. 2019. [17] Timnit Gebru, Jamie Morgenstern, Briana Explaining decision-making algorithms Vecchione, Jennifer Wortman Vaughan, through UI: Strategies to help non-expert Hanna Wallach, Hal Daumé Iii, and Kate stakeholders. Proceedings of the 2019 CHI Crawford. 2018. Datasheets for datasets. Conference on Human Factors in In 5th Workshop on Fairness, Computing Systems: 1–12. Accountability, and Transparency in https://doi.org/10.1145/3290605.3300789 Machine Learning, 1–27. Retrieved from [10] Eunsol Choi, He He, Mohit Iyyer, Mark http://arxiv.org/abs/1803.09010 Yatskar, Wen Tau Yih, Yejin Choi, Percy [18] Ben Green and Lily Hu. 2018. The Myth Liang, and Luke Zettlemoyer. 2020. in the Methodology: Towards a QUAC: Question answering in context. Recontextualization of Fairness in Proceedings of the 2018 Conference on Machine Learning. Proceedings of the Empirical Methods in Natural Language machine learning: the debates workshop. Processing, EMNLP 2018: 2174–2184. [19] Nina Grgic-Hlaca, Elissa M. Redmiles, https://doi.org/10.18653/v1/d18-1241 Krishna P. Gummadi, and Adrian Weller. [11] Alexandra Chouldechova and Aaron Roth. 2018. Human perceptions of fairness in 2018. The Frontiers of Fairness in algorithmic decision making: A case study Machine Learning. 1–13. Retrieved from of criminal risk prediction. The Web http://arxiv.org/abs/1810.08810 Conference 2018 - Proceedings of the [12] Jason A Colquitt and Jessica B Rodell. World Wide Web Conference, WWW 2018: 2015. Measuring justice and fairness. In 903–912. The Oxford handbook of justice in the https://doi.org/10.1145/3178876.3186138 workplace. Oxford University Press, New [20] Moritz Hardt, Eric Price, and Nathan York, NY, US, 187–202. Srebro. 2016. Equality of opportunity in https://doi.org/10.1093/oxfordhb/9780199 supervised learning. Advances in Neural 981410.013.8 Information Processing Systems: 3323– [13] Henriette Cramer, Vanessa Evers, Satyan 3331. Ramlal, Maarten Van Someren, Lloyd [21] J. L. Herlocker, J. A. Konstan, and J. Riedl. Rutledge, Natalia Stash, Lora Aroyo, and 2000. Explaining collaborative filtering Bob Wielinga. 2008. The effects of recommendations. In Proceedings of the transparency on trust in and acceptance of ACM Conference on Computer Supported a content-based art recommender. In User Cooperative Work, 241–250. Modeling and User-Adapted Interaction, https://doi.org/10.1145/358916.358995 455–496. https://doi.org/10.1007/s11257- [22] Lauren Kirchner, Surya Mattu, Jeff 008-9051-3 Larson, and Julia Angwin. 2016. Machine [14] Anupam Datta, Shayak Sen, and Yair Bias. Propublica 23: 1–26. Retrieved from Zick. 2017. Algorithmic Transparency via https://www.propublica.org/article/machi Quantitative Input Influence. Transparent ne-bias-risk-assessments-in-criminal- Data Mining for Big and Small Data: 71– sentencing 94. https://doi.org/10.1007/978-3-319- [23] Rene F. Kizilcec. 2016. How much 54024-5_4 information? Effects of transparency on [15] Nicholas Diakopoulos. 2015. Algorithmic trust in an algorithmic interface. Accountability: Journalistic investigation Proceedings of the 2016 CHI Conference of computational power structures. Digital on Human Factors in Computing Systems: Journalism 3, 3: 398–415. 2390–2395. https://doi.org/10.1080/21670811.2014.97 https://doi.org/10.1145/2858036.2858402 6411 [24] Rafal Kocielnik, Saleema Amershi, and [16] Jonathan Dodge, Q. Vera Liao, Yunfeng Paul N. Bennett. 2019. Will you accept an Zhang, Rachel K.E. Bellamy, and Casey imperfect AI? Exploring Designs for Dugan. 2019. Explaining models: An Adjusting End-user Expectations of AI empirical study of how explanations Systems. Proceedings of the 2019 CHI impact fairness judgment. Proceedings of Conference on Human Factors in the 24th International Conference on Computing Systems: 1–14. Intelligent User Interfaces: 275–285. https://doi.org/10.1145/3290605.3300641 https://doi.org/10.1145/3301275.3302310 [25] Josua Krause, Adam Perer, and Kenney https://doi.org/10.4159/harvard.97806747 Ng. 2016. Interacting with predictions: 36061 Visual inspection of black-box machine [33] Forough Poursabzi-Sangdeh, Daniel G. learning models. Proceedings of the 2016 Goldstein, Jake M. Hofman, Jennifer CHI Conference on Human Factors in Wortman Vaughan, and Hanna Wallach. Computing Systems: 5686–5697. 2018. Manipulating and Measuring Model https://doi.org/10.1145/2858036.2858529 Interpretability. Retrieved from [26] Todd Kulesza, Simone Stumpf, Margaret http://arxiv.org/abs/1802.07810 Burnett, and Irwin Kwan. 2012. Tell me [34] Pearl Pu and Li Chen. 2006. Trust building more? the effects of mental model with explanation interfaces. Proceedings soundness on personalizing an intelligent of the 11th International Conference on agent. Proceedings of the SIGCHI Intelligent User Interfaces 2006: 93–100. Conference on Human Factors in https://doi.org/10.1145/1111449.1111475 Computing Systems: 1–10. [35] Emilee Rader, Kelley Cotter, and Janghee https://doi.org/10.1145/2207676.2207678 Cho. 2018. Explanations as Mechanisms [27] Jeff Larson, Surya Mattu, Lauren for Supporting Algorithmic Transparency. Kirchner, and Julia Angwin. 2020. How Proceedings of the 2018 CHI Conference We Analyzed the COMPAS Recidivism on Human Factors in Computing Systems Algorithm. ProPublica. Retrieved from - CHI ’18: 1–13. https://www.propublica.org/article/how- https://doi.org/10.1145/3173574.3173677 we-analyzed-the-compas-recidivism- [36] Marco Ribeiro, Sameer Singh, and Carlos algorithm Guestrin. 2016. “Why Should I Trust [28] Bruno Lepri, Nuria Oliver, Emmanuel You?”: Explaining the Predictions of Any Letouzé, Alex Pentland, and Patrick Classifier. In Proceedings of the 22nd Vinck. 2018. Fair, Transparent, and ACM SIGKDD international conference Accountable Algorithmic Decision- on knowledge discovery and data mining, making Processes. Philosophy & 97–101. https://doi.org/10.18653/v1/n16- Technology 31, 4: 611–627. 3020 https://doi.org/10.1007/s13347-017-0279- [37] Wojciech Samek, Alexander Binder, x Grégoire Montavon, Sebastian [29] Brian Y. Lim and Anind K. Dey. 2009. Lapuschkin, and Klaus Robert Müller. Assessing demand for intelligibility in 2017. Evaluating the visualization of what context-aware applications. UbiComp a deep neural network has learned. IEEE 2009: Ubiquitous Computing: 195. Transactions on Neural Networks and https://doi.org/10.1145/1620545.1620576 Learning Systems 28, 11: 2660–2673. [30] Brian Y. Lim, Anind K. Dey, and Daniel https://doi.org/10.1109/TNNLS.2016.259 Avrahami. 2009. Why and why not 9820 explanations improve the intelligibility of [38] Megha Srivastava, Hoda Heidari, and context-aware intelligent systems. In Andreas Krause. 2019. Mathematical Proceedings of the 27th international notions vs. Human perception of fairness: conference on Human factors in A descriptive approach to fairness for computing systems - CHI 09, 2119. machine learning. Proceedings of the 25th https://doi.org/10.1145/1518701.1519023 ACM SIGKDD International Conference [31] Margaret Mitchell, Simone Wu, Andrew on Knowledge Discovery and Data Zaldivar, Parker Barnes, Lucy Vasserman, Mining: 2459–2468. Ben Hutchinson, Elena Spitzer, Inioluwa https://doi.org/10.1145/3292500.3330664 Deborah Raji, and Timnit Gebru. 2019. [39] Paolo Tamagnini, Josua Krause, Aritra Model cards for model reporting. FAT* Dasgupta, and Enrico Bertini. 2017. 2019 - Proceedings of the 2019 Interpreting black-box classifiers using Conference on Fairness, Accountability, instance-level visual explanations. and Transparency, Figure 2: 220–229. Proceedings of the 2nd Workshop on https://doi.org/10.1145/3287560.3287596 Human-In-the-Loop Data Analytics, [32] Frank Pasquale. 2015. The Black Box HILDA 2017: 1–6. Society. Harvard University Press. https://doi.org/10.1145/3077257.3077260 [40] Semih Yagcioglu, Aykut Erdem, Erkut Erdem, and Nazli Ikizler-Cinbis. 2020. RecipeQA: A challenge dataset for multimodal comprehension of cooking recipes. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018: 1358–1368. https://doi.org/10.18653/v1/d18-1166 [41] Yunfeng Zhang, Q. Vera Liao, and Rachel K.E. Bellamy. 2020. Efect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency: 295–305. https://doi.org/10.1145/3351095.3372852