Effects of Algorithmic Decision-Making and Interpretability on Human Behavior: Experiments using Crowdsourcing Avishek Anand1 , Kilian Bizer2 , Alexander Erlei2 , Ujwal Gadiraju1 , Christian Heinze3 , Lukas Meub2 , Wolfgang Nejdl1 , and Björn Steinrötter3 1 L3S Research Center, Leibniz Universität Hannover 2 Chair of Economic Policy and SME Research, Georg-August-Universität Göttingen 3 Institute of Legal Informatics, Leibniz Universität Hannover 1 lastname@L3S.de 2 lukas.meub@wiwi.uni-goettingen.de 3 lastname@iri.uni-hannover.de Abstract Why Interpretability? Today algorithmic decision-making (ADM) is prevalent in Interpretability is often deemed critical to enable effective several fields including medicine, the criminal justice system, real-world deployment of intelligent systems, albeit highly financial markets etc. On the one hand, this is testament to context dependent (Weller 2017). For a researcher or devel- the ever improving performance and capabilities of complex oper, high interpretability is crucial to understand how their machine learning models. On the other hand, the increased system/model is working, aiming to debug or improve it. For complexity has resulted in a lack of transparency and inter- an end user, it provides a sense of what the system is doing pretability which has led to critical decision-making models and why, to enable prediction of what it might do in unfore- being deployed as functional black boxes. There is a general seen circumstances and build trust in the technology. Ad- consensus that being able to explain the actions of such sys- ditionally, adequate interpretability provides an expert (per- tems will help to address legal issues like transparency (ex ante) and compliance requirements (interim) as well as liabil- haps a regulator) the ability to audit a prediction or decision ity (ex post). Moreover it may build trust, expose biases and trail in detail and verify whether legal regulatory standards in turn lead to improved models. This has most recently led to have been complied with. For example, explicit content for research on extracting post-hoc explanations from black box innocuous queries (for children) or to expose biases that may classifiers and sequence generators in tasks like image cap- be hard to spot with quantitative measures. tioning, text classification and machine translation. Recent work has highlighted the opportunities for com- However, there is no work yet that has investigated and re- puter scientists to take the lead in designing algorithms and vealed the impact of model explanations on the nature of hu- evaluation frameworks which avoid discrimination and en- man decision-making. We undertake a large scale study using able explanation (Goodman and Flaxman 2016). Also, many crowd-sourcing as a means to measure how interpretability regulatory policies now require or will require algorithmic affects human-decision making using well understood prin- transparency. Take for example the European Union’s new ciples of behavioral economics. To our knowledge this is the General Data Protection Regulation (GDPR) which will first of its kind of an inter-disciplinary study involving inter- pretability in ADM models. take effect from 25 May 2018 onwards, that restricts au- tomated individual decision-making which significantly af- fects users (Art. 22 GDPR). The law intends to create a right Introduction to explanation, whereby a user can ask for an explanation of an algorithmic decision that was made about them (Art. 12, In the context of machine learning and more generally in 13(2) lit. f, 14(2) lit. g GDPR). algorithmic decision-making systems (ADMs) interpretabil- But how is human decision-making affected when ADMs ity can be defined as “the ability to explain or to present in are accompanied with explanations? How does it affect ac- understandable terms to a human” (Doshi-Velez and Kim ceptability of ADMs ? Does it increase trust in the ADMs ? 2017). Inspite of the application of ADMs in a breadth of We intend to initiate large scale studies using crowdsourcing domains, for the most part, they are still used as black boxes based on behavioral economics in order to understand how which output a prediction, score or rankings without un- and if human decision-making is impacted when ADMs are derstanding partially or even completely how different fea- accompanied with explanations. tures influence the model prediction. In such cases when an algorithm prioritizes information to predict, classify or Why Behavioral Economics? rank, algorithmic transparency becomes an important fea- ture to keep tabs on restricting discrimination and enhancing Various external factors shape the design and effects of al- explainability-based trust in the system. gorithmic decision-making systems and ultimately define the adequate implementation of interpretability measures. Copyright c 2018for this paper by its authors. Copying permitted Besides being constrained by the institutional and regula- for private and academic purposes. tory framework, an optimal design further anticipates be- havioral aspects of human-agent interaction (Mosier and stand decisions made by the network – image classification Skitka 2018). We argue that only an interdisciplinary ap- and captioning (Xu et al. 2015; Dabkowski and Gal 2017; proach allows to analyze these factors comprehensively. In- Simonyan, Vedaldi, and Zisserman 2013), sequence to se- troducing behavioral economics offers such an integrative quence modeling (Alvarez-Melis and Jaakkola 2017; Li et approach, that could substantially advance prevailing discus- al. 2015), recommender systems (Chang, Harper, and Ter- sions in manifold dimensions. Over the last decades, behav- veen 2016) etc. Interpretable models can be categorized into ioral economists have developed progressively detailed and two broad classes: model introspective and model agnos- sophisticated models of human behavior. This process has tic. Model introspection refers to interpretable models, such yielded a rich set of meticulous experimental methods and as decision trees, rules (Letham et al. 2015), additive mod- inherently diverse theoretical models (Kagel and Roth 2016; els (Caruana et al. 2015) and attention-based networks (Xu Camerer, Loewenstein, and Rabin 2011). While these mod- et al. 2015). Instead of supporting models that are function- els of human behavior need to account for the progress in ally black-boxes, such as an arbitrary neural network or ran- artificial intelligence (Camerer 2017; Marwala and Hurwitz dom forests with thousands of trees, these approaches use 2017), they enable a sound analysis of ADM systems in- models in which there is the possibility of meaningfully in- creasingly penetrating into society. Specifically, we aim to specting model components directly e.g. a path in a decision examine how human behavior changes in human-agent en- tree, a single rule, or the weight of a specific feature in a lin- vironments and whether these changes have repercussions ear model. for economic outcomes. For instance, we are interested in Model agnostic approaches on the other hand extract post- total productive activity, the frequency of economically rel- hoc explanations by treating the original model as a black evant interactions, cooperation and coordination activity or box either by learning from the output of the black box changes in overall as well as individual welfare. The use of model, or perturbing the inputs, or both (Ribeiro, Singh, and pertinent economic models enables to generalize empirical Guestrin 2016; Koh and Liang 2017). Model agnostic in- findings and subsequently derive inferences about effects in terpretability is of two types: local and global. Local inter- our outcomes of interest. Consequently, certain ADM de- pretability refers to the explanations used to describe a single sign and regulatory choices can be evaluated on relevant decision of the model. There are also other notions of inter- societal dimensions using straightforward counterfactuals pretability, and for a more comprehensive description of the (Kleinberg et al. 2017). Our approach therefore promises approaches we point the readers to (Lipton 2016). evidence that supports the design of economic policy mea- sures with consequences for constructing machine-learning Interpretability and Human Decision-Making systems (Athey 2017; 2018). To arrive at a suitable research design integrating behav- Interpretability is no end in itself. The effects of inter- ioral economic science, our work in progress focuses on the pretability remain ambiguous even if one learns about the ef- effects of interpretability in human-agent interaction. For in- fectiveness of interpretability measures as obtained by stud- stance, explicitly quantifying the economic value of inter- ies like (Garcia et al. 2009; Gacto, Alcala, and Herrera pretability and identifying beneficiaries has implications for 2011). Rather, to resolve this ambiguity, one needs to ask in both the design of ADM systems and regulatory choices. We how far variation in interpretability transfers into variation rely on ultimatum bargaining - a prominent working-horse in behavior. in experimental economics - to derive novel insights with For instance, additional explanations could foster a more respect to the influence of ADM systems and interpretabil- trustful environment that motivates fruitful human-agent in- ity on human behavior. Overall, we ask: Does the introduc- teractions. However, providing additional information might tion of ADM systems influence human decision-making in conversely result in an erosion of trust due to a more thor- a straightforward bargaining context? How do ADM sys- ough scrutiny with respect to agent recommendations. Con- tems adapt to these presumably new behavioral patterns? sider an agent supporting a physician (expert) in diagnosing Beyond those rather general considerations, we specifically a patient’s (consumer) MRI scan. The physician might gen- focus on interpretability to examine, e.g.: Does increased in- erally trust the agent based on positive experience and com- terpretability influence established behavioral concepts such mon knowledge about its superiority; thus reaching higher as acceptance, reciprocity or fairness concerns? Does it accuracy in his diagnosis. In contrast, learning about unfa- increase the quantity of economically relevant interactions miliar features used by the agent might cause distrust and and subsequently affect overall welfare? has the physician stick to her own assessment. This hypoth- esis stems from evidence gathered by observing human in- Interpretability of ML Models teraction (Keller and Staelin 1987; Grimmelikhuijsen et al. 2013; Cramer et al. 2008; Ditto et al. 1998). Hence, in- Interpretability in Machine Learning has been studied for creased interpretability might diminish the efficiency of such a long time in classical machine learning as a desirable economically vital consumer-expert interactions. property to have while chosing a certain model family un- The consideration above illustrates only one distinct case der interpretability by design like decision trees, falling rule with inherent ambiguity regarding the effects of introduc- lists etc. However, the success of Neural networks (NN) and ing increased interpretability. Besides trust, one might think other expressive yet complex ML models have only intensi- of concepts established in behavioral economics like ac- fied the discussion on post-hoc interpretability or interpret- ceptance, accountability or social-preferences. Further, to ing already built models. obtain a more thorough understanding of increased inter- Consequently, interpretability of these complex models pretability, one needs to not only evaluate its effects on the has been studied in various other domains to better under- end-user, but rather also consider regulators, developers or consumers. Such a comprehensive approach poses several Workers will play the roles of proposers and responders un- challenges to the design of experiments and respective mod- der the following different between-subjects treatment con- eling of human behavior. Our work in progress relies on ulti- ditions, to understand the effects of automated decision- matum bargaining to derive novel insight with respect to our making and interpretability on human behavior. We will fol- considerations outlined above. low guidelines from previous works to ensure reliable par- ticipation of crowd workers (Gadiraju et al. 2015). Crowdsourcing Methodology I: Human-Human Interactions. This condition follows the simplest design of the ultimatum game as described earlier, Over the last decade, microtask crowdsourcing platforms consisting of a proposer and responder (roles that will be ful- such as Amazon’s Mechanical Turk1 and CrowdFlower2 filled by randomly paired workers recruited from the crowd- have been used to support or replicate findings from psy- sourcing platform). We will record the interactions between chology and behavioral research, and also to run human- N unique (proposer, responder) pairs, i.e., the offers made centered experiments on a large scale (Mason and Suri 2012; by the proposer and whether they are accepted or rejected Crump, McDonnell, and Gureckis 2013; Chandler, Mueller, by the responder. Following this, the proposer and respon- and Paolacci 2014; Gadiraju et al. 2017). Previous works der will independently complete certain personality related have established that crowdsourcing platforms can be reli- questionnaires. ably leveraged to conduct large scale behavioral experiments II: Human-Machine Interactions. Using the N human- that can be ecologically valid. human interactions and features engineered from condition I, we will train a machine learning model that can classify Ultimatum Bargaining Experiment whether a bid from a proposer is likely to be accepted. In this Ultimatum bargaining represents one of the most promi- condition, proposers will be given the opportunity to use the nent games researched in experimental economics (Gueth, machine learning model as an algorithmic decision-making Schmittberger, and Schwarze 1982). Although it seems quite system that can aid them in making a proposal. The pro- simple, understanding behavior in this framework remains posers will be allowed to probe the ADM system with pro- complex even after decades of research (Gueth and Kocher posals and the system would report the likelihood of the pro- 2014; van Damme et al. 2014). However, there is a rich liter- posal being accepted. The proposers will be allowed to probe ature allowing to integrate and evaluate the relevance of our the ADM system any number of times, but can only make findings. Literature on automated, though not artificial intel- a proposal to the responders once. The responders will be ligent, agents from computer science and economics, makes made aware of the fact that the proposer have a ADM system the ultimatum game an optimal working horse to test our hy- at their disposal to help them in making a proposal. Once pothesis. Our basic framework replicates the simplest design again we will record the interactions between N unique and of the ultimatum game. A proposer X decides on the distri- distinct (proposer, responder) pairs. These interactions be- bution of a pie with size p. X receives x and the responder tween the proposers with the ADM system, as well as with Y receives y, where x, y ≥ 0 and x + y = p. In a sequen- the responders will provide us valuable insights on the ef- tial process, the responder Y learns about the proposal (x, y) fects of ADM on human behavior and how trust manifests and either accepts δ(x, y) = 1 or rejects δ(x, y) = 0. Pay- and fluctuates via such interactions. offs are given by δ(x, y)x and δ(x, y)y, i.e. if the responder III: Human-Machine Interactions with Proposers as Ob- Y rejects both earn nothing. servers. This condition is similar to II, except that the pro- A straightforward solution of the game - merely based on posers will not be allowed to probe the ADM system but monetary outcomes - implies that responder Y should ac- will only observe the proposals made by the system on her cept all positive offers, which gives δ(x, y) = 1 for y > 0.3 behalf. The responders will be conveyed that the offer being This is anticipated by the proposer X, which has him offer made is from an ADM acting on behalf of the proposer. the minimal positive amount. In consequence, X receives IV: Human-Machine Interactions with Explanations. This almost the whole pie p and Y receives little more than noth- condition is virtually identical to II, except that proposers in ing. However, actual behavior observed in prior experiments this case will be aided with explanations alongside likeli- shows that the optimal offer by the proposer amounts to hood estimates to enhance interpretability when they probe 40 to 50% of the pie. This might for example reflect fair- the ADM system. Note that we consider model-introspective ness concerns or merely strategic thinking avoiding punish- variants of interpretability where access to an already built ment by the responder who rejects offers perceived as unfair model is provided. This will allow us to understand the role (Camerer 2003). of interpretability in shaping human behavior while interact- ing with ADM systems. Experimental Setup V, VI, VII: ADM Learned from Human-Machine Inter- We will carry out a large scale ultimatum bargaining experi- actions. To analyze the impact of the type of interactions ment by recruiting workers from a crowdsourcing platform. that the ADM is learned from, we will train a similar ma- chine learning model by using the interactions in condition 1 https://www.mturk.com/ II, that can aid a proposer in making an offer to the respon- 2 https://www.crowdflower.com der. This will allow us to investigate the impact of the type 3 While this represents the weakly dominant strategy for Y , all of interaction data (human-human versus human-machine) distributions (x, y) can be established as equilibrium outcomes. that the ADM is learned from, on the entailing observations For multiple equilibria consider a certain threshold ȳ for accep- of human behavior. Thus, the conditions V, VI and VII are tance by the responder Y , such that [(x, y), δ(x̃, ỹ) = 1] if ỹ ≥ y repetitions of II, III and IV except for the interactions that and δ(x̃, ỹ) = 0 otherwise. the ADM is learned from. References Goodman, B., and Flaxman, S. 2016. European union regulations Alvarez-Melis, D., and Jaakkola, T. S. 2017. A causal framework on algorithmic decision-making and a” right to explanation”. arXiv for explaining the predictions of black-box sequence-to-sequence preprint arXiv:1606.08813. models. arXiv preprint arXiv:1707.01943. Grimmelikhuijsen, S.; Porumbescu, G.; Hong, B.; and Im, T. 2013. Athey, S. 2017. Beyond prediction: Using big data for policy prob- The effect of transparency on trust in government: A crossnational lems. Science 355:483–485. comparative experiment. Public Administration Review 73(4):575– 586. Athey, S. 2018. The impact of machine learning on economics. Economics of Artificial Intelligence. Gueth, W., and Kocher, M. 2014. An experimental analysis of ulti- matum bargaining. Journal of Economic Behavior & Organization Camerer, C.; Loewenstein, G.; and Rabin, M. 2011. Advances in 108:396–409. Behavioral Economics. Princeton, NJ: Princeton University Press. Gueth, W.; Schmittberger, R.; and Schwarze, B. 1982. An exper- Camerer, C. 2003. Behavioral game theory: Experiments in strate- imental analysis of ultimatum bargaining. Journal of Economic gic interaction. Princeton, NJ: Cambridge University Press. Behavior & Organization 3(4):367–388. Camerer, C. 2017. Artificial intelligence and behavioral eco- Kagel, J., and Roth, A. 2016. The Handbook of Experimental nomics. Economics of Artificial Intelligence. Economics, Volume 2. Princeton, NJ: Princeton University Press. Caruana, R.; Lou, Y.; Gehrke, J.; Koch, P.; Sturm, M.; and Elhadad, Keller, K., and Staelin, R. 1987. Effects of quality and quantity N. 2015. Intelligible models for healthcare: Predicting pneumonia of information on decision effectiveness. Journal of Consumer Re- risk and hospital 30-day readmission. In Proceedings of the 21th search 14:200–213. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1721–1730. ACM. Kleinberg, J.; Lakkaraju, H.; Leskovec, J.; Ludwig, J.; and Mul- lainathan, S. 2017. Human decisions and machine predictions. Chandler, J.; Mueller, P.; and Paolacci, G. 2014. Nonnaı̈veté among The Quarterly Journal of Economics 133(11):237293. amazon mechanical turk workers: Consequences and solutions for behavioral researchers. Behavior research methods 46(1):112–130. Koh, P. W., and Liang, P. 2017. Understanding black-box predic- tions via influence functions. arXiv preprint arXiv:1703.04730. Chang, S.; Harper, F. M.; and Terveen, L. G. 2016. Crowd-based personalized natural language explanations for recommendations. Letham, B.; Rudin, C.; McCormick, T. H.; Madigan, D.; et al. In Proceedings of the 10th ACM Conference on Recommender Sys- 2015. Interpretable classifiers using rules and bayesian analysis: tems, RecSys ’16, 175–182. New York, NY, USA: ACM. Building a better stroke prediction model. The Annals of Applied Statistics 9(3):1350–1371. Cramer, H.; Evers, V.; Ramlal, S.; van Someren, M.; Rutledge, L.; Stash, N.; Aroyo, L.; and Wielinga, B. 2008. The effects Li, J.; Chen, X.; Hovy, E.; and Jurafsky, D. 2015. Visual- of transparency on trust in and acceptance of a content-based izing and understanding neural models in nlp. arXiv preprint art recommender. User Modeling and User-Adapted Interaction arXiv:1506.01066. 18(455):456–496. Lipton, Z. C. 2016. The mythos of model interpretability. ICML Crump, M. J.; McDonnell, J. V.; and Gureckis, T. M. 2013. Eval- Workshop on Human Interpretability of Machine Learning. uating amazon’s mechanical turk as a tool for experimental behav- Marwala, T., and Hurwitz, E. 2017. Artificial intelligence and ioral research. PloS one 8(3):e57410. economic theories. arXiv:1703.0659. Dabkowski, P., and Gal, Y. 2017. Real time image saliency for Mason, W., and Suri, S. 2012. Conducting behavioral research on black box classifiers. arXiv preprint arXiv:1705.07857. amazons mechanical turk. Behavior research methods 44(1):1–23. Ditto, P.; Scepansky, J.; Munro, G.; Apanovitch, A. M.; and Lock- Mosier, K., and Skitka, L. 2018. Human decision makers and hart, L. 1998. Motivated sensitivity to preference-inconsistent in- automated decision aids: Made for each other? In Automation and formation. Journal of Personality and Social Psychology 75(1):53– Human Performance: Theory and Applications. 69. Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2016. Why should Doshi-Velez, F., and Kim, B. 2017. Towards a rigorous science of i trust you?: Explaining the predictions of any classifier. In Pro- interpretable machine learning. ceedings of the 22nd ACM SIGKDD International Conference on Gacto, M.; Alcala, R.; and Herrera, F. 2011. Interpretability of Knowledge Discovery and Data Mining, 1135–1144. ACM. linguistic fuzzy rule-based systems: An overview of interpretability Simonyan, K.; Vedaldi, A.; and Zisserman, A. 2013. Deep in- measures. Information Sciences 181:43404360. side convolutional networks: Visualising image classification mod- Gadiraju, U.; Kawase, R.; Dietze, S.; and Demartini, G. 2015. Un- els and saliency maps. arXiv preprint arXiv:1312.6034. derstanding malicious behavior in crowdsourcing platforms: The van Damme, E.; Binmore, K.; Roth, A.; Samuelson, L.; Winter, case of online surveys. In Proceedings of the 33rd Annual ACM E.; Bolton, G.; Ockenfels, A.; Dufwenberg, M.; Kirchsteiger, G.; Conference on Human Factors in Computing Systems, 1631–1640. Gneezy, U.; Kocher, M.; Sutter, M.; Sanfey, A.; Kliemt, H.; Selten, ACM. R.; Nagel, R.; and Azar, O. 2014. How werner gueth’s ultima- Gadiraju, U.; Möller, S.; Nöllenburg, M.; Saupe, D.; Egger-Lampl, tum game shaped our understanding of social behavior. Journal of S.; Archambault, D.; and Fisher, B. 2017. Crowdsourcing ver- Economic Behavior & Organization 108:292–318. sus the laboratory: Towards human-centered experiments using the Weller, A. 2017. Challenges for transparency. arXiv preprint crowd. In Evaluation in the Crowd. Crowdsourcing and Human- arXiv:1708.01870. Centered Experiments. Springer. 6–26. Xu, K.; Ba, J.; Kiros, R.; Cho, K.; Courville, A.; Salakhudinov, R.; Garcia, S.; Fernandez, A.; Luengo, J.; and Herrera, F. 2009. Zemel, R.; and Bengio, Y. 2015. Show, attend and tell: Neural A study of statistical techniques and performance measures for image caption generation with visual attention. In International genetics-based machine learning: accuracy and interpretability. Conference on Machine Learning, 2048–2057. Soft Computing 13.