Explanation to Avert Surprise Melinda Gervasio, Karen Myers, Eric Yeh, Boone Adkins SRI International 333 Ravenswood Avenue, Menlo Park, California 94025, USA {firstname.lastname}@sri.com ABSTRACT what it is about to do. This averts a potentially unpleasant Most explanation schemes are reactive and informational: surprise that distracts the user and erodes trust. explanations are provided in response to specific user queries and focus on making the system’s reasoning more Our foray into explainable autonomy began a few years transparent. In mixed autonomy settings that involve teams ago, when we were developing autonomous agents for a of humans and autonomous agents, proactive explanation project on team autonomy for uncertain, dynamic, that anticipates and preempts potential surprises can be adversarial environments in mixed human-machine settings. particularly valuable. By providing timely, succinct, and As we observed the agents in action, we would sometimes context-sensitive explanations, autonomous agents can see puzzling behaviorfor example, an agent might avoid perceived faulty behavior and the consequent erosion suddenly change course away from its intended destination. of trust, enabling more fluid collaboration. We present an Our first thought would almost invariably be that there was explanation framework based on the notion of explanation a problem with the agent but, upon further inspection, we drivers—i.e., the intent or purpose behind agent would realize that the agent had good reason for its action. explanations. We focus on explanations meant to reconcile For example, it might be reacting to an unexpected event or expectation violations and enumerate a set of triggers for diverting to a higher-priority task. A straightforward UI proactive explanation. Most work on explainable AI showing the agents’ current plans helped somewhat to focuses on intelligibility; investigating explanation in mixed alleviate this problem, but this was a solution targeted at the autonomy settings helps illuminate other important autonomous agents’ designers, not at the end users who explainability issues such as purpose, timing, and impact. would be teaming with these automated agents in the future. Author Keywords The need for intelligent systems that could explain Explainable autonomy; explainable AI; human-machine themselves was recognized early on with expert systems teams; collaborative AI; intelligibility [8], with the desire of both system developers and end users to better understand the reasoning behind a computational ACM Classification Keywords system’s conclusions to determine whether it could be H.5.m. Information interfaces and presentation: trusted. More recently, the dominant work in explanation Miscellaneous; I.2.m Artificial intelligence: Miscellaneous has been on explaining the decisions of learned classifiers, INTRODUCTION particularly in the context of interactive learning [6,9], Humans judge mistakes by computer systems more harshly recommender systems [3,10], and deep learning [5,11]. than mistakes by other humans, with errors having a Explanation for autonomy differs in a number of ways. The disproportionately large impact on perceived reliability decision to be explained is typically part of a larger, [1,2]. This negative impact on trust has particularly orchestrated sequence of actions to achieve some long-term significant repercussions for human-machine teams, where goal. Decisions occur at different levels of the humans’ trust in the autonomous agents directly affects granularityfrom overarching policies and strategic how well they utilize the agents. The effect is particularly decisions down to individual actions. Explanation is unfortunate when the human perceives an agent to be required for various reasons under different circumstances: misbehaving when in fact it is behaving appropriately but in before execution to explain planning decisions, during response to conditions unknown to the user. execution to explain deviations from planned or expected We propose that a primary motivation for explanation behavior, and afterwards to review agent actions. should be surprise. When an agent violates expectations— In the collaborative human-machine team settings that we typically, not in a good way—a human collaborator will are primarily interested in, whether humans serve as invariably want to know the reason why. Reacting to the supervisors or as teammates, explanation during execution human’s surprise and explaining away the violation is a presents the additional challenge of limited cognitive valid approach, but even more effective would be if the resources. With the human already engaged in a cognitively agent could anticipate the surprise and proactively explain demanding task, system explanations must be succinct, © 2018. Copyright for the individual papers remains with the authors. timely, and context-sensitive. In particular, when a human Copying permitted for private and academic purposes. ExSS '18, March asks, “Why are you doing that?” it will often be because the 11, Tokyo, Japan. agent has done something unexpected and the agent’s basic information about current drone assignments and the explanation must address that. status of all known fires and groups, and they could ask basic questions about the drones’ behavior. In the proactive EXPLANATION DRIVERS We have developed an explanation framework based on the condition, they were also given preemptive explanations (as concept of explanation drivers: the intent or purpose behind textual pop-ups) of certain drone decisions. an agent’s explanation. We distinguish between three The participants all found the proactive explanations to be classes of drivers: Inform, Reconcile, and Prime. useful. As one participant put it, “[Proactive explanations Explanations to Inform are what most people typically think were] very helpful, particularly anything that was of as explanations. They provide straightforward answers to counterintuitive or represented a big change.” Based on the questions participants asked, we observed that everyone basic wh-questionsfor example, “What is your goal?” or wanted to know the big picture, both in terms of the overall “How do you plan to achieve that goal?” or “Where are you plan and the agents’ overall priorities. In addition, the going?” In the mixed-team setting, Inform explanations are participants expected the drones to address all the targets— particularly useful early on, when the human is still trying fires extinguished and groups rescued—with a strong to get an overall sense of an (unknown) agent’s decision- preference for saving people over extinguishing fires. making. However, even after some level of trust has been already been established, Inform explanations often still TRIGGERS FOR PROACTIVE EXPLANATION remain useful for maintaining that trust. Most explanation schemes are reactive: explanations about system decisions are generated on-demand in response to Explanations that Reconcile address expectation violations. specific user queries. While reactive explanations are useful They answer questions borne of surprisee.g., “What are in many situations, proactive explanations are sometimes you doing!” or “Why aren’t you doing X?” or “Why did called for, particularly in mixed autonomy settings where, you do Y [instead of Z]?” Reconcile explanations are most for example, close coordination is required and humans are effective when presented before the consequences of the engaged in tasks of their own or are supervising large decision are apparent, to prevent the surprise in the first teams. Proactive explanations serve to keep the human’s place. For example, a warning from a firefighting drone that mental model of the agent’s decisions aligned with the it will be diverting to help extinguish a fire that is growing agent’s actual decision process, minimizing surprises that faster than expected avoids surprising the user and possibly can distract from and disrupt the team’s activities. Used causing concern. It also gives the user the opportunity to judiciously, they can also reduce the communication burden change the plan—for example, to send the drone to its on the human, who will have less cause to question the original target and to co-opt a different one to help instead. agent’s decisions. Finally, there are explanations that Prime the user for We propose the use of surprise as the primary motivation assistance. Just as in human teams, communication and for proactivity, with agents using potential expectation coordination is critical in mixed teams. In human- violations to trigger explanation. Identifying expectation supervised settings, an important part of this collaboration violations requires having a model of the user’s involves agents recognizing when they need help and expectations. However, instead of relying on a providing humans with the information they need to comprehensive formal model of the human’s expectations provide appropriate guidance. Beyond simply asking for based on a representation of team and individual goals, help, Prime explanations inform humans why help is communication patterns, etc., we identify classes of needed to help them provide appropriate assistance. For expectations based on the simpler idea of expectation example, if the agent has low confidence in its best action, norms. That is, given a cooperative team setting where the it can let the human supervisor confirm or override. humans and the agents have the same objectives, we set out Here we focus on Reconcile explanations—in particular, on to determine expectations on agent behavior based on proactive explanations designed to avoid unpleasant rational or commonsense reasoning. We enumerate a set of surprises for human collaborators. This decision to focus on triggers for proactive explanation, discussing for each one proactive explanations was partially validated by the results the manifestation of surprise, the expectation violation of a small four-person user study we conducted in mid- underlying the surprise, and the information that the 2017. The study was in a fictional domain of drone proactive explanation should impart (Table 1). The triggers firefighting and rescue, and participants were given the task are not an exhaustive list but include a broad range that we of understanding what the drones were doing, with the have found particularly useful in our work on explainable knowledge that world was dynamic (e.g., fires could start autonomy. and die out on their own) and that all information was Lim & Dey’s investigation of intelligibility demands is uncertain (e.g., groups to be rescued could appear and focused on context-aware applications [7]; however, some disappear, fires could be larger/smaller than expected). of their findings regarding the situations in which different Participants were presented with snapshots of an evolving explanations apply are relevant here. In particular, scenario. In the baseline condition, they were provided with inappropriate actions, critical situations, situations Trigger Surprise Expectation Explanation Historical deviations Action differs from past Agent will behave as it has in Acknowledgement of behavior in similar situations the past unexpected action Unusual situations Atypical action Normal operation Information about unusual situation Human knowledge (Seemingly) incorrect action Agent has the same Indicate decision criteria limitations information as the human Preference violations Non-preferred action Agent will adhere to specified Acknowledgment of violation preferences with rationale Indistinguishable effects Different action Agent will perform ‘obvious’ Information about equivalent action options Plan deviations Action contrary to plan Actions according to plan Change of plans and rationale Indirect trajectories (Seemingly) aimless behavior Agent will move toward goal Plan for getting to goal Table 1. Triggers for proactive explanation and their surprise manifestations, underlying causes, and explanation content. involving user goals, and high external dependencies were this to some extent when it suggests an unusual route along all found to increase the need for intelligibility, particularly with the justification that it is currently the best option through why not and situation explanations. given current traffic conditions. Historical Deviations Preference Violations An important aspect of trust is predictability—a human will Many formulations of autonomy incorporate preferences generally expect an agent to perform the same actions that it over desired behaviors, whether created by the system has performed in similar situations in the past. Thus, an modeler at design time or imposed by a human supervisor agent suddenly executing a different action is likely to later on. When making decisions, an agent will seek to surprise the user. An agent can anticipate this situation satisfy these preferences; however, various factors (e.g., through a combination of statistical analysis of performance resource limitations, deadlines, physical restrictions) may logs and semantic models for situation similarity. require that they be violated, leading to the agent seemingly Explanation involves an acknowledgment of the atypical operating contrary to plan and surprising the user. behavior and the rationale behind it—for example, Explanation in this case involves acknowledging the “Aborting rescue mission because of engine fire.” violated preference or directive and providing the reason why—for example, “Entering no-fly zone to avoid Unusual Situations dangerously high winds.” A human observer lacking detailed understanding of a domain may be aware of actions for normal operations but Indistinguishability of Effects not of actions for more unusual situations. An agent’s Two actions may be very different in practice but achieve actions in these situations may thus surprise the user. The similar effects—for example, different routes of similar agent can identify these situations by their frequency of duration to the same destination. This can surprise a human occurrence—for example, if the conditions that triggered observer who may not have realized their comparable the behavior are below some probability threshold. effects or even been aware of the other (chosen) option. Explanation to avert this type of surprise involves Agents can anticipate this type of surprise by measuring the describing the unusual situation to the user. For example, an similarity of actions or trajectories and of outcomes. agent might explain, “Normal operation is not to extinguish Explanation then involves making the human aware of fires with civilians on board but fire is preventing egress of different options with similar impact—for example, “I will Drone 17 with a high-priority evacuation.” extinguish Fire 47 before Fire 32 but extinguishing Fire 32 before Fire 47 would be just as effective.” Human Knowledge Limitations Sensing and computational capabilities, particularly in Plan Deviations distributed settings, can enable autonomous platforms to Agents are expected to be executing a plan to achieve a have insights and knowledge that are unavailable to human goal. Inevitably, situations will arise that require a change collaborators. Through awareness of decisions based on this of plans which, if initiated by the agent, can cause surprise. information, an agent can identify potential mismatches in Absent an explicit shared understanding of the current plan, situational understanding that can lead to surprising the user an agent can rely on an expectation of inertia—that is, that with seemingly incorrect decision-making. Explanation it will continue moving in the same direction, toward the involves identifying the potential mismatch and surfacing same target. By characterizing this tendency and that to the user. For example, Google Maps already does recognizing (significant) changes, the agent can anticipate potential surprise. Explanation involves informing the user significance of explanations in terms of their quality and of the plan change—for example, “Diverting to rescue impact. Ultimately, our objective is to provide evidence that newly detected group.” This may be sufficient if it calls explanations enable the appropriate and effective use of attention to a new goal or target previously unknown to the intelligent agents in mixed autonomy settings. user but if the change involves a reprioritization of existing REFERENCES goals, explanation also needs to include the rationale—for 1. Berkeley J. Dietvorst, Joseph P. Simmons, and Cade example, “Diverting to rescue Group 5 before Group 4 Massey. 2014. Algorithm aversion: People erroneously because fire near Group 5 is growing faster than expected.” avoid algorithms after seeing them err. J. Experimental Indirect Trajectories Psychology: General 144, 1. More generally, agents are expected to be engaged in 2. Mary T. Dzindolet, Scott A. Peterson, Regina A. purposeful behavior. In spatiotemporal domains, observers Pomranky, Linda G. Pierce, and Hall P. Beck. 2003. can typically infer from an agent’s trajectory its destination The role of trust in automation reliance. Int. J. Human- and, based on that, its goal. For example, a drone heading Computer Studies 58: 697–718. toward a fire is likely to be planning to extinguish the fire. Surprises occur when the agent has to take an indirect route 3. Jonathan L. Herlocker, Joseph A. Konstan, and John and appears to be headed nowhere meaningful. The agent Riedl. 2000. Explaining collaborative filtering can identify this situation by determining the difference recommendations. Proc CSCW 2000. between its actual destination and an apparent one, if any. 4. Eric Horvitz, Johnson Apacible, Raman Sarin, and Lin Explanation then involves explicitly identifying the goal Liao. 2005. Prediction, expectation, and surprise: and the reason for the indirect action—for example, “New methods, designs, and study of a deployed traffic task to retrieve equipment from supply depot.” forecasting service. Proc. UAI 2005. SUMMARY AND CONCLUSIONS 5. Andrej Karpathy and Li Fei-Fei. 2015. Deep visual- Prior work has noted the utility of surprise for driving semantic alignments for generating image descriptions. intelligent system behavior. Recognizing that the most Proc. CVPR 2015. valuable information to users is information that 6. Todd Kulesza, Margaret Burnett, Weng-Keen Wong, complements what they already know, Horvitz et al. [4] and Simone Stumpf. 2015. Principles of explanatory focus on surprising predictions as the situations about debugging to personalize interactive machine learning. which to alert the users in a traffic forecasting system. Proc. IUI 2015. Wilson et al. [12] use surprise in an intelligent assistant for software engineering to entice users to discover and utilize 7. Brian Y. Lim and Anind K. Dey. 2009. Assessing programming assertions. Here, we use surprise to drive demand for intelligibility in context-aware proactive explanations and help users understand decisions applications. Proc. UBICOMP 2009. that might otherwise cause concern. 8. Edward H. Shortliffe, Randall Davis, Stanton G. Axline, Bruce G. Buchanan, C. Cordell Green, and We are currently investigating our approach to proactive Stanley N. Cohen. 1975. Computer-based consultations explanation in various explainable autonomy formulations. in clinical therapeutics: Explanation and rule In one where an autonomous controller selects, instantiates, acquisition capabilities of the MYCIN system. and executes plays from a pre-determined mission Computers and Biomedical Research 8, 4: 303–320. playbook, we identify surprising role allocations based on assignment to suboptimal resources and use degree of 9. Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen suboptimality to drive proactivity. In another involving a Wong, Margaret Burnett, Thomas Dietterich, Erin reinforcement learner acquiring policies in a gridworld Sullivan, and Jonathan Herlocker. 2009. Interacting domain, we use sensitivity analyses that perturb an existing meaningfully with machine learning systems: Three trajectory to identify points where relatively small changes experiments. Int. J. Human-Computer Studies 67(8): in action lead to very different outcomes. 639–662. Focusing on the motivation behind explanations in 10. Nava Tintarev and Judith Masthoff. 2007. Effective collaborative autonomy settings helps bring to light issues explanations of recommendations: user-centered not often addressed in work on explainable AI. We present design. Proc. RecSys’07. a framework for explanation drivers, focusing in particular 11. Marco Tulio Ribeiro, Sameer Singh, and Carlos on explanations for reconciling expectation violations. We Guestrin. 2016. “Why should I trust you?” Explaining argue that averting surprise should be a primary motivation the predictions of any classifier. Proc. KDD 2016. for explanation and enumerate a set of triggers for proactive 12. Aaron Wilson, Margaret Burnett, Laura Beckwith, explanations. While most current work on explanation Orion Granatir, Ledah Casburn, Curtis Cook, Mike focuses opaque deep learning models and is thus primarily Durham, and Greg Rothermel. 2003. Harnessing concerned with interpretability, mixed autonomy settings curiosity to increase correctness in end-user require additional metrics to capture the usefulness and programming. Proc. CHI 2003.