Explanation to Avert Surprise
                             Melinda Gervasio, Karen Myers, Eric Yeh, Boone Adkins
                                                SRI International
                            333 Ravenswood Avenue, Menlo Park, California 94025, USA
                                         {firstname.lastname}@sri.com
ABSTRACT                                                                what it is about to do. This averts a potentially unpleasant
Most explanation schemes are reactive and informational:                surprise that distracts the user and erodes trust.
explanations are provided in response to specific user
queries and focus on making the system’s reasoning more                 Our foray into explainable autonomy began a few years
transparent. In mixed autonomy settings that involve teams              ago, when we were developing autonomous agents for a
of humans and autonomous agents, proactive explanation                  project on team autonomy for uncertain, dynamic,
that anticipates and preempts potential surprises can be                adversarial environments in mixed human-machine settings.
particularly valuable. By providing timely, succinct, and               As we observed the agents in action, we would sometimes
context-sensitive explanations, autonomous agents can                   see puzzling behaviorfor example, an agent might
avoid perceived faulty behavior and the consequent erosion              suddenly change course away from its intended destination.
of trust, enabling more fluid collaboration. We present an              Our first thought would almost invariably be that there was
explanation framework based on the notion of explanation                a problem with the agent but, upon further inspection, we
drivers—i.e., the intent or purpose behind agent                        would realize that the agent had good reason for its action.
explanations. We focus on explanations meant to reconcile               For example, it might be reacting to an unexpected event or
expectation violations and enumerate a set of triggers for              diverting to a higher-priority task. A straightforward UI
proactive explanation. Most work on explainable AI                      showing the agents’ current plans helped somewhat to
focuses on intelligibility; investigating explanation in mixed          alleviate this problem, but this was a solution targeted at the
autonomy settings helps illuminate other important                      autonomous agents’ designers, not at the end users who
explainability issues such as purpose, timing, and impact.              would be teaming with these automated agents in the future.

Author Keywords                                                         The need for intelligent systems that could explain
Explainable autonomy; explainable AI; human-machine                     themselves was recognized early on with expert systems
teams; collaborative AI; intelligibility                                [8], with the desire of both system developers and end users
                                                                        to better understand the reasoning behind a computational
ACM Classification Keywords                                             system’s conclusions to determine whether it could be
H.5.m.    Information     interfaces     and    presentation:           trusted. More recently, the dominant work in explanation
Miscellaneous; I.2.m Artificial intelligence: Miscellaneous             has been on explaining the decisions of learned classifiers,
INTRODUCTION                                                            particularly in the context of interactive learning [6,9],
Humans judge mistakes by computer systems more harshly                  recommender systems [3,10], and deep learning [5,11].
than mistakes by other humans, with errors having a
                                                                        Explanation for autonomy differs in a number of ways. The
disproportionately large impact on perceived reliability
                                                                        decision to be explained is typically part of a larger,
[1,2]. This negative impact on trust has particularly
                                                                        orchestrated sequence of actions to achieve some long-term
significant repercussions for human-machine teams, where
                                                                        goal. Decisions occur at different levels of
the humans’ trust in the autonomous agents directly affects
                                                                        granularityfrom overarching policies and strategic
how well they utilize the agents. The effect is particularly
                                                                        decisions down to individual actions. Explanation is
unfortunate when the human perceives an agent to be
                                                                        required for various reasons under different circumstances:
misbehaving when in fact it is behaving appropriately but in
                                                                        before execution to explain planning decisions, during
response to conditions unknown to the user.
                                                                        execution to explain deviations from planned or expected
We propose that a primary motivation for explanation                    behavior, and afterwards to review agent actions.
should be surprise. When an agent violates expectations—
                                                                        In the collaborative human-machine team settings that we
typically, not in a good way—a human collaborator will
                                                                        are primarily interested in, whether humans serve as
invariably want to know the reason why. Reacting to the
                                                                        supervisors or as teammates, explanation during execution
human’s surprise and explaining away the violation is a
                                                                        presents the additional challenge of limited cognitive
valid approach, but even more effective would be if the
                                                                        resources. With the human already engaged in a cognitively
agent could anticipate the surprise and proactively explain
                                                                        demanding task, system explanations must be succinct,
© 2018. Copyright for the individual papers remains with the authors.   timely, and context-sensitive. In particular, when a human
Copying permitted for private and academic purposes. ExSS '18, March    asks, “Why are you doing that?” it will often be because the
11, Tokyo, Japan.
agent has done something unexpected and the agent’s              basic information about current drone assignments and the
explanation must address that.                                   status of all known fires and groups, and they could ask
                                                                 basic questions about the drones’ behavior. In the proactive
EXPLANATION DRIVERS
We have developed an explanation framework based on the          condition, they were also given preemptive explanations (as
concept of explanation drivers: the intent or purpose behind     textual pop-ups) of certain drone decisions.
an agent’s explanation. We distinguish between three             The participants all found the proactive explanations to be
classes of drivers: Inform, Reconcile, and Prime.                useful. As one participant put it, “[Proactive explanations
Explanations to Inform are what most people typically think      were] very helpful, particularly anything that was
of as explanations. They provide straightforward answers to      counterintuitive or represented a big change.” Based on the
                                                                 questions participants asked, we observed that everyone
basic wh-questionsfor example, “What is your goal?” or
                                                                 wanted to know the big picture, both in terms of the overall
“How do you plan to achieve that goal?” or “Where are you
                                                                 plan and the agents’ overall priorities. In addition, the
going?” In the mixed-team setting, Inform explanations are
                                                                 participants expected the drones to address all the targets—
particularly useful early on, when the human is still trying
                                                                 fires extinguished and groups rescued—with a strong
to get an overall sense of an (unknown) agent’s decision-
                                                                 preference for saving people over extinguishing fires.
making. However, even after some level of trust has been
already been established, Inform explanations often still        TRIGGERS FOR PROACTIVE EXPLANATION
remain useful for maintaining that trust.                        Most explanation schemes are reactive: explanations about
                                                                 system decisions are generated on-demand in response to
Explanations that Reconcile address expectation violations.      specific user queries. While reactive explanations are useful
They answer questions borne of surprisee.g., “What are          in many situations, proactive explanations are sometimes
you doing!” or “Why aren’t you doing X?” or “Why did             called for, particularly in mixed autonomy settings where,
you do Y [instead of Z]?” Reconcile explanations are most        for example, close coordination is required and humans are
effective when presented before the consequences of the          engaged in tasks of their own or are supervising large
decision are apparent, to prevent the surprise in the first      teams. Proactive explanations serve to keep the human’s
place. For example, a warning from a firefighting drone that     mental model of the agent’s decisions aligned with the
it will be diverting to help extinguish a fire that is growing   agent’s actual decision process, minimizing surprises that
faster than expected avoids surprising the user and possibly     can distract from and disrupt the team’s activities. Used
causing concern. It also gives the user the opportunity to       judiciously, they can also reduce the communication burden
change the plan—for example, to send the drone to its            on the human, who will have less cause to question the
original target and to co-opt a different one to help instead.   agent’s decisions.
Finally, there are explanations that Prime the user for          We propose the use of surprise as the primary motivation
assistance. Just as in human teams, communication and            for proactivity, with agents using potential expectation
coordination is critical in mixed teams. In human-               violations to trigger explanation. Identifying expectation
supervised settings, an important part of this collaboration     violations requires having a model of the user’s
involves agents recognizing when they need help and              expectations. However, instead of relying on a
providing humans with the information they need to               comprehensive formal model of the human’s expectations
provide appropriate guidance. Beyond simply asking for           based on a representation of team and individual goals,
help, Prime explanations inform humans why help is               communication patterns, etc., we identify classes of
needed to help them provide appropriate assistance. For          expectations based on the simpler idea of expectation
example, if the agent has low confidence in its best action,     norms. That is, given a cooperative team setting where the
it can let the human supervisor confirm or override.             humans and the agents have the same objectives, we set out
Here we focus on Reconcile explanations—in particular, on        to determine expectations on agent behavior based on
proactive explanations designed to avoid unpleasant              rational or commonsense reasoning. We enumerate a set of
surprises for human collaborators. This decision to focus on     triggers for proactive explanation, discussing for each one
proactive explanations was partially validated by the results    the manifestation of surprise, the expectation violation
of a small four-person user study we conducted in mid-           underlying the surprise, and the information that the
2017. The study was in a fictional domain of drone               proactive explanation should impart (Table 1). The triggers
firefighting and rescue, and participants were given the task    are not an exhaustive list but include a broad range that we
of understanding what the drones were doing, with the            have found particularly useful in our work on explainable
knowledge that world was dynamic (e.g., fires could start        autonomy.
and die out on their own) and that all information was           Lim & Dey’s investigation of intelligibility demands is
uncertain (e.g., groups to be rescued could appear and           focused on context-aware applications [7]; however, some
disappear, fires could be larger/smaller than expected).         of their findings regarding the situations in which different
Participants were presented with snapshots of an evolving        explanations apply are relevant here. In particular,
scenario. In the baseline condition, they were provided with     inappropriate actions, critical situations, situations
          Trigger                         Surprise                          Expectation                      Explanation
  Historical deviations       Action differs from past             Agent will behave as it has in   Acknowledgement of
                              behavior in similar situations       the past                         unexpected action
  Unusual situations          Atypical action                      Normal operation                 Information about unusual
                                                                                                    situation
  Human knowledge             (Seemingly) incorrect action         Agent has the same               Indicate decision criteria
  limitations                                                      information as the human
  Preference violations       Non-preferred action                 Agent will adhere to specified   Acknowledgment of violation
                                                                   preferences                      with rationale
  Indistinguishable effects Different action                       Agent will perform ‘obvious’     Information about equivalent
                                                                   action                           options
  Plan deviations             Action contrary to plan              Actions according to plan        Change of plans and rationale
  Indirect trajectories       (Seemingly) aimless behavior         Agent will move toward goal      Plan for getting to goal

     Table 1. Triggers for proactive explanation and their surprise manifestations, underlying causes, and explanation content.

involving user goals, and high external dependencies were             this to some extent when it suggests an unusual route along
all found to increase the need for intelligibility, particularly      with the justification that it is currently the best option
through why not and situation explanations.                           given current traffic conditions.
Historical Deviations                                                 Preference Violations
An important aspect of trust is predictability—a human will           Many formulations of autonomy incorporate preferences
generally expect an agent to perform the same actions that it         over desired behaviors, whether created by the system
has performed in similar situations in the past. Thus, an             modeler at design time or imposed by a human supervisor
agent suddenly executing a different action is likely to              later on. When making decisions, an agent will seek to
surprise the user. An agent can anticipate this situation             satisfy these preferences; however, various factors (e.g.,
through a combination of statistical analysis of performance          resource limitations, deadlines, physical restrictions) may
logs and semantic models for situation similarity.                    require that they be violated, leading to the agent seemingly
Explanation involves an acknowledgment of the atypical                operating contrary to plan and surprising the user.
behavior and the rationale behind it—for example,                     Explanation in this case involves acknowledging the
“Aborting rescue mission because of engine fire.”                     violated preference or directive and providing the reason
                                                                      why—for example, “Entering no-fly zone to avoid
Unusual Situations
                                                                      dangerously high winds.”
A human observer lacking detailed understanding of a
domain may be aware of actions for normal operations but              Indistinguishability of Effects
not of actions for more unusual situations. An agent’s                Two actions may be very different in practice but achieve
actions in these situations may thus surprise the user. The           similar effects—for example, different routes of similar
agent can identify these situations by their frequency of             duration to the same destination. This can surprise a human
occurrence—for example, if the conditions that triggered              observer who may not have realized their comparable
the behavior are below some probability threshold.                    effects or even been aware of the other (chosen) option.
Explanation to avert this type of surprise involves                   Agents can anticipate this type of surprise by measuring the
describing the unusual situation to the user. For example, an         similarity of actions or trajectories and of outcomes.
agent might explain, “Normal operation is not to extinguish           Explanation then involves making the human aware of
fires with civilians on board but fire is preventing egress of        different options with similar impact—for example, “I will
Drone 17 with a high-priority evacuation.”                            extinguish Fire 47 before Fire 32 but extinguishing Fire 32
                                                                      before Fire 47 would be just as effective.”
Human Knowledge Limitations
Sensing and computational capabilities, particularly in               Plan Deviations
distributed settings, can enable autonomous platforms to              Agents are expected to be executing a plan to achieve a
have insights and knowledge that are unavailable to human             goal. Inevitably, situations will arise that require a change
collaborators. Through awareness of decisions based on this           of plans which, if initiated by the agent, can cause surprise.
information, an agent can identify potential mismatches in            Absent an explicit shared understanding of the current plan,
situational understanding that can lead to surprising the user        an agent can rely on an expectation of inertia—that is, that
with seemingly incorrect decision-making. Explanation                 it will continue moving in the same direction, toward the
involves identifying the potential mismatch and surfacing             same target. By characterizing this tendency and
that to the user. For example, Google Maps already does               recognizing (significant) changes, the agent can anticipate
potential surprise. Explanation involves informing the user       significance of explanations in terms of their quality and
of the plan change—for example, “Diverting to rescue              impact. Ultimately, our objective is to provide evidence that
newly detected group.” This may be sufficient if it calls         explanations enable the appropriate and effective use of
attention to a new goal or target previously unknown to the       intelligent agents in mixed autonomy settings.
user but if the change involves a reprioritization of existing
                                                                  REFERENCES
goals, explanation also needs to include the rationale—for        1.   Berkeley J. Dietvorst, Joseph P. Simmons, and Cade
example, “Diverting to rescue Group 5 before Group 4                   Massey. 2014. Algorithm aversion: People erroneously
because fire near Group 5 is growing faster than expected.”            avoid algorithms after seeing them err. J. Experimental
Indirect Trajectories                                                  Psychology: General 144, 1.
More generally, agents are expected to be engaged in              2.   Mary T. Dzindolet, Scott A. Peterson, Regina A.
purposeful behavior. In spatiotemporal domains, observers              Pomranky, Linda G. Pierce, and Hall P. Beck. 2003.
can typically infer from an agent’s trajectory its destination         The role of trust in automation reliance. Int. J. Human-
and, based on that, its goal. For example, a drone heading             Computer Studies 58: 697–718.
toward a fire is likely to be planning to extinguish the fire.
Surprises occur when the agent has to take an indirect route      3.   Jonathan L. Herlocker, Joseph A. Konstan, and John
and appears to be headed nowhere meaningful. The agent                 Riedl. 2000. Explaining collaborative filtering
can identify this situation by determining the difference              recommendations. Proc CSCW 2000.
between its actual destination and an apparent one, if any.       4.   Eric Horvitz, Johnson Apacible, Raman Sarin, and Lin
Explanation then involves explicitly identifying the goal              Liao. 2005. Prediction, expectation, and surprise:
and the reason for the indirect action—for example, “New               methods, designs, and study of a deployed traffic
task to retrieve equipment from supply depot.”                         forecasting service. Proc. UAI 2005.
SUMMARY AND CONCLUSIONS                                           5.   Andrej Karpathy and Li Fei-Fei. 2015. Deep visual-
Prior work has noted the utility of surprise for driving               semantic alignments for generating image descriptions.
intelligent system behavior. Recognizing that the most                 Proc. CVPR 2015.
valuable information to users is information that                 6.   Todd Kulesza, Margaret Burnett, Weng-Keen Wong,
complements what they already know, Horvitz et al. [4]                 and Simone Stumpf. 2015. Principles of explanatory
focus on surprising predictions as the situations about                debugging to personalize interactive machine learning.
which to alert the users in a traffic forecasting system.              Proc. IUI 2015.
Wilson et al. [12] use surprise in an intelligent assistant for
software engineering to entice users to discover and utilize      7.   Brian Y. Lim and Anind K. Dey. 2009. Assessing
programming assertions. Here, we use surprise to drive                 demand for intelligibility in context-aware
proactive explanations and help users understand decisions             applications. Proc. UBICOMP 2009.
that might otherwise cause concern.                               8.   Edward H. Shortliffe, Randall Davis, Stanton G.
                                                                       Axline, Bruce G. Buchanan, C. Cordell Green, and
We are currently investigating our approach to proactive
                                                                       Stanley N. Cohen. 1975. Computer-based consultations
explanation in various explainable autonomy formulations.
                                                                       in clinical therapeutics: Explanation and rule
In one where an autonomous controller selects, instantiates,
                                                                       acquisition capabilities of the MYCIN system.
and executes plays from a pre-determined mission
                                                                       Computers and Biomedical Research 8, 4: 303–320.
playbook, we identify surprising role allocations based on
assignment to suboptimal resources and use degree of              9.   Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen
suboptimality to drive proactivity. In another involving a             Wong, Margaret Burnett, Thomas Dietterich, Erin
reinforcement learner acquiring policies in a gridworld                Sullivan, and Jonathan Herlocker. 2009. Interacting
domain, we use sensitivity analyses that perturb an existing           meaningfully with machine learning systems: Three
trajectory to identify points where relatively small changes           experiments. Int. J. Human-Computer Studies 67(8):
in action lead to very different outcomes.                             639–662.
Focusing on the motivation behind explanations in                 10. Nava Tintarev and Judith Masthoff. 2007. Effective
collaborative autonomy settings helps bring to light issues           explanations of recommendations: user-centered
not often addressed in work on explainable AI. We present             design. Proc. RecSys’07.
a framework for explanation drivers, focusing in particular       11. Marco Tulio Ribeiro, Sameer Singh, and Carlos
on explanations for reconciling expectation violations. We            Guestrin. 2016. “Why should I trust you?” Explaining
argue that averting surprise should be a primary motivation           the predictions of any classifier. Proc. KDD 2016.
for explanation and enumerate a set of triggers for proactive     12. Aaron Wilson, Margaret Burnett, Laura Beckwith,
explanations. While most current work on explanation                  Orion Granatir, Ledah Casburn, Curtis Cook, Mike
focuses opaque deep learning models and is thus primarily             Durham, and Greg Rothermel. 2003. Harnessing
concerned with interpretability, mixed autonomy settings              curiosity to increase correctness in end-user
require additional metrics to capture the usefulness and              programming. Proc. CHI 2003.