<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>ACM Transactions on Interactive Intelligent Sys-
tems</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Output Uncertainty: Targeting the Actual End User Problem in Interactions with AI</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zelun Tony Zhang</string-name>
          <email>zhang@fortiss.org</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Heinrich Hußmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>College Station, USA</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LMU Munich, Chair of Applied Informatics and Media Informatics</institution>
          ,
          <addr-line>Frauenlobstraße 7a, 80337 Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>fortiss GmbH, Research Institute of the Free State of Bavaria</institution>
          ,
          <addr-line>Guerickestraße 25, 80805 Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>8</volume>
      <issue>2018</issue>
      <fpage>1</fpage>
      <lpage>8</lpage>
      <abstract>
        <p>Given the opaqueness and complexity of modern AI algorithms, there is currently a strong focus on developing transparent and explainable AI, especially in high-stakes domains. We claim that opaqueness and complexity are not the core issues for end users when interacting with AI. Instead, we propose that the output uncertainty inherent to AI systems is the actual problem, with opaqueness and complexity as contributing factors. Transparency and explainability should therefore not be the end goals, as such a focus tends to place the human into a passive supervisory role in what is in reality an algorithm-centered system design. To enable efective management of output uncertainty, we believe it is necessary to focus on truly human-centered AI designs that keep the human in an active role of control. We discuss the conceptual implications of such a shift in focus and give examples from literature to illustrate the more holistic, interactive designs that we envision.</p>
      </abstract>
      <kwd-group>
        <kwd>output uncertainty</kwd>
        <kwd>human-AI interaction</kwd>
        <kwd>intelligent systems</kwd>
        <kwd>transparency</kwd>
        <kwd>explainability</kwd>
        <kwd>user control</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The field of artificial intelligence (AI) has witnessed
impressive progress in recent years. Yet, in many critical,
high-stakes domains such as aviation, medical technol- of AI usage.
ogy or criminal justice, AI is not yet widely deployable
due to challenges like brittleness of the algorithms [
        <xref ref-type="bibr" rid="ref19">1</xref>
        ] or
algorithmic bias. The results are issues in terms of safety,
ethics and social justice. The complexity and opaqueness
of most modern AI algorithms are generally seen as the
core of the problems, prompting widespread calls for AI
transparency and explainability. However, despite the
plainable AI, the efectiveness of these eforts on the end
user side remains unclear. This paper calls for a more
holistic perspective on the issues in end user
interactions with AI systems, especially in high-stakes domains.
      </p>
      <p>We propose to focus on output uncertainty, i.e. the
uncertainty of the user about the case-by-case correctness
of the algorithmic output, rather than complexity and
opaqueness.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>2.1. What is AI?</p>
      <sec id="sec-2-1">
        <title>When thinking about human-AI interaction, it is useful to have a clear idea about what AI actually is, something</title>
        <p>this end, we suggest that AI is usually applied to complex
problems which often cannot be fully specified with hard
criteria and are therefore subject to uncertainty. Consider
for instance recidivism prediction or making medical
diagnoses, where humans at least partially rely on
experience and gut feeling. It is in problems like these where</p>
      </sec>
      <sec id="sec-2-2">
        <title>AI can achieve what conventional programming cannot.</title>
      </sec>
      <sec id="sec-2-3">
        <title>We therefore draw on the definition of intelligence by</title>
      </sec>
      <sec id="sec-2-4">
        <title>Albus “as the ability of a system to act appropriately in an</title>
        <p>uncertain environment” [5]. For the purpose of this
paper, AI is thus mainly characterized as computer systems
that can appropriately handle problems that are subject to on enabling AI transparency and explainability to keep
uncertainty. In particular, this implies that the resulting the human in the loop. However, outfitting a system like
outputs of an AI system are also subject to uncertainty. those described above with a transparent and
explainNote that the purpose of this characterization is to cap- able interface does not change the algorithm-centered
ture what defines AI for the end user. As such, it does not nature of the design. The human is still supposed to
necessarily address every understanding of AI. However, make up for the deficits of the algorithm; the task only
the vast majority of user-facing applications of AI should gets more user-friendly, but is in its nature still
unsuitbe covered by our characterization. able for humans. To prevent burdening users with such
unsuitable tasks, there needs to be a more fundamental
2.1.2. Conceptualizations of AI Usage rethinking of how to deploy AI for the benefit of users.
Truly human-centered AI designs should not strive for
human-in-the-loop, but AI-in-the-loop [7].</p>
        <p>
          Not only is there no agreement on a clear definition of AI,
but there are also diverging opinions on how AI should
be used. The longstanding debate in HCI whether
computer systems should be designed as tools or as agents [6] 2.2. Transparency and Explainability
also translates to how designers and researchers concep- The current wave of eforts towards more interpretable
tualize the usage of AI. In Shneiderman’s terms, the two AI was initiated by the AI and machine learning
commuextremes of possible conceptualizations are the emula- nities, with an algorithmic focus on how to create
transtion goal and the application goal [7]. The former refers parent models or how to generate explanations. Usability
to the intention to emulate human capabilities with AI and actual user needs only gained more importance later
through autonomous agents, the latter to the usage of AI on as the HCI community shifted its attention towards the
to create tools that enhance human performance. The topic [10]. Still, the efectiveness of current approaches
emulation goal tends to encourage algorithm-centered remains unclear. This already starts with the fact that
thinking, while pursuing the application goal promotes the understanding of the concepts of AI transparency
human-centered thinking. Völkel et al. found that ele- and explainability is similarly difuse as the definition of
ments of both conceptual extremes are represented in AI itself. Especially explainability and related concepts
IUI publications [
          <xref ref-type="bibr" rid="ref15 ref2 ref22">3</xref>
          ]. like interpretability are not well defined [ 11], are used to
        </p>
        <p>
          An algorithm-centered conceptualization of AI usage describe difering ideas by diferent authors [ 12, 13], and
is problematic, especially for high-stakes applications. lack agreed-upon ways to evaluate them [11].
Automation is frequently implemented to supplant the HCI researchers have resorted to various measures
human in various tasks, e.g. with automatic creditwor- to evaluate the efectiveness of explanations. A range
thiness decisions, automatic recidivism predictions, or of user studies that evaluate the efect of explanations
autopilots in planes. However, due to imperfect algo- on trust [14, 15], system understanding [14, 16], or task
rithms, human operators are appointed as supervisory performance [17] demonstrate the benefits of AI
transinstances—a task that is long known to be unsuitable parency and explainability. However, in contradiction to
for humans [8]. Aircraft pilots need to constantly mon- these results, many studies show no significant efect of
itor the automation at their disposal for correct func- providing explanations [
          <xref ref-type="bibr" rid="ref21">18, 19, 20, 21</xref>
          ]. On closer
inspectioning. Likewise, users of automatic decision aids are tion, the efects of adding interpretability to AI systems
not involved in the decision-making process. They are can be counterintuitive or even negative. For instance,
confronted with a suggested result and expected to de- Poursabzi-Sangdeh et al. observed in their study that
cide whether to accept or to reject it. Such systems are adding transparency can hinder users to detect serious
algorithm-centered in that the end users are ordered to mistakes of the algorithm, likely due to information
overcompensate for the shortcomings of the algorithm. As load [22]. Furthermore, Bansal et al. showed that users
a consequence, the automation ironically often needs to were more likely to follow AI outputs when given
explabe cut back in its abilities to allow the human a chance nations, regardless of the correctness of the outputs [23].
to perform his or her supervisory task at all, resulting in The results of Eiband et al. suggest that this overtrust can
lower performance than technically possible. Even worse even occur with placebic explanations, i.e. explanations
in high-stakes applications are the costly and ethically that contain no actual information [24].
problematic mistakes that occur when the human super- Taken together, there is no clear picture of when and
visor fails to detect, understand, or correct an algorithm how transparent and explainable AI can be achieved,
error. despite the rich and rapidly growing body of research.
        </p>
        <p>Commonly, the fact that the human is out of the loop in The widely spread results hint at the complexity of the
these automated systems is identified as the core issue [ 9]. topic, with numerous contributing factors that might not
With the complexity and opacity of modern AI technolo- be obvious and necessitate much more further research.
gies like deep learning, the focus is therefore naturally
to be captured fully by these simplistic rules, the user can
never be certain that no emails have been misclassified
3.1. A Provocative Question without tedious manual verification and correction. The
fact that the user can use the rules to reconstruct how the
Given the complicated matter of making AI interpretable, emails have been categorized does not resolve the issue.
one should be allowed to ask a provocative question: These examples certainly by far do not cover the whole
Why do we actually need transparency and explanations range of problems where transparency and
explainabilin AI? The common line of thinking is that modern AI ity are demanded. However, they serve the purpose to
systems are too complex and opaque to understand how illustrate that not complexity and opaqueness are the
specific outputs are generated, that they constitute black root of the problem, but rather output uncertainty. This
boxes. For AI engineers, this hinders their development view is supported by studies showing that displaying the
work. For regulators, it complicates the evaluation of confidence of the model in its output is highly efective
compliance with regulations. And also for end users, the in calibrating users’ trust to appropriate levels1, while
black box property is commonly thought to be an issue. giving explanations has no significant efect [ 23, 21].
However, many systems in our lives are complex black We do regard complexity and opaqueness as highly
boxes in the eyes of the user, and yet neither the aver- important issues as they constitute major contributing
age user nor the designers of these systems care about factors to output uncertainty. However, focusing on them
transparency to open up these black boxes. For instance, as if they were the root problems can limit our thinking
a car is suficiently complex that the average driver has when searching for human-centered ways to deploy AI
no good understanding of its inner workings. Yet, there and reap its benefits in high-stakes scenarios. We
prois no need to make its complex engineering transparent pose that designers of AI systems should instead focus
or explainable, despite the high-stakes, safety-critical na- on managing output uncertainty, considering complexity
ture of the car. So what is the diference for the end user and opaqueness as contributing factors. The currently
with AI? popular practice is to present fully automatically
generated outputs to users as a fait accompli. Users need to
3.2. Output Uncertainty is the Actual constantly reconstruct the reasons behind these outputs
Problem in order to reject them and to override the algorithm
when necessary. The goal should be to come up with
In the context of end user interactions, complexity and more holistic and efective designs than that. Now, this
opaqueness appear not to be the problem per se. Instead, does not preclude current designs and eforts towards
we argue that from an end user interaction perspective, fully automatic and hopefully transparent decision aids.
the distinguishing factor of AI as characterized by us in But the point is that there needs to be a better
understandSection 2.1.1 is what we refer to as output uncertainty. ing of when and why such a design could be appropriate,
By output uncertainty we mean the uncertainty of the rather than taking it as a default or a given.
user about the case-by-case correctness of the algorithmic
output. Note that we focus on end users here who directly
interact and work with the AI systems, e.g. pilots flying 3.3. Output Uncertainty and Related
with AI assistance, physicians making diagnoses with the Constructs
help of AI, or police departments employing AI-enabled As stated in Section 3.2, we define output uncertainty
predictive policing systems. For other stake holders like as the uncertainty of the user about the correctness of
developers or regulators, diferent considerations might the model output on a case-by-case basis. We
acknowlapply. edge that several similar constructs have already been</p>
        <p>Consider the braking system of a car, which could investigated in human-AI interaction research. In order
be arbitrarily complex and opaque, without the average to clarify our perspective, we briefly discuss how output
driver ever caring about it. Since the driver can be sure uncertainty difers from these related constructs in this
that the car will slow down when stepping on the brake section.
(and that the brake will not apply otherwise), there is Most notably, managing output uncertainty appears to
no need for him or her to wonder about how the result be very similar to calibrating trust in AI systems.
Howcame to be. On the other hand, even very simple rule- ever, output uncertainty management recognizes the
inbased systems could be problematic for the user, despite a evitability of AI errors and is concerned with designs to
high degree of transparency and explainability. Take for manage these errors. As such, it is a wider problem than
instance an agent that automatically categorizes emails trust calibration, which relies mostly on explanations
according to a manageable set of simple and explicit rules
based on factors like sender, keywords, or time. Since the
real criteria for how to categorize the emails are unlikely
1The user has appropriate trust in the model if the user follows
the model output when it is correct and rejects it when it is wrong.
to help users recognize when to trust or to dismiss an 3.4.1. Recalibrating conceptualizations of AI
algorithmic output. While this is a possible approach, towards human-centered,
there are more ways to manage output uncertainty, as user-empowering tools
we describe in Section 3.4.2. Furthermore, trust in AI is a
highly convoluted construct with many diferent mean- As discussed in Section 2.1.2, the currently predominant
ings. [25]. Its conceptualization is also influenced by our algorithm-centered conceptualization of AI is
problemhuman intuition about interpersonal trust, which can atic. The emphasis on transparency and explainability
cause misleading conclusions [25]. In contrast, output tends to reinforce such a conceptualization. This is not
uncertainty is a much simpler and more focused notion, because demands for these properties would be wrong,
which could be seen as one influencing factor in the com- but because the isolated focus on them is based on the
plex of trust and trustworthiness. assumption that AI is necessarily implemented as fully</p>
        <p>We also diferentiate output uncertainty from unpre- automatic systems that need human supervision.
Lookdictability, as a system can behave predictably on some ing solely at transparency and explainability does not
level while still inducing output uncertainty. This could question whether this assumption is compatible with
for instance be the case for the exemplary email agent human cognitive limits; it is merely concerned with
makmentioned in Section 3.2. Due to its simple and explicit ing this inherently algorithm-centered paradigm more
rules, the system behavior can be considered predictable. user-friendly at best (see Fig. 1, left).
Yet still, the agent might unexpectedly miscategorize Focusing on output uncertainty on the other hand has
some emails due to the flexible nature of language, cre- the potential to recalibrate the conceptualization of AI
ating output uncertainty on the end user side. Another towards a more human-centered direction. As shown by
way unpredictability can difer from output uncertainty decades of research in human-automation interaction,
is when the latter is precisely quantifiable, i.e. the user humans are inherently not well suited to deal with output
knows the likelihood that the system is correct in any uncertainty in a passive supervisory role [9]. Therefore,
given situation. Hoping for a specific number while when considering efective ways to manage output
uncerthrowing a dice would be the simplest example. In such tainty, it is necessary to consider how to actively engage
a case, the global behavior of the system is predictable the user in performing the task. In this way,
addressto the user, but the uncertainty about the case-by-case ing the problem of output uncertainty encourages a shift
correctness remains. towards giving the user an active role at the center. AI</p>
        <p>In the same vein, output uncertainty is not the same systems would be designed around the user, as tools that
as the confidence of the model in the correctness of its enhance the user’s ability to perform the task.
outputs—or rather the lack thereof: Confidence scores
are supposed to reveal how (un)certain the model is about 3.4.2. Recalibrating design thinking with regards
its outputs, whereas output uncertainty is the uncertainty to AI towards more holistic, interactive
of the user about the model outputs. While confidence designs
scores might possibly be a viable method to manage out- Solely focusing on transparency and explainability is
put uncertainty in specific situations, both concepts can an example of jumping straight to the solution without
also be detached from each other. An algorithm can be proper search for the root problem first. The result is
correct despite low confidence and vice versa. Hence, that a large part of the solution space is not considered
an uncertainty on the user’s side about the case-by-case at all. The proposed focus on output uncertainty would
correctness of the model output can persist, even for very no longer regard complexity and opaqueness as the root
high or very low system confidence. problems, but as contributing factors to output
uncertainty. This means that transparency and explainability
3.4. Conceptual Implications are not seen as final goals, but as possible building blocks
for efective human-AI interactions. By this reframing of
A more holistic perspective on human-centered AI de- the problem, the focus on output uncertainty can open
ployment with a focus on output uncertainty has concep- up the ideation activities of a design thinking process for
tual implications in two distinct but closely related ways: a much wider range of possible solutions. For instance,
in terms of how we conceptualize AI and its usage and ideation does not always need to focus on how to make
in terms of the design solutions we consider. We discuss fully automatic decision aids more transparent and
exboth briefly in the following. plainable. A solution could instead revolve more around
how to allow users to steer the algorithm so that it
enhances the user’s abilities while actively performing the
task him- or herself (see Fig. 1, right). This could involve
more exploration of efective input techniques and how
Supervision</p>
        <p>Transparency</p>
        <p>&amp;
Explanations</p>
        <p>Support,
Feedback</p>
        <p>Intentions,</p>
        <p>Guidance,
Modi cations
transparency could be integrated with those to enable ing the user control in performing the task.
feedback in interactions with the system. Note that the presented examples bear a strong
resem</p>
        <p>
          The literature provides several promising examples for blance to techniques of interactive machine learning (iML)
how such more holistic, interactive designs could look [29], where interactive user feedback is a key concern.
like. Cai et al. developed a deep learning-based image However, our focus is on managing output uncertainty,
retrieval system for medical decision making with three while iML has the specific purpose to make machine
diferent tools to help physicians refine the retrieved learning more accessible to users that are not machine
results [26]: cropping to indicate important regions of an learning practitioners. The ultimate goal of iML is
thereimage, pinning of examples that contain the searched-for fore to improve the performance of the algorithm through
concept, and sliders to (de-)emphasize certain clinical a well designed, usable training process. We regard
outconcepts. Weber et al. proposed an image restoration put uncertainty as a more fundamental issue that plagues
tool where the user can iteratively control and guide the end user interactions with AI in general. Techniques
inpainting algorithm by manually painting directly onto from iML can be important contributions to designs that
the image to be restored [27]. Heer presented three case efectively manage output uncertainty, though.
studies in the domains of data cleansing and formatting, We reiterate that our point is not to rule out fully
data exploration, and natural language translation [
          <xref ref-type="bibr" rid="ref16">28</xref>
          ]. automatic systems with transparent and explainable
inIn all of his case studies, the predictive models work terfaces. Instead, we call for a more complete view of the
on a task representation shared with the user, and are solution space by focusing on output uncertainty. We see
integrated into interactive systems such that they provide two pillars to this: (1) We need a framework for when
helpful assistance to the user. fully automatic systems are appropriate, and when more
        </p>
        <p>All these exemplary designs allow users to manage interactive solutions are necessary to manage output
unthe output uncertainty of the underlying AI algorithms. certainty. (2) There is currently little understanding on
Instead of being an all-or-nothing afair depending on how to design more interactive AI systems like those
whether the algorithms are right or wrong, these systems mentioned above. We therefore see a need for more
reprovide helpful assistance to their users even in cases search into pertinent guidelines and techniques.
where their output is not entirely correct. Users can
still work with imperfect outputs by manipulating the
results. Furthermore, users can work forwards towards 4. Conclusion
their goals, instead of being forced to work backwards
from an automatic AI output. Transparency in these
systems is therefore not achieved by providing explicit
explanations, but by actively engaging the user and
givIn our view, complexity and opaqueness are not the root
problems for end users when interacting with AI, as is
commonly assumed. Instead, these properties contribute
to what we see as the actual problem that needs to be
addressed: output uncertainty. We believe that efectively [7] B. Shneiderman, Human-centered artificial
inteladdressing output uncertainty requires more holistic, in- ligence: Three fresh ideas, AIS Transactions on
teractive designs than merely transparent and explain- Human-Computer Interaction 12 (2020) 109–124.
able interfaces. Such designs are not all-or-nothing afairs doi:1 0 . 1 7 7 0 5 / 1 t h c i . 0 0 1 3 1 .
depending on the correctness of the algorithm output; [8] L. Bainbridge, Ironies of automation, Automatica 19
allow users to work forwards towards their goal instead (1983) 775–779. doi:1 0 . 1 0 1 6 / 0 0 0 5 - 1 0 9 8 ( 8 3 ) 9 0 0 4 6 - 8 .
of backwards from the AI output; and allow the AI sys- [9] M. R. Endsley, From here to autonomy: Lessons
tem to efectively support the user. Overall, such designs learned from Human–Automation research,
Huwould be much more human-centered. However, we still man Factors: The Journal of the Human Factors and
need a much better understanding of how to achieve Ergonomics Society 59 (2017) 5–27. doi:1 0 . 1 1 7 7 /
these designs and when it is appropriate to choose fully 0 0 1 8 7 2 0 8 1 6 6 8 1 3 5 0 .
automatic system designs instead. [10] A. Abdul, J. Vermeulen, D. Wang, B. Y. Lim,</p>
        <p>We believe that such a human-centered approach that M. Kankanhalli, Trends and trajectories for
exgoes beyond transparency and explainability is necessary plainable, accountable and intelligible systems: An
to overcome the barriers to AI deployment concerning HCI research agenda, in: Proceedings of the 2018
safety, ethics and social justice. Therefore, we initially CHI Conference on Human Factors in Computing
plan to develop our line of thinking concretely into a Systems, CHI ’18, ACM, 2018, pp. 582:1–582:18.
concept for assessing human factors in the certification doi:1 0 . 1 1 4 5 / 3 1 7 3 5 7 4 . 3 1 7 4 1 5 6 .
of AI systems in the aviation domain. Our long-term goal [11] F. Doshi-Velez, B. Kim, Towards a rigorous science
is to extend the expected results of this project to other of interpretable machine learning, arXiv:1702.08608
high-stakes domains as well. [cs, stat] (2017). a r X i v : 1 7 0 2 . 0 8 6 0 8 .
[12] Z. C. Lipton, The mythos of model interpretability,</p>
        <p>Queue 16 (2018) 31–57.</p>
        <p>
          Acknowledgments [13] A. Adadi, M. Berrada, Peeking inside the
blackbox: A survey on explainable artificial intelligence
This work was supported by the German Federal Ministry (XAI), IEEE Access 6 (2018) 52138–52160. doi:1 0 .
for Economic Afairs and Energy (BMWi) under the LuFo
VI-1 program, project KIEZ4-0. [14] 1C1.0J9./ CACaCi,E SJ.S .Jo2 n01g8e.ja28n7, 0J0. 5H2 .olbrook, The efects of
example-based explanations in a machine
learnReferences ing interface, in: Proceedings of the 24th
International Conference on Intelligent User Interfaces, IUI
[
          <xref ref-type="bibr" rid="ref19">1</xref>
          ] D. Heaven, Deep trouble for deep learn- ’19, ACM, 2019, pp. 258–262. doi:1 0 . 1 1 4 5 / 3 3 0 1 2 7 5 .
ing, Nature 574 (2019) 163–166. doi:1 0 . 1 0 3 8 / 3 3 0 2 2 8 9 .
        </p>
        <p>
          d 4 1 5 8 6 - 0 1 9 - 0 3 0 1 3 - 5 . [15] F. Yang, Z. Huang, J. Scholtz, D. L. Arendt, How do
[2] S. Legg, M. Hutter, A collection of defini- visual explanations foster end users’ appropriate
tions of intelligence, arXiv:0706.3639 [cs] (2007). trust in machine learning?, in: Proceedings of the
a r X i v : 0 7 0 6 . 3 6 3 9 . 25th International Conference on Intelligent User
[
          <xref ref-type="bibr" rid="ref15 ref2 ref22">3</xref>
          ] S. T. Völkel, C. Schneegass, M. Eiband, D. Buschek, Interfaces, IUI ’20, ACM, 2020, pp. 189–201. doi:1 0 .
        </p>
        <p>What is ”intelligent” in intelligent user interfaces?: 1 1 4 5 / 3 3 7 7 3 2 5 . 3 3 7 7 4 8 0 .</p>
        <p>A meta-analysis of 25 years of IUI, in: Proceedings [16] H.-F. Cheng, R. Wang, Z. Zhang, F. O’Connell,
of the 25th International Conference on Intelligent T. Gray, F. M. Harper, H. Zhu, Explaining
decisionUser Interfaces, IUI ’20, ACM, 2020, pp. 477–487. making algorithms through UI: Strategies to help
doi:1 0 . 1 1 4 5 / 3 3 7 7 3 2 5 . 3 3 7 7 5 0 0 . non-expert stakeholders, in: Proceedings of the
[4] Q. Yang, A. Steinfeld, C. Rosé, J. Zimmerman, Re- 2019 CHI Conference on Human Factors in
Computexamining whether, why, and how human-AI in- ing Systems, CHI ’19, ACM, 2019, pp. 559:1–559:12.
teraction is uniquely dificult to design, in: Pro- doi:1 0 . 1 1 4 5 / 3 2 9 0 6 0 5 . 3 3 0 0 7 8 9 .
ceedings of the 2020 CHI Conference on Human [17] V. Lai, C. Tan, On human predictions with
explanaFactors in Computing Systems, CHI ’20, ACM, 2020, tions and predictions of machine learning models:
pp. 174:1–174:13. doi:1 0 . 1 1 4 5 / 3 3 1 3 8 3 1 . 3 3 7 6 3 0 1 . A case study on deception detection, in:
Proceed[5] J. S. Albus, Outline for a theory of intelligence, IEEE ings of the Conference on Fairness, Accountability,
Transactions on Systems, Man, and Cybernetics 21 and Transparency, FAT* ’19, ACM, 2019, pp. 29–38.
(1991) 473–509. doi:1 0 . 1 1 0 9 / 2 1 . 9 7 4 7 1 . doi:1 0 . 1 1 4 5 / 3 2 8 7 5 6 0 . 3 2 8 7 5 9 0 .
[6] B. Shneiderman, P. Maes, Direct manipulation vs. [18] E. Chu, D. Roy, J. Andreas, Are visual
explainterface agents, Interactions 4 (1997) 20. doi:1 0 . nations useful? A case study in
model-in-the1 1 4 5 / 2 6 7 5 0 5 . 2 6 7 5 1 4 . loop prediction, arXiv:2007.12248 [cs, stat] (2020).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>a r X i v : 2 0 0 7 . 1 2 2 4 8</article-title>
          . [19]
          <string-name>
            <given-names>B.</given-names>
            <surname>Green</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Chen,</surname>
          </string-name>
          <article-title>The principles</article-title>
          and limits of
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <volume>3</volume>
          (
          <year>2019</year>
          )
          <volume>50</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>50</lpage>
          :
          <fpage>24</fpage>
          .
          <source>doi:1 0 . 1 1</source>
          <volume>4 5 / 3 3 5 9 1 5 2</volume>
          . [20]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Alufaisan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Marusich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Z.</given-names>
            <surname>Bakdash</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>making?</source>
          , arXiv:
          <year>2006</year>
          .11194 [cs, stat] (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>a r X i v : 2 0 0 6 . 1 1 1 9 4</article-title>
          . [21]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. K. E.</given-names>
            <surname>Bellamy</surname>
          </string-name>
          , Efect of con-
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>ings of the 2020 Conference on Fairness</source>
          , Account-
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>ability</surname>
          </string-name>
          , and Transparency, FAT* '20,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2020</year>
          , pp.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          295-
          <fpage>305</fpage>
          .
          <source>doi:1 0 . 1 1</source>
          <volume>4 5 / 3 3 5 1 0 9 5 . 3 3 7 2 8 5 2</volume>
          . [22]
          <string-name>
            <given-names>F.</given-names>
            <surname>Poursabzi-Sangdeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Goldstein</surname>
          </string-name>
          , J. M. Hof-
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>measuring model interpretability</article-title>
          , arXiv:
          <year>1802</year>
          .07810
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [cs] (
          <year>2019</year>
          ).
          <article-title>a r X i v : 1 8 0 2 . 0 7 8 1 0</article-title>
          . [23]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fok</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Nushi</surname>
          </string-name>
          , E. Kamar,
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <article-title>mentary team performance</article-title>
          , arXiv:
          <year>2006</year>
          .14779 [cs]
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          (
          <year>2020</year>
          ).
          <article-title>a r X i v : 2 0 0 6 . 1 4 7 7 9</article-title>
          . [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Eiband</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Buschek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kremer</surname>
          </string-name>
          , H. Hussmann,
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <article-title>the 2019 CHI Conference on Human Factors in</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Computing</surname>
            <given-names>Systems</given-names>
          </string-name>
          ,
          <source>CHI EA '19</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2019</year>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          pp.
          <source>LBW0243:1-LBW0243:6. doi:1 0 . 1 1</source>
          <volume>4 5 / 3 2 9 0 6 0 7 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <volume>3 3 1 2 7 8 7</volume>
          . [25]
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Hofman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Johnson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Bradshaw</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Un-
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>Systems</source>
          <volume>28</volume>
          (
          <year>2013</year>
          )
          <fpage>84</fpage>
          -
          <lpage>88</lpage>
          .
          <source>doi:1 0 . 1 1 0 9 / M I S . 2 0</source>
          <volume>1 3 . 2</volume>
          <fpage>4</fpage>
          . [26]
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Stumpe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Terry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Reif</surname>
          </string-name>
          , N. Hegde,
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <article-title>decision-making</article-title>
          ,
          <source>in: Proceedings of the 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <source>ing Systems</source>
          ,
          <source>CHI '19</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <volume>4</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          :
          <fpage>14</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <source>doi:1 0 . 1 1</source>
          <volume>4 5 / 3 2 9 0 6 0 5 . 3 3 0 0 2 3 4</volume>
          . [27]
          <string-name>
            <given-names>T.</given-names>
            <surname>Weber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hußmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <surname>S</surname>
          </string-name>
          . Matthes, Y. Liu,
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          restoration,
          <source>in: Proceedings of the 25th Interna-</source>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          '20,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2020</year>
          , pp.
          <fpage>243</fpage>
          -
          <lpage>253</lpage>
          .
          <source>doi:1 0 . 1 1</source>
          <volume>4 5 / 3 3 7 7 3 2 5 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <volume>3 3 7 7 5 0 9</volume>
          . [28]
          <string-name>
            <given-names>J.</given-names>
            <surname>Heer</surname>
          </string-name>
          , Agency plus automation: Designing artifi-
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <source>ings of the National Academy of Sciences</source>
          <volume>116</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <source>1844-1850. doi:1 0 . 1 0</source>
          <volume>7 3</volume>
          / p n a
          <source>s . 1 8</source>
          <volume>0 7 1 8 4 1 1 5</volume>
          . [29]
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Dudley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. O.</given-names>
            <surname>Kristensson</surname>
          </string-name>
          ,
          <article-title>A review of user</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>