<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Assessing the Value of Transparency in Recommender Systems: An End-User Perspective</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eric S. Vorm</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrew D. Miller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indiana University Purdue University Indianapolis Indianapolis</institution>
          ,
          <addr-line>IN</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <abstract>
        <p>Recommender systems, especially those built on machine learning, are increasing in popularity, as well as complexity and scope. Systems that cannot explain their reasoning to end-users risk losing trust with users and failing to achieve acceptance. Users demand interfaces that afford them insights into internal workings, allowing them to build appropriate mental models and calibrated trust. Building interfaces that provide this level of transparency, however, is a significant design challenge, with many design features that compete, and little empirical research to guide implementation. We investigated how end-users of recommender systems value different categories of information to help in determining what to do with computer-generated recommendations in contexts involving high risk to themselves or others. Findings will inform future design of decision support in high-criticality contexts.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>New machines are embodied with increasing levels of authority
and unprecedented scope. Decisions previously made by humans
are increasingly being made by computers, often with little or no
explanation, raising concerns over a plethora of social, legal, and
ethical issues such as privacy, bias, and safety.</p>
      <p>
        Transparency is often discussed in terms of back-end
programming or troubleshooting. End-users, especially in the context of
novice users interacting with recommender systems, are seldom
studied. Yet recent developments in AI suggest that automated
recommendations will be an increasingly common component in user’s
daily lives as technologies such as self-driving cars and IoT-enabled
smart homes become commonplace. Developing methods to increase
the transparency of computer-generated recommendations, as well
as understanding user information needs as a means to increase
trust and engagement with recommendations, is therefore crucial.
Accomplishing transparent interface design is often complicated by
a series of trade-offs that seek to balance and prioritize several
competing design principals. Striking the appropriate balance between
too much and not enough information is often more art than science,
and is becoming more difcfiult with the cascading prevalence of
data-driven paradigms such as machine learning [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Efforts towards improving the transparency of recommender
systems commonly involve programming system-generated
explanations that seek to justify the recommendation to users, often through
the use of system logic [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Providing explanations and
justifications of system behavior to users has proven to be a highly effective
means to increase user acceptance and enhancing user attitudes
towards recommender systems [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Studies have shown that providing
explanations to users tends to increase trust [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], improves user
comprehension [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], calibrates appropriate reliance on decision aids [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
and enables better detection and correction of system errors [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
Generating explanations that users find both useful and satisfactory,
however, can be a complicated task, and much research has been
conducted to try to answer the question of "what makes a good
explanation" [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        While system-generated explanations represent the most common
approach to transparency in recommender systems, in many cases
simply providing users access to certain types of information can
also improve transparency, and can dramatically improve user
experience and the likelihood of further interaction [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In some contexts,
affording users the opportunity to see into the system’s dependencies,
policies, limitations, or information about how the user is modeled
and considered by the system can facilitate the same level of user
understanding (and subsequent trust) as an explicit explanation [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        Providing targeted information as a means of improving a user’s
mental model and trust (i.e., transparency) has two potential benefits
over the building of explanation interfaces. First, it affords users an
opportunity to use deductive reasoning to determine the merit and
validity of system recommendations, which has been demonstrated
to improve usability and user trust in many contexts. For instance,
Swearingen and Sinha reported that recommender systems whose
interfaces provided information that could help users understand
the system were preferred over those that did not [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Research
in cognitive agents has also demonstrated that providing users
access to underlying system information, such as system dependencies
or provenance of data, can greatly improve human-machine
performance and reduce the likelihood of users acting on recommendations
that are erroneous, known as "errors of commission" [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. A second
benefit of affording users the opportunity to see into the system in
order to understand its processes is that it takes little to no additional
programming. This is often because much of the information that
could enhance user understanding of system functions and behaviors
is already present in the system, but is often hidden from front-end
interfaces in order to reduce clutter and streamline layouts.
      </p>
      <p>This trade off between providing adequate information to
communicate a system’s intent and achieving a user-friendly interface
design is a common challenge, often resolved through iterative
design evaluations involving user testing. While research involving
transparency in system design frequently focuses on behavioral
outcomes, such as modeling the appropriateness of a user’s interaction
with a recommender system, little is known about what information
is most efficacious to users in terms of improving mental models,
resolving conflicts caused by unexpected or unanticipated system
behaviors, or improving user trust and technology acceptance.
Answering these questions requires an investigation into how user’s
subjectively value and prioritize different categories of information
in an effort to resolve conflicts between expected and observed
system behaviors, or in order to evaluate the validity or accuracy of a
recommendation in order to determine whether to accept or reject it.</p>
      <p>
        To accomplish this, we used an approach known as Q-Methodology,
commonly referred to as the systematic study of subjectivity [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
To constrain our work and prevent over generalization of findings,
we chose to investigate what information users value most when
engaged with recommender systems in a highly critical decision
scenario. We hypothesize that users involved in tasks that involve
a high degree of personal risk or risk to others are more likely to
critically interrogate computer-generated recommendations before
accepting and acting upon them. This suggests that systems
providing recommendations in highly critical decision contexts, such as
medical, legal, financial, or automotive domains, amongst others,
would benefit most from efforts to develop interfaces that enable
users to quickly and accurately discern whether or not to trust those
recommendations. Using the decision criticality framework as a
guide, we developed a hypothetical recommender system named
the Oncological Neural Network Prognosis and Recommendation
(ONNPAR) System. ONNPAR was modeled after modern
clinical decision support systems offering recommendations, and was
designed to serve as the highly-critical decision scenario for our
research.
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>METHODS</title>
    </sec>
    <sec id="sec-3">
      <title>A brief introduction to Q-Methodology</title>
      <p>
        Q-methodology is distinctly different from "R" methodology and
has several distinctions that should be addressed. R-methodology
samples respondents who are representative of a larger population,
and measures them for the presence or expression of certain
characteristics or traits. These measurements are made objectively, as the
opinions of respondents is seen as potentially confounding and are
therefore controlled. Using inferential statistics, findings are then
abstracted to predict prevalence and generalize findings to a larger
target population [
        <xref ref-type="bibr" rid="ref50">50</xref>
        ].
      </p>
      <p>Q-methodology, on the other hand, invites participants to directly
express their subjective opinions on a given topic by sorting
statements (or questions) into a hierarchy that represents what is most
or least important to them. Each participant’s arrangement of
statements or questions represents an individual person’s point of view
about a given topic, which ordinarily would not be of much value
beyond understanding the points of view present in that particular
group of individuals. Through the use of factor analysis, however,
patterns of subjective opinion are uncovered, which reveal a
structure of thoughts and beliefs surrounding a given topic and context.
We can use these findings to understand or model a phenomenon, or
in our example, infer the potential value of different design features
through user input that is both qualitatively rich, yet statistically
sound.</p>
      <p>
        In Q-methodology, participants are given a bank of statements,
each one on a separate card (or electronically using specialized
software), and asked to rank order them in a forced distribution
grid according to some measure of afnfiity or agreement,
depending on the context of the study [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. For our study, we employed
q-methodology as a design-elicitation tool, similar to traditional
iterative design strategies involving user evaluation of prototype
designs. In this way, we provided participants with questions, each
representing a design feature or suite of features that could be
provided through a user interface (UI). We asked participants to sort
these statements in a forced distribution, such as the one shown
in gfiure 2, ranking them from most important to least important
to them. Then, through the use of factor analysis, we analyzed the
different ways that users value and prioritize these questions, thus
inferring what design elements may add to or detract from an optimal
user experience [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and quantifying the potential value of different
categories of information to users in the context of improving the
transparency of recommender system interfaces.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Model Development</title>
      <p>
        The first step for our study was to ensure that our approach was
representative of the technical and theoretical issues related to
transparency in recommender systems (i.e., ontological homogeneity).
To accomplish this we used a combination of analytic and inductive
techniques, combining findings from a systematic literature review
with user input from a user-centered design workshop conducted for
a previous project [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>We also sought out the advice and guidance of subject matter
experts (SMEs) to ensure that all technical and theoretical aspects
of the concept of transparency in recommender systems had been
addressed. We conducted informal interviews with a combination
of academics who regularly conduct research in the fields of
machine learning and intelligent systems, as well as applied researchers
currently engaged in the development and design of recommender
systems for industry. In total, nine SMEs were consulted and asked
to review our preliminary categorization structure, and to offer
suggestions for other technical or theoretical issues not already captured
by our approach.</p>
      <p>The result was a vfie-factor model of transparency in
recommender systems. These categories consist of Data, Personal, System,
Options, and Social. We briefly describe and discuss the relevance
of these categories below.</p>
      <p>
        System Parameters and Logic: Understanding the perspective of
another in order to anticipate their actions or understand their
intentions is the process known as building a mental model [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
Information related to how a system works, including its policies, logic,
and limitations, can help users build an appropriate mental model of
the system. This is often critical, as many accidents, particularly in
high-risk domains such as aviation, have resulted from users having
an inappropriate or inaccurate mental model of system functionality
[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]-[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
      <p>
        Having knowledge of how a system functions can also help in
determining when the system may be in error. Numerous studies
have demonstrated that providing information about how the
system processes information can improve the detection of system
errors and faults[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]-[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], and can thereby lower so-called ’errors of
commission’ [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. These studies indicate that providing users with
information that assists their understanding of system functionality
may be a viable way to improve the transparency of recommender
systems.
      </p>
      <p>
        Qualities of Data: In many instances, understanding the
relationship of dependencies present in a system can provide meaningful
insights into that system’s functionality. A computer program may
be functioning perfectly, but if the data on which it is operating is
exceedingly noisy or corrupt, its outputs may still be incorrect or
inappropriate. Numerous real-world examples from accidents such as
the Space Shuttle Challenger and the Navy warship USS Vincennes
serve as a testament to the importance of providing information on
the quality and provenance of the underlying data to decision makers
[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
      </p>
      <p>
        Efforts to make data-related information available to users of
machine learning applications have been shown to result in higher
user ratings of ease of understanding, meaningfulness, and
convincingness [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. Advances in visual analytic approaches have also
improved the comprehensibility and intelligibility of data to users
by presenting it in a manner that is more readily understood [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
Different visualization techniques have also been demonstrated to
improve user’s understanding of cause and effect relationships
between variables, even among users with little to no data analytical
background (i.e., data novices, [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]).
      </p>
      <p>Just as it is important to consider the source as well as the quality
of information, so too must users be able to see into the system and
understand the data on which it is operating. The current data-driven
paradigm of machine learning, therefore, necessitates information
that can help users answer questions about the qualities of the
system’s data. Affording users the ability to see this data may well
improve the transparency of a system’s interface from the user’s
perspective.</p>
      <p>
        User Representation: The concept of personalization is central
to the discussion of transparency in a variety of intelligent system
domains such as context-aware and automated-collaborative filtering
applications [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]-[
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. Users often want to understand how they
are modeled by a system, if at all, and to what extent system outputs
are personalized for them. While commercial applications such as
personalized targeted advertisement algorithms are an important
component of this category, the importance of user representation
extends well beyond the suitability of computer-generated
recommendations like movies or music titles.
      </p>
      <p>Future machine learning applications are expected to encompass
a variety of domains that may very well necessitate extensive
explanation of how users are represented by computer systems in order to
achieve user buy-in and acceptance. For example, in the domain of
personal financial trading, a machine learning algorithm may possess
a model of risk that is very different from its user, and may perhaps
prioritize one aspect of financial growth, such as diversification, over
other aspects that the user may prioritize more, such as long-term
stability. Understanding what a system knows about its user, and
how that information is subsequently used to derive
recommendations, is therefore of potential critical importance for applications to
achieve acceptable levels of user trust, engagement, and technology
acceptance.</p>
      <p>
        Social Influence : The power of social media has been displayed
in a variety of contexts over the past decade of its modern existence,
and has become a powerful tool for marketers and influencers. As
of August 2017, two thirds of Americans (67%) reported that they
received at least some of their news from social media [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]. Systems
that group users according to online behavior in order to predict
future interests and purchases, such as automated collaborative
filtering algorithms, are abundant, and represent a foundational approach
to modern marketing and sales [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ]. In many cases, a user’s
understanding of how they are grouped by a system using social media
information can provide meaningful insights into why a system
output, such as a targeted advertisement, was generated. This is most
important when conflicts arise between a user and an inappropriate
system output. These are often the result of loose affiliations on
social media with others who may hold radically opposing
philosophical or political viewpoints, which some recommender systems
incorrectly associate into their models. Providing users opportunities
to see into a system and understand how they, the user, are
categorized and represented in a social group, may improve user experience
and trust, leading some users to remain more willing to interact with
a system after such a conflict arises. There is also some evidence
that some decision making may be socially-mediated as well.
      </p>
      <p>
        Scientists have long studied the broad range that social influences
can have on decision making and behavior. These can include
various social biases [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] which can explain in limited cases how some
people sometimes defer their decision making to a group or other
individual, even when it would seem prudent not to do so [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ].
Additionally, many people express the importance of social relationships
in guiding and assisting in decision-making. In a 2017 Pew Research
Poll, 74% of American respondents reported that their social circles
played at least a small role in their decision making; 37% reported
it played a significant role [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. Systems that afford information
that connects a user’s system interaction with their social circles,
may well improve user satisfaction and usability. For example, if
we imagine a user attempting to determine whether or not to accept
or reject a recommendation, in some contexts, social information,
such as the prevalence of that recommendation to others in their
social circle, or a ratio of accept/reject decisions from their friends
or family, could prove to be valuable to some people, and could be
used as a decision heuristic.
      </p>
      <p>
        Options: People often express a preference of choice over no
choice in most decision-making contexts [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]. Accordingly, many
systems strive to offer choices to users as a means of increasing
engagement and satisfaction [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ]. There are times, however, when
providing multiple choices to a user may be undesirable.
      </p>
      <p>For example, most navigation systems output at most three route
choices to the user, and typically highlight the one recommended
by the system. There may be, of course, several hundreds or even
thousands more options available to the user, but displaying them all
would unlikely benefit the user, and may in fact lead them to discard
the technology due to its confusing and cluttered interface.</p>
      <p>
        This "tyranny of choices" [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ] is even more evident in light of the
size and scope of many machine learning models, especially those
involving deep learning. In these circumstances, it is practically
infeasible to display every possible optional output to the user.
      </p>
      <p>
        Common interface design strategies involve efforts that reduce
choices in order to lessen cognitive load and improve the speed
and efcfiiency of decision making [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ]. Determining the trade-offs
between interface aesthetics (i.e., clutter) and user preference for
options is often a challenge for engineers and designers alike.
Sometimes, these decisions are determined by external factors, such as
corporate policy, or mandated safety requirements [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ]. But in some
contexts, users may want more options than they are often provided,
or, at the very least, users may want to know whether or not other
options exist before engaging in a decision. Closely related to this is
the importance of providing some justification of why one option is
deemed better than another.
      </p>
      <p>
        Much has been written about the role that system explanations or
justifications can have on a person’s interaction with or sentiment
towards intelligent systems [
        <xref ref-type="bibr" rid="ref42">42</xref>
        ], [
        <xref ref-type="bibr" rid="ref43">43</xref>
        ]. Users often demand some
form of justification from a system to help them determine the merit
of an output such as a recommendation [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. There are a variety of
sub-categories of this concept too, such as why one option is NOT
the best, (known as counter-factual explanation).
      </p>
      <p>
        The range of discussions over how precisely to engineer
explanation systems in a format that is meaningful and understood by the
user under different circumstances is the subject of much current
discussion in the intelligent systems communities of practice,
especially related to machine learning (for an exhaustive review, see
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]). Much of these are beyond the scope of this current paper, but
for the purposes of this discussion, suffice it to say that the ability
for systems to offer explanations of their outputs is central to the
concept of transparency in recommender systems.
2.3
      </p>
    </sec>
    <sec id="sec-5">
      <title>Concourse and Q-sort development</title>
      <p>
        Having identified these vfie factors, we then created a bank of
questions for our participants to sort. This bank is known in
Qmethodology parlance as a ’concourse.’ A goal of developing a
concourse is to create as many statements as possible to ensure a
comprehensive and saturated pool of opinions or sentiment from
which to sample. We used Ram’s taxonomy of question types as
an initial starting point to ensure that we used a variety of question
types [
        <xref ref-type="bibr" rid="ref44">44</xref>
        ]. This was then refined using Silveira et al’s taxonomy
of user’s frequent doubts [
        <xref ref-type="bibr" rid="ref45">45</xref>
        ]. The initial concourse consisted of
71 questions. We then refined this concourse down to a reasonable
bank of 36 questions through the use of vfie individuals who are
subject-matter experts in recommender systems (either professors in
Cognitive Psychology with experience with recommender systems,
or programmers of recommender systems). Questions that appeared
redundant were combined, and those that were deemed irrelevant
or unrelated were discarded. Each of the vfie factors had a roughly
equivalent number of representative questions.
      </p>
      <p>This final bank of 36 questions was randomized and assigned
numbers, then printed on 3x5 index cards. Each participant received
their own deck consisting of 36 individual questions. Participants
were given instructions for how to sort cards from most-to-least
valuable or important to them. Participants were then shown a
vignette on a computer screen or projector. The vignette described
an interaction with ONNPAR, and ends with the user being given a
recommendation which the user must determine whether or not to
act on, or reject. Participants then sorted their cards, and recorded
their arrangement on a form, along with two additional questions on
a questionnaire: In a few words, please explain WHY you chose your
MOST/LEAST important question to ask."
3</p>
    </sec>
    <sec id="sec-6">
      <title>RESULTS</title>
      <p>Our participant sample was comprised of n=22, 16 males, 6 females,
aged 22-59, average age 33 years old. Expertise was evaluated by
self-report. Participants were classified as novices if they had no
knowledge of or personal use experience with recommender
systems, and experts if they had participated in either the design or
programming of recommender systems.</p>
      <p>In the following sections we briefly describe the methodological
analysis of q-methodology, and then present the findings from our
ONNPAR study. We will describe interpretations and insights from
each of the factor groups of our factor analysis in the discussion
section.
3.1</p>
    </sec>
    <sec id="sec-7">
      <title>Q-method Analysis Overview</title>
      <p>The analysis of q-methodology is quite straightforward. Each
question from the set is assigned a numerical value according to which
column it was placed (-5 to +5 for our study). Each participant’s
arrangement of cards is then combined to create a by-person
correlation matrix. This matrix describes the relationship of each
participant’s arrangement of questions with every other participant’s
arrangement (NOT the relationship between items within each
participant). This matrix is then submitted for factor analysis, which
produces factors onto which participants load based on their
arrangements of questions. Two or more participants who load on the same
factor, therefore, will have arranged their questions in a very similar
manner, which represents similar reasoning styles or prioritization.
These factors, or clusters of participants, are then analyzed by
examining what questions were ranked highest and lowest by each group,
as well as examining the similarities and differences between each
factor group.</p>
      <p>For simplicity’s sake, we will henceforth refer to factors as
factor groups, since in the context of q-methodology, factor analysis
identifies groups of individuals. The term factor group is not to be
confused with the vfie-factor model of transparency, used to guide
our investigation.</p>
      <p>
        Several statistical packages are freely available to aid in the
analysis of q-methodology studies. We used a version known as Ken-Q
Analysis [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ].
3.2
      </p>
    </sec>
    <sec id="sec-8">
      <title>Factor Analysis</title>
      <p>
        Once all sorts had been entered into our database, they were factor
analyzed using the Ken-Q software. We used principal components
analysis (PCA) because it has been shown to better account for
random, specific, and common error variances [
        <xref ref-type="bibr" rid="ref47">47</xref>
        ]. The unrotated
factor matrix was then analyzed to determine how many factors
to retain for rotation. A significant factor loading at (P&lt;0.01) is
calculated using the equation 2.581√n where n = the number of
questions in our set (36). Individuals with factor loadings of ± .48
were considered to have loaded on a factor and were arranged into a
factor group.
      </p>
      <p>
        For factor extraction, we used the common practice of evaluating
only factors with an eigenvalue greater than one [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. We also
determined that only factors with three or more participants loading on
them would be retained. These steps resulted in four factors, which
were then submitted to rotation according to mathematical criteria
(e.g., varimax). With this four-factor solution, all but one participant
loaded clearly on at least one factor, resulting in four distinct
viewpoints of information priorities and preferences of 21 individuals.
Factor Characteristics
No. of Defining Variables
Avg. Rel. Coef.
      </p>
      <p>Composite Reliability
S.E. of Factor Z-scores
Once factor extraction and rotation was complete, we analyzed each
factor group to interpret its meaning. This was first accomplished by
producing a weighted average of each participant’s arrangement of
cards from within their factor group, and combining those
arrangements into one exemplar composite arrangement, which serves as
the model arrangement of questions for that factor group. Once these
composite arrangements, or "factor arrays," have been developed
for each factor group, they can then be analyzed for deeper
interpretation. We next evaluated the questions that were ranked highest
and lowest for each factor array. This provides an early indication of
information priorities, and allows us to begin crafting a picture of
how participants in each factor group tend to think about the value
of each category of information.
3.4</p>
    </sec>
    <sec id="sec-9">
      <title>Factor Groups</title>
      <p>Here we will report the findings from the factor analysis. To do this
we will describe each factor group’s arrangements of the questions
in terms of their highest- and lowest-ranked questions, as well as
positive and negative distinguishing questions. Distinguishing questions
are those where the absolute differences between factor z-scores
are larger than the standard error of differences for a given pair of
factors. All distinguishing questions are significant at (p &lt; .01).</p>
      <p>Factor Group One was defined by eight participants and
explained 22% of the study variance with an eigenvalue of 6.7. Three
of the factor loading participants were females, vfie were males,
with an average age of 37.5 years old. Knowledge of recommender
systems was split between vfie novices and three experts.</p>
      <p>The highest ranked question of this factor group was "Why is
this recommendation the best option?" (+5) The lowest ranked
question of this factor group was "Is there anyone in my social network
that has received a similar recommendation?" (-5) Other positive
distinguishing questions for the factor one group were (in
descending order): "What are all of the factors (or indicators) that were
considered in this recommendation, and how are they weighted?"
(4) "Precisely what information about me does the system know?"
"What does the system think is me level of "acceptable risk?" (1)
Negative distinguishing questions for Factor Group One were (in
ascending order): "How much data was used to train this system?"
(-4) "How many other people have received this recommendation
from this system?" (-2) and "What does the system think I want to
achieve?" (-1)</p>
      <p>Factor Group Two was defined by vfie participants and
explained 13% of the study variance with an eigenvalue of 2.8. All
of the factor loading participants were males, average age of 42
years old. All but one of this factor group were considered experts
in recommender systems. The highest ranked question of this factor</p>
      <sec id="sec-9-1">
        <title>Relative Rankings of Questions by Factor Group</title>
        <sec id="sec-9-1-1">
          <title>Highest</title>
        </sec>
        <sec id="sec-9-1-2">
          <title>Lowest</title>
        </sec>
        <sec id="sec-9-1-3">
          <title>Highest</title>
        </sec>
        <sec id="sec-9-1-4">
          <title>Lowest</title>
        </sec>
        <sec id="sec-9-1-5">
          <title>Highest</title>
        </sec>
        <sec id="sec-9-1-6">
          <title>Lowest</title>
        </sec>
        <sec id="sec-9-1-7">
          <title>Highest</title>
        </sec>
        <sec id="sec-9-1-8">
          <title>Lowest</title>
        </sec>
      </sec>
      <sec id="sec-9-2">
        <title>Factor Group 1</title>
        <p>Why is this recommendation the best
option?
group was "What are all of the factors (or indicators) that were
considered in this recommendation, and how are they weighted?" (+5)
The lowest ranked question of this factor group was "Was this
recommendation made specifically for ME (based on my profile/interests),
or was it made based on something else (based on some other model,
such as corporate profit, or my friend’s interests, etc.)?" (-5) Positive
distinguishing questions for the factor two group were (in
descending order): "How is this data weighted or what data does the system
prioritize?" (+4) "How much data was used to train this system?"
(+2) "Is my data uniquely different from the data on which the
system has been trained?" (1) Negative distinguishing questions for
the factor two group were (in ascending order): "Is there anyone in
my social network that has received a similar recommendation?"
(-4) "What does the system think is MY level of "acceptable risk?"
(-2) "What if I decline? How will that decision be used in future
recommendations by this system?" (-1) "How is my information
measured and weighted in this recommendation?" (-1)</p>
        <p>Factor Group Three was defined by vfie participants and
explained 9% of the study variance with an eigenvalue of 1.9. Two of
the factor loading participants were females, three were males, and
an average age of 34 years old. All but one of the participants for
this group were considered experts in recommender systems.</p>
        <p>The highest ranked question of this factor group was "Under
what circumstances has this system been wrong in the past?" (+5)
The lowest ranked question of this factor group was "What if I
decline? How will that decision be used in future recommendations
by this system?" (-5) Other positive distinguishing questions for
the factor three group were (in descending order): "What data does
the system depend on in order to work properly, and do we know
if those dependencies are functioning properly?" (+4) "Is my data
uniquely different from the data on which the system has been
trained?" (+3) "What have other people like me done in response
to this recommendation?" (+2) Negative distinguishing questions
for the factor three group were (in ascending order): "What is the
system’s level of confidence in this recommendation?" (-2) "Are
there any other options not presented here?" (-2) "How much data
was used to train this system?" (-1) "How does the system consider
risk, and what is its level of "acceptable risk?" (-1)</p>
        <p>Factor Group Four was defined by three participants and
explained 8% of the study variance with an eigenvalue of 1.7. There
were two males and one female, and an average age of 20 years old.
Knowledge of recommender systems was split between two novices
and one expert.</p>
        <p>The highest ranked question of this factor group was "What is
the history of the reliability of this system?" (+5) The lowest ranked
question of this factor group was "What does the system THINK I
want to achieve? (How does the system represent my priorities and
goals?)" (-5) Positive distinguishing questions for the factor four
group were (in descending order): "How many other people have
accepted or rejected this recommendation from this system? (What
is the ratio of approve to disapprove?)" (+4) "Is the system working
with solid data, or is the system inferring or making assumptions on
’fuzzy’ information?" (+3) "How many other people have received
this recommendation from this system?" (+1) Negative
distinguishing questions for the factor four group were (in ascending order):
"Is my data uniquely different from the data on which the system
has been trained?" (-3) "What are all of the factors (or indicators)
that were considered in this recommendation, and how are they
weighted?" (-2) "What have other people like me done in response
to this recommendation?" (-1)
4</p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>DISCUSSION</title>
      <p>Findings from our factor analysis yielded several surprising insights.
We begin with a discussion of questions that produced a high degree
of either consensus or disagreement amongst factor groups, and then
conclude with a discussion of each factor group.
4.1</p>
    </sec>
    <sec id="sec-11">
      <title>Analysis of Consensus vs. Disagreement</title>
    </sec>
    <sec id="sec-12">
      <title>Findings</title>
      <p>
        A common technique to examine these data is to explore questions
that created either consensus or a large amount of disagreement in
our sample. By examining the variance between all item rankings,
we can explore what questions were generally agreed on (i.e.,
consensus), and what items produced large disagreement. For instance,
all participants ranked "How clean or accurate is the data used in
making this recommendation?" as either 0 or -1, indicating that this
question was only moderately valuable to them in the context of a
clinical decision support system. This is potentially valuable
information for designers to consider, given that the fuzziness of data is
sometimes displayed to users as a method of enhancing system
transparency [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ]. Given these findings, it may be useful to reconsider
displaying information about the qualities of data to users in favor
of other types of information deemed more useful or valuable.
      </p>
      <p>Similarly, we can learn much from these data by evaluating
questions that produced a great deal of disagreement between factor
groups. For instance, the question "Was this recommendation made
specifically for ME (based on my profile/interests), or was it made
based on something else (based on some other model, such as
corporate profit, or my friend’s interests, etc.)?" had the largest variance,
with factor groups one and three assigning it a positive value (4 and
3), and factor groups two and four assigning it a negative value (-5
and -4). Interestingly, factor group two assigned this question as
the least valuable or important question of their q-set, while factor
group one assigned this question as their second most valuable or
important question.</p>
      <p>Interpreting these findings can, at first glance, appear confounding
to a designer looking for clear guidance. Clearly, some individuals
would prefer to have information that could indicate how they, as
a user, are modeled and considered (if at all) in system-generated
recommendations as a means of improving their trust, while others
clearly discount the value of this kind of information. These findings
suggest that social influence information, such as what other users
are doing in response to recommendations, may at times be valuable
to some users in helping determine whether or not to accept or reject
a recommendation.</p>
      <p>Two other questions also produced wide disagreement across
factor groups. "How many other people have accepted or rejected
this recommendation from this system? (What is the ratio of approve
to disapprove?)" and "Is there anyone in my social network that has
received a similar recommendation?" were ranked near the poles by
different factor groups. This indicates that the value of social
mediarelated information in highly-critical contexts, while not important
to some, is still considered valuable information by some users who
may find it a valuable and important component to enhance their
understanding and trust in system-generated recommendations.
5</p>
    </sec>
    <sec id="sec-13">
      <title>CONCLUSION</title>
      <p>We have illustrated our vfie-factor model of information categories
that can be used to increase the transparency of recommender
systems to end users. We developed a bank of 36 questions representing
information gathering strategies that users could use to interrogate
system-generated recommendations in an effort to understand its
reasoning, and decide whether to accept or reject the recommendation.</p>
      <sec id="sec-13-1">
        <title>Consensus Versus Disagreement</title>
      </sec>
      <sec id="sec-13-2">
        <title>Consensus</title>
        <sec id="sec-13-2-1">
          <title>Can I influence the system by providing feedback? Will it listen and consider my input?</title>
        </sec>
        <sec id="sec-13-2-2">
          <title>How clean or accurate is the data used in making this recommendation?</title>
        </sec>
        <sec id="sec-13-2-3">
          <title>How often is the system checked to make sure it is functioning as it was designed (i.e., for model accuracy)?</title>
        </sec>
      </sec>
      <sec id="sec-13-3">
        <title>Disagreement</title>
        <sec id="sec-13-3-1">
          <title>How many other people have accepted or rejected this recommendation from this system? (What is the ratio of approve to disapprove?)</title>
        </sec>
      </sec>
      <sec id="sec-13-4">
        <title>Z-Score</title>
      </sec>
      <sec id="sec-13-5">
        <title>Variance</title>
        <p>Using this bank of questions, participants sorted them according to
those they found most valuable or useful in helping them determine
whether to accept or reject a computer-generated recommendation.
We analyzed how participants arranged these questions using a
factor analytic technique. Our findings support other studies that find
that transparency is a multi-dimensional construct, and achieving
it is dependent on multiple variables, including to some extent the
user’s preferences for and valuation of certain categories of
information. Our findings are intended to inform future interface design of
recommender systems, as well as to broaden the discussion of the
importance of building systems whose outputs and recommendations
are easily understood by their users.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Doshi-Velez</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>Towards A Rigorous Science of Interpretable Machine Learning," AirXiv</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Buchannan</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Shortliffe</surname>
          </string-name>
          ,
          <article-title>Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project</article-title>
          . Reading, MA: Addison Wesley,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Ye</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Johnson</surname>
          </string-name>
          ,
          <article-title>"The Impact of Explanation Facilities on User Acceptance of Expert Systems Advice,"</article-title>
          <source>MIS Quarterly</source>
          , vol.
          <volume>19</volume>
          , no.
          <issue>2</issue>
          , p.
          <fpage>157</fpage>
          ,
          <string-name>
            <surname>Jun</surname>
          </string-name>
          .
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Herlocker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          ,
          <article-title>"Explaining collaborative filtering recommendations," presented at the 2000 ACM conference</article-title>
          , New York, New York, USA,
          <year>2000</year>
          , pp.
          <fpage>241</fpage>
          -
          <lpage>250</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kulesza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Burnett</surname>
          </string-name>
          , W.-K. Wong, and
          <string-name>
            <given-names>S.</given-names>
            <surname>Stumpf</surname>
          </string-name>
          ,
          <article-title>"Principles of Explanatory Debugging to Personalize Interactive Machine Learning," presented at the the 20th International Conference</article-title>
          , New York, New York, USA,
          <year>2015</year>
          , pp.
          <fpage>126</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Dzindolet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Peterson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Pomranky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Pierce</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H. P.</given-names>
            <surname>Beck</surname>
          </string-name>
          ,
          <article-title>"The role of trust in automation reliance,"</article-title>
          <source>International Journal of Human-Computer Studies</source>
          , vol.
          <volume>58</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>697</fpage>
          -
          <lpage>718</lpage>
          , Jun.
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Lorenz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Di Nocera</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Parasuraman</surname>
          </string-name>
          ,
          <article-title>"Display Integration Enhances Information Sampling and Decision Making in Automated Fault Management in a Simulated Spaceflight Micro-World,"</article-title>
          <source>Proceedings of the Human Factors and Ergonomics Society 58th Annual Meeting</source>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>35</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>Explanation in Artificial Intelligence: Insights from the Social Sciences," AirXiv</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>57</lpage>
          , Jun.
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G. B.</given-names>
            <surname>Duggan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Banbury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Howes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Patrick</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Waldron</surname>
          </string-name>
          ,
          <article-title>"Too Much, Too Little, or Just Right: Designing Data Fusion for Situation Awareness,"</article-title>
          <source>Proceedings of the Human Factors and Ergonomics Society 58th Annual Meeting</source>
          , pp.
          <fpage>528</fpage>
          -
          <lpage>532</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Swearingen</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Sinha</surname>
          </string-name>
          ,
          <article-title>"Beyond algorithms: An HCI perspective on recommender systems,"</article-title>
          <source>ACM SIGIR 2001 Workshop on Recommender Systems</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>H. F.</given-names>
            <surname>Neyedli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Hollands</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Jamieson</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>Beyond Identity: Incorporating System Reliability Information Into an Automated Combat Identification System," Human Factors</source>
          , vol.
          <volume>53</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>338</fpage>
          -
          <lpage>355</lpage>
          , Jul.
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>W.</given-names>
            <surname>Stephenson</surname>
          </string-name>
          ,
          <article-title>The study of behavior; Q-technique and its methodology</article-title>
          . Chicago, IL: University of Chicago Press,
          <year>1953</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <article-title>"A primer on Q methodology,"</article-title>
          <source>Operant Subjectivity</source>
          .
          <volume>16</volume>
          (
          <issue>3</issue>
          /4),
          <fpage>91</fpage>
          -
          <lpage>138</lpage>
          .
          <year>1993</year>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Watts</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Stenner</surname>
          </string-name>
          ,
          <article-title>"Doing Q Methodology: theory, method and interpretation," Qualitative Research in Psychology</article-title>
          , vol.
          <volume>2</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>67</fpage>
          -
          <lpage>91</lpage>
          , Jan.
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>K. O'Leary</surname>
            ,
            <given-names>J. O.</given-names>
          </string-name>
          <string-name>
            <surname>Wobbrock</surname>
            , and
            <given-names>E. A.</given-names>
          </string-name>
          <string-name>
            <surname>Riskin</surname>
          </string-name>
          ,
          <article-title>"Q-methodology as a research and design tool for HCI," presented at the CHI 2013</article-title>
          , Paris, France,
          <year>2013</year>
          , pp.
          <fpage>1941</fpage>
          -
          <lpage>1950</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>E. S.</given-names>
            <surname>Vorm</surname>
          </string-name>
          ,
          <article-title>"Assessing Demand for Transparency in Intelligent Systems Using Machine Learning," presented at the IEEE Innovations in Intelligent Systems</article-title>
          and
          <string-name>
            <surname>Applications</surname>
            <given-names>INISTA</given-names>
          </string-name>
          , Thessonaliki,
          <year>2018</year>
          , pp.
          <fpage>41</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>W. B.</given-names>
            <surname>Rouse</surname>
          </string-name>
          and
          <string-name>
            <given-names>N. M.</given-names>
            <surname>Morris</surname>
          </string-name>
          ,
          <article-title>"On looking into the black box: Prospects and limits in the search for mental models</article-title>
          .,
          <source>" Psychological Bulletin</source>
          , vol.
          <volume>100</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>349</fpage>
          -
          <lpage>363</lpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>N. B.</given-names>
            <surname>Sarter</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Woods</surname>
          </string-name>
          ,
          <article-title>"How in the World Did We Ever Get into That Mode? Mode Error and Awareness in Supervisory Control,"</article-title>
          <source>Human Factors</source>
          , vol.
          <volume>37</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>5</fpage>
          -
          <lpage>19</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A. F.</given-names>
            <surname>Zeller</surname>
          </string-name>
          ,
          <article-title>"Accidents and Safety," in Systems Psychology, K. B</article-title>
          . DeGreene, Ed. New York, NY,
          <year>1970</year>
          , pp.
          <fpage>131</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>National</given-names>
            <surname>Transportation Safety Board</surname>
          </string-name>
          ,
          <article-title>"Loss of Control on Approach Colgan Air, Inc</article-title>
          .
          <source>Operating as Continental Connection Flight 3407 Bombardier DHC-8-400</source>
          , N200WQ Clarence Center, New York February 12,
          <year>2009</year>
          ,
          <article-title>" National Transportation Safety Board</article-title>
          , NTSB/AAR-10/01 PB2010-
          <fpage>910401</fpage>
          , Feb.
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>G. G.</given-names>
            <surname>Sadler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Battiste</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Hoffmann</surname>
          </string-name>
          , W. Johnson,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shively</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Lyons</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <article-title>"Effects of transparency on pilot trust and agreement in the autonomous constrained flight planner," presented at the 2016</article-title>
          <source>IEEE/AIAA 35th Digital Avionics Systems Conference (DASC)</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sebok</surname>
          </string-name>
          and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Wickens</surname>
          </string-name>
          ,
          <article-title>"Implementing Lumberjacks and Black Swans Into Model-Based Tools to Support Human-Automation Interaction,"</article-title>
          <source>Human Factors</source>
          , vol.
          <volume>59</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>189</fpage>
          -
          <lpage>203</lpage>
          , Mar.
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J. Y. C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Procci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Boyce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Wright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Garcia</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Barnes</surname>
          </string-name>
          ,
          <article-title>"Situation Awareness-Based Transparency,"</article-title>
          <source>ARL-TR-6905, Apr</source>
          .
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Mosier</surname>
          </string-name>
          and
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Skitka</surname>
          </string-name>
          ,
          <article-title>"Human Decision Makers and Automated Decision Aids: Made for Each Other?," in Automation and human performance Theory and applications</article-title>
          , R. Parasuraman and
          <string-name>
            <given-names>M.</given-names>
            <surname>Mouloua</surname>
          </string-name>
          , Eds. NJ: Lawrence Erlbaum,
          <year>1996</year>
          , pp.
          <fpage>201</fpage>
          -
          <lpage>220</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Fisher</surname>
          </string-name>
          and
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <article-title>"Criticality of data quality as exemplified in two disasters,"</article-title>
          <source>Information and Management</source>
          , vol.
          <volume>39</volume>
          , pp.
          <fpage>109</fpage>
          -
          <lpage>116</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Khawaja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>"Making machine learning useable by revealing internal states update - a transparent approach,"</article-title>
          <source>International Journal of Computational Science and Engineering</source>
          , vol.
          <volume>13</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>378</fpage>
          -
          <lpage>389</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>T.</given-names>
            <surname>Muhlbacher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Piringer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gratzl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sedlmair</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Streit</surname>
          </string-name>
          ,
          <article-title>"Opening the Black Box: Strategies for Increased User Involvement in Existing Algorithm Implementations,"</article-title>
          <source>IEEE Trans. Visual. Comput. Graphics</source>
          , vol.
          <volume>20</volume>
          , no.
          <issue>12</issue>
          , pp.
          <fpage>1643</fpage>
          -
          <lpage>1652</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bae</surname>
          </string-name>
          , E. Ventocilla,
          <string-name>
            <given-names>M.</given-names>
            <surname>Riveiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Helldin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Falkman</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>Evaluating Multiattributes on Cause and Effect Relationship Visualization," presented at the International Conference on Information Visualization Theory and Applications</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>64</fpage>
          -
          <lpage>74</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>V.</given-names>
            <surname>Bellotti</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Edwards</surname>
          </string-name>
          ,
          <article-title>"Intelligibility and Accountability: Human Considerations in Context-Aware Systems," Human-Computer Interaction</article-title>
          , vol.
          <volume>16</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>193</fpage>
          -
          <lpage>212</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>B. Y.</given-names>
            <surname>Lim</surname>
          </string-name>
          and
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <article-title>"Assessing demand for intelligibility in context-aware applications," presented at the the 11th international conference</article-title>
          , New York, New York, USA,
          <year>2009</year>
          , p.
          <fpage>195</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Clare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Cummings</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N. P.</given-names>
            <surname>Repenning</surname>
          </string-name>
          ,
          <article-title>"Influencing Trust for HumanAutomation Collaborative Scheduling of Multiple Unmanned Vehicles,"</article-title>
          <source>Human Factors</source>
          , vol.
          <volume>57</volume>
          , no.
          <issue>7</issue>
          , pp.
          <fpage>1208</fpage>
          -
          <lpage>1218</lpage>
          , Oct.
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>E.</given-names>
            <surname>Shearer</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Gottfried</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>News Use Across Social Media Platforms</source>
          <year>2017</year>
          ," Pew Research Center,
          <year>Sep</year>
          .
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>Adobe</given-names>
            <surname>Inc</surname>
          </string-name>
          .,
          <article-title>"Digital Intelligence Briefing: 2018 Digital Trends," Adobe Inc</article-title>
          .,
          <string-name>
            <surname>Feb</surname>
          </string-name>
          .
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>A.</given-names>
            <surname>Tversky</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Kahneman</surname>
          </string-name>
          ,
          <article-title>"Judgment under Uncertainty: Heuristics and Biases,"</article-title>
          <source>Science</source>
          , vol.
          <volume>185</volume>
          , no.
          <issue>4157</issue>
          , pp.
          <fpage>1124</fpage>
          -
          <lpage>1131</lpage>
          , Sep.
          <year>1974</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>S.</given-names>
            <surname>Fiske</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , Social Cognition. Reading, MA: Addison Wesley,
          <year>1991</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>J. B. Horrigan</surname>
          </string-name>
          ,
          <article-title>"How People Approach Facts</article-title>
          and Information," Pew Research Center,
          <year>Aug</year>
          .
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>L. E.</given-names>
            <surname>Blume</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Easley</surname>
          </string-name>
          ,
          <article-title>"Rationality," in The New Palgrave Dictionary of Economics, S. Durlauf</article-title>
          and
          <string-name>
            <given-names>L. E.</given-names>
            <surname>Blume</surname>
          </string-name>
          , Eds.
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>J.</given-names>
            <surname>Preece</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Sharp</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Rogers</surname>
          </string-name>
          , Interaction Design: Beyond Human Computer Interaction, 4 ed. Wiley,
          <year>2015</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>551</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>B.</given-names>
            <surname>Schwartz</surname>
          </string-name>
          ,
          <article-title>The paradox of choice: Why more is less</article-title>
          .
          <source>Harper Perennial</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Rose</surname>
          </string-name>
          ,
          <article-title>"Human-Centered Design Meets Cognitive Load Theory: Designing Interfaces that Help People Think,"</article-title>
          pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          , Oct.
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zahabi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. B.</given-names>
            <surname>Kaber</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Swangnetr</surname>
          </string-name>
          ,
          <article-title>"Usability and Safety in Electronic Medical Records Interface Design: A Review of Recent Literature and Guideline Formulation</article-title>
          .,
          <source>" Human Factors</source>
          , vol.
          <volume>57</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>805</fpage>
          -
          <lpage>834</lpage>
          , Aug.
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gregor</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Benbasat</surname>
          </string-name>
          ,
          <article-title>"Explanations from Intelligent Systems: Theoretical Foundations and Implications for Practice,"</article-title>
          <source>MIS Quarterly</source>
          , vol.
          <volume>23</volume>
          , no.
          <issue>4</issue>
          , p.
          <fpage>497</fpage>
          ,
          <string-name>
            <surname>Dec</surname>
          </string-name>
          .
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <surname>D. L. McGuinness</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Glass</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Wolverton</surname>
            , and
            <given-names>P. P.</given-names>
          </string-name>
          <string-name>
            <surname>Da Silva</surname>
            <given-names>ExaCt</given-names>
          </string-name>
          ,
          <article-title>"A Categorization of Explanation Questions for Task Processing Systems</article-title>
          .," presented at the AAAI Workshop on Explanation- Aware
          <string-name>
            <surname>Computing</surname>
          </string-name>
          ExaCt-,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ram</surname>
          </string-name>
          , AQUA:
          <article-title>Questions that Drive the Explanation Process</article-title>
          . Lawrence Erlbaum,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Silveira</surname>
          </string-name>
          , C. S. de Souza, and
          <string-name>
            <given-names>S. D. J.</given-names>
            <surname>Barbosa</surname>
          </string-name>
          ,
          <article-title>"Semiotic engineering contributions for designing online help systems," presented at the the 19th annual international conference</article-title>
          , New York, New York, USA,
          <year>2001</year>
          , p.
          <fpage>31</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>S.</given-names>
            <surname>Banasick</surname>
          </string-name>
          , "
          <string-name>
            <surname>Ken-Q Analysis</surname>
          </string-name>
          .
          <article-title>"</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Ford</surname>
          </string-name>
          , R. C. MacCallum, and
          <string-name>
            <given-names>M.</given-names>
            <surname>Tait</surname>
          </string-name>
          ,
          <article-title>"The Application of Exploratory Factor Analysis in Applied Psychology: A critical review and analysis,"</article-title>
          <source>Personnel Psychology</source>
          , vol.
          <volume>39</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>291</fpage>
          -
          <lpage>314</lpage>
          , Jun.
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <given-names>W.</given-names>
            <surname>Yuji</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>The Trust Value Calculating for Social Network Based on Machine Learning," presented at the 2017 9th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Amant and R. M. Young</surname>
          </string-name>
          ,
          <article-title>"Interface Agents in Model World Environments,"</article-title>
          <source>AI Magazine</source>
          , vol.
          <volume>22</volume>
          , no.
          <issue>4</issue>
          , p.
          <fpage>95</fpage>
          ,
          <string-name>
            <surname>Dec</surname>
          </string-name>
          .
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devore</surname>
          </string-name>
          , Probability and Statistics for Engineering and
          <source>the Sciences, Fourth. Brooks/Cole</source>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>