<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Effects of Algorithmic Decision-Making and Interpretability on Human Behavior: Experiments using Crowdsourcing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Avishek Anand</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kilian Bizer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Erlei</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ujwal Gadiraju</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Christian Heinze</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>L3S Research Center, Leibniz Universita ̈t Hannover</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Today algorithmic decision-making (ADM) is prevalent in several fields including medicine, the criminal justice system, financial markets etc. On the one hand, this is testament to the ever improving performance and capabilities of complex machine learning models. On the other hand, the increased complexity has resulted in a lack of transparency and interpretability which has led to critical decision-making models being deployed as functional black boxes. There is a general consensus that being able to explain the actions of such systems will help to address legal issues like transparency (ex ante) and compliance requirements (interim) as well as liability (ex post). Moreover it may build trust, expose biases and in turn lead to improved models. This has most recently led to research on extracting post-hoc explanations from black box classifiers and sequence generators in tasks like image captioning, text classification and machine translation. However, there is no work yet that has investigated and revealed the impact of model explanations on the nature of human decision-making. We undertake a large scale study using crowd-sourcing as a means to measure how interpretability affects human-decision making using well understood principles of behavioral economics. To our knowledge this is the first of its kind of an inter-disciplinary study involving interpretability in ADM models.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        In the context of machine learning and more generally in
algorithmic decision-making systems (ADMs)
interpretability can be defined as “the ability to explain or to present in
understandable terms to a human”
        <xref ref-type="bibr" rid="ref1 ref13 ref15 ref30 ref34 ref6">(Doshi-Velez and Kim
2017)</xref>
        . Inspite of the application of ADMs in a breadth of
domains, for the most part, they are still used as black boxes
which output a prediction, score or rankings without
understanding partially or even completely how different
features influence the model prediction. In such cases when
an algorithm prioritizes information to predict, classify or
rank, algorithmic transparency becomes an important
feature to keep tabs on restricting discrimination and enhancing
explainability-based trust in the system.
      </p>
      <p>Copyright c 2018for this paper by its authors. Copying permitted
for private and academic purposes.</p>
      <sec id="sec-1-1">
        <title>Why Interpretability?</title>
        <p>
          Interpretability is often deemed critical to enable effective
real-world deployment of intelligent systems, albeit highly
context dependent
          <xref ref-type="bibr" rid="ref40">(Weller 2017)</xref>
          . For a researcher or
developer, high interpretability is crucial to understand how their
system/model is working, aiming to debug or improve it. For
an end user, it provides a sense of what the system is doing
and why, to enable prediction of what it might do in
unforeseen circumstances and build trust in the technology.
Additionally, adequate interpretability provides an expert
(perhaps a regulator) the ability to audit a prediction or decision
trail in detail and verify whether legal regulatory standards
have been complied with. For example, explicit content for
innocuous queries (for children) or to expose biases that may
be hard to spot with quantitative measures.
        </p>
        <p>
          Recent work has highlighted the opportunities for
computer scientists to take the lead in designing algorithms and
evaluation frameworks which avoid discrimination and
enable explanation
          <xref ref-type="bibr" rid="ref21 ref26">(Goodman and Flaxman 2016)</xref>
          . Also, many
regulatory policies now require or will require algorithmic
transparency. Take for example the European Union’s new
General Data Protection Regulation (GDPR) which will
take effect from 25 May 2018 onwards, that restricts
automated individual decision-making which significantly
affects users (Art. 22 GDPR). The law intends to create a right
to explanation, whereby a user can ask for an explanation of
an algorithmic decision that was made about them (Art. 12,
13(2) lit. f, 14(2) lit. g GDPR).
        </p>
        <p>But how is human decision-making affected when ADMs
are accompanied with explanations? How does it affect
acceptability of ADMs ? Does it increase trust in the ADMs ?
We intend to initiate large scale studies using crowdsourcing
based on behavioral economics in order to understand how
and if human decision-making is impacted when ADMs are
accompanied with explanations.</p>
      </sec>
      <sec id="sec-1-2">
        <title>Why Behavioral Economics?</title>
        <p>
          Various external factors shape the design and effects of
algorithmic decision-making systems and ultimately define
the adequate implementation of interpretability measures.
Besides being constrained by the institutional and
regulatory framework, an optimal design further anticipates
behavioral aspects of human-agent interaction
          <xref ref-type="bibr" rid="ref36">(Mosier and
Skitka 2018)</xref>
          . We argue that only an interdisciplinary
approach allows to analyze these factors comprehensively.
Introducing behavioral economics offers such an integrative
approach, that could substantially advance prevailing
discussions in manifold dimensions. Over the last decades,
behavioral economists have developed progressively detailed and
sophisticated models of human behavior. This process has
yielded a rich set of meticulous experimental methods and
inherently diverse theoretical models
          <xref ref-type="bibr" rid="ref21 ref26 ref4">(Kagel and Roth 2016;
Camerer, Loewenstein, and Rabin 2011)</xref>
          . While these
models of human behavior need to account for the progress in
artificial intelligence
          <xref ref-type="bibr" rid="ref1 ref13 ref15 ref30 ref34 ref6 ref6">(Camerer 2017; Marwala and Hurwitz
2017)</xref>
          , they enable a sound analysis of ADM systems
increasingly penetrating into society. Specifically, we aim to
examine how human behavior changes in human-agent
environments and whether these changes have repercussions
for economic outcomes. For instance, we are interested in
total productive activity, the frequency of economically
relevant interactions, cooperation and coordination activity or
changes in overall as well as individual welfare. The use of
pertinent economic models enables to generalize empirical
findings and subsequently derive inferences about effects in
our outcomes of interest. Consequently, certain ADM
design and regulatory choices can be evaluated on relevant
societal dimensions using straightforward counterfactuals
          <xref ref-type="bibr" rid="ref28">(Kleinberg et al. 2017)</xref>
          . Our approach therefore promises
evidence that supports the design of economic policy
measures with consequences for constructing machine-learning
systems
          <xref ref-type="bibr" rid="ref2">(Athey 2017; 2018)</xref>
          .
        </p>
        <p>To arrive at a suitable research design integrating
behavioral economic science, our work in progress focuses on the
effects of interpretability in human-agent interaction. For
instance, explicitly quantifying the economic value of
interpretability and identifying beneficiaries has implications for
both the design of ADM systems and regulatory choices. We
rely on ultimatum bargaining - a prominent working-horse
in experimental economics - to derive novel insights with
respect to the influence of ADM systems and
interpretability on human behavior. Overall, we ask: Does the
introduction of ADM systems influence human decision-making in
a straightforward bargaining context? How do ADM
systems adapt to these presumably new behavioral patterns?
Beyond those rather general considerations, we specifically
focus on interpretability to examine, e.g.: Does increased
interpretability influence established behavioral concepts such
as acceptance, reciprocity or fairness concerns? Does it
increase the quantity of economically relevant interactions
and subsequently affect overall welfare?</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Interpretability of ML Models</title>
      <p>Interpretability in Machine Learning has been studied for
a long time in classical machine learning as a desirable
property to have while chosing a certain model family
under interpretability by design like decision trees, falling rule
lists etc. However, the success of Neural networks (NN) and
other expressive yet complex ML models have only
intensified the discussion on post-hoc interpretability or
interpreting already built models.</p>
      <p>
        Consequently, interpretability of these complex models
has been studied in various other domains to better
understand decisions made by the network – image classification
and captioning
        <xref ref-type="bibr" rid="ref1 ref13 ref15 ref30 ref34 ref38 ref41 ref6">(Xu et al. 2015; Dabkowski and Gal 2017;
Simonyan, Vedaldi, and Zisserman 2013)</xref>
        , sequence to
sequence modeling
        <xref ref-type="bibr" rid="ref1 ref13 ref15 ref30 ref34 ref6">(Alvarez-Melis and Jaakkola 2017; Li et
al. 2015)</xref>
        , recommender systems
        <xref ref-type="bibr" rid="ref21 ref26 ref9">(Chang, Harper, and
Terveen 2016)</xref>
        etc. Interpretable models can be categorized into
two broad classes: model introspective and model
agnostic. Model introspection refers to interpretable models, such
as decision trees, rules (Letham et al. 2015), additive
models
        <xref ref-type="bibr" rid="ref7">(Caruana et al. 2015)</xref>
        and attention-based networks
        <xref ref-type="bibr" rid="ref41">(Xu
et al. 2015)</xref>
        . Instead of supporting models that are
functionally black-boxes, such as an arbitrary neural network or
random forests with thousands of trees, these approaches use
models in which there is the possibility of meaningfully
inspecting model components directly e.g. a path in a decision
tree, a single rule, or the weight of a specific feature in a
linear model.
      </p>
      <p>
        Model agnostic approaches on the other hand extract
posthoc explanations by treating the original model as a black
box either by learning from the output of the black box
model, or perturbing the inputs, or both
        <xref ref-type="bibr" rid="ref1 ref13 ref15 ref21 ref26 ref30 ref34 ref37 ref6">(Ribeiro, Singh, and
Guestrin 2016; Koh and Liang 2017)</xref>
        . Model agnostic
interpretability is of two types: local and global. Local
interpretability refers to the explanations used to describe a single
decision of the model. There are also other notions of
interpretability, and for a more comprehensive description of the
approaches we point the readers to
        <xref ref-type="bibr" rid="ref33">(Lipton 2016)</xref>
        .
      </p>
    </sec>
    <sec id="sec-3">
      <title>Interpretability and Human Decision-Making</title>
      <p>
        Interpretability is no end in itself. The effects of
interpretability remain ambiguous even if one learns about the
effectiveness of interpretability measures as obtained by
studies like
        <xref ref-type="bibr" rid="ref16 ref19">(Garcia et al. 2009; Gacto, Alcala, and Herrera
2011)</xref>
        . Rather, to resolve this ambiguity, one needs to ask in
how far variation in interpretability transfers into variation
in behavior.
      </p>
      <p>
        For instance, additional explanations could foster a more
trustful environment that motivates fruitful human-agent
interactions. However, providing additional information might
conversely result in an erosion of trust due to a more
thorough scrutiny with respect to agent recommendations.
Consider an agent supporting a physician (expert) in diagnosing
a patient’s (consumer) MRI scan. The physician might
generally trust the agent based on positive experience and
common knowledge about its superiority; thus reaching higher
accuracy in his diagnosis. In contrast, learning about
unfamiliar features used by the agent might cause distrust and
has the physician stick to her own assessment. This
hypothesis stems from evidence gathered by observing human
interaction
        <xref ref-type="bibr" rid="ref11 ref14 ref22 ref27">(Keller and Staelin 1987; Grimmelikhuijsen et al.
2013; Cramer et al. 2008; Ditto et al. 1998)</xref>
        . Hence,
increased interpretability might diminish the efficiency of such
economically vital consumer-expert interactions.
      </p>
      <p>The consideration above illustrates only one distinct case
with inherent ambiguity regarding the effects of
introducing increased interpretability. Besides trust, one might think
of concepts established in behavioral economics like
acceptance, accountability or social-preferences. Further, to
obtain a more thorough understanding of increased
interpretability, one needs to not only evaluate its effects on the
end-user, but rather also consider regulators, developers or
consumers. Such a comprehensive approach poses several
challenges to the design of experiments and respective
modeling of human behavior. Our work in progress relies on
ultimatum bargaining to derive novel insight with respect to our
considerations outlined above.</p>
    </sec>
    <sec id="sec-4">
      <title>Crowdsourcing Methodology</title>
      <p>
        Over the last decade, microtask crowdsourcing platforms
such as Amazon’s Mechanical Turk1 and CrowdFlower2
have been used to support or replicate findings from
psychology and behavioral research, and also to run
humancentered experiments on a large scale
        <xref ref-type="bibr" rid="ref12 ref18 ref24 ref35 ref8">(Mason and Suri 2012;
Crump, McDonnell, and Gureckis 2013; Chandler, Mueller,
and Paolacci 2014; Gadiraju et al. 2017)</xref>
        . Previous works
have established that crowdsourcing platforms can be
reliably leveraged to conduct large scale behavioral experiments
that can be ecologically valid.
      </p>
      <sec id="sec-4-1">
        <title>Ultimatum Bargaining Experiment</title>
        <p>
          Ultimatum bargaining represents one of the most
prominent games researched in experimental economics
          <xref ref-type="bibr" rid="ref25">(Gueth,
Schmittberger, and Schwarze 1982)</xref>
          . Although it seems quite
simple, understanding behavior in this framework remains
complex even after decades of research
          <xref ref-type="bibr" rid="ref24 ref39">(Gueth and Kocher
2014; van Damme et al. 2014)</xref>
          . However, there is a rich
literature allowing to integrate and evaluate the relevance of our
findings. Literature on automated, though not artificial
intelligent, agents from computer science and economics, makes
the ultimatum game an optimal working horse to test our
hypothesis. Our basic framework replicates the simplest design
of the ultimatum game. A proposer X decides on the
distribution of a pie with size p. X receives x and the responder
Y receives y, where x; y 0 and x + y = p. In a
sequential process, the responder Y learns about the proposal (x; y)
and either accepts (x; y) = 1 or rejects (x; y) = 0.
Payoffs are given by (x; y)x and (x; y)y, i.e. if the responder
Y rejects both earn nothing.
        </p>
        <p>
          A straightforward solution of the game - merely based on
monetary outcomes - implies that responder Y should
accept all positive offers, which gives (x; y) = 1 for y &gt; 0.3
This is anticipated by the proposer X, which has him offer
the minimal positive amount. In consequence, X receives
almost the whole pie p and Y receives little more than
nothing. However, actual behavior observed in prior experiments
shows that the optimal offer by the proposer amounts to
40 to 50% of the pie. This might for example reflect
fairness concerns or merely strategic thinking avoiding
punishment by the responder who rejects offers perceived as unfair
          <xref ref-type="bibr" rid="ref5">(Camerer 2003)</xref>
          .
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Experimental Setup</title>
        <p>We will carry out a large scale ultimatum bargaining
experiment by recruiting workers from a crowdsourcing platform.
1https://www.mturk.com/
2https://www.crowdflower.com
3While this represents the weakly dominant strategy for Y , all
distributions (x; y) can be established as equilibrium outcomes.
For multiple equilibria consider a certain threshold y for
acceptance by the responder Y , such that [(x; y); (x~; y~) = 1] if y~ y
and (x~; y~) = 0 otherwise.</p>
        <p>
          Workers will play the roles of proposers and responders
under the following different between-subjects treatment
conditions, to understand the effects of automated
decisionmaking and interpretability on human behavior. We will
follow guidelines from previous works to ensure reliable
participation of crowd workers
          <xref ref-type="bibr" rid="ref17">(Gadiraju et al. 2015)</xref>
          .
        </p>
        <p>I: Human-Human Interactions. This condition follows the
simplest design of the ultimatum game as described earlier,
consisting of a proposer and responder (roles that will be
fulfilled by randomly paired workers recruited from the
crowdsourcing platform). We will record the interactions between
N unique (proposer, responder) pairs, i.e., the offers made
by the proposer and whether they are accepted or rejected
by the responder. Following this, the proposer and
responder will independently complete certain personality related
questionnaires.</p>
        <p>II: Human-Machine Interactions. Using the N
humanhuman interactions and features engineered from condition
I, we will train a machine learning model that can classify
whether a bid from a proposer is likely to be accepted. In this
condition, proposers will be given the opportunity to use the
machine learning model as an algorithmic decision-making
system that can aid them in making a proposal. The
proposers will be allowed to probe the ADM system with
proposals and the system would report the likelihood of the
proposal being accepted. The proposers will be allowed to probe
the ADM system any number of times, but can only make
a proposal to the responders once. The responders will be
made aware of the fact that the proposer have a ADM system
at their disposal to help them in making a proposal. Once
again we will record the interactions between N unique and
distinct (proposer, responder) pairs. These interactions
between the proposers with the ADM system, as well as with
the responders will provide us valuable insights on the
effects of ADM on human behavior and how trust manifests
and fluctuates via such interactions.</p>
        <p>III: Human-Machine Interactions with Proposers as
Observers. This condition is similar to II, except that the
proposers will not be allowed to probe the ADM system but
will only observe the proposals made by the system on her
behalf. The responders will be conveyed that the offer being
made is from an ADM acting on behalf of the proposer.</p>
        <p>IV: Human-Machine Interactions with Explanations. This
condition is virtually identical to II, except that proposers in
this case will be aided with explanations alongside
likelihood estimates to enhance interpretability when they probe
the ADM system. Note that we consider model-introspective
variants of interpretability where access to an already built
model is provided. This will allow us to understand the role
of interpretability in shaping human behavior while
interacting with ADM systems.</p>
        <p>V, VI, VII: ADM Learned from Human-Machine
Interactions. To analyze the impact of the type of interactions
that the ADM is learned from, we will train a similar
machine learning model by using the interactions in condition
II, that can aid a proposer in making an offer to the
responder. This will allow us to investigate the impact of the type
of interaction data (human-human versus human-machine)
that the ADM is learned from, on the entailing observations
of human behavior. Thus, the conditions V, VI and VII are
repetitions of II, III and IV except for the interactions that
the ADM is learned from.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Alvarez-Melis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Jaakkola</surname>
            ,
            <given-names>T. S.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>A causal framework for explaining the predictions of black-box sequence-to-sequence models</article-title>
          .
          <source>arXiv preprint arXiv:1707</source>
          .
          <year>01943</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Athey</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Beyond prediction: Using big data for policy problems</article-title>
          .
          <source>Science</source>
          <volume>355</volume>
          :
          <fpage>483</fpage>
          -
          <lpage>485</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Athey</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>The impact of machine learning on economics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Camerer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Loewenstein</surname>
            , G.; and Rabin,
            <given-names>M.</given-names>
          </string-name>
          <year>2011</year>
          . Advances in Behavioral Economics. Princeton, NJ: Princeton University Press.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Camerer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2003</year>
          .
          <article-title>Behavioral game theory: Experiments in strategic interaction</article-title>
          . Princeton, NJ: Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Camerer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Artificial intelligence and behavioral economics</article-title>
          .
          <source>Economics of Artificial Intelligence.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Caruana</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Lou,
          <string-name>
            <given-names>Y.</given-names>
            ;
            <surname>Gehrke</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          ; Koch,
          <string-name>
            <given-names>P.</given-names>
            ;
            <surname>Sturm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ; and
            <surname>Elhadad</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission</article-title>
          .
          <source>In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          ,
          <fpage>1721</fpage>
          -
          <lpage>1730</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Chandler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Mueller,
          <string-name>
            <surname>P.</surname>
          </string-name>
          ; and Paolacci,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <year>2014</year>
          .
          <article-title>Nonna¨ıvete´ among amazon mechanical turk workers: Consequences and solutions for behavioral researchers</article-title>
          .
          <source>Behavior research methods 46</source>
          <volume>(1)</volume>
          :
          <fpage>112</fpage>
          -
          <lpage>130</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Harper</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Terveen</surname>
            ,
            <given-names>L. G.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Crowd-based personalized natural language explanations for recommendations.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>In Proceedings of the 10th ACM Conference on Recommender Systems</source>
          , RecSys '
          <volume>16</volume>
          ,
          <fpage>175</fpage>
          -
          <lpage>182</lpage>
          . New York, NY, USA: ACM.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Cramer</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Evers</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ramlal</surname>
            , S.; van Someren,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Rutledge</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Stash</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Aroyo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Wielinga</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <year>2008</year>
          .
          <article-title>The effects of transparency on trust in and acceptance of a content-based art recommender</article-title>
          .
          <source>User Modeling and User-Adapted Interaction</source>
          <volume>18</volume>
          (
          <issue>455</issue>
          ):
          <fpage>456</fpage>
          -
          <lpage>496</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Crump</surname>
            ,
            <given-names>M. J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>McDonnell</surname>
            ,
            <given-names>J. V.</given-names>
          </string-name>
          ; and Gureckis,
          <string-name>
            <surname>T. M.</surname>
          </string-name>
          <year>2013</year>
          .
          <article-title>Evaluating amazon's mechanical turk as a tool for experimental behavioral research</article-title>
          .
          <source>PloS one 8</source>
          (
          <issue>3</issue>
          ):
          <fpage>e57410</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Dabkowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Gal</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Real time image saliency for black box classifiers</article-title>
          .
          <source>arXiv preprint arXiv:1705</source>
          .
          <fpage>07857</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Ditto</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Scepansky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Munro,
          <string-name>
            <given-names>G.</given-names>
            ;
            <surname>Apanovitch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            ; and
            <surname>Lockhart</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <year>1998</year>
          .
          <article-title>Motivated sensitivity to preference-inconsistent information</article-title>
          .
          <source>Journal of Personality and Social Psychology</source>
          <volume>75</volume>
          (
          <issue>1</issue>
          ):
          <fpage>53</fpage>
          -
          <lpage>69</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Doshi-Velez</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Towards a rigorous science of interpretable machine learning</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Gacto</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Alcala</surname>
          </string-name>
          , R.; and
          <string-name>
            <surname>Herrera</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <year>2011</year>
          .
          <article-title>Interpretability of linguistic fuzzy rule-based systems: An overview of interpretability measures</article-title>
          .
          <source>Information Sciences</source>
          <volume>181</volume>
          :
          <fpage>43404360</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Gadiraju</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kawase</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Dietze,
          <string-name>
            <surname>S.</surname>
          </string-name>
          ; and Demartini,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Understanding malicious behavior in crowdsourcing platforms: The case of online surveys</article-title>
          .
          <source>In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems</source>
          ,
          <volume>1631</volume>
          -
          <fpage>1640</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Gadiraju</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ; Mo¨ller, S.; No¨llenburg, M.;
          <string-name>
            <surname>Saupe</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Egger-Lampl</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Archambault</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ; and Fisher,
          <string-name>
            <surname>B.</surname>
          </string-name>
          <year>2017</year>
          .
          <article-title>Crowdsourcing versus the laboratory: Towards human-centered experiments using the crowd</article-title>
          .
          <source>In Evaluation in the Crowd. Crowdsourcing and HumanCentered Experiments</source>
          . Springer. 6-
          <fpage>26</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Garcia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Luengo</surname>
          </string-name>
          , J.; and
          <string-name>
            <surname>Herrera</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <source>Soft Computing</source>
          <volume>13</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Goodman</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Flaxman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>European union regulations on algorithmic decision-making and a” right to explanation”</article-title>
          .
          <source>arXiv preprint arXiv:1606</source>
          .
          <fpage>08813</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Grimmelikhuijsen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; Porumbescu,
          <string-name>
            <given-names>G.</given-names>
            ;
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          ; and Im,
          <string-name>
            <surname>T.</surname>
          </string-name>
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <article-title>The effect of transparency on trust in government: A crossnational comparative experiment</article-title>
          .
          <source>Public Administration Review</source>
          <volume>73</volume>
          (
          <issue>4</issue>
          ):
          <fpage>575</fpage>
          -
          <lpage>586</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Gueth</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Kocher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>An experimental analysis of ultimatum bargaining</article-title>
          .
          <source>Journal of Economic Behavior &amp; Organization</source>
          <volume>108</volume>
          :
          <fpage>396</fpage>
          -
          <lpage>409</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Gueth</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ; Schmittberger, R.; and
          <string-name>
            <surname>Schwarze</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <year>1982</year>
          .
          <article-title>An experimental analysis of ultimatum bargaining</article-title>
          .
          <source>Journal of Economic Behavior &amp; Organization</source>
          <volume>3</volume>
          (
          <issue>4</issue>
          ):
          <fpage>367</fpage>
          -
          <lpage>388</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <surname>Kagel</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>The Handbook of Experimental Economics</article-title>
          , Volume
          <volume>2</volume>
          . Princeton, NJ: Princeton University Press.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <surname>Keller</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Staelin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>1987</year>
          .
          <article-title>Effects of quality and quantity of information on decision effectiveness</article-title>
          .
          <source>Journal of Consumer Research</source>
          <volume>14</volume>
          :
          <fpage>200</fpage>
          -
          <lpage>213</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <surname>Kleinberg</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Lakkaraju,
          <string-name>
            <given-names>H.</given-names>
            ;
            <surname>Leskovec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ;
            <surname>Ludwig</surname>
          </string-name>
          , J.; and
          <string-name>
            <surname>Mullainathan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Human decisions and machine predictions</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <source>The Quarterly Journal of Economics</source>
          <volume>133</volume>
          (
          <issue>11</issue>
          ):
          <fpage>237293</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <surname>Koh</surname>
            ,
            <given-names>P. W.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Understanding black-box predictions via influence functions</article-title>
          .
          <source>arXiv preprint arXiv:1703</source>
          .
          <fpage>04730</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          2015.
          <article-title>Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model</article-title>
          .
          <source>The Annals of Applied Statistics</source>
          <volume>9</volume>
          (
          <issue>3</issue>
          ):
          <fpage>1350</fpage>
          -
          <lpage>1371</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <source>arXiv:1506</source>
          .
          <fpage>01066</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          2015.
          <article-title>VisualarXiv preprint Lipton</article-title>
          ,
          <string-name>
            <surname>Z. C.</surname>
          </string-name>
          <year>2016</year>
          .
          <article-title>The mythos of model interpretability</article-title>
          .
          <source>ICML Workshop on Human Interpretability of Machine Learning.</source>
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <surname>Marwala</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Hurwitz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Artificial intelligence and economic theories</article-title>
          .
          <source>arXiv:1703</source>
          .
          <fpage>0659</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <string-name>
            <surname>Mason</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Suri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>Conducting behavioral research on amazons mechanical turk</article-title>
          .
          <source>Behavior research methods 44</source>
          (1):
          <fpage>1</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          <string-name>
            <surname>Mosier</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Skitka</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Human decision makers and automated decision aids: Made for each other? In Automation and Human Performance: Theory and Applications</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          <string-name>
            <surname>Ribeiro</surname>
          </string-name>
          , M. T.;
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Guestrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Why should i trust you?: Explaining the predictions of any classifier</article-title>
          .
          <source>In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          ,
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          <string-name>
            <surname>Simonyan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Vedaldi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Zisserman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2013</year>
          .
          <article-title>Deep inside convolutional networks: Visualising image classification models and saliency maps</article-title>
          .
          <source>arXiv preprint arXiv:1312</source>
          .
          <fpage>6034</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          <string-name>
            <surname>van Damme</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Binmore</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Samuelson</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Winter</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Bolton</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ockenfels</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Dufwenberg</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kirchsteiger</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ; Gneezy,
          <string-name>
            <given-names>U.</given-names>
            ;
            <surname>Kocher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ;
            <surname>Sutter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ;
            <surname>Sanfey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ;
            <surname>Kliemt</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          ; Selten,
          <string-name>
            <given-names>R.</given-names>
            ; Nagel, R.; and
            <surname>Azar</surname>
          </string-name>
          ,
          <string-name>
            <surname>O.</surname>
          </string-name>
          <year>2014</year>
          .
          <article-title>How werner gueth's ultimatum game shaped our understanding of social behavior</article-title>
          .
          <source>Journal of Economic Behavior &amp; Organization</source>
          <volume>108</volume>
          :
          <fpage>292</fpage>
          -
          <lpage>318</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          <string-name>
            <surname>Weller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Challenges for transparency</article-title>
          .
          <source>arXiv preprint arXiv:1708</source>
          .
          <year>01870</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ba</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Kiros,
          <string-name>
            <surname>R.</surname>
          </string-name>
          ; Cho,
          <string-name>
            <given-names>K.</given-names>
            ;
            <surname>Courville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ;
            <surname>Salakhudinov</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          ; Zemel, R.; and Bengio,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Show, attend and tell: Neural image caption generation with visual attention</article-title>
          .
          <source>In International Conference on Machine Learning</source>
          ,
          <fpage>2048</fpage>
          -
          <lpage>2057</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>