<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Recommender Systems and Learning Traps</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Ido Erev Technion-Microsoft Electronic Commerce Research Center</institution>
          ,
          <addr-line>Technion</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Experimental studies of human choice behavior reveal two classes of deviations from optimal choice that should be considered by the designer of recommender systems. The first class of deviations can be described as "presentation effects." In many situations people exhibit high sensitivity to small changes in the presentation of the choice task that do not affect the final outcomes. The second class of deviations can be described as learning traps. In these situations people fail to learn to select the optimal strategies; they appear to converge to inefficient behavior.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Mainstream research in behavioral economics
        <xref ref-type="bibr" rid="ref1 ref7">(e.g., Kahneman &amp; Tversky, 1979;
Ariely, 2008)</xref>
        tends to focus on presentation effects. The current paper tries to clarify
the significance of learning traps. It focuses on two basic properties of decisions from
experience that trigger learning traps. The first property, referred to as
"underweighting of rare events," is illustrated by the experiments summarized in
Figures 1a (results) and 1b (experimental paradigm; see Erev &amp; Haruvy, 2014; Erev
&amp; Roth, 2014). In each trial of these experiments the participants are asked to select
one of two unmarked keys, and then receive feedback consisting of their obtained
payoff (the payoff from the selected key), and the forgone payoff (the payoff that the
participant could have received had he selected the other key).
      </p>
      <p>Each participant faced each of the two problems presented in Figure 1a for 100
trials. Both problems focus on choice between a status quo option (0 with certainty)
and an action that can lead to positive or negative outcomes. In Problem 1, the action
yielded the gamble (-10 with p = 0.1; +1 otherwise); this choice has negative expected
return (EV = -0.1), but it yields the best payoff in 90% of the trials. In Problem 2, the
action (+10 with p = 0.1; -1 otherwise) has positive expected return (EV = +0.1), but
it yields the worst payoff in 90% of the trials. The participants received a show up fee
of 25 Israeli Shekels (1 Shekel ≈ $0.25) plus the payoff (in Shekels) from one
randomly selected trial.</p>
      <p>
        The two curves show the aggregated choice rate of the risky action in 5 blocks of
20 trials over 128 participants that were run in two studies
        <xref ref-type="bibr" rid="ref10">(Nevo &amp; Erev, 2012, Amir
et al., 2013)</xref>
        . The results reveal that the typical participant favored the risky prospect
when it impaired expected return (action rate of 58% in Problem 1 when the EV of
the risky prospect is -0.1), but not when it maximizes expected return (action rate of
27% in Problem 2 when the EV of the risky prospect is +0.1). Thus, the typical results
in both problems reflect deviation from maximization. That is, the typical participant
behaves "as if" he does not pay enough attention to the rare (10%) outcomes. 1
      </p>
      <p>The current experiment includes many trials. Your task, in each trial, is to
click on one of the two keys presented on the screen. Each click will be
followed by the presentation of the keys’ payoffs. Your payoff for the trial is
the payoff of the selected key.</p>
      <p>Fig. 1b: The instructions screen in experimental studies that use the basic
version of the "clicking paradigm". The participants did not receive a
description of the payoff distributions. The feedback after each choice was a
draw from each of the two payoff distributions, one for each key.</p>
      <p>
        The studies summarized above focused on situations with complete feedback; the
feedback after each trial informed the decision makers of the payoff that they got, and
of the payoff that they would have received had they selected a different action. In
many natural settings the feedback is limited to the outcome of the selected action,
and decision makers have to explore to learn the incentive structure. Analysis of this
set of situations highlights the robustness of underweighting of rare events, and shows
1 Notice that this observation is inconsistent with Prospect Theory
        <xref ref-type="bibr" rid="ref7">(Kahneman &amp; Tversky,
1979)</xref>
        . Prospect theory summarizes the results of experiments in which people decide based
on a description of the incentive structure, and these studies reveal over-weighting of rare
events. The current studies implies that the opposite bias emerge in decisions from
experience. That is, the results reflect an experience-description gap
        <xref ref-type="bibr" rid="ref6 ref9">(Hertwig &amp; Erev, 2009;
Lejarraga &amp; Gonzalez, 2011)</xref>
        .
the significance of a second phenomenon: "the hot stove effect"
        <xref ref-type="bibr" rid="ref2">(Denrell &amp; March,
2001)</xref>
        . When the feedback is limited to the obtained outcome the effect of relatively
bad outcomes lasts longer than the effect of good outcomes. The explanation is
simple, bad outcomes decrease the probability of repeated choice and, for that reason,
they slow reevaluation of the disappointing option. As a result, experience with
limited feedback decreases the tendency to select the risky prospect.
      </p>
      <p>Many of the learning traps implied by underweighting of rare events and the hot
stove effects can be described as reflections of insufficient exploration.
Underweighting of rare event implies that insufficient exploration is particularly
likely when the probability of success given exploration is low. In these situations
people tend to "give up" too early, and exhibit learned helplessness (Teodorescu &amp;
Erev, 2014). For example, they do not learn to use software and applications in the
ways that will serve them best, and in some cases they are not be aware of the fact
that they are likely to enjoy certain activities (e.g., watching certain type of movies).</p>
      <p>The hot stove effect implies that insufficient exploration can also be the product of
a random a sequence of bad experiences. For example, a sequence of two bad
experiences with a particular product category can lead the agent to stop exploring
this category and remember it as unattractive.</p>
      <p>Related implications of underweighting of rare events involve a tendency to ignore
instructions, sign contracts without reading them, and to skip questionnaires. These
behaviors are expected when the extra effort (reading instructions or contracts, and/or
filling questionnaires) may be effective in expectation but the common outcome is a
waste of time.</p>
      <p>Designers of recommender systems can address these and similar learning traps by
affecting the incentive structure. Specifically, it is important to understand that in
many cases giving the users what they say that they want, may not be good enough. It
is possible that the users' behavior reflects a learning trap, and encouraging them to
explore and read can help get out from the trap.
9. Teoderescu K, Amir M, Erev I (2013) The experience–description gap and the role of the
inter decision interval. In C. Pammi and N. Srinivasan (Eds). Decision making: neural and
behavioural approaches. Elsevier
10. Teodorescu, Kinneret, and Ido Erev. (2014). Learned helplessness and learned prevalence:
Exploring the causal relations among perceived controllability, reward prevalence, and
exploration. Psychological science. DOI: 10.1177/0956797614543022.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Ariely D</surname>
          </string-name>
          (
          <year>2008</year>
          )
          <article-title>Predictably irrational</article-title>
          . New York: HarperCollins.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Denrell</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>March</surname>
            <given-names>JG</given-names>
          </string-name>
          (
          <year>2001</year>
          )
          <article-title>Adaptation as information restriction: The hot stove effect</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Organization</given-names>
            <surname>Science</surname>
          </string-name>
          ,
          <volume>12</volume>
          (
          <issue>5</issue>
          ),
          <fpage>523</fpage>
          -
          <lpage>538</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          http://www.utdallas.edu/~eeh017200/papers/LearningChapter.pdf Erev,
          <string-name>
            <given-names>I</given-names>
            &amp;
            <surname>Roth</surname>
          </string-name>
          <string-name>
            <surname>A. E.</surname>
          </string-name>
          (
          <year>2014</year>
          ). Maximization, Learning and
          <string-name>
            <given-names>Economic</given-names>
            <surname>Behavior</surname>
          </string-name>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>Proceedings and National Academy of Science</source>
          ,
          <volume>111</volume>
          ,
          <fpage>10818</fpage>
          -
          <lpage>10825</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Hertwig</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erev</surname>
            <given-names>I</given-names>
          </string-name>
          (
          <year>2009</year>
          )
          <article-title>The description-experience gap in risky choice</article-title>
          .
          <source>Trends in Cognitive Sciences</source>
          ,
          <volume>13</volume>
          ,
          <fpage>517</fpage>
          -
          <lpage>523</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Kahneman</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tversky</surname>
            <given-names>A</given-names>
          </string-name>
          (
          <year>1979</year>
          )
          <article-title>Prospect theory: An analysis of decision under risk.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Econometrica</surname>
          </string-name>
          ,
          <volume>47</volume>
          ,
          <fpage>263</fpage>
          -
          <lpage>291</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Lejarraga</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalez</surname>
            <given-names>C</given-names>
          </string-name>
          (
          <year>2011</year>
          )
          <article-title>Effects of feedback and complexity on repeated decisions from description</article-title>
          .
          <source>Organizational Behavior and Human Decision Processes</source>
          ,
          <volume>116</volume>
          (
          <issue>2</issue>
          ),
          <fpage>286</fpage>
          -
          <lpage>295</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Nevo</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erev</surname>
            <given-names>I</given-names>
          </string-name>
          (
          <year>2012</year>
          )
          <article-title>On surprise, change, and the effect of recent outcomes</article-title>
          .
          <source>Frontiers in Cognitive Science</source>
          ,
          <volume>3</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>