<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Predicting actions using an adaptive probabilistic model of human decision behaviours</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A.H.Cruickshank</string-name>
          <email>A.H.Cruickshank@sms.ed.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>R.Shillcock</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S.Ramamoorthy</string-name>
          <email>S.Ramamoorthy@ed.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Informatics, University of Edinburgh</institution>
          ,
          <addr-line>EH8 9AB</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Computer interfaces provide an environment that allows for multiple objectively optimal solutions but individuals will, over time, use a smaller number of subjectively optimal solutions, developed as habits that have been formed and tuned by repetition. Designing an interface agent to provide assistance in this environment thus requires not only knowledge of the objectively optimal solutions, but also recognition that users act from habit and that adaptation to an individual's subjectively optimal solutions is required. We present a dynamic Bayesian network model for predicting a user's actions by inferring whether a decision is being made by deliberation or through habit. The model adapts to individuals in a principled manner by incorporating observed actions using Bayesian probabilistic techniques. We demonstrate the model's effectiveness using specific implementations of deliberation and habitual decision making, that are simple enough to transparently expose the mechanisms of our estimation procedure. We show that this implementation achieves &gt; 90% prediction accuracy in a task with a large number of optimal solutions and a high degree of freedom in selecting actions.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Computer interfaces provide an environment that allows for multiple
objectively optimal solutions but individuals will, over time, use a smaller
number of subjectively optimal solutions, developed as habits that have
been formed and tuned by repetition. Thus an interface agent providing
assistance in this environment requires not only knowledge of the
objectively optimal solutions, but also the ability to adapt to an individual’s
subjectively optimal solutions. Utilising findings in psychology and
neuroscience we propose a general model that adapts to individuals using
Bayesian probability to infer the type of decision making behaviour that
will be used. We demonstrate the effectiveness of our approach using
simple implementations for two decision systems, the deliberative and
habitual systems. The deliberative system uses an internal model of the
environment for forward planning to reach a goal and selects actions
based on the calculated plan. The habitual system learns the utility for
actions in a situation based on previous experience and selects the one
that has proven to be most useful in the past. The existence of both these
systems has been recognised from early studies in psychology and
neuroscience ([
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) and are known to coexist ([
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). Our approach is related
to others that are derived from human decision making, such as that
used in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which proposes a Bayesian Theory of Mind, and [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which
extended the ACT-R cognitive architecture to create ACT-R/E. Other
approaches for plan and action recognition are derived from applying
automated planning and machine learning techniques. These generally fall
into one of two categories a planner approach ([
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]) or a historic approach
([
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]). Of note is that both of these approaches broadly replicate one of the
two types of decision system used by people when selecting actions.
Planner predictors replicate the deliberative decision system whereas historic
predictors replicate the habitual decision system. Thus our approach can
be viewed as integrating these two types of predictors using a Bayesian
model combination.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Predictive Model</title>
      <p>Action prediction is performed using a dynamic Bayesian network model
B is a decision behaviour, which we define as any process that provides
a distribution over actions given a state and previously observed
actions. That is, a decision behaviour can be represented as a distribution
p(a′|B, s, a&lt;n). To predict the action an for a specific state s (elided for
simplicity), given previously observed actions a&lt;n, we calculate
p(an|a&lt;n) = X p(Bn|a&lt;n)p(an|Bn, a&lt;n)
(1)</p>
      <p>Bn
marginalising over the latent variable, B. A simple, recursive update
mechanism can be used for p(Bn|a&lt;n), which significantly increases the
efficiency of the calculation:
p(Bn|a&lt;n) ∝ p(Bn−1|a&lt;(n−1)).p(an−1|a&lt;(n−1), Bn−1)
(2)
In our initial implementation we use relatively simple instantiations of
the deliberative and habitual behaviours.</p>
      <p>Deliberative, D: The deliberative behaviour selects actions based on
solving the experimental task (described in Section 3) using the
smallest number of moves, which subjects find by planning over the available
actions. We use an abstraction of the planning process, effectively
precalculating the optimal actions that exist and modeling the deliberative
behaviour using a Multinomial distribution for each task state. As the
task allows for multiple, optimal actions the parameters of the
distribution are set such that all optimal actions have equal, high probability
with a small probability of a non-optimal action to be selected by
mistake.</p>
      <p>p(an|Bn = D, a&lt;n) = p(an|Bn = D) = M ultinomial(ψs)
(3)
Habitual, H: The habitual behaviour selects actions based on previous
experience. We also use an abstraction of the habitual process that
assumes that the more often that an action has been selected before the
more likely it is to be selected again. We model this using a Dirichlet prior
over a Multinomial distribution for each state. The hyper-parameters
of the Dirichlet are initialised to the same value, making each action
equally likely, but on observing an action the associated parameter is
incremented.</p>
      <p>p(an|Bn = H, a&lt;n) ∝ Dirichlet(αs)
(4)
3</p>
    </sec>
    <sec id="sec-3">
      <title>Experimental Results</title>
      <p>To illustrate the utility of our proposed approach we present results from
a human subject experiment that required subjects to complete a novel
task that contains multiple, optimal solutions. The task used a
“construction” paradigm in which subjects had to join coloured connectors
using a limited selection of parts, shown in Figure 1. The five layouts</p>
      <p>used in the experiment were designed such that they could be completed
using 4 parts to join each of the coloured connectors, giving the optimal
number of actions as 19 (4 part selection actions for each colour plus 3
colour selection actions). Ten subjects took part in the experiment (5
male/5 female, a mix of staff and PhD students from the University of
Edinburgh, School of Informatics).</p>
      <p>A specific task state, used in the models described above, is defined
by the layout being solved {1, ..., 5} and the current connection point
(x, y) co-ordinate. The comparison of the predictive models was made for
the individual components, and the combined model. A window online
accuracy metric was used to assess the predictive power of the models,
which allows for learning dynamics by only considering up to the last
100 predictions.</p>
      <p>accuracywindow online(t) =</p>
      <p>t
Pi=max(0,t−100) δaO(i)=aP (i)
min(t, 100)
where aO(i) is the observed action, aP (i) is the predicted action at time
i and δaO(i)=aP (i) is 1 if these match, 0 otherwise.</p>
      <p>Figure 2 shows the mean accuracy of the three predictive models across
10 runs for a subset of subjects. The highest accuracy was 92%, the
lowest was 57%, the mean was 71% with std 14%. Adaptation to the
subjects can be seen, but is most clear for Subjects 6 and 8.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>This work is supported by the University of Edinburgh Neuroinformatics
Doctoral Training Centre, funded by ESPRC, BBSRC and MRC.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>C.L.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.R.</given-names>
            <surname>Saxe</surname>
          </string-name>
          and
          <string-name>
            <surname>J.B. Tenenbaum</surname>
          </string-name>
          <article-title>Bayesian theory of mind: Modeling joint belief-desire attribution</article-title>
          .
          <source>In Proceedings of the thirtysecond annual conference of the cognitive science society</source>
          , pages
          <fpage>2469</fpage>
          -
          <lpage>2474</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>C.L.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.B.</given-names>
            <surname>Tenenbaum</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.R.</given-names>
            <surname>Saxe</surname>
          </string-name>
          <article-title>Bayesian models of human action understanding</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          <volume>18</volume>
          , pages
          <fpage>99</fpage>
          -
          <lpage>106</lpage>
          . MIT Press,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>A.</given-names>
            <surname>Dickinson</surname>
          </string-name>
          <article-title>Actions and habits: The development of behavioural autonomy</article-title>
          .
          <source>Philosophical Transactions of the Royal Society of London. B</source>
          ,
          <string-name>
            <surname>Biological</surname>
            <given-names>Sciences</given-names>
          </string-name>
          ,
          <volume>308</volume>
          (
          <issue>1135</issue>
          ):
          <fpage>67</fpage>
          -
          <lpage>78</lpage>
          ,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>H.A.</given-names>
            <surname>Kautz</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.F.</given-names>
            <surname>Allen</surname>
          </string-name>
          <article-title>Generalized plan recognition</article-title>
          .
          <source>In AAAI</source>
          , volume
          <volume>86</volume>
          , pages
          <fpage>32</fpage>
          -
          <lpage>37</lpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>K.W. Spence</surname>
          </string-name>
          <article-title>Behaviour theory and conditioning</article-title>
          .
          <year>1956</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. E.C.
          <article-title>Tolman Purposive behaviour in animals and men appletoncentury-crofts</article-title>
          . New York, pages
          <fpage>209</fpage>
          -
          <lpage>211</lpage>
          ,
          <year>1932</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>G.</given-names>
            <surname>Trafton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hiatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harrison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tamborello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Khemlani</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Schultz</surname>
          </string-name>
          Act-r/e:
          <article-title>An embodied cognitive architecture for human-robot interaction</article-title>
          .
          <source>Journal of Human-Robot Interaction</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <fpage>30</fpage>
          -
          <lpage>55</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>