<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Simulating Actions with the Associative Self-Organizing Map</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Miriam Buonamente</string-name>
          <email>miriam.buonamente@unipa.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Haris Dindo</string-name>
          <email>haris.dindo@unipa.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Magnus Johnsson</string-name>
          <email>magnus@magnusjohnsson.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Lund University Cognitive Science</institution>
          ,
          <addr-line>Lundagard, 222 22 Lund</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>RoboticsLab, DICGIM, University of Palermo</institution>
          ,
          <addr-line>Viale delle Scienze, Ed. 6, 90128 Palermo</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present a system that can learn to represent actions as well as to internally simulate the likely continuation of their initial parts. The method we propose is based on the Associative Self Organizing Map (A-SOM), a variant of the Self Organizing Map. By emulating the way the human brain is thought to perform pattern recognition tasks, the ASOM learns to associate its activity with di erent inputs over time, where inputs are observations of other's actions. Once the A-SOM has learnt to recognize actions, it uses this learning to predict the continuation of an observed initial movement of an agent, in this way reading its intentions. We evaluate the system's ability to simulate actions in an experiment with good results, and we provide a discussion about its generalization ability. The presented research is part of a bigger project aiming at endowing an agent with the ability to internally represent action patterns and to use these to recognize and simulate others behaviour.</p>
      </abstract>
      <kwd-group>
        <kwd>Associative Self-Organizing Map</kwd>
        <kwd>Neural Network</kwd>
        <kwd>Action Recognition</kwd>
        <kwd>Internal Simulation</kwd>
        <kwd>Intention Understanding</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Robots are on the verge of becoming a part of the human society. The aim is
to augment human capabilities with automated and cooperative robotic devices
to have a more convenient and safe life. Robotic agents could be applied in
several elds such as the general assistance with everyday tasks for elderly and
handicapped enabling them to live independent and comfortable lives like people
without disabilities. To deal with such desire and demand, natural and intuitive
interfaces, which allow inexperienced users to employ their robots easily and
safely, have to be implemented.</p>
      <p>E cient cooperation between humans and robots requires continuous and
complex intention recognition; agents have to understand and predict human
intentions and motion. In our daily interactions, we depend on the ability to
understand the intent of others, which allows us to read other's mind. In a simple
dance, two persons coordinate their steps and their movements by predicting
subliminally the intentions of each other. In the same way in multi-agents
environments, two or more agents that cooperate (or compete) to perform a certain
task have to mutually understand their intentions.</p>
      <p>
        Intention recognition can be de ned as the problem of inferring an agent's
intention through the observation of its actions. This problem has been faced in
several elds of human-robot collaboration [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In robotics, intention recognition
has been addressed in many contexts like social interaction [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and learning by
imitation [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Intention recognition requires a wide range of evaluative processes including,
among others, the decoding of biological motion and the ability to recognize
tasks. This decoding is presumably based on the internal simulation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] of other
peoples behaviour within our own nervous system. The visual perception of
motion is a particularly crucial source of sensory input. It is essential to be able
to pick out the motion to predict the actions of other individuals. Johansson's
experiment [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] showed that humans, just by observing points of lights, were able
to perceive and understand movements. By looking at biological motion, such as
Johansson's walkers, humans attribute mental states such as intentions and
desires to the observed movements. Recent neurobiological studies [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] corroborate
Johansson's experiment by arguing that the human brain can perceive actions by
observing only the human body poses, called postures, during action execution.
Thus, actions can be described as sequences of consecutive human body poses,
in terms of human body silhouettes [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Many neuroscientists believe
that the ability to understand the intentions of other people just by observing
them depends on the so-called mirror-neuron system in the brain [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], which
comes into play not only when an action is performed, but also when a similar
action is observed. It is believed that this mechanism is based on the internal
simulation of the observed action and the estimation of the actor's intentions on
the basis of a representation of ones own intentions [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        Our long term goal is to endow an agent with the ability to internally
represent motion patterns and to use these patterns to recognize and simulate other's
behaviour. The study presented here is part of a bigger project whose rst step
was to e ciently represent and recognize human actions [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] by using the
Associative Self-Organizing Map (A-SOM) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. In this paper we want to use the
same biologically-inspired model to predict an agent's intentions by internally
simulating the behaviour likely to follow initial movements. As humans do
effortlessly, agents have to be able to elicit the likely continuation of the observed
action even if an obstacle or other factors obscure their view. Indeed, as we will
see below, the A-SOM can remember perceptual sequences by associating the
current network activity with its own earlier activity. Due to this ability, the
ASOM could receive an incomplete input pattern and continue to elicit the likely
continuation, i.e. to carry out sequence completion of perceptual activity over
time.
      </p>
      <p>We have tested the A-SOM on simulation of observed actions on a suitable
dataset made of images depicting the only part of the persons body involved
in the movement. The images used to create this dataset was taken from the
\INRIA 4D repository 3", a publicly available dataset of movies representing 13
common actions: check watch, cross arms, scratch head, sit down, get up, turn
around, walk, wave, punch, kick, point, pick up, and throw (see Fig. 1).</p>
      <p>This paper is organized as follows: A short presentation of the A-SOM
network is given in section II. Section III presents the method and the experiments
for evaluating performance. Conclusions and future works are outlined in section
IV.
2</p>
      <p>
        Associative Self-Organizing Map
The A-SOM is an extension of the Self-Organizing Map (SOM) [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] which
learns to associate its activity with the activity of other neural networks. It can
be considered a SOM with additional (possibly delayed) ancillary input from
other networks, Fig. 2.
      </p>
      <p>
        Ancillary connections can also be used to connect the A-SOM to itself, thus
associating its activity with its own earlier activity. This makes the A-SOM able
to remember and to complete perceptual sequences over time. Many simulations
prove that the A-SOM, once receiving some initial input, can continue to elicit
the likely following activity in the nearest future even though no further input
is received [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
      <p>The A-SOM consists of an I J grid of neurons with a xed number of
neurons and a xed topology. Each neuron nij is associated with r + 1 weight
vectors wiaj 2 Rn and wi1j 2 Rm1 , wi2j 2 Rm2 , . . . , wirj 2 Rmr . All the elements
of all the weight vectors are initialized by real numbers randomly selected from
a uniform distribution between 0 and 1, after which all the weight vectors are
normalized, i.e. turned into unit vectors.</p>
      <p>At time t each neuron nij receives r + 1 input vectors xa(t) 2 Rn and
x1(t d1) 2 Rm1 , x2(t d2) 2 Rm2 , . . . , xr(t dr) 2 Rmr where dp is the time
delay for input vector xp, p = 1; 2; : : : ; r.</p>
      <p>The main net input sij is calculated using the standard cosine metric
sij (t) =</p>
      <p>xa(t) wiaj (t)
jjxa(t)jjjjwiaj (t)jj
;
The activity in the neuron nij is given by</p>
      <p>
        yij = [yiaj (t) + yi1j (t) + yi2j (t) + : : : + yirj (t)]=(r + 1)
where the main activity yiaj is calculated by using the softmax function [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]
3 The repository is available at http://4drepository.inrialpes.fr. It o ers several movies
representing sequences of actions. Each video is captured from 5 di erent cameras.
For the experiments in this paper we chose the movie \Julien1" with the frontal
camera view \cam0".
(1)
(2)
      </p>
      <p>where m is the softmax exponent.</p>
      <p>The ancillary activity yipj (t), p=1,2,. . . ,r is calculated by again using the
standard cosine metric
yiaj (t) =</p>
      <p>(sij (t))m
maxij (sij (t))m
yipj (t) =</p>
      <p>xp(t
jjxp(t
dp) wipj (t)</p>
      <p>p
dp)jjjjwij (t)jj
:
wiajk(t + 1) = wiajk(t) + (t)Gijc(t)[xka(t)
wiajk(t)]
(t)
where 0
The neighbourhood function Gijc(t) = e is a Gaussian function
decreasing with time, and rc 2 R2 and rij 2 R2 are location vectors of neurons c
and nij respectively.</p>
    </sec>
    <sec id="sec-2">
      <title>1 is the adaptation strength with (t) ! 0 when t ! 1.</title>
      <p>jjrc rijjj
2 2(t)</p>
      <p>The weights wipjl, p=1,2,. . . ,r, are adapted by</p>
    </sec>
    <sec id="sec-3">
      <title>The weights wiajk are adapted by</title>
      <p>wipjl(t + 1) = wipjl(t) +
xlp(t
dp)[yiaj (t)
yipj (t)]
where is the adaptation strength.</p>
      <p>All weights wiajk(t) and wipjl(t) are normalized after each adaptation.</p>
      <p>
        In this paper the ancillary input vector x1 is the activity of the A-SOM from
the previous iteration rearranged into a vector with the time delay d1 = 1:
We want to evaluate if the bio-inspired model, introduced and tested for the
action recognition task in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], Fig. 3, is also able to simulate the continuation
of the initial part of an action. To this end, we tested the simulation capabilities of
the A-SOM. The experiments scope is to verify if the network is able to receive
an incomplete input pattern and continue to elicit the likely continuation of
recognized actions. Actions, de ned as single motion patterns performed by a
single human [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], are described as sequences of body postures.
      </p>
      <p>
        The dataset of actions is the same as we used for the recognition experiment
in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. It consists of more than 700 postural images representing 13 di erent
actions. Since we want the agent to be able to simulate one action at a time,
we split the original movie into 13 di erent movies: one movie for each action
(see Fig. 1). Each frame is preprocessed to reduce the noise and to improve
its quality and the posture vectors are extracted (see section 3.1 below). The
posture vectors are used to create the training set required to train the A-SOM.
Our nal training set is composed of about 20000 samples where every sample
is a posture vector.
      </p>
      <p>
        The created input is used to train the A-SOM network. The training lasted
for about 90000 iterations. The generated weight le is used to execute tests.
The implementation of all code for the experiments presented in this paper was
done in C++ using the neural modelling framework Ikaros [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. The following
sections detail the preprocessing phase as well as the results obtained.
To reduce the computational load and to improve the performance, movies
should have the same duration and images should depict the only part of the
body involved in the movement. By reducing the numbers of images for each
movie to 10, we have a good compromise to have seamless and uid actions,
guaranteeing the quality of the movie. As Fig. 4 shows, the reduction of the
number of images, depicting the \walk action" movie, does not a ect the quality
of the action reproduction.
      </p>
      <p>Consecutive images were subtracted to depict the only part of the body
involved in the action, focusing in this way the attention on the movement
exclusively. This operation further reduced the number of frames for each movie
to 9, without a ecting the quality of the video. As can be seen in Fig. 5, in the
\walk action" only the arm is involved in the movement.</p>
      <p>To further improve the system's performance, we need to produce binary
images of xed and small size. By using a xed boundary box, including the part
of the body performing the action, we cut out the images eliminating anything
not involved in the movement. In this way, we simulate an attentive process in
which the human eye observes and follows the salient parts of the action only.
To have smaller representations the binary images depicting the actions were
shrunk to 30 30 matrices. Finally, the obtained matrix representations were
vectorized to produce 9 posture vectors p 2 RD, where D = 900, for each action.
These posture vectors are used as input to the A-SOM.
3.2</p>
      <p>
        Action Simulation
The objective was to verify whether the A-SOM is able to internally simulate
the likely continuation of initial actions. Thus, we fed the trained A-SOM with
incomplete input patterns and expected it to continue to elicit activity patterns
corresponding to the remaining part of the action. The action recognition task
has been already tested in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] with good results. The system we set up was the
same as the one used in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and consists of one A-SOM connected to itself with
time delayed ancillary connections. To evaluate the A-SOM, 13 sequences each
containing 9 posture vectors were constructed as explained above. Each of these
sequences represents an action. The posture vectors represent the binary images
that form the videos and depict only the part of the human body involved in
the action, see Fig.6
      </p>
      <p>We fed the A-SOM with one sequence at a time, reducing the number of
posture vectors at the end of the sequence each time and replacing them with
null vectors (representing no input). In this way, we created the incomplete input
that the A-SOM has to complete.The conducted experiment consisted of several
tests. The rst one was made by using the sequences consisting of all the 9 frames
with the aim to record the coordinates of the activity centres generated by the
A-SOM and to use these values as reference values for the further iterations.
Subsequent tests had the sequences with one frame less (replaced by a null vector
representing no input) each time and the A-SOM had the task to complete the
frame sequence by eliciting activity corresponding to the activity representing
the remaining part of the sequence. The last test included only the sequences
made of one frame (followed by 8 null vectors representing no input).</p>
      <p>The centres of activity generated by the A-SOM at each iteration were
collected in tables, and colour coding was used to indicate the ability (or the
inability) of the A-SOM to predict the action continuation. The dark green colour
indicates that the A-SOM predicted the right centres of activity; the light green
indicates that the A-SOM predicted a value close to the expected centre of
activity and the red one indicates that the A-SOM could not predict the right
value, see Fig.7. The ability to predict varies with the type of action. For actions
like \sit down" and \punch", A-SOM needed 8 images to predict the rest of
the sequence; whereas for the \walk" action, A-SOM needed only 4 images to
complete the sequence. In general the system needed between 4 and 9 inputs to
internally simulate the rest of the actions. This is a reasonable result, since even
humans cannot be expected to be able to predict the intended action of another
agent without a reasonable amount of initial information. For example, looking
at the initial part of an action like \punch", we can hardly say what the person
is going to do. It could be \punch" or \point"; we need more frames to exactly
determine the performed action. In the same way, looking at a person starting
to walk, we cannot say in advance if the person would walk or turn around or
even kick because the initial postures are all similar to one another.</p>
      <p>
        The results obtained through this experiment allowed us to speculate about
the ability of the A-SOM to generalize. The generalization is the network's ability
to recognize inputs it has never seen before. Our idea is that if the A-SOM
is able to recognize images as similar by generating close or equal centres of
activity, then it will also be able to recognize an image it has never encountered
before if this is similar to a known image. We checked if similar images had the
same centres of activity and if similar centres of activity corresponded to similar
images. The results show that the A-SOM generated very close or equal values
for very similar images, see Fig.8. Actions like \turn around", \walk" and \get
up" present some frames very similar to each other and for such frames the
ASOM generates the same centres of activity. This ability is validated through the
selection of some centres of activity and the veri cation that they correspond to
similar images. \Check watch", \get up", \point" and \kick" actions include in
their sequences frames depicting the movement of the arm that can be attributed
to all of them. For these frames the A-SOM elicits the same centre of activity,
see Fig. 9. The results presented here support the belief that our system is also
able to generalize.
In this paper, we proposed a new method for internally simulating behaviours of
observed agents. The experiment presented here is part of a bigger project whose
scope is to develop a cognitive system endowed with the ability to read other's
intentions. The method is based on the A-SOM, a novel variant of the SOM,
whose ability of recognition and classi cation has already been tested in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In
our experiment, we connected the A-SOM to itself with time delayed ancillary
connections and the system was trained and tested with a set of images depicting
the part of the body performing the movement. The results presented here show
that the A-SOM can receive some initial sensory input and internally simulate
the rest of the action without any further input.
      </p>
      <p>Moreover, we veri ed the ability of the A-SOM to recognize input never
encountered before, with encouraging results. In fact, the A-SOM recognizes
similar actions by eliciting close or identical centres of activity.</p>
      <p>We are currently working on improving the system to increase the recognition
and simulation abilities.</p>
      <p>Acknowledgements The authors gratefully acknowledge the support from the
Linnaeus Centre Thinking in Time: Cognition, Communication, and Learning,
nanced by the Swedish Research Council, grant no. 349-2007-8695.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Awais</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henrich</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Human-robot collaboration by intention recognition using probabilistic state machines</article-title>
          .
          <source>In: Robotics in Alpe-Adria-Danube Region (RAAD)</source>
          ,
          <year>2010</year>
          IEEE 19th International Workshop on Robotics. (
          <year>2010</year>
          )
          <volume>75</volume>
          {
          <fpage>80</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Breazeal</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Designing sociable robots</article-title>
          . the MIT Press (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chella</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dindo</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Infantino</surname>
            ,
            <given-names>I.:</given-names>
          </string-name>
          <article-title>A cognitive framework for imitation learning</article-title>
          .
          <source>Robotics and Autonomous Systems</source>
          <volume>54</volume>
          (
          <issue>5</issue>
          ) (
          <year>2006</year>
          )
          <volume>403</volume>
          {
          <fpage>408</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chella</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dindo</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Infantino</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Imitation learning and anchoring through conceptual spaces</article-title>
          .
          <source>Applied Arti cial Intelligence</source>
          <volume>21</volume>
          (
          <issue>4-5</issue>
          ) (
          <year>2007</year>
          )
          <volume>343</volume>
          {
          <fpage>359</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Argall</surname>
            ,
            <given-names>B.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chernova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veloso</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Browning</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>A survey of robot learning from demonstration</article-title>
          .
          <source>Robotics and Autonomous Systems</source>
          <volume>57</volume>
          (
          <issue>5</issue>
          ) (
          <year>2009</year>
          )
          <volume>396</volume>
          {
          <fpage>483</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hesslow</surname>
          </string-name>
          , G.:
          <article-title>Conscious thought as simulation of behaviour and perception</article-title>
          .
          <source>Trends in Cognitive Sciences</source>
          <volume>6</volume>
          (
          <year>2002</year>
          )
          <volume>242</volume>
          {
          <fpage>247</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Johansson</surname>
          </string-name>
          , G.:
          <article-title>Visual perception of biological motion and a model for its analysis</article-title>
          .
          <source>Perception &amp; Psychophysics</source>
          <volume>14</volume>
          (
          <issue>2</issue>
          ) (
          <year>1973</year>
          )
          <volume>201</volume>
          {
          <fpage>211</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Giese</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poggio</surname>
          </string-name>
          ,
          <source>T. Nat Rev Neurosci</source>
          <volume>4</volume>
          (
          <issue>3</issue>
          ) (
          <year>March 2003</year>
          )
          <volume>179</volume>
          {
          <fpage>192</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Gorelick</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blank</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shechtman</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Irani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Basri</surname>
          </string-name>
          , R.:
          <article-title>Actions as space-time shapes</article-title>
          .
          <source>IEEE Trans. Pattern Anal. Mach. Intell</source>
          .
          <volume>29</volume>
          (
          <issue>12</issue>
          ) (
          <year>2007</year>
          )
          <volume>2247</volume>
          {
          <fpage>2253</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Iosi</surname>
            <given-names>dis</given-names>
          </string-name>
          , A.,
          <string-name>
            <surname>Tefas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitas</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>View-invariant action recognition based on arti cial neural networks</article-title>
          .
          <source>IEEE Trans. Neural Netw. Learning Syst</source>
          .
          <volume>23</volume>
          (
          <issue>3</issue>
          ) (
          <year>2012</year>
          )
          <volume>412</volume>
          {
          <fpage>424</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Gkalelis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tefas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitas</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Combining fuzzy vector quantization with linear discriminant analysis for continuous human movement recognition</article-title>
          .
          <source>IEEE Transactions on Circuits Systems Video Technology</source>
          <volume>18</volume>
          (
          <issue>11</issue>
          ) (
          <year>2008</year>
          )
          <fpage>15111521</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Rizzolatti</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Craighero</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>The mirror-neuron system</article-title>
          .
          <source>Annual Review of Neuroscience</source>
          <volume>27</volume>
          (
          <year>2004</year>
          )
          <fpage>169192</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Goldman</surname>
            ,
            <given-names>A.I.</given-names>
          </string-name>
          :
          <article-title>Simulating minds: The philosophy, psychology, and neuroscience of mindreading. (2) (</article-title>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Buonamente</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dindo</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnsson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Recognizing actions with the associative self-organizing map</article-title>
          .
          <source>In: the proceedings of the XXIV International Conference on Information, Communication and Automation Technologies (ICAT</source>
          <year>2013</year>
          ).
          <article-title>(</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Johnsson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balkenius</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hesslow</surname>
          </string-name>
          , G.:
          <article-title>Associative self-organizing map</article-title>
          .
          <source>In: Proceedings of IJCCI</source>
          . (
          <year>2009</year>
          )
          <volume>363</volume>
          {
          <fpage>370</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Kohonen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Self-Organization and Associative Memory</article-title>
          . Springer Verlag (
          <year>1988</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Johnsson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gil</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balkenius</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hesslow</surname>
          </string-name>
          , G.:
          <article-title>Supervised architectures for internal simulation of perceptions and actions</article-title>
          .
          <source>In: Proceedings of BICS</source>
          . (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Johnsson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendez</surname>
            ,
            <given-names>D.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hesslow</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balkenius</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Internal simulation in a bimodal system</article-title>
          .
          <source>In: Proceedings of SCAI</source>
          . (
          <year>2011</year>
          )
          <volume>173</volume>
          {
          <fpage>182</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Bishop</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          :
          <article-title>Neural Networks for Pattern Recognition</article-title>
          . Oxford University Press, Oxford (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Turaga</surname>
            ,
            <given-names>P.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chellappa</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Subrahmanian</surname>
            ,
            <given-names>V.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Udrea</surname>
            ,
            <given-names>O.:</given-names>
          </string-name>
          <article-title>Machine recognition of human activities: A survey</article-title>
          .
          <source>IEEE Trans. Circuits Syst. Video Techn</source>
          .
          <volume>18</volume>
          (
          <issue>11</issue>
          ) (
          <year>2008</year>
          )
          <volume>1473</volume>
          {
          <fpage>1488</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Balkenius</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moren</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johansson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnsson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          : Ikaros:
          <article-title>Building cognitive models for robots</article-title>
          .
          <source>Advanced Engineering Informatics</source>
          <volume>24</volume>
          (
          <issue>1</issue>
          ) (
          <year>2010</year>
          )
          <volume>40</volume>
          {
          <fpage>48</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>