<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mental Simulation for Autonomous Learning and Planning Based on Triplet Ontological Semantic Model</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Yuri Goncalves Rocha and Tae-Yong Kuc College of Information and Communication Engineering, Sungkyunkwan University</institution>
          ,
          <country country="KR">South Korea</country>
        </aff>
      </contrib-group>
      <fpage>65</fpage>
      <lpage>73</lpage>
      <abstract>
        <p>Cognitive science findings showed that humans are able to create simulated mental environments based on their episodic memory and use such environment for prospecting, planning, and learning. Such capabilities could enhance current robotic systems, allowing them predict the output of a plan before actually performing the action on the real world. It also allow robots to use this simulated world to learn new tasks and improve its current ones using Reinforcement Learning approaches. In this work, we propose a semantic modeling framework, which is able to express intrinsic semantic knowledge in order to better represent robots, places and objects, while also being a memory-efficient alternative to classic mapping solutions. We show that such data can be used to automatically generate a complete mental simulation allowing robots to simulate themselves and other modeled agents into known environments. This simulations allows robots to perform autonomous learning and planning without the need of human-tailored models.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>exploring the environment and giving (or removing) rewards, depending on how well the robot executed a given task.
More specifically, Deep Reinforcement Learning(deep-RL), has been used on several autonomous navigation applications
[Tai et al., 2017, Shah et al., 2018, Kahn et al., 2018].</p>
      <p>The contributions of this work are as follows:
• Expanding an Ontologic Semantic Framework in order to automatically generate a full simulation environment,
including a simulated robot.</p>
      <p>• An end-to-end deep-RL model for autonomous navigation trained using the mentally simulated environment.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>In the past decades, several works proposed ways to incorporate knowledge into computers. CYC [Lenat, 1995] and SUMO
[Niles and Pease, 2001] gathered a large amount of encyclopedic knowledge into its database, however such knowledge
lacked the information necessary for mobile robot tasks. The OMICS [Gupta et al., 2004] project created a similar database
containing the necessary knowledge in order to a robot complete several indoor tasks. The RoboEarth [Waibel et al., 2011]
project tried to create a World Wide Web for robots, where they would be able to share and obtain knowledge in an
autonomous way. KnowRob [Tenorth and Beetz, 2009, Beetz et al., 2018] and OpenEASE [Beetz et al., 2015] created a
complete knowledge processing system capable of semantic reasoning and planning, and also performing mental
simulations (referred as Mind’s Eye). Most of those works, however, focused on manipulation tasks only.</p>
      <p>Despite being thoroughly studied by cognitive science researches [Boyer, 2008, Burgess, 2008, Hesslow, 2012,
Kahneman and Tversky, 1981], the mental simulation concept only started to be applied to computational systems few
decades ago. Most of the early works focused on the ”putting yourself on other’s shoes” approach, where an agent
would simulate itself on its counterpart perceived state in order to infer about its feelings and intentions. Leonardo
[Gray and Breazeal, 2005] was developed to infer a human intention and aid the execution of this predicted task. In
[Buchsbaumm et al., 2005], an animated mouse was able to imitate similar actors by inference using its own motor and
action representations. [Laird, 2001] created a Quake bot able to predict its opponent next action by simulating itself on
the opponent’s current state, while [Kennedy et al., 2009] used its own behavior model to predict another agent’s actions.
Most of the recent works on robotics field, however, focused on the application of mental simulation to manipulation tasks
planning and learning [Tenorth and Beetz, 2009, Beetz et al., 2015, Beetz et al., 2018, Kunze and Beetz, 2017], or
comprehension and expression of emotions when socializing with humans [De Carolis et al., 2017, Horii et al., 2016]. J. Hamrick
[Hamrick, 2019], however, showed that there are several similarities between mental simulation findings from cognitive
science and model-based deep-RL approaches.</p>
      <p>Deep Reinforcement Learning (deep-RL) has been applied to several different robot tasks, including but not
limited to Human-Robot Interaction [Christen et al., 2019, Qureshi et al., 2018], dexterous manipulation [Gu et al., 2017,
Rajeswaran et al., 2017] and autonomous map-less navigation [Kahn et al., 2018, Zhu et al., 2017]. RL methods can be
divided into model-based and model-free value-based approaches. Model-based algorithms, such as [Zhu et al., 2017] use
a predictive function that receive the current state and a sequence of actions and outputs the future states. The policy then
select the sequence of actions that maximizes the expected rewards from the predicted states. Model-free approaches,
such as [Christen et al., 2019], approximate a function that receives the current state and action and outputs the sum of
the expected future rewards. The policy then picks the action that maximizes this output. Generally, model-based
approaches are sample-efficient, while model-free methods are better at learning complex, high-dimensional tasks. Some
approaches [Qureshi et al., 2018, Kahn et al., 2018] also tried to use hybrid methods which would explore the advantages
of both model-based and model-free value-based approaches. Regarding value-based deep-RL methods, Deep Q Network
(DQN) has been vastly used by the research community [Qureshi et al., 2018], due to its good generalization capabilities
and relatively simple training method. DQN, however, can only approximate a discrete action space, requiring continuous
applications to be discretized beforehand. Trying to solve this issue, some new approaches such as Deep Deterministic
Policy Gradient (DPPG) have been used [Christen et al., 2019, Gu et al., 2017] due to its ability to approximate continuous
action spaces.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Triplet Ontological Semantic Model</title>
      <p>Researches on the cognitive science and neuroscience fields [Burgess, 2008] have shown that the human brain has its
own ”GPS” mapping system. Every time we revisit a known environment this GPS is responsible for navigating using
past known information and update itself with novel data. By relying on relational information instead of precise metric
position, the human brain remains unparalleled on its spatial scalability and data efficiency. Robots, on the other hand,
still heavily rely on information-rich, yet memory inefficient, maps in order to localize themselves and navigate through
known environments. Despite being precise, those maps require a large amount of data to be stored, which hinders the
robot’s long-term autonomy on large scale environments due to lack of storage space. Aiming to mimic the brain GPS
model efficiency, the Triplet Ontological Semantic Model(TOSM) [Joo et al., 2019] was developed.</p>
      <p>The TOSM representation can be described as three interconnected models as shown on Fig. 1. The explicit model
subsumes everything that can be measured or obtained through sensorial means. It can be data such as size, three-dimensional
pose, shape, color, texture, etc, which are already vastly used on current robot applications. The implicit model, on the other
hand, contains intrinsic knowledge which cannot be obtained by sensors alone, thus needing to be inferred from the
available semantic information. The implicit model comprise a large variety of data that range from physical properties such as
mass and friction coefficient, relational data (e.g. object A is inside object B), until more complex semantic information
such as ”An automatic door opens if one waits in front of it”. Finally, the symbolic model describes an element using a
language-oriented way, such as name, description, identification number and symbols that can represent such element.</p>
      <p>By creating an environment database using TOSM encoded data, a hierarchical mapping system was created, based
on the findings of cognitive science. As shown on Fig. 2, different maps can be generated on-demand according to the
specifications of the robot and the given task. This eliminates the demand to store several different maps by being able
to build them only when needed, reducing the data redundancy and improving its storage efficiency. The TOSM can be
also used to model places and robots, which combined with the object models can be used to generate high-level semantic
maps.</p>
      <p>In this work, we also used the TOSM encoded on-demand database to automatically generate a complete simulation
environment without the need of domain expert tailored models. This allows the robot to update its mental simulated world
automatically just by updating the on-demand database. In order to encode the TOSM data into a machine-readable format,
the Ontology Web Language (OWL) was used. OWL is widely used and has an active community which created several
tools and applications openly available. We used one of those tools, the Prote´ge´framework, to manipulate and visualize
the OWL triplets.
3.1</p>
      <sec id="sec-3-1">
        <title>Robot Description</title>
        <p>In order to describe a robot, it was divided into structural parts, sensors, wheels and joints, each of them described by its
own explicit, implicit and symbolic information. All categories contain similar explicit data, such as pose, shape, color, size
and material. The symbol data contains the part name and an identification number. On the other hand, the implicit data is
unique for each category. For structural parts, it contains the mass and the material, while wheels also store whether or not
it is a active wheel. Joints store which two parts it is connected to. Moreover, the implicit information can be different for
each type of sensor. For example, cameras were described by image resolution, field of view, frames per second and, for
RGB-D cameras, range. A laser range finder can have data such as range, view angle and number of samples.
The environment can be modeled in a similar fashion as the robot. It is divided mainly into objects and places. Regarding
objects, the explicit model contains the same data as described on Subsection 3.1. The implicit model contains data such
as mass, material and relational spatial information, such as ”in front of”, ”on the left of”, etc. With respect to places, on
the other hand, the explicit model contains its boundary points, while the implicit model stores which objects/places are
inside of it and which other places is it connected to. The symbol information is the same for both, storing the name of the
place/object and an identification number.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Mental Simulation</title>
      <p>By encoding the TOSM information using the OWL format, it is possible to do semantic reasoning and querying. Before
doing any task, the robot can reason about its feasibility by knowing about its surrounding environment’s characteristics
and its own structure, limitations and properties. For example, a robot only equipped with a laser scanner can reason
about its inability to navigate through a corridor made out of glass walls. We extended those reasoning capabilities by
automatically generating a complete mental simulation environment using only the on-demand database data.</p>
      <p>The data flow for the mental simulation can be seen on Fig 3. Whenever its needed, the robot requests the TOSM
data to the on-demand database, and generate two different outputs. The first one is an Universal Robot Description
Format(URDF) which is then fed into Robot Operating System(ROS) and Gazebo Simulator in order to control and simulate
the virtual robot. The second one is a Gazebo World file which represents the whole environment simulation. Those files
are generated on-demand and can be constantly updated whenever the real robot update its database.
In order to show one os the uses for the mental simulator, an autonomous navigation policy was trained using a DQN.
The training was performed using a Core i7 CPU and a Nvidia GTX 1060. The OpenAI ROS framework [ezq, ] was
used in order to to abstract the layer between the reinforcement learning algorithm and the Gazebo/ROS structure. The
task learning architecture is shown on Fig. 4. The observation space is composed by the latest three sparse laser scans
concatenated with the last three relative distances between the robot and the target way-point. The action space are 15
different angular velocities equally distributed from −0.5rad/s to 0.5rad/s. The rewards were defined as

rcompletion, if at goal,
</p>
      <p>rcloser, if getting closer to the goal,
rcollision, if too close to an obstacle,
where rcompletion, rcloser and rcollision were defined trivially.</p>
      <p>The training was done using an ε-greedy exploration approach, where ε started at 1.0 and decayed until 0.1. The DQN
was trained using batches of 64, with learning rate α = 0.001 and discount factor γ = 0.996. The robot was trained for a
total of 2000 episodes, where each episode would end in case of completion, collision or after 1000 steps.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Results and Discussion</title>
      <p>In order to show the usability of such framework, the 7th floor of the Corporate Collaboration Center, Sungkyunkwan
University, was added to the on-demand database. Additionally, the differential robot shown on Fig. 5a was modeled. The
comparison between the simulated world and the database data can be seen on Fig. 6 while the comparison between the
real and simulated robot is shown on Fig. 5.</p>
      <p>By automatically generating this simulation environment, we allow the robot to perform mental simulation without the
aid of domain experts by reusing the same data it already uses for planning and navigation. Such approach further improves
the robot autonomous behavior by letting it simulate itself(or even other robots) on its own mind and use this simulated
environment to prospect about new actions. Currently, this can be done in two different ways:
• Learning: The robot can use the mental simulation to run reinforcement learning algorithms in order to train and
learn the execution of new tasks. This is mainly done when the physical robot is idle(e.g. charging at night).
• Planning: The robot can simulate its current state and use it for testing a plan, generated by traditional planners,
and check whether it succeeds or not. In the case of failure, the robot can re-plan without having to fail on the real
environment, allowing for a more robust task execution.
(a) Real robot</p>
      <p>(b) Simulated robot
(a) Visualization of the data obtained from the on-demand DB. Objects
are represented as bounding boxes, while places are represented as
colored polygons on the floor
(b) Mental simulation environment</p>
      <p>The main advantage of such simulation is that it removes the necessity of a tailor-made simulation environment,
allowing the robot to generate and update this environment automatically. It can be specially useful for reinforcement learning
approaches, which, in theory, gets better the more experiences the robot collects. The robot should be able to run learning
algorithms whenever it is idle, slowly improving itself. Naturally, a cluster running multiples CPUs and GPUs would learn
orders of magnitude faster, allowing the robot to run the learning algorithms itself bring the robotics field one step closer
to true robot autonomy. Finally, by uploading the on-demand database to a cloud infrastructure, robots should be able to
share its own model and environment maps, allowing other robots to compare its performance on a given task with one
another, and provide this information for its operators automatically.</p>
      <p>By using the mental simulation, a autonomous navigation task was learned. The average reward graph can be seen
on Fig. 7. Despite being one of the simpler deep-RL approaches, DQN was shown to be good at generalizing a
highdimensional task. However, the whole training took around 30 hours on a mid-range computer. If the same training were
performed on a mobile robot, the training times might be too prohibitive. Thus, sample-efficient learning algorithms should
be more appropriate for this application.
7</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Future Work</title>
      <p>In this paper, we presented a method of generating an automatic mental simulation by using a TOSM on-demand database.
By allowing robots to create and update mental simulations on a complete autonomous way, we removed the necessity of
expert-tailored models, leading for more autonomous robotic systems. In order to show one of the possible applications
of such method, we trained the robot to autonomously navigate on an known environment by using a Deep Q Network.
We plan now to expand those applications, by including behaviors into the on-demand DB, allowing robots to share and
configure RL policies by themselves. We also want to explore the usability of our framework when combined with classical
planners.</p>
      <sec id="sec-6-1">
        <title>Acknowledgment</title>
        <p>This research was supported by Korea Evaluation Institute of Industrial Technology(KEIT) funded by the Ministry of
Trade, Industry &amp; Energy (MOTIE) (No. 1415162366 and No. 1415162820)
[ezq, ] Openai ros documentation. Date last accessed 04-Aug-2019.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Beetz et al.,
          <year>2018</year>
          ] Beetz,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Beßler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Haidu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Pomarlan</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          , Bozcuo g˘lu,
          <string-name>
            <given-names>A. K.</given-names>
            , and
            <surname>Bartels</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Know rob 2.0-a 2nd generation knowledge processing framework for cognition-enabled robotic agents</article-title>
          .
          <source>In 2018 IEEE International Conference on Robotics and Automation (ICRA)</source>
          , pages
          <fpage>512</fpage>
          -
          <lpage>519</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Beetz et al.,
          <year>2015</year>
          ] Beetz,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Tenorth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            , and
            <surname>Winkler</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Open-ease</article-title>
          .
          <source>In 2015 IEEE International Conference on Robotics and Automation (ICRA)</source>
          , pages
          <fpage>1983</fpage>
          -
          <lpage>1990</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>[Boyer</source>
          , 2008] Boyer,
          <string-name>
            <surname>P.</surname>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Evolutionary economics of mental time travel? Trends in cognitive sciences</article-title>
          ,
          <volume>12</volume>
          (
          <issue>6</issue>
          ):
          <fpage>219</fpage>
          -
          <lpage>224</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Buchsbaumm et al.,
          <year>2005</year>
          ] Buchsbaumm,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Blumberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Breazeal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            , and
            <surname>Meltzoff</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. N.</surname>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>A simulationtheory inspired social learning system for interactive characters</article-title>
          .
          <source>In ROMAN 2005. IEEE International Workshop on Robot and Human Interactive Communication</source>
          ,
          <year>2005</year>
          ., pages
          <fpage>85</fpage>
          -
          <lpage>90</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>[Burgess</source>
          , 2008] Burgess,
          <string-name>
            <surname>N.</surname>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Spatial cognition and the brain</article-title>
          .
          <source>Annals of the New York Academy of Sciences</source>
          ,
          <volume>1124</volume>
          (
          <issue>1</issue>
          ):
          <fpage>77</fpage>
          -
          <lpage>97</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Christen et al.,
          <year>2019</year>
          ] Christen,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Stevsic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            , and
            <surname>Hilliges</surname>
          </string-name>
          ,
          <string-name>
            <surname>O.</surname>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Guided deep reinforcement learning of control policies for dexterous human-robot interaction</article-title>
          . arXiv preprint arXiv:
          <year>1906</year>
          .11695.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>[Cosgun and Christensen</source>
          , 2018] Cosgun,
          <string-name>
            <given-names>A.</given-names>
            and
            <surname>Christensen</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. I.</surname>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Context-aware robot navigation using interactively built semantic maps</article-title>
          .
          <source>Paladyn, Journal of Behavioral Robotics</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ):
          <fpage>254</fpage>
          -
          <lpage>276</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>[De Carolis</surname>
          </string-name>
          et al.,
          <year>2017</year>
          ]
          <string-name>
            <given-names>De</given-names>
            <surname>Carolis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Ferilli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            , and
            <surname>Palestra</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Simulating empathic behavior in a social assistive robot</article-title>
          .
          <source>Multimedia Tools and Applications</source>
          ,
          <volume>76</volume>
          (
          <issue>4</issue>
          ):
          <fpage>5073</fpage>
          -
          <lpage>5094</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Gordon</source>
          , 1986] Gordon,
          <string-name>
            <surname>R. M.</surname>
          </string-name>
          (
          <year>1986</year>
          ).
          <article-title>Folk psychology as simulation</article-title>
          .
          <source>Mind &amp; Language</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ):
          <fpage>158</fpage>
          -
          <lpage>171</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Gray and Breazeal</source>
          , 2005] Gray,
          <string-name>
            <given-names>J.</given-names>
            and
            <surname>Breazeal</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Toward helpful robot teammates: A simulation-theoretic approach for inferring mental states of others</article-title>
          .
          <source>In Proceedings of the AAAI 2005 workshop on modular construction of human-like intelligence.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [Gu et al.,
          <year>2017</year>
          ] Gu,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Holly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Lillicrap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            , and
            <surname>Levine</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates</article-title>
          .
          <source>In 2017 IEEE international conference on robotics and automation (ICRA)</source>
          , pages
          <fpage>3389</fpage>
          -
          <lpage>3396</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [Gupta et al.,
          <year>2004</year>
          ] Gupta,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Kochenderfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            ,
            <surname>Mcguinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            , and
            <surname>Ferguson</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Common sense data acquisition for indoor mobile robots</article-title>
          .
          <source>In AAAI</source>
          , pages
          <fpage>605</fpage>
          -
          <lpage>610</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>[Hamrick</source>
          , 2019] Hamrick,
          <string-name>
            <surname>J. B.</surname>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Analogues of mental simulation and imagination in deep learning</article-title>
          .
          <source>Current Opinion in Behavioral Sciences</source>
          ,
          <volume>29</volume>
          :
          <fpage>8</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>[Hesslow</source>
          , 2012] Hesslow,
          <string-name>
            <surname>G.</surname>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>The current status of the simulation theory of cognition</article-title>
          .
          <source>Brain research</source>
          ,
          <volume>1428</volume>
          :
          <fpage>71</fpage>
          -
          <lpage>79</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [Horii et al.,
          <year>2016</year>
          ] Horii,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Nagai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            , and
            <surname>Asada</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Imitation of human expressions based on emotion estimation by mental simulation</article-title>
          .
          <source>Paladyn, Journal of Behavioral Robotics</source>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [Joo et al.,
          <year>2019</year>
          ] Joo,
          <string-name>
            <given-names>S.-H.</given-names>
            ,
            <surname>Manzoor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Rocha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. G.</given-names>
            ,
            <surname>Lee</surname>
          </string-name>
          , H.-U., and
          <string-name>
            <surname>Kuc</surname>
          </string-name>
          , T.-Y. (
          <year>2019</year>
          ).
          <article-title>A realtime autonomous robot navigation framework for human like high-level interaction and task planning in global dynamic environment</article-title>
          . arXiv preprint arXiv:
          <year>1905</year>
          .12942.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [Kahn et al.,
          <year>2018</year>
          ] Kahn,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Villaflor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Abbeel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            , and
            <surname>Levine</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation</article-title>
          .
          <source>In 2018 IEEE International Conference on Robotics and Automation (ICRA)</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <source>[Kahneman and Tversky</source>
          , 1981] Kahneman,
          <string-name>
            <given-names>D.</given-names>
            and
            <surname>Tversky</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          (
          <year>1981</year>
          ).
          <article-title>The simulation heuristic</article-title>
          .
          <source>Technical report</source>
          , STANFORD UNIV CA DEPT OF PSYCHOLOGY.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [Kennedy et al.,
          <year>2009</year>
          ] Kennedy,
          <string-name>
            <given-names>W. G.</given-names>
            ,
            <surname>Bugajska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            ,
            <surname>Harrison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            , and
            <surname>Trafton</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. G.</surname>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>“like-me” simulation as an effective and cognitively plausible basis for social robotics</article-title>
          .
          <source>International Journal of Social Robotics</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ):
          <fpage>181</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [Kostavelis et al.,
          <year>2016</year>
          ] Kostavelis,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Charalampous</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Gasteratos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            , and
            <surname>Tsotsos</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. K.</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Robot navigation via spatial and temporal coherent semantic maps</article-title>
          .
          <source>Engineering Applications of Artificial Intelligence</source>
          ,
          <volume>48</volume>
          :
          <fpage>173</fpage>
          -
          <lpage>187</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <source>[Kunze and Beetz</source>
          , 2017] Kunze,
          <string-name>
            <given-names>L.</given-names>
            and
            <surname>Beetz</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Envisioning the qualitative effects of robot manipulation actions using simulation-based projections</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>247</volume>
          :
          <fpage>352</fpage>
          -
          <lpage>380</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [Laird,
          <year>2001</year>
          ] Laird,
          <string-name>
            <surname>J. E.</surname>
          </string-name>
          (
          <year>2001</year>
          ).
          <article-title>It knows what you're going to do: adding anticipation to a quakebot</article-title>
          .
          <source>In Proceedings of the fifth international conference on Autonomous agents</source>
          , pages
          <fpage>385</fpage>
          -
          <lpage>392</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <source>[Lenat</source>
          , 1995] Lenat,
          <string-name>
            <surname>D. B.</surname>
          </string-name>
          (
          <year>1995</year>
          ).
          <article-title>Cyc: A large-scale investment in knowledge infrastructure</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>38</volume>
          (
          <issue>11</issue>
          ):
          <fpage>33</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <source>[Niles and Pease</source>
          , 2001] Niles,
          <string-name>
            <given-names>I.</given-names>
            and
            <surname>Pease</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          (
          <year>2001</year>
          ).
          <article-title>Towards a standard upper ontology</article-title>
          .
          <source>In Proceedings of the international conference on Formal Ontology in Information Systems-Volume</source>
          <year>2001</year>
          , pages
          <fpage>2</fpage>
          -
          <lpage>9</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <source>[Polceanu and Buche</source>
          , 2017] Polceanu,
          <string-name>
            <given-names>M.</given-names>
            and
            <surname>Buche</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Computational mental simulation: A review</article-title>
          .
          <source>Computer Animation and Virtual Worlds</source>
          ,
          <volume>28</volume>
          (
          <issue>5</issue>
          ):
          <fpage>e1732</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [Qureshi et al.,
          <year>2018</year>
          ] Qureshi,
          <string-name>
            <given-names>A. H.</given-names>
            ,
            <surname>Nakamura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Yoshikawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            , and
            <surname>Ishiguro</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Intrinsically motivated reinforcement learning for human-robot interaction in the real-world</article-title>
          .
          <source>Neural Networks</source>
          ,
          <volume>107</volume>
          :
          <fpage>23</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [Rajeswaran et al.,
          <year>2017</year>
          ] Rajeswaran,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Vezzani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Schulman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Todorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            , and
            <surname>Levine</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Learning complex dexterous manipulation with deep reinforcement learning and demonstrations</article-title>
          .
          <source>arXiv preprint arXiv:1709</source>
          .
          <fpage>10087</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [Shah et al.,
          <year>2018</year>
          ] Shah,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Fiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Faust</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Kew</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            , and
            <surname>Hakkani-Tur</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Follownet: Robot navigation by following natural language directions with deep reinforcement learning</article-title>
          .
          <source>arXiv preprint arXiv:1805</source>
          .06150.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [Tai et al.,
          <year>2017</year>
          ] Tai,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Paolo</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          , and Liu,
          <string-name>
            <surname>M.</surname>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation</article-title>
          .
          <source>In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>36</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <source>[Tenorth and Beetz</source>
          , 2009] Tenorth,
          <string-name>
            <given-names>M.</given-names>
            and
            <surname>Beetz</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Knowrob-knowledge processing for autonomous personal robots</article-title>
          .
          <source>In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems</source>
          , pages
          <fpage>4261</fpage>
          -
          <lpage>4266</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [Waibel et al.,
          <year>2011</year>
          ] Waibel,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Beetz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Civera</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>d'Andrea</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elfring</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Galvez-Lopez</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Ha¨ussermann,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Janssen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Montiel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Perzylo</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , et al. (
          <year>2011</year>
          ).
          <article-title>Roboearth-a world wide web for robots. IEEE Robotics and Automation Magazine (RAM), Special Issue Towards a WWW for Robots</article-title>
          ,
          <volume>18</volume>
          (
          <issue>2</issue>
          ):
          <fpage>69</fpage>
          -
          <lpage>82</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [Zhu et al.,
          <year>2017</year>
          ] Zhu,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Mottaghi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Kolve</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            ,
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Fei-Fei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            , and
            <surname>Farhadi</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Targetdriven visual navigation in indoor scenes using deep reinforcement learning</article-title>
          .
          <source>In 2017 IEEE international conference on robotics and automation (ICRA)</source>
          , pages
          <fpage>3357</fpage>
          -
          <lpage>3364</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>