<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SemNav: How Rich Semantic Knowledge Can Guide Robot Navigation In Indoor Spaces</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Snehasis Banerjee?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Balamurali Purushothaman?</string-name>
          <email>balamurali.pg@tcs.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>TCS Research &amp; Innovation</institution>
          ,
          <addr-line>Tata Consultancy Services</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>We have developed an ontological representation SemNav, speci c to indoor spaces and service robots related to navigation and target nding task. The same has been tested on real life settings. This paper positions semantic web technology as one of the key elements of decision making in robotic tasks like navigation and object nding.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology</kwd>
        <kwd>Cognitive Robotics</kwd>
        <kwd>Semantic Robot Navigation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Background</title>
      <p>
        A major challenge in robotics is successful execution of complex tasks by a
robot in dynamic and uncertain environments. As an example, it is di cult for
the robot to nd objects that are either partially observable or out of sight,
or not where it was expected to be found. Knowledge of the robot, objects
and environment with rich semantic relations aids in such scenarios. Approaches
presented in related works [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] are enhanced and adapted for this problem.
SemNav was created by listing down relevant competency questions speci c to
the navigation and object nding problem. Initially starting with seed ontology
of relations, images scenes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] were processed to extract objects and captions
in order to derive semantic relations. The same were aligned with WordNet to
resolve ambiguity. Some of the prominent object relations are enlisted below:
(1) Occlusion: a bigger object can occlude a smaller object (like cup and jug)
(2) Co-location: two objects are in close proximity (like water glass and jug)
(3) RCC-8 classes: region connection calculus based qualitative spatial relations
(4) Location: object is usually located at speci c zones (like pillow in bedroom)
(5) Disjoint: two objects do not co-occur together in a scene (like jug and towel)
(6) Dimension: range of length, width, height of objects (mobile phones, remote)
(7) Shape: objects can be abstracted to forms (like TV, window as rectangles)
(8) Color and texture similarity - indistinguishable objects (wooden furniture)
(9) Attribute relations from SPARQL endpoints (DBpedia) and commonsense
ontologies: objects' shared resources (like electricity), inter-dependence, use, etc.
? Copyright © 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
      </p>
    </sec>
    <sec id="sec-2">
      <title>Semantic Navigation</title>
      <p>Once the SemNav ontology (Fig. 1.a) is populated with relations and instance
data (RDF) from parsing scenes, and human validated across some random
samples, the knowledge base for semantic navigation (Fig. 1.b) is ready. The
robot processes current scene using computer vision techniques. Obstacles and
free spaces are understood from scene processing. Based on the goal given (like
nd an object `cup'), the decision module consults the knowledge store; and
matches it with current and past history of scenes with reference to GeoSem
Map (a geocentric navigation map storing object instances and their instance
relations). The semantic decision module instructs the Navigation module to
derive a plan for navigation to the next spot (having highest probability to reach
target goal) which in turn instructs the robot's motor wheels to move in that
direction. The knowledge store's relation weights are updated by exploration.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Anagnostopoulos</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , et. al.:
          <article-title>Ontonav: A semantic indoor navigation system</article-title>
          .
          <source>In: 1st Workshop on Semantics in Mobile Environments (SME05)</source>
          ,
          <source>Ayia. Citeseer</source>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bruno</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , et. al.:
          <article-title>Knowledge representation for culturally competent personal robots: requirements, design principles, implementation, and assessment</article-title>
          .
          <source>International Journal of Social Robotics</source>
          , Springer, pp.
          <volume>1</volume>
          {
          <fpage>24</fpage>
          . (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Krishna</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , et. al.:
          <article-title>Visual genome: Connecting language and vision using crowdsourced dense image annotations</article-title>
          .
          <source>IJCV</source>
          , Springer, (
          <volume>123</volume>
          -
          <fpage>1</fpage>
          ),
          <volume>32</volume>
          {
          <fpage>73</fpage>
          . (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>