<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Challenges in Using Semantic Knowledge for 3D Ob ject Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Corina Gur˘au</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Nu¨chter</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Automation Group, Jacobs University Bremen gGmbH</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Robotics and Telematics, University of Wu ̈zburg</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <fpage>29</fpage>
      <lpage>35</lpage>
      <abstract>
        <p>To cope with a wide variety of tasks, robotic systems need to perceive and understand their environments. In particular, they need a representation of individual objects, as well as contextual relations between them. Visual information is the primary data source used to make predictions and inferences about the world. There exists, however, a growing tendency to introduce high-level semantic knowledge to enable robots to reason about objects. We use the Semantic Web framework to represent knowledge and make inferences about sensor data, in order to detect and classify objects in the environment. The contribution of this work is the identification of several challenges that co-occur when combining sensor data processing with such a reasoning method.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Autonomous recognition of structure in an indoor environment is a
challenging task for the robotics community. Relying on depth perception, prior
knowledge and logic, humans are particularly adroit at understanding their
surroundings. Robotic systems rely on imagery and sensor data to build and encode their
knowledge. Yet, we expect some systems to perform tasks such as navigation,
manipulation, or interaction, in cluttered environments, structured for humans.
To improve the way robots structure their knowledge of the world, we can share
a common knowledge management system. Then robots could use our way to
represent, make inferences and take decisions. By finding a representation in
Description Logic for common-sense statements, and mapping them to ontological
concepts and relations between those concepts, information such as the book is
on the shelf or the room is empty is shared between humans and robots. This
high level semantic description through ontologies also permits reasoning in a
logical way.</p>
      <p>In this paper we aim at verifying if the bottom-up, knowledge-based
interpretation of indoor scenes is a reliable approach for 3D object detection. This task
has been heavily performed using statistical methods and pattern recognition.
Detecting and classifying objects by relying on a logical representation has been
less considered in recent years, due to the access to large amounts of data and
computational resources to learn the structure of our visual world.</p>
      <p>Our proposed system is used for knowledge modeling and information
retrieval. We divide the task into three main components: 1) geometric analysis
and characterization of scanned environment data, 2) semantic description and
ontology mapping of geometric shapes, and 3) knowledge query and rule
evaluation.</p>
      <p>After scanning the environment, we use 3D point cloud segments to identify
predefined geometric primitives and formalize spatial relations between object
parts (cf. Fig. 1). We store the obtained geometric information and load it in a
knowledge management system to populate an ontology with class instances. For
answering queries over the computed spatial data, we implement a reasoner in
Semantic Web Rule Language (SWRL) and run it under the platform of Prot´eg´e,
an ontology editor and knowledge-base framework.</p>
      <p>Space and spatial organization are the most common sense knowledge for
humans. To describe them, we make use of Web Ontology Language (OWL).
In our approach, we create an OWL ontology based on Description Logic (DL),
which permits defining instances (description logic individuals), creating classes
(description logic concepts), properties (binary relation specifying class
characteristics), and operations (union, intersection, complement, etc). Our framework
relies on reasoning with the 3D geometric information to detect and classify
objects in a human environment. We consider properties such as size, orientation,
position of point cloud segments, as well as spatial relations between segments,
such as intersection, inclusion or parallelism. Our intuition in selecting the
features is that it is easier to compute spatial relations for simple planar primitives
of a complex object rather than computationally expensive ones for the whole
object.</p>
      <p>
        Related work in the area of combining 3D point cloud processing with
knowledge-based reasoning is concerned with architectural reconstructions [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. A
similar 3D object classification approach was taken by [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], however at critical
points, the paper does not formulate solutions. In this paper, we focus on
identifying the challenges in such an approach.
      </p>
      <p>
        For the preprocessing phase we use the Felzenszwalb and Huttenlocher
segmentation algorithm. Recently, we presented a segmentation method for 3D
point clouds acquired with state-of-the-art 3D laser scanners extending the
method of Felzenszwalb and Huttenlocher [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. From the 3D points an
unoriented graph is constructed. The graph is then segmented by using a k-nearest
neighbor search and a similarity measure based on surface normals, resulting in
a point cloud segmentation in planar patches. Fig. 2 shows two examples.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>The Prot´eg´e platform and Ontology Web Language</title>
      <p>We model in an ontology our prior knowledge of the environment, making
use of the Prot´eg´e-OWL editor. Prot´eg´e-OWL is an extension of Prot´eg´e that
permits loading and saving ontologies, define logical class characteristics as OWL
expressions, and most importantly, execute reasoners such as description logic
classifiers. To complete the modeling process we add semantic rules developed
with Semantic Web Rule Language (SWRL) and run Pellet, a Description Logic
Reasoner, designed to work with OWL. Pellet is an implementation of a full
decision procedure for OWL-DL which provides support for reasoning with
individuals (asserted or inferred), user-defined datatypes and debugging and comparing
ontologies.</p>
      <p>Objects of interest in the scene are modeled under the class BuildingObject,
while the rest map to geometries: either point cloud segments or pairs of point
cloud segments. We therefore restrict our definition of an object to anything
composed of them.</p>
      <p>Within the OWL ontology, not only we create appropriate object classes,
but also class properties, through which we encode object geometry and
spatial relations between segments in the scene (cf. Fig. 3). To integrate 3D data
processing with Semantic Web technologies, we considered attributes such as:
size (Since we are only considering planar surfaces, we refer to size as the area
of the segment. It is the most distinguishable segment property.), position (We
consider minX, maxX, minY, maxY, minZ, maxZ as some objects are expected
at a certain relative position inside a scene.), orientation (Individuals of vertical
or horizontal segments are directly instantiated under the appropriate class.).
Equally important as segment attributes, are the spatial relations between
segments: connected, parallel, perpendicular, the pairs being instantiated under the
classes Pair or PairedObjectPart.
3</p>
    </sec>
    <sec id="sec-3">
      <title>SWRL Rules</title>
      <p>The purpose of our semantic interpretation approach is to enable querying
the spatial knowledge base. After populating our Prot´eg´e classes with
individuals, we see their properties and their relationships as logical predicates (asserted
knowledge), and we use logical rules to derive new facts and instances (inferred
knowledge). The SWRL rules incorporate the restrictions that we impose on the
environment: our knowledge about the scene configuration and about the shape
of the objects. A rule takes the form of an implication between an antecedent
and a consequent, and supports either a final decision or an intermediate
decision in interpretation process. For instance, we know that a bookshelf essentially
consists of a series of parallel segments at certain intervals. We make a similar
judgement that if we have two stairs in the same sequence of primitives, the
object is a staircase. Two example rules are as follows:</p>
      <sec id="sec-3-1">
        <title>LowShelf(?x) → HorizontalSegment(?x) ∧ hasSize(?x, ?size) ∧ swrlb : greaterThan(?size, 0.02) ∧ swrlb : lessThan(?size, 1.0) ∧ hasMaxY(?x, ?maxY) ∧ swrlb : greaterThan(?maxY, 0.6) ∧ swrlb : greaterLess(?maxY, 1.5)</title>
      </sec>
      <sec id="sec-3-2">
        <title>Staircase(?x) → hasHVConnectedPair(?x, ?pair1) ∧ Stair(?pair1) ∧ hasHVConnectedPair(?x, ?pair2) ∧ Stair(?pair2) ∧ GeometricPrimitiveSequence(?x)</title>
        <p>4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>To show the potential of our approach we exhibit three different simulations
in which we query the knowledge system for different building objects. Our
approach is also viable for different geometries, in particular after extending the
method to curved spaces by adding properties and rules accordingly.</p>
      <p>Our simulations concern half of an empty room, a staircase and a bookshelf.
For each scenario, a set of SWRL rules was designed that allows for labeling of
intermediate object parts such as a ceiling, shelf planes or stairs, as well as
labeling of the entire object of interest. Labels correspond to object categories. We
map the segmentation output to the ontology via a mapping language, and
obtain asserted instances. By running the reasoner, we further label the segments,
and create inferred instances. For the three examples, the results are shown in
Table 1. Not all mapped segment get a labeling, which is due to the challenges
described next.</p>
      <p>Missing data. We experienced that mapped segments are not labeled due to
missing data. The laser scanner gages only objects visible. However, multiple
3D scans and scan registration are necessary to completely digitalize scenes.
Efficiency for multi-values predicates. For extracting relations between
individuals they have to be compared. Currently, we perform this comparison
while processing the point cloud in C++, exploiting spatial data structures
such as k-d trees.</p>
      <p>Memory efficiency. Due to the presence of many segments in realistically sized
real-world scenes, Pellet reasoner tends to run out of memory due to the
complexity of the used description logic.</p>
      <p>Designing the data processing tool chain. It is not clear, which parts of
the interpretation process should be implemented at the point cloud
processing level, i.e., in the C/C++ part that acquires the sensor data,
calculates the normals and performs the segmentation, and which parts should
be performed by description logic reasoning in the knowledge-based system.</p>
      <p>The question is, when and where to call Pellet and the used ontology.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We presented a framework for semantic interpretation of point clouds which
takes advantage of Semantic Web technologies. Built on the platform of
Prot´eg´eOWL, our alternative method of linking top level semantic qualification with
low level geometric calculations uses a connectivity-preserving segmentation
algorithm, an ontology structure and a reasoner. We believe that the logical
structure of an ontology is suitable for semantic knowledge representation and that
under the Semantic Web framework, Web Ontology Language is appropriate for
defining spatial knowledge. Such an approach provides a better understanding
of a 3D scene, by facilitating detection and recognition in 3D point clouds.</p>
      <p>
        Needless to say, a lot of work remains to be done. To avoid the use of crisp
thresholds, we plan to add fuzziness to the system and/or use probabilistic
reasoning. A promising approach is given by Pu and Vosselmann in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. They use
semantic building knowledge to reconstruct a polyhedron model of outdoor
terrestrial 3D scans. They also describe the uncertainty and make expected
decisions [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Further future work will aim at interpreting multiple registered 3D
scans. As our system relies on plane segmentation, this extension seams
straightforward. However, a combination with next-best-view planning is highly
desirable.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Sima</surname>
            ,
            <given-names>M.C.</given-names>
          </string-name>
          , Nu¨chter,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>An extension of the Felzenszwalb-Huttenlocher segmentation to 3D point clouds</article-title>
          .
          <source>In: International Conference on Machine Vision</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Duan</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cruz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nicolle</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Architectural reconstruction of 3D building objects through semantic knowledge management</article-title>
          .
          <source>In: 11th ACIS International Conference on Software Engineering, Artificial Intelligence</source>
          , Networking and Parallel/Distributed Computing (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Hmida</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cristophe</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frank</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Christophe</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          :
          <article-title>Knowledge Base Approach for 3D Objects Detection in Point Clouds Using 3D Processing and Specialists Knowledge</article-title>
          .
          <source>International Journal On Advances in Intelligent Systems</source>
          , vol.
          <volume>5</volume>
          , pp.
          <volume>114</volume>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Gu¨nther,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Wiemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Albrecht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Hertzberg</surname>
          </string-name>
          , J.:
          <article-title>Model-based object recognition from 3D laser data</article-title>
          .
          <source>KI 2011: Advances in Artificial Intelligence</source>
          ,
          <source>Springer (LNAI 7006)</source>
          , pp.
          <fpage>99</fpage>
          -
          <lpage>110</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Pu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vosselman</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Knowledge based reconstruction of building models from terrestrial laser scanning data</article-title>
          .
          <source>{ISPRS} Journal of Photogrammetry and Remote Sensing</source>
          <volume>64</volume>
          (
          <issue>6</issue>
          ),
          <fpage>575</fpage>
          -
          <lpage>584</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Pu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Knowledge based building facade reconstrcution from laser point clouds and images</article-title>
          .
          <source>PhD thesis</source>
          , University of Twente (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>