<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Use of the 3D Hough Transform. CEUR Workshop Proceedings</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.18287/1613-0073-2016-1638-340-347</article-id>
      <title-group>
        <article-title>SEGMENTATION OF STEREO IMAGES WITH THE USE OF THE 3D HOUGH TRANSFORM</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ye.V. Goshin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>G.E. Loshkareva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Image Processing Systems Institute - Branch of the Federal Scientific Research Centre "Crystallography and Photonics" of Russian Academy of Sciences, Samara, Russia Samara National Research University</institution>
          ,
          <addr-line>Samara</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>1638</volume>
      <fpage>340</fpage>
      <lpage>347</lpage>
      <abstract>
        <p>A technology for the segmentation of stereo images using the Hough transform is proposed in this paper. The first stage of the process is the formation of a 3D model of a scene in the form of a point cloud - where information about camera parameters is lacking. In the second stage, with the help of the Hough space, the most appropriate set of planes is found. The segmentation of the generated scene is then carried out, based on these. Experimental research results relating to simulated scenes are given.</p>
      </abstract>
      <kwd-group>
        <kwd>image processing</kwd>
        <kwd>3D reconstruction</kwd>
        <kwd>segmentation</kwd>
        <kwd>Hough transform</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The problem of image analysis and, in particular, the problem of image segmentation
is common to many different research areas. However, information useful for a
reliable segmentation of images is often lacking. This may occur, for example, when the
texture of the scene objects consists of large areas of different colors. In this case, if
there is more than one image, it makes sense to analyze not the intensity distribution
of individual images but the three-dimensional structure of the scene instead.
There are a large number of algorithms which construct a three-dimensional model of
a scene from stereo images [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Typically, in the case where the images have been
obtained from arbitrary viewpoints, the camera parameters are unknown, and thus it is
necessary to determine these parameters. The problem of stereo image matching
under these conditions is considered in several articles [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ].
      </p>
      <p>
        One of the approaches to the segmentation of three-dimensional scenes consists, first,
in the detection of some specific objects in the scene [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. For example, [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
considers the scene segmentation based on the detection of planes in the point cloud,
obtained by scanning the surface with a laser ranger (LIDAR). The 3D Hough transform
is used for the detection of the planes [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>The aim of this work is to study a 3D scene segmentation method using a two-step
procedure. The first step is the formation of a 3D model of a scene in the form of a
point cloud - where the camera parameters are unknown. In the second step, with the
use of the Hough space, we find the most appropriate set of planes on which to base a
segmentation of the generated scene.</p>
      <p>
        The main difference between the proposed technology and the one described in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is
the use of stereo images. It should be emphasized that these images are not only used
as initial data for the 3D model construction (in the form of a point cloud), but also
act as a meaningful result in themselves, since the result of a 3D point cloud
segmentation can be used for the segmentation of the initial images.
      </p>
      <p>
        Generally, the proposed method can be used for the segmentation of the scene and the
images in relation to multiple objects which belong to different planes in
threedimensional space as is also the case with [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. However, the present paper considers a
restricted problem of segmentation; this is the division of the image pixels and the
constructed 3D model points into two classes – background points and object points.
The results of these experimental studies demonstrate the segmentation result in
relation to a simulated scene.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>The segmentation method</title>
      <p>The main stages of the stereo image segmentation method, using the 3D Hough
transform, are shown in Fig. 1.</p>
      <p>In accordance with the scheme, the 3D scene model is constructed from two images.
Then, a Hough transform is carried out on all points of the resulting 3D scene. Among
all the planes we search for the maximum and thus select this as a background plane.
By calculating the distance from the points to the selected plane, the model is divided
into background points and object points. The initial image can be segmented with the
use of the segmented scene obtained.</p>
      <p>The key stages of the process will be considered in detail further on in this paper.</p>
      <sec id="sec-2-1">
        <title>Camera parameters estimation</title>
        <p>
          In this paper, we use the algorithm for camera parameters determination described in
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. It is assumed that the matrices of the first and the second cameras are the same
and known. Also the global system of coordinates has been made to coincide with the
coordinate system of the first camera: i.e., the camera view direction coincides with
the direction of the axis OZ. In addition, the set of corresponding points
mi | i  1, N and mi | i  1, N is given, where each pair of points mi   x yT
and mi   x yT are projections of the same point in three-dimensional space in
the first and second images, respectively.
We designate the elements of the translation matrices and the rotation vectors of the
first and second cameras as: R , R , t and t
We obtain an expression correlating the corresponding points in the two images
through the parameters of the rotation matrix and translation vector up to the accuracy
of an unknown parameter, Z :
 x     r11x  r12 y  r11  tx Z  r31x  r32 y  r31  tz Z 1 
 y    r21x  r22 y  r21  ty Z  r31x  r32 y  r31  tz Z 1 

The resulting expression can be represented as a sequence of rotations and
translations:
m  x, y, z   R'm  x, y  ,
 x  z  tz Z   x  tx Z ,

 y  z  tz Z   x  t y Z ,
where Z is the point coordinate in three-dimensional space.
        </p>
        <p>The proposed decomposition enables an iterative procedure to determine the
parameters of the camera. This consists of two stages: the procedure for determining the
rotation, in the course of which the rotation matrix R is constructed, and the procedure
for determining the translation, in the course of which the translation vector
t  tx ty tz  is constructed.</p>
        <p>
          Using the Lucas–Kanade method [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], we form an optical flow which matches points
between the first and second images. The point cloud based on the matches obtained
is formed by triangulation [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Segmentation of a 3D scene model</title>
        <p>The aim of this stage is to separate the objects from the background.</p>
        <p>
          To detect planes we use the 3D Hough transform. The Hough transform is a method
for detecting parametrized objects which is generally used to detect circles and lines.
For example, paper [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] describes the detection of a variety of two-dimensional
objects with the reference contour using the generalized Hough transform.
In this paper, a set of points in space 3 is used as the input and output values of the
parameterized plane. The plane is represented by the distance from the origin of
coordinates to the plane and a normal vector to this plane. Let p be a point in the plane, n
is a normal vector which is perpendicular to the plane and  is the distance
  p  n  pxnx  pyny  pznz
After substituting the angles between the normal vector and the selected coordinate
system the equation of the plane can be written as follows:
px  cos sin  py  sin  sin  pz  cos  
(1)
where  is the angle of inclination of the normal vector to the plane xy and  is the
angle between the plane xy and the normal vector in the direction of axis z . The
coordinates,  ,  and  define, as such, a 3D Hough space, each point of which has
a corresponding plane in 3 . In turn, each point  x0 , y0 , z0  of space 3 has some
surface in the Hough space corresponding to it. At the same time, each point of this
surface  , ,   characterizes some plane, passing through the desired point,
 x0 , y0 , z0  .
        </p>
        <p>In this paper, we consider the problem of finding the background plane containing the
highest number of points from the point cloud. After determining the parameters
ˆ,ˆ,ˆ  of the background plane, for all points of the initial cloud we determine
whether this point belongs to this background plane. To do this, we substitute the point’s
coordinates into the plane equation and then compare the resulting value with a certain
threshold:
px  cosˆ  sinˆ  py  sinˆ  sinˆ  pz  cosˆ  ˆ   .</p>
        <p>All the points that satisfy this inequality belong to the background plane, while the rest
are considered to belong to objects of the scene.
Since there remains a one-to-one correspondence between the points of the 3D model
and the image pixels, the results of the model segmentation can be used for the
segmentation of the initial images.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The implementation of the 3D Hough transform algorithm</title>
      <p>In order to implement the 3D Hough transform the following algorithm is used. As
the Hough accumulator space, we used a three-dimensional array of integers wherein
each cell of the array corresponds to some plane - with the parameters set by the
coordinates of this cell.</p>
      <p>Each point of the point cloud formed in the previous stage increments the value of
those cells of the accumulator space, which correspond to planes passing through the
given point or near it - an exact match is not possible due to the discrete nature of the
array indices.</p>
      <p>This algorithm can be written as pseudocode in the following way:
For each point of the point cloud:
└For each value of the angle coordinate  :
└ For each value of the angle coordinate  :
├Calculate  using formula (1)
└Increment A , ,  
As a result of this algorithm, each cell of the array is assigned a value, which is the
number of points of the point cloud lying near the plane which is represented by this
cell. The cell of the array that contains the maximum value represents the required
background plane.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Experimental results</title>
      <p>The findings of the experimental study of the proposed technology are given below. For
these purposes, a simulated scene was set up and two images were obtained at various
angles (Fig. 2, 3).
To form the Hough space, the following parameters were selected:
  k , k  0,179,
  k , k  0, 359,
  0.001k, k  0,1000.</p>
      <p>With the use of these parameters and the generated point cloud, the background plane
was determined, and the points of the cloud were divided into those belonging (Fig. 5)
and those not belonging (Fig. 6) to this plane.
The initial images were segmented in accordance with the segmentation which had been
carried out on the 3D model. The first segmented images of the scene are shown in Figs.
7 and 8.
It is evident that the algorithm separated (though not perfectly) the background plane
from the objects not belonging to that plane.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>It has been demonstrated that the proposed information technology, relating to stereo
images segmentation and based on the application of the 3D Hough transform to a
generated point cloud, provides a good quality of segmentation. This experimental
work on a simulated scene allowed us to separate the background from the other
planes, of an image.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>The study was supported by RFBR, research project No. 16-07-00729 a.
We would like to thank our scientific supervisor, Prof. Vladimir Fursov, for his support,
continuous guidance and valuable suggestions during this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Pollefeys</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nistér</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frahm</surname>
            <given-names>JM</given-names>
          </string-name>
          ,
          <article-title>Akbarzadeh A. Detailed real-time urban 3d reconstruction from video</article-title>
          .
          <source>International Journal of Computer Vision</source>
          ,
          <year>2008</year>
          ;
          <volume>78</volume>
          (
          <issue>2-3</issue>
          ):
          <fpage>143</fpage>
          -
          <lpage>167</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Baillard</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maıtre H</surname>
          </string-name>
          . 3
          <article-title>-D reconstruction of urban scenes from aerial stereo imagery: a focusing strategy</article-title>
          .
          <source>Computer Vision</source>
          and Image Understanding,
          <year>1999</year>
          ;
          <volume>76</volume>
          (
          <issue>3</issue>
          ):
          <fpage>244</fpage>
          -
          <lpage>258</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Pollefeys</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koch</surname>
            <given-names>R</given-names>
          </string-name>
          , Van Gool L.
          <article-title>Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters</article-title>
          .
          <source>International Journal of Computer Vision</source>
          ,
          <year>1999</year>
          ;
          <volume>32</volume>
          (
          <issue>1</issue>
          ):
          <fpage>7</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Eisert</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steinbach</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girod</surname>
            <given-names>B</given-names>
          </string-name>
          .
          <article-title>Automatic reconstruction of stationary 3-D objects from multiple uncalibrated camera views. Circuits and Systems for Video Technology</article-title>
          , IEEE Transactions on,
          <year>2000</year>
          ;
          <volume>10</volume>
          (
          <issue>2</issue>
          ):
          <fpage>261</fpage>
          -
          <lpage>277</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Reitberger</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schnörr</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krzystek</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stilla</surname>
            <given-names>U.</given-names>
          </string-name>
          <article-title>3D segmentation of single trees exploiting full waveform LIDAR data</article-title>
          .
          <source>ISPRS Journal of Photogrammetry and Remote Sensing</source>
          ,
          <year>2009</year>
          ;
          <volume>64</volume>
          (
          <issue>6</issue>
          ):
          <fpage>561</fpage>
          -
          <lpage>574</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Tarsha-Kurdi</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Landes</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grussenmeyer</surname>
            <given-names>P</given-names>
          </string-name>
          .
          <article-title>Hough-transform and extended ransac algorithms for automatic detection of 3d building roof planes from lidar data</article-title>
          .
          <source>Proceedings of the ISPRS Workshop on Laser Scanning</source>
          ,
          <year>2007</year>
          ;
          <volume>36</volume>
          :
          <fpage>407</fpage>
          -
          <lpage>412</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Zhang</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            <given-names>X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ning</surname>
            <given-names>X</given-names>
          </string-name>
          .
          <article-title>SVM-based classification of segmented airborne LiDAR point clouds in urban areas</article-title>
          .
          <source>Remote Sensing</source>
          ,
          <year>2013</year>
          ;
          <volume>5</volume>
          (
          <issue>8</issue>
          ):
          <fpage>3749</fpage>
          -
          <lpage>3775</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Borrmann</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elseberg</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lingemann</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nüchter</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <article-title>The 3D Hough Transform for plane detection in point clouds: A review and a new accumulator design</article-title>
          .
          <source>3D Research</source>
          ,
          <year>2011</year>
          ;
          <volume>2</volume>
          (
          <issue>2</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Goshin</surname>
            <given-names>YeV</given-names>
          </string-name>
          , Fursov VA.
          <article-title>3D scene reconstruction from stereo images with unknown extrinsic parameters</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2015</year>
          ;
          <volume>39</volume>
          (
          <issue>5</issue>
          ):
          <fpage>770</fpage>
          -
          <lpage>776</lpage>
          . DOI:
          <volume>10</volume>
          .18287/
          <fpage>0134</fpage>
          -2452- 2015-39-5-
          <fpage>770</fpage>
          -775.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Lucas</surname>
            <given-names>BD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kanade</surname>
            <given-names>T.</given-names>
          </string-name>
          <article-title>An iterative image registration technique with an application to stereo vision</article-title>
          .
          <source>IJCAI</source>
          ,
          <year>1981</year>
          ;
          <volume>81</volume>
          :
          <fpage>674</fpage>
          -
          <lpage>679</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Hartley</surname>
            <given-names>RI</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sturm</surname>
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Triangulation</surname>
          </string-name>
          .
          <article-title>Computer vision</article-title>
          and image understanding,
          <year>1997</year>
          ;
          <volume>68</volume>
          (
          <issue>2</issue>
          ):
          <fpage>146</fpage>
          -
          <lpage>157</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Fursov</surname>
            <given-names>VA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bibikov</surname>
            <given-names>SA</given-names>
          </string-name>
          , Yakimov PYu.
          <article-title>Localization of objects contours with different scales in images using hough transform</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2013</year>
          ;
          <volume>37</volume>
          (
          <issue>4</issue>
          ):
          <fpage>496</fpage>
          -
          <lpage>502</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>