Image Processing and Geoinformatics


  3D scene stereo reconstruction with the use of epipolar
                       restrictions

                                  Fursov V.A., Goshin Y.V.

                             Samara State Aerospace University,
               Image Processing Systems Institute, Russian Academy of Sciences


        Abstract. In the present paper a new approach to scene digital model
        reconstruction from pair of stereo images is considered. We propose to perform
        image matching with use of weight coefficient as a penalty function for the
        distance from the point to the correspondent epipolar line. Technology
        implementation for unknown camera parameters is also considered. In this case,
        identification of the fundamental matrix from corresponding points is
        performed on the initial stage. The main advantage of this technology is the
        absence of the image rectification stage which causes additional distortions due
        to image interpolation. There is an example of scene reconstruction from pair of
        test images and a digital model reconstruction from satellite images.

        Keywords: stereo image processing, image matching, 3D reconstruction,
        projective geometry, epipolar geometry


        Citation: Fursov V.A., Goshin Y.V. 3D scene stereo reconstruction with the
        use of epipolar restrictions. Proceedings of Information Technology and
        Nanotechnology (ITNT-2015), CEUR Workshop Proceedings, 2015; 1490:
        268-276. DOI: 10.18287/1613-0073-2015-1490-268-276


1. Introduction
   The main problem with 3D-scene reconstruction is image matching. Solving this
problem presents some difficulties because projective distortions on stereo pairs are
usually significantly different. To overcome these difficulties, an image preprocessing
rectification technique is used. Rectification of stereo images is a transformation, in
which corresponding points on stereo images are arranged in the same rows.
   In [1], the theoretical basis for the method of projective rectification is provided,
and the basic algorithm for rectification is introduced. The algorithm involves
calculation of the fundamental matrix and projective transformation. The method of
projective rectification is also covered in [2]. The idea of this method is to decompose
the matrix of the projective transformation into several matrices. But a common
disadvantage of projective rectification is its inapplicability in case when the epipoles
are located on the images as it results in infinite resolution images. In addition,
images can become very large, for example, when the epipole is close to the image.


                                                                                           268
   Information Technology and Nanotechnology (ITNT-2015)
Image Processing and Geoinformatics               Fursov V.A., Goshin Y.V. 3D Scene Stereo…


   In [3], polar rectification is introduced as an alternative method of rectification.
The method consists in polar scanning of images around the epipoles. Operation of
the proposed approach is illustrated on real pairs of images. This method has two
main advantages: an opportunity to operate with epipoles on images and guaranteed
minimum size of images without losing pixels.
   Though it solves many problems of projective rectification mentioned above, polar
rectification has a number of disadvantages too. For instance, it does not operate
correctly when an epipole is located at infinity. In particular, there may be cases
where one epipole is located at infinity, and another one is located on the image. In
papers [4] and [5], the idea of polar rectification is extended for these cases.
   The above methods of rectification do not take into account possible differences in
the internal parameters of the camera, such as focal lengths. This problem is described
in [6], in which an algorithm for rectification of heterogeneous and uncalibrated
image pairs is proposed, in particular, for the pairs of images obtained from static and
dynamic cameras of different focal lengths and/or different resolutions. Rectification
is performed in two steps. The first step is the correction of heterogeneity (different
scale) by the expansion, compression or shift of images. The second step is a standard
rectification of images. This approach avoids image distortions associated with
differences in scale (compression or tension as a result of rectification).
   At the same time, necessity for conversions directly on an image is a common
disadvantage of the rectification approach. The images and objects on them are
distorted considerably, owing to polar transformation. Feature points detection is
performed on the new interpolated image which causes additional errors. Although
polar rectification is more universal than the projective one, it still impairs some
problems connected with distortions of images during conversions.
   An alternative technique of matching is an approach of scene elements tracing, in
particular, using a method of optical flow [7], [8]. The method of an optical flow
operates efficiently with a sequence of images, for example, with sequences of video
frames. However, when the images are obtained from cameras which are located
relatively far apart, losses of objects during the tracing are possible.
   Earlier authors proposed a technology [9] in which a projective transform is
constructed from reliable corresponding points in an informative areas of images. And
then this transform is used to determine and adjust the corresponding points in
uninformative areas. Number of correspondence points on images given by this
technology is low and resulting 3D model is too rough.
    The technology of 3D scene reconstruction proposed in this paper to a
considerable degree allows to avoid the above mentioned disadvantages. The basic
idea is to take into account epipolar restrictions in the course of points matching.
   We consider the approach to the epipolar restrictions implementation in which the
most similar image fragments are selected in the neighborhood of the epipolar line.
Then, according to the given criterion, the best points with real coordinates belonging
to the epipolar line are selected.


                                                                                              269
   Information Technology and Nanotechnology (ITNT-2015)
Image Processing and Geoinformatics                             Fursov V.A., Goshin Y.V. 3D Scene Stereo…


2. Problem definition
   In order to solve the problem of image matching, the camera obscura model is used
[9]. Let us consider the case when the parameters of the cameras, as well as their
position and orientation are known. To characterize them, camera parameters matrices
are defined:
      f 0 u0             f  0 u0 
 K   0 f v0  , K   0 f  v0  ,
                                                                               (1)
      0 0 1            0 0 1 
where f and f  is focal length of the camera, u0 , v0  , u0 , v0  are cameras’
principal points location in the coordinate system associated with these cameras [10].
   Let us introduce a global coordinate system and the coordinate systems of first and
second cameras with their centers at points c , c in the global coordinate system.
Neither of these two points, in general, coincides with the origin of the global
coordinate system.
   Suppose M is a coordinate vector of some point in the global coordinate system.
Coordinate vectors of this point in the coordinate systems of the first and second
cameras m and m are defined as
m  K R t  M ,                                                                    (2)
m  KR t M ,                                                         (3)
  R , R  are the matrices of 3  3 -dimension, describing the rotation of the
coordinate systems of the first and second cameras on the global coordinate system,
                           T                          T
and t  tx , t y , tz  , t  tx , t y , tz    are coordinates of the origin of the global
coordinate system in the coordinate systems of the first and second cameras
respectively, defined as
     t  Rc ,
     t  Rc ,
   If projections m and m (2), (3) on the first and second camera images are
known, point M coordinates in three-dimensional space can be calculated as the
intersection of rays  c, m  and  c, m  . Due to errors in the course of determining of
the corresponding points coordinates, these rays will not probably cross. The errors
occur in the internal and external calibration matrices and image distortions due to
rectification. Therefore, some inaccurate estimates of the point coordinates are usually
computed.
   A detailed review of methods for image matching is given in [11]. It presents a
classification of methods for stereo matching by the cost function, aggregation area,
and a disparity map construction approach. The paper [12] describes stereo matching
by use of so-called super pixels. Superpixel is a set of pixels that are homogeneous in
their brightness and texture. In [13], the method of matching in a rectangular area,
based on cross-correlation and the maximum-surface method is proposed. In [14] an
intensity-based method that takes into account discontinues and occlusions are
presented.


                                                                                                            270
    Information Technology and Nanotechnology (ITNT-2015)
Image Processing and Geoinformatics                                     Fursov V.A., Goshin Y.V. 3D Scene Stereo…


   In our technology, rectified images are not generated. Matching points are searched
directly on the epipolar lines belonging to the same plane. Then, for each pair of
corresponding points, spatial coordinates of the scene are computed.


3. Points matching using weighting coefficients
   We assume that the fundamental matrix is calculated with use of cameras
parameters or estimated on the set of corresponding points (for example, with use of
the 8-th point algorithm [15]). We will designate the points coordinates on the first
image of stereo images as  u, v  , and the corresponding points coordinates on the
second stereo images as  u  u, v  v  , where u and v are relative shifts of
the coordinates respectively. Let I  u, v  and I   u  u, v  v  are values of the
brightness distribution functions of these images. We will use Euclidean norm as a
measure of proximity between values of brightness for point  u, v  and the
corresponding point  u  u, v  v  :
e u, v, u, v   I u, v   I  u  u, v  v 
   The problem of the determination of shifts values is formulated as a problem of the
following criterion minimization:
E  u0 , v0 , u, v            
                           u , v D u0 ,v0 
                                                  a  u, v  e  u, v, u, v 

where D u0 , v0  is the area around point  u0 , v0  , and a  u, v  is the weight function
defined as multiplication of three weighting coefficients:
a u, v   wc  wd  wf
where wc , wd are the coefficients which reduce projective distortions, and w f is the
coefficient providing the location of the point on the epipolar line. These coefficients
depend on point coordinates  u, v  in area D u0 , v0  .
   Coefficients wd , wc are determined similar to algorithm SimpleFlow [16] by the
following equations:
       
wd  exp  u0 , v0   u, v ,            2
                                                  u, v   D u , v  ,             0   0


w  exp I u , v   I  u, v   , u, v   D
                                                     2
  c                  0     0

      Coefficient wd increases the weight of the points values depending on the
proximity to the center of area D u0 , v0  . It reduces distortions influence, as the
distance from the center to the edges of area increases.
   Weighting coefficient wc performs the same function; however, in this case, the
values of the brightness distribution functions are used. This coefficient extends the
effective area of the fragments comparison in case of sufficiently different intensity
values in area D u0 , v0  .


                                                                                                                    271
      Information Technology and Nanotechnology (ITNT-2015)
Image Processing and Geoinformatics               Fursov V.A., Goshin Y.V. 3D Scene Stereo…


   Along with the weight coefficients wd , wc , which were considered in paper [16],
we introduce weight coefficient w f , which characterizes the distance from a point to
an epipolar line. For each point u0 , v0  on the first image there is a corresponding
epipolar line l  : au  bv  c  0 on the second image:
            F11 F12 F13  u   a 
u, v    F21 F22 F23            
                                v    b 
            F F F  1  c 
            31 32          33        
a  u0 F11  u0 F12  F13 ,
b  u0 F21  u0 F22  F23 ,
c  u0 F31  u0 F32  F33 .
Coefficient w f is calculated as
wf  exp d   u, v , l 
where
                          au  bv  c
d   u , v , l  
                    a 2  b2
is a distance from point  u  u, v  v  on the second image to the epipolar line l  .
This coefficient is used as penalty function for the distance from the point to the
correspondent epipolar line.


4. Examples


4.1. Test scene reconstruction
   We compared the algorithm of the corresponding points determination using the
epipolar restrictions with similar SimpleFlow algorithm. For this experiment we use a
set of test stereo images "Tsukuba" (Figure 1) in the daylight and in the flashlight.
Figure 2 shows the results of the disparity map reconstruction using proposed method
and SimpleFlow. For quantitative assessment of quality we use the number of the
disparity map pixels, which are not occlusions and differ more than by 10\% from
their exact values. The number of such pixels is given in Table 1. This table contains
the results of processing by Simple Flow algorithm as well. The first number is the
absolute number of pixels. The relative number of pixels (to total number of pixels) is
given in brackets.


4.2. Digital terrain model reconstruction
   Stereo images obtained from a spacecraft and shown in Figures 2a and 2b were
used in the test. As parameters of the spacecraft at the moment of recording by an
optical device are unknown, the problem of the fundamental matrix determination
using the corresponding points of the stereo images shown in Figure 3 should be
solved. Let us refer to this stage as preliminary comparison of the stereo images.

                                                                                              272
   Information Technology and Nanotechnology (ITNT-2015)
Image Processing and Geoinformatics                  Fursov V.A., Goshin Y.V. 3D Scene Stereo…


                    Fig. 1. – Scene stereo images: a) left view; b) right view

                          Table 1. Number of wrong correspondences

Images registration          Proposed method                            SimpleFlow
conditions
Daylight                     19 191 (~8,3%)                             21 187 (~9,1%)
Direct light                 15 645 (~6,7%)                             21 088 (~9,1%)


                  Fig. 2. – Disparity map: a) proposed method; b) SimpleFlow


                               Fig. 3. – Initial pair of stereoimages

    Figure 4 (a) shows the fragment of one (left) image of a stereo pair with white
lines illustrating the sizes and the directions of shifts between images (optical flow).
Let us note that the relative number of chaotically directed white lines is rather great.
When using a small number of the corresponding points these errors can lead to rough


                                                                                                 273
   Information Technology and Nanotechnology (ITNT-2015)
Image Processing and Geoinformatics                Fursov V.A., Goshin Y.V. 3D Scene Stereo…


errors when calculating a fundamental matrix. Therefore we form a system of linear
equations and use conforming estimate method as it is described in [18], [19]. Thus,
the fundamental matrix is obtained:


 Fig. 4. – Images with the results of comparison of points: a) at a preliminary stage; b) taking
                               into account epipolar restrictions

      6.73 105 6.56 105 5.94 
                                         
 F   1, 22 104   6.25 106 0.65 
      5.98                           1 
                       0.62
   After the fundamental matrix is determined, it is possible to find the corresponding
points with using the epipolar restrictions. Figure 4 (b) gives a fragment of the same
image with the black lines showing sizes and directions of shifts errors under the
epipolar restrictions. It is easy to notice that the number of the chaotic black lines
characterizing errors in this case is significantly lower.


                         Fig. 5. – a) the disparity map, b) restored DTM


   In Figures 5 the disparity map (a) and the restored digital terrain model (b) of the
image segment, constructed with use of the corresponding points, are shown.


                                                                                               274
   Information Technology and Nanotechnology (ITNT-2015)
Image Processing and Geoinformatics               Fursov V.A., Goshin Y.V. 3D Scene Stereo…


5. Conclusion
   When the fundamental matrix is not known precisely, high quality of the
reconstruction can be reached by the use of epipolar restrictions in the form of weight
coefficient in the penalty function. These coefficients correspond to the distance from
the point to the epipolar line. The results of the comparison with the most popular
SimpleFlow algorithm demonstrate the prospects of the proposed approach.


Acknowledgements
  Study was funded by RFBR according to the research project #13-07-12030 ofi_m,
13-07-97000-r_povolzhye_a.


References
 1. Hartley RI. Theory and Practice of Projective Rectification. International Journal of
    Computer Vision, 1999; 35: 115-127.
 2. Monasse P, Morel JM., Tang Z. Three-step image rectification. BMVC 2010-British
    Machine Vision Conference, 2010; 89.1-89.10.
 3. Pollefeys M. A simple and efficient rectification method for general motion. The
    Proceedings of the 7th IEEE International Conference on Computer Vision, 1999; 1: 496-
    501.
 4. Häming K, Peters G. Extension of the generalized image rectification. Catching the
    infinity cases. Proceedings 4th International Conference on Informatics in Control,
    Automation, and Robotics (ICINCO 2007), 2007; 2: 275-279.
 5. Oram D. Rectification for any epipolar geometry. British Machine Vision Conference,
    2001; 653-662.
 6. Kumar S, Micheloni C, Piciarelli C, Foresti GL. Stereo rectification of uncalibrated and
    heterogeneous images. Pattern Recognition Letters, 2010; 31(11): 1445-1452.
 7. Fleet D, Weiss Y. Optical flow estimation. Handbook of Mathematical Models in
    Computer Vision, 2006; 237-257.
 8. Barron JL, Fleet DJ, Beauchemin SS. Performance of optical flow techniques.
    International Journal of Computer Vision, 1994; 12(1): 43-77.
 9. Goshin YeV, Kotov AP, Fursov VA. Two-stage formation of a spatial transformation for
    image matching. Computer Optics, 2014; 38(4): 886-891. [in Russian]
10. Forsyth D, Ponce J. Computer Vision: A Modern Approach. Moscow: “Williams”
    Publisher, 2004. 928 p. [in Russian]
11. Hartley R, Zisserman A. Multiple view geometry in computer vision. Cambridge
    university press, 2003.
12. Scharstein D, Szeliski R. A taxonomy and evaluation of dense two-frame stereo
    correspondence algorithms. International journal of computer vision, 2002; 47(1-3): 7-42.
13. Mičušík B, Košecká J. Multi-view superpixel stereo in urban environments. International
    journal of computer vision, 2010; 89(1): 106-119.
14. Sun C. Fast stereo matching using rectangular subregioning and 3D maximum-surface
    techniques. International Journal of Computer Vision, 2002; 47(1-3): 99-117.
15. Luo A, Burkhardt H. An intensity-based cooperative bidirectional stereo matching with
    simultaneous detection of discontinuities and occlusions. International Journal of
    Computer Vision, 1995; 15(3): 171-188.
16. Hartley R. In defense of the eight-point algorithm. Pattern Analysis and Machine
    Intelligence, IEEE Transactions on, 1997; 19(6): 580-593.

                                                                                              275
   Information Technology and Nanotechnology (ITNT-2015)
Image Processing and Geoinformatics               Fursov V.A., Goshin Y.V. 3D Scene Stereo…


17. Tao M, Bai J, Kohli P, Paris S. SimpleFlow: A Non-iterative, Sublinear Optical Flow
    Algorithm. Computer Graphics Forum, 2012; 31(2): 345-353.
18. Fursov V, Goshin Ye. Conformed Identification of the Fundamental Matrix in the
    Problem of a Scene Reconstruction, using Stereo Images Image Mining. Theory and
    Applications. Proceedings of IMTA-4, 2013; 29-37.


                                                                                              276
   Information Technology and Nanotechnology (ITNT-2015)