Automatic search for vanishing points on mobile devices

                E V Myasnikov1,2


                1
                 Samara National Research University, Moskovskoe Shosse 34А, Samara, Russia, 443086
                2
                 Image Processing Systems Institute of RAS - Branch of the FSRC "Crystallography and
                Photonics" RAS, Molodogvardejskaya street 151, Samara, Russia, 443001


                e-mail: mevg@geosamara.ru


                Abstract. The vanishing point is a point in the plane of the perspective image, in which the
                projections of mutually parallel lines of three-dimensional space converge. Automatic search
                for vanishing points in images is a rather complicated task, the solution of which is not
                currently possible on mobile devices without introducing additional restrictions. In this paper, a
                rather simple method is proposed for solving this problem, which operates under two main
                constraints. One of these limitations is the analysis of the Manhattan scenes. Another is the
                fusion of information received from mobile device sensors. Modeling was performed using a
                public open data set to evaluate the characteristics of the proposed solution.


1. Introduction
Vanishing points are some of the most important objects in three-dimensional computer vision. The
vanishing points are points in the plane of a perspective image, in which the projections of mutually
parallel lines of three-dimensional space converge. The search of vanishing points can be one of the
stages in assessing the orientation and position of a camera [1], reconstructing three-dimensional
scenes [2], and can be used in many applications related to these tasks [3,4].
   Images with corrected geometric distortion are used to search for vanishing points. As a rule, the
automatic search for vanishing points is performed in several stages. At the first stage, the line
segments are selected on images. On the second stage, the lines are divided into groups corresponding
to different vanishing points. Finally, the calculation of the positions of the corresponding points is
performed.
   Unfortunately, the automatic search for vanishing points is quite a difficult task, the solution of
which is not currently possible on mobile devices without introducing additional restrictions. In the
present paper, a rather simple method is proposed for solving this problem, which operates under two
main constraints. One of these limitations is the analysis of only Manhattan scenes. Another is the
presence of an accelerometer on a device, which makes it possible to estimate the gravity vector.
   The search for vanishing points in images of Manhattan scenes is performed in order to find three
vanishing points: the vanishing points of vertical lines of facades, as well as horizontal lines of main
and side facades. In this case, the vanishing points are selected in such a way that the vectors
corresponding to the found vanishing points are orthogonal. In general, the proposed method is based
on the idea described in [5], according to which the search for horizontal vanishing points can be
performed along the horizon line, defined by a plane orthogonal to the direction to the vertical


                    V International Conference on "Information Technology and Nanotechnology" (ITNT-2018)
Image Processing and Earth Remote Sensing
E V Myasnikov


vanishing point. Modeling was performed using a well-known open data set [6-8] to evaluate the
characteristics of the proposed solution.
   The paper is organized as follows. Section 2 provides a description of the developed method of
searching the vanishing points. Section 3 describes the modeling methodology and experimental
studies, which were conducted using an open data set. The paper ends with a conclusion and a list of
references.

2. Method
In general, the proposed method consists in the sequential search of three vanishing points, starting
from the point corresponding to the vertical lines of the facades.
    At the preparatory stage of the method, the image is pre-processed, including the elimination of
geometric distortions. After that, a search of line fragments (segments) is performed on the image. For
this purpose, the Hough transform can be used; however, in this work, rather a simple approach is
used, which is based on the tracing of contours.
    Let us assume that the camera is modeled as a regular pinhole camera (see Fig. 1), and the origin of
the coordinate system is aligned with the camera. Let the OX and OY axes form a plane parallel to the
image plane, and the OZ axis is orthogonal to it. In this case, the relationship between the coordinates
of the point (X, Y, Z)T in the specified coordinate system and the point (x, y)T in the image plane is
expressed in the form:
                                                           x       X 
                                                            y  ~ K Y 
                                                                    
                                                           1      Z 
where K is the matrix of the internal parameters of the camera, which contains information on the
focal length, pixel size, skew, and shift of the image center relative to the optical axis.


                                  Figure 1. Camera model and coordinate system.

   To find the first vanishing point, we use information from the gravity sensor of the mobile device.
At this stage, we select such line segments that are in a good agreement with the gravity vector. This
selection is carried out by comparing the angle of deviation of each line li from the gravity vector g
with a predetermined threshold value t:
                                           
                                                       
                                                       
                                                                    l  l g   t 
                                       L  li arccos  li2  li1 g
                                                                  T
                                                                               i
                                                                                2   1
                                                                                    i
                                                                                           .
Here and
       l i1     are the coordinates of the ends of the corresponding line segment l i .
                l i2
   The selected set of lines L1 determines the position of the first vanishing point V1, corresponding to
the vertical lines in the image. After finding the first vanishing point, we exclude all the lines
corresponding to it from further consideration: L2=L/L1.
   At the second stage of the method, we determine the plane of the horizon line  h in three-
dimensional space as a plane orthogonal to the refined gravity vector (and direction to the first
vanishing point) and passing through the origin:
                                                  V11 X  V12Y  V13Z  0
                                                                                        .


V International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                 217
Image Processing and Earth Remote Sensing
E V Myasnikov


   Next, we define the horizon line  in the image plane as the projection of a line passing, for
example, through the points with the following coordinates:
                                                  1                        1             
                                         (V11  V13 ) / V12 ,    (V11  V13 ) / V12 
                                           1                     2  
                                                  1                      1             
                                                                                .
    To search for the second vanishing point V2, we calculate the points pi of the intersection of all the
lines li, which were selected on the image l i  L2 , with the horizon line  :
                                                       pi  ( xi , yi ), pi  li  pi   .
Then we build the histogram h of the number of intersection points, which fall into different ranges of
angles in the polar coordinate system:
                                                                n 1                             
                                                    I  N     N  , n  1..N
                                                                                          n
                                        hn                                       i
                                                       i                                                      .
Here  i is the radial coordinate of the intersection point p i in the plane  Г (see figure 2),
              ~                                      ~
 i  arccos( X i ) , and the normalized coordinates Pi are related to the coordinates in the image plane as
follows:
                                                X~ , Y~ , Z~   X , Y , Z  X , Y , Z 
                                                   i       i    i         i   i       i           i   i   i

                                                                    X i         xi 
                                                                            1  
                                                                    Yi  ~ K  y i 
                                                                     Z i     1 
                                                                                              .


                                        Figure 2. The plain of the horizon line.

   An example result of the preparatory stage of the method and the found horizon line are shown in
figure 3.


Figure 3. A result of the preparatory stage: the detected contours of the image are shown in white; the
  selected segments of straight lines are shown in blue. The horizon line found is shown in yellow.

V International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                             218
Image Processing and Earth Remote Sensing
E V Myasnikov


   We calculate the approximate position of the second vanishing point as the position of the
histogram peak on the horizon line. Further, we refine the position of the second vanishing point using
the set of line segments related to the peak on the histogram. Then we exclude the set of lines
corresponding to the second vanishing point from further consideration.
   The position of the third vanishing point can be found in the same way as the position of the peak
on the histogram with the subsequent refinement. However, a faster way to guarantee the
orthogonality of the vectors corresponding to the vanishing points is to find a vector, which is
orthogonal to the vectors corresponding to the first and second vanishing points:
                                                         V3  V1  V2 .
     An example demonstrating the search for the second and third vanishing points is shown in figure
4.


    Figure 4. An example of the search for vanishing points: the directions to the true points from the
  center are shown by thin dashed lines; the directions to the estimated vanishing points are shown in
 thin solid lines; the segments of lines selected for the second and third vanishing points are shown in
                                  bold red and blue lines, respectively.

3. Experiments
This section describes the results of an experimental study of the method described above. For the
experiments, we used the public open data set "The York Urban Database" [6-8]. It contains original
images and information about camera calibration parameters as well as the true position of vanishing
points. The specified data set includes 102 images of urban areas, obtained mainly on the territory of
the University of York and in the city of Toronto, Canada.
    To estimate the optimal parameters of the developed method, we performed the simulation
according to the following scheme:
    - for each image of the above data set, we calculated the true gravity vector using the information
about the true position of vanishing points;
    - we added random vectors (noise) with specified characteristics (uniform distribution in the range
[-r; r]) to the obtained gravity vector, and the resulting vector was used as a gravity vector obtained
from the sensor of a mobile device;
    - we estimated the positions of the vanishing points using the algorithm described above;
    - for each vanishing point, we calculated the error as the norm of the vector difference between the
estimated and true coordinates of the vanishing points.
    The results of the experiments are shown in figure 5 below. Each of the histograms in the figure
shows the distribution of the norm of the vector difference between the estimated and true coordinates
of the vanishing points. In the ideal case, the histogram has only one leftmost nonzero column, which
means minimal deviations of estimated vanishing points from the true values in all experiments. The
histogram located in the left column in figure 5.a has a similar appearance.


V International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                 219
Image Processing and Earth Remote Sensing
E V Myasnikov


            a)


            b)


             c)


           d)
    Figure 5. The histogram of deviations of vanishing points. The left column corresponds to the first
vanishing point, the middle column corresponds to the second point, and the right column corresponds
to the third vanishing point. The rows correspond to different ranges of the noise component: r = 0 (a),
  r = 0.1 (b), r = 0.2 (c, d). Figures (a)-(c) show the result for more strict selection conditions for line
       segments: t = π / 50, N = 50. Figure (d) shows the result for soft conditions: t=π/25, N=25.

    The presence of a high right column in histograms in figures 5.b and 5.c (left) shows the error in
orientation. It means that the direction to the estimated vanishing point is reversed relative to the
direction to its true position. Such errors can occur since, for large random deviations of the gravity
vector, the algorithm selects lines converging in the opposite direction to the true one.
    As can be seen from the results of the experiments, on the one hand, the growth of the noise
component causes the expected increase in errors. On the other hand, the growth becomes
unacceptable for rather significant random deviations r = 0.2 (see figure 5.c). Such growth can be
partially compensated by mitigating the conditions when selecting line segments, as shown in figure
5.d. Here, milder filtering conditions for line segments were used: possible angle deviations were
t = π / 25 compared to the previously used t = π / 50, and the number of histogram splits was N = 25
compared to the previously used N = 50.
    Thus, the accuracy of the algorithm can be improved in the cases of noisy gravity vector readings
by selecting appropriate parameters. Another way to improve the accuracy may consist in the use of
previously obtained estimates in the processing of a video stream. This is a subject of future research.


V International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                    220
Image Processing and Earth Remote Sensing
E V Myasnikov


4. Conclusion
The paper proposed and investigated the method of automatic search for vanishing points. This
method is based on the combination of information obtained from sensors of mobile devices and
images obtained from the camera in the visible range. The simulation was performed using “The York
Urban Database” open data set, which allowed evaluating the performance of the method under
random deviations in the values of the gravity vector. The method described in the paper is easy to
implement and undemanding of computing resources, which allows its use on mobile devices.
    In the future, we plan to improve the described method for working with a video stream that allows
tracking and refinement of the positions of vanishing points with the aim of evaluating the orientation
and position of a camera.

5. References
[1] Caprile B, Torre V 1990 Using vanishing points for camera calibration International Journal of
     Computer Vision 4(2) 127-139
[2] Lee D C, Hebert M and Kanade T 2009 Geometric reasoning for single image structure
     recovery International Conference on Computer Vision and Pattern Recognition (CVPR) 2136-
     2143
[3] Xu H, Yang Z, Zhou Z, Shangguan L, Yi K and Liu Y 2016 Indoor localization via multi-modal
     sensing on smartphones International Joint Conference on Pervasive and Ubiquitous
     Computing, ACM 208-219
[4] Park S, Lee H, Lee S and Yang H S 2015 Line-based single view 3D reconstruction in
     Manhattan world for augmented reality International Conference on Virtual Reality Continuum
     and its Applications in Industry, ACM 89-92
[5] AngladonV, Gasparini S and Charvillat V 2015 The toulouse vanishing points dataset
     Proceedings of the 6th ACM Multimedia Systems Conference (MMSys) (Portland, United States)
[6] Coughlan J M, Yuille A L 2003 Manhattan world: Orientation and outlier detection by Bayesian
     inference Neural Computation 15(5) 1063-1088
[7] Denis P, Elder J H and Estrada F 2008 Efficient Edge-Based Methods for Estimating Manhattan
     Frames in Urban Imagery Proc. European Conference on Computer Vision 5303 197-211
[8] Denis P 2008 Efficient Edge-Based Methods for Estimating Manhattan Frames in Urban
     Imagery M.Sc. Thesis (York University, Canada)

Acknowledgements
The work was partly funded by RFBR according to the research project 17-29-03190 in parts of «2.
Method» – «3. Experiments» and by the Russian Federation Ministry of Science and Higher Education
within a state contract with the "Crystallography and Photonics" Research Center of the RAS under
agreement 007-ГЗ/Ч3363/26 in part of «1. Introduction».


V International Conference on "Information Technology and Nanotechnology" (ITNT-2018)               221