Assessment of camera orientation
                   in Manhattan scenes using information
                      from optical and inertial sensors
                                                                Evgeny Myasnikov
                                             Geoinformatics and Information Security department
                                                    Samara National Research University;
                        Image Processing Systems Institute of RAS - Branch of the FSRC "Crystallography and Photonics" RAS
                                                                  Samara, Russia
                                                               mevg@geosamara.ru

     Abstract—In the present paper, the solution to the problem              taking into account the data of the inertial sensor, determine
of assessing the orientation of a camera is performed under the              the orientation of the camera. The method for determining
condition of two main limitations. The first limitation is the               vanishing points described in this paper is based on the idea
analysis of Manhattan scenes only. The second one is the                     described in [4], according to which the search for horizontal
presence of an accelerometer in a mobile device. To assess the               vanishing points can be performed along the horizon line
characteristics of the proposed solution, a data set was                     defined by a plane orthogonal to the direction of the vertical
prepared containing both photos and accelerometer readings,                  vanishing point.
as well as information about the true orientation of the device.
Experimental studies were carried out using the prepared data                    Unfortunately, common data sets for the evaluation of
set.                                                                         vanishing point assessment methods (see, for example, [5])
                                                                             do not contain information from inertial sensors. For this
   Keywords—camera orientation, vanishing point, Manhattan                   reason, their use for evaluating methods similar to those
scenes, accelerometer, inertial sensor                                       described in this paper is possible only in the mode of
                                                                             sensors emulation, as it was done, for example, in [6].
                         I.    INTRODUCTION
    Assessing the camera orientation is one of the most                          For the above reason, to evaluate the characteristics of
important tasks in three-dimensional computer vision.                        the proposed solution, we prepared our own data set
Typically, camera orientation is estimated using calibration                 containing both photos and accelerometer readings as well as
patterns, and it requires human interaction. For this reason,                information about the true camera orientation. Experimental
automatic methods for assessing the orientation are of                       studies were carried out using the prepared data set.
particular interest.                                                             It should be noted that the initial implementation of the
    Despite the presence of various sensors in modern mobile                 algorithm for determining vanishing points was previously
devices, such as an accelerometer, compass, etc., their use                  described in [6]. Thus, in the present work, the previously
for orientation estimation is limited due to the low accuracy                proposed approach is further developed and studied, using
and the influence of noise [1]. For this reason, both optical                the data set prepared as part of the work.
information and information from the sensors of mobile                            The work is organized as follows. Section 2 describes the
devices are used to determine the orientation of the camera.                 developed method for assessing camera orientation. Section
    In this paper, we consider a method for assessing the                    3 describes the modeling technique and conducts
orientation of a camera, based on the analysis of the position               experimental studies. The work ends with a conclusion and a
of vanishing points [2], i.e. the points in the plane of a                   list of used literature.
perspective image, in which projections of mutually parallel                                          II.   METHOD
lines of three-dimensional space converge. In this case, the
problem is solved under the condition of two main                               As it was mentioned in the introduction, the described
limitations. The first limitation is the restriction of the class            method consists of sequentially determining three vanishing
of analyzed scenes only to Manhattan scenes [3], in which                    points, followed by finding the orientation of the camera.
the lines are aligned along three main mutually orthogonal                   The general scheme of the method is presented in Fig. 1.
directions. Vivid examples of such scenes are photographs of                     First, preliminary processing of the image received from
city buildings (the lines of building facades may possess                    the camera is performed. In particular, it is scaled and rotated
these characteristics), road scenes (border of the roadway,                  with an accuracy of 90 degrees in accordance with the
markings, poles), indoor scenes (borders of rooms, furniture                 information received from the inertial sensor. If necessary,
lines, decoration elements - panels, tiles, etc.). The second                the vector received from the sensor is transformed so as to
limitation is the presence of an accelerometer on a mobile                   correspond to the direction of gravity for the rotated image.
device.
                                                                                 After preliminary processing by one of the known
    The orientation of the camera in this paper is determined                methods, for example, by the Canny method [7], contours are
sequentially in several stages. At the first stage, using the                extracted from the image. The extracted contours are traced
inertial sensor readings, the direction to the first vanishing               and the segments of straight lines are searched. The found
point corresponding to the direction of gravity is determined.               segments form the set L, which will be used subsequently to
After that, the position of the first vanishing point is refined             find vanishing points.
along vertical lines in the optical image. At the second stage,
the vanishing points of the horizontal lines of the main and                     Further, the information obtained from the inertial sensor
side facades are determined. So the found vanishing points,                  is used for a preliminary assessment of the first vanishing


Copyright © 2020 for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
Image Processing and Earth Remote Sensing

point VP1. It is assumed that the direction to the first                      In the direction obtained, a set L1 of segments is selected
vanishing point corresponds, up to a sign, to the gravity                 such that the lines corresponding to this set deviate from the
vector.                                                                   direction to VP1 no more than the predefined angle.

                                             Pre-processing of an image and inertial sensor readings

                                        Extraction and tracing of contours, search for line segments L

                                        Preliminary assessment of the first vanishing point VP1 using
                                                     information from an inertial sensor


                                    The formation of the set L1 of segments corresponding to the first
                                                          vanishing point VP1


                                                              Finding VP1 by L1 is                     no
                                                                    possible

                                                                            yes
                                                   Refinement of the position of VP1 using L1


                                                      Estimation of the horizon plane and
                                                       horizon line Г in the image plane


                                 Search for the points pi as intersections of the extracted lines      from L’
                                                      = L \ L1 with the horizon line Г


                                   Search the interval h on the horizon line Г containing the maximum
                                                     number of intersection points pi


                                        Formation of the set L2 of line segments corresponding to the
                                                            intersection points pi


                                                              Finding VP2 by L2 is
                                                                    possible
                                        no                                                             yes

                   Estimation of the second vanishing point                       Estimation of the second vanishing point
                     VP2 by the intersection points pi  h                         VP2 from the set of line segments L2


                                                  Calculation of the third vanishing point VP3


                                                    Assessment of the camera orientation R

Fig. 1. General scheme of the method.

    If there are enough selected segments, the first vanishing            assessment of VP1 by L1 is not possible, the initial estimation
point is refined by a weighted summation of the points                    of VP1 is used for further processing.
determined by all possible segments from L1. Moreover,
segments of greater length have more weight. If the                           At the next stage, the direction to VP1 is used to
                                                                          determine the horizon line plane as the plane passing through
                                                                          the origin of the modeled optical system and orthogonal to


VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                 6
Image Processing and Earth Remote Sensing

the direction to VP1 (refined gravity vector). In addition to                 To obtain information about the true position of the
the plane, the horizon line Г is also determined as the                   camera, several (from 3 to 7 for each vanishing point) lines
projection of a line, which is in the horizon plane and non-              were manually selected that reliably determine the directions
orthogonal to the image, on the image plane.                              to the true vanishing points. This procedure was performed at
                                                                          2x magnification, and normalized vanishing points obtained
    Further, for all the lines l i  L’ extracted in the image,           using selected lines were considered as true vanishing points.
with the exception of the lines used earlier to find the                  At the moment, the described data set consists of 40 images
vanishing point VP1 (L’ = L \ L1), the intersection points with           of buildings with the corresponding inertial sensor data and
the horizon line Г are determined. A search is made for such              true orientation data.
a segment h (with a predetermined angular size) on the
horizon line, at which the maximum number of intersection
points pi falls.
    After this, we form the set L2 of line segments, for which
the intersections pi with the horizon line Г fall in the
indicated interval h. If there are enough selected segments,
the second vanishing point is estimated by weighted
summation of the intersection points determined by all
possible segments from L2. If the estimation of VP2 by L2 is
not possible, the weighted sum of the points of intersection
of the corresponding lines with the horizon line Г is taken for
the position of VP2. In both cases, the segments of longer
length have more weight when determining VP2.                                                                  (a)
    After determining two vanishing points, the third is found
as a vector orthogonal to the vectors corresponding to the
first and second points: V3 = V1 × V2.
    After finding vanishing points, the camera orientation can
be found as follows:
                              R=[r1 r2 r3],
where R is the rotation matrix, and vectors r1 r2 r3 are
calculated as
            r1 = mK-1VP1, r2 = mK-1VP2, r3 = r1 × r2,
where m is the scale factor, K is the matrix of internal                                                       (b)
parameters of the camera [8] containing information on the
focal length, pixel size, tilt, the shift of the image center
relative to the optical axis.
    In general, the proposed method is the development of
the previously described method [6], the main idea of which
[4] is to search for horizontal vanishing points along the
horizon line defined by a plane orthogonal to the direction of
the vertical vanishing point. Compared with the previous
implementation, both the individual steps of the method
underwent changes (the search for segments on the contours
is now carried out according to the criterion of maximum
deviation, the weighted summation takes into account the
lengths of the segments of lines, which are separated from                                                     (c)
each other by a sufficient distance, the second vanishing                 Fig. 2. An example of the method: a) extracted contours (white) and the
point is refined without using histograms), as well as the                set of lines segments corresponding to the first vanishing point (blue); b)
general scheme of the method (now contains branches that                  the horizon line (red) and the set of lines defining the second vanishing
increase the reliability of determining vanishing points, as              point (red); c) directions to true (dashed lines) and estimated (solid lines)
well as the actual orientation estimation stage).                         vanishing points.

                       III.     EXPERIMENTS                                  An example demonstrating the various stages of the
   To study the method described above, we used our own                   proposed method is shown in Fig. 2.
specially prepared data set. This set was collected using the
                                                                             To assess the quality of the developed method, modeling
Huawei Honor 9 lite smartphone [9]. Its camera has a CMOS
BSI sensor with an f / 2.2 aperture, a focal length of 3.46               was performed according to the following scheme:
mm, and produces a color image of 12.98 MP. To collect
images and inertial sensor data, we developed an Android                                for each image from the prepared data set, three
application that stores both the captured images and a custom                            vanishing points and camera orientations relative
number of accelerometer readings recorded prior to the shot.                             to the building depicted in the photograph were
                                                                                         determined;


VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                               7
Image Processing and Earth Remote Sensing

           using information about the true position, for                the «Crystallography and Photonics» Research Center of the
            each vanishing point, the error was calculated as             RAS in parts «1. Introduction» and «4. Conclusion».
            the angular deviation of the direction to the
            estimated vanishing point from the true direction;

           based on the data obtained for each vanishing
            point, a histogram of the angular deviation of the
            found points from their true values was
            constructed, and the average value of such a
            deviation was also calculated.

   The experimental results are shown in the following
Figure 3.

    Each of the histograms shown in the figure shows the
angular deviation of the estimated vanishing point from its                                                    (a)
true position. In the ideal case, such a histogram should have
a single column on the left side (first), which means the
minimum deviation of the vanishing point from the true
values for all test images. As can be seen from the above
figures, in most cases the position of the three vanishing
points was made with a deviation of up to 2º, while the
deviation exceeded 4º was observed for only 3 of 40 images.
The average error values were: 1.69º, 1.54º, and 1.88º for
the first, second, and third points, respectively.

    It should be noted that using only information from the
inertial sensor (see the histogram in Fig. 1 (a)) provided a
greater level of errors in determining the direction to the
first vanishing point. The average error value was 3.7º when                                                   (b)
using only an inertial sensor versus 1.69º when refined with
an optical image. Thus, the accuracy of the algorithm can be
improved in conditions of noisy readings of the gravity
vector by selecting parameters. Another way to increase
accuracy may be to use previously obtained estimates in
processing a video stream, which is the subject of future
research.

                       IV. CONCLUSION
The method for automatic assessment of the orientation of a
camera in Manhattan scenes using information from optical
and inertial sensors is proposed and investigated. To study
the developed technique, the data set was created containing                                                   (c)
digital images of buildings, readings of inertial sensors, as
well as information about the true position of vanishing
points obtained by careful manual marking of the source
images.
    The described method is simple to implement, and
undemanding to computing resources. Its use allows
reducing the average level of errors in determining the
orientation in more than 2 times compared with the inertial
sensor.
    As a direction for further work, it is planned to expand
the method for assessing the orientation and position of the
camera when working with a video stream.
                                                                                                               (d)
                       ACKNOWLEDGMENT
                                                                          Fig. 3. Estimation of the method quality. Histograms of the angles
    The work was partly funded by RFBR according to the                   deviations of directions to the vanishing points from their true values (in
research project 17-29- 03190-ofi-m in parts of «2. Method»               degrees): a) the first vanishing point, estimated by inertial sensor readings;
- «3. Experiments» and by the Russian Federation Ministry                 b) the first vanishing point, refined by an optical image; c) the second
                                                                          vanishing point; d) the third vanishing point.
of Science and Higher Education within a state contract with


VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                                8
Image Processing and Earth Remote Sensing

                              REFERENCES                                       [5]   P. Denis, J.H. Elder and F. Estrada, “Efficient Edge-Based Methods
                                                                                     for Estimating Manhattan Frames in Urban Imagery,” Proc. European
                                                                                     Conference on Computer Vision, vol. 5303, pp. 197-211, 2008.
[1]   V.V. Myasnikov and E.A. Dmitriev, “The accuracy dependency               [6]   E. Myasnikov, “Automatic search for vanishing points on mobile
      investigation of simultaneous localization and mapping on the errors           devices,” CEUR Workshop Proceedings, vol. 2391, pp. 216-221,
      from mobile device sensors,” Computer Optics, vol. 43, no. 3, pp.              2019.
      492-503, 2019. DOI: 10.18287/2412-6179-2019-43-3-492-503.
                                                                               [7]   J. Canny, “A Computational Approach To Edge Detection,” IEEE
[2]   B. Caprile and V. Torre, “Using vanishing points for camera                    Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no.
      calibration,” International Journal of Computer Vision, vol. 4, no. 2,         6, pp. 679-698, 1986.
      pp. 127-139, 1990.
                                                                               [8]   R. Hartley and A. Zisserman, “Multiple View Geometry in Computer
[3]   J.M. Coughlan and A.L. Yuille, “Manhattan World: compass                       Vision,” Cambridge, Cambridge University Press, 2004.
      direction from a single image by Bayesian inference,” Proceedings of
                                                                               [9]   Huawei.com, “HONOR 9 Lite,” 2020. [Online]. URL:
      the Seventh IEEE International Conference on Computer Vision,
                                                                                     https://consumer.huawei.com/ru/support/phones/honor-9-lite.
      vol.2, pp. 941-947, 1999.
[4]   V. Angladon, S. Gasparini and V. Charvillat, “The Toulouse                                                    .
      vanishing points dataset,” Proceedings of the 6th ACM Multimedia
      Systems Conference (MMSys ’15), Portland, United States, 2015.


VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020)                                                                   9