Investigation of algorithms for generating surfaces of 3D models based
                      on an unstructured point cloud
                                             E.S.Glumova, A.D.Filinskikh
                                       glumova.ek@yandex.ru | alexfil@yandex.ru
                               Nizhny Novgorod State Technical University n a. R.Е. Alekseev

    Methods of 3D object model creation on the basis of unstructured (sparse) cloud of points are considered in the paper. The issues
of combining point cloud compaction methods and subsequent surface generation are described. The comparative analysis of generation
surfaces algorithms for the purpose of revealing of more effective method using as input data the depth maps received from the sparse
cloud of points is carried out. The comparison is made by qualitative, quantitative and temporal criteria. The optimal method of 3D
object model creation on the basis of unstructured (sparse) cloud of points and depth map data is chosen. The mathematical description
of the point cloud compaction method on the basis of stereo-matching with application of two-phase algorithm of species search and
depth map extraction from Multi-View Stereo for Community Photo Collections source image set is provided. The implementation of the
method in open-source software Regard3D is realized in practice.
    Key words: 3D model photogrammetry, surface generation, point cloud, depth maps.

                                                                      contours, etc. The keypoints are described by descriptors -
1. Introduction                                                       vectors of features computed on the basis of
    Today, the development of computing and surface                   intensity/gradients or other characteristics of the
restoration technologies allows to recreate 3D models of              neighborhood points.
objects with high accuracy and high quality. One of such                   The most popular feature descriptors used in modern
technologies is laser scanning. With the help of laser                image processing systems are given in [5]. A-KAZE
scanners, it is possible to get the geometry of high                  (nonlinear diffusion filtering for detecting and describing
accuracy, but unfortunately the devices that allow to                 2D objects) is used to solve the problem of keypoint
achieve accuracy in hundredths of millimeters cost tens               detection.
and hundreds of millions of rubles. One of the types of                    Then the camera position is assessed and a cloud of
non-contact scanning of objects is photogrammetry. The                low density points or sparse points is selected. Keypoints
cost of equipment for obtaining geometric data about an               in multiple images are matched using approximate nearest
object is hundreds of times lower than the equipment that             neighbor and ‘tracks’, linking specific keypoints in a set of
uses laser technology, and the main load for obtaining                pictures. Tracks comprising a minimum of two keypoints
high-quality models falls on the software.                            and three images are used for point-cloud reconstruction,
    3D objects models are widely used in the field of                 with those which fail to meet these criteria being
parametric architecture [1], the industry of computer video           automatically discarded [6]. After that triangulation is
games and animation [2], in the development of scenes for             used to estimate points three-dimensional positions and
VR applications, as well as in mobile development [3].                gradual reconstruction scene geometry fixed into a relative
The quality of the model plays an important role in any of            coordinate system.
these areas. It is important to distribute computational                   An enhanced density point-cloud can be derived by
resources of software correctly.                                      implementing the Multi-View Stereo (MVS) algorithm
    There are quite a lot of various software on the market           [7], based on depth maps, the Clustering Views for Multi-
for processing images and obtaining 3D models by series               View Stereo (CMVS) [8], the Patch-based MVS algorithm
of images. There are both paid software, costing about one            (PMVS2) [9], the Shadow-Aware Multi-View Stereo
hundred thousand rubles, and free open source software.               Algorithm (SMVS) [10], that combines stereo and shape-
In both cases, different algorithms are used at all stages            from-shading energies into a single optimization scheme.
from photo processing to obtaining a 3D model.                        The camera positions obtained from a sparse point cloud
                                                                      are used here as input data. The result of this additional
2. SfM Principles                                                     processing is a significant increase in point density.
                                                                           The color and texture information is then transferred to
    One of the photogrammetry methods is the one of                   a point cloud, after which the final 3D model is rendered.
building a 3D structure by a set of images - Structure from                Simplified process of obtaining 3D-model based on the
Motion. The method feature is automatic determination of              images is shown in Fig. 1.
camera internal parameters [4]. This method restores such                  A stage of reception of surface generation on the basis
camera parameters as the extrinsic calibration (the                   of the received unstructured cloud of points by a 3D-
orientation and position of the camera) and the intrinsic             reconstruction method MVS (Multi-View Stereo) are
calibration (focal length, radial distortion of the lens).            considered separately [7].
    The first step of SfM realization is to detect and match               MVS is based on reconstructing a depth map for each
point features in the input images. Special points (term              view (image). Despite the large redundancy of the output
vary in different sources) - to put it informally - "well             data, the method has proven to be well suited for restoring
detectable" fragments of an image. These are points                   the detailed geometry of sufficiently large scenes. Another
(pixels) with a characteristic (special) neighborhood - i.e.          advantage of depth maps as an intermediate representation
different from all neighboring points. Local features                 is that the geometry is parameterized in its natural domain,
examples can be corner tops, isolated point features,

Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY
4.0)
and per-view data (such as color) is directly available from       ˗    SfM, which reconstructs the parameters of the
the images. The excessive redundancy in the depth maps                  cameras;
can cause problems; not so significant in terms of storage,        ˗    MVS for establishing dense;
but in terms of computational power [11].                          ˗    surface generation (meshing), which merges the MVS
    MVS includes 3 stages:                                              geometry into a globally consistent, colored mesh.


                           Fig. 1. Simplified process of obtaining a 3D model based on a set of images

                                                                       Global view. For each reference view R, global view
3. Point Cloud Compression                                         selection seeks a set N of neighboring views that are good
    In accordance with Fig. 1, once the camera parameters          candidates for stereo matching in terms of scene content,
are known, dense geometry reconstruction is performed by           appearance, and scale. In addition, the neighboring views
Multi-View Stereo for Community Photo Collections                  should provide sufficient parallax (a change in the
(MVSCPC) [12], that reconstructs depth maps for each               apparent position of an object relative to a distant
image. The depth map represents the two-dimensional                background, depending on the position of the observer)
one-channel image containing the information about                 with respect to R and each other in order to enable a stable
distance from a sensor plane to scene objects [13].                match. Here we describe a scoring function designed to
    The method is based on the idea of selecting images            measure the quality of each candidate neighboring view
from the collection so that they match both per-view and           based on these desiderata.
per-pixel level. Appropriate choice of views ensures                   Since matches and sparse point cloud extracted in the
reliable matches even with strong differences in images.           SfM phase are not sufficient indicators for accurate surface
The stereo matching algorithm takes as input sparse 3D             reconstruction (as they are extracted based on the
points reconstructed from SfM and iteratively grows                similarity of only the scene content), another assessment
surfaces from these points. Optimizing for surface norms           of image matches reliability was proposed.
with a photoconsistency measure significantly improves                 A global score 𝑔𝑔𝑅𝑅 for each view 𝑉𝑉 within a candidate
the matching results. The depth map quality is also                neighborhood 𝑁𝑁 (which includes 𝑅𝑅) as a weighted sum
assessed.                                                          over features shared with 𝑅𝑅 is computed as:
    Stereo matching is performed at each pixel by                               𝑔𝑔𝑅𝑅 (𝑉𝑉) =      � 𝑤𝑤𝑁𝑁 (𝑓𝑓) ∙ 𝑤𝑤𝑠𝑠 (𝑓𝑓)            (1)
optimizing for both depth and normal, starting from an                                        𝑓𝑓∈𝐹𝐹𝑉𝑉 ∩𝐹𝐹𝑅𝑅
initial estimate provided by a sparse point cloud. During          where 𝐹𝐹𝑥𝑥 is the set of feature points observed in view 𝑋𝑋,
stereo optimization, poorly matching views can be                  and the weight functions are described below.
discarded and new ones added according to the local view              To encourage a good range of parallax within a
selection criteria. The detour Pixels can be revised and           neighborhood, the weight function 𝑤𝑤𝑁𝑁 (𝑓𝑓) is defined as a
their depth updated if a more accurate match is found [11].        product over all pairs of views in 𝑁𝑁:
    MVSCPC provides depth map assessment for each                                 𝑤𝑤𝑁𝑁 (𝑓𝑓) =      � 𝑤𝑤𝛼𝛼 (𝑓𝑓, 𝑉𝑉𝑖𝑖 , 𝑉𝑉𝑗𝑗 )
input image - each image serves as a reference view only                                        𝑉𝑉𝑖𝑖 ,𝑉𝑉𝑗𝑗 ∈𝑁𝑁                      (2)
once, after which a two-level view selection algorithm is                               𝑖𝑖 ≠ 𝑗𝑗, 𝑓𝑓 ∈ 𝐹𝐹𝑉𝑉𝑖𝑖 ∩ 𝐹𝐹𝑉𝑉𝑗𝑗
implemented. At the image level, global view selection                                                               2
determines for each reference view a set of good neighbor          where 𝑤𝑤𝛼𝛼 �𝑓𝑓, 𝑉𝑉𝑖𝑖 , 𝑉𝑉𝑗𝑗 � = min(�𝛼𝛼�𝛼𝛼𝑚𝑚𝑚𝑚𝑚𝑚 � , 1) and 𝛼𝛼 is the
images to use for stereo matching.                                 angle between the lines of sight from 𝑉𝑉𝑖𝑖 and 𝑉𝑉𝑗𝑗 to 𝑓𝑓.
      The function 𝑤𝑤𝛼𝛼 �𝑓𝑓, 𝑉𝑉𝑖𝑖 , 𝑉𝑉𝑗𝑗 � downweights triangulation                     During stereo matching, 𝐴𝐴 is iteratively updated using
angles below 𝛼𝛼𝑚𝑚𝑚𝑚𝑚𝑚 , which is usuall set to 10 degrees. The                       a set of local view selection criteria designed to select
quadratic weight function serves to counteract the trend of                          views that, given a current depth and normal pixel
greater numbers of features in common with decreasing                                estimates, are photometrically consistent and provide a
angle.                                                                               sufficiently wide range of observation directions. To
      The weighting function 𝑤𝑤𝑠𝑠 (𝑓𝑓) measures similarity in                        measure the photometric consistency, the mean-removed
resolution of images 𝑅𝑅 and 𝑉𝑉 at feature 𝑓𝑓. The diameter                           normalized cross correlation (NCC) between pixels within
𝑠𝑠𝑉𝑉 (𝑓𝑓) of a sphere centered at 𝑓𝑓 whose projected diameter                        a window about the given pixel in 𝑅𝑅 and the corresponding
in 𝑉𝑉 equals the pixel spacing in 𝑉𝑉 is computed to estimate                         window in V is used. If the NCC score is above a fixed
the 3D sampling rate of 𝑉𝑉 in the vicinity of the feature 𝑓𝑓.                        threshold, then 𝑉𝑉 is a candidate for addition to 𝐴𝐴.
      Similarly, 𝑠𝑠𝑅𝑅 (𝑓𝑓) is calculated for 𝑅𝑅 and the scale                            You can measure the angular distribution by looking at
                                                       𝑠𝑠 (𝑓𝑓)                       gaps of directions from which the given scene point (based
weight 𝑤𝑤𝑠𝑠 is defined based on the ratio 𝑟𝑟 = 𝑅𝑅 �𝑠𝑠 (𝑓𝑓)                           on the current depth estimation for the reference pixel) is
                                                               𝑉𝑉
using                                                                                observed. In practice, the angular spread of the epipolar
                                    2� , 2 ≤ 𝑟𝑟
                                       𝑟𝑟                                            line [15] is considered instead, obtained by projecting each
                      𝑤𝑤𝑠𝑠 (𝑓𝑓) = �1,1 ≤ 𝑟𝑟 < 2                                (3)   viewing ray passing through the reference point to the
                                     𝑟𝑟, 𝑟𝑟 < 1                                      reference view. When deciding whether to add view 𝑉𝑉 to
    This weight function prefers views with equal or                                 the active set 𝐴𝐴, the local score is calculated as
higher resolution than a reference view. Having defined a                                         𝑙𝑙𝑅𝑅 (𝑉𝑉) = 𝑔𝑔𝑅𝑅 (𝑉𝑉) ⋅ � 𝑤𝑤𝑒𝑒 (𝑉𝑉, 𝑉𝑉′)        (5)
global estimate of species 𝑉𝑉 and neighbors 𝑁𝑁, one can find                                                            𝑉𝑉 ′ ∈𝐴𝐴
the best 𝑁𝑁 of a given size (usually |𝑁𝑁| = 10) by the sum                                                         𝛾𝛾
                                                                                     where 𝑤𝑤𝑒𝑒 (𝑉𝑉, 𝑉𝑉 ′ ) = min ( �𝛾𝛾𝑚𝑚𝑚𝑚𝑚𝑚 , 1) and 𝛾𝛾 is the acute
of species estimates ∑𝑉𝑉∈𝑁𝑁 𝑔𝑔𝑅𝑅 (𝑣𝑣). For efficiency, a "greedy                     angle between the pair of epipolar lines in the reference
algorithm" [14] is used and grow the neighborhood                                    view as described above. Accept 𝛾𝛾𝑚𝑚𝑚𝑚𝑚𝑚 =10 degrees.
incrementally by iterative adding to 𝑁𝑁 the highest scoring                              Then the local view selection algorithm is performed
view, taking into account the current 𝑁𝑁 (which initially                            in the following way. Taking the initial depth of the pixel,
contains only 𝑅𝑅).                                                                   the view 𝑉𝑉 with the highest 𝑙𝑙𝑅𝑅 (𝑉𝑉) value is found. If this
    Rescaling Views. Although global view selection                                  view has a sufficiently high NCC score (threshold 5 is
algorithm tries to select neighboring views with                                     used), it is added to 𝐴𝐴; otherwise, the view is rejected. The
compatible scale, some inconsistencies in scale are                                  process is repeated until either set 𝐴𝐴 reaches the desired
unavoidable due to differences in resolution within the                              size or the view remains undecided. During stereo
collection of photos, which may negatively affect stereo                             matching, the depth (and normal) are optimized, and a
matching. There are methods to adapt the scale of all views                          view may be removed (and marked as rejected). Then a
by filtering to a common, narrow range or global, pixel-                             replaced view is added. The algorithm completes as the
based view. The first method is used in this research to                             deflected views are never revised.
avoid resizing of the matching window in different areas
of the depth map. This approach finds a view with the                                4. Surface Generation
lowest-resolution 𝑉𝑉𝑚𝑚𝑚𝑚𝑚𝑚 ∈ 𝑁𝑁 relative to 𝑅𝑅, resamples 𝑅𝑅 to
approximately match that lower resolution, and then                                      After computing arrays containing the best matching
resamples higher resolution to match 𝑅𝑅.                                             candidates for each image, you can move towards the step
    In particular, the assessment the resolution scale of a                          of surface generation. Merging the individual depth maps
view 𝑉𝑉 relative 𝑅𝑅 is based on their common features                                into a single polygonal surface is a labor intensive task.
                                        1                          𝑠𝑠𝑅𝑅 (𝑓𝑓)         The depth maps inherit information about the multi-scale
            𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑅𝑅 (𝑉𝑉) =                       �                        (4)   properties of the original images, which leads to vastly
                                  |𝐹𝐹𝑉𝑉 ∩ 𝐹𝐹𝑅𝑅 |                   𝑠𝑠𝑉𝑉 (𝑓𝑓)
                                                   𝑓𝑓∈𝐹𝐹𝑉𝑉 ∩𝐹𝐹𝑅𝑅                     different sampling rates of the research surfaces.
      Then 𝑉𝑉𝑚𝑚𝑚𝑚𝑚𝑚 simply equals 𝑎𝑎𝑎𝑎𝑎𝑎 𝑎𝑎𝑎𝑎𝑎𝑎 𝑚𝑚𝑚𝑚𝑚𝑚 𝑉𝑉∈𝑁𝑁 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑅𝑅 (𝑉𝑉).          Many approaches for depth maps fusion have been
If 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑅𝑅 (𝑉𝑉) is less than the threshold value 𝑡𝑡 (𝑡𝑡 = 1,                    proposed [16-20]. Among them FSSR (Floating Scale
which is close to the 5x5 of the reference window on a 3x3                           Surface Reconstruction) [18] and SPSR (Screened Poisson
window in the neighboring view with the lowest relative                              Surface Reconstruction) [19] were considered as methods
scale), the reference view is rescaled so that, after rescaling                      of surface generation, as they provide high detail of the
𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑅𝑅 (𝑉𝑉) = 𝑡𝑡. Then all neighboring views with                              reconstructed 3D model.
𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑅𝑅 (𝑉𝑉) > 2 to match the scale of the reference view                           FSSR is widely used as outdoor scene reconstruction,
(which itself may have been changed in the previous step).                           when data is too sparse for a reliable reconstruction. In this
It is important that all modified versions of the images are                         case the method does not hallucinate geometry in
discarded when moving to the depth map computation for                               incomplete regions, requiring manual intervention, but
the next reference view.                                                             leaves in these areas holes (i.e. these areas have gaps).
      Local View. Global view selection determines a set of                              The approach draws upon a simple yet efficient
𝑁𝑁 well suited candidates for a reference view and matches                           mathematical formulation to construct an implicit function
their scale. Instead of using all of these views for stereo                          as the sum of compactly supported basis functions. The
matching at a specific location in the reference view, the                           implicit function has spatially continuous “floating” scale
smallest set 𝐴𝐴 ⊂ 𝑁𝑁 of active views is selected (usually                            and can be readily evaluated without any preprocessing.
|𝐴𝐴| = 4). Using this subset naturally speeds up the                                 The final surface is extracted as the zero-level set of the
computation of the depth map.                                                        implicit function. One of the key properties of the
approach is that it is virtually parameter-free even for              retains the same finite-element discretization, the sparse
complex, mixed-scale datasets [18].                                   structure is unchanged and the system can still be resolved
    The FSSR method combines all depth maps in one                    using a multi-mesh approach.
large point cloud. At this stage, the scale value is attached             In addition, Poisson's surface reconstruction presents
to each point, indicating the factual size of the surface area        several algorithmic improvements that together reduce the
in which the point was measured. This value is derived                time complexity of the solution to linear in the number of
from the size of the regions identified in the MVS phase.             points, thus enabling faster and better surface
Then FSSR tools calculate a multi-scale 3D surface.                   reconstruction [19].
    SPSR is an improvement of the approach that
considers surface reconstruction as a spatial Poisson                 5. Algorithm Comparison
problem [20]. The approach explicitly incorporates the                   Consider a combination of MVS-FSSR and MVS-
point as interpolation constraints. Unlike other methods of           SPSR approaches here in more detail.
image processing and geometry processing, the term                       Implementation is studied on the example of 21 photos
screening is defined for a sparse set of points rather than           of the statuette (Fig. 2) and freely distributed software
for the whole area. These rare constraints, however, can be           Regard3D and MeshLab.
effectively integrated. Since the modified linear system


                                                    Fig. 2. Set of original photos

   In Fig. 3. shows the detection of keypoints by                     require the invariance to the image scaling, rotation, noise
Regard3D. This image contains 14,486 keypoints.                       and changes in illumination.
   These key points are then matched to establish sparse
matches between images (Fig. 4). The image features


                                                      Fig. 3. Object keypoints


                               Fig. 4. The result of key point comparison for a pair of original images
    The results of the pairing are then combined and               reconstruction until all the reconstructed views become
unfolded into multiple views, creating functional tracks.          part of the scene. Parameters of lens distortion are
    The next step in SfM implementation is incremental             evaluated during the reconstruction. The performance of
triangulation algorithm. It assesses the relative position of      the following algorithms is significantly improved by
a well-matched original pair of the image, and then all            removing distortions from the original images.
tracks visible in both images are triangulated. The                   In Regard3D's "ideology", this method is called New
matching next images are incrementally added to the                Incremental. The result is a sparse point cloud (Fig. 5).


                    Fig. 5. The result of triangulated point cloud computing using the New Incremental method

   21 cameras (according to the number of uploaded                 compaction using the MVS method is shown in the figure.
images) have been calibrated by the program, i.e. 3D               6. The computation time was 40.42 minutes; 4 756 185
positions and parameters of all images have been found.            points were created. As you can see, the point cloud has
13 583 points has received that match not only the model,          holes on the side of the figure (Fig. 6).
but also some part of the environment. The calculation                The corresponding depth map was obtained using the
time of a point cloud has made slightly less than 30 s.            MeshLab program (Fig.7).
   Further we will proceed to compression of the sparse
point cloud by MVS method. The result of point


                                          Fig. 6. Dense point cloud using MVS method


                                                       Fig. 7. Depth map
    Surface generation by FSSR and SPSR. In Fig. 8. the                     Fig. 9 shows the result of FSSR method surface
results of calculations in the Regard3D command line are                construction. The calculation time was 17.02 min. The
presented, illustrating the iterative algorithm of finding the          final surface contains 1,369,758 points. The model also
best candidates for comparison described above.                         contains small noises and has gaps.
                                                                            In the right picture, you can see that the model has a
                                                                        big hole. This is due to the fact that a shadow falls on this
                                                                        area in the original images. The lighting change is
                                                                        interpreted by the program as a lack of data for point
                                                                        reconstruction, because the shaded area is found in only 2-
                                                                        3 species out of 21, which was a rejection of its revision
                                                                        and surface reconstruction.
                                                                            Fig. 10 shows the result of surface reconstruction using
                                                                        the SPSR method. The calculation time was 1.22 min. The
                                                                        final surface contains 301,497 points. The model also
    Fig. 8. Regard3D command line. FSSR implementation                  contains little noise and has gaps.
                                                                            In the right picture, you can see that the model has an
    In general, 21 reports were produced - according to the             even greater gap than the previous method.
number of uploaded images. You can find the views                           We will compare the obtained models by several
recommended for comparison view, as well as the number                  indicators (Table 1).
of optimized points, i.e. points that have updated the depth
map data and normal in accordance with the described
algorithm.


                                       Fig. 9. MVS model - Floating Scale Surface Reconstruction


                                    Fig. 10. MVS model - Screened Poisson Surface Reconstruction

                                                   Table 1. Comparison of final models.
                                                             FSSR                                  SPSR
                  Visual assessment of details   Denser mesh with lower gap area    Less dense grid with larger gap area
                       Calculation time                    17,02 min                             1,22 min
                      Number of points                     1 369 758                              301 497
                        Model size.obj                      401 Mb                               85,1 Mb
6. Conclusion                                                      Society Conference on Computer Vision and Pattern
                                                                   Recognition (2007).
    Two surface generation algorithms were considered         [10] Langguth F. et al.: Shading-aware multi-view stereo.
during the research: Screened Poisson Surface                      European Conference on Computer Vision, Springer,
Reconstruction point approach and Floating Scale Surface           Cham, pp. 469-485 (2016).
Reconstruction approach. In connection with the method        [11] Seitz S. M. et al.: A comparison and evaluation of
of point compression, the considered algorithms showed             multi-view stereo reconstruction algorithms. 2006
different temporal and quantitative results. The result of         IEEE computer society conference on computer
comparison of final 3D-models generated by these                   vision and pattern recognition (CVPR'06), Т.1, pp.
methods is shown, reduction of time expenses in SPSR               519-528 (2006).
method does not give qualitative result. Model MVS -          [12] Goesele M. et al.: Multi-view stereo for community
Screened Poisson Surface Reconstruction is a much less             photo collections.2007 IEEE 11th International
dense mesh than model MVS - Floating Scale Surface                 Conference on Computer Vision, pp. 1-8 (2007).
Reconstruction. On the basis of the received data it is       [13] Voronin, V, Fisunov, A., Marchuk, V., Svirin, I.,
possible to draw a conclusion that for reception of                Petrov, S.: Restoration of the depth map based on the
qualitative 3D-models on the basis of not structured point         combined processing of a multi-channel image (in
cloud it is necessary to use the algorithm of generation of        Russian). Modern problems of science and education,
the surface based on changing scale of images. The surface         6, (2014).
generation algorithm based on a point approach can be         [14] Greedy algorithms, https://habr.com/ru/post/120343/.
used for small collections of photos that do not contain           Last accessed 26 June 2020.
multiscale images. Reduction of computational power in        [15] Basics           of           Stereo          Vision,
model preparation, as well as their small volume can be            https://habr.com/ru/post/130300/. Last accessed 26
used, for example, for low-polygonal modeling in the               May 2020.
mobile applications.                                          [16] Curless B., Levoy M.: A volumetric method for
    In the future, it is planned to conduct a comparative          building complex models from range images.
analysis of existing algorithms based on depth map data,           Proceedings of the 23rd annual conference on
as well as approaches that take into account changes in            Computer graphics and interactive techniques, pp.
illumination in photographs.                                       303-312 (1996).
                                                              [17] Fuhrmann S., Goesele M.: Fusion of depth maps with
References
                                                                   multiple scales. ACM Transactions on Graphics
[1] Sosnina, O., Filinskikh, A.; Lozhkina, N.: Analysis of         (TOG), Т. 30 (6), pp. 1-8 (2011).
    the virtual nontrivial forms models creation methods      [18] Fuhrmann S., Goesele M.: Floating scale surface
    (in Russian). Information technologies, Т. 25. (11),           reconstruction. ACM Transactions on Graphics
    pp. 679-681 (2019).                                            (ToG), Т. 33 (4), pp. 1-11 (2014).
[2] Sosnina, O., Filinskikh, A., Korotaeva, A.S.:             [19] Kazhdan M., Hoppe H.: Screened poisson surface
    Comparison of the low-polygonal 3D model creation              reconstruction. ACM Transactions on Graphics
    methods (in Russian). Information technologies, Т.             (ToG), Т. 32 (3), pp. 1-13 (2013).
    23(8), pp. 564-568 (2017).                                [20] Kazhdan M., Bolitho M., Hoppe H.: Poisson surface
[3] Malysheva, A., Tomchinskaya, T.: Features of the               reconstruction.    Proceedings     of   the    fourth
    low-polygon modeling and texturing in the mobile               Eurographics symposium on Geometry processing, Т.
    applications      (in    Russian).    CONFERENCE               7, (2006)
    KOGRAF-2019, ISBN 978-5-502-01200-3, p. 51-54.
[4] Westoby, Matthew J., et al.: Structure-from-              About the authors
    Motion’photogrammetry: A low-cost, effective tool             Glumova Ekaterina S. – student of Nizhny Novgorod State
    for geoscience applications. Geomorphology, 179, pp.      Technical     University   n.a.   R.E.   Alekseev,     e-mail:
    300-314 (2012).                                           glumova.ek@yandex.ru.
[5] Kublanov V. and others: Biomedical signals and                Filinskikh Aleksandr D., Ph.D. in Technology, Associate
    images in digital healthcare: storage, processing and     Professor, Nizhny Novgorod State Technical University n.a. R.E.
    analysis: a training manual (in Russian), pp. 193-195,    Alekseev. E-mail: alexfil@yandex.ru
    (2020).
[6] Snavely K.: Scene reconstruction and visualization
    from internet photo collections. USA : University of
    Washington, (2008).
[7] Fuhrmann, Simon, Fabian Langguth, and Michael
    Goesele:      Mve-a      multi-view     reconstruction
    environment. GCH (2014).
[8] Clustering views for multiple stereo views (CMVS),
    https://www.di.ens.fr/cmvs/.Last accessed 10 May
    2020.
[9] Furukawa Y., Ponce J.: Accurate, Dense, and Robust
    Multi-View Stereopsis (PMVS). IEEE Computer