<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Tatyana Demenkova, Leo Tverdokhlebov</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Moscow technological university (MIREA)</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2000</year>
      </pub-date>
      <volume>81</volume>
      <fpage>332</fpage>
      <lpage>338</lpage>
      <abstract>
        <p>This work presents a method of rendering of 360° panoramas based on real-world DEM (Digital Elevation Model) data using graphic processor (GPU). Given geographic coordinates as input this method generates a panorama image and also a distance map with a distance from the point of view to every pixel of the rendered panorama. The details on estimating these distances and the results of accuracy evaluation and error measurements are provided. 3D panorama rendering; GPU rendering; GPU z-buffer; estimation of distances up to objects of relief; digital elevation model; geo-information systems.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>КЛЮЧЕВЫЕ СЛОВА</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <sec id="sec-2-1">
        <title>With rapid development of digital cartography and GIS (geography information systems) a lot of</title>
        <p>related techniques and approaches aiming at usage of spatial data emerged. The great contribution was
made by the NASA's Shuttle Radar Topography Mission (SRTM) giving the public access to highly detailed
geospatial data of the whole surface of our planet. Among all the fields of application of digital elevation
model (DEM) data [1] it seems rational to highlight the field of comparison between real-life collected data
and the model to be able to produce the new, previously non-existent information and conclusions [2,3].</p>
      </sec>
      <sec id="sec-2-2">
        <title>This article presents an algorithm of estimation of the distance to the terrain relief based on the DEM, which can be used for the generation of 360° panoramas together with depth map for all pixels of the panorama. The comparison of the measurements with the etalon model is provided as well.</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>PROPOSED METHOD OF ESTIMATION</title>
      <sec id="sec-3-1">
        <title>Rendering of the panorama is possible using the graphical processor (GPU). In particular it is</title>
        <p>interesting to consider the simultaneous rendering of the panorama together with the depth map. Any
modern GPU is equipped with z-buffer, which stores the information about the distance to all the objects of
the 3D scene. In order to reconstruct linear depth values from the values obtained from z-buffer one needs
to reverse engineer the pipeline [4] which data goes through before it gets into the z-buffer.</p>
        <p>Let us start from the basics and follow the pipeline. The OpenGL projection matrix
* Proceedings of the XI International scientific-practical conference «Modern information
technologies and IT-education» (SITITO’2016), Moscow, Russia, November 25 - 26, 2016
where
where
(GL_PROJECTION) looks like:</p>
        <p>⎝ 0 0 −1 0 ⎠
Multiplied by homogeneous point (xe, ye, ze, we) in eye space it gives us this point in clip space:
= − .
since one only cares about depth component and taking into account the fact that in eye space we equal to
1 the result is:
=
(
)
+</p>
        <p>.</p>
        <p>= −</p>
      </sec>
      <sec id="sec-3-2">
        <title>Next operation OpenGL is the division of by its w component (perspective division):</title>
        <p>( )
⎝ 0 0 −1 0 ⎠
Multiplied by homogeneous point (xe, ye, ze, we) in eye space it gives us this point in clip space:
=
= (
) + (
,
+ ,</p>
        <p>Then we recover ze:
0
0
0
0
0
=
=
=
(
(</p>
        <p>)
)
(</p>
        <p>)
−1
+
+
+
0 ⎟⎞.</p>
        <p>⎟
,
0</p>
        <p>⎛
0 ⎞ ⎜
⎟⎟ ⎜</p>
        <p>⎜
0 ⎠ ⎝
⎞
⎟⎟,
⎟
⎠
=
)
+
,</p>
        <p>,
=
(
)
+</p>
        <p>,
= − ,
since one only cares about depth component and taking into account the fact that in eye space we equal to</p>
      </sec>
      <sec id="sec-3-3">
        <title>1 the result is:</title>
        <p>= − .</p>
      </sec>
      <sec id="sec-3-4">
        <title>Next operation OpenGL is the division of by its w component (perspective division):</title>
        <p>( )
=
= (
) + (
)
+ ,</p>
        <p>Then we recover ze:
=</p>
        <p>(
=
(
)
)
(
(
=
) =
= (
+ (</p>
        <p>)
−
)
,
,
=
) =
= (
−
)
,
=
(
(
).</p>
        <p>.
)</p>
      </sec>
      <sec id="sec-3-5">
        <title>With all these transformations we eventually obtain the linearized value, where n – distance to the near plane of the frustum, f – distance to the far plane of the frustum:</title>
      </sec>
      <sec id="sec-3-6">
        <title>With all these transformations we eventually obtain the linearized value, where n – distance to the near plane of the frustum, f – distance to the far plane of the frustum.</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>IMPLEMENTATION</title>
      <p>The whole system is implemented in Java programming language. Java gives us the advantage of
using the code of the system in Desktop applications, web-services and mobile devices. The rendering of the
3D scenes is performed through the jME3 game engine library with some core classes rewritten in the way
that provides the access to the underlying layer of LWJGL library. Access to the underlying layer is needed
to be able to directly receive the data from z-buffer of GPU hardware, such low-level functionality is normally
not needed to game and conventional 3D software developers.</p>
      <p>jME3 itself uses LWJGL (Lightweight Java Game Library). The library accesses native C code through
the Java Native Interface (JNI). LWJGL provides bindings to OpenGL. The library acts as a wrapper over the
OpenGL native libraries and provides API similar to the OpenGL APIs of other lower lever languages running
directly on hardware (as opposed to Java running inside the Java Virtual Machine) such as C\C++.
Platformspecific lwjgl.dll file should be accessible to the Java code. Though it’s not a big constraint since libraries for
most popular operation systems (Linux/Windows) and platforms (x86/x64/ARM) are available. The first
attempt was made to implement the renderer using JOGL OpenGL Java library. Although the 360 degrees
panorama rendering was successful, it was not possible to access z-buffer data. It forced us to switch to jME3
and LWJGL libraries. Another advantage of jME3 over the JOGL library is the support of DEM height maps
out of the box.</p>
      <sec id="sec-4-1">
        <title>No GUI is provided for the panorama generation functionality since the program is supposed to be</title>
        <p>integrated to the pipeline of the comprising project. The code can anyway be started through command line.</p>
      </sec>
      <sec id="sec-4-2">
        <title>The system can be split into some modules: 334</title>
      </sec>
      <sec id="sec-4-3">
        <title>1. DEM operating module.</title>
      </sec>
      <sec id="sec-4-4">
        <title>This module reads DEM file and produces the 3D landscape model suitable for renderer. Module</title>
        <p>takes geographical coordinates representing the centre of the desired area and the desired radius as an
input. Since the DEM data is stored in the big amount of files one square degree of latitude\longitude per
file the module can sew together terrain data from up to 9 files producing the square terrain.</p>
      </sec>
      <sec id="sec-4-5">
        <title>2. Renderer module.</title>
      </sec>
      <sec id="sec-4-6">
        <title>This module performs rendering of the 3D terrain surface. The usable classes of the module are</title>
        <p>represented by two different classes: OffscreenRenderer and OnscreenRenderer. OffscreenRenderer
provides the 360 degree panorama generation functionality. OnscreenRenderer provides the ability to fly
over the 3D landscape in a flight simulator-like way using keyboard and mouse controls. Simple GUI is
provided to choose screen resolution and some other rendering parameters. OnscreenRenderer is
extremely useful for debug purposes and for detection of anomalies in DEM data.</p>
      </sec>
      <sec id="sec-4-7">
        <title>In the beginning a program receives geographical coordinates (latitude and longitude) as an input.</title>
        <p>Based on the coordinates the set of files containing relevant DEM data is determined and read into the
memory. Then these data are concatenated in the proper order to form one continuous terrain surface. After
this the terrain is positioned and cut to the requested coordinate the new centre of the terrain surface. Next
the DEM heightmap (two-dimensional array of heights in meters) is translated into the 3D model (vertices,
edges, faces, and polygons) and the altitude of the landscape in the desired coordinate is determined based
on the DEM data. Then camera is positioned into the desired coordinate and on the determined latitude.</p>
      </sec>
      <sec id="sec-4-8">
        <title>Initially the camera is facing north with a clockwise shift of</title>
        <p>22.5° (half of 45°). Then the camera is rotated by 45° eight times. After each rotation we make a
snapshot of what camera sees and a snapshot of z-buffer content. At the end all the data is sewed up together
to form 360° view panorama and distance panorama and this data is returned to the calling function to be
used in subsequent processing or just be saved as a file for later use.</p>
        <p>EXPERIMENTAL EVALUATION OF THE PROPOSED METHOD</p>
      </sec>
      <sec id="sec-4-9">
        <title>In order to evaluate the accuracy of the distance predictions this work will evaluate the developed</title>
        <p>model against the mountain view generator of Dr. Ulrich Deuschle (hereinafter referred to as Udeuschle)
which we take as a reference model.</p>
        <p>First three datasets were collected by comparing three pairs of panoramas and handpicking the
mountain peaks which were clearly visible and distinguishable both on picture generated by developed
system and picture generated by Udeuschle generator. After plotting of all three datasets on one plot
together with the line of perfect correlation it became obvious that all the observations could be naturally
divided into two groups. The first group includes the observations made by the camera located lower than
the peak, the second group – observations made by the camera located higher than the peak (Figure 1).</p>
        <p>а) б)
Figure 1. Correlation of distance prediction results between our model and Udeuschle model for cases when: а) camera is
located higher than peaks; б) camera is located lower than peaks
а) б)
Figure 2. The histogram of the residuals on the evaluation data for cases when: a) camera is located higher than peaks; b)
camera is located lower than peaks</p>
        <p>As for the residuals, it can be concluded that a residual for an observation in the evaluation data is
the difference between the true target and the predicted target. Residuals represent the portion of the target
that the model is unable to predict (Figure 2). A positive residual indicates that the model is underestimating
the target (the actual target is larger than the predicted target). A negative residual indicates an
overestimation (the actual target is smaller than the predicted target). The histogram of the residuals on the
evaluation data when distributed in a bell shape and centred at zero indicates that the model makes
mistakes in a random manner and does not systematically over or under predict any particular range of
target values (the histogram follows the form of the normal distribution).</p>
      </sec>
      <sec id="sec-4-10">
        <title>This histogram on the Figure 3 confirms the predictive power of the model. For most of the cases the distance was predicted with the model, as most of the distances are predicted with the percentage error not more than 10% (for most of the points real distance is in the interval [0.9 * predicted value, 1.1* predicted value]).</title>
      </sec>
      <sec id="sec-4-11">
        <title>Camera position Higher Lower All</title>
      </sec>
      <sec id="sec-4-12">
        <title>This table 1 could be used for the practical applications of our distance predictions model. Imagine</title>
        <p>one has done an evaluation L of the distance to the peak from the panorama. Based on this value you should
give the shortest interval of the distances, which will include the real distance with probability 80%. If one
knows that the peak is higher than the camera, than one could say that the distance to the mountain is
L±O.25L with probability 80%. If one has no idea about the height of the peak, he should provide an interval
L±O.3L in order to be 80% sure. Now one could try to improve the accuracy of the predictions for the case
when the camera is located lower than the peak by calculation of the systematic error.</p>
        <p>The dataset was separated into two parts. One part will be used as training set (80%), the second
is as test set (20%). The linear regression will be built to find the systematic shift between real and predicted
value of the distance. In other words the best coefficients b and c will be found, such that:
dist_real= b * dist_predicted + c + ε,
where:
dist_real – real distance from the peak to the camera,
dist_predicted – distance predicted by our model,
ε – error.</p>
      </sec>
      <sec id="sec-4-13">
        <title>The best b and c are calculated by the least squared estimation method on the training set and the</title>
        <p>performance are tested on the test set. The best b and c are found using cross validation in order to avoid
overfitting. The dataset is divided into k folds (in our case k=7), k-1 folds used as a training set, 1 fold used
as a test set. Among this k pair of coefficients the one is chosen, which gives the best value of the R squared
statistics. The best values are b=0.898891146468 and c=-1015.66880806.</p>
      </sec>
      <sec id="sec-4-14">
        <title>Distance</title>
        <p>predicted by our
model</p>
      </sec>
      <sec id="sec-4-15">
        <title>Distance predicted by our model + linear regression</title>
      </sec>
      <sec id="sec-4-16">
        <title>Camera position</title>
      </sec>
      <sec id="sec-4-17">
        <title>Higher</title>
      </sec>
      <sec id="sec-4-18">
        <title>Lower</title>
      </sec>
      <sec id="sec-4-19">
        <title>Lower + LR</title>
      </sec>
      <sec id="sec-4-20">
        <title>Coefficient of RMSE</title>
        <p>determination
(Rsquared)</p>
      </sec>
      <sec id="sec-4-21">
        <title>At Figure 4 below one could see the deviation of the predicted value from the etalon value. The closer the point to the line of the perfect correlation the better the model is (according to the perfect model all the point are at this line).</title>
      </sec>
      <sec id="sec-4-22">
        <title>Two consequences could be done directly from the graph. As all the points are quite close to the line of the perfect correlation, our model provides accurate predictions of the distances, the second consequence is the fact that our model is likely to overestimate the distances to the mountain as most of the points are above the line.</title>
        <p>So building the linear regression could be a good solution to increase the accuracy of the predictions
in cases when it is impossible to install the camera higher than the peaks. But this method could not be used
without having some knowledge about real distances to the peak and distances measured by algorithm in
order to have enough data to build the regression.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>CONCLUSION</title>
      <sec id="sec-5-1">
        <title>In this article the method of distance to terrain relief estimation based on digital elevation model</title>
        <p>was proposed. The specific technique of computing the distance mask based on the transformed data from
z-buffer is presented. The experimental results revealed the correlation between the camera-peak altitude
gap and the distance error. The evaluation of the accuracy of distance measurements against the reference
model is provided. The tables with applied distance accuracy probabilities are computed and directions on
usage are given.</p>
      </sec>
      <sec id="sec-5-2">
        <title>The proposed method effectively estimates the distance to the terrain relief. For better precision</title>
        <p>the measured terrain area should be positioned lower than the camera, in other words it should be directly
observable by the camera. This logical requirement is stipulated by the method of the estimation which is
based on the data obtained from the z-buffer of the GPU. Yet there is still a great field for improvements in
two different areas: rendering optimization and accuracy enhancement.</p>
        <p>This work was supported by the Ministry of Education and Science of Russian Federation (project 2014/112 №35).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Farr</surname>
            ,
            <given-names>T.G.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kobrick</surname>
          </string-name>
          ,
          <article-title>Shuttle Radar Topography Mission produces a wealth of data</article-title>
          ,
          <source>Amer. Geophys</source>
          . Union Eos, v.
          <volume>81</volume>
          , p.
          <fpage>583</fpage>
          -
          <lpage>585</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.G.</given-names>
            <surname>Tragheim.</surname>
          </string-name>
          <article-title>The Accuracy of ASTER Digital Elevation Models, a Comparison to NEXTMap</article-title>
          .
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>SRTM</given-names>
            <surname>Topography. U.S. Geological Survey</surname>
          </string-name>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Ganovelli</surname>
          </string-name>
          , Massimiliano Corsini, Sumanta Pattanaik, Marco Di Benedetto.
          <article-title>Introduction to Computer Graphics: A Practical Learning Approach</article-title>
          . CRC Press,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>