=Paper= {{Paper |id=Vol-2763/CPT2020_paper_s6-1 |storemode=property |title=Effective technology to visualize virtual environment using 360-degree video based on cubemap projection |pdfUrl=https://ceur-ws.org/Vol-2763/CPT2020_paper_s6-1.pdf |volume=Vol-2763 |authors=Petr Timokhin,Mikhail Mikhaylyuk }} ==Effective technology to visualize virtual environment using 360-degree video based on cubemap projection== https://ceur-ws.org/Vol-2763/CPT2020_paper_s6-1.pdf
  Effective technology to visualize virtual environment using 360-degree
                   video based on cubemap projection
                                             P.Y. Timokhin1, M.V. Mikhaylyuk1
                                           webpismo@yahoo.de | mix@niisi.ras.ru
   1
     Federal State Institution «Scientific Research Institute for System Analysis of the Russian Academy of Sciences»,
                                                      Moscow, Russia

    The paper dealt with the task of increasing the efficiency of high-quality visualization of virtual environment (VE), based on video
with a 360-degree view, created using cubemap projection. Such a visualization needs VE images on cube faces to be of high
resolution, which prevents a smooth change of frames. To solve this task, an effective technology to extract and visualize visible faces
of the cube is proposed, which allows the amount of data sent to graphics card to be significantly reduced without any loss of visual
quality. The paper proposes algorithms for extracting visible faces that take into account all possible cases of hitting / missing cube
edges in the camera field of view. Based on the obtained technology and algorithms, a program complex was implemented, and it was
tested on 360-video of a virtual experiment to observe the Earth from space. Testing confirmed the effectiveness of the developed
technology and algorithms in solving the task. The results can be applied in various fields of scientific visualization, in the
construction of virtual environment systems, video simulators, virtual laboratories, in educational applications, etc.
    Keywords: scientific visualization, cubemap projection, 360 degree video, virtual environment.

                                                                       which impedes the visualization process. In this regard,
1. Introduction                                                        the task to reduce the amount of streaming data without
    An important part of many up-to-date researches is                 noticeable loss of visualization quality is arisen.
the visualization of scientific data and experiments using                 In this paper, an effective technology for solving this
a 3D virtual environment (VE) that simulates the object                task is proposed, which is based on the extraction and
of study [1]. This is especially demanded in the fields                visualization of cube faces been visible towards the
where research is associated with high risk and work in                viewer. The technology is implemented in C ++ using the
hard-to-reach environments: medicine [2], space [3], oil               OpenGL graphics library.
and gas industry [4], etc. One of the effective forms to
                                                                       2. The pipeline of 360-video visualization
share such visualization between researchers is video
with a 360-degree view, while watching which one is                        Consider the task of visualization of 360-video with
able to rotate the camera in an arbitrary direction and feel           frames comprising images of cube faces (face textures) as
the effect of immersion in VE. So, for instance, using                 shown in Fig. 1. The faces are named as they are seen by
360-video, anyone can explore the virtual model of the                 the viewer inside the cube. To visualize 360-video, a
center of the galaxy [5].                                              virtual 3D scene is created containing unit cube model
    To create a 360-video, various methods are used to                 centered at the origin of the World Coordinate System
unwrap a spherical panorama onto a plane [6]: by means                 (WCS). Viewer’s virtual camera CV is placed in cube
of equidistant cylindrical projection, cubemap projection,             center and is initially directed to the front face (Fig. 2),
a projection on the faces of the viewer’s frustum [7], etc.            where v and u are "view" and "up" vectors of camera CV
One of the widely distributed is cubemap projection in                 (in WCS), and r= v × u is "right" vector. The pipeline of
which the panorama is mapped to 6 faces of the cube,                   360-video visualization includes reading a frame from
where each face covers a viewing angle of 90 degrees.                  video file with a frequency specified in video; extracting
When playing such a video, the viewer is inside the cube               face textures from the frame and applying them to cube
and looks at its faces with images of the virtual                      model; synthesizing the image of textured cube model
environment. In order to feel immersion effect, it is                  from camera CV. When watching a 360-video, we allow
important that the images on the faces have a sufficiently             camera rotation around the X and Y axes of its local
high resolution, i.e. the viewer should not see their                  coordinate system, which corresponds to tilting the head
discrete (pixel) structure. Increasing of the resolution               up/down and left/right.
leads to the formation of a large stream of graphic data,




                                     Fig. 1. The location of cube faces in the frame of 360-video


Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY
4.0)
                                        Fig. 2. Viewer’s camera CV and visualization cube

    The bottleneck of the pipeline described is transferring        dn and df - distances to the near and far clipping planes.
of face textures from RAM to video memory (VRAM).                   The angle γ hor ∈ [δ , π − δ ] is user-defined, where δ is a
Therefore, transferring of all 6 face textures (the entire          small constant ( δ = 1° in our work), and the angle γ vert is
frame of 360-video) to VRAM will be extremely
                                                                    determined by the ratio
inefficient and impede the smoothness of visualization.
                                                                                  tg(γ vert / 2) = tg (γ hor / 2) / aspect ,   (1)
To solve this problem, we propose a technology based on
the extraction and visualization of only those cube faces           where aspect ≥ 1 is the aspect ratio of camera CV (the
that are seen in the camera CV.                                     ratio of frame’s width to its height). The distance to the
                                                                    far plane should not be less than half of the longest cube
3. The technology to extract and visualize                          diagonal, so we take=      df      3 2 + ε , where ε is machine
   visible cube faces
                                                                    error of real numbers representation. The near clipping
     To identify visible cube faces, we introduce the term          plane should be located so that the near base of the
"face pair" - two cube faces with a common edge. We                 frustum does not contact cube faces from the inside. Fig.
enumerate cube vertices, as shown in Fig. 2, and specify            3 shows that such contact point is the intersection of the
through them the edges: {0, 1}, {1, 5}, {5, 4}, {4, 0}, {6,         side line of camera CV's FOV with the center of cube
4}, {7, 5}, {3, 1}, {2, 0}, {2, 3}, {3, 7}, {7, 6}, {6, 2}.         face.
These edges are corresponded by the following face
pairs: {0, 1}, {0, 3}, {0, 2}, {4, 0}, {2, 4}, {2, 3}, {3, 1},
{1, 4}, {1, 5}, {3, 5}, {5, 2}, {4, 5}, where 0…5 are face
numbers from Fig. 1. Depending on orientation and
projection parameters of camera CV, cube edges may hit
the frustum of the camera, or miss it. Every edge hitting the
frustum determines face pair needed to render. No edges
hitting the frustum means that camera CV captures some
single cube face.
     The proposed technology includes five stages. At the
first stage, camera CV's frustum parameters are
determined. At the second stage, boolean table H of cube
vertices visibility is created. At the third stage, visible
face pairs are extracted using table H. At the fourth
stage, single visible face is extracted (if necessary). At
the fifth stage, extracted cube faces are visualized. Let’s            Fig. 3. Determining the distance to the near clipping plane
consider these stages in detail.
                                                                       Then, for the distance dn the equation can be written:
   Frustum parameters                                                (d n + ε ) 2 0.5
                                                                     =             =  2
                                                                                        − (a 2 + b 2 )
                                                                                                                               (2)
     To determine visible cube faces, we=         need the        0.52 − (d n + ε ) 2 ( tg 2 (γ hor / 2) + tg 2 (γ vert / 2)).
following parameters of camera CV's frustum: γ hor and       Substitute Eq. (1) into Eq. (2) and find the distance dn:
γ vert - horizontal and vertical FOV (field of view) angles;
                               1                                      Visible face pairs extraction
          dn                                         −ε .   (3)
               2 1 + tg (γ hor 2)(1 + 1 / aspect )
                        2                       2
                                                                      Every face pair which common edge intersects the
   The stage described is performed when starting 360-            frustum of camera CV will be extracted for visualization.
video, as well as each time the user changes γ hor or             This is possible in two cases:
aspect.                                                               (a) if at least one of edge vertices falls into the frustum.
                                                                  This case can be easily detected by checking flags b5 of
   Table H of cube vertices visibility                            edge vertices in the table H (at least one vertex having true
                                                                  b5 is enough);
    To simplify checking cube edges visibility, we create             (b) if both vertices lie outside the frustum, but the
a table H that stores boolean flags of being each cube            edge intersects at least one clipping plane of camera CV,
vertex in "+" half-space (where frustum is located) of each       and their intersection point falls into the frustum. Divide
clipping plane of camera CV, except the far plane. Relative       this case into 3 steps: 1) determining the fact of the
to this plane all cube vertices lie obviously in "+" half-        intersection of the ith clipping plane by the edge; 2)
space (see dn determination in Section 3.1). Table H              finding coordinates PI of their intersection point; 3)
consists of 8 rows (the row index is the cube vertex              checking falling the point PI into the frustum.
number), and each row stores 6 flags b0,…,b5:                         Step 1. The fact of "edge - ith clipping plane"
- in the subgroup b0,…,b4, raised bith flag means that            intersection can be easily established using the table H.
     the vertex lies in the "+" half-space or coincides with      For this, edge vertices should have opposite flags bi.
     the ith clipping plane (0 - near, 1 - left, 2 - right, 3 -   Note, if the edge lies in the ith clipping plane, and edge
     bottom, 4 - top);                                            vertices are outside the frustum, then intersection fact
- flag b5 is raised if all flags b0,…,b4 are raised, i.e. the     with this plane will not be established (both flags bi will
     cube vertex is in the frustum of camera CV.                  be true), but it will be surely done with other clipping
    Consider some cube vertex P. Denote by PWCS its               plane, so the case (b) will be correctly detected. Another
coordinates in WCS, and by p, the vector OWCSPWCS. The            important thing is that any cube edge isn't able to cross
calculation of flags b0,…,b5 for the vertex P is done by          the near or far base of the frustum (see Section 3.1), so
the following                                                     there is no need to check these planes.
          Algorithm A1 to fill a row of the table H                   Step 2. After establishing the fact that cube edge
1. Find the projection pv = (p, v) of the vector p on the         intersects the ith clipping plane, for instance, the right
     "view" vector v of camera CV.                                plane (for the remaining planes the derivation will be
2. Find the projection pr of the vector p on the "right"          similar), we need to calculate coordinates PI of their
     vector r similarly to pv.                                    intersection point. For this, we introduce the following
3. Find the projection pup of the vector p on the "up"            denotations: PA, PB - the coordinates of edge vertices in
     vector u similarly to pv.                                    WCS; e - unit vector PAPB; pA - the vector OWCSPA; pI - the
4. Calculate the size dhor of horizontal FOV of camera            vector OWCSPI; pI,r - the projection of the vector pI on the
     CV on the line of the vertex P (see. Fig. 4):                vector r of camera CV; pI,v - the projection of the vector pI
                      d hor = 2 pv tg(γ hor 2) .                  on the vector v of camera CV. In this example, the point PI
5. Calculate the size dvert of vertical FOV of camera CV          lies in the right clipping plane, therefore its projection pI,r
     on the line of the vertex P similarly to dhor.               is equal to half the size of the horizontal FOV on the line
6. b0 = (pv ≥ dn),               b1 = (pr ≥ -dhor/2),    b2 =     of the point PI (similarly to size dhor in Fig. 4):
     (pr ≤ dhor/2),                                                                       γ                                γ
     b3 = (pup ≥ -dvert/2), b4 = (pup ≤ dvert/2).                           pI,r = pI,v tg hor or ( pI , r ) = ( pI , v )tg hor . (4)
                                                                                        2                                     2
7. b5 = (b0 && b1 && b2 && b3 && b4).                                Using the distributive property of the dot product of
                                                                  vectors rewrite the Eq. (4) as
   Having executed algorithm A1 for each cube vertex in                                                                 γ
order, we obtain table H.                                                 ( pI , χ right ) = 0 , where χ right = r − tg( hor )v . (5)
                                                                                                                          2
                                                                     Write another expression for the vector pI using the
                                                                  vector parametric equation of the line PAPB:
                                                                                            p=I      pA + tI e ,              (6)
                                                                  where tI is the parameter determining the position of the
                                                                  point PI on the line PAPB. Substitute Eq. (6) into Eq. (5)
                                                                  and find tI:
                                                                                   t I = −( p A , χ right ) / (e, χ right ) . (7)
                                                                     Similarly to Eq. (6) write for coordinates PI the
                                                                  expression PI = PA + t I e . Substitute tI from Eq. (7) in it
                                                                  and find required coordinates PI:
                                                                                                 ( p A , χ right )
                                                                                       P=
                                                                                        I PA −                       e.           (8)
                                                                                                  (e, χ right )
                                                                     As one can notice, the coordinates of intersection
               Fig. 4. Checking vertex P visibility               points of the edge PAPB with the left, top and bottom
clipping planes will differ from Eq. (8) only by similar                     If algorithm A3 results in true flag bpair, then we
terms χ left , χ top and χ btm . The expressions of these terms          proceed to the stage of visualization of the faces marked in
are derived in a similar way:                                            Bfaces (see the Section 3.5). If bpair is false (no visible face
                      γ hor                            γ                 pairs were found), this means that camera CV captures
          χ left = r +tg(       )v ,   χ top = u − tg( vert )v ,         some single cube face and we need to extract it for
                            2                          2           (9)   visualization (see the Section 3.4).
                                          γ
                        χ btm = u + tg ( vert )v.
                                         2                                  Extraction of single visible face
    Step 3. Having the coordinates PI of intersection point,
we check falling the point PI into the frustum of camera                     As one can see, the visible one will be cube face with
CV. To do this, we calculate the flag b5 for the point PI                the smallest angle between the external normal and the
using algorithm A1, and check its value (true means visible              vector v of camera CV. To determine the number of such
edge). Note, when calculating flag b5, the calculation of the            a face, we calculate the cosines of the angles between the
flag bi can be omitted, as the point PI lies in the ith plane,           normals to the faces and the vector v, and extract the face
and bi will obviously be true.                                           with the largest cosine. Denote by K the array of cosines
    Based on the cases (a) and (b) considered, we define                 for faces 0-5, and by n2, n3 and n5 the normals to the
the algorithm to check the visibility of the edge {m, n},                back, right, and top cube faces. Write the sequence of
where m, n are the numbers of edge vertices, introduced at               normals for the faces 0-5: {-n5, -n2, n2, n3, -n3, n5}. Since
the beginning of the Section 3. Denote by bedge the flag of              the normals n2, n3, n5 coincide with the axes OZWCS,
edge {m, n} visibility. The calculation of bedge is done by              OXWCS, OYWCS, the calculation of the array K reduces to
the following                                                            writing the sequence of the vector v coordinates with
   Algorithm A2 to determine the edge {m, n} visibility                  signs from the normals’ sequence. Execute the following
1. Check the visibility of edge vertices (using the table                      Algorithm A4 to extract single visible kth face
     H):                                                                 1. K = {-vobs,y, -vobs,z, vobs,z, vobs,x, -vobs,x, vobs,y}.
          If (Hm,5 || Hn,5) is true, then:                               2. k = 0. // by default 0th face is supposed visible.
          bedge = true, exit the algorithm.                              3. Loop by i from 1 to 5, where i is face number (see Fig.
2. Check, whether the edge lies in the "-" half-space of                      1).
     any clipping plane:                                                          If K[i] > K[k], then k = i.
          Loop by i from 0 to 4, where i is plane number                      End Loop.
              If (!Hm,i && !Hn,i) is true, then:                         4. Bfaces[k] = true.
              bedge = false, exit the algorithm.                             After executing the algorithm A4, array Bfaces will
          End Loop.                                                      contain one true flag marking single visible cube face.
3. Check the presence of at least one visible point PI of                Visualization of the faces marked in Bfaces is performed at
     the intersection of the edge with any clipping plane                the next stage.
          Loop by i from 1 to 4
              If (Hm,i ^ Hn,i) is true, then:                               Visualization of extracted faces
                     Calculate coordinates PI by Eq. (8) and
                                                                             To each element of the array Bfaces the face texture of
                     (9).
                                                                         d x d pixels is corresponded, and all 6 face textures, as
                     Calculate the flag b5 by algorithm A1.
                                                                         noted in the Section 2, are merged into 360-frame of 3d x
                     If b5 is true, then:
                                                                         2d pixels. At this stage, face textures marked in Bfaces will
                     bedge = true, exit the algorithm.
                                                                         be extracted from 360-frame and applied to cube model.
              End If.
                                                                         Face textures will be extracted to an array T of 6 texture
          End Loop.
                                                                         objects (one object per cube face). Each element of the
4. bedge = false.
                                                                         array T is a continuous area of VRAM, allocated for
    Next, using algorithm A2, visible face pairs are                     storing one face texture. It is important to note here that
extracted. Denote by Bfaces the boolean array of 6 flags of              in 360-frame each face texture is stored not in one
cube faces visibility (true/false - the face is visible/not              continuous line, but in a number of d substrings of d
visible), by D and E - the arrays of face pairs and cube                 pixels in length (see Fig. 5). Since transferring a large
edges, introduced at the beginning of the Section 3, and                 number of small data pieces into VRAM reduces the
by bpair - the flag of at least one visible face pair. Execute           GPU's performance, we tune video driver (using operator
the following                                                            glPixelStorei of the OpenGL library) so that the
          Algorithm A3 to extract visible face pairs                     substrings of face texture are automatically merged and
1. Clear array Bfaces with value false, bpair = false.                   transferred to VRAM as one continuous piece. This is
2. Loop by j from 0 to 11, where j is the edge index                     done in the following
          Calculate flag bedge of the edge E[j] by algorithm                        Algorithm A5 to visualize 360-frame
     A2.                                                                 1. Clear frame buffer, set the viewport, as well as
          If bedge is true, then:                                             projection and modelview matrices according to
              Bfaces[D[j][0]] = true.                                         camera CV’s params.
              Bfaces[D[j][1]] = true.                                    2. Loop by i from 0 to 5, where i is face number.
              bpair = true.                                                       If Bfaces[i] is true, then:
          End If.
     End Loop.
            Set row length nRL of 360-frame, the number
            nSR of skipped rows and the number nSP of
            skipped pixels in a row (see Fig. 5):
                 glPixelStorei (…_ROW_LENGTH, 3d),
                 glPixelStorei (…_SKIP_ROWS, i 3 d ),
                 glPixelStorei          (…_SKIP_PIXELS,
           (i % 3)d),
                 where     "…"      is     shortening     of
           GL_UNPACK.
           Load ith face texture to T[i]th texture object by
           means of the operator glTexSubImage2D.
           Render T[i]th texture object on the ith face.
        End If.
    End Loop.
    Note, if 360-frame isn't changed during the
visualization process (for example, the video is paused),
then in step 2 of the algorithm A5, the same face textures
are not repeatedly loaded into VRAM, but the previously
loaded ones are used.




                                                                 Fig. 6. Testing the solution developed: (a) a frame of source
                                                                         360-video; (b) VE visualized from the frame

                                                               5. Conclusions
                                                                   The paper considers the task of increasing the
                                                               efficiency of VE visualization using 360-degree video
                                                               based on cubemap projection. High-quality visualization
         Fig. 5. Reading face texture from 360-frame           (providing the effect of immersion into VE) requires
                                                               snapshots of high-resolution VE cubemap, which causes
4. Results                                                     overloading graphics card and impedes smooth changing
                                                               the frames. To solve this task, an effective technology is
    The proposed technology and algorithms were                proposed, based on the extraction and visualization of
implemented in a software complex (360-player) written         visible cube faces, which can significantly reduce the
in C++ language using the OpenGL graphics library. The         amount of data sent to graphics card without any loss of
player performs high-quality visualization of a virtual        visual quality. The paper proposes algorithms to extract
environment from a 360-video based on cubemap                  visible cube faces both in the case of falling cube edges
projection. During the visualization, the viewer can rotate    into FOV, and in the case of the absence of visible edges
the camera corresponding to tilting the head up/down and       (the case of single visible face). The resulting technology
left/right, as well as change camera FOV (viewing angle        and algorithms were implemented in a software and tested
and aspect).                                                   on 360-video containing the visualization of virtual
    The developed solution was tested on 360-video with        experiment on observing the Earth from space. The testing
a resolution of 3000x2000 pixels, created in the               of the software confirmed the correctness of the solution
visualization system for virtual experiments to observe        obtained, as well as its applicability for virtual
the Earth from the International Space Station (ISS) [3].      environment systems and scientific visualization, video
By means of the player developed, an experiment was            simulators, virtual laboratories, etc. In the future, we plan
reproduced where the researcher, rotating the observation      to expand the results to increase the efficiency of
tool, searches and analyzes a number of Earth objects          visualization of VE projected onto the dodecahedron.
along the ISS daily track. Fig. 6a shows an example of
360-video frame, and Fig. 6b shows visualization of this       Acknowledgements
frame in 360-player.
                                                                   The publication is made within the state task on
                                                               carrying out basic scientific researches (GP 14) on topic
                                                               (project) “34.9. Virtual environment             systems:
                                                               technologies, methods and algorithms of mathematical
                                                               modeling and visualization” (0065-2019-0012).
References:
[1] Bondarev A.E., Galaktionov V.A. Construction of a
    Generalized Computational Experiment and Visual
    Analysis of Multidimensional Data // CEUR
    Workshop Proceedings: Proc. 29th Int. Conf.
    Computer Graphics and Vision (GraphiCon 2019),
    Bryansk, 2019, vol. 2485, p. 117-121., http://ceur-
    ws.org/Vol-2485/paper27.pdf.
[2] Gavrilov N., Turlapov V. General implementation
    aspects of the GPU-based volume rendering
    algorithm // Scientific Visualization. - 2011. - Vol. 3,
    № 1. - p. 19-31.
[3] Mikhaylyuk, M.V., Timokhin, P.Y., Maltsev, A.V. A
    method of Earth terrain tessellation on the GPU for
    space simulators // Programming and Computer
    Software - 2017. - Vol. 43, p. 243-249. DOI:
    10.1134/S0361768817040065.
[4] Mikhaylyuk M.V., Timokhin P.Yu. Memory-
    effective methods and algorithms of shader
    visualization of digital core material model //
    Scientific Visualization - 2019. - Vol. 11, № 5. - p.
    1-11. DOI: 10.26583/sv.11.5.01.
[5] Porter M. Galactic Center Visualization Delivers
    Star                      Power                       //
    https://chandra.harvard.edu/photo/2019/gcenter/
    (review date 25.05.2020).
[6] El-Ganainy T., Hefeeda M. Streaming Virtual
    Reality                    Content                    //
    https://www.researchgate.net/publication/3119
[7] 25694_Streaming_Virtual_Reality_Content (review
    date 25.05.2020).
[8] Kuzyakov E., Pio D. Next-generation video encoding
    techniques      for   360    video     and     VR     //
    https://code.facebook.com
    /posts/1126354007399553/next-generation-video-
    encodin (review date 25.05.2020).

About the authors
    Timokhin Petr Yu., senior researcher of Federal State
Institution «Scientific Research Institute for System Analysis of
the      Russian      Academy      of      Sciences».    E-mail:
webpismo@yahoo.de.
    Mikhaylyuk Mikhail V., Dr. Sc. (Phys.-Math.), chief
researcher of Federal State Institution «Scientific Research
Institute for System Analysis of the Russian Academy of
Sciences». E-mail: mix@niisi.ras.ru.