Performance-Effective Algorithm for Solving
      Large-Scale Forward Gravity Problem for
                  Elliptical Objects

         Petr S. Martyshko1,2 , Igor V. Ladovskii1 , Denis D. Byzov1 , and
                         Alexander I. Chernoskutov1,2
1
    Bulashevich Institute of Geophysics, Ural branch of Russian Academy of Sciences,
                                Yekaterinburg, Russia
               2
                 Yeltsin Ural Federal University, Yekaterinburg, Russia


        Abstract. In this paper we purpose a performance-effective algorithm
        for solving forward gravity problem for elliptical objects. The algorithm
        is subject to parallel computations, it was implemented and tested with
        the CUDA technology. In general, the suggested method can be applied
        to density distribution models of arbitrary geometry that can be trian-
        gulated.

        Keywords: computational geophysics, potential fields, gravity field, spher-
        ical Earth crust models


1     Problem statement

Modern studies of the internal structure of the Earth require density model
reconstructions with high resolution and accuracy. This raises questions on error
estimation due to spherical shape of the Earth and creates the need of refinement
of the modelling results with sphericity taken into account. Thus, there is a
need for an effective tool that allows solving the forward gravity problem for
spherical and elliptical density models. In this paper, we propose an algorithm
based on the new representations of the triangular plate potential in space that
makes it possible to effectively use the capabilities of modern computational
parallelization technologies (in particular, CUDA) and, as a result, to obtain
high-performance gravity field solving software.
    Define the 3D density model as follows. Upper boundary S of the model
(ground-air interface) is a sector of a surface of an ellipsoid of revolution (for
example, Krasovsky Ellipsoid), all points located at a distance of no more than
H along the inner normal to S are forming a set D (see Fig. 1). In the domain
D ⊂ R3 , the density distribution ρ(p) is defined, p ∈ D.
    The vertical component of the gravitational field intensity ∆g generaded by
the domain D at the outer point q ∈ / D \ ∂D is determined by the integral
                                           Z
                                        ∂      ρ(p)dVp
                          ∆g(q) = −γ                   ,                       (1)
                                       ∂nq D |r − r0 |
                                                                                      97


                Fig. 1. Visual representation of the 3D density model


where γ is the gravitational constant, dVp is a volume element of integration, nq
is the outer normal to S in the orthogonal projection of the point q to S, r and
r0 are the radius vectors of the points p and q, respectively.
    Assume the ellipsoid of rotation for our spherical model D has equatorial
and polar radii equal to a and b. The position of the point on the surface of the
ellipsoid of rotation is uniquely defined by the geodesic latitude B ∈ [ −π/2; π/2]
and the longitude L ∈ ( −π; π] (except for the “poles”, where longitude is not
defined). As a third coordinate H we employ the distance between the given
point and the ellipsoid surface; it has a plus sign if the point lies outside the
ellipsoid, and the minus sign if the point lies inside it. Position of the given point
                                                                                   2
in space is uniquely defined by three values: (L, B, H), where H
     √                                                              p > −N (1 − e ),
        2     2                                                             2   2
e = a − b /a (the eccentricity of the ellipsoid), and N = a/ 1 − e sin B.


2    Gravity field of a spherical model

Assume Di,j,k is a descrete element of D and the density of a descrete element
be a constant ρi,j,k . Then the field ∆g at point q generated by D is equals to

                                   x −1 N
                                  NX      y −1 Nz −1
                                         X     X
                      ∆g(q) = γ                        ρi,j,k Gi,j,k (q),            (2)
                                   i=0   j=0   k=0


where Gi,j,k (q) is the field at the point q of Di,j,k up to a coefficient γ. In order to
calculate Gi,j,k (q), we introduce in the space of a spherical model a rectangular
Cartesian coordinate system O0 x0 y 0 z 0 . The center O0 coincides with the center of
the ellipsoid, O0 z 0 axis is ellipsoid’s axis of rotation and directed from the “south
pole” to the “north” (i.e. when B = π/2, then z 0 > 0), the O0 x0 axis points to
(0, 0, 0) in (L, B, H), the axis O0 y 0 complements the system to the right-handed
98

axis set. Equations for transition from (L, B, H) to O0 x0 y 0 z 0 are then as follows:
                         
                             0
                         x = (N + H) cos B cos L
                         
                           y 0 = (N + H) cos B sin L ,                              (3)
                          0
                         
                           z = (N (1 − e2 ) + H) sin B

and                                                 
                                        cos B cos L
                                   n =  cos B sin L  ,                                              (4)
                                            sin B
where n is a unit vector of the external normal to the ellipsoid at the point
(L, B, 0).
    Integral (1) for the Gi,j,k (q) field can not be expressed analytically. It is also
problematic to calculate in numerically (with the help of the cubature formulas),
since the boundaries of Di,j,k usually have a very complex description in the
(L, B, H) or (x0 , y 0 , z 0 ) coordinate system and, in order to achieve acceptable
accuracy, a large number of nodes is required [1, 6]. Therefore, we propose to
calculate integral (1) not for Di,j,k , but for the polyhedron of approximation
D̂i,j,k formed by the triangulation of Di,j,k . Thereby,
                                                          Z
                                                    ∂                 ρ(p)dVp
                      Gi,j,k (q) ≈ Ĝi,j,k (q) = −                              .                     (5)
                                                   ∂nq        D̂i,j,k |r − r0 |

   In formula (5), we proceed to integration over the surface, using the Ostro-
gradsky theorem

                           Z                          !              I                        !
                                         1                                        np
      Ĝi,j,k (q) =   nq ,         ∇p           dVp       =     nq ,                     dS       .
                           D̂i,j,k    |r − r0 |                      ∂ D̂i,j,k |r − r0 |


    Next, we divide the surface integral into parts along the faces of D̂i,j,k and
take into account that the outer normal np at the integration point is constant
for each face
                                                     Z
                                 X                           dS
                 Ĝi,j,k (q) =           (nq , ni1 )                ,           (6)
                                                      Si1 |r − r0 |
                                  Si1 ∈S(D̂i,j,k )

where ni1 is the outer normal to the face Si1 . Now we need to acquire a formula
                 R       dS
for the integral Si1            over a triangle (which is the gravitational potential
                      |r − r0 |
of a triangle with a unit surface density without γ coefficient). Let ri (i = 1, 2, 3)
be the radius vector of the vertices of the triangle hp1 , p2 , p3 i; q is the field
computation point; ai = ri − r0 ; a(j,i) = ai − aj = ri − rj ; N = a(i−1,i) × a(i,i+1)
is the normal to the plane of the triangle with a length equal to its doubled area;
n = N/|N| is the unit normal to the plane of the triangle; A(j,i) = a(j,i) /|a(j,i) |;
                                                                                          99

(ai · n) is the distance from the point q to the plane of the triangle (with the
appropriate sign). Then
  Z                           
                    dS                                        (A1 ; [A2 ; A3 ])
                           = 2 n; −a1 arctan                                              +
   hp1 ,p2 ,p3 i |r − r0 |                       1 + (A3 ; A1 ) + (A1 ; A2 ) + (A2 ; A3 )
                              3
                                                                      !
                             X   [ai−1 ; ai ]            |ai−1,i |
                           +                  arctanh                   .
                             i=1
                                  |ai−1,i |           |ai−1 | + |ai |
                                                                                          (7)
    Summarizing the process of calculating the vertical component of the field
∆g(q) from the model D, we have:
 1. for each element Di,j,k using coordinates transformation (3), obtain the set
    of faces S(D̂i,j,k ) of the approximating polyhedron D̂i,j,k (in the Cartesian
    system O0 x0 y 0 z 0 );
 2. with the help of formulas (6), (7) calculate the vertical component of the
    field Ĝi,j,k (q) for the polyhedron D̂i,j,k at the point q, the normal vector
    n(q) is calculated by formula (4);
 3. assume Gi,j,k (q) ≈ Ĝi,j,k (q) and by field summation of each D̂i,j,k (2) acquire
    ∆g(q).

3    Implementation
In the presented algorithm, computation of formula (7) is taking most of the
time of the running program. Indeed, in order to calculate the field of body that
consists of Nb discrete elements in Nf points of space, the expression should be
calculated N = kNb Nf times, where k is the amount of triangles in one element
of the body. In practice, N is of order of 1015 for high-resolution Earth crust
models (with surface area of order 1000×1000km and depth of 100km). With-
out using contemporary parallel technologies such computations would become
senseless since it would take months or even years to run. Hence, our goal is to
utilize most powerful computational devices available, which are GPUs.
    Here, we assume that all the preprocessing steps have been done and as our
input we have a set of triangles T (represented by triplets of points in space)
that has been obtained as a result of triangulation of the discrete elements of
the body. Additionally, we have a set of points Q, in which we want to calculate
the normal component (or a vector) of gravitational field of the body. We are
not going into details of this step because its computational complexity does not
depend on Nf ; thus, its becoming insignificant for large values of Nf .
    The field computation requires minimum two nested cycles through Q and T .
The iterations of the outer loop (over Q) are mutually independent and can be
split between different computing nodes or/and GPUs. The inner loop (over T )
can be expressed using map-reduce pattern: gq = reduce(mapq (T )), where g is
the resulting gravitational field vector the at the point q. The trick is that those
operations are implemented with high efficiency in the CUDA Thrust library [5]
in the form of a single function: thrust::transform reduce(). All we were left to
do is to implement formula (7) and pass it as a parameter.
100

4     Performance tests

All the tests have been performed using two regional Earth crust models that
have been previously reconstructed in Bulashevich Institute of Geophysics (Ural
Branch of Russian Academy of Sciences) as a result of solving inverse grav-
ity problem [2–4]. Test models characteristics are shown in Table 1. Krasovsky
Ellipsoid is taken as the reference one.


                      Table 1. Test models characteristics

                                     Discretization of the Discretization of the
                    Physical size   model (with respect to field (with respect to
      Name    (length×width×height) length, width, height)     length, width)
    Workload 1 793km×1057km×81km          256×256×81              256×256
    Workload 2 1336km×969km×81km          1336×969×81             334×243


   Comparison of our most resent CPU available (i7-6900K 8 physical cores @
3.2GHz) and our oldest GPU (GeForce GTX TITAN Black) showed that GPU
outperforms CPU from 15 to 20 times even for considerably small workloads
(Workload 1). So, future tests were conducted using only the GPUs.


                         Fig. 2. Performance test results
                                                                                101

    Figure 2 shows scalability potential of the algorithm. A slight drop of per-
formance after switching from 2 GPU to 4 and from 4 to 8 is mostly due to the
fact that the load has been split between GPUs evenly, when actualy it needs to
be carefully balanced (since our cluster consists of different models of GPUs3 ).
The resulting field for the second workload is shown on Fig. 3.


                 Fig. 3. The resulting field for the second workload


Acknowledgments

This work was partly supported by the Center of Excellence “Geoinformation
technologies and geophisical data complex interpretation” of the Ural Federal
University Program and by the The Russian Foundation for Fundamental Re-
search (project 17-05-00916 ).


References
 1. Heck, B., Seitz, K. A.: Comparison of the tesseroid, prism and point-mass ap-
    proaches for mass reductions in gravity field modelling. Journal of Geodesy. 81,
    121–136 (2007)
 2. Ladovskii, I. V., Martyshko, P. S., Byzov, D. D., Kolmogorova, V. V.: On Select-
    ing the Excess Density in Gravity Modeling of Inhomogeneous Media. Izvestiya,
    Physics of the Solid Earth, Vol. 53, No. 1, 130-139 (2017) https://doi.org/10.
    1134/S1069351316060057
3
    GPUs of the cluster: 2x Quadro M6000 24GB, 3x GeForce GTX TITAN Black, 3x
    GeForce GTX TITAN X
102

3. Martyshko, P. S., Ladovskii, I. V., Fedorova, N. V., Byzov, D. D., Tsidaev, A. G.:
   Theory and methods of complex interpretation of geophysical data. Yekaterinburg:
   Ural Department of Russian Academy of Sciences, 94p. (2016). ISBN 978-5-7691-
   2463-1.
4. Martyshko, P. S., Byzov, D. D., Ladovskii, I. V., Tsidaev, A. G.: 3D density
   models construction method for layered media. 15th International Multidisci-
   plinary Scientific GeoConference SGEM 2015, www.sgem.org, SGEM2015 Con-
   ference Proceedings, Albena. Bulgaria. Book 2 Vol. 1, 425–432 (2015) https:
   //doi.org/10.5593/SGEM2015/B21/S8.053
5. NVIDIA CUDA Toolkit Documentation. Trust.: http://docs.nvidia.com/cuda/
   thrust/index.html
6. Wild-Pfeiffer, F., Augustin, W. und Heck, B.: Optimierung der Rechenzeit bei der
   Berechnung der 2. Ableitungen des Gravitationspotentials von Massenelementen.
   Zeitschrift für Geodäsie. № 6 (132), 377–384 (2007)