=Paper= {{Paper |id=Vol-3041/241-245-paper-44 |storemode=property |title=Optimization of the Computation of Multidimensional Integrals for Estimating the Meson Lifetime |pdfUrl=https://ceur-ws.org/Vol-3041/241-245-paper-44.pdf |volume=Vol-3041 |authors=Daviti Goderidze,Yuri Kalinovsky,Alexandra Friesen }} ==Optimization of the Computation of Multidimensional Integrals for Estimating the Meson Lifetime== https://ceur-ws.org/Vol-3041/241-245-paper-44.pdf
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021



     OPTIMIZATION OF THE COMPUTATION OF
MULTIDIMENSIONAL INTEGRALS FOR ESTIMATING THE
               MESON LIFETIME
                    D. Goderidzea, Yu.L. Kalinovsky, A.V. Friesen
          Joint Institute for Nuclear Research, 6 Joliot-Curie, Dubna, 141980, Russia

                                       E-mail: a goderidze@jinr.ru


To calculate the lifetime of mesons in hot and dense nuclear matter, it is necessary to compute
multidimensional integrals with a complicated integrand function. This work presents an algorithm
and methods to calculate such integrals on the basis of the Monte-Carlo method. To optimize the
computation, the parallel calculation algorithm is implemented in the C++ programming language
using OpenMP and NVIDIA CUDA technologies. The calculations are performed on nodes with
multicore CPUs and Intel Xeon Phi coprocessors and the Nvidia Tesla K40 accelerator installed within
a heterogeneous cluster of the Meshcheryakov Laboratory of Information Technologies, Joint Institute
for Nuclear Research, Dubna. The code is used to calculate the pion lifetime using all possible pion-
pion scattering modes.

Keywords: meson lifetime, pion damping, Monte Carlo, parallelization



                                                  Daviti Goderidze, Yuri Kalinovsky, Alexandra Friesen



                                                             Copyright Β© 2021 for this paper by its authors.
                    Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).




                                                   241
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021



1. Meson lifetime
        The investigation of particles and their properties, including scattering and decay modes, the
lifetime and width of particles in critical conditions of heavy ion collisions, is an important problem in
modern physics. The theoretical description of strong-interaction matter under critical conditions is a
complicated task, and it is not always possible to find an analytical solution to the problem. Therefore,
the use of modern computer technologies plays a key role in theoretical research.
        This paper is devoted to the investigation of the pion damping width in a hot pion gas within
the NJL model. The main contribution to the width/lifetime comes from pion–pion collisions. To
calculate the meson lifetime in hot and dense nuclear matter, it is necessary to compute 5-dimensional
integrals with a complicated integrand function. This work presents an algorithm and methods for the
calculation of complicated integrals based on the Monte-Carlo method using the parallelization
algorithm to optimize the computation time.
         The meson lifetime 𝜏 is determined by the particle self-energy [1]:
                                              Π“ = 𝜏 βˆ’1 = 𝛴> (𝑝) Β± 𝛴< (𝑝),                                            (1)
where the sign depends on the type of the particles: Β«+Β» for fermions and Β«-Β» for bosons. The 𝛴 >
function is responsible for direct processes, when a particular state decays into other states. The 𝛴 <
function is responsible for inverse processes that can restore the decayed state due to the decay of
other particles to this state again. This process prolongs the lifetime of particles in the medium. The Ξ£
functions have a complicated form:
                              1             𝑑𝑝⃗1 1    1                           |𝑝⃗ |2
        𝛴>(<) (𝑝) = (2πœ‹)3 ∫ 𝑑𝛺 ∫                 ∫ 𝑑(π‘π‘œπ‘ π›Ό ) |𝑝⃗ |2( 𝑠 +π‘š 3)βˆ’|𝑝⃗ |𝐸 π‘π‘œπ‘ π›Ό |𝐴|2 𝐹>(<) ,                 (2)
                                            2𝐸 8πœ‹ βˆ’1
                                               1                        3   √ 2     1      1   3

where 𝑑𝛺 = 𝑑𝑠1 𝐴(𝑠1 )𝑑𝑠3 𝐴(𝑠3 )𝑑𝑠4 𝐴(𝑠4 ), the spectral function is calculated in the Breit-
Wigner form 𝐴 (𝑠𝑖 ) = 2𝑀𝑖 𝛀𝑖 ((𝑠𝑖 βˆ’ 𝑀𝑖2 )2 + 𝑀𝑖2 𝛀𝑖2 )βˆ’1 with 𝑠𝑖 = 𝑝𝑖2 . The factors 𝐹 > =
 𝑛1 (𝑛3 + 1)(𝑛4 + 1) and              𝐹< = (𝑛1 + 1)𝑛3 𝑛4 are introduced for brevity, and 𝑛𝑖 are
the particle distribution functions 𝑛𝑖 = (𝑒π‘₯𝑝(π›½βˆšπ‘ π‘– + 𝑝̅ 2 ) Β± 1)βˆ’1 according to the particle
statistics [1, 2]. The momenta 𝑝⃗3 are defined as:
                    |𝑝⃗1 |π‘Ž π‘π‘œπ‘ π›Ό Β± √|𝑝⃗1 |2 π‘Ž 2 π‘π‘œπ‘  2 𝛼 + ((βˆšπ‘ 2 + 𝐸1 )2 βˆ’ |𝑝⃗1 |2 π‘π‘œπ‘  2 𝛼)(π‘Ž2 βˆ’ 4𝑠3 (βˆšπ‘ 2 + 𝐸1 )2 )   (3)
      |𝑝⃗3 |1,2 =
                                                    2((βˆšπ‘ 2 + 𝐸1 )2 ) βˆ’ |𝑝⃗1 |2 π‘π‘œπ‘  2 𝛼
with π‘Ž = (βˆšπ‘ 2 + 𝑠1 )2 + 𝑠3 βˆ’ 𝑠4 + 𝑠1 βˆ’ 𝐸1 2 .
        To calculate the meson lifetime, the scattering amplitude |𝐴| of the process under investigation
should be calculated. The amplitude |𝐴| is generally considered as an invariant amplitude and depends
on the Mandelstam variables 𝑠, 𝑑, 𝑒, where 𝑠 + 𝑑 + 𝑒 = βˆ‘ 𝑠𝑖 .
         It is seen from Eq. (2) that the structure of the integrand function is complicated and depends
on many variables. The complex structure of the integrand is due to the kinematic conditions of
physical processes and is limited by the pole in the denominator and the negative value of the radical
expression. To show the 5-dimensional integration domain for the integral (2), we transfer it to the
plane (𝑝⃗1 , 𝑝⃗3 ), since by generating the arbitrary point (𝑠1 , 𝑠3 , 𝑠4 , 𝑝⃗1 , cos 𝛼) it is possible to determine 𝑝⃗3
for the fixed 𝑝⃗1 (Fig. 1).


2. Optimization of computing
        In this work, the Monte Carlo (MC) method [3] is used to calculate the integrals in Eq. (2). It
is well known that the accuracy of the MC method depends on the number of selected points, as well
as on the quality of the random number generator. Tina’s Random Number Generator (TRNG) [4] is
chosen for calculations as it has various distributions (uniform and non-uniform) and is also optimized
for using in parallel computations. Parallel calculations are required for the integrals in Eq. (2), and the
result and accuracy strongly depend on the number of selected points and the best convergence is

                                                              242
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021



achieved after the number of points reaches 2βˆ™107 (Fig. 2, left). The optimization of computations is
carried out in two directions: simplification of the functions and use of parallelization technology.
                                              0.8


                                              0.6



                                     | p3 |   0.4


                                              0.2


                                              0.0
                                                0.0              0.2            0.4                     0.6
                                                                           | p1 |
    Figure 1. Behavior of the function in the integration area using the (𝑝⃗1 , 𝑝⃗3 ) variables and fixing
                                                cos 𝛼=0.

        To obtain the selected point in the integration domain of Eq. (2), it is necessary to generate 5
random numbers. Taking into account the fact that cos(Ξ±) changes in [-1,1], the reflection of the
selected cos(Ξ±) from [0,1] to [-1,0] relative to zero, while saving the other variables, reduces the
required number of generations twice. Since 𝛴 > and 𝛴 < differ by the factor 𝐹 >(<), the positions of the
special points for these functions are the same, which allows calculating the functions Ξ£
simultaneously to reduce the total computation time twice. This method reduces the number of the
generated points and the total number of the required calculations by 4 times, thus accelerating the
computation process.
        According to the standard MC method, if the accuracy of the integral calculated over N
selected points is insufficient, the number of points should be increased. To optimize calculations, the
additional number of the required selected points is equal to the increase step M. The calculated results
for M points are added to the previous calculations to get the final answer, and the accuracy is checked
for N+M points.
        Parallelization is also used to accelerate the machine time for computations. For this purpose,
the number of generated points (N) is evenly distributed among all threads (P). The remainder of the
points (N mod P) is also additionally distributed over the required number of threads (instead of
sending it to one thread). The program is implemented for computations in both single-thread and
multi-thread versions. The multi-thread version uses such parallelization technologies as OpenMP and
NVIDIA CUDA [5, 6].

                                                                                                4
                       1.5
                                                                                                                       Error of S>
                                                                S>                              3                      Error of S<
                                                                                    Result, 10-10




                                                                S<
            Result, 10-8




                       1.0                                      G
                                                                                                2

                       0.5
                                                                                                1


                       0.0                                                                      0
                             0   2        4         6       8        10                             0   2     4    6         8       10
                                 Points number, 10      7
                                                                                                        Points number, 107

 Figure 2. Left panel: Π“, Ξ£>, Ξ£< as a function of the number of selected points. Right panel: upper error
                          bound as a function of the number of selected points.


                                                                          243
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021



         The tests are performed at a fixed temperature T = 0.15 GeV, π‘šπœ‹ = 0.143 GeV and a fixed
amplitude |A| = 1 for a different number of selected points. Fig. 2 (left) shows the results of
calculations of Π“, Ξ£>, Ξ£ as a function of the number of selected points. Fig. 2 (right) illustrates the
upper error bounds. As seen in Fig. 2 (left), after the limit of 3-4βˆ™107 points, the result becomes
insensitive to an increase in the number of calculations, which is caused by a decrease of the upper
error bounds.
        To speed up calculations, parallelization technologies are used. Table 1 shows the computation
time with the generation of 2*107 points for a different number of threads. The calculations are also
performed on the HybriLIT heterogeneous platform of the Meshcheryakov Laboratory of Information
Technologies, JINR, Dubna [7]. As demonstrated in Table 1, the use of HybriLIT parallelization
resources extremely reduces the computation time.
   Table 1. Comparison of the computation time using OpenMP, CUDA and hybrid CUDA+OpenMP
                                    parallelization. The number of selected points is equal to 2βˆ™10 7.




3. Pion damping
        The physical background of this work is the calculation of the pion lifetime in hot nuclear
matter. The amplitude of pion scattering is defined within the NJL model with meson-exchange
diagrams considered in the pole approximation, which leads to amplitudes similar to the linear sigma
model [8] with isospin amplitudes:
                                    2 𝑔2          3       1        1
                       𝐴0 = 5𝐢4πœ‹ + π‘”πœŽπ‘žπ‘ž πœŽπœ‹πœ‹ (          +      +        ),
                                               π‘€πœŽ2 βˆ’ 𝑠 π‘€πœŽ2 βˆ’ 𝑑 π‘€πœŽ2 βˆ’ 𝑒
                                       2   2       1       1
                                𝐴1 = π‘”πœŽπ‘žπ‘ž π‘”πœŽπœ‹πœ‹ ( 2      βˆ’       ),
                                                 π‘€πœŽ βˆ’ 𝑑 π‘€πœŽ2 βˆ’ 𝑒
                                          2    2       1       1
                             𝐴2 = 2𝐢4πœ‹ + π‘”πœŽπ‘žπ‘ž π‘”πœŽπœ‹πœ‹ ( 2    +        ),
                                                    π‘€πœŽ βˆ’ 𝑑 π‘€πœŽ2 βˆ’ 𝑒
where all parameters, the meson mass π‘€πœŽ and interaction constants π‘”πœŽπœ‹πœ‹ , π‘”πœŽqq, depend on temperature,
and 𝐢4πœ‹ is supposed to be a constant of the 4-pion interaction, 𝑠, 𝑑, 𝑒 are the Mandelstam variables.
The coupling strength π‘”πœŽπœ‹πœ‹ is defined as π‘”πœŽπœ‹πœ‹ (𝑇, πœ‡) = 2π‘”πœŽπ‘žπ‘ž π‘”πœ‹π‘žπ‘ž2 Ξ“
                                                                       πœŽπœ‹πœ‹ (𝑇, πœ‡), where Ξ“πœŽπœ‹πœ‹ (𝑇, πœ‡) is
the amplitude of the triangle vertex Οƒ β†’ ππ.




      Figure 3. Pion width (Ξ“) as a function of temperature. The results are shown for a different
    number of selected points and compared with the result of GSL-integration tools (VEGAS) [10].

        The temperature dependence for the masses and constants is chosen within the NJL model and
can be found, for example, in [9]. In the case of pion–pion collisions, the following processes
contribute to the total cross section:
                         (Ο€Ο€βŸΆΟ€Ο€) = 𝜎(Ο€0Ο€0βŸΆΟ€0Ο€0) + 𝜎(Ο€0Ο€0βŸΆΟ€+Ο€-) + 2𝜎(Ο€0Ο€+βŸΆΟ€0Ο€+).

                                                   244
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021



          The process Ο€0Ο€+βŸΆΟ€0Ο€+ occurs at the same rate as Ο€ 0Ο€-βŸΆΟ€0Ο€-, so we do not calculate them
separately; we just put the factor 2. The calculation result is given in Fig. 3. It is clearly seen, that in a
hot gas, the pion spectral function significantly broadens (due to an increase in the width) near the
critical temperature, which is about 0.17 GeV for this version of the NJL model.


4. Conclusion
        An algorithm for calculating the pion lifetime as a function of temperature is presented. Since
the integrals (2) have a complicated integrand function, it is necessary to select a large number of
points, which leads to an increase in the computation time.
         To minimize the error, tests with a constant amplitude and a fixed temperature are performed
(Fig. 2). The goal of the tests is to identify the minimal number of points for the initial step to plot the
pion width as a function of temperature (Fig. 3). However, it cannot be argued that the test results
obtained at a fixed temperature are relevant for all temperatures and a non-constant amplitude.
Therefore, for each subsequent temperature, an additional calculation is carried out. The calculation
with an increasing number of points (with the increase step M) is performed until the error obtained in
the current iteration becomes worse than in the previous one. Nevertheless, it is difficult to obtain a
smooth behavior of the function Ξ“(𝑇) near critical temperatures (Fig. 3). Fig. 3 shows a comparison
of our results with GSL-integration tools (VEGAS) [10].
       To optimize the computation time, integral optimization and parallelization technologies (such
as OpenMP, CUDA and their combination) are applied to the code. The use of the resources of the
HybriLIT cluster reduces the computation time and enhances code efficiency.


5. Acknowledgement
        The work was supported by the RFBR, grant No. 18-02-40137.

References
[1] L .P. Kadanoff and G. Baym, Quantum Statistical Mechanics //W. A. Benjamin, Icn., New York,
1962.
[2] D. Blaschke, M. K. Volkov, V. L. Yudichev, Pion damping width from SU (2) Γ— SU (2) NJL
model. // Phys.Atom.Nucl. vol. 66, p. 2233-2237, 2013 (nucl-th: 0303034).
[3] I. Sobol. Numerical Monte Carlo methods // Nauka Publishing. 1973, Moscow. (in Russian)
[4] H. Bauke. Tina’s Random Number Generator Library Version 4.24, March 27, 2021. Available
at: https://www.numbercrunch.de/trng/trng.pdf.
[5] A.S. Antonov. MPI and OpenMP parallel programming technologies: A textbook for universities.
Lomonosov Moscow State University; Supercomputer Consortium of Russian Universities. - Moscow:
Moscow University Press, 2012. – 344 p. (in Russian)
[6] CUDA C++ Programming Guide β€” Design Guide,                               May    2021.    Avaliable     at:
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
[7] Heterogeneous Computing Cluster HybriLIT*. *Available at: //hybrilit.jinr.ru
[8] S. R. Colanch, P. Maris, QCD based quark description of pi-pi scattering up to the sigma and rho
region // Phys. Rev. D 66, 116010, 2002. (hep-ph: 0210151).
[9] Yu. L. Kalinovsky and A. V. Friesen, Properties of Mesons and Critical Points in the Nambu–
Jona-Lasinio Model with Different Regularizations// Phys. Part. Nucl. Lett., vol. 12, p. 737, 2015.
[10] GNU Scientific Library. Available at: //gnu.org/software/gsl/doc/html/montecarlo.html



                                                    245