<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Vladimir Bakhtin</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Development of Parallel Software Code for Calculating the Problem of Radiation Magnetic Gas Dynamics and the Study of Plasma Dynamics in the Channel of Plasma Accelerator</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Bauman Moscow State Technical University</institution>
          ,
          <addr-line>ul. Baumanskaya 2-ya, 5/1, 105005, Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Keldysh Institute of Applied Mathematics</institution>
          ,
          <addr-line>Miusskaya sq., 4, 125047, Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Lomonosov Moscow State University</institution>
          ,
          <addr-line>GSP-1, Leninskie Gory, 11999, Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <volume>1</volume>
      <issue>2</issue>
      <fpage>0000</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>DVM-system is designed for the development of parallel programs of scientific and technical calculations in C-DVMH and Fortran-DVMH languages. These languages use a single parallel programming model (DVMH model) and are extensions of the standard C and Fortran languages with parallelism specifications, written in the form of directives to the compiler. The DVMH model makes it possible to create efficient parallel programs for heterogeneous computing clusters, in the nodes of which accelerators (graphic processors or Intel Xeon Phi coprocessors) can be used as computing devices along with universal multi-core processors. The article describes the experience of successful use of DVM-system for the development of parallel software code for calculating the problem of radiation magnetic hydrodynamics and the study of plasma dynamics in the channel of plasma accelerator.</p>
      </abstract>
      <kwd-group>
        <kwd>Automation of Development of Parallel Programs</kwd>
        <kwd>DVM-system</kwd>
        <kwd>GPU</kwd>
        <kwd>Fortran</kwd>
        <kwd>Plasma Accelerator</kwd>
        <kwd>Radiation Magnetic Hydrodynamics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The current level of experimental and numerical studies allows to simultaneously
determine the local values of macroscopic plasma parameters and radiation
characteristics. This opens up new opportunities for complex research, validation of models
and approximation of the results of calculations with the possibilities of experimental
studies. This determines the relevance of the development of numerical models of
radiation transport, considered as injectors for thermonuclear installations. The study
of ionizing gas and plasma flows, respectively, for the first and second stages of
quasi-stationary plasma accelerators (hereinafter QSPA) is carried out on the basis of the
developed evolutionary models of radiation magnetic hydrodynamics, the effective
solution of which required the development of parallel software codes for calculations
on high-performance multiprocessor computer systems. Studies of plasma flows in
the second stage of the QSPA on the basis of the model of radiation magnetic
hydrodynamics allowed to reveal the conditions providing generation of deuterium-tritium
or D-T plasma flows with thermonuclear parameters at discharge currents up to 1 MA
in the second stage of the QSPA.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Mathematical formulation of the problem</title>
      <p>
        The problem formulation includes a system of MHD equations taking into account the
finite conductivity of the medium, thermal conductivity and radiation transport. Under
the condition of quasineutrality n i  ne  n and equality Vi  Ve  V for fully
ionized plasma we have:
 
 t
 div( V)  0 ,  d V  P  1 j H , d 
d t c d t
  V, ,
 t
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
     div  V  P div V  j2
 t 
  2 cv T , q    T ,
 H
 t
      </p>
      <p>
        j
 rot(V  H)  c rot ,

 divq  div W ,
j 
c rot H ,
4
here   m i n is the density, P  Pi  Pe  2 kB n T is the total pressure, q is the heat
flux,  is the thermal conductivity,   e2ne / me e is the electrical conductivity, W
is the density of the radiation energy flux defined by (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) and (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ).
      </p>
      <p>The radial plasma current flowing between electrodes and the azimuthal magnetic
field provide the acceleration of plasma behind the ionization front due to the Ampere
force 1 j H , here j is the current density in plasma.</p>
      <p>c</p>
      <p>As units of measurement we will choose a length of channel L , the characteristic
concentration or gas density at the inlet of accelerator channel no ( o  m no ) and
temperature To . The characteristic value of the azimuthal magnetic field H o the inlet
to channel is defined by discharge current in device J p so that Ho  2 J p / c Ro ,
where Ro is the characteristic radius of channel. These values allow to form the units
of pressure Po  Ho2 / 4 , velocityVo  Ho / 4  o , time to  L /Vo and plasma
current jo  c Ho / 4 L . The set of the MHD equations in the dimensionless variables
contains such dimensionless parameters as the ratio of the characteristic gas pressure
to magnetic one   8  Po / Ho2 and Rem  1   o T 3 / 2 , as well as
dimensionless values of thermal conductivity and flux W .</p>
      <p>
        Formulation of the problem includes the boundary conditions at the electrodes and
at the inlet and outlet of the accelerator channel. We assume that at the inlet of
chan(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
nel z  0 the plasma is supplied with the known values of the density  (r)  f 1(r)
and the temperature T (r)  f2(r) . Not solving an additional electric circuit equation it
is possible to assume that the current is kept constant and comes into the system only
through electrodes. Then at z  0 we have jz  0 or r H  ro  const where
ro  Ro / L . The boundary conditions at electrodes r  ra (z) and r  rк (z) forming
the wall of the channel are based on the assumptions that the electrode surfaces are
equipotential ( E  0 ) and impermeable for plasma (Vn  0 ). When r  0 we have:
Vr  0 , V  0 ,
      </p>
      <p>H  0 .</p>
      <p>
        The algorithm for the numerical solution of equations (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) assumes the mapping of
computational domain onto a unit square in ( y, z ) plane using the relation:
r  (1  y) rк (z)  y ra (z) .
      </p>
      <p>The finite-difference scheme with flux correction is used to calculate the
hyperbolic part of the MHD equations. Magnetic viscosity and thermal conductivity
calculation is based on the flux variant of the back-substitution method. Quasistationary
flows are calculated by the establishment method. The radiation transfer problem was
solved within the framework of the developed 3D model.</p>
      <p>Rate of the radiation propagation is significantly higher than the specific rates of
the plasmodynamic processes. In this case the radiation field instantly adapts to
distribution of the flow parameters, and it is possible to reduce the problem to solve the
stationary equation of the radiation transport:</p>
      <p>
        Ω  I r, Ω  r  r I r, Ω , (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
here I r, Ω is radiation intensity with the particular frequency  , emitting in
direction of the spatial angle Ω , corresponds to the point with the coordinate r . The main
radiation characteristics are the density of the radiation energy U and density of the
radiation energy flux W are determined by the radiation intensity:
      </p>
      <p>
        U r  1 4I r, Ω d  d , Wr  4I r, Ω Ω d  d . (
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
c 0 0 0 0
      </p>
      <p>The absorption coefficient  r and emissivity  r are known functions that
depend on the condition of medium, its density and temperature and are sums of three
parts corresponding to a) absorption and emission in lines; b) photo-ionization and
photo-recombination; and c) scattering.</p>
      <p>
        In accordance with equations (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) the problem of the radiation transport in flow of
the ionizing gas and plasma should be solved in the three-dimensional formulation.
      </p>
      <p>
        It is easy to obtain a grid for the 3D problem of the radiation transport as rotation
the initial grid in plane of the variables (z, r) by 360 degrees around the axis of
channel with a certain step. The radiation intensity has to be determined in different
directions for the further calculation of the integral values in relations (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) in any node or
cell of a three-dimensional grid. For this purpose an additional angular grid is built in
the azimuth and polar angles. The splitting of the complete spatial angle   4 into
elements of an angular grid is made by the method providing the uniform distribution
of rays in all directions. For each node in calculation we use as a rule 440 rays for the
complete spatial angle.
      </p>
      <p>The ray tracing is carried out in accordance with the method of the long
characteristics in order to determine the points of the crossing of a ray with the faces of cells in
the three-dimensional grid and the position of the crossing of a ray with one of the
boundaries of the three-dimensional computational domain. The invisible shadow
regions are excluded from calculation of the radiation energy flux for the given node
of grid in the tracing process of the computational domain by rays emerging from any
node of the grid. Coefficients  (r) and  (r) are calculated by the average value of
density and temperature in the center of the cell. Intensity along rays or characteristics
passing through any number of homogeneous regions with known coefficients  и
 is determined as a result of addition of solutions for homogeneous areas.</p>
      <p>The more detailed formulation of the problem is presented in [1–3].
3</p>
    </sec>
    <sec id="sec-3">
      <title>Development of a parallel version of the program</title>
      <p>A parallel program code for the study of high-speed plasma flows in the channels of
quasi-stationary plasma accelerators is implemented using the DVM-system [4, 5] in
the Fortran-DVMH language.</p>
      <p>Let's consider the process of parallelization of this software complex. The main
difficulty of developing a parallel program for a cluster is the need to make global
decisions on the distribution of data and computations taking into account the
properties of the entire program, and then perform the complex work of modifying the
program and debugging it. The large amount of program code, multi-modularity,
multifunctionality makes it difficult to make decisions on the consistent distribution of data
and computations.</p>
      <p>Incremental parallelization can be used to facilitate the development of a parallel
version of the program. The idea of this method is that not the whole program is
parallelized, but its parts (areas of parallelization) - additional instances of the required
data are created in them, the distribution of this data and the corresponding
computations are made. The regions are selected based on the times obtained by profiling the
sequential program.</p>
      <p>To interact with those parts of the program that have not been parallelized, copy
operations are used from original to additional (distributed) data and back. In this
way, we can isolate the areas of code we are interested in and parallelize each area
separately from the rest of the program. This allows you to test different approaches
to data distribution and calculations distribution within the scope, which can
significantly change the structure of data storage, but any changes within the scope will not
require any modifications of the code outside the scope. At the same time, it is much
easier to find the most optimal distribution of data and computations in a small local
area than for the whole program.</p>
      <p>After parallelizing individual areas, parallel program optimization is performed to
minimize the amount of data copied. To do this, several regions can be combined into
one if an effective common distribution can be found for a common area. It is also
possible to add less time-consuming fragments to the area if this reduces the number
of data copying operations. This allows you to minimize the amount of program code
for analysis and parallelization-it is enough to consider only those fragments that
work with data already used within the parallelized area.</p>
      <p>Having optimal solutions for distributing data and calculations in local areas makes
it a little easier to find a common distribution for the combined area. However, in the
case where the optimal global distribution cannot be found for the combined domain,
incremental parallelization makes the process of finding local optimal distributions
much easier. This is very useful if the program consists of several parts, the data in
which differ significantly in structure.</p>
      <p>To solve this problem, several different mathematical models are used at once, so it
was decided to use incremental parallelization, which was successfully applied. The
most time-consuming parts of the program were determined using the performance
analyzer, which is part of the DVM system. From the point of view of command
execution time, the most essential part of the software package was the run_radiat
function, which corresponds to the calculation of radiation transfer in the 3D formulation
of the problem. In addition, considerable time was required to perform an iteration
loop responsible for the calculations of axisymmetric flows based on the MHD model.</p>
      <p>We describe the process of parallelizing the main iteration loop and the run_radiat
function.</p>
      <p>The iteration loop was a regular loop with a fixed number of iterations. The inner
loops have been described using labels (Fig. 1):
do 43 L = 1,NZ
do 43 M = 1,NR</p>
      <p>HFI(M,L) = HRFI(M,L) / RAD(M,L)
43 continue</p>
      <p>For convenience, all internal loops were rewritten without the use of labels, and all
arrays used in them were replaced (Fig. 2). The rejection of the use of labels was a
purely technical transformation with the aim of increasing the readability of the
program and facilitating parallelization.</p>
      <p>do L = 1,NZ
do M = 1,NR</p>
      <p>new_HFI(M,L)=new_HRFI(M,L)/new_RAD(M,L)
end do
end do</p>
      <p>Arrays were replaced as a preparation for further distribution of these arrays in
isolation from the rest of the program. Since the iterative loop is only some part of the
whole complex, it was necessary to make sure that the distribution of arrays would
not require any changes to the rest of the program. In total, about 45 arrays were
replaced in this way. All these arrays, both original and copies, were replaced by
dynamic ones, and their selection occurred just before the start of the iterative loop. At
that moment, the original arrays were destroyed on the contrary, so as not to consume
extra memory (Fig. 3):
allocate(new_HRFI)
new_HRFI = HRFI
deallocate(HRFI)
&lt;итерационный цикл&gt;
allocate(HRFI)
HRFI = new_HRFI
deallocate(new_HRFI)</p>
      <p>The above transitions from one array to another were placed before and after the
iteration loop if the data available in these arrays were used respectively inside or after
the iteration loop. Most of the loops had a fairly simple array access patterns - either
using the elements appropriate for loop iteration, or the nearest neighbors of those
elements (Fig. 4):
do L = 2,NZM1
do M = 2,NRM1</p>
      <p>FGB = (GAM - 1.) / BB * new_RDY(L) / new_TEM(M,L)
DRHDY = (new_HRFI(M + 1,L) - new_HRFI(M - 1,L)) / DY2
DRHDZ = (new_HRFI(M,L + 1) - new_HRFI(M,L - 1)) / DZ2
! more code
end do
end do</p>
      <p>For arrays such as new_HRFI from the example above, the corresponding shadow
edges were specified during distribution (SHADOW specification). The vast majority
of loops were not closely nested, which did not allow them to be distributed
simultaneously in two dimensions. It was also impossible to carry out a distribution for any
one distribution, since there were loops with direct dependencies in both one and the
other dimensions (Fig. 5):</p>
      <p>Similar loops were in another dimension. A significant number of loops also had a
significant amount of access to potentially remote data for one of the dimensions
(Fig. 6):
do L = 2,NZM1
! more code
PPL = VR1 * new_RAD(1,L) * new_RDY(L) * new_PLT(1,L)
!more code
end do</p>
      <p>Fig.6.Potentiallyremotedata</p>
      <p>To solve these problems, it was decided to use two data distribution templates
(template specification) - one for each dimension. In one case, it turned out that each
process received a part of "rows" of a two-dimensional array and worked with them.
In the other-each process received a part of "columns" of the two-dimensional array
and processed them. For each loop, a dimension without dependencies and without
accesses to remote data was selected, and parallelization was performed on it.
Previously created copies of arrays were also distributed to the dimensions on which the
loop in which they were used was parallelized. If the array was used in loops, among
which there were parallel loops in both one and the other dimension, 2 copies were
created for it at once. Between loops parallelized by different dimensions, if
necessary, switching between templates was made (Fig. 7):
!DVM$ PARALLEL(L) ON new_PLT(*,L), PRIVATE(M)
do L = 2,NZ
do M = 1,NR</p>
      <p>! more code
end do
end do
new2_PLT = new_PLT
!DVM$ PARALLEL (M) ON new2_PLT(M,*), PRIVATE(L)
do M = 2,NR</p>
      <p>do L = 1,NZ</p>
      <p>! more code
end do
end do</p>
      <p>There were only 2 serious switches in the program - the transfer of 8 arrays from
the distribution on the second dimension to the distribution on the first, and the
transfer of the same 8 arrays back. The program required another similar translation back
and forth, but only for one array. About 30 arrays were distributed over the created
templates, of which about 20 had copies for both distribution templates. Also, more
than 50 different loops were parallelized.</p>
      <p>In the run_radiat function, 5 loops were parallelized. These loops used a variety of
data structures and arrays of unique formats (Fig. 8):
do inode = 1,mesh3d%NNodes
den3d%fval(inode) = 0.e0_r8p
tem3d%fval(inode) = 0.e0_r8p
do i = 1,valsInterp%ielem(inode)%nSrc
n1 = valsInterp%ielem(inode)%iSrc(i)
eps = valsInterp%ielem(inode)%wSrc(i)
den3d%fval(inode) = den3d%fval(inode) + den2d%fval(n1)*eps
tem3d%fval(inode) = tem3d%fval(inode) + tem2d%fval(n1)*eps
end do
end do</p>
      <p>Since the main arrays in these loops had a unique structure that could not be
correlated well with each other, 4 different distribution templates were created for all 5
loops (since 2 out of 5 loops were parallelized using the same template). All
distributed arrays were replaced with ordinary arrays of standard types, the remaining arrays
and data structures were left in their original form. The final version of the loop from
the example above after transformations and parallelization looked like this (Fig. 9):
!DVM$ PARALLEL(inode) ON new_den3d(inode),PRIVATE(n1,eps,i)
do inode = 1,m3d
new_den3d(inode) = 0.e0_r8pч
new_tem3d(inode) = 0.e0_r8p
do i = 1,valsInterp%ielem(inode)%nSrc
n1 = valsInterp%ielem(inode)%iSrc(i)
eps = valsInterp%ielem(inode)%wSrc(i)
new_den3d(inode) = new_den3d(inode) + new2_den2d(n1)*eps
new_tem3d(inode) = new_tem3d(inode) + new2_tem2d(n1)*eps
end do
end do</p>
      <p>The main difficulty in parallelizing this part of the program was the indirect
addressing highlighted in the examples above. During any iteration of the loop, the
program can access an arbitrary number of different elements of the addressable array,
perhaps even all. Moreover, the situation when each process will request almost all
the elements of an indirectly addressed array (that is, the values of n1 for each set of
turns allocated to the process pass through almost all possible elements of the den2d%
fval and tem2d% fval arrays) is normal for this task. Therefore, it was decided to
write data to distributed arrays, and if in subsequent loops this array is addressed
indirectly, it was redistributed by all processes. Given that every process uses almost
every element of the array, the extra overhead of making a full copy is negligible. After
the loop indicated above, for example, redistribution took place in the form(Fig. 10):
new2_den3d = new_den3d
new2_tem3d = new_tem3d</p>
      <p>Fig.10.Copydataafterloopexecution</p>
      <p>The prefix new_ pointed to distributed arrays and new2_ to non distributed arrays.
A separate problem was the presence of reduction operation sum with indirect
addressing (Fig. 11):
Do idir = 1,NDirs
do iEn = 1,NEnGroups
do isegm = 1,NPoints
inode = pTrace%rayTree(idir)%segm(isegm)%inode
!more code
cUden%fval(inode) = cUden%fval(inode) + cU_ptX
end do
end do
end do</p>
      <p>In this fragment of the program, when any iteration of the loop is performed, an
arbitrary element of the array cUden%fval(inode) can be written to, and this data
needs to be summed. To solve this problem, a special distributed array was created,
allowing each iteration of the loop to carry out the summation in its own "column".
After this, the summation over the last dimension of the array was performed
(Fig. 12):
do idir = 1,NDirs
!DVM$ PARALLEL(iEn) ON new_cuden3d(*,iEn)
do iEn = 1,NEnGroups
do isegm = 1,NPoints
inode = pTrace%rayTree(idir)%segm(isegm)%inode
!more code
new_cuden3d(inode,iEn) = new_cuden3d(inode,iEn) + cU_ptX
end do
end do
end do
!DVM$ PARALLEL(iEn) ON uni_cuden3d(*,iEn), PRIVATE(inode),
!DVM$* REDUCTION(SUM(new2_cuden3d))
do iEn = 1,NEnGroups
do inode = 1,m3d</p>
      <p>new2_cuden3d(inode)=new2_cuden3d(inode)+ new_cuden3d(inode,iEn)
end do
end do
As a result, each process received a correct copy of the reduction array.</p>
      <p>Based on the results of parallelization, a version of the program was obtained that
can be run on multi-core clusters.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Efficiency of parallel program</title>
      <p>Calculations using a parallel version of the program were performed on two
multiprocessor high-performance computing systems K-100 (KIAM RAS) and MVS1P5
(JSCC RAS). The execution times of the program in seconds on different number of
cores are shown in the Figures 13–14.</p>
      <p>The calculation of one time step is shown, determined by the Courant condition in
solving the evolutionary problem based on the MHD model. Curves 1 in the figures
correspond to calculations using the long characteristics method, which requires the
most computational resources in the case of solving a fully three-dimensional problem
and the calculation of integral characteristics in all nodes of the three-dimensional
coordinate grid. Curves 2 and 3 correspond to the calculations of the integral
characteristics of radiation in one plane of variables taking into account the axial symmetry
of the flow. Along with the method of long characteristics, the method of short
characteristics was implemented, which is answered by Curves 2. The method of short
characteristics leads to numerical diffusion in the calculation of the radiation field,
and it requires more time to calculate compared to the method of long characteristics,
which is represented by the curves 3 in the figures, provided that the integral
characteristics of radiation in one plane of variables are calculated taking into account the
axial symmetry of the flow.
Fig. 15. Acceleration of the program execution process relative to the initial version on a
different number of K-100 cores</p>
      <p>Figures 15 and 16 show how the acceleration of the parallel version of the program
changes relative to the original serial version on a different number of cores of K-100
and MVS1P5 computer systems. Curves 1, 2 and 3 in these figures meet the same
calculation conditions specified above.</p>
      <p>Fig. 16. Acceleration of the program execution process relative to the initial version on a
different number of MVS1P5 cores</p>
      <p>Weaker acceleration on K-100 for curve 3 in Fig. 15 for 32 processors is explained
by the fact that the program execution time is already very short, less than 30 seconds,
and overhead costs begin to occupy a significant percentage of the program time.</p>
      <p>The acceleration of the parallel version of the program on a single processor can be
explained by more optimal memory accesses. Replacing some pointer-based
structures with a set of regular arrays has reduced the total number of memory accesses. It
also allowed in some cases to remove indirect addressing that increased the efficiency
of the cache memory. The resulting acceleration is significantly dependent on the grid
used, the mode of operation of the program and characteristics of memory for a
particular machine. It can be enough to cover all the overhead arising from
parallelization and even speed up the program on a single processor a little.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>This article describes the experience of successful use of DVM-system for the
development of parallel software code for calculating the problem of radiation magnetic
hydrodynamics and the study of plasma dynamics in the channel of plasma
accelerator.</p>
      <p>The use of DVM-system allowed to accelerate the process of developing parallel
program. At the same time, the following results were achieved to accelerate the
program execution process relative to the original version: up to 24 times on 32
processors of the K-100 computer complex (KIAM RAS) and up to 22 times on 28
processors of the MVS1P5 computer system (JSCC RAS).</p>
      <p>Numerous calculations performed using the parallel version of the program have
shown that the value of the discharge current required to achieve ion energy increases
in proportion to the size of the installation. It was found that a decrease in the
characteristic plasma concentration at the entrance to the accelerator channel can
significantly reduce the values of discharge currents. The values of the discharge current in the
plant were determined, providing the ion energy at the output at the level that is
necessary for the subsequent d-T plasma synthesis reaction in magnetic traps to hold the
plasma. The found discharge currents are quite acceptable for existing QSPA
installations.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Kozlov</surname>
            ,
            <given-names>A.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konovalov</surname>
            ,
            <given-names>V.S.</given-names>
          </string-name>
          :
          <article-title>Numerical study of the ionization process and radiation transport in the channel of plasma accelerator</article-title>
          .
          <source>Communications in Nonlinear Science and Numerical Simulation</source>
          ,
          <volume>51</volume>
          ,
          <fpage>169</fpage>
          -
          <lpage>179</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kozlov</surname>
            ,
            <given-names>A.N.</given-names>
          </string-name>
          <article-title>The study of plasma flows in accelerators with thermonuclear parameters</article-title>
          .
          <source>Plasma Physics and Controlled Fusion</source>
          ,
          <volume>51</volume>
          (
          <issue>11</issue>
          ), Ar.
          <volume>115004</volume>
          ,
          <issue>1</issue>
          -
          <fpage>7</fpage>
          (
          <year>2017</year>
          ), https://doi.org/10.1088/
          <fpage>1361</fpage>
          -6587/aa86be.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kozlov</surname>
            ,
            <given-names>A.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konovalov</surname>
            ,
            <given-names>V.S.:</given-names>
          </string-name>
          <article-title>Radiation transport in the ionizing gas flow in the quasisteady plasma accelerator</article-title>
          .
          <source>Journal of Physics: Conference Series</source>
          , vol.
          <volume>946</volume>
          (
          <year>2018</year>
          ), https://doi.org/10.1088/
          <fpage>1742</fpage>
          -6596/946/1/012165.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>C-DVMH language</surname>
          </string-name>
          ,
          <article-title>C-DVMH compiler, compilation, execution and debugging of DVMH programs</article-title>
          , http://dvm-system.
          <article-title>org/static_data/docs/CDVMH-reference-en</article-title>
          .pdf,
          <source>last accessed</source>
          <year>2019</year>
          /11/21.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Fortran</surname>
            <given-names>DVMH</given-names>
          </string-name>
          langauge,
          <article-title>Fortran DVMH compiler, compilation, execution and debugging of DVMH programs</article-title>
          , http://dvm-system.
          <article-title>org/static_data/docs/FDVMH-user-guide-en</article-title>
          .pdf,
          <source>last accessed</source>
          <year>2019</year>
          /11/21.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>