<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Proceedings</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.18287/1613-0073-2016-1638-451</article-id>
      <title-group>
        <article-title>PARALLEL IMPLEMENTATIONS OF PARAMETRIC IDENTIFICATION ALGORITHMS FOR THREE- DIMENSIONAL CRYSTAL LATTICES</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>D.V. Kirsh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A.V. Kupriyanov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Image Processing Systems Institute - Branch of the Federal Scientific Research Centre “Crystallography and Photonics” of Russian Academy of Sciences</institution>
          ,
          <addr-line>Samara</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Samara National Research University</institution>
          ,
          <addr-line>Samara</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>1638</volume>
      <fpage>451</fpage>
      <lpage>459</lpage>
      <abstract>
        <p>The article describes parallel implementations of two parametric identification algorithms. The first algorithm is based on the estimation of Bravais unit cell parameters, and the second one estimates Wigner-Seitz cell volumes. The developed parallel implementations are based on the two-tier model of concurrency. An external tier of concurrency uses Message Passing Interface (MPI) technique to distribute parametric identification tasks between computing nodes. An internal tier uses Open Multi-Processing (OpenMP) technique to split single steps of the parametric identification algorithms into independent subtasks. Experimental results on multi-processor / multi-core systems have demonstrated almost linear speedup of the parallel implementations, confirmed the effectiveness of the two-tier concurrency model and proved the stability of the identification approach under lattice distortion.</p>
      </abstract>
      <kwd-group>
        <kwd>crystal lattice</kwd>
        <kwd>unit cell</kwd>
        <kwd>parametric identification</kwd>
        <kwd>parallel algorithm</kwd>
        <kwd>MPI</kwd>
        <kwd>OpenMP</kwd>
        <kwd>scalability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Structural identification of crystal lattices in three-dimensional space is one of the
basic problems of X-ray diffraction analysis [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The main difficulties are the
reconstruction of three-dimensional objects [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ] and the ambiguity of unit cell choice
from which the ambiguity of lattice system identification arises [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ].
To solve the problem of structural identification in three dimensional space, we
proposed two approaches: the first was based on parametric identification methods [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
and the second was based on application of fuzzy neural networks [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. According to
conducted experiments, the first approach proved to be significantly more universal
than the second one. In the beginning, it estimates a number of parameters: edge
lengths and angle values of Bravais unit cells [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and volumes of Wigner-Seitz cells
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Then the proposed measures calculate the similarity between estimated
parameters and the set of reference ones. However, thorough research showed that structural
identification of high accuracy (above 95%) requires a huge database of reference
lattices [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Therefore, the reverse side of the proposed approach is its high computational
complexity. To overcome this drawback, we decided to develop the parallel
implementations of parametric identification algorithms that would use the computing power of
modern multi-processor / multi-core systems with maximum efficiency.
The main requirement for the developed parallel implementations was scaling
efficiency. Consequently, it was necessary to provide a large number of independent
computing tasks that could be executed in parallel. To solve the problem of scaling,
we proposed a two-tier concurrency model, described in the article.</p>
    </sec>
    <sec id="sec-2">
      <title>External concurrency tier of parametric identification algorithms</title>
      <p>The developed methods of crystal lattice parametric identification estimate and
compare parameters of the unit cells that form the entire lattice. It is obvious that these
methods have a high degree of locality: the initial data they require are coordinates of
a few, the most closely located, lattice points (Fig. 1.).
It should be noted that the estimation of parameters for each unit cell can process
independently. The necessity of data exchange appears only at the final stage, when
the analysis of unit cell parameters and the calculation of the entire lattice parameters
take place.</p>
      <p>Hence, the proposed allocation of parameter estimation tasks called the external tier
of concurrency is universal for all parametric identification method. It can be
described by the following algorithm:
1. Find all lattice points that do not belong to any lattice face – the step executes on
the master node.
2. Scatter uniformly the found lattice points in conjunction with the nearest points
that lies inside the sphere of a given radius among the computing nodes.
3. Calculate parameters for each set of lattice points – the step executes by all
computing nodes in parallel.
4. Gather all calculated parameters and estimate final ones for the entire lattice – the
step executes one the master node.</p>
      <p>
        It is obvious that the described algorithm of parallel implementation corresponds to
the MPI technique [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]: computations are local, communications between tasks are
minimal and appear only at the initial and final stages.
      </p>
      <p>The proposed approach allows to avoid unbalanced load distribution (the number of
tasks on each computing nodes will not differ by more than one task).</p>
    </sec>
    <sec id="sec-3">
      <title>Internal concurrency tier of parametric identification algorithms</title>
      <p>However, in case of using a large amount of multi-core processors, the number of
independent tasks that calculate unit cell parameters may be insufficient to load all
available computing cores. Due to the lack of independent tasks on the external
concurrency tier, we decided to organize an additional internal tier of concurrency for
both parametric identification method by parallelizing the steps of parameter
estimations.</p>
      <p>
        Parallel implementation of the Bravais unit cell parameter estimation
An algorithm of the Bravais unit cell parameter estimation is a sequence of steps that
must be carried out strictly one by one [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Consequently, the concurrency can be
organized only within the steps.
      </p>
      <p>The computational complexity of the algorithm steps for a lattice with N lattice points
are as follows:
1. Select the first nearest node – O (N).
2. Exclude nodes that lie on the same line with the selected node – O (N).
3. Select the second nearest node – O (N).
4. Exclude nodes that lie in the same plane with the two selected nodes – O (N).
5. Select the third nearest node – O (N).
6. Calculate the parameters of the Bravais unit cell – O (1).</p>
      <p>
        The first five steps have the maximum computational complexity. Consequently, the
parallel execution of these steps will allow to achieve the maximum scalability.
Figure 2 demonstrates the scheme of the proposed parallel implementation.
Parallel implementation of the Wigner-Seitz unit cell volume estimation
An algorithm of the Wigner-Seitz unit cell volume estimation is a sequence of steps
that must be carried out strictly one by one [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. As for the previous algorithm, the
concurrency can be organized only within the steps. Figure 3 demonstrates the
scheme of the proposed parallel implementation.
The computational complexity of the algorithm steps for a lattice with N lattice points
and L scattered points that is used to calculate the cell volume are as follows
1. Calculate normal vectors to the boundary planes – O (N).
2. Generate L values of uniformly distributed vector in 3D – O (L).
3. Count the number of points that fell inside the boundary planes – O (L×N).
4. Calculate the volume of the Wigner-Seitz unit cell – O (1).
      </p>
      <p>Due to the fact that the second and the third steps have the greatly higher
computational complexity (the minimum number of scattered points to ensure a sufficient
level of volume calculation accuracy is 100 000 points), the parallel execution should
be organized within these steps, leaving other steps sequential.</p>
      <p>
        Common features of the developed parallel implementations of the both algorithms
are data dependence and a large number of communications at each step and between
the steps. Thus, the most appropriate model for the parallel implementation is an
OpenMP model [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>T
S p  1 ,</p>
      <p>Tp
(1)</p>
    </sec>
    <sec id="sec-4">
      <title>Experimental results</title>
      <p>The aim of this experiment was an investigation of the dependence between the
speedup of the developed parallel implementations and the number of computing
nodes / identification tasks.</p>
      <p>
        The speedup obtained using a parallel implementation for p cores compared with a
sequential algorithm was determined by the following equation [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]:
where T1 – the execution time of a sequential algorithm;
Tp – the execution time of a parallel implementation on p cores.
      </p>
      <p>According to the equation (1), the best value of the speedup is a linear speedup when
the execution time of parallel implementation growths linearly with the increase of
core number.</p>
      <p>A two triclinic lattices were used as an analyzed lattices: small (s) and big (b). Lattice
sizes were chosen in order to provide the following time of parameter estimation
computing on a single core: about 1 second for the small lattice; about 5 minutes for
the big one. This selection allows to estimate speedup at a small number of concurrent
tasks when the overhead of parallel computing prevails, as well as predict the speedup
growth with the increase of the lattice size, the number of cores and the task
complexity.</p>
      <sec id="sec-4-1">
        <title>Investigation of the speedup on multi-core personal computer</title>
        <p>The experiment was conducted on a desktop computer based on a quad-core
processor Intel Core i5 3470 (base frequency 3.8 GHz).</p>
        <p>Lattice size for each parametric identification method were as follows:
 the method of Bravais unit cell parameters estimation – 24 and 51 translations in
each direction;
 the method of Wigner-Seitz unit cell volume estimation – 8 and 17 translations in
each direction.</p>
        <p>The obtained data are presented in the form of then diagram showing the speedup of
the parallel implementations for different numbers of cores (Fig. 4).
The results confirm that the parallel implementations of both parametric identification
methods provide almost linear speedup over the whole range of used lattice sizes and
core configurations. The difference between the obtained values of speedup for small
and big lattice is less than 5%.</p>
        <p>Investigation of the speedup on multiprocessor / multicore cluster system
The experiment was conducted on 16 cluster nodes of supercomputer “Sergey
Korolev”. Each node contained two quad-core Intel Xeon X5560 processor (base
frequency 2.80 GHz).</p>
        <p>Lattice size for each parametric identification method were as follows:
 the method of Bravais unit cell parameters estimation – 19 and 37 translations in
each direction;
 the method of Wigner-Seitz cell volume estimation – 4 and 13 translations in each
direction.</p>
        <p>It should be noted that the lattice of 4 translations cannot provide a sufficient number
of parameter calculation tasks that is necessary for the uniform loading of all
computing cores of the cluster (only 27 parallel tasks). In this case, the internal concurrency
tier begins to play the leading role balancing the processing load on each core.
The obtained data are presented in the form of the diagram showing the speedup of
parallel implementations for different numbers of computing nodes (Fig. 5).
From the presented diagram it can be seen, the parallel implementations provide
almost linear speedup on a single octa-core node of the cluster. These results indirectly
confirms the results from the previous section. As expected for the small lattices, the
worst speedup was shown by the parallel implementation of the Wigner-Seitz cell
volume estimation. However, even in the case of small lattice, the parallel
implementation provides the speedup significantly higher than the number of the cluster nodes.
This result confirms the effectiveness of the two-tier model of concurrency.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Investigation of the stability under lattice distortion</title>
        <p>An ideal lattice is only a mathematical simplification, since real lattices always
contains some dislocations and impurities in their structure. This fact can crucially affect
the accuracy of parameter estimation, especially if we analyze not the whole lattice
structure but only a single unit cell. The example of a distorted lattice and incorrectly
estimated parameters of Brave and Wigner-Seitz unit cells are shown in Figure 6.
a)
b)
c)
In the last experiment, we investigated the stability of the proposed parameter
estimation algorithms under lattice distortion implemented by randomly added/removed
lattice points in the lattice structure. A cubic lattice contained 1000 lattice points (10
translations in each direction) was a sample. The number of randomly added/removed
lattice points varied from 10 to 100 points. The proposed parallel algorithms was used
to estimate unit cell parameters whereas the similarity measures determined the
accuracy of corresponding parameter estimations. Results of the experiment are presented
in Figure 7.
According to the graph, the edge similarity showed the maximum stability. Its value
did not fall below 0.98 at 10 % of added/removed lattice points. On the contrary, the
volumes similarity measure demonstrated the lowest stability among other measures.
However, even at the maximum level of distortion, its value was more than 0.9. The
obtained results prove high stability of the developed parameter estimation algorithms
to random distortion of the lattice structure.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>The developed parallel implementations of parametric identification algorithms for
crystal lattices in three-dimensional space effectively combine the two basic
approaches to parallel programming: MPI at the external concurrency tier and OpenMP
at the internal concurrency tier.</p>
      <p>The experiments have shown that the developed parallel implementations achieves
almost linear speedup on both a desktop computer and a cluster. Therefore, the use of
a two-tier concurrency model allows to load the computing powers of
multiprocessor / multicore systems with maximum efficiency.</p>
      <p>In addition, the developed parallel implementation helped to prove the stability of the
proposed approach of crystal lattice parameter estimation to random distortion of the
lattice structure.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work was partially supported by the Ministry of education and science of the
Russian Federation in the framework of the implementation of the Program of
increasing the competitiveness of SSAU among the world’s leading scientific and
educational centers for 2013-2020 years; by the Russian Foundation for Basic Research
grants (# 14-07-97040, # 15-29-03823, # 15-29-07077, # 16-57-48006); by the ONIT
RAS program # 6 “Bioinformatics, modern information technologies and
mathematical methods in medicine” 2016.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Tilley</surname>
            <given-names>R</given-names>
          </string-name>
          .
          <article-title>Crystals and crystal structure</article-title>
          .
          <source>West Sussex: John Wiley &amp; Sons Ltd</source>
          ,
          <year>2006</year>
          :
          <fpage>17</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Fursov</surname>
            <given-names>VA</given-names>
          </string-name>
          ,
          <article-title>Goshin YeV</article-title>
          .
          <article-title>Information technology for digital terrain model reconstruction from stereo images</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2014</year>
          ;
          <volume>38</volume>
          (
          <issue>2</issue>
          ):
          <fpage>335</fpage>
          -
          <lpage>342</lpage>
          . [in Russian]
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kotov</surname>
            <given-names>AP</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fursov</surname>
            <given-names>VA</given-names>
          </string-name>
          ,
          <article-title>Goshin YeV</article-title>
          .
          <article-title>Technology for fast 3d-scene reconstruction from stereo images</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2015</year>
          ;
          <volume>39</volume>
          (
          <issue>4</issue>
          ):
          <fpage>600</fpage>
          -
          <lpage>605</lpage>
          . [in Russian] DOI: 10.18287/
          <fpage>0134</fpage>
          - 2452-2015-39-4-
          <fpage>600</fpage>
          -605.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Shirokanev</surname>
            <given-names>AS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirsh</surname>
            <given-names>DV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kupriyanov</surname>
            <given-names>AV</given-names>
          </string-name>
          .
          <article-title>Researching methods of reconstruction of three-dimensional crystal lattice from images of projections</article-title>
          .
          <source>CEUR Workshop Proceedings</source>
          ,
          <year>2015</year>
          ;
          <volume>1490</volume>
          :
          <fpage>290</fpage>
          -
          <lpage>297</lpage>
          . DOI:
          <volume>10</volume>
          .18287/
          <fpage>1613</fpage>
          -0073-2015-1490-290-297.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kirsh</surname>
            <given-names>DV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kupriyanov</surname>
            <given-names>AV</given-names>
          </string-name>
          .
          <article-title>Modeling and Identification of Centered Crystal Lattices in Three-Dimensional Space</article-title>
          .
          <source>CEUR Workshop Proceedings</source>
          ,
          <year>2015</year>
          ;
          <volume>1490</volume>
          :
          <fpage>162</fpage>
          -
          <lpage>170</lpage>
          . DOI:
          <volume>10</volume>
          .18287/
          <fpage>1613</fpage>
          -0073-2015-1490-162-170.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kupriyanov</surname>
            <given-names>AV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirsh</surname>
            <given-names>DV</given-names>
          </string-name>
          .
          <article-title>Estimation of the Crystal Lattice Similarity Measure by Three-Dimensional Coordinates of Lattice Nodes</article-title>
          .
          <source>Optical Memory &amp; Neural Networks (Information Optics)</source>
          ,
          <year>2015</year>
          ;
          <volume>24</volume>
          (
          <issue>2</issue>
          ):
          <fpage>145</fpage>
          -
          <lpage>151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Soldatova</surname>
            <given-names>OP</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lezin</surname>
            <given-names>IA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lezin</surname>
            <given-names>IV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kupriyanov</surname>
            <given-names>AV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirsh</surname>
            <given-names>DV</given-names>
          </string-name>
          .
          <article-title>Application of fuzzy neural networks for defining crystal lattice types in nanoscale images</article-title>
          .
          <source>Computer Оptics</source>
          ,
          <year>2015</year>
          ;
          <volume>39</volume>
          (
          <issue>5</issue>
          ):
          <fpage>787</fpage>
          -
          <lpage>795</lpage>
          . DOI:
          <volume>10</volume>
          .18287/
          <fpage>0134</fpage>
          -2452-2015-39-5-
          <fpage>787</fpage>
          -794.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kirsh</surname>
            <given-names>DV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kupriyanov</surname>
            <given-names>AV</given-names>
          </string-name>
          .
          <article-title>Identification of Three-Dimensional Crystal Lattices by Estimation of Their Unit Cell Parameters</article-title>
          .
          <source>CEUR Workshop Proceedings</source>
          ,
          <year>2015</year>
          ;
          <volume>1452</volume>
          :
          <fpage>40</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kirsh</surname>
            <given-names>DV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kupriyanov</surname>
            <given-names>AV</given-names>
          </string-name>
          .
          <article-title>Crystal lattice identification by coordinates of their nodes in three dimensional space</article-title>
          .
          <source>Pattern recognition and image analysis</source>
          ,
          <year>2015</year>
          ;
          <volume>25</volume>
          (
          <issue>3</issue>
          ):
          <fpage>456</fpage>
          -
          <lpage>460</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Antonov</surname>
            <given-names>AS</given-names>
          </string-name>
          .
          <article-title>Techniques of parallel programming MPI and OpenMP</article-title>
          . Moscow: Publisher of Moscow State University,
          <year>2012</year>
          . 344 p. [in Russian]
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Gergel</surname>
            <given-names>VP</given-names>
          </string-name>
          .
          <article-title>Theory and Practice of Parallel Programming</article-title>
          . Moscow: Internet University of Information Technologies “BINOM”,
          <year>2007</year>
          : 423 p. [in Russian]
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>