<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>NEW FEATURES OF THE JINR CLOUD</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A.V. Baranov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>N.A. Balashov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A. N. Makhalkin</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ye. M. Mazhitova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>N.A. Kutovskiy</string-name>
          <email>kut@jinr.ru</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>R.N. Semenov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Nuclear Physics</institution>
          ,
          <addr-line>050032, 1 Ibragimova street, Almaty</addr-line>
          ,
          <country country="KZ">Kazakhstan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Laboratory of Information Technologies, Joint Institute for Nuclear Research</institution>
          ,
          <addr-line>6 Joliot-Curie, Dubna, Moscow region, 141980</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Plekhanov Russian University of Economics</institution>
          ,
          <addr-line>36 Stremyanny per., Moscow, 117997</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>2018 Alexandr V. Baranov, Nikita A. Balashov, Alexandr N. Makhalkin, Yelena M. Mazhitova, Nikolay A. Kutovskiy</institution>
          ,
          <addr-line>Roman N. Semenov</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>257</fpage>
      <lpage>261</lpage>
      <abstract>
        <p>The report covers details on such aspects of the JINR cloud development as migration to high availability setup based on Raft consensus algorithm, Ceph-based storage back-end for VM images and DIRAC-based grid platform for external partner clouds integration into distributed computational cloud environment.</p>
      </abstract>
      <kwd-group>
        <kwd>cloud computing</kwd>
        <kwd>OpenNebula</kwd>
        <kwd>clouds integration</kwd>
        <kwd>cloud bursting</kwd>
        <kwd>DIRAC</kwd>
        <kwd>ceph</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. New high availability setup based on Raft consensus algorithm</title>
      <p>
        Since OpenNebula release 5.4 which the JINR cloud is running on there is a new built-in
mechanism for high availability (HA) setup based on so called Raft consensus algorithm [1].
According to OpenNebula documentation [
        <xref ref-type="bibr" rid="ref1">2</xref>
        ] a consensus algorithm relies on two concepts:
 System State what in the case of OpenNebula-based clouds means the data stored in
the database tables (users, ACLs, or the VMs in the system);

      </p>
      <p>Log what is a sequence of SQL statements that are consistently applied to the
OpenNebula DB in all servers to evolve the system state.</p>
      <p>To preserve a consistent view of the system across servers, modifications to system state are
performed through a special node called the “leader”. The OpenNebula cloud front-end nodes (CFNs)
elect a single node to be the leader. The leader periodically sends heartbeats to the other CFNs called
followers to keep its leadership. If a leader fails to send the heartbeat followers promote to candidates
and start a new election.</p>
      <p>Whenever the system is modified (e.g. a new VM is added to the cluster), the leader updates
the log and replicates the entry in a majority of followers before actually writing it to the database. It
increases the latency of DB operations but enables a safe replication of the system state and the cluster
can continue its operation in case of leader failure.</p>
      <p>
        So during a software upgrade on the JINR cloud from OpenNebula 4.12 (see [
        <xref ref-type="bibr" rid="ref1">2</xref>
        ] for more
details on that architecture) to 5.4 the HA setup based on the Raft consensus algorithm was
implemented. Following the OpenNebula documentation recommendations the JINR cloud has odd
number of front-end nodes (it equals three in our case). They are represented on the
      </p>
      <p>Figure identically to one marked by the black numeral “2” in the same color square.</p>
      <p>KVM-based virtual machines (VMs) and OpenVZ-based containers (CTs) are running on
cloud worker nodes (CWNs) marked on the</p>
    </sec>
    <sec id="sec-2">
      <title>2. Ceph-based software defined storage</title>
      <p>VMs and CTs images as well as a user and scientific experiments data are kept in Ceph-based
software defined storage (SDS). Its architecture is shown on the Figure 2.</p>
      <p>
        Total amount of raw disk space in that SDS is about 1 PB. Due to triple replication an
effective disk space available for users is about 330 TB. More details on the JINR Ceph-based SDS
can be found in [
        <xref ref-type="bibr" rid="ref2">3</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Clouds integration</title>
      <p>Apart from increasing the JINR cloud resources by buying new servers and maintain them
locally at JINR there is another activity on resources expansion: integration of part of computing
resources of the partner organizations’ clouds.</p>
      <p>
        Initially such integration was done with help of cloud bursting driver [
        <xref ref-type="bibr" rid="ref4">5</xref>
        ] developed by the
JINR cloud team. But a growing number of participants of such distributed cloud-based infrastructure
increases a complexity of its maintenance sufficiently (every new cloud integration requires changes
in configuration files of every integrated cloud as well as appropriate services restart). That’s why a
research work was started to evaluate possible alternatives. Among existing software platforms for
distributed computing and data management a DIRAC (Distributed Infrastructure with Remote Agent
Control) [7] one was chosen because of the following reasons:
 it provides the whole needed functionality including both job and data management;


cloud as a computational back-end support (although an appropriate plugin required
some development);
easier services deployment and maintenance in comparison with other platforms with
similar functionality (e.g. EMI).
      </p>
      <p>A schema of clouds integration using DIRAC grid middleware is shown on the Figure 2.</p>
      <p>Such approach also allows to share resources of each cloud between external grid users and
local non-grid users.</p>
      <p>At the moment of writing that article the integration process of the clouds of the JINR Member
State organizations into DIRAC-based distributed platform is at different stages, in particular
(locations of such distributed cloud infrastructure participants are shown on the map on the Figure ):
 Astana branch of the Institute of Nuclear Physics - Private establishment “Nazarbayev</p>
      <p>University Library and IT services” (Astana, Kazakhstan, integrated);








</p>
      <p>Scientific Research Institute of Nuclear Problems of the Belarusian State University
(Minsk, Belarus, integrated);
Institute of Physics of the National Academy of Sciences of Azerbaijan (Baku, Azerbaijan,
integrated);
Yerevan Physical Institute (Yerevan, Armenia, integrated);
Plekhanov Russian Economic University (Moscow, Russia, integrated);
Institute for Nuclear Research and Nuclear Energy (Sofia, Bulgaria, negotiations);
Georgian Technical University (Tbilisi, Georgia, in the process of integration);
St. Sophia University “St. Kliment Ohridski” (Sofia, Bulgaria, negotiations);
Institute of Nuclear Physics (Tashkent, Uzbekistan, negotiations);
University “St. Kliment Ohridski” (Bitola, Macedonia, negotiations).</p>
    </sec>
    <sec id="sec-4">
      <title>6. Conclusion</title>
      <p>The JINR cloud is rapidly developing. A demand in its resources as well as a spectrum of
tasks it is used for is growing permanently. Such scientific experiments as JUNO, Daya Bay,
BaikalGVD started to use its resources.</p>
      <p>The JINR cloud front-end node configuration was migrated from shared across two physical
servers DRDB partition to front-end nodes HA architecture based on the Raft consensus algorithm.</p>
      <p>The ceph-based software defined storage was put into production and now it is used for virtual
appliances images as well as for keeping user and scientific experiments data.
06.11.2018]
[1] The Raft consensus algorithm web-site [Online]. Available: https://raft.github.io/. [Accessed on</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Baranov</surname>
          </string-name>
          et al. JINR cloud infrastructure development // The 7th International Conference «
          <article-title>Distributed Computing and Grid-technologies in Science and Education (Grid'2016)»</article-title>
          , CEUR Workshop Proceedings, ISSN:
          <fpage>1613</fpage>
          -
          <lpage>0073</lpage>
          , vol.
          <volume>1787</volume>
          ,
          <year>2016</year>
          , pp.
          <fpage>15</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.A.</given-names>
            <surname>Balashov</surname>
          </string-name>
          et al.
          <article-title>JINR cloud service for scientific and</article-title>
          engineering computations // Modern Information Technologies and IT-Education,
          <source>ISSN 2411-1473</source>
          , Vol.
          <volume>14</volume>
          , No.
          <volume>1</volume>
          ,
          <issue>2018</issue>
          , pp.
          <fpage>61</fpage>
          -
          <lpage>72</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Baranov</surname>
          </string-name>
          et al. Approaches to cloud infrastructures integration // Computer Research and Modeling, vol.
          <volume>8</volume>
          , no.
          <issue>3</issue>
          ,
          <issue>2016</issue>
          , pp.
          <fpage>583</fpage>
          -
          <lpage>590</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [5]
          <string-name>
            <surname>DIRAC</surname>
          </string-name>
          web-portal [Online]. Available: http://diracgrid.
          <source>org [Accessed on 06.11</source>
          .
          <year>2018</year>
          ]
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>