<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Push-gossip protocol e ciency with network topology propagation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alexey Vanin</string-name>
          <email>alexey@nspcc.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vladimir Bogatyrev</string-name>
          <email>vabogatyrev@corp.ifmo.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ITMO University</institution>
          ,
          <addr-line>Saint Petersburg</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>NEO Saint Petersburg Competence Center</institution>
          ,
          <addr-line>Saint Petersburg</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>E ective data propagation between nodes is crucial factor in distributed systems. Reactive noti cations allow responding quickly to various system events: failures, topology changes, etc. The general approach is to use the gossip protocol. Data is distributed as a rumor; each node retransmits input messages to another nodes. In this paper, we consider a particular case of the gossip protocol usage. Distributed system informs included nodes about the participation in collective challenge, thereby forming a sub-network topology. We evaluate e ciency with the simulation model of topology propagation process.</p>
      </abstract>
      <kwd-group>
        <kwd>gossip protocol</kwd>
        <kwd>distributed system</kwd>
        <kwd>computer network</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The distributed system e ciency can be evaluated by the speed of reaction to
internal or external events [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3,4,5</xref>
        ]. The system takes time to make a decision and
propagate instructions among nodes. Rapid data distribution preserves integrity,
which is one of the systems properties [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Gossip protocol is used to distribute alerts over the network in distributed
systems [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. There are di erent approaches for gossip protocol implementation.
First approach is the push-gossip protocol [
        <xref ref-type="bibr" rid="ref6 ref7">6,7</xref>
        ]. Nodes transfer data for some
amount of time. ime can be chosen su ciently high, so all participants with high
likelihood will receive the data. This is called rumor-mongering protocol. Also
nodes can send data on demand until it is made obsolete by newer information.
This is called anti-entropy protocol. It is useful for sharing information reliably
among a group of participants [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Second approach is the pull-gossip protocol [
        <xref ref-type="bibr" rid="ref6 ref7">6,7</xref>
        ]. Data is periodically
requested on demand by nodes. This approach allows to decrease number of
transmitted messages in the network. Pull-gossip protocol is synchronous, while
pushgossip protocol can work asynchronously. Also, there is combined approach,
where the system starts propagation as push-gossip and then uses pull-gossip
for decreasing the load on network [
        <xref ref-type="bibr" rid="ref6 ref9">6,9</xref>
        ]. Average consensus [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and averaging
gossip algorithms [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] also form an special case of usage.
      </p>
      <p>
        All this implementations transmit noti cations and the data within the frame
of established network topology [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Consider distributed system within transient
state. Nodes often connect and disconnect from the system. System chooses set
of nodes and propagate instructions among them, so they can know each other
to do the set of system tasks together. There is a connection on physical layer
between nodes, but gossip protocol de nes logic topology of the system. Such case
implies the use of push protocols. Nodes are waiting for input gossip messages
with instructions to execute. However, due to the decentralized nature of the
system and the lack of prior synchronization, there is an issue of determining
gossip transmission parameters. In this paper we will examine the simulation
model of push-gossip propagation process as a method to determine optimal
transmission parameters.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Simulation model</title>
      <p>Gossip propagation process can be de ned by several parameters: number of
participants in the process, fan-out (number of random data recipients),
recipients choosing algorithm, etc. Propagation time and nodes noti cation rate are
characteristics of the process.</p>
      <p>The simulation model is built with the usage of general-purpose programming
language (source code available at github.com/nspcc-dev/gossip-model). As an
input model takes number of nodes S, fan-out size f and number of experiments.
The simulation stages are schematically presented in Figure 1.</p>
      <sec id="sec-2-1">
        <title>Data</title>
        <p>(a) First iteration</p>
      </sec>
      <sec id="sec-2-2">
        <title>Data</title>
      </sec>
      <sec id="sec-2-3">
        <title>Data Data</title>
        <p>(b) Second iteration</p>
        <p>For every node the model generates structures and sets data ag in the rst
one. Simulation process consistently repeats three stages:
{ planning,
{ propagation,
{ evaluation.</p>
        <p>At the planning stage the model chooses data sender nodes. In this paper we
examine model, which chooses nodes with data ag that has not sent any data
yet. In this case, every node in the system will retransmit data only once. This
simpli es real implementation of algorithm and reduces number of transmitted
messages.</p>
        <p>At the propagation stage, data sender nodes independently choose recipients.
The number of recipients is de ned by fan-out parameter f . This operation is
based on a pseudo-random number generator with a uniform distribution. Model
sets data ag on every recipient.</p>
        <p>At the evaluation stage, the model collects statistics and checks stop
condition. Stop condition is met if all nodes have data ag. In this paper it is called
saturation. If stop condition is not met, stages are repeated again. In this paper
it is called iteration. Figure 1a represents rst iteration and Figure 1b represents
second iteration. Number of iterations considered as a data propagation time.</p>
        <p>The model outputs table with numbers of iterations and number of
experiments that has been nished with exact propagation time. With precision based
on number of experiments, we can determine saturation probability for constant
S, f and propagation time limit as a number of iterations.</p>
        <p>The model has a number of simpli cations. Iteration is a synchronous process,
but in the real system data propagates asynchronously. However, it a ects the
speed of propagation and does not a ect the order. There is no parameter for
data loss probability in data channels and the model considers only the mesh
topology in the system.
3
3.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Simulation results</title>
      <sec id="sec-3-1">
        <title>General interpretation</title>
        <p>As described in section 2, the model allows to determine saturation probability
for constant S, f and the number of iteration limit. Consider iteration limit as 3
iterations and vary fan-out parameter f for di erent network sizes. Results are
presented in Figure 2.</p>
        <p>Saturation probability has the same curve for di erent S. With a low fan-out
parameter f , the model has never met the iteration limit, thus the possibility of
saturation has the value of 0. Then there is the area, where saturation probability
grows up to the value of 1. After that, there is no point in increasing the fan-out
parameter.</p>
        <p>Data propagates faster if there is no collisions between random sets of
recipients. Consider fastest propagation time for xed fan-out size f and iteration limit
0.8
y
lit
i
b
ab 0.6
o
r
p
n
ito 0.4
a
r
u
t
a
-S 0.2
1
0
0</p>
        <p>Experiments with max. iteration = 3</p>
        <p>S=100
S=300
2
4
6
10
12
14
16</p>
        <p>18
8</p>
        <p>Fan-out
S =
h
X f i
i=0
f = S
1
(1)
(2)
h. First node sends f unique messages, then f nodes sends f 2 unique messages.
After h iteration, the number of all sent messages must be equal S:
Parameter f de ned in (1) actually is the minimal fan-out size, where saturation
is still possible, even with low likelihood.</p>
        <p>Consider the worst scenario when all data of the sender nodes choose the
same recipients. In this case, saturation is possible if fan-out has a size of:
Parameter f de ned in (2) actually is the maximum fan-out size, where
saturation is possible. But probability for all pseudo-random number generators to be
synchronized is extremely low.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Optimal fan-out calculation</title>
        <p>Saturation probability function with any parameters can be represented as in
Figure 3. Set of fan-out sizes lay between fmin and fmax. In real distributed
system, it is reasonable to de ne saturation probability with expected reliability
parameters. In this case, we can determine optimal fan-out for that saturation
probability bound. In Figure 3 it is the intersection between probability
saturation function and boundary v.</p>
        <p>The model allows to nd relation between number of nodes S and optimal
fan-out fopt, so distributed system may increase the scale e ectiveness. We have
y
t
i
l
i
ab 1
b
o
r
p
n
o
i
t
a
r
u
t
aS v
1 0
f-min
f-max</p>
        <p>Fan-out
modeled distributed system with iteration limit h = 3, saturation probability at
least 0:99999 and found out optimal fan-out for di erent S. The obtained set of
values can be expressed as a function with the logarithmic regression:
fopt(S) =
(5:5 ln S 4:8; if S &lt; 30;</p>
        <p>1:4 ln S + 9; if S 30</p>
        <p>As soon as saturation probability function consists of two convex curves, fopt
is also set as a pair of logarithmic functions. Overall results are presented in
Figure 4.</p>
        <p>Optimal fan-out with max. iterations = 3
(3)
20
15
-tou 10
n
a
F
5
0
Logarithmic regression</p>
        <p>Rounded values
Experimental results
100
200
300
400
500
600
700
800
900</p>
        <p>1000</p>
        <p>Network size</p>
        <p>Fig. 4. Optimization task for nding optimal fan-out
The simulation model allows to evaluate the e ectiveness of the push-gossip
protocol for network topology propagation with di erent propagation
parameters. With the experiments, the saturation function was determined with fmin
and fmax values. In addition, the model allows to determine the optimal fan-out
value fopt with xed network size S, probability saturation boundary and
propagation time h. It allows to de ne optimal fan-out function for scalable distributed
systems. In this paper we de ned optimal fan-out function (3) for distributed
system with propagation time h = 3 iterations and saturation probability at
least 0:99999.</p>
        <p>Further research includes adding other physical topologies to the model. Also
adding data loss probability and examining saturation probability function with
new parameters.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Aliev</surname>
            <given-names>T.I.</given-names>
          </string-name>
          <article-title>The basics of discrete systems modeling</article-title>
          .
          <source>St. Petersburg: ITMO</source>
          ,
          <year>2009</year>
          . (in Russian)
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Renesse</surname>
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumitriu</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gough</surname>
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thomas</surname>
            <given-names>C.</given-names>
          </string-name>
          <article-title>E cient Reconciliation and Flow Control for Anti-Entropy Protocols</article-title>
          .
          <source>LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware</source>
          , ACM,
          <year>2008</year>
          , No.
          <volume>6</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bogatyrev</surname>
            <given-names>V.A.</given-names>
          </string-name>
          <article-title>Exchange of Duplicated Computing Complexes in Fault tolerant Systems</article-title>
          .
          <source>Automatic Control and Computer Sciences</source>
          ,
          <year>2011</year>
          , Vol.
          <volume>45</volume>
          , No.
          <volume>5</volume>
          ,
          <fpage>268276</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bogatyrev</surname>
            <given-names>V.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parshutina</surname>
            <given-names>S.A.</given-names>
          </string-name>
          <string-name>
            <surname>Redundant</surname>
          </string-name>
          <article-title>Distribution of Requests Through the etwork by Transferring Them Over Multiple Paths</article-title>
          . Distributed Computer and Communication Networks,
          <year>2015</year>
          , CCIS, vol.
          <volume>601</volume>
          ,
          <fpage>199207</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Saidi</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mohtashemi</surname>
            <given-names>M</given-names>
          </string-name>
          .
          <article-title>Minimum-cost rst-push-then-pull gossip algorithm</article-title>
          .
          <source>IEEE Wireless Communications and Networking Conference</source>
          , WCNC,
          <year>2012</year>
          ,
          <volume>25542559</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Demers</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greene</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hauser</surname>
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Irish</surname>
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Larson</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shenker</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sturgis</surname>
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swinehart</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Terry</surname>
            <given-names>D</given-names>
          </string-name>
          .
          <article-title>Epidemic Algorithms for Replicated Database Maintenance</article-title>
          .
          <source>Proceedings of Sixth Symp. on Principles of Distributed Computing, ACM</source>
          ,
          <year>1987</year>
          ,
          <volume>112</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Apt</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grossi</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoek</surname>
            <given-names>W. When</given-names>
          </string-name>
          <article-title>Are Two Gossips the Same? Types of Communication in Epistemic Gossip Protocols</article-title>
          .
          <year>2018</year>
          , URL: https://arxiv.org/pdf/
          <year>1807</year>
          .05283.pdf
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Gupta</surname>
            , Indranil, Chandra,
            <given-names>Tushar D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goldszmidt</surname>
          </string-name>
          , German S.
          <article-title>On scalable and e - cient distributed failure detectors</article-title>
          .
          <source>Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing</source>
          ,
          <year>2001</year>
          ,
          <volume>170179</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Birman K. The Promise</surname>
          </string-name>
          , and Limitations, of Gossip Protocols.
          <source>SIGOPS Oper. Syst.Rev.</source>
          ,
          <year>2007</year>
          ,
          <volume>41</volume>
          (
          <issue>5</issue>
          ),
          <fpage>813</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Fagnani</surname>
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zampieri</surname>
            <given-names>S.</given-names>
          </string-name>
          <article-title>Randomized consensus algorithms over large scale networks</article-title>
          IEEE JSAC,
          <year>2008</year>
          , vol.
          <volume>26</volume>
          , no.
          <issue>4</issue>
          ,
          <fpage>634649</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Boyd</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghosh</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prabhakar</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            <given-names>D.</given-names>
          </string-name>
          <article-title>Randomized gossip algorithms</article-title>
          .
          <source>IEEE Trans. Info. Theory</source>
          ,
          <year>2006</year>
          vol.
          <volume>52</volume>
          , no. 6.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>