Core-based Reconfiguration for Reliable Overlay Networks

                           Silvia Bonomi                                     Sara Tucci-Piergiovanni
                   Sapienza Univertità di Roma                              Sapienza Univertità di Roma
            Dipartimento di Informatica e Sistemistica               Dipartimento di Informatica e Sistemistica
               Via Ariosto 25, 00185 Roma, Italy                        Via Ariosto 25, 00185 Roma, Italy
                   bonomi@dis.uniroma1.it                                     tucci@dis.uniroma1.it


ABSTRACT                                                           correctly search for the matching results.
In P2P overlay networks, the continual arrival and depar-
ture of nodes leads to the necessity of running periodically       To face this problem many approaches have been proposed.
a specific reconfiguration algorithm able to renew the over-       In structured overlay networks, a dedicated routine is in
lay. This paper presents a new reconfiguration mechanism           charge of revealing node departures [8]. Upon the detection
for overlay networks. The algorithm exploits a character-          of a node departure, this routine starts an overlay reconfig-
istic observed in many overlay networks: the presence of a         uration devoted to renew the overlay network, in order to
core composed by reliable nodes (nodes that never leave the        come back to the operative state. In unstructured overlay
overlay). Even if members of the core are a-priori unknown,        networks, usually there is no dedicated routine to detect de-
the algorithm is able to eventually select one of them that        partures [9]. The departure detection happens only when
will carry out the periodic renewing of the overlay. Once the      a node wants to exchange information with an already de-
reconfiguration is performed by a core node, the overlay will      parted one. Usually, the detection does not lead to renew
never lose its reliability, letting any query, issued from this    the entire overlay, but who detected the departure only re-
time on, be ever satisfied.                                        moves its pending connection. Note that the approach used
                                                                   in unstructured overlays can be pursued because a departure
1.   INTRODUCTION                                                  affects only locally the overlay, in contrast with structured
Recently, Peer-to-Peer (P2P) systems attract attention. In         overlays where a departure may affect the whole overlay. Let
P2P systems, each user shares and exchanges information as         us note that if a departure affects the whole overlay, a possi-
equals by playing the role of both the server and the client.      ble burst of departures may significantly lengthens the time
Because nodes that are participating to the system are con-        for the reconfiguration to take effect. The longer the time
nected to each others to construct an overlay network, the         is for reconfiguration, the higher is the number of queries
p2p paradigm is superior on scalability compared to client-        that can cross a still damaged network with the possibility
sever model. However, P2P networks have technical issues           to return an empty (or partial) result. This could lead to a
that should be solved. Generally, P2P networks are catego-         very poor reliability.
rized into two types, structured [6, 7, 8] and unstructured
[1, 2, 3, 9]). Due to the absence of a server node that cen-       The approach proposed in this paper circumvents the draw-
trally stores and manages all of the information, a query          backs of a continual global reconfiguration starting from the
has to be routed inside the network to search the a-priori un-     observation that not all nodes in the network have the same
known set of nodes that maintain the matching information.         lifetime. It is widely recognized that overlay networks con-
Therefore, a lot of effort has been paid in order to realize ef-   tains, among a huge number of nodes with very short life-
ficient and reliable information search/retrieval mechanisms       times, a core, i.e. a small number of very reliable nodes
by properly constructing the overlay network with the aim          with unlimited lifetime [5]. However, the identities of the
of improving successiveness of the search, shorten of search       core members are not a priori known.
delay, or reducing a total resource consumption. The is-
sue of reliability is particularly challenging in P2P systems      In this paper we propose a core-based approach to overlay
due to the continuous arrival and departure of nodes. This         reconfiguration. The basic idea is to eventually elect a core
phenomenon, also called churn, may seriously compromise            node as a special node able to carry out the reconfiguration.
overlay functionalities. For instance, overlay may degrade         This node, also called supervisor, actually carries the recon-
until disconnection as time passes by, avoiding queries to         figuration in a proactive way: the node renews the overlay
                                                                   periodically, before it degrades to disconnection. This can be
                                                                   done deterministically, by knowing an upper bound on the
                                                                   departure-rate of nodes. More in details, the supervisor peri-
                                                                   odically gathers information about current alive nodes inside
                                                                   the network, computes for each node a new state (e.g. new
                                                                   connections to other nodes) and communicates it to these
                                                                   alive nodes, which update their state before the network is
                                                                   damaged; in such a way a query crossing the network at any
                                                                   time will be ever satisfied. Note that, at the beginning, the
algorithm is not able to give to a core node the supervisor               In our algorithm, a node, called supervisor, is elected and it
role, as core nodes are unknown to the algorithm. However,                will be in charge of running the reconfiguration procedure.
the proposed algorithm is able to converge to the election
of a core node as supervisor after a finite time. Before this             In a dynamic system, continuous failures of nodes make con-
time, queries can be lost, but after this time, queries will              nectivity of the ON decreasing. In our approach, the ON
be satisfied forever, contrarily to what happens in the struc-            graph is a k-connected graph so that it will be resilient to
tured approach.                                                           at least (k − 1) failures from its definition.

The paper is organized as follows: Section 2 defines the sys-             Fixing a desiderata degree of connectivity k and given the
tem model and Section 3 presents the reconfiguration algo-                failure rate fr , it is possible to know how long the ON graph
rithm along with its correctness proof.                                   will be connected.

2.    SYSTEM MODEL                                                           Property 2. Let k be the degree of connectivity and let t
We consider an infinite set of processes Π={p1 , p2 . . . , pn . . . }.
                                                                          be the time where the ON graph is defined. Let be fr the fail-
Each process has a unique identifier for all its life time in the
                                                                          ure rate of the system, then the ON graph will be connected
system. Processes are arranged in a logical network, called
                                                                          for a time period ∆T = (k − 1)/fr from t.
Overlay Network (ON), built on top of the physical one.
The ON can be seen as a graph where each node represents
a process pi ∈ Π and each edge represents a logical commu-                We call this period graph lifetime LG .
nication channel between two elements pi , pj ∈ Π such that
pi and pj can communicate. Every process is able to commu-                The idea of the protocol is to use the degree of connectivity
nicate with its direct neighbors in the ON by means of mes-               and the failure rate to know when the overlay is becoming
sages exchange on point-to-point reliable channels. There                 disconnected and renew it. This is the task of the supervisor
exists a known bound δ on the message transmission delay                  that recomputes a new k-connected graph Gnew and com-
on the considered channels, hence the system can be con-                  municates it to all the other nodes in the system before the
sidered as synchronous. We define as correct a process that               graph lifetime LG of the current graph G expires. The idea
never fails. A faulty process fails by crashing and if it recov-          of the reconfiguration protocol is the following: during the
ers from the crash then it is considered as new in the system             lifetime of G, the supervisor starts the reconfiguration pro-
(with a new identifier). System is dynamic, i.e., nodes may               cedure to compute Gnew . The reconfiguration is composed
join and leave system at any time. More formally, we assume               of three phases: (i) nodes health verification, (ii) graph com-
that:                                                                     putation, (iii) graph dissemination. The first phase consists
                                                                          of a verification of the alive nodes in the system. During
                                                                          this phase, the supervisor sends a “ping” message to all the
     • when a node leaves the system, it does not perform any             nodes included in the current graph G and to newly joined
       specific task and then a voluntary leave is considered             nodes, thus waits for their replies. Then the second phase
       as a failure. In the following we consider equally faults          starts and Gnew is computed including all nodes which have
       and voluntary leaves and we refer them as faults;                  replied. In the third phase, the supervisor starts the dis-
     • when a node fails, it is removed from the graph to-                semination of the Gnew to all the alive nodes included in
       gether with all its incident edges;                                the new graph.

     • a node joins the system through a join event. The over-            If the supervisor does not crash, connectivity is preserved as
       lay is actually composed by all and only those nodes               the dissemination of the new graph Gnew deterministically
       that have joined the system and have not left yet.                 terminates before LG expires.

     • there exists a known upper bound fr on the node fail-              If the supervisor crashes, a new election will start and a new
       ure rate (i.e. the number of nodes that fail/leave the             supervisor is selected. During the period when the supervi-
       system in each time unit);                                         sor is changing, connectivity cannot be assured. However,
                                                                          due to the core assumption, we have that in a finite num-
     • there exists a-priori unknown finite set of processes,             ber of elections, and then in a finite time, a core node will
       called core ∈ Π, that never crash or leave the system              become a supervisor and then connectivity will hold forever.
       [4, 5].
                                                                          The join to the overlay is also managed by a specific pro-
3. THE RECONFIGURATION ALGORITHM                                          cedure that allows a new node to become part of the ON
                                                                          by means of an access point node selected among the ones
3.1 Algorithm Overview                                                    already connected in the ON through a graph computed by
The proposed algorithm aims at keeping overlay connectiv-                 some supervisor.
ity most of the time. Formally the problem is defined as
follows:
                                                                          3.2     Algorithm pseudo-code
                                                                          3.2.1    Data Structures
  Property 1. Eventual Connectivity. Eventually and                       At the process start-up, all the data structures have to be
permanently, all processes in the ON will be included in a                initialized. In Figure 1 the pseudo-code of the initialization
connected graph GP .                                                      phase is shown.
Init
                                                                   We assume to have a bootstrap service that makes possible,
 1 supervisor = nil
 2 active = f alse                                                 for incoming nodes, to find an access point to the ON. In
 3 myLevel = 0                                                     Figure 2 is presented the Join protocol.
 4 joined = ∅
 5 myKnowledge[] = emptyArray                                      Join(pi )
 6 parentsLevel = 0                                                 1 if (pi == nil)
 7 parents = ∅                                                      2    then
 8 candidates = ∅                                                   3          knowledge[0] = myId
                                                                    4          supervisor = myId
                                                                    5          active = true
                                                                    6          set timerGraph = ((k − 1)/fr ) − 4δ
        Figure 1: Data Structure Initialization
                                                                    7    else send (“Join00 , myId) to pi

                                                                                          (a)
Data structures, maintained by a node pi , collect several         when (receive (“Join00 , i) from pi ) do
information, as described as follows:                              1 if (supervisor 6= myId)
                                                                   2    then joined = joined ∪ {pi }
The variable active is initially false and does not change         3 knowledge[myLevel + 1] = knowledge[myLevel + 1] ∪ {pi }
until pi does not install1 the first graph and remains true        4 send (“Ack00 , myLevel, supervisor, knowledge[]) to pi
until pi crashes or leaves the system.
                                                                                                 (b)
The variable myLevel represents a logical “distance” be-           when (receive (“Ack00 , level, s, knowledge[]) from pi ) do
tween pi and the supervisor and is set during the join pro-        1 myLevel = level + 1; parentsLevel = level
cedure depending from the access point used by pi to join          2 myKnowledge[] = knowledge[]
the ON.                                                            3 supervisor = s

                                                                                                (c)
The variable myKnowledge contains the information about
processes currently joined to the ON. This information is
actually gathered by the supervisor and it is the only one al-                      Figure 2: Join Protocol
lowed to communicate using connections to processes stored
in the knowledge variable. This communication happens              The joining node pj contacts the bootstrap service that re-
when the supervisor has to communicate the new graph to            turns an active node pi already member of ON if there exists,
current participants. This variable also maintains the in-         otherwise it returns nil.
formation about the distance that processes have from the
current supervisor. To this end the variable myKnowledge           If no process is returned by the bootstrap service, pj starts
has an array structure defined as follows: at the i-th entry       to build the knowledge inserting itself at the level 0 and
of the array it is stored the set of nodes having level i.         becoming active. At this point pj is the first node part of
                                                                   ON. Since pj is the only node part of the ON it becomes
The variable supervisor contains the identifier of the node        automatically the supervisor and then it starts to monitor
that is currently considered as supervisor for the ON by the       the ON. In particular it sets a timer, namely timerGraph,
process pi .                                                       equal to the graph lifetime LG (minus the time needed to
                                                                   verify the health of the ON participants and to communicate
The variable joining contains a set of nodes joined from the       the new graph before the current one lose the connectivity)
installation of the last graph and not yet active, including       and then starts the Reconfiguration procedure.
those that joined using pi as access point.
                                                                   If a process pi is returned, pj contacts pi sending a mes-
The information used to elect a supervisor is stored in the        sage of join request with attached its identifier. When pi
variables parents and candidates; parents is the set con-          receives the request of pj , it updates its knowledge (stor-
taining all the nodes alive at the level just before pi ’s level   ing pj identifier at its level plus one) and sends back to pj
and represents the nodes to be monitored in order to detect        an acknowledgment message containing pi ’s level, the cur-
a possible crash of the supervisor while candidates is the set     rent supervisor and its knowledge. Moreover, if pi is not the
containing the nodes at pi level that are possible candidates      current supervisor, it puts the identifier of pj in the list of
for the election of the new supervisor. Due to the failures,       joined nodes in order to let the supervisor aware of pj for
an entry of the myKnowledge array may become empty and             the next graph. When pj receives the ack of pi , it updates
then there exist some nodes inside the system that have to         its structure with the information received.
update the level of their parents in the myKnowledge struc-
ture. To this aim we store in the variable parentsLevel the        3.2.3    Reconfiguration Procedure
current level not empty where the node can find its parents        The Reconfiguration procedure is managed by the supervisor
in the myKnowledge structure.                                      node and it is triggered periodically.

                                                                   In Figure 3 it is shown the reconfiguration protocol.
3.2.2    Join Procedure
                                                                   This protocol is based on the usage of two timers, timerGraph
1                                                                  and timerM onitor, exploiting the synchrony of the system.
 A graph is installed when a node receives it and starts to
communicate using its links.                                       timerGraph measures the graph lifetime while timerM onitor
                                                                  measures the time needed to collect all the information needed
                                                                  for the detection. When the timerGraph expires, the super-
                                                                  visor sends a ping message to all the nodes in myKnowledge,
                                                                  resets its knowledge, sets the timerM onitor and then waits
                                                                  the replies for the maximum time needed to send and receive
                                                                  a message (2δ).

                                                                  When a node pi receives the ping, it replies with a “pong”
                                                                  message and attaches its level and the set joined of nodes
                                                                  that it knew have joined in the last graph lifetime.

                                                                  When the supervisor receives the pong message, it updates
1 when (timerGraph expired) do                                    its knowledge and when the timerM onitor expires it com-
2 for each (p ∈ memberOf (myKnowledge)) do                        putes the new graph containing all the nodes that have
3   send (“ping 00 ) to pfor each i do                            replied to the ping and the nodes that have completed their
4   myKnowledge[i] = ∅                                            join before the reconfiguration starts. The new graph is com-
5 Set timerM onitor = 2δ
                                                                  puted by means of the compute graph() function that has as
                          (a)                                     parameters the set of nodes to be connected and the degree
                                                                  of connectivity k and returns the new graph. This function
                                                                  can build whichever type of k-connected graph because the
1 when (receive (“ping 00 ) from pi ) do
2 send (“pong 00 , joined, myLevel) to pi                         graph topology is irrelevant for our algorithm since it uses
                                                                  this function as black box. Once the new graph is computed,
                    (b)                                           the supervisor sends it to all the nodes it knows and then
                                                                  sets again the timerGraph.
1 when (receive (“pong 00 , joined, level) from pi ) do
2 myKnowledge[level + 1] = myKnowledge[level + 1] ∪ joined        When a node pi receives the graph, it updates its data struc-
3 myKnowledge[level] = myKnowledge[level] ∪ pi                    tures; if pi was not active, it changes its state and from now
                                                                  on it becomes effectively part of the ON and it starts the
                                (c)                               (transitive) monitoring of the supervisor.

1 when (timerM onitor expired) do
2 if (|member(myKnowledge[])| > f (k))
                                                                  3.2.4    Supervisor Election Procedures
3    then G ← compute graph(to monitor ∪ myId, k)            The supervisor can crash and then it is necessary to elect
4          for each p ∈ member(myKnowledge[]) do             a new one. The monitoring procedure uses the particular
5            send (“newGraph00 , G, myKnowledge[]) to p      structure of the knowledge and avoids all the nodes to ping
6 Set timerGraph = ((k − 1)/fr ) − 4δ                        the supervisor delegating this task only to the nodes at the
                                                             supervisor level or at the subsequent level. Since every node
                       (d)                                   can crash this local monitoring is repeated for all the lev-
                                                             els of the myKnowledge structure and each node monitors
 1 when (receive (“newGraph00 , G, knowledge[]) from pi ) do transitively the supervisor by means of a local monitoring.
 2    if (¬active)
 3       then active = true
 4              parentsLevel = maxi {i < myLevel∧                 In Figure 4 it is shown the monitoring procedure.
 5              myKnowledge[i] 6= ∅}
 6              parents = parents ∪ knowledge[parentsLevel]       The monitor procedure is activated as soon as a node be-
 7              candidates = candidates ∪ knowledge[myLevel]      comes active and it is executed periodically. Let us consider
 8              trigger monitorSupervisor()                       a process pi , it sends a heartbeat message to all the nodes
 9    myKnowledge[] = knowledge[]; supervisor = pi ; joined = ∅
                                                                  contained in the parents and candidates lists and waits for
10    install(G)
                                                                  their replies setting a timer, namely timerElection.
                                  (e)
                                                                  When a node receives the heartbeat message, it sends back
                                                                  a heartbeat reply and, if it did not know the sender of the
     Figure 3: Graph Reconstruction Procedure                     message, it adds the sender to the list of candidates.

                                                                  Receiving the heartbeat reply, every process updates its data
                                                                  structure according to the level of the sender node.

                                                                  When the timerElection expires, the election procedure starts
                                                                  as shown in Figure 5.

                                                                  The election procedure selects a new supervisor when the
                                                                  crash of the current one is detected. The crash of the super-
                                                                  visor is detected by the nodes whose level is the supervisor
                                                                  one and from nodes at the first level not empty following the
                                                                  supervisor level; when these nodes do not receive any reply
                                                      by the “parents” nodes, they know that they are potentially
                                                      candidates to become supervisor and then decide with a de-
                                                      terministic rule the supervisor. The new supervisor is now
                                                      ready to build the new graph and starts the reconfiguration
                                                      procedure.

monitorSupervisor()
1 while (active every 2δ) do                          3.3    Correctness and Guarantees of the Algo-
2 for each (p ∈ parents ∪ candidates) do                     rithm
3    send (“HB Req, myLevel00 ) to p
                                                      In this section we show that our algorithm satisfies eventual
4 parents = ∅
5 candidates = ∅                                      connectivity.
6 set timerElection = 2δ

                   (a)                                  Theorem 1. The reconfiguration procedure satisfies Even-
                                                      tual Connectivity.
1 when (receive (“HB Req, level00 ) from pi ) do
2 send (“HB Rep00 , myLevel) to pi
3 if (pi ∈
         / knowledge[level])                            Proof. (Sketch) We first show that the reconfiguration
4    then joined = joined ∪ pi                        procedure maintains connectivity when the supervisor does
5           knowledge[level] = knowldge[level] ∪ pi   not crash and then we show that it works even with fault
6           if (level == myLevel)                     supervisor.
7              then candidates = candidates ∪ pi
                                                      At the beginning there is only one node pi that becomes
                           (b)
                                                      supervisor and activates the reconfiguration thread. pi is
                                                      the only node active in the ON and the graph G0 of the
1 when (receive (“HB Rep00 , level) from pi ) do      ON is composed only by pi ; moreover, both the knowledge
2 if (level < myLevel)
3    then parents = parents ∪ pi                      and the election information are composed only by pi . Since
4    else candidates = candidates ∪ pi                the supervisor is the only active node, the bootstrap service
                                                      returns always its identifier to the incoming nodes. When
                           (c)                        the reconfiguration procedure starts due to the expiration of
                                                      the timerGraph, the supervisor starts to ping all the nodes
   Figure 4: Monitoring Supervisor Procedure          who it knows about and waits for 2δ for the replies. We may
                                                      have two cases (i) no node has joined the ON in LG0 ; (ii)
                                                      some nodes have joined the ON in LG0 . In the first case in 2δ
                                                      pi will receive only its own reply, due to the synchrony of the
                                                      system, and will compute the graph containing again only
                                                      itself. In the second case, the nodes who have joined the ON
                                                      and are still alive will receive the ping message in δ time due
                                                      to the synchrony and the perfect link, and then will reply
                                                      to the supervisor that will receive the pong message after at
                                                      most δ time. After 2δ from the ping messages, the supervisor
                                                      will know exactly how many and who are the nodes alive to
                                                      be connected in the new graph G1 . If there is enough nodes
                                                      to build the k-connected graph then a graph including all
                                                      the nodes is computed otherwise G1 will include only pi .
when (timerElection expired)
1 if (|parents == 0|)                                 Let us consider a graph Gi with more than one node; when
2    then supervisor = min(candidates)                the timerGraph expires the supervisor repeats the proce-
3 if (supervisor = myId)                              dure described above. Since a k-connected graph is able to
4    then set timerGraph =                           tolerate k − 1 failures, the graph lifetime is LG = (k − 1)/fr
5    else knowledge[myLevel] = candidates             and the timerGraph expires after ((k − 1)/fr ) − 4δ we have
6           knowledge[parentsLevel] = parents         that the connectivity is maintained when the reconfiguration
7           parentsLevel = maxi {i < myLevel∧
8           knowledge[i] 6= ∅}                        starts and it is guaranteed for 4δ more time. To verify which
                                                      nodes are still alive between the known ones, the supervisor
                     (a)                              uses 2δ times and to spread the new graph the supervisor
                                                      uses one more δ time then when the new graph is installed,
                                                      the old one is still connected. Moreover the new graph is
           Figure 5: Election Procedure
                                                      k-connected again and then connectivity is preserved.

                                                      Consider now the case where the supervisor can fail. Let i be
                                                      the level of the supervisor. Due to the monitorSupervisor()
                                                      procedure, every 2δ time nodes at level i and i + 1 send a
                                                      heartbeat message to the supervisor. Let us suppose that at
                                                      some t the supervisor crashes. If there are no other nodes
at level i, when the timerElection expires, nodes at level             Distributed systems(OTMÕ06), October 2006., 2006.
i + 1 have the set parents empty and then will execute the         [6] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and
line 2 of Figure 5. Since the rule used to select the super-           S. Schenker. A scalable content-addressable network. In
visor is deterministic, all the nodes will recognize the same          SIGCOMM ’01: Proceedings of the 2001 conference on
new supervisor and the new supervisor knows that it is the             Applications, technologies, architectures, and protocols
new one. Similarly, a new supervisor can be selected deter-            for computer communications, pages 161–172, New
ministically even if other nodes are still at level i; the only        York, NY, USA, 2001. ACM Press.
difference is that now the supervisor will be chosen between       [7] A. I. T. Rowstron and P. Druschel. Pastry: Scalable,
the nodes still alive at level i. The new supervisor starts            Decentralized Object Location, and Routing for
now the reconfiguration procedure and then the new graph               Large-Scale Peer-to-Peer Systems. In Middleware ’01:
could be installed.                                                    Proceedings of the IFIP/ACM International Conference
                                                                       on Distributed Systems Platforms Heidelberg, pages
Due to the core assumption, we have that inside the sys-               329–350, London, UK, 2001. Springer-Verlag.
tem there exist stable nodes and eventually one of them will       [8] I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger,
be selected to become supervisor and then it never crashes.            M. F. Kaashoek, F. Dabek, and H. Balakrishnan.
When a stable node is selected, we return to the case de-              Chord: a scalable peer-to-peer lookup protocol for
scribed above where the supervisor does not crash and then             internet applications. IEEE/ACM Trans. Netw.,
from that point connectivity is guaranteed forever.                    11(1):17–32, 2003.
                                                                   [9] S. Voulgaris, D. Gavidia, and M. Steen. CYCLON:
                                                                       Inexpensive Membership Management for Unstructured
                                                                       P2P Overlays. Journal of Network and Systems
                                                                       Management, 13(2):197–217, June 2005.
4.   CONCLUSIONS
This paper presented a new core-based reconfiguration al-
gorithm able to build an overlay network highly reliable.
In terms of deterministic guarantees, the overlay is able to
let queries crossing the network be ever satisfied from some
point of time on. This limits the possible continual lost of re-
liability current reconfiguration approaches suffer from. In
fact, our approach can lose queries only for a finite time.
This finite-time unreliability comes from the fact that mem-
bers of the core are a-priori unknown to any process join-
ing the network. Nevertheless, all processes will be even-
tually able to select a core member that will carry out all
the reconfigurations. By assuming known an upper-bound
on the node failure-rate, successive reconfigurations done by
the same supervisor will take effect before the overlay de-
grades its functionalities.

5.   REFERENCES
[1] A. Allavena, A. Demers, and J. E. Hopcroft.
    Correctness of a gossip based membership protocol. In
    PODC ’05: Proceedings of the twenty-fourth annual
    ACM symposium on Principles of distributed
    computing, pages 292–301, New York, NY, USA, 2005.
    ACM Press.
[2] P. Eugster, S. Handurukande, R. Guerraoui,
    A. Kermarrec, and P. Kouznetsov. Lightweight
    probabilistic broadcast. In In Proceedings of The
    International Conference on Dependable Systems and
    Networks (DSN ’01), July 2001., 2001.
[3] A. J. Ganesh, A. Kermarrec, and L. Massoulié.
    Peer-to-Peer Membership Management for
    Gossip-Based Protocols. IEEE Trans. Comput.,
    52(2):139–149, 2003.
[4] P. B. Godfrey, S. Shenker, and I. Stoica. Minimizing
    churn in distributed systems. ACM SIGCOMM
    Computer Communication Review, 36(4):147–158, 2006.
[5] V. Gramoli, A. Kermarrec, A. Mostefaouii, M. Raynal,
    and B. Sericola. Core persistence in peer to peer
    systems: Relating size to lifetime. In On The Move
    International Workshop on Reliability in Decentralized