Proceedings of the
     1st Workshop on Artificial Intelligence and Formal Verification, Logics, Automata and Synthesis (OVERLAY),
                                          Rende, Italy, November 19–20, 2019


              Towards the Automated Verification of
                  Publish/Subscribe Networks

                                                Giorgio Delzanno1
                                    1
                                        DIBRIS, University of Genova
                                         1
                                             giorgio.delzanno@unige.it


                                                    Abstract


                We present a formal model of publish/subscribe network architectures in which
            a central communication broker is in charge of distributing messages to clients
            subscribed to certain topics. We consider different semantics for the internal
            structured of the server and for the notification phase. We discuss applicability of
            an SMT-based infinite-state model checker to the proposed model and decidability
            results for abstractions obtained hiding the internal structure of a server.


1    Introduction
Protocols designed to operate in distributed systems are often defined for an arbitrary number of
components and for asynchronous communication. Formal specification languages like Petri nets and
automata are often used to model skeletons of this kind of systems. The coverability decision problem
[1] is typically used to formulate reachability of bad configurations independently from the number of
components of a system. To express safety properties of distributed systems we can lift the coverability
decision problem, in which the initial configuration is fixed a priory, to a more general formulation
existentially quantified over an infinite set of initial configurations [6, 7]. The existentially quantified
coverability problem has been considered in [17, 18, 19, 5, 4] in order to reason on parameterized broadcast
protocols. Falsification of this decision problem provides a characterization of initial configurations from
which it is possible to reach a bad configuration, e.g., an anomaly in the protocol. Existentially quantified
coverability problem is undecidable for systems with a static communication topology and atomic broadcast
communication [1, 14, 17, 18, 6].
    In this work we focus our attention on the application of the SMT-based Infinite-state Model Checker
Cubicle [10] to formally specify and validate distributed protocols that combine asynchronous communi-
cation and synchronous operations on local data structures. Cubicle provides a specification language
that combines both local and global update rules. The combination is particularly useful when dealing
with specification of the internal behavior of protocols designed for client-server architectures. In the talk
we will first present a formal model that can be applied to Pub/Sub protocols such as Redis, MQTT,
etc. The model is based on a transition system in which clients are specified via labelled automata and
the behaviour of the server is described via a global subscriber list and a finite set of worker threads.
Spawning of worker threads accessing shared data is modelled using snapshot synchronisation [24] as
in copy-on-write data structures [30]. This strategy consists in creating snapshots of shared data that

                Copyright c 2020 for this paper by its authors.
                Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
36                                                                                                    G. Delzanno


are confined in the local data of worker threads. In our setting the shared data consists of the current
subscribers list. Copy-on-write data structures [30] are a possible implementation of this synchronisation
techniques that can be used to avoid race conditions on shared data structures. Copy-on-write data
structures are particularly useful when book-keeping session data in a distributed application. Indeed
they automatically produce a snapshot of the share data structures (lists, sets, etc) when combined with
iterators (i.e. to scan a set/list) . The semantics of copy-on-write concurrent data structures typically
provide a mechanism to generate snapshots on demand (e.g. when an iterator needs to scan the structure)
producing an exception (that will be handled by the user) in case of simultaneous read and write access
from multiple threads. In our model we will not model exception handling and assume that snapshots
are generated and confined in a worker thread when necessary. Extending the semantics towards a more
complex modeling of copy-on-write is an interesting direction for refining the model. The use of worker
threads allows us to model the publishing of a message asynchronously w.r.t. to other operations executed
on the server (e.g. registration of new clients, etc). Under the considered semantics for copy-on-write
data structures, we will then show that computations in the resulting model can be transformed into
round-based executions that are simpler to analyse (they reduce the number of possible interleavings)
with respect to a more liberal interleaving model.


2     How to Formally Model Publish/Subscribe Networks
We use Id to denote a denumerable set of identifiers of client instances. Furthermore, we define Q as the
finite set of client state labels, A as the finite set of action labels, and M as the finite set of message
labels. For brevity, we assume that action labels have one of the following form: local, that denotes a local
transition, subscribe, that denotes a subscription request, unsubscribe, that denotes an unsubscription
request, publish(m) with m ∈ M , that denotes a publish request for message m ∈ M .
    The above listed type of actions are strictly related to the communication model typical of pub-
lish/subscribe architecture based on a client-server architecture in which every message is delivered to
all subscribers via a central server (or, more in general, via a cluster/federation of servers). A client
specification P is a tuple hQ, q0 , Ri, where Q is a finite set of states, q0 ∈ Q is the initial state, and
R ⊆ Q × A × Q defines state transitions induced by action labels. In other words a client specification
can be viewed as a finite state automata with labelled transitions that statically define its behaviour.

Client Configuration
A client configuration is a tuple hi, s, b, f i, where i ∈ Id is the client identifier generated after a connection
request, s ∈ Q is the current client state, b ∈ 2M is the set of messages received so far, and f ∈ {>, ⊥}
is a flag that defines the connection status of the client with respect to the global network, namely >
corresponds to the normal operating status, whereas ⊥ corresponds to a disconnection event. We assume
that disconnected clients cannot roll back to a normal status, i.e., when they restart they will be assigned a
new identifier, their internal state being completely reset. The client specification hQ, q0 , Ri can naturally
be extended with enabling conditions for transitions in R based on the presence of certain messages in the
current message list (e.g. hq1 , local, q2 i only if message m1 , . . . , mr have already been received). We will
not discuss this extension in this paper.

Server Configuration
For a fixed n ≥ 1, a server configuration is defined by a tuple hL, W1 , . . . , Wn i, where L ∈ 2Id is a finite set
of identifers that represents the list of subscribed clients and, for each i : 1, . . . , n with n > 1, Wi represents
the current state of the i-th worker thread used by the server to deliver messages. More specifically, Wi
can be either ⊥ or hmi , Ri i with mi ∈ M and Ri ⊆ 2Id . In the former case it denotes an idle thread,
whereas in the latter it denotes initialized threads in which Ri is the current snapshot of the subscription
list, and mi is the message to be delivered. The use of a snapshot R of the current subscription list
of the server is based on implementations of read/write operations on shared data structures based on
snapshots/copy-on-write data structutes used in concurrent programming to avoid race conditions. The
initialisation of a worker thread with the snapshot of the subscriber list allows the server to proceed
Towards the Automated Verification of Publish/Subscribe Networks                                                            37


asynchronously with the delivery of a message to all subscribers. In our model we consider a fairly realistic
scenario in which the server employs a fixed number or worker threads for this kind of task. Furthermore,
we model the semantics of a publish message asynchronously: the server first spawns (when available) a
new task with a copy of the current subscriber list. The worker can then deliver the message to the active
clients. The initial server configuration is the tuple h∅, I1 , . . . , In i where Ii = ⊥ for i : 1, . . . , n.

Global Configuration
A global configuration is defined by a tuple hS, Ci, where S = hL, W1 , . . . , Wn i is a server configuration,
and C = {c1 , . . . , ck } is a finite set of client configurations such that cj = hij , sj , bj , fj i for j : 1, . . . , k. We
use N to denote the set of network configurations.

Pub/Sub Network
For fixed sets A, Q, M , and given a client specification P = hQ, q0 , Ri, and n > 1, a Pub/Sub Network
P S is defined via a transitions system defined through a binary relation over network configurations.
More precisely, the relation →⊆ N × N is defined as the least relation satisfying one of the conditions
listed below. We show next some example of transitions such as P ublish and N otif y.

    • Publish Transition:

                                  P ublish hS, {hi, s, b, f i} ∪ Ci → hS 0 , {hi, s0 , b, f i} ∪ Ci

       under the assumptions: f = >, hs, publish(m), s0 i ∈ R, S = hL, W1 , . . . , Wn i, q ∈ {1, . . . , n},
       Wq = ⊥, S 0 = hL, W1 , . . . , Wq0 , . . . , Wn i, and Wq0 = hm, Li, With this rule a client instance sends a
       publish request to the server. The server acknowledges the request passing the message to an idle
       worker thread together with a snapshot of the current subscriber list L. The client updates its local
       state according to R.
    • Notify Transition:

                                                   N otif y hS, Ci → hS 0 , C 0 i
       under the following assumptions C = {c1 , . . . , ck }, ci = hidi , si , bi , fi i for all i ∈ {1, . . . , k}, S =
       hL1 , W1 , . . . , Wn i, Wq = hm, L2 i for some q ∈ {1, . . . , n}, S 0 = hL1 , W1 , . . . , Wq0 , . . . , Wn i, Wq0 = ⊥,
       C 0 = {c01 , . . . , c0k } where for all j : 1, . . . , k, cj = hidj , sj , bj , fj i, if idj ∈ L2 and fj = >, then
       c0j = hidj , sj , b0j , fj i and b0j = bj ∪ {m}, c0j = cj , otherwise. With this rule we model notification
       of message m via a global action performed by worker thread Wq = hm, L2 i whose effect is to
       update the message list of each active client instance whose identifier is included in the list L2 . The
       remaining client instances (their identifier is not in the list or they are inactive) remain unchanged.
       After notification the worker thread resets its state to idle ⊥ and returns available for distributing
       other published message.
The semantics of copy-on-write concurrent data structures in concurrent programming languages such
as Java [30] is in general more complex than the model described above. Indeed, copy-on-write data
structures typically provide a mechanism to generate snapshots on demand (e.g. when an iterator needs
to scan the structure) producing an exception (that will be handled by the user) in case of simultaneous
read and write access from multiple threads. In our semantics we do not model exception handling and
focus instead on the confinement of snapshots of a shared data structure in worker threads.


3     Verification Problems and SMT Solvers
In our work we have applied the above described formal language to build a verifiable (parameterised)
model that can be validated via the SMT-based Infinite-state Model Checker Cubicle. In this setting we
use unbounded arrays to model arbitrary collections of publisher and subscriber processes as well as an
38                                                                                          G. Delzanno


unbounded shared data structure used as a communication media between processes. The evolution of
the shared memory through the different rounds is modelled using a bi-dimensional unbounded arrays
indexed both on round numbers and process identifiers. Each row of such a matrix can then be used to
model the different phases of a protocol round, e.g., the formation of a subscriber group.
   The resulting model can be validated through the SMT-based Infinite-state Model Checker Cubicle.
Cubicle implements a symbolic backward reachability algorithm in which sets of configurations are
represented via formulas in fragments of First Order Logic that combine Presburger Arithmetics and
the Theory of Arrays [10, 29, 3, 23]. By construction, the Cubicle verification algorithm ensures that,
upon termination, the resulting correctness proof is guaranteed to hold for any number of processes.
In other words Cubicle can be applied as an automated engine for solving parameterised verification
problems for the considered distributed protocol. In our experiments with the considered case-studies
we successfully validated different versions of the Redis Pub/Sub protocol and identify corner cases for
guards of transitions.


4    Decidability of Verification Problems
In [16] we have introduced a model of Publish Subscribe Networks inspired to Petri Nets in which
the internal structure of a server is abstracted away and its behaviour is included in the transition
system describing the interaction among clients. In this context we have considered different semantics
for the notification phase in order to take into consideration exceptions due to node crashes. For the
considered model, decidabilty of the coverability problem can be obtained via the application of Higman’s
and Dickson’s lemmas and the theory of well-structured transition systems. The results are based
on compositional properties of well-quasi orderings that can be applied in order to define verification
algorithms based on symbolic backward reachability in which sets of Petri Net markings are finitely
represented via constraint formulas. The extension of the above mentioned results to refined models of
Pub/Sub Networks is left as future research.


References
 [1] P. A. Abdulla and G. Delzanno. Parameterized verification. STTT, 18(5):469–473, 2016.
 [2] P. A. Abdulla, G. Delzanno, N. Ben Henda, and A. Rezine. Monotonic abstraction: on efficient
     verification of parameterized systems. Int. J. Found. Comput. Sci., 20(5):779–801, 2009.
 [3] F. Alberti, S. Ghilardi, and N. Sharygina. A framework for the verification of parameterized
     infinite-state systems. Fundam. Inform., 150(1):1–24, 2017.
 [4] N. Bertrand, G. Delzanno, B. König, A. Sangnier, and J. Stückrath. On the decidability status of
     reachability and coverability in graph transformation systems. In RTA’12, volume 15 of LIPIcs, pages
     101–116. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2012.
 [5] N. Bertrand, P. Fournier, and A. Sangnier. Distributed local strategies in broadcast networks. In
     26th International Conference on Concurrency Theory, CONCUR 2015, Madrid, Spain, September
     1.4, 2015, pages 44–57, 2015.
 [6] R. Bloem, S. Jacobs, A. Khalimov, I. Konnov, S. Rubin, H. Veith, and J. Widder. Decidability
     of Parameterized Verification. Synthesis Lectures on Distributed Computing Theory. Morgan &
     Claypool Publishers, 2015.
 [7] R. Bloem, S. Jacobs, A. Khalimov, I. Konnov, S. Rubin, H. Veith, and J. Widder. Decidability in
     parameterized verification. SIGACT News, 47(2):53–64, 2016.

 [8] B. Charron-Bost and A. Schiper. The heard-of model: computing in distributed systems with benign
     faults. Distributed Computing, 22(1):49–71, 2009.
Towards the Automated Verification of Publish/Subscribe Networks                                         39


 [9] S. Conchon, G. Delzanno, and A. Ferrando. Parameterized verification of topology-sensitive distributed
     protocols goes declarative. In Proceedings of NETYS 2018, to appear, 2018.

[10] S. Conchon, A. Goel, S. Krstic, A. Mebsout, and F. Zaïdi. Cubicle: A parallel smt-based model
     checker for parameterized systems - tool paper. In Computer Aided Verification - 24th International
     Conference, CAV 2012, Berkeley, CA, USA, July 7-13, 2012 Proceedings, pages 718–724, 2012.
[11] S. Conchon, A. Goel, S. Krstic, A. Mebsout, and F. Zaïdi. Invariants for finite instances and beyond.
     In Formal Methods in Computer-Aided Design, FMCAD 2013, Portland, OR, USA, October 20-23,
     2013, pages 61–68, 2013.
[12] H. Debrat and S. Merz. Verifying fault-tolerant distributed algorithms in the heard-of model. Archive
     of Formal Proofs, 2012.
[13] G. Delzanno. A logic-based approach to verify distributed protocols. In Proceedings of the 31st
     Italian Conference on Computational Logic, Milano, Italy, June 20-22, 2016., pages 86–101, 2016.

[14] G. Delzanno. A unified view of parameterized verification of abstract models of broadcast communi-
     cation. STTT, 18(5):475–493, 2016.
[15] G. Delzanno. Formal Verification of Internet of Things Protocols Invited talk at the 5th Workshop
     on Formal Reasoning in Distributed Algorithms (FRIDA), FLOC 2018 (draft available in the FLOC
     2018 workshop web page).

[16] G. Delzanno. Parameterised Verification of Publish/Subscribe Networks with Exception Handling.
     RP 2019.
[17] G. Delzanno, A. Sangnier, and G. Zavattaro. Parameterized verification of ad hoc networks. In
     CONCUR 2010 - Concurrency Theory, 21th International Conference, CONCUR 2010, Paris, France,
     August 31-September 3, 2010. Proceedings, pages 313–327, 2010.
[18] G. Delzanno, A. Sangnier, and G. Zavattaro. On the power of cliques in the parameterized verification
     of ad hoc networks. In Foundations of Software Science and Computational Structures - 14th
     International Conference, FOSSACS 2011, Held as Part of the Joint European Conferences on Theory
     and Practice of Software, ETAPS 2011, Saarbrücken, Germany, March 26-April 3, 2011. Proceedings,
     pages 441–455, 2011.

[19] G. Delzanno, A. Sangnier, and G. Zavattaro. Verification of ad hoc networks with node and
     communication failures. In FORTE/FMOODS’12, volume 7273 of LNCS, pages 235–250. Springer,
     2012.
[20] C. Dragoi, T. A. Henzinger, H. Veith, J. Widder, and D. Zufferey. A logic-based framework for
     verifying consensus algorithms. In Verification, Model Checking, and Abstract Interpretation - 15th
     International Conference, VMCAI 2014, San Diego, CA, USA, January 19-21, 2014, Proceedings,
     pages 161–181, 2014.
[21] C. Dragoi, T. A. Henzinger, and D. Zufferey. The need for language support for fault-tolerant
     distributed systems. In 1st Summit on Advances in Programming Languages, SNAPL 2015, May 3-6,
     2015, Asilomar, California, USA, pages 90–102, 2015.
[22] C. Dragoi, T. A. Henzinger, and D. Zufferey. Psync: a partially synchronous language for fault-tolerant
     distributed algorithms. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on
     Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, January 20 - 22, 2016,
     pages 400–415, 2016.

[23] S. Ghilardi and S. Ranise. Backward reachability of array-based systems by SMT solving: Termination
     and invariant synthesis. Logical Methods in Computer Science, 6(4), 2010.
[24] M. Herlihy, N. Shavit. The art of multiprocessor programming. Morgan Kaufmann 2008.
40                                                                                               G. Delzanno


[25] A. Mebsout. Inférence d’invariants pour le model checking de systèmes paramétrés. (Invariants
     inference for model checking of parameterized systems). PhD thesis, University of Paris-Sud, Orsay,
     France, 2014.

[26] T. Tsuchiya and A. Schiper. Verification of consensus algorithms using satisfiability solving. Distributed
     Computing, 23(5-6):341–358, 2011.
[27] http://alt-ergo.lri.fr.
[28] http://functory.lri.fr/.

[29] http://users.mat.unimi.it/users/ghilardi/mcmt/.
[30] https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/CopyOnWriteArraySet.
     html .