Formalization and Verification of Go-based New Simple Queue System Danyang Wang1 , Jiaqi Yin2 , Sini Chen1 and Huibiao Zhu1 1 Shanghai Key Laboratory of Trustworthy Computing. East China Normal University, Shanghai, China 2 Northwestern Polytechnical University, Xi’an, China Abstract NSQ (New Simple Queue) is a real-time distributed messaging platform implemented by Go language. It’s designed to operate at scale stably and efficiently handle billions of messages per day. Its decentralized topology guarantees fault tolerance, high availability, and reliable message delivery. Operationally, NSQ is elastic to configure and deploy. With the broad application of the NSQ message system, its security and stability have attracted extensive concentration. Therefore, it is crucial to conduct a rigorous analysis and verification of NSQ’s properties. In this paper, we employ process algebra CSP (Communicating Sequential Processes) to model the core functional modules of the NSQ. In addition, we utilize the model checker PAT (Process Analysis Toolkit) to verify five properties of the model, including divergence freedom, reachability, scalability, availability, and flow controllability. The verification results demonstrate that the NSQ system satisfies all the above properties, proving that the system has high flexibility and robustness while providing credible and efficient message delivery. Keywords NSQ, Messaging System, Communicating Sequential Processes(CSP), Modeling, Verification 1. Introduction to ensure high availability and reachability of messages, which increases system complexity. Therefore, excellent In the rapidly evolving era of the Internet, the explosion message middleware should have high message process- of users and services creates severe challenges for net- ing efficiency, robustness, stability, and scalability. work applications. Conventional monolithic and vertical NSQ [6] has emerged from these excellent middlewares service architectures can no longer deal with such a vol- in recent years. It is a distributed messaging platform ume of data. Distributed services are gradually becoming based on Go language [9] with outstanding performance, the mainstream architecture. As a foundational segment robustness, and usability. This messaging platform is a in distributed message systems, middleware [1] is impor- user-friendly middleware for real-time messaging ser- tant in decoupling, asynchronous communication, traffic vices, capable of managing hundreds of millions of mes- clipping, and other issues. It can improve the perfor- sages. In addition, NSQ is fitted to the current concurrent mance and stability of applications. Therefore, message Internet ecosystem due to Go’s native strengths in concur- queue as a critical middleware acquires more attention rency. Go is a programming language with concurrency in the Internet field. features, and its concurrency model was developed based With the evolution of technology, message queues on the process communication concept of CSP (Com- are gradually maturing, resulting in a series of outstand- municating Sequential Processes) [10, 11]. This feature ing middleware, including ActiveMQ [2], RabbitMQ [3], makes the Go-based NSQ distributed system well-suited Kafka [4], and RocketMQ [5]. These services decouple to the producer-consumer concurrency problem. There- complex systems and enable asynchronous operations fore, it is becoming popular within businesses and has to reduce response times, providing a better user experi- also attracted the attention of researchers. ence. Although the introduction of middleware can sig- NSQ is suitable for distributed applications and sys- nificantly improve the performance of a system, we must tems that require asynchronous messaging, such as Social consider its potential problems and challenges, such as Media, Gaming, and other industries that require high reduced availability due to unstable message queues and concurrency. Until now, existing studies primarily fo- data inconsistencies due to concurrent communication. cus on comparing different message queues performance, The system needs to introduce additional mechanisms operability, and other characteristics [8] or delve into practical applications of NSQ [7]. To the best of our QuASoQ 2023: 11th International Workshop on Quantitative Approaches to Software Quality, December 04, 2023, Seoul, South knowledge, there has yet to be research about the verifi- Korea cation of its properties, which are significant for users. $ 51215902076@stu.ecnu.edu.cn (D. Wang); jqyin@nwpu.edu.cn And the fundamental attributes of the system still need (J. Yin); 52265902002@stu.ecnu.edu.cn (S. Chen); to be proven. hbzhu@sei.ecnu.edu.cn (H. Zhu) © 2023 Copyright for this paper by its authors. Use permitted under Creative Employing a formal verification approach to verify CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) NSQ’s fundamental properties offers rigorous proof and CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 74 • Topic: It is a distinct stream of messages. An NSQ instance can have multiple Topics, each of which can have one or more Channels. • Channel: It is a logical grouping of consumers sub- scribed to a given Topic. Each Channel receives a copy of all the messages for that Topic. The Topic and Channel in the NSQ system are imple- mented by Go’s channel data type. Go-chan builds on the idea of channel in CSP, it allows data transfers and syn- chronization operations between concurrent processes. A channel with cache space are also the natural way to Figure 1: An Instance of the NSQ System express queue structure. Therefore, essentially NSQ’s Topic/Channel is a buffered queue for message. After learning the basic terms, we can introduce the assurance, ensuring the system’s correctness, reliability, NSQ system further from two core workflows: Message and stability. This approach enhances confidence and multi-cast and Message consumption. credibility in the design, implementation, and deploy- 2.1.1. Message Multi-cast ment of the system, which is paramount for developers and users. Consequently, this paper bridges this research NSQ designs nsqd to handle multiple data streams con- gap by adopting formal methods to analyze the NSQ sys- currently. Each Topic can have one or more Channels. tem. We utilize process algebra CSP to formally model Topics multicast the received messages to Channels, and the core functional modules and basic workflow of the each Channel receives copies of messages. In practice, NSQ such as message publishing, subscription, registra- Channels map to downstream services that subscribe the tion, and querying. Subsequently, leveraging the model Topics. Topics and Channels are not preconfigured but checker PAT [12], we verify five properties of the model, are created upon the first publication or subscription. including Divergence Freedom, Reachability, Scalability, Within nsqd, Topics and Channels independently buffer Availability, and Flow Controllability. Experimental re- data to prevent lagging consumers from affecting other sults demonstrate that the NSQ distributed message plat- Channels. Messages are delivered to a randomly client form can guarantee all these properties, proving that the when all clients are ready, achieving load balancing. system has outstanding flexibility and robustness. 2.1.2. Message Consumption The remainder of this paper is organized as follows. Unlike many conventional message queues, NSQ maxi- Section II briefly describes the NSQ system and process mizes performance and throughout by pushing data to algebra CSP. In Section III, we use CSP to model four the client instead of waiting for it to pull. This concept fundamental components in the NSQ message system. is called the RDY (Ready) state, constituting a form of Furthermore, in Section IV, we employ the model check- client-side flow control. This RDY state is a pivotal perfor- ing tool PAT to implement the constructed models and mance parameter, allowing clients to modulate message verify five properties we defined. Finally, we summarize by adjusting the RDY value. Once clients establish con- this paper and discuss future work in Section V. nections and subscriptions, they assert control over the flow of messages from nsqd by dynamically updating the 2. Background RDY value. In this section we give a brief description of the NSQ’s 2.2. CSP architecture and process algebraic language CSP. Process Algebra CSP [10, 11] is a formal mathematical 2.1. NSQ - New Simple Queue method that is widely applied in the design and verifi- A typical architecture of the NSQ system is displayed in cation of concurrent systems. This language has been Fig. 1. Before furthering into the transmission mechanism successfully applied in modeling and verifying various of the NSQ, we should familiar with the following terms: concurrent systems and protocols [13, 14]. Parts of the CSP syntax used in this paper is defined as follows: • nsqd: The NSQ daemon responsible for receiving and 𝑃 , 𝑄 ::= 𝑆𝐾𝐼𝑃 | 𝑐?𝑢 → 𝑃 | 𝑐!𝑣 → 𝑃 | 𝑃 □𝑄 | 𝑃 ||𝑄 delivering messages. nsqd instances manage the actual | 𝑃 |||𝑄 | 𝑃 [|𝑋|]𝑄 | 𝑃 ◁𝐵▷𝑄 | 𝑃 ; 𝑄 message storage and distribution. • nsqlookupd: The NSQ lookup daemon that manages • SKIP: The process terminates properly. topology information. It receives registration informa- • 𝑐?𝑢 → 𝑃 : The process receives a value from channel tion and provides service discovery. c and assigns it to variable u, then starts P. 75 • 𝑐!𝑣 → 𝑃 : The process sends value v to channel c and Table 1 shows the definitions we defined for the rele- then starts executing process P. vant sets employed in the modeling process. The Module • 𝑃 □𝑄: It depicts a general choice between process P set contains all modules of the NSQ messaging system. and process Q. The ID set consists of unique identifiers for each object within the system. Commands describe the instructions • 𝑃 |||𝑄: It illustrates interleaving. Processes P and Q managing interactions within the NSQ, such as message run simultaneously and do not share any operations publication (PUB) and subscription (SUB). The Data set or variables. indicates the topological information queried by compo- • 𝑃 ◁𝐵▷𝑄: It portrays the execution of process P if nents, and the Ack set is internal feedback. the boolean expression b is true; otherwise, process Based on the above collections, we give the definition Q will be executed. of the Message transferred between components: 𝑀 𝑆𝐺𝑟𝑒𝑞 = {𝑚𝑠𝑔𝑟𝑒𝑞 .𝐴.𝐵.𝐴𝑐𝑡𝑖𝑜𝑛.𝐶𝑜𝑛𝑡𝑒𝑛𝑡 | 3. Modeling 𝐴 ∈ 𝑀 𝑜𝑑𝑢𝑙𝑒, 𝐵 ∈ 𝑀 𝑜𝑑𝑢𝑙𝑒, In this section, we construct the model of NSQ distributed 𝐴𝑐𝑡𝑖𝑜𝑛 ∈ 𝐶𝑜𝑚𝑚𝑎𝑛𝑑, 𝐶𝑜𝑛𝑡𝑒𝑛𝑡 ∈ 𝐼𝐷} architecture as illustrated in Fig. 1. 𝑀 𝑆𝐺𝑟𝑒𝑝 = {𝑚𝑠𝑔𝑟𝑒𝑝 .𝐴.𝐵.𝐴𝑐𝑡𝑖𝑜𝑛.𝐶𝑜𝑛𝑡𝑒𝑛𝑡 | 3.1. Sets, Messages and Channels 𝐴 ∈ 𝑀 𝑜𝑑𝑢𝑙𝑒, 𝐵 ∈ 𝑀 𝑜𝑑𝑢𝑙𝑒, 𝐴𝑐𝑡𝑖𝑜𝑛 ∈ 𝐶𝑜𝑚𝑚𝑎𝑛𝑑, For a more detailed understanding of how the compo- nents within the NSQ system communicate and inter- 𝐶𝑜𝑛𝑡𝑒𝑛𝑡 ∈ {𝐼𝐷, 𝐴𝐶𝐾, 𝐷𝐴𝑇 𝐴}} act, we have laid out explanations for the fundamental 𝑀 𝑆𝐺𝑑𝑎𝑡𝑎 = {𝑚𝑠𝑔𝑑𝑎𝑡𝑎 .𝐶𝑜𝑛𝑡𝑒𝑛𝑡 | 𝐶𝑜𝑛𝑡𝑒𝑛𝑡 ∈ 𝐼𝐷} building blocks used in the model: Sets, Messages, and Communication Channels. 𝑀 𝑆𝐺𝑟𝑒𝑞 denotes the set of request messages, 𝑀 𝑆𝐺𝑟𝑒𝑝 means the set of responses, and 𝑀 𝑆𝐺𝑑𝑎𝑡𝑎 rep- Table 1 resents the set of transmitted data. The correspondence between sets and constants/variables Next, we define the Channels responsible for communi- Set Constant / Variable cation between the modules and refer to these Channels Module P(Producer), C(Consumer), D(nsqd),LD(nsqlookupd) pid(producer ID), cid(consumer ID), with the label 𝐶𝑂𝑀 _𝑃 𝐴𝑇 𝐻. ID did(nsqid), lid(nsqlookupd ID), • ComCL: channels between consumer and nsqlookupd. tid(Topic ID), chid(Channel ID), msgid(message ID) FIN(Finish), REQ(Requeue), SUB(Subscribe), PUB(Publish), • ComPD: channels between producer and nsqd. Command REGISTER(Register),MSG(Message),REP(Response), LOOKUPCHA(Lookup channel), LOOKUPD(Lookup nsqd) • ComDL: channels between nsqd and nsqlookupd. chList(registered channel list), Data dList(nsqd list with specific Topic) • ComCD: channels between consumer and nsqd. ACK OK, OUTTIME We also define the channels used internally by compo- nents with the label 𝑀 𝑆𝐺_𝑃 𝐴𝑇 𝐻. These channels have cache space and are responsible for caching mes- sages. Fig. 2 shows all the channels we have defined. • MsgTPC: message cache channels of Topics. • MsgCHA: message cache channels of Channels. • MsgCON: message cache channels of consumers. 3.2. Overall Modeling The NSQ system embodies an intricate workflow. Due to the page limit, we only present part of the core modeling codes in this section. The whole 𝑆𝑦𝑠𝑡𝑒𝑚() as below: 𝑆𝑦𝑠𝑡𝑒𝑚() =𝑑𝑓 |||𝑝𝑖𝑑∈𝑃 𝐼𝐷,𝑑𝑖𝑑∈𝐷𝐼𝐷,𝑙𝑖𝑑∈𝐿𝐼𝐷,𝑐𝑖𝑑∈𝐶𝐼𝐷 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟𝑝𝑖𝑑 [|𝐶𝑂𝑀 _𝑃 𝐴𝑇 𝐻|] 𝑛𝑠𝑞𝑑𝑑𝑖𝑑 ⎛ ⎞ ⎝ [|𝐶𝑂𝑀 _𝑃 𝐴𝑇 𝐻|] 𝑛𝑠𝑞𝑙𝑜𝑜𝑘𝑢𝑝𝑑𝑙𝑖𝑑 ⎠ Figure 2: Channels of the NSQ System [|𝐶𝑂𝑀 _𝑃 𝐴𝑇 𝐻|] 𝐶𝑜𝑛𝑠𝑢𝑚𝑒𝑟𝑐𝑖𝑑 76 It describes the concurrent model where producers, 𝑚𝑠𝑔𝑃 𝑢𝑠ℎ𝑑𝑖𝑑 () is responsible for pushing messages nsqds, nsqlookupds, and consumers run in parallel and to clients by load balancing strategy. In the NSQ mes- collaborate over the [|𝐶𝑂𝑀 _𝑃 𝐴𝑇 𝐻|] channels. The saging system, this strategy is achieved by employing 𝑝𝑖𝑑 denotes the producer ID, and 𝑃 𝐼𝐷 means the set of a random distribution strategy, wherein messages are 𝑝𝑖𝑑. Other characters such as 𝑑𝑖𝑑, 𝑙𝑖𝑑 are similar. randomly dispatched to clients subscribed to the same Channel. 𝑅𝑑𝑦𝑐𝑖𝑑 [𝑑𝑖𝑑] signifies the number of messages 3.3. Producer 𝑐𝑜𝑛𝑠𝑢𝑚𝑒𝑟𝑖𝑑 can process from a specific 𝑛𝑠𝑞𝑑𝑖𝑑 . We use The producer is responsible for generating and sending the 𝑝𝑢𝑠ℎ𝑆𝑡𝑎𝑡𝑒𝑠𝑑𝑖𝑑 [𝑐𝑖𝑑] array to mark whether 𝑛𝑠𝑞𝑑𝑑𝑖𝑑 messages to corresponding Topics. It communicates with is in the state of pushing messages to 𝑐𝑜𝑛𝑠𝑢𝑚𝑒𝑟𝑐𝑖𝑑 . nsqd the nsqd directly and publishes messages to the nsqd only sends messages to clients who can process messages. module through the 𝐶𝑜𝑚𝑃 𝐷𝑖 channel. We model this process using the General Choice in CSP. 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟𝑝𝑖𝑑 () =𝑑𝑓 𝐸𝑥𝑒𝑐𝐿𝑜𝑜𝑝𝑑𝑖𝑑 () =𝑑𝑓 ⎛ ⎞ ⎛ 𝐶𝑜𝑚𝑃 𝐷𝑖 !𝑚𝑠𝑔𝑟𝑒𝑞 .𝑝𝑖𝑑.𝑑𝑖𝑑.𝑃 𝑈 𝐵.𝑡𝑖𝑑.𝑀 𝑆𝐺𝐼𝐷 → )︂ → 𝐶𝑜𝑚𝑃 𝐷𝑖 ?𝑚𝑠𝑔𝑟𝑒𝑞 .𝑝𝑖𝑑.𝑑𝑖𝑑.𝑃 𝑈 𝐵.𝑡𝑖𝑑.𝑚𝑠𝑔𝑖𝑑 ⎞ ⎜ 𝐶𝑜𝑚𝑃 𝐷𝑖 ?𝑚𝑠𝑔𝑟𝑒𝑝 .𝑑𝑖𝑑.𝑝𝑖𝑑.𝑅𝐸𝑃.𝑚𝑠𝑔𝑖𝑑.𝑂𝐾 → ⎟ ⎜ (︂ 𝑐𝑟𝑒𝑎𝑡𝑇 𝑜𝑝𝑖𝑐(𝑑𝑖𝑑, 𝑡𝑖𝑑) ; ⎜ ⎟ ⎟ ⎜ 𝑢𝑝𝑑𝑎𝑡𝑒𝑀 𝑠𝑔𝑆𝑡𝑎𝑡𝑒𝑠 {𝑃 𝑚𝑠𝑔𝑆𝑡𝑎𝑡𝑒𝑠[𝑚𝑠𝑔𝑖𝑑] == 1; } ; ⎟ ⎜ ◁𝐷𝑇 𝑆𝑡𝑎𝑡𝑒𝑠𝑑𝑖𝑑 [𝑡𝑖𝑑] == 0 ▷ 𝑆𝐾𝐼𝑃 ⎟ ⎜ (︂ )︂ ⎟ ⎜ ⎟ ⎝ 𝑆𝐾𝐼𝑃 ◁ 𝑚𝑠𝑔𝑖𝑑 == 1▷ ⎠ ⎜ 𝑀 𝑠𝑔𝑇 𝑃 𝐶𝑗 !𝑚𝑠𝑔𝑑𝑎𝑡𝑎 .𝑡𝑖𝑑.𝑚𝑠𝑔𝑖𝑑 → ⎜ ⎟ ⎟ 𝑛𝑒𝑥𝑡𝑀 𝑠𝑔{𝑀 𝑆𝐺𝐼𝐷 + +; } → 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟𝑝𝑖𝑑 () ⎝ 𝐶𝑜𝑚𝑃 𝐷𝑖 !𝑚𝑠𝑔𝑟𝑒𝑝 .𝑑𝑖𝑑.𝑝𝑖𝑑.𝑅𝐸𝑃.𝑚𝑠𝑔𝑖𝑑.𝑂𝐾 → ⎠ ◁ 𝑝𝑖𝑑 == 0▷ 𝑆𝐾𝐼𝑃 ⎛ 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟 ⎞ □ 𝑝𝑖𝑑 () ⎛ ⎞ ⎜ ◁ 𝑃 𝑚𝑠𝑔𝑆𝑡𝑎𝑡𝑒𝑠[0] == 1&&𝑃 𝑚𝑠𝑔𝑆𝑡𝑎𝑡𝑒𝑠[1] == 1 ▷ ⎞ ⎟ 𝐶𝑜𝑚𝐶𝐷𝑘 ?𝑚𝑠𝑔𝑟𝑒𝑞 .𝑐𝑖𝑑.𝑑𝑖𝑑.𝐹 𝐼𝑁.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑 → ⎜ ⎛ 𝐶𝑜𝑚𝑃 𝐷𝑖 !𝑚𝑠𝑔𝑟𝑒𝑞 .𝑝𝑖𝑑.𝑑𝑖𝑑.𝑃 𝑈 𝐵.𝑡𝑖𝑑.2 → 𝑢𝑝𝑑𝑎𝑡𝑒𝑀 𝑠𝑔{𝐷𝑀 𝑆𝑡𝑎𝑡𝑒𝑠𝑑𝑖𝑑,𝑡𝑖𝑑,𝑐ℎ𝑖𝑑 [𝑚𝑠𝑔𝑖𝑑] = −1} → ⎠ ⎟ ⎝ ; ⎜ ⎟ ⎜ ⎜ 𝐶𝑜𝑃 𝐷𝑖 ?𝑚𝑠𝑔𝑟𝑒𝑝 .𝑑𝑖𝑑.𝑝𝑖𝑑.𝑅𝐸𝑃.2.𝑂𝐾 → 𝑆𝐾𝐼𝑃 ⎜ ⎜ ⎟ ⎟⎟ ⎝ ⎝ 𝑢𝑝𝑑𝑎𝑡𝑒𝑀 𝑠𝑔𝑆𝑡𝑎𝑡𝑒𝑠 {𝑃 𝑚𝑠𝑔𝑆𝑡𝑎𝑡𝑒𝑠[2] == 1; } ⎟ ⎠ ⎠ □ → 𝑆𝐾𝐼𝑃 ⎛ ⎞ 𝐶𝑜𝑚𝐶𝐷𝑘 ?𝑚𝑠𝑔𝑟𝑒𝑞 .𝑐𝑖𝑑.𝑑𝑖𝑑.𝑅𝐸𝑄.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑 → We define two type 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟𝑖𝑑 . 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟0 sends ⎜ 𝑢𝑝𝑑𝑎𝑡𝑒𝑀 𝑠𝑔{𝐷𝑀 𝑆𝑡𝑎𝑡𝑒𝑠𝑑𝑖𝑑,𝑡𝑖𝑑,𝑐ℎ𝑖𝑑 [𝑚𝑠𝑔𝑖𝑑] + +} → ⎟ ⎜ messages with message id 0, 1 while 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟1 with ⎝ 𝑀 𝑠𝑔𝐶𝐻𝐴𝑙 !𝑚𝑠𝑔𝑑𝑎𝑡𝑎 .𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑 → ⎟ ⎠ id 2. The producers publish three messages to simulate 𝑆𝐾𝐼𝑃 the practical operation of the NSQ. Furthermore, we re- □ strict that 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟1 must wait for 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟0 to finish ⎛ 𝐶𝑜𝑚𝐶𝐷 ?𝑚𝑠𝑔 .𝑐𝑖𝑑.𝑑𝑖𝑑.𝑆𝑈 𝐵.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑 → ⎞ 𝑘 𝑟𝑒𝑞 sending before it sends the message. ⎜ ⎛ 𝑐𝑟𝑒𝑎𝑡𝑒𝑇 𝑜𝑝𝑖𝑐(𝑑𝑖𝑑, 𝑡𝑖𝑑); ⎞ ⎟ ⎜ ⎜ 𝑢𝑝𝑑𝑎𝑡𝑒𝐶ℎ𝑎𝑛𝑛𝑒𝑙{ ⎜ ⎜ ⎟ ⎟ ⎟ ⎟ 3.4. nsqd 𝐷𝐶ℎ𝑆𝑡𝑎𝑡𝑒𝑠𝑑𝑖𝑑,𝑡𝑖𝑑 [𝑐ℎ𝑖𝑑] = 1; } → ⎜ ⎝ ⎠ ⎟ ⎜ ⎟ 𝑁 𝑜𝑡𝑖𝑓 𝑦(𝑑𝑖𝑑, 𝑡𝑖𝑑, 𝑐𝑖𝑑) ⎜ ⎟ ⎜ ⎟ The nsqd is daemon that receives, queues, and delivers ⎜ ◁ ⎜ 𝐷𝑇 𝑆𝑡𝑎𝑡𝑒𝑠 𝑑𝑖𝑑 [𝑡𝑖𝑑] == 0 ▷ ⎟ ⎞ ⎞ ⎟ messages to clients. It handles multiple streams of data ⎜ ⎛ ⎛ 𝑢𝑝𝑑𝑎𝑡𝑒𝐶ℎ𝑎𝑛𝑛𝑒𝑙{ ⎟ ⎟; ⎜ ⎟ at once through the unique design of Topic and Channel. 𝐷𝐶ℎ𝑆𝑡𝑎𝑡𝑒𝑠𝑑𝑖𝑑,𝑡𝑖𝑑 [𝑐ℎ𝑖𝑑] = 1; } → ⎜ ⎜ ⎝ ⎠ ⎟ ⎜ ⎜ ⎟ ⎟ We modeled three core functions of nsqd. 𝑁 𝑜𝑡𝑖𝑓 𝑦(𝑑𝑖𝑑, 𝑡𝑖𝑑, 𝑐𝑖𝑑) ⎜ ⎝ ⎠ ⎟ ⎜ ⎟ The entire nsqd process execute as flowing: ◁𝐷𝐶ℎ𝑆𝑡𝑎𝑡𝑒𝑠𝑑𝑖𝑑,𝑡𝑖𝑑 [𝑐ℎ𝑖𝑑] == 0 ▷ 𝑆𝐾𝐼𝑃 ⎜ ⎟ ⎜ ⎟ ⎜ 𝑎𝑑𝑑𝐶𝑙𝑖𝑒𝑛𝑡{𝑇 𝐶ℎ2𝐶𝑑𝑖𝑑,𝑡𝑖𝑑,𝑐ℎ𝑖𝑑 [𝑐𝑖𝑑] = 1; ]} → ⎜ ⎟ ⎟ 𝑝𝑢𝑚𝑝𝑀 𝑠𝑔{𝑠𝑡𝑎𝑟𝑡𝑀 𝑠𝑔𝑃 𝑢𝑚𝑝 [𝑡𝑖𝑑] = 1; } → ⎜ ⎟ 𝑁 𝑠𝑞𝑑𝑑𝑖𝑑 () =𝑑𝑓 ⎜ 𝑑𝑖𝑑 ⎝ 𝐶𝑜𝑚𝐶𝐷 !𝑚𝑠𝑔 .𝑑𝑖𝑑.𝑐𝑖𝑑.𝑅𝐸𝑃.𝑆𝑈 𝐵.𝑂𝐾 → ⎟ ⎠ 𝑘 𝑟𝑒𝑝 𝐸𝑥𝑒𝑐𝐿𝑜𝑜𝑝𝑑𝑖𝑑 ()|||𝑚𝑠𝑔𝑃 𝑢𝑚𝑝𝑑𝑖𝑑 ()|||𝑚𝑠𝑔𝑃 𝑢𝑠ℎ𝑑𝑖𝑑 (); 𝑆𝐾𝐼𝑃 The 𝐸𝑥𝑒𝑐𝐿𝑜𝑜𝑝𝑑𝑖𝑑 () is the main execution loop that 𝐸𝑥𝑒𝑐𝐿𝑜𝑜𝑝𝑑𝑖𝑑 (); drives the core functions of the NSQ daemon. It is respon- 𝑚𝑠𝑔𝑃 𝑢𝑠ℎ𝑑𝑖𝑑 () =𝑑𝑓 sible for constantly listens requests from other compo- ⎛ ⎛ 𝑀 𝑠𝑔𝐶𝐻𝐴 ?𝑚𝑠𝑔 𝑑𝑎𝑡𝑎 .𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑 ⎞ ⎞ 𝑖 nents and processes them according to predefined logic. {𝑝𝑢𝑠ℎ𝑆𝑡𝑎𝑡𝑒𝑠𝑑𝑖𝑑 [𝑐𝑖𝑑] = 1; } → We model four basic command handling logics, including ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ 𝐶𝑜𝑚𝐶𝐷𝑗 !𝑚𝑠𝑔𝑟𝑒𝑝 .𝑑𝑖𝑑.𝑐𝑖𝑑. ⎜ ⎜ ⎟ ⎟ 𝑅𝐸𝑄, 𝑆𝑈 𝐵, 𝑃 𝑈 𝐵 and 𝐹 𝐼𝑁 . ⎟ ⎟ 𝑀 𝑆𝐺.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑 ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ Multicasting and delivery of messages is a core func- □ ⎜ ⎜ ⎝ {𝑝𝑢𝑠ℎ𝑆𝑡𝑎𝑡𝑒𝑠𝑑𝑖𝑑 [𝑐𝑖𝑑] = 0; } → ⎠ ⎟; ⎟ tion of nsqd. The relationship between Topics and 𝑐𝑖𝑑 ⎜ ⎜ 𝑆𝐾𝐼𝑃 ⎟ ⎟ Channels is established through multicast, ensuring that (︂ )︂ 𝑇 𝐶ℎ2𝐶𝑑𝑖𝑑,𝑡𝑖𝑑,𝑐ℎ𝑖𝑑 [𝑐𝑖𝑑] == 1 ⎜ ⎟ ⎝ ◁ ▷ ⎜ ⎟ each Channel receives a copy of all messages associated & 𝑅𝑑𝑦𝑐𝑖𝑑 [𝑑𝑖𝑑] > 0 ⎠ with a given Topic. This logic is implemented by the 𝑆𝐾𝐼𝑃 𝑚𝑠𝑔𝑃 𝑢𝑚𝑝𝑑𝑖𝑑 () process. 𝑚𝑠𝑔𝑃 𝑢𝑠ℎ𝑑𝑖𝑑 (); 77 3.5. nsqlookupd signal through 𝐶𝑜𝑚𝐷𝐿𝑖 and then remove all informa- tion associated with the corresponding nsqd from its The nsqlookupd daemon manages the system’s topology records. This process ensures that the information stored information. nsqlookupd provides discovery and registra- on nsqlookupd remains consistently available. tion services, which decouple consumers from producers. The formal modeling of nsqlookupd is as follows. 3.6. Consumer 𝑛𝑠𝑞𝑙𝑜𝑜𝑘𝑢𝑝𝑑𝑙𝑖𝑑 () =𝑑𝑓 When a consumer is initiated, it queries nsqlookupd for the addresses of nsqd instances associated with the target 𝐿𝑜𝑜𝑘𝑢𝑝𝑙𝑖𝑑 () ||| 𝑅𝑒𝑔𝑖𝑠𝑡𝑒𝑟𝑙𝑖𝑑 () ||| 𝐸𝑟𝑟𝑜𝑟𝐻𝑎𝑛𝑑𝑙𝑒𝑟𝑙𝑖𝑑 (); Topics. Upon receiving the addresses, it subscribes to all The 𝑅𝑒𝑔𝑖𝑠𝑡𝑒𝑟𝑙𝑖𝑑 () process handles the registration of these instances. Only after these can the consumer requests sent by nsqd through the 𝐶𝑜𝑚𝐶𝐷𝑖 chan- activate processes for message retrieval and processing. nel,and record nsqd instance by 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠𝑙𝑖𝑑 [𝑑𝑖𝑑]. Therefore, the modeling of consumer is as follows: 𝐿𝑇 𝑆𝑡𝑎𝑡𝑒𝑠𝑙𝑖𝑑 [𝑡𝑖𝑑] stores all the registered Topics on the nsqlookupd, and 𝑇 2𝐷𝑙𝑖𝑑 [𝑡𝑖𝑑][𝑑𝑖𝑑] holds the cor- 𝐶𝑜𝑛𝑠𝑢𝑚𝑒𝑟𝑐𝑖𝑑,𝑡𝑖𝑑,𝑐ℎ𝑖𝑑 () =𝑑𝑓 responding nsqd addresses for each Topic. Similarly, 𝐶𝑜𝑛𝑛𝑇 𝑜𝐿𝑜𝑜𝑘𝑢𝑝𝑑𝑠𝑐𝑖𝑑,𝑡𝑖𝑑 (); 𝐿𝐶ℎ𝑆𝑡𝑎𝑡𝑒𝑠𝑙𝑖𝑑,𝑡𝑖𝑑 [𝑐ℎ𝑖𝑑] and 𝑇 𝐶2𝐷𝑙𝑖𝑑 [𝑡𝑖𝑑][𝑐ℎ𝑖𝑑][𝑑𝑖𝑑] (𝐻𝑎𝑛𝑑𝑙𝑒𝑟𝑐𝑖𝑑 () ||| 𝑅𝑒𝑎𝑑𝐿𝑜𝑜𝑝𝑐𝑖𝑑 ()) ; serve same functions for Channels. 𝑅𝑒𝑔𝑖𝑠𝑡𝑒𝑟𝑙𝑖𝑑 () =𝑑𝑓 𝐶𝑜𝑛𝑛𝑇 𝑜𝐿𝑜𝑜𝑘𝑢𝑝𝑑𝑠𝑐𝑖𝑑,𝑡𝑖𝑑 () =𝑑𝑓 𝐶𝑜𝑚𝐷𝐿𝑖 ?𝑚𝑠𝑔𝑟𝑒𝑞 .𝑑𝑖𝑑.𝑙𝑖𝑑.𝑅𝐸𝐺𝐼𝑆𝑇 𝐸𝑅.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑 → 𝐿𝑂𝑂𝑃 (𝑙𝑖𝑑 : 0..𝐿𝐷) : 𝑎𝑑𝑑𝑛𝑠𝑞𝑑{𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠𝑙𝑖𝑑 [𝑑𝑖𝑑] = 1; } → 𝑆𝐾𝐼𝑃 ◁ 𝑎𝑑𝑑𝑟𝐿𝑜𝑜𝑘𝑢𝑝𝑑[𝑙𝑖𝑑] == 0▷ ⎛ ⎛ 𝑟𝑒𝑔𝑖𝑠𝑡𝑒𝑟𝑇 𝑜𝑝𝑖𝑐{ ⎞ ⎞ ⎛ 𝑎𝑑𝑑𝐿𝐷{𝐶𝐿𝑆𝑡𝑎𝑡𝑒𝑠 [𝑙𝑖𝑑] = 1; } → ⎞ 𝑐𝑖𝑑 ⎜ ⎝ 𝐿𝑇 𝑆𝑡𝑎𝑡𝑒𝑠𝑙𝑖𝑑 [𝑡𝑖𝑑] = 1; ⎠ ⎟ ⎜ 𝑐𝑜𝑢𝑛𝑡{𝑡𝑜𝑡𝑎𝑙𝐿𝐷 = 𝑐𝑜𝑢𝑛𝑡𝐿𝐷(𝑙𝑖𝑑, 𝑐𝑖𝑑); }; ⎟ 𝐶𝑜𝑚𝐶𝐿𝑖 !𝑚𝑠𝑔𝑟𝑒𝑞 . ⎜ ⎛ ⎛ ⎞ ⎞ ⎟ 𝑇 2𝐷𝑙𝑖𝑑 [𝑡𝑖𝑑][𝑑𝑖𝑑] = 1; } → 𝑆𝐾𝐼𝑃 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 𝑐𝑖𝑑.𝑙𝑖𝑑.𝐿𝑂𝑂𝐾𝑈 𝑃 𝐷.𝑡𝑖𝑑 → ⎜ ⎟ ⎜ ◁𝑐ℎ𝑖𝑑 == −1▷ ⎜ ⎟ ⎜ ⎜ ⎜ ⎟ ⎟ ⎟ ⎟ 𝐶𝑜𝑚𝐶𝐿 ?𝑚𝑠𝑔 . ⎜ ⎜ ⎜ ⎟ ⎟ ⎟ 𝑟𝑒𝑔𝑖𝑠𝑡𝑒𝑟𝑇 𝑜𝑝𝑖𝑐𝐴𝑛𝑑𝐶ℎ𝑎𝑛{ 𝑖 𝑟𝑒𝑝 ⎜ ⎛ ⎞ ⎟ ⎟ ⎟ ⎟; ⎜ ⎜ ⎜ ⎟ ⎟ ⎟ ⎟; ⎜ ⎟ 𝑙𝑖𝑑.𝑐𝑖𝑑.𝑅𝐸𝑃.𝑡𝑖𝑑.𝑑𝑙𝑖𝑠𝑡 → ⎜ ⎜ ⎜ 𝐿𝑇 𝑆𝑡𝑎𝑡𝑒𝑠 [𝑑𝑖𝑑] = 1; ⎜ 𝑙𝑖𝑑 ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎜ ⎟ ⎟ ⎟ 𝐿𝑂𝑂𝑃 (𝑑𝑖𝑑 : 0..𝐷) : ⎜ ⎜ ⎜ ⎟ ⎟ ⎟ 𝐿𝐶ℎ𝑆𝑡𝑎𝑡𝑒𝑠𝑙𝑖𝑑,𝑡𝑖𝑑 [𝑐ℎ𝑖𝑑] = 1; ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎜ (︂ ⎟ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ )︂ 𝐶𝑜𝑛𝑛𝑇 𝑜𝑁 𝑠𝑞𝑑𝑐𝑖𝑑,𝑑𝑖𝑑,𝑡𝑖𝑑 () ⎜ ⎜ ⎝ ⎠ ⎟ ⎟ 𝑇 2𝐷𝑙𝑖𝑑 [𝑡𝑖𝑑][𝑑𝑖𝑑] = 1; ⎜ ⎜ ⎟ ⎟ ; ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎝ ⎝ 𝑇 𝐶2𝐷𝑙𝑖𝑑 [𝑡𝑖𝑑][𝑐ℎ𝑖𝑑][𝑑𝑖𝑑] = 1; } → ⎠ ⎠ ⎝ ⎝ ◁𝑑𝑙𝑖𝑠𝑡[𝑑𝑖𝑑] == 1 ▷ 𝑆𝐾𝐼𝑃 ⎠ ⎠ 𝑆𝐾𝐼𝑃 ◁𝑡𝑜𝑡𝑎𝑙𝐿𝐷 == 1 ▷ 𝑆𝐾𝐼𝑃 𝑅𝑒𝑔𝑖𝑠𝑡𝑒𝑟𝑙𝑖𝑑 (); 𝐶𝑜𝑛𝑛𝑇 𝑜𝑁 𝑠𝑞𝑑𝑐𝑖𝑑,𝑑𝑖𝑑,𝑡𝑖𝑑 () =𝑑𝑓 𝐿𝑜𝑜𝑘𝑢𝑝𝑙𝑖𝑑 () formalizes nsqlookupd’s responses to 𝑆𝐾𝐼𝑃 ◁ 𝐶𝐷𝑆𝑡𝑎𝑡𝑒𝑠𝑐𝑖𝑑 [𝑑𝑖𝑑] == 1▷ queries from consumers and nsqd instances using Gen- ⎛ ⎞ 𝐶𝑜𝑚𝐶𝐷𝑖 !𝑚𝑠𝑔𝑟𝑒𝑞 .𝑐𝑖𝑑.𝑑𝑖𝑑.𝑆𝑈 𝐵.𝑡𝑖𝑑.𝑐2𝑐ℎ[𝑐𝑖𝑑] → eral Choice. 𝑙𝑜𝑜𝑘𝑝𝑛𝑠𝑞𝑑(𝑙𝑖𝑑, 𝑡𝑖𝑑) provides all the stored ⎜ 𝐶𝑜𝑚𝐶𝐷𝑖 ?𝑚𝑠𝑔𝑟𝑒𝑝 .𝑑𝑖𝑑.𝑐𝑖𝑑.𝑅𝐸𝑃.𝑆𝑈 𝐵.𝑂𝐾 → ⎜ ⎟ ⎟; nsqd address information associated with a specific Topic ⎝ 𝑎𝑑𝑑𝑛𝑠𝑞𝑑{𝐶𝐷𝑆𝑡𝑎𝑡𝑒𝑠𝑐𝑖𝑑 [𝑑𝑖𝑑] = 1; } → ⎠ in nsqlookupd. Similarly, the 𝑙𝑜𝑜𝑘𝑢𝑝𝐶ℎ𝑎𝑛𝑛𝑒𝑙(𝑙𝑖𝑑, 𝑡𝑖𝑑) 𝑢𝑝𝑑𝑎𝑡𝑒𝑅𝐷𝑌 {𝑅𝑑𝑦𝑐𝑖𝑑 [𝑑𝑖𝑑] = 1} → 𝑆𝐾𝐼𝑃 returns Channels list under the specified Topic. The above formula models the process of a con- 𝐿𝑜𝑜𝑘𝑢𝑝𝑙𝑖𝑑 () =𝑑𝑓 sumer connecting to nsqdlookupds and nsqds. The ⎛ ⎞ consumer sends a 𝑆𝑈 𝐵 request to the nsqd through 𝐶𝑜𝑚𝐶𝐿𝑖 ?𝑚𝑠𝑔𝑟𝑒𝑞 .𝑐𝑖𝑑.𝑙𝑖𝑑.𝐿𝑂𝑂𝐾𝑈 𝑃 𝐷.𝑡𝑖𝑑 → 𝐶𝑜𝑚𝐶𝐷𝑖 channel. It records connection information ⎜ 𝑙𝑜𝑜𝑘𝑢𝑝𝑛𝑠𝑞𝑑(𝑙𝑖𝑑, 𝑡𝑖𝑑); in 𝐶𝐷𝑆𝑡𝑎𝑡𝑒𝑠𝑐𝑖𝑑 [𝑑𝑖𝑑] and updates the 𝑅𝑑𝑦𝑐𝑖𝑑 value of ⎟ ⎜ ⎟ ⎝ 𝐶𝑜𝑚𝐶𝐿𝑖 !𝑚𝑠𝑔𝑟𝑒𝑝 .𝑙𝑖𝑑.𝑐𝑖𝑑.𝑅𝐸𝑃.𝑡𝑖𝑑.𝑑𝑙𝑖𝑠𝑡 → 𝑛𝑠𝑞𝑑𝑖𝑑 . In the formula, the value of 𝑅𝑑𝑦𝑐𝑖𝑑 is set to ⎠ 𝑆𝐾𝐼𝑃 1, indicating the consumer’s readiness to process one □ message from 𝑛𝑠𝑞𝑑𝑖 . ⎛ ⎞ 𝐶𝑜𝑚𝐷𝐿𝑖 ?𝑚𝑠𝑔𝑟𝑒𝑞 .𝑑𝑖𝑑.𝑙𝑖𝑑.𝐿𝑂𝑂𝐾𝑈 𝑃 𝐶𝐻𝐴.𝑡𝑖𝑑 → 𝑟𝑒𝑎𝑑𝐿𝑜𝑜𝑝𝑐𝑖𝑑 () =𝑑𝑓 ⎜ 𝑙𝑜𝑜𝑘𝑢𝑝𝐶ℎ𝑎𝑛𝑛𝑒𝑙(𝑙𝑖𝑑, 𝑡𝑖𝑑); ⎟ ⎜ ⎟ ; 𝐶𝑜𝑚𝐶𝐷𝑖 ?𝑚𝑠𝑔𝑟𝑒𝑝 .𝑑𝑖𝑑.𝑐𝑖𝑑. ⎝ 𝐶𝑜𝑚𝐷𝐿𝑖!𝑚𝑠𝑔𝑟𝑒𝑝 .𝑙𝑖𝑑.𝑑𝑖𝑑.𝑅𝐸𝑃.𝑡𝑖𝑑.𝑐ℎ𝑙𝑖𝑠𝑡 → ⎠ 𝑆𝐾𝐼𝑃 𝑀 𝑆𝐺.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑{𝑅𝑑𝑦𝑐𝑖𝑑 [𝑑𝑖𝑑] − −; } → ⎛ (︂ 𝑢𝑝𝑑𝑎𝑡𝑒𝑅𝐷𝑌 {𝑅𝑑𝑦 [𝑑𝑖𝑑] + +; } → )︂ 𝐿𝑜𝑜𝑘𝑢𝑝𝑙𝑖𝑑 (); ⎞ 𝑐𝑖𝑑 ⎜ 𝑅𝑒𝑎𝑑𝐿𝑜𝑜𝑝𝑐𝑖𝑑,𝑑𝑖𝑑,𝑡𝑖𝑑 () ⎟ We also modeled the response of nsqlookupd to con- ⎜ ◁ 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠𝑐𝑖𝑑 [𝑚𝑠𝑔𝑖𝑑] == −1▷ ⎜ ⎟ ⎟; ⎟ nection errors. When nsqlookupd encounters connec- ⎜ ⎛ ⎞ 𝑀 𝑠𝑔𝐶𝑂𝑁𝑗 !𝑚𝑠𝑔𝑑𝑎𝑡𝑎 .𝑑𝑖𝑑.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑{ ⎜ ⎟ tion timeouts with nsqd, it will receive 𝑂𝑈 𝑇 𝑇 𝐼𝑀 𝐸 𝑚𝑠𝑔2𝑑𝑐𝑖𝑑 [𝑚𝑠𝑔𝑖𝑑] = 𝑑𝑖𝑑; } → ⎝ ⎝ ⎠ ⎠ 𝑅𝑒𝑎𝑑𝐿𝑜𝑜𝑝𝑐𝑖𝑑,𝑑𝑖𝑑,𝑡𝑖𝑑 () 78 After completing the subscription, the consumer main- Topic and associated two Channels. 𝑀 𝑠𝑔𝑁 𝑢𝑚 defines tains a TCP connection with nsqd to be ready to receive the number of messages. messages. The diminishing of 𝑅𝑑𝑦𝑐𝑖𝑑 [𝑑𝑖𝑑] value implies a decrease in the amount of messages consumers can 𝑣𝑎𝑟 𝑅𝑑𝑦[𝐶][𝐷] = [0, 0, 0, 0]; handle. 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠𝑐𝑖𝑑 [𝑚𝑠𝑔𝑖𝑑] keeps track of the mes- 𝑣𝑎𝑟 𝑝𝑢𝑠ℎ𝑆𝑡𝑎𝑡𝑒𝑠[𝐷][𝐶] = [0, 0, 0, 0]; sage attempts number. −1 signifies successful processing 𝑣𝑎𝑟 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[𝐿𝐷][𝐷] = [0, 0, 0, 0]; and will release 𝑅𝑑𝑦𝑐𝑖𝑑 [𝑑𝑖𝑑]. Otherwise, the message is 𝑣𝑎𝑟 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠[𝐶][𝑀 𝑠𝑔𝑁 𝑢𝑚] = [0, 0, 0, 0, 0, 0]; cached in the 𝑀 𝑠𝑔𝐶𝑂𝑁𝑖 channel for further processing. 𝐻𝑎𝑛𝑑𝑙𝑒𝑟𝑐𝑖𝑑 () is the message-handling module of the In addition, We define some arrays to store system consumer process. In our experiment, we use nonde- information, which assists us in confirming the status of terministic to model the message-processing behavior. processes. 𝑅𝑑𝑦[𝐶][𝐷] is used to record the number of We also model aborting re-queuing when the message messages the consumer can process. 𝑝𝑢𝑠ℎ𝑆𝑡𝑎𝑡𝑒𝑠[𝐷][𝐶] attempts exceed the maximum value. 𝑀 𝑎𝑥𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠 de- marks whether the nsqd is in the state of pushing mes- fines the maximum number of message attempts allowed sages to the consumer. 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[𝐿𝐷][𝐷] logs informa- by the system. tion about registered instances of nsqd on nsqlookupd. 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠[𝐶][𝑀 𝑠𝑔𝑁 𝑢𝑚] tracks the status of messages 𝐻𝑎𝑛𝑑𝑙𝑒𝑟𝑐𝑖𝑑 () =𝑑𝑓 processed on the consumer. 𝑀 𝑠𝑔𝐶𝑂𝑁𝑖 ?𝑚𝑠𝑔𝑑𝑎𝑡𝑎 .𝑑𝑖𝑑.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑 → Furthermore, we have implemented the relevant chan- nels in PAT based on the definitions provided earlier. We ⎛ ⎞ 𝐶𝑜𝑚𝐶𝐷𝑗 !𝑚𝑠𝑔𝑟𝑒𝑞 .𝑐𝑖𝑑.𝑑𝑖𝑑.𝐹 𝐼𝑁.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑{ ⎝ 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠𝑐𝑖𝑑 [𝑚𝑠𝑔𝑖𝑑] = −1; } → ⎠ use multidimensional arrays to store channels between 𝑢𝑝𝑑𝑎𝑡𝑒𝑅𝐷𝑌 {𝑅𝑑𝑦𝑐𝑖𝑑 [𝑑𝑖𝑑] + +; } → 𝑆𝐾𝐼𝑃 different entities is to avoid resource contention. ◁ 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠𝑐𝑖𝑑 [𝑚𝑠𝑔𝑖𝑑] > 𝑀 𝑎𝑥𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠▷ ⎛ ⎛ 𝐶𝑜𝑚𝐶𝐷 !𝑚𝑠𝑔 . ⎞ ⎞ 𝑐ℎ𝑎𝑛𝑛𝑒𝑙 𝐶𝑜𝑚𝑃 𝐷[𝑃 ][𝐷] 0; 𝑗 𝑟𝑒𝑞 ⎜ ⎜ 𝑐𝑖𝑑.𝑑𝑖𝑑.𝐹 𝐼𝑁.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑{ ⎟ ⎟ 𝑐ℎ𝑎𝑛𝑛𝑒𝑙 𝐶𝑜𝑚𝐶𝐷[𝐶][𝐷] 0; ⎜ ⎟ ⎟ 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠𝑐𝑖𝑑 [𝑚𝑠𝑔𝑖𝑑] = −1; } → ⎜ ⎝ ⎜ ⎠ ⎟ 𝑐ℎ𝑎𝑛𝑛𝑒𝑙 𝑀 𝑠𝑔𝑇 𝑃 𝐶[𝐷][𝑇 ] 𝑀 𝑠𝑔𝑁 𝑢𝑚; 𝑢𝑝𝑑𝑎𝑡𝑒𝑅𝐷𝑌 {𝑅𝑑𝑦𝑐𝑖𝑑 [𝑑𝑖𝑑] + +; } → 𝑆𝐾𝐼𝑃 ⎜ ⎟ 𝑐ℎ𝑎𝑛𝑛𝑒𝑙 𝑀 𝑠𝑔𝐶𝐻𝐴[𝐷][𝑇 ][𝐶𝐻𝐴] 𝑀 𝑠𝑔𝑁 𝑢𝑚; ⎜ ⎟ ⎜ ⊓ ⎜ ⎟ ⎜ ⎛ ⎞ ⎟ 𝐶𝑜𝑚𝐶𝐷𝑗 !𝑚𝑠𝑔𝑟𝑒𝑞 . ⎟ The channel definitions can be categorized into two ⎜ ⎟ 𝑐𝑖𝑑.𝑑𝑖𝑑.𝑅𝐸𝑄.𝑡𝑖𝑑.𝑐ℎ𝑖𝑑.𝑚𝑠𝑔𝑖𝑑{ ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎝ ⎝ 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠𝑐𝑖𝑑 [𝑚𝑠𝑔𝑖𝑑] + +; } → ⎠ ⎠ types: 𝐶𝑂𝑀 _𝑃 𝐴𝑇 𝐻 are used for inter-component 𝑢𝑝𝑑𝑎𝑡𝑒𝑅𝐷𝑌 {𝑅𝑑𝑦𝑐𝑖𝑑 [𝑑𝑖𝑑] + +; } → 𝑆𝐾𝐼𝑃 communication, where the channel size is set to 0 𝐻𝑎𝑛𝑑𝑙𝑒𝑟𝑐𝑖𝑑(); to achieve process synchronization. Cache channels 𝑀 𝑆𝐺_𝑃 𝐴𝑇 𝐻 are used within components, where the channel size is set to 𝑀 𝑠𝑔𝑁 𝑢𝑚. These channels are uti- 4. Verification lized for process synchronization and message buffering. Given that the NSQ message system operates with mul- In this section, we use the model-checking tool PAT to tiple producers, nsqds, nsqlookupds, and consumers, we realize the formal model constructed in section III, and employ a combination of interleaving and loop functions verify its properties. At the same time, the results of to realize the system’s implementation. The comprehen- properties verification are also shown at the end. sive definition of the NSQ system is presented as follows. |||𝑖 : {0..𝑁 }@𝑃 (𝑖); statement means that multiple pro- 4.1. Implementation cesses 𝑃 (𝑖) run interspersed in the PAT. This part presents details of the modeling implementation 𝑆𝑦𝑠𝑡𝑒𝑚() = with the PAT tool, mainly concerning the definition of |||𝑝𝑖𝑑 : {0..𝑃 − 1}; 𝑑𝑖𝑑 : {0..𝐷 − 1}; 𝑙𝑑𝑖𝑑 : {0..𝐿𝐷 − 1}; constants, array variables and channels. 𝑐𝑖𝑑 : {0..𝐶 − 1}; 𝑡𝑖𝑑 : {0..𝑇 − 1}; 𝑐ℎ𝑖𝑑 : {0..𝐶𝐻𝐴 − 1} (︂ )︂ #𝑑𝑒𝑓 𝑖𝑛𝑒 𝑃 2; #𝑑𝑒𝑓 𝑖𝑛𝑒 𝐶 2; 𝑃 𝑟𝑜𝑑𝑢𝑐𝑒𝑟(𝑝𝑖𝑑, 𝑡𝑖𝑑) ‖ 𝑛𝑠𝑞𝑑(𝑑𝑖𝑑) ‖ @ #𝑑𝑒𝑓 𝑖𝑛𝑒 𝐷 2; #𝑑𝑒𝑓 𝑖𝑛𝑒 𝐿𝐷 2; 𝑛𝑠𝑞𝑙𝑜𝑜𝑘𝑢𝑝𝑑(𝑙𝑖𝑑) ‖ 𝐶𝑜𝑛𝑠𝑢𝑚𝑒𝑟(𝑐𝑖𝑑, 𝑡𝑖𝑑, 𝑐ℎ𝑖𝑑) #𝑑𝑒𝑓 𝑖𝑛𝑒 𝑇 1; #𝑑𝑒𝑓 𝑖𝑛𝑒 𝐶𝐻𝐴 2; 4.2. Properties Verification #𝑑𝑒𝑓 𝑖𝑛𝑒 𝑀 𝑠𝑔𝑁 𝑢𝑚 3; In this section, we verify the properties of the constructed We define constants as above to materialize the archi- model with the model checker PAT. These properties tecture of the NSQ system in Fig. 1. 𝑃 , 𝐷, 𝐿𝐷, and 𝐶 present the flexibility and robustness of NSQ distributed represent the number of producer, nsqd, nsqlookupd, and messaging platform. consumer. 𝑇 and 𝐶𝐻𝐴 denote that each nsqd has one 79 4.2.1. Divergence Freedom information about the available components of the sys- tem forever, and when instances of nsqd are abnormal, it In NSQ system, if messages can always flow and be han- deletes all information about the corresponding instances. dled in the correct way as they should, avoiding invalid We defined a new system to verify the high availability or infinite loops, then we think the system is divergence of NSQ. An 𝑂𝑈 𝑇 𝑇 𝐼𝑀 𝐸 event of 𝑛𝑠𝑞𝑑0 is added to the free. It is crucial for message systems because the correct- original system, which will be triggered when all mes- ness and stability of the system depends on the correct sages are finished. handling and delivery of messages. 𝑆𝑦𝑠𝑡𝑒𝑚2() = 𝑆𝑦𝑠𝑡𝑒𝑚()||| PAT provides the primitive to verify the divergence freedom of the system: ⎛ ⎞ |||𝑙𝑖𝑑 ⎛ : {0..𝐿𝐷 − 1} ⎞ ⎜ [𝑟𝑒𝑎𝑐ℎ𝑎𝑏𝑖𝑙𝑖𝑡𝑦] ⎟ ⎝ @ ⎝ 𝐶𝑜𝑚𝐷𝐿𝑖 !𝑚𝑠𝑔𝑟𝑒𝑞 .0.𝑙𝑖𝑑. ⎜ ⎟ #𝑎𝑠𝑠𝑒𝑟𝑡 𝑆𝑦𝑠𝑡𝑒𝑚() 𝑑𝑖𝑣𝑒𝑟𝑔𝑒𝑛𝑐𝑒𝑓 𝑟𝑒𝑒; ⎠ ⎠ 𝐸𝑅𝑅𝑂𝑅.𝑂𝑈 𝑇 𝑇 𝐼𝑀 𝐸 → 𝑆𝐾𝐼𝑃 ; 4.2.2. Reachability The above formula describes the new system, and we Data reachability is the basic property of message queue. verify in the PAT whether nsqlookupd maintains the list NSQ ensures at least one delivery of a message using the of available nsqd. The definition and assertion are as 𝐹 𝐼𝑁 and 𝑅𝐸𝑄, but it does not guarantee data order. In follows: our experiment, we track the attempts of messages with #𝑑𝑒𝑓 𝑖𝑛𝑒 𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑖𝑙𝑖𝑡𝑦{ 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠[𝐶][𝑀 𝑠𝑔𝑁 𝑢𝑚], where the value of −1 indi- cates the message is finished. Therefor, the definitions of 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[0][0] == 0 && 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[0][1] == 1 reachability and assertions are as follows: && 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[1][0] == 0 && 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[1][1] == 1}; #𝑎𝑠𝑠𝑒𝑟𝑡 𝑆𝑦𝑠𝑡𝑒𝑚2() |=<> 𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑖𝑙𝑖𝑡𝑦; #𝑑𝑒𝑓 𝑖𝑛𝑒 𝑅𝑒𝑎𝑐ℎ𝑎𝑏𝑖𝑙𝑖𝑡𝑦{ 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠[0][0] == −1 && 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠[1][0] == −1 && 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠[0][1] == −1 && 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠[1][1] == −1 4.2.5. Flow Controllability && 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠[0][2] == −1 && 𝐴𝑡𝑡𝑒𝑚𝑝𝑡𝑠[1][2] == −1}; NSQ can dynamically adjust messages’ processing rate #𝑎𝑠𝑠𝑒𝑟𝑡 𝑆𝑦𝑠𝑡𝑒𝑚() |=<> 𝑅𝑒𝑎𝑐ℎ𝑎𝑏𝑖𝑙𝑖𝑡𝑦; by changing the consumer’s RDY value. To verify this property, we need to demonstrate that nsqd can push As the model we constructed has two consumers sub- messages only if the consumer’s 𝑅𝐷𝑌 is greater than scribing to different Channels under the same Topic, each 0. Therefore, we introduce the 𝑝𝑢𝑠ℎ𝑆𝑡𝑎𝑡𝑒𝑠[𝐷][𝐶] array consumer will receive a copy of all messages sent by pro- to store nsqds’ status, which indicate whether 𝑛𝑠𝑞𝑑𝑑𝑖𝑑 ducers and finish them all eventually. Symbol <> means is pushing data to 𝐶𝑜𝑛𝑠𝑢𝑚𝑒𝑟𝑐𝑖𝑑 . Combined with the that the system can finally reach 𝑅𝑒𝑎𝑐ℎ𝑎𝑏𝑖𝑙𝑖𝑡𝑦 state. 𝑅𝑑𝑦[𝐶][𝐷] array, we give the following definition and assertion. 4.2.3. Scalability #𝑑𝑒𝑓 𝑖𝑛𝑒 𝑅𝑒𝑎𝑑𝑦00 {𝑅𝑑𝑦[0][0] > 0}; The NSQ system realizes a distributed decentralized ar- #𝑑𝑒𝑓 𝑖𝑛𝑒 𝑅𝑒𝑎𝑑𝑦 · · · chitecture with nsqlookupd, which shows scalability. #𝑑𝑒𝑓 𝑖𝑛𝑒 𝑝𝑢𝑠ℎ𝑆𝑡𝑜𝑝00 {𝑝𝑢𝑠ℎ𝑆𝑡𝑎𝑡𝑒𝑠[0][0] = 0}; nsqlookupd manages the topological information of the #𝑑𝑒𝑓 𝑖𝑛𝑒 𝑝𝑢𝑠ℎ𝑆𝑡𝑜𝑝 · · · system and allows nsqd instances to be added for horizon- #𝑎𝑠𝑠𝑒𝑟𝑡 𝑆𝑦𝑠𝑡𝑒𝑚() |= tal scaling. In our experiments, the 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[𝐿𝐷][𝐷] is initially set to 0, denoting that no nsqd instances are (𝑝𝑢𝑠ℎ𝑆𝑡𝑜𝑝00 U 𝑅𝑒𝑎𝑑𝑦00) available. When the value changes to 1, it indicates that &&(𝑝𝑢𝑠ℎ𝑆𝑡𝑜𝑝01 U 𝑅𝑒𝑎𝑑𝑦01) nsqd instances were dynamically added, demonstrating &&(𝑝𝑢𝑠ℎ𝑆𝑡𝑜𝑝10 U 𝑅𝑒𝑎𝑑𝑦10) the system’s scalability. &&(𝑝𝑢𝑠ℎ𝑆𝑡𝑜𝑝11 U 𝑅𝑒𝑎𝑑𝑦11); #𝑑𝑒𝑓 𝑖𝑛𝑒 𝑆𝑐𝑎𝑙𝑎𝑏𝑖𝑙𝑖𝑡𝑦{ Our model has four message subscription connections 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[0][0] == 1 && 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[0][1] == 1 as show in Fig.2. 𝑝𝑢𝑠ℎ𝑆𝑡𝑜𝑝00 defines the state when && 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[1][0] == 1 && 𝐿𝐷𝑆𝑡𝑎𝑡𝑒𝑠[1][1] == 1}; 𝑛𝑠𝑞𝑑0 stops pushing messages to the 𝑐𝑜𝑛𝑠𝑢𝑚𝑒𝑟0 , and 𝑅𝑒𝑎𝑑𝑦00 defines the state in which the 𝑐𝑜𝑛𝑠𝑢𝑚𝑒𝑟0 is #𝑎𝑠𝑠𝑒𝑟𝑡 𝑆𝑦𝑠𝑡𝑒𝑚() |=<> 𝑆𝑐𝑎𝑙𝑎𝑏𝑖𝑙𝑖𝑡𝑦; ready to receive messages from 𝑛𝑠𝑞𝑑0 . The rest of defini- tions are similar. We use the Untill(U) syntax from Linear 4.2.4. Availability Timing Logic (LTL) to describe the event that the nsqd nsqlookupd serves as a distributed directory service that stops pushing messages until the Rdy of the corresponding supports fault tolerance and redundancy. It maintains consumer is larger than zero. This formula verifies if the system can realize flow control. 80 References [1] Bernstein, P. A. (1996). Middleware: a model for distributed system services. Communications of the ACM, 39(2), 86-98. [2] Snyder, B., Bosnanac, D., & Davies, R. (2011). Ac- tiveMQ in action (Vol. 47). Greenwich Conn.: Man- ning. [3] Rostanski, M., Grochla, K., & Seman, A. (2014, Figure 3: Verification Results of the NSQ System September). Evaluation of highly available and fault- tolerant middleware clustered architectures using RabbitMQ. In 2014 federated conference on computer science and information systems (pp. 879-884). IEEE. 4.3. Verification and Results [4] Wang, G., Koshy, J., Subramanian, S., Paramasivam, Depending on the definitions and assertions provided K., Zadeh, M., Narkhede, N., ... & Stein, J. (2015). Build- above, we use model checker PAT to verify five proper- ing a replicated logging system with Apache Kafka. ties of the NSQ system, including Divergence Freedom, Proceedings of the VLDB Endowment, 8(12), 1654- Reachability, Scalability, Availability, and Flow Control- 1655. lability. The model checker PAT verifies properties by [5] Yue, M., Ruiyang, Y., Jianwei, S., & Kaifeng, Y. (2017, searching for counterexamples in the system’s state space October). A MQTT protocol message push server or reaching the limits of state exploration. based on RocketMQ. In 2017 10th International Con- We present a summary of the verification statistics ference on Intelligent Computation Technology and in Fig. 3, including Visited States, Total Transitions, Time Automation (ICICTA) (pp. 295-298). IEEE. Used, and Estimated Memory Used. [6] NSQ: A realtime distributed messaging platform, The verification results of all five properties indicate https://nsq.io/ that the NSQ message queue satisfies all the above prop- [7] Lai, X., Wang, H., Zhao, J., Zhang, F., Zhao, C., erties, proving that the system has high flexibility and & Wu, G. (2020, May). HBase Connection Dynamic robustness while providing credible delivery of messages. Keeping Method Based on Reactor Pattern. In Jour- nal of Physics: Conference Series (Vol. 1544, No. 1, p. 012122). IOP Publishing. [8] Raje, S. N. (2019). Performance Comparison of Mes- 5. Conclusion and Future Work sage Queue Methods (Doctoral dissertation, Univer- sity of Nevada, Las Vegas). In this paper, we focus on the core functionalities of the [9] Togashi, N., & Klyuev, V. (2014, April). Concurrency NSQ message platform, including message publishing, in Go and Java: performance analysis. In 2014 4th registration, subscription, and querying. With CSP, we IEEE international conference on information science formalized critical components of the NSQ architecture, and technology (pp. 213-216). IEEE. such as producers, consumers, nsqd, and nsqlookupd. [10] Brookes, S. D., Hoare, C. A. R, & Roscoe, A. W. Using the model checker PAT, we conducted a rigorous (1984). A theory of communicating sequential pro- analysis of the constructed NSQ model, verifying five fun- cesses. Journal of the ACM (JACM), 31(3), 560-599. damental properties: Divergence Freedom, Reachability, [11] Hoare, C. A. R. (1985). Communicating sequential Scalability, Availability, and Flow Controllability. These processes (Vol. 178). Englewood Cliffs: Prentice-hall. properties underscore NSQ’s capacity to handle real-time [12] PAT: Process Analysis Toolkit. An Model Checker distributed message delivery at scale, confirming its high and Refinement Checker for Concurrent and Real- flexibility and robustness while ensuring dependable mes- time System. https://pat.comp.nus.edu.sg/ sage transmission. [13] Xiao, L., Zhu, H., Xu, Q., & Vinh, P. C. (2022). Mod- Nonetheless, besides the robustness of message queues, eling and verifying pso memory model using CSP. the security of data is extremely important for users. In Mobile Networks and Applications, 27(5), 2068-2083. the future, we will continue to enhance the formalized [14] Xu, J., Yin, J., Zhu, H., & Xiao, L. (2023). Formaliza- modeling and verification of NSQ by refining workflows. tion and verification of Kafka messaging mechanism We will also delving into the system’s security aspects to using CSP. Computer Science and Information Sys- advance our research outcomes continually. tems, 20(1), 277-306. 81