1 MAS-DRiVe: a Practical Approach to Decentralized Runtime Verification of Agent Interaction Protocols Davide Ancona, Daniela Briola, Angelo Ferrando, and Viviana Mascardi Abstract—We address the problem of decentralized runtime Once the formal specification of the global pattern of events verification of interaction protocols in multiagent systems by is given, however, distributing the monitoring activity can be means of MAS-DRiVe, an algorithm for partitioning a multi- resorted to decomposing the global specification into “sub- agent system (MAS) into sub-MASs which can be monitored independently. Given a global interaction protocol named AIP specifications”, involving less events than the global one, (for “Agent Interaction Protocol”) describing all the interactions which can be monitored in an independent way from each which can take place in the MAS, the MAS-DRiVe algorithm other and such that the union of their independent monitoring extracts the interaction graph from AIP , identifies the clusters of gives the same guarantees as the monitoring of the whole agents which cannot be split during the decentralized monitoring system w.r.t. the original specification. as the interactions they are involved in are not independent, col- lapses each of those clusters into a single node in the interaction Given a global specification of the expected system behavior graph, and finally partitions the collapsed graph obtained so far. S, the challenge to face is then to compute a set of n sub- Although the “unsplittable agents identification” stage is still in specifications S1 , . . . , Sn derived from S in such a way that its early design and prototyping phases and requires a better the following properties hold: formalization and a deeper analysis, the MAS-DRiVe algorithm • decentralized monitoring against the sub-specifications pipeline has been fully implemented and demonstrated on two simple MASs. S1 , . . . , Sn is equivalent to centralized monitoring against Once the independently monitorable sub-MASs have been S; identified by MAS-DRiVe, the global interaction protocol AIP • the decomposition of S into S1 , . . . , Sn is balanced can be projected onto the subsets of agents belonging to each according to some weighting criterion, to get, ideally, the graph partition, thus obtaining local versions of AIP which can best monitoring performances. be monitored in a decentralized way. These sub-specifications do not necessarily induce a partition Index Terms—Agent Interaction Protocol; Decentralized Run- of the monitored events: indeed, it might be necessary that time Verification; Decentralized Monitoring; Graph Partitioning some monitored events belong to more than one specification, and hence must be checked by more monitors. Let us suppose for example that the formal specification of the event pattern I. I NTRODUCTION AND M OTIVATION states that when the alarm rings, Mary stops it and either she Runtime verification (RV) is a software verification tech- wakes up, washes, has breakfast and goes to work (then the nique that complements formal static verification and testing. monitoring is allowed to stop), or she continues to sleep. In RV, dynamic checking of the correct behavior of a system The events “wake up”, “wash”, “have breakfast”, “go to is performed by a monitor which is generated from a formal work” are independent from the event “sleep”, so a good way specification of the properties to be verified. to decentralize the monitoring activity between two monitors Distributed runtime verification (DRV), as described by S. m1 and m2 could be to have m1 in charge of the “active” Rajsbaum in his keynote speech at the SSS 2015 Symposium behavior (m1 would check that “when the alarm rings, Mary [25], tackles the problem of building a decentralized runtime stops it and then either she wakes up, washes, has breakfast, monitor for a distributed system and involves designing a goes to work – then the monitoring can stop –, or the mon- distributed algorithm that coordinates the monitors in order itoring can stop immediately”) and m2 in charge of the lazy for them to correctly verify the dynamic behavior of the whole one (m2 would check that “when the alarm rings, Mary stops system. it and then either she continues to sleep, or the monitoring can With respect to RV, DRV techniques are more complex and stop immediately”). less developed, since they involve designing a distributed al- It would be convenient that the events “the alarm rings” and gorithm that monitors another distributed algorithm. Neverthe- “Mary stops it” were monitored by both m1 and m2, in order less, they are gaining more and more attention, as shown by the to avoid a further monitor ensuring the correct sequence of recent Distributed Runtime Verification Workshop organized initial events, whatever Mary’s attitude towards the new day. in Bertinoro in May 20161 . Due to events which may be monitored by more than one monitor, the total amount of monitored events in the D. Ancona, A. Ferrando, and V. Mascardi are with DIBRIS, University of Genova, Italy. davide.ancona@unige.it, decentralized setting may be larger than in the centralized angelo.ferrando@dibris.unige.it, case. However, decentralized monitors can be run in parallel viviana.mascardi@unige.it on different machines, so the workload is more balanced D. Briola is with DISCO, University of Milano-Bicocca, Italy. daniela.briola@unimib.it than in the centralized setting. Also, sometimes a centralized 1 http://www.labri.fr/perso/travers/DRV2016/index.html. approach is unfeasible, as in the motivating scenario discussed 35 2 in our previous work presented at EMAS 2014 [2]. There, we of agents in a MAS which provide formal guarantees that supposed that the safety of a humanitarian convoy was guar- verification through projected types and decentralized monitors anteed by unmanned aerial vehicles whose behavior is critical is equivalent to verification performed by a single centralized for the success of the mission and hence subject to runtime monitor with a single global type. verification, but which might not be monitorable by a single This paper addresses that problem by presenting MAS- monitor for network connectivity reasons and for failures of DRiVe, an algorithm for partitioning a MAS in such a way some computational entities. In cases like that, decentralizing that the identified subsystems can be safely monitored in the runtime verification activity is no longer an option, but a decentralized way. The paper is structured as follows: the only viable solution. The possibility to exchange protocol Section II discusses the related work, Section III introduces specifications, dealt with as first class entities that can be sent background knowledge, Section IV presents the design of and received [3], allows us to apply our approach to oppor- MAS-DRiVe, Section V discusses its implementation and tunistic networks (Wireless Sensor Networks, drone ferries, experiments, Section VI concludes. and the like) where the actual communication protocols are unknown until deployment, and mobility dynamically changes II. R ELATED W ORK them. Finding a set of sub-specifications S1 , . . . , Sn of S DRV lies at the crossroad where decentralized algorithms meeting the requirements for decentralizing the monitoring and formal methods meet [25]. It must cope both with the activity heavily depends on the way S has been specified. problems intrinsic to distributed systems, and with those RV specifications are typically expressed using trace predicate related to formalization of properties to be verified at runtime. formalisms. Many of them, such as finite state machines, reg- Furthermore, the distribution of the monitoring activity itself ular expressions, context free grammars, linear-time temporal poses new problems and raises new challenges. (LTL) logic, have been originally introduced for other aims One of the most severe and most studied problems which and then exploited for RV; others, such as trace expressions characterize decentralized systems is the lack of a global clock [6], have been expressly devised for RV. Different formalisms and the need to tag events with a timestamp which, although may have different expressive power, which impacts on the local to the node where the event took place or was observed, types of patterns that can be described and on the meaning of can be compared with the timestamps of the other events to “independent” sub-specifications. check their relative order. In this paper we assume that each Although the MAS-DRiVe algorithm proposed in this paper monitor observes the events it is in charge of, in the same for decentralizing the monitoring activity is general enough to order as they actually took place in the environment. This be adopted whatever the formalism and the kind of monitored condition is enough for our approach to work. As mentioned events are, it has only been experimented for the specific in the introduction, some events may be observed by more formalism of trace expressions, for specific decentralized than one monitor so the condition we pose implicitly means systems, that is, multiagent systems (MASs), and for specific that event ordering is always preserved in all monitors. A large events, namely interactions between pairs of agents. Also, as body of literature addresses this problem; the two most widely we will see later on, the “unsplittable agents identification” is used approaches are to connect the different physical clocks almost naif and requires a refinement and a better theoretical of the decentralized nodes to one common logical clock, by support. exploiting causality in communication between nodes (Lam- Trace expressions are an evolution of global types [5], which ports “happened before” relation [22]), and to synchronize the have been initially proposed for RV of interactions in MASs. local clocks to provide a full ordering of events as proposed Trace expressions are an expressive formalism based on a set for example by M. Maróti et al. [23]. of operators (including prefixing, concatenation, shuffle, union, Formalizing the system properties in order to dynamically and intersection) to denote finite and infinite traces of events. verify them requires that a suitable formal language is avail- Their semantics is based on a labeled transition system defined able. Among the first efforts in this direction we may cite one by a simple set of rewriting rules which directly drive the paper by A. Bauer, M. Leucker, and C. Schallhart [7] where behavior of monitors generated from them. They have been the use of LTL in the context of RV was explored. When used used to model and monitor (in a simulated setting) both fail- for RV, the expressive power of LTL is reduced, because at uncontrolled and ambient intelligence systems [4], as well as runtime only finite traces can be checked. To provide a formal medical guidelines for remote patient monitoring [16]. account for this limitation, a three-valued semantics for LTL, In the already cited EMAS 2014 paper, we tackled the called LTL3 , has been proposed later on [8]. A third truth problem of projecting a global specification of an interaction value “?” is introduced to specify that after a finite trace of protocol expressed in the global types formalism onto subsets events has been occurred, the outcome of a monitor can be of agents in the MAS. The projection function is a key element inconclusive. for moving from a global specification of the protocol to a The generation of efficient monitors for properties specified localized one, which involves less agents and hence hopefully in a variety of formal languages was firstly addressed by less interaction events. However, in our previous work we M. Kim et al. [21, 18] and many proposals exist for RV faced only one of the issues involved in distributing the of object-oriented systems (jassda [11], PQL [24], JavaMOP RV activity, that of projecting the interaction protocol, and [13], LARVA [14], SAGA [15], just to cite a few) and MASs did not consider the problem of finding suitable partitions [1, 17]. A large community of researchers works on RV 36 3 and in 2001 the Runtime Verification Workshop/Conference • τ1 ·τ2 (concatenation), denoting the set of all traces ob- series was initiated (http://www.runtime-verification.org), and tained by concatenating the traces of τ1 with those of workshops/conferences have occurred each year since then. τ2 . When moving from RV to DRV, however, the situation • τ1 ∧τ2 (intersection), denoting the intersection of the dramatically changes: no introductory papers and surveys on traces of τ1 and τ2 . the topic exist, which demonstrates the youth of this research • τ1 ∨τ2 (union), denoting the union of the traces of τ1 and field. W.r.t. DRV of MASs, we were not able to find any related τ2 . work except for one paper of ours [10] where the projection • τ1 |τ2 (shuffle), denoting the set obtained by shuffling the mechanism introduced in our previous EMAS 2014 work had traces in τ1 with the traces in τ2 . been integrated with JADE [9]. To support recursion without introducing an explicit construct, trace expressions are regular (a.k.a. rational or cyclic) terms. III. BACKGROUND A regular term can be represented by a finite set of syntactic In this section we introduce the three pillars of our work: equations, as happens, for instance, in most modern Prolog trace expressions, projection, and graph partitioning. implementations where unification supports cyclic terms. A trace expression τ is contractive if all its infinite paths from A. Trace Expressions the root contain the prefix operator. From the operators above we can derive the filter opera- Trace expressions are a specification formalism expressly tor, useful for making trace expressions more compact and designed for RV. They are based on the notions of event and readable. The expression ϑτ denotes the set of all traces its abstraction, event type. contained in τ , when deprived of all events that do not match ϑ. Assuming that event types are closed by complementation, Events. In the following we denote by E a fixed universe of the expression above is a convenient syntactic shortcut for T |τ , events. An event trace over E is a possibly infinite sequence where T = ∨ϑ:T , and ϑ is the complement event type of ϑ, of events in E. As an example, we might have E = { that is, JϑK = E \ JϑK. msg(bob, alice, tell, m1), msg(alice, bob, tell, a1), msg(bob, In the context of RV, where the three-valued semantics carol, tell, m2), msg(carol, bob, tell, a2), msg(bob, dave, tell, is considered for LTL, trace expressions are strictly more m3), msg(dave, bob, tell, a3) } where the interaction event expressive than LTL [6]: every LTL formula can be encoded msg(S, R, P, C) corresponds to the interaction between agent into a trace expression with an equivalent three-valued seman- S and agent R, with S sending a message with performative tics, whereas the opposite property does not hold, since trace P and content C to R. In this example, contents m1, m2 and expressions are also able to specify all context-free languages, m3 correspond to actual messages, and contents a1, a2 and and also some non context-free ones. a3 correspond to acknowledges of reception. Although trace expressions can define protocols involving events of any kind, in the sequel we will limit our investigation Event types. To be more general, trace expressions are to interaction events like msg(S, R, P, C) and interaction types built on top of event types (chosen from a set ET), rather that we will identify with α. than of single events; an event type denotes a subset of E. For example, if we were interested only in the type of the message content (actual message or acknowledge), we might B. Projection define the two event types Msg= { msg(bob, alice, tell, The projection function was first introduced in [2]. Given m1), msg(bob, carol, tell, m2), msg(bob, dave, tell, m3) } the finite set AGS of all the agents that could play a role in and Ack= { msg(alice, bob, tell, a1), msg(carol, bob, tell, the MAS and an interaction type α, senders(α) is the set of a2), msg(dave, bob, tell, a3) }, and use them for describing all the agents in AGS that could play the role of sender in interaction patterns. actual interactions having type α, and receivers(α) is the set of all the agents in AGS that could play the role of receiver Trace expressions. A trace expression τ represents a set of in interactions of type α. The involves predicate holds on one possibly infinite event traces, and is defined on top of the interaction type α and one set of agents Ags, involves(α, Ags), following operators (binary operators associate from left, and iff (senders(α) ∩ Ags 6= ∅) ∨ (receivers(α) ∩ Ags 6= ∅). are listed in decreasing order of precedence, that is, the first Projection can be described as a function Π : T × operator has the highest precedence): P(AGS) → T where T is the set of trace expressions. Π is •  (empty trace), denoting the singleton set {} containing driven by the syntax of the trace expression to project; since the empty event trace . Π is defined on cyclic terms, the simplest way to define it • ϑ:τ (prefix), denoting the set of all traces whose first event would be by coinduction2 as follows: e matches the event type ϑ (e ∈ ϑ), and the remaining (i) Π(, Ags) =  part is a trace of τ . For example, if our communication protocol just required that an interaction from the set 2 Coinduction is a technique for defining and proving properties of systems Msg must take place, no matter which one, and then an of concurrent interacting objects. It is the mathematical dual to structural induction. Coinductively defined types are typically infinite data structures, interaction from Ack must occur, no matter which one, such as streams. See http://cseweb.ucsd.edu/groups/tatami/handdemos/doc/ we could express it as Msg:Ack:. coind.htm for a more technical formalization. 37 4 (ii) Π(α : τ, Ags) = α : Π(τ, Ags) if involves(α, Ags) all edges in the CIG have weight 1, but the approach (iii) Π(α : τ, Ags) = Π(τ, Ags) if ¬involves(α, Ags) works also in case edge weights are different: a higher (iv) Π(τ 0 op τ 00 , Ags) = Π(τ 0 , Ags) op Π(τ 00 , Ags), where edge weight would model a preferential communication op ∈ {·, ∧, ∨, |}. channel; This definition works properly only on all non cyclic terms 4) partition the new graph obtained so far using some and on some, but not all, cyclic terms. To guarantee that suitable graph partition method; the projection function always returns a contractive type and 5) project the interaction protocol onto these agents’ parti- that the correct coinductive definition is implemented, it is tions. necessary to keep track of all types visited by Π along a path; each type is associated with its depth in the path, and with The algorithm is explained by means of two running exam- a fresh variable which will be unified with the corresponding ples introduced in [2]. computed projection. During the visit, the depth DeepestSeq of the deepest visited sequence operator is kept. If a type τ has been already visited, then a cycle is detected: if its depth is less than DeepestSeq then the cycle contains an occurrence of the A. Examples sequence constructor, therefore the projected type associated with τ is contractive and, hence, is returned; otherwise, the projection would not be contractive, therefore  is returned. C. Graph Partitioning The graph partitioning problem (GPP) is defined in the following way [12]: given a number k ∈ N>1 and an undirected graph G = (V, E) with non-negative node and edge weights, GPP asks for a partition of V , that is, with blocks of nodes (V1 , ..., Vk ) such that 1) V1 ∪ V2 ∪ ... ∪ Vk = V 2) Vi ∩ Vj = ∅ ∀i 6= j A balance constraint may be required, demanding that all blocks have about equal weights. To be more precise, a good Fig. 1. The “Socks and Shoes” MAS partition is the one where the sum of the node weights in each Vi is “about the same” and the sum of all edge a) Socks and Shoes (SaS): Let us consider the simple weights of edges connecting all different pairs Vi and Vj is scenario illustrated in Figure 1 where two robots (right minimized. Applications of graph partitioning include parallel and left), two monitors (right and left) associated with processing, road networks, image processing, VLSI design, each robot, and a plan monitor which supervises them, social networks, and bioinformatics. Among the many existing must reach the goal to put the right and left socks and tools for graph partitioning we opted for using METIS [20], shoes in the correct way (socks first!). As robots are a set of serial programs for partitioning graphs, partitioning autonomous, they could perform the two actions in the wrong finite element meshes, and producing fill reducing orderings order: monitors are there to ensure that wrong actions are for sparse matrices (http://glaros.dtc.umn.edu/gkhome/views/ immediately rolled back. Robots communicate their actions metis, accessed on June 2016). The choice of METIS was due to their corresponding monitors, which, in turn, notify the to the efficiency of the multi-level Kernighan/Lin approach it plan monitor when the robots accomplish their goal. The builds upon [19] and to the ease of its installation and usage. interaction types involving the right robot and the right node monitor are defined as msg(right robot, right node monitor, IV. MAS-DR I V E : D ESIGN tell, put sock) ∈ put right sock; msg(right robot, Given an interaction protocol expressed in some suitable right node monitor, tell, put shoe) ∈ put right shoe; formalism, the MAS-DRiVe algorithm for partitioning a MAS msg(right robot, right node monitor, tell, removed shoe) ∈ in such a way that the identified subsystems can be safely removed right shoe; msg(right node monitor, right robot, monitored in a decentralized way is the following: tell, oblige remove shoe) ∈ oblige remove right shoe; 1) extract the interaction graph IG from the interaction msg(right node monitor, plan monitor, tell, ok) ∈ ok right. protocol; Those for the left robot and left node monitor are symmetrical. 2) identify the sets of agents which cannot be split during The right robot (RIGHT branch in the protocol description the decentralized monitoring (unsplittable agents), since below) can start by putting the sock, which is the correct action the interactions they are involved in are not independent; to do, or (∨ operator) by putting the shoe, which requires 3) compute the collapsed interaction graph CIG by col- a recovery by the right robot monitor and looping back to lapsing each of those sets into a single node with weight the RIGHT branch. The left robot has the same behavior equal to the cardinality of the collapsed set, and by com- (LEFT branch in the protocol description). The SaS protocol puting the collapsed edges consistently; for simplicity, is described by the shuffle of the actions of the right and left 38 5 robots and monitors (RIGHT | LEFT): verify MA2 , and if an event has type msg3 or ack3 , then it must verify MA3 . RIGHT = (put right sock:put right shoe:ok right:)∨ (put right shoe:oblige remove right shoe: removed right shoe:RIGHT) B. Algorithm Steps LEFT = (put left sock:put left shoe:ok left:)∨ (put left shoe:oblige remove left shoe: As already introduced in the beginning of this section, the removed left shoe:LEFT) MAS-DRiVe algorithm consists of five steps. SaS = RIGHT|LEFT. 1) Interaction graph extraction. The interaction graph IG of b) Alternating Bit Protocol with four participants the agents interaction protocol AIP expressed using formal- (ABP3): As a second example we consider the Alternating ism F is an undirected and unweighted graph with one node Bit Protocol ABP. Ai for each agent Ai involved in AIP . An edge between Ai and Aj exists iff an interaction between Ai and Aj is foreseen by AIP . By “interaction between Ai and Aj ” we mean that either Ai sends a message to Aj , or viceversa. As an example, the nodes extracted from the SaS protocol are right robot, right node monitor, plan monitor, left robot, left node monitor and the edges are (right robot, right node monitor), (right node monitor, plan monitor), (left robot, left node monitor), (left node monitor, plan monitor). The interaction graph algorithm explores the trace expres- Fig. 2. The ABP MAS with one “manager” (Bob) and three participants. sion to find pairs of agents involved in the interaction types. We consider the instance of ABP that we name ABP3, where Since the trace expression may be cyclic, loops must be six different interactions may occur: Bob sends m1 to Alice detected and dealt with properly. This goal is achieved in (interaction type msg1 ), Alice sends a1 to Bob (interaction a simple way: trace expressions met during exploration are type ack1 ), Bob sends m2 to Carol (interaction type msg2 ), recorded, and when the exploration comes back to an already Carol sends a2 to Bob (interaction type ack2 ), Bob sends recorded trace, it stops. m3 to Dave (interaction type msg3 ), Dave sends a3 to Bob To identify the sets of unsplittable agents during the second (interaction type ack3 ). The ABP3 is an infinite iteration, step of the algorithm, during this stage an auxiliary directed where the following constraints have to be satisfied for all multigraph IGext is generated, where each edge connecting occurrences of the sending actions: two nodes is labeled with a string that encodes the path 1. The n-th occurrence of an interaction of type msg1 followed to reach that interaction in the protocol. The nodes must precede the n-th occurrence of an interaction of type and edges in IGext are obtained by means of the extract(ID, msg2 which in turn must precede the n-th occurrence of an AlreadyMetTrExp, Str, Level, τ ) function where interaction of type msg3 which in turn must precede the n+1- – ID is the protocol identifier; th occurrence of an interaction of type msg1 . – AlreadyMetTrExp is the set of trace expressions which 2. For k ∈ {1, 2, 3}, the n-th occurrence of msgk must were already met during the protocol analysis, to take care of precede the n-th occurrence of the acknowledge ackk , which, cycles; in turn, must precede the (n + 1)-th occurrence of msgk . – Str is the current path encoding; ABP3 can be defined by the following set of equations: – Level is an encoding of the current outermost operator ABP3 = (msgMM )∧(msg ack (1)MA1 )∧ level and position (left or right) in the trace expression parse (msg ack (2)MA2 )∧(msg ack (3)MA3 ) tree; MM = msg1 :msg2 :msg3 :MM – τ is the trace expression currently under consideration. MA1 = msg1 :ack1 :MA1 MA2 = msg2 :ack2 :MA2 extract(ID, AlreadyMetTrExp, Str, Level, τ ) is defined by cases: MA3 = msg3 :ack3 :MA2 1) if TrExpr belongs to AlreadyMetTrExp, then return; where msg ack (i), i ∈ {1, 2, 3}, and msg denote the event 2) if TrExpr is , then return; types s.t. Jmsg ack (i)K = Jmsg i K ∪ Jack i K, i ∈ {1, 2, 3}, and 3) if τ is α uop τ1 , where uop is an operator in {:, }, JmsgK = Jmsg1 K ∪ Jmsg2 K ∪ Jmsg3 K. then The trace expression defined by MM corresponds to the a) NewStr = Str • (uop, Level+1) first constraint informally stated above, while MA1 , MA2 , b) extract all the possible pairs of senders and re- and MA3 formalize the second constraint. The main trace ceivers (S, R) such that S and R are involved in expression ABP3 can be easily read as follows: if an event α has type msg1 or msg2 or msg3 , then it must verify MM , c) for each of such pairs, generate and assert edge(ID, and if an event has type msg1 or ack1 , then it must verify S, R, NewStr) S MA1 , and if an event has type msg2 or ack2 , then it must d) NewAlreadyMetTrExp = {τ } AlreadyMetTrExp 39 6 e) call extract(ID, NewAlreadyMetTrExp, NewStr, As an example, alice, bob, carol and dave in the ABP3 Level+1, τ1 ) represent an unsplittable set of agents. In fact, if we monitored 4) if τ is τ1 ∧τ2 , then interactions involving alice and bob only, we could find that a) NewStr = Str • (∧, Level+1) S they respect the second constraint of the protocol, but we could b) NewAlreadyMetTrExp = {τ } AlreadyMetTrExp never ensure that the first constraint – which also depends on c) call extract(ID, NewAlreadyMetTrExp, NewStr, the interactions of bob with carol and dave – is respected as Level+1, τ1 ) well. The notions of dependence and independence are related d) call extract(ID, NewAlreadyMetTrExp, NewStr, to the protocol but also to the formalism used for modeling Level+1, τ2 ) it, so no general rule for this step can be devised. When the interaction formalism F is that of trace expressions, however, 5) if τ is τ1 bop τ2 , where bop is an operator in {·, ∨, |}, dependencies among interactions are naturally modeled using then the intersection operator. Thus, although it may be an over- a) NewStr1 = Str • (bop, Level+1) cautious approach, when using trace expressions we define b) NewStr2 = Str • (bop, 2*Level+1) S “unsplittable” the agents involved in interactions connected c) NewAlreadyMetTrExp = {τ } AlreadyMetTrExp by an intersection operator. d) call extract(ID, NewAlreadyMetTrExp, NewStr1, The algorithm for performing the identification of Level+1, τ1 ) unsplittable agents looks at the labels in IGext and creates e) call extract(ID, NewAlreadyMetTrExp, NewStr2, sets of agents involved in interactions whose label has the 2*Level+1, τ2 ) same prefix before the first ∧ operator in the label, if any. Given a protocol identified by ID and represented by the For example, in SaS no label contains ∧ and hence the set trace expression τ , extract(ID, {}, Λ, 1, τ ) generates and of unsplittable agents is empty, whereas in ABP3 all the asserts all the edges in IGext . The generation of IGext nodes edge labels share the (∧, 2) prefix, meaning that all the is straightforward. The idea behind this function is that edges interactions in the protocol belong to branches connected whose labels share the same prefix have a strong dependency by the same intersection operator. As a consequence, all the as they belong to the same intersection. In fact, the only case agents involved in the ABP3 protocol interactions belong to where labels of edges share the same prefix is that of ∧ (case the same set of unsplittable agents, which coincides with all 4), where the same string and level are passed to the recursive the agents in the MAS. calls. If the operator is not an intersection (case 5), the strings and levels passed to the recursive calls are different, meaning 3) Graph collapse. In this stage, we transform the unweighted that the two operands are independent from one another and undirected interaction graph IG into a weighted undirected so will be the edges generated during their exploration. graph CIG, obtained by collapsing each node in IG corre- For example, the edges in the auxiliary IGext graph ex- sponding to an agent belonging to unsplittable set U , into a tracted from the ABP3 protocol are: single node with label U and weight equal to U cardinality. (bob, alice, ((∧, 2) (∧, 3) (∧, 4) (, 5)) ) Nodes in CIG are labeled with sets of agents, which can be (bob, carol, ((∧, 2) (∧, 3) (∧, 4) (, 5)) ) the singleton set for agents belonging to no unsplittable set. (bob, dave, ((∧, 2) (∧, 3) (∧, 4) (, 5)) ) Edges exiting from (resp. entering into) a node are all those (bob, alice, ((∧, 2) (∧, 3) (∧, 4) (, 5) (:, 6)) ) which, in IG, exited from (resp. entered into) one of the nodes (bob, carol, ((∧, 2) (∧, 3) (∧, 4) (, 5) (:, 6) (:, 7)) ) in the label. (bob, dave, ((∧, 2) (∧, 3) (∧, 4) (, 5) (:, 6) (:, 7) (:, 8)) ) Considering the running examples above, the (alice, bob, ((∧, 2) (∧, 3) (∧, 4) (, 5) (:, 6) (:, 7)) ) weighted nodes in the SaS CIG are ({right robot},1) (bob, carol, ((∧, 2) (∧, 3) (, 4)) ) ({right node monitor},1) ({plan monitor},1) ({left robot},1) (carol, bob, ((∧, 2) (∧, 3) (, 4)) ) ({left node monitor},1) namely, no collapse took place, (bob, carol, ((∧, 2) (∧, 3) (, 4) (:, 5)) ) while the only weighted node in the ABP3 CIG is (carol, bob, ((∧, 2) (∧, 3) (, 4) (:, 5) (:, 6)) ) ({alice,bob,carol,dave},4) labeled with the set of unsplittable (bob, dave, ((∧, 2) (, 3)) ) agents. (dave, bob, ((∧, 2) (, 3)) ) The edges in the SaS CIG are the same as those in the (bob, dave, ((∧, 2) (, 3) (:, 4)) ) SaS IG, while there are no edges in the ABP3 CIG due to (dave, bob, ((∧, 2) (, 3) (:, 4) (:, 5)) ) the presence of only one node. The IG nodes are alice, bob, dave, carol and the edges are 4) Graph partitioning. This stage of the algorithm consists in (alice, bob), (dave, bob), (carol, bob), which are extracted in partitioning the CIG graph obtained by the collapse stage. a trivial way from IGext . The number of expected partitions is given as an input by the user and must be grater than one. Any suitable partitioning 2) Identification of unsplittable agents. Not all the agents algorithm could be used during this stage. We will discuss can be monitored independently from one another, as some our implementative choice in Section V. interaction patterns that must be respected by a set of agents The partition of the original, not collapsed graph IG, is seen as a whole, could be lost when looking at subsets of the trivially derived by that of CIG by making the union of agents. the agents in the labels of the nodes in each CIG partition. 40 7 By using a graph partitioning algorithm where the sum of Unsplittable list: [] the node weights in each partition is about the same and The trace expressions the number of edges connecting nodes in two different parti- @((S_1|S_2), tions is minimized (by associating the unit weight with each [S_1= edge), we can obtain the following partition for the agents (put_right_sock:put_right_shoe:ok_right:lambda)\/ (put_right_shoe:oblige_remove_right_shoe: involved in the SaS protocol: {{left node monitor, left robot}, removed_right_shoe:S_1), {plan monitor, right node monitor, right robot} } . S_2= The agents involved in the ABP3 protocol cannot be (put_left_sock:put_left_shoe:ok_left:lambda)\/ (put_left_shoe:oblige_remove_left_shoe: partitioned because the collapsed graph consists of only one removed_left_shoe:S_2)]) node. can be partitioned into 5) Projection. Once the agents belonging to the same partition {[left_node_monitor],[left_robot]} have been devised, the projection algorithm presented in {[plan_monitor],[right_node_monitor],[right_robot]} Section III-B can be used to project the global interaction protocol onto each of them. Nodes node(socks,right_robot) V. I MPLEMENTATION AND E XPERIMENTS node(socks,right_node_monitor) node(socks,plan_monitor) The MAS-DRiVe algorithm is fully implemented in SWI- node(socks,left_robot) Prolog (http://www.swi-prolog.org/, accessed on June 2016), node(socks,left_node_monitor) apart from the graph partitioning stage which exploits the Extended edges METIS tool described in Section III. The extraction of the edge(socks,right_robot,right_node_monitor, IG from the protocol, the identification of the unsplittable [((|),2), ((\/),3), ((:),4)]) agents sets, and the generation of the collapsed graph CIG edge(socks,right_robot,right_node_monitor, are performed by Prolog predicates. After them, a predicate for [((|),2), ((\/),3), ((:),4), ((:),5)]) encoding CIG into a format suitable for METIS and saving ........ it into a file is called. The command line for running METIS Collapsed nodes on that file is launched from inside Prolog, and the result is cnode(socks,[right_robot],1) read and decoded. cnode(socks,[right_node_monitor],1) In order to run the algorithm, the user must call the cnode(socks,[plan_monitor],1) cnode(socks,[left_robot],1) partition/3 predicate with three arguments: the identifier cnode(socks,[left_node_monitor],1) of the agent interaction protocol to distribute, the number of partitions to be obtained, and the name of the file where Collapsed edges cedge(socks,[right_robot],[right_node_monitor]) Prolog will write the result of the partition. For example, cedge(socks,[right_node_monitor],[plan_monitor]) given that abp3 identifies the ABP3 protocol, and that the cedge(socks,[left_robot],[left_node_monitor]) trace expression associated with this identifier has been cedge(socks,[left_node_monitor],[plan_monitor]) read into the Prolog knowledge base, the user should call We run experiments with protocols that involve distinct partition(abp3, 2, ’./abp3Out.txt’) obtaining groups of agents each following the ABP3 protocol. For the following content for the abp3Out.txt file: example, in the DoubleABP3, alice, bob, carol and dave follow the ABP3 protocol, alice2, bob2, carol2 and dave2 follow the Unsplittable list: ABP3 protocol as well, and both bob and bob2 interact with [((/\),2)],[alice,bob,carol,dave] boss. Since alice, bob, carol and dave have no interactions Partition failed with alice2, bob2, carol2 and dave2, the two groups are independent from each other and can be safely partitioned Nodes for the purpose of decentralized runtime verification. In such node(abp3,bob) node(abp3,alice) a situation, the partition computed by MAS-DRiVe is the node(abp3,carol) following: node(abp3,dave) Unsplittable list: Extended edges [ ((*),3), ((\/),7), ((:),8), ((:),9), ((/\),10)], edge(abp3,bob,alice,[ ((/\),2), ((/\),3), [alice2,bob2,carol2,dave2] ((/\),4), ((>>),5)]) [ ((*),3), ((\/),4), ((:),5), ((:),6), ((/\),7)], edge(abp3,bob,carol,[ ((/\),2), ((/\),3), [alice,bob,carol,dave] ((/\),4), ((>>),5)]) .... The trace expressions @((bobasks:lambda|bob2asks:lambda)* Collapsed nodes ((okbob:nobob2:msg3>>S_1/\msg_ack(1)>>S_2/\ cnode(abp3,[alice,bob,carol,dave],4) msg_ack(2)>>S_3/\msg_ack(3)>>S_4)\/ (okbob2:nobob:mmsg3>>S_5/\mmsg_ack(1)>>S_6/\ The content of the socksOut.txt file that we obtain by mmsg_ack(2)>>S_7/\mmsg_ack(3)>>S_8)), [S_1=m1:m2:m3:S_1,S_2=m1:a1:S_2, calling partition(socks, 2, ’./socksOut.txt’) S_3=m2:a2:S_3,S_4=m3:a3:S_4,S_5=mm1:mm2:mm3:S_5, is: S_6=mm1:aa1:S_6,S_7=mm2:aa2:S_7,S_8=mm3:aa3:S_8]) 41 events. For example, an interaction between the plan monitor can be partitioned into and the left node monitor, corresponding to a (plan monitor, {[boss],[alice2,bob2,carol2,dave2]} left node monitor) edge in the graph, will be monitored by {[alice,bob,carol,dave]} both the monitor in charge of {left node monitor, left robot} and that in charge of {plan monitor, right node monitor, The algorithm correctly recognizes that alice, bob, carol and right robot}, as it involves agents in both of them. This dave must be monitored all together, and that the same holds redundancy cannot be avoided, but can be kept as small as for alice2, bob2, carol2 and dave2. No other constraints on possible by selecting a partitioning algorithm that minimizes the partition are found. We got the expected correct results the number of edges across partitions. Graph partitioning is also with the TripleABP3 protocol, with three distinct groups another area of improvement of our work: strategies could of agents following the ABP3. be employed to automatically detect the optimal number of The code of the MAS-DRiVe algorithm can be downloaded monitors to deploy, or at least guidelines may be provided in from http://www.disi.unige.it/person/MascardiV/Software/ the future. masdrive.html. It requires SWI Prolog and the METIS Finally, as part of our future work, we plan to make an ex- software installed and accessible via the command line from haustive experimentation of MAS-DRiVe on existing complex everywhere. We tested it on Mageia Linux. protocols. METIS can easily partition graphs with hundreds of nodes so we will model protocols involving hundreds of VI. C ONCLUSIONS AND F UTURE W ORK agents and test the scalability of the approach. Studying its This paper addresses the problem of how to partition a computational properties from a theoretical viewpoint is on MAS into agents’ subsets that can be verified at runtime, our agenda as well. independently from one another, in such a way that the results obtained by the decentralized runtime verification are the ACKNOWLEDGMENTS same as those obtained by monitoring the whole MAS in a We are grateful to the anonymous reviewers for their useful centralized way. remarks and constructive comments. We designed and implemented the MAS-DRiVe algorithm, whose results are consistent with those discussed in [2]. In that R EFERENCES paper, the feasibility of distributing the runtime monitoring [1] M. Alberti, M. Gavanelli, E. Lamma, P. Mello, and P. Torroni. over user-defined sets of agents was empirically validated by The SCIFF abductive proof-procedure. In AI*IA, pages 135– checking the traces of events compliant with the decentralized 147, 2005. subprotocols up of a given length, and verifying whether they [2] D. Ancona, D. Briola, A. El Fallah Seghrouchni, V. Mascardi, were compliant with the original global protocol as well. That and P. Taillibert. Efficient verification of MASs with projections. approach gave a semi-decidable information: if all traces up In EMAS 2014. Revised Selected Papers, volume 8758 of LNCS, to a given length l were found to be compliant with the pages 246–270. Springer, 2014. protocol, this did not ensure that problems could not arise [3] D. Ancona, D. Briola, A. Ferrando, and V. Mascardi. Global with traces of length l + 1. With MAS-DRiVe we propose a protocols as first class entities for self-adaptive agents. In constructive way to partition agents in the MAS, rather than G. Weiss, P. Yolum, R. H. Bordini, and E. Elkind, editors, Pro- letting the user decide how to partition them, and we overcome ceedings of the 2015 International Conference on Autonomous the problems due to the empirical validation approach by Agents and Multiagent Systems, AAMAS 2015, pages 1019– proposing partitions that are “safe” w.r.t. the trace expression 1029. ACM, 2015. operators. In particular, we assume that agents involved in [4] D. Ancona, D. Briola, A. Ferrando, and V. Mascardi. Runtime an intersection can never be partitioned. This is an over- verification of fail-uncontrolled and ambient intelligence sys- cautious approach, but since intersection is the construct used tems: A uniform approach. Intelligenza Artificiale, 9(2):131– to state constraints across different independent branches of the 148, 2015. protocol, we assume that those branches cannot be monitored [5] D. Ancona, S. Drossopoulou, and V. Mascardi. Automatic independently. Generation of Self-Monitoring MASs from Multiparty Global Session Types in Jason. In M. Baldoni, L. A. Dennis, In its current version, then, the MAS can not get split if V. Mascardi, and W. Vasconcelos, editors, DALT 2012. Revised no semi-independent parts of the system can get identified. Selected Papers, volume 7784 of LNCS. Springer, 2013. Although this is a limitation of our approach, this could have [6] D. Ancona, A. Ferrando, and V. Mascardi. Comparing trace an interesting side effect that we aim to explore in the close expressions and linear temporal logic for runtime verification. future, namely the possibility to come up with suggestions In E. Ábrahám, M. M. Bonsangue, and E. B. Johnsen, editors, for MAS designers in terms of good practices for better sub- Theory and Practice of Formal Methods, volume 9660 of LNCS, systems encapsulation based on interaction protocol analysis. pages 47–64. Springer, 2016. Such suggestions could be applied to distributed systems in [7] A. Bauer, M. Leucker, and C. Schallhart. Comparing LTL general, and not only to MASs, and could prove useful in semantics for runtime verification. J. Log. and Comput., those approaches where there is an embedded notion of some 20(3):651–674, June 2010. sort of encapsulation. [8] A. Bauer, M. Leucker, and C. Schallhart. Runtime verification As far as graph partitioning is concerned, we point out for LTL and TLTL. ACM Trans. Softw. Eng. Methodol., that the obtained partitions do not correspond to partitions of 20(4):14:1–14:64, Sept. 2011. 8 42 [9] F. L. Bellifemine, G. Caire, and D. Greenwood. Developing Multi-Agent Systems with JADE. Wiley, 2007. [10] D. Briola, V. Mascardi, and D. Ancona. Distributed runtime verification of JADE multiagent systems. In D. Camacho, L. Braubach, S. Venticinque, and C. Badica, editors, IDC 2014, volume 570 of Studies in Computational Intelligence, pages 81– 91. Springer, 2014. [11] M. Brörkens and M. Möller. Dynamic event generation for runtime checking using the JDI. Electr. Notes Theor. Comput. Sci., 70(4):21–35, 2002. [12] A. Buluç, H. Meyerhenke, I. Safro, P. Sanders, and C. Schulz. Recent advances in graph partitioning. In Algorithm Engi- neering: Selected Results and Surveys, volume 9220 of LNCS. Springer, 2015 (in press). [13] F. Chen and G. Rosu. Mop: an efficient and generic runtime verification framework. In OOPSLA 2007, pages 569–588, 2007. [14] C. Colombo, G. J. Pace, and G. Schneider. LARVA — Safer Monitoring of Real-Time Java Programs (Tool Paper). In SEFM 2009, pages 33–37, 2009. [15] F. S. de Boer and C. P. T. de Gouw. Combining Monitoring With Run-Time Assertion Checking. In SFM 14, pages 217 – 262. Springer, 2014. [16] A. Ferrando, V. Mascardi, and D. Ancona. Monitoring patients with hypoglycemia using self-adaptive protocol-driven agents: a case study. In Workshop Proceedings of EMAS 2016, 2016. [17] L. Giordano, A. Martelli, and C. Schwind. Specifying and ver- ifying interaction protocols in a temporal action logic. Journal of Applied Logic, 5(2):214 – 234, 2007. [18] K. Havelund and G. Rosu. Synthesizing monitors for safety properties. In J. Katoen and P. Stevens, editors, TACAS 2002, volume 2280 of LNCS, pages 342–356. Springer, 2002. [19] G. Karypis and V. Kumar. Analysis of multilevel graph partitioning. In Proceedings of the 1995 ACM/IEEE Conference on Supercomputing, Supercomputing ’95, New York, NY, USA, 1995. ACM. [20] G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Scientific Computing, 20(1):359–392, 1998. [21] M. Kim, S. Kannan, I. Lee, O. Sokolsky, and M. Viswanathan. Java-mac: a run-time assurance tool for java programs. Electr. Notes Theor. Comput. Sci., 55(2):218–235, 2001. [22] L. Lamport. Time, clocks, and the ordering of events in a distributed system. Commun. ACM, 21(7):558–565, July 1978. [23] M. Maróti, B. Kusy, G. Simon, and A. Lédeczi. The flooding time synchronization protocol. In SenSys ’04, pages 39–49. ACM, 2004. [24] M. C. Martin, V. B. Livshits, and M. S. Lam. Finding application errors and security flaws using PQL: a program query language. In OOPSLA 2005, pages 365–383, 2005. [25] S. Rajsbaum. Distributed runtime verification – where combina- torics, fault-tolerance and formal methods meet. Keynote Talk at the SSS 2015, August 2015, 2015. 9 43