1


MAS-DRiVe: a Practical Approach to Decentralized
Runtime Verification of Agent Interaction Protocols
                            Davide Ancona, Daniela Briola, Angelo Ferrando, and Viviana Mascardi


   Abstract—We address the problem of decentralized runtime               Once the formal specification of the global pattern of events
verification of interaction protocols in multiagent systems by         is given, however, distributing the monitoring activity can be
means of MAS-DRiVe, an algorithm for partitioning a multi-             resorted to decomposing the global specification into “sub-
agent system (MAS) into sub-MASs which can be monitored
independently. Given a global interaction protocol named AIP           specifications”, involving less events than the global one,
(for “Agent Interaction Protocol”) describing all the interactions     which can be monitored in an independent way from each
which can take place in the MAS, the MAS-DRiVe algorithm               other and such that the union of their independent monitoring
extracts the interaction graph from AIP , identifies the clusters of   gives the same guarantees as the monitoring of the whole
agents which cannot be split during the decentralized monitoring       system w.r.t. the original specification.
as the interactions they are involved in are not independent, col-
lapses each of those clusters into a single node in the interaction       Given a global specification of the expected system behavior
graph, and finally partitions the collapsed graph obtained so far.     S, the challenge to face is then to compute a set of n sub-
Although the “unsplittable agents identification” stage is still in    specifications S1 , . . . , Sn derived from S in such a way that
its early design and prototyping phases and requires a better          the following properties hold:
formalization and a deeper analysis, the MAS-DRiVe algorithm              • decentralized monitoring against the sub-specifications
pipeline has been fully implemented and demonstrated on two
simple MASs.                                                                 S1 , . . . , Sn is equivalent to centralized monitoring against
   Once the independently monitorable sub-MASs have been                     S;
identified by MAS-DRiVe, the global interaction protocol AIP              • the decomposition of S into S1 , . . . , Sn is balanced
can be projected onto the subsets of agents belonging to each                according to some weighting criterion, to get, ideally, the
graph partition, thus obtaining local versions of AIP which can
                                                                             best monitoring performances.
be monitored in a decentralized way.
                                                                       These sub-specifications do not necessarily induce a partition
   Index Terms—Agent Interaction Protocol; Decentralized Run-          of the monitored events: indeed, it might be necessary that
time Verification; Decentralized Monitoring; Graph Partitioning
                                                                       some monitored events belong to more than one specification,
                                                                       and hence must be checked by more monitors. Let us suppose
                                                                       for example that the formal specification of the event pattern
              I. I NTRODUCTION AND M OTIVATION
                                                                       states that when the alarm rings, Mary stops it and either she
   Runtime verification (RV) is a software verification tech-          wakes up, washes, has breakfast and goes to work (then the
nique that complements formal static verification and testing.         monitoring is allowed to stop), or she continues to sleep.
In RV, dynamic checking of the correct behavior of a system               The events “wake up”, “wash”, “have breakfast”, “go to
is performed by a monitor which is generated from a formal             work” are independent from the event “sleep”, so a good way
specification of the properties to be verified.                        to decentralize the monitoring activity between two monitors
   Distributed runtime verification (DRV), as described by S.          m1 and m2 could be to have m1 in charge of the “active”
Rajsbaum in his keynote speech at the SSS 2015 Symposium               behavior (m1 would check that “when the alarm rings, Mary
[25], tackles the problem of building a decentralized runtime          stops it and then either she wakes up, washes, has breakfast,
monitor for a distributed system and involves designing a              goes to work – then the monitoring can stop –, or the mon-
distributed algorithm that coordinates the monitors in order           itoring can stop immediately”) and m2 in charge of the lazy
for them to correctly verify the dynamic behavior of the whole         one (m2 would check that “when the alarm rings, Mary stops
system.                                                                it and then either she continues to sleep, or the monitoring can
   With respect to RV, DRV techniques are more complex and             stop immediately”).
less developed, since they involve designing a distributed al-            It would be convenient that the events “the alarm rings” and
gorithm that monitors another distributed algorithm. Neverthe-         “Mary stops it” were monitored by both m1 and m2, in order
less, they are gaining more and more attention, as shown by the        to avoid a further monitor ensuring the correct sequence of
recent Distributed Runtime Verification Workshop organized             initial events, whatever Mary’s attitude towards the new day.
in Bertinoro in May 20161 .                                               Due to events which may be monitored by more than
                                                                       one monitor, the total amount of monitored events in the
  D. Ancona, A. Ferrando, and V. Mascardi are with DIBRIS,
University of Genova, Italy. davide.ancona@unige.it,                   decentralized setting may be larger than in the centralized
angelo.ferrando@dibris.unige.it,                                       case. However, decentralized monitors can be run in parallel
viviana.mascardi@unige.it                                              on different machines, so the workload is more balanced
  D. Briola is with DISCO, University of Milano-Bicocca, Italy.
daniela.briola@unimib.it                                               than in the centralized setting. Also, sometimes a centralized
  1 http://www.labri.fr/perso/travers/DRV2016/index.html.              approach is unfeasible, as in the motivating scenario discussed


                                                                   35
                                                                                                                                      2


in our previous work presented at EMAS 2014 [2]. There, we            of agents in a MAS which provide formal guarantees that
supposed that the safety of a humanitarian convoy was guar-           verification through projected types and decentralized monitors
anteed by unmanned aerial vehicles whose behavior is critical         is equivalent to verification performed by a single centralized
for the success of the mission and hence subject to runtime           monitor with a single global type.
verification, but which might not be monitorable by a single             This paper addresses that problem by presenting MAS-
monitor for network connectivity reasons and for failures of          DRiVe, an algorithm for partitioning a MAS in such a way
some computational entities. In cases like that, decentralizing       that the identified subsystems can be safely monitored in
the runtime verification activity is no longer an option, but         a decentralized way. The paper is structured as follows:
the only viable solution. The possibility to exchange protocol        Section II discusses the related work, Section III introduces
specifications, dealt with as first class entities that can be sent   background knowledge, Section IV presents the design of
and received [3], allows us to apply our approach to oppor-           MAS-DRiVe, Section V discusses its implementation and
tunistic networks (Wireless Sensor Networks, drone ferries,           experiments, Section VI concludes.
and the like) where the actual communication protocols are
unknown until deployment, and mobility dynamically changes
                                                                                           II. R ELATED W ORK
them.
   Finding a set of sub-specifications S1 , . . . , Sn of S              DRV lies at the crossroad where decentralized algorithms
meeting the requirements for decentralizing the monitoring            and formal methods meet [25]. It must cope both with the
activity heavily depends on the way S has been specified.             problems intrinsic to distributed systems, and with those
RV specifications are typically expressed using trace predicate       related to formalization of properties to be verified at runtime.
formalisms. Many of them, such as finite state machines, reg-         Furthermore, the distribution of the monitoring activity itself
ular expressions, context free grammars, linear-time temporal         poses new problems and raises new challenges.
(LTL) logic, have been originally introduced for other aims              One of the most severe and most studied problems which
and then exploited for RV; others, such as trace expressions          characterize decentralized systems is the lack of a global clock
[6], have been expressly devised for RV. Different formalisms         and the need to tag events with a timestamp which, although
may have different expressive power, which impacts on the             local to the node where the event took place or was observed,
types of patterns that can be described and on the meaning of         can be compared with the timestamps of the other events to
“independent” sub-specifications.                                     check their relative order. In this paper we assume that each
   Although the MAS-DRiVe algorithm proposed in this paper            monitor observes the events it is in charge of, in the same
for decentralizing the monitoring activity is general enough to       order as they actually took place in the environment. This
be adopted whatever the formalism and the kind of monitored           condition is enough for our approach to work. As mentioned
events are, it has only been experimented for the specific            in the introduction, some events may be observed by more
formalism of trace expressions, for specific decentralized            than one monitor so the condition we pose implicitly means
systems, that is, multiagent systems (MASs), and for specific         that event ordering is always preserved in all monitors. A large
events, namely interactions between pairs of agents. Also, as         body of literature addresses this problem; the two most widely
we will see later on, the “unsplittable agents identification” is     used approaches are to connect the different physical clocks
almost naif and requires a refinement and a better theoretical        of the decentralized nodes to one common logical clock, by
support.                                                              exploiting causality in communication between nodes (Lam-
   Trace expressions are an evolution of global types [5], which      ports “happened before” relation [22]), and to synchronize the
have been initially proposed for RV of interactions in MASs.          local clocks to provide a full ordering of events as proposed
Trace expressions are an expressive formalism based on a set          for example by M. Maróti et al. [23].
of operators (including prefixing, concatenation, shuffle, union,        Formalizing the system properties in order to dynamically
and intersection) to denote finite and infinite traces of events.     verify them requires that a suitable formal language is avail-
Their semantics is based on a labeled transition system defined       able. Among the first efforts in this direction we may cite one
by a simple set of rewriting rules which directly drive the           paper by A. Bauer, M. Leucker, and C. Schallhart [7] where
behavior of monitors generated from them. They have been              the use of LTL in the context of RV was explored. When used
used to model and monitor (in a simulated setting) both fail-         for RV, the expressive power of LTL is reduced, because at
uncontrolled and ambient intelligence systems [4], as well as         runtime only finite traces can be checked. To provide a formal
medical guidelines for remote patient monitoring [16].                account for this limitation, a three-valued semantics for LTL,
   In the already cited EMAS 2014 paper, we tackled the               called LTL3 , has been proposed later on [8]. A third truth
problem of projecting a global specification of an interaction        value “?” is introduced to specify that after a finite trace of
protocol expressed in the global types formalism onto subsets         events has been occurred, the outcome of a monitor can be
of agents in the MAS. The projection function is a key element        inconclusive.
for moving from a global specification of the protocol to a              The generation of efficient monitors for properties specified
localized one, which involves less agents and hence hopefully         in a variety of formal languages was firstly addressed by
less interaction events. However, in our previous work we             M. Kim et al. [21, 18] and many proposals exist for RV
faced only one of the issues involved in distributing the             of object-oriented systems (jassda [11], PQL [24], JavaMOP
RV activity, that of projecting the interaction protocol, and         [13], LARVA [14], SAGA [15], just to cite a few) and MASs
did not consider the problem of finding suitable partitions           [1, 17]. A large community of researchers works on RV


                                                                  36
                                                                                                                                                  3


and in 2001 the Runtime Verification Workshop/Conference                 •  τ1 ·τ2 (concatenation), denoting the set of all traces ob-
series was initiated (http://www.runtime-verification.org), and             tained by concatenating the traces of τ1 with those of
workshops/conferences have occurred each year since then.                   τ2 .
   When moving from RV to DRV, however, the situation                    • τ1 ∧τ2 (intersection), denoting the intersection of the
dramatically changes: no introductory papers and surveys on                 traces of τ1 and τ2 .
the topic exist, which demonstrates the youth of this research           • τ1 ∨τ2 (union), denoting the union of the traces of τ1 and
field. W.r.t. DRV of MASs, we were not able to find any related             τ2 .
work except for one paper of ours [10] where the projection              • τ1 |τ2 (shuffle), denoting the set obtained by shuffling the
mechanism introduced in our previous EMAS 2014 work had                     traces in τ1 with the traces in τ2 .
been integrated with JADE [9].                                        To support recursion without introducing an explicit construct,
                                                                      trace expressions are regular (a.k.a. rational or cyclic) terms.
                      III. BACKGROUND                                 A regular term can be represented by a finite set of syntactic
   In this section we introduce the three pillars of our work:        equations, as happens, for instance, in most modern Prolog
trace expressions, projection, and graph partitioning.                implementations where unification supports cyclic terms. A
                                                                      trace expression τ is contractive if all its infinite paths from
A. Trace Expressions                                                  the root contain the prefix operator.
                                                                         From the operators above we can derive the filter opera-
   Trace expressions are a specification formalism expressly          tor, useful for making trace expressions more compact and
designed for RV. They are based on the notions of event and           readable. The expression ϑτ denotes the set of all traces
its abstraction, event type.                                          contained in τ , when deprived of all events that do not match
                                                                      ϑ. Assuming that event types are closed by complementation,
  Events. In the following we denote by E a fixed universe of         the expression above is a convenient syntactic shortcut for T |τ ,
events. An event trace over E is a possibly infinite sequence         where T = ∨ϑ:T , and ϑ is the complement event type of ϑ,
of events in E. As an example, we might have E = {                    that is, JϑK = E \ JϑK.
msg(bob, alice, tell, m1), msg(alice, bob, tell, a1), msg(bob,           In the context of RV, where the three-valued semantics
carol, tell, m2), msg(carol, bob, tell, a2), msg(bob, dave, tell,     is considered for LTL, trace expressions are strictly more
m3), msg(dave, bob, tell, a3) } where the interaction event           expressive than LTL [6]: every LTL formula can be encoded
msg(S, R, P, C) corresponds to the interaction between agent          into a trace expression with an equivalent three-valued seman-
S and agent R, with S sending a message with performative             tics, whereas the opposite property does not hold, since trace
P and content C to R. In this example, contents m1, m2 and            expressions are also able to specify all context-free languages,
m3 correspond to actual messages, and contents a1, a2 and             and also some non context-free ones.
a3 correspond to acknowledges of reception.
                                                                         Although trace expressions can define protocols involving
                                                                      events of any kind, in the sequel we will limit our investigation
   Event types. To be more general, trace expressions are
                                                                      to interaction events like msg(S, R, P, C) and interaction types
built on top of event types (chosen from a set ET), rather
                                                                      that we will identify with α.
than of single events; an event type denotes a subset of E.
For example, if we were interested only in the type of the
message content (actual message or acknowledge), we might             B. Projection
define the two event types Msg= { msg(bob, alice, tell,                  The projection function was first introduced in [2]. Given
m1), msg(bob, carol, tell, m2), msg(bob, dave, tell, m3) }            the finite set AGS of all the agents that could play a role in
and Ack= { msg(alice, bob, tell, a1), msg(carol, bob, tell,           the MAS and an interaction type α, senders(α) is the set of
a2), msg(dave, bob, tell, a3) }, and use them for describing          all the agents in AGS that could play the role of sender in
interaction patterns.                                                 actual interactions having type α, and receivers(α) is the set
                                                                      of all the agents in AGS that could play the role of receiver
   Trace expressions. A trace expression τ represents a set of        in interactions of type α. The involves predicate holds on one
possibly infinite event traces, and is defined on top of the          interaction type α and one set of agents Ags, involves(α, Ags),
following operators (binary operators associate from left, and        iff (senders(α) ∩ Ags 6= ∅) ∨ (receivers(α) ∩ Ags 6= ∅).
are listed in decreasing order of precedence, that is, the first         Projection can be described as a function Π : T ×
operator has the highest precedence):                                 P(AGS) → T where T is the set of trace expressions. Π is
   •  (empty trace), denoting the singleton set {} containing       driven by the syntax of the trace expression to project; since
     the empty event trace .                                         Π is defined on cyclic terms, the simplest way to define it
   • ϑ:τ (prefix), denoting the set of all traces whose first event   would be by coinduction2 as follows:
     e matches the event type ϑ (e ∈ ϑ), and the remaining               (i) Π(, Ags) = 
     part is a trace of τ . For example, if our communication
     protocol just required that an interaction from the set            2 Coinduction is a technique for defining and proving properties of systems

     Msg must take place, no matter which one, and then an            of concurrent interacting objects. It is the mathematical dual to structural
                                                                      induction. Coinductively defined types are typically infinite data structures,
     interaction from Ack must occur, no matter which one,            such as streams. See http://cseweb.ucsd.edu/groups/tatami/handdemos/doc/
     we could express it as Msg:Ack:.                                coind.htm for a more technical formalization.


                                                                  37
                                                                                                                                       4


   (ii) Π(α : τ, Ags) = α : Π(τ, Ags) if involves(α, Ags)                   all edges in the CIG have weight 1, but the approach
   (iii) Π(α : τ, Ags) = Π(τ, Ags) if ¬involves(α, Ags)                     works also in case edge weights are different: a higher
   (iv) Π(τ 0 op τ 00 , Ags) = Π(τ 0 , Ags) op Π(τ 00 , Ags), where         edge weight would model a preferential communication
op ∈ {·, ∧, ∨, |}.                                                          channel;
   This definition works properly only on all non cyclic terms           4) partition the new graph obtained so far using some
and on some, but not all, cyclic terms. To guarantee that                   suitable graph partition method;
the projection function always returns a contractive type and            5) project the interaction protocol onto these agents’ parti-
that the correct coinductive definition is implemented, it is               tions.
necessary to keep track of all types visited by Π along a path;
each type is associated with its depth in the path, and with              The algorithm is explained by means of two running exam-
a fresh variable which will be unified with the corresponding          ples introduced in [2].
computed projection. During the visit, the depth DeepestSeq
of the deepest visited sequence operator is kept. If a type τ has
been already visited, then a cycle is detected: if its depth is less
than DeepestSeq then the cycle contains an occurrence of the           A. Examples
sequence constructor, therefore the projected type associated
with τ is contractive and, hence, is returned; otherwise, the
projection would not be contractive, therefore  is returned.

C. Graph Partitioning
   The graph partitioning problem (GPP) is defined in the
following way [12]: given a number k ∈ N>1 and an
undirected graph G = (V, E) with non-negative node and edge
weights, GPP asks for a partition of V , that is, with blocks of
nodes (V1 , ..., Vk ) such that
   1) V1 ∪ V2 ∪ ... ∪ Vk = V
   2) Vi ∩ Vj = ∅ ∀i 6= j
   A balance constraint may be required, demanding that all
blocks have about equal weights. To be more precise, a good            Fig. 1. The “Socks and Shoes” MAS
partition is the one where the sum of the node weights
in each Vi is “about the same” and the sum of all edge                       a) Socks and Shoes (SaS): Let us consider the simple
weights of edges connecting all different pairs Vi and Vj is           scenario illustrated in Figure 1 where two robots (right
minimized. Applications of graph partitioning include parallel         and left), two monitors (right and left) associated with
processing, road networks, image processing, VLSI design,              each robot, and a plan monitor which supervises them,
social networks, and bioinformatics. Among the many existing           must reach the goal to put the right and left socks and
tools for graph partitioning we opted for using METIS [20],            shoes in the correct way (socks first!). As robots are
a set of serial programs for partitioning graphs, partitioning         autonomous, they could perform the two actions in the wrong
finite element meshes, and producing fill reducing orderings           order: monitors are there to ensure that wrong actions are
for sparse matrices (http://glaros.dtc.umn.edu/gkhome/views/           immediately rolled back. Robots communicate their actions
metis, accessed on June 2016). The choice of METIS was due             to their corresponding monitors, which, in turn, notify the
to the efficiency of the multi-level Kernighan/Lin approach it         plan monitor when the robots accomplish their goal. The
builds upon [19] and to the ease of its installation and usage.        interaction types involving the right robot and the right node
                                                                       monitor are defined as msg(right robot, right node monitor,
                  IV. MAS-DR I V E : D ESIGN                           tell,   put sock)     ∈    put right sock;      msg(right robot,
   Given an interaction protocol expressed in some suitable            right node monitor, tell, put shoe) ∈ put right shoe;
formalism, the MAS-DRiVe algorithm for partitioning a MAS              msg(right robot, right node monitor, tell, removed shoe) ∈
in such a way that the identified subsystems can be safely             removed right shoe; msg(right node monitor, right robot,
monitored in a decentralized way is the following:                     tell, oblige remove shoe) ∈ oblige remove right shoe;
   1) extract the interaction graph IG from the interaction            msg(right node monitor, plan monitor, tell, ok) ∈ ok right.
      protocol;                                                        Those for the left robot and left node monitor are symmetrical.
   2) identify the sets of agents which cannot be split during            The right robot (RIGHT branch in the protocol description
      the decentralized monitoring (unsplittable agents), since        below) can start by putting the sock, which is the correct action
      the interactions they are involved in are not independent;       to do, or (∨ operator) by putting the shoe, which requires
   3) compute the collapsed interaction graph CIG by col-              a recovery by the right robot monitor and looping back to
      lapsing each of those sets into a single node with weight        the RIGHT branch. The left robot has the same behavior
      equal to the cardinality of the collapsed set, and by com-       (LEFT branch in the protocol description). The SaS protocol
      puting the collapsed edges consistently; for simplicity,         is described by the shuffle of the actions of the right and left


                                                                   38
                                                                                                                                        5


robots and monitors (RIGHT | LEFT):                                     verify MA2 , and if an event has type msg3 or ack3 , then it
                                                                        must verify MA3 .
 RIGHT  = (put right sock:put right shoe:ok right:)∨
          (put right shoe:oblige remove right shoe:
          removed right shoe:RIGHT)                                     B. Algorithm Steps
   LEFT = (put left sock:put left shoe:ok left:)∨
          (put left shoe:oblige remove left shoe:                        As already introduced in the beginning of this section, the
          removed left shoe:LEFT)                                       MAS-DRiVe algorithm consists of five steps.
    SaS = RIGHT|LEFT.
                                                                        1) Interaction graph extraction. The interaction graph IG of
     b) Alternating Bit Protocol with four participants                 the agents interaction protocol AIP expressed using formal-
(ABP3): As a second example we consider the Alternating                 ism F is an undirected and unweighted graph with one node
Bit Protocol ABP.                                                       Ai for each agent Ai involved in AIP . An edge between Ai
                                                                        and Aj exists iff an interaction between Ai and Aj is foreseen
                                                                        by AIP . By “interaction between Ai and Aj ” we mean that
                                                                        either Ai sends a message to Aj , or viceversa.
                                                                           As an example, the nodes extracted from the SaS
                                                                        protocol are right robot, right node monitor, plan monitor,
                                                                        left robot, left node monitor and the edges are (right robot,
                                                                        right node monitor), (right node monitor, plan monitor),
                                                                        (left robot,      left node monitor),        (left node monitor,
                                                                        plan monitor).
                                                                           The interaction graph algorithm explores the trace expres-
Fig. 2. The ABP MAS with one “manager” (Bob) and three participants.
                                                                        sion to find pairs of agents involved in the interaction types.
   We consider the instance of ABP that we name ABP3, where             Since the trace expression may be cyclic, loops must be
six different interactions may occur: Bob sends m1 to Alice             detected and dealt with properly. This goal is achieved in
(interaction type msg1 ), Alice sends a1 to Bob (interaction            a simple way: trace expressions met during exploration are
type ack1 ), Bob sends m2 to Carol (interaction type msg2 ),            recorded, and when the exploration comes back to an already
Carol sends a2 to Bob (interaction type ack2 ), Bob sends               recorded trace, it stops.
m3 to Dave (interaction type msg3 ), Dave sends a3 to Bob                  To identify the sets of unsplittable agents during the second
(interaction type ack3 ). The ABP3 is an infinite iteration,            step of the algorithm, during this stage an auxiliary directed
where the following constraints have to be satisfied for all            multigraph IGext is generated, where each edge connecting
occurrences of the sending actions:                                     two nodes is labeled with a string that encodes the path
   1. The n-th occurrence of an interaction of type msg1                followed to reach that interaction in the protocol. The nodes
must precede the n-th occurrence of an interaction of type              and edges in IGext are obtained by means of the extract(ID,
msg2 which in turn must precede the n-th occurrence of an               AlreadyMetTrExp, Str, Level, τ ) function where
interaction of type msg3 which in turn must precede the n+1-               – ID is the protocol identifier;
th occurrence of an interaction of type msg1 .                             – AlreadyMetTrExp is the set of trace expressions which
   2. For k ∈ {1, 2, 3}, the n-th occurrence of msgk must               were already met during the protocol analysis, to take care of
precede the n-th occurrence of the acknowledge ackk , which,            cycles;
in turn, must precede the (n + 1)-th occurrence of msgk .                  – Str is the current path encoding;
ABP3 can be defined by the following set of equations:                     – Level is an encoding of the current outermost operator
  ABP3 = (msgMM )∧(msg ack (1)MA1 )∧                                  level and position (left or right) in the trace expression parse
         (msg ack (2)MA2 )∧(msg ack (3)MA3 )                          tree;
   MM = msg1 :msg2 :msg3 :MM                                               – τ is the trace expression currently under consideration.
  MA1 = msg1 :ack1 :MA1
  MA2 = msg2 :ack2 :MA2                                                 extract(ID, AlreadyMetTrExp, Str, Level, τ ) is defined by cases:
  MA3 = msg3 :ack3 :MA2                                                     1) if TrExpr belongs to AlreadyMetTrExp, then return;
where msg ack (i), i ∈ {1, 2, 3}, and msg denote the event                  2) if TrExpr is , then return;
types s.t. Jmsg ack (i)K = Jmsg i K ∪ Jack i K, i ∈ {1, 2, 3}, and          3) if τ is α uop τ1 , where uop is an operator in {:, },
JmsgK = Jmsg1 K ∪ Jmsg2 K ∪ Jmsg3 K.                                           then
   The trace expression defined by MM corresponds to the                          a) NewStr = Str • (uop, Level+1)
first constraint informally stated above, while MA1 , MA2 ,                      b) extract all the possible pairs of senders and re-
and MA3 formalize the second constraint. The main trace                              ceivers (S, R) such that S and R are involved in
expression ABP3 can be easily read as follows: if an event                           α
has type msg1 or msg2 or msg3 , then it must verify MM ,                          c) for each of such pairs, generate and assert edge(ID,
and if an event has type msg1 or ack1 , then it must verify                          S, R, NewStr)                  S
MA1 , and if an event has type msg2 or ack2 , then it must                       d) NewAlreadyMetTrExp = {τ }          AlreadyMetTrExp


                                                                       39
                                                                                                                                      6


          e) call extract(ID, NewAlreadyMetTrExp, NewStr,                 As an example, alice, bob, carol and dave in the ABP3
             Level+1, τ1 )                                            represent an unsplittable set of agents. In fact, if we monitored
   4) if τ is τ1 ∧τ2 , then                                           interactions involving alice and bob only, we could find that
          a) NewStr = Str • (∧, Level+1) S                            they respect the second constraint of the protocol, but we could
         b) NewAlreadyMetTrExp = {τ }           AlreadyMetTrExp       never ensure that the first constraint – which also depends on
          c) call extract(ID, NewAlreadyMetTrExp, NewStr,             the interactions of bob with carol and dave – is respected as
             Level+1, τ1 )                                            well. The notions of dependence and independence are related
         d) call extract(ID, NewAlreadyMetTrExp, NewStr,              to the protocol but also to the formalism used for modeling
             Level+1, τ2 )                                            it, so no general rule for this step can be devised. When the
                                                                      interaction formalism F is that of trace expressions, however,
   5) if τ is τ1 bop τ2 , where bop is an operator in {·, ∨, |},
                                                                      dependencies among interactions are naturally modeled using
       then
                                                                      the intersection operator. Thus, although it may be an over-
          a) NewStr1 = Str • (bop, Level+1)                           cautious approach, when using trace expressions we define
         b) NewStr2 = Str • (bop, 2*Level+1) S                        “unsplittable” the agents involved in interactions connected
          c) NewAlreadyMetTrExp = {τ }          AlreadyMetTrExp       by an intersection operator.
         d) call extract(ID, NewAlreadyMetTrExp, NewStr1,                 The algorithm for performing the identification of
             Level+1, τ1 )                                            unsplittable agents looks at the labels in IGext and creates
          e) call extract(ID, NewAlreadyMetTrExp, NewStr2,            sets of agents involved in interactions whose label has the
             2*Level+1, τ2 )                                          same prefix before the first ∧ operator in the label, if any.
   Given a protocol identified by ID and represented by the           For example, in SaS no label contains ∧ and hence the set
trace expression τ , extract(ID, {}, Λ, 1, τ ) generates and          of unsplittable agents is empty, whereas in ABP3 all the
asserts all the edges in IGext . The generation of IGext nodes        edge labels share the (∧, 2) prefix, meaning that all the
is straightforward. The idea behind this function is that edges       interactions in the protocol belong to branches connected
whose labels share the same prefix have a strong dependency           by the same intersection operator. As a consequence, all the
as they belong to the same intersection. In fact, the only case       agents involved in the ABP3 protocol interactions belong to
where labels of edges share the same prefix is that of ∧ (case        the same set of unsplittable agents, which coincides with all
4), where the same string and level are passed to the recursive       the agents in the MAS.
calls. If the operator is not an intersection (case 5), the strings
and levels passed to the recursive calls are different, meaning       3) Graph collapse. In this stage, we transform the unweighted
that the two operands are independent from one another and            undirected interaction graph IG into a weighted undirected
so will be the edges generated during their exploration.              graph CIG, obtained by collapsing each node in IG corre-
   For example, the edges in the auxiliary IGext graph ex-            sponding to an agent belonging to unsplittable set U , into a
tracted from the ABP3 protocol are:                                   single node with label U and weight equal to U cardinality.
   (bob, alice, ((∧, 2) (∧, 3) (∧, 4) (, 5)) )                       Nodes in CIG are labeled with sets of agents, which can be
   (bob, carol, ((∧, 2) (∧, 3) (∧, 4) (, 5)) )                       the singleton set for agents belonging to no unsplittable set.
   (bob, dave, ((∧, 2) (∧, 3) (∧, 4) (, 5)) )                        Edges exiting from (resp. entering into) a node are all those
   (bob, alice, ((∧, 2) (∧, 3) (∧, 4) (, 5) (:, 6)) )                which, in IG, exited from (resp. entered into) one of the nodes
   (bob, carol, ((∧, 2) (∧, 3) (∧, 4) (, 5) (:, 6) (:, 7)) )         in the label.
   (bob, dave, ((∧, 2) (∧, 3) (∧, 4) (, 5) (:, 6) (:, 7) (:, 8)) )      Considering the running examples above, the
   (alice, bob, ((∧, 2) (∧, 3) (∧, 4) (, 5) (:, 6) (:, 7)) )         weighted nodes in the SaS CIG are ({right robot},1)
   (bob, carol, ((∧, 2) (∧, 3) (, 4)) )                              ({right node monitor},1) ({plan monitor},1) ({left robot},1)
   (carol, bob, ((∧, 2) (∧, 3) (, 4)) )                              ({left node monitor},1) namely, no collapse took place,
   (bob, carol, ((∧, 2) (∧, 3) (, 4) (:, 5)) )                       while the only weighted node in the ABP3 CIG is
   (carol, bob, ((∧, 2) (∧, 3) (, 4) (:, 5) (:, 6)) )                ({alice,bob,carol,dave},4) labeled with the set of unsplittable
   (bob, dave, ((∧, 2) (, 3)) )                                      agents.
   (dave, bob, ((∧, 2) (, 3)) )                                         The edges in the SaS CIG are the same as those in the
   (bob, dave, ((∧, 2) (, 3) (:, 4)) )                               SaS IG, while there are no edges in the ABP3 CIG due to
   (dave, bob, ((∧, 2) (, 3) (:, 4) (:, 5)) )                        the presence of only one node.

The IG nodes are alice, bob, dave, carol and the edges are            4) Graph partitioning. This stage of the algorithm consists in
(alice, bob), (dave, bob), (carol, bob), which are extracted in       partitioning the CIG graph obtained by the collapse stage.
a trivial way from IGext .                                            The number of expected partitions is given as an input by the
                                                                      user and must be grater than one. Any suitable partitioning
2) Identification of unsplittable agents. Not all the agents          algorithm could be used during this stage. We will discuss
can be monitored independently from one another, as some              our implementative choice in Section V.
interaction patterns that must be respected by a set of agents           The partition of the original, not collapsed graph IG, is
seen as a whole, could be lost when looking at subsets of the         trivially derived by that of CIG by making the union of
agents.                                                               the agents in the labels of the nodes in each CIG partition.


                                                                  40
                                                                                                                               7


By using a graph partitioning algorithm where the sum of          Unsplittable list: []
the node weights in each partition is about the same and
                                                                  The trace expressions
the number of edges connecting nodes in two different parti-      @((S_1|S_2),
tions is minimized (by associating the unit weight with each      [S_1=
edge), we can obtain the following partition for the agents       (put_right_sock:put_right_shoe:ok_right:lambda)\/
                                                                  (put_right_shoe:oblige_remove_right_shoe:
involved in the SaS protocol: {{left node monitor, left robot},                          removed_right_shoe:S_1),
{plan monitor, right node monitor, right robot} } .               S_2=
   The agents involved in the ABP3 protocol cannot be             (put_left_sock:put_left_shoe:ok_left:lambda)\/
                                                                  (put_left_shoe:oblige_remove_left_shoe:
partitioned because the collapsed graph consists of only one                            removed_left_shoe:S_2)])
node.
                                                                  can be partitioned into
5) Projection. Once the agents belonging to the same partition    {[left_node_monitor],[left_robot]}
have been devised, the projection algorithm presented in          {[plan_monitor],[right_node_monitor],[right_robot]}
Section III-B can be used to project the global interaction
protocol onto each of them.                                       Nodes
                                                                  node(socks,right_robot)
           V. I MPLEMENTATION AND E XPERIMENTS                    node(socks,right_node_monitor)
                                                                  node(socks,plan_monitor)
   The MAS-DRiVe algorithm is fully implemented in SWI-           node(socks,left_robot)
Prolog (http://www.swi-prolog.org/, accessed on June 2016),       node(socks,left_node_monitor)
apart from the graph partitioning stage which exploits the
                                                                  Extended edges
METIS tool described in Section III. The extraction of the        edge(socks,right_robot,right_node_monitor,
IG from the protocol, the identification of the unsplittable      [((|),2), ((\/),3), ((:),4)])
agents sets, and the generation of the collapsed graph CIG
                                                                  edge(socks,right_robot,right_node_monitor,
are performed by Prolog predicates. After them, a predicate for   [((|),2), ((\/),3), ((:),4), ((:),5)])
encoding CIG into a format suitable for METIS and saving          ........
it into a file is called. The command line for running METIS
                                                                  Collapsed nodes
on that file is launched from inside Prolog, and the result is    cnode(socks,[right_robot],1)
read and decoded.                                                 cnode(socks,[right_node_monitor],1)
   In order to run the algorithm, the user must call the          cnode(socks,[plan_monitor],1)
                                                                  cnode(socks,[left_robot],1)
partition/3 predicate with three arguments: the identifier        cnode(socks,[left_node_monitor],1)
of the agent interaction protocol to distribute, the number
of partitions to be obtained, and the name of the file where      Collapsed edges
                                                                  cedge(socks,[right_robot],[right_node_monitor])
Prolog will write the result of the partition. For example,       cedge(socks,[right_node_monitor],[plan_monitor])
given that abp3 identifies the ABP3 protocol, and that the        cedge(socks,[left_robot],[left_node_monitor])
trace expression associated with this identifier has been         cedge(socks,[left_node_monitor],[plan_monitor])
read into the Prolog knowledge base, the user should call            We run experiments with protocols that involve distinct
partition(abp3, 2, ’./abp3Out.txt’) obtaining                     groups of agents each following the ABP3 protocol. For
the following content for the abp3Out.txt file:                   example, in the DoubleABP3, alice, bob, carol and dave follow
                                                                  the ABP3 protocol, alice2, bob2, carol2 and dave2 follow the
Unsplittable list:                                                ABP3 protocol as well, and both bob and bob2 interact with
[((/\),2)],[alice,bob,carol,dave]                                 boss. Since alice, bob, carol and dave have no interactions
Partition failed                                                  with alice2, bob2, carol2 and dave2, the two groups are
                                                                  independent from each other and can be safely partitioned
Nodes                                                             for the purpose of decentralized runtime verification. In such
node(abp3,bob)
node(abp3,alice)                                                  a situation, the partition computed by MAS-DRiVe is the
node(abp3,carol)                                                  following:
node(abp3,dave)
                                                                  Unsplittable list:
Extended edges                                                    [ ((*),3), ((\/),7), ((:),8), ((:),9), ((/\),10)],
edge(abp3,bob,alice,[ ((/\),2), ((/\),3),                                                [alice2,bob2,carol2,dave2]
                      ((/\),4), ((>>),5)])                        [ ((*),3), ((\/),4), ((:),5), ((:),6), ((/\),7)],
edge(abp3,bob,carol,[ ((/\),2), ((/\),3),                                                    [alice,bob,carol,dave]
                      ((/\),4), ((>>),5)])
....                                                              The trace expressions
                                                                  @((bobasks:lambda|bob2asks:lambda)*
Collapsed nodes                                                   ((okbob:nobob2:msg3>>S_1/\msg_ack(1)>>S_2/\
cnode(abp3,[alice,bob,carol,dave],4)                              msg_ack(2)>>S_3/\msg_ack(3)>>S_4)\/
                                                                  (okbob2:nobob:mmsg3>>S_5/\mmsg_ack(1)>>S_6/\
The content of the socksOut.txt file that we obtain by            mmsg_ack(2)>>S_7/\mmsg_ack(3)>>S_8)),
                                                                  [S_1=m1:m2:m3:S_1,S_2=m1:a1:S_2,
calling partition(socks, 2, ’./socksOut.txt’)                     S_3=m2:a2:S_3,S_4=m3:a3:S_4,S_5=mm1:mm2:mm3:S_5,
is:                                                               S_6=mm1:aa1:S_6,S_7=mm2:aa2:S_7,S_8=mm3:aa3:S_8])


                                                              41
                                                                    events. For example, an interaction between the plan monitor
can be partitioned into                                             and the left node monitor, corresponding to a (plan monitor,
{[boss],[alice2,bob2,carol2,dave2]}                                 left node monitor) edge in the graph, will be monitored by
{[alice,bob,carol,dave]}                                            both the monitor in charge of {left node monitor, left robot}
                                                                    and that in charge of {plan monitor, right node monitor,
   The algorithm correctly recognizes that alice, bob, carol and
                                                                    right robot}, as it involves agents in both of them. This
dave must be monitored all together, and that the same holds
                                                                    redundancy cannot be avoided, but can be kept as small as
for alice2, bob2, carol2 and dave2. No other constraints on
                                                                    possible by selecting a partitioning algorithm that minimizes
the partition are found. We got the expected correct results
                                                                    the number of edges across partitions. Graph partitioning is
also with the TripleABP3 protocol, with three distinct groups
                                                                    another area of improvement of our work: strategies could
of agents following the ABP3.
                                                                    be employed to automatically detect the optimal number of
   The code of the MAS-DRiVe algorithm can be downloaded
                                                                    monitors to deploy, or at least guidelines may be provided in
from      http://www.disi.unige.it/person/MascardiV/Software/
                                                                    the future.
masdrive.html. It requires SWI Prolog and the METIS
                                                                       Finally, as part of our future work, we plan to make an ex-
software installed and accessible via the command line from
                                                                    haustive experimentation of MAS-DRiVe on existing complex
everywhere. We tested it on Mageia Linux.
                                                                    protocols. METIS can easily partition graphs with hundreds
                                                                    of nodes so we will model protocols involving hundreds of
          VI. C ONCLUSIONS AND F UTURE W ORK                        agents and test the scalability of the approach. Studying its
   This paper addresses the problem of how to partition a           computational properties from a theoretical viewpoint is on
MAS into agents’ subsets that can be verified at runtime,           our agenda as well.
independently from one another, in such a way that the results
obtained by the decentralized runtime verification are the                                ACKNOWLEDGMENTS
same as those obtained by monitoring the whole MAS in a               We are grateful to the anonymous reviewers for their useful
centralized way.                                                    remarks and constructive comments.
   We designed and implemented the MAS-DRiVe algorithm,
whose results are consistent with those discussed in [2]. In that                              R EFERENCES
paper, the feasibility of distributing the runtime monitoring
                                                                     [1] M. Alberti, M. Gavanelli, E. Lamma, P. Mello, and P. Torroni.
over user-defined sets of agents was empirically validated by
                                                                         The SCIFF abductive proof-procedure. In AI*IA, pages 135–
checking the traces of events compliant with the decentralized           147, 2005.
subprotocols up of a given length, and verifying whether they
                                                                     [2] D. Ancona, D. Briola, A. El Fallah Seghrouchni, V. Mascardi,
were compliant with the original global protocol as well. That           and P. Taillibert. Efficient verification of MASs with projections.
approach gave a semi-decidable information: if all traces up             In EMAS 2014. Revised Selected Papers, volume 8758 of LNCS,
to a given length l were found to be compliant with the                  pages 246–270. Springer, 2014.
protocol, this did not ensure that problems could not arise          [3] D. Ancona, D. Briola, A. Ferrando, and V. Mascardi. Global
with traces of length l + 1. With MAS-DRiVe we propose a                 protocols as first class entities for self-adaptive agents. In
constructive way to partition agents in the MAS, rather than             G. Weiss, P. Yolum, R. H. Bordini, and E. Elkind, editors, Pro-
letting the user decide how to partition them, and we overcome           ceedings of the 2015 International Conference on Autonomous
the problems due to the empirical validation approach by                 Agents and Multiagent Systems, AAMAS 2015, pages 1019–
proposing partitions that are “safe” w.r.t. the trace expression         1029. ACM, 2015.
operators. In particular, we assume that agents involved in          [4] D. Ancona, D. Briola, A. Ferrando, and V. Mascardi. Runtime
an intersection can never be partitioned. This is an over-               verification of fail-uncontrolled and ambient intelligence sys-
cautious approach, but since intersection is the construct used          tems: A uniform approach. Intelligenza Artificiale, 9(2):131–
to state constraints across different independent branches of the        148, 2015.
protocol, we assume that those branches cannot be monitored          [5] D. Ancona, S. Drossopoulou, and V. Mascardi. Automatic
independently.                                                           Generation of Self-Monitoring MASs from Multiparty Global
                                                                         Session Types in Jason. In M. Baldoni, L. A. Dennis,
   In its current version, then, the MAS can not get split if
                                                                         V. Mascardi, and W. Vasconcelos, editors, DALT 2012. Revised
no semi-independent parts of the system can get identified.
                                                                         Selected Papers, volume 7784 of LNCS. Springer, 2013.
Although this is a limitation of our approach, this could have
                                                                     [6] D. Ancona, A. Ferrando, and V. Mascardi. Comparing trace
an interesting side effect that we aim to explore in the close
                                                                         expressions and linear temporal logic for runtime verification.
future, namely the possibility to come up with suggestions               In E. Ábrahám, M. M. Bonsangue, and E. B. Johnsen, editors,
for MAS designers in terms of good practices for better sub-             Theory and Practice of Formal Methods, volume 9660 of LNCS,
systems encapsulation based on interaction protocol analysis.            pages 47–64. Springer, 2016.
Such suggestions could be applied to distributed systems in          [7] A. Bauer, M. Leucker, and C. Schallhart. Comparing LTL
general, and not only to MASs, and could prove useful in                 semantics for runtime verification. J. Log. and Comput.,
those approaches where there is an embedded notion of some               20(3):651–674, June 2010.
sort of encapsulation.                                               [8] A. Bauer, M. Leucker, and C. Schallhart. Runtime verification
   As far as graph partitioning is concerned, we point out               for LTL and TLTL. ACM Trans. Softw. Eng. Methodol.,
that the obtained partitions do not correspond to partitions of          20(4):14:1–14:64, Sept. 2011.


                                                                 8
                                                                42
 [9] F. L. Bellifemine, G. Caire, and D. Greenwood. Developing
     Multi-Agent Systems with JADE. Wiley, 2007.
[10] D. Briola, V. Mascardi, and D. Ancona. Distributed runtime
     verification of JADE multiagent systems. In D. Camacho,
     L. Braubach, S. Venticinque, and C. Badica, editors, IDC 2014,
     volume 570 of Studies in Computational Intelligence, pages 81–
     91. Springer, 2014.
[11] M. Brörkens and M. Möller. Dynamic event generation for
     runtime checking using the JDI. Electr. Notes Theor. Comput.
     Sci., 70(4):21–35, 2002.
[12] A. Buluç, H. Meyerhenke, I. Safro, P. Sanders, and C. Schulz.
     Recent advances in graph partitioning. In Algorithm Engi-
     neering: Selected Results and Surveys, volume 9220 of LNCS.
     Springer, 2015 (in press).
[13] F. Chen and G. Rosu. Mop: an efficient and generic runtime
     verification framework. In OOPSLA 2007, pages 569–588,
     2007.
[14] C. Colombo, G. J. Pace, and G. Schneider. LARVA — Safer
     Monitoring of Real-Time Java Programs (Tool Paper). In SEFM
     2009, pages 33–37, 2009.
[15] F. S. de Boer and C. P. T. de Gouw. Combining Monitoring
     With Run-Time Assertion Checking. In SFM 14, pages 217 –
     262. Springer, 2014.
[16] A. Ferrando, V. Mascardi, and D. Ancona. Monitoring patients
     with hypoglycemia using self-adaptive protocol-driven agents:
     a case study. In Workshop Proceedings of EMAS 2016, 2016.
[17] L. Giordano, A. Martelli, and C. Schwind. Specifying and ver-
     ifying interaction protocols in a temporal action logic. Journal
     of Applied Logic, 5(2):214 – 234, 2007.
[18] K. Havelund and G. Rosu. Synthesizing monitors for safety
     properties. In J. Katoen and P. Stevens, editors, TACAS 2002,
     volume 2280 of LNCS, pages 342–356. Springer, 2002.
[19] G. Karypis and V. Kumar. Analysis of multilevel graph
     partitioning. In Proceedings of the 1995 ACM/IEEE Conference
     on Supercomputing, Supercomputing ’95, New York, NY, USA,
     1995. ACM.
[20] G. Karypis and V. Kumar. A fast and high quality multilevel
     scheme for partitioning irregular graphs. SIAM J. Scientific
     Computing, 20(1):359–392, 1998.
[21] M. Kim, S. Kannan, I. Lee, O. Sokolsky, and M. Viswanathan.
     Java-mac: a run-time assurance tool for java programs. Electr.
     Notes Theor. Comput. Sci., 55(2):218–235, 2001.
[22] L. Lamport. Time, clocks, and the ordering of events in a
     distributed system. Commun. ACM, 21(7):558–565, July 1978.
[23] M. Maróti, B. Kusy, G. Simon, and A. Lédeczi. The flooding
     time synchronization protocol. In SenSys ’04, pages 39–49.
     ACM, 2004.
[24] M. C. Martin, V. B. Livshits, and M. S. Lam. Finding
     application errors and security flaws using PQL: a program
     query language. In OOPSLA 2005, pages 365–383, 2005.
[25] S. Rajsbaum. Distributed runtime verification – where combina-
     torics, fault-tolerance and formal methods meet. Keynote Talk
     at the SSS 2015, August 2015, 2015.


                                                                     9
                                                                    43