=Paper= {{Paper |id=Vol-2651/paper3 |storemode=property |title=Models for Dynamic Exploration of the Statespaces of Autonomous Vehicles |pdfUrl=https://ceur-ws.org/Vol-2651/paper3.pdf |volume=Vol-2651 |authors=Johan Arcile,Raymond Devillers,Hanna Klaudel |dblpUrl=https://dblp.org/rec/conf/apn/ArcileDK20 }} ==Models for Dynamic Exploration of the Statespaces of Autonomous Vehicles== https://ceur-ws.org/Vol-2651/paper3.pdf
         Models for the dynamic exploration
     of the state spaces of autonomous vehicles

              Johan Arcile1 , Raymond Devillers2 , and Hanna Klaudel1
          1
           IBISC, Univ Evry, Université Paris-Saclay, 91025 Evry, France,
          johan.arcile@gmail.com, hanna.klaudel@ibisc.univ-evry.fr
                 2
                   ULB, Bruxelles, Belgium, rdevil@ulb.ac.be



       Abstract. We present multi-agent timed models, called MAPTs, where
       each agent is associated with a regular timed schema upon which all
       possible actions of the agent rely. MAPTs allow for a layered structure
       of the state space, so that it is possible to explore the latter dynamically
       and use heuristics to greatly reduce the computation time needed to
       address reachability problems.
       We then use an available tool for the Petri net implementation of MAPTs,
       to explore the state space of autonomous vehicle systems and compare
       this exploration with timed automata-based approaches in terms of ex-
       pressiveness of available queries and computation time.


1    Introduction

In this paper we propose models and methods tailored for efficient model-checking
of multi-agent systems composed of periodic communicating autonomous agents
evolving in a constrained environment. Typical examples of such systems are
systems of Communicating Autonomous Vehicles (CAVs), where each vehicle
(agent) observes its environment (the road, the other vehicles) and periodically
evaluates its environment in order to make a decision on the action to perform.
The decision policies of CAVs often rely on trajectory planning algorithms stud-
ied in 3D simulation, sometimes in conjunction with on-road experiments [18,
19, 14, 20, 15]. While it is possible to plan a collision-free trajectory for a vehicle
with this technique, this does not guarantee the efficiency or robustness of a
given decision policy.
     Therefore, various properties of such systems are of interest to study safety,
efficiency and robustness of the decision choices when some variability is intro-
duced in the reaction times, transmission delays, etc. They can be either Boolean
ones (e.g. Does some vehicle overtake another one before some milestone? Is a
collision possible for a given safety distance?) or numerical ones (e.g. What is
the minimal distance between two vehicles? What is the biggest deceleration for
    Copyright c 2020 for the individual papers by the papers’ authors. Copyright
 c 2020 for the volume as a collection by its editors. This volume and its papers are
published under the Creative Commons License Attribution 4.0 International (CC BY
4.0).
30      Johan Arcile et al.

some vehicle?). This motivates the use of formal methods and several approaches
exist focusing on the properties of vehicles (functional layer [13] or soundness of
vehicle platooning [17]) or a system of vehicles in their environment. Their draw-
backs are twofold: some only allow for basic interactions [26] while others lead
to unsatisfactory computation times because of their high-level of realism [22].
    The framework VerifCar [5] allows to answer such questions, with a specific
focus on the impact of latencies and communication delays on the behaviour
of CAVs. Each vehicle is modelled by an agent which performs time restricted
actions that impact the valuation of shared variables describing the agents’ states
and the environment. The system is highly non-deterministic due to overlaps of
timed intervals in which the actions of various agents can occur. This model is
based on timed automata with an interleaving semantics [2], and is implemented
in Uppaal [27], a state of the art tool for real time systems with an efficient state
space reduction for model checking. Furthermore, Uppaal allows to consider
complex shared data structures and update functions, which are needed to model
CAV systems. However, it is limited in terms of the kind of queries that may be
asked on the models and deals only with discrete values for the shared variables,
which is not always convenient and may lead to imprecise computations.
    A closer look at the state spaces in the applications studied with VerifCar
shows that they take the form of a directed acyclic graph (DAG). Each agent
also has syntactically the form of a DAG between clock resets, and various forms
of agents’ communications are modelled through shared variables. Our goal is
then to exploit these peculiarities to build a dedicated checking environment for
reachability problems, when the models satisfy the kind of restricted forms we
observed for CAV systems (as well as some other application domains). The hope
is that this will allow to get rid of the uncomfortable restrictions we observed on
the kind of queries that may be solved, while allowing to do it efficiently. Our
aim is thus to change the balance between the expressiveness of the models and
the richness of the queries while keeping a reasonable efficiency.
    Concretely, we want to explore the state graph dynamically (checking tem-
poral logic properties and computing numerical indicators directly as we explore
states) to avoid constructing the full state space, and therefore not to loose time
and memory space storing and comparing all previously reached states. The ob-
jective is to be able to tune the verification algorithm with heuristics that will
choose which path to explore in priority, which might significantly speed up the
computation time if the searched state exists. That implies that our algorithms
should explore the graph depth-first, since breadth-first algorithms cannot ex-
plore paths freely and are restricted to fully explore all the states at some depth
before exploring the next one.
    For systems featuring a high level of concurrency between actions, such as the
CAV systems, most of the non-determinism results from possibly having several
actions of different agents available from a given state, that can occur in different
orders and which often lead eventually to the same state. This corresponds in
the state space to what is sometimes called diamonds. Breadth-first exploration
allows to compare states at a given depth and therefore remove duplicates. On
                                            Models for dynamic exploration      31

the other hand, depth-first exploring such a state space with diamonds, leads to
examine possibly several times the same states or paths, which is not efficient. In
this context, our aim is to detect and merge most identical states coming from
diamonds while continuing to explore the state space mainly depth-firstly. This
diamond detection will consist in a breadth-first exploration in a certain layer of
the state space, each layer corresponding to some states at a given depth having
common characteristics. It turns out that such layers are typically observed in
the state spaces of CAV systems. This allows for a depth-first exploration from
layer to layer, while greatly reducing the chances of exploring several times the
same states. The class of models on which this kind of algorithm can be applied
will be referred to as Multi-Agent with timed Periodic Tasks (MAPTs).
    In order to analyse a model, it is useful to translate it into formalisms
for which we have existing tools. Like in [5], we may use networks of Timed
Automata [2] and the tool Uppaal [27] or Time Petri net analyzers, such as
Romeo [21] or Tina [7]. However, to keep a better control on the models and
queries, we chose to implement our algorithms with Zinc [25], a compiler for
high-level Petri nets that generates a library of functions (in Python) allowing
to easily explore the state space. In particular, this allows to apply heuristics
leading to faster computation times. Another gain when comparing to Uppaal
is that it is easy to represent complex data structures, possibly using non-integer
variables, thus limiting the loss of information.
    The existing works on state space reduction such as [28, 12] use partial-order
reductions to explore a subset of the graph, and [8] adapts this methods to
Time Petri nets. In those approaches, some independent transitions are merged
together to build a single transition in the state space. This requires to know
beforehand those different paths leading to a common state, which cannot be
done in the models we address. Indeed, diamonds may appear or not depending
on the values of the shared variables.
    Other approaches, using bounded model checking [11, 10, 23], where a Boolean
formula describes an execution of a given length, have been used for studying
temporal logic properties. The size of the Boolean formula being exponential in
function of the number of variables, this method is effective in terms of compu-
tation time only for a restricted length. This is particularly problematic for the
systems we study, where the state of the system is described by a large number
of scalar variables. In addition, this kind of approaches does not generally allow
to prove safety properties, hence to be able to conclude on the non-reachability
of a state.
    Concerning the exploration algorithms, since the systems we aim to model
with MAPTs produce acyclic state spaces, this allows us to use more efficient
on-the-fly algorithms for CTL than those proposed in the literature for general
cases, such as [9, 6].
    In this paper, we start with a formal definition of MAPT, give some examples
and provide a translation to high-level Petri nets. Then, we present the explo-
ration algorithms taking advantage of the layered structure, weak and strong
variables, and heuristics. Finally, we use MAPT in experiments that highlight
32       Johan Arcile et al.

the benefits of our approach in terms of expressiveness of available queries and
computation time, and compare them with VerifCar when possible. The con-
tributions of the paper are the introduction and analysis of the MAPTs, as well
as the efficient exploration method dedicated to this new formalism.


2    Syntax and semantics of MAPTs

We first define G-MAPTs, composed of several agents that may interact through
shared variables. Each agent is associated with a unique local clock and performs
actions occurring in some given time intervals. There is no competition between
agents in the sense that no agent will ever have to wait for another one’s action
in order to perform its own actions. However, there may be non-determinism
when several actions are available at the same time, as well as choices between
actions and time passing.
    A G-MAPT is a tuple (V, F, A, Init) where:

 – V is a set of values;
 – F is a (finite) set of calculable functions from V to V;
                                                                             df
 – A is a set of n agents such that ∀i ∈ [1, n], agent Ai = (Li , Ci , Ti , Ei ) with:
     • Li is a set of localities denoted as a list Li = (li1 , . . . , limi ) with mi > 0,
                                                                     df


        such that ∀i 6= j, Li ∩ Lj = ∅;
     • Ci is the unique clock of agent Ai , with values in N (we assume Ci 6= Cj
        if i 6= j, but they may have the same value);
     • Ti is a finite set of transitions, forming a directed acyclic graph between
        localities, with a unique initial locality li1 and a unique final locality
        limi , each transition being of the form (l, f, I, l0 ) where l, l0 ∈ Li are the
        source and destination localities, f ∈ F is a transformation function and
           df
        I = [a, b] is an interval with a, b ∈ N and a ≤ b;
     • Ei ∈ N \ {0} is the reset period of agent Ai .
                     −−→ −−→
 – Init is a triple (initL , initC , initV ) = ((l1 , · · · , ln ), (init1 , · · · , initn ), initV ) where
   ∀i ∈ [1, n], li ∈ Li , initi ∈ N and initV ∈ V.

For each agent Ai and each locality l ∈ Li , we shall define by l • = {(l, f, I, l0 ) ∈ Ti }
the set of transitions originated from l, and by • l = {(l0 , f, I, l) ∈ Ti } the set of
                                                              •
transitions leading to l. We assume that • li1 = ∅ and (limi ) = ∅. Moreover, when
i 6= j, since Li ∩ Lj = ∅, Ti ∩ Tj = ∅ too, so that each transition belongs to a
single agent, avoiding confusions in the model. A simple example of a G-MAPT
M with two non-deterministic agents is represented in Ex. 1.
    In the semantics, for each agent Ai , we emulate a transition ri from limi to li1
that resets clock Ci every Ei time units. As such, each agent in the network cycles
over a fixed period. There can be several possible cycles though, since a given
locality may be the source of several transitions, so that there may be several
paths from li1 to limi . The behaviour of the system is defined as a transition
system where Init is the initial state. A state of a G-MAPT composed of n
agents is a tuple denoted by s = (l, c, v) where l = (l1 , · · · , ln ) with li ∈ Li the
                                              Models for dynamic exploration         33

current locality of agent Ai , c = (c1 , · · · , cn ) where ci ∈ N is the value of clock
Ci , and v ∈ V. We shall sometimes consider that v is the value of some shared
variable V used to model the evolution of the system.
    There are three possible kinds of state changes: a firing of a transition, a
clock reset and a time increase. From the state s = (l, c, v):
 – A transition (l, f, [a, b], l0 ) ∈ Ti can be fired if li = l and a ≤ ci ≤ b. Then,
   in the new state, li ← l0 and V ← f (v).
 – For any agent Ai , clock Ci can be reset if li = limi and ci = Ei . Then, Ci ← 0
   and li ← li1 .
 – Time can increase if ∀i ∈ [1, n], either there exist at least one transition
   (l, f, [a, b], l0 ) ∈ Ti with li = l and ci < b, or li = limi and ci < Ei . A time
   increase means that ∀i ∈ [1, n], Ci ← ci + 1.
    The transition system associated with a G-MAPT is driven by these state
changes, from the initial state. It may be observed that there is a single shared
element in such a system: variable V ; all the other ones are local to an agent.
It is unique but its values may have the form of a vector, and an agent may
modify several components of this vector through the functions of F used in its
transitions, thus emulating the presence of several shared variables. The values
of V are not restricted to the integer domain, but there is only a countable set
of values that may be reached: the ones that may be obtained from initV by a
recursive application of functions from F (V is not modified by the resets nor the
time increases). However this domain may be dense inside the reals, for instance.
Example 1. Let M = (V, F, A, Init) where:
 – V = R × N;
                                 f1 (x, y) → (2x, y + 1) f2 (x, y) → (x + 1.3, y + 1)
 – F = {f1 , f2 , f3 , f4 } with
                                 f3 (x, y) → ( x2 , y)          f4 (x, y) → (2x, y)
 – A = {(L1 , C1 , T1 , E1 ), (L2 , C2 , T2 , E2 )} with
    L1 = {1, 2} T1 = {(1, f1 , [1, 2], 2), (l, f2 , [3, 3], 2)} E1 = 5
    L2 = {3, 4} T2 = {(3, f3 , [1, 2], 4), (3, f4 , [3, 3], 4)} E2 = 5
 – Init = ((1, 3), (0, 0), (0.5, 0)).
   A visual representation of M and the initial fragment of its dynamics is
depicted in Fig. 1.                                                      ♦
   G-MAPT semantics when considered in a naive way is prone to a state
space explosion phenomenon. Motivated by our target applications, we restrict
G-MAPTs with Assumption 1 to produce state spaces being DAGs.
Assumption 1 (acyclicity of the state space)
1. V may be promoted into a partially ordered set (V, <) and there exist an
   agent Ai such that, in all paths from li1 to limi in the local graph of Ai (see
                                                     df
   for example Fig. 1), there exists a transition t = (l, f, [a, b], l0 ) such that for
   all v ∈ V, v < f (v); in general, V = W × X, where (X, <) is a known
   ordered set and the order is extended in a natural way to V by stating that
   (w, x) < (w0 , x0 ) iff x < x0 ;
34       Johan Arcile et al.


                  [1, 2]                     (1, 3; 0, 0)                     t2
                    f1                                     (2, 3; 1, 1)                  (2, 4; 1, 1)
     1                            2           +1
                                                     t1                            t1
                  [3, 3]                                             +1
                                             (1, 3; 1, 1)                 (1, 4; 1, 1)          +1
                    f2                                               t2            t2
                 E1 = 5
                                              +1           (2, 3; 2, 2)                  (2, 4; 2, 2)
                                                                           +1
                                                     t1                            t1
                                                                     +1
                  [1, 2]
                                             (1, 3; 2, 2)                 (1; 4; 2, 2)          +1
                    f3                                               t2            t02
     3                            4
                  [3, 3]                      +1           (2, 3; 3, 3)                  (2, 4; 3, 3)
                                                                           +1
                    f4                               t01                           t01
                                                               t02                              +1
                                             (1, 3; 3, 3)                 (1, 4; 3, 3)
                 E2 = 5

Fig. 1. Left: Visual representation of the two ’local’ graphs of G-MAPT from Ex. 1.
Right: Initial fragment of the dynamics (without shared variable values).


2. and there is no f ∈ F such that for some v ∈ V, we have f (v) < v.

    Item 1 ensures that at least one agent increments the variable V at least once
between two of its resets.
    Item 2 ensures that no function can decrease the variable V . The result is an
absence of cycles in the whole state space of the G-MAPT.
    Furthermore, in order to shorten the computation time of the queries, we also
consider an accelerated semantics that often allows to increase time by several
time units in one step. We shall not detail it here (see [3, 4]) but we shall use it
in the experiments shown later in this paper. It needs the following assumptions:

Assumption 2 (strong liveness)
1. If the initial locality of some agent Ai is the terminal one (li = limi ), then
   the initial value of clock Ci satisfies initi ≤ Ei ;
2. otherwise (when li 6= limi ), we have initi ≤ max{b | (li , f, [a, b], l0 ) ∈ li • });
3. moreover, for each agent Ai , if l ∈ Li \ {li1 , limi }, then
   max{b | (l0 , f, [a, b], l) ∈ • l} ≤ min{b0 | (l, f 0 , [a0 , b0 ], l00 ) ∈ l • };
4. and we have max{b | (l0 , f, [a, b], limi ) ∈ • limi } ≤ Ei .

   Items 1 and 2 ensure that, when the system is started, either in the terminal
locality or in a non terminal one of some agent, the time is not blocked and we
shall have the possibility to perform an action in some future.
   Item 3 ensures that whenever a non terminal locality is entered, any (and
not only some) transition originated from that locality will have the possibility
                                                Models for dynamic exploration        35

to occur in some future (the case when we enter an initial locality is irrelevant
since resets reinitialise the corresponding clock to 0).
    Item 4 captures similar features in the case when we enter a terminal locality.
In other words, each transition or reset in l • is enabled when entering l or will
be enabled in the future (after possibly some time passings in order to reach the
lower bound a).
Definition 2. A MAPT is a G-MAPT satisfying Assumptions 1 and 2.

3     Translation into high-level Petri nets
A high-level Petri net [16] can be viewed as an abbreviation of a low-level one
[24] where tokens are elements of some set of values that can be checked and
updated when transitions are fired3 . Here, we express a G-MAPT as a high-level
Petri net to be implemented with Zinc [25].
    Formally, a high-level Petri net is a tuple (S, T, λ, M0 ) where:
 – S is a finite set of places;
 – T is a finite set of transitions;
 – λ is a labelling function on places, transitions and arcs such that
     • for each place s ∈ S, λ(s) is a set of values defining the type of s,
     • for each transition t ∈ T , λ(t) is a Boolean expression with variables and
       constants defining the guard of t and
     • for each arc (x, y) ∈ (S × T ) ∪ (T × S), λ(x, y) is the annotation of the
       arc from x to y, driving the production (on y, when x ∈ T and y ∈ S) or
       consumption (from x, when x ∈ S and y ∈ T ) of tokens (see an example
       below).
 – M0 is an initial marking associating tokens to places, according to their
   types: for each s ∈ S, M0 (s) is a finite multiset of values in λ(s).
    The semantics of a high-level Petri net is captured by a transition system
containing as states all the markings which are reachable from the initial marking
M0 . A marking M 0 is directly reachable from a marking M if there is a transition
t enabled at M , whose firing leads to M 0 ; it is reachable from M if there is a
sequence of such firings leading to it. A transition t is enabled at some marking
M if the tokens in all the input places of t allow to satisfy the flow expressed
by the annotations of input arcs and the guard of t, through a valuation of the
variables involved in the latter. The firing of t consumes the concerned tokens
in input places of t and produces tokens on output places of t, according to the
annotations of the output arcs and the same valuation, as shown below for a
transition before and after the firing.

     s1 1      x x > 0 (w, z)                   s1        x x > 0 (w, z)
                   t             (0, 0)                       t             (1, 2)
     s2 2      y                                s2        y
                   (w + x, z + y) s3                          (w + x, z + y) s3
3
    We shall allow infinite value sets, but there will never be infinitely many tokens in
    the system
36      Johan Arcile et al.

Definition 3. Given a G-MAPT Q = (V, F, A, Init) with |A| = n, its trans-
lation to a high-level Petri net N = translate(Q) = (S, T, λ, M0 ) is defined as
follows:
                                       df
 – S = {sA , sC , sV } with λ(sA ) = L1 × · · · × Ln where Li is the set of localities
   of agent Ai (its jth element is denoted lij ), λ(sC ) = Nn and λ(sV ) = V. For
                                                                df                     df


   any token x of the type λ(sA ) or λ(sC ), x[i] is the ith element of the list.
      df
 – T = Ttrans ∪ Treset ∪ {ttime } where
     • Ttrans is the smallest set of transitions such that, for each agent Ai =
         (Li , Ci , Ti , Ei ) in A and for each transition (l, f, [a, b], l0 ) ∈ Ti , there is a
                                                         df               df               df
         transition t ∈ Ttrans such that λ(sA , t) = x, λ(sC , t) = y, λ(sV , t) = z,
                     df                                                                    df
         λ(t, sA ) = x0 where x0 [i] ← l0 and ∀j 6= i, x0 [j] ← x[j], λ(t, sC ) = y,
                     df                  df
         λ(t, sV ) = f (z) and λ(t) = (x[i] = l) ∧ (a ≤ y[i] ≤ b). This is equivalent
         to the set of transitions of the G-MAPT.
     • Treset is the smallest set of transitions such that, for each agent Ai =
                                                                                           df
         (Li , Ci , Ti , Ei ) in A, there is a transition t ∈ Treset such as λ(sA , t) = x,
                     df               df
         λ(sC , t) = y, λ(t, sA ) = x0 where x0 [i] ← li1 and ∀j 6= i, x0 [j] ← x[j],
                     df                                                               df
         λ(t, sC ) = y 0 where y 0 [i] ← 0 and ∀j 6= i, y 0 [j] ← y[j], and λ(t) = (x[i] =
         limi ) ∧ (y[i] = Ei ) where mi = |Li |. This is equivalent to the set of clock
                                            df


         resets of the G-MAPT.
                          df                   df                  df                     df
     • λ(sA , ttime ) = x, λ(sC , ttime ) = y, λ(ttime , sA ) = x, λ(ttime , sC ) = y 0 ,
                                                                   df
         where ∀i ∈ [1, n], y 0 [i] ← y[i] + 1, and λ(ttime ) = G1 ∧ · · · ∧ Gn where
         Gi acts as the ”upper bound guard” for all the transitions in agent Ai ,
                     df                                                                       df
         i.e., Gi = (g1 ∨ · · · ∨ gmi ) with mi = |Li | and ∀j ∈ [1, mi − 1], gj =
                                                                                       •
         (x[i] = lij ) ∧ (y[i] < B), where B = max{b|(lij , f, [a, b], l0 ) ∈ lij } is the
         highest upper bound of the intervals from all outgoing transitions of lij
         and gmi = (x[i] = limi )∧(y[i] < Ei ). This is equivalent to a time increase.
                     df


 – (M0 (sA ), M0 (sC ), M0 (sV )) = Init is the initial marking

    The translation associates singletons as arc annotations for all arcs. As a
consequence, during the execution, starting from the initial marking which as-
sociates one token to each place, there will always be exactly one token in each
of the three places. Each reachable marking, where sA contains l, sC contains c
and sV contains v, encodes a state (l, c, v) of the considered G-MAPT.
    It is easy to see [3] that the transition system of a G-MAPT Q is isomorphic
to the one of its translated Petri net translate(Q). It is easily possible to modify
this translation to correspond to the accelerated semantics (mentioned, but not
detailed, above).


4    Dynamic exploration of a MAPT
When model checking a system, one usually has the choice between a depth-first
and a breadth-first exploration of the state space. For reachability properties
(where one searches if some state satisfying a specific property may be reached),
                                              Models for dynamic exploration        37

depth-first (directed and limited by the query) is usually considered more ef-
fective. However, the majority of the non-determinism in systems featuring a
high level of concurrency (such as CAV systems) leads to diamonds. Indeed, if
transitions on different agents are available at a state then they may occur in
several possible orders, all of them converging most of the time to the same state
(if the applied functions commute). In order to avoid exploring again and again
the same states, a depth-first exploration needs to store all the states already
visited up to now, which is usually impossible to do in case of large systems. For
example, if states s1 and s2 share a common successor s3, when the algorithm
will compute the successors of s1, it will classically remove s1 from the stack and
continue with its successors, until reaching s3 and exploring all paths from s3,
forgetting each time the nodes already visited. That way, when the algorithm
has explored all paths from s1 and starts exploring from s2, there is no memory
of s3 having been explored already, and thus all paths starting from it will be
explored again.
     On the contrary, using breadth-first algorithms would partly allow to avoid
that issue, because duplicate states obtained at a given depth can be removed.
However, this would also imply exploring all reachable states at a given depth
and forbid using heuristics to guide and limit the exploration.
     An idea is then to combine both approaches, which is possible for MAPTs,
using the layered approach described next. A MAPT may be non-deterministic
but it is strongly live and has a DAG state space. For instance, the G-MAPT
M from Ex. 1 satisfies both assumptions (acyclicity is satisfied due to y being
incremented in all cycles of A1 ) and so is actually a MAPT.

4.1   Layered state space
The state space of a MAPT shows an interesting characteristics: apart from
having no cycles, its structure can often be divided in layers such that all states
on the border of a layer share the same vectors of localities and clocks (and
thus, the same set of enabled transitions) and are situated at the same time
distance from the initial state. The only difference concerns the value of the
variable, due to the non-determinism and the concurrency inherent to this kind
of models. Non-determinism means that an agent has the choice between several
transitions at some location; concurrency means that at least two agents may
perform transitions at some point. In the first case, several paths may be followed
by the agent to reach some point, leading to different values of the variable; in
the second case, transitions of the two agents may be commuted, leading again
to different values of the variable if the corresponding functions do not commute.
    For instance, if agent Ai in a MAPT starts at li1 with a null clock, after Ei
time units and before (Ei + 1) time units, it shall necessarily pass through its
reset (it is possible that it performs other transitions before and/or after this
reset without modifying its clock, but it is sure the agent will go through this
reset before performing a new time passing). Hence, if each agent starts from its
initial locality with a null clock, after lcm{E1 , . . . , En } (i.e., the least common
multiple of the various reset periods; denoted in the following by lcm(E)) time
38      Johan Arcile et al.

units, it is sure we shall revisit the initial localities with null clocks, but possibly
with other values of the variable V , yielding the border of a layer. From this
border the same sequences of transitions/resets/time-passings as initially will
occur periodically (with a period of lcm(E)), leading to new borders, with the
initial localities and the null clocks.

Exploring layered state space The function N extBorder(state) takes a state
(l, c, v) and computes, through a breadth-first exploration, the set of successors
up to the next border. To do so we introduce a function N extState(s), which
returns the set of all successors of state s. This is used iteratively in a depth-first
exploration to jump from a state to one of its successors belonging to the next
border. During this exploration, an additional function may be used to check
if a visited state satisfies some condition. Such a use of layers allows to reduce
the number of explored paths by detecting diamonds caused by the order of
transitions of concurrent agents.

Exploration using strong and weak variables The approach presented above does
not deal with diamonds spreading on a time distance longer than the one between
two adjacent borders. For example, it may still happen that two different states
s1 and s2 belonging to the same border have a common successor s in the
future. To cope with this issue, it is more interesting to perform the breadth-
first exploration that computes successors at the next border for the set {s1 , s2 }
instead than taking them separately. In general, it is not obvious to know or
guess which states should be kept together in the computation of the next border.
Indeed, one should be able to determine when sets of states should be split in
sub-sets and when they should be kept together. To perform such a clustering,
it may be interesting to exploit the properties of target applications, such as
CAVs.
                                           df
    A possible solution is to assume V = Vw × Vs , where Vw (weak) is a ”less
important” part of V and Vs (strong) a ”more important” one, such that states
differing in the valuation of Vs are unlikely to have a common successor, while
this is not the case for Vw . Symmetrically, states with the same valuation of Vs
are more likely to have a common successor. This may give us a criterion to
cluster states and jump from a set of states to the set of their successors at the
next border. The choice of Vs and Vw is of course system-dependent and should
be defined by an expert, or with the help of a simulation tool. As an example,
elements that can be assigned a new value independently of their previous one
might be considered as weak, while elements whose value changes depend on their
present value (for instance the position of a moving object) might be considered
as strong.
    The function ClusteredN extBorder(state set) is then an easy variant of
N extBorder(state), taking a set of states and producing a set of clusters, i.e.,
sets of states having identical values of variables in Vs . It is used in a similar way
as N extBorder() to explore in a depth-first manner the layered state space, the
only difference being that it jumps from a cluster belonging to some border to
a cluster belonging to the next one, based on the choice of Vs .
                                              Models for dynamic exploration        39

4.2   Exploration algorithms

This section is dedicated to exploration algorithms of finite prefixes of MAPTs:
states that do not have successors in the considered prefix will be called final.
The algorithms are denoted with the CTL temporal logic syntax. Since this
temporal logic is meant to explore infinite paths, we shall consider that each
final state has a self loop.
    Our algorithms have two main characteristics: they operate ”on-the-fly”,
which means that they do not store the entire visited state space (but only
a cut of it), and they can be tuned with heuristics defining a priority on paths
to be explored, that might significantly speed up the computation time if the
searched states exist. To do so we rely on the procedure ClusteredN extBorder()
mentioned above. Since our algorithms do not store all the states that have been
explored, we do not return traces of execution, unlike what is usually proposed
by standard temporal logic model checking tools.
    Although the general case for checking reachability CTL formulas has not
yet been implemented (left for a future work), it can be done easily thanks to the
absence of cycles in the state space. Here, we shall present the algorithms used
in our experiments. The atomic formulas p can be any Boolean formula using
variables or current localities of the state where p is checked. It is also possible to
use clock values, but only with the original semantics: the accelerated semantics
reduces the state space in a way that preserves the sequences of transitions, but
not all the clock values.
    The algorithm for EF p (meaning a reachable state satisfies p) consists, start-
ing from a stack containing the initial state, in taking the first element s of
the stack, returning it if p is true on s, and otherwise adding the result of
ClusteredN extBorder() to the stack. The algorithm continues recursively until
reaching p or there is no more states to explore in the considered finite prefix.
Additionally, we return true if p is satisfied by a state between two borders, i.e.,
during an application of ClusteredN extBorder().
    The algorithm for EGp (meaning there exists a path where p is always true)
works in a similar way, but the state s is returned if p is true on s and if s
is final, and ClusteredN extBorder(s) is added to the stack only if p is true
on s. Additionally, states where p is not true are dropped when explored in
ClusteredN extBorder(). That way, only states where p is true are explored.
    We shall also describe how to check nested CTL queries, of the kind EF (p ∧
EF q), meaning that a reachable state satisfies p and from that state a reachable
state satisfies q, and EF (p∧EGq), meaning that a reachable state satisfies p and
from that state there exist a path where q is always true. Those nested queries are
implemented using a marking function (i.e., a Boolean indicator). EF (p ∧ EF q)
is implemented as follows. Whenever p is true on a state, the state is marked.
Whenever a state is marked, all its successors are marked. Starting from a stack
containing the initial state, the first element s of the stack is returned if q is
true on s and s is marked. Otherwise, the result of ClusteredN extBorder(s)
is added to the stack. The same marking process is performed between two
borders, i.e., in ClusteredN extBorder(). We continue recursively until a state
40        Johan Arcile et al.

validates the property or there is no more state to explore. As for EF (p ∧ EGq),
states are marked whenever both p and q are true or the state is a successor
of a marked state and q is true. If a marked final state is reached, it validates
the property and is returned. Again, the same marking process is performed in
ClusteredN extBorder().
    To have a better idea of how those algorithms are implemented, we provide
Algorithm 1 computing EF (p ∧ EGq).


Algorithm 1 EF (p ∧ EGq)
    exploring.add(init) {add initial state to the stack of states to explore}
    while exploring do
      successors ← N extBorder(exploring.pop()) {removes the state at the end of the
      stack and store all its successors}
      for all s ∈ successors do
         if (check(p, s) or parent marked(s)) and check(q, s) then
            mark(s) {s is marked if either p is true on s or q is true on s and its prede-
            cessor is marked}
         end if
         if is marked(s) and check(f inal, s) then
            return true {returns true if s is a marked final state}
         else if not check(f inal, s) then
            exploring.add(s) {otherwise and if s is not a final state, add it to the stack}
         end if
      end for
    end while
    return false




5     Experiments
In this section we illustrate the performances of our exploration algorithms ap-
plied to three models used in [5], featuring various state space sizes. Those mod-
els represent systems of autonomous vehicles circulating on a portion of highway
where each vehicle communicates with the other ones to make decisions about
its behaviour.
    More specifically, we assume that the road section is composed of three uni-
directional lanes the vehicles move on, and we store at any time the coordinates
(position) of each vehicle on it. Each vehicle is represented by a rectangle with
a given length and width, centered on its position. Its state is thus described by
a record containing its position, current longitudinal and lateral speeds, longi-
tudinal acceleration (in a given range from negative to positive values), current
direction and knowledge about intentions of other vehicles (for example in the
form of timed trajectories).
    Since we are focusing on the efficiency and robustness of decision choices
and not on the control of vehicles (i.e., the module responsible for producing
                                             Models for dynamic exploration       41

a trajectory according to the decision choices), we abstract from the physical
laws involved in a maneuver such as rotation or inertia. In other words, we
assume that turning the steering wheel impacts the speed value that will make
the rectangle move on the x-axis, while its longitudinal speed still makes it move
on the y-axis.
    At a constant frequency of 10 Hz, the speed and position of each vehicle is
updated by a transition of an agent environment update.
    We assume that the vehicles (1) observe the behavior of their neighbors and
that the information obtained from the perception sensors is always accurate
and immediate and (2) communicate and receive pieces of information, which
are not directly observable, such as the planned lane change of the other vehicles;
these communications are timed in order to realistically represent the delays
between emissions and receptions of data. The behavior of each vehicle consists
in repeating endlessly, at its own constant frequency:

 – a computation allowing to make a decision on the immediate action to be
   performed (i.e., execution of the decision algorithm) in the form of an accel-
   eration (speed increase, braking, no change) and a direction (left, right, no
   change), followed by
 – the communication of its own intention (in the form of the trajectory it
   wishes to follow) to all vehicles (close enough to be) able to receive this
   information. Received information coming from the other vehicles is stored
   in the database of the vehicle and used when running the decision algorithm.

    A decision algorithm has been implemented for the experiments [3] but should
not be considered as one of our contributions: our goal here is not to promote
a clever decision making algorithm but to illustrate how MAPTs and their
exploration algorithms can be used to efficiently check temporal logic queries in
complex multi-agent systems. As such, this section focuses on the comparison
of execution times rather than on the actual results, which are detailed and
discussed in [5]. Sources for the models and algorithms used in this section are
available in [1].
    In the following, we first discuss the advantages and drawbacks of using
various types of layer-based explorations. Then, we provide some heuristics, and
experiment them in order to (hopefully) observe the gain that can be achieved
with them. Finally, we compare this method with the framework VerifCar [5],
which uses Uppaal, and we provide a verification method for the analysis of
such systems that is more efficient and less constrained than the one proposed
in [5]. Note that Uppaal uses real clocks but in our context (resets only occur at
periodic integer times and functions do not rely on clocks) there is no obstacle
in replacing them by integer ones.


5.1   Efficiency of the layer-based exploration.

Here, we compare, for several exploration algorithms, the full exploration time
and the reachability time of the first occurrence of a final state. They are explored
42       Johan Arcile et al.

in breadth-first, depth-first without layers and depth-first with layers (with and
without the use of strong/weak variables). The sizes of the state spaces for
Models 1, 2 and 3 are respectively of 7751, 52732 and 285944 states.


 Exploration algorithm              Full exploration time (s)   First occurrence of a final state (s)
                                   Model 1 Model 2 Model 3      Model 1 Model 2          Model 3
 Breadth first                      10.8      52.1        420     10.7        52          419.9
 Depth-first without layers          ∞          ∞          ∞       3.3        4.6           3.9
 Depth-first layered (Vw = ∅)        11         ∞          ∞       4.5        6.6           4.1
 Depth-first layered (small Vw )     11        250       2015      4.5        7.4            6
 Depth-first layered (large Vw )     11         71        667      4.5       14.4           7.2

Table 1. Comparison of full exploration time and time to reach the first occurrence of a
final state state for exploration algorithms in breadth-first, depth-first with and without
layers and with or without the use of weak variables. ∞ means that the exploration
was stopped after 50 hours of computation without a result.




    As shown in Table 1, one can see that the breadth-first algorithm has the
best full exploration time in any case, but the time before reaching any final
state is close to the full exploration one, which makes it the worst technique in
this case. On the other hand, the standard depth-first algorithm is the fastest
for reaching a final state, but it does not fully explore the state space even after
50 hours of computation.
    Results for Model 1 show that as long as the layer based approach is used, the
full exploration time is very close to that of the breadth-first one. This indicates
that there is almost no diamonds covering several layers, meaning that different
states belonging to the border of a layer almost never share a common successor.
For this reason the use of weak variables has no effect. Although this case is rather
simple, it clearly highlights the advantage of layer-based exploration: with almost
no increase in full exploration time, it is able to reach a final state much faster.
    Model 2 and 3 have much more complex state spaces and, in both cases, the
layer-based algorithm that does not rely on weak variables to aggregate states is
not able to explore the full state space even after 50 hours of computation. On
the contrary, using even a small number but well chosen weak variables (6 out
of 39), it is possible to fully explore the state space. In both cases, exploration
is about five times longer than the exploration time of breadth-first algorithm.
When using a large number of weak variables (30 out of 39), the exploration is
much shorter (about 1.5 times the time of breadth-first algorithm). One can note
however that the larger Vw , the longer it takes to reach a final state. Indeed, as
states are aggregated layer by layer, a too large Vw would result in an exploration
similar to a breadth-first one, where all states are kept together and final states
are only reached at the end of the computation. With the weak variables chosen,
the time to reach a final state remains however reasonable.
    In the next experiments, the depth-first algorithms always use layers and a
fixed non-empty Vw .
                                             Models for dynamic exploration        43

5.2    Heuristics

Exploration algorithms based on layers allow the use of heuristics. These heuris-
tics guide the exploration, choosing among all the unexplored states the one that
will most likely lead to a state that satisfies the checked property. The heuristics
we use associate a weight to each state, such that when a new state is discov-
ered, it is placed in a list ordered by weight of states to explore. The weight is
a prediction of the distance between the current state and a state satisfying the
property.
    Therefore, a reachability property may be associated with a heuristics that
takes a state as an input and returns a weight as an output. Below is a list of
heuristics that we used for experiment purposes together with the property they
are associated with:

 1. distance vh1 vh2 : returns the longitudinal position of vehicle vh1 minus that
    of vehicle vh2 . It is used with property EF arrival vh1 bef ore vh2 , where
    arrival vh1 before vh2 is true in a state if vehicle vh1 reaches the end of the
    road portion before vehicle vh2 does. The idea behind is to check in priority
    states where vh1 is the most ahead of vh2 .
 2. estimated travel time vh: returns the time traveled since the initial state4
    plus the estimated time to reach the end of the road portion, assuming the
    current speed is maintained. It is used with property EF travel time vh sup n,
    where travel time vh ≥ n is true in a state if vh reaches the end of the road
    portion in n time units or more. The idea is to check in priority states where
    vh is predicted to reach the end of the road with the longest time.
 3. time to overtake vh1 vh2 : is the time before both vehicles arrive at the same
    longitudinal position if they keep their current speed. It is used with property
    EF ttc vh1 vh2 ≤ n, where ttc vh1 vh2 is the value of the time to collision
    indicator between vh1 and vh2 (i.e., the delay before there is a collision
    between the two vehicles if they keep their current speed), and n is a time
    to collision value. The idea is to check in priority states where one of the
    vehicles is getting closer to the other one with the higher speed.

    These heuristics have been used on Model 3, with results given in Table 2.
The scenario in Model 3 considers three vehicles positioned as depicted in Fig. 2
on a two lane road portion that is 500 m long, with one additional junction lane.
Initially, vehicle A is on the right lane at position 0 m with a speed of 30 m/s,
vehicle B is on the left lane at position 30 m with a speed of 15 m/s and vehicle
C is on the junction lane at position 40 m with a speed of 20 m/s. They all aim
at being on the right lane at the end of the road portion.
    The first two queries (EF arrival B bef ore A and EF travel time A ≥
15.9) can only be true in a final state (the deepest layer). As such, the reachability
time with the breadth-first algorithm is close to the full exploration time with
the same algorithm. In general, the breadth-first reachablity time depends on
4
    Elapsed time since initial state can be estimated by counting how many times an
    agent has returned to its initial locality.
44         Johan Arcile et al.




Fig. 2. Initial positions and possible trajectories of autonomous vehicles for the scenario
in Model 3.
 Exploration algorithm            EF arrival B bef ore A   EF travel time A ≥ 15.9   EF ttc A C ≤ 1.14   EF ttc A B ≤ 0
 Breadth-first                             416                       427                    292                95
 Depth-first without heuristics         234—357                   167—340                247—547           277—483
 Depth-first with heuristics               131                       149                    103                13


Table 2. Comparison of reachability time for exploration algorithms in breadth-first
and depth-first with and without heuristics. As depth-first without heuristics is non
deterministic, the two values correspond to the fastest and the slowest runs obtained
for each query (five runs where performed each time).



the depth of the first state that satisfies the property. One can observe that for
the fourth query (EF ttc A B ≤ 0), the state is actually reached at a lower
depth, which is reflected by the reachability time.
    As the depth-first algorithm without heuristics randomly chooses which paths
to explore first, the reachability time varies. The number of states in the whole
state space that satisfies the property thus impacts the mean reachability time
with this algorithm, i.e., when there is more possibility to verify the property,
the average time needed is shorter. As we do not want to rely on luck, this is
not satisfying.
    On the other hand, depth-first algorithm with heuristics explores states in
a given order (depending on their weights) and therefore the reachablity time
is always the same. The heuristics we used could of course be modified and
improved, but they are enough to show a significant decrease of the reachability
time. Even for the fourth query, where the breadth-first is faster than the depth-
first algorithm, the heuristics allows to quickly identify the state that satisfies
the property.

5.3     Comparison with VerifCar
We will now compare the reachability time obtained with Uppaal with the ones
obtained with the depth-first exploration algorithms with heuristics, on Model 3.
We observed that Uppaal first constructs the state space in about 106 s, then
is able to answer almost instantly if a searched state exists. It can therefore
answer several queries after constructing the state space, unlike our heuristics-
based dynamic exploration algorithms, which have to explore the state space
                                            Models for dynamic exploration       45

from scratch for each query. Yet, most of the states we aimed for can be reached
easily, and the computation took only about 4 seconds. Queries depicted in
Table 2 are those where states were harder to reach. Compared to the ones
we obtained in [5], these results indicate that, when a reachability property is
satisfied, our algorithms have the same kind of execution time than the ones
observed with Uppaal. On the other hand, if the reachability property is not
true, they are slower than Uppaal, which depending on the kind of query takes
between 34 and 370 seconds, which equals the full exploration time with this
tool for Model 3. As mentioned previously, the full state space exploration time
with depth-first algorithms on this Model, is in our case, of 667 seconds. This is
not a surprise since Uppaal is a mature tool using many efficient optimisations.
    However, Uppaal is restricted in terms of query language, at least in two
ways interesting for us. First, it is not possible to directly check bounds of a
given numerical indicator, and such bounds should be obtained by dichotomy,
requiring several runs for each indicator, such as proposed in the methodology
of [5]. Second, it is limited to a subset of CTL (accepting mainly non nested
queries). Our algorithms do not have such restrictions.
    Indeed it is possible to do a full exploration of the state space while keeping,
for each state, the lower and higher values reached on the paths leading to
the state, for a given set of indicators. That way, all the information needed
to analyse the behaviour of the system, can be obtained after only one full
exploration. This is performed as a standard breadth-first exploration (storing
states in a file) with the difference that each state is associated with a set of
pairs (min, max), being the (temporary) bounds of the considered indicators.
Each time a state is explored, the value for each indicator is computed, and it
overwrites min if it is smaller, and max if it is greater. That way, each state
s contains, for each indicator, the smallest and highest values that exist on the
paths from the initial state to s. As several paths can lead to s, we will consider
that s reached from path P 1 and s reached from path P 2 are equivalent only if
the set of their indicators are equal. Therefore, some diamonds might be detected
(i.e., two identical states coming from different paths) but not merged together
in order to keep information about their respective paths. That way it is possible
to have several versions of the same state, but with different indicator values.
If one is interested in the reachability of states (for instance, if an indicator
is equal to some value), this can easily be done in the same way, by adding
Boolean variables to the set of indicators. At the end of the exploration, we get
this way the set of all final states, together with all the information that has
been carried on their respective paths. It therefore contains all the information
needed to analyse finely the system. For the case of Model 3, getting the arrival
order together with the bounds for travel time and worst time-to-collision takes
708 seconds. In comparison, the time needed by Uppaal to obtain the same
information with the dichotomy procedure is 3553 seconds.
    Also, the DAG shape of the state space allows us to implement nested CTL
queries. For the experiments, we used a query of the kind EF (p ∧ EGq), which
is the negation of the ”leads to” operator p −−> ¬q (the only nested operator
46      Johan Arcile et al.

available in Uppaal, in addition to deadlock tests) and two of the kind EF (p ∧
EF q), which cannot be expressed in Uppaal.
    In [5], arrival C bef ore A −−> arrival B bef ore A was used and reached
a state invaliding the property in 110 seconds. Its negation can be expressed here
as EF (arrival C bef ore A ∧ EG¬arrival B bef ore A) and our algorithm finds
the state validating the property in about 10 seconds. The properties
EF (ttc A B ≤ 1 ∧ EF arrival A bef ore C) and
EF (ttc A B ≤ 1 ∧ EF arrival A bef ore B),
that cannot be checked in Uppaal, can be expressed here. The first one expresses
the possibility for vehicle A to arrive ahead of vehicle C after a dangerous situa-
tion has occurred, involving a time to collision of less than 1 second. The second
is similar for vehicle A and B in the same conditions. The first query is false
and needs to explore the whole state space to give an answer (in 680 seconds),
while the second one is true and finds a state satisfying the property in about
10 seconds.
    Finally, it is worth mentioning that discretisation is needed for Uppaal, and
therefore approximations may be mandatory in some cases, leading to a loss in
precision and realism. In addition to a better expressiveness, the model checking
process presented in this paper also ensures that no approximation is needed,
hence a higher level of realism is achieved.


6    Conclusion

We introduced G-MAPTs, multi-agent timed models where each agent is as-
sociated with a regular timed schema upon which all possibles actions of the
agent rely. The formalism allows to easily model systems featuring a high level
of concurrency between actions, where actions are not temporally deterministic,
such as the CAVs. We have then formalised MAPTs (Multi-Agent with timed
Periodic Tasks), by soundly constraining G-MAPT ones. We also presented how
to extract a layered structure out of a MAPT, that allows to detect diamonds
while exploring the system depth-first. A translation from G-MAPT to high-
level Petri nets allowed us to implement a dedicated checking environment for
this formalism with the (free) academic tool Zinc. Algorithms implemented in
such environments explore state spaces dynamically and can be used together
with heuristics that allow to reduce the computation time needed to reach some
states in the model. Finally, experiments highlighted the efficiency of our abstrac-
tions, and a comparison of model checking CAVs systems with the framework
VerifCar has been performed. Although our checking environment does not
return traces of executions and is not better for full exploration times than the
state of the art tool Uppaal used in VerifCar, it has a better expressiveness
both on the model, since we can compute with non-integer numbers, and on the
queries since nested CTL ones can be checked. The heuristics performed well
for reachability problems and we also provided an exploration algorithm that
allows to gather all information needed to analyse the system in one run, which
greatly decreased the time needed to gather the same amount of information
                                              Models for dynamic exploration        47

when using VerifCar. Although we developed this method with the case study
of autonomous vehicles in mind, this formalism and all the abstractions and al-
gorithms presented in this paper can be easily applied to any multi-agent real
time systems where agents adopt a cyclic behaviour, such as mobile robots com-
pleting cyclically tasks according to their own objectives, flying drone squadrons,
etc.


References

 1. Sources of the MAPTs models and exploration algorithms. https://forge.ibisc.univ-
    evry.fr/jarcile/MAPTs/. Accessed: 2019-10-11.
 2. R. Alur and D. Dill. Automata for modelling real-time systems. In Proceedings
    ICALP’90, volume 443 of LNCS, pages 322–335. Springer-Verlag, 1990.
 3. J. Arcile.     Conception, modélisation et vérification formelle d’un système
    temps-réel d’agents coopératifs : application aux véhicules autonomes commu-
    nicants. PhD thesis, Université Paris-Saclay, France; https://www.biblio.univ-
    evry.fr/theses/2019/2019SACLE029.pdf, 2019.
 4. J. Arcile, R. Devillers, and H. Klaudel. Dynamic exploration of multi-agent systems
    with timed periodic tasks. Technical Report: https://hal.archives-ouvertes.fr/hal-
    02365415, 2019.
 5. J. Arcile, R. Devillers, and H. Klaudel. Verifcar: a framework for modeling and
    model checking communicating autonomous vehicles. Autonomous Agents and
    Multi-Agent Systems, 33(3):353–381, 2019.
 6. S. Ben-David, T. Heyman, O. Grumberg, and A. Schuster. Scalable distributed on-
    the-fly symbolic model checking. In Formal Methods in Computer-Aided Design,
    pages 427–441. Springer Berlin Heidelberg, 2000.
 7. B. Berthomieu, P.-O. Ribet, and F. Vernadat. The tool TINA – construction of
    abstract state spaces for Petri nets and time Petri nets. International Journal of
    Production Research, 42(14):2741–2756, 2004.
 8. B. Berthomieu and F. Vernadat. State class constructions for branching analysis
    of time Petri nets. In In TACAS’03, page 442–457. Springer-Verlag, 2003.
 9. G. Bhat, R. Cleaveland, and O. Grumberg. Efficient on-the-fly model checking for
    CTL. In Proceedings of LiCS, pages 388–397, June 1995.
10. A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, Y. Zhu, et al. Bounded model
    checking. Advances in computers, 58(11):117–148, 2003.
11. E. Clarke, A. Biere, R. Raimi, and Y. Zhu. Bounded model checking using satisfi-
    ability solving. Formal Methods in System Design, 19(1):7–34, Jul 2001.
12. E.M. Clarke, O. Grumberg, M. Minea, and D. Peled. State space reduction using
    partial order reduction. International Journal on Software Tools for Technology
    Transfer, 2:279–287, 1999.
13. M. Foughali, B. Berthomieu, S. Dal Zilio, F. Ingrand, and A. Mallet. Model Check-
    ing Real-Time Properties on the Functional Layer of Autonomous Robots. In
    International Conference on Formal Engineering Methods (ICFEM 2016), Tokyo,
    Japan, November 2016.
14. A. Furda and L. Vlacic. Enabling safe autonomous driving in real-world city traffic
    using multiple criteria decision making. IEEE Intelligent Transportation Systems
    Magazine, 3(1):4–17, Spring 2011.
48      Johan Arcile et al.

15. S. Glaser, B. Vanholme, S. Mammar, D. Gruyer, and L. Nouveliere. Maneuver-
    based trajectory planning for highly autonomous vehicles on real road with traffic
    and driver interaction. IEEE Transactions on Intelligent Transportation Systems,
    11(3):589–606, 2010.
16. K. Jensen. Coloured Petri Nets - Basic Concepts, Analysis Methods and Practical
    Use - Volume 1. EATCS Monographs on TCS. Springer, 1992.
17. M. Kamali, L. A. Dennis, O. McAree, M. Fisher, and S. M. Veres. Formal verifica-
    tion of autonomous vehicle platooning. Science of Computer Programming, 148:88
    – 106, 2017. Special issue on Automated Verification of Critical Systems (AVoCS
    2015).
18. C. Katrakazas, M. Quddus, W.-H. Chen, and L. Deka. Real-time motion planning
    methods for autonomous on-road driving: State-of-the-art and future research di-
    rections. Transportation Research Part C: Emerging Technologies, 60:416 – 442,
    2015.
19. Y. Kuwata, J. Teo, G. Fiore, S. Karaman, E. Frazzoli, and J. P. How. Real-time
    motion planning with applications to autonomous urban driving. IEEE Transac-
    tions on Control Systems Technology, 17(5):1105–1118, 2009.
20. M. Likhachev and D. Ferguson. Planning long dynamically feasible maneuvers for
    autonomous vehicles. The International Journal of Robotics Research, 28(8):933–
    945, 2009.
21. D. Lime, O. H. Roux, C. Seidner, and L.-M. Traonouez. Romeo: A parametric
    model-checker for Petri nets with stopwatches. In TACAS, pages 54–57. Springer
    Berlin Heidelberg, 2009.
22. M. O’Kelly, H. Abbas, and R. Mangharam. APEX : Autonomous vehicle plan
    verification and execution. In SAE World Congress, 2016.
23. W. Penczek and A. Lomuscio. Verifying epistemic properties of multi-agent systems
    via model checking. Fundamenta Informaticae, 2(52):167–185, 2003.
24. J. L. Peterson. Petri Net Theory and the Modelling of Systems. Prentice Hall,
    1981.
25. F. Pommereau. ZINC: a compiler for “any language”-coloured Petri nets. Technical
    report, IBISC, university of Evry / Paris-Saclay, 2018.
26. M. M. Quottrup, T. Bak, and R. I. Zamanabadi. Multi-robot planning : a timed
    automata approach. In IEEE International Conference on Robotics and Automa-
    tion, 2004. Proceedings. ICRA ’04. 2004, volume 5, pages 4417–4422 Vol.5, April
    2004.
27. Uppaal. http://www.uppaal.org/.
28. F. Vernadat, P. Azéma, and F. Michel. Covering step graph. In Application and
    Theory of Petri Nets 1996, pages 516–535. Springer Berlin Heidelberg, 1996.