Implementable strategies for a two-player asynchronous game on Petri nets Federica Adobbati1 , Luca Bernardinello1 , Lucia Pomello1 and Riccardo Stramare1 1 Università degli studi di Milano–Bicocca, DISCo Abstract We consider a two-player game on Petri nets, in which each player controls a subset of transitions. The players are called ‘user’ and ‘environment’; we assume that the environment must guarantee progress on its transitions. A play of this game is a run in the unfolding of the net, satisfying the progress assumption. In general, we define a strategy for the user as a map from the set of ‘observations’ to subsets of transitions owned by the user. Different restrictions on strategies can be used to encode observability assumptions. We say that a given strategy is implementable if the net can be endowed with new places so that the runs of the new net coincide with the plays of the original net, complying with the strategy. We propose an algorithm based on the search of regions to decide whether a strategy is implementable. 1. The game We recall the definition of a two-player, asynchronous game on 1-safe Petri nets (see [1]), define a notion of implementable strategy for one of the players, and give a simple algorithm to decide whether a given, general, strategy is implementable. Let us introduce the main notions related to the game through an example. Consider the net in Fig. 1. Two players interact on it, the user, by controlling the light grey transitions, and the environment, by controlling the white ones. The game is asynchronous: the players can concurrently fire their transitions. The game is asymmetric: the user has the right to keep his transitions blocked when they are enabled, whereas the environment must guarantee progress of its transitions. In this example, we suppose that the user has full knowledge of the current marking, and that he has the goal of marking place 𝑞 infinitely often. In order to win, the user must wait for the environment to choose between firing 𝑡1 or 𝑡2 . He can do it, since the environment cannot delay this choice forever. In the former case, the user chooses 𝑢1 , otherwise 𝑢2 . The environment is then forced to fire either 𝑣1 or 𝑣2 , with the effect of marking 𝑞, and then to fire 𝑧, reproducing the initial marking. Formally, a Petri net is a tuple Σ = (𝑃, 𝑇, 𝐹, 𝑚0 ), where 𝑃 is the set of places, 𝑇 the set of transitions, 𝐹 ⊆ (𝑃 × 𝑇 ) ∪ (𝑇 × 𝑃 ) the flow relation, 𝑚0 : 𝑃 → N the initial marking. A marking is a map 𝑚 : 𝑃 → N. We suppose the reader knows the definition of the firing rule, denoted 𝑚[𝑡⟩𝑚′ , and how to compute the set of reachable markings. Let 𝑀 be the set of reachable markings in Σ; then Σ is 1-safe iff ∀𝑚 ∈ 𝑀 , ∀𝑝 ∈ 𝑃 , 𝑚(𝑝) ≤ 1. We assume that the game is played on a 1-safe Petri net. In a 1-safe Petri net a marking can be interpreted as the characteristic function of a subset of places. Hence, in the rest of the paper we will denote ATAED’22, June 20-21, 2022, Bergen, Norway © 2022 CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) Figure 1: A game net Figure 2: A full play (left) and a partial play (right) markings as sets of places. We denote with MG(Σ) = (𝑀, 𝑇, 𝐴, 𝑚0 ) the sequential marking graph of Σ, where 𝐴 = {(𝑚, 𝑡, 𝑚′ ) : 𝑚, 𝑚′ ∈ 𝑀, 𝑡 ∈ 𝑇 𝑎𝑛𝑑 𝑚[𝑡⟩𝑚′ } is the set of labelled arcs. We denote with 𝑇𝑢 the set of transitions controlled by the user, and with 𝑇𝑒 the set of transitions uncontrollable for him. We assume that 𝑇𝑒 ∪ 𝑇𝑢 = 𝑇 , 𝑇𝑒 ∩ 𝑇𝑢 = ∅. We will also refer to the unfolding of the net (see [2] for the formal definitions). The unfolding is an acyclic, possibly infinite, net representing all the possible histories of the executions of a net system. In the unfolding of a net system, only forward conflicts are allowed: two events can share a precondition, but no postcondition. Since the unfolding is acyclic, the reflexive and transitive closure of the flow relation is a partial order. If two events 𝑒1 and 𝑒2 are in conflict, then we say that every descendant of 𝑒1 is in conflict with every descendant of 𝑒2 . Two elements are concurrent if they are neither ordered nor in conflict. The events of the unfolding are partitioned into controllable and uncontrollable events depending on their correspondence to occurrences of controllable or uncontrollable transitions, respectively. A run is a subnet of the unfolding representing an execution, i.e. it is a net without conflicts and close with respect to the past of its elements. A run is maximal if no event can be added without creating a conflict. A play in the game is formally defined as a run in the unfolding of the net, maximal with respect to uncontrollable transitions. The winning condition for the user is a set of plays. The user wins a play if the play belongs to the winning condition. In the example discussed above, the winning condition is formed by all the maximal runs with an infinite number of occurrences of the place 𝑞. Figure 2 shows, on the left, a play ending in a deadlock (the environment wins) and, on the right, the initial segment of an infinite play (the user wins). 2. Strategies In order to reach his goal, the user can apply a strategy. We assume that the user has no memory and only a partial knowledge of the current state of the net. This is formalized by assuming an equivalence relation, denoted by ≡, on the set of reachable markings. The equivalence classes of this relation will be called observations, and a strategy is defined as a map from observations to sets of controllable transitions. For example, if we assume the user may observe only some places, two markings are considered equivalent if they share the same observable places. An example of equivalence relation between partially observable markings has been considered in the game presented in [3]. Let Obs be the set of observations. Then a strategy is a map 𝛼 : Obs → 2𝑇𝑢 . The notion of observation can be extended to unfoldings: a maximal set of pairwise concurrent places in the unfolding is called a B-cut. Every B-cut corresponds to a reachable marking. We say that two B-cuts are equivalent if their corresponding markings are equivalent. Two B-cuts of a run 𝜋, 𝛾 and 𝛾 ′ , are ordered, denoted 𝛾 < 𝛾 ′ , iff 𝛾 ̸= 𝛾 ′ and ∀𝑏 ∈ 𝛾, ∃𝑏′ ∈ 𝛾 ′ such that: 𝑏 ≤ 𝑏′ . A sequence 𝛾1 𝛾2 ...𝛾𝑛 ... of B-cuts is increasing if 𝛾𝑖 < 𝛾𝑗 for each 𝑖, 𝑗 ∈ N with 𝑖 < 𝑗. A play 𝜋 complies with a strategy 𝛼 if (1) there is a sequence 𝜇 of observations which is the projection of an increasing sequence of B-cuts in 𝜋, and (2) every controllable event in 𝜋 is chosen by the strategy in one of the observations in 𝜇. Given a strategy 𝛼, we can construct the reduced version of MG(Σ), by removing arcs corresponding to controllable transitions not chosen by the strategy, and then removing all states and transitions unreachable from the initial marking. The reduced version will be denoted by MG𝛼 (Σ), and the operation will be referred to as the 𝛼-reduction of MG(Σ). Fig. 3 shows the reduced version of the marking graph of the net in Fig. 1, when 𝛼 is the strategy described above. 2.1. Implementable strategies As defined above, the notion of strategy is not encoded in the net system itself. We will now investigate the possibility of defining a strategy within the net. The idea we pursue is this: a strategy is encodable within a net if its decisions can be represented as extra places, corresponding to causal relations from uncontrollable transitions to controllable transitions. We will then say that a strategy is implementable if we can add new places to the given net, so that the runs of the augmented nets are exactly the plays complying with the strategy in the original net. Formally, we need a notion of “augmented” net system, in which we add places that restrict the possible behaviours. For Σ𝑖 = (𝑃𝑖 , 𝑇𝑖 , 𝐹𝑖 , 𝑚0𝑖 ), with 𝑖 = 1, 2, we write Σ1 ⊑ Σ2 if Σ2 is obtained by adding new places to Σ1 , hence 𝑃1 ⊆ 𝑃2 , 𝑇1 = 𝑇2 , 𝐹1 ⊆ 𝐹2 and new arcs in 𝐹2 have one end in 𝑃2 ∖ 𝑃1 , 𝑚01 ⊆ 𝑚02 and 𝑚02 ∖ 𝑚01 ⊆ 𝑃2 ∖ 𝑃1 . A given strategy 𝛼 might prohibit any occurrence of a given controllable transition, and might make some uncontrollable transition unreachable in complying plays. Let 𝑇𝛼 be the set of reachable transitions in complying plays, and let Σ𝛼 be the net system whose underlying net is the subnet generated by 𝑇𝛼 , i.e.: Σ𝛼 = (𝑃𝛼 , 𝑇𝛼 , 𝐹𝛼 , 𝑚0𝛼 ), where 𝑃𝛼 = ∙ 𝑇𝛼 ∪ 𝑇𝛼∙ , 𝐹𝛼 is 𝐹 Figure 3: The marking graph of the net in Fig. 1 and its reduced version MG𝛼 (Σ) restricted to (𝑃𝛼 × 𝑇𝛼 ) ∪ (𝑇𝛼 × 𝑃𝛼 ) and 𝑚0𝛼 = 𝑚0 ∩ 𝑃𝛼 . Definition 1. A strategy 𝛼 for a game on Σ is implementable if there exists a 1-safe net system Σ′ = (𝑃 ′ , 𝑇𝛼 , 𝐹 ′ , 𝑚′0 ) such that (1) Σ𝛼 ⊑ Σ′ ; (2) the marking graph of Σ′ is isomorphic to MG𝛼 (Σ). Consider again the net system in Fig. 1. Suppose that the user can observe all the places in the net, so that it has full knowledge of the current marking. If his goal is to reach infinitely often the place 𝑞, then a winning strategy is defined by the following clauses: (1) 𝛼({𝑝0 , 𝑠0 }) = ∅; (2) 𝛼({𝑝0 , 𝑠1 }) = {𝑢1 }; (3) 𝛼({𝑝0 , 𝑠2 }) = {𝑢2 }. In this case, the strategy 𝛼 does not prevent the occurrence of any controllable transition, it only selects which one between 𝑢1 and 𝑢2 to choose, depending on the observation of {𝑝0 , 𝑠1 } or {𝑝0 , 𝑠2 }; therefore the set of reachable transitions in plays complying with 𝛼, 𝑇𝛼 , is 𝑇 itself, and then Σ𝛼 = Σ. The marking graph of Σ, MG(Σ), is shown on the left side of Fig. 3, whereas the reduced version MG𝛼 (Σ) is on the right side of the same figure. The information needed to take the right decision in accordance with 𝛼, in this simple case, can be encoded by adding to Σ a couple of places, one going from 𝑡1 to 𝑢1 , the other from 𝑡2 to 𝑢2 , as shown in Fig. 4, on the right, where the new places are drawn in dark grey. It is easy to see that the obtained system has a marking graph isomorphic to MG𝛼 (Σ). 3. An algorithm to decide if a strategy is implementable In this section, we propose a simple algorithm to decide whether a given strategy is imple- mentable. The algorithm checks whether the reduced marking graph is isomorphic to the marking graph of a 1-safe Petri net, and relies on the theory of regions [4]. We first recall some basic notions related to regions of transition systems. Region theory and separation problems As remarked above, each node of the marking graph of a 1-safe Petri net is a set of places. Hence, we can associate to each place its extension, namely the set of reachable markings to which it belongs: ∀𝑝 ∈ 𝑃 𝑟(𝑝) = {𝑚 ∈ 𝑀 | 𝑝 ∈ 𝑚}. The extension of a place satisfies the uniform crossing property: for any given transition 𝑡, if Figure 4: A game net with an implemented strategy one occurrence of 𝑡 in the marking graph enters 𝑟(𝑝), then all occurrences of 𝑡 do the same, and analogously when an occurrence leaves 𝑟(𝑝). This property can be defined more abstractly for general labelled transition systems, giving the notion of a region of a transition system. The synthesis problem consists in deciding if a given labelled transition system is isomorphic to the marking graph of a net system, and, if so, in constructing such a net. The problem admits a solution if the set of regions satisfies the so-called state separation problems and the event-state separation problems. In this case, we say that the transition system is separated. Consider now a marking graph MG(Σ), a strategy 𝛼, and the reduced transition system MG𝛼 (Σ) = (𝑄, 𝑇𝛼 , 𝐴, 𝑞0 ). Then, it is easy to prove that, for each region 𝑟 of MG(Σ), the set 𝑟 ∩ 𝑄 is a region of MG𝛼 (Σ). This implies that all state separation problems of MG𝛼 (Σ) are solved by those regions, since MG(Σ) is separated by definition. This is not the case for the event-state separation problems: for each edge labelled with 𝑡 removed from MG(Σ) in the state 𝑠, we need to check if there is a region without state 𝑠 from which 𝑡 leaves. The algorithm Given a strategy 𝛼 for a game on Σ, we can now describe an algorithm to decide whether 𝛼 is implementable. The algorithm first computes MG𝛼 (Σ), and then checks if it is separated by trying to solve each new event-state separation problem. The new regions solving those problems, if any, encode the flow of information from observations to the controllable part of the system. 4. Conclusion In this work we presented a notion of implementable strategy on 1-safe nets, and we provided an algorithm to decide whether the strategy is implementable. In future works, we plan to broad the definition of implementable strategy by allowing for strategies modelled by adding general bounded places. As an example, consider the net on the left of Fig. 5 where the user can observe the places {3, 4, 5, 6, 10}, and his goal is to reach place 12. A winning strategy is: 𝛼({4, 6, 10}) =H, 𝛼({3, 5, 10}) = 𝛼({3, 6, 10}) = 𝛼({4, 5, 10}) = I. The choice of H can be represented by adding a place from B to H, and a place from D to H. Instead, the choice of I Figure 5: An example of non-implementable strategy depends on the occurrence of either A or C, hence it may be represented with a place allowing for two tokens, making the net not 1-safe. Although the strategy is not implementable according to the given definition, the net on the right of Fig. 5 realizes the strategy. Furthermore, we plan to relax the definition of implementable strategy, by allowing for implementations in which only parts of the behaviours in the reduced marking graph are reproducible on the net. Specifically, a strategy is implementable if we can add places to the initial net so that there is at least a maximal run in the unfolding of the augmented net, and all its maximal runs are associated to maximal paths of the reduced marking graph. A similar use of regions can be found in applications to problems of control: in [5] and [6] regions are used to synthesize maximally permissive controllers avoiding unwanted states; [7] proposes an implementation based on region theory of a controller, applied to flexible manufacturing systems. A general overview of works and approaches to automatic control using Petri nets can be found in [8]. Acknowledgments This work is supported by the Italian MUR. References [1] F. Adobbati, L. Bernardinello, L. Pomello, A two-player asynchronous game on fully observable Petri nets., Trans. Petri Nets Other Model. Concurr. 15 (2021) 126–149. [2] J. Engelfriet, Branching processes of Petri nets, Acta Inf. 28 (1991) 575–591. URL: https: //doi.org/10.1007/BF01463946. doi:10.1007/BF01463946. [3] F. Adobbati, L. Bernardinello, L. Pomello, Looking for winning strategies in two-player games on Petri nets with partial observability, 2022. URL: https://arxiv.org/abs/2204.01603. doi:10.48550/ARXIV.2204.01603. [4] E. Badouel, L. Bernardinello, P. Darondeau, Petri Net Synthesis, Texts in Theoretical Computer Science. An EATCS Series, Springer, 2015. URL: http://dx.doi.org/10.1007/ 978-3-662-47967-4. doi:10.1007/978-3-662-47967-4. [5] A. Ghaffari, N. Rezg, X. Xie, Design of a live and maximally permissive Petri net controller using the theory of regions, IEEE Transactions on Robotics and Automation 19 (2003) 137–141. doi:10.1109/TRA.2002.807555. [6] Z. Li, M. Zhou, M. Jeng, A maximally permissive deadlock prevention policy for fms based on petri net siphon control and the theory of regions, IEEE Transactions on Automation Science and Engineering 5 (2008) 182–188. [7] S. Rezig, C. Ghorbel, Z. Achour, N. Rezg, PLC-based implementation of supervisory con- trol for flexible manufacturing systems using theory of regions, International Journal of Automation and Control 13 (2019) 619–640. [8] A. Giua, M. Silva, Petri nets and automatic control: A historical perspective, Annual Reviews in Control 45 (2018) 223–239.