12 Fault diagnosis of discrete event systems using Petri nets Carla Seatzu Abstract— This talk focuses on on-line fault diagnosis of states modeling different degrees of alarm and it applies to discrete event systems based on Petri nets. In particular a all systems whose unobservable subnet is acyclic. diagnosis approach based on the notion of basis markings and Note that the approach here presented, as most of the justifications is discussed. This concept allows us to represent the reachability space in a compact manner, i.e., it requires to approaches dealing with diagnosis of discrete event systems, enumerate only a subset of the reachability space. Arbitrary assumes that the faulty behavior is completely known, thus labeled Petri nets are considered as the reference model. Some a fault model is available. results on diagnosability analysis are finally recalled. The last part of the talk is devoted to briefly recall some results on diagnosability analysis that may also be applied I. I NTRODUCTION to unbounded nets. Failure detection and isolation in industrial systems is a subject that has received a lot of attention in the past II. BACKGROUND ON LABELED P ETRI NETS few decades. A failure is defined to be any deviation of a A Place/Transition net (P/T net) is a structure N = system from its normal or intended behavior. Diagnosis is the (P, T, P re, P ost), where P is a set of m places; T is a set process of detecting an abnormality in the system behavior of n transitions; P re : P × T → N and P ost : P × T → N and isolating the cause or the source of this abnormality. are the pre– and post– incidence functions that specify the The first contributions in the discrete event systems frame- arcs; C = P ost − P re is the incidence matrix. work have been proposed in the context of automata by A marking is a vector M : P → N that assigns to each Sampath et al. [1], [2] who proposed an approach to failure place of a P/T net a nonnegative integer number of tokens, diagnosis where the system is modeled as a nondeterministic represented by black dots. The marking of place p is denoted automaton in which failures are treated as unobservable as M (p). A P/T system or net system ⟨N, M0 ⟩ is a net events. N with an initial marking M0 . A transition t is enabled at More recently, Petri net (PN) models have been used in M iff M ≥ P re(· , t) and may fire yielding the marking the context of diagnosis [3], [4], [5], [6]. Indeed, the use of M ′ = M + C(· , t). One writes M [σ⟩ to denote that the PNs offers significant advantages because of their twofold sequence of transitions σ = tj1 · · · tjk is enabled at M , and representation: graphical and mathematical. Moreover, the M [σ⟩ M ′ to denote that the firing of σ yields M ′ . One intrinsically distributed nature of PNs where the notion of writes t ∈ σ to denote that a transition t is contained in σ. state (i.e., marking) and action (i.e., transition) is local The set of all sequences that are enabled at the initial reduces the computational complexity involved in solving marking M0 is denoted L(N, M0 ), i.e., L(N, M0 ) = {σ ∈ a diagnosis problem. T ∗ | M0 [σ⟩}. In this talk we recall an approach we proposed in [7], that Given a sequence σ ∈ T ∗ , let π : T ∗ → Nn be the function is a generalization of [8]. The main difference between the that associates to σ a vector y ∈ Nn , called the firing vector diagnosis approach presented here and the approaches cited of σ. In particular, y = π(σ) is such that y(t) = k if the above is the concept of basis marking. This concept allows transition t is contained k times in σ. us to represent the reachability space in a compact manner, A marking M is reachable in ⟨N, M0 ⟩ iff there exists i.e., it requires to enumerate only a subset of the reachability a firing sequence σ such that M0 [σ⟩ M . The set of all space. In particular, we deal with arbitrary labeled PNs markings reachable from M0 defines the reachability set of where there is an association between sensors and observable ⟨N, M0 ⟩ and is denoted R(N, M0 ). events, while no sensor is available for certain activities — A PN having no directed circuits is called acyclic. A net such as faults or other unobservable but regular transitions system ⟨N, M0 ⟩ is bounded if there exists a positive constant — due to budget constraints or technology limitations. It k such that, for M ∈ R(N, M0 ), M (p) ≤ k. is assumed that several different transitions might share the The association between sensors and transitions can be same sensor in order to reduce cost or power consumption. If captured by a labeling function L : T → L ∪ {ε} assigns to two transitions are simultaneously enabled and one of them each transition t ∈ T either a symbol from a given alphabet fires, it is impossible to distinguish which one has fired, thus L or the empty string ε. they are called undistinguishable. The diagnosis approach The set of transitions whose label is ε is denoted as here presented is based on the definition of four diagnosis Tu , i.e., Tu = {t ∈ T | L(t) = ε}. Transitions in Tu are called unobservable or silent. To denotes the set of C. Seatzu is with the Department of Electrical and Electronic En- gineering, University of Cagliari, Piazza D’Armi, 09123 Cagliari, Italy. transitions labeled with a symbol in L. Transitions in To are E-mail:seatzu@diee.unica.it. called observable because when they fire their label can be 13 observed. Note that we assume that the same label l ∈ L can be associated to more than one transition. In particular, b ε10 p5 two transitions t1 , t2 ∈ To are called undistinguishable if ε9 b c a ε8 p4 they share the same label, i.e., L(t1 ) = L(t2 ). The set of p2 t2 p3 transitions sharing the same label l are denoted as Tl . c d ε11 p6 t3 p7 t4 p1 t1 In the following let Cu (Co ) be the restriction of the p8 incidence matrix to Tu (To ) and nu and no , respectively, be t5 p9 ε12 t6 ε13 the cardinality of the above sets. Moreover, given a sequence σ ∈ T ∗ , Pu (σ), resp., Po (σ), denotes the projection of σ over d p10 p11 Tu , resp., To . t7 The word w of events associated to sequence σ is w = Po (σ). Fig. 1. The PN system considered in Sections II to IV. Definition 2.1: [7] Let ⟨N, M0 ⟩ be a labeled net system with labeling function L : T → L ∪ {ε}, where N = (P, T, P re, P ost) and T = To ∪ Tu . Let w ∈ L∗ be an Y (M, t) = π(Σ(M, t)) be the e-vectors (or explanation observed word. Let S(w) = {σ ∈ L(N, M0 ) | Po (σ) = w} vectors), i.e., firing vectors associated to the explanations. be the set of firing sequences consistent with w ∈ L∗ , and  C(w) = {M ∈ Nm | ∃σ ∈ T ∗ : Po (σ) = w ∧ M0 [σ⟩M } Thus Σ(M, t) is the set of unobservable sequences whose be the set of reachable markings consistent with w ∈ L∗ .  firing at M enables t. Among the above sequences select those whose firing vector is minimal. The firing vector of In plain words, given an observation w, S(w) is the set these sequences are called minimal e-vectors. of sequences that may have fired, while C(w) is the set of Definition 3.2: [8] Given a marking M and a transition markings in which the system may actually be. t ∈ To , let us define Example 2.2: Consider the PN in Fig. 1. Assume To = {t1 , t2 , t3 , t4 , t5 , t6 , t7 } and Tu = {ε8 , ε9 , ε10 , ε11 , ε12 , ε13 }, Σmin (M, t) = {σ ∈ Σ(M, t) | @ σ ′ ∈ Σ(M, t) : where for a better understanding unobservable transitions π(σ ′ )  π(σ)} have been denoted εi rather than ti . The labeling function the set of minimal explanations of t at M , and let us is defined as follows: L(t1 ) = a, L(t2 ) = L(t3 ) = b, define Ymin (M, t) = π(Σmin (M, t)) the corresponding set L(t4 ) = L(t5 ) = c, L(t6 ) = L(t7 ) = d. of minimal e-vectors.  First consider w = ab. The set of firing sequences Example 3.3: Consider the PN in Fig. 1 previously that is consistent with w is S(w) = {t1 t2 , introduced in Example 2.2. It holds that Σ(M0 , t1 ) = t1 t2 ε8 , t1 t2 ε8 ε9 , t1 t2 ε8 ε9 ε10 , t1 t2 ε8 ε11 }, and the {ε}. Then Σ(M0 , t2 ) = ∅. Finally, let M = set of markings consistent with w is C(w) = [ 0 0 1 0 0 0 0 1 0 0 0 ]T , it holds that Σ(M, t5 ) = {[0 0 1 0 0 0 0 1 0 0 0]T , [0 0 0 1 0 0 0 1 0 0 0]T , {ε, ε8 , ε8 ε9 , ε8 ε11 , ε8 ε9 ε10 }, while Σmin (M, t5 ) = {ε}. It [0 0 0 0 1 0 0 1 0 0 0]T , [0 1 0 0 0 0 0 1 0 0 0]T , follows that Y (M, t5 ) = {[0 0 0 0 0 0]T , [1 0 0 0 0 0]T , [0 0 0 0 0 1 0 1 0 0 0]T }. [1 1 0 0 0 0]T , [1 0 0 1 0 0]T , [1 1 1 0 0 0]T }, and If w = acd is considered the set of firing sequences that Ymin (M, t5 ) = {[0 0 0 0 0 0]T }.  are consistent with w is S(w) = {t1 t5 t6 , t1 t5 ε12 ε13 t7 }, and the set of markings consistent with w is C(w) = B. Basis markings and j-vectors {[0 1 0 0 0 0 0 1 0 0 0]T }. Thus two different firing Definition 3.4: [7] Let ⟨N, M0 ⟩ be a net system with sequences may have fired (the second one also involving labeling function L : T → L ∪ {ε}, where N = silent transitions), but they both lead to the same marking. (P, T, P re, P ost) and T = To ∪ Tu . Let w ∈ L∗ be a given  observation. Let Definition 2.3: Given a net N = (P, T, P re, P ost), and a subset T ′ ⊆ T of its transitions, let us define the T ′ −induced Jˆ(w) = { (σo , σu ), σo ∈ To∗ , L(σo ) = w, σu ∈ Tu∗ | subnet of N as the new net N ′ = (P, T ′ , P re′ , P ost′ ) where [∃σ ∈ S(w) : σo = Po (σ), σu = Pu (σ)]∧ P re′ , P ost′ are the restrictions of P re, P ost to T ′ . The [̸ ∃σ ′ ∈ S(w) : σo = Po (σ ′ ), σu′ = Pu (σ ′ )∧ net N ′ can be thought as obtained from N removing all π(σu′ ) π(σu )]} transitions in T \ T ′ . Let us also write N ′ ≺T ′ N .  be the set of pairs (sequence σo ∈ To∗ with L(σo ) = w, III. C HARACTERIZATION OF THE SET OF CONSISTENT corresponding justification of w).  MARKINGS In simple words, Jˆ(w) is the set of pairs whose first element is the sequence σo ∈ To∗ labeled w and whose sec- A. Minimal explanations and minimal e-vectors ond element is the corresponding sequence of unobservable Definition 3.1: [8] Given a marking M and an observable transitions interleaved with σo whose firing enables σo and transition t ∈ To , let Σ(M, t) = {σ ∈ Tu∗ | M [σ⟩M ′ , M ′ ≥ whose firing vector is minimal. The firing vectors of these P re(·, t)} be the set of explanations of t at M , and let sequences are called j-vectors. 2 14 Example 3.5: Consider the PN in Fig. 1 previously in- unobservable subnet is acyclic. For any w ∈ L∗ it holds that troduced in Example 2.2. It is Jˆ(ab) = {(t1 t2 , ε)} and C(w) = {M ∈ Nm | M = Mb + Cu · y : y ≥ ⃗0 and Jˆ(acd) = {(t1 t5 t6 , ε), (t1 t5 t7 , ε12 ε13 )}.  Mb ∈ Mbasis (w)}. The main difference among minimal explanations and The above result shows that the set C(w) can be charac- justifications is that the first ones are functions of a generic terized in linear algebraic terms given the set Mbasis (w), marking M and transition t, while justifications are functions thus not requiring exhaustive enumeration. This is the main of the initial marking M0 and w. Moreover, as shown in [7], advantage of the approach here presented. in the case of acyclic unobservable subnets, justifications can be computed recursively summing up minimal explanations. IV. O N - LINE DIAGNOSIS USING P ETRI NETS Definition 3.6: [7] Let ⟨N, M0 ⟩ be a net system with Assume that the set of unobservable transitions is parti- labeling function L : T → L ∪ {ε}, where N = tioned into two subsets, namely Tu = Tf ∪ Treg where Tf (P, T, P re, P ost) and T = To ∪ Tu . Let w be a given obser- includes all fault transitions (modeling anomalous or fault vation and (σo , σu ) ∈ Jˆ(w) be a generic pair (sequence of behavior), while Treg includes all transitions relative to unob- observable transitions labeled w; corresponding justification). servable but regular events. The set Tf is further partitioned The marking Mb = M0 +Cu ·y+Co ·y ′ , where y = π(σu ) and into r different subsets Tfi , where i = 1, . . . , r, that model the y ′ = π(σo ), i.e., the marking reached firing σo interleaved different fault classes. Usually, fault transitions that belong with the justification σu , is called basis marking and y is to the same fault class are transitions that represent similar called its j-vector (or justification-vector).  physical faulty behavior. Obviously, because in general more than one justification Definition 4.1: [7] A diagnoser is a function ∆ : L∗ × exists for a word w (the set Jˆ(w) is generally not a {Tf1 , Tf2 , . . . , Tfr } → {0, 1, 2, 3} that associates to each singleton), the basis marking may be not unique as well. observation w ∈ L∗ and to each fault class Tfi , i = 1, . . . , r, Definition 3.7: [7] Let ⟨N, M0 ⟩ be a net system with a diagnosis state. labeling function L : T → L ∪ {ε}, where N = • ∆(w, Tfi ) = 0 if for all σ ∈ S(w) and for all tf ∈ Tfi (P, T, P re, P ost) and T = To ∪ Tu . Let w ∈ L∗ be an it holds tf ̸∈ σ. observed word. Let In such a case the ith fault cannot have occurred, because none of the firing sequences consistent with the observation M(w) = {(M, y) | (∃σ ∈ S(w) : M0 [σ⟩M ) ∧ contains fault transitions of class i. (∃(σo , σu ) ∈ Jˆ(w) : σo = Po (σ), • ∆(w, Tfi ) = 1 if: σu = Pu (σ), y = π(σu ))} (i) there exist σ ∈ S(w) and tf ∈ Tfi such that tf ∈ σ but be the set of pairs (basis marking, relative j-vector) that are (ii) for all (σo , σu ) ∈ Jˆ(w) and for all tf ∈ Tfi it holds consistent with w ∈ L . ∗  that tf ̸∈ σu . Note that the set M(w) does not keep into account the In such a case a fault transition of class i may have sequences of observable transitions that may have actually occurred but is not contained in any justification of w. fired. It only keeps track of the basis markings that can be • ∆(w, Tfi ) = 2 if there exist (σo , σu ), (σo′ , σu′ ) ∈ Jˆ(w) reached and of the firing vectors relative to sequences of un- such that observable transitions that have fired to reach them. Indeed, (i) there exists tf ∈ Tfi such that tf ∈ σu ; this is the information really significant when performing (ii) for all tf ∈ Tfi , tf ̸∈ σu′ . diagnosis. The notion of M(w) is fundamental to provide a In such a case a fault transition of class i is contained in recursive way to compute the set of minimal explanations. one (but not in all) justification of w. Under the assumption of acyclicity of the unobservable • ∆(w, Tfi ) = 3 if for all σ ∈ S(w) there exists tf ∈ Tfi subnet, the set M(w) can be easily constructed following such that tf ∈ σ. a simple algorithm in [7]. In such a case the ith fault must have occurred, because Definition 3.8: [7] Let ⟨N, M0 ⟩ be a net system where all firable sequences consistent with the observation contain N = (P, T, P re, P ost) and T = To ∪ Tu . Assume that the at least one fault in Tf . i  unobservable subnet is acyclic. Let w ∈ To∗ be an observed Example 4.2: Consider the PN in Fig. 1 previously intro- word. Let duced in Example 2.2. Let Tf = {ε11 , ε12 }. Assume that the two fault transitions belong to different fault classes, i.e., Mbasis (w) = {M ∈ Nm | ∃y ∈ Nnu and (M, y) ∈ M(w)} Tf = {ε11 } and Tf = {ε12 }. 1 2 Let us observe w = a. Then ∆(w, Tf1 ) = ∆(w, Tf2 ) = 0, be the set ∪of basis markings at w. Moreover, denote as being Jˆ(w) = {(t1 , ε)} and S(w) = {t1 }. In simple words Mbasis = w∈To∗ Mbasis (w) the set of all basis markings no fault of both fault classes may have occurred. for any observation w.  Let us observe w = ab. Then ∆(w, Tf1 ) = 1 and Note that if the net system is bounded then the set Mbasis ∆(w, Tf2 ) = 0, being Jˆ(w) = {(t1 t2 , ε)} and S(w) = is finite being the set of basis markings a subset of the {t1 t2 , t1 t2 ε8 , t1 t2 ε8 ε9 , t1 t2 ε8 ε9 ε10 , t1 t2 ε8 ε11 }. This means reachability set. that a fault of the first fault class may have occurred (firing Theorem 3.9: [7] Consider a net system ⟨N, M0 ⟩ whose the sequence t1 t2 ε8 ε11 ) but it is not contained in any 3 15 justification of ab, while no fault of the second fault class with two different notions of diagnosability: diagnosability can have occurred. and diagnosability in K steps. Diagnosability in K steps is Now, consider w = abb. In this case stronger than diagnosability and implies not only that the ∆(w, Tf1 ) = 2 and ∆(w, Tf2 ) = 0, being system is diagnosable, i.e., when the fault occurs we are Jˆ(w) = {(t1 t2 t2 , ε8 ε9 ε10 ), (t1 t2 t3 , ε8 ε11 )} and able to detect it in a finite number of transition firings, but S(w)={t1 t2 ε8 ε9 ε10 t2 , t1 t2 ε8 ε9 ε10 t2 ε8 , t1 t2 ε8 ε9 ε10 t2 ε8 ε9 , also that if the fault occurs we are able to detect it in at t1 t2 ε8 ε9 ε10 t2 ε8 ε9 ε10 , t1 t2 ε8 ε9 ε10 t2 ε8 ε11 , t1 t2 ε8 ε11 t3 }. most K steps. We gave necessary and sufficient conditions This means that no fault of the second fault class can have for both notions of diagnosability and we presented a test occurred, while a fault of the first fault class may have to study both diagnosability and K-diagnosability based on occurred since one justification does not contain ε11 and the analysis of the coverability graph of a special PN, called one justification contains it. Verifier Net, that is built starting from the initial system. Finally, consider w = abbccc. In this case Moreover, we gave a procedure to compute the bound K ∆(w, Tf1 ) = 3 and ∆(w, Tf2 ) = 1. In fact since in the case of K-diagnosable systems. Then, we showed Jˆ(w) = {(t1 t2 t3 t5 t4 t4 , ε8 ε11 ), (t1 t2 t3 t4 t5 t4 , ε8 ε11 ), how sufficient conditions on diagnosability can be computed (t1 t2 t3 t4 t4 t5 , ε8 ε11 ), (t1 t2 t3 t4 t4 t4 , ε8 ε11 )} a fault of using linear programming techniques. the first fault class must have occurred, while a fault To the best of our knowledge, this is the first time that of the second fault class may have occurred (e.g. necessary and sufficient conditions for diagnosability and K- t1 t2 ε8 ε11 t3 t4 t4 t5 ε12 ) but it is not contained in any diagnosability of labeled unbounded Petri nets are presented. justification of w.  VI. C ONCLUSION AND FUTURE WORK Proposition 4.3: [7] Consider an observed word w ∈ L∗ . In this talk an on-line approach for fault diagnosis of • ∆(w, Tfi ) ∈ {0, 1} iff for all (M, y) ∈ M(w) and for discrete event systems based on labeled Petri nets has been all tf ∈ Tfi it holds y(tf ) = 0. discussed. Some results on diagnosability analysis have also • ∆(w, Tfi ) = 2 iff there exist (M, y) ∈ M(w) and been recalled, that also apply to unbounded nets. (M ′ , y ′ ) ∈ M(w) such that: Several are the future lines of research in this framework. (i) there exists tf ∈ Tfi such that y(tf ) > 0, One of the most important is the relaxation of the assumption (ii) for all tf ∈ Tfi , y ′ (tf ) = 0. that the fault model is perfectly known, as well as the • ∆(w, Tfi ) = 3 iff for all (M, y) ∈ M(w) there exists initial state of the system. Moreover, it is interesting to tf ∈ Tfi such that y(tf ) > 0. investigate how the problems of on-line fault diagnosis and In [7] it has been shown that it is possible to distinguish diagnosability analysis can take advantage from information between diagnosis states 0 and 1 by simply checking if a coming timing associated with transitions. Finally, due to given constraint set is feasible or not. the complexity of current systems, it is important to develop Remark 4.4: As proved in [7], if the considered net sys- efficient distributed and/or decentralized approaches. tem is bounded, the most burdensome part of the procedure R EFERENCES can be moved off-line defining a graph called Basis Reach- [1] M. Sampath, R. Sengupta, S. Lafortune, K. Sinnamohideen, and ability Graph (BRG). The BRG is a deterministic graph that D. Teneketzis, “Diagnosability of discrete-event systems,” IEEE Trans. has as many nodes as the number of possible basis markings, on Automatic Control, vol. 40 (9), pp. 1555–1575, 1995. [2] ——, “Failure diagnosis using discrete-event models,” IEEE Trans. thus it is always finite being the set of basis markings a subset Control Systems Technology, vol. 4, no. 2, pp. 105–124, 1996. of the set of reachable markings.  [3] S. Genc and S. Lafortune, “Distributed diagnosis of place-bordered Petri nets,” IEEE Trans. on Automation Science and Engineering, V. D IAGNOSABILITY ANALYSIS vol. 4, no. 2, pp. 206–219, 2007. [4] A. Benveniste, E. Fabre, S. Haar, and C. Jard, “Diagnosis of asyn- In the diagnosis framework two different problems can be chronous discrete event systems: A net unfolding approach,” IEEE Trans. on Automatic Control, vol. 48, no. 5, pp. 714–727, 2003. solved: the problem of diagnosis and the problem of diagnos- [5] F. Basile, P. Chiacchio, and G. D. Tommasi, “An efficient approach for ability. As explained in detail in the above sections, solving online diagnosis of discrete event systems,” IEEE Trans. on Automatic a problem of diagnosis means that to each observed string Control, vol. 54, no. 4, pp. 748–759, 2009. [6] M. Dotoli, M. P. Fanti, A. Mangini, and W. Ukovich, “On-line fault of events is associated a diagnosis state, such as “normal” or detection in discrete event systems by Petri nets and integer linear “faulty” or “uncertain”. Solving a problem of diagnosability programming,” Automatica, vol. 45, no. 11, pp. 2665–2672, 2009. is equivalent to determine if the system is diagnosable, [7] M. Cabasino, A. Giua, M. Pocci, and C. Seatzu, “Discrete event diagnosis using labeled Petri nets. An application to manufacturing i.e., to determine if, once a fault has occurred, the system systems,” Control Engineering Practice, 2011, (in press, published can detect its occurrence in a finite number of steps. We on-line with DOI 10.1016/j.conengprac.2010.12.010). addressed this problem in two different settings, depending [8] M. Cabasino, A. Giua, and C. Seatzu, “Fault detection for discrete event systems using Petri nets with unobservable transitions,” Auto- on the boundedness of the net system. In particular, in [9] matica, vol. 46, no. 9, pp. 1531–1539, 2010. we proposed a solution that only applies to bounded nets, [9] ——, “Diagnosability of discrete event systems using labeled Petri where the major feature is to allow to perform, using the nets,” IEEE Trans. on Automation Science and Engineering, vol. 11, no. 1, pp. 144–153, 2014. same framework, both diagnosis and diagnosibility analysis. [10] M. Cabasino, A. Giua, S. Lafortune, and C. Seatzu, “A new approach Moreover, in [10] we proposed an approach that also for diagnosability analysis of Petri nets using Verifier Nets,” IEEE applies to unbounded PNs. In particular in [10] we dealt Trans. on Automatic Control, vol. 57, no. 12, pp. 3104–3117, 2012. 4