=Paper=
{{Paper
|id=Vol-1689/paper8
|storemode=property
|title=Using Model-Checking Techniques for Diagnosability Analysis of Intermittent Faults - A Railway Case Study
|pdfUrl=https://ceur-ws.org/Vol-1689/paper8.pdf
|volume=Vol-1689
|authors=Abderraouf Boussif,Mohamed Ghazel
|dblpUrl=https://dblp.org/rec/conf/vecos/BoussifG16
}}
==Using Model-Checking Techniques for Diagnosability Analysis of Intermittent Faults - A Railway Case Study ==
Using Model-Checking Techniques for Diagnosability Analysis of Intermittent Faults - A Railway Case-Study Abderraouf Boussif Mohamed Ghazel Univ. Lille Nord de France, F-59000 Lille, France IFSTTAR, Cosys/Estas, F-59650 Villenveuve d’Ascq, France {abderraouf.boussif, momahed.ghazel}@ifsttar.fr This paper addresses formal verification of intermittent fault diagnosability in Discrete Event Systems (DESs). The system is modeled by a Finite State Automaton and intermittent faults are defined as faults that can automatically recover once they have occurred. Two definitions of diagnosability, regarding the detection of fault occurrences within a finite delay and the detection of fault occurrences before their recovery, are discussed. The diagnosability is analyzed on the basis of the twin-plant structure, which is formally modeled as a Kripke structure, while diagnosability conditions are formulated using LTL temporal logic. We focus on a practical application of this approach, namely a case-study from the railway control field, will serve as a benchmark to illustrate the various developed mechanisms and to assess the scalability of the technique. Discrete Event Systems, Formal verification, Diagnosability analysis, Intermittent faults, Model-Checking. 1. INTRODUCTION analyzing every pair of executions that share the same observation. Fault diagnosis is a crucial and challenging task in large and complex dynamic systems. This problem has been Model-Checking techniques (Clarke et al. 1999), which extensively studied by both Artificial Intelligence (AI) have been developed for efficiently verifying complex and Control Engineering communities. In particular, an dynamic systems, have been exploited to deal with increasing amount of work has been devoted to fault diagnosis issues. For instance, (Cimatti et al. 2003) diagnosis of DES over the last two decades as witnessed addressed the formal verification of diagnosability using by the survey work in (Zaytoon and Lafortune 2013). CTL symbolic Model-Checking. In this work, verifying diagnosability is reduced to a reachability analysis One of the main issues in fault diagnosis of DES, is problem in the twin-plant structure where the condition diagnosability analysis. In simple terms, diagnosability of diagnosability is expressed as a CTL formula. Some refers to the ability to infer accurately, from partially further reformulations were also given in (Bourgne et al. observed executions, about the faulty behavior within a 2009; Boussif and Ghazel 2015). Several algorithms finite delay after a possible occurrence of a fault. The based on symbolic techniques, have been proposed formal definition of diagnosability was first introduced in (Grastien 2009) to test diagnosability with fairness in the seminal work (Sampath et al. 1995), where a properties. In (Peres and Ghazel 2014) a novel approach systematic method to check diagnosability based on to deal with diagnosability using a µ-calculus logical a diagnoser construction in an event-based context framework was proposed. The Boolean Satisfiability was developed. A similar work based on a diagnoser Problem (SAT), which is a dual technique of Model- construction in a state-based context was proposed in Checking has been also used to deal with some fault (Zad et al. 2003). diagnosis issues (Grastien and Anbulagan 2007; Grastien 2008). Improvements in terms of complexity, based on the verifier and the twin-plant structures have been All the DES framework discussed above assume that introduced in (Jiang et al. 2001; Schumann and Pencolé the failures are permanent, which means once a fault 2007; Yoo and Lafortune 2002), where the basis idea occurs, the system remains indefinitely faulty; hence the was to build an intermediate structure by performing terminology ”failure” is often used for permanent faults. a parallel composition of the system model with itself. Then diagnosability problem can be addressed by In many systems, faulty behavior often occurs intermittently, which can be depicted as a failure 1 event followed by its corresponding ”reset” event, (Guanqian et al.). Diagnosability of both permanent and followed by new occurrences of failure events, and so intermittent failures were revisited, and an approach to forth (Contant 2005). Indeed, intermittent faults are discriminate between these fault types was discussed. defined as faults that can automatically recover once Illustrative examples to demonstrate the proposed they have occurred. Such faults are predominant in approach and analysis results were presented. many real life systems, like, for example, electrical contacts, overheating of chips, noise measurements in In this paper, we propose an approach for diagnosability hardware systems, exceptions, interrupts, and bugs in analysis of intermittent fault using model-checking software systems. techniques. The approach is based on the twin-plant structure (Jiang et al. 2001), and the reformulation of The methodologies referenced above for permanent the diagnosability issues as temporal logic formulas that faults are no longer adequate in the context of are workable with Model-Checking. intermittent faults. Since the case of intermittent faults shows some subtle configurations compared to the case We first discuss two definitions of diagnosability of of permanent failures. Consequently, some DES based intermittent faults, regarding the detection of fault frameworks have been proposed to handle intermittent occurrences within a finite delay and the detection of faults. One of the first contributions was made by fault occurrences before their recovery. Then, necessary (Jiang et al. 2003), where a state-based DES modeling and sufficient conditions for each notion are developed for the so-called ”repeated faults” was introduced. based on the twin-plant construction, and reformulated Various notions of diagnsability were discussed and as linear temporal logic (LTL) formulas in order to use polynomial algorithms for checking these properties were model-checking for actual verification. A railway case- provided. Some improvements have been introduced study is used to illustrate the various concepts discussed in (Yoo and Garcia 2004; Zhou and Kumar 2009). and also to assess the efficiency and the scalability of the These works focus on determining how many times a approach. failure has occurred, but do not address the system status determination. Dealing with diagnosability of The paper is organized as follows: Section 2 introduces intermittent faults in this sense was first discussed in the considered system model and the modeling of series of works (Contant 2002, 2005). In these studies, intermittent faults as well as some related notions an FSA event-based approach is used, i.e. the faults and and notations. In section 3, different definitions of their recovering are considered as unobservable events. diagnosability are discussed. Section 4 discusses the The purpose of these works is to determine which twin-plant construction and gives the necessary and failures are present in the system and which failures sufficient conditions for each definition. Formulation of have occurred and been recovered, This work represents diagnosability of intermittent faults as a model-checking an extension of the seminal work on diagnosability of issue, and necessary and sufficient conditions as LTL permanent failures (Sampath et al. 1995) with some formulas are established in Section 5. We illustrate the modifications regarding the failure modeling and the discussed concepts through a railway case-study (a level diagnoser construction in order to cater for intermittent crossing benchmark) in Section 6. Finally, Section 7 failures. A similar work is reported in (Correcher et al. draws some concluding remarks and points some future 2003) with an illustration through an industrial process. directions. In (Soldani et al. 2007), intermittent fault diagnosis in an FSA framework was reported. Particularly, only the 2. PRELIMINARIES normal behavior of the system is considered and failure 2.1. System Model are modeled as the occurrence of an extra event or as the lack of a specific event. A diagnoser is then established Discrete models are quite convenient to perform safety for each event type. An extension to Petri Net framework analysis of industrial systems in a sufficiently high was given in (Soldani et al. 2006). abstraction level (Cassandras and Lafortune 2008). When systems are abstracted as DESs for diagnosis An extension of the state-based DES framework, purposes, the model used is often a finite state introduced in (Zad et al. 2003), was proposed in (Biswas automaton (FSA) G = hX, Σ, δ, x0 i where, X is a finite 2012) to deal with intermittent failures. Two notions set of states, Σ is a finite set of events, δ : X × Σ → 2X of diagnosability were introduced, one for detecting the is the partial transition function, and x0 ∈ X is the occurrence of a fault, and the other for detecting its initial state. A triple (x, σ, x0 ) ∈ X × Σ × X is called recovery. The diagnoser is constructed in the same way a transition if x0 ∈ δ(x, σ). The model G accounts as in (Zad et al. 2003) with the same time-complexity. for the normal and faulty behaviors of the system. Necessary and sufficient conditions for each notion The system behaviors are then described by the prefix- have been developed, and an algorithm to verify the closed language L ⊆ Σ∗ generated by G, where Σ∗ diagnosability conditions has been provided. denotes the Kleene-closure of set Σ. A new way for modeling failures, which includes Partial observability is a key issue in fault diagnosis. permanent and intermittent faults, was proposed in In this regard, some events in Σ are observable, i.e., their occurrence can be observed, while the others are Let G1 = hX1 , Σ1 , δG1 , x01 i and G2 = unobservable. U Thus, event set Σ can be partitioned as hX2 , Σ2 , δG2 , x02 i denote two finite state automata. Σ = Σo Σu , where Σo denotes the set of observable The strict synchronous composition of G1 events and Σu the set of unobservable events. and G2 produces an automaton GG1 kG2 = hX1 × X2 , Σ1 ∩ Σ2 , δG1 kG2 , (x01 , x02 )i, where In the context of diagnosis of intermittent faults, let δG1 kG2 ⊆ (X1 × X2 ) × (Σ1 ∩ Σ2 ) × (X1 × X2 ) Σf ⊆ Σu denotes the set of fault events and let and (x01 , x02 ) ∈ δG1 ||G2 ((x1 , x2 ), σ) if x01 ∈ δG1 (x1 , σ) Σr ⊆ Σu denotes the set of fault reset events. and x02 ∈ δG2 (x2 , σ). Failures and their recovery are basically represented using unobservable events, since their detection and Finally, we define the non-deterministic automaton diagnosis would be trivial if they were observable. G0 = hXo , Σo , δG0 , x0 i as the generator of language Thus, the set of fault events (resp. the set of L(G0 ) = P (L(G)). Thus, G0 is called “the constructed reset events) can Ube partitionedU as disjoint failure generator.” of G. Elements Σo and x0 are as defined before. Xo = {x0 } ∪ {x ∈ X : ∃x0 ∈ X, ∃σ ∈ Σo : U classes Σf = Σf1 Σf2 · · · Σfm , where Σfi (i = 1, 2,U . . . , m) U denotes U the ith fault class (resp. Σr = x ∈ δ(x0 , σ)}. The transition relation of G0 is given Σr1 Σr2 · · · Σrm , where Σri (i = 1, 2, . . . , m) by δG0 ⊆ (Xo × Σo × Xo ) and is defined as follows: denotes the recovering class of faults in Σfi ). (x, σ, x0 ) ∈ δG0 if ∃s = (σ1 , σ2 , . . . , σn = σ) ∈ Σ∗ such that x0 ∈ δ(x, s) and ∀1 ≤ i ≤ n−1, σi ∈ Σu , σn ∈ Σo . An event-trace s = (σ1 , σ2 , . . . , σn ) is said to be associated with state-trace π = (x1 , x2 , . . . , xn+1 ) if In the remainder of this paper, we consider a finite state ∀1 ≤ i ≤ n, xi+1 ∈ δ(xi , σi ). We write si to indicate the automaton G = hX, Σ, δ, x0 i as the model of the system ith event in s and sf the last event in s (i.e., sf = s|s| ). to be analyzed. For the sake of clarity, here only one class We denote by L/s the post-language of L upon s, i.e., of fault event Σf and its corresponding class Σr of reset L/s := {t ∈ Σ∗ |s.t ∈ L}. We write s ≤ t to denote the events will be considered. fact that s is a prefix of t. 2.2. Modeling of Intermittent Faults For convenience, we introduce the following particular sets of event-traces: In the literature pertaining to diagnosis of DES, faults are said to be intermittent when they are non- • ψ(Σfi ) = {s = (σ1 , σ2 , . . . , σn ) ∈ L | σn ∈ Σfi } is permanent, in the sense that each occurrence of fault the set of event-traces in L that end with a faulty is followed by its reset to the recovery behavior of event in Σfi . the system within a finite delay. Such faults may be activated or deactivated by some external disturbance. • ψ(Σri ) = {s = (σ1 , σ2 , . . . , σn ) ∈ L | σn ∈ Σri } Regarding the system status, an intermittent failure is the set of event-traces in L that end with a reset takes the system from a normal state to a faulty event in Σri . state (by the occurrence of the corresponding fault • ψ(Σfi ) = {s = (σ1 , σ2 , . . . , σn ) ∈ L | ∀1 ≤ i < n : event), and then the system is taken again to a recover σi ∈ / Σfi ∧ σn ∈ Σfi } is the set of event-traces in L state within a finite delay (by the occurrence of the that have only the last event in Σfi . corresponding reset event). • ψ(Σri ) = {s = (σ1 , σ2 , . . . , σn ) ∈ L | ∀1 ≤ i < n : In order to capture these changes in the system status, σi ∈ / Σri ∧ σn ∈ Σri } is the set of event-traces in L due to the various types of events, we use the supervision that have only the last event in Σri . pattern Ω (Carvalho et al. 2012), shown in Figure 1, which is a label automaton that models the dynamic Let us consider σ ∈ Σ and s ∈ Σ∗ , we write σ ∈ s to behavior of the system regarding intermittent faults. denote the fact that ∃ 1 ≤ i ≤ |s| such that si = σ. One can note that automaton Ω plays the role of the By abuse of notation, we write Σf ∈ s to denote that label function, which is usually used in fault diagnosis a fault event from Σf is an event in event-trace s (i.e., (Sampath et al. 1995). ∃f ∈ Σfi such that f ∈ s). Σ \ Σf Σ \ Σr Σ \ Σf To capture the observed behavior of the system, we Σr define the projection operator as a mapping P : Σ∗ → Σf start N F R Σ∗o . In the usual manner, P (σ) = σ for σ ∈ Σo ; P (σ) = for σ ∈ Σu , and P (sσ) = P (s)P (σ), where s ∈ Σ∗ Σf and σ ∈ Σ. That is, P simply erases the unobservable events in any event-trace. The inverse projection PL−1 Figure 1: The label automaton Ω is defined by PL−1 (y) = {s ∈ L(G) : P (s) = y}. The projection operator can then be extended to language L Actually, when label automaton Ω is in state N (N for by applying the projection to all traces of L. Therefore, if normal status), this means that the system executes a L ⊆ Σ∗ , then P (L) = {t ∈ Σ∗o : (∃ s ∈ L) [P (s) = t]}. normal behavior, which indicates that no event from Σf has occurred yet. However, when a fault event occurs, the label automaton Ω moves to state F (F for faulty 3. DIAGNOSABILITY OF INTERMITTENT status), and remains in that state for as long as the FAULTS system executes a faulty behavior. When the fault is recovered, by the occurrence of a reset event, Ω switches 3.1. Assumptions to state R (for recovery status), where it stays as long as the system continues to execute a non-faulty behavior. Besides the well-known assumptions considered in the As we deal with intermittent faults, the system can diagnosis of permanent faults (Sampath et al. 1995), execute again a fault event. Then the label automaton i.e., that language L(G) is live and no cycle of Ω can return to state F . only unobservable events exists in G, the following assumptions are considered: In order to keep track of the occurrence of faults and their corresponding resets along the system’s evolution, 1. Each fault event σf has its corresponding we compute automaton G` as the parallel composition reset event σr . Recall that both events are of automata G and Ω (G` = G k Ω). In fact, the states unobservable. of G` are the states of automaton G enriched with labels 2. At least, one observable event exists between N , F , or R. The following example illustrates these the occurrence of any fault event σf and its notions. corresponding reset event σr and between the occurrence of any reset event σr and a new Example 1 Consider automaton G, shown in Figure occurrence of fault event σf . 2(a) and taken from (Contant et al. 2004). The sets of observable and unobservable events are Σo = {a, b, c, d} 3. Each occurrence of fault event σf is followed by and Σu = {f, r}, respectively. In addition, Σf = {f } the occurrence of its corresponding reset event σr and Σr = {r}. Automaton G` = G k Ω is depicted in within a finite delay, and each occurrence of a Figure 2 (b). reset event σr is followed by a new occurrence of the fault event σf within a finite delay. This start 1 start 1,N assumption implies that fault and reset events f f occur with some regularity (pseudo-periodicity). These notions are called the Σf − recurrence 2 2,F and Σr − recurrence, as introduced in (Contant a a 2005). 3 3,F r r 3.2. Diagnosability Definitions b b 10 4 10,F 4,R r b r b Intermittent failures are dynamic (Contant 2005), that is, they can repeatedly occur and reset, and thus, the 11 5 11,R 5,R e e fault status evolves along with the system evolution. c f c f Consequently, several notions of diagnosability can e 12 6 9 e 12,R 6,F 9,R be introduced, according to the properties and the d c d c specifications one may need to investigate. For example, d one may want to ensure the detection of any fault 13 7 d 13,R 7,F occurrence or its corresponding recovery. Another f r f r definition would require checking the presence of each 14 8 14,F 8,R fault before its recovery or checking the recovery of any fault before a new occurrence of this fault. (a) (b) Determining accurately the finite delays in which the fault or its recovery can be diagnosed can also be Figure 2: (a) Automaton G, (b) automaton G` of of interest in practice. Obviously, the choice between Example 1. these considerations greatly depends on the application nature of the system and the objectives assigned to the Regarding the diagnosis of intermittent faults, one can diagnosis activity. infer that the states of automaton G` can be partitioned into three subsets: ‘Normal’, ‘Faulty’ and ‘Recovered’, In this section, we discuss two definitions of which can be identified using fault-assignment function: diagnosability: F -diagnosability, which ensures the detection of fault occurrence within a finite delay after Ψ : X → {N, F, R}. its occurrence, and Fr -diagnosability, which ensures the detection of fault occurrences before their recovery. Note that, these definitions do not take into account the identification of the system status (i.e., whether the faulty status of the system is precisely known or not, when the fault is diagnosed). Definition 1 (F -diagnosability) event. Condition DFr then requires that any finite event- trace that shares the same observation with s.t, shall An FSA G is said to be F -diagnosable w.r.t. projection experiment a faulty behavior between the moment of function P , fault class Σf and reset event class Σr , if the fault occurrence (at the end of s) and its recovery the following holds: (at the end of t), which ensures the fault detection. (∃ n ∈ N) [∀s ∈ ψ(Σf )] (∀t ∈ L/s) [k t k≥ n ⇒ DF ] Example 3 Consider again, automaton G of Example where diagnosability condition DF is: 1 (Figure 2). Let us take the finite event-trace s = f arbf associated with finite execution ρ = ∀ω ∈ [PL−1 (P (s.t))] ⇒ (Σf ∈ ω) 1, f, 2, a, 3, r, 4, b, 5, f . It is clear that s ends with faulty event f . Let event-trace t = crd, be the continuation F -diagnosability, where ‘F ’ stands for fault of s until the reset of f . There exists, in automaton G, occurrences, has the following meaning: for any one event-trace ω = f abrcd associated to the finite event-trace s ending with a fault event in Σf , execution ρ0 = 1, f, 2, a, 3, b, 10, r, 11, c, 12, d which and t any continuation of s, then, n ∈ N exists such shares the same observed event-trace with s.t, i.e., that, after the occurrence of at most n events, it P (ω) = P (s.t) = ab(cd). However, according to is possible to detect the occurrence of the fault Definition 2, the occurrence of f cannot be detected, based on the captured observation. This implies that in this case, before its recovery (i.e., no faulty state in all the event-traces indistinguishable from s.t have ρ0 is reached, between the occurrence of f and its reset experimented, at least one fault from Σf . r). Therefore, G is non-Fr -diagnosable. Example 2 Let us take automaton G of Example 1 It should be noticed that Fr -diagnosability is stronger (Figure 2). G contains one fault event f with its than F -diagnosability. Indeed, Fr -diagnosability requires corresponding reset event r. Let us consider execution the detection of any fault before its reset. However, ρ = 1, f, 2, a, 3, r, 4, b (5, f, 6, c, 7, r, 8, d, 9, e)∗ , the F -diagnosability only requires the detection of the infinite event-trace, corresponding to this execution, is fault within a finite delay, without considering the noted s.t with, s = f arbf (one can see that s ∈ different occurrences or resets of the fault. Thereby, it ψ(Σf )), and t = (crdef )∗ . The resulting observed is straightforward to infer the following, event-trace is then P (s.t) = ab(cde)∗ . The only event- trace in G which shares the same observable event-trace Proposition 1 (Relation between definitions) with ρ is ω = f ab(rcdf e)∗ . One can see that 4 events after executing the faulty event-trace s (2 observable • Fr -diagnosability ⇒ F -diagnosability events), it is possible to infer accurately the occurrence of fault f (since f occurs in all the event-traces which • non-F -diagnosability ⇒ non-Fr -diagnosability share the same observation with s.t). Thus, according to Definition 1, G is F -diagnosable (for n ≥ 4). Remark: As we deal with intermittent faults, each fault The above-mentioned definition ensures the detection occurrence is followed later on by its corresponding of intermittent fault occurrence within a finite delay. reset event. Then, it is also interesting to discuss the However, it does take into account the detection of diagnosability of the recovery occurrence. Namely, this each occurrence of the intermittent fault before its consists in checking whether we can detect that the recovery. Hereafter, we introduce a strong version of system has moved to its recovery behavior after the fault diagnosability that deals with this property. has been recovered (and before a new occurrence of a fault event). These properties will not be discussed in Definition 2 (Fr -diagnosability) this paper. An FSA G is said to be Fr -diagnosable w.r.t. projection function P , fault class Σf and reset event class Σr , if 4. VERIFICATION OF DIAGNOSABILITY OF the following holds: INTERMITTENT FAULTS [∀s ∈ ψ(Σf )] (∀t ∈ L/s ∧ t ∈ ψ(Σr )) ⇒ DFr where diagnosability condition DFr is: The procedure we suggest for analyzing diagnosability of intermittent failures will be carried out by combining the (∀ωω ω0 ∈ L : ωω ∈ [PL−1 (P (s))] ∧ ω 00 ∈ [PL−1 (P (t))]) 00 0 twin-plant construction method (Jiang et al. 2001), and / ω 0 ] ∨ [Σf ∈ ω 00 ] ⇒ [ω ∈ ψ(Σf ) ∧ Σr ∈ some extensions, we develop, of the Model-Checking reformulation of diagnosability in the case of permanent with F stands for ‘fault occurrences’, and (r ) for failures as in (Cimatti et al. 2003; Boussif and Ghazel ‘detection before the fault recovery’. 2015). In this section, we first recall the twin-plant The above definition means the following: let s be any construction, and later we develop the necessary and finite event-trace in L that ends with a faulty event, sufficient conditions for each notion of diagnosability t be any finite continuation that ends with a reset introduced in the previous section. 4.1. Twin-Plant Construction Definition 4 (Twin-plant state types) The twin-plant, firstly introduced in (Jiang et al. 2001), We define the following state types, simply consists of two synchronized copies of generator G0 of system model G, while the synchronization • N -state (resp. F -state, R-state): is a state q = (x, x0 ) is performed. Thus, any event-trace in the twin- ∈ Q, such that Ψ(q) = (N, N ) (resp. Ψ(q) = (F, F ), plant corresponds to a pair of event-traces in the Ψ(q) = (R, R)). system model that share the same observation. More • N F -state (resp: is a state q = (x, x0 ) ∈ Q, such that precisely, a path in the twin-plant corresponds to two Ψ(q) = (N, F ). F N -state is defined similarly. indistinguishable traces in the system model. • F F -state: is a state q = (x, x0 ) ∈ Q, such that To preserve the label tracking, we use the constructed Ψ(q) = (F, F ). generator G0` , instead of the constructed generator G0 . Then, in order to generate a reduced state-space • F 1-state (resp. R1-state, N 1-state): is a state q = of twin-plant (by generating only the behavior of (x, x0 ) ∈ Q, such that Ψ(q) = (F, 4) (resp. Ψ(q) = interest for fault diagnosis) we perform the synchronous (R, 4), Ψ(q) = (N, 4)),with 4 ∈ {N, F, R}. composition G0` ||G0`F , which is different from that in • F 1-state: is a state q = (x, x0 ) ∈ Q, such that (Jiang et al. 2001). In fact, G0`F depicts only the co- Ψ(q) = (F , 4). accessible part of generator G0` from faulty states, i.e. it only contains the faulty event-traces. Thus, G0`F = One can underline that the twin-plant has an interesting (XoF , Σo , δoF , x0 ), where XoF is the set of states in feature, which is the symmetric property. It means that G0` that are reachable by event-traces which contain at each path in the twin-plant has its symmetric path least one fault event. For more details about generating (e.g., a path containing F F -states has its symmetric G0`F , the reader can refer to (Moreira et al. 2011). path which contains the symmetric F F -states, and vice versa). In the following section, we take into account Definition 3 (The reduced twin-plant) this property for developing the necessary and sufficient conditions. A reduced twin-plant of model G is an FSA P = hQ, Σo , Γ, q0 i, where: 4.2. Necessary and sufficient conditions for F -diagnosability • Q ⊆ {(x, x0 ) | x ∈ Xo , x0 ∈ XoF } is the set of states. In a previous work (Boussif and Ghazel 2015), we • Σo the set of the (observable) events. have dealt with diagnosability of permanent faults using a twin-plant-based structure in model-checking • Γ ⊆ Q × Σo × Q is the partial transition relation s.t. framework. The necessary and sufficient condition for (q, σ, q 0 ) ∈ Γ, with q = (x1 , x2 ), and q 0 = (x01 , x02 ) if diagnosability was the absence of “infinite critical pairs” and only if (x1 , σ, x01 ), (x2 , σ, x02 ) ∈ δoF . in the constructed twin-plant. This means the absence of • q0 = (x0 × x0 ) ∈ Q is the initial state. cycles which are composed only of F N (or N F )-states. In the same way, we formalize a necessary and sufficient It is worthwhile recalling that constructing the twin- condition for the diagnosability of intermittent faults. plant can be performed in (O(|X|4 ×|Σo |)) (Jiang et al. In order to do so, we need to introduce the following 2001). definition, As the reduced twin-plant is performed directly on Definition 5 (F -confused cycle) constructed generator G0` , then label tracking is preserved and therefore, the fault-assignment function It is a cycle π = (q1 , q2 , . . . , qn , qn+1 = q1 ), in the is extended as follows: twin-plant, s.t. ∀ 1 ≤ i ≤ n, qi is an N 1-state, and ∃ 1 ≤ j ≤ n, s.t. qj is an N F -state. Ψ : (Xo , Xo ) → ({N, F, R} × {N, F, R}) An F -confused cycle in twin-plant corresponds to two Hence, different types of states can be distinguished cycles on the system model (automaton G) which in the reduced twin-plant. Hereafter, only state types, generate the same observed event-trace, such that the which will be used in the sequel, for developing necessary first one has no fault event (a fault-free cycle) and the and sufficient conditions for diagnosability, are defined. second one contains, at least, one fault event (which is depicted by the existence of an N F -state). In order to simplify the notations, we introduce labels N , F and R which mean respectively that the label is Figure 3 shows a path that contains a configuration of different from N (i.e., it can be F or R), different from an F -confused cycle represented by states q2 , q3 , q4 , q5 . F (i.e., it can be N or R) and different from R (i.e., it can be N or F ). After having set up the necessary notions, we now establish the necessary and sufficient conditions for F - diagnosability. Since Σf ∈ / ω, then ∀q ∈ c`, q is an N 1-state. Moreover, Σf ∈ s (since s ∈ ψ(Σf )). According to assumption 3, the fault event occurs and reset regularly. Then, ∃ i ≤ k 0 s.t. qi is an N F -state. Thus, c` is an F -confused cycle, according to Definition 3, which contradicts our assumption. Figure 3: An F -confused cycle 4.3. Necessary and Sufficient Conditions for Fr -diagnosability Theorem 1 (F -diagnosability) It should be noticed that, as stated in Proposition 1, the non-F -diagnosability implies the non-Fr -diagnosability. A system model G is F -diagnosable, w.r.t projection Then, it is straightforward that the necessary and function P , class of fault events Σf and its sufficient condition for F -diagnosability (i.e, the absence corresponding class of reset events Σr , if and only if no of an F -confused cycle in the twin-plant) is also a F -confused cycle exists in its corresponding twin-plant. necessary condition for Fr -diagnosability. Hereafter, the notion of F -indicating sequence is introduced, such Proof 1 (⇒) Assume that L(G) is F -diagnosable a notion will be used for stating the necessary and but there exists an F -confused cycle in its sufficient condition of Fr -diagnosability. corresponding twin-plant: q1 , σ1 , q2 , . . . , qn , σn , q1 , n ≥ 1. Such a cycle corresponds to two Definition 6 (F -indicating sequence) cycles in G0` : c` = x11 , σ1 x12 , . . . , x1n , σn , x11 and c` 0 = x1 , σ1 , x2 , . . . , x2n , σn , x21 . 2 2 Let It is a finite path π = (q1 , q2 , . . . , qn ), such that q1 is an t = v1 , σ1 , v2 , σ2 , . . . , vn , σn and t0 = F 1-state, ∀ 1 < i < n, qi is F F -state, and qn (n > 2) v10 , σ1 , v20 , σ2 , . . . , vn0 , σn be the event-traces is an R1-state. that correspond to cyclic executions c` and c`0 in G s.t. ∀ i ≤ n, vi , vi0 ∈ Σ∗u . (i.e., Figure 4 shows a path that contains a configuration of an P (t) = P (t0 ) = σ1 , σ2 , . . . , σn ). F -indicating sequence represented by states q1 , q2 , q3 , and q4 . By construction of the twin-plant, ∃ s0 , s00 ∈ L(G), s.t. [P (s0 ) = P (s00 )] ∧ [δ(x0 , s0 ) = x11 ] ∧ [δ(x0 , s00 ) = x21 ] ∧ [Σf ∈ / s0 ]. (the last condition Σf ∈ / s0 is due to the fact that x1i is an N -state ∀1 ≤ i ≤ n). Also, according to Figure 4: An F -indicating sequence Definition 5, Σf ∈ t0 and Σf ∈ / t. Thus, one can 0|t0 | consider t0 = t01 .t02 such that t1 1 ∈ Σf (i.e., t01 ends Theorem 2 (Fr -diagnosability) with a fault event). Now, let us consider s = s00 .t01 , A system model G is Fr -diagnosable, w.r.t a projection then s ∈ ψ(Σf ). Thus, for any n ∈ N let us take function P , a class of fault events Σf and its t00n = t02 t0n ∈ L/s, then (|t00n | ≥ n) and (∃ ωn = s0 .tn+1 ) corresponding class of reset events Σr , if and only if no such that [ωn ∈ P −1 P (st00n )] ∧ [Σf ∈ / ωn ]. F -indicating sequence exists in its corresponding twin- plant. Therefore, F -diagnosability definition is violated. Proof 2 The proof of Theorem 2 is not given, but it is (⇐) Assume that twin-plant P is F -confused-cycle-free developed in the same manner as for F -diagnosability and suppose that automaton G is non-F -diagnosable. by adopting an absurd reasoning. Then, (∀n ∈ N)(∃ s ∈ ψ(Σf )) (∃ t ∈ L(G)/s) such that: [k t k≥ n] ∧ [(∃ ω ∈ P −1 P (s.t)) ∧ [Σf ∈ / ω]] 5. DIAGNOSABILITY VERIFICATION USING Let us pick any n ≥ |X|2 , and ω ∈ L(G) such MODEL-CHECKING that P (ω) = P (s.t) = σ1 , σ2 , . . . , σk , with k ∈ N. By constructing twin-plant P of G, we have a path 5.1. Model-Checking π = q0 , σ1 , q1 , . . . , σk , qk+1 , k ≤ |s.t| that corresponds to executions ω and s.t. Model-Checking is an automatic formal verification technique that is widely applied for the design of As |t| ≥ n ≥ |X|2 , it is clear that executions complex dynamic systems (Clarke et al. 1999). It allows corresponding to s.t and ω contain cycles (Jiang et al. for verifying whether the system behavior (modeled by 2001). Thus, ∃0 ≤ i ≤ k 0 , with k 0 ≤ k such that c` ∈ π, a Kripke Structure) satisfies a given property expressed with c` = qi , σi+1 , qi , . . . , qk0 −1 , σk0 , qi (i.e., a cycle c` as a temporal logic formula or not, using efficient exists in π). algorithms based on exhaustive exploration of the system state-space. A counter-example is generated if The specification can be read as follows: “ for the the system does not satisfy the property, which is an considered infinite path in the twin-plant, all states, interesting feature namely for debugging. from the current state, are N 1-states and at least one state is an N F -state”. Therefore, F -diagnosability can In order to use Model-Checking for verifying diagnosabil- be expressed as follows: ity of permanent faults, (Cimatti et al. 2003) proposed a method to formulate the diagnosability issue as a Model- KP , SP |= ¬ F G( N1 ∧ F NF ) Checking problem using a CTL/LTL temporal logic formulas and the Kripke structure that corresponds to where KP is the Kripke structure corresponding to the the twin-plant of the system. Hereafter, we reformulate twin-plant P of G, and SP is the initial state in KP . the necessary and sufficient conditions for the different definitions in the same way as in (Cimatti et al. 2003). 5.4.2. Fr -diagnosability as a Model-Checking problem: The LTL formula which characterizes the first state of 5.2. The Kripke Structure any F -indicating sequence in the twin-plant is, In simple terms, a Kripke structure is a non- deterministic state/transition system with atomic φ1 : F 1 ∧ X F F ∧ X [F F U R1] propositions assigned to the states (or the actions). Each state of the Kripke structure represents some possible The specification can be read as follows: “for the configuration of the system, while a labeling function considered path in the twin-plant, the current state is associates with each state the properties holding in it. an F 1-state, the successor state is an F F -state and Thus, in order to formulate a twin-plant as a Kripke the successors states are F F -states until an R1-state is structure, one can simply encode states (of the two reached”. copies of the system) and the observed events of the The Model-Checking problem which expresses Fr - twin-plant in the state space of the Kripke structure, diagnosability is: i.e., a state in the Kripke structure is defined as a vector (x1 , x2 , σ), where x1 , x2 are the states of the system copies and σ is a feasible (observable) event from both KP , SP |= ¬ F ( F 1 ∧ X F F ∧ X [F F U R1] ) x1 and x2 . The labeling function associated with each state in the Kripke structure takes one proposition from where KP is the Kripke structure corresponding to the {N, F, R} × {N, F, R}. twin-plant P of G, and SP is the initial state in KP . 5.3. Linear Temporal Logic (LTL) 6. A RAILWAY CASE-STUDY Temporal logics allow one to formally express properties to be satisfied by a model. LTL is a temporal logic that In order to evaluate the efficiency and the scalability reasons over linear traces of Kripke structure through of the proposed approach, we experiment a railway time. At each time instant, there is only one real future case-study: a Level Crossing benchmark (Liu 2014). timeline that is considered. Conventionally, that timeline For verification, we use the symbolic model-checker is defined as starting “now”, in the current time step, nuXmv (version 1.0) (Bozzano et al. 2014), which is and progressing infinitely in the future. LTL formulas a symbolic model-checker for analyzing of synchronous may contain a finite number of atomic propositions, finite-state and infinite-state systems. It is an extension Boolean connectives ¬, ∧, ∨, and temporal connectives: of the existing NuSMV model-checker with some new ‘X’ that means ‘neXt’, ‘G: Globally’,‘R: Release’, ‘F : interesting features. Both are originated from the re- in the Future’,’U : U ntil’. engineering, re-implementation and extension of the CMU SMV Tool. Its main advantage is the integration of 5.4. Diagnosability Conditions as LTL Formulas techniques based on the ”Satisfiability Modulo Theory (SMT)”, implemented through a tight integration with In order to formulate the two notions of diagnosability MathSAT5. as Model-Checking problems, we first formulate each diagnosability condition as an LTL formula. For the sake 6.1. Level Crossing Benchmark of simplicity, we introduce these atomic propositions: N 1, N F , F 1, F F , and R1, which mean respectively: A Level Crossing (LC) is an intersection where a railway the state q is an N 1-state, N F -state, F 1-state, F F - line intersects with a road or path at the same level. state, and R1-state. It is composed of three subsystems: the railway traffic, the LC controller and the barriers, these subsystems are 5.4.1. F -diagnosability as a Model-Checking problem: modeled by Labeled Petri Net (LPN). The global system The LTL formula which characterizes each state of an is established using some shared places and transitions F -confused cycle in the twin-plant is, between these sub-models. φ1 : G( N1 ∧ F NF ) An interesting feature of this benchmark is that it can be extended to n railway tracks in order to obtain larger models and assess the scalability of the used The diagnosability verification will be then performed techniques. For more details about modeling, function on the obtained reachability graph. Before proceeding and the development of the benchmark, the reader can with tests, we construct the twin-plant as a Kripke refer to (Ghazel and Liu 2016). structure. It is described as a synchronous composition of two copies of a system modules instead of The global single-line LC model is depicted in Figure 5. enumerating the whole model. The verification task is Transitions in green squares are observable and the conducted as follows, we first check F -diagnosability. others are unobservable, where the red one is a fault If the specification is satisfied by the Kripke structure event from the fault class Σf , the yellow one is a reset corresponding to the twin-plant model (which means, event from Σr corresponding to fault class Σf , and that an F -confused cycle exists in the twin-plant and will the grey one is a normal unobservable event. Actually, be directly generated by the model-checker as a counter- the fault event is pertaining to train-sensors along the example), then the system is not F -diagnosable. In this track and may cause the arrival of the train into the LC case, we can directly decide about the Fr -diagnosability intersection zone before the barriers are ensured to be (the system is not Fr -diagnosable), without proceeding lowered. to check its corresponding specifications, since the non-F -diagnosability implies the non-Fr -diagnosability railway traffic t01,4 , ig as stated in proposition 1. In the other case, we Faulty track proceed by checking Fr -diagnosability as done with F - p01,1 t01,1 , ap’1 p01,2 t01,2 , en’1 p01,3 t01,3 , lv’1 p01,4 diagnosability. 6.2. Results and Discussion Normal track tu , All experiments were conducted on a 64-bit PC, Ubuntu p1,1 t1,1 , ap1 p1,2 t1,2 , en1 p1,3 t1,3 , lv1 14.04 operating system, an Intel Core i5, 2.5 GHz Processor with 4 cores and 6 GB RAM. p1 p2 p3 p4 Table 1 shows the obtained results. Columns from left to right correspond to: n: the numbers of railway tracks, GS : the obtained number of states in automaton G t1 , cr p9 t2 , or (i.e., the reachability graph of the LPN model); GT : the number of transitions in G; PS : the number of LC controller p5 p6 obtained states in twin-plant G at the end of the analysis; tP : the time needed to generate twin-plant barriers p7 , up (as a Kripke structure); DiagF : the F -diagnosability verdict; tDiagF : the time needed to conclude about F -diagnosability; DiagFr : the Fr -diagnosability verdict t3 , kd t4 , lw t5 , rs and finally, tDiagfr : the time needed to conclude about Fr -diagnosability. p8 , down The analysis of the different notions of diagnosability observable transition unobservable fault transition shows that diagnosability properties are violated by the model whatever the number of tracks, which means that the LC benchmark is neither F -diagnosable nor Fr - Figure 5: Level Crossing Benchmark diagnosable. Based on the generated counter-examples, one can conclude that this is due to the fact that two scenarios exists in the model that generate the same As the LC system is modeled by an LPN, we first observation (i.e, the same subsequent firing sequence). generate its reachability graph with the help of TINA these two scenarios correspond to : (1) a train-sensing Tool (which represents our input automaton G) and failure occurs and then the train does never leave the then, perform our technique based on the generated intersection zone, and (2) the train stops indefinitely reachability graph. In order to assess the scalability, we before reaching the intersection zone (See (Ghazel and increase the number of railway track k progressively. Liu 2016)). n GS GT PS tP DiagF tDiagF DiagFr tdiagFr 1 26 39 123 0.1 s non 0.27 s non 0.17 s 2 271 683 3536 1.42 s non 5.80 s non 11.23 s 3 1938 6771 144210 28,63 s non 51.29 s non 83.50 s 4 2542 9293 3053060 40,29 s non 1274.59 s non 1846 s Table 1: Experimental results for n-LC models. Regarding the scalability of the approach, one can the diagnosability analysis process, in such a way as to observe that the size of the marking graph (GS and GT ) be able to deal with large systems. significantly increases with the dimension of the net, namely with the number of tracks (n) in the benchmark. It is not surprising that the number of the reachable REFERENCES states (PS ) of the Kripke structure highly increases with the size of the reachability graph, since it is very J. Zaytoon and S. Lafortune. Overview of fault diagnosis sensitive to combinatorial explosion. Indeed, as we use a methods for discrete event systems. Annual Reviews symbolic Model-Checker, it is difficult to conclude about in Control, 37:308–320, 2013. the evolution of the state-space since it depends on M. Sampath, R. Sengupta, and S. Lafortune. variable and transition orderings and clustering (which Diagnosability of discrete-event systems. IEEE is not taken into account in our experiments). It should Transactions on Automatic Control 40(9), pages be stressed that no reduction or optimization techniques 1555–1575, 1995. have been used to perform the analysis. S. H. Zad, R .H. Kwong, and W. M. Wonham. Fault Finally, three remarks relatively to the elapsed times for diagnosis in discrete-event systems: Framework and generating the twin plant and verifying diagnosability, model reduction. IEEE Transactions on Automatic can be emphasized: Control, 48(7):1199–1212, 2003. 1. The Model-Checker spends more time in S. Jiang, Z. Huang, V. Chandra, and R. Kumar. verification than in generating the twin plant. A polynomial algorithm for testing diagnosability of discrete-event systems. IEEE Transactions on Automatic Control, 46(8):1318–1321, 2001. 2. Elapsed times for generating the twin plant and verifying diagnosability stay in the order A. Schumann and Y. Pencolé. Scalable diagnosability of seconds until 3-tracks, then it increases checking of event-driven system. 20th International significantly. Joint Conference on Artificial Intelligence, pages 575– 580, 2007. 3. More time elapsed for verifying Fr -diagnosability T. S. Yoo and S. Lafortune. Polynomial-time verification than F -diagnosability. This result is logical, since of diagnosability of partially observed discrete-event the LTL formula that expresses Fr -diagnosability systems. IEEE Transactions on automatic control, 47 contains more temporal connectives than the (9):1491–1495, 2002. LTL formula expressing F -diagnosability (i.e., the model-checking techniques are sensitive to the E. M . Clarke, O. Grumberg, and D. Peled. Model size of the properties). Checking, The MIT Press Cambridge, MA, 1999. A. Cimatti, C. Pecheur, and R. Cavada. Formal 7. CONCLUSION verification of diagnosability via symbolic model checking. Int. Conference on Artificial Intelligence, In this paper, we propose an approach to analyze pages 363–369, 2003. diagnosability of intermittent faults in DESs using a G. Bourgne, P. Dague, F. Nouioua, and N. Rapin. model-checking framework. System modeling, faults Diagnosability of input output symbolic transition modeling, and two notions of diagnosability with systems. 1st Int. Conference on Advances in System their corresponding necessary and sufficient conditions Testing and Validation Lifecycle, pages 147–154, have been discussed. Diagnosability issues were then 2009. formulated as LTL Model-Checking problems based on the twin-plant construction. The effectiveness and A. Boussif and M. Ghazel. Diagnosability analysis scalability of the proposed approach are experimentally of input/output discrete event system using model evaluated through a railway case-study. checking. The 5th International Workshop on Dependable Control of Discrete Systems (DCDS’15), This work falls within the scope of our activities on 48(7):71–78, 2015. reformulating issues related to fault diagnosis in a model-checking framework. We have already studied A. Grastien. Symbolic testing of diagnosability. 20th the case of permanent failures and we wish, on International Workshop on Principles of Diagnosis one hand, to extend our study to deal with more (DX-09), 2009. complex classes of faults such as repeated failures, supervision patterns in untimed and timed domains. On F. Peres and M. Ghazel. An operative formulation of the other hand, we intend to evaluate the efficiency the diagnosability of discrete event systems using a of advanced formal verification techniques such as single logical framework. The 8th Int. Workshop abstraction techniques, SAT analysis, on-the-fly model- on Verification and Evaluation of Computer and checking and also develop optimization techniques for communication Systems, pages 1–11, 2014. A. Grastien and A. Anbulagan. Diagnosis of O. Contant, S. Lafortune, and D. Teneketzis. Diagnosis discrete-event systems using satisfiability algorithms. of intermittent faults. Discrete Event Dynamic Proceedings of the National Conference on Artificial Systems: Theory and Applications, 14(2):171–202, Intelligence, 2007. 2004. A. Grastien. Incremental diagnosis of des by M.V. Moreira, T.C. Jesus, and J.C. Basilio. Polynomial satisfiability. Proceedings of the conference on ECAI, time verification of decentralized diagnosability of 2008. discrete event systems. IEEE Transactions on Automatic Control, 56(7):1679–1684, 2011. S. Jiang, R. Kumar, and H. E. Garcia. Diagnosis of repeated / intermittent failures in discrete event B. Liu. An efficient approach for diagnosability and systems. IEEE Transactions on Robotics and diagnosis of DES based on LPN. PhD thesis, Univ. Automation, 19(2):310–323, 2003. de Lille1, 2014. T. S. Yoo and H. E. Garcia. Event diagnosis of discrete- M. Bozzano, R. Cavada, A. Cimatti, M. Dorigatti, event systems with uniformly and nonuniformly A. Griggio, and A. et al. Mariotti. nuXmv 1.0 User bounded diagnosis delays. American Control manual. 2014. Conference, 6:5102–5107, 2004. M. Ghazel and B. Liu. A customizable benchmark C. Zhou and R. Kumar. Computation of diagnosable to deal with fault diagnosis issues in DES. 13th fault-Occurrence indices for systems with repeatable- International Workshop on Discrete Event System faults. IEEE Transactions on automatic control, 54 (WODES 2016), 2016. (7):1477–1490, 2009. O. Contant. Failure diagnosis of discrete event system: the case of intermittent faults. International conference on decision and control, 4:4006–4017, 2002. O. Contant. On monitoring and diagnosing classes of discrete event systems. PhD Thesis, University of Michigan, 2005. A. Correcher, E. Garcia, F. Morant, and E. Quiles. Intermittent failure diagnosis in industrial processes. 2:723–728, 2003. S. Soldani, M. Combacau, A. Subias, and J. Thomas. Intermittent fault diagnosis: a diagnoser derived from the normal behavior. International workshop principles diagnosis, pages 391–399, 2007. S. Soldani, M. Combacau, A. Subias, and J. Thomas. Intermittent fault detection through message ex- changes: a coherence based approach. International workshop principles diagnosis, pages 251–257, 2006. S. Biswas. Diagnosability of discrete event systems for temporary failures. Computers & Electrical Engineering, 38(6):1534–1549, 2012. D. Guanqian, Q. Jing, L. Guanjun, and L. Kehong. A discrete event systems approach to discriminating intermittent from permanent faults. C.G. Cassandras and S. Lafortune. Introduction to discrete event systems. 2008. L.K. Carvalho, J.C. Basilio, and M. V. Moreira. Robust diagnosis of discrete event systems against intermittent loss of observations. Automatica, 48(9): 2068–2078, 2012.