=Paper= {{Paper |id=Vol-1507/dx15paper17 |storemode=property |title=A Framework For Assessing Diagnostics Model Fidelity |pdfUrl=https://ceur-ws.org/Vol-1507/dx15paper17.pdf |volume=Vol-1507 |dblpUrl=https://dblp.org/rec/conf/safeprocess/ProvanF15 }} ==A Framework For Assessing Diagnostics Model Fidelity== https://ceur-ws.org/Vol-1507/dx15paper17.pdf
                          Proceedings of the 26th International Workshop on Principles of Diagnosis




                     A Framework For Assessing Diagnostics Model Fidelity

                                     Gregory Provan1 and Alex Feldman2
                    1
                        Computer Science Department, University College Cork, Cork, Ireland
                                           e-mail: g.provan@cs.ucc.ie
                                     2
                                       PARC Inc., Palo Alto, CA 94304, USA
                                          e-mail: afeldman@parc.com


                           Abstract                                   models can actually perform worse than lower-fidelity mod-
                                                                      els on real-world data, as can be explained using over-fitting
     “All models are wrong but some are useful" [1].                  arguments within a machine learning framework.
     We address the problem of identifying which di-                     To our knowledge, there is no theory within Model-Based
     agnosis models are more useful than others. Mod-                 Diagnostics that relates notions of model complexity, model
     els are critical to diagnostics inference, yet little            accuracy, and inference complexity. To address these issues,
     work exists to be able to compare models. We de-                 we explore several of the factors that contribute to model
     fine the role of models in diagnostics inference,                complexity, as well as a theoretically sound approach for
     propose metrics for models, and apply these met-                 selecting models based on their complexity and diagnostics
     rics to a tank benchmark system. Given the many                  performance, i.e., their accuracy in diagnosing faults.
     approaches possible for model metrics, we argue                     Our contributions are as follows:
     that only information-theoretic methods address
                                                                         • We characterise the task of selecting a diagnosis model
     how well a model mimics real-world data. We
                                                                           of appropriate fidelity as an information-theoretic
     focus on some well-known information-theoretic
                                                                           model selection task.
     modelling metrics, demonstrating the trade-offs
     that can be made on different models for a tank                     • We propose several metrics for assessing the quality of
     benchmark system.                                                     a diagnosis model, and derive approximation versions
                                                                           of a subset of these metrics.
                                                                         • We use a dynamical systems benchmark model to
1 Introduction                                                             demonstrate our compare how the metrics assess mod-
A core goal of Model-Based Diagnostics (MBD) is to ac-                     els relative to the accuracy of diagnostics output based
curately diagnose a range of systems in real-world appli-                  on using the models.
cations. There has been significant progress in developing
algorithms for systems of increasing complexity. A key                2 Related Work
area where further work is needed is scaling-up to real-              This section reviews work related to our proposed approach.
world models, as multiple-fault diagnostics algorithms are               Model-Based Diagnostics: There is some seminal work
currently limited by the size and complexity of the models            on modelling principles within the Model-Based Diagnosis
to which they can be applied. In addition, there is still a great     (MBD) community, e.g., [2; 3]; this early work adopts an
need for defining metrics to measure diagnostics accuracy,            approach based on logic or qualitative physics for model
and to measure the computational complexity of inference              specification. However, this work provides no means for
and of the models’ contribution to inference complexity.              comparing models in terms of diagnostics accuracy. More
   This article addresses the modeling side of MBD: we fo-            recent work ([4]) provides a logic-based specification of
cus on methods for measuring the size and complexity of               model fidelity. There is also work specifying metrics for
MBD models. We explore the role that diagnostics model                diagnostics accuracy, e.g., [5].
fidelity can play in being able to generate accurate diagnos-            However, none of this work defines precise metrics for
tics. We characterise model fidelity and examine the trade-           computing both diagnostics accuracy and model complex-
offs of fidelity and inference complexity within the overall          ity, and their trade-offs. This article adopts a theoretically
MBD inference task.                                                   well-founded approach for integrating multiple MBD met-
   Model fidelity is a crucial issue in diagnostics [2]: mod-         rics.
els that are too simple can be inaccurate, yet highly detailed           Multiple Fidelity Modeling There is limited work de-
and complex models are expensive to create, have many pa-             scribing the use of models of multiple levels of fidelity. Ex-
rameters that require significant amounts of data to estimate,        amples of such work includes [6; 7; 8]. In this article we
and are computationally intensive to perform inference on.            focus on methods for evaluating multi-fidelity models and
There is an urgent need to incorporate inference complexity           their impact on diagnostics accuracy, as opposed to devel-
within modelling, since even relatively simple models, such           oping methodoligies for modelling at multiple levels of fi-
as some of the combinational ISCAS-85 benchmark models,               delity.
pose computational challenges to even the most advanced                  Multiple-Mode Modeling One approach to MBD is to
solvers for multiple-fault tasks. In addition, higher-fidelity        use a separate model for every failure mode, rather than to




                                                                127
                        Proceedings of the 26th International Workshop on Principles of Diagnosis


define a model containing all failure modes. Examples of               We generalise that notion to incorporate inference effi-
this approach include [9; 10; 11; 12]. Note that this work          ciency as well as accuracy. We can define an inference com-
does not specify metrics for computing both diagnostics ac-         plexity measure as C(Ỹ , φ). We can then define our diagno-
curacy and model complexity, or their trade-offs.                   sis task as jointly minimising a function g that incorporates
   Model- Selection The metrics that we adopt and extend            the accuracy (based on the residual function) and the infer-
have been used extensively to compare different models,             ence complexity:
e.g., [13]. The metrics are used to compare simulation per-                                                         
formance of models only. In contrast, we extend this frame-                    ξ ∗ = argmin g R(Ỹ , Yφ ), C(Ỹ , φ) .        (2)
work to examine diagnostics performance. In the process,                              ξ∈Ξ
we explore the use of multiple loss functions for penalising        Here g specifies a loss or penalty function that induces a
models, in addition to the standard penalty functions based         non-negative real-valued penalty based on the lack of accu-
on number of model parameters.                                      racy and computational cost.
   Model-Order Reduction Model-Order reduction [14]                    In forward simulation, a model φ, with parameters θ, can
aims to reduce the complexity of a model with an aim to
                                                                    generate multiple observations Ỹ = {ỹ1 , ..., ỹn }. The di-
limit the performance losses of the reduced model. The re-
                                                                    agnostics task involves performing the inverse operation on
duction methods are theoretically well-founded, although
                                                                    these observations. Our objective thus involves optimising
they are highly domain-specific. In contrast to this ap-
                                                                    the state estimation task over a future set of observations,
proach, we assume a model-composition approach from a
                                                                    Ỹ = {Ỹ1 , ..., Ỹn }. Our model φ and inference algorithm
component library containing hand-constructed models of
multiple levels of fidelity.                                        A have different performance based on Ỹi , i = 1, ..., n: for
                                                                    example, [15] shows that both inference-accuracy and -time
                                                                    vary based on the fault cardinality . As a consequence, to
3 Diagnostics Modeling and Inference                                compute ξ ∗ we want to optimise the mean performance over
This section formalises the notion of diagnostics model             future observations. This notion of mean performance op-
within the process of diagnostics inference. We first intro-        timisation has been characterised using the Bayesian model
duce the task, and then define it more precisely.                   selection approach, which we examine in the following sec-
                                                                    tion.
3.1 Diagnosis Task
Assume that we have a system S that can operate in a nom-           3.2 Diagnosis Model
inal state, ξN , or a faulty state, ξF , where Ξ is the set of      We specify a diagnosis model as follows:
possible states of S. We further assume that we have a dis-
                                                                    Definition 1 (Diagnosis Model). We characterise a Diag-
crete vector of measurements, Ỹ = {ỹ1 , ..., ỹn } observed       nosis Model φ using the tuple hV , θ, Ξ, Ei, where
at times t = {1, ..., n} that summarizes the response of
the system S to control variables U = {u1 , ..., un }. Let             • V is a set of variables, consisting of variables denoting
Yφ = {y1 , ..., yn } denote the corresponding predictions                 the system state (X), control (U ), and observations
from a dynamic (nonlinear) model, φ, with parameter values                (Y ).
θ: this can be represented by Yφ = φ(x0 , θ, ξ, Ũ ), where x0         • θ is a set of parameters.
signifies the initial states of the system at t0 .                     • Ξ is a set of system modes.
   We assume that we have a prior probability distribution
P (Ξ) over the states Ξ of the system. This distribution de-           • E is a set of equations, with a subset Eξ ⊆ E for each
notes the likelihood of the failure states of the system.                 mode ξ ∈ Ξ.
   We define a residual vector R(Ỹ , Yφ ) to capture the dif-         We will assume that we can use a physics-based approach
ference between the actual and model-simulated system be-           to hand-generate a set E of equations to specify a model.
haviour. An example of a residual vector is the mean-               Obtaining good diagnostics accuracy, given a fixed E, en-
squared-error (MSE). We assume a fixed diagnosis task T             tails estimating the parameters θ to optimise that accuracy.
throughout this article, e.g., computing the most likely diag-
nosis, or a deterministic multiple-fault diagnosis.                 3.3 Running Example: Three-Tank Benchmark
   The classical definition of diagnosis is as a state estima-      In this paper, we use the three-tank system shown in Fig. 1
tion task, whose objective is to identify the system state that     to illustrate our approach. The three tanks are denoted as T1 ,
minimises the residual vector:                                      T2 , and T3 . Each tank has the same area A1 = A2 = A3 .
                                                                    For i = 1, 2, 3, tank Ti has height hi , a pressure sensor pi ,
                  ξ ∗ = argmin R(Ỹ , Yφ )                 (1)
                          ξ∈Ξ                                       and a valve Vi , i = 1, 2, 3 that controls the flow of liquid
                                                                    out of Ti . We assume that gravity g = 10 and the liquid has
Since this is a minimisation task, we typically need to             density ρ = 1.
run multiple simulations over the space of parameters and              Tank T1 gets filled from a pipe, with measured flow q0 .
modes to compute ξ ∗ . We can abstract this process as              Using Torricelli’s law, the model can be described by the
performing model-inversion, i.e., computing some ξ ∗ =              following non-linear equations:
φ−1 (x0 , θ, ξ, Ũ ) that minimises R(Ỹ , Yφ ).
   During this diagnostics inference task, a model φ can play              dh1          1 h     p                i
                                                                                  =         −κ1 h1 − h2 + q0 ,                 (3)
two roles: (a) simulating a behaviour to estimate R(Ỹ , Yφ );              dt         A1
(b) enabling the computation of ξ ∗ = φ−1 (x0 , θ, ξ, Ũ ). It             dh2          1 h p                  p          i
                                                                                  =         κ1 h1 − h2 − κ2 h2 − h3 , (4)
is clear that diagnostics inference requires a model that has               dt         A2
good fidelity and is computationally efficient for performing              dh3          1 h p                  p i
these two roles.                                                                  =         κ2 h2 − h3 − κ3 h3 .               (5)
                                                                            dt         A3



                                                              128
                           Proceedings of the 26th International Workshop on Principles of Diagnosis


   q0                                                                      mode, and ξ· is the mode where · denotes the combination
                                                                           of valves (taken from a combination of {1, 2, 3}) which are
                                                                           faulty. This fault model has 9 parameters.

                 V1                   V2                   V3              4 Modelling Metrics
                                                                           This section describes the metrics that can be applied to esti-
                                                                           mate properties of a diagnosis model. We describe two types
        p1*                 p2*                  p3*                       of metrics, dealing with accuracy (fidelity) and complexity.

          Figure 1: Diagram of the three-tank system.
                                                                           4.1 Model Accuracy
                                                                           Model accuracy concerns the ability of a model to mimic a
                                                                           real system. From a diagnostics perspective, this translates
   In eq. 3, the coefficient κ1 denotes a parameter that cap-              to the use of a model to simulate behaviours that distinguish
tures the product of the cross-sectional area of the tank                  nominal and faulty behaviours sufficiently well that appro-
A√1 , the area of the drainage hole, a gravity-based constant              priate fault isolation algorithms can identify the correct type
( 2g), and the friction/contraction factor of the hole. κ2                 of fault when it occurs. As such, a diagnostics model needs
and κ3 can be defined analogously.                                         to be able to simulate behaviours for multiple modes with
   Finally, the pressure at the bottom of each tank is obtained            “appropriate" fidelity.
from the height: pi = g hi , where i is the tank index (i ∈                   Note that we distinguish model accuracy from diagnosis
{1, 2, 3}).                                                                inference accuracy. As noted above, model accuracy con-
   We emphasize the use of the κi , i = 1, 2, 3 because we                 cerns the ability of a model to mimic a real system through
will use these parameter-values as a means for “diagnos-                   simulation, and to assist in diagnostics isolation. Diagnosis
ing” our system in term of changes in κi , i = 1, 2, 3. Con-               inference accuracy concerns being able to isolate the true
sider a physical valve R1 between T1 and T2 that constraints               fault given an observation and the simulation output of a
the flow between the two tanks. We can say that the valve                  model.
changes proportionally the cross-sectional drainage area of                   A significant challenge for a diagnosis model is the need
q1 and hence κ1 . The diagnostic task will be to compute the               to simulate behaviours for multiple modes. Two approaches
true value of κ1 , given p1 , and from κ1 we can compute the               that have been taken are to use a single model with multiple
actual position of the valve R1 .                                          modes explicitly defined (a multi-mode approach), or to use
   We now characterise our nominal model in terms of Def-                  multiple models [9; 16; 17], each of which is optimised for
inition 1:                                                                 a single or small set of modes (a multi-model approach).
  • variables V         consist of variables denoting                         The AI-based MBD approach typically uses a single
    the system state (X         = {h1 , h2 , h3 }), con-                   model φ with multiple modes explicitly defined [18], or a
    trol (U = {q0 , V1 , V2 , V3 }), and observations                      single model with just nominal behaviour [19]. From a di-
    (Y = {p1 , p2 , p3 }).                                                 agnostics perspective, accuracy must be defined with respect
  • θ = {{A1 , A2 , A3 }, {κ1 , κ2 , κ3 }} is the set of pa-               to the task T . We adopt here the task of computing the most-
    rameters.                                                              likely diagnosis.
                                                                              Given evidence suggesting that model fidelity for a multi-
  • Ξ consists of a single nominal mode.                                   mode approach varies depending on the mode, it is impor-
  • E is a set of equations, given by equations 3 through 5.               tant to explicitly consider the mean performance of φ over
   Note that this model has a total of 6 parameters.                       the entire observation space Y (the space of possible obser-
                                                                           vations of the system).
   Fault Model In this article we focus on valve faults,
where a valve can have a blockage or a leak. We model                         In this article we adopt the expected residual approach,
this class of faults by including in equations 3 to 5 an addi-             i.e., given a space Y = {Ỹ1 , ..., Ỹn } of observations, the ex-
tive parameter β, which is applied to the parameter κ, i.e., as            pected residual is the average over the n observations, e.g.,
                                                                                                 Pn
κi (1+βi ), i = 1, 2, 3, where −1 ≤ βi ≤ κ1i −1, i = 1, 2, 3.              as given by: R̄ = n1 i=1 R(Ỹi , Yφ ).
β > 0 corresponds to a leak, such that β ∈ (0, 1/κ − 1];
β < 0 corresponds to a blockage, such that β ∈ [−1, 0).                    4.2 Model Complexity
The fault equations can be written as:                                     At present, there is no commonly-accepted definition of
                                                                           model complexity, whether the model is used purely for
dh1           1 h               p               i
                                                                           simulation or if it is used for diagnostics or control. Defin-
        =         −κ1 (1 + β1 ) h1 − h2 + q0 ,               (6)
  dt         A1                                                            ing the complexity of a model is inherently tricky, due to the
dh2           1 h             p                                            number of factors involved.
        =         κ1 (1 + β1 ) h1 − h2
  dt         A2                                                               Less complex models are often preferred either due to
                               p         i                                 their low computational simulation costs [20], or to min-
                − κ2 (1 + β2 ) h2 − h3 ,                                   imise model over-fitting given observed data [21; 22]. Given
dh3           1 h             p                        p i                 the task of simulating a variable of interest conditioned by
        =         κ2 (1 + β2 ) h2 − h3 − κ3 (1 + β3 ) h3 .                 certain future values of input (control) variables, overfitting
  dt         A3
                                                                           can lead to high uncertainty in creating accurate simulations.
The fault equations allow faults for any combination of                    Overfitting is especially severe when we have limited ob-
the valves {V1 , V2 , V3 }, resulting in system modes Ξ =                  servation variables for generating a model representing the
{ξN , ξ1 , ξ2 , ξ3 , ξ12 , ξ13 , ξ23 , ξ123 }, where ξN is the nominal     underlying process dynamics. In contrast, models with low




                                                                     129
                         Proceedings of the 26th International Workshop on Principles of Diagnosis


parameter dimensionality (i.e. fewer parameters) are con-               Statistical model selection is commonly based on Oc-
sidered less complex and hence are associated with low pre-          cam’s parsimony principle (ca.1320), namely that hypothe-
diction uncertainty [23].                                            ses should be kept as simple as possible. In statistical terms,
   Several approaches have been used, based on issues like           this is a trade-off between bias (distance between the aver-
(a) number of variables [24], (b) model structure [25], (c)          age estimate and truth) and variance (spread of the estimates
number of free parameters [23], (d) number of parameters             around the truth).
that the data can constrain [26], (e) a notion of model weight          The idea is that by adding parameters to a model we ob-
[27], or (f) type and order of equations for a non-linear dy-        tain improvement in fit, but at the expense of making pa-
namical model [14], where type corresponds to non-linear,            rameter estimates “worse"’ because we have less data (i.e.,
linear, etc.; e.g., order for a non-linear model is such that a      information) per parameter. In addition, the computations
k-th order system has k-th derivates in E.                           typically require more time. So the key question is how to
   Factors that contribute to the true cost of a model include:      identify how complex a model works best for a given prob-
(a) model-generation; (b) parameter estimation; and (c) sim-         lem.
ulation complexity, i.e., the computational expense (in terms           If the goal is to compute the likelihood of a given model
of CPU-time and memory) needed to simulate the model                 φ(x0 , θ, ξ, U ), then θ and U are nuisance parameters.
given a set of initial conditions Rather than try to formu-          These parameters affect the likelihood calculation but are
late this notion in terms of the number of model variables or        not what we want to infer. Consequently, these parameters
parameters, or a notion of model structural complexity, we           should be eliminated from the inference. We can remove
specify model complexity in terms of a measure based on              nuisance parameters by assigning them prior probabilities
parameter estimation, and inference complexity, assuming a           and integrating them out to obtain the marginal probability
construction cost of zero.                                           of the data given only the model, that is, the model likeli-
   A thorough analysis of model complexity will need to              hood (also called integrative, marginal, or predictive like-
take into consideration the model equation class, since              lihood).    In equational form, this looks like: P (Y |φ) =
                                                                     R R
model complexity is class-specific. For example, for non-                    P (φ|Y  , θ, U )P (θ, U |φ)dθdU . However, this multi-
                                                                       θ U
linear dynamical models, complexity is governed by the               dimensional integral can be very difficult to compute, and it
type and order of equations [14]. In contrast, for linear dy-        is typically approximated using computationally intensive
namical models, which have only matrices and variables in            techniques like Markov chain Monte Carlo (MCMC).
equations (no derivatives), it is the order of the matrices that        Rather than try to solve such a computationally challeng-
determines complexity. In this article, we assume that mod-          ing task, we adopt an approximation to the multidimen-
els are of appropriate complexity, and hence do not address          sional integral. In the statistics literature several decompos-
Model order reduction techniques [14], which aim to gen-             able approximations have been proposed.
erate lower-dimensional systems that trade off fidelity for             Spiegelhalter et al. [26] have proposed a well-known
reduced model complexity.                                            such decomposable framework, termed the Deviance In-
4.3 Diagnostics Model Selection Task                                 formation Criterion (DIC), which measures the number of
                                                                     model parameters that the data can constrain.: DIC =
The model in this model selection problem corresponds to             D + pD , where D is a measure of fit (expected deviance),
a system with a single mode. Given a space Φ of possible             and pD is a complexity measure, the effective number of
models, we can define this model selection task as follows:          parameters. The Akaike Information Criterion (AIC) [29;
                                               
    φ∗ = argmin g1 R(Ỹ , Yφ ) + g2 C(Ỹ , φ) ,          (7)         30] is another well-known measure: AIC = −2L(θ̂) + 2k,
            φ∈Φ                                                      where θ̂ is the Maximum Likelihood Estimate (MLE) of θ
adopting the simplifying assumption that our loss function           and k is the number of parameters.
g is additively decomposable.                                           To compensate for small sample size n, a variant of AIC,
                                                                     termed AICc , is typically used:
4.4 Information-Theoretic Model Complexity
The Information-Theoretic (or Bayesian) model complex-                                                       2k(k + 1)
                                                                               AICc = −2L(θ̂) + 2k +                            (8)
ity approach, which is based on the model likelihood, mea-                                                  (n − k − 1)
sures whether the increased “complexity" of a model with
more parameters is justified by the data. The Information-              Another computationally more tractable approach is the
Theoretic approach chooses a model (and a model structure)           Bayesian Information Criterion (BIC) [31]: BIC =
from a set of competing models (from the set of correspond-          −2L(θ̂) + klogn, where k is the number of estimable pa-
ing model structures, respectively) such that the value of a         rameters, and n is the sample size (number of observations).
Bayesian criterion is maximized (or prediction uncertainty           BIC was developed as an approximation to the log marginal
in choosing a model structure is minimized).                         likelihood of a model, and therefore, the difference between
   The Information-Theoretic approach addresses prediction           two BIC estimates may be a good approximation to the nat-
uncertainty by specifying an appropriate likelihood func-            ural log of the Bayes factor. Given equal priors for all com-
tion. In other words, it specifies the probability with which        peting models, choosing the model with the smallest BIC is
the observed values of a variable of interest are generated          equivalent to selecting the model with the maximum poste-
by a model. The marginal likelihood of a model structure,            rior probability. BIC assumes that the (parameters’) prior is
which represents a class of models capturing the same pro-           the unit information prior (i.e., a multivariate normal prior
cesses (and hence have the same parameter dimensional-               with mean at the maximum likelihood estimate and variance
ity), is obtained by integrating over the prior distribution of      equal to the expected information matrix for one observa-
model parameters; this measures the prediction uncertainty           tion).
of the model structure [28].                                            Wagenmakers [32] shows that one can convert the BIC




                                                               130
                        Proceedings of the 26th International Workshop on Principles of Diagnosis


metric to                                                              Fault Model The fault model introduces a parameter βi
                              SSE                                   associated with κi , i.e., we replace κi with κi (1 + βi ), i =
               BIC = n log           + k logn,                      1, 2, 3, where −1 ≤ βi ≤ κ1i − 1, i = 1, 2, 3. This model
                             SStotal
                                                                    has 7 parameters, adding parameters β1 , β2 , β3 .
where SSE is the sum of squares for the error term. In our
experiments, we assume that the non-linear model is the             Qualitative Model
“correct" model (or the null hypothesis H0 ), and either the        Nominal Model  p For the model we replace the non-linear
linear or qualitative models are the competing model (or al-        sub-function hi − hj with the qualitative sub-function
ternative hypothesis H1 ). Hence what we do is use BIC to           M + (hi − hj ), where M + is the set of reasonable functions
compare the non-linear to each of the competing models.             f such that f 0 > 0 on the interior of its domain [34].
   Suppose that we obtain the BIC values for the alternative           The tank-heights are constrained to be non-negative, as
and the correct models, using the relevant SS terms. When           are the parameters κi . As a consequence, we can discretize
computing ∆BIC = BIC(H1 ) − BIC(H0 ), note that both                the hi to take on values {+, 0}, which means that M + (hi −
the null (H0 ) and the alternative hypothesis (H1 ) models          hj ) can take on values {+, 0, −}. The domain for dh dt must
                                                                                                                            1

share the same SStotal term (both models attempt to explain         be {+, 0, −}, since the qualitative version of q0 , Q is non-
the same collection of scores), although they differ with re-       negative (domain of {+, 0}) and each M + (hi − hj ) can
spect to SSE. The SStotal term common to both BIC values            take on values {+, 0, −}. We see that this model has no
cancels out in computing ∆BIC , producing                           parameters to estimate.
                           SSE1                                        Fault Model
            ∆BIC = n log        + (k1 − k0 )logn,          (9)         The qualitative fault model has different M + functions
                           SSE0                                     for the modes where the valve is passing and blocked. We
where SSE1 and SSE0 are the sum of squares for the er-              derive these functions as follows. From a qualitative per-
ror terms in the alternative and the null hypothesis models,        spective, the domain of βi is {0,+} for a passing valve, and
respectively.                                                       {-,0} for a blocked valve. To create a new M + function for
                                                                    the cases of passing and blocked valve, we qualitatively ap-
5 Experimental Design                                               ply these corresponding domains to the standard M + func-
                                                                    tion with domain {-,0,+} to obtain fault-based M + func-
This section compares three tank benchmark models accord-           tions : MP+ (hi − hj ) denotes the M + function when the
ing to various model-selection measures. We adopt as our            valve is passing, and MB+ (hi − hj ) denotes the M + func-
“correct" model the non-linear model. We will examine the           tion when the valve is blocked.
fidelity and complexity tradeoffs of two simpler models over
a selection of failure scenarios.                                   5.2 Simulation Results
   The diagnostic task will be to compute the fault state
of the system, given an injected fault, which is one of             We have compared the simulation performance of the mod-
(ξN , ξB , ξP ), denoting nominal blocked and passing valves,       els under nominal and faulty conditions, considering faults
respectively. This translates to different tasks given the dif-     to individual valves V1 , V2 and V3 , as well as double-fault
ferent models.                                                      combinations of the valves. In the following we present
                                                                    some plots for simulations of faults and fault-isolation for
non-linear model estimate the true value of κ1 given p1 ,           different model types.
    which corresponds to a most-likely failure mode as-                Figure 2 shows the results from a single-fault scenario,
    signment of one of (ξN , ξB , ξP ).                             where valve V1 is stuck at 50%) at t = 250, based on the
linear model estimate the true value of κ1 given p1 , which         non-linear model. The plot from this simulation show that
     corresponds to a most-likely failure mode assignment           at the time of the fault injection, the water level in tank T1
     of one of (ξN , ξB , ξP ).                                     starts increasing while the water level at tanks T2 and T3
                                                                    start decreasing due to the lower inflow.
qualitative model estimate the failure mode assignment of
    one of (ξN , ξB , ξP ).                                                   200                                          p_1
                                                                                                                           p_2



5.1 Alternative Models
                                                                                                                           p_3

                                                                              150


This section describes the two alternative models that we                     100
compare to the non-linear model, a linear and a qualitative
model.                                                                         50



Linear Model                                                                    0

We compare the non-linear model with a linearised version.                          0   100   200
                                                                                                    time [s]
                                                                                                               300   400


We can perform this linearised process in a variety of ways
[33]. In this simple tank example, we can perform the lin-          Figure 2: Simulation with non-linear model for the scenario
earisation directly through replacement of non-linear and           of a fault in valve 1 at t = 250 s
linear operators, as shown below.
   Nominal Model We can linearise the the non-linear                  Table 1 shows the simulation error-difference between the
3-tank
p        model by replacing the non-linear sub-function             non-linear and linear models, for the nominal case and the
   hi − hj with the linear sub-function γij (hi − hj ), where       faulty case (where valve 1 is faulted). Given that we mea-
γij is a parameter (to be estimated) governing the flow be-         sure the pressure levels for p1 , p2 and p3 every second, we
tween tanks i and j. The linear model has 4 parameters,             use the difference in these outputs to identify the sum-of-
γ12 , γ12 , γ23 , γ3 .                                              squared-error (SSE) values for the simulations.




                                                              131
                            Proceedings of the 26th International Workshop on Principles of Diagnosis


                                                                         Total
                                                                                                     1
                      p1              p2                    p3                                                                                            R_1

                                                                                                                                                          R_2


       Nominal      2600.3           316.2                 118.1        3034.6                      0.8                                                   R_3



       V1 -fault    2583.1           347.5                 137.2        3067.8                      0.6



                                                                                                    0.4

Table 1: Data for SSE values for simulations using Non-
linear and Linear representations, given two scenarios:                                             0.2



nominal and faulty (valve V1 at 50% after 250 s)                                                     0

                                                                                                             100     200              300   400     500
                                                                                                                           time [s]




   Figure 3 shows the results for diagnosing the V1 -fault us-                            Figure 5: Simulation of fault isolation of fault in valve 1
ing the non-linear model. We can see that the diagnostic                                  with mixed non-linear/linear model (T1 non-linear and both
accuracy is high, as P (V1 ) converges to almost 1 with little                            T2 and T3 linear). The figure depicts the probability of
time lag.                                                                                 valves 1, 2 and 3 being faulty.
                1

                                                                                          6.1 Model Comparisons
            0.8
                                                                                          We have empirically compared the diagnostics performance
            0.6
                                                                                          of several multi-tank models. In our first set of experiments,
         R_1




            0.4
                                                                                          we ran a simulation over 500 seconds, and induced a fault
                                                                                          (valve V1 at 50%) after 250 s. The model combinations in-
            0.2
                                                                                          volved a non-linear (NL) model, a model (denoted M) with
                0                                                                         tank T1 being linear (and other tanks non-linear), a fully
                      100          200               300          400         500
                                         time [s]                                         linear model (denoted L), and a Qualitative model (denoted
                                                                                          Q).
Figure 3: Simulation of fault isolation of fault in valve 1                                  To compare the relative performance of the models, we
with non-linear model. The figure depicts the probability of                              compute a measure of diagnostics error (or loss), using the
valve 1 being faulty.                                                                     difference between the true fault (which is known for each
                                                                                          simulation) and the computed fault. We denote the true fault
   In contrast, Figure 4 shows the diagnostic accuracy and                                existing at time t using the pair (ω, t); the computed fault at
isolation time with a linear model. First, note that there is                             time t is denoted using the pair (ω̂, t̂). The inference system
a false-positive identified early in the simulation, and the                              that we use, LNG [35], computes an uncertainty measure
model incorrectly identifies both valves 2 and 3 as being                                 associated with each computed fault, denoted P (ω̂). Hence,
faulty. This linear model thus delivers both poor diagnos-                                we define a measure of diagnostics error over a time window
tic accuracy (classification errors) and poor isolation time                              [0, T ] using
(there is a lag between when the fault occurs and when                                                             T X
the model identifies the fault). After the fault injection at                                                      X
                                                                                                          γ1D =                 |P (ω̂t ) − ωt |,               (10)
t = 250 [s], the predictive accuracy improves and the cor-
                                                                                                                   t=0 ξ∈Ξ
rect fault becomes the most likely fault.
                                                                                          where Ξ is the set of failure modes for the model, and ωt
           1                                                                  R_1
                                                                              R_2
                                                                                          denotes ω at time t.
          0.8
                                                                              R_3
                                                                                             Our second metric covers the fault latency, i.e., how
                                                                                          quickly the model identifies the true fault (ω, t): γ2 = t − t̂.
          0.6
                                                                                             Table 2 summarises our results. The first columns com-
          0.4
                                                                                          pare the number of parameters for the different models, fol-
          0.2
                                                                                          lowed by comparisons of the error (γ1 ) and the CPU-time
                                                                                          (γ2 ). The data show that the error (γ1 ) does not grow very
           0
                    100      200               300          400         500               much as we increase model size, but it increases as we de-
                                    time [s]
                                                                                          crease model fidelity from non-linear through to qualitative
                                                                                          models. In contrast, the CPU-time (a) increases as we in-
Figure 4: Simulation of fault isolation of fault in valve
                                                                                          crease model size, and (b) is proportional to model fidelity,
1 with linear model.The figure depicts the probability of
                                                                                          i.e., it decreases as we decrease model fidelity from non-
valves 1, 2 and 3 being faulty.
                                                                                          linear through to qualitative models.
                                                                                             In a second set of experiments, we focused on multiple
   Figure 5 depicts the diagnostic performance with a mixed                               model types for a 3-tank system, with simulations running
linear/non-linear model (T1 is non-linear, while T2 and T3                                over 50s, and we induced a fault (valve V1 at 50%) after 25 s.
are linear). The diagnostic accuracy is almost the same as                                The model combinations involved a non-linear (NL) model,
that of the non-linear model (cf. Figure 3), except for a                                 a model with tank 3 linear (and other tanks non-linear), a
false-positive detection at the beginning of the scenario.                                model with tanks 2 and 3 linear and tank 1 non-linear, a fully
                                                                                          linear model, and a qualitative model. Table 3 summarises
6 Experimental Results                                                                    our results.
                                                                                             The data show that, as model fidelity decreases, the er-
This section describes our experimental results, summaris-                                ror γ1 increases significantly and the inference times γ2 de-
ing the data first and then discussing the implications of the                            crease modestly. If we examine the outputs from AICc , we
results.                                                                                  see that the best model is the mixed model (T3 -linear). BIC




                                                                                    132
                         Proceedings of the 26th International Workshop on Principles of Diagnosis


          Tanks                   2        3        4               7 Conclusions
       # Parameters      NL       7        9        11              This article has presented a framework for evaluating the
                         M        6        8        10              competing properties of models, namely fidelity and com-
                         L        5        7        9               putational complexity. We have argued that model perfor-
                         Q        2        3        4               mance needs to be evaluated over a range of future observa-
             γ1          NL      242      242      242              tions, and hence we need a framework that considers the ex-
                         M       997     1076     1192              pected performance. As such, information-theoretic meth-
                         L      1236     1288     1342              ods are well suited.
                         Q      3859     3994     4261                 We have proposed some information-theoretic metrics for
             γ2          NL     10.59     23.7    39.5              MBD model evaluation, and conducted some preliminary
                         M       8.52    17.96    34.6              experiments to show how these metrics may be applied.
                         L       6.11    10.57    32.0              This work thus constitutes a start to a full analysis of model
                         Q       4.64     7.31    26.4              performance. Our intention is to initiate a more formal anal-
                                                                    ysis of modeling and model evaluation, since there is no
                                                                    framework in existence for this task. Further, the experi-
Table 2: Data for 2-, 3-, and 4-tank models using Non-linear
                                                                    ments are only preliminary, and are meant to demonstrate
(NL), Mixed (M), Linear (L) and Qualitative (Q) represen-
                                                                    how a framework can be applied to model comparison and
tations
                                                                    evaluation.
                                                                       Significant work remains to be done, on a range of fronts.
                                                                    In particular, a thorough empirical investigation is needs on
indicates the qualitative model as the best; it is worth noting
                                                                    diagnostics modeling. Second, the real-world utility of our
that BIC typically will choose the simplest model.
                                                                    proposed framework needs to be determined. Third, a theo-
                                                                    retical study of the issues of mode-based parameter estima-
                         γ1        γ2     AICc      BIC             tion and its use for MBD is necessary.
     Non-Linear         0.97      23.7    29.45     43.7
     T3 -linear         3.12     17.96    26.77     42.9            References
     T2 , T3 -linear    21.96    13.21    31.12    39.56
     Linear             77.43    10.57    35.76    37.55            [1] George EP Box. Statistics and science. J Am Stat
     Qualitative       304.41     9.74    43.01    29.13                Assoc, 71:791–799, 1976.
                                                                    [2] Peter Struss. What’s in SD? Towards a theory of mod-
Table 3: Data for 3-tank model, using Non-linear, Mixed,                eling for diagnosis. Readings in model-based diagno-
Linear and Qualitative representations, given a fault (valve            sis, pages 419–449, 1992.
V1 at 50%) after 25 s                                               [3] Peter Struss. Qualitative modeling of physical sys-
                                                                        tems in AI research. In Artificial Intelligence and
                                                                        Symbolic Mathematical Computing, pages 20–49.
6.2 Discussion                                                          Springer, 1993.
Our results show that MBD is a complex task with several            [4] Nuno Belard, Yannick Pencolé, and Michel Comba-
conflicting factors.                                                    cau. Defining and exploring properties in diagnostic
                                                                        systems. System, 1:R2, 2010.
  • The diagnosis error γ1 is inversely proportional to             [5] Alexander Feldman, Tolga Kurtoglu, Sriram
    model fidelity, given a fixed diagnosis task.
                                                                        Narasimhan, Scott Poll, and David Garcia. Em-
  • The error γ1 increases with fault cardinality.                      pirical evaluation of diagnostic algorithm performance
                                                                        using a generic framework. International Journal of
  • The CPU-time γ2 increases with model size (i.e., num-               Prognostics and Health Management, 1:24, 2010.
    ber of tanks).
                                                                    [6] Steven D Eppinger, Nitin R Joglekar, Alison Ole-
   This article has introduced a framework that can be used             chowski, and Terence Teo. Improving the systems en-
to trade off the different factors governing MBD “accuracy".            gineering process with multilevel analysis of interac-
We have shown how one can extend a set of information-                  tions. Artificial Intelligence for Engineering Design,
theoretic metrics to combine these competing factors in                 Analysis and Manufacturing, 28(04):323–337, 2014.
diagnostics model selection. Further work is necessary
to identify how best to extend the existing information-            [7] Sanjay S Joshi and Gregory W Neat. Lessons learned
theoretic metrics to suit the needs of different diagnostics            from multiple fidelity modeling of ground interferom-
applications, as it is likely that the “best" model may be              eter testbeds. In Astronomical Telescopes & Instru-
domain- and task-specific.                                              mentation, pages 128–138. International Society for
   It is important to note that we conducted experiments with           Optics and Photonics, 1998.
un-calibrated models, and we have ignored the cost of cal-          [8] Roxanne A Moore, David A Romero, and Christi-
ibration in this article. The literature suggests that linear           aan JJ Paredis. A rational design approach to gaus-
models can be calibrated to achieve good performance, al-               sian process modeling for variable fidelity models. In
though performance inferior to that of calibrated non-linear            ASME 2011 International Design Engineering Tech-
models. This class of qualitative models does not possess               nical Conferences and Computers and Information in
calibration factors, so calibration will not improve their per-         Engineering Conference, pages 727–740. American
formance.                                                               Society of Mechanical Engineers, 2011.




                                                              133
                        Proceedings of the 26th International Workshop on Principles of Diagnosis


[9] Peter D Hanlon and Peter S Maybeck. Multiple-                 [23] S Pande, L Arkesteijn, HHG Savenije, and LA Basti-
    model adaptive estimation using a residual correlation             das. Hydrological model parameter dimensionality is a
    Kalman filter bank. Aerospace and Electronic Systems,              weak measure of prediction uncertainty. Natural Haz-
    IEEE Transactions on, 36(2):393–406, 2000.                         ards and Earth System Sciences Discusions, 11, 2014,
[10] Redouane Hallouzi, Michel Verhaegen, Robert                       2014.
     Babuška, and Stoyan Kanev. Model weight and state            [24] Martin Kunz, Roberto Trotta, and David R Parkinson.
     estimation for multiple model systems applied to fault            Measuring the effective complexity of cosmological
     detection and identification. In IFAC Symposium on                models. Physical Review D, 74(2):023503, 2006.
     System Identification (SYSID), Newcastle, Australia,         [25] Gregory M Provan and Jun Wang. Automated bench-
     2006.                                                             mark model generators for model-based diagnostic in-
[11] Amardeep Singh, Afshin Izadian, and Sohel Anwar.                  ference. In IJCAI, pages 513–518, 2007.
     Fault diagnosis of Li-Ion batteries using multiple-          [26] David J Spiegelhalter, Nicola G Best, Bradley P Car-
     model adaptive estimation. In Industrial Electronics              lin, and Angelika Van Der Linde. Bayesian measures
     Society, IECON 2013-39th Annual Conference of the                 of model complexity and fit. Journal of the Royal
     IEEE, pages 3524–3529. IEEE, 2013.                                Statistical Society: Series B (Statistical Methodology),
[12] Amardeep Singh Sidhu, Afshin Izadian, and Sohel                   64(4):583–639, 2002.
     Anwar. Nonlinear Model Based Fault Detection of              [27] Jing Du. The “weight" of models and complexity.
     Lithium Ion Battery Using Multiple Model Adaptive                 Complexity, 2014.
     Estimation. In World Congress, volume 19, pages              [28] Jasper A Vrugt and Bruce A Robinson. Treatment of
     8546–8551, 2014.                                                  uncertainty using ensemble methods: Comparison of
[13] Aki Vehtari, Janne Ojanen, et al. A survey of bayesian            sequential data assimilation and bayesian model aver-
     predictive methods for model assessment, selection                aging. Water Resources Research, 43(1), 2007.
     and comparison. Statistics Surveys, 6:142–228, 2012.         [29] Hirotugu Akaike. A new look at the statistical model
[14] Athanasios C Antoulas, Danny C Sorensen, and                      identification. Automatic Control, IEEE Transactions
     Serkan Gugercin. A survey of model reduction meth-                on, 19(6):716–723, 1974.
     ods for large-scale systems. Contemporary mathemat-          [30] Hirotugu Akaike. Likelihood of a model and infor-
     ics, 280:193–220, 2001.                                           mation criteria. Journal of econometrics, 16(1):3–14,
[15] Alexander Feldman, Gregory M Provan, and Arjan JC                 1981.
     van Gemund. Computing observation vectors for max-           [31] G. Schwarz. Estimating the dimension of a model.
     fault min-cardinality diagnoses. In AAAI, pages 919–              Ann. Statist., 6:461–466, 1978.
     924, 2008.                                                   [32] Eric-Jan Wagenmakers. A practical solution to the per-
[16] Amardeep Singh, Afshin Izadian, and Sohel Anwar.                  vasive problems ofp values. Psychonomic bulletin &
     Nonlinear model based fault detection of lithium ion              review, 14(5):779–804, 2007.
     battery using multiple model adaptive estimation. In         [33] Pol D Spanos. Linearization techniques for non-linear
     19th IFAC World Congress, Cape Town, South Africa,                dynamical systems. PhD thesis, California Institute of
     2014.                                                             Technology, 1977.
[17] Youmin Zhan and Jin Jiang. An interacting multiple-          [34] Benjamin Kuipers and Karl Åström. The composition
     model based fault detection, diagnosis and fault-                 and validation of heterogeneous control laws. Auto-
     tolerant control approach. In Decision and Control,               matica, 30(2):233–249, 1994.
     1999. Proceedings of the 38th IEEE Conference on,
     volume 4, pages 3593–3598. IEEE, 1999.                       [35] Alexander Feldman, Helena Vicente de Castro, Arjan
                                                                       van Gemund, and Gregory Provan. Model-based diag-
[18] Peter Struss and Oskar Dressler. " physical negation"             nostic decision-support system for satellites. In Pro-
     integrating fault models into the general diagnostic en-          ceedings of the IEEE Aerospace Conference, Big Sky,
     gine. In IJCAI, volume 89, pages 1318–1323, 1989.                 Montana, USA, pages 1–14, March 2013.
[19] Johan De Kleer, Alan K Mackworth, and Raymond
     Reiter. Characterizing diagnoses and systems. Arti-
     ficial Intelligence, 56(2):197–222, 1992.
[20] Elizabeth H Keating, John Doherty, Jasper A Vrugt,
     and Qinjun Kang. Optimization and uncertainty as-
     sessment of strongly nonlinear groundwater models
     with high parameter dimensionality. Water Resources
     Research, 46(10), 2010.
[21] Saket Pande, Mac McKee, and Luis A Bastidas.
     Complexity-based robust hydrologic prediction. Wa-
     ter resources research, 45(10), 2009.
[22] G Schoups, NC Van de Giesen, and HHG Savenije.
     Model complexity control for hydrologic prediction.
     Water Resources Research, 44(12), 2008.




                                                            134