=Paper=
{{Paper
|id=Vol-1507/dx15paper17
|storemode=property
|title=A Framework For Assessing Diagnostics Model Fidelity
|pdfUrl=https://ceur-ws.org/Vol-1507/dx15paper17.pdf
|volume=Vol-1507
|dblpUrl=https://dblp.org/rec/conf/safeprocess/ProvanF15
}}
==A Framework For Assessing Diagnostics Model Fidelity==
Proceedings of the 26th International Workshop on Principles of Diagnosis
A Framework For Assessing Diagnostics Model Fidelity
Gregory Provan1 and Alex Feldman2
1
Computer Science Department, University College Cork, Cork, Ireland
e-mail: g.provan@cs.ucc.ie
2
PARC Inc., Palo Alto, CA 94304, USA
e-mail: afeldman@parc.com
Abstract models can actually perform worse than lower-fidelity mod-
els on real-world data, as can be explained using over-fitting
“All models are wrong but some are useful" [1]. arguments within a machine learning framework.
We address the problem of identifying which di- To our knowledge, there is no theory within Model-Based
agnosis models are more useful than others. Mod- Diagnostics that relates notions of model complexity, model
els are critical to diagnostics inference, yet little accuracy, and inference complexity. To address these issues,
work exists to be able to compare models. We de- we explore several of the factors that contribute to model
fine the role of models in diagnostics inference, complexity, as well as a theoretically sound approach for
propose metrics for models, and apply these met- selecting models based on their complexity and diagnostics
rics to a tank benchmark system. Given the many performance, i.e., their accuracy in diagnosing faults.
approaches possible for model metrics, we argue Our contributions are as follows:
that only information-theoretic methods address
• We characterise the task of selecting a diagnosis model
how well a model mimics real-world data. We
of appropriate fidelity as an information-theoretic
focus on some well-known information-theoretic
model selection task.
modelling metrics, demonstrating the trade-offs
that can be made on different models for a tank • We propose several metrics for assessing the quality of
benchmark system. a diagnosis model, and derive approximation versions
of a subset of these metrics.
• We use a dynamical systems benchmark model to
1 Introduction demonstrate our compare how the metrics assess mod-
A core goal of Model-Based Diagnostics (MBD) is to ac- els relative to the accuracy of diagnostics output based
curately diagnose a range of systems in real-world appli- on using the models.
cations. There has been significant progress in developing
algorithms for systems of increasing complexity. A key 2 Related Work
area where further work is needed is scaling-up to real- This section reviews work related to our proposed approach.
world models, as multiple-fault diagnostics algorithms are Model-Based Diagnostics: There is some seminal work
currently limited by the size and complexity of the models on modelling principles within the Model-Based Diagnosis
to which they can be applied. In addition, there is still a great (MBD) community, e.g., [2; 3]; this early work adopts an
need for defining metrics to measure diagnostics accuracy, approach based on logic or qualitative physics for model
and to measure the computational complexity of inference specification. However, this work provides no means for
and of the models’ contribution to inference complexity. comparing models in terms of diagnostics accuracy. More
This article addresses the modeling side of MBD: we fo- recent work ([4]) provides a logic-based specification of
cus on methods for measuring the size and complexity of model fidelity. There is also work specifying metrics for
MBD models. We explore the role that diagnostics model diagnostics accuracy, e.g., [5].
fidelity can play in being able to generate accurate diagnos- However, none of this work defines precise metrics for
tics. We characterise model fidelity and examine the trade- computing both diagnostics accuracy and model complex-
offs of fidelity and inference complexity within the overall ity, and their trade-offs. This article adopts a theoretically
MBD inference task. well-founded approach for integrating multiple MBD met-
Model fidelity is a crucial issue in diagnostics [2]: mod- rics.
els that are too simple can be inaccurate, yet highly detailed Multiple Fidelity Modeling There is limited work de-
and complex models are expensive to create, have many pa- scribing the use of models of multiple levels of fidelity. Ex-
rameters that require significant amounts of data to estimate, amples of such work includes [6; 7; 8]. In this article we
and are computationally intensive to perform inference on. focus on methods for evaluating multi-fidelity models and
There is an urgent need to incorporate inference complexity their impact on diagnostics accuracy, as opposed to devel-
within modelling, since even relatively simple models, such oping methodoligies for modelling at multiple levels of fi-
as some of the combinational ISCAS-85 benchmark models, delity.
pose computational challenges to even the most advanced Multiple-Mode Modeling One approach to MBD is to
solvers for multiple-fault tasks. In addition, higher-fidelity use a separate model for every failure mode, rather than to
127
Proceedings of the 26th International Workshop on Principles of Diagnosis
define a model containing all failure modes. Examples of We generalise that notion to incorporate inference effi-
this approach include [9; 10; 11; 12]. Note that this work ciency as well as accuracy. We can define an inference com-
does not specify metrics for computing both diagnostics ac- plexity measure as C(Ỹ , φ). We can then define our diagno-
curacy and model complexity, or their trade-offs. sis task as jointly minimising a function g that incorporates
Model- Selection The metrics that we adopt and extend the accuracy (based on the residual function) and the infer-
have been used extensively to compare different models, ence complexity:
e.g., [13]. The metrics are used to compare simulation per-
formance of models only. In contrast, we extend this frame- ξ ∗ = argmin g R(Ỹ , Yφ ), C(Ỹ , φ) . (2)
work to examine diagnostics performance. In the process, ξ∈Ξ
we explore the use of multiple loss functions for penalising Here g specifies a loss or penalty function that induces a
models, in addition to the standard penalty functions based non-negative real-valued penalty based on the lack of accu-
on number of model parameters. racy and computational cost.
Model-Order Reduction Model-Order reduction [14] In forward simulation, a model φ, with parameters θ, can
aims to reduce the complexity of a model with an aim to
generate multiple observations Ỹ = {ỹ1 , ..., ỹn }. The di-
limit the performance losses of the reduced model. The re-
agnostics task involves performing the inverse operation on
duction methods are theoretically well-founded, although
these observations. Our objective thus involves optimising
they are highly domain-specific. In contrast to this ap-
the state estimation task over a future set of observations,
proach, we assume a model-composition approach from a
Ỹ = {Ỹ1 , ..., Ỹn }. Our model φ and inference algorithm
component library containing hand-constructed models of
multiple levels of fidelity. A have different performance based on Ỹi , i = 1, ..., n: for
example, [15] shows that both inference-accuracy and -time
vary based on the fault cardinality . As a consequence, to
3 Diagnostics Modeling and Inference compute ξ ∗ we want to optimise the mean performance over
This section formalises the notion of diagnostics model future observations. This notion of mean performance op-
within the process of diagnostics inference. We first intro- timisation has been characterised using the Bayesian model
duce the task, and then define it more precisely. selection approach, which we examine in the following sec-
tion.
3.1 Diagnosis Task
Assume that we have a system S that can operate in a nom- 3.2 Diagnosis Model
inal state, ξN , or a faulty state, ξF , where Ξ is the set of We specify a diagnosis model as follows:
possible states of S. We further assume that we have a dis-
Definition 1 (Diagnosis Model). We characterise a Diag-
crete vector of measurements, Ỹ = {ỹ1 , ..., ỹn } observed nosis Model φ using the tuple hV , θ, Ξ, Ei, where
at times t = {1, ..., n} that summarizes the response of
the system S to control variables U = {u1 , ..., un }. Let • V is a set of variables, consisting of variables denoting
Yφ = {y1 , ..., yn } denote the corresponding predictions the system state (X), control (U ), and observations
from a dynamic (nonlinear) model, φ, with parameter values (Y ).
θ: this can be represented by Yφ = φ(x0 , θ, ξ, Ũ ), where x0 • θ is a set of parameters.
signifies the initial states of the system at t0 . • Ξ is a set of system modes.
We assume that we have a prior probability distribution
P (Ξ) over the states Ξ of the system. This distribution de- • E is a set of equations, with a subset Eξ ⊆ E for each
notes the likelihood of the failure states of the system. mode ξ ∈ Ξ.
We define a residual vector R(Ỹ , Yφ ) to capture the dif- We will assume that we can use a physics-based approach
ference between the actual and model-simulated system be- to hand-generate a set E of equations to specify a model.
haviour. An example of a residual vector is the mean- Obtaining good diagnostics accuracy, given a fixed E, en-
squared-error (MSE). We assume a fixed diagnosis task T tails estimating the parameters θ to optimise that accuracy.
throughout this article, e.g., computing the most likely diag-
nosis, or a deterministic multiple-fault diagnosis. 3.3 Running Example: Three-Tank Benchmark
The classical definition of diagnosis is as a state estima- In this paper, we use the three-tank system shown in Fig. 1
tion task, whose objective is to identify the system state that to illustrate our approach. The three tanks are denoted as T1 ,
minimises the residual vector: T2 , and T3 . Each tank has the same area A1 = A2 = A3 .
For i = 1, 2, 3, tank Ti has height hi , a pressure sensor pi ,
ξ ∗ = argmin R(Ỹ , Yφ ) (1)
ξ∈Ξ and a valve Vi , i = 1, 2, 3 that controls the flow of liquid
out of Ti . We assume that gravity g = 10 and the liquid has
Since this is a minimisation task, we typically need to density ρ = 1.
run multiple simulations over the space of parameters and Tank T1 gets filled from a pipe, with measured flow q0 .
modes to compute ξ ∗ . We can abstract this process as Using Torricelli’s law, the model can be described by the
performing model-inversion, i.e., computing some ξ ∗ = following non-linear equations:
φ−1 (x0 , θ, ξ, Ũ ) that minimises R(Ỹ , Yφ ).
During this diagnostics inference task, a model φ can play dh1 1 h p i
= −κ1 h1 − h2 + q0 , (3)
two roles: (a) simulating a behaviour to estimate R(Ỹ , Yφ ); dt A1
(b) enabling the computation of ξ ∗ = φ−1 (x0 , θ, ξ, Ũ ). It dh2 1 h p p i
= κ1 h1 − h2 − κ2 h2 − h3 , (4)
is clear that diagnostics inference requires a model that has dt A2
good fidelity and is computationally efficient for performing dh3 1 h p p i
these two roles. = κ2 h2 − h3 − κ3 h3 . (5)
dt A3
128
Proceedings of the 26th International Workshop on Principles of Diagnosis
q0 mode, and ξ· is the mode where · denotes the combination
of valves (taken from a combination of {1, 2, 3}) which are
faulty. This fault model has 9 parameters.
V1 V2 V3 4 Modelling Metrics
This section describes the metrics that can be applied to esti-
mate properties of a diagnosis model. We describe two types
p1* p2* p3* of metrics, dealing with accuracy (fidelity) and complexity.
Figure 1: Diagram of the three-tank system.
4.1 Model Accuracy
Model accuracy concerns the ability of a model to mimic a
real system. From a diagnostics perspective, this translates
In eq. 3, the coefficient κ1 denotes a parameter that cap- to the use of a model to simulate behaviours that distinguish
tures the product of the cross-sectional area of the tank nominal and faulty behaviours sufficiently well that appro-
A√1 , the area of the drainage hole, a gravity-based constant priate fault isolation algorithms can identify the correct type
( 2g), and the friction/contraction factor of the hole. κ2 of fault when it occurs. As such, a diagnostics model needs
and κ3 can be defined analogously. to be able to simulate behaviours for multiple modes with
Finally, the pressure at the bottom of each tank is obtained “appropriate" fidelity.
from the height: pi = g hi , where i is the tank index (i ∈ Note that we distinguish model accuracy from diagnosis
{1, 2, 3}). inference accuracy. As noted above, model accuracy con-
We emphasize the use of the κi , i = 1, 2, 3 because we cerns the ability of a model to mimic a real system through
will use these parameter-values as a means for “diagnos- simulation, and to assist in diagnostics isolation. Diagnosis
ing” our system in term of changes in κi , i = 1, 2, 3. Con- inference accuracy concerns being able to isolate the true
sider a physical valve R1 between T1 and T2 that constraints fault given an observation and the simulation output of a
the flow between the two tanks. We can say that the valve model.
changes proportionally the cross-sectional drainage area of A significant challenge for a diagnosis model is the need
q1 and hence κ1 . The diagnostic task will be to compute the to simulate behaviours for multiple modes. Two approaches
true value of κ1 , given p1 , and from κ1 we can compute the that have been taken are to use a single model with multiple
actual position of the valve R1 . modes explicitly defined (a multi-mode approach), or to use
We now characterise our nominal model in terms of Def- multiple models [9; 16; 17], each of which is optimised for
inition 1: a single or small set of modes (a multi-model approach).
• variables V consist of variables denoting The AI-based MBD approach typically uses a single
the system state (X = {h1 , h2 , h3 }), con- model φ with multiple modes explicitly defined [18], or a
trol (U = {q0 , V1 , V2 , V3 }), and observations single model with just nominal behaviour [19]. From a di-
(Y = {p1 , p2 , p3 }). agnostics perspective, accuracy must be defined with respect
• θ = {{A1 , A2 , A3 }, {κ1 , κ2 , κ3 }} is the set of pa- to the task T . We adopt here the task of computing the most-
rameters. likely diagnosis.
Given evidence suggesting that model fidelity for a multi-
• Ξ consists of a single nominal mode. mode approach varies depending on the mode, it is impor-
• E is a set of equations, given by equations 3 through 5. tant to explicitly consider the mean performance of φ over
Note that this model has a total of 6 parameters. the entire observation space Y (the space of possible obser-
vations of the system).
Fault Model In this article we focus on valve faults,
where a valve can have a blockage or a leak. We model In this article we adopt the expected residual approach,
this class of faults by including in equations 3 to 5 an addi- i.e., given a space Y = {Ỹ1 , ..., Ỹn } of observations, the ex-
tive parameter β, which is applied to the parameter κ, i.e., as pected residual is the average over the n observations, e.g.,
Pn
κi (1+βi ), i = 1, 2, 3, where −1 ≤ βi ≤ κ1i −1, i = 1, 2, 3. as given by: R̄ = n1 i=1 R(Ỹi , Yφ ).
β > 0 corresponds to a leak, such that β ∈ (0, 1/κ − 1];
β < 0 corresponds to a blockage, such that β ∈ [−1, 0). 4.2 Model Complexity
The fault equations can be written as: At present, there is no commonly-accepted definition of
model complexity, whether the model is used purely for
dh1 1 h p i
simulation or if it is used for diagnostics or control. Defin-
= −κ1 (1 + β1 ) h1 − h2 + q0 , (6)
dt A1 ing the complexity of a model is inherently tricky, due to the
dh2 1 h p number of factors involved.
= κ1 (1 + β1 ) h1 − h2
dt A2 Less complex models are often preferred either due to
p i their low computational simulation costs [20], or to min-
− κ2 (1 + β2 ) h2 − h3 , imise model over-fitting given observed data [21; 22]. Given
dh3 1 h p p i the task of simulating a variable of interest conditioned by
= κ2 (1 + β2 ) h2 − h3 − κ3 (1 + β3 ) h3 . certain future values of input (control) variables, overfitting
dt A3
can lead to high uncertainty in creating accurate simulations.
The fault equations allow faults for any combination of Overfitting is especially severe when we have limited ob-
the valves {V1 , V2 , V3 }, resulting in system modes Ξ = servation variables for generating a model representing the
{ξN , ξ1 , ξ2 , ξ3 , ξ12 , ξ13 , ξ23 , ξ123 }, where ξN is the nominal underlying process dynamics. In contrast, models with low
129
Proceedings of the 26th International Workshop on Principles of Diagnosis
parameter dimensionality (i.e. fewer parameters) are con- Statistical model selection is commonly based on Oc-
sidered less complex and hence are associated with low pre- cam’s parsimony principle (ca.1320), namely that hypothe-
diction uncertainty [23]. ses should be kept as simple as possible. In statistical terms,
Several approaches have been used, based on issues like this is a trade-off between bias (distance between the aver-
(a) number of variables [24], (b) model structure [25], (c) age estimate and truth) and variance (spread of the estimates
number of free parameters [23], (d) number of parameters around the truth).
that the data can constrain [26], (e) a notion of model weight The idea is that by adding parameters to a model we ob-
[27], or (f) type and order of equations for a non-linear dy- tain improvement in fit, but at the expense of making pa-
namical model [14], where type corresponds to non-linear, rameter estimates “worse"’ because we have less data (i.e.,
linear, etc.; e.g., order for a non-linear model is such that a information) per parameter. In addition, the computations
k-th order system has k-th derivates in E. typically require more time. So the key question is how to
Factors that contribute to the true cost of a model include: identify how complex a model works best for a given prob-
(a) model-generation; (b) parameter estimation; and (c) sim- lem.
ulation complexity, i.e., the computational expense (in terms If the goal is to compute the likelihood of a given model
of CPU-time and memory) needed to simulate the model φ(x0 , θ, ξ, U ), then θ and U are nuisance parameters.
given a set of initial conditions Rather than try to formu- These parameters affect the likelihood calculation but are
late this notion in terms of the number of model variables or not what we want to infer. Consequently, these parameters
parameters, or a notion of model structural complexity, we should be eliminated from the inference. We can remove
specify model complexity in terms of a measure based on nuisance parameters by assigning them prior probabilities
parameter estimation, and inference complexity, assuming a and integrating them out to obtain the marginal probability
construction cost of zero. of the data given only the model, that is, the model likeli-
A thorough analysis of model complexity will need to hood (also called integrative, marginal, or predictive like-
take into consideration the model equation class, since lihood). In equational form, this looks like: P (Y |φ) =
R R
model complexity is class-specific. For example, for non- P (φ|Y , θ, U )P (θ, U |φ)dθdU . However, this multi-
θ U
linear dynamical models, complexity is governed by the dimensional integral can be very difficult to compute, and it
type and order of equations [14]. In contrast, for linear dy- is typically approximated using computationally intensive
namical models, which have only matrices and variables in techniques like Markov chain Monte Carlo (MCMC).
equations (no derivatives), it is the order of the matrices that Rather than try to solve such a computationally challeng-
determines complexity. In this article, we assume that mod- ing task, we adopt an approximation to the multidimen-
els are of appropriate complexity, and hence do not address sional integral. In the statistics literature several decompos-
Model order reduction techniques [14], which aim to gen- able approximations have been proposed.
erate lower-dimensional systems that trade off fidelity for Spiegelhalter et al. [26] have proposed a well-known
reduced model complexity. such decomposable framework, termed the Deviance In-
4.3 Diagnostics Model Selection Task formation Criterion (DIC), which measures the number of
model parameters that the data can constrain.: DIC =
The model in this model selection problem corresponds to D + pD , where D is a measure of fit (expected deviance),
a system with a single mode. Given a space Φ of possible and pD is a complexity measure, the effective number of
models, we can define this model selection task as follows: parameters. The Akaike Information Criterion (AIC) [29;
φ∗ = argmin g1 R(Ỹ , Yφ ) + g2 C(Ỹ , φ) , (7) 30] is another well-known measure: AIC = −2L(θ̂) + 2k,
φ∈Φ where θ̂ is the Maximum Likelihood Estimate (MLE) of θ
adopting the simplifying assumption that our loss function and k is the number of parameters.
g is additively decomposable. To compensate for small sample size n, a variant of AIC,
termed AICc , is typically used:
4.4 Information-Theoretic Model Complexity
The Information-Theoretic (or Bayesian) model complex- 2k(k + 1)
AICc = −2L(θ̂) + 2k + (8)
ity approach, which is based on the model likelihood, mea- (n − k − 1)
sures whether the increased “complexity" of a model with
more parameters is justified by the data. The Information- Another computationally more tractable approach is the
Theoretic approach chooses a model (and a model structure) Bayesian Information Criterion (BIC) [31]: BIC =
from a set of competing models (from the set of correspond- −2L(θ̂) + klogn, where k is the number of estimable pa-
ing model structures, respectively) such that the value of a rameters, and n is the sample size (number of observations).
Bayesian criterion is maximized (or prediction uncertainty BIC was developed as an approximation to the log marginal
in choosing a model structure is minimized). likelihood of a model, and therefore, the difference between
The Information-Theoretic approach addresses prediction two BIC estimates may be a good approximation to the nat-
uncertainty by specifying an appropriate likelihood func- ural log of the Bayes factor. Given equal priors for all com-
tion. In other words, it specifies the probability with which peting models, choosing the model with the smallest BIC is
the observed values of a variable of interest are generated equivalent to selecting the model with the maximum poste-
by a model. The marginal likelihood of a model structure, rior probability. BIC assumes that the (parameters’) prior is
which represents a class of models capturing the same pro- the unit information prior (i.e., a multivariate normal prior
cesses (and hence have the same parameter dimensional- with mean at the maximum likelihood estimate and variance
ity), is obtained by integrating over the prior distribution of equal to the expected information matrix for one observa-
model parameters; this measures the prediction uncertainty tion).
of the model structure [28]. Wagenmakers [32] shows that one can convert the BIC
130
Proceedings of the 26th International Workshop on Principles of Diagnosis
metric to Fault Model The fault model introduces a parameter βi
SSE associated with κi , i.e., we replace κi with κi (1 + βi ), i =
BIC = n log + k logn, 1, 2, 3, where −1 ≤ βi ≤ κ1i − 1, i = 1, 2, 3. This model
SStotal
has 7 parameters, adding parameters β1 , β2 , β3 .
where SSE is the sum of squares for the error term. In our
experiments, we assume that the non-linear model is the Qualitative Model
“correct" model (or the null hypothesis H0 ), and either the Nominal Model p For the model we replace the non-linear
linear or qualitative models are the competing model (or al- sub-function hi − hj with the qualitative sub-function
ternative hypothesis H1 ). Hence what we do is use BIC to M + (hi − hj ), where M + is the set of reasonable functions
compare the non-linear to each of the competing models. f such that f 0 > 0 on the interior of its domain [34].
Suppose that we obtain the BIC values for the alternative The tank-heights are constrained to be non-negative, as
and the correct models, using the relevant SS terms. When are the parameters κi . As a consequence, we can discretize
computing ∆BIC = BIC(H1 ) − BIC(H0 ), note that both the hi to take on values {+, 0}, which means that M + (hi −
the null (H0 ) and the alternative hypothesis (H1 ) models hj ) can take on values {+, 0, −}. The domain for dh dt must
1
share the same SStotal term (both models attempt to explain be {+, 0, −}, since the qualitative version of q0 , Q is non-
the same collection of scores), although they differ with re- negative (domain of {+, 0}) and each M + (hi − hj ) can
spect to SSE. The SStotal term common to both BIC values take on values {+, 0, −}. We see that this model has no
cancels out in computing ∆BIC , producing parameters to estimate.
SSE1 Fault Model
∆BIC = n log + (k1 − k0 )logn, (9) The qualitative fault model has different M + functions
SSE0 for the modes where the valve is passing and blocked. We
where SSE1 and SSE0 are the sum of squares for the er- derive these functions as follows. From a qualitative per-
ror terms in the alternative and the null hypothesis models, spective, the domain of βi is {0,+} for a passing valve, and
respectively. {-,0} for a blocked valve. To create a new M + function for
the cases of passing and blocked valve, we qualitatively ap-
5 Experimental Design ply these corresponding domains to the standard M + func-
tion with domain {-,0,+} to obtain fault-based M + func-
This section compares three tank benchmark models accord- tions : MP+ (hi − hj ) denotes the M + function when the
ing to various model-selection measures. We adopt as our valve is passing, and MB+ (hi − hj ) denotes the M + func-
“correct" model the non-linear model. We will examine the tion when the valve is blocked.
fidelity and complexity tradeoffs of two simpler models over
a selection of failure scenarios. 5.2 Simulation Results
The diagnostic task will be to compute the fault state
of the system, given an injected fault, which is one of We have compared the simulation performance of the mod-
(ξN , ξB , ξP ), denoting nominal blocked and passing valves, els under nominal and faulty conditions, considering faults
respectively. This translates to different tasks given the dif- to individual valves V1 , V2 and V3 , as well as double-fault
ferent models. combinations of the valves. In the following we present
some plots for simulations of faults and fault-isolation for
non-linear model estimate the true value of κ1 given p1 , different model types.
which corresponds to a most-likely failure mode as- Figure 2 shows the results from a single-fault scenario,
signment of one of (ξN , ξB , ξP ). where valve V1 is stuck at 50%) at t = 250, based on the
linear model estimate the true value of κ1 given p1 , which non-linear model. The plot from this simulation show that
corresponds to a most-likely failure mode assignment at the time of the fault injection, the water level in tank T1
of one of (ξN , ξB , ξP ). starts increasing while the water level at tanks T2 and T3
start decreasing due to the lower inflow.
qualitative model estimate the failure mode assignment of
one of (ξN , ξB , ξP ). 200 p_1
p_2
5.1 Alternative Models
p_3
150
This section describes the two alternative models that we 100
compare to the non-linear model, a linear and a qualitative
model. 50
Linear Model 0
We compare the non-linear model with a linearised version. 0 100 200
time [s]
300 400
We can perform this linearised process in a variety of ways
[33]. In this simple tank example, we can perform the lin- Figure 2: Simulation with non-linear model for the scenario
earisation directly through replacement of non-linear and of a fault in valve 1 at t = 250 s
linear operators, as shown below.
Nominal Model We can linearise the the non-linear Table 1 shows the simulation error-difference between the
3-tank
p model by replacing the non-linear sub-function non-linear and linear models, for the nominal case and the
hi − hj with the linear sub-function γij (hi − hj ), where faulty case (where valve 1 is faulted). Given that we mea-
γij is a parameter (to be estimated) governing the flow be- sure the pressure levels for p1 , p2 and p3 every second, we
tween tanks i and j. The linear model has 4 parameters, use the difference in these outputs to identify the sum-of-
γ12 , γ12 , γ23 , γ3 . squared-error (SSE) values for the simulations.
131
Proceedings of the 26th International Workshop on Principles of Diagnosis
Total
1
p1 p2 p3 R_1
R_2
Nominal 2600.3 316.2 118.1 3034.6 0.8 R_3
V1 -fault 2583.1 347.5 137.2 3067.8 0.6
0.4
Table 1: Data for SSE values for simulations using Non-
linear and Linear representations, given two scenarios: 0.2
nominal and faulty (valve V1 at 50% after 250 s) 0
100 200 300 400 500
time [s]
Figure 3 shows the results for diagnosing the V1 -fault us- Figure 5: Simulation of fault isolation of fault in valve 1
ing the non-linear model. We can see that the diagnostic with mixed non-linear/linear model (T1 non-linear and both
accuracy is high, as P (V1 ) converges to almost 1 with little T2 and T3 linear). The figure depicts the probability of
time lag. valves 1, 2 and 3 being faulty.
1
6.1 Model Comparisons
0.8
We have empirically compared the diagnostics performance
0.6
of several multi-tank models. In our first set of experiments,
R_1
0.4
we ran a simulation over 500 seconds, and induced a fault
(valve V1 at 50%) after 250 s. The model combinations in-
0.2
volved a non-linear (NL) model, a model (denoted M) with
0 tank T1 being linear (and other tanks non-linear), a fully
100 200 300 400 500
time [s] linear model (denoted L), and a Qualitative model (denoted
Q).
Figure 3: Simulation of fault isolation of fault in valve 1 To compare the relative performance of the models, we
with non-linear model. The figure depicts the probability of compute a measure of diagnostics error (or loss), using the
valve 1 being faulty. difference between the true fault (which is known for each
simulation) and the computed fault. We denote the true fault
In contrast, Figure 4 shows the diagnostic accuracy and existing at time t using the pair (ω, t); the computed fault at
isolation time with a linear model. First, note that there is time t is denoted using the pair (ω̂, t̂). The inference system
a false-positive identified early in the simulation, and the that we use, LNG [35], computes an uncertainty measure
model incorrectly identifies both valves 2 and 3 as being associated with each computed fault, denoted P (ω̂). Hence,
faulty. This linear model thus delivers both poor diagnos- we define a measure of diagnostics error over a time window
tic accuracy (classification errors) and poor isolation time [0, T ] using
(there is a lag between when the fault occurs and when T X
the model identifies the fault). After the fault injection at X
γ1D = |P (ω̂t ) − ωt |, (10)
t = 250 [s], the predictive accuracy improves and the cor-
t=0 ξ∈Ξ
rect fault becomes the most likely fault.
where Ξ is the set of failure modes for the model, and ωt
1 R_1
R_2
denotes ω at time t.
0.8
R_3
Our second metric covers the fault latency, i.e., how
quickly the model identifies the true fault (ω, t): γ2 = t − t̂.
0.6
Table 2 summarises our results. The first columns com-
0.4
pare the number of parameters for the different models, fol-
0.2
lowed by comparisons of the error (γ1 ) and the CPU-time
(γ2 ). The data show that the error (γ1 ) does not grow very
0
100 200 300 400 500 much as we increase model size, but it increases as we de-
time [s]
crease model fidelity from non-linear through to qualitative
models. In contrast, the CPU-time (a) increases as we in-
Figure 4: Simulation of fault isolation of fault in valve
crease model size, and (b) is proportional to model fidelity,
1 with linear model.The figure depicts the probability of
i.e., it decreases as we decrease model fidelity from non-
valves 1, 2 and 3 being faulty.
linear through to qualitative models.
In a second set of experiments, we focused on multiple
Figure 5 depicts the diagnostic performance with a mixed model types for a 3-tank system, with simulations running
linear/non-linear model (T1 is non-linear, while T2 and T3 over 50s, and we induced a fault (valve V1 at 50%) after 25 s.
are linear). The diagnostic accuracy is almost the same as The model combinations involved a non-linear (NL) model,
that of the non-linear model (cf. Figure 3), except for a a model with tank 3 linear (and other tanks non-linear), a
false-positive detection at the beginning of the scenario. model with tanks 2 and 3 linear and tank 1 non-linear, a fully
linear model, and a qualitative model. Table 3 summarises
6 Experimental Results our results.
The data show that, as model fidelity decreases, the er-
This section describes our experimental results, summaris- ror γ1 increases significantly and the inference times γ2 de-
ing the data first and then discussing the implications of the crease modestly. If we examine the outputs from AICc , we
results. see that the best model is the mixed model (T3 -linear). BIC
132
Proceedings of the 26th International Workshop on Principles of Diagnosis
Tanks 2 3 4 7 Conclusions
# Parameters NL 7 9 11 This article has presented a framework for evaluating the
M 6 8 10 competing properties of models, namely fidelity and com-
L 5 7 9 putational complexity. We have argued that model perfor-
Q 2 3 4 mance needs to be evaluated over a range of future observa-
γ1 NL 242 242 242 tions, and hence we need a framework that considers the ex-
M 997 1076 1192 pected performance. As such, information-theoretic meth-
L 1236 1288 1342 ods are well suited.
Q 3859 3994 4261 We have proposed some information-theoretic metrics for
γ2 NL 10.59 23.7 39.5 MBD model evaluation, and conducted some preliminary
M 8.52 17.96 34.6 experiments to show how these metrics may be applied.
L 6.11 10.57 32.0 This work thus constitutes a start to a full analysis of model
Q 4.64 7.31 26.4 performance. Our intention is to initiate a more formal anal-
ysis of modeling and model evaluation, since there is no
framework in existence for this task. Further, the experi-
Table 2: Data for 2-, 3-, and 4-tank models using Non-linear
ments are only preliminary, and are meant to demonstrate
(NL), Mixed (M), Linear (L) and Qualitative (Q) represen-
how a framework can be applied to model comparison and
tations
evaluation.
Significant work remains to be done, on a range of fronts.
In particular, a thorough empirical investigation is needs on
indicates the qualitative model as the best; it is worth noting
diagnostics modeling. Second, the real-world utility of our
that BIC typically will choose the simplest model.
proposed framework needs to be determined. Third, a theo-
retical study of the issues of mode-based parameter estima-
γ1 γ2 AICc BIC tion and its use for MBD is necessary.
Non-Linear 0.97 23.7 29.45 43.7
T3 -linear 3.12 17.96 26.77 42.9 References
T2 , T3 -linear 21.96 13.21 31.12 39.56
Linear 77.43 10.57 35.76 37.55 [1] George EP Box. Statistics and science. J Am Stat
Qualitative 304.41 9.74 43.01 29.13 Assoc, 71:791–799, 1976.
[2] Peter Struss. What’s in SD? Towards a theory of mod-
Table 3: Data for 3-tank model, using Non-linear, Mixed, eling for diagnosis. Readings in model-based diagno-
Linear and Qualitative representations, given a fault (valve sis, pages 419–449, 1992.
V1 at 50%) after 25 s [3] Peter Struss. Qualitative modeling of physical sys-
tems in AI research. In Artificial Intelligence and
Symbolic Mathematical Computing, pages 20–49.
6.2 Discussion Springer, 1993.
Our results show that MBD is a complex task with several [4] Nuno Belard, Yannick Pencolé, and Michel Comba-
conflicting factors. cau. Defining and exploring properties in diagnostic
systems. System, 1:R2, 2010.
• The diagnosis error γ1 is inversely proportional to [5] Alexander Feldman, Tolga Kurtoglu, Sriram
model fidelity, given a fixed diagnosis task.
Narasimhan, Scott Poll, and David Garcia. Em-
• The error γ1 increases with fault cardinality. pirical evaluation of diagnostic algorithm performance
using a generic framework. International Journal of
• The CPU-time γ2 increases with model size (i.e., num- Prognostics and Health Management, 1:24, 2010.
ber of tanks).
[6] Steven D Eppinger, Nitin R Joglekar, Alison Ole-
This article has introduced a framework that can be used chowski, and Terence Teo. Improving the systems en-
to trade off the different factors governing MBD “accuracy". gineering process with multilevel analysis of interac-
We have shown how one can extend a set of information- tions. Artificial Intelligence for Engineering Design,
theoretic metrics to combine these competing factors in Analysis and Manufacturing, 28(04):323–337, 2014.
diagnostics model selection. Further work is necessary
to identify how best to extend the existing information- [7] Sanjay S Joshi and Gregory W Neat. Lessons learned
theoretic metrics to suit the needs of different diagnostics from multiple fidelity modeling of ground interferom-
applications, as it is likely that the “best" model may be eter testbeds. In Astronomical Telescopes & Instru-
domain- and task-specific. mentation, pages 128–138. International Society for
It is important to note that we conducted experiments with Optics and Photonics, 1998.
un-calibrated models, and we have ignored the cost of cal- [8] Roxanne A Moore, David A Romero, and Christi-
ibration in this article. The literature suggests that linear aan JJ Paredis. A rational design approach to gaus-
models can be calibrated to achieve good performance, al- sian process modeling for variable fidelity models. In
though performance inferior to that of calibrated non-linear ASME 2011 International Design Engineering Tech-
models. This class of qualitative models does not possess nical Conferences and Computers and Information in
calibration factors, so calibration will not improve their per- Engineering Conference, pages 727–740. American
formance. Society of Mechanical Engineers, 2011.
133
Proceedings of the 26th International Workshop on Principles of Diagnosis
[9] Peter D Hanlon and Peter S Maybeck. Multiple- [23] S Pande, L Arkesteijn, HHG Savenije, and LA Basti-
model adaptive estimation using a residual correlation das. Hydrological model parameter dimensionality is a
Kalman filter bank. Aerospace and Electronic Systems, weak measure of prediction uncertainty. Natural Haz-
IEEE Transactions on, 36(2):393–406, 2000. ards and Earth System Sciences Discusions, 11, 2014,
[10] Redouane Hallouzi, Michel Verhaegen, Robert 2014.
Babuška, and Stoyan Kanev. Model weight and state [24] Martin Kunz, Roberto Trotta, and David R Parkinson.
estimation for multiple model systems applied to fault Measuring the effective complexity of cosmological
detection and identification. In IFAC Symposium on models. Physical Review D, 74(2):023503, 2006.
System Identification (SYSID), Newcastle, Australia, [25] Gregory M Provan and Jun Wang. Automated bench-
2006. mark model generators for model-based diagnostic in-
[11] Amardeep Singh, Afshin Izadian, and Sohel Anwar. ference. In IJCAI, pages 513–518, 2007.
Fault diagnosis of Li-Ion batteries using multiple- [26] David J Spiegelhalter, Nicola G Best, Bradley P Car-
model adaptive estimation. In Industrial Electronics lin, and Angelika Van Der Linde. Bayesian measures
Society, IECON 2013-39th Annual Conference of the of model complexity and fit. Journal of the Royal
IEEE, pages 3524–3529. IEEE, 2013. Statistical Society: Series B (Statistical Methodology),
[12] Amardeep Singh Sidhu, Afshin Izadian, and Sohel 64(4):583–639, 2002.
Anwar. Nonlinear Model Based Fault Detection of [27] Jing Du. The “weight" of models and complexity.
Lithium Ion Battery Using Multiple Model Adaptive Complexity, 2014.
Estimation. In World Congress, volume 19, pages [28] Jasper A Vrugt and Bruce A Robinson. Treatment of
8546–8551, 2014. uncertainty using ensemble methods: Comparison of
[13] Aki Vehtari, Janne Ojanen, et al. A survey of bayesian sequential data assimilation and bayesian model aver-
predictive methods for model assessment, selection aging. Water Resources Research, 43(1), 2007.
and comparison. Statistics Surveys, 6:142–228, 2012. [29] Hirotugu Akaike. A new look at the statistical model
[14] Athanasios C Antoulas, Danny C Sorensen, and identification. Automatic Control, IEEE Transactions
Serkan Gugercin. A survey of model reduction meth- on, 19(6):716–723, 1974.
ods for large-scale systems. Contemporary mathemat- [30] Hirotugu Akaike. Likelihood of a model and infor-
ics, 280:193–220, 2001. mation criteria. Journal of econometrics, 16(1):3–14,
[15] Alexander Feldman, Gregory M Provan, and Arjan JC 1981.
van Gemund. Computing observation vectors for max- [31] G. Schwarz. Estimating the dimension of a model.
fault min-cardinality diagnoses. In AAAI, pages 919– Ann. Statist., 6:461–466, 1978.
924, 2008. [32] Eric-Jan Wagenmakers. A practical solution to the per-
[16] Amardeep Singh, Afshin Izadian, and Sohel Anwar. vasive problems ofp values. Psychonomic bulletin &
Nonlinear model based fault detection of lithium ion review, 14(5):779–804, 2007.
battery using multiple model adaptive estimation. In [33] Pol D Spanos. Linearization techniques for non-linear
19th IFAC World Congress, Cape Town, South Africa, dynamical systems. PhD thesis, California Institute of
2014. Technology, 1977.
[17] Youmin Zhan and Jin Jiang. An interacting multiple- [34] Benjamin Kuipers and Karl Åström. The composition
model based fault detection, diagnosis and fault- and validation of heterogeneous control laws. Auto-
tolerant control approach. In Decision and Control, matica, 30(2):233–249, 1994.
1999. Proceedings of the 38th IEEE Conference on,
volume 4, pages 3593–3598. IEEE, 1999. [35] Alexander Feldman, Helena Vicente de Castro, Arjan
van Gemund, and Gregory Provan. Model-based diag-
[18] Peter Struss and Oskar Dressler. " physical negation" nostic decision-support system for satellites. In Pro-
integrating fault models into the general diagnostic en- ceedings of the IEEE Aerospace Conference, Big Sky,
gine. In IJCAI, volume 89, pages 1318–1323, 1989. Montana, USA, pages 1–14, March 2013.
[19] Johan De Kleer, Alan K Mackworth, and Raymond
Reiter. Characterizing diagnoses and systems. Arti-
ficial Intelligence, 56(2):197–222, 1992.
[20] Elizabeth H Keating, John Doherty, Jasper A Vrugt,
and Qinjun Kang. Optimization and uncertainty as-
sessment of strongly nonlinear groundwater models
with high parameter dimensionality. Water Resources
Research, 46(10), 2010.
[21] Saket Pande, Mac McKee, and Luis A Bastidas.
Complexity-based robust hydrologic prediction. Wa-
ter resources research, 45(10), 2009.
[22] G Schoups, NC Van de Giesen, and HHG Savenije.
Model complexity control for hydrologic prediction.
Water Resources Research, 44(12), 2008.
134