<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Robust Alternative to Correlation Networks for Identifying Faulty Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Patrick Traxler</string-name>
          <email>patrick.traxler@scch.at</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pablo Gómez</string-name>
          <email>pablo.gomez@faw.jku.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tanja Grill</string-name>
          <email>tanja.grill@scch.at</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Applied Knowledge Processing, Johannes Kepler University</institution>
          ,
          <addr-line>Linz</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Software Competence Center Hagenberg</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <fpage>11</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>We study the situation in which many systems relate to each other. We show how to robustly learn relations between systems to conduct fault detection and identification (FDI), i.e. the goal is to identify the faulty systems. Towards this, we present a robust alternative to the sample correlation matrix and show how to randomly search in it for a structure appropriate for FDI. Our method applies to situations in which many systems can be faulty simultaneously and thus our method requires an appropriate degree of redundancy. We present experimental results with data arising in photovoltaics and supporting theoretical results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The increasing number of technical systems connected to
the Internet raises new challenges and possibilities in
diagnosis. Large amount of data needs to be processed and
analyzed. Faults need to be detected and identified.
Systems exist in different configurations, e.g. two systems of
the same type that have different sets of sensors.
Knowledge about the system design is often incomplete. Data is
often unavailable due to unreliable data connections.
Besides these and other difficulties, the large amount of data
also opens new possibilities for diagnosis based on machine
learning.</p>
      <p>The idea of our approach is to conduct fault detection and
identification (FDI) by comparing data of similar systems.
We assume to have data of machines, devices, systems of a
similar type and want to know if some system is faulty and if
so, to identify the faulty systems. This situation may deviate
from classic diagnosis problems in that we just have limited
information (e.g. sensor or control information) of system
internals. Moreover, we may have incomplete knowledge
about the system design. This makes manual system
modeling hard or even impossible. The problem is then to
compare the limited information of the working systems
(perhaps only input-output information) to identify faulty
systems.</p>
      <p>In this work we tackle one concrete problem of this kind.
It is motivated by photovoltaics. We describe it in more
detail below. The problem that arises in our and other
applications is that not every two systems can be compared. We
thus need to learn relations between systems.</p>
      <p>There are different approaches to learn structure, e.g.
learning Bayesian networks, Markov random fields, or
sim1
2
3
4
5
6
1 2 3 4 5 6
(a) Fitness matrix
1
2</p>
      <p>5
3</p>
      <p>4 6
(b) Digraph
ilar concepts. The concept that fits our needs are correlation
networks. A correlation network is some structure in the
correlation matrix, e.g. a minimum spanning tree or a
clustering. In our application we have n variables which
represent the produced energy per photovoltaic system. Given
that a single system correlates strongly with enough other
systems, we use this information for FDI via applying a
median.</p>
      <p>
        We can also think of correlation networks as a method
for knowledge discovery. It has been applied in areas such
as biology [18; 10] and finance [
        <xref ref-type="bibr" rid="ref16">12</xref>
        ] to analyze gene
coexpression and financial markets. In our situation, the first
step is to learn linear relations between systems. For
learning we need historical data. A sample result of this step is
depicted in Fig. 1. In Fig. 1(a) the fitness matrix, our robust
alternative to the correlation matrix, is shown. It represents
the degree of linearity between any two systems. For FDI,
the second step of our method, we work with the result as
depicted in 1(b) and current data. In the example, we derive
for every of the six systems an estimation mˆ i of its current
value yi from its neighbors current values, e.g. for system
1 we get an estimate from the current values of the systems
2, 3, 4 and for system 5 from system 6. Finally, we test for a
fault by checking if |mˆ i − yi| is large.
      </p>
      <p>
        The major difficulty we try to tackle with this approach
is the presence of many faults. Faults influence both the
learning problem and the FDI problem. Robustness is an
essential property of our algorithms. Our result can be seen
as a robust structure learning algorithm for the purpose of
FDI. Robustness is a preferable property of many learning
and estimation algorithms. However, the underlying
optimization problems unlike their non-robust variants are often
NP-hard. This is for example the case for computing robust
and non-robust estimators for linear regression, e.g. Least
Median of Squares versus Ordinary Least Squares [
        <xref ref-type="bibr" rid="ref20">16</xref>
        ]. We
avoid NP-hardness by a careful modeling of our problem.
In particular, our algorithms are computationally efficient.
Under some conditions, FDI can be done in (almost) linear
time in the number of systems n.
      </p>
      <p>To summarize our contributions, we introduce a novel
alternative to the sample correlation matrix and present a first
use of it to discover structure appropriate for general FDI
and in particular for identifying faulty photovoltaic systems
(PV). Our method works in the presence of many faults. Our
algorithms are computationally efficient. Our method
incorporates a couple of techniques from machine learning and
statistics: (Repeated) Theil-Sen estimation for robust
simple linear regression. Trimming to obtain a robust fitness
measure. Randomized subset selection for improved
running time. And a median mechanism to conduct FDI.</p>
      <p>In Sec. 2 we discuss our method. In Sec. 3 we present
experimental and theoretical results.
1.1</p>
      <sec id="sec-1-1">
        <title>Motivating Application: Identifying Faulty</title>
      </sec>
      <sec id="sec-1-2">
        <title>Photovoltaic Systems</title>
        <p>Faults influence the performance of photovoltaic systems
(PV). PV systems produce less energy than possible if faults
occur. We can distinguish between two kinds of faults.
Faults caused by an exogenous event such as shading,
(melting) snow, and tree leafs covering solar modules. And faults
caused by endogenous events such as module defects and
degradation, defects at the power inverter, and string
disconnections.</p>
        <p>We are going to detect faults by estimating the drop in
produced energy. Most of the common faults result in such
a drop. The particular problem is given by the sensor setup.
We just assume to know the produced energy and possible
but not necessarily the area (e.g. the zip code) where the PV
system is located.</p>
        <p>We apply our method to PV system data. Difficulties in
the application are different system types and deployments
of systems. For example, different number of strings and
modules per string and differing orientation (north, west,
south, east) of the modules; see Fig. 2. Moreover, the lack of
information due to the lack of sensors and incomplete data
due to unreliable data connections. Faults occur frequently,
in particular exogenous faults during winter.</p>
        <p>The novelty of our work in the context of photovoltaics
is that it works in an extremely restrictive (only power
measurement) sensor setting. To the best of our knowledge, we
are the first to consider this restrictive sensor setting. We
only need to know the produced energy of a PV system.
There is also the implicit assumption, which is tested by
the learning algorithm, that the systems are not too far from
each other so that we can observe them in similar working
(environmental) conditions. Distances of a couple of
kilometers are possible. Systems which are very close to each
other and have the same orientation such as systems in a
solar power plant yield the best results. Other approaches
assume the presence of a plane-of-array-irradiance sensor
which are mostly deployed for solar power plants.
Irradiance estimations via satellite imaging are usually not
accurate enough.
1.2</p>
      </sec>
      <sec id="sec-1-3">
        <title>Related Work</title>
        <p>Correlation networks have applications in biology and
finance. See e.g. [12; 18; 10] and the references therein. In
biology [18; 10], they are applied to study gene interactions.
The correlation matrix is the basis for clustering genes and
the identification of biologically significant clusters. In [18;
10], a scale-free network is derived via the concept of
topological overlap. Scale-free networks tend to have few nodes
(genes) with many neighbors, so called hubs.</p>
        <p>Correlation networks are primarily used for knowledge
discovery. In particular, concepts such as clusters, hubs, and
spanning trees are interpreted in the context of biology and
finance. In our work, we introduce a robust alternative to
correlation networks.</p>
        <p>
          Other structural approaches, i.e. approaches based on
graphical models, are based on Bayesian networks, Markov
random fields and similar concepts. Gaussian Markov
random fields are loosely related to correlation networks. Their
structure is described by the precision matrix, the inverse
covariance matrix (ch. 17.3, [
          <xref ref-type="bibr" rid="ref13">9</xref>
          ].)
        </p>
        <p>Another structural approach is FDI in sensor networks [7;
4; 19; 20]. The current approach [7; 4; 19] mainly deals with
wireless sensor networks. The algorithms usually use the
median for FDI such as we do. The difference is that FDI
in wireless sensor networks uses a geometric model similar
to interpolation methods. It requires the geographic location
of the sensors. It is assumed that two sensors close to each
other have a similar value. This cannot be assumed in
general. To overcome these problems of manual modeling, we
apply machine learning techniques.</p>
        <p>
          Models for PV systems are compared in [
          <xref ref-type="bibr" rid="ref18">14</xref>
          ]. All these
models require the plane-of-array irradiance. Fault
detection of PV systems is the topic of e.g. [3; 8; 5; 2;
17]. Firth et al. [
          <xref ref-type="bibr" rid="ref12">8</xref>
          ] consider faults if the PV system
generates no energy. Another type of fault occurs if the
panels are covered by snow, tree leaves, or something else.
In this case, we can observe a drop in energy. It is
considered e.g. in [5]. The fraction of panel area covered
is a crucial parameter. All these approaches [3; 8; 5; 2;
17] require at least the knowledge of the plane-of-array
irradiance, i.e. it requires an irradiance sensor installed. We do
make this assumption.
        </p>
        <p>
          The median is common in fault detection and
identification. One reason for this circumstance is its optimal
breakdown point [
          <xref ref-type="bibr" rid="ref20">16</xref>
          ]. We also make use of (repeated)
TheilSen algorithms [6; 15] for learning. An ingredient of our
fault identification algorithm is the algorithm for median
selection [1] and an algorithm for generating uniform
subsets efficiently (see e.g. the Fisher-Yates or Knuth shuffle
in [
          <xref ref-type="bibr" rid="ref17">13</xref>
          ].) In our algorithm analysis we derive bounds for a
partial Cauchy-Vandermonde identity (pg. 20 in [
          <xref ref-type="bibr" rid="ref15">11</xref>
          ]).
2
2.1
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Method</title>
      <sec id="sec-2-1">
        <title>Data Model for Incomplete Data</title>
        <p>We have data from n systems and one data stream per
system. A data stream for system i ∈ {1, . . . , n} is given by a
set Ni ⊆ {1, . . . , N } of available data and values xi,t ∈ R
with t ∈ Ni. We can think of the parameter t as discrete
time. With Ni, we explicitly model data availability.
Incomplete data is a common problem in our situation. Causes in
practice are unreliable data connections or unreliable
sensors. We call D := {(xi,t)t∈Ni : i ∈ {1, . . . , n}} a data
set. Sets of historical and current data are the inputs to our
algorithms.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2 Fitness Matrix – Definition and Robustness</title>
        <p>
          The fitness matrix is intended as a robust replacement for
the sample correlation matrix. The sample correlation
coefficient such as the sample covariance is well known to be
sensitive to faults (outliers) [
          <xref ref-type="bibr" rid="ref20">16</xref>
          ]. As an example, we
generated the data for Fig. 1 with faults. The non-robust sample
correlation matrix would have yield a digraph without edges
instead of the digraph in Fig. 1(b).
        </p>
        <p>A fault can be an arbitrary corruption of a single data item
xi,t. That is, xi,t = x˜i,t + Δ, Δ 6= 0, where Δ is the fault.
We think of x˜i,t as the actual or true but unobserved value.</p>
        <p>We do not make any assumptions on faults themselves
but only on their number. This is at core of the definition of
the breakdown point. This statistical concept is defined for
a particular estimation or learning problem. In our case for
simple linear regression.</p>
        <p>
          Linear regression is closely related to the correlation
coefficient. For simple linear regression – a regression model
with one independent and one dependent variable – the
correlation coefficient can be seen as a fitness measure of
the line which fits the data best w.r.t. vertical squared
distances. See e.g. [
          <xref ref-type="bibr" rid="ref20">16</xref>
          ]. However, the corresponding estimator,
namely `2-regression a.k.a. ordinary least squares, is known
to be sensitive to outliers [
          <xref ref-type="bibr" rid="ref20">16</xref>
          ]. On the other hand, there are
estimators for simple linear regression which are robust to
a large number of faults, i.e. they have a large breakdown
point.
        </p>
        <p>
          The idea underlying the fitness matrix is thus to replace
the correlation coefficient (and `2-regression) by a robust
notion of fitness based on robust linear regression. In the
remainder of this section we recall the definition of the
breakdown point following [
          <xref ref-type="bibr" rid="ref20">16</xref>
          ], pg. 9, and we are going to
formalize the notion of fitness matrices.
        </p>
        <p>We define the breakdown only for simple linear
regression. We fix two systems i, j ∈ {1, . . . , n} and define
Z := Zi,j := {(xi,t, xj,t) : t ∈ Ni ∩ Nj }. Let T be a
regression estimator, i.e. T (Zi,j ) = θˆ ∈ R2 is the intercept
and slope for the data set Zi,j . For Z, we define Z0 as Z
with m data points arbitrarily corrupted. Define
bias(m; T , Z) := sup kT (Z) − T (Z0)k.</p>
        <p>Z0
If bias(m; T , Z) is infinite, then m faults (outliers) have an
arbitrarily large effect on the estimate T (Z0). The (finite
sample) breakdown point of T and Z is defined as
m
|Z|</p>
        <p>To explain this notion, we consider four typical examples.
The breakdown point ε∗(T`2 , Z) is 1/n for `2-regression.
This holds for any Z. The situation is different for
`1regression in that ε∗(T`1 , Z) = 1/n for some Z.</p>
        <p>In this work we are going to use the Theil-Sen estimator1
T TS a.k.a. median slope selection. The reason is its
breakdown point of at least 1 − √12 ≥ 0.292 (see e.g. [6]) and the
: bias(m; T , Z) = ∞
.</p>
        <p>ε∗(T , Z) := min
1There is a subtle issue here we have to deal with.
Regression problems are optimization problems. The solution to the
concrete optimization problem does not need to be unique. In our
situation, intercept and slope are unique for `2-regression but not
for `1-regression. The estimator T`1 is however unique for a
(deterministic) algorithm solving the optimization problem. We thus
think of T`1 as the output of a particular (deterministic) algorithm.
wide availability of efficient implementations of near-linear
time algorithms. There is also a variant of T TS, called the
repeated Theil-Sen estimator, which has a breakdown point
of 0.5, but less efficient implementations. The concrete
definition of T TS can be found e.g. in [6]. It is however not
important for our application, only its robustness property
and the existence of efficient implementations are.</p>
        <p>To define the breakdown point of a fitness matrix, let f
be a real-valued function defined on any finite data set. We
define the fitness matrix as
and its breakdown point as</p>
        <p>Fji := f (Zi,j )
ε∗(F ) := min ε∗(f, Zi,j ).</p>
        <p>i,j
Next, we provide the fitness matrix we are going to use. It
has the property that Fji is close to zero if xi and xj are
strongly linearly related and it has a high breakdown point.</p>
        <p>Let yt := xi,t and yˆt := xj,t · θˆ2 + θˆ1, t ∈ Ni ∩ Nj , for
the Theil-Sen estimate θˆ of Zij . Let rt := yˆt − yt be the
residuals. And let i1, . . . , ik ∈ Ni ∩ Nj with k := |Ni ∩ Nj |
be such that |ri1 | ≤ · · · ≤ |rik |. We define
f TS(Zij ) :=</p>
        <p>1
Pbk/√2c |yit |
t=1
·
bk/√2c</p>
        <p>X
t=1
|rit |.</p>
        <p>(1)
We define</p>
        <p>F TS w.r.t. f TS, i.e.</p>
        <p>FjTiS := f TS(Zi,j ).</p>
        <p>Note that the sum goes from 1 up to bk/√2c. This
trimming together with the high breakdown point of Theil-Sen
directly implies the following result.</p>
        <p>Theorem 1. It holds that ε∗(F TS) ≥ 1 − √12 .</p>
        <p>Finally, we compare the sample correlation matrix and the
fitness matrix. Let C denote the sample correlation matrix
and define Cj0i = 1 − |Cji|. Both matrices have the property
that if some entry is close to 0 then xi and xj have a strong
linear relation. It is guaranteed that Cj0i is at most 1. A value
close to 1 means a weak linear relation. For FjTiS, it is not
guaranteed that FjTiS ≤ 1, but experimental results suggest
that it is usually the case. We also note that both matrices
obey a weak form of the triangle inequality since if xi and
xj correlate strongly and xj and xk correlate strongly, then
also xi and xk correlate.</p>
        <p>There are two important benefits of fitness matrices over
correlation matrices. They are robust and are also defined
for incomplete data. On the negative side, the fitness matrix
is not positive semi-definite, in particular not symmetric.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Structure in Fitness Matrices – Algorithm</title>
        <p>LEARN and IDENTIFY
We want to identify faulty systems. In a first step, we learn
a structure appropriate for FDI; see algorithm LEARN. We
obtain it via thresholding the fitness matrix. Most of the
correlation networks, i.e. structures arising from the sample
correlation matrix, are obtained in this way [12; 18; 10].
We denote the threshold by θ ≥ 0 and the threshold fitness
matrix as</p>
        <p>Fji;θ :=</p>
        <p>Fji
0
if Fji ≤ θ
if Fji &gt; θ
.
Algorithm 1 Algorithm LEARN with input D (data set) and
parameter θ (fitness threshold). Output is a digraph G with
edge labels (intercept, slope) representing the threshold
fitness matrix.</p>
        <p>Let G = (V, E) be a digraph with V = {1, . . . , n} and
E = {}.
for all i ∈ V and j ∈ V \ {i} do</p>
        <p>Learn (Theil-Sen) the intercept aj,i and slope bj,i
between xi (dependent variable) and xj (independent
variable).
end for
for all i ∈ V and j ∈ V \ {i} do</p>
        <p>Compute the trimmed fitness f = f TS (Eq. 1) of Zi,j .
end for
if f ≤ θ then</p>
        <p>Add to G the directed edge from j to i with edge
labels (aj,i, bj,i).</p>
        <p>end if
The input to algorithm LEARN is a data set D as described
in Sec. 2.1. It outputs a digraph G = (V, E), i.e. the
(possible sparse) threshold fitness matrix FθTS. Additionally,
intercept and slope of the simple linear regressions are added
as edge labels.</p>
        <p>Algorithm 2 Algorithm IDENTIFY with input G (digraph
with edge labels), current data yi for the i-th system, and
parameters k and s (deviation). It outputs the set of all faulty
systems H .</p>
        <p>else
Set H = {}.
for all i ∈ V = {1, . . . , n} do</p>
        <p>Let Ni− := {j ∈ V : (j, i) ∈ E}.
if |Ni−| = 0 then</p>
        <p>Continue with the next (system) i.
end if
if |Ni−| ≥ k then</p>
        <p>Select uniformly at random a k-element
subset S from Ni−.</p>
        <p>Set S := Ni−.
end if
Compute Mi := {yˆj = bj,i · yj + aj,i : j ∈ S}.
Compute the median mˆ i of Mi.</p>
        <p>Add i to H if |mˆ i − yi| &gt; s
end for
Output H .</p>
        <p>In the second step, we identify the faulty systems; see
algorithm IDENTIFY. Its input is the result of algorithm
LEARN. Algorithm IDENTIFY constructs a random
digraph of in-degree at most k for FDI. It works as follows.
Independently for every system, we choose uniformly at
random at most k of its neighbors in the digraph G and compute
the median mˆ i of estimated values derived from the selected
neighbors values. We compare the median mˆ i to the current
system value and decide whether it has a fault or not via the
deviation parameter s.</p>
        <p>We discuss the threshold parameter θ and the deviation
parameter s in Sec. 3.1. They essentially depend on the
variance in the data set D. Parameter k in algorithm IDENTIFY
has the purpose of improving running time efficiency. In
particular, we have the following result.</p>
        <p>Theorem 2. Let D be a data set with n systems and let
m := maxi |Ni|. The running time of LEARN is O(n2 · m ·
log(m)). The running time of IDENTIFY is O(k · n).
Proof. LEARN. There are O(n2) pairs of systems. The
Theil-Sen estimator can be computed in time O(m log(m))
[6]. The computation of f TS, Eq. 1, is done via sorting and
thus takes time O(m log(m)).</p>
        <p>
          IDENTIFY. Assume |Ni−| ≥ k. We uniformly at
random choose a k-element subset out of N − and compute
i
the median. For random selection we can use for example
the Fisher-Yates (or Knuth) shuffle [
          <xref ref-type="bibr" rid="ref17">13</xref>
          ] which runs in time
O(k) and for median selection the algorithm in [1] which
also runs in time O(k). The second case, 1 ≤ |N −
i | ≤ k −1,
is analogous. This shows that the overall running time of
IDENTIFY is O(kn).
        </p>
        <p>In Sec. 3.2, we provide some sufficient conditions that
IDENTIFY works correctly even if k = O(log(n)). This
is a strong running time improvement from O(n2) to O(n ·
log(n)).
3
3.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <sec id="sec-3-1">
        <title>Experimental</title>
        <p>In this section we are going to discuss how to apply our
method, Sec. 2, to photovoltaic data. In particular, it
remains to discuss how the use-case fits to the model. More
precisely, why there is strong correlation between PV
systems. Finally, we present experimental results to verify the
estimation and fault identification quality of our algorithms.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Use-Case Photovoltaics</title>
        <p>A simple system model for PV systems is as follows:</p>
        <p>Pi = ci · Ii.</p>
        <p>Here, Pi is the power, Ii the plane-of-array (POA) irradiance
of the i-th system, and ci a constant of the system which can
be interpreted as the efficiency of converting solar energy
into electrical energy. More complex physical models
include system variables such as the module temperature [14;
17]. Our considerations translate to the more complex
models as long as they are time-independent. We also note that
these models are more accurate, but only slightly, since the
POA-irradiance has the most critical influence on the
produced energy.</p>
        <p>We get from the above considerations that Pi = c0ij · Pj
given that Ii = Ij . In our situation we cannot test the
condition Ii = Ij since we do not know the POA-irradiance, but
Ii ≈ Ij holds if the system operate under similar weather
conditions and have a similar orientation. The former holds
if the systems are close to each other. To reduce the effect of
different orientations, see Fig. 2, we consider the following
model: PiΔ = uij · PjΔ + vij . The variable PiΔ is the power
within a time interval Δ, usually one hour. The variables
uij and vij are the unknowns.</p>
        <p>In more general words, let Yi be the output of the i-th
system and let Xi describe the system input and system
internals. Our model assumption is that for a reasonable
number of system pairs (i, j), the system outputs Yi are Yj are
linearly related given that Xi ≈ Xj . By the above
considerations, it is plausible that PV systems fulfill these
requirements.
12:00</p>
        <p>13:00
Time</p>
        <p>We next describe our experimental setup to verify it by
real data.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Experimental Setup</title>
        <p>To demonstrate our method, we use two data sets DA and
DK. DA arises from 13 systems from a solar park located in
Arizona2. The PV systems there are geographically close.
We use data for one year. DK arises from 40 systems spread
across a typical municipality located in Austria, i.e. the
systems can be up to some kilometers apart. Their orientation
can differ significantly. Some systems may be orientated to
the west, others to the east. We have data for almost a year.</p>
        <p>A system is faulty if it produces considerable less energy
than estimated; see Fig. 3. This definition is motivated by
the fact that most faults imply a drop in energy. The
difficulty in setting up an experiment is that we do not know
if a PV system is faulty in advance, i.e. we do not have
labeled data. We thus design our experiment as follows: We
verify the accuracy of the energy estimation, namely the
relative deviation |mˆ i − yi|/|mˆ i| for every system i and over
the period of a week, mˆ i and yi as in algorithm IDENTIFY.</p>
        <p>This relative deviation is noted in column Hour of Table
1 for the time period 12:00 to 13:00. In column Day of
Table 1 we note the same but for a whole day, i.e. mˆ i is the
estimated energy (power) for the whole day calculated from
the hourly estimates and yi the actual energy for the whole
day. For the whole day we consider the time period from
9:00 to 16:00.</p>
        <p>The number |mˆ i − yi|/|mˆ i| can be read as some relative
deviation, i.e. the estimation is 100 · x% away from the truth
value where x is some entry in the column Hour and Day.</p>
        <p>POA−irradiance</p>
        <p>Estimate</p>
        <p>Real</p>
        <p>The entry x is an average over all systems and 7 days. The
first day is noted in column Start.</p>
        <p>Algorithm LEARN is executed once for every week and
with θ = 0.8 and roughly three months of historical data,
e.g. for the months January, February, and March to get
estimates for the days April, 1. to April, 7. Algorithm
IDENTIFY is executed with s = 0.25 · |mˆ i| and k = 11 for both
data sets. The choice of parameters θ and s depend on the
variance of the input data and were chosen manually, so to
get a reasonable number of good estimates. Similar for k.
The difficulty in choosing the parameters is that increasing θ
will usually reduce the number of neighbors. For a
reasonable number of good estimates we need both: A strong
linear relation of a system to its neighbors and enough
neighbors. The parameters were chosen accordingly. For
parameter k, we derive a theoretical result in Sec. 3.2 which says
that k = O(log(n)) is a good choice for n the number of
systems.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Experimental Results</title>
        <p>The false positive rate (FPR), the false negative rate (FNR),
and the estimation accuracy are the most interesting
numbers for us. As remarked above, we do not have labeled
data. The faults as recorded in Table 1 are faults as detected
by our algorithm.</p>
        <p>We make a worst case assumption, namely that all
detected faults are false positives. This yields a FPR of at most
0% to 5% per 7 day period (rows in the table.) To get an
understanding of FNR, we simulated faults by subtracting 33%
percent of energy. The FNR in this case is at most 10% per
7 day period. In the rows Sum and Sum−33% in Table 1 we
summed up the faults to get the FPR and FNR for the whole</p>
        <sec id="sec-3-4-1">
          <title>Start</title>
          <p>April, 1.</p>
          <p>May, 1.</p>
          <p>June, 1.</p>
          <p>July, 1.</p>
          <p>Aug., 1.</p>
          <p>Sept., 1.</p>
          <p>Oct., 1.</p>
          <p>Nov., 1.</p>
          <p>Dec., 1.</p>
          <p>Sum</p>
          <p>Sum−33%</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>Start</title>
          <p>June, 1.</p>
          <p>June, 15.</p>
          <p>July, 1.</p>
          <p>July, 15.</p>
          <p>Aug., 1.</p>
          <p>Aug., 15.</p>
          <p>Sept., 1.</p>
          <p>Sept., 15.</p>
          <p>Sum
Sum−33%
(a) Results for DA.</p>
          <p>(b) Results for DK.
data sets.</p>
          <p>The interpretation of these results is as follows. Setting
the parameter s to 0.25 · |mˆ i| means that we define a fault
as a 25% relative deviation of the observed produced energy
from its true value. Setting s to this value, yields the above
mentioned FPR. Simulating a 33% drop in energy, which
corresponds naturally to the faults we want to detect, yields
the above FNR.</p>
          <p>For the data set DA we have knowledge about the
POAirradiance. We can thus cross-check with the irradiance to
check if faulty systems were identified correctly; see Fig.
3. This manual inspection suggests that the FPR is much
smaller than 5%, close to 1%. Furthermore, increasing the
drop implies a decreasing FNR, i.e. stronger energy drops
are easier to identify.</p>
          <p>Depending on the application, these rates may be
considered appropriate or not. In some applications, we may want
to detect faults which yield a drop in energy of less −25%.
This worsens the FPR and FNR. On the other side, if we
want to improve the FPR and FNR, we may have to specify
a fault as a drop in energy of −50%. In other words, our
parameter setting is one out of many reasonable parameter
settings.
We argued in Sec. 3.1 that algorithm LEARN yields good
estimates for the systems current value. For an estimate to
be good, the neighboring system j in G of system i needs
to work correctly. Moreover, the regression estimates, the
intercept and slope, need to be accurate enough. In this
section, we provide a supporting theoretical result which says
that, if enough estimates are good, algorithm IDENTIFY
correctly identifies all faulty systems.</p>
          <p>The input to IDENTIFY is a digraph G = (V, E) with
edge labels. Let yi be the current value of system i. Let
yi = y˜i + Δi. We think of y˜i as the true value. We say that
system i is correct if Δi = 0 and faulty otherwise.</p>
          <p>The input to IDENTIFY has to satisfy two conditions, Eq.
2 and 3, to work correctly. These conditions state that there
are more good than bad estimates. We formulate them
below.</p>
          <p>Theorem 3. Let 0 &lt; p &lt; 1 and s &gt; 0. Let H := {i ∈
{1, . . . , n} : |Δi| &gt; 2s}. Assume that the input digraph G
satisfies Eq. 2 and 3. Then, algorithm IDENTIFY outputs
H with probability at least 1 − p.</p>
          <p>Let yˆj be the estimates as computed in IDENTIFY. Fix
a system i and let j ∈ Ni−. We say that yˆj is s-good for
system i if |y˜i − yˆj | ≤ s. Let Ai := {j ∈ Ni− : |y˜i − yˆj | ≤
s} be the s-good estimates for system i. Condition 2 is as
follows: For every system i with 1 ≤ |Ni−| ≤ k − 1 it holds
that
|Ai| &gt; |Ni−| ,
2
i.e. there are more good than bad estimates. For the case that
|Ni−| ≥ k we assume
|Ai| &gt;</p>
          <p>1
1 − cn,p,k
· |Ni−|,
(2)
(3)
with cn,p,k := ( n</p>
          <p>p · 18k)2/(k−1). Setting k = Ω(log( np ))
makes cn,p,k larger than some constant independent of n
and p. This is the most reasonable setting as it implies that
a constant fraction of estimates can be bad and IDENTIFY
still identifies the faulty systems correctly. We remark that
the asymptotic analysis which yields cn,k,p is not optimal.
In particular, it seems that the factor 18k is not optimal and
may be improved to a factor as small as 2k/2. For practical
applications, the following heuristic seems reasonable: For
n systems and a failure probability p of IDENTIFY, set k to
10 · log( np ).
3.3</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>Proof of Theorem 3</title>
        <p>We apply the following lemma with A = Gi and M = Ni−.
It directly gives us the probability that IDENTIFY correctly
identifies the faulty systems since the median works
correctly if |S ∩Ai| &gt; |S ∩(Ni− \Ai)|, where S is the (random)
set chosen in IDENTIFY.</p>
        <p>Lemma 1. Let M be a finite set and A ⊆ M . Let k ≥ 2
be an integer. Let S ⊆ M be a k-element subset selected
uniformly at random. Then
PSr(|S ∩ A| &gt; |S ∩ (M \ A)|) ≥ 1 − 18k |M \ A|
|M |
bk/2c
.</p>
        <p>Proof. Let M := {1, . . . , m}, F := M \ A, and r := |F |.
First, we are going to bound the number of k-element
subsets S ⊆ M for which |S ∩ G| ≤ k0 with k0 = bk/2c. The
exact number of these sets is
k0
X
i=0
m − r
i</p>
        <p>r
k − i
since there are |A| ways to choose an i-element subset
i
from A and k|F−|i ways to choose from F for the
remaining k − i elements.</p>
        <p>
          Note that |S ∩ A| &gt; |S ∩ F | iff |S ∩ A| &gt; b|S|/2c =
k0. Moreover, we can assume that r = |F | ≥ 1 since the
claim holds for r = 0. To provide a lower bound for the
probability of this event we show an upper bound on the
complementary event, i.e. |S ∩ A| ≤ k0. First, we derive an
upper bound for Eq. 4 using
m k
k
≤
m
k
≤
me
k
k
for e = 2.714 . . . and 1 ≤ k ≤ m. (See e.g. pg. 12 in [
          <xref ref-type="bibr" rid="ref15">11</xref>
          ].)
Since this inequality holds just for k ≥ 1 we rewrite Eq. 4
as
r
k
        </p>
        <p>k0
+ X
i=1
m − r
i</p>
        <p>r
k − i
.</p>
        <p>It holds that kr ≤ ( re )k and for the second term in Eq. 6
k
(4)
(5)
(6)
k−i
k0
X
i=1
m − r
i</p>
        <p>k0
= (re)k X</p>
        <p>r
k − i
k0
X
i=1</p>
        <p>i
≤
m − r
r
(m − r)e i</p>
        <p>i
k − i i
i</p>
        <p>1
k − i</p>
        <p>re
k − i
k
i=1
Next, we prove the upper bound on the probability p that
|S ∩ A| ≤ k0. We select uniformly at random a k-element
subset of M . Its probability is
m −1. We multiply Eq. 6
k
with m −1 and get two parts p1 + p2 ≥ p. For the first part
k
p1 ≤ ( rme )k since mk −1 ≤ (k/m)k. For the second part p2,
we use mr−r ≤ mr , ((k−i)/i)i ≤ 2k, and (k/(k−i))k ≤ 2k.
The latter since i ≤ k0. We get an upper for the second part:
p2 ≤
re
m
k k0</p>
        <p>X
i=1
m − r</p>
        <p>r</p>
        <p>Proof of Theorem 3. We show that the success probability
of IDENTIFY is at least 1 − p. Let p0 := np . We show that
for every i ∈ V , G = (V, E), the success probability of a
single iteration in the loop of IDENTIFY is at least 1 − p0.
This implies the above claim since (1 − p0)n ≥ 1 − p0n by
e.g. the Binomial Theorem.</p>
        <p>Fix some i ∈ V , i.e. we consider one iteration in the loop
of IDENTIFY. We apply Lemma 1. Let us assume that |S ∩
Ai| &gt; |S ∩ (Ni− \ Ai)|, S the random k-element subset as
in IDENTIFY and Ai the good estimates as defined above.
Since |S∩Ai| &gt; |S∩(N −</p>
        <p>i \Ai)|, it follows that |mˆ i−y˜i| ≤ s
for the median mˆ i as computed in IDENTIFY and yi =
y˜i + Δi.</p>
        <p>Assume Δi = 0, i.e. system i works correctly. Then,
|mˆ i − yi| = |mˆ i − y˜i| ≤ s. Thus, i is not output.</p>
        <p>Assume Δi 6= 0, i.e. system i is faulty. Here, |mˆ i − yi| =
|mˆ i − y˜i − Δi|. It follows from |mˆ i − y˜i| ≤ s and |Δi| &gt; 2s
that |mˆ i − yi| &gt; s. Thus, i is output.</p>
        <p>Finally, we want that the probability of failure for a
single step is at most np . By Lemma 1, 18kαbk/2c ≤
18kα(k−1)/2 ≤ np with α := |Ni−\Ai| . With c = cn,p,k :=
|Ni−|
( np · 18k)2/(k−1), c · |Ni− \ Ai| ≤ |Ni−| and thus (1 − 1/c) ·
|Ni−| ≤ |Ai|.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusions and Open Problems</title>
      <p>We presented a method for learning structure to identify
faulty systems. The basic method of correlation networks
has found many applications in biology and finance. In our
application, the presence of many faults required the design
and analysis of robust algorithms. We provided an
experimental analysis of our algorithms to verify their estimation
and fault identification quality. We also provided a
supporting theoretical result which allowed us to considerable
improve the running time of algorithm IDENTIFY.</p>
      <p>Improving the running time of LEARN remains as an
open problem. It is not directly clear that it is necessary
to compare every two systems. The reason is that if systems
(i, j) and (j, k) correlate strongly, then also (i, k) correlate,
but not necessarily strongly. Thus, it may not be necessary
to solve a simple linear regression problem for every system
pair.</p>
      <p>
        In other applications it may be useful to solve a general
linear regression problem instead of a simple linear
regression, e.g. if our model depends on more than one variable
per system. The corresponding correlation networks are
based on the partial correlation coefficient [
        <xref ref-type="bibr" rid="ref16">12</xref>
        ]. Since
robust estimators for general linear regression are based on
regression problems which are NP-hard, it remains as an open
problem to find a robust alternative to partial correlation
networks that can be computed efficiently.
      </p>
      <p>Finally, to put our method and results into a broader
context, we approached the problem of FDI via learning
graphical models. It seems to be a challenge to learn classical
component-models of technical systems to conduct
diagnosis. In this work we were able to close the gap between
(structure) learning on the one side and FDI on the other
side for a concrete problem setting.
i
k − i i
i
12r
m</p>
      <p>k
k − i
k k0</p>
      <p>X
i=1
k</p>
      <p>≤
m i</p>
      <p>.
r
An upper bound for the geometric sum is k0(m/r)k0 . In
total
Substituting k−1 for k0 and further simplification yields
2
p ≤ p1 + p2 ≤
p ≤
(k + 2)(12)k
2
re
m
r
m
k</p>
      <p>+ k0
(k−1)/2
12r
m
k m k</p>
      <p>.</p>
      <p>r
≤ 18k
r
m
(k−1)/2
.</p>
      <p>The latter since ((k + 2)/2)1/k ≤ 1.5 for k ≥ 3. We have
thus a lower bound for the probability 1 − p and the claim
follows.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Manuel</given-names>
            <surname>Blum</surname>
          </string-name>
          , Robert W. Floyd, Vaughan Pratt, Ronald L.
          <string-name>
            <surname>Rivest</surname>
            , and
            <given-names>Robert E.</given-names>
          </string-name>
          <string-name>
            <surname>Tarjan</surname>
            . Linear time [17]
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Traxler</surname>
          </string-name>
          .
          <article-title>Fault detection of large amounts of photovoltaic systems</article-title>
          .
          <source>In Online Proc. of the ECML PKDD 2013 Workshop on Data Analytics for Renewable Energy Integration (DARE'13)</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Bin</given-names>
            <surname>Zhang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Steve</given-names>
            <surname>Horvath</surname>
          </string-name>
          .
          <article-title>A general framework for weighted gene co-expression network analysis</article-title>
          .
          <source>Statistical Applications in Genetics and Molecular Biology</source>
          ,
          <volume>4</volume>
          (
          <issue>17</issue>
          ),
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Chongming</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Jiuchun Ren,
          <string-name>
            <given-names>Chuanshan</given-names>
            <surname>Gao</surname>
          </string-name>
          , Zhonglin Yan, and
          <string-name>
            <surname>Li</surname>
            <given-names>Li</given-names>
          </string-name>
          .
          <article-title>Sensor fault detection in wireless sensor networks</article-title>
          .
          <source>In Proc. of the IET International Communication Conference on Wireless Mobile and Computing</source>
          , pages
          <fpage>66</fpage>
          -
          <lpage>69</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Yang</surname>
            Zhang,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Meratnia</surname>
            , and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Havinga</surname>
          </string-name>
          .
          <article-title>Outlier detection techniques for wireless sensor networks: a survey</article-title>
          .
          <source>Communications Surveys and Tutorials</source>
          , IEEE,
          <volume>12</volume>
          (
          <issue>2</issue>
          ):
          <fpage>159</fpage>
          -
          <lpage>170</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>bounds for median computations</article-title>
          .
          <source>In Proc. of the 4th Annual ACM Symposium on Theory of Computing</source>
          , pages
          <fpage>119</fpage>
          -
          <lpage>124</lpage>
          ,
          <year>1972</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>In 37th IEEE International Conference on Acoustics, Speech and Signal Processing</source>
          , pages
          <fpage>1681</fpage>
          -
          <lpage>1684</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>K. H. Chao</surname>
            ,
            <given-names>S. H.</given-names>
          </string-name>
          <string-name>
            <surname>Ho</surname>
            , and
            <given-names>M. H.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
          </string-name>
          .
          <article-title>Modeling and fault diagnosis of a photovoltaic system</article-title>
          .
          <source>Electric Power Systems Research</source>
          ,
          <volume>78</volume>
          (
          <issue>1</issue>
          ):
          <fpage>97</fpage>
          -
          <lpage>105</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Jinran</given-names>
            <surname>Chen</surname>
          </string-name>
          , Shubha Kher, and
          <string-name>
            <given-names>Arun</given-names>
            <surname>Somani</surname>
          </string-name>
          .
          <article-title>Distributed fault detection of wireless sensor networks</article-title>
          .
          <source>In Proc. of the 2006 Workshop on Dependability Issues in Wireless Ad Hoc Networks and Sensor Networks</source>
          , pages
          <fpage>65</fpage>
          -
          <lpage>72</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Energy</given-names>
            <surname>Conversion</surname>
          </string-name>
          and Management,
          <volume>51</volume>
          :
          <fpage>1929</fpage>
          -
          <lpage>1937</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <article-title>An optimal-time algorithm for slope selection</article-title>
          .
          <source>SIAM Journal on Computing</source>
          ,
          <volume>18</volume>
          (
          <issue>4</issue>
          ):
          <fpage>792</fpage>
          -
          <lpage>810</lpage>
          ,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Ding</surname>
          </string-name>
          , Dechang Chen, Kai Xing, and Xiuzhen Cheng.
          <article-title>Localized fault-tolerant event boundary detection in sensor networks</article-title>
          .
          <source>In Proc. of the 24th Annual Joint Conference of the IEEE Computer and Communications Societies</source>
          , volume
          <volume>2</volume>
          , pages
          <fpage>902</fpage>
          -
          <lpage>913</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.K.</given-names>
            <surname>Firth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.J.</given-names>
            <surname>Lomas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.J.</given-names>
            <surname>Rees</surname>
          </string-name>
          .
          <article-title>A simple model of PV system performance and its use in fault detection</article-title>
          .
          <source>Solar Energy</source>
          ,
          <volume>84</volume>
          :
          <fpage>624</fpage>
          -
          <lpage>635</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Trevor</given-names>
            <surname>Hastie</surname>
          </string-name>
          , Robert Tibshirani, and
          <string-name>
            <given-names>Jerome</given-names>
            <surname>Friedman</surname>
          </string-name>
          .
          <article-title>The elements of statistical learning</article-title>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Steve</given-names>
            <surname>Horvath</surname>
          </string-name>
          .
          <article-title>Weighted network analysis: applications in genomics and systems biology</article-title>
          . Springer Science &amp; Business
          <string-name>
            <surname>Media</surname>
          </string-name>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Jukna</surname>
          </string-name>
          .
          <article-title>Extremal combinatorics: with applications in computer science</article-title>
          .
          <source>Springer, 2nd edition</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Dror</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Kenett</surname>
            , Michele Tumminello, Asaf Madi, Gitit Gur-Gershgoren,
            <given-names>Rosario N.</given-names>
          </string-name>
          <string-name>
            <surname>Mantegna</surname>
          </string-name>
          , and
          <string-name>
            <surname>Eshel</surname>
          </string-name>
          Ben-Jacob.
          <article-title>Dominating clasp of the financial sector revealed by partial correlation analysis of the stock market</article-title>
          .
          <source>PLoS ONE</source>
          ,
          <volume>5</volume>
          (
          <issue>12</issue>
          ):
          <year>e15032</year>
          ,
          <year>12 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Donald</surname>
            <given-names>E. Knuth.</given-names>
          </string-name>
          <article-title>The art of computer programming: seminumerical algorithms</article-title>
          , volume
          <volume>2</volume>
          .
          <string-name>
            <surname>Addison-Wesley Longman</surname>
          </string-name>
          Publishing Co., Inc.,
          <source>3rd edition</source>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Marion</surname>
          </string-name>
          .
          <article-title>Comparison of predictive models for PV module performance</article-title>
          .
          <source>In 33rd IEEE Photovoltaic Specialist Conference</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Matousek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Mount</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N. S.</given-names>
            <surname>Netanyahu</surname>
          </string-name>
          .
          <article-title>Efficient randomized algorithms for the repeated median line estimator</article-title>
          .
          <source>Algorithmica</source>
          ,
          <volume>20</volume>
          (
          <issue>2</issue>
          ):
          <fpage>136</fpage>
          -
          <lpage>150</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Peter J Rousseeuw and Annick M Leroy.</surname>
          </string-name>
          <article-title>Robust regression and outlier detection</article-title>
          , volume
          <volume>589</volume>
          . John Wiley &amp; Sons,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>