<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Unsupervised Subsequence Anomaly Detection in Large Sequences</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Paul Boniol Supervised by: Themis Palpanas</string-name>
          <email>paul.boniol@edf.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohammed Meftah</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emmanuel Remy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>EDF R&amp;D, LIPADE, Universit e ́ de Paris</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Subsequence anomaly detection in long sequences is an important problem with applications in a wide range of domains. However, the approaches that have been proposed so far in the literature have severe limitations: they either require prior domain knowledge that is used to design the anomaly discovery algorithms, or become cumbersome and expensive to use in situations with recurrent anomalies of the same type. In this Ph.D. work, we address these problems, and present unsupervised methods suitable for domain agnostic subsequence anomaly detection. We explore two possible way to represent the normal behavior of a long data series that lead to fast and accurate identi cation of abnormal subsequences. These normal representations are either based on subsequences (using a data structure called the normal model), or on graphs, by taking advantage of graph properties to encode the normal transitions between neighboring subsequences of a long series. The experimental results, on a large set of synthetic and real datasets, demonstrate that the proposed approaches correctly identify single and recurrent anomalies of various types, without any prior knowledge of the characteristics of these anomalies. Our approaches outperform by a large margin several competing approaches in accuracy, while being up to orders of magnitude faster.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Time series1 anomaly detection is a crucial problem with
application in a wide range of domains. Examples of such
applications can be found in manufacturing, astronomy,
engineering, and other domains [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ]. This implies a real
need by relevant applications for developing methods that
can accurately and e ciently achieve this goal.
1A time series, or data series, or sequence, is an ordered
sequence of real-valued points. If the dimension that imposes
the ordering is time then we talk about time series, but
it could also be mass, angle, position, etc. We will use the
terms time series, data series, and sequence interchangeably.
Proceedings of the VLDB 2020 PhD Workshop, August 31st, 2020. Tokyo,
Japan. Copyright (C) 2020 for this paper by its authors. Copying permitted
for private and academic purposes.
      </p>
      <p>
        Anomaly detection is a well studied task [
        <xref ref-type="bibr" rid="ref13 ref16 ref2 ref8">2, 13, 16, 8</xref>
        ]
that can be tackled by either examining single values, or
sequences of points. In the speci c context of sequences,
which is the focus of this paper, we are interested in
identifying anomalous subsequences [
        <xref ref-type="bibr" rid="ref12 ref16 ref8">16, 12, 8</xref>
        ], which are not
single abnormal values, but rather an abnormal sequence of
values. In real-world applications, this distinction becomes
crucial: in certain cases, even though every individual point
may be normal, the trend exhibited by the sequence of these
same values may be anomalous. Evidently, failing to
identify such situations could lead to severe problems that are
only detected when it is too late.
      </p>
      <p>
        Some existing techniques explicitly look for a set of
predetermined types of anomalies [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. These are techniques that
have been speci cally designed to operate in a particular
setting, they require domain expertise, and cannot
generalize. Other techniques identify as anomalies the subsequences
with the largest distances to their nearest neighbors (termed
discords) [
        <xref ref-type="bibr" rid="ref12 ref16">16, 12</xref>
        ]. The assumption is that the most distant
subsequence is completely isolated from the "normal"
subsequences.
      </p>
      <p>
        However, this de nition fails in the case where an anomaly
repeats itself (approximately the same). In this situation,
anomalies will have other anomalies as close neighbors, and
will not be identi ed as discords. In order to remedy this
situation, the mthdiscord approach has been proposed [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ],
which takes into account the multiplicity m of the anomalous
subsequences that are similar to one another, and marks as
anomalies all the subsequences in the same group. However,
this approach assumes that we know the cardinality of the
anomalies, which is not true in practice (otherwise, we need
to try several di erent m values, increasing drastically the
execution time). Furthermore, the majority of the previous
approaches require prior knowledge of the anomaly length,
and their performance deteriorates signi cantly when the
correct length value is not used.
      </p>
      <p>Our work addresses the aforementioned problems. We
proposed two approaches aiming to build a normal
representation of the data series, which enables to identify anomalous
subsequences. The two approaches use di erent data
structures to represent the normal behavior of the sequences; they
are summarized below:</p>
      <p>
        NormA [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], a normal-model based subsequence
anomaly detection method in large data series, for
which the normal data structure is a set of
subsequences paired with a weight. The higher those
weights are, the more "normal" their paired
subsequences are.
      </p>
      <p>
        Series2Graph [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], an unsupervised graph-based
approach for subsequence anomaly detection in large
data series, for which the data structure is modeled
as a directed graph. Nodes fo this graph can be seen
as subsequences of the data series.
      </p>
    </sec>
    <sec id="sec-2">
      <title>NORMAL BEHAVIOR</title>
      <p>In this section we describe our proposed approaches to the
aforementioned problems. In the next part we describe the
notions and the elements used in our solutions.</p>
      <p>We begin by introducing some formal notations useful for
the rest of the paper. A data series T 2 Rn is a sequence
of real-valued numbers Ti 2 R [T1; T2; :::; Tn], where n = jT j
is the length of T , and Ti is the ith point of T . We are
typically interested in local regions of the data series, known
as subsequences. A subsequence Ti;` 2 R` of a data series
T is a continuous subset of the values from T of length `
starting at position i. Formally, Ti;` = [Ti; Ti+1; :::; Ti+` 1].
Given two sequences, A and B, of the same length, `, we
can calculate their Z-normalized Euclidean distance, dist,
as follows: dist = qPi`=1( Ai A Bi B )2, where and
A B
represent respectively the mean and standard deviation
of the sequences. For the remaining of this paper, we will
simply use the term distance.</p>
      <p>Given a subsequence Ti;`, we say that its mth Nearest
Neighbor (mth NN) is Tj;` if Tj;` has the mth shortest
distance to Ti;` among all the subsequences of length ` in T ,
excluding trivial matches; a trivial match of Ti;` is a
subsequence Ta;`, where ji aj &lt; `=2 (i.e., the two subsequences
overlap by more than half their length). Since we are
interested in subsequence anomalies, we rst de ne the set
of all subsequences of length ` in a given data series T :
T` = fTi;`j8i:0 i jT j ` + 1g. In general, we assume
that T` contains both normal and anomalous subsequences.
We then need a way to characterize normal behavior. For
the purpose of the paper, we abstractly de ne the normal
behavior of the data series as follows:</p>
      <p>Definition 1 (Normal Behavior, NB). Given a
data series T , NB is a model that represents the normal
(i.e., not anomalous) trends and patterns in T .</p>
      <p>The above de nition is not precise on purpose: it allows
several interpretations, which can lead to di erent kinds of
models. Nevertheless, subsequence anomalies can then be
de ned in a uniform way: anomalies are the subsequences
that have the largest distances to the expected, normal
behavior, NB (or their distance is above a set threshold). Both
of the proposed approaches will de ne their own data
structure to represent NB, as well as their own distance measure.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>The NormA Approach</title>
      <p>
        We describe a new approach for subsequence anomaly
detection. We propose a formalization for NB,
called the Normal Model, denoted NM , and de ned
as follows [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]: NM is a set of sequences, NM =
f(N M0 ; w0); (N M1 ; w1); :::; (N Mn ; wn)g, where N Mi is a
subsequence of length `NM (the same for all N Mi ) that corresponds
to a recurring behavior in the data series T , and wi is its
normality score (as we explain later, the highest this score is,
the more usual the behavior represented by N Mi is). In other
words, this model averages (with proper weights) the di
erent recurrent behaviors observed in the data, such that all
*
'(
3
'(
      </p>
      <p>5
!",ℓ '(
7"
ℬ
'(
'*
'3
!",ℓ
'8ℓ
'4
'5
'6
!%,ℓ
!&amp;,ℓ
(a) d defined as the distance to the Normal (b) Normal represtnation of the data series as
Model '( = '(* , +* , … , '(- , +- ; 8ℓ .The path to reach the red dots has smaller
ℬ is the weighted barycenter of NM weight than those to reach the blue dots.
the normal behaviors of the data series will be represented
in the normal model, while unusual behaviors will not (or
will have a very low weight).</p>
      <p>Figure 1(a) is an illustration of a Normal Model. As
depicted, the Normal Model NM is a weighted combination of
a set of subsequences (points within the dotted circles). The
combination of these subsequences and their related weights
returns distances di, dj, dk that are high enough to be di
erentiated from the normal points/subsequences. These
distances can be seen as the distance between subsequences
and a weighted barycenter B (in green) that represents NM .
Note that we do not actually compute this barycenter; we
illustrate it in Figure 1(a) for visualization purposes.</p>
      <p>
        We choose `NM &gt; ` in order to make sure that we do
not miss useful subsequences, i.e., subsequences with a large
overlap with an anomalous subsequence. For instance, for
a given subsequence of length `, a normal model of length
`NM = 2` will also contain the subsequences overlapping
with the rst and last half of the anomalous subsequence.
We can now de ne the Abnormal subsequences as follows [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>Definition 2 (Subsequence Anomaly). Assume a
data series T , the set T` of all its subsequences of length
`, and the Normal Model NM of T . Then, the subsequence
Tj;` 2 T` with anomaly score, i.e., distance to NM ,
dj = PNMi wi minx2[0;`NM `] dist(Tj;`; (N Mi )x;`) is an
anomaly if d is in the T op-k largest distances among all
subsequences in T`, or d &gt; , where 2 R&gt;0 is a threshold.</p>
      <p>
        Note that the only essential input parameter is the length
` of the anomaly (which is also one of the inputs in all
relevant algorithms in the literature [
        <xref ref-type="bibr" rid="ref12 ref16 ref7 ref8 ref9">12, 16, 7, 9, 8</xref>
        ]). The
parameter k (or ) is not essential, as long as the algorithm
can rank the anomalies.
      </p>
      <p>
        We stress that in practice, experts start by examining the
most anomalous pattern, and then move down in the ranked
list, since there is (in general) no rigid threshold separating
anomalous from non-anomalous behavior [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>As we mentioned above and will detail later on, we choose
to de ne NM as a set of sequences that summarizes
normality in T , by representing the average behavior of a set of
normal sequences. Intuitively, NM is the set of data series,
which tries to minimize the sum of Z-normalized Euclidean
distances between itself and some of the subsequences in T .
Last but not least, we need to compute NM in an
unsupervised way, i.e., without having normal/abnormal labels for
the subsequences in T`.</p>
      <p>Observe that this de nition of NM implies the following
challenge: even though NM summarizes the normal behavior
only, it needs to be computed based on T , which may include
(several) anomalies. We address these challenges by taking
advantage of the fact that anomalies are a minority class.
2.1.1</p>
      <sec id="sec-3-1">
        <title>NormA Framework</title>
        <p>
          We now brie y describe NormA [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], our unsupervised
solution to the subsequence anomaly detection problem.
[Computing the Normal Model] Recall that NM should
capture (summarize) the normal behavior of the data. This
may not be very hard to do for a sequence T that does not
contain any anomalous subsequences. In practice though,
we want to apply the NormA approach in an unsupervised
way on any sequence, which may contain several anomalies.
        </p>
        <p>We compute the NM in three steps. First, we extract
the subsequences that can serve as candidates for building
the NM . These candidates are either randomly selected
from T (NormA-smpl), or correspond to motifs. Then,
we group these subsequences according to their similarity
in a set of clusters C (we use hierarchical clustering and
Minimum Description Length to identify the right
number of clusters). The last step consists of scoring each
cluster, and selecting the cluster that best represents
normal behavior. Formally, for a given cluster c 2 C, we
set a weight wc = N orm(c; C) with the following
formula: N orm(c; C) = PFx2reCqduiesnt(cCye(cn)t2er(Cc)o;vCeernagteer((cx))) ; where
F requency(c) is the number of subsequences in c, and
Coverage(c) is the time interval between the rst and the
last occurrence of a subsequence in c. We thus set NM =
f(N M0 ; w0); (N M1 ; w1); :::; (N Mn ; wn)g, with N Mi the centroid
of the ith cluster in C.
[Normal Model Based Anomaly Detection]
Intuitively, the anomalous subsequences of the long series T are
the ones that are far away from NM . We thus compute the
meta S = [d0; d1; :::; djT j `NM ], with dj the distance of the
subsequence Ti;j to the Normal Model NM as de ned in
Definition 2. These distances correspond to the degree of
abnormality: the larger the distance is to NM , the more abnormal
the subsequence is. We then extract the k subsequences of
length ` with the highest distances to NM , and rank them
according to their distances. Alternatively, we can extract
all subsequences with distance larger than a threshold.
2.2</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>The Series2Graph Approach</title>
      <p>
        We now present an alternative data structure to represent
the subsequences normal behaviors of the data series. The
previous normal model was a set of subsequences that aimed
to store both normal and abnormal subsequences into the
same set, associated with weights that rank them based on
their normality. One can argue that an ordering information
is missing from this data structure representation. We thus
formulate an approach for subsequence anomaly detection
based on the data series representation into a Graph, in
which edges encode the ordering information [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Figure 1(b)
illustrates the Graph data structure.
      </p>
      <p>We rst de ne several basic elements related to graphs.
We de ne a Node Set N as a set of unique integers. Given a
Node Set N , an Edge Set E is then a set composed of tuples
(xi; xj), where xi; xj 2 N . w(xi; xj) is the weight of that
edge. Given a Node Set N , an Edge Set E (pairs of nodes in
N ), a Graph G is an ordered pair G = (N ; E). A directed
graph or digraph G is an ordered pair G = (N ; E) where N
is a Node Set, and E is an ordered Edge Set.</p>
      <p>
        We now provide a new formulation for subsequence
anomaly detection. The idea is that a data series is
transformed into a sequence of abstract states (corresponding to
di erent subsequence patterns), represented by nodes N in a
directed graph, G(N ; E), where the edges E encode the
number of times one state occurred after another. Under this
formulation, paths in the graph composed of high-weight
edges and high-degree nodes correspond to normal behavior.
Then, the Normality of a data series is de ned as follows [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>E = f(N (i); N (i+1))gi2[0;n 1], such that: N
8(N (i); N (i+1)) 2 E ; w((N (i); N (i+1))):(deg(N (i))</p>
      <p>Using -Normality subgraphs naturally leads to a ranking
of subsequences based on their "normality". For practical
reasons, this ranking can be transformed into a score, where
each rank can be seen as a threshold in that score. We used
such a score in GraphAn to detect abnormal subsequences.</p>
      <p>Note that given the existence of graph G, the above
definitions imply a way for identifying the anomalous
subsequences. The problem is now how to construct this graph.
2.2.1</p>
      <sec id="sec-4-1">
        <title>Series2Graph Framework</title>
        <p>
          We now brie y describe Series2Graph [
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ], our
unsupervised solution to the subsequence anomaly detection
problem. For a given data series T , the overall Series2Graph
process is divided into four main steps as follows:
        </p>
        <p>[Subsequence Embedding] We project all the
subsequences (of a given length `) of T in a three-dimensional
space that corresponds to the three most important
components of the Principle Component Analysis (PCA). This
space is subsequently transformed into a two-dimensional
space, composed of two components r~y; r~z corresponding to
two orthogonal vectors of vr~ef , an axis composed of every</p>
        <p>
          at sequences (a 1` with a 2 R). In the later space, the
shape similarity is preserved [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>
          [Node Creation] We then create a node for each one of
the densest parts of the above two-dimensional space. The
space is rst discretized by a set of radius I of angle . We
then estimate the density of subsequences along each one of
these radius using gaussian kernels. The maximal values of
the estimated densities are assigned to nodes and form our
Node Set N . These nodes summarize all major patterns of
length ` that occurred in T [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>[Edge Creation] We then retrieve all transitions between
pairs of subsequences represented by two di erent nodes:
each transition corresponds to a pair of subsequences, where
one occurs immediately after the other in the input data
series T . We represent transitions with an edge between the
corresponding nodes, and we thus form the Edge Set E. The
edge weights are set to the number of times the
corresponding pair of subsequences was observed in T . Finally we build
our graph G`(N ; E).</p>
        <p>[Subsequence Scoring] We compute the normality
(or anomaly) score of a subsequence of length `q `
(a) proposed approach of using a Normal Behavior model for the
subsequence anomaly detection problem is very promising.</p>
        <p>In our current work, we focus on the following directions.
[Normal Behavior Model] We will study in depth
difsmplNorNmoSArTm-OSAJM-GSsPmVJpl NorINDFmoAArGDm-VsAm-psmlpLlOFSDTOADMGPVGVS2G STIODFNAMoDrPmADA-SJ LOFSIFTOSMTONPMorPmSA2L-GsOmFpNIlForImFA-SJS2GLVOFLOF NormDAS-D2sGmS2pGl STOfGMeVrPent mDIoFADdels foLOrSFTONMoPrmSa2lG BIFehavioLrO,F and pSe2Grform analytical
10000 1000011000000000 100010000 Time out10000 1000010100000 10001100000000 Time out 10000 and e1x00p00erimental comp a10r00i0sons among them. The goal will
10500100100101000100020100020100200000105010101500100001002011001100010i()scTeem7000101000010200010000200000i()scTeem0110011000110000001100100010012010000100705200102010005000000001011010070010010201700002000i()scTeem0100100100001000010010101105010100100105000100010200120502000001001105010050001010000110010001000200 0 1010151050005000500 172050000000105101001001001001000200
12[dpbwm0Mreei1oa5lcl0nutic01s5oit0lwit0itooc1uoi0n1ar7v010rn00010,slkad0palreiborimraonsotticutaeeaatnsltsgtdDi,hoo5102netraa0hisnttbeadheomstfathSsne1eem01aa0o101tr00100clroh0iyh0eedaztsteeio]clnawmtGeloiu,lipllvu1s5tre50soiio0vnne0pa.haertserihratetaiteontest,1mldm0y0aa0aastcknaoweynseiesldnirldofi1oea5em0srsr0,mattiwehnodeesNifnmautnsbmoebmreoarfloipe(fcosp)inoNtiunsmtsber(co)fNa(bun)moNmb(udeam)lribAe(oecsn(fr)b(oaocN)mn)fNuopNamuomulyimbnmalteblesbireneosrgrftoohaffnpaoonmio(ndamt)sliaAe(lsnbieo)smNua(cml(y)dbAl)eenAnorngmotofhamlpy(oalceli)ynn(Ntdlgset)u(nhdAmg)ntbAhoenmrooamlfyaalnyeonlemgntahglti(gheces)tNhuemr baelrl ovfaarniaombl(aedls)ieAsonfotmhaelysleernigetsh. T(hd)eAcnhoamllaelnygleinnggthproblems
are how to create a multivariate Normal Behavior model
Figure 2: Comparison of six state-of-the-art meth- and corresponding data structure, and how to de ne and
ods to Series2graph and NormA: (a) Precision@k formalize suitable distance measures for this case.
with a critical diagram over 21 datasets; (b) execu- [Online Operation] In several situations, subsequence
tion time vs varying number of points; (c) execution anomaly detection applications need to operate in an
ontime vs varying anomaly length. line fashion. In this context, we will develop extensions of
our methods for real-time applications. We will analyze the
impact of data characteristics changes on the Normal
Behavior models, and explore solutions for updating the Normal</p>
        <p>Behavior data structures in real-time.
(within or outside of T ), based on the previously
computed edges/nodes and their weights/degrees. Formally, for
a subsequence Ti;`q of T , represented by a path in G`(N ; E )
Pth = hN (i); N (i+1); :::; N (i+`q)i, the normality score is
de</p>
        <p>Acknowledgments Work partially supported by EDF
R&amp;D and ANRT French program.
ned as: N orm(Pth) = Pi+`q 1 w(N(j);N(j+1))(deg(N(j)) 1) ,</p>
        <p>j=i `q
where w(e) and deg(n) are the weight of edge e and the
degree of node n, respectively.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>2.3 Experimental Analysis</title>
      <p>
        In our preliminary experimental evaluations [
        <xref ref-type="bibr" rid="ref3 ref5">3, 5</xref>
        ], we used
21 synthetic and real datasets from diverse domains, and
measured Precision@k (i.e., accuracy in the Top-k answers)
and execution time. (Due to lack of space, we only report
here aggregated results.)
      </p>
      <p>
        After rejecting the null hypothesis using the
Friedman test, we used the pairwise Post-Hoc Analysis with a
Wilcoxon signed-rank test ( = 0:05) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and we depict the
results in Figure 2(a). The results show that the NormA and
Series2Graph approaches are the overall winners,
performing statistically signi cantly better than the state-of-the-art
algorithms from both the data series and the
multidimensional data communities. (Even though Series2Graph seems
to be slightly better than both variants of Norma, this
difference is not statistically signi cant.)
      </p>
      <p>Moreover, Figures 2(b) and (c) demonstrate that
NormAsmpl and Series2Graph are considerably faster than the
other methods (note the log-scale y-axis). In particular,
NormA-smpl is between 1-3 orders of magnitude faster than
the rest, as we increase the length of the input sequence, or
the length of the anomalies.</p>
      <p>We note that further experiments are needed in order to
compare in detail our proposed methods.</p>
    </sec>
    <sec id="sec-6">
      <title>3. CONCLUSIONS AND FUTURE WORK</title>
      <p>Our studies on NormA and Series2Graph, described
above, constitute a rst attempt to formalize the problem of
Normal Behavior representation in the context of data
series. The two proposed approaches describe data structures
based on subsequence sets and graphs, respectively. The
experimental results show that both NormA and Series2Graph
outperform the current state-of-the-art techniques methods
over a large set of diverse datasets. This indicates that the</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Abboud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Elbadaoui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Smith</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Randall</surname>
          </string-name>
          .
          <article-title>Advanced bearing diagnostics: A comparative study of two powerful approaches</article-title>
          .
          <source>MSSP</source>
          ,
          <volume>114</volume>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>V.</given-names>
            <surname>Barnet</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Lewis</surname>
          </string-name>
          .
          <article-title>Outliers in Statistical Data</article-title>
          . John Wiley and Sons, Inc.,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Boniol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Linardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Roncallo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          .
          <source>Automated Anomaly Detection in Large Sequences. In ICDE</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Boniol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Linardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Roncallo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          . SAD:
          <article-title>An Unsupervised System for Subsequence Anomaly Detection</article-title>
          .
          <source>In ICDE</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Boniol</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          . Series2graph:
          <article-title>Graph-based subsequence anomaly detection for time series</article-title>
          . PVLDB,
          <volume>13</volume>
          (
          <issue>11</issue>
          ),
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Boniol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Meftah</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Remy</surname>
          </string-name>
          . Graphan:
          <article-title>Graph-based subsequence anomaly detection</article-title>
          .
          <source>PVLDB</source>
          ,
          <volume>13</volume>
          (
          <issue>11</issue>
          ),
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. T.</given-names>
            <surname>Leung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. W.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pei</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Meshkin</surname>
          </string-name>
          .
          <article-title>WAT: nding top-k discords in time series database</article-title>
          .
          <source>In SIAM</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Linardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh. Matrix Pro le Goes</surname>
          </string-name>
          <string-name>
            <surname>MAD</surname>
          </string-name>
          :
          <article-title>Variable-Length Motif And Discord Discovery in Data Series</article-title>
          . In DAMI,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>W.</given-names>
            <surname>Luo</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Gallagher</surname>
          </string-name>
          .
          <article-title>Faster and parameter-free discord search in quasi-periodic time series</article-title>
          .
          <source>In Advances in Knowledge Discovery and Data Mining</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          .
          <article-title>Data series management: The road to big sequence analytics</article-title>
          .
          <source>SIGMOD Rec</source>
          .,
          <volume>44</volume>
          (
          <issue>2</issue>
          ):
          <volume>47</volume>
          {
          <fpage>52</fpage>
          ,
          <string-name>
            <surname>Aug</surname>
          </string-name>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.</given-names>
            <surname>Beckmann</surname>
          </string-name>
          .
          <source>Report on the First and Second Interdisciplinary Time Series Analysis Workshop (ITISA)</source>
          .
          <source>ACM SIGMOD Record</source>
          ,
          <volume>48</volume>
          (
          <issue>3</issue>
          ),
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Senin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Oates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gandhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Boedihardjo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Frankenstein</surname>
          </string-name>
          .
          <article-title>Time series anomaly discovery with grammar-based compression</article-title>
          .
          <source>In EDBT</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Subramaniam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Papadopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kalogeraki</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Gunopulos</surname>
          </string-name>
          .
          <article-title>Online outlier detection in sensor data using non-parametric models</article-title>
          .
          <source>In VLDB 2006</source>
          , pages
          <fpage>187</fpage>
          {
          <fpage>198</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>F.</given-names>
            <surname>Wilcoxon</surname>
          </string-name>
          .
          <article-title>Individual comparisons by ranking methods</article-title>
          .
          <source>Biometrics Bulletin</source>
          ,
          <volume>1</volume>
          (
          <issue>6</issue>
          ):
          <volume>80</volume>
          {
          <fpage>83</fpage>
          ,
          <year>1945</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>D.</given-names>
            <surname>Yankov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>U.</given-names>
            <surname>Rebbapragada</surname>
          </string-name>
          .
          <article-title>Disk aware discord discovery: nding unusual time series in terabyte sized datasets</article-title>
          .
          <source>Knowl. Inf. Syst.</source>
          ,
          <volume>17</volume>
          (
          <issue>2</issue>
          ):
          <volume>241</volume>
          {
          <fpage>262</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>C. M. Yeh</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Ulanova</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Begum</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>H. A.</given-names>
          </string-name>
          <string-name>
            <surname>Dau</surname>
            ,
            <given-names>D. F.</given-names>
          </string-name>
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Mueen</surname>
            , and
            <given-names>E. J.</given-names>
          </string-name>
          <string-name>
            <surname>Keogh</surname>
          </string-name>
          .
          <article-title>Matrix pro le I: all pairs similarity joins for time series: A unifying view that includes motifs, discords and shapelets</article-title>
          . In ICDM, pages
          <volume>1317</volume>
          {
          <fpage>1322</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>