<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards a Hierarchical Approach for Outlier Detection in Industrial Production Setings</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Burkhard Hoppenstedt</string-name>
          <email>burkhard.hoppenstedt@uni-ulm.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manfred Reichert</string-name>
          <email>manfred.reichert@uni-ulm.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Klaus Kammerer</string-name>
          <email>klaus.kammerer@uni-ulm.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Myra Spiliopoulou</string-name>
          <email>myra@ovgu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rüdiger Pryss</string-name>
          <email>ruediger.pryss@uni-ulm.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Otto-von-Guericke-University</institution>
          ,
          <addr-line>Magdeburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ulm University</institution>
          ,
          <addr-line>Ulm</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the context of Industry 4.0, the degree of cross-linking between machines, sensors, and production lines increases rapidly. However, this trend also ofers the potential for the improvement of outlier scores, especially by combining outlier detection information between diferent production levels. The latter, in turn, ofer various other useful aspects like diferent time series resolutions or context variables. When utilizing these aspects, valuable outlier information can be extracted, which can be then used for condition-based monitoring, alert management, or predictive maintenance. In this work, we compare diferent types of outlier detection methods and scores in the light of the aforementioned production levels with the goal to develop a model for outlier detection that incorporates these production levels. The proposed model, in turn, is basically inspired by a use case from the field of additive manufacturing, which is also known as industrial 3D-printing. Altogether, our model shall improve the detection of outliers by the use of a hierarchical structure that utilizes production levels in industrial scenarios.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        In general, outlier detection can be used in the context of
production control to provide Condition Monitoring, generate Alerts,
discover Concept Shifts, or serve as an indicator for Predictive
Maintenance. In the context of the latter, the degree of deviation
from an expected value represents the urgency to maintain a
system. In this work, we focus on the detection of anomalies
in temporal data. In general, outliers can be seen as changes,
sequences, or temporal patterns [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Furthermore, there exist
various anomaly types (see Fig. 1, [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]). In this context, the most
common techniques that are used for an outlier detection
constitute classification and clustering. Moreover, the field of outlier
detection is related to forecasting, as deviations from expected
values might indicate an unexpected change in the behavior of a
machine. Nowadays, industrial production generates data in
various resolutions and formats. Usually, the obtained sensor values
have a very high resolution. In this context, data is assigned by
First International Workshop on Data Science for Industry 4.0.
      </p>
      <p>Copyright ©2019 for the individual papers by the papers’ authors. Copying
permitted for private and academic purposes. This volume is published and copyrighted
by its editors.
a computer-aided quality assurance (CAQ) to a higher hierarchy
level if it has a lower resolution and vice versa. Therefore,
outliers can be detected and utilized coming from diferent hierarchy
levels, while these levels, in turn, have their diferent
requirements towards the used algorithms, e.g., in terms of data types,
calculation speed, and dimensionality. In this work, we provide a
short overview of outlier detection methods and their purpose.
Furthermore, we suggest a data structure for outlier detection
that is based on the following idea: Machines are often equipped
with redundant sensors, e.g., to measure the temperature of the
same machine at diferent places. However, sensors measuring
the same information allow for the calculation of a support value
for outliers. Hereby, an outlier is more valuable if it is also found
in the supporting sensor at the same time. Based on this idea, the
suggested data structure shall be able to represent the supporting
as well as the hierarchy value for an outlier.</p>
      <p>The remainder of this paper is structured as follows. In Section 2,
we briefly illustrate the hierarchical structure. Section 3 presents
the categories of outliers that can be found in the literature, while
Section 4 sketches an algorithm which incorporates the hierarchy.
Related work is discussed in Section 5. Finally, a summary and
an outlook are provided in Section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>HIERARCHICAL STRUCTURE</title>
      <p>The production layers used in this work (see Fig. 2) contain
diferent types of data and therefore a framework is introduced
that can handle several types of outlier detection approaches as
well as can combine their advantages with respect to specific data
types. The first introduced layer is denoted as phase level 1 .
The production process is usually split into several phases, e.g.,
preparation, warm-up, and calibration. In the proposed model,
this layer provides the most detailed view on the production.
It comprises multi-dimensional, high-resolution sensor values
that deliver either time series data or discrete value sequences
during the corresponding phase. Time series data corresponds
to numeric data over time, while discrete sequences are made of
labels. In the job level 2 , a whole production process is displayed.
A job may consist of several phases and it starts with a setup
and ends with a computer-aided quality (CAQ) check. The setup
and quality tests are not time series, but provide nevertheless
Additive Outlier</p>
      <p>Innovative Outlier</p>
      <p>Temporary Change</p>
      <p>Level Shift
Phase 1</p>
      <p>Phase 2</p>
      <p>Phase 3
Job Level
Environment</p>
      <p>Level
Production Line</p>
      <p>Level</p>
      <p>Job 2
Production</p>
      <p>Level
Job 1
Job 3
=Job Configuration
=CAQ</p>
      <p>=Machine Configuration
high-dimensional data. During the setup, parameters are selected
and the job is prepared. When considering the environment-level
3 , a new time series is introduced, which does not correspond
directly to the production process, but is measured in the same
period. An example of such a time series would be the room
temperature. If jobs over time are investigated 4 , the
highdimensional setup provides also a time series. This layer, in turn,
is denoted as production line level. Finally, the production level 5
includes data from diferent machines and represents therefore
the most complex scenario. The aim of future work will be to
combine outlier information from the diferent levels in a valuable
manner.
3</p>
    </sec>
    <sec id="sec-3">
      <title>CATEGORIZATION OF LITERATURE ON</title>
    </sec>
    <sec id="sec-4">
      <title>OUTLIERS</title>
      <p>Due to the various scenarios in a production environment,
different outlier detection algorithms should be kept in mind (see
Table 1). In general, production levels with high resolution values
should use sequences to represent the outliers as points since
they are vulnerable to measurement errors. In contrast, for
aggregated values, points can be used to represent outliers. In general,
anomalies in time series can be extracted by a straightforward
computation or by using overlapping fixed size windows, which,
in turn, are aggregated. The first introduced technique in this
context is called discriminative approach (DA). Thereby, a similarity
function compares sequences and clusters, while the distance of
a time series to the centroid of the nearest clusters denotes the
anomaly score. In unsupervised parametric approaches (UPA), an
anomaly is discovered if a sequence is unlikely to be generated
from a specified summary model. In case of multidimensional
data, an Online Analytical Processing (OLAP) cube can be
analyzed, using an unsupervised approach (UOA) with each cell as
a measure. When labeled training data is available, supervised
1
2
3
4
5
approaches (SA) can be applied. Window-based detection is
another type of outlier detection. Furthermore, outlier scores are
calculated for overlapping windows with fixed length as
parameters. This class of outlier detection suits well for detecting exact
positions of anomalies. The normal pattern database (NPD), in
turn, is a representative of a window-based approach. Regarding
the latter, the frequencies of overlapping windows are stored
in a database. If a new subsequence has many mismatches, it
is considered as an anomaly. This procedure can be extended
by not including only exact matches, but rather compute soft
mismatch scores. In contrast to a NPD approach, the negative and
mixed pattern database (NMD) is based on anomaly dictionaries.
Here, test sequences are classified as anomalies if they match a
sequence from the database. Next, to find outlier subsequences
(OS), patterns are compared to their expected frequency in the
database. The main problem is to preserve computational
eficiency as the calculation of a match score and its permutations is
very costly. Prediction models (PM) define the outlier score based
on the delta value to the predicted value. In addition, prediction
models are suitable for multi-variate time series. Another way
to detect outliers is to compare a normal profile with new time
points. This procedure is denoted as profile similarity (PS).
Moreover, a information-theoretic model (ITM) detects outlier points
by removing points from a sequel and measuring the
improvement in a histogram-based representation. In this context, outlier
points are denoted as deviants.</p>
      <p>Note that diferent type of outliers must be identified for each
hierarchy in order to distinguish between outliers for finding
points (pts), sub-sequences (ssq), or time-series (tss).
4</p>
    </sec>
    <sec id="sec-5">
      <title>ALGORITHM</title>
      <p>The work at hand proposes an algorithm (see Algorithm 1) for the
utilization of outliers in a hierarchical production system. The
result of the algorithm is represented by the triple global score,
outlierness, and support (i.e., the data structure). First, the global
score denotes in which of the five proposed levels the outlier was
noticed. For example, if it was only recognized in the phase level,
the global score value is low. Consequently, the higher a global
score is, the more obvious was the outlier. Note that if outliers
are identified in a high production level, it is assumed that these
outliers can be also identified in a lower level as well. Adversely,
if no outlier can be found at a lower level, but in a higher level,
a measurement error must be assumed. Second, the outlierness
constitutes the significance of the outlier as computed by the
actually used algorithm. Third, the support value can be increased
if the outlier can be found in the same level for corresponding
sensors, e.g., when the room temperature measurement supports
another sensor measurement. In general, support values reduce
the probability of finding a measurement error.</p>
      <sec id="sec-5-1">
        <title>FindHierarchicalOutlier T S , LV</title>
        <p>inputs : startLevel(LV) and timeSeries(TS)
output : &lt;global score, outlierness, support&gt;
algorithm:=ChooseAlgorithm(startLevel);
List&lt;Sensors&gt; correspondingSensors;
List&lt;Outlier&gt; outlierList := CalculateOutlier(algorithm, startLevel,TS);
foreach outlier ∈ outlier List do
foreach sensor ∈ cor r espondinдSensor s do
if sensor supports outlier then</p>
        <p>support++;
end</p>
        <p>end
end
support/=Number of Corresponding Sensors;
outlierness:=CalcOutlierness(algorithm);
globalScore:= CalcGlobalScore(level++,true);</p>
        <p>CalcGlobalScore(level–,false);</p>
      </sec>
      <sec id="sec-5-2">
        <title>CalcGlobalScore l evel , up</title>
        <p>algorithm =ChooseAlgorithm(level);
CalculateOutlier(algorithm, level);
if up then
if Outlier Detected in Level then</p>
        <p>globalScore++; CalcGlobalScore(level++,true);
end
else
end
end
else
end
if No Outlier Detected in Level then</p>
        <p>Warning for Wrong Measurement;</p>
        <p>CalcGlobalScore(level–,false);
end</p>
        <p>Algorithm 1: Outlier Hierarchical Algorithm
5</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>RELATED WORK</title>
      <p>
        Outlier detection is also known as anomaly detection, event
detection, novelty detection, deviant discovery, change point detection,
fault detection, or intrusion detection. Based on an extensive
literature study, Fig. 3 shows corresponding numbers of papers
from each of these categories extracted from the search engine
Web of Science. Note that each term was filtered with the word
time series and afterwards limited to those items that are
connected to the category automation control systems. In general,
methods for outlier detection have been presented as general
frameworks [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ] as well as features for process control systems
(PCS) [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ]. Moreover, another challenge for outlier detection is
related to the calculation speed. To tackle the latter, the authors
of [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] used the MapReduce pattern to speed up the calculation for
distance-based outliers. A further challenge in the field of outlier
detection is the complexity of time series. Hereby, an approach
for multivariate time series is introduced by [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. To tackle the
problem of large, noisy features, [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] used an outlier thresholding
function for outlier selection, whose results are further on used
as target feature. Another approach to deal with high dimensions
constitutes the combination of outlier detection and dimension
reduction. In this context, [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] used the principal component
analysis (PCA) and the local outlier factor (LOC) for a robust detection
of noisy variables. In contrast, [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] extended the PCA with a
factor leverage, which measures the influence of each data point of
the PCA. A further way to reduce the dimension constitutes the
use of intrinsic dimensions (ID). In [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ], for example, the PCA is
combined with a randomized approach for subspace recovery.
Again, the dimension reduction method is combined with a local
outlier score [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ]. Due to the strong connection of outlier
detection and the nearest neighbor method (knn), the efect of hubness
needs to be considered (e.g., [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]). Note that hubness is denoted
as the tendency of high-dimensional data to contain points from
other knn lists. To summarize, all presented approaches help to
tackle complex and large production data.
      </p>
      <p>
        Another important part of related work can be referred to
outlierness scores. For the production scenario used in this paper, flexible
and adaptive outlier scores are needed, which can be expressed
by the degree of outlierness. These scores allow for a ranking of
outliers, which cannot be done using a binary outlier score, as
the latter reveals only a decision for true/false decisions. In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ],
for example, an interval-based approach is presented, in which
the outlierness score is defined as the resulting distance after the
clustering process. Hereby, it is possible to define a pattern as the
ground truth prototype and all outlierness scores are relative to
this selected pattern. A similar definition of outlierness score is
presented by [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], in which it is denoted as the distance between
a normal and the outlier class. The distance, in turn, is measured
by a Support Vector Machine. Next, [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] enriches the outlierness
score by including diferent context levels. For the levels local,
global, and ensemble, an expected behavior is modeled and the
outlierness refers to the diference between the expected and the
measured value. Another approach uses the impact of outliers on
the clustering objective, where the sensitivity denotes the
worstcase impact of a point of the clustering solution [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. Moreover,
outlierness scores can be combined to outlier vectors, as, for
example, pursued by [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. This is especially helpful in the context of
online outlier detection. Another way of expressing the degree of
outlierness constitutes the evaluation of all distances to elements
in the neighbor and by the use of the percentage of distances
higher than the mean distance [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ]. This concept is designed to
work for dependent elements, as they can be found in graphs.
The last presented outlierness approach [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] uses the imbalance
between densities of all objects. Finally, sensors can be simulated
using software, which is denoted as soft sensor modeling. A fusion
of outlier detection and soft sensor modeling, for example, is
presented by [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ].
      </p>
      <p>In the light of the presented approach, to the best of our
knowledge, none of the evaluated related works deal with outlier
detection in diferent hierarchy levels in an industrial production
setting as we do.
6</p>
    </sec>
    <sec id="sec-7">
      <title>SUMMARY AND OUTLOOK</title>
      <p>We proposed a novel algorithm that includes three characteristics
of outliers in a production environment, namely the global score,
the outlierness, and the support. These values are calculated
using diferent algorithms, whereby the algorithm should be
selected with respect to the resolution best fitting to a production
layer. This representation of outliers helps then to represent
the importance of an outlier and classify the outliers by several
criteria for a more transparent production. The review of various
outlier methods has shown possible algorithm candidates that can
0
Outlier Detection</p>
      <p>Anomaly
Detection</p>
      <p>Event Detection</p>
      <p>Novelty
Detection</p>
      <p>Deviant
Discovery</p>
      <p>Change Point</p>
      <p>Detection</p>
      <p>Fault Detection</p>
      <p>Intrusion
Detection</p>
      <sec id="sec-7-1">
        <title>Automation Control Systems</title>
      </sec>
      <sec id="sec-7-2">
        <title>Time Series</title>
        <p>be used for the corresponding layers. Some of these algorithms
ift better on time series, some of them on sequences, while others
on outlier points. In future work, the approach will be evaluated
based on real-life data of a company that produces machines in
an industrial large-scale production setting.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Fabrizio</given-names>
            <surname>Angiulli</surname>
          </string-name>
          and
          <string-name>
            <given-names>Clara</given-names>
            <surname>Pizzuti</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Fast Outlier Detection in High Dimensional Spaces</article-title>
          .
          <source>In Principles of Data Mining and Knowledge Discovery. Lecture Notes in Computer Science, Lecture Notes in Artificial Intelligence</source>
          , Vol.
          <volume>2431</volume>
          . Springer, Berlin and Heidelberg,
          <volume>15</volume>
          -
          <fpage>27</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Suratna</given-names>
            <surname>Budalakoti</surname>
          </string-name>
          , Ashok N Srivastava,
          <string-name>
            <surname>Ram Akella</surname>
            , and
            <given-names>Eugene</given-names>
          </string-name>
          <string-name>
            <surname>Turkov</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Anomaly detection in large sets of high-dimensional symbol sequences</article-title>
          . (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>João</surname>
            <given-names>B. D.</given-names>
          </string-name>
          <string-name>
            <surname>Cabrera</surname>
            ,
            <given-names>Lundy</given-names>
          </string-name>
          <string-name>
            <surname>Lewis</surname>
          </string-name>
          , and
          <string-name>
            <surname>Raman</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Mehra</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Detection and classification of intrusions and faults using sequences of system calls</article-title>
          .
          <source>ACM SIGMOD Record 30</source>
          ,
          <issue>4</issue>
          (
          <year>2001</year>
          ),
          <fpage>25</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Sorin</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ciolofan</surname>
          </string-name>
          et al.
          <year>2016</year>
          .
          <article-title>Rapid Parallel Detection of Distance-based Outliers in Time Series using MapReduce</article-title>
          .
          <source>Journal of Control Engineering and Applied Informatics</source>
          <volume>18</volume>
          ,
          <issue>3</issue>
          (
          <year>2016</year>
          ),
          <fpage>63</fpage>
          -
          <lpage>71</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Domenico</given-names>
            <surname>Cucina</surname>
          </string-name>
          et al.
          <year>2014</year>
          .
          <article-title>Outliers detection in multivariate time series using genetic algorithms</article-title>
          .
          <source>Chemometrics and Intelligent Laboratory Systems</source>
          <volume>132</volume>
          (
          <year>2014</year>
          ),
          <fpage>103</fpage>
          -
          <lpage>110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Eleazar</given-names>
            <surname>Eskin</surname>
          </string-name>
          et al.
          <year>2002</year>
          .
          <article-title>A Geometric Framework for Unsupervised Anomaly Detection</article-title>
          .
          <source>In Applications of Data Mining in Computer Security. Advances in Information Security</source>
          , Vol.
          <volume>6</volume>
          . Springer, Boston, MA,
          <fpage>77</fpage>
          -
          <lpage>101</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>German</given-names>
            <surname>Florez-Larrahondo</surname>
          </string-name>
          et al.
          <year>2005</year>
          .
          <article-title>Eficient modeling of discrete events for anomaly detection using hidden markov models</article-title>
          .
          <source>In International Conference on Information Security</source>
          . Springer,
          <fpage>506</fpage>
          -
          <lpage>514</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Pedro</surname>
            <given-names>A Forero</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Scott</given-names>
            <surname>Shafer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Josh</given-names>
            <surname>Harguess</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Online robust dictionary learning with density-based outlier weighing</article-title>
          .
          <source>In OCEANS 2016 MTS/IEEE Monterey. IEEE</source>
          , 1-
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Fox</surname>
          </string-name>
          .
          <year>1972</year>
          .
          <article-title>Outliers in Time Series</article-title>
          .
          <source>Journal of the Royal Statistical Society. Series B (Methodological) 34</source>
          ,
          <issue>3</issue>
          (
          <year>1972</year>
          ),
          <fpage>350</fpage>
          -
          <lpage>363</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Anup</surname>
            <given-names>K Ghosh</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aaron Schwartzbard</surname>
            , and
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Schatz</surname>
          </string-name>
          .
          <year>1999</year>
          .
          <article-title>Learning Program Behavior Profiles for Intrusion Detection.</article-title>
          .
          <source>In Workshop on Intrusion Detection and Network Monitoring</source>
          , Vol.
          <volume>51462</volume>
          .
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Fabio</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>González</surname>
            and
            <given-names>Dipankar</given-names>
          </string-name>
          <string-name>
            <surname>Dasgupta</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Anomaly Detection Using Real-Valued Negative Selection</article-title>
          .
          <source>Genetic Programming and Evolvable Machines</source>
          <volume>4</volume>
          ,
          <issue>4</issue>
          (
          <year>2003</year>
          ),
          <fpage>383</fpage>
          -
          <lpage>403</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Manish</given-names>
            <surname>Gupta</surname>
          </string-name>
          et al.
          <year>2014</year>
          .
          <article-title>Outlier Detection for Temporal Data: A Survey</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>26</volume>
          ,
          <issue>9</issue>
          (
          <year>2014</year>
          ),
          <fpage>2250</fpage>
          -
          <lpage>2267</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Manish</given-names>
            <surname>Gupta</surname>
          </string-name>
          and
          <string-name>
            <given-names>Abhishek</given-names>
            <surname>Singh</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Context-Aware Time Series Anomaly Detection for Complex Systems</article-title>
          ,
          <source>In Proc. of the SDM Workshop on Data Mining for Service and Maintenance.</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Marwan</surname>
            <given-names>Hassani</given-names>
          </string-name>
          , Yifeng Lu, and
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Seidl</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Towards an Eficient Ranking of Interval-Based Patterns.</article-title>
          .
          <source>In EDBT</source>
          .
          <volume>688</volume>
          -
          <fpage>689</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>David</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Hill</surname>
            and
            <given-names>Barbara S.</given-names>
          </string-name>
          <string-name>
            <surname>Minsker</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Anomaly detection in streaming environmental sensor data: A data-driven modeling approach</article-title>
          .
          <source>Environmental Modelling &amp; Software</source>
          <volume>25</volume>
          ,
          <issue>9</issue>
          (
          <year>2010</year>
          ),
          <fpage>1014</fpage>
          -
          <lpage>1022</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Terran</given-names>
            <surname>Lane</surname>
          </string-name>
          and
          <string-name>
            <given-names>Carla</given-names>
            <surname>Brodley</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Sequence Matching and Learning in Anomaly Detection for Computer Security</article-title>
          . (05
          <year>1997</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Terran</given-names>
            <surname>Lane</surname>
          </string-name>
          and
          <string-name>
            <given-names>Carla E</given-names>
            <surname>Brodley</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>An application of machine learning to anomaly detection</article-title>
          .
          <source>In Proceedings of the 20th National Information Systems Security Conference</source>
          , Vol.
          <volume>377</volume>
          .
          <string-name>
            <surname>Baltimore</surname>
          </string-name>
          , USA,
          <fpage>366</fpage>
          -
          <lpage>380</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Wenke</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <surname>Salvatore J Stolfo</surname>
          </string-name>
          , et al.
          <year>1998</year>
          .
          <article-title>Data mining approaches for intrusion detection</article-title>
          ..
          <source>In USENIX Security Symposium</source>
          . San Antonio, TX,
          <fpage>79</fpage>
          -
          <lpage>93</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Xiaolei</given-names>
            <surname>Li</surname>
          </string-name>
          et al.
          <year>2007</year>
          .
          <article-title>ROAM: Rule- and Motif-Based Anomaly Detection in Massive Moving Object Data Sets</article-title>
          .
          <source>In Proceedings of the Seventh SIAM International Conference on Data Mining. Soc. for Industrial and Applied Mathematics</source>
          , Philadelphia, Pa.,
          <fpage>273</fpage>
          -
          <lpage>284</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Xiaolei</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jiawei</given-names>
            <surname>Han</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Mining approximate top-k subspace anomalies in multi-dimensional time-series data</article-title>
          .
          <source>In Proceedings of the 33rd international conference on Very large data bases. VLDB Endowment</source>
          ,
          <volume>447</volume>
          -
          <fpage>458</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Jiongqian</given-names>
            <surname>Liang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Srinivasan</given-names>
            <surname>Parthasarathy</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Robust contextual outlier detection: Where context meets sparsity</article-title>
          .
          <source>In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM</source>
          ,
          <volume>2167</volume>
          -
          <fpage>2172</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Jessica</given-names>
            <surname>Lin</surname>
          </string-name>
          et al.
          <year>2003</year>
          .
          <article-title>A symbolic representation of time series, with implications for streaming algorithms</article-title>
          .
          <source>In Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. ACM</source>
          ,
          <fpage>2</fpage>
          -
          <lpage>11</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Ninghao</surname>
            <given-names>Liu</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Donghwa</given-names>
            <surname>Shin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Xia</given-names>
            <surname>Hu</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Contextual Outlier Interpretation</article-title>
          .
          <source>arXiv preprint arXiv:1711.10589</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Mario</surname>
            <given-names>Lucic</given-names>
          </string-name>
          , Olivier Bachem, and
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Krause</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Linear-time outlier detection via sensitivity</article-title>
          .
          <source>arXiv preprint arXiv:1605.00519</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Carla</given-names>
            <surname>Marceau</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Characterizing the behavior of a program using multiplelength n-grams</article-title>
          .
          <source>Technical Report</source>
          . Odyssey Research Associates Inc Ithacany.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Amanda F Mejia</surname>
          </string-name>
          et al.
          <year>2017</year>
          .
          <article-title>PCA leverage: outlier detection for highdimensional functional magnetic resonance imaging data</article-title>
          .
          <source>Biostatistics</source>
          <volume>18</volume>
          ,
          <issue>3</issue>
          (
          <year>2017</year>
          ),
          <fpage>521</fpage>
          -
          <lpage>536</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Muthukrishnan</surname>
          </string-name>
          et al.
          <year>2004</year>
          .
          <article-title>Mining deviants in time series data streams</article-title>
          .
          <source>In SSDBM 2004. IEEE Computer Society</source>
          , Los Alamitos, Calif,
          <fpage>41</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Alexandre</given-names>
            <surname>Nairac</surname>
          </string-name>
          et al.
          <year>1999</year>
          .
          <article-title>A System for the Analysis of Jet Engine Vibration Data</article-title>
          .
          <source>Integrated Computer-Aided Engineering 6</source>
          ,
          <issue>1</issue>
          (
          <year>1999</year>
          ),
          <fpage>53</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Ortner</surname>
          </string-name>
          et al.
          <year>2017</year>
          .
          <article-title>Local projections for high-dimensional outlier detection</article-title>
          .
          <source>arXiv preprint arXiv:1708.01550</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Xinghao</surname>
            <given-names>Pan</given-names>
          </string-name>
          , Jiaqi Tan, Soila Kavulya, Rajeev Gandhi, and
          <string-name>
            <given-names>Priya</given-names>
            <surname>Narasimhan</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Ganesha: Black-Box Fault Diagnosis for MapReduce Systems</article-title>
          (CMUPDL-08-112).
          <source>Parallel Data Laboratory</source>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Guansong</given-names>
            <surname>Pang</surname>
          </string-name>
          et al.
          <year>2018</year>
          .
          <article-title>Sparse Modeling-based Sequential Ensemble Learning for Efective Outlier Detection in High-dimensional Numeric Data</article-title>
          . AAAI.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Leonid</given-names>
            <surname>Portnoy</surname>
          </string-name>
          et al.
          <year>2001</year>
          .
          <article-title>Intrusion Detection with Unlabeled Data Using Clustering</article-title>
          . (11
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>Mario</given-names>
            <surname>Alfonso</surname>
          </string-name>
          Prado-Romero and
          <string-name>
            <given-names>Andrés</given-names>
            <surname>Gago-Alonso</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Community Feature Selection for Anomaly Detection in Attributed Graphs</article-title>
          .
          <source>In Iberoamerican Congress on Pattern Recognition</source>
          . Springer,
          <fpage>109</fpage>
          -
          <lpage>116</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Miloš</surname>
            <given-names>Radovanović</given-names>
          </string-name>
          , Alexandros Nanopoulos, and
          <string-name>
            <given-names>Mirjana</given-names>
            <surname>Ivanović</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Reverse nearest neighbors in unsupervised distance-based outlier detection</article-title>
          .
          <source>IEEE transactions on knowledge and data engineering 27</source>
          ,
          <issue>5</issue>
          (
          <year>2015</year>
          ),
          <fpage>1369</fpage>
          -
          <lpage>1382</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>Mostafa</given-names>
            <surname>Rahmani and George K Atia</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Randomized robust subspace recovery and outlier detection for high dimensional data matrices</article-title>
          .
          <source>IEEE Transactions on Signal Processing 65</source>
          ,
          <issue>6</issue>
          (
          <year>2017</year>
          ),
          <fpage>1580</fpage>
          -
          <lpage>1594</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Umaa</surname>
            <given-names>Rebbapragada</given-names>
          </string-name>
          , Pavlos Protopapas,
          <string-name>
            <given-names>Carla E.</given-names>
            <surname>Brodley</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Charles</given-names>
            <surname>Alcock</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Finding anomalous periodic time series</article-title>
          .
          <source>Machine Learning</source>
          <volume>74</volume>
          ,
          <issue>3</issue>
          (
          <year>2009</year>
          ),
          <fpage>281</fpage>
          -
          <lpage>313</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Karlton</given-names>
            <surname>Sequeira</surname>
          </string-name>
          and
          <string-name>
            <given-names>Mohammed</given-names>
            <surname>Zaki</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>ADMIT: anomaly-based data mining for intrusions</article-title>
          .
          <source>In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM</source>
          ,
          <volume>386</volume>
          -
          <fpage>395</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>Weixing</given-names>
            <surname>Su</surname>
          </string-name>
          et al.
          <year>2013</year>
          .
          <article-title>An online outlier detection method based on wavelet technique and robust RBF network</article-title>
          .
          <source>Transactions of the Institute of Measurement and Control</source>
          <volume>35</volume>
          ,
          <issue>8</issue>
          (
          <year>2013</year>
          ),
          <fpage>1046</fpage>
          -
          <lpage>1057</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>J.</given-names>
            <surname>Takeuchi</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Yamanishi</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>A unifying framework for detecting outliers and change points from time series</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>18</volume>
          ,
          <issue>4</issue>
          (
          <year>2006</year>
          ),
          <fpage>482</fpage>
          -
          <lpage>492</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Hui-xin Tian</surname>
          </string-name>
          et al.
          <year>2016</year>
          .
          <article-title>An outliers detection method of time series data for soft sensor modeling</article-title>
          .
          <source>In Proceedings of the 28th Chinese Control</source>
          and
          <article-title>Decision Conference (2016 CCDC)</article-title>
          . IEEE, Piscataway, NJ,
          <fpage>3918</fpage>
          -
          <lpage>3922</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>Jonathan</given-names>
            <surname>Von Brünken</surname>
          </string-name>
          , Michael E Houle, and
          <string-name>
            <given-names>Arthur</given-names>
            <surname>Zimek</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Intrinsic Dimensional Outlier Detection in High-Dimensional Data</article-title>
          .
          <source>Technical Report. Technical report, National Institute of Informatics</source>
          , Tokyo.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>