<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>STAD: State-Transition-Aware Anomaly Detection Under Concept Drifts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bin Li</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emmanuel Müller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>TU Dortmund</institution>
          ,
          <addr-line>Otto-Hahn-Straße 14, 44227 Dortmund</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The detection of temporal abnormal patterns over streaming data is challenging due to volatile data properties and lacking real-time labels. The abnormal patterns are usually hidden in the temporal context, which can not be detected by evaluating single points. Furthermore, the normal state evolves over time due to concept drift. A single model does not fit all data over time. Autoencoders are recently applied for unsupervised anomaly detection. However, they usually get expired and invalid after distributional drifts in the data stream. In this paper, we propose an autoencoder-based approach (STAD) for anomaly detection under concept drift. In particular, we use a state-transition-based model to map diferent data distributions in each period of the data stream into states, thereby addressing the model adaptation problem in an interpretable way. We empirically demonstrate the state transition process and evaluate the anomaly detection performance on the Covid-19 dataset of Germany.</p>
      </abstract>
      <kwd-group>
        <kwd>State transition</kwd>
        <kwd>Anomaly detection</kwd>
        <kwd>Concept drift</kwd>
        <kwd>Autoencoder</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Anomaly detection in streaming data is gaining traction in the current big data research. Despite
the high demand in a variety of real-world applications [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] (e.g., health care, device monitoring
and predictive maintenance), rare existing models show convincing performance in real-time
deployment. The detection of abnormal patterns in streaming data is challenging. On the
one hand, labels are unavailable or expensive to acquire in real-time, such that supervised
approaches usually fail. On the other hand, the conventional batch models easily get expire,
while a single stationary model does not fit the ever-changing data stream.
      </p>
      <p>
        Recently, autoencoders have been employed for anomaly detection in an unsupervised manner
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Autoencoders are trained to reconstruct the normal data1, such that for any unknown data
instance, a high reconstruction error indicates an anomaly. Specifically, for time series data, the
temporal dependencies between data points can be captured by constructing autoencoders using
Recurrent Neural Networks (RNNs) and their variants [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. Although such methods show
impressive performance on time series data, they usually ignore that such data are commonly
collected in a streaming way and do not allow full access during the training phase. Therefore,
an adaptive autoencoder is desired, which can be initialized with a few normal data and be
updated according to the real-time data distribution changes.
      </p>
      <p>Another major challenge of anomaly detection in streaming data is distinguishing between
abnormal patterns and concept drifts. Once the data stream drifts to a novel distribution,
a stationary model trained only on outdated data may detect most of the upcoming data
undesirably as anomalies.</p>
      <p>
        Given the severe problems, our goal is to consider the concept drift detection and anomaly
detection as a whole, adapt the model to the latest data distribution, and detect anomalies only
concerning the temporal context where they are located. Previous concept drift researches
focus on detecting changes of the joint probability  (, ) under supervised setting, namely,
the decision boundary changes along with the distributional changes in the input data [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
However, for anomaly detection, the class distribution between normal and abnormal is extremely
unbalanced, and labels are usually missing, so it is impractical to use traditional supervised
approaches [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ], e.g., detect drifts based on the changes of real-time prediction error rate. Instead,
the adaptation based on changes of the prior probability  () will ensure the autoencoder
reconstructs the normal data in from the current concept. Statistical tests are commonly used for
unsupervised drift detection [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. For instance, the two-sample tests examine whether samples
from two collections are generated from the same data distribution. However, existing methods
conduct tests mostly in the original input space, which only works for linearly detectable drifts.
Ceci et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] introduce both PCA and autoencoder to embed features into a latent space for
the change detection. However, their change detector is distance-based and highly depends on
a user-defined threshold.
      </p>
      <p>In this paper, we propose STAD (State-Transition-aware Anomaly Detection). In STAD, data
distribution in a time period is defined as a states. We use state transitions to model the concept
drifts between periods. As autoencoders are well studied for non-linear time series anomaly
detection, we are motivated to extend the state transition paradigm to autoencoders. We follow
the standard usage of autoencoders for anomaly detection and novelly couple the detection
of concept drifts and anomalies with the informative latent representation of autoencoders.
An existing autoencoder can be reused when a data concept reappears in the stream. A state
transition is triggered by the detection of concept drift, and this will further guide the reuse
or adaptation of autoencoders for the next period. The states quantify the uncertainty caused
by concept drifts and raise interpretability in understanding the decision of autoencoders and
changes in the data stream.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem definition</title>
      <sec id="sec-2-1">
        <title>2.1. Terminology</title>
        <sec id="sec-2-1-1">
          <title>2.1.1. Data Stream</title>
          <p>
            Let  = {}∈N be a D-dimensional data stream, where  denotes the observation at
timestamp . The data stream contains unlabeled anomalies as well as distributional changes
caused by concept drifts. Instead of explicitly categorizing diferent concept drift types [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ], we
uniformly consider that a concept drift occurs in the data stream between timestamps  and
 +  if the prior probability &lt;() ̸= &gt;+(), where &lt; and &gt;+ are respectively the
data distribution from the last concept drift to  and from  +  to the next concept drift. The
period [,  + ] is the drift period, which is defined as the minimum period that covers the whole
distributional change. The data distribution other than drift periods is assumed to be stable.
Due to the lack of labels under the unsupervised setting, we only consider the prior (virtual)
shifts [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ] in the data stream.
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>2.1.2. State transition</title>
          <p>Imitating the automata theory, we formulate concept drifts in streaming data with a state
transition model ℳ = ⟨ , ,  ⟩ where  is a multivariate data stream,  = {1, 2, ...,  } is
a set of states ( is the user-defined maximum number of states that can be maintained),  is a
set of transition functions  : { ⇒  }(,  ∈ ,  ̸= ). For each state  = ⟨, ⟩( =
1, ...,  ),  is the empirically estimated distribution in latent space,  is the autoencoder
trained on the new concept data. In this work, we assume that suficient data after the concept
drift is available to learn  and .</p>
          <p>Considering that no information about the upcoming new concept is accessible, despite a
potential high error rate, we still keep using the previous model for anomaly detection until the
model adaptation is finished. Or in other words, the previous model is used during the upcoming
drift period. For distributional stationary data streams where no concept drift occurs, there
will be only a single state without transition, and the model reduces to a single conventional
autoencoder.
2.1.3. Anomaly
An observed data snippet + = {, ..., +}(,  ∈ N) is abnormal if it is significantly
deviated from its temporal neighbors (data snippets in the same state). The significance of
the deviation can be determined by thresholding or statistical techniques. Both concept drifts
and anomaly snippets are distributionally deviate from their temporal neighbors. In our study,
we distinguish them in terms of length. After the concept drifts, we assume that the data
distribution stays stationary in the new concept for a significantly longer period. In contrast,
the data stream returns to the previous distribution after a short anomaly snippet.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Problem statement</title>
        <p>Given a D-dimensional data stream  = {}∈N, we aim to identify any period [,  + ]
where the corresponding data snippet + is abnormal. The detection process should be
unsupervised and in real-time. We also detect concept drifts in the data stream and switch to an
existing autoencoder or train a new one on the newly arrived data.</p>
        <p>ℒ
 ′</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. State-transition-aware anomaly detection</title>
      <p>In this section, we propose STAD, a state-transition-aware anomaly detection model, which
employs autoencoders as the base model. The latent representations of autoencoders are used
to detect concept drifts, which consequently trigger state transitions. An overview of STAD is
shown in Figure 1.</p>
      <sec id="sec-3-1">
        <title>3.1. Reconstruction and latent representation learning</title>
        <p>Let  : R
encoder maps a snippet + of the multivariate streaming data into a H-dimensional latent
representation  ∈ R , while the decoder reconstructs the same format snippet ′+ from
→ R and  : R → R be the encoder and decoder of an autoencoder. The
, where  is the snippet length and ,  ∈ N. A common assumption for anomaly detection
using autoencoders is that pure normal data are available for the initial model training. The
reconstruction error + = |+

−</p>
        <p>′+| indicates the goodness of fit to the normal data.</p>
        <p>
          In the test phase, abnormal snippets will cause larger reconstruction errors than normal data
such that they are separable. The encoder and decoder can be implemented with a variety of
deep models [
          <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
          ]. Considering the temporal dependencies in streaming data, Recurrent
Neural Networks (RNNs) and their variants [
          <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
          ] are naturally suitable for the target. In the
following illustration, as an example, we take the LSTM-Autoencoder [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], which takes data
snippets as input and produces a single latent representation for each snippet. To map the
multivariate reconstruction error to the likelihood of anomalies, a commonly used approach is
to estimate a multivariate Gaussian distribution from the reconstruction error of normal data
and measure the Mahalanobis distance between the reconstruction error of an unknown data
point to the estimated distribution [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Moreover, the Gaussian Mixture Model (GMM) [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
and energy-based model [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] can also be used for likelihood estimation. The thresholding over
the estimated anomaly likelihood in an unsupervised manner is challenging, especially in the
Algorithm 1 Concept Drift Detection
real-time prediction scenario. A possible non-parametric dynamic thresholding technique is
proposed in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The unsupervised approach for the adaptive threshold in diferent periods
is not the main focus of this paper and will be addressed in our future work. In the following
sections, we focus on adapting autoencoders based on the state transitions.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Drift detection in the latent space</title>
        <p>
          In real-time, the latent representations of the autoencoder are accumulated for concept drift
detection. Existing concept drift detection approaches mostly work in the original space,
targeting linear separable concept drifts. Considering the complex concept drifts in multivariate
streaming data, even non-linear distributional changes can be observed in the autoencoder latent
space. We perform dimension-wise two-sample Kolmogorov–Smirnov test (KS-test) [
          <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
          ] as a
non-parametric and distribution-free statistical test to check whether two latent representations
are drawn from the same continuous distribution. Algorithm 1 shows the online concept drift
detection process. Formally, let ℒℎ = {− − +1, − − +2, ..., − } be the accumulated
latent representation since the last concept drift and ℒ = {− +1, − +2, ..., } be the
latest latent representations. ℎ and  are the empirical estimated cumulative distribution
functions from the two latent representation sets. The null hypothesis (i.e., the observations in
ℒℎ and ℒ are from the same distribution) will be rejected if
|ℎ() − ()| &gt; ( )

√︂  + 
 · 
(1)
where  is the supremum function,  is the significance level, ,  are the size of ℒℎ
and ℒ, ( ) = √︁− ln( 2 ) · 12 . Since the KS-test is designed for univariate data, we conduct
parallel tests in each latent dimension and report concept drift if the null hypothesis is rejected
in at least one of the dimensions. Once a concept drift is detected, the historical and latest
sample sets are emptied and we further collect samples from the new data distribution.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. State transition model</title>
        <p>Modeling reoccurring data distributions (e.g., seasonal changes), coupling autoencoders with
drift detection, and reusing models based on the distributional features can increase the eficiency
of updating a deep model in real-time. In STAD, for each period between two concept drifts in
the data stream, the data distribution, as well as the corresponding autoencoder are represented
in a fixed-length queue . The first state 0 ∈  represents the beginning period of the data
stream before the first concept drift. After every new concept drift, a new autoencoder will
be trained from scratch if no existing element in the queue fits the current data distribution;
otherwise, the state will transit to the existing one and reuse the corresponding autoencoder. In
our study, we assume that suficient data after the concept drifts can be accumulated to initialize
a new autoencoder. In future work, we plan to discover state transitions with limited data (e.g.,
tolerantly reusing existing autoencoders).</p>
        <p>To compare the distributional similarity between the newly arrived latent representation
 and the distributions of existing states {| = 1, ...,  }, we employ the symmetrized
Kullback–Leibler Divergence. The similarity between  and an existing state distribution  is
defined as
(, ) = (||) + (||)</p>
        <p>
          ∈ℒ
= ∑︁ () ()
()
+ ()
()
()
The next step is to estimate the corresponding probability distributions from the sequence of
latent representations. In [
          <xref ref-type="bibr" rid="ref13 ref14">14, 13</xref>
          ], the probability distribution of categorical data is estimated
by the number of object appearances in each category. In our case, the target is to estimate
the probability distribution of fixed length real-valued latent representations. In previous
research, one possibility for density estimation of streaming data is to maintain histograms of
the raw data stream [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. In STAD, we take advantage of the fix-sized latent representation
of autoencoders and maintain histograms of each period in the latent space for the density
estimation. Let ℒ = {1, 2, ..., } be a sequence of observed latent representations, where
 = ⟨ℎ1 , ℎ2 , ..., ℎ ⟩ and  is the latent space size, the histogram of ℒ is
() = 1 ∑︁

ℎ
        </p>
        <p>ℎ
∈ℒ ∑︀=1  
( = 1...)
and the density of a given period is estimated by  () = (). Hence, Equation 2 can be
converted to</p>
        <p>∑︁
=1...
(, ) =
()</p>
        <p>+ ()
()
()
()
()
(2)
(3)
(4)
◁ Equation 4
◁ : Trained on new concept data
Algorithm 2 State Transition Procedure
1: function StateTransition(ℎ, ℒ, ,  )
2:  = DensityEstimation(ℒ)
3: if  {(, )} ≤  then
=⟨,⟩∈
 ←  ∪ (ℎ ⇒ )
return 
end if
For a newly detected concept with distribution , if there exists a state ( ∈ [1,  ]) with
corresponding probability distribution  satisfies (, ) ≤  , where  is a tolerant factor,
and  is not the direct last state, the concept drift can be treated as a reoccurrance of the
existing concept, therefore the corresponding autoencoder can be reused, and the state transfers
to the existing state. If no autoencoder is reusable, a new one will be trained on the latest arrived
data after concept drift. To prevent an explosion in the number of states, the state transition
model ℳ = ⟨ , ,  ⟩ only maintains the  latest states. The state transition procedure is
described in Algorithm 2.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Case study</title>
      <p>
        We carry out a case study using the Covid-19 daily infection case dataset of Germany2, where
the waves of the epidemic can be considered as human-interpretable concept drifts and the
public holidays with abnormal statistic numbers are the anomalies. The Covid dataset (Figure 2)
contains daily new infection cases and death cases in Germany from March 2020 to April
2021. The data stream follows a 7-day period and fluctuates with the trend depending on the
development of the epidemic, seasons, and local prevention policies. The LSTM-autoencoder
and scoring function in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is applied as the base model. Both the encoder and decoder consist of
a single LSTM unit, and the latent representation is three dimensional. We use the data between
March and May 2020 to initialize the autoencoder and let the rest data arrive in a streaming
fashion. The model takes sliding windows (snippets) of 7 timestamps length (a week) without
overlap as inputs for the autoencoder. For both initial and real-time training, the autoencoders
are trained with 50 epochs with 0.4 dropout rate. For the KS-test,  = 3,  = 2, and the
significance level is set to  = 0.05. The real-time processing starts from June 2020. The dashed
lines in Figure 2 are the positions where concept drifts are detected in the latent space. All four
·104
      </p>
      <p>Detected drifts
1,500
1,000
detected drifts are near significant changes in the evolution of the epidemic. The threshold  of
KL-divergence is set to 0.0025. The size of the new bufer ℒ is 14. As shown in Figure 3,
no reusable autoencoder is found for the first three concept drifts such that three new states
with corresponding new autoencoders are created. After the concept drift near March 2021,
the upcoming data in ℒ has KL-divergence below  with state 2 (end September to early
December), therefore it triggers a backward state transition to 2.</p>
      <p>In the test phase, we manually labeled 11 weeks containing public holidays in Germany as
abnormal snapshots and ranked the anomaly scores in the periods corresponding to each state.
In the evaluation of recall in the ranking list, we got 18% for @1, 54% for @5 and 90%
for @10. A major reason is that some data points from the beginning of concept drifts are
mistakenly alarmed as anomalies before the model update. In the follow-up work, we aim to
reduce the false positive detection of anomalies by distinguishing concept drifts and abnormal
snapshots by their length.</p>
      <p>0
 1
 2
 3</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>We have proposed an autoencoder-based streaming data anomaly detection approach STAD,
which uses the latent representation to detect concept drift and model state transitions between
diferent data distributions in the data stream. With a demo experiment, we showed the
statetransition-aware anomaly detection process during the stream evolution. However, there are
still open challenges. In the current work, we assume that suficient data are available for online
training. However, the states of some periods in the real data stream are too short, such that
the data for training a new model in real-time is not suficient. One future work is to discover
eficient strategies for reusing autoencoders for such cases. Another further research direction
is to discover semantic explanations for each state, which helps the human better understand
the model as well as the changes of data.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Sipple</surname>
          </string-name>
          ,
          <article-title>Interpretable, multidimensional, multimodal anomaly detection with negative sampling for detection of device failure</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>9016</fpage>
          -
          <lpage>9025</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sakurada</surname>
          </string-name>
          , T. Yairi,
          <article-title>Anomaly detection using autoencoders with nonlinear dimensionality reduction</article-title>
          ,
          <source>in: Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>4</fpage>
          -
          <lpage>11</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Marchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Vesperini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Weninger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Eyben</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Squartini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schuller</surname>
          </string-name>
          ,
          <article-title>Non-linear prediction with lstm recurrent neural networks for acoustic novelty detection</article-title>
          , in: 2015
          <source>International Joint Conference on Neural Networks (IJCNN)</source>
          , IEEE,
          <year>2015</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Malhotra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramakrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Anand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Vig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          , G. Shrof,
          <article-title>Lstm-based encoder-decoder for multi-sensor anomaly detection</article-title>
          ,
          <source>arXiv preprint arXiv:1607.00148</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Zhang, Learning under concept drift: A review</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>31</volume>
          (
          <year>2018</year>
          )
          <fpage>2346</fpage>
          -
          <lpage>2363</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Medas</surname>
          </string-name>
          , G. Castillo,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rodrigues</surname>
          </string-name>
          ,
          <article-title>Learning with drift detection</article-title>
          ,
          <source>in: Brazilian symposium on artificial intelligence</source>
          , Springer,
          <year>2004</year>
          , pp.
          <fpage>286</fpage>
          -
          <lpage>295</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Baena-Garcıa</surname>
          </string-name>
          ,
          <source>J. del Campo-Ávila</source>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fidalgo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bifet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Gavalda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morales-Bueno</surname>
          </string-name>
          ,
          <article-title>Early drift detection method</article-title>
          ,
          <source>in: Fourth international workshop on knowledge discovery from data streams</source>
          , volume
          <volume>6</volume>
          ,
          <year>2006</year>
          , pp.
          <fpage>77</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M. U.</given-names>
            <surname>Togbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chabchoub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Boly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Barry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chiky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bahri</surname>
          </string-name>
          ,
          <article-title>Anomalies detection using isolation in concept-drifting data streams</article-title>
          ,
          <source>Computers</source>
          <volume>10</volume>
          (
          <year>2021</year>
          )
          <fpage>13</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ceci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Corizzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Japkowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mignone</surname>
          </string-name>
          , G. Pio,
          <article-title>Echad: embedding-based change detection from multivariate time series in smart grids</article-title>
          ,
          <source>IEEE Access 8</source>
          (
          <year>2020</year>
          )
          <fpage>156053</fpage>
          -
          <lpage>156066</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>B.</given-names>
            <surname>Zong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Min</surname>
          </string-name>
          , W. Cheng, C. Lumezanu,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Deep autoencoding gaussian mixture model for unsupervised anomaly detection</article-title>
          ,
          <source>in: International Conference on Learning Representations</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhai</surname>
          </string-name>
          , Y. Cheng, W. Lu,
          <string-name>
            <surname>Z. Zhang,</surname>
          </string-name>
          <article-title>Deep structured energy based models for anomaly detection</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1100</fpage>
          -
          <lpage>1109</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>K.</given-names>
            <surname>Hundman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Constantinou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Laporte</surname>
          </string-name>
          , I. Colwell, T. Soderstrom,
          <article-title>Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding</article-title>
          ,
          <source>in: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery &amp; data mining</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>387</fpage>
          -
          <lpage>395</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , G. Min,
          <article-title>Statistical features-based real-time detection of drifted twitter spam</article-title>
          ,
          <source>IEEE Transactions on Information Forensics and Security</source>
          <volume>12</volume>
          (
          <year>2016</year>
          )
          <fpage>914</fpage>
          -
          <lpage>925</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Dasu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Krishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Venkatasubramanian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yi</surname>
          </string-name>
          ,
          <article-title>An information-theoretic approach to detecting changes in multi-dimensional data streams</article-title>
          ,
          <source>in: In Proc. Symp. on the Interface of Statistics, Computing Science, and Applications</source>
          , Citeseer,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sebastiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          ,
          <article-title>Change detection in learning histograms from data streams</article-title>
          ,
          <source>in: Portuguese Conference on Artificial Intelligence</source>
          , Springer,
          <year>2007</year>
          , pp.
          <fpage>112</fpage>
          -
          <lpage>123</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>