<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Neuroevolution  methods  for  organizing  the  search  for  anomalies in time series </article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Serhii Leoshchenko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrii Oliinyk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergey Subbotin</string-name>
          <email>subbotin@zntu.edu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matviy Ilyashenko</string-name>
          <email>matviy.ilyashenko@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tetiana Kolpakova</string-name>
          <email>t.o.kolpakova@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National University “Zaporizhzhia Polytechnic”</institution>
          ,
          <addr-line>Zhukovskogo street 64, Zaporizhzhia, 69063</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>   This paper is devoted to the problem of detecting and classifying anomalies for time series data. Some of the important applications of time series anomaly detection are healthcare, fraud detection, and system failure recognition. Despite scientists extensive experience in detecting anomalies in time series, most methods look for individual objects that differ from ordinary objects, but do not take into account the specifics of the data sequence [1]. In this paper, a method for detecting anomalies and clastering time series based on neuroevolutionary approaches is proposed. Prediction-based methods are used to detect anomalies: statistical and deep neural networks are used. Classical clustering methods that accept statistical parameters of series were used for clustering.</p>
      </abstract>
      <kwd-group>
        <kwd> 1  Forecasting</kwd>
        <kwd>time series</kwd>
        <kwd>clustering</kwd>
        <kwd>neuroevolution</kwd>
        <kwd>genetic method</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction </title>
      <p>Prescriptive (decision support)
Predictive (forecast, predictive)</p>
      <p>Diagnostic</p>
      <p>
        Descriptive
Complexity of the
model
Human participation
rate
collection, processing, and preliminary analysis. The listed types of analytics differ both in the
complexity of the models used and in the degree of human participation [
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5 ref6">2-6</xref>
        ].
      </p>
      <p>
        There are a lot of areas of application of analytics tools-information security, banking sector,
public administration, medicine and many other subject areas. Often, the same method works
effectively for different subject areas, so developers of analytics systems create universal modules
containing different algorithms [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        For many technological systems, monitoring results can be represented as time series. The
properties of the time series are [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]:
 linking each measurement (sample, discrete) to the time of its occurrence;
 equal time distance between measurements;
 ability to restore the behavior of the process in the current and subsequent periods from data
from the previous period.
      </p>
      <p>
        Time series can describe more than just numerically measurable processes. The use of various
methods and model architectures, including deep neural networks, allows you to work with data from
natural language processing, computer vision tasks, etc. For example, a chat message can be
converted into numeric vectors (embedding) that appear sequentially at a certain time, and the video is
nothing more than a matrix of numbers that changes over time [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        So, time series are very useful for describing the operation of complex devices and are often used
for typical tasks: modeling, forecasting, feature selection, classification, clustering, pattern search,
anomaly search [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Examples of such usage include an electrocardiogram, changes in the value of
shares or currency, weather forecast values, changes in network traffic, engine operation parameters,
and much more.
      </p>
      <p>
        Time series have typical characteristics that accurately describe the nature of the time series [
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5 ref6">2-6</xref>
        ]:
 period: a time interval of constant length for the entire series, at the ends of which the series
takes similar values,
 seasonality: periodicity property (season the same as period),
 cycle: characteristic changes in a series related to global causes (for example, cycles in the
economy), there is no constant period,
 trend: a trend towards a long-term increase or decrease in the values of a series.
      </p>
      <p>
        Time series may contain anomalies. An anomaly is a deviation in the standard behavior of a
process. The method of machine search for anomalies uses data about the operation of the process
(data sets) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Depending on the subject area, there may be different types of anomalies in the dataset.
It is customary to distinguish between several types of anomalies [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
y
      </p>
      <p>N1</p>
      <p>O3</p>
      <p>O1</p>
      <p>O2</p>
      <p>N2</p>
      <p>
        x
c)
Figure  3:  Examples  of  various  anomalies  in  time  series:  a)  point  O1  and  O2   and  collective  O3  
1. Point anomalies. They occur in situations where a single instance of data can be considered as
absolutely abnormal in relation to others [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
2. Contextual anomalies. They are observed if the instance is abnormal in a certain context, or
when a certain condition is met (therefore also called conditional) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
3. Collective anomalies. Occur when a sequence of related data instances (for example, a time
series graph) is abnormal relative to the rest of the data [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>An individual instance may not be a deviation, but the joint appearance of such instances will be a
collective anomaly.
1.1.</p>
    </sec>
    <sec id="sec-2">
      <title>Anomaly detection strategies </title>
      <p>
        Detecting point anomalies often requires some kind of system model. If the system does not have a
deterministic mathematical model or such a model is too difficult to build, then a statistical model
must be available. Depending on the method of constructing a statistical model, the following
approaches are distinguished [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8-10</xref>
        ].
      </p>
      <p>1. Recognize anomalies in case supervised learning. This technique requires a full-fledged
training sample, including enough representatives of the normal and abnormal classes of values.</p>
      <p>
        The technique is applied in 2 stages: first, training takes place on data that manually indicates
normal and abnormal points. Then recognition occurs when new data is classified based on the
constructed model [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8-10</xref>
        ].
      </p>
      <p>It is usually assumed that the statistical properties of the model do not change over time, and such
a change often requires repeated training.</p>
      <p>
        The main difficulty of such methods is the formation of data for training. In addition to the
obvious labor costs, often the abnormal class is also worse represented than the normal one, which
can lead to inaccuracies in the resulting model [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8-10</xref>
        ].
      </p>
      <p>
        2. Recognition of anomalies partially supervised learning. Similar to the previous one, but the
training data represents only a normal class. A system trained in a normal class can determine
whether data belongs to it, thus identifying abnormal data by exclusion.
3. Recognition in case unsupervised learning. In the absence of a priori information, this is the
only possible option. Recognition in unsupervised learning: free mode is based on the assumption
that abnormal data is quite rare. Therefore, only those that are farthest from the average values are
indicated as anomalous [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8-10</xref>
        ]. Applying this technique to streaming data is difficult because it is
necessary to have an idea of the entire data array to have a good estimate of the average and
expected deviations.
1.2.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Anomaly recognition methods </title>
      <p>
        Classification. This method is based on the fact that the normal behavior of a system can be
determined by one or more classes. Then an instance that does not belong to any of the classes is
anomalous. This method usually uses the “partially supervised learning” approach. Basic
mechanisms: neural networks, Bayesian networks, rule-based reference vector method [
        <xref ref-type="bibr" rid="ref7 ref9">7, 9</xref>
        ].
      </p>
      <p>
        Clustering. This method is based on grouping similar values into clusters, and does not require
knowledge of the properties of possible deviations. Anomaly detection is based on the following
assumptions [
        <xref ref-type="bibr" rid="ref10 ref8">8, 10</xref>
        ]:
 normal data instances belong to a cluster
 normal data is closer to the center of the cluster, abnormal data is further away
 normal data forms large dense clusters, while abnormal data forms small and scattered ones.
 one of the simplest clustering methods is the k–means algorithm.
      </p>
      <p>
        Statistical analysis. Using this approach, a statistical model of the process is constructed, which is
then compared with the actual behavior. If the actual behavior differs from the model by more than a
certain threshold, it is concluded that there are anomalies. Statistical analysis methods are divided into
two groups [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]:
 parametric methods. Normal data is assumed to have a probability density function  x, ,
where  is the parameter vector, x is the data instance (observation);
 nonparametric method. The model structure is not defined a priori, but is determined from the
data provided.
      </p>
      <p>
        Nearest neighbor method. When using this technique, a metric is introduced (a measure of
similarity between objects). Then two approaches are possible [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]:
 distance to the kth-nearest neighbor. Abnormal data is the most distant from all other data;
 using relative density. Samples in areas with low relative density are evaluated as abnormal.
(e.g. local emission level method).
      </p>
      <p>Spectral methods. Based on the spectral (frequency) characteristics of the data, a model is
constructed that is designed to take into account most of the variability in the data.</p>
      <p>Let's compare the existing methods and present the results as a table.</p>
      <sec id="sec-3-1">
        <title>Table 1 </title>
        <sec id="sec-3-1-1">
          <title>Comparison of anomaly recognition methods </title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Method </title>
        </sec>
        <sec id="sec-3-1-3">
          <title>Classification </title>
        </sec>
        <sec id="sec-3-1-4">
          <title>Clustering </title>
        </sec>
        <sec id="sec-3-1-5">
          <title>Statistical analysis </title>
        </sec>
        <sec id="sec-3-1-6">
          <title>Nearest neighbor </title>
        </sec>
        <sec id="sec-3-1-7">
          <title>Spectral methods </title>
        </sec>
        <sec id="sec-3-1-8">
          <title>Result  Tag  Tag </title>
        </sec>
        <sec id="sec-3-1-9">
          <title>Degree </title>
        </sec>
        <sec id="sec-3-1-10">
          <title>Degree  Tag </title>
        </sec>
        <sec id="sec-3-1-11">
          <title>Strategy </title>
        </sec>
        <sec id="sec-3-1-12">
          <title>Supervised learning,  partially supervised  learning </title>
        </sec>
        <sec id="sec-3-1-13">
          <title>Supervised learning,  partially supervised  learning </title>
        </sec>
        <sec id="sec-3-1-14">
          <title>Partially supervised  learning </title>
        </sec>
        <sec id="sec-3-1-15">
          <title>Unsupervised learning </title>
        </sec>
        <sec id="sec-3-1-16">
          <title>Unsupervised learning,  partially supervised  learning </title>
        </sec>
        <sec id="sec-3-1-17">
          <title>Classification of  anomalies  Yes  No </title>
          <p>No 
No 
No </p>
          <p>In general, from the comparison of methods, we can conclude that there is an insufficient
qualitative level of existing methods, because most methods are not able to use all strategies
(approaches) to machine learning, and also from the results they do not provide an unambiguous
answer about the assessment of detected anomalies. Moreover, most papers note an insufficient level
of accuracy when using cool methods with newer model topologies. That is why the task of
developing new approaches and methods for detecting anomalies in time series remains an urgent
task.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>2. Related Works </title>
      <p>
        Point anomalies are most easily recognized-these are individual points where the behavior of the
process differs dramatically from other points. For example, you can observe a sharp deviation of
parameter values at a particular point [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7-10</xref>
        ].
      </p>
      <p>Such values are called “outliers”, they strongly influence the statistical indicators of the process
and are easily detected by setting thresholds for the observed value.</p>
      <p>
        It is more difficult to detect an anomaly in a situation where the process behaves “normally” at
each point, but collectively the values at several points behave “strangely”. Such abnormal behavior
can include, for example, a change in the waveform, a change in statistical indicators (mean, mode,
median, variance), the appearance of a mutual correlation between two parameters, small or
shortterm abnormal changes in the amplitude, and so on. And in this case, the task is to recognize
abnormal behavior of parameters that cannot be detected by conventional statistical methods [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7-10</xref>
        ].
      </p>
      <p>Finding anomalies is very important. In one situation, data should be cleared of anomalies in order
to get a more realistic picture, while in another situation, anomalies should be carefully examined, as
they may indicate a possible rapid transition of the device to emergency mode.</p>
      <p>
        Finding anomalies in time series is not easy (unclear definition of anomalies, lack of markup,
nonobvious correlation). So far, the Self Organizing Tree Algorithm (SOTA-algorithms) for searching for
anomalies in time series has a high level of False Positive [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7-10</xref>
        ].
      </p>
      <p>
        Only a small number of anomalies, mostly point-based, can be detected manually if good data
visualization is available. Group anomalies are harder to detect manually, especially when it comes to
large amounts of data and analyzing information about multiple devices. Also difficult to detect is the
case of an “anomaly in time”, when a normal signal appears at the “wrong” time. Therefore, when
searching for anomalies in time series, it is advisable to use automation methods [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7-10</xref>
        ].
      </p>
      <p>A big problem in finding anomalies on real data is that the data is usually not marked up, so
initially it is not strictly defined what an anomaly is, there are no rules for searching. In such
situations, it is necessary to apply methods of teaching without a teacher (unsupported learning),
while models independently determine the relationships and characteristic laws in the data.</p>
      <p>
        The methods used to search for anomalies in time series are usually divided into groups [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7-10</xref>
        ]:
 proximity-based: anomaly detection based on proximity parameter information or a sequence
of fixed-length parameters, suitable for detecting point anomalies and outliers, but will not detect
changes in the waveform;
 prediction-based: build a forecast model and compare the forecast and actual value, best
applied to time series with pronounced periods, cycles, or seasonality;
 reconstruction-based: methods based on Data Fragment reconstruction use data fragment
recovery (reconstruction), so it can detect both point anomalies and group anomalies, including
changes in the waveform.
      </p>
      <p>
        Proximity-based methods are focused on finding values that significantly deviate from the
behavior of all other points. The simplest and most obvious example of implementing this method is
monitoring whether a given threshold of values is exceeded [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        In prediction-based methods, the main task is to build a qualitative process model in order to
simulate the signal and compare the obtained simulated values with the original (true) ones. If the
predicted and true signal are close, then the behavior is considered “normal”, and if the values in the
model are very different from the true ones, then the behavior of the system in this area is declared
abnormal [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        The most common time series modeling methods are SARIMA [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and periodic neural networks.
      </p>
      <p>
        The original approach is used in reconstruction-based models - first, the model is taught to encode
and decode signals from an existing sample, while the encoded signal has a much smaller dimension
than the original one, so the model has to learn to “compress” information. An example of such
compression for 32-by-32-pixel images is given in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>After training, the models give input signals that are segments of the time series under study, and if
encoding and decoding is successful, then the behavior of the process is considered “normal”,
otherwise the behavior is declared abnormal.</p>
      <p>
        One of the newly developed reconstruction-based methods that show good results in detecting
anomalies is TadGAN [
        <xref ref-type="bibr" rid="ref13 ref14">13-15</xref>
        ], developed by MIT researchers in late 2020. The TadGAN Method
Architecture contains elements of an auto-encoder and Generative Adversarial Networks.
      </p>
      <p>
        The network  acts as an encoder that translates time series segments x into hidden space vectors
z, and  is a decoder that recovers time series segments from the hidden z representation. Cx is the
critic who evaluates the recovery quality of   x, and Cz is the critic who evaluates the similarity
of the hidden representation of z   x to white noise. In addition, there is a control of the
“similarity” of the original and restored samples using the L2 measure according to the “Cycle
consistency loss” ideology (which ensures the overall similarity of the generated samples to the
original samples in GAN) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The final objective function is a combination of all metrics for
evaluating the quality of work of critics Cx , Cz and a measure of similarity between the original and
restored signal.
      </p>
      <p>
        To create and train a neural network, you can use various standard packages (for example,
TensorFlow or PyTorch) that have a high-level API. An example of implementing a TadGAN-like
architecture using the TensorFlow package for weight training can be found in the repository [
        <xref ref-type="bibr" rid="ref14">14, 15</xref>
        ].
When training this model, five metrics were optimized:
 , Cx Cx ,Cz Cz VX Cx ,   VZ Cz ,   VL2  , . 
min max
(1) 
 aeLoss is the root-mean-square deviation between the original and restored time series, i.e.
the difference between x and   x ;
 cxLoss-binary cross-entropy critique of Cx , which determines the difference between a true
Time Series segment and an artificially generated one,
 cx_g_loss-binary cross-entropy, oscillator error   x, which characterizes its inability to
“deceive” the critic Cx ,
 czloss-binary cross-entropy critique of Cz , which determines the difference between the
hidden vector generated by the encoder and white noise, provides similarity of the hidden vector ‡
(x) to a random vector, preventing the model from “memorizing” individual patterns in the source
data,
 cz_g_Loss is a binary cross-entropy, an error of the oscillator  x , which characterizes its
inability to create hidden vectors similar to random ones, and thus “deceive” the critic Cz .
      </p>
      <p>
        After training the model, individual segments of the time series under study are reconstructed and
the original and reconstructed series are compared, which can be performed using one of the
following methods [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]:
 streaming comparison;
 comparison of curve areas in a given area around each sample (area length is a
hyperparameter);
 Dynamic Time Warping.
      </p>
      <p>Quality is evaluated using the f1 metric for the binary classification problem, “positive” (null
hypothesis): there is an anomaly, “negative” (alternative hypothesis): no anomaly.</p>
      <sec id="sec-4-1">
        <title>Table 2 </title>
        <sec id="sec-4-1-1">
          <title>Quality is evaluated using the F1 metric for the binary classification problem </title>
          <p> </p>
        </sec>
        <sec id="sec-4-1-2">
          <title>There is an anomaly </title>
        </sec>
        <sec id="sec-4-1-3">
          <title>There is no anomaly </title>
        </sec>
        <sec id="sec-4-1-4">
          <title>The anomaly is predicted by  the model  TP  correctly predicted anomaly </title>
          <p>FP 
predicted an anomaly where it 
doesn't exist </p>
        </sec>
        <sec id="sec-4-1-5">
          <title>The model predicted the </title>
          <p>absence of an anomaly </p>
          <p>FN 
there is an anomaly, but it was 
not found </p>
          <p>TN 
there is no anomaly and the 
model does not see it 
2</p>
          <p>2
 </p>
          <p>To demonstrate the operation of the method, we use a synthetic (artificially generated) series
without anomalies, which is the sum of two sinusoids, the values of which vary in the range from -1
to 1: yx  1 sinx  1 sin0.8x .</p>
          <p>
            It can be seen that the model has quite accurately learned to predict the main patterns in the data.
Let's try adding various anomalies to the data and then detect them using the tadgan model. First, let's
add a few point anomalies [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ].
          </p>
          <p>
            The graph of the original and predicted signals shows that the model cannot restore the “peaks” of
anomalous values, which can be used with high accuracy to determine point anomalies. However, in
such a situation, the crust of the complex TadGAN model is not obvious-such anomalies can also be
detected by estimating the excess of thresholds [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ].
          </p>
          <p>
            Now consider a signal with a different type of anomaly: a periodic signal with an abnormal
frequency change. In this case, there is no excess of the threshold: from the point of view of
amplitude, all elements of the series are “normal” values, and the anomaly is detected only in the
group behavior of several points. In this case, TadGAN also cannot restore the signal (as can be seen
in the figure) and can be used as a sign of the presence of a group anomaly [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ].
          </p>
        </sec>
        <sec id="sec-4-1-6">
          <title>Figure 6: The result of the TadGAN operation on a dataset with an abnormal frequency change. </title>
          <p>These two examples illustrate the work of the method. The reader can also try to create their own
data sets and test the model's capabilities in various situations.</p>
          <p>
            More complex examples of datasets can be found in the article by the authors of the TadGAN
Method [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ]. There is also a link to the Orion library, which is developed by MIT specialists, which
uses machine learning to recognize rare anomalies in time series, using the unsupported learning
approach [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ].
          </p>
          <p>However, most scientists agree that each case requires its own method of signal reconstruction and
a conductive method of training the model, which significantly slows down the practical
implementation.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>3. Proposed method </title>
      <p>The prototype of our solution is shown in Fig. 7, consists of two separate groups of the same
population. The first is a set of models, and the second is a set of data individuals.</p>
      <p>Each model is a sequence of encoder and decoder levels that describe the input multidimensional
signal [16]. The framework allows you to create an ensemble model using an approach similar to the
method based on packaging in packages. A whole ensemble model is a set of sub-models that work
with subgroups of input signals. Ensemble models form a set.</p>
      <p>The work of the solution begins with generating initial groups of input signals using correlation;
the system then updates the models and groups using a genetic algorithm [17-19]. In parallel, genetic
operators optimize individual models in each subgroup by making changes to the topology of neural
models (for example, model length and layer parameters) [17-19]. The end result of these actions is an
ensemble model optimized for anomaly detection. The ensemble model is defined as follows.</p>
      <p>A number of papers note that despite different models, it is noticeable that almost all models show
a similar set of anomalies, despite changes in their hyperparameters [16]. Of course, the results
differed depending on the hyperparameters, but none of the changes significantly affected detection.
Because of this, an ensemble model is proposed, based on the simultaneous division of the search
space into subgroups and the evolution of models for such subgroups. As a result, the models were
able to identify more specific dependencies and relationships between signals [20-22].</p>
      <p>Models within each subgroup are optimized by changing their internal structure. The evolution of
a single model is carried out in five main stages:
1. clustering the search space [23-25];</p>
      <p>crossing, mutating, and selecting the best models for individual clusters;
syncing solutions;
multi-parent crossover [17-19];
evaluation of the ensemble solution.</p>
    </sec>
    <sec id="sec-6">
      <title>4. Experimental research </title>
      <p>For the experimental research of proposed method was be used the following as the training and
testing data:
 The Secure Water Treatment (SWaT) Dataset [26], [16]. This dataset contains data gathered
from a scaled-down version of a real water treatment plant. The data were collected for 11 d in two
modes – 7 d of normal operation of the plant and 4 d during which there were cyber and physical
attacks executed;
 The Water Distribution (WADI) Dataset [27], [16]. This dataset contains data from a
scaleddown version of a water distribution network in a city. The collected data contain 14 d of normal
operation and 2 d during which there were 15 attacks executed.</p>
      <p>The general information about datasets present in Table 3.</p>
      <sec id="sec-6-1">
        <title>Table 3 </title>
        <sec id="sec-6-1-1">
          <title>General information about datasets </title>
          <p> </p>
        </sec>
        <sec id="sec-6-1-2">
          <title>Datasets </title>
        </sec>
        <sec id="sec-6-1-3">
          <title>SWAT  WADI‐2017  WADI‐2019 </title>
        </sec>
        <sec id="sec-6-1-4">
          <title>Number of Input </title>
        </sec>
        <sec id="sec-6-1-5">
          <title>Signals  51  123  123 </title>
        </sec>
        <sec id="sec-6-1-6">
          <title>Number of </title>
        </sec>
        <sec id="sec-6-1-7">
          <title>Trainings  49,668  1,048,571  784,571 </title>
        </sec>
        <sec id="sec-6-1-8">
          <title>Number of </title>
        </sec>
        <sec id="sec-6-1-9">
          <title>Tests  44,981  172,801  172,801 </title>
        </sec>
        <sec id="sec-6-1-10">
          <title>Number of </title>
        </sec>
        <sec id="sec-6-1-11">
          <title>Anomalies  11.97%  5.99%  5.77% </title>
          <p>The meta-parameters for neuroevolution synthesis of models demonstrate at Table 4.</p>
        </sec>
      </sec>
      <sec id="sec-6-2">
        <title>Table 4 </title>
        <sec id="sec-6-2-1">
          <title>The meta‐parameters for neuroevolution synthesis </title>
        </sec>
        <sec id="sec-6-2-2">
          <title>Metaparameter </title>
        </sec>
        <sec id="sec-6-2-3">
          <title>Population size </title>
        </sec>
        <sec id="sec-6-2-4">
          <title>Elite size </title>
        </sec>
        <sec id="sec-6-2-5">
          <title>Activation function (fitness functions) </title>
        </sec>
        <sec id="sec-6-2-6">
          <title>Mutation probability </title>
        </sec>
        <sec id="sec-6-2-7">
          <title>Crossover type </title>
        </sec>
        <sec id="sec-6-2-8">
          <title>Types of mutation </title>
        </sec>
        <sec id="sec-6-2-9">
          <title>Clustering method </title>
        </sec>
        <sec id="sec-6-2-10">
          <title>Neighbors number </title>
          <p>The results of the work present at Table 5.</p>
        </sec>
      </sec>
      <sec id="sec-6-3">
        <title>Table 5 </title>
        <sec id="sec-6-3-1">
          <title>The work results of proposed method </title>
          <p>Value 
100 
5% 
hyperbolic tangent </p>
          <p>25% 
uniform 
deleting an interneuronal connection 
removing a neuron 
adding interneuronal connection 
adding a neuron 
changing the activation function 
k‐nearest neighbors 
7 </p>
        </sec>
        <sec id="sec-6-3-2">
          <title>Datasets </title>
        </sec>
        <sec id="sec-6-3-3">
          <title>SWAT  WADI‐2017  WADI‐2019 </title>
        </sec>
        <sec id="sec-6-3-4">
          <title>Precision, %  94.41  90.28  89.53 </title>
          <p>Recall, % 
55.35 
70.64 
71.47 
f1, % 
0.74 
0.82 
0.83 </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>5. Discussions of results </title>
      <p>The experimental results demonstrate that parallel clustering of data and synthesis of a model
based on the processed data using an ensemble system can significantly increase the efficiency of the
anomaly detection process. The process of neuroevolution helps to synthesize models and develop
them in stages based on updated information about input data, without separating the process of data
preprocessing. Tests based on samples of WADI and SWAT data. In both cases, the results of
anomaly detection demonstrated satisfactory values, raising the qualitative level of detection. The
improvements in the WADI dataset were more significant than in the SWAT dataset because there are
more sensors and samples in the WADI dataset than in the SWAT dataset.</p>
      <p>The article proved that the neuroevolution approach can have a positive impact on the results. Our
future work will focus on further improvements to the method. A combination of different solution
topologies can significantly improve the results of the work.</p>
    </sec>
    <sec id="sec-8">
      <title>6. Conclusion </title>
      <p>At the paper a research and comparative analysis of existing strategies and methods solving the
problem of detecting and classifying anomalies were carried out, and a detection method based on
neuroevolutionary methods was also proposed. As can be seen from the results of the study, the
method of detecting anomalies based on neuroevolution has shown much greater effectiveness. The
proposed method has shown itself to be viable and can be improved.</p>
      <p>The result of the work is not only a comprehensive study and theoretical justification of the theory
associated with the analysis of time series, but also the proposed solution. His work consists of two
stages: the separation of the search space and the synthesis of models. At the training stage, the
method processes and separates data about the behavior of the system. In the synthesis mode, the
method gradually adjusts the models in order to get the final solution from them in the future. The
resulting model is synthesized using uniform crossing, which makes it possible to increase the size of
the parent pool from two individuals to a much larger number.</p>
    </sec>
    <sec id="sec-9">
      <title>7. Acknowledgements </title>
      <p>The work was carried out with the support of the state budget research projects of the state budget
of the National University "Zaporozhzhia Polytechnic" “Development of methods and tools for
analysis and prediction of dynamic behavior of nonlinear objects” (state registration number
0121U107499) and “Intelligent methods and tools for diagnosing and predicting the state of complex
objects” (state registration number 0122U000972).
8. References 
[15] Examples. TadGAN, 2022. URL:
https://github.com/CyberLympha/Examples/tree/main/%D0%A0%D0%B0%D0%B7%D0%B1%
D0%BE%D1%80%20%D1%81%D1%82%D0%B0%D1%82%D0%B5%D0%B9/TadGAN
[16] K. Faber, M. Pietron, D. Zurek, Ensemble Neuroevolution-Based Approach for Multivariate</p>
      <p>Time Series Anomaly Detection, Entropy, 23 (2021). doi: 10.3390/e23111466.
[17] S. Leoshchenko, S. Subbotin, A. Oliinyk, V. Lytvyn, M. Ilyashenko, Smart crossover mechanism
for parallel neuroevolution method of medical diagnostic models synthesis, in: Proceedings of
the Third International Workshop on Computer Modeling and Intelligent Systems, CMIS-2020,
CEUR-WS, Zaporizhzhia, 2020, pp. 57-69.
[18] S. Leoshchenko, A. Oliinyk, S. Subbotin, Adaptive Mechanisms for Parallelization of the
Genetic Method of Neural Network Synthesis, in: Proceedings of the 10th International
Conference on Advanced Computer Information Technologies, ACIT 2020, Deggendorf,
Ternopil, IEEE, 2020, oo. 446-450. doi: 10.1109/ACIT49673.2020.9208905.
[19] S. Leoshchenko, A. Oliinyk, S. Subbotin, T. Zaiko, S. Shylo, V. Lytvyn, Sequencing for
encoding in neuroevolutionary synthesis of neural network models for medical diagnosis , in:
Proceedings of the 3rd International Conference on Informatics &amp; Data-Driven Medicine, IDDM
2020, Växjö, Lviv, CEUR-WS, 2020, pp. 62-71.
[20] J.A.J. Alsayaydeh, Irianto, M. Zainon, H. Baskaran, S. G. Herawan, Intelligent Interfaces for
Assisting Blind People using Object Recognition Methods, International Journal of Advanced
Computer Science and Applications(IJACSA), 13(5) (2022) 734-741. doi:
10.14569/IJACSA.2022.0130584.
[21] J.A.J. Alsayaydeh, Irianto, A. Aziz, C.K. Xin, A. K. M. Zakir Hossain, S.G. Herawan, Face
Recognition System Design and Implementation using Neural Networks, International Journal of
Advanced Computer Science and Applications(IJACSA), 13(6) (2022) 519-526. doi:
10.14569/IJACSA.2022.0130663.
[22] V. Shkarupylo, I. Blinov, A. Chemeris, J.A.J. Alsayaydeh, A. Oliinyk, Iterative approach to TLC
model checker application, in: Proceedings of the 2nd KhPI Week on Advanced Technology,
KhPI Week 2021, IEEE, Kyiv, 2021, pp. 283-287. doi: 10.1109/KhPIWeek53812.2021.9570055.
[23] R. Pawar, k-NN based Time Series Classification, 2021. URL:
https://towardsdatascience.com/knn-based-time-series-classification-e5d761d01ea2
[24] S. Tajmouati, B. Wahbi, A. Bedoui, A. Abarda, M. Dakkon, Applying k-nearest neighbors to
time series forecasting : two new approaches, 2021. URL: https://arxiv.org/abs/2103.14200
[25] F. Martínez, M.P. Frías, M. Pérez, A. J. Rivera Rivas, A methodology for applying k-nearest
neighbor to time series forecasting, Artificial Intelligence Review, 52 (2019) 2019–2037. doi:
10.1007/s10462-017-9593-z.
[26] A. P. Mathur, N. O. Tippenhauer, SWaT: a water treatment testbed for research and training on
ICS security, in: Proceedings of the International Workshop on Cyber-physical Systems for
Smart Water Networks, CySWater, Vienna, IEEE, 2016, pp. 31-36, doi:
10.1109/CySWater.2016.7469060.
[27] C. M. Ahmed, V. R. Palleti, A. P. Mathur, WADI: a water distribution testbed for research in the
design of secure cyber physical systems, in: Proceedings of the 3rd International Workshop on
Cyber-Physical Systems for Smart Water Networks, CySWATER '17, Pittsburgh, PA, ACM,
2017, pp. 25-28. doi: 10.1145/3055366.3055375.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Chandola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>Anomaly detection: A survey</article-title>
          ,
          <source>ACM Computing Surveys</source>
          ,
          <volume>41</volume>
          (
          <issue>3</issue>
          ) (
          <year>2009</year>
          )
          <fpage>1</fpage>
          -
          <lpage>58</lpage>
          . doi:
          <volume>10</volume>
          .1145/1541880.1541882.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>What</given-names>
            <surname>Is Data Analytics</surname>
          </string-name>
          ?,
          <year>2020</year>
          . URL: https://www.intel.com/content/www/us/en/artificialintelligence/what-is-dataanalytics.
          <source>html#:~:text=Data%20analytics%20is%20the%20process,data%20for%20practically% 20any%20purpose</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Cycle</given-names>
            <surname>Consistency Loss</surname>
          </string-name>
          ,
          <year>2017</year>
          . URL: https://paperswithcode.com/method/cycle-consistency-loss
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>[4] Search for anomalies in time series based on the estimation of their parameters</article-title>
          ,
          <year>2021</year>
          . URL: https://openarchive.nure.ua/items/7c8e2e76-10fa
          <string-name>
            <surname>-</surname>
          </string-name>
          4044
          <string-name>
            <surname>-</surname>
          </string-name>
          907b-e51d05bd7cbf [In Ukrainian]
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ma</surname>
          </string-name>
          , S. Perkins,
          <article-title>Time-series novelty detection using one-class support vector machines</article-title>
          ,
          <source>in: Proceedings of the International Joint Conference on Neural Networks, Portland</source>
          ,
          <string-name>
            <surname>OR</surname>
          </string-name>
          , IEEE,
          <year>2003</year>
          , pp.
          <fpage>1741</fpage>
          -
          <lpage>1745</lpage>
          . doi:
          <volume>10</volume>
          .1109/IJCNN.
          <year>2003</year>
          .
          <volume>1223670</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6] MIT - Data to AI Lab,
          <article-title>Time series anomaly detection - in the era of deep learning</article-title>
          ,
          <year>2020</year>
          . URL: https://medium.com/mit
          <article-title>-data-to-ai-lab/time-series-anomaly-detection-in-the-era-of-deeplearning-dccb2fb58fd</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Anomaly</given-names>
            <surname>Detection</surname>
          </string-name>
          in Time Series,
          <year>2023</year>
          . URL: https://neptune.ai/blog/anomaly-detection
          <string-name>
            <surname>-</surname>
          </string-name>
          intime-series
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          , Effective Approaches for Time Series Anomaly Detection,
          <year>2020</year>
          . URL: https://towardsdatascience.com
          <article-title>/effective-approaches-for-time-series-anomaly-detection9485b40077f1</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schmidl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wenig</surname>
          </string-name>
          , T. Papenbrock,
          <article-title>Anomaly detection in time series: a comprehensive evaluation</article-title>
          ,
          <source>in: Proceedings of the Proceedings of the VLDB Endowment</source>
          , Vol.
          <volume>15</volume>
          (
          <issue>9</issue>
          ), Sydney,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>1779</fpage>
          -
          <lpage>1797</lpage>
          . doi:
          <volume>10</volume>
          .14778/3538598.3538602.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <article-title>Anomaly detection and forecasting in Azure Data Explorer</article-title>
          ,
          <year>2023</year>
          . URL: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/anomaly-detection
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>B.</given-names>
            <surname>Artley</surname>
          </string-name>
          , Time Series Forecasting with
          <string-name>
            <surname>ARIMA</surname>
          </string-name>
          ,
          <source>SARIMA and SARIMAX</source>
          ,
          <year>2022</year>
          . URL: https://towardsdatascience.com/time-series
          <article-title>-forecasting-with-arima-sarima-and-sarimaxee61099e78f6</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>M. K. Mandal</surname>
          </string-name>
          ,
          <string-name>
            <surname>Implementing</surname>
            <given-names>PCA</given-names>
          </string-name>
          ,
          <article-title>Feedforward and Convolutional Autoencoders and using it for Image Reconstruction</article-title>
          ,
          <source>Retrieval &amp; Compression</source>
          ,
          <year>2018</year>
          . URL: https://blog.manash.
          <article-title>io/implementing-pca-feedforward-and-convolutional-autoencoders-andusing-it-for-image-reconstruction-8ee44198ea55</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Geiger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Alnegheimish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cuesta-Infante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Veeramachaneni</surname>
          </string-name>
          ,
          <article-title>TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks</article-title>
          ,
          <source>in: Proceedings of the IEEE International Conference on Big Data (Big Data)</source>
          , IEEE,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .1109/BigData50022.
          <year>2020</year>
          .
          <volume>9378139</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>TadGAN</surname>
          </string-name>
          ,
          <year>2021</year>
          . URL: https://github.com/gusty1g/TadGAN
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>