<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Journal of
Intelligent Information Systems (2023) 1-19.
[34] S. Rayana</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1145/3292500.3330871</article-id>
      <title-group>
        <article-title>On the Environmental Impact of the Algorithm LatentOut for Unsupervised Anomaly Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fabrizio Angiulli</string-name>
          <email>f.angiulli@dimes.unical.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabio Fassetti</string-name>
          <email>f.fassetti@dimes.unical.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Ferragina</string-name>
          <email>luca.ferragina@unical.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Latent</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Anomaly Detection, Variational Autoencoder, Carbon Footprint</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DIMES Dept., University of Calabria</institution>
          ,
          <addr-line>87036 Rende (CS)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>13601</volume>
      <fpage>10</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>environmental impact. Because of their astonishing performances, Deep Neural Network-based approaches have become pervasive in many human activities. However, they often require a long, energy-intensive training phase, which has a huge In recent years, there has been a significant increase in the emphasis placed on environmental themes across various sectors, driven by growing concerns over climate change and sustainability. This heightened focus has led to many initiatives, policies and discussions aimed at addressing ecological challenges and promoting a more sustainable future. For the reasons stated above, Deep Learning cannot be exempted from such initiatives and the literature is starting to pay attention to these issues. This paper aims at contributing to this field, in particular, concerning the Anomaly Detection Task whose environmental impact, due to its widespread employment, deserves to be addressed.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>with
other Anomaly Detection Neural Network-based methods and we highlight that it is the one that obtains the best
results in terms of a balance between high accuracy performance and low carbon footprint.</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>Anomalies can be defined as examples that significantly deviate from the majority of the data to arise
the suspect of being generated by a diferent mechanism. Anomaly Detection represents a fundamental
task in many human activities, including Healthcare, Cyber-security, Industrial Monitoring, Fraud
Detection, and many others.</p>
      <p>
        It is possible to identify three diferent types of settings of Anomaly Detection [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In the Supervised
setting a dataset whose items are labeled as normal and abnormal is available to build a classifier,
typically the dataset is highly unbalanced and the anomalies form a rare class. The Semi-supervised
setting, also called one-class, is characterized by the presence in input of only examples from the normal
class that are used to train the detector. In the Unsupervised setting the goal is to assign an anomaly
score to each object of the input dataset in order to find anomalies in it.
      </p>
      <p>
        Classical data mining and machine learning algorithms performing the task of detecting outliers
include statistical-based [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], distance-based [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6">3, 4, 5, 6</xref>
        ], density-based [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ], reverse nearest
neighborbased [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ], SVM-based [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ], and many others [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>CEUR</p>
      <p>ceur-ws.org</p>
      <p>
        Recently, the approaches that have achieved the most success have been those based on deep learning
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], which can be divided into three main families: reconstruction error-based methods employing
Autoencoders (AE), models based on Generative Adversarial Networks (GAN), and SVM-like neural
architectures.
      </p>
      <p>
        At the basis of the application of Autoencoders (AE) and Variational Autoencoders (VAE) [
        <xref ref-type="bibr" rid="ref14 ref15 ref16">15, 16, 14</xref>
        ] to
Anomaly Detection relies the concept of reconstruction error. More in detail, (Variational) Autoencoders
are trained to map data into a low dimensional latent space and then map them back into the original
space generating in output a reconstruction of the input as similar as possible to it. Since the majority
of the data used for training models belongs to the normal class, it is assumed that these networks are
able to reconstruct the inliers better than the outliers and, thus, the reconstruction error can be adopted
as an anomaly score.
      </p>
      <p>
        GAN-based models [
        <xref ref-type="bibr" rid="ref17 ref18 ref19 ref20">17, 18, 19, 20</xref>
        ] basically consist in the combined, adversarial training of two
sub-architectures, the generator and the discriminator. Specifically, the generator network produces
artificial anomalies as realistic as possible, and the discriminator assigns an anomaly score to each item.
      </p>
      <p>
        SVM-like methods [
        <xref ref-type="bibr" rid="ref21">21, 22, 23</xref>
        ] leverage the idea of enclosing normal data into a hypersphere employing
a One-Class SVM-like loss function combined with a deep neural architecture. A slightly diferent
approach that can be included in this family, is introduced in [24] where the architecture presents an
additional final layer composed of just one neuron that produces an anomaly score that, for anomalies,
is as far as possible from a value obtained as the average of randomly sampled normal items anomaly
scores.
      </p>
      <p>Moreover, in [25] has been introduced Deep Isolation Forest (DIF), a novel methodology that utilizes
casually initialized neural networks to map original data into random representation ensembles, where
random axis-parallel cuts are subsequently applied to perform data partition.</p>
      <p>Nevertheless, the cost of high power and energy combines with the high accuracy and training speed
of the Deep Learning models. This is leading researchers to be aware of the environmental impact
of deep neural architectures by trading of accuracy against energy consumption and also to perform
characterization in terms of performance, power and energy for guiding the architecture design of DNN
models [26, 27, 28, 29].</p>
      <p>This paper aims to provide a contribution in this direction, and, in particular, to the field of Anomaly
Detection by analyzing the behaviour of recent methods from the point of view of the detection
performance as well as from the point of view of their carbon footprint. Specifically, we focus on the
Latent algorithm [30, 31, 32, 33], an anomaly detection framework that applies to any deep neural
architecture as a baseline to obtain a refined score, and we compare it with the baseline architecture on
which it is applied and deep learning-based competitors from the other families.
2. The Latent</p>
      <p>algorithm for Unsupervised Anomaly Detection
Due to the quite good performances they obtained as well as their versatility, the ones based on
(Variational) Autoencoders have become the most widespread Anomaly Detection approaches relying
on Deep Neural Networks.</p>
      <p>The main issue about them is that they often generalize so well to reconstruct also anomalies [ 30],
thus worsening the capability of detecting anomalies of the reconstruction error.</p>
      <p>In [31] Latent is introduced. It is a methodology that enhances both the reconstruction error and
the latent space distribution of the Variational Autoencoder in order to obtain a refined anomaly score.
Specifically, the first variant of the Latent (Figure 1) algorithm considers the enlarged feature space
 =  ×  , where  represents the latent space and  is the reconstruction error space (usually  ⊆ ℝ ),
and performs a  -NN density estimation in the space  .</p>
      <p>In Figure 1 the complete workflow of Latent is showed. Each point of the dataset  ∈  is mapped
into the latent space  of the VAE (blue points represent inliers, red ones represent anomalies) by means
of the encoder   and then reconstructed back in the original space  ∈̂  by means of the decoder   .
Then, the reconstruction error () = ‖ − ‖̂ 22 is computed, the feature space  =  ×  is created, and
the  -NN density estimation is performed in it to compute the Latent anomaly score.</p>
      <p>The motivation behind this procedure is based on the observation that anomalies tend to lie in the
sparsest regions of the augmented feature space  . This happens because even when their reconstruction
error is not exceptionally large, is still significantly larger than that of their most similar normal items.</p>
      <p>In [32] Latent has been expanded in order to be potentially applied to any neural architecture
that has three fundamental properties:
• it outputs an anomaly score,
• it has a latent space  ,
• it performs a mapping from the original data space  to  through an encoder-shaped module.
In particular, the neural models on which Latent has actually been tested are AE, VAE, GANomaly,
Fast–AnoGAN, SO − GAAL, and MO − GAAL.</p>
      <p>Moreover, in [33] it has been showed that the separation properties of the enlarged space  allow any
generic anomaly score (not only the  -NN) to perform better when applied on it than on the input data
space  .</p>
    </sec>
    <sec id="sec-3">
      <title>3. Experimental results</title>
      <sec id="sec-3-1">
        <title>3.1. Experimental setup</title>
        <p>In our experiments we consider the tabular datasets cardio, letter, lympho, mammography, pendigits,
pima, satellite, satimage-2, speech, thyroid, from the ODDS repository [34] as well as the image datasets
MNIST [35], Fashion-MNIST [36], and CIFAR10 [37].</p>
        <p>The last three datasets (diferently from the ones from the ODDS repository) are multi-class, thus to
make them suitable for the anomaly detection task we adopt a one-vs-all strategy, meaning that we
consider one class as normal and we randomly sample  items from each other class. If not otherwise
stated, we set  = 10 . Specifically, we select the class “ 0” as normal for the MNIST dataset, the class
“Sandal” for Fashion-MNIST, and the class “deer ” for CIFAR-10.</p>
        <p>As for the implementation details of the algorithm, we consider the original version of Latent
with the VAE as baseline architecture, and the  -NN with  = 50 as estimator of the density of the
feature space  . The latent space dimension ℓ of the VAE is set to ℓ = 2 for tabular ODDS datasets and
to ℓ = 32 for image datasets. As for the encoder structure (the decoder is symmetric to it) we adopt
the same strategy used in [33], i. e. we insert hidden layers of dimension ℓ = ⌊ 4 ⌋ between the input
 -dimensional space and the ℓ-dimensional latent space for each  ∈ ℕ + such that ⌊ 4 ⌋ &gt; ℓ.</p>
        <p>The  2 emissions are estimated by means of the Python library CodeCarbon [38] which bases its
tracking on the power consumption and the geographic location where the code is executed.
3.2. Evolution of performance and emissions of Latent
and VAE during training
The energy consumption of any Deep Learning model is related to the training phase, and, in particular,
to the number of training epochs.</p>
        <p>Therefore, it is of crucial importance to understand the behavior of these algorithms as the training
proceeds to optimize the trade-of between the maximization of the performance and the minimization
of energy consumption.</p>
        <p>The quantity of  2 produced by Latent , which we represent as ℰLatent , is fundamentally
constituted by two terms:
• the emissions ℰ  needed for the training of the architecture and the computation, which is
shared with the Variational Autoencoder,
• the emissions ℰ -NN used for the building of the feature space ℱ and the computation of the
 -NN algorithm in it.</p>
        <p>Since the two operations are carried out in sequence and independently of each other, we have that
ℰLatent
= ℰ 
+ ℰ -NN
which means that, with equal training epochs, the carbon footprint of Latent is always greater
than the one of the Variational Autoencoder. Thus, for a fair comparison, we train the Variational
Autoencoder for 100 epochs and we stop the training earlier for evaluating the Latent score.</p>
        <p>In figures 2, 3, 4, we show the performances of both Latent (in orange) and the standard Variational
Autoencoder (in blue) in terms of Area Under the ROC Curve (AUC) as the training proceeds. Observe
that on the horizontal axis is reported the  2 emissions (in   ), which means that, for the reasons
stated above, each value of the AUC of Latent is obtained with fewer epochs than the relative value
of the VAE.</p>
        <p>As we can see, in almost every plot the curve of Latent is placed above the curve of the VAE.
Moreover, the trend of Latent is much more regular than the one of the VAE (see in particular the
plots of the datasets cardio, mammography, satellite, satimage-2, mnist, cifar ). This implies that if we
ifx a threshold on the amount of  2 we want to emit, the score of Latent always outperforms the
standard score of the VAE. In other words, Latent is able to better exploit the emissions produced
than the standard architecture on which it is applied.</p>
        <p>This happens because as the training proceeds the reconstruction capabilities of the VAE improve so
much that at some point it becomes able to reconstruct also outliers, thus lowering the anomaly detection
performances of the model. On the other side Latent benefits of the latent space organization that
produces a progressively better separation between normal examples and anomalies in the feature
space  .</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.3. Comparison with competitors</title>
        <p>
          We consider as competitors some of the neural networks algorithm implemented in the Python library
PyOD [39], namely Deep-SVDD [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], from the SVM-like family, AnoGAN [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] and ALAD [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], from
the GAN family, and DIF [25]. For the implementation details (number of layers and neurons, training
epochs, learning rate, potential hyperparameters), we refer to the default values fixed in PyOD. As for
Latent , we consider again the setup described in section 3.1 and we perform a few-epochs training,
due to the good convergence properties observed in the last section. Specifically, the VAE is trained for
15 epochs.
        </p>
        <p>As evaluation metrics we adopt the standard Area Under the ROC Curve (AUC) and the ratio   2
between the emissions of  2 (in   ) produced for the training and the inference of a model, and the
AUC. This last value is a measure combining both performance and energy consumption, indeed it
indicates how much  2 is needed (on average) to obtain a single percentage point of AUC.</p>
        <p>Table 1 shows the results in terms of AUC. As we can see, Latent is the best method for half the
datasets, achieving performances close to the best also in the other half. In particular, confirming the
observation made in [31], Latent is especially efective on higher dimensional, structured data (for
example speech and the image datasets). In Table 2 are shown the results of the experiment in terms of
the ratio  2 . Here, Latent outperforms its competitors in all but one dataset, exhibiting the best
 
trade-of between performances obtained and the emissions of  2 produced.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>In this paper, we have focused on the algorithm Latent for unsupervised anomaly detection in
order to evaluate its performances and measure the environmental impact of its executions. When
compared to the standard architecture on which it is applied, i. e. the Variational Autoencoder, Latent
shows that low energy-consumptive training can lead it to conspicuously better results. Moreover, in
comparison with other neural network-based anomaly detection approaches it has shown superior
performances both in terms of absolute AUC and, most importantly, in terms of the ratio between the
emitted  2 and the AUC obtained.</p>
      <p>As future development, we intend to expand the discussion about the environmental impact of
Latent by including a more profound analysis of all its several variants and an investigation
specialized on the hardware type (e.g., CPU vs. GPU), as well as propose novel measures to better capture the
trade-of between emissions and performances. Finally, as a more ambitious goal, we aim at introducing
a mechanism enabling Latent to consider the green-aware aspect at training time.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We acknowledge the support of the PNRR project FAIR - Future AI Research (PE00000013), Spoke 9
Green-aware AI, under the NRRP MUR program funded by the NextGenerationEU.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ruf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Kaufmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Vandermeulen</surname>
          </string-name>
          , G. Montavon,
          <string-name>
            <given-names>W.</given-names>
            <surname>Samek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kloft</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. G.</given-names>
            <surname>Dietterich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <article-title>A unifying review of deep and shallow anomaly detection</article-title>
          ,
          <source>Proc. IEEE</source>
          <volume>109</volume>
          (
          <year>2021</year>
          )
          <fpage>756</fpage>
          -
          <lpage>795</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Davies</surname>
          </string-name>
          , U. Gather,
          <article-title>The identification of multiple outliers</article-title>
          ,
          <source>Journal of the American Statistical Association</source>
          <volume>88</volume>
          (
          <year>1993</year>
          )
          <fpage>782</fpage>
          -
          <lpage>792</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Knorr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tucakov</surname>
          </string-name>
          ,
          <article-title>Distance-based outlier: algorithms and applications</article-title>
          ,
          <source>VLDB Journal 8</source>
          (
          <year>2000</year>
          )
          <fpage>237</fpage>
          -
          <lpage>253</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Angiulli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pizzuti</surname>
          </string-name>
          ,
          <article-title>Outlier mining in large high-dimensional data sets</article-title>
          ,
          <source>IEEE Trans. Knowl. Data Eng</source>
          .
          <volume>2</volume>
          (
          <year>2005</year>
          )
          <fpage>203</fpage>
          -
          <lpage>215</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Angiulli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Basta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pizzuti</surname>
          </string-name>
          ,
          <article-title>Distance-based detection and prediction of outliers</article-title>
          ,
          <source>IEEE Trans. on Knowledge and Data Engineering</source>
          <volume>2</volume>
          (
          <year>2006</year>
          )
          <fpage>145</fpage>
          -
          <lpage>160</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Angiulli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fassetti</surname>
          </string-name>
          ,
          <article-title>DOLPHIN: an eficient algorithm for mining distance-based outliers in very large datasets</article-title>
          ,
          <source>ACM Trans. Knowl. Disc. Data (TKDD) 3</source>
          (
          <issue>1</issue>
          ) (
          <year>2009</year>
          )
          <article-title>Article 4</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>M. M. Breunig</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Kriegel</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Sander</surname>
          </string-name>
          , Lof:
          <article-title>Identifying density-based local outliers</article-title>
          ,
          <source>in: Proc. Int. Conf. on Managment of Data (SIGMOD)</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>W.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tung</surname>
          </string-name>
          , J. Han,
          <article-title>Mining top-n local outliers in large databases</article-title>
          ,
          <source>in: Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD)</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>V.</given-names>
            <surname>Hautamäki</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Kärkkäinen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fränti</surname>
          </string-name>
          ,
          <article-title>Outlier detection using k-nearest neighbour graph</article-title>
          ,
          <source>in: International Conference on Pattern Recognition (ICPR)</source>
          , Cambridge, UK,
          <year>August</year>
          23-
          <issue>26</issue>
          ,
          <year>2004</year>
          , pp.
          <fpage>430</fpage>
          -
          <lpage>433</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Radovanović</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nanopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ivanović</surname>
          </string-name>
          ,
          <article-title>Reverse nearest neighbors in unsupervised distancebased outlier detection</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>27</volume>
          (
          <year>2015</year>
          )
          <fpage>1369</fpage>
          -
          <lpage>1382</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Angiulli</surname>
          </string-name>
          ,
          <article-title>CFOF: A concentration free measure for anomaly detection, ACM Transactions on Knowledge Discovery from Data (TKDD) 14 (</article-title>
          <year>2020</year>
          ) 4:
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          :
          <fpage>53</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B.</given-names>
            <surname>Schölkopf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Platt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shawe-Taylor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Smola</surname>
          </string-name>
          , R. C.
          <article-title>Williamson, Estimating the support of a high-dimensional distribution</article-title>
          ,
          <source>Neural Computation</source>
          (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>D. M. J. Tax</surname>
            ,
            <given-names>R. P. W.</given-names>
          </string-name>
          <string-name>
            <surname>Duin</surname>
          </string-name>
          ,
          <article-title>Support vector data description, Mach</article-title>
          . Learn. (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Chalapathy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chawla</surname>
          </string-name>
          ,
          <article-title>Deep learning for anomaly detection: A survey</article-title>
          ,
          <year>2019</year>
          . arXiv:
          <year>1901</year>
          .03407.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hawkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Baxter</surname>
          </string-name>
          ,
          <article-title>Outlier detection using replicator neural networks</article-title>
          ,
          <source>in: International Conference on Data Warehousing and Knowledge Discovery (DAWAK)</source>
          ,
          <year>2002</year>
          , pp.
          <fpage>170</fpage>
          -
          <lpage>180</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>An</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <article-title>Variational autoencoder based anomaly detection using reconstruction probability</article-title>
          ,
          <source>Technical Report 3</source>
          ,
          <string-name>
            <given-names>SNU</given-names>
            <surname>Data Mining</surname>
          </string-name>
          <string-name>
            <surname>Center</surname>
          </string-name>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Schlegl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Seeböck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Waldstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Schmidt-Erfurth</surname>
          </string-name>
          , G. Langs,
          <article-title>Unsupervised anomaly detection with generative adversarial networks to guide marker discovery</article-title>
          ,
          <year>2017</year>
          . arXiv:
          <volume>1703</volume>
          .
          <fpage>05921</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>Akcay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Atapour-Abarghouei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. P.</given-names>
            <surname>Breckon</surname>
          </string-name>
          , Ganomaly:
          <article-title>Semi-supervised anomaly detection via adversarial training</article-title>
          ,
          <year>2018</year>
          . arXiv:
          <year>1805</year>
          .06725.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>Generative adversarial active learning for unsupervised outlier detection</article-title>
          ,
          <source>IEEE Trans. Knowl. Data Eng</source>
          .
          <volume>32</volume>
          (
          <year>2020</year>
          )
          <fpage>1517</fpage>
          -
          <lpage>1528</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zenati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Romain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.-S.</given-names>
            <surname>Foo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lecouat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chandrasekhar</surname>
          </string-name>
          ,
          <article-title>Adversarially learned anomaly detection</article-title>
          , in: 2018 IEEE International conference
          <article-title>on data mining (ICDM)</article-title>
          , IEEE,
          <year>2018</year>
          , pp.
          <fpage>727</fpage>
          -
          <lpage>736</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ruf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Görnitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Deecke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Siddiqui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Vandermeulen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Binder</surname>
          </string-name>
          , E. Müller, M. Kloft,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>