<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>October</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Latent Representations of Terrain in Aerial Image Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pylyp Prystavka</string-name>
          <email>prystavka@nau.edu.ua</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Serge Dolgikh</string-name>
          <email>sdolgikh@nau.edu.ua</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olga Cholyshkina</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oleksandr Kozachuk</string-name>
          <email>kozachukk@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Interregional Academy of Personnel Management</institution>
          ,
          <addr-line>2/16 Frometivska St., Kyiv, 03039</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Aviation University</institution>
          ,
          <addr-line>1 Lubomyra Huzara Ave, Kyiv, 03058</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Solana Networks</institution>
          ,
          <addr-line>301 Moodie Drive, Ottawa, K2H 9C4</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>2</volume>
      <issue>2021</issue>
      <abstract>
        <p>Investigation of informative representations of complex data is a rapidly developing field of research in machine learning. In this work we present a process of production and analysis of informative low-dimensional latent representations of real-world image data with neural network models of unsupervised generative learning. A model of convolutional autoencoder based on VGG-16 architecture was used to produce low-dimensional latent representations of aerial image data and the characteristics of distributions of several higher-level classes of terrain types were studied. The analysis of distributions demonstrated a landscape of compact concept clusters for most studied types of terrain with good separation between concept regions. The results of this work can be used in developing methods of effective learning with minimal labeled data based on the emergent concept-sensitive structure in the latent representations. Artificial Intelligence, unsupervised machine learning, clustering, image recognition, image Related Research Informative representations obtained with models of unsupervised generative self-learning were used in a number of applications to identify the concepts or classes of interest in the observable data. Artificial neural networks have strong potential in such problems and applications due to their capability of universal approximation [1,2], making them suitable for processing data of virtually any type and complexity including live image data recorded in aerial surveillance.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>classification</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>Classification of complex data such as images represents a significant challenge in the areas with
severe deficit of labels. In these problems and applications, unsupervised machine learning methods
such as processing data with models of unsupervised generative self-learning has shown to be effective
in identification and selection of in-formative latent representations that can simplify subsequent
classification and significantly reduce the label requirement. The motivation of this work was to apply
these methods to the practical task of aerial image recognition where a specific set of classes combined
with a strong deficit of labels make application of standard methods of supervised classification
challenging.
1.1.</p>
      <p>
        2021 Copyright for this paper by its authors.
is further complicated by poor or inconsistent between different studies formalization of classes and
severe deficit of labelled samples [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3-5</xref>
        ].
      </p>
      <p>
        Hierarchical representations of observable data were obtained in a completely unsupervised training
process with Restricted Boltzmann Machines (RBM) and Deep Belief Networks (DBN) [
        <xref ref-type="bibr" rid="ref6 ref7">6,7</xref>
        ] offering
a noticeable improvement in the quality of subsequent supervised learning. Different types,
architectures and flavors of generative models were investigated since including autoencoder neural
networks, Generative Adversarial Networks (GAN) [
        <xref ref-type="bibr" rid="ref8 ref9">8,9</xref>
        ] to name only a few in a rapidly expanding
field, resulting in improved accuracy and versatility of the models with virtually unlimited range of
applications. The relations between learning and statistical thermodynamics was studied in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] leading
to understanding of a deep connection between learning processes in artificial neural models and
principles of information theory and statistical thermodynamics.
      </p>
      <p>
        Previous results in unsupervised representations with generative self-learning neural network
models include applications of deep autoencoder models of different architectures such as sparse,
variational, convolutional and others to create informative representations of image [
        <xref ref-type="bibr" rid="ref11 ref12">11,12</xref>
        ] and other
types of data [
        <xref ref-type="bibr" rid="ref13 ref14">13,14</xref>
        ]. These results have demonstrated that categorization of data by common
higherlevel concepts in the latent representations under certain constraints imposed in training can be
considered a general effect of information processing in such models [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. An unsupervised structure
of this type that does not require massive amounts of labeled data to identify can be harnessed for more
effective learning in the environments with strong deficit of labels and/or new and unknown
environments where labeled data is scarce. Given the constraints of the problem and the results
discussed earlier, it was hypothesized that applying the methods and models of unsupervised generative
learning to this problem may allowed to obtain informative representations of image data and reduce
the requirement for labeled data to achieve successful learning.
1.2.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Motivation</title>
      <p>
        The motivation of this work suggested by earlier results [
        <xref ref-type="bibr" rid="ref5 ref6 ref7 ref8">5-8</xref>
        ] was to investigate the structure that
emerges in unsupervised representations of generative models with real-world image data and introduce
methods of production, evaluation and analysis of informative latent representations that can be used in
developing of effective learning models with reduced requirements of labeled training data.
      </p>
      <p>To solve the problem of strong label deficit methods of creating informative representations with
deep neural network models of unsupervised generative learning were applied, that are capable of
learning essential patterns in the observed data in the unsupervised mode without any labeled data. The
novelty of the proposed approach is a successful application of generative models to real-world data
such as aerial images, allowing eventually to perform processing on board of an autonomous vehicle;
secondly, designing and demonstrating the methods of evaluation and measurement of latent
representations, including entirely unsupervised ones.</p>
      <p>The dataset of images recorded in real surveillance of terrain from an aerial vehicle was chosen for
two main reasons: first to demonstrate that the methods developed in this work can be applied to realistic
complex data types; and not in the least, because the tasks aerial image classification and interpretation
are becoming increasingly common in many practical applications.</p>
    </sec>
    <sec id="sec-4">
      <title>2. Methods and Data</title>
      <p>This section contains the description of the model, data and methods used to produce and analyze
the distributions of the characteristic classes of data in the input image dataset.
2.1.</p>
    </sec>
    <sec id="sec-5">
      <title>Methodology</title>
      <p>
        The models were of the type of deep convolutional autoencoder neural networks [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] based on VGG
architecture that achieved good results in supervised learning of image data [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. The models were
implemented in Tensorflow and Keras [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] with a number of common machine learning packages and
libraries.
      </p>
      <p>Generative neural network models were used as described further in this section to produce
compressed latent representations of aerial terrain observation images represented in the dataset to
evaluate distributions of classes identified by a type of terrain the latent space. Methods of geometrical
analysis were applied to samples transformed to latent representation to identify characteristic
parameters of distributions, such as compactness and separation between distribution regions of
different classes.
2.2.</p>
    </sec>
    <sec id="sec-6">
      <title>Generative Model Architecture</title>
      <p>The architecture diagram of the models used in this work is given in Figure 1. It can be described as
a deep convolutional autoencoder neural network that contained the encoder model with several
convolutional blocks, activation and normalization layers producing a flattened numerical
representation with dimensionality 8 x 8 x 256; and the generator model with up-sampling blocks with
the resulting output layer of the same dimensionality as the input layer.</p>
      <p>The models were trained in the unsupervised mode, with unlabeled raw image da-ta, to reduce the
generative error, i.e. the mean deviation of input images in the training dataset from their regeneration
by the model as:</p>
      <p>=  (| ( ) −  |) → 
where S is the training sample, G(S), the output of the model on the training sample.</p>
      <p>The architecture of the model is shown in Figure 1.
(1)</p>
      <p>A trained model can produce encoded representation of the input sample X in the observable space
by transforming it to the latent representation space defined by activations of the neurons in the latent
layer of the model as:</p>
      <p>=  ( ) (2)
where E(X) is the generating stage of the model, from observable input to the latent representation
layer (Figure 1).</p>
      <p>It needs to be noted that in a trained generative model the encoding transformation (1) is defined in
completely unsupervised process and does not require for training any samples labeled with known
categories of the observable data.
2.3.</p>
    </sec>
    <sec id="sec-7">
      <title>Data</title>
      <p>The dataset of images was obtained in live aerial surveillance of the terrain with preprocessing of
scaling to the standard size (64 x 64) and augmentation by rotation. Images in the dataset were classified
in semi-automatic process into classes representing characteristic types of terrain with significant
representation in the dataset. In the rest of the study, classes or categories of images were denoted with
a symbol, such as “T” for transport tracks, “W” for wooded areas and so on.</p>
      <p>The detailed composition of the dataset is described in Table 1.</p>
    </sec>
    <sec id="sec-8">
      <title>Training</title>
      <sec id="sec-8-1">
        <title>Symbol T N B</title>
        <p>W
H
F
D
O</p>
        <p>The models were trained in unsupervised process with minimization of the deviation between the
training set of images and their regenerations by the model. Cost functions used for unsupervised
training were Mean Squared Error (MSE) and binary cross entropy (BCE), both showing strong
improvement in the process of training (Figure 2).</p>
        <p>Strong reduction of generative error in the process of unsupervised training indicated that the latent
representations created by the models contained sufficient information to regenerate observable data,
because in a feedforward neural network of the type used in the study the output has to be generated
entirely from the information contained in the latent representation.</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>3. Results</title>
      <p>In this section we present the results of measurement and analysis of distributions of higher-level
concepts in the original (observable) dataset in the latent representations produced with generative
training of unsupervised autoencoder models as described in the previous sections.
3.1.</p>
    </sec>
    <sec id="sec-10">
      <title>Overall Characteristics</title>
      <p>The characteristics of the general representative set of samples in the latent representation, without
breakdown by higher-level concepts were as follows:</p>
      <p>
        The analysis of principal components [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] produced the following results:
First three components: 54.4% of overall variation
First 10 components: 72.7 %
First 100 components: 98.2% of overall variation.
      </p>
      <p>These results indicated the possibility of strong redundancy reduction in the latent representation
without significant loss of information. Based on these results in the analysis of class distribution three
principal dimensions with the highest variation were used, which allowed to produce direct
visualizations of concept distributions in the latent representation.
3.2.</p>
    </sec>
    <sec id="sec-11">
      <title>Latent Concept Distributions</title>
      <p>In this section the results of the measurement and analysis of concept distributions in the latent
representation of generative models are presented. The parameters of distributions such as the
characteristic size, standard deviation and density of the concept distribution regions in the latent
representation coordinates are given relative to the maximum dimension of the overall latent dataset,
and the uniform density.</p>
      <p>For most concepts with significant representation in the dataset, a compact and well-defined
character of latent concept distributions was observed with the density of the concept region (i.e. the
region of distribution of samples associated with the studied concept in the latent representation space)
significantly higher than uniform.</p>
      <p>
        These results confirm the earlier observed effect of correlation between unsupervised latent
distributions and higher-level concepts with strong representation in the observable dataset [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. An
indepth analysis of distributions will be attempted in a future study.
3.3.
      </p>
    </sec>
    <sec id="sec-12">
      <title>Unsupervised Categorization</title>
      <p>In this section, measurements related to categorization capacity of the models are presented and
discussed. Shown in Table 3 are visualizations of intersections of concept regions, with the highest
relative volume. A cross-concept intersection matrix can be defined to indicate the degree of
disentanglement of concept regions in the representation, as the ratio of the latent volume of the
overlapping region between concepts A and B, Oa,b to the volume of the concept region A, Ha:
 (  , ) (2)
  , =</p>
      <p>(  )</p>
      <sec id="sec-12-1">
        <title>Concept (Built area)</title>
      </sec>
      <sec id="sec-12-2">
        <title>Original</title>
      </sec>
      <sec id="sec-12-3">
        <title>Generated</title>
        <p>As can be seen from the results in Table 3, a good separation of concept regions was observed for
most categories of images with significant representation in the dataset, indicating strong categorization
achieved by the models in the process of unsupervised generative learning.</p>
        <p>
          Unsupervised categorization or decoupling of higher-level concepts in the unsupervised latent
representations is the effect observed in a number of experiments with unsupervised self-learning
models [
          <xref ref-type="bibr" rid="ref6 ref7 ref8">6-8</xref>
          ] that is evident as compact and well-separated concept distributions in the latent space.
        </p>
        <p>Accordingly, categorized distributions tend to minimize the volume and consequently, maximize
the relative density of latent concept distributions while minimizing the overlapping between the
distributions of different concepts.
3.4.</p>
      </sec>
    </sec>
    <sec id="sec-13">
      <title>Generative Ability</title>
      <p>Models that were successful in identifying characteristic patterns or concepts in the observable data
as a result of generative self-learning can be expected to be able to regenerate input samples with close
resemblance / low deviation from original input samples. To evaluate generative ability of the models,
experiments were performed with samples of the concepts present in the training dataset, as well as
those that were not classified into concepts (i.e. general background). Examples of generative results of
the models are shown in Table 4.</p>
      <p>Table 4</p>
      <sec id="sec-13-1">
        <title>Generative ability of self-learning models</title>
        <p>A successful generative ability across multiple classes of data indicates that the distributions in the
latent representation were correlated with characteristic patterns in the original (i.e. observable) data. It
follows from the architecture of feed-forward artificial neural networks that all the information
necessary for generation of the output of the model must be contained in the latent representation layer
and there-fore the process of unsupervised generative training was able to produce informative
representation with substantially reduced redundancy in the observable data represented in the training
dataset.</p>
        <p>Generative ability of self-learning models such as those studied in this work can be used in
augmentation of training datasets for models of supervised learning, to improve the accuracy and extend
classification ability to classes of data under represented in the datasets. It will be investigated in more
detail in another study.</p>
      </sec>
    </sec>
    <sec id="sec-14">
      <title>4. Discussion</title>
      <p>
        The results presented in this work are in agreement with the growing number of reports with
observations of the effects of concept-correlated latent representations emerging in unsupervised
generative learning and provide strong additional arguments in support of the general character of this
effect. Given the wide range of models and types of data, from lower complexity [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] to massive and
complex architectures [
        <xref ref-type="bibr" rid="ref11 ref12">11,12</xref>
        ] where unsupervised categorization in the latent representations in
generative self-learning has been observed, this conclusion appears to be well substantiated.
      </p>
      <p>It was demonstrated that under certain constraints such as generative accuracy and information
compression or redundancy reduction, low-dimensional latent representations of models learning to
generate input distributions in a completely unsupervised learning process can produce distinct structure
that is correlated with principal higher-level concepts in the observable data. The results in Sections
3.2, 3.3 on measurement of latent distributions of main concept regions appear to support this
conclusion.</p>
      <p>Identification and description of unsupervised latent structure that emerges in generative learning
can be a valuable instrument in the analysis of general data, in particular, of less known origin where
massive prior knowledge such as large labeled datasets used in supervised machine learning may not
be available.</p>
      <p>
        A number of recent results indicated that similar low-dimensional representations can play an
important role in processing of sensory data by humans [
        <xref ref-type="bibr" rid="ref20 ref21">20,21</xref>
        ]. Demonstration of success of
lowdimensional representations obtained with models of generative self-learning supports the conclusion
about the general nature of the observed effect in the learning systems of biological and artificial origin
and provides an intriguing possibility of a connection to bioinformatics [22] with learning systems able
to learn intuitively, incrementally and with minimal prior knowledge of the environment.
      </p>
    </sec>
    <sec id="sec-15">
      <title>5. Conclusions</title>
      <p>
        Based on the results reported in this work, several essential observations can be made on the process
of unsupervised generative learning with real-work image data and the conceptual structure in the latent
representations of data produced by such models:
1. The models of generative unsupervised learning used in the study were capable of producing
well-defined categorized representations correlated with the principal (i.e. strongly represented)
higher-level concepts in the training dataset.
2. The observed latent representations showed good categorization and separation of principal
concepts and appear to support the hypothesis [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] of a correlation between the unsupervised
representation structure emergent in unsupervised generative learning and higher-level concepts
with significant representation in the training data.
3. Methods of measurement of categorization capacity of unsupervised generative models in the
latent representations were defined and validated.
4. The models showed good generative capacity for some principal concepts in the training data.
Optimization of models for generative ability will be further investigated in a future study.
5. Methods of evaluation of categorization ability of models are general and can be applied to
different types of data and model architectures.
      </p>
      <p>
        Overall, the observed latent representations showed good categorization and separation of principal
concepts and appear to support the hypothesis [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] of a correlation between the categorization
performance and architecture of the model.
      </p>
      <p>The methods of evaluation of latent distributions of data classes, demonstrated and verified in this
work are of general nature not limited to a specific type of data and can be instrumental in evaluation
of the learning capacity and performance of generative models. The unsupervised latent structure
demonstrated in this and other works can be used to enhance learning ability of the models in the
environments with strong deficit of labels. These findings can therefore be instrumental in development
of learning models and methods that are capable of acquiring knowledge in a flexible and
environmentdriven process that is closer to learning of biological systems.
[22] Hassabis D., Kumaran D., Summerfield C. and Botvinick M., “Neuroscience inspired Artificial
Intelligence”, Neuron vol. 95, 245-258, 2017.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Coates</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.Y.</given-names>
          </string-name>
          , “
          <article-title>An analysis of single-layer networks in unsupervised feature learning”</article-title>
          ,
          <source>in: Proceedings of 14th International Conference on Artificial Intelligence and Statistics</source>
          ,
          <volume>15</volume>
          , pp.
          <fpage>215</fpage>
          -
          <lpage>223</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Hornik</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stinchcombe</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>White</surname>
            <given-names>H.</given-names>
          </string-name>
          , “
          <article-title>Multilayer feedforward neural networks are universal approximators”</article-title>
          ,
          <source>Neural Networks</source>
          , vol.
          <volume>2</volume>
          (
          <issue>5</issue>
          ), pp.
          <fpage>359</fpage>
          -
          <lpage>366</lpage>
          ,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Marfil</surname>
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molina-Tanco</surname>
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bandera</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodriguez</surname>
            <given-names>J.A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sandoval</surname>
            <given-names>F.</given-names>
          </string-name>
          , “
          <article-title>Pyramid segmentation algorithms revisited”</article-title>
          ,
          <source>Pattern Recognition</source>
          , vol.
          <volume>39</volume>
          (
          <issue>8</issue>
          ), pp.
          <fpage>1430</fpage>
          -
          <lpage>1451</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Chyrkov</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prystavka</surname>
            <given-names>P.</given-names>
          </string-name>
          , “
          <article-title>Suspicious Object Search in Airborne Camera Video Stream”</article-title>
          , in: Hu Z.,
          <string-name>
            <surname>Petoukhov</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dychka</surname>
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            <given-names>M</given-names>
          </string-name>
          .
          <article-title>(eds) Advances in Computer Science for Engineering and Education</article-title>
          ,
          <source>ICCSEEA 2018, Advances in Intelligent Systems and Computing</source>
          , vol.
          <volume>754</volume>
          , Springer, Cham, pp.
          <fpage>340</fpage>
          -
          <lpage>348</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Prystavka</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cholyshkina</surname>
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dolgikh</surname>
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Karpenko</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <article-title>"Automated object recognition system based on convolutional autoencoder," 10th International Conference on Advanced Computer Information Technologies (ACIT-</article-title>
          <year>2020</year>
          ), Deggendorf, Germany, pp.
          <fpage>830</fpage>
          -
          <lpage>833</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Fischer</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Igel</surname>
            <given-names>C.</given-names>
          </string-name>
          , “
          <article-title>Training restricted Boltzmann machines: an introduction”</article-title>
          ,
          <source>Pattern Recognition</source>
          , vol.
          <volume>47</volume>
          , pp.
          <fpage>25</fpage>
          -
          <lpage>39</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Hinton</surname>
            <given-names>G. E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Osindero</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teh</surname>
            <given-names>Y.W.</given-names>
          </string-name>
          , “
          <article-title>A fast learning algorithm for deep belief nets”</article-title>
          ,
          <source>Neural Computation</source>
          , vol.
          <volume>18</volume>
          (
          <issue>7</issue>
          ), pp.
          <fpage>1527</fpage>
          -
          <lpage>1554</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Welling</surname>
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kingma D.P.</surname>
          </string-name>
          , “
          <article-title>An introduction to variational autoencoders”</article-title>
          ,
          <source>Foundations and Trends in Machine Learning</source>
          , vol.
          <volume>12</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>307</fpage>
          -
          <lpage>392</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Creswell</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>White</surname>
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumoulin</surname>
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arulkumaran</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sengupta</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bharath</surname>
            <given-names>A.A.</given-names>
          </string-name>
          , “
          <article-title>Generative adversarial networks: an overview”</article-title>
          ,
          <source>IEEE Signal Processing Magazine</source>
          , vol.
          <volume>35</volume>
          , (
          <issue>1</issue>
          ), pp.
          <fpage>53</fpage>
          -
          <lpage>65</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Ranzato</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boureau</surname>
            <given-names>Y.-L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chopra</surname>
            <given-names>S.</given-names>
          </string-name>
          , LeCun Y.,
          <article-title>“A unified energy-based framework for unsupervised learning”</article-title>
          ,
          <source>in: 11th International Conference on Artificial Intelligence and Statistics (AISTATS)</source>
          , San Huan, Puerto Rico,
          <year>2007</year>
          , vol.
          <volume>2</volume>
          , pp.
          <fpage>371</fpage>
          -
          <lpage>379</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Le</surname>
            ,
            <given-names>Q.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ransato</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monga</surname>
            <given-names>R.</given-names>
          </string-name>
          , et al.,
          <article-title>“Building high-level features using large scale unsupervised learning”</article-title>
          ,
          <source>arXiv 1112</source>
          .
          <article-title>6209 [cs</article-title>
          .LG]
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Higgins</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matthey</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glorot</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al., “
          <article-title>Early visual concept learning with unsupervised deep learning”</article-title>
          ,
          <source>arXiv1606.05579 [cs.LG]</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , “
          <article-title>Concept learning through deep reinforcement learning with memory-augmented neural networks”</article-title>
          ,
          <source>Neural Networks</source>
          , vol.
          <volume>110</volume>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>54</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Dolgikh</surname>
            <given-names>S.</given-names>
          </string-name>
          , “
          <article-title>Categorized representations and general learning”</article-title>
          ,
          <source>in: Proceedings of 10th International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions</source>
          , vol.
          <volume>1095</volume>
          , pp.
          <fpage>93</fpage>
          -
          <lpage>100</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Tishby</surname>
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pereira</surname>
            <given-names>F. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bialek</surname>
            <given-names>W.</given-names>
          </string-name>
          , “The Information Bottleneck method”,
          <source>arXiv:physics/0004057</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Kavukcuoglu</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sermanet</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boureau</surname>
            <given-names>Y. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gregor</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mathieu</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cun</surname>
            <given-names>Y.</given-names>
          </string-name>
          ,
          <article-title>“Learning convolutional feature hierarchies for visual recognition”</article-title>
          ,
          <source>Proceedings of the 23rd International Conference on Neural Information Processing Systems</source>
          , vol.
          <volume>1</volume>
          , pp.
          <fpage>1090</fpage>
          -
          <lpage>1098</lpage>
          Vancouver, Canada,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Simonyan</surname>
            <given-names>K.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Zisserman</surname>
            <given-names>A.</given-names>
          </string-name>
          , “
          <article-title>Very deep convolutional networks for large-scale image recognition”</article-title>
          , arXiv,
          <volume>1409</volume>
          .1556 [cs.LG],
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <article-title>Keras: The Python Deep Learning library</article-title>
          , URL: https://keras.io.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Jolliffe</surname>
            ,
            <given-names>I.T.</given-names>
          </string-name>
          , “Principal Component Analysis”, Series: Springer Series in Statistics, 2nd ed., Springer, NY,
          <year>2002</year>
          , XXIX, 487 p.
          <fpage>28</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Yoshida</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ohki</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , “
          <article-title>Natural images are reliably represented by sparse and variable populations of neurons in visual cortex”</article-title>
          ,
          <source>Nature Communications</source>
          , vol.
          <volume>11</volume>
          , pp.
          <fpage>872</fpage>
          <lpage>2020</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Bao</surname>
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gjorgiea</surname>
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shanahan</surname>
            <given-names>L.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Howard J. D.</surname>
          </string-name>
          , T. Kahnt T.,
          <string-name>
            <surname>Gottfried</surname>
            <given-names>J. A.</given-names>
          </string-name>
          , “
          <article-title>Grid-like neural representations support olfactory navigation of a two-dimensional odor space”</article-title>
          ,
          <source>Neuron</source>
          , vol.
          <volume>102</volume>
          (
          <issue>5</issue>
          ), pp.
          <fpage>1066</fpage>
          -
          <lpage>1075</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>