<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Convolutional neural networks for image classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrii O. Tarasenko</string-name>
          <email>andrejtarasenko97@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuriy V. Yakimov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vladimir N. Soloviev[</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kryvyi Rih State Pedagogical University</institution>
          ,
          <addr-line>54, Gagarina Ave, Kryvyi Rih 50086</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>101</fpage>
      <lpage>114</lpage>
      <abstract>
        <p>This paper shows the theoretical basis for the creation of convolutional neural networks for image classification and their application in practice. To achieve the goal, the main types of neural networks were considered, starting from the structure of a simple neuron to the convolutional multilayer network necessary for the solution of this problem. It shows the stages of the structure of training data, the training cycle of the network, as well as calculations of errors in recognition at the stage of training and verification. At the end of the work the results of network training, calculation of recognition error and training accuracy are presented.</p>
      </abstract>
      <kwd-group>
        <kwd>machine learning</kwd>
        <kwd>deep learning</kwd>
        <kwd>neural network</kwd>
        <kwd>recognition</kwd>
        <kwd>convolutional neural network</kwd>
        <kwd>artificial intelligence</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Today, a very old field of research, called artificial intelligence deals with the ability of
machines to think like a man. Leading it companies and researchers from universities
have made a breakthrough in the research of artificial intelligence and now we have
software that can “see” (recognize objects in the image, capable of restoring video),
make predictions based on certain data (forecasting stock market indices), make
decisions (the ability to play games, sometimes better than a person).</p>
      <p>
        The idea of artificial intelligence originated in the 1940s, when the question first
arose whether it was possible to make a computer “think” [
        <xref ref-type="bibr" rid="ref12">10</xref>
        ]. Briefly, this sphere can
be described as follows: automation of intellectual tasks, which are usually performed
by people [
        <xref ref-type="bibr" rid="ref1 ref3">1</xref>
        ]. After some time, the researchers were able to create models that are
capable of learning and perform tasks without a clear statement. Such models are called
neural networks. Their peculiarity lies in the ability to learn without programming the
networks themselves.
      </p>
      <p>Artificial intelligence has occupied the place of the sphere of research and today
includes the paradigm of learning – machine learning [12; 13], which differs from the
initial, as it was previously called “symbolic artificial intelligence”, in that in machine
learning, the programmer enters into the program data with answers that correspond to
these data, and the output receives the rules (the answer), and symbolic artificial
intelligence, in turn, performed the rules set by the programmer.</p>
      <p>
        In recent years, much attention has been paid to deep learning, which is successfully
used in classification and recognition problems. The key place here is occupied by
neural networks, namely convolutional neural networks, which contain the meaning of
“depth” [
        <xref ref-type="bibr" rid="ref2 ref4">2</xref>
        ].
      </p>
      <p>Depth in deep learning does not mean the deeper understanding achieved by this
approach, the idea is multi-layered representation. The number of layers into which the
data model is divided is called the depth of the model. Other relevant names for this
area of machine learning could be: layered learning or hierarchical learning.</p>
      <p>Modern deep learning often involves tens or even hundreds of successive layers of
representation – all of which are determined automatically by the training data.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Analysis of previous studies</title>
      <p>
        Deep neural networks over the past 20 years take a very large part of the world’s
research and development, but the first mention of deep network algorithms appeared
in the mid-1960s in the book of Aleksey G. Ivakhnenko and Valentin G. Lapa [
        <xref ref-type="bibr" rid="ref8">6</xref>
        ]. Then,
the concept of “deep learning” in the scientific community emerged through the work
of Rina Dechter in 1986 [
        <xref ref-type="bibr" rid="ref5">3</xref>
        ].
      </p>
      <p>
        The first models of convolutional neural networks were called “neocognitron” and
were discovered in 1980 by Kunihiko Fukushima [
        <xref ref-type="bibr" rid="ref6">4</xref>
        ]. Fukushima proposed several
algorithms for supervised and unsupervised learning, and the neocognitron itself was a
multilayered deep structure.
      </p>
      <p>
        Taking into account the fact that the recognition of medical images has recently
attracted the attention of researchers, there are several difficulties in the study of this
area, namely: 1) a small number of training copies, 2) the difference in scale and fuzzy
boundaries of images. These disadvantages were taken into account when creating a
network model that offers a full-scale convolutional layer extracting patterns of
different receptive fields with a common set of convolutional nuclei, so that
scaleinvariant patterns are captured by this compact set of nuclei [
        <xref ref-type="bibr" rid="ref15">13</xref>
        ].
      </p>
      <p>
        Last decade convolutional networks are gaining a lot of attention of researchers and
developers. ImageNet is one of the largest competitions dedicated to artificial
intelligence and computer vision. Among the varieties of artificial networks, the prizes
were taken using a convolutional network structure, such as AlexNet [
        <xref ref-type="bibr" rid="ref9">7</xref>
        ], VGG [
        <xref ref-type="bibr" rid="ref16">14</xref>
        ],
GoogleNet [
        <xref ref-type="bibr" rid="ref17">15</xref>
        ] and ResNet [
        <xref ref-type="bibr" rid="ref7">5</xref>
        ]. These networks do an excellent job and have a large
percentage of recognition of more than 90%.
      </p>
      <p>
        New developments in the recognition of biomedical images Shuchao Pang, Anan
Du, Mehmet A. Orgun and Zhezhou Yu [
        <xref ref-type="bibr" rid="ref11">9</xref>
        ]. This new neural network, which is called
“fused” convolutional neural network (FCNN) has a precise and highly efficient
classifier, which combines the features of small balls and features of deep layers.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Basic information</title>
      <p>Neurons transmit electrical impulses. When transmitting a pulse (through an axon), the
signal can be amplified or attenuated to be transmitted to a trace neuron (through
dendrites). Such basic functions are implemented in the model of artificial neural
networks:
1. Data. As signals, the artificial network uses input data (photos, audio files), which
are reduced to a certain form, which the network is able to read;
2. Weights are the parameters of each layer of neurons. Act as a force signals by
analogy with natural neurons. If the natural neurons the strength of the impulses, the
artificial network is a numeric value;
3. Next comes the input function. In this part, the data and numerical values of the
weights are processed according to the type of convolution layer;
4. This is followed by the activation function. The function that processes the input
data, its value, is the output of the neuron. Then the data are transferred through the
scales to the next layer of neurons (Fig. 1).</p>
      <p>As mentioned earlier, an artificial neural network consists of a large number of artificial
neurons, which are combined into layers forming a network of neurons. There are the
following types of layers of neurons:
 Input layer. Here the network receives input data that is converted in advance to the
desired format (lists, arrays);
 Output layer. A layer that processes the incoming data in the outgoing (arrays of
numbers) that satisfy the task;
 Convolutional layers. Models of deep neural networks use convolutional layers,
which perform various kinds of “learning” operations of input data transferring to
the output layer.</p>
      <p>The simplest models of neural networks do not use convolutional layers, they are also
called “single-layer” (Fig. 2). Input data described above – (х1, ..., хn), relevant learning
factors – (wnm), activation function – (∑), and actually the output data – (y1, ..., ym).</p>
      <p>Deep learning uses convolutional neural networks, which can contain many
convolutional layers of neurons. The task of such layers is to process the input data in
such a way that the output receives data that satisfies the problem condition. As with a
single-layer network, these are lists of numbers.</p>
      <p>Deep learning uses the concept of a container tensor for data. In fact, the tensor is a
matrix of numbers, or one number is also called a tensor, or else – a scalar.</p>
      <p>Tensors are characterized by the following key characteristics:
 Number of axes (rank, dimension). For example, a matrix is a two-dimensional
tensor;
 Form. A list of integers describing the number of dimensions on each axis of the
tensor;
 Data type. The type of data belonging to the tensor; for example: float32, float64,
uint8, etc.</p>
      <p>It should also be noted that, although the data is stored in the tensor as a mass, the neural
network does not process the entire data set at once, but divides it into so-called
“packets” of data. That is, the data package represents several separate samples with
their other characteristics.</p>
      <p>There are the following main categories of data:
 Vector-two-dimensional tensors with shape (samples, features);
 Time series-three-dimensional tensors with shape (samples, time labels, signs);
 Image-four-dimensional tensors with shape (samples, height, width, color);
 Video-five-dimensional tensors with shape (samples, frames, height, width, color).
Also, returning to the structure of the neuron, it is necessary to mention the function of
action. It has been said that the function takes some value as a parameter. This value is
the data that has gone through operations on the tensors (each convolutional layer
performs its own operations), after which the data becomes the source of this
convolutional layer.</p>
      <p>Depending on the complexity of the task set for the neural network, activation
functions can be different (table 1):</p>
      <p>Function name Description
Unit function As you can see in the graph, the function has
( ) = 10,, ≤≥ , twwhoersetaitteiss 0enooru1g.hIttoisgiuvseuathlleyaunssewdeirn“tYaseks”s
b – adder output. or “No”
Logistic function The most commonly used function is due to
( ) = the fact that among the two States 0 and 1</p>
      <p>, there are many others, for example
( ∗ )
– the coefficient 0.234532, or 0.7. This makes it possible to
that characterizes get from the neural network not only one
the curvature of the answer, but, for example, 10 or 1000</p>
      <p>graph
The hyperbolic</p>
      <p>tangent
( ) =</p>
      <p>It is used for a more realistic model of a
neural network, which can give the initial
values not only positive, but also negative,
that is, from -1 to 1</p>
      <p>The following are the steps that are performed in the training cycle:
 The neural network receives a data packet with training instances and corresponding
data for verification (must be different);
 The network performs data processing (this step is called a direct pass) and receives
a packet of predictions;
 An estimate of the discrepancy between the network prediction and the validation
data is calculated, that is, there must be a function to estimate this discrepancy;
 The parameters are adjusted to reduce discrepancies on this data packet.
The learning cycle is repeated as many times as the problem condition requires. After
completing the training, we will get a network that has a low grade of disagreement.</p>
      <p>Correction parameters occurs in the calculation of the gradient differences of
network parameters. In this case, an offset parameter is added to the training
parameters, which is the opposite of the network variance gradient.</p>
      <p>Then, the training cycle will look like this:
1. The neural network receives a data packet with training instances and corresponding
data for verification (must be different);
2. The network performs data processing (this step is called a direct pass) and receives
a packet of predictions;
3. An estimate of the discrepancy between the network prediction and the validation
data is calculated, that is, there must be a function to estimate this discrepancy;
4. The variance gradient for the network parameters (reverse) is calculated);
5. The parameters are adjusted by a small value in the direction opposite to the gradient
to reduce discrepancies on this data packet.</p>
      <p>To achieve the result, apply the gradient descent method to the gradient of differences
on the selected data package. Before the training cycle, the point of calculating the loss
gradient on a certain data package is entered, after which the global minimum of this
function is calculated by the gradient descent method.</p>
      <p>Fig. 3 images of the gradient descent operation on the function are presented:</p>
      <p>The architecture of neural networks is determined by the tasks for which neural
networks are designed.</p>
      <p>The task of this work is the recognition of objects in images. Therefore, it is
necessary to use the appropriate network architecture, which highlights the features in
the images and forms a representation of them about the object in the photo.</p>
      <p>We review the principle of convolutional neural networks for image distribution.</p>
      <p>First, you need to determine the type of architecture. As described earlier,
convolutional neural networks in most cases have a sequential architecture, where the
neural network receives certain data at the input (tensors), after which the data is
sequentially processed by the source layer-sequential architecture.</p>
      <p>Convolutional layers are used for recognition, in which the features of each object
are selected. For example, the image falls on the input layer, then the neural network
tries to select one common image (usually in simple networks to recognize one object).
Selection of an object is performed by means of so-called layer filters. The filter is a
small window relative to the image that reads a certain area of the image. Usually the
window size is 3x3 pixels, or 5x5 pixels, with different image sizes. In order to
determine the image values, such a filter must pass through the image. Usually the filter
starts to recognize from the upper left corner of the image and moving 1 pixel to the
side, and then to the bottom passes through the entire image. At the initial stages of
research, it is advisable to use a 28x28 pixel image (Fig. 4). Because large images
require more time passage, and therefore more time to learn.</p>
      <p>So, at the first convolutional stage, the neural network uses filters to determine the main
object in the image.</p>
      <p>The next step in network training is to highlight the spatial hierarchy of features.
That is, having identified in the picture, for example, a cat, the network must divide the
cat into parts of objects, which also must “remember” (Fig. 5). After training, the
network will form a representation of such objects about the object as a whole, that is,
about a cat, or other object of recognition.</p>
      <p>So, we have determined that using convolutional layers of filters with a certain size, to
determine the primary features of the object in the image.</p>
      <p>However, with the help of convolutional layers alone, it is impossible to achieve
recognition of the spatial hierarchy of objects. To achieve this goal, you should perform
certain operations on the image. One of these operations, which is most often used in
practice – “selection of the largest of the neighboring” (Max Pooling).</p>
      <p>The reason for using Max Pooling is that the neural network must determine the
spatial hierarchy of objects, and for this we need to reduce the image by 2 times.</p>
      <p>The Max Pooling operation is similar to the convolution operation. Take the
window, now 2x2 pixels and perform one simple action with the taken four
elementsTami-choose the largest number. Thus, the initial data is filled only with the largest
numbers for each window and the resulting image is reduced by half.</p>
      <p>The third step is to define a data set for neural network training.</p>
      <p>Since classification tasks are primarily supervised learning (teacher-assisted
learning), then we will use a special data package that contains training data and feature
class labels that refer to the recognized data.</p>
      <p>The sigmoid activation function is suitable for classification problems, but in our
case we will use the ReLU activation function (Fig. 6). The advantages of using such a
function are the sparsity of the activity, that is, the involvement of only a part of
neurons, which will accordingly reduce the load in the calculation.</p>
      <p>
        Although the use of this activation function makes part of the network passive (some
neurons are not activated), it produces good results on testing [
        <xref ref-type="bibr" rid="ref2 ref4">2</xref>
        ].
      </p>
      <p>( ) = max(0, )
To train the network, the change of the corresponding weights for each neuron is used,
namely, they are changed with the help of the optimizer. The optimizer uses gradient
descent in this case.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Development tool</title>
      <p>An important stage in the development of the program is the choice of development
tools, because it affects the complexity and quality of the result. In fact, any
programming language has tools for the development of neural networks, the difference
is only in the complexity of their implementation in the chosen language. There are
even many implementations of neural networks in the web programming language.
Today Python is quite a popular language for researchers. After all, it is quite easy to
understand and has a large community of developers. Python has many libraries for
research and data visualization, so it is advisable to choose Python as the language of
implementation of your own software tool.</p>
      <p>To control the versions of the libraries and the development environment, it is
advisable to choose the Anaconda tool. The choice of Anaconda lies in the convenience
of version control of programming environments and libraries. Anaconda has a base of
libraries for Python, which are quite convenient to install, and supports several
programming environments, one of which is Jupyter Notebook, which was originally
developed for the convenience of research, but eventually gained popularity among the
community of developers. It will also be used as a programming environment.</p>
      <p>To achieve this goal in the beginning, we will use a lot of additional libraries, in
particular should be allocated Keras and TensorFlow.</p>
      <p>Keras is an open source deep learning library written in the Python programming
language. The library was created to improve research and design of neural networks.
The library has a lot of implementations of convolutional layers, optimizers, activation
functions, work with images and text.</p>
      <p>TensorFlow is an open-source software library for machine learning developed by
Google to solve the problems of building and training a neural network in order to
automatically find and classify objects, achieving the quality of human perception. It is
used both for research and for the development of Google’s own products. The main
API for working with the library is implemented for Python.</p>
      <p>NumPy is an open-source module for python that provides General mathematical
and numerical operations in the form of pre-compiled, fast functions. They are
combined into high-level packages. They provide functionality that can be compared
to that of MatLab. NumPy (Numeric Python) provides the base methods for handling
the large arrays and matrices. SciPy (Scientific Python), which we will also use, extends
numpy functionality with a huge collection of useful algorithms such as minimization,
Fourier transform, regression, and other butt-end mathematical techniques.</p>
      <p>Matplotlib is a Python programming language library for visualization of
twodimensional (2D) graphics data (3D graphics is also supported). The resulting images
can be used as illustrations in the publication.</p>
      <p>OpenCV is a computer vision and machine learning library. We use the OpenCV
library to take a picture from a webcam.</p>
      <p>PIL (python image library) – a set of tools for working with images in Python.</p>
      <p>Dlib is an open source machine learning library. Dlib contains in its database a
pretrained neural network that recognizes the descriptors of the person's face from the
photo, use in future work.</p>
      <p>Tkinter is a cross-platform library for building window interfaces, included in the
standard set of Python modules.</p>
      <p>We used as a data set for training and testing x-ray images of fractures of human
body parts. The images are taken from such datasets:</p>
      <p>The learning process is shown in Fig. 7. In the picture, each iteration of the learning
(epoch) is accompanied by calculations of errors and accuracy on the training data set
and on the test data set. The data to be checked contain images that are not present in
the training set, so this accuracy should be guided. The number of epochs 25 for testing
was randomly selected. However, analyzing the accuracy of recognition with each
epoch, it becomes clear that at first the neural network increases the percentage of
accuracy, and then there are fluctuations by several percent. Starting from the 8th
epoch, the accuracy falls, then increases, then repeats until the end. This is a clear sign
of retraining the network.</p>
      <p>The plot at fig. 8 shows that the accuracy increases to the tenth epoch, then decreases,
then increases. The number of learning epochs should be reduced.</p>
      <p>Fig. 9 shows the errors on the same data. The smallest error is recorded on the tenth
epoch.</p>
      <p>During training, we obtain certain data after the completion of each era of training:
loss – error on training data; acc – accuracy on training data; val_loss – error data for
testing; val_acc – precision on the data for verification.</p>
      <p>As a result of the training, we received an accuracy of 73.13% on the test data.</p>
      <p>We use of Convolutional neural network in the recognition of medical images. The
neural network is trained on a set of data from broken and whole human bones (Fig. 10).
The neural network is able to identify obvious bone fractures on X-rays (Fig. 11).
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>Consequently, in this the important bases of structure and a structure of convolution
neural networks and a cycle about the theory of neural networks were shown. As it was
shown in the end, the result of training of neural networks allows, without their explicit
programming, “teach” the program to recognize objects in the images.</p>
      <p>This paper provides examples of databases, both existing and self-assembled, on
which the neural network learns to distinguish objects. The assessment of accuracy and
errors at the stages of training and verification was also carried out.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Brownlee</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A Gentle Introduction to the Rectified Linear Unit (ReLU). Machine Learning Mastery</article-title>
          . https://machinelearningmastery.com
          <article-title>/rectified-linear-activation-function-fordeep-learning-neural-networks/ (</article-title>
          <year>2019</year>
          ). Accessed 25 Oct 2019
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Brownlee</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A Tour of Machine Learning Algorithms</article-title>
          . Machine Learning Mastery. https://machinelearningmastery.com
          <article-title>/a-tour-of-machine-learning-algorithms/ (</article-title>
          <year>2019</year>
          ). Accessed 25 Oct 2019
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          1.
          <string-name>
            <surname>Chollet</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Deep Learning with Python</article-title>
          . Manning, Shelter
          <string-name>
            <surname>Island</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          2.
          <string-name>
            <surname>Courville</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goodfellow</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Deep Learning</article-title>
          . MIT Press, Cambridge (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          3.
          <string-name>
            <surname>Dechter</surname>
          </string-name>
          , R.:
          <article-title>Learning while searching in constraint-satisfaction-problems</article-title>
          .
          <source>In: AAAI-86 Proceedings The Fifth National Conference on Artificial Intelligence, August 11-15</source>
          ,
          <year>1986</year>
          , in Philadelphia, Pennsylvania., pp.
          <fpage>178</fpage>
          -
          <lpage>183</lpage>
          . https://aaai.org/Papers/AAAI/1986/AAAI86- 029.pdf (
          <year>1986</year>
          ).
          <source>Accessed 17 Aug 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          4.
          <string-name>
            <surname>Fukushima</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position</article-title>
          .
          <source>Biological Cybernetics</source>
          <volume>36</volume>
          ,
          <fpage>193</fpage>
          -
          <lpage>202</lpage>
          (
          <year>1980</year>
          ). doi:
          <volume>10</volume>
          .1007/BF00344251
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          5.
          <string-name>
            <surname>He</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
          </string-name>
          , J.:
          <article-title>Deep Residual Learning for Image Recognition</article-title>
          .
          <source>In: In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <source>Las Vegas</source>
          ,
          <fpage>27</fpage>
          -
          <lpage>30</lpage>
          June 2016, pp.
          <fpage>770</fpage>
          -
          <lpage>778</lpage>
          . IEEE (
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .1109/CVPR.
          <year>2016</year>
          .90
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ivakhnenko</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lapa</surname>
            ,
            <given-names>V.G.</given-names>
          </string-name>
          :
          <article-title>Cybernetics and forecasting techniques</article-title>
          . American Elsevier Publ. Co., New York (
          <year>1967</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          7.
          <string-name>
            <surname>Krizhevsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.E.:
          <article-title>ImageNet classification with deep convolutional neural networks</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>60</volume>
          (
          <issue>6</issue>
          ),
          <fpage>84</fpage>
          -
          <lpage>90</lpage>
          (
          <year>2017</year>
          ).
          <source>doi:10.1145/3065386</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          8.
          <string-name>
            <surname>Laves</surname>
            ,
            <given-names>M.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ihler</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ortmaier</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Deformable Medical Image Registration Using a Randomly-Initialized CNN as Regularization Prior</article-title>
          . In:
          <article-title>Medical Imaging with Deep Learning 2019</article-title>
          . https://openreview.net/pdf?id=
          <source>S1ehZFQ15E</source>
          (
          <year>2019</year>
          ). Accessed 25 Oct 2019
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          9.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orgun</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>A novel fused convolutional neural network for biomedical image classification</article-title>
          .
          <source>Medical &amp; Biological Engineering &amp; Computing</source>
          <volume>57</volume>
          ,
          <fpage>107</fpage>
          -
          <lpage>121</lpage>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1007/s11517-018-1819-y
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          10.
          <string-name>
            <surname>Semerikov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teplytskyi</surname>
            ,
            <given-names>I.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yechkalo</surname>
            ,
            <given-names>Yu.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiv</surname>
            ,
            <given-names>A.E.</given-names>
          </string-name>
          :
          <article-title>Computer Simulation of Neural Networks Using Spreadsheets: The Dawn of the Age of Camelot</article-title>
          . In: Kiv,
          <string-name>
            <given-names>A.E.</given-names>
            ,
            <surname>Soloviev</surname>
          </string-name>
          , V.N. (eds.)
          <source>Proceedings of the 1st International Workshop on Augmented Reality in Education (AREdu</source>
          <year>2018</year>
          ), Kryvyi Rih, Ukraine, October 2,
          <year>2018</year>
          .
          <source>CEUR Workshop Proceedings</source>
          <volume>2257</volume>
          ,
          <fpage>122</fpage>
          -
          <lpage>147</lpage>
          . http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2257</volume>
          /paper14.pdf (
          <year>2018</year>
          ).
          <source>Accessed 30 Nov 2018</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          11.
          <string-name>
            <surname>Semerikov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teplytskyi</surname>
            ,
            <given-names>I.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yechkalo</surname>
            ,
            <given-names>Yu.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiv</surname>
            ,
            <given-names>A.E.</given-names>
          </string-name>
          :
          <article-title>Computer Simulation of Neural Networks Using Spreadsheets: The Dawn of the Age of Camelot</article-title>
          . In: Kiv,
          <string-name>
            <given-names>A.E.</given-names>
            ,
            <surname>Soloviev</surname>
          </string-name>
          , V.N. (eds.)
          <source>Proceedings of the 1st International Workshop on Augmented Reality in Education (AREdu</source>
          <year>2018</year>
          ), Kryvyi Rih, Ukraine, October 2,
          <year>2018</year>
          .
          <source>CEUR Workshop Proceedings</source>
          <volume>2257</volume>
          ,
          <fpage>122</fpage>
          -
          <lpage>147</lpage>
          . http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2257</volume>
          /paper14.pdf (
          <year>2018</year>
          ).
          <source>Accessed 30 Nov 2018</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          12.
          <string-name>
            <surname>Semerikov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teplytskyi</surname>
            ,
            <given-names>I.O.</given-names>
          </string-name>
          :
          <article-title>Metodyka uvedennia osnov Machine learning u shkilnomu kursi informatyky (Methods of introducing the basics of Machine learning in the school course of informatics). In: Problems of informatization of the educational process in institutions of general secondary and higher education, Ukrainian scientific</article-title>
          and practical conference,
          <source>Kyiv, October</source>
          <volume>09</volume>
          ,
          <year>2018</year>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>20</lpage>
          . Vyd-vo
          <string-name>
            <surname>NPU imeni M. P. Drahomanova</surname>
          </string-name>
          ,
          <string-name>
            <surname>Kyiv</surname>
          </string-name>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          13.
          <string-name>
            <surname>Semerikov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          :
          <article-title>Zastosuvannia metodiv mashynnoho navchannia u navchanni modeliuvannia maibutnikh uchyteliv khimii (The use of machine learning methods in teaching modeling future chemistry teachers)</article-title>
          . In: Starova,
          <string-name>
            <surname>T.V</surname>
          </string-name>
          . (ed.)
          <article-title>Technologies of teaching chemistry at school</article-title>
          and university, Ukrainian Scientific and Practical Internet Conference, Kryvyi Rih,
          <year>November 2018</year>
          , pp.
          <fpage>10</fpage>
          -
          <lpage>19</lpage>
          . KDPU,
          <string-name>
            <surname>Kryvyi Rih</surname>
          </string-name>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          14.
          <string-name>
            <surname>Simonyan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zisserman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Very Deep Convolutional Networks for Large-Scale Image Recognition</article-title>
          . In: International Conference on Learning Representations. https://www.robots.ox.ac.uk/~vgg/publications/2015/Simonyan15/simonyan15.pdf (
          <year>2015</year>
          ). Accessed 25 Oct 2019
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          15.
          <string-name>
            <surname>Szegedy</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jia</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sermanet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reed</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anguelov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erhan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vanhoucke</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rabinovich</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Going Deeper with Convolutions</article-title>
          .
          <source>In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          , Boston,
          <fpage>7</fpage>
          -
          <issue>12</issue>
          <year>July 2015</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          . IEEE (
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .1109/CVPR.
          <year>2015</year>
          .7298594
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          16.
          <string-name>
            <surname>Theart</surname>
          </string-name>
          , R.:
          <article-title>Getting started with PyTorch for Deep Learning (Part 3: Neural Network basics)</article-title>
          . Code to Light. https://codetolight.wordpress.com/
          <year>2017</year>
          /11/29/getting-started
          <article-title>-with-pytorchfor-deep-learning-part-3-neural-network-basics/ (</article-title>
          <year>2017</year>
          ). Accessed 25 Oct 2019
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>