<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Introduction to Image Classification and Object Detection using YOLO Detector</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Technical University of Košice</institution>
          ,
          <addr-line>Košice</addr-line>
          ,
          <country country="SK">Slovakia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Artificial neural networks have been proved to be the best and the most used solution for image classification and object detection tasks. Paper analyzes them as a tool that significantly improves the mentioned, very complicated computational calculations. In the paper there is a brief history of their development as well as the selected object detector that we used for our introductory experiment that is shown later in the paper. Also, there is introduced the idea of the future research that is going to be based on the conducted experiment and which is going to involve a new methodology for an automated generation of new domain-specific datasets that are essential in the training phase of the neural networks.</p>
      </abstract>
      <kwd-group>
        <kwd>Artificial neural network</kwd>
        <kwd>Image classification</kwd>
        <kwd>Object detection</kwd>
        <kwd>Dataset</kwd>
        <kwd>Pattern recognition</kwd>
        <kwd>Computer vision</kwd>
        <kwd>Machine learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In the last two decades scientists and researchers in the fields of computer vision,
machine learning and neural networks perceive an increasing popularity of these
sectors of computer science due to the fact that technologically hardware as well as
software components of today's computers have been significantly advanced. It has
allowed us to do extensive algorithmic operations and work with a huge amount of data.</p>
      <p>We analyzed artificial neural networks (in short neural networks), which is a
subarea of the machine learning, that are the most suitable method for image
classification and object detection tasks.</p>
      <p>Neural networks use methodologies of the machine learning and computer vision.
Computer vision takes care about image processing in a way so it also deals with
noise reduction, brightness change, or image enhancement by various techniques. On
the other hand, the machine learning is very flexible, because it can be used in
computer vision, image processing as well as other sectors of computer science.</p>
      <p>The paper also describes the history of the neural networks as well as the primarily
used convolutional neural network which has become the most popular method at the
image classification and object detection tasks.</p>
      <p>According to the analyzed facts and the results from our empirically tested data, in
the future we would like to design and implement optimized method for automated
generation of domain-specific datasets that are essential in the training phase of the
neural networks which is very necessary task to do for the neural networks to actually
be able to learn and detect objects on the series of any new images.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Neural Networks</title>
      <p>
        There are a lot of general methods that deal with a problem in a unique way in an
optimal time consuming interval and nowadays neural networks have been one of
them that become commercially popularized thanks to the fact that hardware as well
as software are being significantly advanced on daily basis. Today, they have been
widely used in many sectors of the computer science from arduino microcontroller
interfaces [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] through authentication [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or our researched image classification and
object detection.
      </p>
      <p>Neural networks consist of many interconnected groups of nodes that are called
neurons. Variables from input functions from data are transmitted to these neurons as
a multivariable linear combination, where the values are multiplied with each function
variable (i.e. weights). On this linear combination there is later applied non-linearity
that give the neural networks an ability to model complex non-linear relations. Neural
networks can have more layers, where an output from one layer is the input for the
other. Also, for the learning and detecting processes, neural networks use trained
datasets (section 2.2).</p>
      <p>Nowadays, there are a lot of algorithms with various types of neural networks.
Their historical development is described in the next section 2.1.
2.1</p>
      <sec id="sec-2-1">
        <title>History of the Neural Networks</title>
        <p>
          For a few decades there have been simple approaches to create one of the firsts neural
networks and its very first approach begun by Frank Rosenblatt in 1958 [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], who
researched how information from physical world are stored in biological system so it
could be used for detection or behavioral influences in the future.
        </p>
        <p>
          Later, there were developed models with several successively non-linear layers of
neurons that are dated back to the sixty's [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] and seventy's [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Gradient descent
method was in the supervised learning [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] in discrete, differencional networks of an
arbitrary depth called backpropagation [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] applied for the first time to a neural
network in 1981.
        </p>
        <p>
          With a huge amount of various layers, neural networks were too hard to develop at
this time, because of that their development stagnated until the beginning of ninety's
[
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], when unsupervised learning [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] method was implemented.
        </p>
        <p>
          In the ninety's and twenty's of the last century there were significant
improvements in this kind of field. There was developed a new method of reinforced learning
[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] that looks into an unknown environment and by using the trial and error method,
agent learns about its surroundings and gets better every time it tries a new approach
with its actions [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          In the third millennium, the neural networks attracted a large amount of
researchers for their application in many different sectors [
          <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
          ] resulting among the best
algorithms. Since 2009 the neural networks have won many competitions especially
in a pattern recognition.
        </p>
        <p>
          The pattern recognition was significantly improved when Alex Krizhevsky et al.
in 2012 developed convolutional neural network for image classification task on
ImageNet challenge [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. He and his team won the challenge and created
state-of-theart image classification method that is also used today.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Datasets</title>
        <p>Today, there are a lot of various datasets for the machine learning but we will take a
closer look at image datasets that are essential for image classification and object
detection tasks.</p>
        <p>Creating image datasets is a relatively time-consuming operation, since their
meaning is acquired when they contain a huge amount of data. The image datasets
that are used in image classification and object detection are created by labeling
objects and accurately locating them with a bounding box. Nowadays, there are no such
tools that could perform fully automated objects labeling and locating.</p>
        <p>We want to direct our research to domain-specific environments, so creating a
method that automates the generation of these datasets is desired in the community.</p>
        <p>We assume that it will be based on a convolutional neural network and an image
object detector within YOLO architecture which we empirically tested on the series of
our two experiments (section 3.2). Our idea is to collect images online that would
consist of various types and colors of the same object classes, transparent or
onecolored background and accurate name. Then, we could extract individual objects
from the images and programmatically adjust their brightness, light settings, shadows,
etc to get even more images for the training phase.</p>
        <p>Our idea is to put those objects into randomly generated backgrounds with random
location and overlapping as can be seen in the next figure (Fig. 1).</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Detector YOLO and the Experiments</title>
      <p>
        YOLO is an object detector created by Redmon, J., et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. The YOLO authors
state [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] that it is a state-of-the-art image object detector that achieves the best
results in terms of accuracy and speed and that's why we used it in our research along
with its neural network called Darknet.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Detector</title>
        <p>YOLO divides each image into a grid of size S x S and each cell in the grid predicts B
bounding boxes and their confidence. This confidence of an object reflects how
reliable and accurate the bounding box that locates and classifies an object is. It defines
the confidence of an object as follows:

(
) ∗ 


ℎ
(1)
which means that the probability of the detected object is multiplied with an
intersection over union (the intersection area divided by the union area for two bounding
boxes) between the predicted boundary box and the ground truth box (i.e. hand
labeled bounding box in a training data).
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Experiments</title>
        <p>
          With the detector YOLO we conducted two experiments on a pre-trained COCO
dataset [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. In the first one, we showed how the detector works on the image shown
below (Fig. 2) and in the second one we tested the detector on the series of 500 images
to empirically confirm its functionality.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Using the detector on the image in various resolutions. In this experiment we com</title>
        <p>pared the image classification and object detection while processed on processor Intel
Core i7-7700K (Table 1) and graphic card GeForce GTX 1070 (Table 2) while we
used the same image for both of the components.</p>
        <p>Resolution
378x284
756x567
1008x756
2016x1512
4032x3024
Resolution</p>
        <p>By comparing the two tables, we can see that the data processing, image classification
and the object detection on the processor is noticeably slower than on the graphic card
(approximately 8x slower). Also, with the increasing resolution, the number of
detected objects is also increased, which is caused because of the better quality and clearer
image.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Using the detector on the series of 500 images. We extended our first experiment to</title>
        <p>detect objects on the series of 500 images. Also, according to the results of our
previous experiment, we didn't use various resolutions anymore, because it has no effect in
time on the final detections and using the images in their original resolution provide
more detected objects. For the comparison we chose the images with the fastest and
the slowest detection time and the images with the most and the least objects detected.
Also, we provided average time and average amount of detected objects per whole
series of the images. Similarly we used processor and graphic card processing as in
the first experiment. The results are shown in the next tables (Table 3 and Table 4).</p>
        <p>The speed of the detection on the series of 500 images is between 1401.273ms to
2188.822ms with the average time of the detection 1563.503ms on the processor and
187.003ms to 220.742ms with the average time of the detection 192.299ms on the
graphic card.</p>
        <p>From the results of this experiment we can conclude that the amount of objects
detected doesn't affect the speed of detection (the fastest and the slowest processed
images contain the same amount of objects) as well as the time of the most and the least
objects detected images is almost identical.</p>
        <p>Objects Detected
13.23
20
20
44
2
20
20
44
2
Objects Detected</p>
        <p>13.23</p>
        <p>Property
The fastest detection
The slowest detection</p>
        <p>Average time
The most objects
The least objects</p>
        <p>Property
The fastest detection
The slowest detection</p>
        <p>Average time
The most objects
The least objects
FLOPS
65.864
65.864

65.864
65.864</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Future Research</title>
      <p>In the future, we would like to use the YOLO detector for processing a huge amount
of images for a training phase of automated generation of domain-specific datasets.
Based on our results, we will aim the processing on a graphic card. The card we used
achieved 5 FPS.</p>
      <p>The future research will also be aimed to design completely new methodology for
the automated generation of domain-specific datasets. We assume that the method
will be of a great importance in reducing time cost while creating new datasets,
especially in the phase of the labeling where each object on an image must be precisely
put into the bounding box. Nowadays this task is handmade by people and this
approach should completely get rid of the human intervention during the labeling
process. The method would also be applied in real-time detections as well as many other
tasks like determining specific species of a certain kind or in education to learn
specific objects in the same way as children learn from their very first moments of life.</p>
      <p>
        The last thing I would like to point out is that creating such datasets is a serious
problem since labeling and locating of the objects in the images is mostly a manual
work. Our method would help researchers in many different areas to get significantly
better results because, as is written in this papers [
        <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
        ], often times their datasets
are very limited and it could affect the results accuracy.
      </p>
      <p>With our approach of automated generation of domain-specific datasets we could
train the neural networks on specific environments which would significantly help
with a determination not only of a class of some object but also its kinds and
subclasses e.g. a detected flower would be more accurately detected as forget-me-not or a
detected tree would be more accurately detected as baobab.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgement</title>
      <p>This work was supported by the Faculty of Electrical Engineering and Informatics,
Technical University of Košice under the contract No. FEI-2018-59: Semantic
Machine of Source-Oriented Transparent Intensional Logic.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Madoš</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ádám</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hurtuk</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Čopjak</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Brain-computer interface and Arduino microcontroller family software interconnection solution</article-title>
          .
          <source>In: Proc. of the IEEE 14th International Symposium on Applied Machine Intelligence and Informatics</source>
          (
          <year>2016</year>
          ), pp.
          <fpage>217</fpage>
          -
          <lpage>221</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Vokorokos</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Danková</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ádám</surname>
          </string-name>
          , N.:
          <article-title>Task scheduling in distributed system for photorealistic rendering</article-title>
          .
          <source>In: Proc. of the IEEE 8th International Symposium on Applied Machine Intelligence and Informatics</source>
          (
          <year>2010</year>
          ), pp.
          <fpage>43</fpage>
          -
          <lpage>47</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Rosenblatt</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>The Perceptron: A Probabilistic Model for Information Storage and Organization in Brain</article-title>
          . In: Psychological Review, USA,
          <year>1958</year>
          , vol.
          <volume>65</volume>
          ,
          <issue>iss</issue>
          . 6, pp.
          <fpage>386</fpage>
          -
          <lpage>407</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ivakhnenko</surname>
            ,
            <given-names>G. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lapa</surname>
            ,
            <given-names>G. V.</given-names>
          </string-name>
          :
          <article-title>Cybernetic predicting devices</article-title>
          .
          <source>USA. CCM Information Corp</source>
          ,
          <year>1965</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Werbos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Beyond regression: new tools for prediction and analysis in the behavioral sciences</article-title>
          .
          <year>1974</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hardt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Price</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srebro</surname>
          </string-name>
          , N.:
          <article-title>Equality of Opportunity in Supervised Learning</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems</source>
          (
          <year>2016</year>
          ), vol.
          <volume>29</volume>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zengya</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Back propagation neural network with adaptive differential evolution algorithm for time series forecasting</article-title>
          .
          <source>In: Expert Systems with Applications</source>
          .
          <year>2015</year>
          , vol.
          <volume>42</volume>
          ,
          <issue>iss</issue>
          . 2, pp.
          <fpage>855</fpage>
          -
          <lpage>863</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simard</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frasconi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Learning long-term dependencies with gradient descent is difficult</article-title>
          .
          <source>In: IEEE Transactions on Neural Networks</source>
          (
          <year>1994</year>
          ), vol.
          <volume>5</volume>
          ,
          <issue>iss</issue>
          . 2, pp.
          <fpage>157</fpage>
          -
          <lpage>166</lpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Metz</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chintala</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks</article-title>
          .
          <source>In: International Conference on Learning Representations (ICLR)</source>
          .
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Marco</surname>
            , W., Van Otterlo,
            <given-names>M.</given-names>
          </string-name>
          : Reinforcement Learning.
          <source>2012. ISBN 978-3-642-27645-3.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Chovanec</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chovancová</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dufala</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>DIDS based on hybrid detection</article-title>
          .
          <source>In: IEEE International Conference on Emerging eLearning Technologies and Applications</source>
          (ICETA), Slovakia, pp.
          <fpage>79</fpage>
          -
          <lpage>83</lpage>
          .
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Vokorokos</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pekár</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ádám</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daranyi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Yet Another Attempt in User Authentication</article-title>
          .
          <year>2013</year>
          , vol.
          <volume>10</volume>
          ,
          <issue>iss</issue>
          . 3, pp
          <fpage>37</fpage>
          -
          <lpage>50</lpage>
          . Acta Polytechnica Hungarica (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Hurtuk</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baláž</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ádám</surname>
          </string-name>
          , N.:
          <article-title>Security sandbox based on RBAC model</article-title>
          .
          <source>In: Proc. of the 11th International Symposium on Applied Computational Intelligence and Informatics</source>
          (
          <year>2016</year>
          ). pp.
          <fpage>75</fpage>
          -
          <lpage>80</lpage>
          .
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Krizhevsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.:
          <article-title>ImageNet Classification with Deep Convolutional Neural Networks</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems</source>
          (
          <year>2012</year>
          ), vol.
          <volume>25</volume>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Lecun</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.:
          <article-title>Deep learning</article-title>
          .
          <source>In: Nature</source>
          .
          <year>2015</year>
          , pp.
          <fpage>436</fpage>
          -
          <lpage>444</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Redmon</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Divvala</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girshick</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farhadi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>You Only Look Once: Unified, RealTime Object Detection</article-title>
          .
          <source>In: IEEE Conference on Computer Vision</source>
          and Pattern
          <string-name>
            <surname>Recognition</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <source>IEEE Xplore</source>
          , pp.
          <fpage>779</fpage>
          -
          <lpage>788</lpage>
          .
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Redmon</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farhadi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>YOLOv3: An Incremental Improvement</article-title>
          .
          <source>Tech Report. arXiv:1804.02767</source>
          .
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Tsung-Yi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maire</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Belongie</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bourdev</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girshick</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hays</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perona</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramanan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zitnick</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dollár</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Microsoft</surname>
            <given-names>COCO</given-names>
          </string-name>
          :
          <article-title>Common Objects in Context</article-title>
          .
          <source>In: European Conference on Computer Vision (ECCV)</source>
          , pp.
          <fpage>740</fpage>
          -
          <lpage>755</lpage>
          .
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Garcia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barbedo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Impact of dataset size and variety on the effectiveness of deep learning and transfer learning for plant disease classification</article-title>
          .
          <source>In: Computers and Electronics in Agriculture</source>
          , vol.
          <volume>153</volume>
          , pp.
          <fpage>46</fpage>
          -
          <lpage>53</lpage>
          . Elsevier.
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Vokorokos</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ennert</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Čajkovský</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radušovský</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A Survey of parallel intrusion detection on graphical processors</article-title>
          .
          <source>In: Central European Journal of Computer Science</source>
          , vol.
          <volume>4</volume>
          ,
          <issue>iss</issue>
          . 4, pp.
          <fpage>222</fpage>
          -
          <lpage>230</lpage>
          . Open Computer Science.
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>