<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Vision Inspection with Neural Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Arnaud Nguembang Fadja</string-name>
          <email>arnaud.nguembafadja@unife.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Evelina Lamma</string-name>
          <email>evelina.lamma@unife.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabrizio Riguzzi</string-name>
          <email>fabrizio.riguzzi@unife.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dipartimento di Ingegneria</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dipartimento di Matematica e Informatica</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Ferrara Via Saragat</institution>
          <addr-line>1, I-44122, Ferrara</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This work describes a system for extracting and classifying defects inside bottles for cosmetic and pharmaceutical use. The system integrates various defects identi cation and automatic classi cation algorithms based on neural networks (NN). The aim is to be able to identify defective bottles at the end of production chain. In a set of 60 bottles, 3600 images were taken, 60 for each. We extracted 4161 defects of which, 70% was used for training and 30% for testing the neural network. We considered ve defect classes (rubber, aluminum, glass, hair and tissue) with more than 90% accuracy on the test set.</p>
      </abstract>
      <kwd-group>
        <kwd>Vision Inspection</kwd>
        <kwd>Computer Vision</kwd>
        <kwd>Neural Networks</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Computer vision (CV) [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] aims at understanding information in images, for
example, extracting patterns. In manufacturing, CV plays a major role in various
applications of measurement, location, identi cation and inspection (or control)
thanks to increasingly sophisticated automatic classi cation and real-time vision
algorithms. The industrial problems faced by CV are heterogeneous. In
measurement applications, the aim is to measure physical features of the objects.
Example of features are diameter, area, volume, height. . . In detection applications,
the purpose is to locate the object in an area by reporting its position and its
orientation. center of gravity or corners and then send these information to a
robot for picking up the object. In identi cation applications, the vision system
reads various codes and alphanumeric characters, for example barcodes, machine
plates and matrix codes. In inspection or control applications, the aim is to
validate certain features, for example the presence or absence of a correct label on
a bottle or the classi cation of defects on the surface or inside a product.
      </p>
      <p>Automated visual inspection of industrial products for quality control plays
an important role in production processes. Manufacturing rms try to automate
as much as possible the process of production in order to decrease the production
cost. However, at the end of the production chain, products can be a ected by
many defects due to:
1. Degradation of machines used in the production chain,
2. Degradation of machines not directly involved in the production chain,
3. Contamination from the environment.</p>
      <p>Therefore, at the end of the production chain, products have to be control in
order to guarantee a good quality. In most cases, this quality inspection is carried
out by humans by visual inspection. In the cosmetic and pharmaceutic industry,
each product is inspected by many employees and the product contains a certain
defect if the majority detects it. However, the reliability of manual inspection
is limited because of fatigue and inattentiveness. Therefore, rms are working
hard towards the automation of the visual inspection process in order to obtain
high quality products with high-speed production. In this paper we present a
vision system for inspecting products for pharmaceutical and cosmetic use. The
system identi es and classi es defects inside bottles. These defects can be of
various types: rubber, aluminum, particles or hair and tissue bers. Figure 1
shows examples of defects in water bottles.
Computer Vision is the process that aims at creating an approximate model
of the real world (in 3D) from bidimensional (2D) images. CV can replace or
complement manual inspections and measurements with digital cameras and
image processing. The technology is used in a variety of di erent industries to
automate inspection, increase production speed and yield, and improve product
quality. The main purpose of CV is to reproduce human vision in analyzing and
understanding the content of the acquired image. Information is understood in
this case as something that permits a classi cation, for example recognizing an
object in an image.</p>
      <p>Unlike CV, Images Processing (IP) receives as input an image and outputs a
processed image, for example with a modi ed contrast or brightness, and rather
than understanding the image content.</p>
      <p>CV uses some techniques from IP, such as lters, before using proper CV
techniques. The di erent steps of CV can be summarized as follow:
1. Acquiring the image</p>
      <p>In CV systems, acquiring or grabbing images of a scene is the rst and one
of the most important step. Good digital cameras, lens and lightning are
needed to obtain high quality digital images. The type (black and white,
gray-level or color) and the quality of the image are important decisions to
make during acquisition. While black and white images take less space and
processing time, color images take large space and processing. Gray-level
images often provide a fair compromise. The choice of type of image is based
on various criteria like performance and information content to be extracted.
2. Processing the images (IP)</p>
      <p>The second step in a CV system is to improve the quality of images through
IP. Usually, we are interested only in speci c regions of the image called
Regions of interest (ROI). A ROI contains the information to analyse. For
example, if we aim at identifying particles in bottles, after acquiring the
whole image of the bottle, the ROI is the part of the image that contains
particles. We then use image processing in order to increase the quality. In
this step we can change the contrast, the brightness or the rotation.
3. Extracting information from the image</p>
      <p>In this step the system analyses the ROI. For example, if the ROI
contains di erent particles, each can be extracted using a segmentation process.
Segmentation is performed by selecting pixels of relevant object (the
foreground). In particle identi cation, for example, particles can be highlighted
in the ROI by extracting brighter pixels. After segmentation, the system
extracts features of the ROI. These features, or vector of feature, depend on
the application, and are then used for classi cation or identi cation purpose.
4. Taking action</p>
      <p>In the last step we use di erent techniques, such as deep learning 3, for
classifying or clustering the ROI. Decision can then be taken, for example
deciding whether region contains a cat or a dog.
3</p>
    </sec>
    <sec id="sec-2">
      <title>Neural Networks</title>
      <p>
        Machine learning [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] is the ability of extracting knowledge from data. For years,
constructing a pattern-recognition [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] or a machine learning system was a
difcult and complicated task. Researchers had to design feature extractors that
transformed the raw data in an internal representation, a feature vector, from
which a learning subsystem, for example a classi er, could be applied. Rather
than building two subsystems, raw data can be fed to a single system that is
able to discover the representation needed for detection or classi cation. Deep
learning [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] uses this techniques and feeds raw data to a sequence of layers.
Each layer is composed by di erent non linear elements. Elements in a single
layer provide the input of the next layer and learning consists of training from
raw data.
      </p>
      <p>
        An Arti cial Neural Network (ANN) [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] is an information processing paradigm
that is inspired by the way a biological nervous systems, a brain, processes
information. It is composed of a large number of highly interconnected processing
elements (neurons) working together to solve speci c problems. It can be used for
classi cation or regression. In classi cation purposes, the network is organized
as follow:
1. The input layer is composed of a non processing neuron for each input
feature.
2. One or more hidden layers. Each layer has one or more processing neurons.
3. The output layer is composed of di erent processing neurons one for each
class.
4. The output of each neuron in a layer is connected to the input of all the
neurons in the next layer (in fully connected networks). Each connection is
associated to a weight.
      </p>
      <p>In classi cation, the aim of training is to modify the weights in order to correctly
improve the classi cation of the training set. The intuition is that if the network
can correctly classify all the examples of the training set, or most of them,
it will be able to classify well unseen data. This means that it has acquired
generalisation capacity. Depending on the information ow among the di erent
layers, we have two topologies of ANN.</p>
      <p>
        Feedforward ANN (FANN) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]: this is the simplest type of ANN in which
information moves in only one direction|forward: from the input nodes, data
goes through the hidden nodes (if any) and to the output nodes, see Figure 2.
      </p>
      <p>
        Recurrent ANN (RANN) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]: while a feedforward network propagates data
from input to output, RANNs also propagate data from later processing stages
to earlier stages.
      </p>
      <p>
        The most used FANN is the MultiLayer Perceptron (MLP) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The nodes
of MLP neural networks are perceptrons [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. A perceptron is a single neuron,
having n inputs and one output. It is composed of:
1. N inputs, x1:::xn.
2. Weights for each input, w1:::wn.
3. A dummy input x0 (constant and equal to 1) and associated weight w0 = .
4. Weighted sum of inputs, net = Pn
      </p>
      <p>i=0 wixi
5. The transfer (or activation) function computes the value of the output signal
based on the previous value.</p>
      <p>y = T (net)
T (net) =
( 1; if net &lt; 0,</p>
      <p>+1; if net 1.</p>
      <p>T (net) = (net) =</p>
      <p>1
1 + e net</p>
      <p>In order to introduce non linearity in our system, we consider the following
activation function (called sigmoid function):</p>
      <p>To train a MLP, data is usually divided into two sets: the training and the test
set. The training set is used to train the system and the test set is used to test its
ability to generalize on unseen data. Depending on whether the data is labeled
or not, we can distinguish supervised learning, in which input objects (typically
vectors) are labeled with a desired output value and unsupervised learning, that
discovers hidden structures from unlabeled data.</p>
      <p>
        MLP can be trained using techniques such as backpropagation [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. To
update the weights, errors are back propagated from the output to the input layers
in order to minimize the error of the output. Back-propagation applies gradient
descent [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] to reach a local minimum in the space of parameters. Various
techniques such as learning rate [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and momentum [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] can be used to avoid bad
local minima.
4
      </p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>
        Several system are related to ours. ImageNet [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] uses Convolution Neural
Network (CNN) 3 for training and classi cation. CNN are neural networks designed
to process data that comes in the form of arrays such as 1D for signals, 2D
for images and 3D for videos. There are two main types of layers: convolutional
layers and fully connected layers. Convolutional layers extract conjunctions of
features from the previous layer and the fully connected layers classify them.
In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] a deep convolutional neural network is used to classify 1.3 million
highresolution images in the LSVRC-2010 ImageNet training set into 1000 di erent
classes.
3 http://cs231n.github.io/convolutional-networks/
(1)
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] authors apply CNN to the MNIST dataset, a database of handwritten
digits with a training set of 60,000 examples, and a test set of 10,000 examples.
      </p>
      <p>
        Recurrent neural network [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] can also be applied to image classi cation.
5
      </p>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>We used the Halcon 12 4 for acquiring, processing and identifying ROIs. It also
provides procedures and functions for training and testing MLPs with one hidden
layer. The system is composed of di erent subsystems described in the following
subsections:
5.1</p>
      <sec id="sec-4-1">
        <title>Acquisition and Defects Identi cation</title>
        <p>This subsystem extracts defects in images of the bottles. Images are acquired
with a mechanical devices that rotates the bottle around its vertical axis in
front of a digital camera. When the rotation stops, the liquid inside the bottle
continues to rotate and the camera acquires a set of 20 successive images, see
Figure 3. This procedure is repeated for the same bottle 3 times, acquiring a
total of 60 images per bottle. We acquired images from a set of 60 bottles with
the following distribution of particles: 10 bottles contain rubber particles, 10
aluminum particles, 20 glass particles, 10 hair bers and 10 tissue bers. Overall
we collected 3600 images, divided into 180 sets of 20 images.</p>
        <p>In each image in 3 we are interested in the part showing the liquid: the
region under the meniscus. This region is called Region of Interest (ROI) 4. We
extracted 3600 ROI overall.
4 http://www.mvtec.com/products/halcon/product-information/version12/</p>
        <p>In order to identify defects, ROIs in the same set of 20 ROIs are compared in
pairs, one ROI with the next, and di erences, in term of pixels, represent defects
that are moving in the bottle. The minimum rectangle that contains each defect
is extracted and then saved as a small image see Figure 1. From the 3600 ROI
we have extracted further 4161 defect images distributed as shown in Table 1.
This subsystem extracts the features of each defect image. We considered the 121
default features provided by Halcon. These features are divided into three groups:
region, gray-level and contour features. 63 region features describe characteristics
of the region such as area (the number of pixels), circularity (how much the
region looks like circle) and the position of the region in the image. 19 gray-level
features describe the properties of the region such as the min or the max value
of pixels and the standard deviation. 39 contour features describes contour such
as maximum diameter, orientation, compactness, and convexity.</p>
        <p>
          From this set of feature, we removed 8 features that we deemed not useful,
for example position (row and column) and orientation of the region and its
contour in the small rectangle image. After the extraction of the features, to
speed up the training as well as the classi cation we used Principal Component
Analysis (PCA) [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] for preprocessing of the feature vectors.
We built a MLP with the parameters in table 2. The parameter NumInput
speci es the dimensionality of the feature vectors. NumHidden de nes the number
of units of the hidden layer of the MLP. Since it signi cantly in uences the
result of the classi cation, it should be adjusted very carefully. Its value should be
between NumInput and NumOutput. Smaller values lead to a less complex
separating hyperplane and large value run the risk of over tting. NumOutput speci es
the number of classes. The parameter OutputFunction determines the function
used by the output units. In almost all classi cation applications,
OutputFunction should be set to softmax. Preprocessing de nes the type of preprocessing
applied to the feature vectors for training as well as for testing. The parameter
NumComponents de nes the number of components to which the feature vector
is reduced if a preprocessing is applied. In particular, NumComponents has to
be adjusted only if a PCA is performed. RandSeed is the seed for the random
number generator.
        </p>
        <p>Table 3 shows the training and classi cation parameters. Training
parameters MaxIterations, WeightTolerance, and ErrorTolerance control the nonlinear
optimization algorithm. MaxIterations speci es the number of iterations of the
optimization algorithm. The optimization terminates if the weight change is
smaller than WeightTolerance and the change of the error is smaller than
ErrorTolerance. In any case, the optimization terminates after at most MaxIterations
iterations.
Parameters Value
MaxIterations 400
WeightTolerance 0.0001</p>
        <p>ErrorTolerance 0.01</p>
        <p>Table 3. Training parameters</p>
        <p>According to the number of classes (2,3,5), we implemented 3 architectures:
one for experiment.
5.4</p>
      </sec>
      <sec id="sec-4-2">
        <title>Results</title>
        <p>We divided the data into two sets, the training set with 70% of the examples
(2909) and the testing, 30% of the examples (1252).</p>
        <p>In the rst experiment we considered two classes: particles (rubber,
aluminum, glass) and ber (hair, tissue). We obtained 99% precision for particles
and 93% precision for bers for a total accuracy of 97%. The lower precision on
bers is due to the fact that small bers are sometimes classi ed as particles.
In the second experiment we considered three classes: black particles (rubber,
aluminum), light particles (glass) and bers (hair, tissue) and we obtained 95%
precision for black particles, 91% precision for light particle and 94% precision
for bers, for a total accuracy of 93%. We then considered ve classes (rubber,
aluminum, glass, hair, tissue) in the last experiment. We obtained 89% precision
for rubber, 86% precision for aluminum, 91% precision for glass, 85% precision
for hair and 97% precision for tissue for a total of accuracy of 91%.
6</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>We have presented a system for inspecting defects in products for pharmaceutical
and cosmetic use. The algorithm identi es and classi es defects using a MLP
architecture with one hidden layer. Experiments show that the MLP can achieve
good results. We obtained an accuracy of more than 90% on ve di erent classes
(rubber, aluminum, glass, hair, ber). In the future we plan to add other defect
classes, such as plastic particles or bubbles, and perform experiments with deep
architectures such as convolutional neural networks.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baldi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Gradient descent learning algorithm overview: A general dynamical systems perspective</article-title>
          .
          <source>IEEE Transactions on Neural Networks</source>
          <volume>6</volume>
          (
          <issue>1</issue>
          ),
          <volume>182</volume>
          {
          <fpage>195</fpage>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bebis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georgiopoulos</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Feed-forward neural networks</article-title>
          .
          <source>IEEE Potentials</source>
          <volume>13</volume>
          (
          <issue>4</issue>
          ),
          <volume>27</volume>
          {
          <fpage>31</fpage>
          (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Everitt</surname>
            ,
            <given-names>B.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dunn</surname>
          </string-name>
          , G.:
          <article-title>Principal components analysis</article-title>
          .
          <source>Applied Multivariate Data Analysis, Second</source>
          Edition pp.
          <volume>48</volume>
          {
          <issue>73</issue>
          (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Flusser</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suk</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Pattern recognition by a ne moment invariants</article-title>
          .
          <source>Pattern Recognition</source>
          <volume>26</volume>
          (
          <issue>1</issue>
          ),
          <volume>167</volume>
          {
          <fpage>174</fpage>
          (
          <year>1993</year>
          ), https://doi.org/10.1016/
          <fpage>0031</fpage>
          -
          <lpage>3203</lpage>
          (
          <issue>93</issue>
          )
          <fpage>90098</fpage>
          -
          <lpage>H</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Grossberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Recurrent neural networks</article-title>
          .
          <source>Scholarpedia</source>
          <volume>8</volume>
          (
          <issue>2</issue>
          ),
          <year>1888</year>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Haykin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Network</surname>
            ,
            <given-names>N.:</given-names>
          </string-name>
          <article-title>A comprehensive foundation</article-title>
          .
          <source>Neural Networks</source>
          <volume>2</volume>
          (
          <year>2004</year>
          ),
          <volume>41</volume>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Jacobs</surname>
            ,
            <given-names>R.A.</given-names>
          </string-name>
          :
          <article-title>Increased rates of convergence through learning rate adaptation</article-title>
          .
          <source>Neural networks 1(4)</source>
          ,
          <volume>295</volume>
          {
          <fpage>307</fpage>
          (
          <year>1988</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Krizhevsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.E.:
          <article-title>Imagenet classi cation with deep convolutional neural networks</article-title>
          .
          <source>In: Advances in neural information processing systems</source>
          . pp.
          <volume>1097</volume>
          {
          <issue>1105</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>LeCun</surname>
          </string-name>
          , Y.:
          <article-title>The mnist database of handwritten digits</article-title>
          . http://yann. lecun. com/exdb/mnist/ (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>LeCun</surname>
          </string-name>
          , Y.,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.:
          <article-title>Deep learning</article-title>
          .
          <source>Nature</source>
          <volume>521</volume>
          (
          <issue>7553</issue>
          ),
          <volume>436</volume>
          {
          <fpage>444</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , Kara at, M.,
          <string-name>
            <surname>Burget</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cernocky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khudanpur</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Recurrent neural network based language model</article-title>
          .
          <source>In: Interspeech</source>
          . vol.
          <volume>2</volume>
          , p.
          <volume>3</volume>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. Mitchell,
          <string-name>
            <surname>T.M.:</surname>
          </string-name>
          <article-title>Machine learning</article-title>
          .
          <source>McGraw Hill series in computer science</source>
          ,
          <source>McGraw-Hill</source>
          (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Phansalkar</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sastry</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Analysis of the back-propagation algorithm with momentum</article-title>
          .
          <source>IEEE Transactions on Neural Networks</source>
          <volume>5</volume>
          (
          <issue>3</issue>
          ),
          <volume>505</volume>
          {
          <fpage>506</fpage>
          (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Plagianakos</surname>
            ,
            <given-names>V.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Magoulas</surname>
            ,
            <given-names>G.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vrahatis</surname>
            ,
            <given-names>M.N.</given-names>
          </string-name>
          :
          <article-title>Deterministic nonmonotone strategies for e ective training of multilayer perceptrons</article-title>
          .
          <source>IEEE Transactions on Neural Networks</source>
          <volume>13</volume>
          (
          <issue>6</issue>
          ),
          <volume>1268</volume>
          {
          <fpage>1284</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Rosenblatt</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>The perceptron: A probabilistic model for information storage and organization in the brain</article-title>
          .
          <source>Psychological review 65(6)</source>
          ,
          <volume>386</volume>
          (
          <year>1958</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Umbaugh</surname>
            ,
            <given-names>S.E.</given-names>
          </string-name>
          :
          <article-title>Computer vision and image processing: A practical approach using CViptools with Cdrom</article-title>
          .
          <source>Prentice Hall PTR</source>
          (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Yegnanarayana</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Arti cial neural networks</article-title>
          .
          <source>PHI Learning Pvt. Ltd</source>
          . (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>