<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Multiple Classifier System for fast an accurate learning in Neural Network context</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>E. F. Romero</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>R.M. Valdovinos</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>R. Alejo</string-name>
          <email>ralejoll@hotmail.com</email>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>J. R. Marcial-Romero</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>J. A. Carras- co-Ochoa</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>3, Col. Ma. Isabel</institution>
          ,
          <addr-line>Valle de Chalco</addr-line>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Centro Universitario Valle de Chalco</institution>
          ,
          <addr-line>Hermenegildo Galena</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Facultad de Ingeniería</institution>
          ,
          <addr-line>Ciudad Universitaria, Cerro de Coatepec s/n, Toluca</addr-line>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Instituto Nacional de Astrofísica Óptica y Electrónica</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Tecnológico de Estudios Superiores de Jocotitlán</institution>
          ,
          <addr-line>Carretera Toluca-Atlacomulco km 44.8</addr-line>
          ,
          <institution>Col. Ejido de San Juan y San Agustín</institution>
          ,
          <addr-line>Jocotitlán</addr-line>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Universidad Autónoma del Estado de Mexico</institution>
        </aff>
      </contrib-group>
      <fpage>50</fpage>
      <lpage>57</lpage>
      <abstract>
        <p>Nowadays, the Multiple Classification Systems (MCS) (also called as ensemble of classifiers, committee of learners and mixture of experts) constitutes a well-established research field in Pattern Recognition and Machine Learning. The MCS consists in dividing the whole problem with resampling methods, or using different models for constructing the system over a single data set. A similar approach is studied in the Neural Network context, with the Modular Neural Network. The main difference between these approaches is the processing cost associate to the training step of the Modular Neural Network (in its classical form), due to each module requires to be learned with the whole data set. In this paper, we analyze the performance of a Modular Neural Network and a Multiple Classifier System integrated by small Modular Neural Networks as individual member, in order to identity the convenience of each one. The experiments here were carried out on datasets from real problems showing the effectiveness of the Multiple Classifier System in terms of overall accuracy and processing time respect to uses a single Modular Neural Network.</p>
      </abstract>
      <kwd-group>
        <kwd>Artificial Neural Networks</kwd>
        <kwd>Modular Neural Networks</kwd>
        <kwd>Mixture of Experts</kwd>
        <kwd>Linear Perceptron</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The Modular Neural Networks (MNN) presents a new trend in Neural Network (NN)
architectural designs. It has been motivated by the highly-modular nature in biological
networks and based on the “divide and conquer” approach [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The MNN bases its
structure on the idea of a cooperative or competitive working, fragmenting the
problem into modules where each module is part of the whole problem [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Some
advantages of this network respect to other models are:
1. Learning speed. The numbers of iterations needed to train the individual
modules is less than the number of iterations needed to train a Non-Modular NN for
the same task [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
2. Data processing. MNN is useful when it is working with different data sources
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], or when the data has been preprocessed with different techniques.
3. Knowledge distribution. In a MNN, the network modules tend to specialize by
learning from different regions of the input space [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. And the modules can be
trained independently and in parallel.
      </p>
      <p>
        There exist several implementations of the MNN, although the most important
difference among them refers to the nature of the gating network. In some cases, this
corresponds to a single neuron evaluating the performance of the other expert
modules [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], other are based on a NN trained with a data set different from the one used
for training the expert networks [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Finally, training all modules, including the
integrator module, with the same dataset [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>On the other hand, currently, the multiple classifier system (MCS) (also known as
ensemble of classifiers, committee of learners, etc.) is a set of individual classifiers
whose decisions are combined when classifying new patterns. Some reasons for
combining multiple classifiers to solve a given learning problem are: First, MCS tries to
exploit the local different behavior of the individual classifiers to improve the
accuracy of the overall system. Second, in some cases MCS might not be better than the
single best classifier but can diminish or eliminate the risk of picking an inadequate
single classifier. Finally, the limited representational capability of learning
algorithms, it is possible that the classifier space considered for the problem does not
contain the optimal classifier</p>
      <p>To ensure a high performance of the MCS it is necessary to have enough diversity
in the individual decisions, and consider an acceptable individual accuracy of each
membership, which constitutes the MCS.</p>
      <p>
        Some aspects of the MCS aim to overcome in comparison when a single classifier
is used, are [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]: The MCS takes advantage of the combined decision over the
individual classifier decisions, the correlated errors of the individual components can be
eliminated when the global decisions is considered, the training patterns cannot
provide enough information to select the best classifier, the learning algorithm may be
unsuitable to solve the problem and finally, the individual search space cannot contain
the objective function.
      </p>
      <p>In this paper, a comparative study that aims to display the advantages of both
methods, the MNN and a MCS are used for classification task. In the first method
(MNN) each member corresponds to a linear perceptron, and in the MCS each
individual classifier corresponds to a single MNN, that is to say, the MCS is a neural
network made with MNN.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>Modular Neural Network</title>
      <p>
        MNN called as systems committee, Hierarchical Mixture of Experts or Hybrid
Systems [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], bases its structure (modular) on the modularity of the human nervous
system, in which each brain region has a specific function, but in turn, the regions are
interconnected. Therefore, we can say that an ANN is modular if the computation
performed by the network can be decomposed into two or more modules or
subsystems that work independently on the same or part of the problem. Each module
corresponds to a feed forward artificial neural network, and can be considered as neurons
in the network as a whole.
      </p>
      <p>
        In its most basic implementation, all modules are of the same type [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], but
different schemes can be used. In the classical architecture, all modules, including the
gating module, have n input units, that is, the number of features in the sample. The
number of output neurons in the expert networks is equal to the number of classes c,
whereas in the gating network it is equal to the number of experts r [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] (Fig. 1).
      </p>
      <p>In the learning process, the network uses the stochastic gradient function:
− ln⎜⎜⎛⎝ ∑=r1 i gi ∗ exp⎜⎛⎝ − 12 s − Zi 2 ⎟⎞⎠ ⎟⎟⎞⎠ (eq.1)
where s is the desired output for the input x, and Zi is the output vector of the j’th
expert network, gj is the output of the gating network, ui is the total weighted input
received by output neuron j of the gating network.</p>
      <p>Given a pattern x n-dimensional as input, the overall learning process of the MNN
considers the following steps:
1. Random initialization of the synaptic weights for the different networks with
small values uniformly distributed. Henceforth, we will consider wji as weights
of the expert network and wti as the integrating network.
2. The pattern x is presented to each and every one of the networks (experts and
integrating network) so, the output of the knowledge network is given by:</p>
      <p>Z m = x ∗ w mji</p>
      <p>i
where x is the input vector, and the superscript m is indicative of module.
Similarly, the output of the gating network is obtained by, where ui = x * wti:
exp(ui )
gi = r
∑ exp(u j )
j=1
Adjusting the weights of the expert networks and the gating network: To adjust
weights, two criteria are taken into account.
(eq.3)
(eq.2)</p>
      <p>From expert networks: wmji (I +1) = wim (I ) +η ∗ hi (s − Zim )x
For the gating network: wti (I +1) = wti (I ) +η (hi (I ) − gi (I )) x
(eq.4)
(eq.5)
where:
hi = r
∑ g ∗ exp⎜⎛ − 1 s − Zim 2 ⎟⎞
j=1 ⎝ 2 ⎠
4. Finally, the network decides how the modules outputs will be combines to
obtain the final output of the MNN by:  = !!!! g ∗    (eq.7)</p>
    </sec>
    <sec id="sec-3">
      <title>Multiple Classifier System</title>
      <p>
        Let D = {D1,..., Dh} be a set of h classifiers. Each classifier Di (i = 1,..., h) gets as
input a feature vector x ∈ Rn, and assigns it to one of the c problem classes. The
output of the MCS is an h-dimensional vector [D1(x), . . . , Dh(x)]T containing the
decisions of each h individual classifiers. After that the individual decisions are combined
by some strategy [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] in order to obtain a final decision.
      </p>
      <p>
        For constructing a MCS it is based on two aspects: the diversity in the individual
decisions and the accuracy of the single classifiers. The methods used to achieve
diversity can be described in five groups [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]: Pattern manipulation, attribute
manipulation, tags manipulation, using different classification algorithms and use randomness.
      </p>
      <p>
        To integrate the MCS, in this study we use subsamples which consider patterns
manipulation, such that the resulting subsets have a proportional size to the number of
classifiers that integrate the MCS. Thus, in the experiments here reported the MCS
was integrated with 7 and 9 classifier each one, according to [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], this means that the
subsample only includes seven or nine percent of the samples included in the original
training dataset.
      </p>
      <p>
        To obtain the subsamples, we use the random selection without replacement of
patterns [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and Bagging [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In the first method, the random selection is performed
without replacement of patterns in which a certain pattern cannot be selected more
than once, thereby reducing the redundancy patterns. On the other hand, Bagging
produces subsamples called Bootstrap, where each subsample has the same size than
the original dataset. For each subsample obtained with Bagging, each pattern has a
probability of 1-(1/m)m of being selected at least once between the m times that is
selected with , that is to say, each pattern has approximately 63% chance of appearing
in the subsample.
      </p>
      <p>When the subsamples are integrated using some resampling method, each one is
presented to the MCS (Fig. 2). After that, for combining the individual classifier
decisions in the literature two strategies are proposed: Fusion and selection. In classifier
selection, each individual classifier is supposed as an expert in a part of the feature
space and correspondingly, only one classifier is selected to label the input vector. In
classifier fusion, each component is supposed to have knowledge of the whole feature
space and thus, all individual classifiers are taken into account to decide the label for
the input vector.</p>
    </sec>
    <sec id="sec-4">
      <title>Experimental Results</title>
      <p>The results correspond to the experiments carried out over 12 real data sets taken
from the UCI Machine Learning Database Repository (http://archive.ics.uci.edu/ml/).</p>
      <p>For each database, we estimate the average predictive accuracy and processing
time by 5-fold cross-validation, considering the 80% as the training set and the
remaining as the test set (20%). According to the scheme of MNN and the MCS, some
specifications are as follows:
1. Topology. Each expert in the MNN corresponds to a linear perceptron, in which
the number of nodes in the input layer corresponds to the number of attributes in
the input pattern. For the experts network, the number of neurons in the output
layer is equal to the number of categories in the problem, while for the
integrating network is equal to the number of experts used.
2. Connection weights. The connection weights where initialized to random values
in the range between -0.5 and 0.5.
3. Each MNN consists of 5 modules and an gating network.
4. For the final decision on the MCS, simple majority voting was used.</p>
      <p>Only the result of the best technique on each database has been presented.
Analogously, for each database, related to the number of subsamples to induce the
individual classifiers, that is, the number of classifiers in the MCS, we have experimented
with 7 and 9 elements, and the best results have been finally included in Table 2.
Besides, single MNN classification accuracy for each original training set is also
reported as the baseline classifier.</p>
      <p>Since the accuracies are very different for the distinct data sets, using these results
across the data sets will be inadequate. Instead we calculate ranks for the methods.
For each data set, the method with the best accuracy receives rank 1, and the worst
receives rank 5. If there is a tie, the ranks are shared. Thus the overall rank of a
method is the averaged rank of this method across the 12 data sets. The smaller rank
indicates the better method.</p>
      <p>In Table 2 are two sections; the first one includes the MNN results. The second
section shows the results when the MCS is used with 7 and 9 classifiers. In this case,
the corresponding capital letter identifies the resampling method used for obtaining
the subsamples: randomly without replacement (A) and Bagging (B). The results
correspond to the overall accuracy and the standard deviation included in parentheses
and values in bold type indicate the highest accuracy for each database.</p>
      <sec id="sec-4-1">
        <title>Dataset</title>
      </sec>
      <sec id="sec-4-2">
        <title>Cancer</title>
      </sec>
      <sec id="sec-4-3">
        <title>Heart</title>
      </sec>
      <sec id="sec-4-4">
        <title>Liver</title>
      </sec>
      <sec id="sec-4-5">
        <title>Pima</title>
      </sec>
      <sec id="sec-4-6">
        <title>Sonar</title>
      </sec>
      <sec id="sec-4-7">
        <title>Iris</title>
      </sec>
      <sec id="sec-4-8">
        <title>Vehicle</title>
      </sec>
      <sec id="sec-4-9">
        <title>German</title>
      </sec>
      <sec id="sec-4-10">
        <title>Phoneme</title>
      </sec>
      <sec id="sec-4-11">
        <title>Waveform</title>
      </sec>
      <sec id="sec-4-12">
        <title>Segment</title>
      </sec>
      <sec id="sec-4-13">
        <title>Overall Rank MNN</title>
        <p>88.4 (4.6)
73.7 (8.6)
63.5 (5.4)
66.5 (1.6)
65.9 (6.2)
80.7 (11.4)
36.4 (7.1)
61.8 (18.0)
67.9 (4.5)
77.2 (2.7)
78.2 (5.6)
46.5</p>
        <p>From results shown in Table 2, some comments may be drawn. First, except with
Cancer and Segment data set, it is clear that some MCS schemes leads to better
performance than the MNN. This is confirmed by the general basis of the MNN, which
clearly corresponds to the poorer. Second, comparing the MCS using 7 or 9
classifiers, it is possible to observe that when we use a MCS with 7 classifiers we can find
some results with a precision greater than (or equivalent) rating when nine classifiers
are used. Finally, to compare different resampling methods, the A method (random
selection without replacemnt) behave generally better performance than the B method
(Bagging), using MCS with 7 classifiers on 5 datasets. In fact, for best results, the
details are still very close to the winner.</p>
        <p>The Vehicle data set is a special case due to the poor performance, regardless of
the scheme used. In this case, a thorough analysis of the data distribution is necessary
in order to identify the reason why the MNN and the MCS are not able to recognize
the kinds of problem which is required.</p>
        <p>Another aspect to be analyzed is the computational cost associated with each
model. To this end, Table 3 shows the time required in minutes during the training and
the classification process by each classifier model.</p>
      </sec>
      <sec id="sec-4-14">
        <title>Dataset</title>
      </sec>
      <sec id="sec-4-15">
        <title>Cancer</title>
      </sec>
      <sec id="sec-4-16">
        <title>Heart</title>
      </sec>
      <sec id="sec-4-17">
        <title>Liver</title>
      </sec>
      <sec id="sec-4-18">
        <title>Pima</title>
      </sec>
      <sec id="sec-4-19">
        <title>Sonar</title>
      </sec>
      <sec id="sec-4-20">
        <title>Iris</title>
      </sec>
      <sec id="sec-4-21">
        <title>Vehicle</title>
      </sec>
      <sec id="sec-4-22">
        <title>German</title>
      </sec>
      <sec id="sec-4-23">
        <title>Phoneme</title>
      </sec>
      <sec id="sec-4-24">
        <title>Waveform</title>
      </sec>
      <sec id="sec-4-25">
        <title>Segment</title>
        <p>
          Results in the Table 3 clearly show large differences between the processing times
obtained by the three models used. It is interesting to note that in the majority of
cases, the time required by the MNN is almost two times more than the required by any
MCS. For example, with Sonar dataset the MNN requires nine times more than the
MCS. These differences could be because the MCS uses small subsamples in the
training process m/L, where m, is the number of patterns of training and L the number
of subsamples [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], reducing the computational cost in terms of runtime. In fact,
using 9 classifiers requires less time in most cases, because the subsamples are smaller.
        </p>
        <p>Finally, regarding the performance of the schemes used, we can note that the best
classification results was obtained with an MCS with 7 classifiers requiring less time
processing respect to the single MNN and short differences respect to an MCS with 9
members.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Concluding Remarks and Future work</title>
      <p>Designing a MCS with MNN as individual classifiers has been here analyzed. Two
MCS were used, with 7 and whit 9 classifiers. For the single MNN architecture, we
have employed five network experts and one gating network. The experimental
results allow comparing these models, in terms of processing time and predictive
accuracy. From this, it has been possible to corroborate that in general, the MCS clearly
outperforms the classifier obtained with the MNN.</p>
      <p>In addition, when comparing the behavior of the resampling methods, it has been
empirically demonstrated, that to use the random selection without replacement offers
the best performance: with greater precision and lower computational cost.</p>
      <p>Finally, by comparing the results of the classification and the processing time
required for each model, the use of the MCS provides the best performance, being the
best option to improve the binomial time-accuracy.</p>
      <p>As a future work to expand this research, aimed mainly at the improvement the
single MNN performance. In this context, other architectures with different
parameters and possible mechanisms such as regularization/Cross-validation must be
analyzed. Also, it should be further investigated the relationship between the individual
classifiers and the resampling methods in order to determine the “optimal” scenario.
Acknowledgements. This work has partially
3834/2014/CIA project, from the Mexican UAEM.
been
supported
grants</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alejo</surname>
          </string-name>
          , R. “
          <article-title>Análisis del Error en Redes Neuronales: Corrección del error de los datos y distribuciones no balanceadas”</article-title>
          . Tesis
          <string-name>
            <surname>Doctoral. Universitat Jaume</surname>
            <given-names>I</given-names>
          </string-name>
          , Castelló de la Plana,
          <source>España</source>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bauckhage</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Thurau</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>"Towards a Fair'n Square aimbot Using Mixture of Experts to Learn Context Aware Weapon Handling"</article-title>
          ,
          <source>in Proccedings of (GAME-ON'04)</source>
          , Ghent, Belgium, pp.
          <fpage>20</fpage>
          -
          <lpage>24</lpage>
          (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Breiman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>"Bagging predictors"</article-title>
          ,
          <source>Machine Learning 26 (2)</source>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>140</lpage>
          , (
          <year>1996</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Dietterich</surname>
          </string-name>
          , T. G. “
          <article-title>Ensemble methods in machine learning</article-title>
          ,
          <source>” Lecture Notes in Computer Science</source>
          , vol.
          <year>1857</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hartono</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hashimoto</surname>
            <given-names>S.:</given-names>
          </string-name>
          <article-title>"Ensemble of Linear Perceptrons with Confidence Level Output"</article-title>
          ,
          <source>in Proceedings of the 4th International. (USA)</source>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>106</lpage>
          , (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kadlec</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Gabrys</surname>
            <given-names>B.</given-names>
          </string-name>
          “
          <source>Learnt Topology Gating Artificial Neural Networks”. IJCNN</source>
          , pp.
          <fpage>2604</fpage>
          -
          <lpage>2611</lpage>
          . IEEE (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kuncheva</surname>
            ,
            <given-names>L. I.</given-names>
          </string-name>
          :
          <article-title>"Using measures of similarity and inclusion of multiple classifier fusion by decision templates"</article-title>
          ,
          <source>Fuzzy Sets and systems</source>
          ,
          <volume>122</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>401</fpage>
          -
          <lpage>407</lpage>
          (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kuncheva</surname>
            ,
            <given-names>L.I.</given-names>
          </string-name>
          <string-name>
            <surname>Bezdek</surname>
            ,
            <given-names>J.C. Duin R.P.W.</given-names>
          </string-name>
          :
          <article-title>“Decision templates for multiple classifier fusion”</article-title>
          .
          <source>Pattern Recognition</source>
          ,
          <volume>34</volume>
          , pp.
          <fpage>299</fpage>
          -
          <lpage>314</lpage>
          (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kuncheva</surname>
            <given-names>L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Roumen</surname>
            <given-names>I. Kountchev K.</given-names>
          </string-name>
          :
          <article-title>"Generating classifier outputs of fixed accuracy and diversity"</article-title>
          ,
          <source>Pattern Recognition letters, 23</source>
          , pp.
          <fpage>593</fpage>
          -
          <lpage>600</lpage>
          (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Martínez</surname>
            <given-names>L. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodríguez</surname>
            <given-names>P. A.</given-names>
          </string-name>
          “
          <article-title>Modelado de sus funciones cognitivas para entidades artificiales mediante redes neuronales modulares”</article-title>
          . Tesis doctoral, Universidad Politécnica de Madrid. España (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Valdovinos</surname>
            <given-names>R.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sánchez</surname>
            <given-names>J.S.</given-names>
          </string-name>
          : “Sistemas Múltiples de Clasificación, preprocesado, construcción, fusión y evaluación”.
          <source>Académica Española: Alemania</source>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Valdovinos</surname>
            ,
            <given-names>R.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sánchez</surname>
            <given-names>J.S.</given-names>
          </string-name>
          : “
          <article-title>Class-dependant resampling for medical applications”</article-title>
          ,
          <source>In: Proc. 4th Intl. Conf. on Machine Learning and Applications</source>
          , pp.
          <fpage>351</fpage>
          -
          <lpage>356</lpage>
          (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>