<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Bagging-based instance selection for instance-based classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dmytro K</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>The task of reducing marked large size samples for building diagnostic and recognizing models by precedents is considered. The method allowing to reduce essentially the size of a training sample increasing at the same time its efficiency, through removal of irrelevant and redundant instances is proposed. The given method provides an opportunity to estimate each instance of a training sample by synthesis of an ensemble of weak classifiers using a bagging model, and to create a reduced sample of the most significant instances by estimations. Software is developed to implement the proposed method. This software has been experimentally investigated in solving the task of reducing synthetic and real world data. The results of the conducted experiments allow recommending the use of the developed method and its software realization for solving the task in the sphere of technical diagnostics.</p>
      </abstract>
      <kwd-group>
        <kwd>base classifier</kwd>
        <kwd>class</kwd>
        <kwd>classification</kwd>
        <kwd>ensemble of classifiers</kwd>
        <kwd>instance</kwd>
        <kwd>meta-estimator</kwd>
        <kwd>metric</kwd>
        <kwd>sampling</kwd>
        <kwd>training sample</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The constantly increasing volume of available information requires significant
computing resources for successful data processing. So the task of reducing dimension
data for further building models based on them is urgent [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1-4</xref>
        ]. It is especially
necessary when solving practical tasks of industrial diagnostics, when the diagnostic
system should immediately respond to any deviations in the operation of equipment. In
the conditions of continuous production, diagnostic systems of nondestructive testing,
which allow making diagnostics in real time, are especially important. The main
component of nondestructive testing diagnostic systems is the model of pattern
classification by precedents (classifier) [
        <xref ref-type="bibr" rid="ref5 ref6 ref7 ref8">5-8</xref>
        ]. The classifier is a set of rules, which allow
determining whether new observations (instances) belong to one of the existing
classes. To generate a model of recognition by precedents it is necessary to have a set
of instances (precedents) with known values of classes (training sample), and a
classification method, which will form the rules of recognition (training) using the training
sample [
        <xref ref-type="bibr" rid="ref10 ref9">9-10</xref>
        ].
      </p>
      <p>
        There are many classification methods with different principles and approaches
[
        <xref ref-type="bibr" rid="ref11 ref12">11-12</xref>
        ]. The family of metric classification methods based on precedents is quite
effectively used in building diagnostic models. Metric methods of training refer to the
geometrical paradigm of machine learning, which assumes that instances have a
geometric structure, where each of them is described by numerical features and is
considered as a point in a multidimensional feature space. These methods are based on the
assumption of local compactness of classes, from which it follows that the similarity
of two instances on N independent features also assumes their similarity on one
dependent feature N  1 . Thus, only instances of the same class can exist in the
neighborhood of one class instances. And the closer a control instance is to the
neighborhood of a class, the more likely it will belong to this class [
        <xref ref-type="bibr" rid="ref13 ref14">13-14</xref>
        ]. Metric
methods have such advantages as simplicity of implementation, clear logic of
methods' work, geometrical nature, simplicity of model results interpretation, developed
theoretical base, adaptation to the necessary task by metric selection. The
disadvantage of the metric methods recognition is the necessity to store the whole training
sample in the computer memory. Thus, the use of large size training samples may
require significant computing resources and classification time. In technical
diagnostics systems, where the speed of decision making is a priority, models based on large
training samples may be ineffective. Reducing the size of training samples will help
reduce both the classification time and the computational complexity of diagnostic
models.
      </p>
      <p>
        The most widely used approach to reducing the data dimension is the selection of
informative features [
        <xref ref-type="bibr" rid="ref15 ref16 ref17">15-17</xref>
        ], which implies the selection from the initial set of
features a smaller subset of features, sufficient to solve the problem with the required
accuracy or meeting some criterion. However, if the size of the feature space is small,
or features individually are low informative, but together they contain enough
information for building a model, selection of informative features does not allow to
reduce the data dimension effectively.
      </p>
      <p>
        Another approach to reducing data dimensional is the selection of instances [
        <xref ref-type="bibr" rid="ref1">1,
1822</xref>
        ]. To date, the instance selection has been considered a necessary preliminary data
processing procedure [
        <xref ref-type="bibr" rid="ref1 ref23">1, 23</xref>
        ]. Successful use of instance selection methods allows for
the selection of a small sample size, independently of the model in which it will be
used in a future and without performance loss [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In the process of selecting
instances, irrelevant and redundant instances are removed from the sample, so in some
cases the performance of models that are built on processed samples may be higher
than on the initial data.
      </p>
      <p>
        Many instance selection methods have been proposed in the past few decades, each
with weaknesses and advantages [
        <xref ref-type="bibr" rid="ref23 ref24 ref25 ref26">23-26</xref>
        ]. However, there is no universal method
which can achieve equally high results with various data samples. In general, the task
of instance selection is to select the most relevant instances of a training sample.
Virtually, for the selection of instances must be solved the problem of binary
classification where each instance of the training sample can be classified as selected or
unselected. Therefore, it can be assumed that the approaches and methods used to solve
the classification problem can also be applied to the instance selection task.
      </p>
      <p>
        One of the successful directions for increase of models productivity in the tasks of
classification is the use ensembles of classifiers [
        <xref ref-type="bibr" rid="ref27 ref28 ref29">27-29</xref>
        ].
      </p>
      <p>In this paper we study the possibility of using ensembles of metric classifiers to
select the most relevant instances in solving the problem of reducing training samples
and increasing their representativeness.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Formal problem statement</title>
      <p>
        The task of instance selection for classification model building based on precedents is
to generate such a training subsample of minimum size from the most relevant
instances of the initial training sample, which will allow to classify new unmarked data
with an accuracy at least as accurate as using the initial training sample [
        <xref ref-type="bibr" rid="ref2 ref30">2, 30</xref>
        ].
Formally, the task can be written in the following way.
      </p>
      <p>Let the initial training sample be presented as a set of S precedents of dependence
yx and be defined by the expression X  x, y is a sign matrix of input features x
and output features y. The set of input attributes x is defined by the standard object ‒
feature matrix:</p>
      <p>x  xij SN ,
where S is a number of instances, N is a number of input features, xij is a value of the
j-th feature on the i-th instance. The set of output features is defined by the vector:
y  y1 ,..., yS  , yi  1,2,..., K ,
where K is a number of classes in the sample K  1 . Each i-th instance is
represented as X i  xi , yi . Then the task of selecting instances is to select from the initial
sample X  x, y such a subsample X   x, y that the following conditions are
executed:
x  x, y  yi | xi  x, S   y , S   S, f  x, y , x, y   opt ,
(3)
where S is a number of instances of the resulting subsample, x is a set of input
features of the resulting subsample, y is a set of output features of the resulting
sub(1)
(2)
sample.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Review of the literature</title>
      <p>
        At solving the complex practical problems of pattern recognition by precedents, there
may be occasions when no classifier provides the necessary accuracy of pattern
recognition (classes). Increase in accuracy allows creating a model composed of a set of
classifiers (ensemble). The strategy of ensembles is that a set of independent models
of classification (base classifiers) is created, and the results of their work are
combined. Thus, the productivity of base classifiers increases due to compensation of
errors of some classifiers by correct work of other ones [
        <xref ref-type="bibr" rid="ref27 ref28 ref29">27-29</xref>
        ]. Generally, the
ensemble of classifiers can be described as follows:
yx  b1x,..., bT x ,
(4)
where bt is a base classifier,  is a meta-estimator that creates a decisive rule that
the recognized instance belongs to a certain class yx .
      </p>
      <p>The basic properties of classifiers are the ability of each base classifier to
independently solve the initial classification task and the possibility to use existing
classification training standard methods.</p>
      <p>
        The one of the most important conditions for the efficiency of the classifier
ensembles in pattern recognition problems is the requirement for a sufficient variety of base
classifiers [
        <xref ref-type="bibr" rid="ref27 ref28">27-28</xref>
        ]. Thus there is a compensation of classification errors of some base
classifiers by work of other ones. Therefore, it is necessary to combine the results of
the base classifiers, so as to increase the influence of true decisions and minimize the
influence of wrong decisions on the response of the ensemble. The basic strategies for
building ensembles are the synthesis of independent classifiers and decision-making
on a bagging basis [
        <xref ref-type="bibr" rid="ref31 ref32 ref33">31-33</xref>
        ], special coding of target values and reduction of the task
solution to solving several tasks (error-correcting output code) [34], building of
metasignatures on the basis of responses of base classifiers on subsets of samples and
training meta-functions on them (stacking) [
        <xref ref-type="bibr" rid="ref28">28, 35</xref>
        ], sequential summarization of
several classifiers, with each next classifier being trained taking into account the
errors of the previous ones (boosting) [
        <xref ref-type="bibr" rid="ref27 ref28 ref29 ref32 ref33">27-29, 32-33, 36-37</xref>
        ], heuristic methods of
combining the answers of base classifiers by training in special subspaces and
visualizations (mixture-of-experts) [
        <xref ref-type="bibr" rid="ref28">28, 38-39</xref>
        ], recursive synthesis of homogeneous ensembles
(neural networks) [40].
      </p>
      <p>Bagging-based ensembles are the most common when it comes to solving real
world tasks due to their simplicity of implementation and high generalization ability.
The main advantages of bagging are the ability to perform parallel computations at
high classification accuracy.</p>
      <p>
        During ensemble forming by the bagging method, each base classifier is trained on
a random subset of the training sample. With this approach, the variety of methods is
achieved even with the use of one classification method for all base classifiers [
        <xref ref-type="bibr" rid="ref27 ref33">27,
33, 38</xref>
        ]. In the traditional bagging model, the bootstrap technique [
        <xref ref-type="bibr" rid="ref27 ref29 ref31 ref33">27, 29, 31, 33, 41</xref>
        ]
is used to select random subsets of a training sample, which implies the formation of
subsets by random selection with return. The basic bagging method works well on
small samples, when exclusion of a small number of instances leads to a significant
distribution transformation in the sample. For larger training samples it is possible to
use other sampling methods. In this case, the task of selecting the optimal size of the
selected subsamples S.
      </p>
      <p>
        There are many different ways to extract subsets (resampling) of a training sample
for the synthesis of bagging ensembles [
        <xref ref-type="bibr" rid="ref11 ref31">11, 31, 41-46</xref>
        ]. The most known methods are
bootstrap [
        <xref ref-type="bibr" rid="ref31">31, 41-46</xref>
        ], pasting [47], random subspaces [
        <xref ref-type="bibr" rid="ref28 ref29">28-29</xref>
        ], random patches [48]
and cross-validation [41].
      </p>
      <p>The base classifiers forming an ensemble using a bagging strategy do not need a
separate test subsample to evaluate the accuracy of the generated model. It is possible
because each base classifier is trained on a subsample containing only part of the
training sample. Thus, it allows to estimate the accuracy of each base classifier with
instances not included in the selected subsample. Provided that the number of base
classifiers is quite numerous, the evaluation will be performed for almost every
instance of the training sample. Moreover, this evaluation will be independent, because
the accuracy of each base classifier will be evaluated by a subsample of instances
with the dependent variable values unknown to it. Further training and test samples of
base classifiers will be called local for certainty.</p>
      <p>
        By constructing ensembles of classifiers, it is important to take into account that
there are classification methods stable with respect to the selection of random subsets
(for example, the SVM (support-vector machine) method, or the kNN (k-nearest
neighbors) method at k  3 ) [
        <xref ref-type="bibr" rid="ref27">27, 49</xref>
        ]. Application of such methods in ensembles
based on bagging is ineffective because diversity of methods is not achieved.
      </p>
      <p>The result of ensemble work depends on the choice of meta-estimator. In the most
common case, the meta-estimator is a majority vote function for the classification
task:
In more complex cases, weighted voting can be used, where each base classifier has a
weighting characteristic:</p>
      <p>K T
y(x)  b1(x),...,bT (x)  arg max  btk .</p>
      <p>k1 t1</p>
      <p>K T
y(x)  b1(x),...,bT (x)  arg max  wtbtk ,
k1 t1
(5)
(6)
where wt is a weight characteristic of the base classifier. The weighting characteristic
primarily depends on the accuracy parameter of new instances recognition by the
classifier. Such characteristic can be a relative number of correctly recognized
instances of the test sample (relative accuracy):</p>
      <p>1 Sb
E  S b i1 1 | bxi   yi ,
(7)
where E is a relative accuracy of the classifier, S b is a number of instances in the
local test sample of the current classifier, b is an approximating function of the
classifier.</p>
      <p>Another important parameter depends on the number of instances in the selected
training sample. According to the size minimization task, in order to calculate the
weighting characteristics of each base classifier it is possible to combine the
characteristics of classification accuracy within the local test sample and the share of
instances reduced from the initial training sample:
w  E  (1  )r ,
(8)
where   0...1 is a coefficient indicating the degree to which factors affect the value
of the overall score, r is a share of reduced instances r  S b S , since the local test
sample is formed from instances not included in the local training sample.</p>
      <p>The relative accuracy of classification (7) gives an objective assessment of the
classifier provided it is sufficiently balanced by classes the test sample. If the test
sample has an imbalance of classes, for example, a minority class is 1% of the
sample, it is possible that a classifier that incorrectly classified all minority instances and
correctly classified the majority class will have abnormally high relative accuracy
(E = 99%). Selection of instances, when constructing a bagging-ensemble, is carried
out randomly, so it is impossible to guarantee the balancing of training and test local
samples. Stratified instances selection for the local training sample can be one
solution to the problem, but if the initial sample is imbalanced by classes, the local test
sample will also have class imbalances. Therefore, for imbalanced samples, an
assessment based on a confusion matrix may be more appropriate [50]. The confusion
matrix is a way of grouping the instances depending on the combination of the true
answer and the classifier's answer and allows to get a set of different metrics. In case
of binary classification, instances can be divided into four categories (Table 1).
b(x)  1
b(x)  0</p>
      <p>The instances of the class of greater interest are called positive instances and
another class is called negative. When dealing with imbalanced data, the minority class
is usually presented as positive. Using the confusion matrix, it is possible to obtain
precision and recall metrics. The precision:
where TP is the correctly classified positive instances, FP is a incorrectly classified
positive instances, shows the share of correctly predicted positive instances. The
recall:</p>
      <p>P  TP TP  FP ,
R  TP TP  FN  ,
(9)
(10)
where FN is the incorrectly classified negative instances, shows the share of correctly
predicted positive instances of all predicted instances as positive instances.
Obviously, the higher the values of these metrics, then the classifier is better. However, it
is impossible in practice to reach the maximum values of precision and recall
simultaneously, so it is necessary to choose which characteristic is more important for a
particular task, or to search for balance between these values. The harmonic mean of the
precision and recall (F-measure) allows combining these parameters [51]:
F  2PR P  R .
(11)
4</p>
    </sec>
    <sec id="sec-4">
      <title>Materials and methods</title>
      <p>To select the most relevant instances of a training sample using an ensemble of
classifiers based on a bagging model, you need to solve the problem of binary classification
for each instance of the training sample. In addition, each instance will be assigned to
the selected class of instances, or to the class of instances that are not meeting the
selection condition.</p>
      <p>The classification of instances from the initial sample is based on the voting results
of base classifiers. The base classifier is a model trained on the marked sample, in
which the value of the output feature y(x) for each instance is known. To synthesize
the base classifier of a bagging ensemble, a random subset of instances is selected
from the initial sample by the bootstrap method. The resulting local sample is used to
train the base classifier. The nearest neighbor kNN method with one nearest neighbor
and Manhattan distance metric was chosen as training method [52]. This choice is
conditioned by the necessity of obtaining less stable classifiers and increasing the rate
of synthesis of a large number of base classifiers. This model uses a passive learning
strategy, in which there is no learning phase of the classifier, instead, the learning
sample is stored in memory, which is used to classify new data. The main advantage
of this model is the ability to use new data without retraining, simply adding new
significant instances to the sample. However, in such a model, large size training
samples will require significant memory resources for storage. Using a basic
bootstrap method for building an ensemble of classifiers implies retrieving subsamples of
the same size as the initial sample. Thus, each ensemble classifier must store almost
the entire training sample in memory. Since this research is aimed at reducing the size
of large training samples, extraction of random subsamples was performed by random
selection with return, but in contrast to bootstrap, the length of the subsamples was
randomly determined in a predefined range. When creating each classifier, the
unselected instances were used as a local test sample for the particular classifier estimate.
The weighting parameter w of each base classifier was obtained using the following
equation (8).</p>
      <p>The primary aim of the study was to select the most representative data from the
training sample, so the task of the selection method is to investigate each instance of
the sample and assess its relevance. Random selection methods do not guarantee the
examination of each instance, so at the preliminary stage of creating an ensemble it is
proposed to divide the initial sample into some number approximately equal in size
subsamples and then to classify each subsample using ensembles built on the
remaining training subset. With this approach, it is possible to ensure that every instance of
the initial training sample is examined. The number of subsamples can take the value
M  1 , increasing the number of subsamples will lead to more stable and accurate
results, but on the other hand will increase the computational complexity of the model
and the processing time.</p>
      <p>Formally, the proposed instance selection method can be presented as follows:
1. Set the initial training sample</p>
      <p>X  x, y , initialize the resulting sample
X   x, y   . Set the number of subsamples M  1 . Set the number of base
classifiers T . Set the value of the coefficient   0...1 . Set the threshold
  0...1 , instance selection in a new training sample.
2. Split the initial sample X into M subsamples of approximately equal size:
X  X m  x m , y m , m  1...M .</p>
      <p> T Sm
im   t1 t i1 .

i 
</p>
      <p>i S S .</p>
      <p>arg maxi i1 i1
X    xi , yi | i  i1 .</p>
      <p>S
3. Set the number of subsamples m  1 .
4. Set the number of base classifiers t  1 .
5. Using a method of simple random selection with return, take a local training
sample X  from the subsample X \ X m . To define as a local test sample X  the set
of instances not selected in the sample X  .
6. Train the base classifier using the local training sample b  fit( X  ) .
7. Calculate the harmonic mean value F (11) for the current base classifier. If</p>
      <p>F  0.5 , go to step 5.
8. Calculate the weight characteristic of the current base classifier w (8).
9. Classify the subsample X m and calculate the weight of each instance, taking into
account the weight characteristic of the base classifier :</p>
      <p>t  wbxim   yim iSm1 .
10. Set t  t  1 . If t  T , go to step 5.
11. For each instance of a subsample, calculate the value of the meta-estimator:
12. Set m  m 1 . If m  M , go to step 4.
13. Merge M vectors of meta-estimators and normalize the values to a unit segment:
14. Form a new training sample:
(12)
(13)
(14)
(15)</p>
    </sec>
    <sec id="sec-5">
      <title>Experiments and Results</title>
      <p>To obtain a summary evaluation of the method, the experiments were conducted on
two different samples, which differed in the number of instances, features and classes
(Table 2). To evaluate the obtained training samples, at the first stage of the
experiment, the initial data set X 0 was divided by the stratification method [53] into
training X and validation X V samples in a ratio of 75/25. The training sample obtained
by stratification method was later considered as the initial sample. Classifiers were
built on the basis of the initial and the resulting samples and tested with the validation
sample. Then values of relative accuracy of classification and number of instances of
samples were compared. The nearest neighbor method with the Euclidean distance
metric was used as a method of classifying the recognition model. The resulting
sample was formed using an ensemble of base classifiers. All base classifiers used the
nearest neighbor method with one neighbor and Manhattan distance metric. Using the
Manhattan distance metric reduced the stability of the base classifiers and
computational complexity. A variety of base classifiers was achieved through the use of
simple random selection with return when forming a local training sample of each base
classifier.
selection for the local training sample of each base classifier was performed by simple
random selection with return. The local sample size was determined randomly in the
range of 1%...100% of the initial subsample size. For the base classifier the
Fmeasure value F was calculated using the local test sample, consisting of instances
not included in the local training sample. Moreover, if the F-measure value F of the
base classifier was less than 50%, the synthesis procedure for this classifier was
repeated. Then the weight of each base classifier was calculated according to the
equation (8) with parameter   0,75 . The second subsample was classified by each base
classifier according to its weight. Thus, a vector of weights corresponding to all the
base classifiers was formed for each instance of the initial sample. The resulting
training sample X  was formed from the instances having the maximum total weight.</p>
      <p>At the next stage of the study, classification of the validation sample X V by the
method of the nearest neighbor with one nearest neighbor and the Euclidean distance
metric was performed. The initial sample X and the resulting sample X  were used
as training samples. The relative accuracy of the models was calculated according to
the equation (7). Using the obtained data, the dependencies of relative accuracy and
sample length on the number of base classifiers were plotted (Fig. 2-5).</p>
      <p>E
,
y
c
a
r
u
c
c
A
99
97
95
1
Fig. 2. ‒ Dependence of model accuracy on the number of base classifiers for the Pulsar dataset
,
y
a 85
c
r
u
c
c
A
75
5
S
,
s
n
a
ts 3
n
i
X'
X</p>
      <p>X'
(165; 4900)
51
101
151</p>
      <p>201</p>
      <p>For clarity, the figures showed the local areas within the critical values of model
accuracy, at which the accuracy of classification of the resulting sample became less
than the accuracy of the initial training sample. Thus, it was possible to estimate the
critical number of instances of the resulting sample, below which the relative
accuracy of the model based on the resulting sample became lower than the relative
accuracy of the initial sample model.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Discussion</title>
      <p>The proposed method showed high efficiency on all investigated datasets. All models
on the basis of obtained training samples had relative accuracy higher than models
with initial samples. At the same time, the size of the obtained samples was less than
two times than the initial samples, even with a minimum number of classifiers. The
increase in the number of base classifiers led to a decrease in the size of the resulting
sample and a decrease in the relative accuracy of the model. Such results are due to
the fact that with the increase in the number of base classifiers decreased the number
of instances, the total weight of which reached the value of a given threshold, and
probably removed significant instances, which reduced the effectiveness of the model.
Despite the disadvantages of the proposed method, in practical application there is a
range of the number of classifiers, in which the relative accuracy of the method will
be higher, the classifier based on the initial sample. Also, the proposed method of
instance selection requires a quite large number of initial parameters. Therefore, there
is a need to create a method capable of independently estimating the initial parameters
of the model. For this purpose, it is necessary to develop mechanisms of model
evaluation and determination of the initial parameters of the method.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>The task of reducing marked data samples of large size for building diagnostic and
recognizing models by precedents is considered. The results of the experiments have
shown the efficiency of the proposed method on all investigated samples.</p>
      <p>The scientific novelty of the obtained results is that a new method has been
created, reducing the size of the marked samples, saving the most significant instances
and removing the less informative ones. Thus, the proposed method allows to solve
the data reduction problem, increasing the efficiency of the training sample, by
removing irrelevant and redundant instances.</p>
      <p>The practical significance of the results obtained is that the software implementing
the proposed method has been developed. The given software has been
experimentally investigated at the decision of problems reduction of synthetic and real world
data. The conducted experiments have confirmed working capacity of the developed
software. The results of the performed experiments allow recommending the use of
the developed method and its software for solving the problems of technical
diagnostics.</p>
      <p>Further research in the field of reducing training samples by building ensembles of
classifiers can be conducted in the following directions:
─ development of adaptive ensembles of classifiers with a minimum number of
initial parameters;
─ using ensembles of classifiers different types;
─ the use of different approaches to the formation of local samples for base
classifiers, to create balanced data sets;
─ search for optimal classification methods for the synthesis a base classifiers;
─ implementation of the proposed method for multiprocessor systems operating in
parallel modes.
8
34. Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting
output codes. In: Journal of Artificial Intelligence Research, vol. 2, pp. 263–286. AI Access
Foundation, USA (1995)
35. Wolpert, D., H.: Stacked generalization. In: Neural Networks, vol. 5(2), pp. 241–259.
Elsevier (1992). doi: 10.1016/S0893-6080(05)80023-1
36. Yoav, F., Schapire, R., Abe, N.: A Short Introduction to Boosting. In: Journal of Japanese</p>
      <p>Society For Artificial Intelligence, vol. 14(5), pp. 771-780. (1999)
37. Blachnik, M.: Instance Selection for Classifier Performance Estimation in Meta Learning.</p>
      <p>In: Entropy, vol. 19(11) (2017). doi:10.3390/e19110583
38. Rokach, L.: Ensemble-based classifiers. In Artificial Intelligence Review, vol. 33, pp. 1–
39. Springer (2010). doi:10.1007/s10462-009-9124-7
39. Jordan, M., I., Jacobs, R., A.: Hierarchical mixtures of experts and the EM algorithm. In:</p>
      <p>Neural computation, vol. 6(2), pp. 181-214. IEEE (1994). doi: 10.1162/neco.1994.6.2.181
40. Haykin, S., O.: Neural Networks and Learning Machines. Pearson (2008)
41. Kuhn, M., Johnson, K.: Applied Predictive Modeling. Springer (2018)
42. Thompson, S.K.: Sampling. John Wiley &amp; Sons, Hoboken (2012)
43. Cochran, W.G.: Sampling Techniques. John Wiley &amp; Sons, New York (1977)
44. Chaudhuri, A., Stenger, H.: Survey sampling theory and method. Chapman &amp; Hall, New</p>
      <p>York (2005)
45. Good, P., I.: Resampling methods: a practical guide to data analysis. Birkhäuser (2005)
46. Scheaffer, L., Mendenhall, W., Lyman Ott, R. at. al.: Elementary Survey Sampling.
Cengage Learning (2011)
47. Breiman, L.: Pasting Small Votes for Classification in Large Databases and On-Line. In:</p>
      <p>Machine Learning, vol. 36, pp. 85–103. Springer (1999)
48. Louppe, G., Geurts, P.: Ensembles on Random Patches. In: Lecture Notes in Computer</p>
      <p>Science, pp. 346–361. Springer (2012). doi:10.1007/978-3-642-33460-3_28
49. Wu, X., Kumar, V., Quinlan, J. at. al.: Top 10 algorithms in data mining. In: Knowledge
and Information Systems, vol. 14, pp. 1–37. Springer (2008). doi:
10.1007/s10115-0070114-2
50. Elkan, C.: The foundations of cost-sensitive learning. In: 17th international joint
conference on Artificial intelligence 2001, vol. 2, pp. 973-978. Morgan Kaufmann Publishers
Inc. (2001)
51. Fawcett T.: An Introduction to ROC Analysis. In: Pattern Recognition Letters, vol. 27(8),
pp. 861-874. Elsevier (2005). doi: 10.1016/j.patrec.2005.10.010
52. Zhang, S., Cheng, D., Deng, Z. at. al.: A novel KNN algorithm with data-driven k
parameter computation. In: Pattern Recognition Letters, vol. 109, pp. 44-54. Elsevier (2018). doi:
10.1016/j.patrec.2017.09.036
53. Parsons, V., L.: Stratified Sampling. In: Wiley StatsRef: Statistics Reference Online. John</p>
      <p>Wiley &amp; Sons (2017). doi: 10.1002/9781118445112.stat05999.pub2
54. Lyon, R., J., Stappers, B., W., Cooper, S. at. al.: Fifty Years of Pulsar Candidate Selection:
From simple filters to a new principled real-time classification approach. In: Monthly
Notices of the Royal Astronomical Society, vol. 459(1), pp. 1104-1123. Oxford (2016). doi:
10.1093/mnras/stw656
55. Alcalá-Fdez J., Fernandez A., Luengo J. at. al.: KEEL Data-Mining Software Tool: Data
Set Repository, Integration of Algorithms and Experimental Analysis Framework. In:
Journal of multiple-valued logic and soft computing, vol. 17(4), pp. 255-287. Old city
publishing (2010)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>García</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luengo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herrera</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Data Preprocessing in Data Mining</article-title>
          . Springer, Switzerland (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>García</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramírez-Gallego</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luengo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          et al.:
          <article-title>Big data preprocessing: methods and prospects</article-title>
          .
          <source>In: Big Data Anal</source>
          , vol.
          <volume>1</volume>
          (
          <issue>9</issue>
          ). Springer (
          <year>2016</year>
          ).
          <source>doi: 10.1186/s41044-016-0014-0</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Big Data: A Survey</article-title>
          .
          <source>In: Mobile Networks and Applications</source>
          , vol.
          <volume>19</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>171</fpage>
          -
          <lpage>209</lpage>
          . Springer (
          <year>2014</year>
          ).
          <source>doi:10.1007/s11036-013-0489-0</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Han,
          <string-name>
            <given-names>J</given-names>
            .,
            <surname>Kamber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Pei</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          :
          <article-title>Data Mining: Concepts and Techniques</article-title>
          . Morgan Kaufmann, USA (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Subbotin</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The sample properties evaluation for pattern recognition and intelligent diagnosis</article-title>
          .
          <source>In: The 10th International Conference on Digital Technologies</source>
          <year>2014</year>
          , Zilina,
          <fpage>9</fpage>
          -
          <issue>11</issue>
          <year>July 2014</year>
          , pp.
          <fpage>332</fpage>
          -
          <lpage>343</lpage>
          . IEEE, Los Alamitos (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Subbotin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliinyk</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Levashenko</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaitseva</surname>
          </string-name>
          , E.:
          <article-title>Diagnostic rule mining based on artificial immune system for a case of uneven distribution of classes in sample</article-title>
          . In: Communications - Scientific Letters of the University of Zilina, vol.
          <volume>18</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>3</fpage>
          -
          <lpage>11</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Boguslayev</surname>
            ,
            <given-names>A. V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oleynik</surname>
            , Al.
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oleynik</surname>
          </string-name>
          , An. A. at. al.:
          <article-title>Progressivnyye tekhnologii modelirovaniya, optimizatsii i intellektual'noy avtomatizatsii etapov zhiznennogo tsikla aviatsionnykh dvigateley: monografiya [Progressive technologies for modeling, optimization, and intelligent automation of aircraft engine life cycle stages: monograph]. OAO «Motor Sich», Zaporozh'ye (</article-title>
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Harley</surname>
            ,
            <given-names>J. B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sparkman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>: Machine learning and NDE: Past, present, and future</article-title>
          .
          <source>In: AIP Conference Proceedings</source>
          , vol.
          <volume>2102</volume>
          (
          <issue>1</issue>
          ) (
          <year>2019</year>
          ).
          <source>doi:10.1063/1</source>
          .5099819
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Murphy</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Machine Learning: A Probabilistics Perspective</article-title>
          . The MIT Press, Cambridge, Massachusetts (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Bishop</surname>
            ,
            <given-names>C. M.</given-names>
          </string-name>
          :
          <article-title>Pattern recognition and machine learning</article-title>
          . Springer, New York (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Lavrakas</surname>
            ,
            <given-names>P.J.:</given-names>
          </string-name>
          <article-title>Encyclopedia of survey research methods</article-title>
          .
          <source>Sage Publications</source>
          , Thousand
          <string-name>
            <surname>Oaks</surname>
          </string-name>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Fernández-Delgado</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cernadas</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          et al.:
          <article-title>Do we need hundreds of classifiers to solve real world classification problems?</article-title>
          <source>In: Journal of Machine Learning Research</source>
          , vol.
          <volume>15</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>3133</fpage>
          -
          <lpage>3181</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Samarev</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vasnetsov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smelkova</surname>
          </string-name>
          , E.:
          <article-title>Generalization of metric classification algorithms for sequences classification and labelling</article-title>
          .
          <source>In: ArXiv: 1610.04718</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>Abu</given-names>
            <surname>Alfeilat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. A.</given-names>
            ,
            <surname>Hassanat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B. A.</given-names>
            ,
            <surname>Lasassmeh</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. at.</surname>
          </string-name>
          :
          <article-title>Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review</article-title>
          .
          <source>In: Big Data</source>
          , vol.
          <volume>7</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>221</fpage>
          -
          <lpage>248</lpage>
          . Mary Ann Liebert, Inc. (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1089/big.
          <year>2018</year>
          .0175
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Subbotin</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The neuro-fuzzy network synthesis and simplification on precedents in problems of diagnosis and pattern recognition</article-title>
          .
          <source>In: Optical Memory and Neural Networks (Information Optics)</source>
          , vol.
          <volume>22</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>97</fpage>
          -
          <lpage>103</lpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Subbotin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Quasi-relief method of informative features selection for classification</article-title>
          .
          <source>In: 2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT</source>
          <year>2018</year>
          ), Lviv,
          <fpage>11</fpage>
          -14
          <source>September</source>
          <year>2018</year>
          , pp.
          <fpage>318</fpage>
          -
          <lpage>321</lpage>
          . Vezha i Ko,
          <source>Lviv</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Subbotin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Methods of data sample metrics evaluation based on fractal dimension for computational intelligence model buiding</article-title>
          .
          <source>In 4th International Scientific-Practical Conference. Problems of Infocommunications. Science and Technology (PICS&amp;T)</source>
          ,
          <year>Kharkov</year>
          ,
          <fpage>10</fpage>
          -
          <lpage>13</lpage>
          Oct.
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . IEEE, Los Alamitos (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Haro-García</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cerruela-García</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>García-Pedrajas</surname>
            ,
            <given-names>N.:</given-names>
          </string-name>
          <article-title>Instance selection based on boosting for instance-based learners</article-title>
          .
          <source>In: Pattern Recognition</source>
          , vol.
          <volume>96</volume>
          .
          <string-name>
            <surname>Elsevier</surname>
          </string-name>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1016/j.patcog.
          <year>2019</year>
          .
          <volume>07</volume>
          .004
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Hamidzadeh</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monsefi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sadoghi</given-names>
            <surname>Yazdi</surname>
          </string-name>
          , H.:
          <article-title>IRAHC: Instance Reduction Algorithm using Hyperrectangle Clustering</article-title>
          .
          <source>In: Pattern Recognition</source>
          , vol.
          <volume>48</volume>
          (
          <issue>5</issue>
          ), pp.
          <fpage>1878</fpage>
          -
          <lpage>1889</lpage>
          . Elsevier (
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .1016/j.patcog.
          <year>2014</year>
          .
          <volume>11</volume>
          .005
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Subbotin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliinyk</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The sample and instance selection for data dimensionality reduction</article-title>
          . In: Szewczyk,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Kaliczyńska</surname>
          </string-name>
          , M. (eds.)
          <source>Advances in Intelligent Systems and Computing</source>
          , vol.
          <volume>543</volume>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>103</lpage>
          . Springer, Cham (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Subbotin</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The instance and feature selection for neural network based diagnosis of chronic obstructive bronchitis</article-title>
          . In: Bris,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Majernik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Pancerz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Zaitseva</surname>
          </string-name>
          , E. (eds.)
          <source>Studies in Computational Intelligence</source>
          , vol
          <volume>606</volume>
          , pp.
          <fpage>215</fpage>
          -
          <lpage>228</lpage>
          . Springer, Cham (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Subbotin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Methods of sampling based on exhaustive and evolutionary search</article-title>
          .
          <source>In: Automatic Control and Computer Sciences</source>
          , vol.
          <volume>47</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>113</fpage>
          -
          <lpage>121</lpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motoda</surname>
          </string-name>
          , H.:
          <article-title>On issues of instance selection</article-title>
          .
          <source>In: Data Mining and Knowledge Discovery</source>
          , vol.
          <volume>6</volume>
          , pp.
          <fpage>115</fpage>
          -
          <lpage>130</lpage>
          . Springer (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Olvera-López</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carrasco-Ochoa</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martínez-Trinidad</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          et. al.:
          <article-title>A review of instance selection methods</article-title>
          .
          <source>In: Artificial Intelligence Review</source>
          , vol.
          <volume>34</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>133</fpage>
          -
          <lpage>143</lpage>
          . Springer (
          <year>2010</year>
          ). doi:
          <volume>10</volume>
          .1007/s10462-010-9165-y
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Garcia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Derrac</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cano</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          et. al.:
          <article-title>Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study</article-title>
          .
          <source>In: IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          , vol.
          <volume>34</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>417</fpage>
          -
          <lpage>435</lpage>
          . IEEE (
          <year>2012</year>
          ). doi:
          <volume>10</volume>
          .1109/tpami.
          <year>2011</year>
          .142
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Kavrin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Subbotin</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The sampling method preserving interclass boundaries</article-title>
          .
          <source>In: CEUR Workshop Proceedings of the Second International Workshop on Computer Modeling and Intelligent Systems (CMIS-2019)</source>
          , vol.
          <volume>2353</volume>
          , pp.
          <fpage>664</fpage>
          -
          <lpage>673</lpage>
          . CEUR-WS (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Kuncheva</surname>
            ,
            <given-names>L. I.</given-names>
          </string-name>
          :
          <article-title>Combining pattern classifiers: methods and algorithms</article-title>
          . John Wiley &amp; Sons, Inc.,
          <string-name>
            <surname>Hoboken</surname>
          </string-name>
          , New Jersey (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Polikar</surname>
          </string-name>
          , R.:
          <article-title>Ensemble based systems in decision making</article-title>
          .
          <source>In: IEEE Circuits and Systems Magazine</source>
          ,
          <volume>6</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>21</fpage>
          -
          <lpage>45</lpage>
          . IEEE (
          <year>2006</year>
          ). doi:
          <volume>10</volume>
          .1109/
          <string-name>
            <surname>mcas</surname>
          </string-name>
          .
          <year>2006</year>
          .1688199
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Valentini</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Re.,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Ensemble methods: a review</article-title>
          .
          <source>In: Advances in Machine Learning and Data Mining for Astronomy</source>
          , pp.
          <fpage>563</fpage>
          -
          <lpage>594</lpage>
          . Chapman &amp;
          <string-name>
            <surname>Hall</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Sammut</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Webb</surname>
          </string-name>
          , G.:
          <article-title>Encyclopedia of Machine Learning</article-title>
          . Springer (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Breiman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Bagging predictors</article-title>
          .
          <source>In: Machine Learning</source>
          , vol.
          <volume>24</volume>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>140</lpage>
          . Springer (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Kotsiantis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Bagging and boosting variants for handling classifications problems: a survey</article-title>
          .
          <source>In: The Knowledge Engineering Review</source>
          ,
          <volume>29</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>78</fpage>
          -
          <lpage>100</lpage>
          . Cambridge University Press (
          <year>2013</year>
          ). doi:
          <volume>10</volume>
          .1017/s0269888913000313
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , H.:
          <article-title>Ensemble Methods Foundations and Algorithms</article-title>
          . Chapman &amp;
          <string-name>
            <surname>Hall</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>