<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Selection of aggregated classifiers for the prediction of the state of technical objects</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>D A Zhukov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>V N Klyachkin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>V R Krasheninnikov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yu E Kuvayskova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ulyanovsk State Technical University</institution>
          ,
          <addr-line>Severny Venets street, 32, Ulyanovsk, Russia, 432027</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>361</fpage>
      <lpage>367</lpage>
      <abstract>
        <p>The basic data in the problem of the prediction of technical object's state of health based on the known indicators of its operation are the known results of the object state estimation by information about previous service. The problem may be solved using the machine learning methods, it reduces to binary classification of states of the object. The research was conducted in the Matlab environment, ten various basic methods of machine learning were used: naive Bayes classifier, neural networks, bagging of decision trees and others. In order to improve quality of healthy state identification, it has been suggested that aggregated methods combining several basic classifiers should be used. This paper addresses the issue of selection of the best aggregated classifier. The effectiveness of such approach has been confirmed by numerous tests of real-world objects.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        It is possible to forecast the state of the technical object using various methods. The realistic
simulation using the time-series system is the most commonly used approach [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1-4</xref>
        ]. However, as often
as not forecasting comes to the object state division in the target horizon in operating ones, i.e. capable
of fulfilling intended functions, or faulty ones. Still and all, the diagnostics is carried out according to
the object operation and the measurement of indirect values of its functioning.
      </p>
      <p>
        For example, the engine performance is diagnosed by reference to the fuel consumption rate, the
gas temperature, the noise and vibration level, the exhaust gas composition, the clearance between the
cylinder and the piston, the clearance between crankshaft necks and bearings and some other
indicators [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Therein, there is a false alarm risk (when the operating object will be considered as the
faulty one) or vice versa when the faulty object is considered as the operating one will be skipped.
      </p>
      <p>
        Basic data are a priori information about the state of the object according to the results of the
previous exploitation: upon the given values of controlled indicators the technical system is operating
one or the faulty one. It is assumed that there is some unknown dependence between indicators of the
object functioning and its states. Based on basic data it is necessary to restore this dependence, i.e. to
plot an algorithm, capable of providing a fairly valid answer about the state of the object for the given
set of indicators of its functioning. It is a task for the computer-aided learning or the learning from
examples (with a tutor). Binary classification, i.e. the object state division in two categories [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6-8</xref>
        ], is a
special case of this task.
      </p>
      <p>To assess the quality of the plotted algorithm in the context of the opportunity to forecast, the
original sample is divided in two disjoint subsets. The first subset is the learning sample itself for
handling the task of learning (which, usually, comes to assessing parameters of the model of the
appropriate algorithm). The second subset is the control (or test) sample which is not used for learning.
This part of the sample is used estimate the forecast error which characterizes the quality of learning.
When using the cross-validation, the sample is divided in N parts (in practice, usually, N = 5 or N =
10). In this case, N – 1 parts are used for learning and the rest for control. All possible options are
sorted out successively.</p>
      <p>
        Methods of the computer-aided learning are actively used in all kinds of activities. Many different
approaches to the classification are used. For example, classical statistical methods (Bayesian
classifiers, the discriminative analysis, the logistic regression), methods specially focused on the
computer-aided learning (the support vector machine, neural networks), compositional methods
(bagging, boosting) and etc. The question at issue that it is impossible to determine which method
from selected ones will provide the best solution of the task. That is why many different methods or
their combinations are usually used. Decision to apply is made based on findings of the research of the
quality functional for the control sample. In works [
        <xref ref-type="bibr" rid="ref10 ref9">9-10</xref>
        ], the aggregate approach, applying of the
combination of several classification methods, is suggested to improve the forecasting quality. These
results were certified by the experiment and for technical diagnostics tasks as well [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11-13</xref>
        ].
      </p>
      <p>The purpose of this study is to plot selection algorithms of the best aggregated classifier.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Using of basic classifiers</title>
      <p>The most widely known indicator that can be used for the quality assessment of the binary
classification is the proportion of correct answers in the control sample,
Q
N
where Q is the number of correctly classified objects from the control sample and N is the overall
control sample size. The opposite characteristic which is the proportion (or the percentage) of errors in
the control sample is used more often.</p>
      <p>Sometimes the error dispersion (the mean square deviations of the operating state true probability
in the r-test P(Yr ) from its forecasted value Pˆ ( X r ) ) is used to assess the quality of the classification:
σ 2 = 1 ∑l (P(Yr ) − Pˆ ( X r ))2.</p>
      <p>N r =1</p>
      <p>
        If classes are unbalanced (when there are much more operating states than faulty ones), the
proportion of errors cannot be used for the reliable quality assessment of the classification [
        <xref ref-type="bibr" rid="ref15 ref16">15-16</xref>
        ].
Accuracy and completeness are far more informative
tp
Accuracy =
      </p>
      <p>,
,
,
P =
R =
tp + fp</p>
      <p>tp
tp + fn
where tp is the number of properly classified operating states, fp is the number of misclassified
operating states, fn is the number of misclassified faulty states. Based on these two indicators the
uniform criterion can be formed.</p>
      <p>F =
2PR</p>
      <p>P + R
– it is called the harmonic average of the accuracy and the completeness (F-measure): the closer is the
value of F to one, the higher is the quality of the classification.</p>
      <p>
        Area under the ROC-curve (receiver operating characteristics): AUC (area under the curve) can
also be selected as the quality functional [
        <xref ref-type="bibr" rid="ref16 ref17 ref18 ref19 ref20">16-20</xref>
        ]. ROC-curve will be formed, if values fp(c) are taken
at the x-axis and values of tp(c) are taken at the y-axis, where c is the threshold. Area under the
ROCcurve gives an opportunity to assess the model in general without being related to the certain
threshold. Criterion AUC-ROC is resistant to the influence of unbalanced classes. It can be interpreted
as the likelihood that the probability value of the randomly selected object from the class 1 will be
closer to 1 in comparison with the randomly selected object from the class 0. Such curves are shown in
Fig. 1. They are plotted in the Matlab system for the diagnostics example considered below. In this
case three methods of the binary classification: the logistic regression, the support vector machine and
the naïve Bayesian classifier) were used.
      </p>
      <p>As an illustration of the numerical study we considered the water treatment system. We had the
results of 348 tests upon eight quality indicators of the drinking water treatment. The system was
faulty in 47 cases (when even one water quality indicator was beyond the limits). Whereas the division
of basic data in the learning sample and the control sample was carried out randomly, we repeated
tests 50 times.</p>
      <p>We used the Matlab-package for tests. In Table 1 there are averaged values of the F-criterion and
the area under the ROC-curve AUC for those five methods of the computer-aided learning where these
values were maximum. Estimates suggest that the correlation between these two indicators was
nonsignificant at the significance level 0.05. If the F-criterion is the same for selected classifiers, AUC
values can be used for selection of the best classification method.</p>
      <p>It is apparent that the decision tree bagging showed the best results in the considered example.
Fcriterion discrepancy between the best and the worst (0.801 for the RUSBoost method) results was
8.7%, AUC – 21.5%.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Aggregated classifiers</title>
      <p>
        The aggregated approach was suggested for handling tasks of the credit scoring [
        <xref ref-type="bibr" rid="ref10 ref9">9-10</xref>
        ]. Later it was
used for the technical diagnostics of the system state. One and the same classification method is used
for plotting of the assembly with compositional approaches (the bagging, the boosting). This method is
plotted either at various sample subsets or oriented towards the error compensation of the previous
iteration. Multiple use of various classification methods plotted with the learning sample is of interest.
In this case to achieve the best result it is necessary to resolve following issues: which learning
methods shall be used? How these methods can be combined? How to make the consistent decision
about the operating state of the object based on solutions of certain methods?
      </p>
      <p>We will use the exhaustive enumeration of sets from H base methods. Then, for example, if H = 2,
we will get three sets: two basic ones and one aggregated; if Н = 3, there will be 7 sets: three basic
ones, three aggregated ones, by two basic ones and one aggregated of all three basic methods. It is not
too difficult to see that in the general case the number of sets is equal to 2Н – 1. To make the consistent
decision about the operating state of the object based on solutions from certain classification methods,
we will consider the aggregation of results on the average value, on the median line and using the
voting procedure.</p>
      <p>Suppose PˆK ( X r ) is the probability that r-object is the operating one determined with the aid of the
К-basic method, K = 1,..., H . In mean that when aggregating on the average value:
H
∑ PˆК ( X r )</p>
      <p>H
where PˆАК ср ( X r ) is the probability that r-object is the operating one.</p>
      <p>ˆ</p>
      <p>PАК ср ( X r ) = K =1</p>
      <p>When aggregating on the median line, firstly, it is necessary to range the line with the results of
basic methods in the set. If the number of basic methods is odd, the probability that r-object is the
operating one will be:</p>
      <p>PАК мед ( X r ) = PˆH +1 ( X r ).
ˆ
2</p>
      <p>If the number of basic methods is even, the relevant probability will be calculated as the half-sum
of the median value results.</p>
      <p>The result of the aggregated classification method on the voting procedure is the average value of
of basic methods results determining the operating state of the object with the probability, for
example, not lower than 0.1 ( PˆК ( Х r ) ≥ 0,1 ). Otherwise the probability that r-object is the operating
one is considered as zero. PˆК is the probability that r-object is the operating one at base values of the
object Xr functioning. As can be seen from the above, values of classification probabilities lower than
0.1 are treated as 0 and the rest are treated as 1. Aggregated classification models are plotted using
these very values.</p>
      <p>In this case, as mentioned above, the division of basic data in the learning sample and the control
sample is carried out randomly. That’s why structures of aggregated classifiers turn out to be different.
The question that has to be answered is what structure to select for making the final decision about the
operating state of the object.</p>
      <p>As before, we repeated tests 50 times. The sample volume was one and the same (25%) using all
eight functioning indicators. Corresponding results of the F-criterion for five options of every
aggregation type are shown in Table 2.</p>
      <p>For example, the entry in the first line GrB+DTB+AB means that the aggregate of three basic
classifiers including the gradient boosting (GrB), the decision tree bagging (DTB) and the AdaBoost
boosting method was the best aggregation option on the average value when using the F-criterion in
this experiment. The number of classifiers included in the aggregate (Table 2) fluctuates from two to
six. In the general case it can include all basic classifiers.</p>
      <p>Firstly, let us remark that any aggregated method on the F-criterion turned out to be better than any
basic one. Secondly, values of the F- criterion for aggregated methods are not widely diverging. And
finally, it is worth paying attention to the fact that the best of basic methods (the decision tree bagging)
is included into the structure of all aggregated classifiers.</p>
      <p>It is of interest to study the distribution pattern of the F-criterion values. As far as the aggregation
using the voting procedure is concerned, we applied the following sequence of steps. We used the
Statistica system to plot the normal probability curve. Then we transferred this curve to the value
distribution histogram (Fig. 2) of this criterion. To check the normality, we used the Shapiro-Wilk
criterion recommended for small samples (up to 50 tests). It is apparent that the distribution can be
considered as the normal one when the significance level is 0.05. Similar results were obtained for
other classifiers (both basic and aggregated) as well.</p>
      <p>The distribution normality gives an opportunity to use the standard approach for checking the
hypothesis that in the given example the aggregation does lead to the improvement of the diagnostics
quality.</p>
      <p>We checked the null hypothesis for the equality of F-criterion average values when aggregating
and when using basic classification methods (in comparison with data from the decision tree bagging
being taken into consideration as the best basic method). As an alternative, we considered the
hypothesis for excessing of the average value when aggregating. Firstly, we compared dispersions of
two samples upon the Fisher criterion (the difference turned out to be statistically non-significant).
Then we tried the Student criterion with similar dispersions. It was concluded that the null hypothesis
should be rejected: the average value of the F-criterion when aggregating is higher than when using
basic classifiers.</p>
      <p>As has already been noted, values of the F-criterion in Table 2 do not much differ. We checked the
hypothesis that the increase in the number of basic classifiers (more than two) in the aggregate
structure will non-significantly influence the value of the F-criterion. We divided the whole sample in
two subsets. Data on aggregates consisting of only two components will be included in the first subset.
All rest values will be included in the second subset.</p>
      <p>Checking of the hypothesis for the equality of average values in these subsets shows its validity:
the average value of the F-criterion does not change when expanding the number of basic classifiers in
the aggregate structure.</p>
      <p>A consequence of the above result is the fact that it is possible to reduce dramatically the time
required for the calculation. Instead of enumerating all aggregation options when searching for the
maximum value of the F-criterion (three aggregation methods and 11 basic classification methods
used in the Matlab package, 3*(211 - 1) = 6141 options); it will be enough to enumerate only options
including two basic methods (3*11!/2!9! = 165).</p>
      <p>It is necessary to take into consideration one more circumstance. During all tests the aggregate
included the best basic method (the decision tree bagging). Taking this fact into consideration gives an
opportunity to scale back the number of options being enumerated by 30.</p>
      <p>However, it is necessary to bear in mind that the given results are obtained in tests of only one
technical object. Nevertheless, this experiment shows that the suggested approach shall be approbated
for the diagnostics of any other system being studied.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>To assess the operating state of the object it is recommended to select the simplest aggregated
classifier with the sufficiently great value of the F-criterion. In the given example this classifier is the
aggregation on the average value for the decision tree bagging and AdaBoost, or the decision tree
bagging together with the gradient boosting (except for sufficiently great values of the F-criterion,
these combinations can be more often found in Table 2).</p>
      <p>The considered approach was also used (except for the water treatment system) when assessing the
faulty state of the hydroelectric installation on the vibration level and the technological process of the
mechanical processing when it showed similar results.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This test was carried out with the financial support from the Russian Foundation for Basic Research
(RFBR) and the Government of Ulyanovsk region, the project 18-48-730001.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Gaskarov</surname>
            <given-names>D V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Golinkevich</surname>
            <given-names>T A</given-names>
          </string-name>
          and
          <string-name>
            <surname>Mozgalevskij</surname>
            <given-names>A V</given-names>
          </string-name>
          <year>1974</year>
          <article-title>Technical condition and reliability prediction of electronic equipment (Moscow: Soviet radio</article-title>
          ) p
          <fpage>224</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Klyachkin</surname>
            <given-names>V N</given-names>
          </string-name>
          and
          <string-name>
            <surname>Bubyr' D S 2014</surname>
          </string-name>
          <article-title>Forecasting of technical object state based on piecewise linear regressions</article-title>
          <source>Radioengineering</source>
          <volume>7</volume>
          <fpage>137</fpage>
          -
          <lpage>140</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Krasheninnikov</surname>
            <given-names>V R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuvayskova Yu</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shunina</surname>
            <given-names>Yu S</given-names>
          </string-name>
          and
          <string-name>
            <surname>Klyachkin</surname>
            <given-names>V N</given-names>
          </string-name>
          <year>2017</year>
          <article-title>Updating of models predicting objects' state as time series systems and multivariate classifier</article-title>
          <source>Herald of Computer and Information Technologies</source>
          <volume>6</volume>
          <fpage>11</fpage>
          -
          <lpage>16</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Krasheninnikov</surname>
            <given-names>V R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klyachkin</surname>
            <given-names>V N</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Kuvayskova</given-names>
            <surname>Yu</surname>
          </string-name>
          . E.
          <year>2018</year>
          <article-title>Models updating for technical objects state forecasting 3rd Russian-Pacific Conf. on Computer Technology and Applications (RPC)</article-title>
          .
          <source>IEEE Xplore 1-4.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Birger</surname>
            <given-names>I A 1978</given-names>
          </string-name>
          <string-name>
            <surname>Technical Diagnostics</surname>
          </string-name>
          (Moscow: Engineering) p
          <fpage>240</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Witten</surname>
            <given-names>I H</given-names>
          </string-name>
          and
          <article-title>Frank E 2005 Data mining: practical machine learning tools and techniques</article-title>
          (San Francisco: Morgan Kaufmann Publishers) p
          <fpage>525</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Merkov</surname>
            <given-names>A B</given-names>
          </string-name>
          <year>2011</year>
          <article-title>Pattern recognition</article-title>
          .
          <article-title>Introduction to statistical learning methods (Moscow: Editorial URSS</article-title>
          ) p
          <fpage>256</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Voronina</surname>
            <given-names>V V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miheev</surname>
            <given-names>A V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yarushkina N G and Svyatov K V 2017</surname>
          </string-name>
          <article-title>Machine learning: theory and practice (Ulyanovsk: UlSTU</article-title>
          ) p
          <fpage>290</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Yumaganov</surname>
            <given-names>A S</given-names>
          </string-name>
          and
          <string-name>
            <surname>Myasnikov</surname>
            <given-names>V V</given-names>
          </string-name>
          <string-name>
            <surname>2017</surname>
          </string-name>
          <article-title>A method of searching for similar code sequences in executable binary files using a featureless approach</article-title>
          <source>Computer Optics</source>
          <volume>41</volume>
          (
          <issue>5</issue>
          )
          <fpage>756</fpage>
          -
          <lpage>764</lpage>
          DOI: 10.18287/
          <fpage>2412</fpage>
          -6179-2017-41-5-
          <fpage>756</fpage>
          -764
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Kropotov</given-names>
            <surname>Yu</surname>
          </string-name>
          <string-name>
            <given-names>A</given-names>
            ,
            <surname>Proskuryakov</surname>
          </string-name>
          <string-name>
            <surname>A Yu</surname>
          </string-name>
          and
          <article-title>Belov A A 2018 Method for forecasting changes in time series parameters in digital information management systems</article-title>
          <source>Computer Optics</source>
          <volume>42</volume>
          (
          <issue>6</issue>
          )
          <fpage>1093</fpage>
          -
          <lpage>1100</lpage>
          DOI: 10.18287/
          <fpage>2412</fpage>
          -6179-2018-42-6-
          <fpage>1093</fpage>
          -1100
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Klyachkin</surname>
            <given-names>V N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuvayskova</surname>
            <given-names>Yu E</given-names>
          </string-name>
          and
          <string-name>
            <surname>Zhukov</surname>
            <given-names>DA 2017</given-names>
          </string-name>
          <article-title>The use of aggregate classifiers in technical diagnostics</article-title>
          ,
          <source>based on machine learning CEUR Workshop Proc</source>
          .
          <year>1903</year>
          32-
          <fpage>35</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Maksimov</surname>
            <given-names>A I</given-names>
          </string-name>
          and
          <string-name>
            <surname>Gashnikov M V 2018</surname>
          </string-name>
          <article-title>Adaptive interpolation of multidimensional signals for differential compression</article-title>
          <source>Computer Optics</source>
          <volume>42</volume>
          (
          <issue>4</issue>
          )
          <fpage>679</fpage>
          -
          <lpage>687</lpage>
          DOI: 10.18287/
          <fpage>2412</fpage>
          -6179-2018-42-4-
          <fpage>679</fpage>
          -687
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Kuvayskova Yu E 2017</surname>
          </string-name>
          <article-title>The prediction algorithm of the technical state of an object by means of fuzzy logic inference models</article-title>
          <source>Procedia Engineering</source>
          <volume>201</volume>
          <fpage>767</fpage>
          -
          <lpage>772</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Voroncov</surname>
            <given-names>K V</given-names>
          </string-name>
          <string-name>
            <surname>URL</surname>
          </string-name>
          <article-title>: https://yadi</article-title>
          .sk/i/FItIu6V0beBmF
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Sokolov</surname>
            <given-names>E A</given-names>
          </string-name>
          URL: https://github.com /esokolov/ml-course-hse/blob/master/2018-fall/lecturenotes/lecture04-linclass.pdf
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Davis</surname>
            <given-names>J</given-names>
          </string-name>
          and
          <string-name>
            <surname>Goadrich</surname>
            <given-names>M 2006</given-names>
          </string-name>
          <article-title>The relationship between Precision-Recall and ROC curves Proc</article-title>
          .
          <source>of the 23rd int. conf. on Machine learning (Pittsburgh)</source>
          <fpage>233</fpage>
          -
          <lpage>240</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Klyachkin</surname>
            <given-names>V N</given-names>
          </string-name>
          and
          <article-title>Shunina Yu S 2015 System for borrowers' creditworthiness assessment and repayment of loans forecasting Herald of Computer</article-title>
          and Information Technologies 11
          <fpage>45</fpage>
          -
          <lpage>51</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Neykov</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <article-title>Jun S Liu and Tianxi Cai 2016 On the Characterization of a Class of FisherConsistent Loss Functions</article-title>
          and its Application to Boosting
          <source>J. of Machine Learning Research</source>
          <volume>17</volume>
          (
          <issue>70</issue>
          )
          <fpage>1</fpage>
          -
          <lpage>32</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Wyner</surname>
            <given-names>A J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matthew</surname>
            <given-names>Olson</given-names>
          </string-name>
          ,
          <article-title>Justin Bleich and David Mease 2017 Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers J</article-title>
          .
          <source>of Machine Learning Research</source>
          <volume>18</volume>
          (
          <issue>48</issue>
          )
          <fpage>1</fpage>
          -
          <lpage>33</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Chen</surname>
            <given-names>T</given-names>
          </string-name>
          and
          <string-name>
            <surname>Guestrin C 2016 XGBoost: A Scalable Tree</surname>
          </string-name>
          <article-title>Boosting System Proc</article-title>
          .
          <source>of the 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining 765-794</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>