<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Boosted surrogate models in evolutionary optimization?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Martin Holen</string-name>
          <email>martin@cs.cas.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Computer Science, Academy of Sciences of the Czech Republic</institution>
          ,
          <addr-line>Pod Vod</addr-line>
        </aff>
      </contrib-group>
      <fpage>15</fpage>
      <lpage>22</lpage>
      <abstract>
        <p>The paper deals with surrogate modelling, Indeed, the impossibility to compute analytically the a modern approach to the optimization of empirical ob- function values of such a function makes also an anajective functions. The approach leads to a substantial de- lytical computation of its gradient and second-order crease of time and costs of evaluation of the objective func- derivatives impossible, whereas measurement errors tion, a property that is particularly attractive in evolution- usually hinder obtaining su±ciently accurate estimaary optimization. In the paper, an extension of surrogate tes of the derivatives. modelling with regression boosting is proposed. Such an extension increases the accuracy of surrogate models, thus Like other methods relying solely on function valalso the agreement between results of surrogate modelling ues, evolutionary algorithms need the objective funcand results of the intended optimization of the original ob- tion to be evaluated in quite a large number of points. jective function. The proposed extension is illustrated on In the context of optimization of empirical objective a case study in the area of searching catalytic materials op- functions, this can be quite disadvantageous because timal with respect to their behaviour in a particular chem- the evaluation of such a function in the points formical reaction. A genetic algorithm developed speci¯cally for ing one generation of an evolutionary algorithm is ofthis application area is employed for optimization, multi- ten costly and time-consuming. Hence, the above menlayer perceptrons serve as surrogate models, and a method tioned advantages of using evolutionary algorithms for called AdaBoost.R2 is used for boosting. Results of the case the optimization of empirical objective functions are sgtautdeymcoledaerllliyngc.on¯rm the usefulness of boosting for surro- frequently counterbalanced by considerably high costs and time needed for the evaluation of such functions. An area, where the trade-o® between successful op1 Introduction timization and costly objective function evaluations plays a crucial role, is the computer-aided search for For more than two decades, evolutionary algorithms, new materials and chemicals optimal with respect to especially their most frequently encountered represen- certain properties [2]. Here, evolutionary algorithms tative { genetic algorithms, belong to the most suc- are used in more than 90 % of optimization tasks, cessful methods for solving di±cult optimization tasks and the rarely encountered alternatives are simulated [3, 11, 31, 32, 42]. The popularity of evolutionary algo- annealing [9, 22, 23], simplex method [17], and holorithms is to some extent due to their biological inspi- graphic search strategy [37, 38, 41], which also use soleration, which increases their comprehensibility out- ly function values, therefore needing a similarly high side computer science. Nevertheless, they share sev- number of objective function evaluations as evolutioneral purely mathematical properties of all stochastic ary algorithms. Testing a generation of materials or optimization methods, most importantly, the valuable chemicals typically needs hours to days of time and ability of to escape a local optimum and continue the costs hundreds to thousands euros. Therefore, the evosearch for a global one, and the restriction of the infor- lutionary optimization rarely runs for more than ten mation on which they rely to function values only. generations. Consequently, they do not need information about gra- The usual approach to decreasing the cost and dients or second-order partial derivatives, di®erently time of optimization of empirical objective functions to smooth optimization methods (such as steepest de- is to evaluate the objective function only sometimes scent, conjugate gradient methods, the popular Le- and to evaluate a suitable regression model of that venberg-Marquardt method, etc. ). This makes them function otherwise. The employed model is termed particularly attractive for the optimization of empiri- surrogate model of the empirical objective function, cal objective functions, the values of which cannot be and the approach is referred to as surrogate modelling. analytically computed, but have to be obtained ex- Needless to say, the time and costs needed to evaluperimentally, through some measurement or testing. ate a regression model are negligible compared to an ? The research reported in this paper has been supported empirical objective function. However, it must not be by the grant No. 201/08/1744 of the Grant Agency of forgotten that the ¯nal optimized function coincides the Czech Republic and partially supported by the Insti- with the original empirical objective function only in tutional Research Plan AV0Z10300504. some points, whereas in the remaining points it coin-</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        cides only with its surrogate model. Consequently, the that is known to be useful in general [
        <xref ref-type="bibr" rid="ref19 ref20">19, 20, 29</xref>
        ]. In
evoagreement between the results of surrogate modelling lutionary algorithms, most important for the progress
and the results of the intended optimization of the of the method are on the one hand points that best
inoriginal objective function depends on the accuracy of dicate the global optimum (typically through highest
the approximation of the original objective function values of the ¯tness function), on the other hand points
by the surrogate model. that most contribute to the diversity of the population.
      </p>
      <p>This paper suggests to increase the accuracy of sur- In the context of evolutionary optimization,
surrorogate models by means of boosting. Boosting is a pop- gate modelling has the following main steps:
ular approach to increasing the accuracy of classi¯ca- (i) Collecting an initial set of points in which the
obtion, and due to the success of classi¯cation boosting, jective function has already been empirically
evalalso several methods of regression boosting have al- uated. This can be the ¯rst generation or several
ready been proposed. However, so far no attempt has ¯rst generations of the evolutionary algorithm, but
been reported to combine regression boosting with sur- such points are frequently available in advance.
rogate modelling. Hence, the purpose of the research (ii) Approximating the objective function by a
surroreported in the paper is basically a proof of concept: to gate model, with the use of the set of all points in
extend surrogate modelling through the incorporation which it has been empirically evaluated.
of regression boosting, and to validate that extension (iii) Running the evolutionary algorithm for a
popuon several su±ciently complex case studies. One of lation considerably larger than is the desired
popthose case studies is described in the paper. ulation size, with the empirical objective function</p>
      <p>In the following section, basic principles of surro- replaced by the surrogate model.
gate modelling and its strategies in evolutionary opti- (iv) Forming the next generation of the desired size
mization are recalled, and important surrogate models as a subset of the large population obtained in
are listed. Section 3 recalls the principles of boosting the preceding step that includes points most
imand explains a particular method of regression boost- portant according to considered criteria for the
ing that will be employed later in a case study in mate- progress of optimization (such as indication of
glorials science. That case study is sketched and its main bal optimum, diversity).
results are presented in Section 4. (v) Empirically evaluating the objective function in
all points that belong to the next generation of
the desired size, and returning to step (ii).
2 Surrogate modelling Actually, the above steps (ii){(v) correspond to
only one possible strategy of surrogate modelling in
evolutionary optimization: the individual-based
control, sometimes also referred to as pre-selection [40].</p>
      <p>An alternative strategy to the steps (ii){(v) is to run
the algorithm for only the desired population size,
interleaving one generation/several generations in which
the original objective function is empirically evaluated
with a certain number of generations in which the
surrogate model is evaluated. This is the generation-based
control of surrogate modelling in evolutionary
optimization.</p>
      <p>For empirical objective functions, it is typical to be
highly nonlinear. Therefore, nonlinear regression
models should be used as surrogate models. They can be
basically divided into two large groups according to
whether the set of functions among which the
surrogate model has to be chosen has an explicit ¯nite
parametrization.</p>
      <p>
        Surrogate modelling is a general approach to the
optimization of costly objective functions in which the
evaluation of the objective function is restricted to
points that are considered to be most important for
the progress of the employed optimization method [
        <xref ref-type="bibr" rid="ref25 ref27 ref5">5,
25, 27, 30, 39, 40</xref>
        ]. It is most frequently encountered in
connection with the optimization of empirical
objective functions, but has been equally successfully
applied also to expensive optimization tasks in
engineering design in which the objective function is not
empirical, but its evaluation is connected with intensive
computations [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. In the context of computer-aided
search for new materials and chemicals optimal with
respect to certain properties, surrogate modelling can
be viewed as replacing real experiments with simulated
virtual experiments in a computer: such virtual
experiments are sometimes referred to as virtual
screening [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Although surrogate modelling is a general
optimization approach (cf. its application in the context
of conventional optimization in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]), it is most
frequently encountered in connection with evolutionary
algorithms. The reason is that in evolutionary
optimization, the approach leads to the approximation of
the landscape of the ¯tness function, i.e., to a method
1. So far, mostly parametric models have been used
for surrogate modelling. From the point of view of
their role in this context and/or their overall
importance, the following kinds of parametric
nonlinear regression models are most worth mentioning:
(i) Multilayer feed-forward neural networks, more
precisely, the nonlinear mappings computed
by such networks. Their attractiveness for
nonlinear regression in general and for surrogate
modelling in particular [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] is due to their
universal approximation capability, which
actually means that linear spaces of functions
computed by certain families of multilayer
feedforward neural networks are dense in some
general function spaces [
        <xref ref-type="bibr" rid="ref18 ref21 ref26">18, 21, 26</xref>
        ]. For
example, considering the most common
representative of such networks { multilayer
perceptrons, the linear space formed by all functions
computed by the family of perceptrons with
one hidden layer and in¯nitely smooth
activation functions is dense in the space Lp(¹)
of functions with the p-th power of absolute
value ¯nitely integrable with respect to a
¯nite measure ¹, in the space C(X) of functions
continuous on a compact X, and in Sobolev
spaces generalizing Lp(¹) to functions that are
di®erentiable up to a given order. In the
application domain of catalytic materials, from
which the case study presented in Section 4 is
taken, nearly all examples of regression
analysis published since mid 1990s rely on
multilayer feed-forward neural networks, typically
on multilayer perceptrons (Figure 1). In the
last edition of \Handbook of heterogeneous
catalysis", more than 20 such examples are
listed, as well as several additional, based on
other kinds of such networks { radial basis
function networks and piecewise-linear neural
networks [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Therefore, these three kinds of
neural networks are now brie°y recalled:
{ Multilayer perceptrons (MLPs) can have
an arbitrary number of hidden layers, and
the basis functions of their linear space
of computed functions are constructed by
means of sigmoidal activation functions,
such as logistic sigmoid, hyperbolic
tangent, or arctangent [
        <xref ref-type="bibr" rid="ref13">13, 43</xref>
        ].
{ Radial basis function (RBF) networks
always have only one hidden layer, and the
basis functions of their space of computed
functions are radial, i.e., the function value
depends only on the distance of the vector
of input values from some centre, speci¯c
to the function [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
{ Piecewise-linear neural networks are
simply MLPs with piecewise-linear activation
functions. Their linear space of computed
functions is dense only in C(X), but on
the other hand, they allow a
straightforward extraction of logical rules describing
the relationships between input and
output values of the network [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>
        (ii) Support vector regression based on positive
semi-de¯nite kernels [34, 36]. It is worth
mentioning that they generalize the above recalled
RBF networks, and also the historically ¯rst
kind of nonlinear regression { polynomial
regression.
(iii) Gaussian process regression [28] is listed here
also due to a relationship to radial basis
function networks, but most importantly due to
the fact that it has already been successfully
employed in surrogate modelling [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
2. Nonparametric regression models are, in general,
more °exible than parametric models, but the
°exibility is typically paid for by more extensive
computations. Therefore, their importance has been
increasing only during the last two decades,
following the increasing power of available
computers [
        <xref ref-type="bibr" rid="ref12 ref14">12, 14</xref>
        ]. Nevertheless, there is one noteworthy
exception:
(v) Regression trees have been successfully used
already since the early 1980s [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. They are
actually a modi¯cation of a classi¯cation
method, therefore the regression function is
piecewise-constant. That property accounts for
relatively low computational requirements of
regression trees, but also decreases their
°exibility, otherwise the main advantage of
nonparametric methods.
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Boosting regression models</title>
      <p>
        Boosting is a method of improving classi¯cation
accuracy that consists in developing the classi¯er
iteratively, and increasing the relative in°uence of the
training data that most contributed to errors in the
previous iterations on its development in the
subsequent iterations [33]. The usefulness of boosting for
classi¯cation has incited its extension to regression [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>Both for classi¯cation and for regression, the basic
approach to increasing the relative in°uence of particular
training data is re-sampling the training data accord- The errors used to asses the quality of the
boosting to a distribution that gives them a higher probabil- ing approximation are then called boosting errors, e.g.,
ity of occurrence. This is equivalent to re-weighting the boosting MSE, or boosting MAE, where MSE refers to
contributions of the individual training pairs (xj ; yj ), the mean squared error between the computed and
with higher weights corresponding to higher values of measured values, whereas MAE refers to the mean
the error measure. absolute error, i.e., to the mean Euclidean distance</p>
      <p>
        Since surrogate models are regression models, any between them. For simplicity, also the approximation
method for regression boosting (such as [
        <xref ref-type="bibr" rid="ref10 ref8">8, 10, 35</xref>
        ]) in the ¯rst iteration, F1, is called boosting
approximais suitable for them. In the following, the met- tion if boosting is performed, and the respective errors
hod AdaBoost.R2 will be explained in detail, pro- are then called boosting errors, although boosting
acposed in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. tually does not introduce any modi¯cations in the ¯rst
      </p>
      <p>
        Similarly to other adaptive boosting methods, each iteration.
of the available pairs (x1; y1); : : : : : : ; (xp; yp) of input The above formulation of the method deals only
and output data is in the ¯rst iteration of with the case E¹i &lt; 0:5. For E¹i ¸ 0:5, the original
AdaBoost.R2 used exactly once. This corresponds to formulation of the method in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] proposes to stop the
re-sampling them according to the uniform probabil- boosting. However, that is not allowed if the stopping
ity distribution P1 with P1(x1) = p1 for j = 1; : : : ; p. criterion should be based on an independent set of
In addition, the weighted average error of the 1st iter- validation data. Indeed, the calculation of E¹i does not
ation is set to zero, E¹1 = 0. rely on any such independent data set, but it relies
      </p>
      <p>
        In the subsequent iterations (i ¸ 2), the following solely on the data employed to construct the
regressequence of steps is performed: sion model. A possible alternative for the case E¹i ¸ 0:5
is reinitialization, i.e., proceeding as in the 1st
itera1. A sample (»1; ´1); : : : ; (»p; ´p) is obtained through tion [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        re-sampling (x1; y1); : : : ; (xp; yp) according to the In connection with using feed-forward neural
netdistribution Pi¡1. works as surrogate models, it is important to be aware
2. Using (»1; ´1); : : : ; (»p; ´p) as training data, a re- of the di®erence between the iterations of boosting
gression model Fi is constructed. and the iterations of neural network training.
Boost3. A [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ]-valued squared error vector Ei of Fi with ing iterates on a higher level, one iteration of boosting
respect to (x1; y1); : : : ; (xp; yp) is calculated as includes a complete training of an ANN, which can
proceed for many hundreds of iterations. Nevertheless,
Ei = (Ei(1); : : : ; Ei(p)) = both kinds of iterations are similar in the sense that
= ((Fi(x1) ¡ y1)2; : : : ; (Fi(xp) ¡ yp)2) : (1) starting with some iteration, over-training is present.
maxk=1;:::;p(Fi(xk) ¡ yk)2 Therefore, also over-training due to boosting can be
reduced through stopping in the iteration after which
4. The weighted average error of the i-th iteration is the error for an independent set of data ¯rst time
incalculated as creases. Moreover, cross-validation can be used to ¯nd
the iteration most appropriate for stopping.
      </p>
      <p>(2)
5. Provided E¹i &lt; 0:5 , the probability distribution for
re-sampling (x1; y1); : : : ; (xp; yp) is for k = 1; : : : ; p
updated according to</p>
      <p>The extension of surrogate modelling with boosting
will now be illustrated on a case study using data from
the investigation of catalytic materials for the
highPi(xk; yk) = temperature synthesis of hydrocyanic acid. That
investigation and its results have been recently described
= Pi¡1(xk; yk) ³ 1¡E³¹Ei¹iE¹´i(1´¡(E1i¡(kE)i)(k)) : (3) ienxp[2er4i]m.Ietnhtsasinbeaenciprceurfloarrm4e8d-cthharnonueglhrheaigcht-otrh.rIonugmhpoustt
Pip=1 Pi¡1(xk; yk) 1¡E¹i of those experiments, the composition of the materials
was designed by means of a genetic algorithm
devel6. The boosting approximation in the i-th iteration is oped speci¯cally for heterogeneous catalysis [44]. More
set to the median of the approximations F1; : : : ; Fi precisely, the algorithm was running for 7 generations
with respect to the probability distribution of population size 92, and in addition 52 other
cataµ 1 ¡E¹1E¹1 ; : : : ; 1 ¡E¹iE¹i ¶ : (4)
ltayilgysattsticewdmi.taChtoemnriasaenlqsuuawelelnyrteldyg,easditaghtneaerdeadbc.ooumtpaolstiotgioenthwerer6e96invcaest-k=1</p>
      <p>p
E¹i = 1 X Pi(xk; yk)Ei(k):
p
4</p>
    </sec>
    <sec id="sec-3">
      <title>Case study in materials science</title>
      <p>
        The composition and preparation of the investi- as test data was calculated, and averaged over all the
gated catalytic materials and the conditions in which 604 folds. The criterion according to which boosting is
they had been tested have been in detail described considered useful to an architecture was: the average
in [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. Here, only the independent and dependent boosting MSE in the 2nd iteration has to be lower than
variables are recalled, the latter corresponding to the in the 1st iteration. The iteration till which the
averconsidered possible objective functions: age boosting MSE continuously decreased was then
taken as the ¯nal iteration of boosting.
{ independent variables : material used as support, According to that criterion, boosting was useful
and proportions of the 10 metal additives Y, La, to 9 from the 12 considered architectures with one
Mo, Re, Ir, Ni, Pt, Zn, Ag, Au (an 11th metal, Zr, hidden layer and to 65 from the 78 considered
archiwas left out due to the fact that the proportions tectures with two hidden layers. To validate the most
of all active compounds sum up to 100 %); promising results of the investigation of the
useful{ dependent variables, i.e., objective functions: con- ness of boosting in our case study, the data from the
versions of CH4 and NH3, and yield of HCN. 7th generation of the genetic algorithm were used. The
validation included the 5 architectures that were most
promising for boosting from the point of view of the
lowest boosting MSE on test data in the ¯nal iteration.
      </p>
      <p>These were the architectures (14,10,6,3), (14,14,8,3),
(14,13,5,3), (14,10,4,3) and (14,11,3), for which the
¯nal iterations of boosting were 32, 29, 31, 19 and 3,
respectively. For each of them, the validation proceeded
as follows:</p>
      <p>
        As the surrogate model, MLPs were employed, in
accordance with their leading role among nonlinear
regression models in the area of catalytic
materials [
        <xref ref-type="bibr" rid="ref16 ref2">2, 16</xref>
        ]. Each considered neural network had
14 input neurons : 4 of them coding the material used
as support, the other 10 corresponding to the
proportions of the 10 metal additives belonging to
independent variables; output neurons were 3, corresponding
to the possible objective functions (Figure 1). 1. In each iteration up to the ¯nal boosting iteration
      </p>
      <p>The most appropriate MLP architectures were corresponding to the respective architecture, a
sinsearched by means of cross-validation, using only data gle MLP was trained with data about the 604
catabout catalysts from the 1.{6. generation of the ge- alytic materials considered during the architecture
netic algorithm and about the 52 catalysts with manu- search.
ally designed composition, thus altogether data about 2. Each of those MLPs was employed to approximate
604 catalytic materials. Data about catalysts from the the conversions of CH4 and NH3 and the yield of
7. generation were completely excluded and left out HCN for the 92 materials from the 7. generation
for validating the search results. To use as much in- of the genetic algorithm.
formation as possible from the available data, cross- 3. In each iteration, the medians with respect to the
validation was applied as the extreme 604-fold vari- probability distribution (4) of the approximations
ant, i.e., leave-1-out validation. The set of architec- of the two conversions and of the HCN yield
obtures within which the search was performed was de- tained up to that iteration were used as the
boostlimited by means of the heuristic pyramidal condition: ing approximations.
the number of neurons in a subsequent layer must not 4. From the conversions and the yield predicted by
increase the number of neurons in a previous layer. the boosting approximations, and from the
meaDenote nI , nh and nO the numbers of input, hidden sured values, the boosting MSE and MAE were
and output neurons, respectively, and nH1 and nH2 calculated for each MLP.
the numbers of neurons in the ¯rst and second
hidden layer, respectively. Then the pyramidal condition The boosting errors (MSE and MAE) are
sumreads: marized in Figure 2, whereas Figure 3 compares the
(i) for MLPs with 1 hidden layer: nI ¸ nH ¸ nO, in boosting approximations of the conversions of CH4
our case 14 ¸ nH ¸ 3 (12 architectures); and NH3 and of the yield of HCN in the 1st and ¯nal
(ii) for MLPs with 2 hidden layers: nI ¸ nH1 ¸ iteration with their measured values. The presented
nH2 ¸ nO, in our case 14 ¸ nH1 ¸ nH2 ¸ 3 results clearly con¯rm the usefulness of boosting for
(78 architectures). the ¯ve considered architectures. For each of them,
To investigate the usefulness of boosting in our boosting led to an overall decrease of both considered
case study, the same data were used and the same error measures, the MSE and MAE, on new data from
set of architectures was considered as for architecture the 7th generation of the genetic algorithm. Moreover,
search. In each iteration, a leave-1-out validation was the decrease of the MSE (which is the measure
emperformed, in the way brie°y outlined in the preceding ployed during the investigation of the usefulness of
section: The mean squared error of the performance of boosting) is uninterrupted or nearly uninterrupted till
the catalytic materials serving in the individual folds the ¯nal boosting iteration. On the other hand, the
scatter plots in Figure 3 do not indicate any apparent
di®erence between the e®ect of boosting on the three
properties employed as catalyst performance measures
in our case study { conversion of CH4, conversion of
NH3, and yield of HCN. Hence, the performed
validation con¯rms the usefulness of boosting irrespectively
of which of those performance measures is considered.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>The paper dealt with surrogate modelling, a
modern approach to the optimization of empirical
objective functions, which is particularly attractive in
evolutionary optimization. It proposed to extend
surrogate modelling with regression boosting, to increase
the accuracy of surrogate models, thus also the
agreement between results of surrogate modelling and
results of the intended optimization of the original
objective function. Needless to say, regression boosting
is not new, though it is less common than the popular
classi¯cation boosting. However, novel is its
combination with surrogate models, which adds the advantage
of increased accuracy to the main advantage of
surrogate modelling { decreasing the time and costs of
optimization of empirical objective functions.</p>
      <p>Theoretical principles of both surrogate modelling
and boosting are known, therefore the main purpose of
the reported research was to validate the feasibility of
the proposed extension of surrogate modelling on
several su±ciently complex case studies, one of which was
sketched in this paper. The presented case study
results clearly con¯rm the usefulness of boosting. For the
¯ve most promising architectures, boosting leads to
an overall decrease of both considered error measures,
MSE and MAE, on new data from the 7th generation
of the genetic algorithm. Moreover, the decrease of the
MSE (which is the boosting error employed during the
investigation of the usefulness of boosting) is
uninterrupted or nearly uninterrupted till the ¯nal boosting
iteration. On the other hand, the scatter plots in
Figure 3 do not indicate any apparent di®erence between
the e®ect of boosting on the three catalyst properties
considered as possible objective functions in our case
study { conversion of CH4, conversion of NH3, and
yield of HCN. Hence, the performed validation
con¯rms the usefulness of boosting irrespectively of which
of these objective functions is selected.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>H.</given-names>
            <surname>Altin</surname>
          </string-name>
          <article-title>»cay: Optimal resampling and classi¯er prototype selection in classi¯er ensembles using genetic algorithms</article-title>
          .
          <source>Pattern Analysis and Applications 7</source>
          ,
          <year>2004</year>
          ,
          <volume>285</volume>
          {
          <fpage>295</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>M.</given-names>
            <surname>Baerns</surname>
          </string-name>
          and
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Holen·a: Combinatorial Development of Solid Catalytic Materials</article-title>
          .
          <article-title>Design of HighThroughput Experiments, Data Analysis, Data Mining</article-title>
          . World Scienti¯c, Singapore,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>T.</surname>
          </string-name>
          Bartz-Beielstein: Experimental Research in Evolutionary Computation. Springer Verlag, Berlin,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.H.</given-names>
            <surname>Friedman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.A.</given-names>
            <surname>Olshen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.J.</given-names>
            <surname>Stone</surname>
          </string-name>
          :
          <article-title>Classi¯cation and Regression Trees</article-title>
          . Wadsworth, Belmont,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>A.J.</given-names>
            <surname>Brooker</surname>
          </string-name>
          , J. Dennis,
          <string-name>
            <given-names>P.D.</given-names>
            <surname>Frank</surname>
          </string-name>
          , D.B. Sera¯ni,
          <string-name>
            <given-names>V.</given-names>
            <surname>Torczon</surname>
          </string-name>
          , and
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Trosset: A rigorous framework for optimization by surrogates</article-title>
          .
          <source>Structural and Multidisciplinary Optimization</source>
          ,
          <volume>17</volume>
          ,
          <year>1998</year>
          ,
          <volume>1</volume>
          {
          <fpage>13</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>D. BuÄche</surname>
            ,
            <given-names>N.N.</given-names>
          </string-name>
          <string-name>
            <surname>Schraudolph</surname>
          </string-name>
          , and P. Koumoutsakos:
          <article-title>Accelerating evolutionary algorithms with gaussian process ¯tness function models</article-title>
          .
          <source>IEEE Transactions on Systems, Man, and Cybernetics</source>
          , Part C:
          <article-title>Applications</article-title>
          and Reviews 35,
          <year>2005</year>
          ,
          <volume>183</volume>
          {
          <fpage>194</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>M.D.</surname>
          </string-name>
          <article-title>Buhmann: Radial Basis Functions: Theory and Implementations</article-title>
          . Cambridge University Press, Cambridge,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>H.</given-names>
            <surname>Drucker</surname>
          </string-name>
          :
          <article-title>Improving regression using boosting techniques</article-title>
          . In A.J.C. Sharkey, editor,
          <source>Proceedings of the 14th International Conference on Machine Learning</source>
          , Springer Verlag, London,
          <year>1997</year>
          ,
          <volume>107</volume>
          {
          <fpage>115</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>A.</given-names>
            <surname>Eftaxias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Font</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fortuny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Giralt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fabregat</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Stber</surname>
          </string-name>
          <article-title>: Kinetic modelling of catalytic wet air oxidation of phenol by simulated annealing</article-title>
          .
          <source>Applied Catalysis B: Environmental</source>
          <volume>33</volume>
          ,
          <year>2001</year>
          ,
          <volume>175</volume>
          {
          <fpage>190</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. J. Friedman:
          <article-title>Greedy function approximation: A gradient boosting machine</article-title>
          .
          <source>Annals of Statistics 29</source>
          ,
          <year>2001</year>
          ,
          <volume>1189</volume>
          {
          <fpage>1232</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. D. Goldberg:
          <article-title>Genetic Algorithms in Search, Optimization and Machine Learning</article-title>
          .
          <source>Addison-Wesley</source>
          , Reading,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. L. GyÄor¯,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kohler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krzyzak</surname>
          </string-name>
          , and H. Walk: 28. E. Rasmussen and
          <string-name>
            <given-names>C.</given-names>
            <surname>Williams</surname>
          </string-name>
          <article-title>: Gaussian Process for A Distribution-Free Theory of Nonparametric Regres- Machine Learning</article-title>
          . MIT Press, Cambridge,
          <year>2006</year>
          . sion. Springer Verlag, Berlin,
          <year>2002</year>
          . 29. A.
          <article-title>Ratle: Accelerating the convergence of evolution-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>M.T. Hagan</surname>
            ,
            <given-names>H.B.</given-names>
          </string-name>
          <string-name>
            <surname>Demuth</surname>
            , and
            <given-names>M.H.</given-names>
          </string-name>
          <string-name>
            <surname>Beale</surname>
          </string-name>
          :
          <article-title>Neural ary algorithms by ¯tness landscape approximation</article-title>
          .
          <source>In Network Design. PWS Publishing</source>
          , Boston,
          <year>1996</year>
          .
          <string-name>
            <given-names>A.E.</given-names>
            <surname>Eiben</surname>
          </string-name>
          , T. BÄack, M. Schoenauer, and H.
          <string-name>
            <surname>-P.</surname>
          </string-name>
          Schwe-
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>T.J. Hastie</surname>
            and
            <given-names>R.J.</given-names>
          </string-name>
          <string-name>
            <surname>Tibshirani</surname>
          </string-name>
          : Generalized Additive fel, editors,
          <source>Parallel Problem Solving from Nature</source>
          , Models. Chapman &amp; Hall, Boca Raton,
          <year>1990</year>
          . Springer Verlag, Berlin,
          <year>1998</year>
          ,
          <volume>87</volume>
          {
          <fpage>96</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. M.
          <article-title>Holen·a: Piecewise-linear neural networks and their 30. A. Ratle: Kriging as a surrogate ¯tness landscape in relationship to rule extraction from data</article-title>
          .
          <source>Neural Com- evolutionary optimization. Arti¯cial Intelligence for putation 18</source>
          ,
          <year>2006</year>
          ,
          <volume>2813</volume>
          {
          <fpage>2853</fpage>
          .
          <string-name>
            <surname>Engineering</surname>
            <given-names>Design</given-names>
          </string-name>
          ,
          <source>Analysis and Manufacturing</source>
          <volume>15</volume>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. M.
          <article-title>Holen·a and</article-title>
          <string-name>
            <surname>M. Baerns:</surname>
          </string-name>
          Computer-aided strategies
          <year>2001</year>
          ,
          <volume>37</volume>
          {
          <fpage>49</fpage>
          .
          <article-title>for catalyst development</article-title>
          . In G. Ertl, H. KnÄozinger, 31.
          <string-name>
            <given-names>C.R.</given-names>
            <surname>Reeves</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.E.</given-names>
            <surname>Rowe</surname>
          </string-name>
          <article-title>: Genetic Algorithms: PrinF</article-title>
          . SchuÄth, and J. Eitkamp, editors,
          <source>Handbook of Het- ciples and Perspectives</source>
          . Kluwer Academic Publishers, erogeneous Catalysis, Wiley-VCH,
          <year>Weinheim</year>
          ,
          <year>2008</year>
          . Boston,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>A.</given-names>
            <surname>Holzwarth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Denton</surname>
          </string-name>
          , H. Zantho®, and 32. R. Schaefer:
          <article-title>Foundation of Global Genetic OptimizaC</article-title>
          . Mirodatos: Combinatorial approaches to het- tion. Springer Verlag, Berlin,
          <year>2007</year>
          .
          <article-title>erogeneous catalysis: Strategies and perspectives for 33</article-title>
          . R. Schapire:
          <article-title>The strength of weak learnability</article-title>
          .
          <source>Maacademic research. Catalysis Today 67</source>
          ,
          <year>2001</year>
          ,
          <volume>309</volume>
          {
          <fpage>318</fpage>
          .
          <source>chine Learning 5</source>
          ,
          <year>1990</year>
          ,
          <volume>197</volume>
          {
          <fpage>227</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. K. Hornik:
          <article-title>Approximation capabilities of multilayer 34. B. SchoÄlkopf and A</article-title>
          .
          <string-name>
            <surname>J. Smola</surname>
          </string-name>
          :
          <article-title>Learning with Kernels. neural networks</article-title>
          .
          <source>Neural Networks</source>
          <volume>4</volume>
          ,
          <year>1991</year>
          ,
          <volume>251</volume>
          {
          <fpage>257</fpage>
          . MIT Press, Cambridge,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          :
          <article-title>A comprehensive survery of ¯tness approxima-</article-title>
          35. D.L. Shrestha:
          <article-title>Experiments with AdaBoost.RT, an imtion in evolutionary computation. Soft Computing 9, proved boosting scheme for regression</article-title>
          .
          <source>Neural Compu2005</source>
          ,
          <volume>3</volume>
          {
          <fpage>12</fpage>
          . tation 18,
          <year>2006</year>
          ,
          <volume>1678</volume>
          {
          <fpage>1710</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          , M. HuÄsken, M. Olhofer, and
          <string-name>
            <given-names>B.</given-names>
            <surname>Sendho</surname>
          </string-name>
          ®:
          <fpage>36</fpage>
          . I.
          <article-title>Steinwart and A. Christmann: Support Vector MaNeural networks for ¯tness approximation in evolu- chines</article-title>
          . Springer Verlag, New York,
          <year>2008</year>
          .
          <article-title>tionary optimization</article-title>
          . In Y. Jin, editor,
          <source>Knowledge 37. A</source>
          .
          <string-name>
            <surname>Tompos</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          <string-name>
            <surname>Margitfalvi</surname>
          </string-name>
          , E. T¯rst, L. V¶egva¶ri, Incorporation in Evolutionary Computation, Springer M.A.
          <string-name>
            <surname>Jaloull</surname>
            ,
            <given-names>H.A.</given-names>
          </string-name>
          <string-name>
            <surname>Khalfalla</surname>
          </string-name>
          , and
          <string-name>
            <surname>M.M. Elgarni</surname>
          </string-name>
          : DeVerlag, Berlin,
          <year>2005</year>
          ,
          <volume>281</volume>
          {
          <fpage>306</fpage>
          .
          <article-title>velopment of catalyst libraries for total oxidation of</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>P.C. Kainen</surname>
            , V. Kºurkova¶, and
            <given-names>M.</given-names>
          </string-name>
          <article-title>Sanguineti: Esti- methane: A case study for combined application of mates of approximation rates by gaussian radial-basis "holographic research strategy and arti¯cial neural netfunctions. In Adaptive and Natural Computing Algo- works" in catalyst library design</article-title>
          .
          <source>Applied Catalysis A: rithms</source>
          , Springer Verlag, Berlin,
          <year>2007</year>
          ,
          <volume>11</volume>
          {
          <fpage>18</fpage>
          . General 285,
          <year>2005</year>
          ,
          <volume>65</volume>
          {
          <fpage>78</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Ding</surname>
          </string-name>
          : A 38. A.
          <string-name>
            <surname>Tompos</surname>
            , L. V¶gva¶ri, E. T¯rst, and
            <given-names>J.L.</given-names>
          </string-name>
          <article-title>Margitsimulated annealing study of Si, Al distribution in the falvi: Assessment of predictive ability of arti¯cial omega framework</article-title>
          .
          <article-title>Journal of Molecular Catalysis A: neural networks using holographic mapping</article-title>
          .
          <source>CombiChemical 148</source>
          ,
          <year>1999</year>
          ,
          <volume>189</volume>
          {
          <fpage>195</fpage>
          . natorial Chemistry and High Throughput Screening
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>A.S. McLeod</surname>
            and
            <given-names>L.F.</given-names>
          </string-name>
          <string-name>
            <surname>Gladden</surname>
          </string-name>
          : Heterogeneous cat-
          <volume>10</volume>
          ,
          <year>2007</year>
          ,
          <volume>121</volume>
          {
          <fpage>134</fpage>
          .
          <article-title>alyst design using stochastic optimization algorithms</article-title>
          . 39. H.
          <string-name>
            <surname>Ulmer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Streichert</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Zell</surname>
          </string-name>
          <article-title>: Model-assisted m Journal of Chemical Information and Computer Sci- steady state evolution strategies</article-title>
          .
          <source>In GECCO 2003: Geence 40</source>
          ,
          <year>2000</year>
          ,
          <volume>981</volume>
          {
          <fpage>987</fpage>
          . netic and Evolutionary Computation, Springer Verlag,
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24. S. MÄohmel,
          <string-name>
            <given-names>N.</given-names>
            <surname>Steinfeldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Endgelschalt</surname>
          </string-name>
          , M. Holen·a, Berlin,
          <year>2003</year>
          ,
          <volume>610</volume>
          {621.
          <string-name>
            <given-names>S.</given-names>
            <surname>Kolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Dingerdissen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weber</surname>
          </string-name>
          , and 40. H.
          <string-name>
            <surname>Ulmer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Streichert</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Zell: Model assisted M. Bewersdorf</surname>
          </string-name>
          :
          <article-title>New catalytic materials for the evolution strategies</article-title>
          . In Y. Jin, editor, Knowledge
          <article-title>Inhigh-temperature synthesis of hydrocyanic acid from corporation in Evolutionary Computation, Springer methane and ammonia by high-throughput approach</article-title>
          . Verlag, Berlin,
          <year>2005</year>
          ,
          <volume>333</volume>
          {
          <fpage>355</fpage>
          . Applied Catalysis A: General 334,
          <year>2008</year>
          ,
          <volume>73</volume>
          {
          <fpage>83</fpage>
          . 41. L. V¶
          <article-title>egva¶ri, A. Tompos, S. GÄoboÄloÄs, and</article-title>
          <string-name>
            <given-names>J.F.</given-names>
            <surname>Mar</surname>
          </string-name>
          -
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <given-names>Y.S.</given-names>
            <surname>Ong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.B.</given-names>
            <surname>Nair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.J.</given-names>
            <surname>Keane</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.W.</given-names>
            <surname>Wong</surname>
          </string-name>
          <article-title>: gitfalvi: Holographic research strategy for catalyst liSurrogate-assisted evolutionary optimization frame- brary design: Description of a new powerful optimisaworks for high-¯delity engineering design problems</article-title>
          . In tion method.
          <source>Catalysis Today 81</source>
          ,
          <year>2003</year>
          ,
          <volume>517</volume>
          {527. Y. Jin, editor,
          <source>Knowledge Incorporation in Evolution- 42</source>
          . M.D. Vose:
          <article-title>The Simple Genetic Algorithm</article-title>
          .
          <source>Foundaary Computation</source>
          , Springer Verlag, Berlin,
          <year>2005</year>
          ,
          <volume>307</volume>
          { tions and Theory. MIT Press, Cambridge,
          <year>1999</year>
          .
          <volume>331</volume>
          . 43. H. White: Arti¯cial Neural Networks: Approxima-
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26. A.
          <article-title>Pinkus: Approximation theory of the MPL model tion and Learning Theory</article-title>
          . Blackwell Publishers,
          <article-title>Camin neural networks</article-title>
          .
          <source>Acta Numerica 8</source>
          ,
          <year>1998</year>
          ,
          <volume>277</volume>
          {
          <fpage>283</fpage>
          . bridge,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <given-names>K.</given-names>
            <surname>Rasheed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ni</surname>
          </string-name>
          , and
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Vattam: Methods for using 44</article-title>
          . D.
          <string-name>
            <surname>Wolf</surname>
            ,
            <given-names>O.V.</given-names>
          </string-name>
          <string-name>
            <surname>Buyevskaya</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <article-title>Baerns: An evosurrogate modesl to speed up genetic algorithm oprim- lutionary approach in the combinatorial selection and ization: Informed operators and genetic engineering. optimization of catalytic materials</article-title>
          . Applied Catalyst In Y. Jin, editor,
          <source>Knowledge Incorporation in Evolu- A: General</source>
          <volume>200</volume>
          ,
          <year>2000</year>
          ,
          <volume>63</volume>
          {
          <fpage>77</fpage>
          . tionary Computation, Springer Verlag, Berlin,
          <year>2005</year>
          ,
          <volume>103</volume>
          {
          <fpage>123</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>