-

Evolutionary optimization with active learning of surrogate models and xed evaluation batch size?

Viktor Charypar

Martin Holena

martin@cs.cas.cz 1 0 Czech Technical University Faculty of Nuclear Sciences and Physical Engineering Brehova 7 , 115 19 Praha 1 , Czech Republic 1 Institute of Computer Science Academy of Sciences of the Czech Republic Pod vodarenskou vez 2 , 182 07 Praha , Czech Republic

33 40

Evolutionary optimization is often applied to lation used as the objective function takes minutes to problems, where simulations or experiments used as the t- nish, the traditional approach becomes impractical. ness function are expensive to run. In such cases, surro- When the objective function is evaluated using a physgate models are used to reduce the number of tness eval- ical experiment, in the evolutionary optimization of uations. Some of the problems also require a xed size catalytic materials [1] for example, an evaluation for batch of solutions to be evaluated at a time. Traditional one generation of the algorithm takes between several pmreotvheodthseofsusrerleocgtaitneg minoddievlideuitahlserforreqturuiree eivnadliuvaidtiuoanl tpooiinmts- days and several weeks and costs thousands of euros. to be evaluated, or couple the batch size with the EA gener- The typical solution to this problem is performation size. We propose a queue based method for individual ing only a part of all evaluations using the true tselection based on active learning of a kriging model. Indi- ness function and using a response-surface model as viduals are selected using the con dence intervals predicted its replacement for the rest. This approach is called by the model, added to a queue and evaluated once the surrogate modeling. When using a surrogate model, queue length reaches the batch size. The method was tested only a small portion of all the points that need to be on several standard benchmark problems. Results show that evaluated is evaluated using the true objective functhe proposed algorithm is able to achieve a solution using tion (simulation or experiment) and for the rest, the signi cantly less evaluations of the true tness function. model prediction is assigned as the tness value. The dTihsceuesseedc.t of the batch size as well as other parameters is model is built using the information from the true tness evaluations. Since the tness function is assumed to be highly 1 Introduction non-linear the modeling methods used are non-linear as well. Some of the commonly used methods include Evolutionary optimization algorithms are a popular arti cial neural networks, radial basis functions, reclass of optimization techniques suitable for various gression trees, support vector machines or Gaussian optimization problems. One of their main advantages processes [3]. is the ability to nd optima of black-box functions { Furthermore, some experiments require a xed functions that are not explicitly de ned and only their number of samples to be processed at one time. This input/output behavior is known from previous evalu- presents its own set of challenges for adaptive sampling ations of a nite number of points in the input space. and is the main concern of this paper. We present an This is typical for applications in engineering, chem- evolutionary optimization method assisted by a variistry or biology, where the evaluation is performed in ant of a Gaussian-process-based interpolating model a form of computer simulation or physical experiment. called kriging. In order to best use the evaluation budThe main disadvantage for such applications is the get, our approach uses active learning methods in severy high number of evaluations of the objective func- lecting individuals to evaluate using the true tness tion (called tness function in the evolutionary op- function. A key feature of the approach is support timization context) needed for an evolutionary algo- for online and o ine batch evaluation with arbitrary rithm (EA) to reach the optimum. Even if the simu- batch size independent of the generation size of ? This work was supported by the Grant Agency of the EA. the Czech Technical University in Prague, grant No. The rest of the paper is organized as follows: in the SGS12/196/OHK3/3T/14 as well as the Czech Science following section we introduce the kriging surrogate Foundation grant 201/08/0802. model and its properties, in section 2 the methods of

coupling a model to the evolutionary optimization are discussed, section 4 provides details of the proposed method and nally, the results of testing the method are presented and discussed in section 5. 2

Model-assisted evolutionary optimization 1. An initial Ni generations of the EA is performed, yielding sets G1; : : : ; GNi of individuals (x; ft(x)), ft being the true tness function. 2. The model M is trained on the individuals

(x; ft(x)) 2 SiN=i1 Gi. 3. The tness function ft is replaced by a model pre

diction fM . 4. T generations are performed evaluating fM as the

tness function. 5. One generation is performed using ft yielding

a set Gj of individuals. (initially j = Ni + 1) 6. The model is retrained on the individuals (x; ft(x)) 2 Sj

i=1 Gi 7. Steps 4{6 are repeated until the optimum is reached.

Since the surrogate model used as a replacement for the tness function in the EA is built using the results of the true tness function evaluations, there are two competing objectives. First, we need to get the most information about the underlying relations in the data, in order to build a precise model of the tness function. If the model does not capture the features of the tness function correctly, the optimization can get The amount of true tness evaluations in this apstuck in a fake optimum or generally fail to converge to proach is dependent on the population size of the EA a global one. Second, we have a limited budget for the and the frequency of control generations T , which can true tness function evaluations. Using many points be xed or adaptively changed during the course of from the input space to build a perfect model can re- the optimization [ 6 ]. For problems requiring batched quire more true tness evaluations than not employing evaluation this approach has the advantage of evalua model at all. ating the whole generation, the size of which can be

In the general use of surrogate modeling, such as set to the size of the evaluation batch. The main disdesign space exploration, the process of select- advantage of the generation-based strategy is that not ing points from the input space to evaluate and build all individuals in the control generation are necessarily the model upon is called sampling [ 3 ]. Traditionally, bene cial to the model quality and the expensive true the points to sample are selected upfront. Upfront tness evaluations are wasted. sampling schemes are based on the theory of design of experiments (DoE), e.g. a Latin hypercube design.

When we don't know anything about the function we 2.2 Individual-based approach are trying to model, it is better to use a small set of As opposed to the generation-based approach, in the points as a base for an initial model, which is then it- individual-based strategy, the decision whether to eratively improved using new samples, selected based evaluate a given point using the true tness function on the information from previous function evaluations or the surrogate model is made for each individual and the model itself. This approach is called adaptive separately. sampling [ 3 ]. In model-based optimization in general, there are

Using the surrogate model in an evolutionary op- several possible approaches to individual-based samtimization algorithm, the adaptive sampling decisions pling. The most used approach in the evolutionary change from selecting which points of the input space optimization is pre-selection. In each generation of the to evaluate in order to improve the model to whether EA, number of points, which is a multiple of the poputo evaluate a given point (selected by the EA) with lation size, is generated and evaluated using the model the true tness function or not. There are two gen- prediction. The best of these individuals form the next eral approaches to this choice: the generation-based generation of the algorithm. The optimization is perapproach and the individual-based approach. We will formed as follows. discuss both, with emphasis on the latter, a variant of which is used in the method we propose in section 4. 2.1

Generation-based approach

In the generation-based approach the decision whether to evaluate an individual point with the true tness function is made for the whole generation of the evolutionary algorithm. The optimization takes the following steps. 1. An initial set of points S is chosen and evaluated

using the true tness function ft. 2. Model M is trained using the pairs (x; ft(x)) 2 S 3. A generation of the EA is run with the tness function replaced by the model prediction fM and a population Oi of size qp is generated and evaluated with fM , where p is the desired population size for the EA and q is the pre-screening ratio.

Initially, i = 1. 4. A subset P O is selected according to a selection method [ 2 ], most of them depend on the model used.

criterion. The kriging model used in our proposed method o ers 5. Individuals from P are evaluated using the true a good estimate of the local model accuracy by givtness function ft. ing an error estimate of its prediction. It is possible 6. The model M is retrained using S [ P, the set S to use the estimate itself as a measure of the model's is replaced with S [ P, and the EA resumes from con dence in the prediction, or base a more complex step 3. measure on the variance estimate. The measures that were tested for use in our method will be described in detail in section 4.1.

Another possibility, called the best strategy [ 5 ], is to replace S with S [ O instead of just P in step 6 after re-evaluating the set O n P with fM (after the model M has been re-trained). This also means using 3 Kriging meta-models the population size qp in the EA. The kriging method is an interpolation method origi

The key piece of this approach is the selection crite- nating in geostatistics [ 9 ], based on modeling the funcrion (or criteria) used to determine which individuals tion as a realization of a stochastic process [ 11 ]. from set O should be used in the following generation In the ordinary kriging, which we use, the function of the algorithm. There are a number of possibilities, is modeled as a realization of a stochastic process let us discuss the most common.

An obvious choice is selecting the best individuals Y (x) = 0 + Z(x) (1) based on the tness value. This results in the region of the optimum being sampled thoroughly, which helps where Z(x) is a stochastic process with mean 0 and nding the true optimum. On the other hand, the re- covariance function 2 given by agiopnosssfiabrlefrbometttehreocputrimreunmtocpatnimbuemmairsesende.gTleoctseadmapnlde covfY (x + h); Y (x)g = 2 (h); (2) the areas of the tness landscape that were not ex- where 2 is the process variance for all x. The correplored yet, space- lling criteria are used, either alone lation function (h) is then assumed to have the form or in combination with the best tness selection or otheArllcrtihteeripar.evious criteria have the fact that they (h) = exp " Xd ljhljpl # ; (3) are concerned with the optimization itself in common. l=1 A di erent approach is to use the information about the model, most importantly its accuracy, to decide which points of the input space to evaluate with the true tness function in order to most improve it. This approach is sometimes called active learning. where l; l = 1; : : : ; d, where d is the number of dimensions, are the correlation parameters. The correlation function depends on the di erence of the two points and has the intuitive property of being equal to 1 if h = 0 and tending to 0 when h ! 1. The l parameters determine how fast the correlation tends to zero 2.3 Active learning in each coordinate direction and the pl determines the smoothness of the function.

Active learning is an approach that tries to maxi- The ordinary kriging predictor based on n sample mize the amount of insight about the modeled function points fx1; : : : ; xng with values y = (y1; : : : ; yn)0 is gained from its evaluation while minimizing the num- then given by ibnerthoef egveanleuraatlionesldneocfessusarrryo.gaTtheemmoedtehliondgs aasreanuseefd- y^(x) = ^0 + (x)0 1(y ^01); (4) cient adaptive sampling strategy. The terms adap- where (x)0 = ( (x x1); : : : ; (x xn)), is an tive sampling and active learning are often used inter- n n matrix with elements (xi xj ), and changeably. We will use the term active learning for the methods based on the characteristics of the sur- 10 1y rogate model itself, such as accuracy, with the goal of ^0 = 10 11 : (5) minimizing the model prediction error either globally An important feature of the kriging model is that or, more importantly, in the area of the input space apart from the prediction value it can estimate the the EA is exploring. prediction error as well. The kriging predictor error in

The active learning methods are most often based point x is given by on the local model prediction error, such as crossvpaelniddaetnitonof etrhreorm. oAdeltlh,ofourghexasmompele mtheethLoOdLs Aa-rVeorinodneo-i s2(x) = ^2 1 0 1 + (1 0 1 )2 (6) where the kriging variance is estimated as 6. Steps 4 and 5 are repeated until the goal is

reached, or a stall condition is ful lled.

In this section we will describe the proposed method for kriging-model-assisted evolutionary optimization with batch tness evaluation. Our main goal was to decouple the true tness function sampling from the EA iterations based on an assumption that requiring a speci c number of true tness evaluations in every generations of the EA forces unnecessary sampling.

In the generation-based approach, some of the 4.1 Measures of estimated improvement points may be unnecessary to evaluate, as they will To estimate the improvement, which evaluation of not bring any new information to the surrogate model. a given point will bring, we can use several measures. The individual-based approach is better suited for the The three measures introduced here are all based on task, as it chooses those points from each generation, the prediction error estimate of the kriging model. The which are estimated to be the most valuable for the goal of these measures is to prefer the points that help model. There is still the problem of performing a given improve the model in regions explored by the EA. number of evaluations in every generation, although Each of the measure's results for a given point are there might not be enough valuable points to select compared with a threshold and when the estimated from. improvement is above the threshold, the point is eval

The method we propose achieves the desired de- uated using the true tness function. coupling by introducing an evaluation queue. The evolutionary algorithm uses the model prediction at all Standard deviation (STD) is the simplest measure we times and when a point, in which the model's con- tested. It is computed directly from the error as its dence in its prediction is low, is encountered, it is square root points in the queue, all the points in it are evaluated ST D(x) = qs^2M (x): (8) added to the evaluation queue. Once there are enough and the model is re-trained using the results. The optimization takes the following course. 1. Initial set S of b samples is selected using a chosen initial design strategy and evaluated using the true

tness function ft. 2. An initial kriging model M is trained using pairs

(x; ft(x)) 2 S. 3. The evolutionary algorithm is started, with the

model prediction fM as the tness function. 4. For every prediction fM (x) = y^M (x), an estimated improvement measure c(s2M (x)) is computed from the error estimate s2M (x). If c(s2M (x)) > t, an improvement threshold, the point is added to the evaluation queue Q. 5. If the queue size jQj b, the batch size, all points x 2 Q are evaluated, the set S is replaced by S [ f(x; ft(x)g and the EA is resumed.

The STD captures only the model's estimate of the error of its own prediction (based on the distance from the known samples). As such, it does not take into account the value of the prediction itself and can be considered a measure of the model accuracy.

Probability of improvement (POI) [ 7 ] uses the fact, that the kriging prediction is a Gaussian process and the prediction in a single point is therefore a normally distributed random variable Y (x) with a mean and variance given by the kriging predictor. If we choose a target T (based on the goal of the optimization), we can estimate the probability that a given point will have a value y(x) T as a probability that Y (x) T .

The probability of improvement is therefore de ned as Expected improvement (EI) [ 7, 8 ] is based on estimating, as the name suggests, the improvement we expect to achieve over the current minimum fmin, if a given point is evaluated. As before, we assume the model prediction in point x to be a normally distributed random variable Y (x) with a mean and variance given by the kriging predictor. We achieve an improvement I over fmin if Y (x) = fmin I. As shown in [ 7 ] the expected value of I can be obtained using the likelihood of achieving the improvement where is the cumulative distribution function of the e ect of the parameters and also investigated the the standard normal distribution. As opposed to the optimal choice of batch size for problems where an STD, the POI takes into account the prediction mean upfront choice is possible. In this section we discuss (value) as well as its variance (error estimate). The the tests performed and their results. area of the current optimum is therefore preferred over For testing, we used the genetic algorithm implethe rest of the input space. When the area of the mentation from the global optimization toolbox for the current optimum is sampled enough, the variance be- Matlab environment and the implementation of an orcomes very small and the term T s2My^M(x()x) becomes ex- dinary kriging model from the SUMO Toolbox [ 4 ]. The tremely negative, encouraging the sampling of less ex- parameters of the supporting methods, e.g. the genetic plored areas. algorithm itself, were kept on their default values provided by the implementation.

Because the EA itself is not deterministic, each test was performed 20 times and the results we present are statistical measures of this sample. As a performance measure we use the number of true tness evaluations used to reach a set goal in all tests. The main reason to use this measure is that in model-assisted optimization the computational cost of everything except the true

tness evaluation is minimal in comparison. We also track the proportion of the 20 runs that reached the goal before various limits (time, stall, etc.) took e ect.

The domain is restricted to a hypercube 10 xi 10; i = 1; : : : ; n. The function has one global optimum f (x) = 0 in point x = 0. The De Jong's function was 5 Results and discussion primarily used as a proof of concept test. As a second benchmark, we used the Rosenbrock's The proposed method was tested using simulations on function, also called Rosenbrock's valley. The global three standard benchmark functions. We studied the optimum is inside a long parabolic shaped valley, model evolution during the course of the optimization, which is easy to nd. Finding the global optimum in p 2 s2M (x) I=0 1

Z I=1 exp (fmin

y^M (x)2 2s2M (x) Expected improvement is the expected value of the improvement found by integrating over this density.

The resulting measure EI is de ned as

EI(x) = E(I) = s2M (x)[u (u) + (u)]; (u)]; (11) where u = fmin and and are the cumulative distribution function and the probability distribution functions of the normal distribution respectively. The expected improvement has an important advantage over the POI: it does not require a preset target T , which can be detrimental to the POI's successful sample selection when set too high or too low.

All three measures have an important weakness of being based on the model prediction. If the modeled function is deceptive, the model can be very inaccurate while estimating a low variance. A good initial sampling of the tness function is therefore very important. The success of the whole method is dependent on the model's ability to capture the response surface correctly and thus on the function itself. Since the evolutionary algorithms and optimization heuristics in general are often used on black-box optimization, where the properties of the objective function are unknown, it is not straightforward to asses their quality on real world problems. It has therefore become a standard practice to test optimization algorithms and their modi cations on specially designed testing problems.

These benchmark functions are explicitly de ned and their properties and optima are known. They are often designed to exploit typical weaknesses of optimization algorithms in nding the global optimum.

We used three functions found in literature [ 10 ]. Although we performed our tests in two dimensions we give general multi-dimensional de nitions of the functions.

First of the functions used is the De Jong's function. It is one of the simplest benchmarks, it is continuous, convex and unimodal and is de ned as f (x) = n X xi2 i=1 (13)

true fitness 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2

initial model and samples Th0e.2 domain of the function is restricted to a hypercube 2 2 xi 2; i = 14; : : : ; n. It has o6ne globagelneration 8 optimum f (x) = 0 in x = 1.

Finally, the third function used as a benchmark is 5.3 the Rastrigin's function. It is based on the De Jong's function with addition of cosine modulation, which produces a high number of regularly distributed local minima and makes the function highly multimodal.

The function is de ned as In order to compare the measures of estimated improvement, we performed simulations on each benchmark using each improvement estimate measure with di erent values of the threshold. The batch size was n set to 40 { generally found to be the ideal batch size f (x) = 10n + X[xi2 10 cos(2 xi)] (15) { for these experiments. For comparison, we also peri=1 formed tests with the standard genetic algorithm with1T;h:e: : d;nom.Taihne gislorbeasltroipctteimd utmo f (5x) = 0xiis in x 5=, 1i. = tohuet taambloed1e.l. Results of these simulations are shown in The De Jong's function proved to be simple to optimize and the threshold setting did not have almost 5.2 Model evolution any e ect. Only when using the standard deviation, setting the threshold too low lead to an increase in the As the basic illustration of how the model evolves dur- number of evaluations, as too many points were evaling the course of the EA, let us consider an example uated, although the model prediction in those points test run using the Rosenbrock's function. For this ex- was accurate enough. periment we set the batch size of 15, used the STD The same is true for the STD measure used on the measure of estimated improvement with a threshold Rosenbrock's function, where setting the threshold too of 0:001 and set the target tness value of 0:001 as well. low leads to a big increase in variance of the results. The target was reached at the point (0:9909; 0:9824) Interestingly, setting the threshold too high leads to using 90 true tness evaluations. A genetic algorithm a decrease in the number of evaluations, but also in without a surrogate model needed approximately the success rate of reaching the goal. The POI and EI 3000 evaluations to reach the goal in several test runs. are more stable in terms of true tness evaluations,

The model evolution is shown in gure 1. The true but have worse overall success rate. The results are tness function is shown on the left, the initial model shown in gure 2 is in the middle and the nal model on the right. The The Rastrigin's function proved di cult to optipoints where the true tness function was sampled are mize. This is probably due to the locality of the krigdenoted with circles an the optimum is marked with ing model and the high number of local minima of the a star. function. Overall the STD measure is the most suc400 350 300 cessful. POI and EI lead to bad sampling of the model and failure to reach the optimum.

An interesting general result is that the more complex measures of estimated improvement perform worse than the simple standard deviation estimate.

This indicates that the goal of active learning selection criteria in the evolutionary optimization should be the best possible sampling for overall model accuracy, as opposed to trying to improve the accuracy in the best regions of the input space. Both the POI and EI are design to select next best points to reach the optimum. Since in our case, this is handled by the EA itself, the measures bring an unnecessary noise to the estimate of the model accuracy. The results also show that the best measure selection is dependent on the optimized function. 5.4

Batch size

In order to study the batch size e ect on the optimization, a number of experiments were performed with di erent batch sizes. The only option to achieve 2000 1800 1600 1400 s itno1200 a lau1000 v e rue800 t 600 400 200 1 0.9 0.8 0.7 0.6 deh

c 0.5 rae

l 0.4 oga 0.3 0.2 0.1 0 05 10 15 20 25 30 40 ba5tc0h size60 70 80 90 100

(a) No model 160 140 120 itsn100 o a lau80 v e e tru60 160 140 120 itsn100 o a lau80 v e e tru60 1 0.9 0.8 0.7 0.6 deh

c 0.5 rae

l 0.4 gao 0.3 40 0.2 20 0.1

0 05 10 15 20 25 30 40 ba5tc0h size60 70 80 90 100 (b) STD 1 0.9 0.8 0.7 0.6 deh

c 0.5 rae

l 0.4 oga 0.3 40 0.2 20 0.1

0 05 10 15 20 25 30 40 ba5tc0h size60 70 80 90 100

goal reached Fig. 3. Batch size e ect on Rosenbrock's function optimization - true tness evaluations and proportion of runs raching the goal. a given batch size is to set the population size in a standard GA, in our method however, the settings are independent so a population size of 30, which proved e cient, was used in all of the tests.

The results on the De Jong's functions show that apart from small batch sizes (up to 10), the optimization is successful in all runs. Our method helps stabilize the EA for small batch sizes and for batch sizes above 15 the algorithm nds the optimum using 6

Conclusions a single batch. For a standard GA this strong depen- ture information about the model accuracy improvedence arises for batch sizes above 40 and the algorithm ment expected by sampling a given point. In the exreaches the goal in the second generation, evaluating periments with the batch size we found that small twice as many points. batch sizes perform better when the objective function

For the Rosenbrock's function we get the intuitive is simple, while causing bad initial sampling of more result that setting the batch size too low leads to more complex functions, suggesting using a larger initial evaluations or a failure to reach the goal, while large sample. The future development of this work should batch size do not improve the results and waste true include experiments using di erent batch sizes for initness evaluations. For this function the POI proved tial sampling and comparison of the method with other to be the most e cient measure. The comparison is ways of employing a surrogate model in the optimizashown in gure 3. Overall the method reduces the tion as well as other model-assisted optimization number of true evaluations from hundreds to tens for methods. the Rosenbrock's function, while slightly reducing the The method brings promising results, reducing the success rate of the computation. number of true tness evaluations to a large degree

The Rastrigin's function proved di cult to opti- for some of the benchmark functions. On the other mize even without a surrogate model. With the model, hand, its success is highly dependent on the optimized the STD achieved the best results reducing the number function and its initial sampling. of true tness evaluations approximately three times in the area of the highest success rate with batch size References of 70. The other two measures were ine ective. We attribute the method's di culty optimizing the Rastrigin's function to the fact that the kriging model is local and thus it requires a large number of samples to capture the function's complicated behavior in the whole input space. When the initial sampling is misleading, which is more likely for the Rastrigin's function, both the model prediction and estimated improvement are wrong.

The results suggest that best batch size and best estimated improvement measure are highly problemdependent. The proposed method is also very sensitive to good initial sample selection, which is the most usual reason for it to fail to nd the optimum. The experimental results support the intuition that batches too small are bad for the initial sampling of the model and batches too large slow down the model improvement by evaluating points that it would not be necessary to evaluate with smaller batches. This suggests using a larger initial sample and a small batch for the rest of the optimization.

In this paper we presented a method for model-assisted evolutionary optimization with a xed batch size requirement. To decouple the sampling from the EA iterations and support an individual-based approach while keeping a xed evaluation batch size, the method uses an evaluation queue. The candidates for true tness evaluations are selected by an active learning method using a measure of estimated improvement of the model quality based on the model prediction error estimate.

The results suggest using simple methods for improvement estimate in active learning, which only cap

Baerns , M. Holena: Combinatorial development of solid catalytic materials: design of high-throughput experiments, data analysis, data mining . Catalytic Science Series . Imperial College Press, 2009 .

Crombecq , L. De Tommasi,

Gorissen , T. Dhaene: A novel sequential design strategy for global surrogate modeling . In Winter Simulation Conference, WSC '09, Winter Simulation Conference , 2009 , 731 { 742 .

Gorissen : Grid-enabled adaptive surrogate modeling for computer aided engineering . PhD Thesis , Ghent University, University of Antwerp, 2009 .

Gorissen , I. Couckuyt ,

Demeester ,

Dhaene , K. Crombecq: A surrogate modeling and adaptive sampling toolbox for computer based design . The Journal of Machine Learning Research 11 , 2010 , 2051 { 2055 .

5. L. Graning,

Jin ,

Sendho : E cient evolutionary optimization using individual-based evolution control and neural networks: A comparative study . In ESANN , 2005 , 273 { 278 .

Jin ,

Olhofer ,

Sendho : Managing approximate models in evolutionary aerodynamic design optimization . In Evolutionary Computation , 2001 . Proceedings of the 2001 Congress on , vol. 1 , IEEE, 2001 , 592 { 599 .

D.R.

Jones . A taxonomy of global optimization methods based on response surfaces . Journal of Global Optimization , 21 : 345 { 383 , 2001 .

D. R.

Jones ,

Schonlau ,

W.J.

Welch : E cient global optimization of expensive black-box functions . Journal of Global Optimization 13 , 1998 , 455 { 492 .

9. G. Matheron: Principles of geostatistics . Economic Geology 58 ( 8 ), 1963 , 1246 { 1266 .

10. M. Molga , C. Smutnicki: Test functions for optimization needs . Test Functions for Optimization Needs , 2005 .

11. J. Sacks , W. J.

Welch , T. J. Mitchell, H. P.

Wynn

: Design and analysis of computer experiments . Statistical Science 4 ( 4 ), 1989 , 409 { 423 .