J. Yaghob (Ed.): ITAT 2015 pp. 159–166
Charles University in Prague, Prague, 2015


                         Investigation of Gaussian Processes in the Context
                              of Black-Box Evolutionary Optimization

                              Andrej Kudinov1 , Lukáš Bajer2 , Zbyněk Pitra3 , and Martin Holeňa4
                 1Faculty of Information Technology, Czech Technical University, Prague, kudinand@fit.cvut.cz,
                      2 Faculty of Mathematics and Physics, Charles University, Prague, bajeluk@gmail.com,
         3 Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University, Prague, z.pitra@gmail.com,
                                    4 Czech Academy of Sciences, Prague, martin@cs.cas.cz


Abstract: Minimizing the number of function evaluations            which makes evolutionary search for the global optimum
became a very challenging problem in the field of black-           difficult because evolutionary algorithms (EAs) tend to get
box optimization, when one evaluation of the objective             trapped in one of the local optima.
function may be very expensive or time-consuming. Gaus-               The following section describes the theoretical funda-
sian processes (GPs) are one of the approaches suggested           mentals of GPs and introduces the MGSO method. It also
to this end, already nearly 20 years ago, in the area of gen-      explains surrogate modeling and using GPs as a surrogate
eral global optimization. So far, however, they received           model for CMA-ES. Section 3 presents results of exper-
only little attention in the area of evolutionary black-box        imental evaluation of the considered methods. Section 5
optimization.                                                      summarizes the results and concludes the paper.
   This work investigates the performance of GPs in the
context of black-box continuous optimization, using mul-
timodal functions from the CEC 2013 competition. It
                                                                   2    Gaussian Processes in Optimization
shows the performance of two methods based on GPs,                 GP is a random process such that any finite sequence
Model Guided Sampling Optimization (MGSO) and GPs                  X1 , . . . , Xk of the involved random variables has a multi-
as a surrogate model for CMA-ES. The paper compares                variate Gaussian distribution. GP is defined by its mean
the speed-up of both methods with respect to the number            value and covariance matrix described by a function with
of function evaluations using different settings to CMA-           relatively small number of hyper-parameters, which are
ES with no surrogate model.                                        usually fitted by the maximum likelihood method. Firstly,
                                                                   GP is trained with N data points from the input space X,
1    Introduction                                                                      XN = {xi |xi ∈ RD }Ni=1
Evolutionary computation became very successful during             with known input-output values (xN , yN ), then it is used
the past few decades in continuous black-box optimiza-             for predicting the (N + 1)-st point. The conditional den-
tion. In such optimization, we have no mathematical def-           sity of the extended vector yN+1 = (y1 , . . . , yN , yN+1 ), con-
inition of the optimized function available, thus we can           ditioned on XN+1 = XN ∪ {xN+1 } is
calculate analytically neither that function itself, nor its
derivatives. In such cases, there is no option but to em-                                  exp(− 1 yT C−1 yN+1 )
                                                                           p(yN+1 |XN+1 ) = p 2 N+1 N+1          ,
pirically evaluate the objective function through measure-                                    (2π)N+1 det(CN+1 )
ments, tests or simulations.
    In various real-world optimization problems, the eval-         where CN+1 is the covariance matrix of a (N + 1)-
uation of the objective function is very expensive or              dimensional Gaussian distribution. The covariance matrix
time-consuming, e.g., protein’s folding stability optimiza-        can be expressed as
tion [6], computer-assisted design [1] and job allocations                                            
                                                                                               CN k
in a computational grid [13]. In such cases, we need to                              CN+1 =              ,
                                                                                               kT κ
keep the number of function evaluations as low as possi-
ble, without impairing the quality of expected results.            where κ is the variance of the new point itself, k is is the
    In this paper, we employ two approaches addressing that        vector of covariances between the new point and training
task. The first, called Model Guided Sampling Optimiza-            data and CN is the covariance of the Gaussian distribution
tion (MGSO) [2], is one of the recent implementations of           corresponding to the N training data points [5].
GPs. The second employed approach is surrogate model-                 Covariance functions provide prior information about
ing, recalled in Subsection 2.2, which we will use in con-         the objective function and express the covariance be-
junction with CMA-ES.                                              tween the function values of each two data points xi ,
    This work investigates the performance of both methods         x j as cov( f (xi ), f (x j )) = k(xi , x j ). Because the matrix
on the set of niching functions from the CEC 2013 compe-           x1 , . . . , xN serves as a covariance matrix, it has to positive
tition [10], characterized by a high number of local optima,       semidefinite.
160                                                                                          A. Kudinov, L. Bajer, Z. Pitra, M. Holeňa


2.1   Model Guided Sampling Optimization                           An individual-based EC strategy consists in determin-
                                                                ing the subset of individuals evaluated by the original fit-
MGSO has the ability to use a regression model for pre-         ness function in each generation. The following descrip-
diction and error estimation in order to get the probabil-      tion is illustrated by Figure 1. Denote λ to be the size
ity of obtaining a better solution. It was inspired by two      of the population provided by CMA-ES. First, λ 00 = αλ
previously proposed methods in the field of black-box op-       points are sampled from N(m, σ 2 C), where α ∈ [0, 1], m is
timization. The first method, Estimation of Distribution        the mean, σ is the step-size and C stands for the covariance
Algorithms [9], creates a new set of solutions for the next     matrix (1). These λ 00 points are evaluated by the original
generation using estimated probability distribution from        fitness function and included in training the model. Then,
previously selected candidate solutions. The second ap-         the extended population λ 0 = β (λ −λ 00 ), where β ∈ [1, ∞),
proach is surrogate modeling, described in Section 2.2.         is sampled by a model using the same distribution (2). The
    MGSO was proposed as an alternative method for              extended population is required by the model for choosing
Jones’ Efficient Global Optimization (EGO) [8]. Unlike          promising points for re-evaluation by the original fitness
EGO, MGSO produces not a single solution, but a whole           function. Subsequently, γ(λ −λ 00 ) points, where γ ∈ [0, 1],
population of solutions. The selection of candidate solu-       are chosen according to some criterion from among the ex-
tions is performed by sampling the probability of improve-      tended population, e.g. fitness value, and used in the evalu-
ment (PoI) of the GP model, which serves as a measure of        ation by the original fitness function (3). The complement
how promising the chosen point is for locating the opti-        to λ points is gathered from the rest of the extended popu-
mum. PoI is determined by means of a chosen threshold T         lation by dividing it into k = (1 − γ)(λ − λ 00 ) clusters and
and the estimation of the objective function shape by the       selecting the best point from each cluster, which are also
current GP model.                                               evaluated by the original fitness function and added to the
    The crucial step in the MGSO algorithm is the sampling      final population (4) [3].
of PoI, which is determined by the predicted mean fˆ(x) =
ŷ and the standard deviation ŝ(x) = sy of the GP model fˆ                              λ
in any point x of the input space
                                                                (1)
                                
                    T − fˆ(x)                                         λ 00 = αλ          λ − λ 00
       PoIT (x) = Φ                  = P( fˆ(x ≤ T )),
                      ŝ(x)                                                              λ − λ 00
which corresponds to the value of cumulative distribution
function of the Gaussian for the value of threshold T . Al-     (2)
though all the variables are sampled from Gaussian distri-                                          β (λ − λ 00 )
bution, PoI(x) is not Gaussian-shaped as it depends on the                                          β (λ − λ 00 )
threshold T and the modelled function f .

                                                                (3)
2.2   GP as Surrogate Model for CMA-ES
                                                                          λ 00    γ(λ − λ 00 )
Surrogate modeling is a technique used in optimization in                                                      k
order to decrease the number of expensive function eval-
uations. A surrogate model, which is a regression model
of suitable kind (in our case a GP), is constructed by train-   (4)
ing with known values of the objective function for some                λ 00 γ(λ − λ 00 ) (1 − γ)(λ − λ 00 )
inputs first, and then it is used by the employed evolution-
ary optimization algorithm instead of the original objective                         λ
function (in evolutionary optimization usually called fit-        best points from each cluster
ness) during the search for the global optimum. Although,         best points from the extended population
creating and training a model increases time complexity           extended population
of the optimization algorithm, using a model instead of           pre-sampled points
the original fitness function decreases the number of its
evaluations, which is a crucial objective in expensive opti-                     Figure 1: Individual-based EC
mization.
   Every regression model approximates the original fit-           A generation-based EC strategy determines the num-
ness function with some error. To prevent the optimiza-         ber of model-evaluated generations between two genera-
tion from being mislead from such an erroneous approxi-         tions evaluated by the original function. After a generation
mation, it is necessary to use the original fitness function    is evaluated by the original fitness function, the model is
for some subset of evaluations. That subset is determined       trained using the obtained values. The number of consec-
by the evolution control (EC) strategy [5].                     utive model-evaluated generations can be determined also
Investigation of Gaussian Processes in the Context of Black-Box Evolutionary Optimization                                                161


dynamically, as introduced in so-called adaptive EC stra-                                            S-CMA-ES
tegy [11], when the deviation between the original and the                        covariance                       cov ∈
                                                                                   functions               ν= 52
model fitness function is assessed and then it is decided                                               {KMatérn , Kexp , Kiso    ard
                                                                                                                           SE , KSE }
whether to evaluate using the original fitness or the model.                                                (0.1, 10) for KSE iso
   Determining the most suitable EC parameters, however,                        starting values
                                                                                   of (σ 2f , `)        (0.05 × J1,D , 0.1) for KardSE
is an open problem, which depends on the properties of
                                                                                                          (0.5, 2) otherwise
the fitness function and the current performance of the
surrogate model. Moreover, the most suitable parameters                     starting values of σn2               0.01
change during the optimization process.                                                               MGSO
                                                                                  covariance
                                                                                   functions                cov ∈ {Kiso   ard
                                                                                                                    SE , KSE }
3     Experimental Evaluation                                                  starting values              (0.1, 10) for Kiso
                                                                                                                             SE
                                                                                   of (σ 2f , `)       (0.05 × (J1,D ), 0.1) for Kard
                                                                                                                                  SE
Previous investigations compared the performance of
                                                                            starting values of σn2                 0.01
MGSO [2] and CMA-ES with GP surrogate model [4]
(denoted hereafter S-CMA-ES) with CMA-ES without                          Table 1: Model parameter settings for S-CMA-ES and
a model, using the standard black-box optimization bench-                 MGSO performance testing. The symbols Kiso       ard
                                                                                                                     SE , KSE ,
marks [7]. In this work, we compare those methods using                            ν= 5
an additional set of 12 multimodal fitness functions from                 Kexp , KMatérn
                                                                                     2
                                                                                         , denote, respectively, the isotropic squared
the CEC 2013 competition on niching methods for multi-                    exponential, squared exponential with automatic relevance
modal function optimization [10]:                                         determination, exponential and Matérn with parameter
                                                                          ν = 25 covariance functions. J1,D denotes the vector of
    • f1: Five-Uneven-Peak Trap (1D),                                     ones of length equal to the dimension D of the input space.
    • f2: Equal Maxima (1D),
    • f3: Uneven Decreasing Maxima (1D),                                  and the version using automatic relevance determination
    • f4: Himmelblau (2D),                                                                                                      
                                                                              ard               2         1
    • f5: Six-Hump Camel Back (2D),                                         KSE (xi , x j ) = σ f exp − (xi − x j ) λ (xi − x j ) ,
                                                                                                                      > −2
                                                                                                          2
    • f6: Shubert (2D, 3D),                                                                                                        (2)
    • f7: Vincent (2D, 3D),                                               where λ stands for the characteristic length scale which
    • f8: Modified Rastrigin - All Global Optima (2D),                    measures the distance for being uncorrelated along xi . The
    • f9: Composition Function 1 (2D),                                    covariance matrices 1 and 2 differ only when λ is a diag-
                                                                          onal matrix instead of a scalar. Two types of the Matérn
    • f10: Composition Function 2 (2D),
                                                                          covariance function were used,
    • f11: Composition Function 3 (2D, 3D, 5D, 10D),                                                             r
                                                                                               ν= 21
    • f12: Composition Function 4 (3D, 5D, 10D, 20D).                                        KMatérn (r) = exp − ,                (3)
                                                                                                                     `
                                                                          which is better known as exponential covariance function
3.1    MGSO Performance                                                   (Kexp ), and
                                                                                                      √            !       √ !
                                                                                                        5r    5r 2          5r
MGSO performance was examined using two covari-                              ν= 52
                                                                           KMatérn (r) = σ 2f 1 +          + 2 exp −            , (4)
ance functions, isotropic squared exponential (KisoSE ) and                                             `     3`            `
squared exponential with automatic relevance determina-
tion (Kard                                                                where r = (xi − x j ) and the parameter ` is the characteris-
        SE ), with parameters shown in Table 1. The results
in Tables 2 and 3 show the speed-up of MGSO with respect                  tic length-scale with which the distance of two considered
to CMA-ES. As can be seen, the Kiso                                       data points is compared and σ 2f is the signal variance. The
                                    SE covariance function
performed better among these two in more than the half of                 description of the listed covariance functions can be found
cases. Table 1 shows used parameter settings in our evalu-                in [12]. The considered covariance functions parameters
ations.                                                                   are shown in Table 1.
                                                                             In the performed experiments, different variants of the
                                                                          chosen EC strategies, described in Section 2.2, were exam-
3.2    S-CMA-ES Performance                                               ined, generation-based and individual-based. The result
The speed-up results are shown in Tables 2 and 3. In per-                 are discussed in the following sections.
formed evaluations, four covariance functions in the GP
surrogate model were used, two types of the squared ex-                   3.3 Generation-Based EC Strategy
ponential covariance function, the isotropic version
                                                                          Apart from covariance function selection, generation-
                                                      
    iso               2       1                                           based EC strategy was determined by two other param-
  KSE (xi , x j ) = σ f exp − 2 (xi − x j ) (xi − x j ) , (1)
                                           >
                                                                          eters, the number of model-evaluated generations and the
                             2`
162                                                                                        A. Kudinov, L. Bajer, Z. Pitra, M. Holeňa


multiplication factor of CMA-ES’ step size σ , which is        “+” mean that, unlike the employed method, no CMA-
used in the original-evaluated generations in order to pro-    ES run was able to reach the target. Signs “*” mean that
vide points for model training from a broader region of the    neither the considered method nor CMA-ES were able to
input space. In the implementation, the first parameter was    reach the target. Speed-ups written in bold mark cases
varied among the values 1, 2, 4 and 8 consecutive model-       where the S-CMA-ES’ or MGSO’s median of the ERT is
evaluated generations and the second parameter was varied      significantly lower than the median of the CMA-ES ac-
among the values 1 and 2.                                      cording to the one-sided Wilcoxon’s test on the signifi-
                                                               cance level α = 0.05.
3.4   Individual-Based EC Strategy
                                                               4.2 Observations
Apart from covariance function selection, three other pa-
rameters, described in Section 2.2, were examined in the       The MGSO method brought the highest speed-up in the
case of individual-based EC strategy. The first parameter      case of f 2, f 4, 3D version of f 7, 2D version of f 11 and
α ∈ [0, 1] determines the proportion of the original popula-   5D version of f 12. The worst results were observed in the
tion to be pre-sampled and evaluated by the original fitness   case of f 1, 2D version of f 10 and 3D and 10D versions of
function. The second parameter β ∈ [1, ∞) is a multiplica-      f 12. The best results were achieved using Kiso SE covariance
tor determining the size of extended population. The third     function, however, in the case of f 4 and 2D version of f 7
parameter γ ∈ [0, 1] determines the amount of points with      Kard
                                                                  SE covariance function brought much better results.
the best model-fitness chosen from the extended popula-            In the case of the generation-based EC, the overall best
tion to be re-evaluated by the original fitness function to    settings with respect to the median values are (Kiso   SE , 8, 1)
become a part of the final population. This parameter also     – 8 consecutive model-evaluated generations with un-
determines the number of clusters, where the best point is     modified step size in combination with Kiso      SE covariance
chosen from each cluster and added to the final population.    function. The overall best generation-based EC settings
   In performed evaluations, the parameter α was varied        showed to be also the best generation-based EC settings
among the values 0, 0.0625, 0.125, and 0.25, the parameter     of the respective functions, except for 3D version of f 12,
β was varied among the values 5 and 10 and γ was varied        where S-CMA-ES performed better using larger step size.
among the values 0, 0.1 and 0.2.                               Using different covariance functions didn’t bring much
                                                               better results than the overall best covariance function.
                                                                   The individual-based EC strategy achieved the best re-
4 Results and Their Assessment                                 sults with the overall settings (Kard
                                                                                                 SE , 0, 5, 0.1) – squared ex-
                                                               ponential covariance matrix with automatic relevance de-
4.1   Result Tables                                            termination, no pre-sampling before training the model, 5
                                                               as the multiplicator determining the size of extended pop-
Tables 2 and 3 show the speed-up of S-CMA-ES and               ulation and 0.1 as a multiplicator determining the amount
MGSO, compared to CMA-ES without a surrogate model.            of best points chosen from the extended population. The
For the respective targets (distances to the true optimum      best results using described parameters were achieved in
∆ fopt ), the speed-up of the expected running time (ERT) is   the case of functions f 3 and 2D and 3D version of f 6.
shown. ERT is the number of function evaluations needed        However, the overall performance of the individual-based
to reach the target divided by the ratio of the successful     EC strategy lags far behind the generation-based EC stra-
runs, which reached the target. Stopping criteria: the dis-    tegy, MGSO and even CMA-ES itself.
tance 10−8 to the true optimum and 100D original fitness
function evaluations.
   The first column in each box corresponds to the over-       4.3 Best-Fitness Progress Diagrams
all best settings (described in Section 4.2), the covari-      Figure 2 shows examples of the best-fitness progress with
ance function and EC settings of S-CMA-ES using the            the best observed settings (see Table 2 for details). Me-
generation-based EC strategy in terms of the average           dians and the first and third quartiles of the best fitness
speed-up. The second column corresponds to the best co-        reached are shown; medians and quartiles measured for
variance function and generation-based EC settings for         MGSO and S-CMA-ES on 15 and 10 independent runs
the respective function-dimension combination, if there        (for both EC strategies), respectively.
was any better than the overall best observed settings.           The optimization progress of the individual-based EC
Analogously, the third and fourth columns show results         strategy showed to be the slowest in comparison to other
for S-CMA-ES using individual-based EC strategy. Simi-         methods. MGSO outperformed CMA-ES in most cases
larly, the last two columns in each box show the speed-up      and the highest speed-up was achieved in the later phase of
of the MGSO.                                                   the optimization process. The generation-based EC stra-
   Signs “-” instead of the speed-up values mean that, un-     tegy achieved the highest speed-up in the middle phase of
like the CMA-ES, no run of the considered method (S-           the optimization process. However, the generation-based
CMA-ES or MGSO) was able to reach that target. Signs
Investigation of Gaussian Processes in the Context of Black-Box Evolutionary Optimization                                                                                                   163


                                          f1 (1–D)                                                                                           f2 (1–D)
  ∆ fopt   S-CMA-ES - generation EC          S-CMA-ES - individual EC                 MGSO       ∆ fopt   S-CMA-ES - generation EC              S-CMA-ES - individual EC             MGSO
  1e1        00.90       01.00               01.00                                 00.17 00.14   1e-1       01.00       01.44                  00.76             00.13            01.30
  1e0        00.90       01.00               01.00                                 00.06 00.14   1e-2       01.41       02.54                  00.30             00.10            03.30
  1e-1       00.90       01.00               01.00                                 00.06 00.08   1e-3       01.85       02.54                  00.12             00.86            01.08
  1e-2       00.90       01.00               01.00                                   -   00.05   1e-4       03.11       03.75                     -              01.73            01.73
  1e-3       00.90       01.00               01.00                                   -   00.05   1e-5       04.04       04.49                     -                -              02.25
  1e-4       00.90       01.00               01.00                                   -     -     1e-6       04.34       04.83                     -                -              02.37
  1e-5       00.90       01.00               01.00                                   -     -     1e-7       12.95       13.57                     -                -              08.05
  1e-6       00.90       01.00               01.00                                   -     -     1e-8       12.21       15.23                     -                -              16.78
  1e-7       00.90       01.00               01.00                                   -     -
  1e-8       00.90       01.00               01.00                                   -     -


                                                                                                                                                              v= 5
 param: (Kiso
          SE , 8, 1)     (Kard      (Kard
                           SE , 8, 1)  SE , 0, 5, 0.1)                             Kiso
                                                                                    SE    Kard
                                                                                           SE    param: (Kiso
                                                                                                          SE , 8, 1)    (Kard      (Kard
                                                                                                                          SE , 8, 1)
                                                                                                                                                                              iso
                                                                                                                                     SE , 0, 5, 0.1) (KMatérn, 2 , 10, 0.2) KSE
                                                                                                                                                         2      −2

                                      f3 (1–D)                                                                                       f4 (2–D)
  ∆ fopt   S-CMA-ES - generation EC      S-CMA-ES - individual EC                     MGSO       ∆ fopt   S-CMA-ES - generation EC       S-CMA-ES - individual EC              MGSO
  1e-1       02.44                       00.52            00.45                    01.83         1e2        00.54       01.00           00.16                00.11          00.35 00.35
  1e-2       05.71                       01.44            01.51                    02.13         1e1        01.24       01.63           00.17                00.44          01.55 01.55
  1e-3       21.36                       07.12            17.80                    09.49         1e0        02.59       03.52              -                 01.26          02.20 04.00
  1e-4       18.41                       07.12              -                      08.63         1e-1       02.74       03.47              -                    -           01.97 02.62
  1e-5       20.07                       07.76              -                      09.41         1e-2       02.99       03.66              -                    -           02.44 03.25
  1e-6       20.07                       07.76              -                      09.41         1e-3       03.91       05.00              -                    -           03.58 04.57
  1e-7         *                             *              *                        *           1e-4       04.45       05.22              -                    -           04.52 04.68
  1e-8         *                             *              *                        *           1e-5       13.44       14.83              -                    -           00.67 13.23
                                                                                                 1e-6       17.63       20.10              -                    -             -   18.12
                                                                                                 1e-7         +           +                *                    *             *     +
                                                                                                 1e-8         +           +                *                    *             *     +

 param: (Kiso
          SE , 8, 1)                (Kard
                                       SE , 0, 5, 0.1) (Kexp, 0, 5, 0.2)           Kiso
                                                                                    SE           param: (Kiso
                                                                                                          SE , 8, 1)               (Kard
                                                                                                                        (Kexp, 8, 1) SE , 0, 5, 0.1) (Kexp, 2−2, 5, 0.1)          Kiso
                                                                                                                                                                                   SE    Kard
                                                                                                                                                                                          SE
                                      f5 (2–D)                                                                                       f6 (2–D)
  ∆ fopt   S-CMA-ES - generation EC      S-CMA-ES - individual EC                     MGSO       ∆ fopt   S-CMA-ES - generation EC       S-CMA-ES - individual EC                    MGSO
  1e0        01.00                       00.12              00.09                  00.65 00.65   1e2        01.18       01.18              -               01.74                  03.04 02.03
  1e-1       01.72                       00.05              00.08                  00.80 00.80   1e1        04.97       04.20              -                 -                    00.36 02.21
  1e-2       03.28                           -              00.70                  01.52 01.52   1e0        07.16       08.40              -                 -                      -   01.80
  1e-3       02.95                           -                 -                   01.51 01.92   1e-1         +           +                *                 *                      *     +
  1e-4       03.27                           -                 -                   01.96 02.48   1e-2         +           +                *                 *                      *     +
  1e-5       03.74                           -                 -                   02.55 03.23   1e-3         +           +                *                 *                      *     +
  1e-6         *                             *                *                      *     *     1e-4         +           +                *                 *                      *     +
  1e-7         *                             *                *                      *     *     1e-5         *           *                *                 *                      *     *
  1e-8         *                             *                *                      *     *     1e-6         *           *                *                 *                      *     *
                                                                                                 1e-7         *           *                *                 *                      *     *
                                                                                                 1e-8         *           *                *                 *                      *     *

                                                              v= 5
 param: (Kiso
          SE , 8, 1)              (Kard                                      iso
                                     SE , 0, 5, 0.1) (KMatérn, 0, 10, 0.2) KSE
                                                         2
                                                                                 Kard
                                                                                  SE             param: (Kiso
                                                                                                          SE , 8, 1)    (Kard      (Kard
                                                                                                                          SE , 4, 1) SE , 0, 5, 0.1) (Kard
                                                                                                                                                        SE , 0, 10, 0)            Kiso
                                                                                                                                                                                   SE    Kard
                                                                                                                                                                                          SE
                                    f6 (3–D)                                                                                         f7 (2–D)
  ∆ fopt S-CMA-ES - generation EC      S-CMA-ES - individual EC               MGSO               ∆ fopt   S-CMA-ES - generation EC       S-CMA-ES - individual EC                    MGSO
  1e4      01.00                       01.00                01.00          00.27                 1e-2       03.28       02.93           00.05             00.43                   00.14 01.02
  1e3      17.55                           -                07.42          04.53                 1e-3       04.42       04.81              -                 -                    00.27 00.54
  1e2      07.72                           -                   -           04.06                 1e-4       05.63       06.09              -                 -                      -   00.52
  1e1      05.96                           -                   -             -                   1e-5       07.91       07.91              -                 -                      -   01.45
  1e0        +                             *                   *             *                   1e-6       16.24       16.24              -                 -                      -   03.39
  1e-1       +                             *                   *             *                   1e-7       65.25       68.62              -                 -                      -   11.29
  1e-2       +                             *                   *             *                   1e-8         +           +                *                 *                      *     +
  1e-3       +                             *                   *             *
  1e-4       +                             *                   *             *
  1e-5       +                             *                   *             *
  1e-6       +                             *                   *             *
  1e-7       +                             *                   *             *
 param: (Kiso
           SE , 8, 1)             (Kard
                                     SE , 0, 5, 0.1)  (Kard
                                                        SE , 0, 10, 0.1)    Kiso
                                                                             SE                  param: (Kiso
                                                                                                          SE , 8, 1)               (Kard
                                                                                                                        (Kexp, 8, 1) SE , 0, 5, 0.1) (Kard
                                                                                                                                                       SE , 2 , 10, 0)
                                                                                                                                                             −4
                                                                                                                                                                                  Kiso
                                                                                                                                                                                   SE    Kard
                                                                                                                                                                                          SE
                                    f7 (3–D)                                                                                         f8 (2–D)
  ∆ fopt S-CMA-ES - generation EC      S-CMA-ES - individual EC               MGSO               ∆ fopt   S-CMA-ES - generation EC       S-CMA-ES - individual EC                    MGSO
  1e-1     03.15                       00.03                01.28          01.14                 1e0        02.68       02.68              -               00.96                  01.68 01.68
  1e-2     05.26                           -                   -           01.56                 1e-1       03.91       03.91              -                 -                    02.22 02.22
  1e-3     10.36                           -                   -           03.16                 1e-2       04.46       04.79              -                 -                    03.20 03.20
  1e-4     13.33                           -                   -           04.95                 1e-3       07.15       07.65              -                 -                    03.29 03.29
  1e-5     64.47                           -                   -           22.83                 1e-4       12.95       13.74              -                 -                    05.61 06.73
  1e-6       +                             *                   *             +                   1e-5       22.40       24.85              -                 -                    11.39 13.53
  1e-7       +                             *                   *             +                   1e-6         +           +                *                 *                      +     +
  1e-8       +                             *                   *             +                   1e-7         +           +                *                 *                      +     +
                                                                                                 1e-8         +           +                *                 *                      +     +


                                                               v= 5                                                       v= 5
 param: (Kiso
          SE , 8, 1)                    (Kard
                                          SE , 0, 5, 0.1)   (KMatérn
                                                                2
                                                                     , 0, 10, 0)   Kiso
                                                                                    SE           param: (Kiso
                                                                                                          SE , 8, 1)   (KMatérn
                                                                                                                           2
                                                                                                                                , 8, 1)   (Kard
                                                                                                                                            SE , 0, 5, 0.1)   (Kexp, 2−4, 5, 0)   Kiso
                                                                                                                                                                                   SE    Kard
                                                                                                                                                                                          SE


Table 2: Speed-up of S-CMA-ES using individual- and generation-based strategies and MGSO, compared to CMA-ES
without a surrogate model – functions f 1− f 10 (see Section 4.1 for details). Empty columns signify that the best observed
settings for the respective function-dimension combination are identical with the overall best observed settings.
164                                                                                                                       A. Kudinov, L. Bajer, Z. Pitra, M. Holeňa

                                            f9 (2–D)                                                                     f10 (2–D)
 ∆ fopt   S-CMA-ES - generation EC             S-CMA-ES - individual EC      MGSO        ∆ fopt   S-CMA-ES - generation EC     S-CMA-ES - individual EC           MGSO
 1e1        03.40       02.74                    -              00.10     01.06          1e2        01.98       01.76            00.03          00.26             00.39
 1e0        03.38       03.11                    -              00.07     01.15          1e1        02.29       02.62              -              -               00.46
 1e-1       03.82       03.82                    -                -       01.53          1e0        03.01       03.21              -              -               00.07
 1e-2       04.35       04.35                    -                -       01.52          1e-1       03.71       04.00              -              -                 -
 1e-3       07.80       07.80                    -                -       02.68          1e-2       06.15       07.02              -              -                 -
 1e-4       22.40       23.56                    -                -       08.54          1e-3       18.28       21.88              -              -                 -
 1e-5         +           +                      *                *         +            1e-4         +           +                *              *                 *
 1e-6         +           +                      *                *         +            1e-5         +           +                *              *                 *
 1e-7         +           +                      *                *         +            1e-6         +           +                *              *                 *
 1e-8         +           +                      *                *         *            1e-7         +           +                *              *                 *
                                                                                         1e-8         +           +                *              *                 *

                                                                                                                 v= 5
 param: (Kiso
          SE , 8, 1)    (Kard      (Kard
                          SE , 8, 1)  SE , 0, 5, 0.1) (Kard
                                                        SE , 2 , 5, 0)
                                                              −4
                                                                           Kiso
                                                                            SE           param: (Kiso
                                                                                                  SE , 8, 1)
                                                                                                          (KMatérn
                                                                                                              2
                                                                                                                   , 8, 1) (Kard                 ard
                                                                                                                               SE , 0, 5, 0.1) (KSE , 0, 10, 0)   Kiso
                                                                                                                                                                   SE
                                    f11 (2–D)                                                                            f11 (3–D)
 ∆ fopt   S-CMA-ES - generation EC     S-CMA-ES - individual EC              MGSO      ∆ fopt S-CMA-ES - generation EC         S-CMA-ES - individual EC           MGSO
 1e2        03.92       03.38           00.04               01.06         01.24        1e3      00.53        01.00                01.00             01.00         00.27
 1e1        03.69       03.69               -                 -           01.43        1e2      02.51        03.04                   -                -           01.33
 1e0        04.63       05.01               -                 -           02.32        1e1      03.04        03.38                   -                -           01.41
 1e-1       05.19       05.19               -                 -           03.18        1e0      03.12        03.47                   -                -           01.85
 1e-2       08.70       09.23               -                 -           02.51        1e-1     03.90        04.33                   -                -           01.71
 1e-3       12.06       14.04               -                 -           03.72        1e-2     05.95        07.21                   -                -           02.19
 1e-4       59.40       65.25               -                 -           18.46        1e-3     27.83        32.15                   -                -           11.06
 1e-5         +           +                 *                 *             +          1e-4        +            +                    *                *             +
 1e-6         +           +                 *                 *             *          1e-5        +            +                    *                *             +
 1e-7         +           +                 *                 *             *          1e-6        +            +                    *                *             +
 1e-8         +           +                 *                 *             *          1e-7        +            +                    *                *             *
                                                                                       1e-8        +            +                    *                *             *
 param: (Kiso
          SE , 8, 1)    (Kard      (Kard
                          SE , 8, 1)  SE , 0, 5, 0.1) (Kiso
                                                        SE , 2 , 10, 0)
                                                              −4            iso
                                                                           KSE                  iso
                                                                                      param: (KSE , 8, 1)  (Kexp, 8, 1)        ard                ard
                                                                                                                            (KSE , 0, 5, 0.1) (KSE , 0, 5, 0)      Kiso
                                                                                                                                                                    SE
                                    f12 (3–D)                                                                            f11 (5–D)
 ∆ fopt   S-CMA-ES - generation EC     S-CMA-ES - individual EC              MGSO      ∆ fopt S-CMA-ES - generation EC         S-CMA-ES - individual EC           MGSO
 1e2        02.05       01.72           00.37               00.27         00.91 00.46 1e2       03.02                                -                            01.36
 1e1        01.42       02.27               -               00.67         00.31 00.63 1e1       03.24                                -                            01.51
 1e0          -         02.28               -                 -           00.48 00.65 1e0       04.37                                -                            01.96
 1e-1         -         02.76               -                 -             -     -    1e-1     09.79                                -                            03.12
 1e-2         -         03.78               -                 -             -     -    1e-2     27.12                                -                            07.51
 1e-3         -         19.56               -                 -             -     -    1e-3     47.82                                -                            13.17
 1e-4         *           +                 *                 *             *     *    1e-4        +                                 *                              +
 1e-5         *           +                 *                 *             *     *    1e-5        +                                 *                              *
 1e-6         *           +                 *                 *             *     *    1e-6        +                                 *                              *
 1e-7         *           +                 *                 *             *     *    1e-7        +                                 *                              *
 1e-8         *           +                 *                 *             *     *    1e-8        +                                 *                              *

                                                              v= 5
 param: (Kiso
          SE , 8, 1)    (Kard      (Kard
                          SE , 8, 2)
                                                                              iso
                                      SE , 0, 5, 0.1) (KMatérn, 0, 10, 0.1) KSE
                                                          2
                                                                                  Kard
                                                                                   SE    param: (Kiso
                                                                                                  SE , 8, 1)                (Kard
                                                                                                                               SE , 0, 5, 0.1)                    Kiso
                                                                                                                                                                   SE
                                    f11 (10–D)                                                                           f12 (5–D)
 ∆ fopt   S-CMA-ES - generation EC      S-CMA-ES - individual EC               MGSO      ∆ fopt   S-CMA-ES - generation EC     S-CMA-ES - individual EC           MGSO
 1e3        00.89       01.14           00.17                00.64          00.27        1e2        01.64       01.70                -          00.04             00.78
 1e2        03.65       04.11               -                   -           01.05        1e1        05.60       06.61                -            -               04.73
 1e1        08.51       09.35               -                   -           02.61        1e0        19.24       33.01                -            -               28.86
 1e0        09.78       10.59               -                   -           02.10        1e-1         +           +                  *            *                 +
 1e-1       20.02       21.50               -                   -             -          1e-2         +           +                  *            *                 +
 1e-2         +           +                 *                   *             *          1e-3         +           +                  *            *                 +
 1e-3         +           +                 *                   *             *          1e-4         +           +                  *            *                 *
 1e-4         +           +                 *                   *             *          1e-5         +           +                  *            *                 *
 1e-5         +           +                 *                   *             *          1e-6         +           +                  *            *                 *
 1e-6         +           +                 *                   *             *          1e-7         +           *                  *            *                 *
 1e-7         *           +                 *                   *             *          1e-8         *           *                  *            *                 *
 1e-8         *           *                 *                   *             *
                          v= 5
 param: (Kiso
          SE , 8, 1)   (KMatérn
                           2
                                   (Kard
                                , 8, 1)
                                      SE , 0, 5, 0.1) (Kiso
                                                        SE , 2 , 5, 0)
                                                              −4
                                                                           Kiso
                                                                            SE           param: (Kiso      (Kard
                                                                                                  SE , 8, 1) SE , 4, 1)    (Kard                iso −2           iso
                                                                                                                              SE , 0, 5, 0.1) (KSE , 2 , 5, 0) KSE
                                    f12 (10–D)                                                                         f12 (20–D)
 ∆ fopt   S-CMA-ES - generation EC      S-CMA-ES - individual EC             MGSO      ∆ fopt S-CMA-ES - generation EC       S-CMA-ES - individual EC          MGSO
 1e2        08.98                           -                             01.88 02.02 1e3       01.39                           00.05               00.18      00.42
 1e1        06.83                           -                               -   01.05 1e2       02.73                               -                 -        00.03
 1e0        17.61                           -                               -     -    1e1      07.24                               -                 -          -
 1e-1         +                             *                               *     *    1e0      59.06                               -                 -          -
 1e-2         +                             *                               *     *    1e-1       +                                 *                 *          *
 1e-3         +                             *                               *     *    1e-2       +                                 *                 *          *
 1e-4         +                             *                               *     *    1e-3       +                                 *                 *          *
 1e-5         +                             *                               *     *    1e-4       +                                 *                 *          *
 1e-6         *                             *                               *     *    1e-5       +                                 *                 *          *
 1e-7         *                             *                               *     *    1e-6       *                                 *                 *          *
 1e-8         *                             *                               *     *    1e-7       *                                 *                 *          *
                                                                                       1e-8       *                                 *                 *          *
 param: (Kiso
          SE , 8, 1)                      (Kard
                                            SE , 0, 5, 0.1)                Kiso
                                                                            SE   Kard
                                                                                  SE  param: (Kiso
                                                                                                SE , 8, 1)                 (Kard                 iso
                                                                                                                              SE , 0, 5, 0.1) (KSE , 0, 10, 0)  Kiso
                                                                                                                                                                 SE


Table 3: Speed-up of S-CMA-ES using individual- and generation-based strategies and MGSO, compared to CMA-ES
without a surrogate model – functions f 11 − f 20 (see Section 4.1 for details). Empty columns signify that the best
observed settings for the respective function-dimension combination are identical with the overall best observed settings.
                                                                                                                                Investigation of Gaussian Processes in the Context of Black-Box Evolutionary Optimization
                                                                                                                                165


Figure 2: Examples of the best-fitness progress with the best observed settings, solid line: median, dashed lines: quartiles.
166                                                                                         A. Kudinov, L. Bajer, Z. Pitra, M. Holeňa


EC strategy generally brought the best results, it was out-      [2] Bajer, L., Charypar, V., Holeňa, M.: Model guided sam-
performed by MGSO in the case of 2D version of f 8 and               pling optimization with gaussian processes for expensive
5D version of f 12 in the later phase of the optimization            black-box optimization. In: Blum, C. (ed.), GECCO Com-
process.                                                             panion’13, 2013, New York: ACM
                                                                 [3] Bajer, L., Holeňa, M.: Two gaussian approaches to black-
                                                                     box optomization. CoRR, abs/1411.7806, 2014
5     Conclusion                                                 [4] Bajer, L., Pitra, Z., Holeňa, M.: Benchmarking Gaus-
                                                                     sian processes and random forests surrogate models on the
In this paper, two optimization approaches based on Gaus-            BBOB noiseless testbed. In: GECCO’15, Madrid, Spain,
sian processes were tested on the set of multimodal fitness          2015, ACM
functions from the CEC 2013 competition [10], and were           [5] Bajer, L., Pitra, Z., Holeňa, M.: Investigation of Gaussian
compared to the state-of-the-art evolutionary approach in            processes and random forests as surrogate models for evo-
                                                                     lutionary black-box optimization. In: GECCO’15, Poster
black-box optimization, CMA-ES. One of them is Model
                                                                     Abstracts, Madrid, Spain, 2015, ACM, TBA
Guided Sampling Optimization [2], the other approach, S-
                                                                 [6] Chaput, J. C., Szostak, J. W.: Evolutionary optimization of
CMA-ES [5], consists in using GP as a surrogate model
                                                                     a nonbiological ATP binding protein for improved folding
for CMA-ES. The performance of the methods was com-
                                                                     stability. Chemistry & Biology bf 11(6) (June 2004), 865–
pared with respect to the number of function evaluations.            874
   In the case of S-CMA-ES, two evolution control strate-
                                                                 [7] Hansen, N., Finck, S., Ros, R., Auger, A.: Real-parameter
gies were used, the individual- and generation-based. Al-            black-box optimization benchmarking 2012: experimental
though S-CMA-ES using generation-based EC strategy                   setup. Technical Report, INRIA, 2012
outperformed MGSO, both methods showed the perfor-               [8] Jones, D. R., Schonlau, M., Welch, W. J.:           Efficient
mance improvement in most cases. On the other hand,                  Global Optimization of expensive black-box functions. J.
the individual-based EC strategy brought the worst re-               of Global Optimization 13(4) (Dec. 1998), 455–492
sults of all considered methods. We also observed, that S-       [9] Larranaga, P., Lozano, J.: Estimation of distribution algo-
CMA-ES performs better using generation-based EC set-                rithms: a new tool for evolutionary computation. Kluwer
ting with more consecutive model-evaluated generations.              Academic Pub, 2002
Isotropic squared exponential covariance function showed        [10] Li, X., Engelbrecht, A., Epitropakis, M. G.: Benchmark
to be the most suitable for the optimization from all tested         functions for CEC’2013 special session and competition
covariance functions.                                                on niching methods for multimodal function optimization’.
   This is a work in progress that is a part of a broader            2013.
ongoing research. Terefore, it would be premature to draw       [11] Loshchilov, I., Schoenauer, M., Sebag, M.: Self-adaptive
deeper conclusions at this stage. We hope to be able to              surrogate-assisted covariance matrix adaptation evolution
draw such conclusions after further investigations will be           strategy. In: Proceedings of the Fourteenth International
performed in the future.                                             Conference On Genetic And Evolutionary Computation
                                                                     Conference, GECCO’12, 321–328, New York, NY, USA,
                                                                     2012, ACM
Acknowledgement                                                 [12] Rasmussen, C., Williams, C.: Gaussian processes for ma-
                                                                     chine learning. Adaptive Computation and Machine Learn-
The research reported in this paper has been supported by            ing, MIT Press, Cambridge, MA, USA, Jan. 2006
the Czech Science Foundation (GAČR) grant 13-17187S.
                                                                [13] Tesauro, G., Jong, N. K., Das, R., Bennani, M. N.: On
                                                                     the use of hybrid reinforcement learning for autonomic
                                                                     resource allocation. Cluster Computing 10(3) (2007),
References
                                                                     287–299
 [1] Nik, M. A., Fayazbakhsh, K., Pasini, D., Lessard, L.:
     A comparative study of metamodeling methods for the de-
     sign optimization of variable stiffness composites. Com-
     posite Structures 107 (2014), 494–501