<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Improving Optimization With Gaussian Processes in the Covariance Matrix Adaptation Evolution Strategy</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jiří Tumpach</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Koza</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Holeňa</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Charles University, Faculty of Mathematics and Physics</institution>
          ,
          <addr-line>Prague</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Czech Academy of Sciences, Institute of Computer Science</institution>
          ,
          <addr-line>Prague</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Czech Technical University, Faculty of Information Technology</institution>
          ,
          <addr-line>Prague</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper explores the use of Gaussian processes (GPs) in the covariance matrix adaptation evolution strategy (CMA-ES) for black-box optimization. GPs are powerful probabilistic models that capture complex relationships, making them suitable for modeling uncertain objective functions. Integrating GPs into the CMA-ES improves exploration and adaptation in the search space, enhancing convergence speed and solution quality. The paper describes a novel implementation framework allowing to use GPs as surrogate models for the CMA-ES. That framework findings encourage further research to advance the application of GPs in black-box optimization.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>black-box optimization were low-order polynomials and
artificial neural networks (ANNs), specifically multilayer
Black-box optimization is an optimization of objective perceptron (MLP). The former have always remained a
functions for which no analytical description is provided. suitable choice in situations when enough evaluations
It employs optimization methods that need as input only of the true, black-box objective function are afordable
points in the search space paired with respective values of for the approximation properties of polynomials to be in
the objective function obtained in a non-analytical way, efect. On the other hand, surrogate modeling for
sube.g. from sensors, in experiments or through numerical stantially fewer evaluations of the true objective function
simulations. The most frequently used approaches are has undergone further development during the last two
evolutionary optimization, such as evolution strategies, decades. MLPs were soon replaced with another kind
genetic algorithms, and diferential evolution, or other of ANNs, radial basis function networks (RBFs), which
metaheuristics, such as particle swarm optimization. better fit the local peculiarities of an objective function</p>
      <p>
        Because black-box optimization methods receive only landscape. Those networks, however, have since the late
information about values of the objective function, they 2000s been superseded by other kinds of surrogate
modtypically need many such values. This is a problem in els, primarily Gaussian processes (GPs), but also ranking
situations when evaluating the black-box objective func- support vector machines (RSVMs) and random forests
tion is time-consuming and/or expensive. That is fre- (RFs). GPs are currently the most successful kind of
surquently the case if it is evaluated empirically in experi- rogate models for black-box optimization with a small
ments. For example, for the evolutionary optimization evaluation budget of functions with complicated
multitasks described in the book [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the evaluation of a com- modal landscapes, mainly due to their ability to estimate
paratively small generation of a genetic algorithm can the probability distribution of the true objective function
sometimes take more than a week and cost more than in a given point.
10000 e. To tackle such problems, an approach called
surrogate modeling has emerged more than 20 years ago.
      </p>
      <p>In particular in continuous optimization, surrogate mod- 2. Surrogate Modeling in
eling consists in evaluating the true, black-box objective Black-Box Optimization
function only in some points and evaluating a suitable
regression model in all remaining points. Such a
regression model is called surrogate model or metamodel of the
objective function. It is trained on points where the true
objective function has been evaluated and approximates
it in the search space.</p>
      <p>The earliest kinds of surrogate models in continuous</p>
      <p>
        Surrogate modeling for black-box optimization relies on
the combination and interaction of three components: a
regression model serving as a surrogate of the true,
blackbox objective function, a black-box optimization method
seeking the optimum of that objective function, and a
strategy when to evaluate the true objective function and
when its surrogate model. In the context of evolutionary
black-box optimization, that strategy is usually called
evolution control [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4, 5, 6</xref>
        ].
      </p>
      <p>The regression models that are the most suitable kind
of surrogate models if suficiently many evaluations of characterizing the objective function landscape and the
the true, black-box objective function are afordable, are black-box optimization method [35, 24, 38, 10]. Apart
low-order polynomials, typically quadratic functions [7, from classification according to the appropriateness of
8, 9, 10, 11]. The suficient number of evaluations de- the surrogate model for the considered data,
metalearnpends, according to these cited research works, on the ing can also be used for regression of model error on the
black-box function and on the dimension. For substan- combination of values of metafeatures [39].
tially fewer evaluations, the most traditional kind of sur- Finally, evolution control has been since the first
surrorogate models were MLPs [5, 12], soon replaced with gate-assisted black-box optimization methods performed
RBFs [13, 14, 11, 15, 10], and since the late 2000s with basically in two ways, generation-based and
individualGaussian processes (GPs) a.k.a. kriging [2, 4, 16, 17, based. In the generation based, all points are in some
18, 19, 20, 21, 22]. Occasionally, RBFs were used as lo- generations evaluated with the true objective function
cal models in combination with GP-based global mod- and in the remaining generations with the model. On the
els [23]. Other kinds of surrogate models employed other hand, in every generation of the individual-based
during the last decade include decision trees [24], ran- evolution control, based on the evaluation of all points
dom forests [25, 26, 24] and ranking support vector ma- with the model, a preselection of points to be evaluated
chines [27, 28]. The last one has an exceptional property with the true objective function is performed [5]. In most
of invariance with respect to order-preserving transforma- of the surrogate-assisted methods, however, the evolution
tions of the objective function. This is important in situ- control is specifically tailored to the respective method.
ations when the black-box optimization algorithm
possesses such invariance, a frequently encountered prop- 2.1. Surrogate Modeling in Connection
erty of evolutionary algorithms. On the other hand, the</p>
      <sec id="sec-1-1">
        <title>With the CMA-ES</title>
        <p>
          surrogate modeling methods proposed in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] and [22] use
GPs to perform preselection based on a partial ordering Not only the two most important kinds of surrogate
modthat is also invariant with respect to order-preserving els, i.e. low-order polynomials [7, 8, 9] and GPs [13,
transformations. More importantly, the adaptive func- 4, 17, 21, 22], but also the less common RBFs, RFs and
tion value warping approach recently proposed in [29] RSVMs [25, 27, 26, 15] are most often combined with the
aims to provide such invariance to any surrogate model. Covariance matrix adaptation evolution strategy
(CMA
        </p>
        <p>
          As to the black-box optimization methods, surrogate ES). That is not surprising because CMA-ES has already
models are most often combined with evolutionary op- in the 2000s become a state-of-the-art approach to
singletimizers. Their combinations with the most important objective unconstrained continuous black-box
optimizaamong them, the state-of-the-art black-box optimization tion [40, 41]. Occasionally, also Bayesian optimization
algorithm CMA-ES will be surveyed in some detail below, is combined with CMA-ES. For example in [42],
optiin Subsection 2.1. GPs were combined also with other evo- mization switches from the most traditional Bayesian
lutionary optimization methods [18, 30], and GPs, poly- optimization method, EGO (Eficient Global
Optimizanomials and RBFs were combined with particle swarm tion) [32], to CMA-ES. Finally, CMA-ES has also been
optimization [11] and with memetic optimization [14]. combined with a team of surrogate models and the choice
Moreover, GPs are used in black-box optimization in of the most appropriate among them based on landscape
two diferent ways. In connection with evolutionary and analysis [37, 20].
similar black-box optimization methods, they serve as As to the evolution control of surrogate-assisted
varia regression model evaluated instead of the true objec- ants of CMA-ES, the authors of the present paper have
tive function. In addition, they also play a key role in been involved into an investigation of the evolution
conBayesian optimization. That kind of optimization relies trol of two important polynomial-assisted CMA-ES
varion GP-estimates of probability distributions of values of ants lmm-CMA [7, 9] and lq-CMA-ES [8] and of two
the true objective function. Those probability distribu- variants of the GP-assisted variant DTS-CMA-ES [
          <xref ref-type="bibr" rid="ref2">2, 19</xref>
          ].
tions enable several ways of searching for optima of that Noteworthy, that investigation included mutually
replacobjective function, each of them governed by a specific ing the evolution control of each variant with the
evoluacquisition function [31, 32, 33]. The surrogate-assisted tion control of the others. According to its findings, the
black-box optimization methods constructing several sur- success of those important surrogate-assisted CMA-ES
rogate models simultaneously either aggregate them to variants is definitely not limited to using the respective
a team [14, 11] or complement the evolution control by specific tailored evolution control [6].
a classifier selecting the most appropriate among those
models. Important examples of classifiers used in this
context are ANNs [34, 35, 36] and classification trees [ 37, 20].
        </p>
        <p>Their learning can be viewed as metalearning because
it is based on metafeatures, i.e. properties empirically</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. New Framework for a</title>
    </sec>
    <sec id="sec-3">
      <title>Surrogate-Assisted CMA-ES</title>
      <sec id="sec-3-1">
        <title>The most widely used implementation of the CMA-ES</title>
        <p>algorithm is the oficial code written by the author of the
algorithm Nikolaus Hansen and his team [43]. It is avail- For the CMA-ES algorithm in particular, the steps are
able in multiple programming languages, including C, these:
C++, Matlab, R, Python, and others. It is being actively de- 1. Sample  points 
veloped, and it contains various versions and extensions 2. Evaluate the objective function  ()
of the algorithm and extensive parameterization options. 3. Select  lowest  ()
While the C and C++ versions are the most performant 4. Update the population mean and covariance
mafor solving real problems in practice, the most suitable trix
for experimentation with the algorithm itself is nowa- 5. Repeat until optimum reached
days the Python version. However, the Python CMA-ES
version is still based on the original Matlab legacy code These steps correspond to the methods implemented
rewritten into Python. It contains very long function def- in the main class of the framework ModularCMAES as
initions with multiple nested if statements for diferent shown in the diagram in Figure 1. It also depicts the
soalgorithm variants and parameter handling, which makes called ask-and-tell interface provided by the framework
it highly inconvenient to experiment with modifications as well.
of the core parts of the algorithm.</p>
        <p>Therefore we decided to base our code on a diferent
implementation by Jacob de Nobel and his colleagues
called Modular CMA-ES [44], which is written in a
modern modular object-oriented way, allowing to create
different variants of the CMA-ES algorithm easily.</p>
        <sec id="sec-3-1-1">
          <title>3.1. Modular CMA-ES</title>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>The starting point of our implementation is the library</title>
        <p>Modular CMA-ES. Each optimization technique is
encapsulated within a modular component, providing
independence and flexibility in selecting and combining
diferent modules. This modularity enables users to
construct tailored optimization strategies by combining
multiple modules, thereby expanding the exploration space
and enhancing the search capabilities of the CMA-ES
algorithm. By integrating previously distant optimization
techniques, the library enables combinatorial exploration
of diferent strategies within the CMA-ES framework.
Users can efortlessly combine modules representing
various optimization methods such as population sampling
techniques, surrogate modeling, elitism, step size
adaptation, restart strategies, and constraint handling
mechanisms. This combinatorial exploration empowers
researchers to exploit the strengths of diferent techniques,
leading to more efective and eficient optimization
processes. The Modular CMA-ES library prioritizes ease of
use and customization. Moreover, the modular
architecture allows for the activation and deactivation of
modules during runtime, facilitating dynamic exploration and
adaptation during the optimization process.</p>
        <p>A general scheme of an evolution strategy can be
expressed in the following steps:</p>
      </sec>
      <sec id="sec-3-3">
        <title>1. Generate a new population</title>
      </sec>
      <sec id="sec-3-4">
        <title>However, this library does not provide support for surrogate models on its own. That is why we have been developing the framework described in this paper.</title>
        <sec id="sec-3-4-1">
          <title>3.2. Incorporating Gaussian Processes</title>
          <p>We added to the Modular CMA-ES package popular
covariance functions such as Matérn, RBF, periodic, and
many others [45]. In addition to these individual
kernels, the package also provides the flexibility to explore
additive and multiplicative combinations of them, cf.
Subsection 3.3. This allows users to create more complex and
customized GP-based surrogate models by combining
multiple kernels together. Furthermore, the framework
ofers a search within these kernels. A list of Gaussian
process covariance functions that are available in the
framework follows.</p>
          <p>Included covariance functions [45]
• Polynomial Kernels
• Parabolic
• RBF
• Exponential curve
• Periodic kernel
• Matérn 1 , Matérn 3 and Matérn 5</p>
          <p>2 2 2
Included covariance function modifications
• Learnable scaling of features
• Exponential mapping</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>3.3. A Systematic Approach to Combining</title>
        </sec>
        <sec id="sec-3-4-3">
          <title>Incorporated Covariance Functions</title>
          <p>The works [46] and in more detail [47] present a
systematic approach to automating the construction of GP
covariance functions. Compositional kernels enable
flexible and automatic discovery of the appropriate structure
and complexity of a model by allowing the composition
of multiple simpler kernels. By combining these
kernels, the model can capture a wide range of patterns and
structures, adapting to the complexity of the underlying
data. Our framework evaluates the performance of each
kernel through cross-validated regression, ensuring its
efectiveness in capturing the underlying data patterns.
Additionally, a complexity-based penalization approach
is employed to assess the complexity of each kernel. By
incorporating these evaluation methods, the framework
enables the automatic selection of the most suitable
kernels for optimizing complex problems.</p>
        </sec>
        <sec id="sec-3-4-4">
          <title>3.4. Included Evolution Control</title>
          <p>
            Evolution control in surrogate CMA-ES involves the
management of the surrogate model and the decision-making
process of how to update it. The key idea is to balance
the exploration of the search space and the exploitation
of promising regions guided by the surrogate model’s
predictions. The evolution control in surrogate CMA-ES
plays a crucial role in leveraging the surrogate model to
guide the search. We will briefly outline two diferent
evolution controls we implemented in the framework.
Doubly Trained S-CMA-ES
The DTS-CMA-ES published in [
            <xref ref-type="bibr" rid="ref2">2, 19</xref>
            ] is a successor
to the S-CMA-ES algorithm, which it extends with a
second round of surrogate model training. The
algorithm involves sampling a new population, training a
surrogate model on original-evaluated points, selecting
points based on the model’s prediction, evaluating those
points, retraining the model, and predicting fitness for
non-original evaluated points. The key features include
sampling from the CMA-ES distribution, utilizing
Gaussian process uncertainty estimation for point selection,
using recent points for fitness prediction, and
maintaining a training set near the CMA-ES distribution mean.
          </p>
          <p>Each generation of this EC can be summarized in the
following steps:</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>1. sample a new population of size  (standard CMA</title>
        <p>ES ofspring),
2. train the first surrogate model on the
originalevaluated points from the archive ,
3. select ⌈ ⌉ point(s) wrt. a criterion , which is
based on the first model’s prediction,
4. evaluate these point(s) with the original fitness,
5. retrain the surrogate model also using these new
point(s), and
6. predict the fitness of the non-original evaluated
points with this second model.</p>
        <p>Kendall- Rank Test Strategy From lq-CMA-ES</p>
      </sec>
      <sec id="sec-3-6">
        <title>In this evolution strategy developed for the surrogate-as</title>
        <p>sisted CMA-ES variant LQ CMA-ES [8], which is based
on quadratic polynomials, a queue is utilized to store
all evaluated solutions for model building. During each
iteration, a limited number of the best solutions based on
the model’s performance are chosen from the population.
These selected solutions are then evaluated using the
true objective function  , sorted, and added to the end of
the queue (with the best solution being enqueued last).
To maintain the queue’s size, the oldest elements are
dropped when the maximum capacity is reached. This
process continues until the Kendall- rank correlation
coeficient between the rankings of function  and the
model’s rankings exceeds a threshold of 0.85, or until
the entire population has been evaluated. At the end of
the process, the population is ranked based on surrogate
iftness unless all population members have been
evaluated using function  , in which case the rankings based
on function  are used. Through using the correlation
coeficient, this approach avoids a direct comparison of
the model and true objective function.</p>
        <sec id="sec-3-6-1">
          <title>3.5. IOHprofiler Integration</title>
          <p>The use of Modular CMA-ES in conjunction with
IOHprofiler [ 48, 49] ofers a powerful approach for analyzing
and comparing iterative optimization heuristics.
IOHprofiler, a versatile tool for evaluating algorithm
performance, provides statistical assessments by analyzing the
distribution of fixed-target running time and fixed-budget
function values. By integrating modular CMA-ES with
IOHprofiler, researchers can gain insights into the
algorithm’s behavior, assess its adaptability, and compare its
performance against other optimization heuristics. The
combination allows for tracking the evolution of
algorithm parameters, facilitating the analysis, comparison,
and design of self-adaptive algorithms. With
IOHproifler’s experimental and post-processing capabilities,
researchers can generate and evaluate running time data
for benchmark problems, adjust the precision and range
of displayed data, and make informed decisions based on
the statistical evaluations produced.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>This paper presented a new framework for support of the
state-of-the-art black-box optimization method CMA-ES
through GP-based modeling. It is a work-in-progress
paper: not all intended functionality described in
Section 3 has already been implemented and even some of
the implemented is not yet working properly. However,
we hope that the situation will be much better at the time
of the workshop. Still, we are not aware of any other
system that provides such a comprehensive functionality
for combining CMA-ES with Gaussian processes.</p>
      <p>We have concentrated on Gaussian processes because
we consider them to be the most suitable kind of
surrogate models for dificult multimodal black-box functions
if only a small number of evaluations of the true objective
function is available. In the future, however, we intend
to extend the developed framework also to other kinds of
surrogate models. Most importantly, to low-order
polynomials, which are a surrogate-modeling continuation
of traditional response surface models [50], and which
have always been the most successful kind of surrogate
models if a large number of evaluations of the true
objective function is available or if that function is easy to
ift. In addition, we intend to include also some other of
the models recalled above in Section 2, as well as several
models that have not yet been employed for surrogate
modeling, but we believe that they are worth to be
investigated to this end. For various time horizons, we think
altogether of the following models:
• Deep Gaussian processes, in which an ANN
architecture connects individual GPs, similarly to
connecting individual recurrence cells in a long
short term memory [51, 52].
• MLPs in the neural tangent kernel
parametrization [53, 54, 55], which at a suficient width have
an ability to mimic GP sampling and to replace
traditional acquisition functions in Bayesian
optimization. Such behaviour of this kind of ANNs is,
according to [53] and [55], a consequence of their
asymptotic properties if the number of hidden
neurons increases to infinity [56, 54, 57].
• Variational autoencoders, allowing to perform
optimization on a latent space of a substantially
lower dimension. Such use of a low dimensional
latent space has already been investigated in the
case of Bayesian optimization [58, 59].
• The generative adversarial networks (GANs)
paradigm has been recently shown to be applicable
to black-box optimization. More precisely, a
generator has to propose samples compatible with
the distribution of low values or directly with
the distribution of the optimum of the
considered black-box function, whereas one or more
discriminators have to classify samples according
to whether they are governed by that
distribution [60, 61].</p>
      <sec id="sec-4-1">
        <title>Acknowledgments</title>
        <sec id="sec-4-1-1">
          <title>This work was supported by the Czech Technical Uni</title>
          <p>versity grant SGS23/205/OHK3/3T/18 and by the SVV
project number 260 575 of the Charles University.
Computational resources were supplied by the project
"eInfrastruktura CZ" (e-INFRA LM2018140).
proximate ranking, Applied Intelligence 48 (2018) lution control for the surrogate cma-es, in: PPSN,
4288–4204. 2016, pp. 59–68.
[5] Y. Jin, M. Olhofer, B. Sendhof, A framework for [20] Z. Pitra, J. Repický, M. Holeňa, Landscape analysis
evolutionary optimization with approximate fitness of Gaussian process surrogates for the covariance
functions, IEEE Transactions on Evolutionary Com- matrix adaptation evolution strategy, in: GECCO,
putation 6 (2002) 481–494. ACM, 2019, pp. 691–699.
[6] Z. Pitra, M. Hanuš, J. Koza, J. Tumpach, M. Holeňa, [21] L. Toal, D. Arnold, Simple surrogate model assisted
Interaction between model and its evolution control optimization with covariance matrix adaptation, in:
in surrogate-assisted CMA evolution strategy, in: PPSN, 2020, pp. 184–197.</p>
          <p>GECCO, 2021, p. 358 (paper no.). [22] V. Volz, G. Rudolph, B. Naujoks, Investigating
un[7] A. Auger, D. Brockhof, N. Hansen, Benchmark- certainty propagation in surrogate-assisted
evoluing the local metamodel cma-es on the noiseless tionary algorithms, in: GECCO, 2017, pp. 881–888.
BBOB’2013 test bed, in: GECCO, 2013, pp. 1225– [23] Z. Zhou, Y. Ong, P. Nair, A. Keane, K. Lum,
Combin1232. ing global and local surrogate models to accellerate
[8] N. Hansen, A global surrogate assisted CMA-ES, evolutionary optimization, IEEE Transactions on
in: GECCO, 2019, pp. 664–672. Systems, Man and Cybernetics. Part C: Applications
[9] S. Kern, N. Hansen, P. Koumoutsakos, Local meta- and Reviews 37 (2007) 66–76.
models for optimization using evolution strategies, [24] B. Saini, M. Lópey-Ibañez, K. Miettinen, Automatic
in: PPSN, 2006, pp. 939–948. surrogate modelling technique selection based on
[10] H. Yu, C. Sun, Y. Tan, J. Zeng, Y. Jin, An adaptive features of optimization problems, in: GECCO,
model selection strategy for surrogate- assisted par- 2019, pp. 1765–1772.
ticle swarm optimization algorithm, in: IEEE SCI, [25] N. Belkhir, J. Dréo, P. Savéant, M. Schoenauer, Per
2016, pp. 1–8. instance algorithm configuration of CMA-ES with
[11] H. Wang, Y. Jin, J. Doherty, Committee-based active limited budget, in: GECCO, 2017, pp. 681–688.
learning for surrogate-assisted particle swarm opti- [26] Z. Pitra, J. Repický, M. Holeňa, Boosted regression
mization of expensive problems, IEEE Transactions forest for the doubly trained surrogate covariance
on Cybernetics 47 (2017) 2664–2677. matrix adaptation evolution strategy, in: ITAT 2018,
[12] M. Papadrakakis, N. Lagaros, Y. Tsompanakis, Struc- 2018, pp. 72–79.</p>
          <p>tural optimization using evolution strategies and [27] I. Loshchilov, M. Schoenauer, M. Sebag,
Comparineural networks, Computer Methods in Applied son-based optimizers need comparison-based
surMechanics and Engineering 156 (1998) 309–333. rogates, in: PPSN, 2010, pp. 364–373.
[13] L. Bajer, M. Holeňa, Surrogate model for contin- [28] T. Runarsson, Ordinal regression in evolutionary
uous and discrete genetic optimization based on computation, in: PPSN, 2006, pp. 1048–1057.
RBF networks, in: Intelligent Data Engineering and [29] A. Abbasnejad, D. V. Arnold, Adaptive function
Automated Learning, Springer, 2010, pp. 251–258. value warping for surrogate model assisted
evolu[14] D. Lim, Y. Ong, Y. Jin, S. B., A study on metamod- tionary optimization, in: Parallel Problem Solving
eling techniques, ensembles, and multi-surrogates from Nature – PPSN XVII: 17th International
Conin evolutionary computation, in: GECCO, 2007, pp. ference, PPSN 2022, Dortmund, Germany, 2022, pp.
1288–1295. 76–89.
[15] H. Ulmer, F. Streichert, A. Zell, Model-assisted [30] M. Wu, A. Karkar, B. Liu, A. Yakovlev, G. Gielen,
steady state evolution strategies, in: GECCO, V. Grout, Network on chip optimization based on
Springer, 2003, pp. 610–621. surrogate model assisted evolutionary algorithms,
[16] O. Krause, Recombination weight based selection in: IEEE CEC, 2014, pp. 3266–3271.</p>
          <p>in the DTS-CMA-ES, in: PPSN, 2022, pp. 295–308. [31] Y. Diouane, V. Picheny, R. Le Riche, A. Di
Perro[17] Z. Li, T. Gao, B. Wang, Elite-driven surrogate as- tolo, TREGO: a trust-region framework for eficient
sisted CMA-ES algorithm by improved lower confi- global optimization, Journal of Global Optimization
dence bound method, in: Engineering with Comput- 85 (2022) 10.1007/s10898–022–01245–w (doi).
ers, Springer, 2022, pp. 10.1007/s00366–022–01642– [32] D. Jones, M. Schonlau, W. Welch, Eficient global
op5 (doi). timization of expensive black-box functions,
Jour[18] L. Na, Q. Feng, Z. Liang, W. Zhong, Gaussian pro- nal of Global Optimization 13 (1998) 455–492.
cess assisted coevolutionary estimation of distri- [33] J. Knowles, ParEGO: a hybrid algorithm with
onbution algorithm for computationally expensive line landscape approximation for expensive
multiproblems, Journal of Central South University of objective optimization problems, IEEE Transactions
Technology 19 (2012) 443–452. on Evolutionary Computation 10 (2006) 50–66.
[19] Z. Pitra, L. Bajer, M. Holeňa, Doubly trained evo- [34] Y. He, Y. Yuen, Black box algorithm selection by
convolutional neural network, in: LOD, 2020, pp. sponse Surface Methodology: Proces and Product
264–280. Optimization Using Designed Experiments, John
[35] M. Pikalov, V. Mironovich, Automated parameter Wiley and Sons, Hoboken, 2009.
choice with exploratory landscape analysis and ma- [51] T. Bui, D. Hernandez-Lobato, J. Hernandez-Lobato,
chine learning, in: GECCO, 2021, pp. 1982–1985. Y. Li, R. Turner, Deep Gaussian processes for
regres[36] R. Prager, M. Seiler, H. Trautman, P. Kerschke, To- sion using approximate expectation propagation,
wards feature-free automated algorithm selection in: ICML, 2016, pp. 1472–1481.
for single-objective continuous black box optimiza- [52] G. Hernández-Muñoz, C. Villacampa-Calvo, D.
Hertion, in: IEEE SCI, 2021, pp. 1–8. nández Lobato, Deep Gaussian processes using
[37] Z. Pitra, L. Bajer, M. Holeňa, Knowledge-based expectation propagation and Monte Carlo methods,
selection of gaussian process surrogates, in: ECML in: ECML PKDD 2020, 2021, pp. 479–494.</p>
          <p>Workshop IAL, 2019, pp. 48–63. [53] B. He, B. Lakshminarayanan, Y. Teh, Bayesian deep
[38] R. Seiler, M.V.and Prager, P. Kerschke, H. Traut- ensembles via the neural tangent kernel, in: NIPS,
mann, A collection of deep learning-based feature- 2020, pp. 1–13.
free approaches for characterizing single-objective [54] A. Jacot, F. Gabriel, C. Hongler, Neural tangent
continuous fitness landscapes, in: GECCO, 2022, kernel: Convergence and generalization in neural
pp. 657–665. networks, in: NIPS, 2018, pp. 1–10.
[39] A. Jankovic, G. Popovski, T. Eftimov, C. Doerr, The [55] B. Paria, B. Pòczos, K. Ravikumar, S. J., A. Suggala,
impact of hyper-parameter tuning for landscape- et al., Be greedy -– a simple algorithm for
blackaware performance regression and algorithm selec- box optimization using neural networks, in: ICML
tion, in: GECCO, 2021, pp. 687–696. Workshop on Adaptive Experimental Design and
[40] N. Hansen, A. Ostermaier, Completely derandom- Active Learning in the Real World, 2022, pp. 1–27.
ized self-adaptation in evolution strategies, Evolu- [56] S. Arora, S. Du, W. Hu, Z. Li, R. Salakhutdinov,
tionary Computation 9 (2001) 159–195. et al., On exact computation with an infinitely wide
[41] N. Hansen, The CMA evolution strategy: A com- neural net., in: NIPS, 2019, pp. 1–10.
paring review, in: Towards a New Evolutionary [57] J. Lee, L. Xiao, S. Schoenholz, Y. Bahri, R. Novak,
Computation, Springer, 2006, pp. 75–102. et al., Wide neural networks of any depth evolve
[42] H. Mohammadi, R. Riche, E. Touboul, Making EGO as linear models under gradient descent, in: NIPS,
and CMA-ES complementary for global optimiza- 2019, pp. 1–10.
tion, in: Learning and Intelligent Optimization, [58] S. Kim, P. Lu, C. Lob, J. Smith, J. Snoek, et al., Deep
Springer, 2015, pp. 287–292. learning for bayesian optimization of scientific
[43] N. Hansen, CMA-ES source code, 2016. https://cma- problems with high-dimensional structure,
Transaces.github.io. tions on Machine Learning Research 1 (2023)
open[44] J. de Nobel, D. Vermetten, The source code of review tPMQ6Je2rB.</p>
          <p>the modular python version of CMA-ES, 2020. [59] A. Tripp, E. Daxberger, J. Hernández-Lobato,
Samhttps://github.com/IOHprofiler/ModularCMAES. ple-eficient optimization in the latent space of deep
[45] E. Rasmussen, C. Williams, Gaussian Processes for generative models viaweighted retraining, in: NIPS,</p>
          <p>Machine Learning, MIT Press, Cambridge, 2006. 2020, pp. 1–14.
[46] D. Duvenaud, J. Lloyd, R. Grosse, J. Tenebaum, [60] M. Gillhofer, H. Ramsauer, J. Brandstetter, B. Schäfl,
G. Zoubin, Structure discovery in nonparametric S. Hochreiter, A GAN based solver of black-box
regression through compositional kernel search, in: inverse problems, in: NIPS, 2019, pp. 1–5.
30th International Conference on Machine Learn- [61] M. Lu, S. Ning, S. Liu, F. Sun, B. Zhang, et al.,
OPTing, 2013, pp. 1166–1174. GAN: A broad-spectrum global optimizer for
black[47] D. Duvenaud, Automatic Model Construction with box problems by learning distribution, 2022. Arxiv
Gaussian Processes, Ph.D. thesis, University of Cam- 2102.03888v5.</p>
          <p>bridge, 2014.
[48] C. Doerr, H. Wang, F. Ye, S. van Rijn, T. Bäck,</p>
          <p>IOHprofiler: A benchmarking and profiling tool
for iterative optimization heuristics, 2018. Arxiv
1810.05281.
[49] C. Doerr, F. Ye, N. Horesh, H. Wnag, O. Shir, et al.,</p>
          <p>Benchmarking discrete optimization heuristics with
IOHprofiler, Applied Soft Computing Journal 88
(2020) 106027 (paper no.).
[50] R. Myers, D. Montgomery, C. Anderson-Cook,
Re</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Baerns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Holeňa</surname>
          </string-name>
          ,
          <article-title>Combinatorial Development of Solid Catalytic Materials</article-title>
          .
          <article-title>Design of HighThroughput Experiments, Data Analysis, Data Mining</article-title>
          , Imperial College Press / World Scientific, London,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bajer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pitra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Repický</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Holeňa</surname>
          </string-name>
          ,
          <article-title>Gaussian process surrogate models for the CMA evolution strategy</article-title>
          ,
          <source>Evolutionary Computation</source>
          <volume>27</volume>
          (
          <year>2019</year>
          )
          <fpage>665</fpage>
          -
          <lpage>697</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Büche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Schraudolph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Koumoutsakos</surname>
          </string-name>
          ,
          <article-title>Accelerating evolutionary algorithms with Gaussian process fitness function models</article-title>
          ,
          <source>IEEE Transactions on Systems, Man, and Cybernetics</source>
          , Part C:
          <article-title>Applications</article-title>
          and Reviews 35 (
          <year>2005</year>
          )
          <fpage>183</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Radi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>El Hami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bai</surname>
          </string-name>
          ,
          <article-title>CMA evolution strategy assisted by kriging model and ap-</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>