<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Suitability of Modern Neural Networks for Active and Transfer Learning in Surrogate-Assisted Black-Box Optimization</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Martin</forename><surname>Holeňa</surname></persName>
							<email>martin@cs.cas.cz</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Czech Academy of Sciences</orgName>
								<orgName type="department" key="dep2">Institute of Computer Science</orgName>
								<address>
									<settlement>Prague</settlement>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Faculty of Information Technology</orgName>
								<orgName type="institution">Czech Technical University</orgName>
								<address>
									<settlement>Prague</settlement>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jan</forename><surname>Koza</surname></persName>
							<email>kozajan@fit.cvut.cz</email>
							<affiliation key="aff1">
								<orgName type="department">Faculty of Information Technology</orgName>
								<orgName type="institution">Czech Technical University</orgName>
								<address>
									<settlement>Prague</settlement>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Suitability of Modern Neural Networks for Active and Transfer Learning in Surrogate-Assisted Black-Box Optimization</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">6C30A2C47C606570F7BCA77DFF9D1E5E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:23+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Active learning plays a crucial role in black-box optimization, especially for objective functions that are expensive to evaluate. Continuous black-box optimization has adopted an approach called surrogate modelling, where the original black-box objective is approximated with a regression model. An active learning task in this context is to decide which points should be evaluated using the original objective to update the surrogate model. Apart from low-order polynomials, the first surrogate models were artificial neural networks of the kinds multilayer perceptron and radial basis function network. In the late 2000s, neural networks have been superseded by other kinds of surrogate models, primarily Gaussian processes. However, over the last 15 years, neural networks have seen significant and successful development, suggesting that they once again have the potential to serve as promising surrogate models. This paper reviews possible research directions concerning that potential, and recalls initial results from investigations in some of these directions. Finally, it contributes to those results by investigating the state-of-the-art black-box optimizer CMA-ES surrogate-assisted by two variants of random-activation-function neural network ensembles.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>One area where active learning plays a really important role is black-box optimization (BBO), i.e., optimization of objective functions for which no analytical description is provided. It employs optimization methods that need as input only points in the search space paired with respective values of the objective function obtained in a non-analytical way, e.g. from sensors, in experiments or through numerical simulations. Most frequently used are evolutionary optimization approaches, such as evolution strategies, genetic algorithms, and differential evolution, or other metaheuristics, such as particle swarm optimization.</p><p>Because BBO methods receive only information about values of the objective function, they typically need many such values. This is a problem in situations when evaluating the black-box objective function is time-consuming and/or expensive. That is frequently the case if it is evaluated empirically in experiments. For example, for the evolutionary optimization tasks described in the book <ref type="bibr" target="#b0">[1]</ref>, the evaluation of a comparatively small generation of a genetic algorithm can sometimes take more than a week and cost more than 10000 e. To deal with expensive evaluations, continuous BBO has in the late 1990s and early 2000s adopted an approach called surrogate modelling or metamodelling <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8]</ref>. In principle, a surrogate model is any regression model that with a sufficient fidelity approximates the original black-box objective function, restricting the necessity of its evaluation only to a small proportion of points, whereas everywehere else, only the surrogate model is used.</p><p>Selecting the points in which the original objective function should be evaluated is a step in which active learning is involved. However, it is not active learning of a regression model although the surrogate model itself is a regression model. The reason is that its utility functions are not based on the model, like are the commoly used utility functions uncertainty decrease, model performance, diversity, or surprise-novelty. Instead, they are based on the BBO, the most common being minimizing the objective function for a given evaluation budget, and minimizing the evaluation budget for a given objective-function threshold. Nevertheless, even active learning in surrogate-assisted BBO follows the basic priciple of active learning: to actively select next model inputs according to the considered utility function.</p><p>The earliest kinds of surrogate models in continuous BBO were low-order polynomials and artificial neural networks (ANNs) of the kind multilayer perceptron (MLP). The former have always remained a suitable choice in situations when enough evaluations of the original black-box objective function are affordable for the approximation properties of polynomials to be in effect. On the other hand, surrogate modelling for substantially less evaluations of the original objective has during the last two decades undergone further development. MLPs were soon replaced with another kind of ANNs, radial basis function networks (RBFs), which better fit local peculiarities of an objective function landscape. Those networks, however, have since the late 2000s been superseded by other kinds of surrogate models, primarily Gaussian processes (GPs), but also ranking support vector machines (RSVMs), and random forests (RFs). GPs are currently the most successful kind of surrogate models for BBO with small evaluation budget of functions with complicated multimodal landscapes, mainly due to their ability to assess the uncertainty of the estimate of the original objective function in a given point, more precisely, to provide the probability distribution of this estimate. That property of GPs allows to combine the original BBO method, e.g. an evolutionary one, with Bayesian optimization.</p><p>Consequently, only little attention has been paid to ANN-based surrogate models in continuous BBO during the last 15 years. This contrasts with the intense and successful development of the ANN area during that time, which suggests that ANNs again have the potential to serve as promising surrogate models. This paper attempts to bring a small contribution to research into that potential, presenting in addition a review of possible directions for such a research, connected with different classes of neural networks. Moreover, it also points out that ANNs can serve as the basis for transfer learning between surrogate-assisted BBO of different functions.</p><p>The next section surveys important aspects and key methods concerning surrogate-assisted continuous BBO. The review of possible research directions concerning the usability of modern neural networks in surrogate-assisted BBO is presented in Section 3. Finally, Section 4 reports an experimental contribution to one of those research directions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Surrogate-Assisted Continuous BBO</head><p>Surrogate modelling for continuous BBO relies on combination and interaction of three components: a regression model serving as a surrogate of the original black-box objective function, a BBO method seeking the optimum of that objective function, and a strategy when to evaluate the original objective function and when its surrogate model. That strategy is in the context of evolutionary BBO usually called evolution control <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b12">13]</ref>. There are two other aspects, namely observing constraints on the feasible set of the black-box objective function (cf. e.g. <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15]</ref>), and generalizing surrogate modelling from single objective to multiple objectives (cf. e.g. <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b16">17]</ref>), however, we will restrict our attention to single-objective unconstrained optimization.</p><p>As already mentioned in the introduction, the regression models that are the most suitable kind of surrogate models if sufficiently many evaluations of the original black-box objective function are affordable, are low-order polynomials, typically quadratic functions <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b18">19,</ref><ref type="bibr" target="#b20">20,</ref><ref type="bibr" target="#b21">21,</ref><ref type="bibr" target="#b22">22]</ref>. For substantially less evaluations, the most traditional kind have been MLPs <ref type="bibr" target="#b23">[23,</ref><ref type="bibr" target="#b8">9]</ref>, soon replaced with RBFs <ref type="bibr" target="#b24">[24,</ref><ref type="bibr" target="#b25">25,</ref><ref type="bibr" target="#b26">26,</ref><ref type="bibr" target="#b21">21,</ref><ref type="bibr" target="#b22">22]</ref>, and since the late 2000s with GPs a.k.a. kriging <ref type="bibr" target="#b27">[27,</ref><ref type="bibr" target="#b28">28,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b29">29,</ref><ref type="bibr" target="#b30">30]</ref>. Occasionally, RBFs were used as local models in combination with GP-based global models <ref type="bibr" target="#b31">[31]</ref>. Other kinds of surrogate models employed during the last decade include decision trees <ref type="bibr" target="#b32">[32]</ref>, RFs <ref type="bibr" target="#b33">[33,</ref><ref type="bibr" target="#b34">34,</ref><ref type="bibr" target="#b32">32]</ref>, and RSVMs <ref type="bibr" target="#b35">[35,</ref><ref type="bibr" target="#b36">36]</ref>. The last one has an exceptional property of invariance with respect to order-preserving transformations of the objective function. This is important in situations when the BBO algorithm possesses such invariance, a frequently encountered property of evolutionary algorithms. On the other hand, the surrogate modelling methods proposed in <ref type="bibr" target="#b10">[11]</ref> and <ref type="bibr" target="#b28">[28]</ref> use GPs to perform preselection based on a partial ordering that is also invariant with respect to order-preserving transformations. More importantly, the adaptive function value warping approach recently proposed in <ref type="bibr" target="#b37">[37]</ref> aims at providing such invariance to any surrogate model. As a final remark to different kinds of surrogate models, important works about that topic always consider several kinds <ref type="bibr" target="#b38">[38,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b39">39,</ref><ref type="bibr" target="#b20">20,</ref><ref type="bibr" target="#b32">32]</ref>, to compare them and select the best among them, and in <ref type="bibr" target="#b22">[22,</ref><ref type="bibr" target="#b39">39]</ref> also to aggregate their results, thus providing a team of surrogate models.</p><p>As to the BBO methods, not only the two most important kinds of surrogate models, i.e. low-order polynomials <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b18">19,</ref><ref type="bibr" target="#b20">20]</ref>, and GPs <ref type="bibr" target="#b26">[26,</ref><ref type="bibr" target="#b28">28,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b29">29,</ref><ref type="bibr" target="#b30">30]</ref>, but also the less common RBFs, RFs, and RSVMs <ref type="bibr" target="#b24">[24,</ref><ref type="bibr" target="#b36">36,</ref><ref type="bibr" target="#b33">33,</ref><ref type="bibr" target="#b34">34]</ref> are most often combined with the Covariance matrix adaptation evolution strategy (CMA-ES). That is not surprising because CMA-ES has already in the 2000s become a state-of-the-art approach to single-objective unconstrained continuous BBO. Basically, the CMA-ES evolves a Gaussian estimate of the position of the minimum of the original objective function. That evolution relies on simultaneous adaptation of the vector mean of the Gaussian estimate, of the scalar step size, and of the covariance matrix. For more details of this sophisticated evolution strategy, the reader is referred to the journal papers <ref type="bibr" target="#b40">[40,</ref><ref type="bibr" target="#b41">41]</ref>. GPs were also combined with other evolutionary optimization methods <ref type="bibr" target="#b27">[27,</ref><ref type="bibr" target="#b42">42]</ref>, and GPs, polynomials, and RBFs were combined with particle swarm optimization <ref type="bibr" target="#b22">[22]</ref> and with memetic optimization <ref type="bibr" target="#b25">[25]</ref>. Moreover, GPs are used in black-box optimization in two different ways. In connection with evolutionary and similar BBO methods, they serve as a regression model evaluated instead of the original objective function. In addition, they also play a key role in Bayesian optimization, which then relies on GP-estimates of probability distributions of values of the original objective. Those probability distributions enable several ways of searching for optima of that objective function, each of them governed by a specific assessment of uncertainty of the objective function estimate, commonly called acquisition function <ref type="bibr" target="#b43">[43,</ref><ref type="bibr" target="#b44">44,</ref><ref type="bibr" target="#b45">45]</ref>. Occasionally, Bayesian optimization is combined with CMA-ES. For example in <ref type="bibr" target="#b46">[46]</ref>, optimization switches from the most traditional Bayesian optimization method, EGO (Efficient Global Optimization) <ref type="bibr" target="#b43">[43]</ref>, to CMA-ES.</p><p>Finally, evolution control has been since the first surrogate-assisted BBO methods performed basically in two ways, generation-based, and individual-based. In the generation based, all points are in some generations evaluated with the true objective function, and in the remaining generations with the model. On the other hand, in every generation of the individual-based evolution control, based on the evaluation of all points with the model, a preselection of points to be evaluated with the true objective function is performed <ref type="bibr" target="#b8">[9]</ref>. In most of the surrogate-assisted methods, however, the evolution control is specifically tailored to the respective method. Noteworthy, the authors of <ref type="bibr" target="#b12">[13]</ref> investigated mutually replacing the evolution control of two important polynomial-assisted methods lmm-CMA <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b18">19]</ref> and lq-CMA-ES <ref type="bibr" target="#b20">[20]</ref>, and of two variants of the GP-assisted method DTS-CMA-ES <ref type="bibr" target="#b47">[47,</ref><ref type="bibr" target="#b11">12]</ref> with the evolution control of the others. According to their findings, the success of those important methods is definitely not limited to using the respective specific tailored evolution control. The surrogate-assisted black-box optimization methods constructing several surrogate models simultaneously either aggregate them to a team <ref type="bibr" target="#b25">[25,</ref><ref type="bibr" target="#b22">22]</ref> or complement the evolution control by a classifier selecting the most appropriate among those models. Important examples of classifiers used in this context are ANNs <ref type="bibr" target="#b48">[48,</ref><ref type="bibr" target="#b49">49,</ref><ref type="bibr" target="#b50">50]</ref>, and classification trees <ref type="bibr" target="#b51">[51,</ref><ref type="bibr" target="#b52">52]</ref>. Their learning can be viewed as metalearning because it is based on metafeatures, i.e. properties empirically characterizing the objective function landscape and the BBO method <ref type="bibr" target="#b21">[21,</ref><ref type="bibr" target="#b32">32,</ref><ref type="bibr" target="#b49">49,</ref><ref type="bibr" target="#b53">53]</ref>. Apart from classification according to the appropriateness of the surrogate model for the considered data, metalearning can be also used for regression of model error on the combination of values of metafeatures <ref type="bibr" target="#b54">[54]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Usability of Modern Neural Networks in Surrogate-Assisted BBO</head><p>This section primarily reviews eight kinds of modern neural networks that we consider worth a research into their ability to serve as surrogate models in BBO. A high-level overview of those kinds of ANNs is given in Table <ref type="table" target="#tab_0">1</ref>, which for each of them mentions whether such research has already started. In Subsection 3.1, two kinds integrating GPs into ANNs are recalled. Subsection 3.2 recalls three kinds of ANNs providing the most advantageous property of GPs, their ability to estimate the distribution of black-box objective function values. Finally, in Subsection 3.3, three well-known kinds of modern neural </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ANNs</head><p>Research into its ability + main references to serve as surrogate model in BBO MLPs with a GP as the final layer <ref type="bibr" target="#b55">[55,</ref><ref type="bibr" target="#b56">56]</ref> First investigations <ref type="bibr" target="#b57">[57,</ref><ref type="bibr" target="#b58">58]</ref> Deep GP networks <ref type="bibr" target="#b59">[59,</ref><ref type="bibr" target="#b60">60,</ref><ref type="bibr" target="#b61">61,</ref><ref type="bibr" target="#b62">62,</ref><ref type="bibr" target="#b63">63]</ref> Not Tangent kernel networks <ref type="bibr" target="#b64">[64,</ref><ref type="bibr" target="#b65">65]</ref> Not Prior networks <ref type="bibr" target="#b66">[66,</ref><ref type="bibr" target="#b67">67,</ref><ref type="bibr" target="#b68">68,</ref><ref type="bibr" target="#b69">69,</ref><ref type="bibr" target="#b70">70]</ref> First investigations <ref type="bibr" target="#b71">[71]</ref> Ensembles of neural networks <ref type="bibr" target="#b72">[72,</ref><ref type="bibr" target="#b73">73,</ref><ref type="bibr" target="#b74">74,</ref><ref type="bibr" target="#b75">75,</ref><ref type="bibr" target="#b76">76]</ref> First investigations [this paper] Variational autoencoders <ref type="bibr" target="#b77">[77,</ref><ref type="bibr" target="#b78">78]</ref> Not Generative adversarial networks <ref type="bibr" target="#b79">[79,</ref><ref type="bibr" target="#b80">80]</ref> Not Transformers <ref type="bibr" target="#b81">[81,</ref><ref type="bibr" target="#b82">82]</ref> Not networks, namely variational autoencoders, transformers, and generative adversarial networks, are recalled due to the fact that they have already proven useful in the related area of Bayesian optimization. In addition, Subsection 3.4 is devoted to knowledge transfer in surrogate-assisted BBO, which relates to the usability of modern neural networks through their important role in transfer learning.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Integration of GPs into ANNs</head><p>The integration of GPs into ANNs has been proposed on two different levels:</p><p>1. At the layer level -a GP serves as the final layer of an MLP <ref type="bibr" target="#b55">[55,</ref><ref type="bibr" target="#b56">56]</ref>. Integration on that level is based on the following two assumptions: (i) If 𝑛 𝐼 denotes the number of the ANN input neurons, then the ANN computes a mapping net of 𝑛 𝐼 -dimensional input values into the set 𝒳 on which is the GP defined. Consequently, the number 𝑛 𝑂 of neurons in the last hidden layer fulfills 𝒳 ⊂ R 𝑛 𝑂 , and the ANN maps an input 𝑣 into a point 𝑥 = net(𝑣) ∈ 𝒳 , corresponding to an observation 𝑓 (𝑥 + 𝜀) governed by the GP, where 𝜀 is a zero-mean Gaussian noise. From the point of view of the ANN inputs, the GP is now 𝒢𝒫(𝑚 GP (net(•)), 𝜅(net(•), net(•))), where 𝑚 GP is the mean function, and 𝜅 is the covariance function of the GP <ref type="bibr" target="#b83">[83]</ref>. (ii) The GP mean 𝜇 is assumed to be a known constant, thus not contributing to the GP hyperparameters, and independent of net 2. At the level of individual neurons -GPs can replace all hidden and output neurons of an MLP. This kind of neural networks is commonly called deep Gaussian process <ref type="bibr" target="#b59">[59,</ref><ref type="bibr" target="#b60">60,</ref><ref type="bibr" target="#b61">61,</ref><ref type="bibr" target="#b62">62,</ref><ref type="bibr" target="#b63">63,</ref><ref type="bibr" target="#b84">84,</ref><ref type="bibr" target="#b85">85,</ref><ref type="bibr" target="#b86">86,</ref><ref type="bibr" target="#b87">87,</ref><ref type="bibr" target="#b88">88,</ref><ref type="bibr" target="#b89">89,</ref><ref type="bibr" target="#b90">90]</ref>.</p><p>Integration on both levels has been developed primarily for Bayesian modelling and optimization. Nevertheless, GPs integrated as the last layer of MLPs have been used as surrogate models in a CMA-ES-driven BBO <ref type="bibr" target="#b57">[57,</ref><ref type="bibr" target="#b58">58]</ref>. In particular, those surrogate models incorporate GPs with five commonly employed covariance functions linear, quadratic, rational quadratic, squared exponential, and Matérn 5  2 , as well as with one composite covariance function superposing the quadratic and squared exponential. Those 6 models were compared in <ref type="bibr" target="#b57">[57]</ref> from the point of view of regression accuracy, evaluated on a large dataset collected during many previous runs of DTS-CMA-ES on the collection of 24 noiseless benchmarks from the Comparing Continuous Optimizers platform <ref type="bibr" target="#b91">[91,</ref><ref type="bibr" target="#b92">92]</ref> (cf. Section 4) in dimensions 2, 3, 5, 10, and 20. Then in <ref type="bibr" target="#b58">[58]</ref>, they were compared on the same benchmarks in the same dimensions from the point of view of the success of surrogate-assisted optimization with CMA-ES. Unfortunately, neither of those comparisons included more traditional surrogate models nor the CMA-ES without surrogate assistance. To our knowledge, the only comparison that included both a GP integrated as the last layer of an MLP, and more traditional surrogate models, was the comparison from the point of view of regression accuracy in <ref type="bibr" target="#b93">[93]</ref>. However, it included only one such integrated surrogate model, with the GP using the most simple covariance function -the linear one, in addition to the traditional GP-based surrogate models with eight different covariance functions, including the five listed above.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">ANNs Estimating the Distribution of Black-Box Objective Function Values</head><p>In our opinion, the property of GPs most advantageous from the point of view of surrogate modelling is that they estimate the whole distribution of a predicted value of the original black-box objective function. Recall from Section 2 that due to that property also ensembles of regression trees -RFs -are used as surrogate models <ref type="bibr" target="#b33">[33,</ref><ref type="bibr" target="#b34">34,</ref><ref type="bibr" target="#b32">32]</ref>. This draws attention to those modern neural networks that also allow estimation of such a distribution. Basically, there are three classes of them, differing in the way how that estimate can be obtained.</p><p>1. The multivariate normal distribution underlying GPs is actually the asymptotic distribution for network width increasing to infinity. Such results have been established for several kinds of ANNs <ref type="bibr" target="#b94">[94,</ref><ref type="bibr" target="#b88">88,</ref><ref type="bibr" target="#b95">95,</ref><ref type="bibr" target="#b96">96,</ref><ref type="bibr" target="#b97">97]</ref>. In addition, closely related is the infinite width limit of the neural tangent kernel, which governs the kernel gradient of the functional cost used in MLP regression <ref type="bibr" target="#b64">[64,</ref><ref type="bibr" target="#b65">65]</ref>.</p><p>Although those results have great theoretical value, there can be a serious disparity between the infinite width results and their finite width counterparts <ref type="bibr" target="#b77">[77]</ref>. Therefore, it is unclear whether they can be applied to surrogate modelling. 2. The distribution of a predicted value, or more precisely the parameters of such a distribution, can be directly learned by an ANN. The best-known kind of such neural networks are the prior networks, learning the parameters of a normal-inverse Wishart distribution, which is the conjugate prior to a multivariate normal distribution <ref type="bibr" target="#b66">[66,</ref><ref type="bibr" target="#b67">67,</ref><ref type="bibr" target="#b68">68,</ref><ref type="bibr" target="#b69">69,</ref><ref type="bibr" target="#b70">70,</ref><ref type="bibr" target="#b98">98]</ref>. Prior networks belong to a broader class of evidential neural networks <ref type="bibr" target="#b99">[99,</ref><ref type="bibr" target="#b100">100,</ref><ref type="bibr" target="#b101">101,</ref><ref type="bibr" target="#b102">102,</ref><ref type="bibr" target="#b103">103]</ref>. Their name refers to the fact that they follow the basic principle of the Dempster-Shafer theory of evidence <ref type="bibr" target="#b104">[104]</ref> -to fall back onto prior belief for unfamiliar data. 3. An estimate of the distribution of a predicted value is produced by an ensemble of neural networks.</p><p>Important kinds of such ensembles are ensembles obtained through diversification of training data <ref type="bibr" target="#b105">[105,</ref><ref type="bibr" target="#b106">106]</ref>, ensembles obtained through diversification of network properties <ref type="bibr" target="#b107">[107,</ref><ref type="bibr" target="#b108">108,</ref><ref type="bibr" target="#b109">109]</ref>, a specific subgroup of which are ensembles in which the diversification is achieved through diverse activation functions <ref type="bibr" target="#b76">[76]</ref>, ensembles obtained through negative correlation learning <ref type="bibr" target="#b110">[110,</ref><ref type="bibr" target="#b111">111,</ref><ref type="bibr" target="#b112">112]</ref>, bagging ensembles <ref type="bibr" target="#b72">[72,</ref><ref type="bibr" target="#b113">113]</ref>, boosting ensembles <ref type="bibr" target="#b114">[114,</ref><ref type="bibr" target="#b115">115]</ref> deep ensembles <ref type="bibr" target="#b73">[73,</ref><ref type="bibr" target="#b74">74,</ref><ref type="bibr" target="#b116">116]</ref> including deep echo-state network ensembles <ref type="bibr" target="#b117">[117]</ref>, and anchored ensembles <ref type="bibr" target="#b75">[75]</ref> with a later modification random activation function (RAF) ensembles <ref type="bibr" target="#b76">[76]</ref>. RAF ensembles take over the principle of anchored ensembles that regularization is performed not with respect to zero, but with respect to the initialization values of the parameters, which are assumed normally distributed. Differently to an anchored ensemble, however, an RAF ensemble uses varied ativation fuctions from an a priori specified set of size 𝑛 AF . From that set, the activation function is chosen randomly, apart from the first 𝑛 AF members of the ensemble, among which each activation function occurs exactly once. We consider this last mentioned kind of ensembles as the state of the art.</p><p>To our knowledge, the only ANNs estimating the distribution of function values that have already been used as surrogate models in BBO, are prior networks. In <ref type="bibr" target="#b71">[71]</ref>, the prediction accuracy of four versions has been evaluated on the above mentioned dataset from previous runs of DTS-CMA-ES. This direction of research is continued by the present paper: Section 4 reports results for CMA-ES surrogate-assisted by two variants of RAF ensembles.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">ANNs Found Useful in Bayesian Optimization</head><p>Recall from Section 2 that GPs, simultaneously with their importance as surrogate models in BBO with non-Bayesian methods, such as CMA-ES, also play a crucial role in Bayesian optimization. That is why this subsection lists three well-known kinds of modern neural networks that have been recently found useful in Bayesian optimization. In our opinion, this indicates that they are worth investigating whether they could be used also in surrogate-assisted BBO.</p><p>1. Variational autoencoders have been utilized in Bayesian optimization because they allow for optimization in a lower-dimensional latent space <ref type="bibr" target="#b77">[77,</ref><ref type="bibr" target="#b78">78]</ref>. 2. The generative adversarial networks (GANs) paradigm has been recently shown to be applicable to BBO: A generator proposes samples that align with the distribution of low values (or even the optimal value) of the black-box function, while one or more discriminators classify samples based on whether they belong to that distribution <ref type="bibr" target="#b79">[79,</ref><ref type="bibr" target="#b80">80]</ref>. 3. Transformers have proven effective in estimating complex prior distributions for Bayesian optimization <ref type="bibr" target="#b81">[81,</ref><ref type="bibr" target="#b82">82]</ref>. Notably, an OptFormer transformer trained on Google Vizier <ref type="bibr" target="#b118">[118]</ref>, the largest hyperparameter optimization (HPO) database, achieved superior HPO outcomes compared to GP-based Bayesian optimization <ref type="bibr" target="#b81">[81]</ref>. Furthermore, the recently introduced transformer-based Prior-data Fitted Networks <ref type="bibr" target="#b82">[82]</ref> can mimic Gaussian Processes (GPs) and Bayesian networks, while also incorporating additional information into the prior.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">ANN-Based Transfer Learning for Surrogate-Assisted Black-Box Optimization</head><p>Obtaining accurate surrogate models in the initial stages of BBO is challenging due to the scarcity of data points with evaluated objective function values. That can be mitigated by leveraging knowledgetransfer learning. And a connection of modern kinds of neural networks with transfer learning is even more obvious than with active learning. Indeed, transfer learning is nowadays one of the areas where ANNs play most important role <ref type="bibr" target="#b119">[119,</ref><ref type="bibr" target="#b120">120,</ref><ref type="bibr" target="#b121">121]</ref>. Different types of ANNs have been utilized to this end, including convolutional <ref type="bibr" target="#b122">[122,</ref><ref type="bibr" target="#b123">123]</ref>, recurrent <ref type="bibr" target="#b124">[124]</ref>, autoencoder <ref type="bibr" target="#b125">[125,</ref><ref type="bibr" target="#b126">126]</ref>, GAN <ref type="bibr" target="#b127">[127,</ref><ref type="bibr" target="#b128">128,</ref><ref type="bibr" target="#b129">129]</ref>, and transformer <ref type="bibr" target="#b81">[81]</ref>. In the context of the research direction pursued in this paper, most interesting are those that also have connections to BBO: (i) Four ANN-based transfer learning approaches draw inspiration from the GAN paradigm. CoGAN trains two GANs to generate the source and target, respectively, achieves a domain invariant feature space by tying the high layers parameters of the two GANs, and performs domain adaptation by training a classifier on the discriminator output <ref type="bibr" target="#b130">[130]</ref>. Adversarial discriminative domain adaptation learns first a discriminative representation using the labels in the source domain, and then, using a domain-adversarial loss, a separate encoding that maps the target data to the same space through an asymmetric mapping <ref type="bibr" target="#b127">[127]</ref>. Minimax-game-based selective transfer learning employs a selector and a discriminator to identify source domain data resembling the target domain's distribution, and distinguish genuine target domain data from selected source domain data, respectively <ref type="bibr" target="#b129">[129]</ref>. Selective adversarial network addresses negative transfer by excluding outlier classes from the source domain selection, and maximizing the similarity between source and target domain data distributions <ref type="bibr" target="#b128">[128]</ref>. (ii) An autoencoder for transfer learning, described in <ref type="bibr" target="#b125">[125,</ref><ref type="bibr" target="#b126">126]</ref>, incorporates embedding and label encoding layers. The embedding layer reduces the disparity between instance distributions from the source and target domains, while the label encoding layer utilizes a softmax regression model to encode label information from the source domain.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>(iii)</head><p>The transformer OptFormer has demonstrated competitiveness with specific transfer learning methods, although its usage leans more toward metalearning than traditional transfer learning <ref type="bibr" target="#b81">[81]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experimental Evaluation of RAF Ensembles</head><p>This section describes a small experimental contribution to one of the above surveyed possible research directions: RAF ensembles are experimentally evaluated as surrogate models for CMA-ES. The experiments were performed on the probably most commonly used platform for experimenting in continuous optimization -COCO ( Comparing Continuous Optimizers) <ref type="bibr" target="#b92">[92]</ref>. COCO contains severeal suites of benchmark functions, our evaluation was performed with the most traditional suite, which is the bbob suite <ref type="bibr" target="#b92">[92]</ref>. It consists of 24 dimension-scalable noiseless benchmark functions, the definitions of which have been given in <ref type="bibr" target="#b91">[91]</ref>. Each function is used in 15 differently rotated and/or translated instances. The employed benchmarks forming the bobo suite are surveyed in Appendix A.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Considered Variants of RAF Ensembles</head><p>As activation functions forming an RAF ensemble, we employed those included in the implementation <ref type="bibr" target="#b131">[131]</ref>, to which the RAF paper refers <ref type="bibr" target="#b76">[76]</ref>. They are listed in Appendix B. We used them in two variants of RAF ensembles:</p><p>1. An RAF ensemble of size 5 trained directly using the above mentioned implementation <ref type="bibr" target="#b131">[131]</ref>, and aggregated by the empirical mean. In the results, it will be denoted simply RAF. 2. An ensemble of size 5, in which the differences of values of the original black-box objective function with respect to its median are first transformed to their logarithms before using <ref type="bibr" target="#b131">[131]</ref> in the logarithmic scale to train the ensemble. This transformation attempts to deal with situations when the function returns in many points values close to the median. The aggregation function is again the empirical mean, which in terms of the data before the logarithmic transformation actually corresponds to the empirical geometric mean. That version will be in the results denoted RAF-log.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Considered CMA-ES Variants for Comparison</head><p>CMA-ES surrogate-assisted by the above mentioned two variants of RAF ensembles was compared with CMA-ES without surrogate modelling, as well as with two earlier surrogate-assisted variants of CMA-ES:</p><p>3. CMA-ES without surrogate modelling was used in an implementation that is in the COCO data archive <ref type="bibr" target="#b132">[132]</ref> called default-CMA-ES, and described as "default CMA-ES from the pycma module, version 3.3.0". Here, it will be in the results denoted simply default. 4. DTS-CMA-ES <ref type="bibr" target="#b11">[12]</ref>, using a surrogate GP with the covariance function Matérn 5  2 . In the results, it will be denoted simply DTS. <ref type="bibr" target="#b20">[20]</ref>, which will be in the results denoted simply lq.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">lq-CMA-ES</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Evolution Control</head><p>Whereas DTS-CMA-ES and lq-CMA-ES have each their own evolution control, for the two variants of RAF ensembles was necessary to propose when to evaluate a given point 𝑥 by the original black-box objective function 𝐹 bb , and when by its surrogate model 𝐹 sm . We decided to use a modification of the lq-CMA-ES evolution control. That modification is described below in Algorithm 1 using the notation 𝜏 ((𝑦 1 , . . . , 𝑦 𝑘 ), (𝑧 1 , . . . , 𝑧 𝑘 )) for the Kendall correlation coefficient between the sequences (𝑦 1 , . . . , 𝑦 𝑘 ) and (𝑧 1 , . . . , 𝑧 𝑘 ), and the notation 𝜌 for the ranking function on R 𝑑 , i.e., 𝜌 : R 𝑑 → Π(𝑑) with Π(𝑑) denoting the set of permutations of {1, . . . , 𝑑}</p><formula xml:id="formula_0">such that ∀𝑦 ∈ R 𝑑 : (𝜌(𝑦)) 𝑖 &lt; (𝜌(𝑦)) 𝑗 ⇒ 𝑦 𝑖 ≤ 𝑦 𝑗 . (1)</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Results</head><p>In Tables <ref type="table">2-3</ref>, the two considered variants of RAF ensembles, and three considered other CMA-ES variants, are compared based on the difference between the optimal value of the objective function, and its value achieved for a given evaluation budget. The achieved values were averaged over the 15 instances provided by the COCO benchmark suite in each dimension for each of the 24 noiseless functions listed in Appendix A. The comparisons were performed separately for each of the five above described groups of those functions, and subsequently also for all 24 of them, each time including the instances in dimensions 2, 3, 5, 10, and 20. For each evaluation budget, hence, six evaluations were</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2</head><p>Comparison of CMA-ES surrogate-assisted by RAF, and by RAF-log, with CMA-ES without surrogate modelling, with lq-CMA-ES, and with DTS-CMA-ES, for evaluation budget 3×dimension. Each cell of each sub-table records the number of function-dimension combinations, for which the method in the row achieved with the evaluation budget a lower value, averaged over the 15 COCO instances, than the method in the column. Ties within the considered precision are halved between both methods. If the Friedman test rejected the hypothesis of equivalence of all methods, and according to the subsequent Wilcoxon signed-rank test with Holm correction, the method in the row is significantly better than the method in the column, the number in the cell is in bold with * for the familywise level 5 %, and with ** for the familywise level 1 %. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 3</head><p>Comparison of CMA-ES surrogate-assisted by RAF, and by RAF-log, with CMA-ES without surrogate modelling, with lq-CMA-ES, and with DTS-CMA-ES, for evaluation budget 50×dimension. Each cell of each sub-table records the number of function-dimension combinations, for which the method in the row achieved with the evaluation budget a lower value, averaged over the 15 COCO instances, than the method in the column. Ties within the considered precision are halved between both methods. If the Friedman test rejected the hypothesis of equivalence of all methods, and according to the subsequent Wilcoxon signed-rank test with Holm correction, the method in the row is significantly better than the method in the column, the number in the cell is in bold with * for the familywise level 5 %, and with ** for the familywise level 1 %.  <ref type="table">2</ref> were conducted for the evaluation budget 3×dimension, while the comparisons in Table <ref type="table">3</ref> were conducted for the evaluation budget 50×dimension.</p><p>The results of each of those 12 comparisons were subsequently assessed for statistical significance. First, the hypothesis that all five considered methods are equivalent was tested by the Friedman test. With the exception of both comparisons for multi-modal functions with adequate global structure, the test rejected that hypothesis on the familywise significance level 5%, using the Holm procedure for multiple-hypothesis correction <ref type="bibr" target="#b133">[133]</ref>. This rejection justified testing the equivalence of any two among the five methods. We adopted the arguments of <ref type="bibr" target="#b134">[134]</ref> that, in machine learning, the Wilcoxon signed-rank test is more appropriate for this purpose than the post-hoc tests presented in <ref type="bibr" target="#b135">[135]</ref> and <ref type="bibr" target="#b133">[133]</ref>. If for particular two methods, the Wilcoxon signed-rank test rejected the hypothesis that they are equivalent, then in the respective table, their comparison in the row corresponding to the method that was more frequently better is shown in bold italics.</p><p>The results in Tables 2-3 primarily confirm the superior performance of the methods lq-CMA-ES, and DTS-CM-ES. In the two comparisons based on all 120 noiseless benchmark functions, each of them is for both considered budgets significantly better not only than default CMA-ES, but also than CMA-ES surrogate-assisted by the two variants of RAF ensembles. Moreover, lq-CMA-ES is also among the 10 comparisons based on individual groups of functions 6 times significantly better than default CMA-ES, and 7 times, respectively 5 times significantly better than CMA-ES surrogate-assisted by RAF, respectively by RAF-log. For DTS-CMA-ES, the results of the 10 comparisons based on individual groups of functions are less convincing: 3 times significantly better than default CMA-ES, 3 times than CMA-ES surrogate-assisted by RAF, and only once than CMA-ES assisted by RAF-log. As to a comparison between the two variants of RAF ensembles, the differences among them were not significant apart from unimodal functions with high conditioning, for which CMA-ES achieves significantly better results if assisted by RAF than if assisted by RAF-log.</p><p>The different progress of optimization performed by each of the compared methods is illustrated, always in three particular dimensions, by means of optimization-progress plots. They show the average difference Δ 𝑓 between the optimal and achieved value of the objective function over the 15 COCO instances. For that illustration, we have chosen the functions 𝑓 9 (Figure <ref type="figure" target="#fig_0">1</ref>), 𝑓 18 (Figure <ref type="figure" target="#fig_1">2</ref>), and 𝑓 20 (Figure <ref type="figure" target="#fig_2">3</ref>). We can see that optimisation using CMA-ES surrogate-assisted by RAF or RAF-log sometimes leads to similarly fast decrease of the objective function as, or even faster than, optimization using the state-of-the-art methods DTS-CMA-ES or lq-CMA-ES. In Figure <ref type="figure" target="#fig_0">1</ref>, this is the case for RAF-log in dimension 2. In Figure <ref type="figure" target="#fig_1">2</ref>, dimesnion  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>The paper was motivated by our opinion that the intense and successful development of artificial neural networks during the last 15 years suggests that they again have the potential to be important for active learning in surrogate-assisted BBO. It surveyed possible directions of research into that potential, including closely connected research into neural-network-based transfer learning for surrogate modelling. Moreover, it recalled the first published investigations in some of those directions, and added a new contribution to the emerging mosaic of those investigations. The fact that the main purpose of the experimental section of the paper is to contribute to the mosaic of emerging investigations should be epmhasized especially in context of the obtained experimental results. It justifiess that there is no significant difference between using CMA-ES surrogate-assisted by RAF ensembles and using it alone, as well as that results with RAF-ensemble-based surrogate models are significantly worse than results with the state-of-the-art surrogate-assisted CMA-ES variants, lq-CMA-ES, and DTS-CMA-ES. This is an obvious limitation not only of RAF ensembles, but of all above surveyed kinds of neural networks that have been so far investigated as surrogate models for CMA-ES. On the other hand, as the survey has shown, there are many more other possibilities for such investigations within future research.    </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Progress of optimization by the compared methods up to the budget 250×dimension for the benchmark function 𝑓 9 -Rosenbrock rotated. Each curve is the average of the 15 COCO instances of this function.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Progress of optimization by the compared methods up to the budget 250×dimension for the benchmark function 𝑓 18 -Schaffers F7 function, moderately ill-conditioned. Each curve is the average of the 15 COCO instances of this function.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Progress of optimization by the compared methods up to the budget 250×dimension for the benchmark function 𝑓 20 -Schwefel. Each curve is the average of the 15 COCO instances of this function.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Separable functions. From left to right: sphere, ellipsoidal, Rastrigin, Büche-Rastrigin, maximized linear slope.</figDesc><graphic coords="20,94.57,65.60,406.14,82.28" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Functions with low or moderate conditioning. From left to right: attractive sector, step ellipsoidal, Rosenbrock, Rosenbrock rotated.</figDesc><graphic coords="20,94.57,334.53,406.13,83.93" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Unimodal functions with high conditioning. From left to right: ellipsoidal, discus, bent cigar, sharp ridge, different powers.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: Multi-modal functions with adequate global structure. From left to right: Rastrigin, Weierstrass, Schaffers F7 function, moderately ill-conditioned Schaffers F7 function, composite Griewank-Rosenbrock function F8F2.</figDesc><graphic coords="20,94.57,619.81,406.13,82.21" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>A high-level overview of kinds of ANNs that we consider worth a research with respect to surrogate modelling for BBO</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head></head><label></label><figDesc>Evolution control used for RAF and RAF-log ensembles.Require: points 𝑥 1 , . . . , 𝑥 𝜆 ∈ R 𝑑 , in which the surrogate model 𝐹 sm trained on some archive 𝐴 has been evaluated; thus 𝜆 is the population size 1: Set 𝑘 = ⌊1 + max(0.02𝜆, 4)⌋; the number of 𝐹 bb evaluations 2: Set 𝑄 = {𝑥 𝑗 |(𝜌(𝐹 sm (𝑥 1 ), . . . , 𝐹 sm (𝑥 𝜆 ))) 𝑗 ≤ 𝑘}; points with the 𝑘 smallest 𝐹 sm values 3: In 𝑥 ∈ 𝑄 for which 𝐹 bb (𝑥) is not yet known, evaluate 𝐹 bb (𝑥) 4: Order the elements of 𝑄 as (𝑥 1 𝑄 , . . . , 𝑥 𝑘 𝑄 ) decreasingly with respect to their 𝐹 bb (𝑥) values 5: Set ℓ = max(1, ⌊𝑘 + 1 − max(15, 0.75𝜆)⌋); the lower index for computing 𝜏 between 𝐹 bb and 𝐹 sm 6: while 𝑘 &lt; 𝜆 &amp; 𝜏 ((𝐹 bb (𝑥 ℓ ), . . . , 𝐹 bb (𝑥 𝑘 )), (𝐹 sm (𝑥 ℓ ), . . . , 𝐹 sm (𝑥 𝑘 ))) &lt; 0.85 do 𝑄 ∪ {𝑥 𝑗 |(𝜌(𝐹 sm (𝑥 1 ), . . . , 𝐹 sm (𝑥 𝜆 ))) 𝑗 ≤ ⌈1.5𝑘⌉} 𝑄 for which 𝐹 bb (𝑥) is not yet known, evaluate 𝐹 bb (𝑥) Order the elements of 𝑄 as (𝑥 1 𝑄 , . . . , 𝑥 𝑘 𝑄 ) decreasingly with respect to their 𝐹 bb (𝑥) values 11: end while 12: Update 𝐴 = 𝐴 ∪ {𝑥 𝑗 |𝐹 bb (𝑥) has been evaluated in 𝑥 𝑗 } 13: if 𝑘 = 𝜆 then Return 𝐴 and {(𝑥 1 , 𝐹 bb (𝑥 1 )), . . . , (𝑥 𝜆 ), 𝐹 bb (𝑥 𝜆 ))} Return 𝐴 and {(𝑥 𝑖 , 𝐹 bb (𝑥 𝑖 ))|𝑥 𝑖 ∈ 𝑄} ∪ {(𝑥 𝑖 , 𝐹 sm (𝑥 𝑖 ))|𝑖 = 1, . . . , 𝜆, 𝑥 𝑖 ̸ ∈ 𝑄} 17: end if performed. The comparisons in Table</figDesc><table><row><cell cols="3">Martin Holeňa et al. CEUR Workshop Proceedings</cell><cell></cell><cell></cell><cell>47-67</cell></row><row><cell cols="4">Separable functions RAF-log DTS-CMA-ES 7 4 -6 Update 𝑘 = ⌈1.5𝑘⌉, ℓ = max(1, ⌊𝑘 + 1 − max(15, 0.75𝜆)⌋) RAF -RAF-log RAF 18 Algorithm 1 7: Update 𝑄 = 8: In 𝑥 ∈ 9:</cell><cell>lq-CMA-ES 1.5 0.5</cell><cell>default CMA-ES 11 12</cell></row><row><cell>DTS-CMA-ES 10:</cell><cell>21**</cell><cell>19</cell><cell>-</cell><cell>7.5</cell><cell>19</cell></row><row><cell>lq-CMA-ES</cell><cell>23.5**</cell><cell>24.5**</cell><cell>17.5</cell><cell>-</cell><cell>22.5**</cell></row><row><cell>default CMA-ES</cell><cell>14</cell><cell>13</cell><cell>6</cell><cell>2.5</cell><cell>-</cell></row><row><cell>14: RAF 15: else</cell><cell>RAF -</cell><cell cols="3">Functions with low or moderate conditioning RAF-log DTS-CMA-ES lq-CMA-ES 15.5 6 0.5</cell><cell>default CMA-ES 4.5</cell></row><row><cell>RAF-log 16:</cell><cell>4.5</cell><cell>-</cell><cell>6</cell><cell>0</cell><cell>2</cell></row><row><cell>DTS-CMA-ES</cell><cell>14</cell><cell>14</cell><cell>-</cell><cell>11.5</cell><cell>14</cell></row><row><cell>lq-CMA-ES</cell><cell>19.5**</cell><cell>20**</cell><cell>8.5</cell><cell>-</cell><cell>18.5*</cell></row><row><cell>default CMA-ES</cell><cell>15.5</cell><cell>18**</cell><cell>6</cell><cell>1.5</cell><cell>-</cell></row><row><cell></cell><cell></cell><cell cols="3">Unimodal functions with high conditioning</cell><cell></cell></row><row><cell></cell><cell>RAF</cell><cell>RAF-log</cell><cell>DTS-CMA-ES</cell><cell>lq-CMA-ES</cell><cell>default CMA-ES</cell></row><row><cell>RAF</cell><cell>-</cell><cell>21*</cell><cell>0</cell><cell>0</cell><cell>4</cell></row><row><cell>RAF-log</cell><cell>4</cell><cell>-</cell><cell>0</cell><cell>0</cell><cell>1</cell></row><row><cell>DTS-CMA-ES</cell><cell>25**</cell><cell>25**</cell><cell>-</cell><cell>7.5</cell><cell>25**</cell></row><row><cell>lq-CMA-ES</cell><cell>25**</cell><cell>25**</cell><cell>17.5*</cell><cell>-</cell><cell>25**</cell></row><row><cell>default CMA-ES</cell><cell>21</cell><cell>24**</cell><cell>0</cell><cell>0</cell><cell>-</cell></row><row><cell></cell><cell cols="4">Multi-modal functions with adequate global structure</cell><cell></cell></row><row><cell></cell><cell>RAF</cell><cell>RAF-log</cell><cell>DTS-CMA-ES</cell><cell>lq-CMA-ES</cell><cell>CMA-ES alone</cell></row><row><cell>RAF</cell><cell>-</cell><cell>15</cell><cell>13</cell><cell>10</cell><cell>11.5</cell></row><row><cell>RAF-log</cell><cell>10</cell><cell>-</cell><cell>11.5</cell><cell>9</cell><cell>12</cell></row><row><cell>DTS-CMA-ES</cell><cell>12</cell><cell>13.5</cell><cell>-</cell><cell>13.5</cell><cell>15</cell></row><row><cell>lq-CMA-ES</cell><cell>15</cell><cell>16</cell><cell>11.5</cell><cell>-</cell><cell>13</cell></row><row><cell>default CMA-ES</cell><cell>13.5</cell><cell>13</cell><cell>10</cell><cell>12</cell><cell>-</cell></row><row><cell></cell><cell cols="4">Multi-modal functions with weak global structure</cell><cell></cell></row><row><cell></cell><cell>RAF</cell><cell>RAF-log</cell><cell>DTS-CMA-ES</cell><cell>lq-CMA-ES</cell><cell>default CMA-ES</cell></row><row><cell>RAF</cell><cell>-</cell><cell>9</cell><cell>3.5</cell><cell>8</cell><cell>14</cell></row><row><cell>RAF-log</cell><cell>16</cell><cell>-</cell><cell>6.5</cell><cell>12</cell><cell>18.5</cell></row><row><cell>DTS-CMA-ES</cell><cell>21.5</cell><cell>18.5</cell><cell>-</cell><cell>20</cell><cell>23**</cell></row><row><cell>lq-CMA-ES</cell><cell>17</cell><cell>13</cell><cell>5</cell><cell>-</cell><cell>20*</cell></row><row><cell>default CMA-ES</cell><cell>11</cell><cell>6.5</cell><cell>2</cell><cell>5</cell><cell>-</cell></row><row><cell></cell><cell></cell><cell cols="2">All noiseless benchmark functions</cell><cell></cell><cell></cell></row><row><cell></cell><cell>RAF</cell><cell>RAF-log</cell><cell>DTS-CMA-ES</cell><cell>lq-CMA-ES</cell><cell>default CMA-ES</cell></row><row><cell>RAF</cell><cell>-</cell><cell>67.5</cell><cell>26.5</cell><cell>20</cell><cell>45</cell></row><row><cell>RAF-log</cell><cell>52.5</cell><cell>-</cell><cell>30</cell><cell>21.5</cell><cell>45.5</cell></row><row><cell>DTS-CMA-ES</cell><cell>93.5**</cell><cell>90**</cell><cell>-</cell><cell>60</cell><cell>96**</cell></row><row><cell>lq-CMA-ES</cell><cell>100**</cell><cell>98.5**</cell><cell>60</cell><cell>-</cell><cell>99**</cell></row><row><cell>default CMA-ES</cell><cell>75</cell><cell>74.5</cell><cell>24</cell><cell>21</cell><cell>-</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head></head><label></label><figDesc>3, CMA-ES surrogate-assisted by RAF reaches lower values of the objective function than any other of the compared methods, whereas in dimension 2, CMA-ES surrogate-assisted by any of RAF or RAF-log leads to a similarly fast decrease of 𝑓 18 as DTS-CMA-ES but slower than lq-CMA-ES. Finally, in Figure3, dimensions 3 and 5, CMA-ES surrogate-assisted by any of RAF or RAF-log leads to a similarly fast decrease of 𝑓 18 as lq-CMA-ES, but slower than DTS-CMA-ES.</figDesc><table><row><cell>Martin Holeňa et al. CEUR Workshop Proceedings</cell><cell>47-67</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">Martin Holeňa et al. CEUR Workshop Proceedings 47-67</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgemengt</head><p>The research reported in this paper has been supported by the German Research Foundation (DFG) funded project 467401796, and by the Czech Technical University grant SGS 23/205/OHK3/3T/18. The authors are very grateful to Jaroslav Langer for his crucial contribution to the RAF experiments.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Employed Benchmarks</head><p>The functions in the bbob suite are divided into five groups:</p><p>1. Separable functions (Figure <ref type="figure">4</ref>).</p><p>• 𝑓 1 : sphere; • 𝑓 2 : ellipsoidal; • 𝑓 3 : Rastrigin; • 𝑓 4 : Büche-Rastrigin; • 𝑓 5 : linear slope.</p><p>2. Functions with low or moderate conditioning (Figure <ref type="figure">5</ref>).</p><p>• 𝑓 6 : attractive sector; • 𝑓 7 : step ellipsoidal; • 𝑓 8 : Rosenbrock; • 𝑓 8 : Rosenbrock rotated.</p><p>3. Unimodal functions with high conditioning (Figure <ref type="figure">6</ref>).</p><p>• 𝑓 10 : ellipsoidal; • 𝑓 11 : discus; • 𝑓 12 : bent cigar; 5. Multi-modal functions with weak global structure (Figure <ref type="figure">8</ref>).  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Activation Functions Employed to Form an RAF Ensemble</head><p>• Gauss error function</p><p>• Gaussian error linear unit</p><p>• Scaled exponential linear unit</p><p>where 𝑐, 𝛼 &gt; 0. In the employed Tensorflow implementation, 𝑐 = 1.05070098, 𝛼 = 1.67326324. </p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Combinatorial Development of Solid Catalytic Materials</title>
		<author>
			<persName><forename type="first">M</forename><surname>Baerns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Design of High-Throughput Experiments, Data Analysis, Data Mining</title>
				<meeting><address><addrLine>London</addrLine></address></meeting>
		<imprint>
			<publisher>Imperial College Press / World Scientific</publisher>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A rigorous framework for optimization by surrogates</title>
		<author>
			<persName><forename type="first">A</forename><surname>Booker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dennis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Frank</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Serafini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">V</forename></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Trosset</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Structural and Multidisciplinary Optimization</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="1" to="13" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Metamodeling techniques for evolutionary optimization of computaitonally expensive problems: Promises and limitations</title>
		<author>
			<persName><forename type="first">M</forename><surname>El-Beltagy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Keane</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Genetic and Evolutionary Computation Conference</title>
				<meeting>the Genetic and Evolutionary Computation Conference</meeting>
		<imprint>
			<publisher>Morgan Kaufmann Publishers</publisher>
			<date type="published" when="1999">1999</date>
			<biblScope unit="page" from="196" to="203" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Kriging as a surrogate fitness landscape in evolutionary optimization</title>
		<author>
			<persName><forename type="first">A</forename><surname>Ratle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence for Engineering Design, Analysis and Manufacturing</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page" from="37" to="49" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Metamodel-assisted evolution strategies</title>
		<author>
			<persName><forename type="first">M</forename><surname>Emmerich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Giotis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Özdemir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Bäck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Giannakoglou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">PPSN, ACM</title>
				<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="361" to="370" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A derivative based surrogate model for approximating and optimizing the output of an expensive computer simulation</title>
		<author>
			<persName><forename type="first">S</forename><surname>Leary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bhaskar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Keane</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Global Optimization</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="39" to="58" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Surrogate-assisted evolutionary optimization frameworks for high-fidelity engineering design problems</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Ong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Keane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Wong</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Knowledge Incorporation in Evolutionary Computation</title>
				<editor>
			<persName><forename type="first">Y</forename><surname>Jin</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="307" to="331" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Methods for using surrogate modesl to speed up genetic algorithm oprimization: Informed operators and genetic engineering</title>
		<author>
			<persName><forename type="first">K</forename><surname>Rasheed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Vattam</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Knowledge Incorporation in Evolutionary Computation</title>
				<editor>
			<persName><forename type="first">Y</forename><surname>Jin</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="103" to="123" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A framework for evolutionary optimization with approximate fitness functions</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Olhofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Sendhoff</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Evolutionary Computation</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="481" to="494" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Accelerating evolutionary algorithms with Gaussian process fitness function models</title>
		<author>
			<persName><forename type="first">D</forename><surname>Büche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Schraudolph</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Koumoutsakos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="183" to="194" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">CMA evolution strategy assisted by kriging model and approximate ranking</title>
		<author>
			<persName><forename type="first">C</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Radi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">El</forename><surname>Hami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Bai</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Applied Intelligence</title>
		<imprint>
			<biblScope unit="volume">48</biblScope>
			<biblScope unit="page" from="4288" to="4204" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Gaussian process surrogate models for the CMA evolution strategy</title>
		<author>
			<persName><forename type="first">L</forename><surname>Bajer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Repický</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Evolutionary Computation</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="page" from="665" to="697" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Interaction between model and its evolution control in surrogate-assisted CMA evolution strategy</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hanuš</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Koza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tumpach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page">358</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note>paper no</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Augmented Lagrangian, penalty techniques and surrogate modeling for constrained optimization with CMA-ES</title>
		<author>
			<persName><forename type="first">P</forename><surname>Dufossé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Hansen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="519" to="527" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Adaptive ranking-based constraint handling for explicitly constrained black-box optimization</title>
		<author>
			<persName><forename type="first">N</forename><surname>Sakamoto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Akimoto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Evolutionary Computation</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="503" to="529" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">A mono surrogate for objective optimization</title>
		<author>
			<persName><forename type="first">I</forename><surname>Loshchilov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schoenauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sebag</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="471" to="478" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Guiding surrogate-assisted multi-objective optimisation with decision maker preferences</title>
		<author>
			<persName><forename type="first">F</forename><surname>Gibson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Everson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Fieldsend</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="786" to="795" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Local metamodels for optimization using evolution strategies</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kern</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Hansen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Koumoutsakos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PPSN</title>
		<imprint>
			<biblScope unit="page" from="939" to="948" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Benchmarking the local metamodel cma-es on the noiseless BBOB&apos;2013 test bed</title>
		<author>
			<persName><forename type="first">A</forename><surname>Auger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Brockhoff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Hansen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="1225" to="1232" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">Martin</forename><surname>Holeňa</surname></persName>
		</author>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<biblScope unit="page" from="47" to="67" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">A global surrogate assisted CMA-ES</title>
		<author>
			<persName><forename type="first">N</forename><surname>Hansen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="664" to="672" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">An adaptive model selection strategy for surrogate-assisted particle swarm optimization algorithm</title>
		<author>
			<persName><forename type="first">H</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE SCI</title>
		<imprint>
			<biblScope unit="page" from="1" to="8" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Committee-based active learning for surrogate-assisted particle swarm optimization of expensive problems</title>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Doherty</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Cybernetics</title>
		<imprint>
			<biblScope unit="volume">47</biblScope>
			<biblScope unit="page" from="2664" to="2677" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Structural optimization using evolution strategies and neural networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Papadrakakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lagaros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tsompanakis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer Methods in Applied Mechanics and Engineering</title>
		<imprint>
			<biblScope unit="volume">156</biblScope>
			<biblScope unit="page" from="309" to="333" />
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Model-assisted steady state evolution strategies</title>
		<author>
			<persName><forename type="first">H</forename><surname>Ulmer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Streichert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zell</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="610" to="621" />
			<date type="published" when="2003">2003</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">A study on metamodeling techniques, ensembles, and multi-surrogates in evolutionary computation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Lim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">B</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="1288" to="1295" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Surrogate model for continuous and discrete genetic optimization based on RBF networks</title>
		<author>
			<persName><forename type="first">L</forename><surname>Bajer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Intelligent Data Engineering and Automated Learning</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="251" to="258" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Gaussian process assisted coevolutionary estimation of distribution algorithm for computationally expensive problems</title>
		<author>
			<persName><forename type="first">L</forename><surname>Na</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zhong</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Central South University of Technology</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="443" to="452" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Investigating uncertainty propagation in surrogate-assisted evolutionary algorithms</title>
		<author>
			<persName><forename type="first">V</forename><surname>Volz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Rudolph</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Naujoks</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="881" to="888" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Simple surrogate model assisted optimization with covariance matrix adaptation</title>
		<author>
			<persName><forename type="first">L</forename><surname>Toal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Arnold</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PPSN</title>
		<imprint>
			<biblScope unit="page" from="184" to="197" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Elite-driven surrogate assisted CMA-ES algorithm by improved lower confidence bound method</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Wang</surname></persName>
		</author>
		<idno>.1007/s00366- 022-01642-5</idno>
	</analytic>
	<monogr>
		<title level="j">Engineering with Computers</title>
		<imprint>
			<biblScope unit="page">10</biblScope>
			<date type="published" when="2022">2022</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Combining global and local surrogate models to accellerate evolutionary optimization</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Keane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Systems, Man and Cybernetics. Part C: Applications and Reviews</title>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="page" from="66" to="76" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Automatic surrogate modelling technique selection based on features of optimization problems</title>
		<author>
			<persName><forename type="first">B</forename><surname>Saini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lópey-Ibañez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Miettinen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="1765" to="1772" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Per instance algorithm configuration of CMA-ES with limited budget</title>
		<author>
			<persName><forename type="first">N</forename><surname>Belkhir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dréo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Savéant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schoenauer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="681" to="688" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Boosted regression forest for the doubly trained surrogate covariance matrix adaptation evolution strategy</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Repický</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ITAT</title>
		<imprint>
			<biblScope unit="page" from="72" to="79" />
			<date type="published" when="2018">2018. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">Ordinal regression in evolutionary computation</title>
		<author>
			<persName><forename type="first">T</forename><surname>Runarsson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PPSN</title>
		<imprint>
			<biblScope unit="page" from="1048" to="1057" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Comparison-based optimizers need comparison-based surrogates</title>
		<author>
			<persName><forename type="first">I</forename><surname>Loshchilov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schoenauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sebag</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PPSN</title>
		<imprint>
			<biblScope unit="page" from="364" to="373" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<analytic>
		<title level="a" type="main">Adaptive function value warping for surrogate model assisted evolutionary optimization</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Arnold</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PPSN</title>
		<imprint>
			<biblScope unit="page" from="76" to="89" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<analytic>
		<title level="a" type="main">Automatic surrogate model type selection during the optimization of expensive black-box problems</title>
		<author>
			<persName><forename type="first">I</forename><surname>Couckuyt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Gorissen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Winter Simulation Conference</title>
				<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="4285" to="4293" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<analytic>
		<title level="a" type="main">Multi-surrogate-based global optimization using a score-based infill criterion</title>
		<author>
			<persName><forename type="first">H</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Structural and Multidisciplinary Optimization</title>
		<imprint>
			<biblScope unit="volume">59</biblScope>
			<biblScope unit="page" from="485" to="506" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b40">
	<analytic>
		<title level="a" type="main">Completely derandomized self-adaptation in evolution strategies</title>
		<author>
			<persName><forename type="first">N</forename><surname>Hansen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ostermaier</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Evolutionary Computation</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="159" to="195" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b41">
	<analytic>
		<title level="a" type="main">The CMA evolution strategy: A comparing review</title>
		<author>
			<persName><forename type="first">N</forename><surname>Hansen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Towards a New Evolutionary Computation</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="75" to="102" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b42">
	<analytic>
		<title level="a" type="main">Network on chip optimization based on surrogate model assisted evolutionary algorithms</title>
		<author>
			<persName><forename type="first">M</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Karkar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yakovlev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gielen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Grout</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE CEC</title>
		<imprint>
			<biblScope unit="page" from="3266" to="3271" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b43">
	<analytic>
		<title level="a" type="main">Efficient global optimization of expensive black-box functions</title>
		<author>
			<persName><forename type="first">D</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schonlau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Welch</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Global Optimization</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="455" to="492" />
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b44">
	<analytic>
		<title level="a" type="main">ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems</title>
		<author>
			<persName><forename type="first">J</forename><surname>Knowles</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Evolutionary Computation</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="50" to="66" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b45">
	<analytic>
		<title level="a" type="main">TREGO: a trust-region framework for efficient global optimization</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Diouane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Picheny</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Le Riche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Di</surname></persName>
		</author>
		<author>
			<persName><surname>Perrotolo</surname></persName>
		</author>
		<idno>.1007/s10898-022-01245-w</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of Global Optimization</title>
		<imprint>
			<biblScope unit="volume">85</biblScope>
			<biblScope unit="page">10</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b46">
	<analytic>
		<title level="a" type="main">Making EGO and CMA-ES complementary for global optimization</title>
		<author>
			<persName><forename type="first">H</forename><surname>Mohammadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Riche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Touboul</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Learning and Intelligent Optimization</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="287" to="292" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b47">
	<analytic>
		<title level="a" type="main">Doubly trained evolution control for the surrogate cma-es</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bajer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PPSN</title>
		<imprint>
			<biblScope unit="page" from="59" to="68" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b48">
	<analytic>
		<title level="a" type="main">Black box algorithm selection by convolutional neural network</title>
		<author>
			<persName><forename type="first">Y</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yuen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">LOD</title>
		<imprint>
			<biblScope unit="page" from="264" to="280" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b49">
	<analytic>
		<title level="a" type="main">Automated parameter choice with exploratory landscape analysis and machine learning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Pikalov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mironovich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="1982" to="1985" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b50">
	<analytic>
		<title level="a" type="main">Towards feature-free automated algorithm selection for single-objective continuous black box optimization</title>
		<author>
			<persName><forename type="first">R</forename><surname>Prager</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Seiler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Trautman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kerschke</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE SCI</title>
		<imprint>
			<biblScope unit="page" from="1" to="8" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b51">
	<analytic>
		<title level="a" type="main">Knowledge-based selection of gaussian process surrogates</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bajer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ECML Workshop IAL</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="48" to="63" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b52">
	<analytic>
		<title level="a" type="main">Landscape analysis of Gaussian process surrogates for the covariance matrix adaptation evolution strategy</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Repický</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">GECCO, ACM</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="691" to="699" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b53">
	<analytic>
		<title level="a" type="main">A collection of deep learning-based featurefree approaches for characterizing single-objective continuous fitness landscapes</title>
		<author>
			<persName><forename type="first">R</forename><surname>Seiler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">V</forename><surname>Prager</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kerschke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Trautmann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="657" to="665" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b54">
	<analytic>
		<title level="a" type="main">The impact of hyper-parameter tuning for landscape-aware performance regression and algorithm selection</title>
		<author>
			<persName><forename type="first">A</forename><surname>Jankovic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Popovski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Eftimov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Doerr</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GECCO</title>
		<imprint>
			<biblScope unit="page" from="687" to="696" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b55">
	<analytic>
		<title level="a" type="main">Manifold Gaussian processes for regression</title>
		<author>
			<persName><forename type="first">R</forename><surname>Calandra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Rasmussen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Deisenroth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IJCNN</title>
		<imprint>
			<biblScope unit="page" from="3338" to="3345" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b56">
	<analytic>
		<title level="a" type="main">Deep kernel learning</title>
		<author>
			<persName><forename type="first">A</forename><surname>Wilson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Salakhutdinov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Xing</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICAIS</title>
		<imprint>
			<biblScope unit="page" from="370" to="378" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b57">
	<analytic>
		<title level="a" type="main">Combining gaussian processes and neural networks in surrogate modeling for covariance matrix adaptation evolution strategy</title>
		<author>
			<persName><forename type="first">J</forename><surname>Koza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tumpach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IAL Workshop, ECML PKDD</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1" to="10" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b58">
	<analytic>
		<title level="a" type="main">Combining gaussian processes with neural networks for active learning in optimization</title>
		<author>
			<persName><forename type="first">J</forename><surname>Užička</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Koza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tumpach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ECML Workshop IAL</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="105" to="120" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b59">
	<analytic>
		<title level="a" type="main">Doubly stochastic variational inference for deep Gaussian processes</title>
		<author>
			<persName><forename type="first">H</forename><surname>Salimbeni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Deisnroth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="16" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b60">
	<analytic>
		<title level="a" type="main">Deep convolutional Gaussian processes</title>
		<author>
			<persName><forename type="first">K</forename><surname>Blomqvist</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kaski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Heinonen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Joint European Conference on Machine Learning and Knowledge Discovery in Databases</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="582" to="597" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b61">
	<analytic>
		<title level="a" type="main">Deep Gaussian processes using expectation propagation and Monte Carlo methods</title>
		<author>
			<persName><forename type="first">G</forename><surname>Hernández-Muñoz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Villacampa-Calvo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hernández</surname></persName>
		</author>
		<author>
			<persName><surname>Lobato</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ECML PKDD</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="479" to="494" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b62">
	<analytic>
		<title level="a" type="main">Deep Gaussian process emulation using stochastic imputation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ming</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Williamson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Guillas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Techometrics</title>
		<imprint>
			<biblScope unit="volume">65</biblScope>
			<biblScope unit="page" from="150" to="161" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b63">
	<analytic>
		<title level="a" type="main">Active learning for deep gaussian process surrogates</title>
		<author>
			<persName><forename type="first">A</forename><surname>Sauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gramacy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hugdon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Technometrics</title>
		<imprint>
			<biblScope unit="volume">65</biblScope>
			<biblScope unit="page" from="4" to="18" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b64">
	<analytic>
		<title level="a" type="main">Neural tangent kernel: Convergence and generalization in neural networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Jacot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Gabriel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hongler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="10" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b65">
	<analytic>
		<title level="a" type="main">Neural tangents: Fast and easy infinite neural networks in python</title>
		<author>
			<persName><forename type="first">R</forename><surname>Novak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Alemi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICLR</title>
		<imprint>
			<biblScope unit="page" from="1" to="19" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b66">
	<analytic>
		<title level="a" type="main">Predictive uncertainty estimation via prior networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Malinin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gales</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="17" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b67">
	<analytic>
		<title level="a" type="main">Uncertainty on asynchronous time event prediction</title>
		<author>
			<persName><forename type="first">M</forename><surname>Biloš</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Charpentier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Günnemann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="10" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b68">
	<analytic>
		<title level="a" type="main">Reverse KL-divergence training of prior networks: Improved uncertainty and adversarial robustness</title>
		<author>
			<persName><forename type="first">A</forename><surname>Malinin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gales</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="12" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b69">
	<analytic>
		<title level="a" type="main">Towards maximizing the representation gap between in-domain and out-of-distribution examples</title>
		<author>
			<persName><forename type="first">J</forename><surname>Nandy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Hsu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="12" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b70">
	<analytic>
		<title level="a" type="main">Uncertainty aware semi-supervised learning on graph data</title>
		<author>
			<persName><forename type="first">X</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="10" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b71">
	<analytic>
		<title level="a" type="main">Neural-network-based estimation of normal distributions in black-box optimization</title>
		<author>
			<persName><forename type="first">J</forename><surname>Tumpach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Koza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ESANN</title>
		<imprint>
			<biblScope unit="page" from="1" to="6" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b72">
	<analytic>
		<title level="a" type="main">Parallel approach for ensemble learning with locally coupled neural networks</title>
		<author>
			<persName><forename type="first">C</forename><surname>Valle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Saravia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Allende</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Monge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Fernández</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural Processing Letters</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="page" from="277" to="291" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b73">
	<analytic>
		<title level="a" type="main">Simple and scalable predictive uncertainty estimation using deep ensembles</title>
		<author>
			<persName><forename type="first">B</forename><surname>Lakshminaraynan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Prityel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Blundell</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="12" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b74">
	<analytic>
		<title level="a" type="main">The MBPEP: A deep ensemble pruning algorithm providing high quality uncertainty prediction</title>
		<author>
			<persName><forename type="first">R</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>He</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Applied Intelligence</title>
		<imprint>
			<biblScope unit="volume">49</biblScope>
			<biblScope unit="page" from="2942" to="2955" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b75">
	<analytic>
		<title level="a" type="main">Uncertainty in neural networks: Approximately bayesian ensembling</title>
		<author>
			<persName><forename type="first">T</forename><surname>Pearce</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Leibfried</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Brintrup</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Neely</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AISTATS</title>
		<imprint>
			<biblScope unit="page" from="1" to="30" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b76">
	<analytic>
		<title level="a" type="main">Toward robust uncertainty estimation with random activation functions</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Stoyanova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ghandi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tavakol</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AAAI Conference on Artificial Intelligence</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="1" to="13" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b77">
	<analytic>
		<title level="a" type="main">Deep learning for bayesian optimization of scientific problems with high-dimensional structure</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Lob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Smith</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Snoek</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Transactions on Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note>openreview tPMQ6Je2rB</note>
</biblStruct>

<biblStruct xml:id="b78">
	<analytic>
		<title level="a" type="main">Sample-efficient optimization in the latent space of deep generative models viaweighted retraining</title>
		<author>
			<persName><forename type="first">A</forename><surname>Tripp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Daxberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hernández-Lobato</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="14" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b79">
	<analytic>
		<title level="a" type="main">A GAN based solver of black-box inverse problems</title>
		<author>
			<persName><forename type="first">M</forename><surname>Gillhofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ramsauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Brandstetter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Schäfl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hochreiter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="5" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b80">
	<monogr>
		<title level="m" type="main">OPT-GAN: A broad-spectrum global optimizer for black-box problems by learning distribution</title>
		<author>
			<persName><forename type="first">M</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ning</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zhang</surname></persName>
		</author>
		<idno>Arxiv 2102.03888v5</idno>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b81">
	<analytic>
		<title level="a" type="main">Towards learning universal hyperparameter optimizers with transformers</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="16" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b82">
	<monogr>
		<title level="m" type="main">PFNs4BO: In-context learning for Bayesian optimization</title>
		<author>
			<persName><forename type="first">S</forename><surname>Muller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Feurer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Hollmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hutter</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2023">2023</date>
			<publisher>ICML</publisher>
			<biblScope unit="page" from="1" to="27" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b83">
	<monogr>
		<title level="m" type="main">Gaussian Processes for Machine Learning</title>
		<author>
			<persName><forename type="first">E</forename><surname>Rasmussen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Williams</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
			<publisher>MIT Press</publisher>
			<pubPlace>Cambridge</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b84">
	<analytic>
		<title level="a" type="main">Deep Gaussian processes</title>
		<author>
			<persName><forename type="first">A</forename><surname>Damianou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lawrence</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AISTATS</title>
		<imprint>
			<biblScope unit="page" from="1" to="9" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b85">
	<monogr>
		<title level="m" type="main">Deep Gaussian processes for regression using approximate expectation propagation</title>
		<author>
			<persName><forename type="first">T</forename><surname>Bui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hernandez-Lobato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hernandez-Lobato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Turner</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<publisher>ICML</publisher>
			<biblScope unit="page" from="1472" to="1481" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b86">
	<monogr>
		<title level="m" type="main">Random feature expansions for deep Gaussian processes</title>
		<author>
			<persName><forename type="first">K</forename><surname>Cutajar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Bonilla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Michiardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Filippone</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
			<publisher>ICML</publisher>
			<biblScope unit="page" from="884" to="893" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b87">
	<analytic>
		<title level="a" type="main">Efficient global optimization using deep Gaussian processes</title>
		<author>
			<persName><forename type="first">A</forename><surname>Hebbal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Brevault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Balesdent</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Talbi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Melab</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE CEC</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="1" to="12" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b88">
	<analytic>
		<title level="a" type="main">Gaussian process behaviour in wide deep neural networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Matthews</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rowland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Turner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICLR</title>
		<imprint>
			<biblScope unit="page" from="1" to="15" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b89">
	<monogr>
		<title level="m" type="main">Bayesian optimization using deep Gaussian processes</title>
		<author>
			<persName><forename type="first">A</forename><surname>Hebbal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Brevault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Balesdent</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Talbi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Mela</surname></persName>
		</author>
		<idno>ArXiv: 1905.03350v1</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b90">
	<analytic>
		<title level="a" type="main">Convolutional normalizing flows for deep Gaussian processes</title>
		<author>
			<persName><forename type="first">H</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Low</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Jaillet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IJCNN</title>
		<imprint>
			<biblScope unit="page" from="1" to="5" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b91">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Hansen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Finck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Auger</surname></persName>
		</author>
		<title level="m">Real-Parameter Black-Box Optimization Benchmarking 2009: Noiseless Functions Definitions</title>
				<meeting><address><addrLine>Paris Saclay</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>INRIA</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical Report</note>
</biblStruct>

<biblStruct xml:id="b92">
	<analytic>
		<title level="a" type="main">COCO: a platform for comparing continuous optimizers in a black box setting</title>
		<author>
			<persName><forename type="first">N</forename><surname>Hansen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Auger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Merseman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Tušar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Brockhoff</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Optimization Methods and Software</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="114" to="144" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b93">
	<analytic>
		<author>
			<persName><forename type="first">J</forename><surname>Koza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tumpach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Using past experience for configuration of Gaussian processes in black-box optimization</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="167" to="182" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b94">
	<analytic>
		<title level="a" type="main">Deep neural networks as Gaussian processes</title>
		<author>
			<persName><forename type="first">J</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bahri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Novak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Schoenholz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICLR</title>
		<imprint>
			<biblScope unit="page" from="1" to="17" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b95">
	<analytic>
		<title level="a" type="main">Bayesian deep convolutional networks with many channels are Gaussian processes</title>
		<author>
			<persName><forename type="first">R</forename><surname>Novak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bahri</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICLR</title>
		<imprint>
			<biblScope unit="page" from="1" to="35" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b96">
	<analytic>
		<title level="a" type="main">Bayesian deep ensembles via the neural tangent kernel</title>
		<author>
			<persName><forename type="first">B</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lakshminarayanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Teh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="13" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b97">
	<analytic>
		<title level="a" type="main">Be greedy --a simple algorithm for blackbox optimization using neural networks</title>
		<author>
			<persName><forename type="first">B</forename><surname>Paria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Pòczos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Ravikumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Suggala</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICML Workshop on Adaptive Experimental Design and Active Learning in the Real World</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1" to="27" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b98">
	<monogr>
		<title level="m" type="main">Regression prior networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Malinin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chervontsev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Povilkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gales</surname></persName>
		</author>
		<idno>ArXiv: 2006.11590v2</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b99">
	<monogr>
		<title level="m" type="main">Deep evidential regression</title>
		<author>
			<persName><forename type="first">A</forename><surname>Amini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Schwarting</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Soleimany</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Rus</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2020">2020</date>
			<publisher>NeurIPS</publisher>
			<biblScope unit="page" from="1" to="11" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b100">
	<analytic>
		<title level="a" type="main">Evidential deep learning to quantify classification uncertainty</title>
		<author>
			<persName><forename type="first">M</forename><surname>Sensoy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kaplan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kandmir</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="11" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b101">
	<analytic>
		<title level="a" type="main">Improving evidential deep learning via multi-task learning</title>
		<author>
			<persName><forename type="first">D</forename><surname>Oh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Shin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AAAI Conference on Artificial Intelligence</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1" to="14" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b102">
	<analytic>
		<title level="a" type="main">An evidential classifier based on Dempster-Shafer theory and deep learning</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Tong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Denoeux</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neurocomputing</title>
		<imprint>
			<biblScope unit="volume">450</biblScope>
			<biblScope unit="page" from="275" to="293" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b103">
	<monogr>
		<title level="m" type="main">A survey on evidential deep learning for single-pass uncertainty estimation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ulmer</surname></persName>
		</author>
		<idno>ArXiv: 2110.03051v2</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b104">
	<monogr>
		<title level="m" type="main">A Mathematical Theory of Evidence</title>
		<author>
			<persName><forename type="first">G</forename><surname>Shafer</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1976">1976</date>
			<publisher>Princeton University Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b105">
	<analytic>
		<title level="a" type="main">Causal discovery based on neural network ensemble method</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Software</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page" from="1479" to="1484" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b106">
	<analytic>
		<title level="a" type="main">Wrapper approach for learning neural network ensemble by feature selection</title>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Jiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Networks -ISNN 2005</title>
				<imprint>
			<publisher>Springer</publisher>
			<biblScope unit="volume">202</biblScope>
			<biblScope unit="page" from="526" to="531" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b107">
	<analytic>
		<title level="a" type="main">Network generalization differences quantified</title>
		<author>
			<persName><forename type="first">D</forename><surname>Partridge</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural Networks</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="263" to="271" />
			<date type="published" when="1996">1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b108">
	<analytic>
		<title level="a" type="main">Observational learning algorithm for an ensemble of neural networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Jang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Analysis and Applications</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="154" to="167" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b109">
	<analytic>
		<title level="a" type="main">An active learning approach for neural network ensemble</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Computer Research and Development</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="page" from="375" to="380" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b110">
	<analytic>
		<title level="a" type="main">A constructive algorithm for training cooperative neural network ensembles</title>
		<author>
			<persName><forename type="first">M</forename><surname>Islam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Murase</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Neural Networks</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="820" to="834" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b111">
	<analytic>
		<title level="a" type="main">Fast decorrelated neural network ensembles with random weights</title>
		<author>
			<persName><forename type="first">M</forename><surname>Alhamdoosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">264</biblScope>
			<biblScope unit="page" from="104" to="117" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b112">
	<analytic>
		<title level="a" type="main">A novel decorrelated neural network ensemble algorithm for face recognition</title>
		<author>
			<persName><forename type="first">K</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Cao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge Based Systems</title>
		<imprint>
			<biblScope unit="volume">89</biblScope>
			<biblScope unit="page" from="541" to="552" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b113">
	<analytic>
		<title level="a" type="main">Feature selection based neural network ensemble method</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Fudan University (Natural Sciences)</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="page" from="685" to="688" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b114">
	<analytic>
		<title level="a" type="main">Freeway incident detection based on Adaboost RBF neural network</title>
		<author>
			<persName><forename type="first">T</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer Engineering and Applications</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="page" from="223" to="225" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b115">
	<analytic>
		<title level="a" type="main">AdaBoost based ensemble of neural networks in analog circuit fault diagnosis</title>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Han</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Chinese Journal of Scientific Instrument</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="851" to="856" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b116">
	<analytic>
		<title level="a" type="main">Pitfalls of in-domain uncertainty estimation and ensembling in deep learning</title>
		<author>
			<persName><forename type="first">A</forename><surname>Ashukha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lyzhov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Molchanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Vetrov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICLR</title>
		<imprint>
			<biblScope unit="page" from="1" to="30" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b117">
	<analytic>
		<title level="a" type="main">Deep echo state networks with uncertainty quantification for spatiotemporal forecasting</title>
		<author>
			<persName><forename type="first">P</forename><surname>Mcdermott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wikle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Environmetrics</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page">e2553</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note>paper no</note>
</biblStruct>

<biblStruct xml:id="b118">
	<analytic>
		<title level="a" type="main">Google vizier: A service for black-box optimization</title>
		<author>
			<persName><forename type="first">D</forename><surname>Golovin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Solnik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Moitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Kochanski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Karro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Knowledge Discovery and Data Mining</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1487" to="1496" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b119">
	<analytic>
		<title level="a" type="main">How transferable are features in deep neural networks?</title>
		<author>
			<persName><forename type="first">J</forename><surname>Yosinski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Clune</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lipson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="9" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b120">
	<analytic>
		<title level="a" type="main">Simultaneous deep transfer across domains and tasks</title>
		<author>
			<persName><forename type="first">E</forename><surname>Tzeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hoffman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Darell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Saenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICCV</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="4068" to="4076" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b121">
	<monogr>
		<title level="m" type="main">Domain separation networks</title>
		<author>
			<persName><forename type="first">K</forename><surname>Bousmalis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Trigeorgis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Silberman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Krishnan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Erhan</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<publisher>NeurIPS</publisher>
			<biblScope unit="page" from="1" to="9" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b122">
	<analytic>
		<title level="a" type="main">Learning and transferring mid-level image representations using convolutional neural networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Oquab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Laptev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sivic</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="1717" to="1724" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b123">
	<monogr>
		<title level="m" type="main">Deep transfer learning with joint adaptation networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Long</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jordan</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
			<publisher>ICML</publisher>
			<biblScope unit="page" from="3470" to="3479" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b124">
	<analytic>
		<title level="a" type="main">Transfer learning for sequences via learning to collocate</title>
		<author>
			<persName><forename type="first">W</forename><surname>Cui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICLR</title>
		<imprint>
			<biblScope unit="page" from="1487" to="1496" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b125">
	<analytic>
		<title level="a" type="main">Supervised representation learning: Transfer learning with deep autoencoders</title>
		<author>
			<persName><forename type="first">F</forename><surname>Zhuang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Pan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IJCAI</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="4119" to="4125" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b126">
	<analytic>
		<title level="a" type="main">Supervised representation learning with double encoding-layer autoencoder for transfer learning</title>
		<author>
			<persName><forename type="first">F</forename><surname>Zhuang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>He</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Intelligent Systems and Technology</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="1" to="17" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b127">
	<analytic>
		<title level="a" type="main">Adversarial discriminative domain adaptation</title>
		<author>
			<persName><forename type="first">E</forename><surname>Tzeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hoffman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Saenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Darell</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CVPR</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1" to="10" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b128">
	<analytic>
		<title level="a" type="main">Partial transfer learning with selective adversarial networks</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Long</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jordan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="2724" to="2732" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b129">
	<analytic>
		<title level="a" type="main">A minimax game for instance based selective transfer learning</title>
		<author>
			<persName><forename type="first">B</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Qiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Gon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">KDD</title>
		<imprint>
			<biblScope unit="page" from="34" to="43" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b130">
	<analytic>
		<title level="a" type="main">Coupled generative adversarial networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="page" from="1" to="9" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b131">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Stoyanova</surname></persName>
		</author>
		<ptr target="https://github.com/YanasGH/RAFs" />
		<title level="m">YanasGH/RAFs</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b132">
	<monogr>
		<title level="m" type="main">Algorithm data sets for the bbob test suite</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">D</forename><surname>Archive</surname></persName>
		</author>
		<ptr target="https://numbbo.github.io/data-archive/bbob/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b133">
	<analytic>
		<title level="a" type="main">An extension on &quot;Statistical Comparisons of Classifiers over Multiple Data Sets&quot; for all pairwise comparisons</title>
		<author>
			<persName><forename type="first">S</forename><surname>Garcia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Herrera</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="2677" to="2694" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b134">
	<analytic>
		<title level="a" type="main">Should we really use post-hoc tests based on mean-ranks?</title>
		<author>
			<persName><forename type="first">A</forename><surname>Benavoli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Corani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Mangili</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="1" to="10" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b135">
	<analytic>
		<title level="a" type="main">Statistical comparisons of classifiers over multiple data sets</title>
		<author>
			<persName><forename type="first">J</forename><surname>Demšar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="1" to="30" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
