<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Meta-learning Recommendation of Default Hyper-parameter Values for SVMs in Classifications Tasks</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Rafael</forename><forename type="middle">G</forename><surname>Mantovani</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Universidade de São Paulo (USP)</orgName>
								<address>
									<settlement>So Carlos -Brazil</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">André</forename><forename type="middle">L D</forename><surname>Rossi</surname></persName>
							<email>alrossi@itapeva.unesp.br</email>
							<affiliation key="aff1">
								<orgName type="institution">Universidade Estadual Paulista (UNESP)</orgName>
								<address>
									<settlement>Itapeva -SP</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Joaquin</forename><surname>Vanschoren</surname></persName>
							<email>j.vanschoren@tue.nl</email>
							<affiliation key="aff2">
								<orgName type="institution">Eindhoven University of Technology (TU/e)</orgName>
								<address>
									<settlement>Eindhoven</settlement>
									<country key="NL">The Netherlands</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">André</forename><forename type="middle">C P L F</forename><surname>Carvalho</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Universidade de São Paulo (USP)</orgName>
								<address>
									<settlement>So Carlos -Brazil</settlement>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Meta-learning Recommendation of Default Hyper-parameter Values for SVMs in Classifications Tasks</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">1FACE1D18CB76B648DF053A74696E8EE</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T02:32+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Meta-learning</term>
					<term>Hyper-parameter tuning</term>
					<term>Default Values</term>
					<term>Support Vector Machines</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Machine learning algorithms have been investigated in several scenarios, one of them is the data classification. The predictive performance of the models induced by these algorithms is usually strongly affected by the values used for their hyper-parameters. Different approaches to define these values have been proposed, like the use of default values and optimization techniques. Although default values can result in models with good predictive performance, different implementations of the same machine learning algorithms use different default values, leading to models with clearly different predictive performance for the same dataset. Optimization techniques have been used to search for hyper-parameter values able to maximize the predictive performance of induced models for a given dataset, but with the drawback of a high computational cost. A compromise is to use an optimization technique to search for values that are suitable for a wide spectrum of datasets. This paper investigates the use of meta-learning to recommend default values for the induction of Support Vector Machine models for a new classification dataset. We compare the default values suggested by the Weka and LibSVM tools with default values optimized by meta-heuristics on a large range of datasets. This study covers only classification task, but we believe that similar ideas could be used in other related tasks. According to the experimental results, meta-models can accurately predict whether tool suggested or optimized default values should be used.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Support Vector Machine (SVMs) have been successfully used for classification tasks <ref type="bibr" target="#b29">[21]</ref>. However, their predictive performance for a given dataset is affected by their hyper-parameter values. Several approaches have been proposed to choose these values. Some machine learning tools suggest hyper-parameter values for SVMs regardless of the dataset analyzed, or employ simple heuristics <ref type="bibr">[8]</ref>. Although these values can induce models with good predictive performance <ref type="bibr">[6]</ref> this does not occur in many situations, requiring a fine tuning process <ref type="bibr">[4,</ref><ref type="bibr" target="#b21">13,</ref><ref type="bibr" target="#b33">25]</ref>.</p><p>However, the optimization of these hyper-parameters usually has a high computational cost, since a large number of candidate solutions needs to be evaluated. An alternative is to generate a new set of default values by optimizing these hyper-parameter values over several datasets rather than for each one. The optimized common values may improve the model accuracy, when compared with the use of the default values, and reduce the computation cost to induce models, when compared with a optimization for each dataset.</p><p>This study proposes a recommendation system able to indicate which default hyper-parameters values should be used in SVMs when applied to new datasets. This recommendation is based on Meta-learning (MTL) <ref type="bibr">[7]</ref> ideas to induce a classification model that, based on some features of a dataset, indicates which hyper-parameters default values should be used: those proposed by ML tools or those achieved by an optimization technique considering a set of prior datasets.</p><p>The proposed recommendation system is evaluated experimentally using a large number of classification datasets and considering three sets of hyperparameters values for SVMs: default values from LibSVM <ref type="bibr">[9]</ref>, default values from Weka <ref type="bibr" target="#b24">[16]</ref>, and those obtained from an pre-optimization process with prior datasets, from here on referred to as "Optimized". We employed a Particle Swarm Optimization (PSO) <ref type="bibr" target="#b28">[20]</ref> algorithm to perform the optimization. This study covers only classification task, but we believe that similar ideas could be used in other related tasks.</p><p>This paper is structured as follows: section 2 contextualizes the hyper-parameter tuning problem and cites some techniques explored by related work. Section 3 presents our experimental methodology and steps covered to evaluate the approaches. The results are discussed in section 4. The last section presents our conclusions and future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Hyper-parameter tuning</head><p>Hyperparameter optimization is a crucial step in the process of applying ML in practice <ref type="bibr" target="#b20">[12]</ref>. Setting a suitable configuration for the hyperparameters of a ML algorithm requires specific knowledge, intuition and, often, trial and error. Depending on the training time of the algorithm at hand, finding good hyperparameters values manually is time-consuming and tedious. As a result, much recent work in ML has focused on the study of methods able to find the best hyper-parameter values <ref type="bibr" target="#b27">[19]</ref>.</p><p>The tuning of these hyperparameters is usually treated as an optimization problem, whose objective function captures the predictive performance of the model induced by the algorithm. As related in <ref type="bibr" target="#b32">[24]</ref>, this tuning task may present many aspects that can make it difficult: i) some hyperparameter values that lead to a model with high predictive performance for a given dataset may not lead to good results for other datasets; ii) the hyperparameter values often depend on each other, and this must be considered in the optimization; and iii) the evaluation of a specific hyperparameter configuration, let alone many, can be very time consuming.</p><p>Many approaches have been proposed for the optimization of hyperparameters of classification algorithms. Some studies used Grid Search (GS), a simple deterministic approach that provides good results in low dimensional problems <ref type="bibr">[6]</ref>. For optimization of many hyperparameters on large datasets, however, GS becomes computationally infeasible due to the combinatorial explosion. In these scenarios, probabilistic approaches, such as Genetic Algorithms (GA), are generally recommended <ref type="bibr" target="#b21">[13]</ref>. Other authors explored the use of Pattern Search (PS) <ref type="bibr" target="#b19">[11]</ref> or techniques based on gradient descent <ref type="bibr" target="#b18">[10]</ref>. Many automated tools are also available in the literature, such as methods based on local search (ParamILS <ref type="bibr" target="#b26">[18]</ref>), estimation of distributions (REVAC <ref type="bibr" target="#b31">[23]</ref>) and Bayesian optimization (Auto-Weka <ref type="bibr" target="#b36">[28]</ref>).</p><p>Recent studies have shown the effectiveness of Random Sampling (RS) methods <ref type="bibr">[1]</ref> for hyper-parameter fine tuning. In <ref type="bibr">[5]</ref>, the authors use RS to tune Deep Belief Networks (DBNs), comparing its performance with grid methods and showed empirically and theoretically that RS are more efficient for hyperparameter optimization than trials on a grid. Other recent works use a collaborative filtering solution <ref type="bibr">[3]</ref>, or combine optimization techniques for tuning algorithms in computer vision problems <ref type="bibr">[4]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Materials and methods</head><p>In addition to the default values suggested by LibSVM and Weka, an optimization technique was used to search for a new set of values suitable for a group of datasets. For such, the predictive performance of models induced by SVMs for public data sets using a PSO algorithm to tune SVM's hyper-parameters was evaluated.</p><p>In the PSO optimization, each particle encodes one hyper-parameter setting composed of a pair of real values representing the SVM hyper-parameter C (cost) and the width of the Gaussian kernel γ. The former is a SVM parameter and the latter is the Gaussian kernel parameter <ref type="bibr" target="#b25">[17]</ref>. Table <ref type="table">1</ref> shows the range of values for C and γ <ref type="bibr" target="#b34">[26]</ref> used in the optimization. The default values provided by the Weka <ref type="bibr" target="#b24">[16]</ref> and LibSVM tools <ref type="bibr">[9]</ref>, and the obtained optimized values are listed in Table <ref type="table">2</ref>.</p><p>Table <ref type="table">1</ref>. SVM hyper-parameters range values investigated during optimization <ref type="bibr" target="#b34">[26]</ref>.</p><formula xml:id="formula_0">Hyper-parameter Minimum Maximum cost (C) 2 −2 2 15 gamma (γ) 2 −15 2 3</formula><p>Table <ref type="table">2</ref>. Default values tested in the datasets and used to generate meta-labels.</p><formula xml:id="formula_1">Approach Cost (C) Gamma (γ) DF-Weka 1 0.1 DF-LibSVM 4 1 1/attrs DF-Optimized 5 2 5.6376 2 −8.2269</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Datasets</head><p>For the experiments, 145 classification datasets with different characteristics were collected from the UCI repository <ref type="bibr">[2]</ref> and OpenML <ref type="bibr" target="#b37">[29]</ref>. These datasets were split into two groups:</p><p>-One group contains 21 datasets that were used in the optimization process to find common values of the hyper-parameters. These datasets were randomly selected from the total amount of 145; -The second group, containing the 124 remaining datasets, were used to test the models induced with the hyper-parameters values found by the optimization process. These 124 datasets and models results were used in the meta-learning system.</p><p>Only few datasets were selected to the optimization to not spend too much time, and because we need the other for the meta-learning. All datasets were standardized with µ = 0 e σ = 1 internally by package 'e1071' (R interface for 'LibSVM' library), employed here to train SVMs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Optimization process</head><p>Figure <ref type="figure" target="#fig_0">1</ref> illustrates the optimization process. The PSO algorithm is run with the 21 training datasets. The evaluation of the hyper-parameters uses 10-fold crossvalidation (CV). Whenever a pair of SVM hyper-parameter values is generated by the tuning technique, one model is induced for each dataset using 8 partitions (training folds). One of the remaining partitions is used to validate the induced models, and will guide the search for the best hyper-parameter values (validation fold). The final one is used to asses the predictive performance of the induced models (test fold) only, not for hyper-parameter selection. This way, each dataset has validation and testing accuracies averaged over the 10-fold CV. The fitness criteria was defined as the median validation accuracy.</p><p>The PSO algorithm was implemented in R using the "pso" package, available on CRAN <ref type="foot" target="#foot_2">6</ref> . Since PSO is a stochastic method, the technique was run 30 times for each training dataset, so we obtain 30 solutions. The hyper-parameters values that resulted in the best median testing accuracy, considering the training datasets and executions, are defined as the "Optimized Default" values found by the optimization technique. Those values will be compared to the default ones provided by ML tools in Section 4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Meta-learning system</head><p>The problem of choosing one of the default values shown in Table <ref type="table">2</ref> can be viewed as a classification task, and solved using a meta-learning approach. A meta-dataset is created by extracting characteristics from the datasets and used to induce a meta-model that predicts the best set of hyper-parameters based on these data characteristics. Then, this meta-model can be applied to predict which default values are more likely to lead to good predictive SVM models for a new dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Meta-data set</head><p>Each meta-example of the meta-data set is composed of meta-features and a target feature. The meta-features are extracted from the 124 datasets from the total amount of 145 (Sec. 3.1). The other 21 datasets were used to find the DF-Optimized parameter settings, and are therefore excluded in the metalearning system. Since each dataset results in one meta-example, the meta-data set contains 124 meta-examples, each one composed of 81 meta-features. Table <ref type="table" target="#tab_0">3</ref> shows an overview of the meta-features obtained from these datasets, subdivided into 7 subsets. These meta-features were used before in many similar studies <ref type="bibr" target="#b22">[14,</ref><ref type="bibr" target="#b23">15,</ref><ref type="bibr" target="#b33">25,</ref><ref type="bibr" target="#b35">27]</ref>.</p><p>For each subset of these meta-features, a different meta-data set was created to explore their utility for this task. Furthermore, we built a meta-data set merging all meta-features, referred to as ALL, and another one, referred as FEAT.SELEC., obtained using a meta-feature selection method on the subset ALL. Specifically, we employed the correlation rank method from R package 'FSelector', selecting the 25% most correlated meta-features.</p><p>Besides the meta-features, each meta-example has a target, whose label indicates which default hyper-parameter values should be used on the corresponding dataset. In order to define the label of the meta-examples, we run the three sets of default values (DF-LibSVM, DF-Weka, and DF-Optimized) on the 124 test datasets. The hyper-parameters values that yielded the median accuracy value over 30 executions are selected.</p><p>All of the default approaches were evaluated performing 10-CV strategy on testing datasets. This procedure was repeated 30 times and the predictive performance of models assessed by the mean balanced accuracy. The Wilcoxon sign-test was applied for each pair of alternatives for the default values to assess the significance of the differences of accuracy measures per dataset. Table <ref type="table" target="#tab_1">4</ref> shows the win-tie-loss results based on this significance test with a confidence level of 95%. In these initial experiments, we considered the problem as binary, specially due to a small number of DF-Weka and DF-LibSVM wins and eventual ties. Thus, if the best mean accuracy for the dataset was obtained by the DF-Optimized with statistical significance (Wilcoxon test) compared to the other both approaches, a meta-example receives the label "OPTM". Otherwise, it is labeled as "DF".</p><p>According to this criteria, 84 of the 124 datasets were labeled with the OPTM class: the induced models presented the best predictive performance when it used the parameter values obtained by the optimization process. The other 40 meta-examples were labeled with DF class: default values provided by tools were enough. Due to the small number of meta-examples, the Leave-One-Out Cross-Validation (LOO-CV) methodology was adopted to evaluate the predictive performance of the meta-learners.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5">Meta-learner</head><p>Six ML classification algorithms were used as meta-learners: J48 Decision Tree (J48), Naïve Bayes (NB), k-Nearest Neighbors (k-NN) with k = 3, Multilayer Perceptron (MLP), Random Forest (RF) and Support Vector Machines (SVM). These algorithms follow different learning paradigms, each one with a distinct bias, and may result in different predictions. An ensemble (ENS) of these classifiers was also used, with prediction defined by majority voting.</p><p>The predictive performance of each meta-learner, including the ensemble, was averaged over all LOO-CV iterations/executions for four performance measures. Each meta-learner was evaluated with meta-data sets composed by meta-features extracted by different approaches, described in Table <ref type="table" target="#tab_0">3</ref>, and the meta-feature sets ALL, which combines all meta-features, and FEAT.SELEC., which applies feature selection to ALL.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experimental results</head><p>The predictive performance of models induced using the optimized default values for SVMs were compared with hyper-parameter values provided by SVMs tools. This comparison was performed by applying the Friedman statistical test and the Nemenyi post-hoc test with a confidence level of 95%. According to the test, the hyper-parameter values optimized by the PSO technique for several datasets (DF-Optimized) led to SVMs models with significantly better predictive performance than the default values provided by both SVMs tools (DF-Weka and DF-LibSVM) (see Table <ref type="table" target="#tab_1">4</ref>). Moreover, the test showed that there is no significance difference between the performance of DF-Weka and DF-LibSVM values.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">MTL predictive performance</head><p>Table <ref type="table">5</ref> summarizes the predictive performance of the meta-learners for different sets of meta-features. The first column identifies the meta-learning algorithm. The second column shows the meta-feature set used. The other columns present the predictive performance of the meta-learner according to different predictive performance measures: balanced accuracy, precision, recall, and F-Score. A trivial classifier would have a mean balanced accuracy equal to 0.500. The performance measures of this baseline method (MAJ.CLASS ) and of a RAN-DOM method are included at the bottom of the Table <ref type="table">5</ref>. The random method selects labels randomly. The best results for each meta-feature set according to the F-score measure are highlighted.</p><p>A general picture of the predictive performance of the meta-learners is provided by the F-Score measure, which is a balance between precision and recall measures, and mean balanced classification accuracy. According to these values, the J48 algorithm using all the meta-features was the best meta-learner overall, with an F-Score of 0.821 and balanced accuracy of 0.847. The same combination of meta-learner and meta-features also achieved the best results according to the precision measure. For the recall measure, the best result was also obtained by J48 algorithm, but using the Statlog meta-features subset. The J48 algorithm appears four times in the list, while RF and ENS appear three times each one. These results indicate the superiority of J48 for this task, differently from other similar meta-learning studies, such as <ref type="bibr" target="#b30">[22]</ref>. The intrinsic feature selection mechanism of J48 performed slightly better than the rank correlation based method (FEAT.SELEC.), since the meta-model J48-ALL is the first in the ranking followed by "J48.FSELEC". Another feature selection method may further improve the meta-learners predictive performance. Figure <ref type="figure">2</ref> illustrates that few meta-examples were misclassified by all meta-models. In these cases, all meta-examples are labeled as DF.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Hits and Misses</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Tree Analysis</head><p>The decision tree in Figure <ref type="figure" target="#fig_2">3</ref> was the most frequently induced model during the meta-level learning using the J48 algorithm with all meta-features and performing LOO-CV. This pruned tree was obtained in most of the experiments and kept basically the same structure with 19 nodes, of which 10 are leaf nodes, and 10 rules. The meta-features selected by J48 as the most relevant ones were: It is interesting to observe that about one meta-feature from each subset was used to generate the tree. The predictive meta-feature most frequently selected as the root node was dim: the problem dimensionality, i.e., dim = attributes samples . The LibSVM library considers the dimensionality of the dataset (Table <ref type="table">2</ref>) to assign the γ hyper-parameter value. However, the meta-feature dim is a ratio between the number of attributes and examples.</p><p>According to the tree, this ratio is close to zero, DF hyper-parameter values are already good solutions, and the pre-optimized values do not improve the model's accuracy. However, if the execution time of a statistical model (staT ime) is superior to 68.25, it indicates that the optimized hyper-parameter values should be used. The pre-optimized values are also recommended if a standard deviation of the number of leaves generated by model-based DTs is higher than 1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>Many experiments with SVMs use default values for the hyper-parameters.Thus, a good set default values allow non-expert users to have good models with low computational costs. This study investigated the development of a metalearning system to recommend hyper-parameter values for Support Vector Machines (SVMs) from a set of predefined default values. The meta-learning system was experimentally evaluated using 124 datasets from UCI and OpenML.</p><p>Besides the default values proposed by ML tools, we used an optimization technique to define new default hyper-parameter values based on a group of datasets. The use of this new set of hyper-parameter values, referred to as optimized default values, produced significantly better models than the default values suggested by ML tools.</p><p>According to the experiments to assess the performance of the meta-learning system, it is possible to create a recommendation system able to select which default values must be used for SVM hyper-parameters for classification tasks. Observing the most frequent decision tree, a small number of simple meta-features was sufficient to characterize the datasets. According to this decision tree, default values proposed by ML tools are suitable for problems with a dimensionality ratio close to zero.</p><p>As future work, we intend to expand the experiments by increasing the number of datasets and meta-features and exploring other ML algorithms. We also plan to cluster datasets according to their similarities to generate better optimized hyper-parameter values. The fitness value used in experiments is an aggregate measure of performance across different datasets. It would be interesting to explore other measures such as average ranks. We pretend to build on, and make all our experiments available in OpenML <ref type="bibr" target="#b37">[29]</ref>.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. SVM hyper-parameter tuning process with multiple datasets.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 Fig. 2 .</head><label>22</label><figDesc>Figure2depicts the hits and misses of the top-10 meta-models analyzing the F-score measure. The y-axis represents the meta-models: the algorithm and the set of meta-features used in the experiments. The x-axis represents all the 124 meta-examples of the meta-data set. In the figure, a hit is represented by a light gray square, and a miss by a black one.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Most common J48 DT with all meta-features.</figDesc><graphic coords="9,214.80,244.07,185.76,164.52" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 3 .</head><label>3</label><figDesc>Classes and number of meta-features used in experiments.</figDesc><table><row><cell cols="3">Meta-features Num. Description</cell></row><row><cell>Statlog</cell><cell>17</cell><cell>Simple measures, such as number of attributes classes and attributes.</cell></row><row><cell>Statistical</cell><cell>7</cell><cell>Statistics measures, such as the skewness and kurtosis.</cell></row><row><cell>Information</cell><cell>7</cell><cell>Information theory measures, such as the attributes' entropy, and so on.</cell></row><row><cell>Landmarking</cell><cell>10</cell><cell>The performance of some ML algorithms on the datasets</cell></row><row><cell>Model</cell><cell>18</cell><cell>Features extracted from DTs models, such as the number of leaves, nodes, rules.</cell></row><row><cell>Time</cell><cell>9</cell><cell>The execution time of some ML algorithms on these dataset.</cell></row><row><cell>Complexity</cell><cell>13</cell><cell>measures that analyze the complexity of a classification problem.</cell></row><row><cell>Total</cell><cell>81</cell><cell>All meta-features</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 4 .</head><label>4</label><figDesc>Win-tie-loss of the approaches for 124 datasets.</figDesc><table><row><cell>Technique</cell><cell cols="3">Win Tie Loss</cell></row><row><cell>DF-Weka</cell><cell>13</cell><cell>21</cell><cell>90</cell></row><row><cell>DF-LibSVM</cell><cell>6</cell><cell>20</cell><cell>98</cell></row><row><cell>DF-Optimized</cell><cell>84</cell><cell>6</cell><cell>34</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_0">attrs: the number of attributes in the dataset (except the target attribute)</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_1">Those are the values that presented the median accuracy over 30 solutions found in the optimization process. See Section3.2</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_2">http://cran.r-project.org/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgments. The authors would like to thank CAPES, CNPq (Brazilian Agencies) for the financial support. This project is supported by São Paulo Research Foundation (FAPESP) under the grant#2012/23114-9.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0" />			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m">dim: the problem dimensionality (Statlog)</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m">ktsP : kurtosis pre-processed (Statistical)</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m">f3 : maximum individual feature efficiency (Complexity)</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m">stSd : standard deviation of stump time (Landmarking)</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m">bMin: minimum level of branches (tree) (Model-based)</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m">lSd : standard deviation of leaves (Model-based)</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m">eAttr : attribute entropy (Information)</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
	</analytic>
	<monogr>
		<title level="m">staTime: the execution time of a statistical model</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m">attr : number of attributes</title>
				<imprint>
			<publisher>Statlog</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A review of random search methods</title>
		<author>
			<persName><forename type="first">S</forename><surname>Andradottir</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Handbook of Simulation Optimization</title>
		<title level="s">International Series in Operations Research &amp; Management Science</title>
		<editor>
			<persName><forename type="first">M</forename><forename type="middle">C</forename><surname>Fu</surname></persName>
		</editor>
		<meeting><address><addrLine>New York</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="volume">216</biblScope>
			<biblScope unit="page" from="277" to="292" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>Bache</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lichman</surname></persName>
		</author>
		<ptr target="http://archive.ics.uci.edu/ml" />
		<title level="m">UCI machine learning repository</title>
				<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Collaborative hyperparameter tuning</title>
		<author>
			<persName><forename type="first">R</forename><surname>Bardenet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Brendel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Kégl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sebag</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 30th International Conference on Machine Learning (ICML-13)</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Dasgupta</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Mcallester</surname></persName>
		</editor>
		<meeting>the 30th International Conference on Machine Learning (ICML-13)</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="page" from="199" to="207" />
		</imprint>
	</monogr>
	<note>JMLR Workshop and Conference Proceedings</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bergstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yamins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">D</forename><surname>Cox</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. 30th Intern. Conf. on Machine Learning</title>
				<meeting>30th Intern. Conf. on Machine Learning</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="1" to="9" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Random search for hyper-parameter optimization</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bergstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Mach. Learn. Res</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="281" to="305" />
			<date type="published" when="2012-03">Mar 2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">A note on parameter selection for support vector machines</title>
		<author>
			<persName><forename type="first">I</forename><surname>Braga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">P</forename><surname>Do Carmo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Benatti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">C</forename><surname>Monard</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Soft Computing and Its Applications</title>
				<editor>
			<persName><forename type="first">F</forename><surname>Castro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Gelbukh</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>González</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="volume">8266</biblScope>
			<biblScope unit="page" from="233" to="244" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Metalearning: Applications to Data Mining</title>
		<author>
			<persName><forename type="first">P</forename><surname>Brazdil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Giraud-Carrier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Soares</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vilalta</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<publisher>Springer Verlag</publisher>
		</imprint>
	</monogr>
	<note>2 edn</note>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">J</forename><surname>Lin</surname></persName>
		</author>
		<ptr target="http://www.csie.ntu.edu.tw/~cjlin/libsvm" />
		<title level="m">LIBSVM: a Library for Support Vector Machines</title>
				<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">LIBSVM: A library for support vector machines</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">J</forename><surname>Lin</surname></persName>
		</author>
		<ptr target="http://www.csie.ntu.edu.tw/~cjlin/libsvm" />
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Intelligent Systems and Technology</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page">27</biblScope>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Choosing multiple parameters for support vector machines</title>
		<author>
			<persName><forename type="first">O</forename><surname>Chapelle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Vapnik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Bousquet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mukherjee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">46</biblScope>
			<biblScope unit="issue">1-3</biblScope>
			<biblScope unit="page" from="131" to="159" />
			<date type="published" when="2002-03">Mar 2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Efficient optimization of support vector machine learning parameters for unbalanced datasets</title>
		<author>
			<persName><forename type="first">T</forename><surname>Eitrich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Comp. and Applied Mathematics</title>
		<imprint>
			<biblScope unit="volume">196</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="425" to="436" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Initializing bayesian hyperparameter optimization via meta-learning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Feurer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Springenberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hutter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence</title>
				<meeting>the Twenty-Ninth AAAI Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2015-01">Jan 2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Evolutionary tuning of multiple svm parameters</title>
		<author>
			<persName><forename type="first">F</forename><surname>Friedrichs</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Igel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neurocomput</title>
		<imprint>
			<biblScope unit="volume">64</biblScope>
			<biblScope unit="page" from="107" to="117" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Noisy data set identification</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">P F</forename><surname>Garcia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>De Carvalho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Lorena</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Hybrid Artificial Intelligent Systems</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">S</forename><surname>Pan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><forename type="middle">M</forename><surname>Polycarpou</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Wo?niak</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>De Carvalho</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Quintin</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Corchado</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="volume">8073</biblScope>
			<biblScope unit="page" from="629" to="638" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Combining meta-learning and search techniques to select parameters for support vector machines</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">A F</forename><surname>Gomes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">B C</forename><surname>Prudêncio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Soares</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L D</forename><surname>Rossi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">P L F</forename><surname>Nd André</surname></persName>
		</author>
		<author>
			<persName><surname>De Carvalho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neurocomput</title>
		<imprint>
			<biblScope unit="volume">75</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="3" to="13" />
			<date type="published" when="2012-01">Jan 2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">The weka data mining software: An update</title>
		<author>
			<persName><forename type="first">M</forename><surname>Hall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Frank</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Holmes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Pfahringer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Reutemann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">H</forename><surname>Witten</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SIGKDD Explor. Newsl</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="10" to="18" />
			<date type="published" when="2009-11">Nov 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<title level="m" type="main">A Practical Guide to Support Vector Classification</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">W</forename><surname>Hsu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">J</forename><surname>Lin</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2007">2007</date>
			<pubPlace>Taipei, Taiwan</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Department of Computer Science -National Taiwan University</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Paramils: an automatic algorithm con-figuration framewor</title>
		<author>
			<persName><forename type="first">F</forename><surname>Hutter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hoos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Leyton-Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Stützle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Artificial Intelligence Research</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="page" from="267" to="306" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Automatic algorithm configuration based on local search</title>
		<author>
			<persName><forename type="first">F</forename><surname>Hutter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">H</forename><surname>Hoos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Stützle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 22nd national conference on Artificial intelligence -Volume 2</title>
				<meeting>the 22nd national conference on Artificial intelligence -Volume 2</meeting>
		<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="1152" to="1157" />
		</imprint>
	</monogr>
	<note>AAAI&apos;07</note>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Particle swarms: optimization based on sociocognition</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kennedy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Recent Development in Biologically Inspired Computing</title>
				<editor>
			<persName><forename type="first">L</forename><surname>Castro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><forename type="middle">V</forename><surname>Zuben</surname></persName>
		</editor>
		<imprint>
			<publisher>Idea Group</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="235" to="269" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Tuning and evolution of support vector kernels</title>
		<author>
			<persName><forename type="first">P</forename><surname>Koch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bischl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Flasch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Bartz-Beielstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Weihs</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Konen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Evolutionary Intelligence</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="153" to="170" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">To tune or not to tune: recommending when to adjust svm hyper-parameters via meta-learning</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">G</forename><surname>Mantovani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L D</forename><surname>Rossi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bischl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Vanschoren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C P L F</forename><surname>Carvalho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 2015 International Joint Conference on Neural Network</title>
				<meeting>2015 International Joint Conference on Neural Network</meeting>
		<imprint>
			<date type="published" when="2015-07">Jul 2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Relevance estimation and value calibration of evolutionary algorithm parameters</title>
		<author>
			<persName><forename type="first">V</forename><surname>Nannen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">E</forename><surname>Eiben</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the 20th Intern. Joint Conf. on Art. Intelligence</title>
				<meeting>of the 20th Intern. Joint Conf. on Art. Intelligence</meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="975" to="980" />
		</imprint>
	</monogr>
	<note>IJCAI&apos;07</note>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Meta-learning for evolutionary parameter optimization of classifiers</title>
		<author>
			<persName><forename type="first">M</forename><surname>Reif</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Shafait</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Dengel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">87</biblScope>
			<biblScope unit="page" from="357" to="380" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Automatic classifier selection for non-experts</title>
		<author>
			<persName><forename type="first">M</forename><surname>Reif</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Shafait</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Goldstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Breuel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Dengel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Analysis and Applications</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="83" to="96" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Bio-inspired optimization techniques for svm parameter tuning</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L D</forename><surname>Rossi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C P L F</forename><surname>Carvalho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceed. of 10th Brazilian Symp. on Neural Net</title>
				<meeting>eed. of 10th Brazilian Symp. on Neural Net</meeting>
		<imprint>
			<publisher>IEEE Computer Society</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="435" to="440" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">A meta-learning method to select the kernel width in support vector regression</title>
		<author>
			<persName><forename type="first">C</forename><surname>Soares</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">B</forename><surname>Brazdil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kuba</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="195" to="209" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms</title>
		<author>
			<persName><forename type="first">C</forename><surname>Thornton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hutter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">H</forename><surname>Hoos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Leyton-Brown</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of KDD-2013</title>
				<meeting>of KDD-2013</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="847" to="855" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<analytic>
		<title level="a" type="main">OpenML: Networked science in machine learning</title>
		<author>
			<persName><forename type="first">J</forename><surname>Vanschoren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">N</forename><surname>Van Rijn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bischl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Torgo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SIGKDD Explorations</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="49" to="60" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
