<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Two ways of using arti cial neural networks in knowledge discovery from chemical materials data⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Martin Holenˇa</string-name>
          <email>martin@cs.cas.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Computer Science, Academy of Sciences of the Czech Republic Pod Vod ́arenskou vˇeˇz ́ı 2</institution>
          ,
          <addr-line>18207 Prague</addr-line>
        </aff>
      </contrib-group>
      <issue>201</issue>
      <fpage>17</fpage>
      <lpage>24</lpage>
      <abstract>
        <p>In the application area of chemical materials, to serve as universal approximators in very general data mining methods have been used for more than a de- function spaces [12, 14, 18]. This ability is particularly cade. By far most popular have from the very beginning valuable in the context of the highly nonlinear nature been methods based on artificial neural networks. However, of the dependencies encountered in catalysis (cf. Figthey are frequently used without awareness of the difference ure 1). bdeattwaebeyn ntehueranlunmetewriocrknraetgurreessoiofnk,naonwdletdhgeesyombtbaoilnicednafrtuorme However, it seems to be little awareness, among of knowledge obtained by some other data mining meth- researchers using artificial neural networks in catalods. This paper explains that within the surrogate model- ysis, of the difference among the symbolic nature of ling approach, which plays an important role in this area, the knowledge obtained from data by analysis of variusing numeric knowledge is justified. At the same time, ance and decision trees, and the numeric nature of the it recalls the possibility to obtain symbolic knowledge from knowledge obtained by neural network regression. neural networks in the form of logical rules and describes a recently proposed method for the extraction of Boolean rules in disjunctive normal form. Both ways of using neural networks are illustrated on examples from this application area.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The search for new chemical materials, e.g., catalytic
materials for a plethora of chemical reactions,
produces large amounts of data. To discover useful
knowledge from those data, statistical as well as
machinelearning data mining methods have been used in this
area since the late 1990s, the former represented in
particular by the analysis of variance, decision trees
and support vector regression, the latter by main
variants of feed-forward neural networks.</p>
      <p>This paper summarizes experience from nearly ten
years using and developing neural-networks based data
mining methods for catalytic data. Artificial neural
networks are the most popular regression model in this Fig. 1. A 3-dimensional cut of a neural-network regression
application area. In the survey [10], more than 20 pub- of the yield of a reaction product on the composition of
lished applications of multilayer perceptrons (MLPs) the catalytic material.
to catalytic data have been listed, as well as several
applications of radial basis function networks. The role of
feed-forward neural nets as a regression model
predicting catalytic performance of materials (such as yield,
conversion, selectivity) is due partially to their
preceding success in other areas, but mostly to their ability
with the optimization of materials performance in an
approach called surrogate modelling. In that context,
also the possibility to increase the accuracy of
neural network regression by means of boosting is
mentioned. The other strategy, on the other hand, relies
on employing rules extraction methods to obtain, from
trained neural networks, symbolic knowledge.</p>
      <p>These two strategies determine also the structure
of the paper. In Section 2, the surrogate modelling
approach is described. Section 3 then explains a method
for the extraction of logical rules from trained neural
networks. Both strategies are in the respective sections
illustrated using real-world examples.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Neural networks used as surrogate models</title>
      <p>From the point of view of theoretical computer
science, the search for most suitable chemical materials
entails complex optimization tasks. As objective
functions, those tasks use various properties of the
materials, e.g. in the case of catalytic materials, properties
quantifying their catalytic performance, such as yield,
conversion, or selectivity. A crucial feature of such
objective functions is that they cannot be expressed
analytically, their values must be obtained empirically.</p>
      <p>For their optimization, it is not possible to employ
most common optimization methods, such as steepest
descent, conjugate gradient methods or the
LevenbergMarquardt method. Indeed, to obtain sufficiently
precise numerical estimates of gradients or second order
derivatives of the empirical objective function, those
methods need to evaluate the function in points some
of which would have a smaller distance than is the
empirical error of catalytic measurements. That is why
methods not requiring any derivatives have been used
to solve such optimization tasks, such as the simplex
method, and most frequently genetic and other
evolutionary algorithms [2]. To compensate for missing
information about derivatives, these methods need quite
large number of objective function evaluations. In the
context of catalysis, this is quite disadvantageous
because the evaluation of the empirical objective
functions used in the search for optimal catalysts is
often costly and time-consuming. Testing a generation
of catalytic materials proposed by an evolutionary
algorithm typically needs several days of time and costs
thousands of euros.</p>
      <p>
        The usual approach to decreasing the cost and time
of optimization of empirical objective functions is to
evaluate the function only in points considered to be
most important for the progress of the employed
optimization method, and to evaluate its suitable
regression model otherwise. That model is termed surrogate
model of the function, and the approach is referred to
as surrogate modelling [
        <xref ref-type="bibr" rid="ref10 ref13 ref17 ref7">17, 20, 23, 27</xref>
        ]. Needless to say,
the time and costs needed to evaluate a regression
model are negligible compared to time and costs needed
to evaluate empirical functions such as yield or
conversion. However, it must not be forgotten that the
agreement between the results obtained with a
surrogate model and those obtained with the original
function depends on the accuracy of the model.
      </p>
      <p>
        The fact that feed-forward neural networks are the
most frequent regression models in catalysis suggests
them as the most natural candidate for surrogate
models in this area. Indeed, several nice examples of the
application of neural-network based surrogate
modelling to the optimization of performance of catalytic
materials have been published during the last five
years [
        <xref ref-type="bibr" rid="ref11 ref15">3, 6, 21, 25</xref>
        ]. Within the overall context of the
application of artificial neural networks to mining
catalytic data, however, they are still rare.
      </p>
      <p>
        Although surrogate modelling has been also
applied to conventional optimization [5], it is most
frequently encountered in connection with evolutionary
algorithms because for them, the approach leads to the
approximation of the fitness function, whose usefulness
in evolutionary computation is already known [
        <xref ref-type="bibr" rid="ref3 ref9">13, 19</xref>
        ].
      </p>
      <p>For the progress of evolutionary optimization, most
important criteria are on the one hand points that
indicate closeness to the global optimum (through
highest values of the fitness function), on the other hand
points that most contribute to the diversity of the
population.</p>
      <p>
        In the literature, various possibilities of
combining evolutionary optimization with surrogate
modelling have been discussed [
        <xref ref-type="bibr" rid="ref14 ref17 ref7">17, 24, 27</xref>
        ]. Nevertheless, all
of them are controlled by one of two basic approaches:
A. The individual-based-control consists in choosing
between the evaluation of the empirical objective
function and the evaluation of its surrogate model
individual-wise, basically in the following steps:
(i) An initial set E of individuals is collected, in
which the considered empirical fitness η was
evaluated (for example, the population of
several first generations of the evolutionary
algorithm).
(ii) The surrogate model is constructed using the
      </p>
      <p>set of pairs {(x, η(x)) : x ∈ E }.
(iii) The evolutionary algorithm is run with the
fitness η replaced by the model for one
generation with a population Q of size qP , where
P is the desired population size for the
optimization of η, and q is a prescribed ratio (e.g.,
q = 10 or q = 100).
(iv) A subset P ⊂ Q of size P is selected so as to
contain those individuals from Q that are most
important according to the considered criteria
for the progress of optimization.
(v) For x ∈ P, the empirical fitness is evaluated. (ii’c) A k-fold crossvalidation of regression boosting
(vi) The set E is replaced by E ∪ P and the algo- is performed, and the error of the boosting
aprithm returns to the step (ii). proximation is in each iteration measured with
the prescribed error measure on the validation
B. The generation-based-control consists in choosing data.</p>
      <p>
        between both kinds of evaluation generation-wise, (ii’d) The first iteration i in which the average error
basically in the following steps: of the boosting approximation on the validation
(i) An initial set E of individuals in which the data is lower than in the i + 1-th iteration is
considered empirical fitness η was evaluated is taken as the final iteration of boosting.
collected like with the individual-based con- (ii’e) Boosting using the complete set {(x, η(x)) :
trol. x ∈ E } is performed up to the final iteration
(ii) The surrogate model is constructed using the found in step (ii’d), and the result of the
apset of pairs {(x, η(x)) : x ∈ E }. plication of the employed boosting method in
(iii) Relying on the error of the surrogate mo- each such iteration of boosting is taken as the
del, measured with a prescribed error measure boosted surrogate model in that iteration.
(e.g., mean squared error, MSE, or mean
absolute error, MAE), an appropriate number gm 2.1 An illustration
of generations is chosen, during which η should
be replaced by the model. A particular method for MLP boosting has been
pre(iv) The evolutionary algorithm is run with the sented in [11]. That method will now be employed in
fitness η replaced by the model for gm genera- surrogate modelling with data from the investigation
tions with populations P1, . . . , Pgm of size P . of catalytic materials for the high-temperature
synthe(v) The evolutionary algorithm is run with the sis of hydrocyanic acid (HCN) [
        <xref ref-type="bibr" rid="ref6">16</xref>
        ]. The composition
empirical fitness η for a prescribed number ge of most of those materials was designed by means of
of generations (frequently, ge = 1) with popu- a specific genetic algorithm (GA) for heterogeneous
lations Pgm+1, . . . , Pgm+ge . catalysis [
        <xref ref-type="bibr" rid="ref16">26</xref>
        ]. As usually in evolutionary optimization
of catalytic materials, the GA configuration was
de(vi) The set E is replaced by E ∪ Pgm+1 ∪ . . . termined by the experimental conditions in which the
·st·e·p∪(Piig)m.+ge and the algorithm returns to the optimization was performed: number of channels of the
reactor in which the materials were tested, as well as
time and financial resources available for those
expen
      </p>
      <p>The agreement between the results that are ob- sive tests. In the reported investigation, the algorithm
tained with a surrogate model and those that would be was running for 7 generations of population size 92,
obtained if the empirical objective function were evalu- and in addition 52 other catalysts with manually
deated depends on the accuracy of the model. A popular signed composition were investigated. Consequently,
approach to increasing the accuracy of learning meth- data about 696 catalytic materials were available. The
ods is boosting, i.e., construction of a strong learner considered MLPs had 14 input neurons: 4 of them
codthrough combining weak learners. It is important to ing catalyst support, the other 10 corresponding to the
realize that boosted surrogate models are only par- proportions of 10 metal additives forming the active
ticular kinds of surrogate models and their interaction shell, and 3 output neurons, corresponding to 3 kinds
with optimization algorithms in optimization tasks fol- of catalytic activity considered as fitness functions.
lows the same rules as the interaction of surrogate For boosting, only data about catalysts from the
models in general. In particular in the above outlines of 1.-6. generation of the GA and about the 52 catalysts
individual-based and generation-based control, boost- with manually designed composition were used, thus
ing is always performed in the step (ii), which has to altogether data about 604 catalytic materials. Data
be replaced with: about catalysts from the 7. generation were completely
excluded and left out for testing. The set of
architec(ii’a) The set {(x, η(x)) : x ∈ E } is divided into k tures to which boosting was applied was restricted to
disjoint subsets of size ⌊ jEkj ⌋ or ⌈ jEkj ⌉, where | | MLPS with 1 and 2 hidden layers and was delimited by
denotes the cardinality of a set, ⌊ ⌋ the lower in- means of the heuristic pyramidal condition: the
numteger bound of a real number, and ⌈ ⌉ its upper ber of neurons in a subsequent layer must not exceed
integer bound. the number of neurons in a previous layer. Let nI , nH
and nO denote the numbers of input, hidden and
output neurons, respectively, and nH1 and nH2 denote
the numbers of neurons in the first and second
hid(ii’b) For each j = 1, . . . , k, a surrogate model F1j is
constructed, using only data not belonging to
the j-th subset.
den layer, respectively. Then the pyramidal condition
entails the following 90 architectures:
(i) one hidden layer and 3 ≤ nH ≤ 14 (12
architec</p>
      <p>tures);
(ii) two hidden layers and 3 ≤ nH2 ≤ nH1 ≤ 14</p>
      <p>(78 architectures).</p>
      <p>As was mentioned above, boosting can be combined
both with the individaul-based and with the
generation-based control of surrogate modelling. In the
reported investigation of catalytic materials for HCN
synthesis, the indiviual-based control was employed.</p>
      <p>The error measure employed in the crossvalidation
in the step (ii’c) was MSE. The distribution of the
final iterations of boosting, found for MLPs with the
90 considered architectures in the step (ii’d), is
depicted in Figure 2. We can see that only for 16 MLPs,
already the 1st iteration was the final. For the
remaining 74 MLPs, boosting improved the average MSE on
the validation data for at least 1 iteration. The mean
and median of the distribution of the final iterations
were 6.6 and 5, respectively.</p>
      <sec id="sec-2-1">
        <title>For testing with the data from the 7th generation</title>
        <p>of the evolutionary algorithm, we used only the five
MLPs most promising from the point of view of the
average MSE on the validation data in the final
iteration of boosting. These were the following MLPs:
{ a 1-hidden-layer MLP, with nH = 11 and the
3rd iteration of boosting being the final iteration,
{ four 2-hidden-layers MLPs, with (nH1, nH2) =
= (10, 4), (10, 6), (13, 5), (14, 8) and the final
iterations of boosting 19, 32, 31 and 29, respectively.
For each of them, the validation proceeded as follows:
1. In each iteration up to the final, a single MLP
was trained with data about all the 604 catalytic
materials used for boosting.
2. In each iteration up the final iteration of boosting,
the boosted surrogate model was constructed for
the trained MLP, according to the step (ii’e).
3. From the values predicted by the boosted
surrogate model for the 92 materials from the 7.
generation of the GA, and from the measured values,
the boosting MSE was calculated.</p>
      </sec>
      <sec id="sec-2-2">
        <title>The results are summarized in Figure 3, decom</title>
        <p>posed to the properties corresponding to the MLP
outputs – conversions of CH4 and NH3 and yield of
HCN. They clearly confirm the usefulness of
boosting for the five considered architectures. For each of
them, boosting leads to an overall decrease of MSE of
the conversion of CH4 and HCN yield, on new data
from the 7th generation of the GA, which is
uninterrupted or nearly uninterrupted till the final boosting
iteration. On the other hand, boosting did not lead
to any decrease of the error of the conversion of NH3,
which on the other hand is already from the beginning
much lower than the two other performance measures
(notice that the scale of the y-axis is 10-times finer
for the conversion of NH3 than for the conversion of
CH4 and HCN yield). The explanation for the
different behavior of the conversion of NH3 is the
substantially lower variability of its values in the seventh
generation of the GA, used for validating the usefulness
of boosting (standard deviation, SD: 2.8,
interquartile range, IQR: 1.6), compared to the conversion of
CH4 (SD: 26.1, IQR: 45.0) and HCN yield (SD: 20.1,
IQR: 35.9). Due to so low variability, the conversion
of NH3 appears effectively as nearly constant during
the validation of boosting, which in turn accounts for
a nearly constant MSE.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Neural-network based rules extraction from data</title>
      <p>The architecture of a trained neural network and the
weights and biases that determine the regression
model computed by the network inherently represent the
knowledge contained in the data used to train the
network. As was already mentioned in the introduction,
such a representation is not comprehensible to
humans, being very far from the symbolic, modular and
often vague way they represent knowledge by
themselves. Therefore, methods for the extraction of
symbolic knowledge from trained neural networks have
been investigated since the late 1980s. Most frequently,
the extracted knowledge has the form of a Boolean
implication:</p>
      <sec id="sec-3-1">
        <title>IF the input variables fulfil an input condition CI</title>
      </sec>
      <sec id="sec-3-2">
        <title>THEN the output variables are likely</title>
        <p>
          to fulfil an output condition CO. (1)
In addition, also implications and equivalences of
important kinds of fuzzy logic are frequently
extracted [
          <xref ref-type="bibr" rid="ref5">8, 15</xref>
          ]. In general, extracted formulas of a
formal logic are called rules. Over the last two decades,
various rules extraction methods have been proposed
for neural networks, but so far none of them has
become a common standard (cf. the survey
papers [
          <xref ref-type="bibr" rid="ref1 ref12 ref5">1, 15, 22</xref>
          ] and the monograph [7]). Here, a method
for the extraction of Boolean implications from
multilayer perceptrons with n inputs and m outputs will
be sketched that finds to each output condition of the
form:
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>CO : the value y of the output variables</title>
        <p>lies in a rectangular area R ⊂ Rm
(2)
one or more input conditions of the form</p>
      </sec>
      <sec id="sec-3-4">
        <title>CI : the value x of the input variables</title>
        <p>lies in a polyhedron P ⊂ Rn</p>
      </sec>
      <sec id="sec-3-5">
        <title>Hence, this method extracts rules of the form:</title>
        <p>IF x ∈ P THEN y ∈ R.</p>
        <p>A detailed explanation of the method can be found
in [9]. Its main principles can be summarized as
follows:
(3)
(4)
{ An m-dimensional rectangular area R with
borders perpendicular to the m coordinate axes has
to be chosen in advance in the output space of
a trained MLP with sigmoid activation functions.
The reason for choosing such an area is that in
the space of evaluations of m free variables, each
m-dimensional rectangular area is the validity set
of the conjunction of some m univariate Boolean
predicates. That conjunction then serves as the
consequent of the rule to extract.
{ The activation functions in the hidden neurons are
approximated with piecewise-linear sigmoid
activation functions. This can be done with an
arbitrary precision.
{ The products of individual linearity intervals of
all the activation functions determine areas in the
input space in which the final approximating
mapping computed by the multilayer perceptron is
linear.
{ In each such area, all points mapped to R form
a polyhedron, which may eventually be empty or
may be concatenated with polyhedra from some
of the neighboring areas to a larger polyhedron.
{ The union of all the nonempty concatenated
polyhedra P1, . . . , Pq defines the antecedent of a rule
in a combined form</p>
        <p>IF x ∈ P1 ∪ · · · ∪ Pq THEN y ∈ R,
(5)
IF x ∈ P1 THEN y ∈ R</p>
        <p>. . .</p>
        <p>IF x ∈ Pq THEN y ∈ R.</p>
      </sec>
      <sec id="sec-3-6">
        <title>Rules of the form (7) are also very convenient from</title>
        <p>(6) the visualization point of view: Since cuts of
rectangular areas coincide with the corresponding projections
of those areas, the values of no variables need to be
fixed.</p>
      </sec>
      <sec id="sec-3-7">
        <title>To increase the comprehensibility of the extracted</title>
        <p>rules, visualization by means of 2- or 3-dimensional
cuts of the set P1 ∪ · · · ∪ Pq can be used (Figure 4). 3.1 An illustration</p>
        <p>Usually, logical rules of the form (4) are the
final results of this rule-extraction method. Nonethe- As an example, Figure 5 shows three-dimensional cuts
less, there is one exception – when the polyhedron P determining the antecedents of conjunctive-form rules
is also rectangular with borders perpendicular to axes, extracted from a trained MLP with 5 input neurons
or more generally, when P can be approximately re- and 1 output neuron such that:
placed with such a rectangular area RI in the input (i) the input neurons correspond to variables that
respace. Then the above rule (4) can be approximately cord the molar proportions of the oxides of Fe, Ga,
expressed in the conjunctive form Mg, Mn and Mo in the catalytic material;
IF x1 ∈ I1 &amp; . . . &amp; xnI ∈ InI THEN y ∈ R. (7) (ii) ctohredionugtppruotpneneueryoineldco.rresponds to a variable
reThe extracted rules are listed in Table 1.
which is equivalent to a logical disjunction of
q rules of the simple form (4):
{ The conditional empirical distribution of the input</p>
        <p>variables in the available data, conditioned by P .
Here, I1, . . . , InI are intervals that constitute the
projections of RI into the nI input dimensions. Each such
interval can be restricted both from below and from
above, restricted only from below or only from above,
or finally can be even the complete set of real
numbers. However, dimensions for which the corresponding
projection of RI equals the complete real axis are
usually not included in (7), since they would not provide
any new knowledge. Finally, observe that due to (5)
and (7), the final extracted rule is in the disjunctive
normal form.</p>
        <p>In the rule-extraction method outlined above, the
possibility of replacing a polyhedron P with a
rectangular area RI is assessed according to the following
principles:
1. The resulting dissatisfaction with points that
either belong to P but do not belong to RI , or
belong to RI but do not belong to P (i.e., with points
from the symmetric difference RI ∆P ), has to
remain within a prescribed tolerance ε and RI has
to be minimal in the input space among
rectangular areas of some specified kind with dissatisfacion
within that tolerance.
2. The dissatisfaction with points from RI ∆P
depends solely on those points and is increasing with
respect to inclusion. Consequently, it can be
measured using some monotone measure on the input
space, possibly depending on P .
3. To be eligible for replacement, P has to cover at
least one point of the available data.</p>
      </sec>
      <sec id="sec-3-8">
        <title>For 2., the most attractive monotone measures, due to their straightforward interpretability, are:</title>
      </sec>
      <sec id="sec-3-9">
        <title>The paper dealt with employing feed-forward neural</title>
        <p>networks for knowledge discovery from data about
chemical materials. It has shown that in this application
area, obtaining numeric knowledge by neural-network
{ The joint empirical distribution of the input vari- regression is justified, in spite of the fact that numeric
ables in the available data. knowledge is substantially less human-understandable
Rule
1
2
3
than symbolic knowledge. Its justification consists in
the possibility to use such knowledge in the optimiza- 9. M. Holenˇa: Piecewise-linear neural networks and their
tion tasks entailed by search for new materials in the relationship to rule extraction from data. Neural
Comsurrogate modelling approach. putation, 18, 2006, 2813–2853.</p>
        <p>In addition to justifying the specific need for nu- 10. M. Holenˇa and M. Baerns: Computer-aided strategies
meric knowledge from neural network regression in for catalyst development. In: G. Ertl, H. Kn¨ozinger,
this application area, the paper recalled the possibil- F. Schu¨th, and J. Eitkamp, (eds), Handbook of
Heteroity to obtain symbolic knowledge in the form of logical geneous Catalysis, Wiley-VCH, Weinheim, 2008, 66-81.
rules from trained neural networks. It explained a re- 11. M. Holenˇa, D. Linke, and N. Steinfeldt: Boosted
neucently proposed method for the extraction of Boolean ral networks in evolutionary computation. In: Neural
rules in disjunctive normal form, and illustrated it on Information Processing. Lecture Notes in Computer
data about catalytic materials. Science 5864, Springer Verlag, Berlin, 2009, 131–140.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>R.</given-names>
            <surname>Andrews</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Diederich</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.B.</given-names>
            <surname>Tickle</surname>
          </string-name>
          :
          <article-title>Survey and critique of techniques for extracting rules from trained artificical neural networks</article-title>
          .
          <source>Knowledge Based Systems</source>
          ,
          <volume>8</volume>
          ,
          <year>1995</year>
          ,
          <fpage>378</fpage>
          -
          <lpage>389</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          12. K. Hornik:
          <article-title>Approximation capabilities of multilayer neural networks</article-title>
          .
          <source>Neural Networks</source>
          ,
          <volume>4</volume>
          ,
          <year>1991</year>
          ,
          <fpage>251</fpage>
          -
          <lpage>257</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          , M. Hu¨sken, M. Olhofer, and
          <string-name>
            <given-names>B.</given-names>
            <surname>Sendhoff</surname>
          </string-name>
          :
          <article-title>Neural networks for fitness approximation in evolutionary optimization</article-title>
          . In Y. Jin, (ed.),
          <source>Knowledge Incorporation in Evolutionary Computation</source>
          , Springer Verlag, Berlin,
          <year>2005</year>
          ,
          <fpage>281</fpage>
          -
          <lpage>306</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          14. V.
          <article-title>K˚urkov´a: Neural networks as universal approximators</article-title>
          . In: M. Arbib, (ed.),
          <source>Handbook of Brain Theory and Neural Networks</source>
          , MIT Press, Cambridge,
          <year>2002</year>
          ,
          <fpage>1180</fpage>
          -
          <lpage>1183</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          15.
          <string-name>
            <given-names>S.</given-names>
            <surname>Mitra</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hayashi</surname>
          </string-name>
          <article-title>: Neuro-fuzzy rule generation: Survey in soft computing framework</article-title>
          .
          <source>IEEE Transactions on Neural Networks</source>
          ,
          <volume>11</volume>
          ,
          <year>2000</year>
          ,
          <fpage>748</fpage>
          -
          <lpage>768</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          16.
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>M¨ohmel</article-title>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Steinfeldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Endgelschalt</surname>
          </string-name>
          , M. Holenˇa, S. Kolf,
          <string-name>
            <given-names>U.</given-names>
            <surname>Dingerdissen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weber</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Bewersdorf</surname>
          </string-name>
          :
          <article-title>New catalytic materials for the high-temperature synthesis of hydrocyanic acid from methane and ammonia by high-throughput approach</article-title>
          .
          <source>Applied Catalysis A: General</source>
          ,
          <volume>334</volume>
          ,
          <year>2008</year>
          ,
          <fpage>73</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          17.
          <string-name>
            <given-names>Y.S.</given-names>
            <surname>Ong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.B.</given-names>
            <surname>Nair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.J.</given-names>
            <surname>Keane</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K.W. Wong:
          <article-title>Surrogate-assisted evolutionary optimization frameworks for high-fidelity engineering design problems</article-title>
          . In: Y. Jin, (ed.),
          <source>Knowledge Incorporation in Evolutionary Computation</source>
          , Springer Verlag,Berlin,
          <year>2005</year>
          ,
          <fpage>307</fpage>
          -
          <lpage>331</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          18. A.
          <article-title>Pinkus: Approximation theory of the MPL model in neural networks</article-title>
          .
          <source>Acta Numerica</source>
          ,
          <volume>8</volume>
          ,
          <year>1998</year>
          ,
          <fpage>277</fpage>
          -
          <lpage>283</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          19. A.
          <article-title>Ratle: Accelerating the convergence of evolutionary algorithms by fitness landscape approximation</article-title>
          . In: A.E. Eiben, T. B¨ack, M. Schoenauer, and
          <string-name>
            <given-names>H.- P.</given-names>
            <surname>Schwefel</surname>
          </string-name>
          , (eds),
          <source>Parallel Problem Solving from Nature</source>
          , Springer Verlag, Berlin,
          <year>1998</year>
          ,
          <fpage>87</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          20. A. Ratle:
          <article-title>Kriging as a surrogate fitness landscape in evolutionary optimization</article-title>
          .
          <source>Artificial Intelligence for Engineering Design, Analysis and Manufacturing</source>
          ,
          <volume>15</volume>
          ,
          <year>2001</year>
          ,
          <fpage>37</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          21. U. Rodemerck,
          <string-name>
            <given-names>M.</given-names>
            <surname>Baerns</surname>
          </string-name>
          , and
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Holenˇa: Application of a genetic algorithm and a neural network for the discovery and optimization of new solid catalytic materials</article-title>
          .
          <source>Applied Surface Science</source>
          ,
          <volume>223</volume>
          ,
          <year>2004</year>
          ,
          <fpage>168</fpage>
          -
          <lpage>174</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          22.
          <string-name>
            <surname>A.B. Tickle</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Andrews</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Golea</surname>
            , and
            <given-names>J. Diederich:</given-names>
          </string-name>
          <article-title>The truth will come to light: Directions and challenges in extracting rules from trained artificial neural networks</article-title>
          .
          <source>IEEE Transactions on Neural Networks</source>
          ,
          <volume>9</volume>
          ,
          <year>1998</year>
          ,
          <fpage>1057</fpage>
          -
          <lpage>1068</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          23. H.
          <string-name>
            <surname>Ulmer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Streichert</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Zell</surname>
          </string-name>
          <article-title>: Model-assisted steady state evolution strategies</article-title>
          .
          <source>In: GECCO 2003: Genetic and Evolutionary Computation</source>
          , Springer Verlag, Berlin,
          <year>2003</year>
          ,
          <fpage>610</fpage>
          -
          <lpage>621</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          24. H.
          <string-name>
            <surname>Ulmer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Streichert</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Zell</surname>
          </string-name>
          <article-title>: Model assisted evolution strategies</article-title>
          . In: Y. Jin, (ed.),
          <source>Knowledge Incorporation in Evolutionary Computation</source>
          , Springer Verlag, Berlin,
          <year>2005</year>
          ,
          <fpage>333</fpage>
          -
          <lpage>355</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          25.
          <string-name>
            <given-names>S.</given-names>
            <surname>Valero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Argente</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Botti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.M.</given-names>
            <surname>Serra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Serna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Moliner</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Corma</surname>
          </string-name>
          <article-title>: DoE framework for catalyst development based on soft computing techniques</article-title>
          .
          <source>Computers and Chemical Engineering</source>
          ,
          <volume>33</volume>
          ,
          <year>2009</year>
          ,
          <fpage>225</fpage>
          -
          <lpage>238</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          26.
          <string-name>
            <given-names>D.</given-names>
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.V.</given-names>
            <surname>Buyevskaya</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Baerns</surname>
          </string-name>
          :
          <article-title>An evolutionary approach in the combinatorial selection and optimization of catalytic materials</article-title>
          .
          <source>Applied Catalyst A: General</source>
          ,
          <volume>200</volume>
          ,
          <year>2000</year>
          ,
          <fpage>63</fpage>
          -
          <lpage>77</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          27.
          <string-name>
            <surname>Z.Z. Zhou</surname>
            ,
            <given-names>Y.S.</given-names>
          </string-name>
          <string-name>
            <surname>Ong</surname>
            ,
            <given-names>P.B.</given-names>
          </string-name>
          <string-name>
            <surname>Nair</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          <string-name>
            <surname>Keane</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K.Y. Lum:
          <article-title>Combining global and local surrogate models to accellerate evolutionary optimization</article-title>
          .
          <source>IEEE Transactions on Systems, Man and Cybernetics</source>
          . Part C:
          <article-title>Applications</article-title>
          and Reviews,
          <volume>37</volume>
          ,
          <year>2007</year>
          ,
          <fpage>66</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>