<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Architecture</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>A Tool for Evolving Artificial Neural Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Efstratios F. Georgopoulos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adam V. Adamopoulos</string-name>
          <email>adam@med.duth.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Spiridon D. Likothanassis</string-name>
          <email>likothan@cti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Computer Engineering &amp; Informatics, University of Patras</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Medicine, Democritus University of Thrace</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Technological Educational Institute of Kalamata</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2081</year>
      </pub-date>
      <volume>1</volume>
      <fpage>2</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>A hybrid evolutionary algorithm that combines genetic programming philosophy, with localized Extended Kalman Filter (EKF) training method is presented here. This algorithm is used for the topological evolution and training of Multi-Layered Neural Networks. It is implemented as a visual software tool in C++ programming language. The proposed hybrid evolutionary algorithm is applied on two bio-signal modeling tasks: the Magneto Encephalogram (MEG) of epileptic patients and the Magneto Cardiogram (MCG) of normal subjects, exhibiting very satisfactory results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 INTRODUCTION</title>
      <p>One of the main problems that are faced when Artificial Neural
Networks (ANN) and especially Multilayer Perceptrons, are applied
on some tasks, is finding the network architecture or topology that
is best suited for the task at hand. A small network for the problem
might causes poor learning ability, while a large one might cause
poor generalization. Until now the common method to determine
the architecture of a neural network is by trial and error. However,
in the last years there have been many attempts, in the direction of
designing the architecture of a neural network automatically, that
have led to a variety of methods.</p>
      <p>
        Two subcategories of such methods are a) the constructive and
b) pruning (destructive) algorithms, [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ], [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. Roughly speaking, a
constructive algorithm starts with a minimal network, that is an
ANN with a minimal number of hidden layers, hidden neurons and
connections, and adds new layers, neurons or connections, if it is
necessary, during the training phase. On the opposite, a pruning
(destructive) algorithm does the opposite, starts with a maximal
network and deletes the unnecessary layers, nodes and connections
during training.
      </p>
      <p>
        Another approach to this problem is by using Genetic
Algorithms. Genetic Algorithms are a class of optimization
algorithms, which are good in exploring a large and complex space
in an intelligent way in order to find values close to the global
optimum (see [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] and [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] for details). The design of a
near optimal topology can be formulated as a search problem in the
architecture space, where each point in the space represents network
architecture. The training can be formulated as a search problem in
the weight space. Since the end of the last decade, there have been
several attempts to combine the technology of neural networks with
that of genetic algorithms. Given some performance criteria, for
example error, generalization ability, learning time, architectural
complexity etc, for the architecture, the performance level of all
architectures forms a surface in the space. The optimal architecture
design equals to finding the optimum point on this surface.
      </p>
      <p>
        The first attempts, described in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] and [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], focused
mainly on the problem of training the networks and not in the
topology design. They used neural networks with fixed architecture
and genetic algorithms in order to search the weight space for some
near optimum weight vector that solves the problem of network
training. That is, they used genetic algorithms instead of some
classical training algorithm. Soon the main research interest moved
from the training, to the search for the optimal architectural (or
topological) design of a neural network. Some first works used
genetic algorithms in order to imitate the pruning algorithms. They
start with a network larger than necessary for the task and then use
a specially designed genetic algorithm to define which combination
of connections is sufficient to, quickly and accurately, learn to
perform the target task, using back propagation. Miller et al. [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]
did that for some small nets. The same problem, but for larger
networks, was faced by Whitley and Bogard in [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. Bornholdt and
Graudenz in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], used a modified GA in order to evolve a simplified
model of a biological neural network and then applied the algorithm
to some toy Boolean functions. A different approach to the design
and evolution of modular neural network architectures is presented
in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Billings and Zheng in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] used a GA for the architectural
evolution of radial basis function (RBF) networks. The most recent
approach and maybe the most successful one, to the problem of
finding the near optimum architecture is presented in [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. There,
Yao and Liu propose a new evolutionary system, the EPNet, for
evolving artificial neural networks’ behavior.
      </p>
      <p>
        The last couple of years, there is an increasing interest in the use
of multi-objective optimization methods and especially
evolutionary multi-objective techniques for neural network training
and structure optimization. Two very interesting approaches are
presented in [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] and [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ].
      </p>
      <p>
        The present work is the sequence of a series of efforts
concerning the application of evolutionary algorithms for the
optimization of neural networks. In [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] a neural network model
with binary neurons was evolved by a modified genetic algorithm in
order to learn some Boolean functions. In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] genetically evolved artificial neural
networks were successfully used for a variety of problems.
      </p>
      <p>
        In this paper we present a hybrid evolutionary method that looks
like more to a genetic programming technique for the evolution of a
population of feed-forward Multi Layered Perceptrons [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. This
hybrid algorithm combines a genetic programming technique (for
details see [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]) for the evolution of the architecture of a neural
network, with a training method based on the localized Extended
Kalman Filter (EKF), known as Multiple Extended Kalman
Algorithm (MEKA). The MEKA is described in detail in [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. The
novelty of this effort depends on, apart from the combination of
evolution techniques with MEKA, the capability of the proposed
method to search, not only for the optimal number of hidden units,
but also, for the number of inputs needed for the problem at hand;
of course this stands only for time series prediction problems where
the number of needed past values, which represent the network’s
inputs, is unknown. This hybrid algorithm is an evolved and heavily
enriched version of an older algorithm that was developed by the
authors and presented in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Furthermore this
evolutionary neural network system has been implemented as a
visual tool in C++ with a graphical user interface. In order to test
the ability of this algorithm to produce networks that perform well,
we apply the system on two biosignals, namely the Magneto
Encephalogram (MEG) recordings of epileptic patients and
Magneto Cardiogram (MCG) of normal subjects. The algorithm
produces networks with small sizes that perform well.
      </p>
      <p>The rest of the paper is organized as follows. Section 2 describes
the hybrid evolutionary algorithm, while the numerical experiments
are presented in section 3. Finally, section 4 discusses the
concluding remarks.</p>
    </sec>
    <sec id="sec-2">
      <title>2 THE HYBRID EVOLUTIONARY ALGORITHM</title>
      <p>2.1</p>
      <p>THE MULTIPLE EXTENDED KALMAN</p>
      <p>ALGORITHM - MEKA
Consider a network characterized by a weight vector w. The
average cost function that should be minimized during the training
phase is defined in terms of N input-output patterns as follows:
Eav (w) =
1 N</p>
      <p>∑ ∑ ⎡⎣d j (n) − y j (n)⎤⎦
2N n=1 j∈C
2
(1)
(2)
(3)
(4)
(5)</p>
      <p>Where d j ( n) is the desired response and y j ( n) the actual
response of output neuron j when input pattern n is presented, while
the set C includes all the output neurons of the network. The cost
function Eav ( w) depends on the weight vector w due to the fact that
y j ( n) itself depends on w.</p>
      <p>
        Concentrating on an arbitrary neuron i, which might be located
anywhere in the network, its behavior during the training phase may
be viewed as a non-linear dynamic system, which in the context of
Kalman filter theory may be described by the following
statemeasurement equations [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]:
      </p>
      <p>Where the iteration n corresponds to the presentation of the nth
input pattern, xi ( n) and yi ( n) are the input and output vector of
neuron i respectively and ei ( n) is the measurement error at the
output of neuron i, the instantaneous estimate of which is given by:
wi ( n + 1) = wi (n)
di ( n) = yi ( n) + ei ( n)
yi ( n) = ϕ ( xiT ( n), wi ( n))
e i ( n ) = −
∂ E ( n )
∂ y i ( n )</p>
      <p>
        The differentiation in equation (5) corresponds to the
backpropagation of the global error to the output of neuron i. The
activation function ϕ (•) is responsible for the non-linearity in the
neuron. The weight vector wi of the optimum model for neuron i is
to be “estimated” through training with examples. The activation
function is assumed to be differentiable. Accordingly, we can use
Taylor series to expand equation (3) about the current estimate of
the weight vector and thereby linearize the equation as follows [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]:
φ ( xiT (n)wi (n)) ≅ qiT (n)wi (n) + ⎢⎣⎡φ ⎜⎝⎛ xiT (n) wi (n) ⎟ − qiT (n) wi (n)⎥
∧ ⎞ ∧ ⎤
⎠ ⎦
where
      </p>
      <p>⎡∂φ ( xiT (n)wi (n)) ⎤
qi (n) = ⎢ ⎥
⎣⎢ ∂ wi(n) ⎥⎦ wi (n)=wi ∧(n)</p>
      <p>∧ ∧ ⎤
= yi (n) ⎢⎡1 − yi (n)⎥⎦ xi (n)
⎣
∧
yi (n) is the output of neuron i that results from the use of the
weight estimate. In equation (8) we have assumed the use of the
logistic function; other sigmoid functions, like the hyperbolic
tangent, can be used as well. The first term of the right hand side of
equation (7) is the desired linear term while the remaining term
represents a modeling error. Thus substituting equation (7) and (4)
in (3) and ignoring the modeling error we obtain:
di (n) = qiT (n)wi (n) + ei (n)</p>
      <p>Where, n=1,…,N is the iteration number and N is the total
number of examples.</p>
      <p>The vector qi ( n) represents the linearized neuron activation
function given in equation (6), Pi ( n) is the current estimate of the
inverse of the covariance matrix of qi ( n) and ki (n) is the Kalman
gain. The parameter λ is a forgetting factor which takes values in
the range (0,1], and ei ( n) is the localized measure of the global
error. Equation (13) is called the Riccatti difference equation.</p>
      <p>Each neuron in the network perceives its own effective input
qi ( n) , hence it has to maintain its own copy of Pi ( n) even in the</p>
      <p>Where ei ( n) and qi ( n) are defined in equations (5) and (8)
respectively.</p>
      <p>
        Equations (2) and (9) describe the linearized behavior of neuron
i. Given the pair of equations (2) and (9), we can make use of the
standard Recursive Least Squares (RLS) algorithm equations [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ],
which is a special case of the Kalman filter, to make an estimate of
the weight vector wi ( n) of neuron i. The resulting solution is
defined by the following system of recursive equations [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] that
describe the Multiple Extended Kalman Algorithm (MEKA) [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]:
ri ( n) = (λ − 1) ⋅ Pi ( n − 1) ⋅ qi ( n)
ki ( n) = ri ( n) ⋅ (1 + riT ( n) ⋅ qi ( n)) − 1
      </p>
      <p>wi ( n + 1) = wi ( n) + ei ( n) ⋅ ki ( n)
Pi ( n + 1) = (λ − 1) ⋅ Pi ( n) − ki ( n) ⋅ riT (n)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
case in which it may share some of its inputs with other neurons in
the network.
2.2</p>
    </sec>
    <sec id="sec-3">
      <title>THE EVOLUTIONARY ALGORITHM</title>
      <p>
        The proposed evolutionary algorithm is an improved version of a
modified genetic algorithm that was used aforetime by the authors.
It maintains the basic working philosophy of evolutionary
algorithms and resembles genetic programming (see [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] for
details) since it evolves complicated structures like linked lists and
not simple bit strings as genetic algorithms do.
      </p>
      <p>The algorithm evolves, using a number of genetic operators, a
population of artificial neural networks (multilayered perceptrons)
that are represented as linked lists of network layers and neurons;
thus it is used the direct encoding scheme. The basic steps of the
algorithm are as follow:
1. Initialization: An initial population of neural networks (called
individuals) is created. Every individual has a random number
of neurons (or nodes) and connections (synapses). The
connection weights are initialized to some random values
within a specific range.</p>
      <p>Training: Every individual (neural network) in the population
is trained using MEKA for a small number of training epochs.
For populations other than initial, training occurs only for
those networks that have been changed by the application of
genetic operators.</p>
      <p>Fitness Evaluation: As fitness function it is used a function that
combines the performance of the network in the training
and/or validation set with the size of the network. The
performance is evaluated using the Mean Squared Error (MSE)
or the Mean Relative Error (MRE). While, the size is the
number of neurons and/or the number of active synapses in the
network. So the fitness function for the case of MRE has a
formula of the type:</p>
      <p>Fitness (i ) =</p>
      <p>1
1 + MRE (i ) + sp ⋅ MRE (i ) ⋅ SIZE (i )
(14)
Where sp is a parameter that controls the weight of the
network size in the evaluation of fitness, MRE(i) is the value
of MRE of individual i, SIZE(i) is the size of individual i
which can be calculated as the number of active connections
or the number of neurons and i is an index taking values in the
range 1 to population size.</p>
      <p>
        Selection: Selection operator is been used in order to create a
new, intermediate, population from the old one, by selecting
individuals based on their fitness. This can be done using any
of the following three different selection schemes that have
been implemented, namely:
 The Elitism Roulette Wheel Selection Operator, with
variable elitist pressure (for more details see [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
and [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]).
 The Rank Based Selection (for more details see [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ],
[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] and [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]).
 The Tournament Selection with variable tournament size
(for more details see [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] and [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]).
      </p>
      <p>
        Mutation: It works on the members of Three different
mutation operators are implemented:
 Input Mutation: it selects randomly a neural network from
the population and changes its number of inputs. This
operator works only on time series modeling and prediction
problems, where the number of past values (network inputs)
(16)
(17)
needed to predict future values is not usually known a
priory.
 Hidden mutation: it selects randomly a neural network from
the population and changes the structure of its hidden
region by adding or deleting a random number (selected
uniformly from a given interval) of hidden neurons.
 Non Uniform Weight mutation: it is responsible for the fine
tuning capabilities of the system. It selects randomly a
number of connection weights and changes their values to
new ones as follows: Let suppose that w is the old weight
value then the new one is given by the formula:
w( n + 1) = w( n) ± Δw(t,ub − w(n))
(15)
Where lb and ub are the lower and upper bounds of the
weight values, t is the generation number, and Δ(t,y) is a
function that returns a value in the range [0,y], such that the
probability of Δ(t,y) being close to 0 increases as t
increases. This property causes this operator to search the
solution space initially uniformly (while t is small) and very
locally at the later stages. In our experiments the following
function, [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] was used:
      </p>
      <p>
        ⎛
Δ (t, y ) = y ⋅ ⎜1 − r
⎝
(1− tT )b ⎞⎟
⎠
Where r is a random number on [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ], T is the maximal
generation number (a parameter of the algorithm), and b is a
system parameter determining the degree of non-uniformity.
 Gaussian weight mutation: it works like the Non Uniform
Weight mutation operator with the difference that the new
weight value is calculated by the formula:
      </p>
      <p>w( n + 1) = w( n) + Δw( n)</p>
      <p>Where, Δw is a small random number following Gaussian
distribution.
 Uniform weight mutation: it works like the Gaussian
mutation operator with the difference that, Δw is a small
random number following Uniform distribution.</p>
      <p>Crossover: It selects two parents (neural networks) and
generates one or two offspring by recombining parts of them.
The offspring take the place of their parents in the new
population. In the presented algorithm crossover recombines
whole neurons with their incoming connections. But since we
have to deal with networks with different structures, the new
connections that might have to be produced are initialized with
random weight values as in the initialization phase. Herein
crossover works more like a mutation operator, like in most
genetic programming systems, than as the recombination
operator of genetic algorithms</p>
      <p>Therefore the presented hybrid evolutionary algorithm works in
brief as follows: it starts with a population of randomly constructed
Neural Networks (step 1). Networks undergo some training for a
couple of epochs with MEKA, using the training set (step 2).
Performance is measured with the fitness function (step 3) using the
validation set, in order to improve generalization. Then a new,
intermediate, population is created, by selecting the more fit
individuals according to their fitness (step 4) using any of the three
selection schemes. Some members of this intermediate population
undergo transformations by means of genetic operators to form the
members (individuals) of the new population: mutation (step 5) and
crossover (step 6) operators. The new population that is created is
trained again (step 2); new members are trained for a couple of
epochs, while the members that have survived and passed from the
old population may be trained with MEKA for some more epochs,
or may not be trained at all. This is the new generation. This whole
process continues until a predefined termination condition is
fulfilled; the termination condition might be a maximum number of
generation or a minimum error (MSE or MRE) value. Once
terminated the algorithm is expected to have reached a
nearoptimum solution, i.e. a trained network with near optimum
architecture.
2.3</p>
    </sec>
    <sec id="sec-4">
      <title>THE TOOL</title>
      <p>This hybrid evolutionary algorithm has been implemented as a
visual tool in C++ programming language, having a graphical user
interface (GUI). Specifically, it was used the Borland C++ version
6.0 IDE for Windows. Figure 1 and 2 depict two of the basic forms
of the program, for the two main categories of problems that it can
be used for, classification and time series prediction. Figure 3 is the
“statistics” form that illustrates the evolutionary process and prints
useful information about it.</p>
      <p>The main form for classification problems of the evolutionary
neural network system</p>
      <p>The main form for prediction problems of the evolutionary</p>
      <p>neural network system</p>
      <p>The user can select between this two problem categories. Then
he/she can insert the values of the various genetic parameters, the
training, validation and test files, as well as the output log files. The
user can observe the evolutionary process using some real time
graphical display of the error, the performance of the best ever
network and other parameters.</p>
    </sec>
    <sec id="sec-5">
      <title>3 NUMERICAL EXPERIMENTS</title>
      <p>In order to examine the ability of the algorithm to produce networks
that learn and generalize well we have tested it on two real world
problems: the modeling of the MEG recordings of epileptic patients
and the modeling MCG recordings of normal subjects.</p>
      <p>
        Brain dynamics can be evaluated by recording the changes of the
neuronal electric voltage, either by the electroencephalogram
(EEG), or by the MEG. The EEG recordings represent the time
series that match up to neurological activity as a function of time.
On the other hand the MEG is generated due to the time varying
nature of the neuronal electric activity, since time-varying electric
currents generate magnetic fields. EEG and MEG are considered to
be complementary, each one carrying a part but not the whole of the
information related to the underlying neural activity. Thus, it has
been suggested that the EEG is mostly related to the inter-neural
electric activity, whereas the MEG is mostly related to the
intraneural activity. The MEG recordings of epileptic patients were
obtained using a Super-conductive QUantum Interference Device
(SQUID) and were digitized with a sampling frequency of 256Hz
using a 12-bit A/D Converter. SQUID is a very sensitive
magnetometer, capable to detect and record the bio-magnetic fields
produced in the human brain due to the generation of electrical
micro-currents at neural cellular level [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ].
      </p>
      <p>
        The same stands for the MCG recordings which are magnetic
recordings of the heart operation of normal subjects. MEG and
MCG data were provided by the Laboratory of Medical Physics of
the Democritus University of Thrace, Greece, where a one-channel
DC SQUID is operable. Both biosignal data were normalized in the
interval [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ] in order to be processed by the neural networks.
      </p>
      <p>In all the experiments we used, for comparison reasons, the same
parameter values, which are depicted in table 1. For the case of the
MEG modeling, as training set where used 1024 data samples
(corresponding to a four seconds epoch of the MEG) while for the
testing was used 512 data samples (corresponding to a two seconds
epoch of the MEG). For the case of the MCG modeling, as training
set where used 1024 data samples and for the test set was used 1024
data samples The algorithm was left to run over 1000 generations.</p>
      <p>In order to evaluate the forecasting capability of the produced
networks we used three well-known error measures, the Normalized
Root Mean Squared Error (NRMSE), the Correlation Coefficient
(CC) and the Mean Relative Error (MRE). The performance of the
hybrid algorithm for the case of MEG modeling is depicted in
tables 2 and 3 and in figure 4, while for the case of MCG in tables 4
and 5 and in figure 5.
MRE
1.4222
1.1874
1.3642
0.9617
0.8677
0.9882
0.9817
0.9915</p>
      <p>Actual(-) vs Predicted(:)
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
0
200
400
600
800
1000</p>
    </sec>
    <sec id="sec-6">
      <title>4 CONCLUSIONS</title>
      <p>In the current paper it was presented a hybrid biological inspired
evolutionary algorithm that combines a genetic programming
technique with a training method based on the Multiple Extended
Kalman Algorithm. This hybrid algorithm is implemented in C++
as a software system with a graphical user interface.</p>
      <p>The main novelties of the proposed hybrid algorithm are the
combination of genetic programming technique with MEKA, the
use of a fitness function that combines performance with network
size, the ability to evolve not only the structure of the hidden layers
but the number of inputs as well, and the large number of different
genetic operators and especially mutation operators that have been
implemented. Another novelty is the representation used for neural
networks. As said before, every network in the population is
represented as a link list of layers and neurons, using the direct
encoding scheme. The use of link lists has some certain advantages
that have to do mainly with the memory management; you use only
the memory that is needed every time and you don’t have to
allocate a maximum memory size, for maximum network size like
other representation schemes. Moreover link lists are dynamic data
structures, which it means that the neural network architecture can
change dynamically during run time in contrast with other data
structures like matrices that in C++ can not change during run time.</p>
      <p>This hybrid algorithm was used for the modeling of two
biological time series, namely the Magneto Encephalogram (MEG)
recordings of epileptic patients and Magneto Cardiogram (MCG) of
normal subjects. All the reported cases refer to predictions on
recordings of the dynamics of nonlinear systems. In all the
performed experiments the algorithm was able to find a near
optimum network architecture that gave small prediction errors.
Therefore we can conclude that the algorithm is able to produce
small and compact networks that learn and generalize well.</p>
      <p>The algorithm has only tested on time series prediction problems
and it is in our intention to test it on some difficult classification
problems as well.</p>
      <p>One of the main drawbacks of this kind of algorithms, namely
the evolutionary algorithms, hybrid or not, is that they are
computational expensive in terms of computer memory and CPU
time. Even though the proposed algorithm belongs to this category,
the use of MEKA for just a couple of epochs for the training phase
of the neural networks and the representation where each member
of the population is a network represented as a link list so that there
is no need to use encoding and decoding functions for the
calculation of network’s performance, makes the algorithm less
computational expensive than other approaches to the same
problem of neural networks evolution.</p>
      <p>The algorithm could be further improved by adding some more
genetic operators for better and faster local search both to the
architecture and weight space and this is going to be one of our
future research targets. Furthermore, in the integrated software
system there are already implemented a large number of genetic
operators whose influence to the performance of the hybrid
algorithm needs to be appraised; we need to see which of the three
selection schemes, or the many mutation operators give better
results. Another future research direction will be the combination
of MEKA with other evolutionary techniques like Particle Swarm
Optimization and Differential Evolution for neural network
evolution.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Adamopoulos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andreou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georgopoulos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ioannou</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Likothanassis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , “
          <article-title>Currency Forecasting Using Recurrently RBF Networks Optimized by Genetic Algorithms”</article-title>
          ,
          <source>Computational Finance 1997 (CF'97)</source>
          , London Business School,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Adamopoulos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anninos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Likothanassis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Georgopoulos</surname>
          </string-name>
          , E. “
          <article-title>On the Predictability of MEG of Epileptic Patients Using RBF Networks Evolved with Genetic Algorithms”</article-title>
          , BIOSIGNAL'98,
          <string-name>
            <surname>Brno</surname>
          </string-name>
          , Czech Republic, June 23-25,
          <year>1998a</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Adamopoulos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georgopoulos</surname>
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manioudakis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Likothanassis</surname>
            ,
            <given-names>S. “</given-names>
          </string-name>
          <article-title>An Evolutionary Method for System Structure Identification Using Neural Networks</article-title>
          ” Neural Computation '
          <volume>98</volume>
          ,
          <year>1998b</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Adamopoulos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Georgopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Likothanassis</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Anninos</surname>
          </string-name>
          , “
          <article-title>Forecasting the MagnetoEngephaloGram (MEG) of Epileptic Patient Using Genetically Optimized Neural Networks”</article-title>
          ,
          <source>Genetic and Evolutionary Computation Conference (GECCO'99)</source>
          , Orlando,
          <string-name>
            <surname>Florida</surname>
            <given-names>USA</given-names>
          </string-name>
          ,
          <source>July 14-17</source>
          ,
          <year>1999</year>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Andreou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georgopoulos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Likothanassis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Polidoropoulos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , “
          <article-title>Is the Greek foreign exchange-rate market predictable? A comparative study using chaotic dynamics and neural networks”</article-title>
          ,
          <source>Proceedings of the Fourth International Conference on Forecasting Financial Markets</source>
          , Banque Nationale de Paris and Imperial College, London,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Andreou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georgopoulos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zombanakis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Likothanassis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , “
          <article-title>Testing Currency Predictability Using An Evolutionary Neural Network Model”</article-title>
          ,
          <source>Proceedings of the fifth International Conference on Forecasting Financial Markets</source>
          , Banque Nationale de Paris and Imperial College, London,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Andreou</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georgopoulos</surname>
            <given-names>E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Likothanassis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          “
          <source>Exchange Rates Forecasting: A Hybrid Algorithm Based On Genetically Optimized Adaptive Neural Networks”, Computational Economics Journal</source>
          , Kluwer Academic Publishers, vol.
          <volume>20</volume>
          , issue 3, pp.
          <fpage>191</fpage>
          -
          <lpage>210</lpage>
          ,
          <year>December 2002</year>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Billings</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>G. L.</given-names>
          </string-name>
          <article-title>Radial basis function network configuration using genetic algorithms</article-title>
          .
          <source>Neural Networks</source>
          , Vol.
          <volume>8</volume>
          , pp.
          <fpage>877</fpage>
          -
          <lpage>890</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Bornholdt</surname>
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Graudenz</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>General asymmetric neural networks and structure design by genetic algorithms</article-title>
          .
          <source>Neural Networks</source>
          , Vol.
          <volume>5</volume>
          ,
          <fpage>pp327</fpage>
          -
          <lpage>334</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <article-title>Mapping classifier systems into neural networks</article-title>
          .
          <source>Proceedings of the 1988 Conference on Neural Information Processing Systems</source>
          , Morgan Kaufmann,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Georgopoulos</surname>
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Likothanassis</surname>
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Adamopoulos</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <source>“Evolving Artificial Neural Networks Using Genetic Algorithms”, Neural Network World, 4/00</source>
          , pp.
          <fpage>565</fpage>
          -
          <lpage>574</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Goldberg</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>Genetic Algorithms in Search Optimization &amp; Machine Learning</article-title>
          , Addison-Wesley
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Happel</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , et al.
          <article-title>Design and evolution of modular neural network architectures</article-title>
          .
          <source>Neural Networks</source>
          , Vol.
          <volume>7</volume>
          , pp.
          <fpage>985</fpage>
          -
          <lpage>1004</lpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Haykin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , “
          <string-name>
            <surname>Neural Networks - A Comprehensive Foundation</surname>
            <given-names>”</given-names>
          </string-name>
          ,
          <source>McMillan College Publishing Company, ch. 6</source>
          , p.
          <fpage>213</fpage>
          , New York,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Holland</surname>
          </string-name>
          ,
          <source>J. Adaptation in Natural and Artificial Systems</source>
          , MIT press
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Koza</surname>
            <given-names>J.R.</given-names>
          </string-name>
          ,
          <article-title>Genetic programming: on the programming of computers by means of natural selection</article-title>
          , MIT Press, Cambridge, MA,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Likothanassis</surname>
            <given-names>S. Georgopoulos E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Fotakis</surname>
            <given-names>D.</given-names>
          </string-name>
          (
          <year>1997</year>
          ).
          <article-title>Optimizing the Structure of Neural Networks Using Evolution Techniques</article-title>
          .
          <source>5th International Conference on Applications of High - Performance Computers in Engineering</source>
          , pp.
          <fpage>157</fpage>
          -
          <lpage>168</lpage>
          , Santiago de Compostela, Spain, July.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Likothanassis</surname>
            ,
            <given-names>S. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georgopoulos</surname>
            ,
            <given-names>E. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manioudakis</surname>
            ,
            <given-names>G. D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Adamopoulos</surname>
            ,
            <given-names>A.V.</given-names>
          </string-name>
          , “
          <article-title>Currency Forecasting Using Genetically Optimized Neural Networks”</article-title>
          ,
          <source>HERCMA Athens September</source>
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Likothanassis</surname>
            ,
            <given-names>S. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georgopoulos</surname>
            ,
            <given-names>E. F.</given-names>
          </string-name>
          “
          <article-title>A Novel Method for the Evolution of Neural Networks”</article-title>
          ,
          <source>3rd IMACS/IEEE International Conference on Circuits Systems and Computers (CSC'99)</source>
          ,
          <year>July 1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Michalewicz</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <source>“Genetic Algorithms + Data Structures = Evolution Programs”</source>
          , Springer-Verlag,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , et al.
          <article-title>Designing neural networks using genetic algorithms</article-title>
          .
          <source>Proceedings of the 3rd International Conference on Genetic Algorithms</source>
          , Morgan Kaufmann 1989.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Mitchell</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>An Introduction to Genetic Algorithms</article-title>
          , MIT Press 1996.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Montana</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <article-title>Training feedforward neural networks using genetic algorithms</article-title>
          .
          <source>BBN Systems and Technologies</source>
          , Cambridge, MA 1989.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palmieri</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Datum</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , “
          <article-title>Optimal Filtering Algorithms for Fast Learning in Feed-Forward Neural Networks”</article-title>
          ,
          <string-name>
            <surname>Neural</surname>
            <given-names>Networks</given-names>
          </string-name>
          , Vol.
          <volume>5</volume>
          , pp.
          <fpage>779</fpage>
          -
          <lpage>787</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Whitley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>Applying genetic algorithms to neural network problems</article-title>
          ,
          <source>International Neural Networks Society</source>
          , p.
          <fpage>230</fpage>
          <lpage>1988</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Whitley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Bogart</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>The evolution of connectivity: Pruning neural networks using genetic algorithms</article-title>
          .
          <source>International Joint Conference on Neural Networks</source>
          , Washington D.C., 1. Hillsdale, NJ: Lawpence Erlbaum, pp.
          <fpage>134</fpage>
          -
          <lpage>137</lpage>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Whitley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Hanson</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <article-title>Optimizing neural networks using faster, more accurate genetic search</article-title>
          .
          <source>3rd Intern. Conference on Genetic Algorithms</source>
          , Washington D.C., Morgan Kaufmann, pp.
          <fpage>391</fpage>
          -
          <lpage>396</lpage>
          ,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <article-title>A New Evolutionary System for Evolving Artificial Neural Networks</article-title>
          ,
          <source>IEEE Transactions on Neural Networks</source>
          , vol.
          <volume>8</volume>
          , no.
          <issue>3</issue>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          “
          <article-title>Evolving Artificial Neural Networks”</article-title>
          ,
          <source>Proceedings of the IEEE</source>
          ,
          <volume>87</volume>
          (
          <issue>9</issue>
          ):
          <volume>1423</volume>
          :
          <fpage>1447</fpage>
          ,
          <year>September 1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Anninos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Jacobson</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Tsagas</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Adamopoulos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          “
          <article-title>Spatiotemporal Stationarity of Epileptic Focal Activity Evaluated by Analyzing Magneto Encephalographic (MEG) data and the Theoretical Implications”</article-title>
          .
          <source>Panminerva Med</source>
          .
          <volume>39</volume>
          ,
          <fpage>189</fpage>
          -
          <lpage>201</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Graning</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ; Yaochu Jin; Sendhoff,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Generalization Improvement in Multi-Objective Learning, Neural Networks</article-title>
          ,
          <source>IJCNN apos;06</source>
          . International Joint Conference on Volume ,
          <source>Issue</source>
          ,
          <fpage>0</fpage>
          -
          <lpage>0</lpage>
          0 Page(s):
          <fpage>4839</fpage>
          -
          <lpage>4846</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Vieira</surname>
            ,
            <given-names>D. A. G.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Vasconcelos</surname>
          </string-name>
          and
          <string-name>
            <surname>W. M.</surname>
          </string-name>
          <article-title>Caminhas: Controlling the parallel layer perceptron complexity using a multiobjective learning algorithm</article-title>
          ,
          <source>Neural Computing and Applications</source>
          , vol.
          <volume>16</volume>
          , n. 4, p.p.
          <fpage>317</fpage>
          -
          <lpage>325</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>