<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Learning Central Pattern Generator Network with Back-Propagation Algorithm</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rudolf J. Szadkowski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Petr Cˇ ížek</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Faigl</string-name>
          <email>faiglj@fel.cvut.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Czech Technical University in Prague</institution>
          ,
          <addr-line>Technicka 2, 16627 Prague</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>2203</volume>
      <fpage>116</fpage>
      <lpage>123</lpage>
      <abstract>
        <p>An adaptable central pattern generator (CPG) that directly controls the rhythmic motion of multilegged robot must combine plasticity and sustainable periodicity. This combination requires an algorithm that searches the parametric space of the CPG and yields a non-stationary and non-divergent solution. We model the CPG with the pioneering Matsuoka's neural oscillator which is (mostly) non-divergent and provides constraints ensuring nonstationarity. We embed these constraints into the CPG formulation which we further implemented as a layer of an artificial neural network. This enables the CPG to be learnable by back-propagation algorithm while sustaining the desirable properties. Moreover, the proposed CPG can be integrated into more complex networks and trained under different optimization objectives. In addition to the theoretical properties of the developed system, its flexibility is demonstrated in successful learning of the tripod motion gait with its practical deployment on the real hexapod walking robot.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        The movement of legged robots relies on synchronized
control of each its joint. Since these joints are part of
the same body, the velocity of each joint is dependent on
the position of all robot’s joints. The problem of
generating such synchronized control signals gets harder with
increasing number of legs (or the number of joints per leg).
A widely used generator of such signals is a system of
interconnected Central Pattern Generators (CPGs). The
system based on CPGs can be described as two or more
coupled oscillators. CPGs appear in many vertebrates and
insects where they are responsible for controlling rhythmic
motions, such as swimming, walking or respiration [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
It also appears in biologically inspired robotics, where
CPGs are used for locomotion control of legged robots [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>A CPG network can be modeled as a non-linear
dynamic system with coupled variables. Such a non-linear
dynamic system is parameterized in the way that it
contains a stable limit cycle, but finding such a
parametrization is difficult because an analytical description of the
high-dimensional non-linear dynamic system is hard or
impossible. Moreover, even a small change in the
parameters can result in a sudden change of the system’s
qualitative properties that can range from chaotic to stationary
and somewhere between is the desired periodic behavior.</p>
      <p>
        Parameters of the CPG networks can be found
experimentally (i.e., tuned manually or automatically by
evolutionary algorithms [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]) or they can be heuristically
designed. Such design-dependent methods make CPG
networks difficult to scale on other robotic bodies or
adapt to the locomotion control in different environments.
The scaling problem can be partially bypassed by
precomputing a trajectory for each foot tip and employing
inverse kinematics to determine the control signals for the
particular leg’s joints [
        <xref ref-type="bibr" rid="ref5">5, 6</xref>
        ]. However, the inverse
kinematic depends on the robot’s body, and identification of
the parameters that have to be manually fine-tuned to
ensure a proper behavior.
      </p>
      <p>The motivation for the presented approach is to develop
a fully automatic CPG learning and this paper explores the
possibility of learning a CPG network modeled by
Matsuoka’s neural oscillators [7] with back-propagation
algorithm (BP). To boost the BP algorithm that learns the
desired locomotion control for our multi-legged walking
robot, we propose two methods pruning the parameter
space of the CPG network.</p>
      <p>The particular contributions presented in the paper are
considered as follows.</p>
      <p>• A normalization layer that prunes the parameter
space from parametrization with stable stationary
solutions.
• An inductive learning method that exploits the
structure of robot’s body and further reduces the searched
parametric space.
• Experimental evaluation of the proposed learning
using real hexapod walking robot for which the
proposed CPG network learned by the designed
algorithm exhibits successful locomotion control
following tripod gait, where the developed CPG network
directly produces the control signal for each of 18
actuators of the robot.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Different biomimetic approaches including CPGs [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
Recurrent Neural Networks [8] or Self-Adjusting Ring
Modules [9] to produce rhythmic patterns have been studied
and deployed for locomotion control of robots [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] in recent
years. These approaches differ mainly in the complexity of
the underlying model and have different levels of
abstraction ranging from biomechanical models [10] simulating
membrane potentials and ion flows inside neurons, down
to a model of two coupled neurons in a mutual
inhibition [11]. Amongst them, the CPGs based on Matsuoka’s
neural oscillator [7] are being used as the prevalent model.
Further details on the Matsuoka’s model are in Section 3
as we built on its properties [7, 12, 13] in our work.
      </p>
      <p>
        Deployment of the CPG oscillators on legged robots is
also particularly difficult because of different kinematics
and dynamics of each robot. A different amount of
postprocessing is used to translate the CPG outputs to joint
coordinates. Namely, approaches using inverse
kinematics [
        <xref ref-type="bibr" rid="ref5">5, 6</xref>
        ] suffer from necessary hand fine-tuning of both
the parameters of CPG as-well-as kinematics. Besides,
existing approaches are using the separate neural network as
motor control unit [11] or use CPG outputs directly as joint
angles [14]. Furthermore, CPGs can seamlessly switch
between different output patterns, thus different gaits [15]
which further supports the direct joint control. In our
work, we use a dedicated output layer to shape the
outputs of CPGs as we assume simple transformations of the
output signal are easier to learn by changing parameters of
the output layer while the gait change is in charge of the
CPG.
      </p>
      <p>
        Parametrization of the oscillator can be found
experimentally, e.g., using evolutionary algorithms with fitness
function minimizing energy consumption [11],
maximizing the velocity [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], or using parameter optimization [16].
Besides, a modified back-propagation algorithm has been
used on an adaptive neural oscillator in [17] to imitate an
external periodic signal by its output signal, but it fails
to sustain oscillations for complex waveforms. Further
works on the parameter constraining of CPGs to maintain
stable oscillations have been published [7,12,13,16];
however, to the best of our knowledge we are the first to teach
a network of CPGs to perform a locomotion gait of a
hexapod walking robot using back-propagation. Furthermore,
we propose two methods to prune the space of possible
CPG parameters.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Central Pattern Generator Network</title>
      <p>The CPG network used in this paper is based on the
Matsuoka’s neural oscillator [7]. Matsuoka’s neural oscillator
is a pair of symmetrically connected adaptive neurons,
extensor, and flexor, that imitate the behavior of biological
neurons where after peaking, the neuron starts to
repolarize until its activation drops to resting potential. Features
of Matsuoka’s neurons were extensively studied; hence,
necessary conditions under which the neural network
enters the stable stationary state [7], effects of time-variant
tonic input [12], and approximation of oscillator’s
fundamental frequency and amplitude [13] are well documented
in the literature. The description of the particular CPG
model used in this work is as follows.</p>
      <p>Extensor neuron</p>
      <p>Flexor neuron
d
Ta dt</p>
      <p>d
Tr dt
cie
vie
uie
β</p>
      <p>to other CPGs
wij</p>
      <p>wij
wfe</p>
      <p>wfe
from other CPGs
vif
uif
β</p>
      <p>d
Ta dt</p>
      <p>d</p>
      <p>Tr dt
cif
where the subscript i ∈ N denotes the particular CPG and
the superscript μ ∈ {e, f } distinguishes the extensor and
flexor neurons, respectively. Each tuple of the variables
uie, vie describes the dynamics of the extensor neuron. The
variable uie represents activation of the neuron and vie
represents its self-inhibitory input, which makes this neuron
adaptive. Similarly uif , vif describe the dynamics of the
flexor neuron. The function g is a rectifier</p>
      <p>g(x) = max(0, x)
that is an activation function that adds non-linearity to the
system. Each neuron (i, μ) inhibits itself through the
variable viμ scaled by the parameter β &gt; 0. The extensor-flexor
pair (i.e., the CPG unit) mutually inhibits itself through
the symmetric connection with the weight w f e &gt; 0.
Finally, the CPG units are inter-connected with the
symmetric inhibiting connections wi j ∈ W for wi j ≥ 0 and wii = 0,
where W is a symmetric matrix. The only source of
excitation for this CPG network is the tonic input cie, cif (≥ 0)
which is given externally. In general, the tonic input may
be time-dependent and can be used to regulate the output
of the CPG network [12]. Tr and Ta (both &gt; 0) are
reaction times for their respective variables. The structure of
the CPG unit is visualized in Fig. 1.</p>
      <p>All the equations (1), (2), (3), and (4) are differentiable
except the cases when uiμ = 0, since the rectifier is used as
the activation function. However, we assume this will not
cause any problems because the rectifier is used inside the
Rectified Linear Units (ReLU), which are widely used in
deep neural networks.
(2)
(4)
(5)</p>
      <p>a
Tibi
θT</p>
      <p>Note that except tonic inputs cie, cif , there are used only
inhibiting connections, because such a system is less prone
to become chaotic or divergent [13].
In this work, we consider the self-inhibitory inputs ve, v f
as hidden variables, we do not work with them outside of
the CPG network. The output layer combines the
activation variables ue, u f with the affine transformation
y = Wout u + bout ,
(6)
where u = (ue, u f ) and Wout ∈ RN×2N , bout ∈ RN×1 are the
learnable parameters. The connection of the CPG network
and the output layer is illustrated in Fig. 2.</p>
      <p>The main advantage of having Wout and bout as learnable
parameters are that the BP algorithm can scale and
translate the limit cycle formed by the CPG network. Here, we
assume that these transformations are easier to learn by
changing the parameters of the output layer than by
changing parameters of the CPG network. It is because a change
of any parameter of the CPG network can generally cause
a non-linear change in the amplitude, frequency, and shift
of the generated signals [6]. Another advantage of the
proposed output layer is that it can develop complex signals
as it can combine outputs from different CPGs.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Proposed Locomotion Control Learning</title>
      <p>In this section, we propose the normalization layer and
inductive learning method adapted to learning a CPG
network for a hexapod walking robot, see Fig. 3a. Each leg
of the robot has three joints called coxa, femur, and tibia
(see Fig. 3b) for which an appropriate control signal has
to be generated to control the locomotion of the robot. In
the total, the robot has 18 controllable joints and
depending on the control signals; the robot can move with various
motion gaits [18], e.g., tripod, quadruped, wave, and
pentapod. During the locomotion, each leg is either in a swing
phase to reach a new foothold or in the stance phase in
which it supports the body. The motion gait prescribes the
order in which the swing and support phases alternate for
individual legs; hence, all the legs must work in
coordination to simultaneously achieve the desired behavior. The
hexapod walking robot is thus used for benchmarking the
proposed learning method, where the CPG network has to
learn to generate control signals that realize the
locomotion control of the robot with the tripod motion gait.
The proposed normalization layer is based on early
experiments with randomly parametrized CPG networks which
in most cases ends up oscillating or converges to a static
behavior. The static behavior is caused by the stable fixed
points that may appear in the corresponding dynamic
system. Therefore, we propose to employ a sufficient
condition for the CPG network to be free of stable fixed points.</p>
      <sec id="sec-4-1">
        <title>Condition. For a CPG network of N units, if all the values</title>
        <p>of the tonic input ciμ , where i ∈ N and μ ∈ {e, f }, are from
the range [cmin, cmax] and
w f e &lt; cmin (1 + β ) − mi∈aNx</p>
        <p>cmax
w f e &gt; 1 + Tr/Ta</p>
        <p>N !
∑ wi j ,
j
then the CPG network has no stable fixed point.</p>
        <p>Proof. First, we state adapted theorem from [7].
Theorem. Assume that for some i and k (i 6= k)
2N
ci(1 + β ) − ∑ ai jc j &gt; 0,</p>
        <p>j
2N
ck(1 + β ) − ∑ ak jc j &gt; 0,</p>
        <p>j
aik &gt; 1 + Tr/Ta,
A =</p>
        <p>W
w f eI
w f eI
W
then the CPG network has no stable fixed point. The term
{ai j} = A(2N,2N) is a matrix of the form
(7)
(8)
(9)
(10)
(11)
(12)
and c = (ce, c f ), where I is the identity matrix of the same
dimensions as W .</p>
        <p>Since the CPGs should act as independent units, it is
intuitive that each extensor-flexor neuron pair (a CPG) is
able to oscillate on its own. Thus, a weaker form of the
theorem is used, where the following conditions must hold
for each i-th CPG:
ce 1 N
if (1 + β ) − f ∑ wi jcej &gt; w f e
ci ci j
c f 1 N
ie (1 + β ) − ce ∑ wi jc jf &gt; w f e
ci i j</p>
        <p>w f e &gt; 1 + Tr/Ta.</p>
        <p>Now, we can focus on the effect of the tonic input c. For
any parametrization W, β , Tr, Ta, w f e we can find a vector
c that would break these conditions. Let’s relax the
problem by clipping the values of c into the range [cmin, cmax]
where cmin &gt; 0. Then, it must become independent on the
mutable c vector to simplify the system of conditions. This
can be done by substituting c with such ci− that minimizes
the left side expression of (13) or (14) for the i-th CPG.
W.l.o.g. we consider findingci− just for (13) as
ci− =</p>
        <p>argmin cief (1 + β ) − 1f ∑N wi jcej.</p>
        <p>c∈[cmin,cmax]2N ci ci j
Since all the parameters are positive and wii = 0, the min
argument in (16) decreases monotonically with decreasing
cie and increasing c jf values. Thus, we can substitute these
variables with their respective extremes</p>
        <p>Since ccmmainx ∈ (0, 1] and ε &gt; 0, the expression F(cmax)
always minimizes (20). Therefore
get</p>
        <p>After substituting c0i into (17) and then ci− into (13) we</p>
        <p>N
cmin (1 + β ) − ∑ wi j &gt; w f e.</p>
        <p>cmax j
Finally, to make this condition independent on the i-th
CPG, we can choose such an inequality (26) that has the</p>
        <p>N
largest value of the ∑ wi j expression</p>
        <p>j
w f e &lt; cmin (1 + β ) − mi∈aNx
cmax</p>
        <p>N !
∑ wi j .
j
Combining (15) and (27) we get the desired (8) and (7).</p>
        <p>We integrate the conditions (7) and (8) into the BP
framework by redefining the variablesw f e and β as
functions
w f e(wˆf e, Tr, Ta) = 1 + Tr/Ta + exp(wˆf e),
β (βˆ, w f e, w∗) = (w f e + w∗) cmax + exp(βˆ) − 1, (29)
cmin
where wˆf e, βˆ∈ R are new independent parameters and w∗
is defined as
w∗ = max
i∈N</p>
        <p>N !
∑ wi j .
j
Then, the max operator is approximated by the
differentiable smoothmax defined as
softmax(x) =
exp(x)
∑ exp(x)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(30)
(31)
that leaves just c0i as the variable to minimize</p>
        <p>N
F(c) = cmin (1 + β ) − cmax ∑ wi j,</p>
        <p>c c j
c0i =</p>
        <p>argmin
cif ∈[cmin,cmax]</p>
        <p>F(cif ).</p>
        <p>Notice that now, we are searching a scalar value c0i that
minimizes the given expression.</p>
        <p>The equation dF(c) = 0 has a solution only if F has such
dc
parameters β ,W, cmin, and cmax that make the function F
constant. Since it is unlikely that such a parametrization
will emerge during the learning, we consider F does not
have any local extremes in the range [cmin, cmax].
Therefore, the minimization (19) can be simplified to
c0i = argmin{F(cmin), F(cmax)}.
(20)</p>
        <p>The condition (13) implies F &gt; 0, because w f e must be
greater than zero and the following condition must hold
too</p>
        <sec id="sec-4-1-1">
          <title>Now, we define variableε &gt; 0 that</title>
          <p>and substitute the right side of (22) into F(cmin) and</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>F(cmax)</title>
        <p>N
1 + β &gt; cmax ∑ wi j.</p>
        <p>cmin j</p>
        <p>N
1 + β = cmax ∑ wi j + ε</p>
        <p>cmin j
F(cmax) = cmin ε,</p>
        <p>cmax
F(cmin) = ε.</p>
        <p>c0i = cmax.
Since all the parameters must be positive, other parameters
are defined as exponent of the underlying parameter as
Ta = exp(Tˆa),
Tr = exp(Tˆr),
wi j = exp(wˆi j), i 6= j,
(33)
where Tˆa, Tˆr, wˆi j ∈ R. The weights wi j, i 6= j cannot reach
zero during learning, but they can approach it.</p>
        <p>The BP algorithm learns the proposed new
parameters Tˆa, Tˆr, wˆi j, wˆf e, and βˆ that are later normalized
by (28), (29), and (33).
4.2</p>
        <p>Proposed Architecture and Inductive Learning
We propose to divide the CPG network into smaller
subnetworks to reduce the search parameter space. These
sub-networks are independently learned and then merged
into larger sub-networks until a single final network
remains. The proposed learning of the CPG network is
performed in three phases. First, we learn a single CPG
to generate a signal for one joint which gives us the
shared parameters (w f e, Ta, Tr, β ). Then, six triplets of
CPGs are learned to generate a control signal for the
particular leg. Therefore, for each leg k ∈ [1, . . . , 6], we
get parameters W k and Wokut , bkout . In the final phase,
we connect all six CPG sub-networks into one. We
choose to connect CPG sub-networks only by coxa-CPGs
as it is assumed this is enough for each CPG
subnetwork to synchronize. Therefore, for the subspace ue =
(uecoxa,1, . . . , uecoxa,6, uef emur,1, . . . , uteibia,1) (and similarly for
u f ), W ∈ R18×18 is organized as follows
</p>
      </sec>
      <sec id="sec-4-3">
        <title>Wcoxa,coxa</title>
        <p>W =  Wf emur,coxa
</p>
      </sec>
      <sec id="sec-4-4">
        <title>Wtibia,coxa</title>
      </sec>
      <sec id="sec-4-5">
        <title>Wcoxa, f emur</title>
        <p>0</p>
      </sec>
      <sec id="sec-4-6">
        <title>Wtibia, f emur</title>
      </sec>
      <sec id="sec-4-7">
        <title>Wcoxa,tibia</title>
        <p>Wf emur,tibia  ,
0

where Wi j, i 6= j is the matrix of the connections between
the i-th and j-th joints that can be expressed as</p>
        <p>Wi j = 
 wi1j
0
0
0
· · ·
0
0 
0  ,
wi6j
where the weights {wi j} = W k are taken from the matrices
k
parametrizing the previously learned CPG sub-networks.</p>
        <p>For the rearranged vector u = (ue1, u1f , . . . , ue6, u6f ), the
term Wout ∈ R18×36 is composed of the matrices Wokut of
the previously learned CPG network that controls the k-th
leg</p>
        <p>Wout = 
 Wo1ut
0
0
0
· · ·
0</p>
        <p>
          All the zeroes in the W and Wout matrices are unlearnable
constants imposing a structure onto the CPG network.
where d(t) ∈ [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ]18 is the target signal for each of 18
robot’s actuators at the time t.
        </p>
        <p>During early evaluation of the proposed learning, we
observed that in many cases, the output signal has
undesired lower frequency harmonics. This caused the output
signal to fit the target signal only for a couple of the first
periods. We propose to address this issue by an additional
term to the objective function (34)</p>
        <p>+ kr − ωk ,
where r ∈ R+ is a new hyperparameter and ω is an
approximation of the fundamental frequency of the CPG
oscillations that can be expressed as [13]
ω =</p>
        <p>Ta
1 s (Tr + Ta)β − Trw f e</p>
        <p>.</p>
        <p>Trw f e
(35)
(36)
The hyperparameter r should be equal to the
fundamental frequency of the desired signal. However, since (36)
is just an approximation; it might lead to undesired local
minima. Therefore, we propose to switch off the
regularization once the term (35) is lesser than a predefined
threshold.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experimental evaluation</title>
      <p>
        The proposed learning method has been experimentally
verified using rmsprop [19] algorithm, which is
commonly used to learn recurrent neural networks. Since the
following experiments are meant to benchmark and map
problems of the CPG network learning, we use a constant
tonic input c = 1. Therefore, cmin = cmax = 1. The initial
e f f e f
state (uienit , vinit , uinit , vinit ) is set to uinit = 0.1, uinit = −0.1,
f f
and vinit = vinit = 0. The target signal is formed of eighteen
sequences of joint angles that were recorded for a course
of five tripod gait cycles. The hexapod robot was driven by
a default regular gait based on [20], which is suitable for
traversing flat terrains, and it uses the inverse kinematics
for following the prescribed triangular leg foot-tip
trajectory. This 4.7 seconds long record of all joint signals is
sampled to 2350 equidistant data points, and each signal
is further normalized in the range [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ], smoothed using
Gaussian convolution to filter out signal peaks, and finally
downsampled by the factor of 3.
      </p>
      <p>Preliminary experiments have shown that the process of
learning profoundly depends on initial parameters and in
some runs, the BP algorithm seems to stuck in local
minima from which the learning becomes very slow. This
observation is consistent with [17]. The performance of the
0.5
0.4
BP algorithm has been improved by adding the
regularization term (35). After that, the learning is performed in the
three following consecutive steps.</p>
      <p>First, each single CPG unit is learned to generate the
sinusoid sin(t/2) that has the same frequency as the
fundamental frequency of the desired control signal, which
is deterministically set to 3 Hz. The CPG is learned in
2000 epochs, each back-propagating a batch of size 50
data-points. Note that the number of the needed epochs
depends on the initial random parametrization.</p>
      <p>Next, the parameters of the sinusoid generator is
retrained to generate the desired joint control signals. The
generator of each joint control is learned with 2000
epochs. We experimented with the stability of the learned
limit cycle of the first leg by perturbing it, see Fig. 4a.
Finally, the joints CPGs are connected as described in Sec. 4
with non-diagonal values of Wcoxa,coxa initialized to 0.5,
and learned with 4000 epochs. We experimented with the
stability of this final CPG network and results are depicted
in Fig. 4b.</p>
      <p>A comparison of the desired control signal of the first
leg and the learned signal is depicted in Fig. 5. The learned
signal has a similar shape and the same frequency as the
original signal. Binding between different triplets of the
legs, the most difficult part is shown in Fig. 6. We can
see that the learned trajectory has a similar structure to the
desired limit cycles. The trajectory also stays within its
limit cycle; the trajectory was generated by six gait-cycles,
therefore, traveled the limit cycle multiple times.</p>
      <p>We deployed the resultant CPG locomotion controller
on the real hexapod (see Fig. 3a) and compared with the
original controller [20] in 10 trials. The robot was
requested to crawl on flat surface for 10 s and then stop.
The velocity of the robot was estimated using an
external visual localization system based on tracking of visual
marker [21] running with 25 Hz. Moreover, the robot’s
0.5
stability was measured as smoothness of the locomotion
using an XSens MTi-30 inertial measurement unit (IMU)
attached to the robot trunk. The variances in vertical
acceleration (Accz) and the orientation (pitch and roll angles)
of the robot’s body are the selected indicators of the
locomotion stability.</p>
      <p>The recorded robot trajectories visualized in Fig. 7 show
that there is a transition effect for our CPG locomotion
controller at the beginning of the trajectory where the
CPG network starts to oscillate which makes the robot
initial acceleration lower; however, the overall locomotion is
smoother, as the velocity deviation is smaller.</p>
      <p>The quantitative results are listed in Table 1 as
average values of the indicators. The results indicate that the
performance of the CPG locomotion controller is similar
to the implementation [20] based on inverse kinematics
(IKT).</p>
      <sec id="sec-5-1">
        <title>Velocity</title>
        <sec id="sec-5-1-1">
          <title>Accz var.</title>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>Pitch var. Roll var. Table 1: Experimental results</title>
        <p>During the experimental evaluation of the proposed
learning of the CPG network, a couple of good practices how
to learn the sinusoid generator came up as follows.
1. It is better to learn the network in batches containing
at most two periods.
2. If the CPG network is restarted to the initial state, it
is good to ignore the transient states.
3. Since it is not important at which place the system
enters the limit cycle, it is suitable to phase-shift the
target signal; so, to minimize the distance from the
output signal.</p>
        <p>Combination of sub-networks into one network has two
difficulties. The parameters (w f e, Ta, Tr, β ) must be the
same for the whole CPG network, but the sub-networks
are trained independently; so, they can end up with
different parameters. In our case, the parameters are similar
because all the CPG sub-networks are based on one CPG
sub-network. Thus, the BP algorithm is able to adjust them
during the learning of the complete network. Another
difficulty is the choice of the initialWcoxa,coxa weights. The
higher the weights are, the stronger is the coupling
between the legs. However, if the weight values are too high,
the constraint (7) would be violated. Therefore, we used
(7) to choose the initial Wcoxa,coxa weights.</p>
        <p>Even though that the robustness is not the objective
of the learning algorithm, it is a property of single
Matsuoka’s oscillator [22]. This property translated well into
our 3-unit CPG network (see Fig. 4a) where the network
can recover from perturbations. In the real world,
robustness helps quickly react to simple temporal events, e.g.,
servo errors, or feedback from the environment.</p>
        <p>In this work, we chose a simple model with cmin =
cmax = 1, i.e., we have a constant tonic input. The
timevariant tonic input; however, introduces dynamic changes
as we can see in Fig. 8. In the future work, we would
like to use the tonic input to control the output of the CPG
network dynamically.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper, we propose a new methodology for
learning a CPG network modeled by symmetrically connected
neural oscillators. The method is based on a combination
of the back-propagation learning algorithm, normalization
layer, and regularization term, where the normalization
layer prunes the parameters spaces of the CPG network
from the undesired non-periodic results, and thus help to
speed up the learning process. The advantage of the
proposed solution over the previous work on the CPG-based
locomotion control is in the scalability of the method that
enables to create such a CPG network that can directly
control each actuator without the need to employ the
inverse kinematics. The proposed method has been
successfully deployed in the locomotion control of the real
hexapod walking robot.</p>
      <p>The main properties of the proposed methodology arise
from the idea that the proposed CPG network for the
hexapod locomotion control is based on the architecture of the
CPG connections that imitates the structure of the robot.
The CPG is inductively learned by learning its parts and
merging them. Therefore, the proposed method is
promising to be easily extendable to other multi-legged robot
bodies. Furthermore, since the proposed CPG network
is learnable by the back-propagation algorithm, it can be
integrated into more complex neural networks supporting
back-propagation, which is a subject of our future work.</p>
      <p>Acknowledgments – This work was supported by the
4
2
0
0
0</p>
      <p>80
iterations</p>
      <p>80
iterations</p>
      <p>Czech Science Foundation (GACˇ R) under research project
No. 18-18858S. The support of the Grant Agency of the
CTU in Prague under grant No. SGS16/235/OHK3/3T/13
to Rudolf Szadkowski is also gratefully acknowledged.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Marder</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Bucher</surname>
          </string-name>
          , “
          <article-title>Central pattern generators and the control of rhythmic movements</article-title>
          ,
          <source>” Current Biology</source>
          , vol.
          <volume>11</volume>
          , no.
          <issue>23</issue>
          , pp.
          <fpage>R986</fpage>
          -
          <lpage>R996</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E.</given-names>
            <surname>Marder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Schulz</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. L.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , “
          <article-title>Invertebrate central pattern generation moves along</article-title>
          ,
          <source>” Current Biology</source>
          , vol.
          <volume>15</volume>
          , no.
          <issue>17</issue>
          , pp.
          <fpage>685</fpage>
          -
          <lpage>699</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Ijspeert</surname>
          </string-name>
          , “
          <article-title>Central pattern generators for locomotion control in animals and robots: A review,” Neural Networks</article-title>
          , vol.
          <volume>21</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>642</fpage>
          -
          <lpage>653</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Beer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. J.</given-names>
            <surname>Chiel</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Gallagher</surname>
          </string-name>
          , “
          <article-title>Evolution and analysis of model CPGs for walking: II. General principles and individual variability</article-title>
          ,
          <source>” Journal of Computational Neuroscience</source>
          , vol.
          <volume>7</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>119</fpage>
          -
          <lpage>147</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Deng</surname>
          </string-name>
          , and G. Liu, “
          <article-title>Gait Generation With Smooth Transition Using CPGBased Locomotion Control for Hexapod Walking Robot,”</article-title>
          <source>IEEE Transactions on Industrial Electronics</source>
          , vol.
          <volume>63</volume>
          , no.
          <issue>9</issue>
          , pp.
          <fpage>5488</fpage>
          -
          <lpage>5500</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>