=Paper= {{Paper |id=Vol-1419/section0009 |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-1419/section0009.pdf |volume=Vol-1419 }} ==None== https://ceur-ws.org/Vol-1419/section0009.pdf
                         Brain-Supported Learning Algorithms for Robots

                                                   Chairperson
                                  Florian Röhrbein (florian.roehrbein@in.tum.de)
                              Department of Informatics VI, Technische Universität München
                                      Boltzmannstr. 3, 85748 Garching, Germany


                                                        Discussants

                                        Cecilia Laschi (cecilia.laschi@sssup.it)
                                   The BioRobotics Institute, Scuola Superiore Sant´ Anna
                                    Viale Rinaldo Piaggio, 34, 56026 Pontedera (PI), Italy

                                       Florian Walter (florian.walter@tum.de)
                              Department of Informatics VI, Technische Universität München
                                      Boltzmannstr. 3, 85748 Garching, Germany


                                                          Speakers

                                          Sander Bohte (S.M.Bohte@cwi.nl)
                                         CWI - Centrum Wiskunde & Informatica
                                    Science Park 123, 1098 XG Amsterdam, Netherlands

                                          Egidio Falotico (e.falotico@sssup.it)
                                   The BioRobotics Institute, Scuola Superiore Sant´ Anna
                                    Viale Rinaldo Piaggio, 34, 56026 Pontedera (PI), Italy

                                          Silvia Tolu (silvia.tolu@gmail.com)
                          Technical University of Denmark, Department of Electrical Engineering
                                  Elektrovej Building 326, 2800 Kgs. Lyngby, Denmark

                                        Stefan Ulbrich (Stefan.Ulbrich@fzi.de)
                                           FZI - Forschungszentrum Informatik
                                    Haid-und-Neu-Str. 10-14, 76131 Karlsruhe, Germany


Roboticists have early recognized the high potential of              implementation of learning algorithms for brain-controlled
neuro-biological control structures for robotic applications.        robots is the availability of appropriate tools like, e.g., the
However, limited processing power and the lack of                    SpiNNaker board, which is able to simulate the neural
appropriate models and tools shifted the focus of research           network in real-time. On the software side, these tools
far away from biological neural networks. Today, combined            should ease the development and support the researcher
efforts in the fields of neurosciences, computer science and         during the evaluation by offering a toolchain for the
many other areas in interdisciplinary research projects like         implementation and simulation of new algorithms (i.e.
the Human Brain Project enable the simulation of spiking             simulators able to connect and synchronize simulated brain-
biological neural networks with millions of neurons. The             supported algorithms and simulated robotic platforms).
computational power of these networks makes them a very
promising tool for the development for brain-controlled              This symposium presents actual brain-supported learning
neurorobots. Major challenges towards this goal include a            techniques for robots as well as support tools for the
meaningful mapping between tasks and neural structures as            implementation of these algorithms. In particular, the papers
well as making the simulated brain exhibit the desired               in this symposium provide evidence of the advantages of the
behavior. The large size and the complexity of biological            proposed brain-supported learning solutions and the
neural networks make the development of learning                     effectiveness of tools for the evaluation and implementation
algorithms a huge challenge. A main prerequisite for the             of these algorithms.



                                                                11
Continuous-time neural reinforcement learning              of         This talk will present latest results of bio-inspired learning
working memory tasks                                                  mechanisms integrated within anticipative control
                                                                      architectures implemented on humanoid robots.
Sander Bohte

As living organisms, one of our primary characteristics is
                                                                      Cerebellar internal models for a modular robot
the ability to rapidly process and react to unknown and
unexpected events. To this end, we are able to recognize an           Silvia Tolu
event or a sequence of events and learn to respond properly.
Despite advances in machine learning, current cognitive               The problem to solve in controlling a dynamical system is to
robotic systems are not able to rapidly and efficiently               find out the input to the system that will achieve the desired
respond in the real world: the challenge is to learn to               behavior as output even under disturbances or changing
recognize both what is important, and also when to act.               environments. The cerebellum acts in this sense because it
Reinforcement Learning (RL) is typically used to solve                adapts its output in every condition by acquiring intrinsic
complex tasks: to learn the how. To respond quickly - to              models through experience by a perceptual feedback that
learn when - the environment has to be sampled often                  allows the motor learning to proceed. Each internal model
enough. For "enough", a programmer has to decide on the               (IM) is then instantiated based on what has been learned
step-size as a time-representation, choosing between a fine-          about a specific motor control for a specific machine. Apart
grained representation of time (many state-transitions;               from adaptation, another issue is the Central Nervous
difficult to learn with RL) or to a coarse temporal resolution        System (CNS) capability of recalling the appropriate IM and
(easier to learn with RL but lacking precise timing). Here,           using it to make predictions during a movement. Therefore,
we derive a continuous-time version of on-policy SARSA-               after training and adaptation the IM becomes encoded into a
learning in a working-memory neural network model,                    long-term memory. Neurorobots have proved useful for
AuGMEnT. Using a neural working memory network                        investigating motor control, and for designing robot
resolves the “what” problem, our “when” solution is built on          controllers as well. Furthermore, they can generate
the notion that in the real world, instantaneous actions of a         hypotheses and test theories of brain functions. In this work,
certain duration are actually impossible. We demonstrate              we have designed a control system that can operate in an
how we can decouple action duration from the internal time-           unknown or changing environment, when the dynamical
steps in the neural RL model using an action selection                robot model is unknown (e.g. a modular robot) inspired by
system. The resultant CT-AuGMEnT successfully learns to               how the brain works. Furthermore a cerebellar model has
react to the events of a continuous-time task, without any            been developed with the aim of implementing model
pre-imposed specifications about the duration of the events           extraction schemes for acquisition of knowledge (forward
or the delays between them.                                           and inverse IMs). A modular robot (Fable robot) benefits
                                                                      from the organization and adaptivity of IMs that are
                                                                      embedded into its control system architecture. Finally, we
Bio-inspired learning mechanisms and anticipation in                  have tested the adaptation of the IMs under a given task and
humanoid robotics                                                     the robustness of the whole control system.
Egidio Falotico
                                                                      Sensorimotor Learning for Neural Robot Control based
Nowadays, increasingly complex robots are being designed.
                                                                      on the Kinematic Bézier Maps and Spiking Neural
As the complexity of robots increases, traditional methods
                                                                      Networks
for robotic control may become complex to handle. For this
reason, the use of neuro-controllers, controllers based on            Stefan Ulbrich
biological learning mechanisms, have risen at a rapid pace.
This kind of controllers are especially useful in the field of        The Kinematic Bézier Maps are a highly specialized model
humanoid robotics, where it is common for the robot to                representation of robot kinematics and dynamics with
perform difficult tasks (i.e. visual tracking, gaze guided            related, optimal learning algorithms. By means of complex
locomotion) in a complex unstructured environment. In                 basis transformations embedding prior knowledge, these
order to perform these tasks, motor control cannot be based           complex functions are transformed into a high-dimensional
on sensory feedback, which would be too slow. Indeed, in              space where they can be represented in a linear form and,
humans, perceptual activity is not confined to the                    thus, efficiently be learned. In this work, we present our
interpretation of sensory information, but it anticipates the         ongoing research on how this model representation and
consequences of action. Also in robotics, the anticipatory            learning algorithms can be translated into a novel form
control, generated thanks to internal models built by                 based on spiking neural networks exploiting the high degree
experience, properly combined with reactive behaviours,               of parallelism in order to benefit from the increased
can greatly improve the effectiveness of perception-action            performance in robotic applications when applied on
loops and the overall behaviour in real-world environments.           neuromorphic hardware.



                                                                 12