=Paper=
{{Paper
|id=Vol-1419/section0009
|storemode=property
|title=None
|pdfUrl=https://ceur-ws.org/Vol-1419/section0009.pdf
|volume=Vol-1419
}}
==None==
Brain-Supported Learning Algorithms for Robots Chairperson Florian Röhrbein (florian.roehrbein@in.tum.de) Department of Informatics VI, Technische Universität München Boltzmannstr. 3, 85748 Garching, Germany Discussants Cecilia Laschi (cecilia.laschi@sssup.it) The BioRobotics Institute, Scuola Superiore Sant´ Anna Viale Rinaldo Piaggio, 34, 56026 Pontedera (PI), Italy Florian Walter (florian.walter@tum.de) Department of Informatics VI, Technische Universität München Boltzmannstr. 3, 85748 Garching, Germany Speakers Sander Bohte (S.M.Bohte@cwi.nl) CWI - Centrum Wiskunde & Informatica Science Park 123, 1098 XG Amsterdam, Netherlands Egidio Falotico (e.falotico@sssup.it) The BioRobotics Institute, Scuola Superiore Sant´ Anna Viale Rinaldo Piaggio, 34, 56026 Pontedera (PI), Italy Silvia Tolu (silvia.tolu@gmail.com) Technical University of Denmark, Department of Electrical Engineering Elektrovej Building 326, 2800 Kgs. Lyngby, Denmark Stefan Ulbrich (Stefan.Ulbrich@fzi.de) FZI - Forschungszentrum Informatik Haid-und-Neu-Str. 10-14, 76131 Karlsruhe, Germany Roboticists have early recognized the high potential of implementation of learning algorithms for brain-controlled neuro-biological control structures for robotic applications. robots is the availability of appropriate tools like, e.g., the However, limited processing power and the lack of SpiNNaker board, which is able to simulate the neural appropriate models and tools shifted the focus of research network in real-time. On the software side, these tools far away from biological neural networks. Today, combined should ease the development and support the researcher efforts in the fields of neurosciences, computer science and during the evaluation by offering a toolchain for the many other areas in interdisciplinary research projects like implementation and simulation of new algorithms (i.e. the Human Brain Project enable the simulation of spiking simulators able to connect and synchronize simulated brain- biological neural networks with millions of neurons. The supported algorithms and simulated robotic platforms). computational power of these networks makes them a very promising tool for the development for brain-controlled This symposium presents actual brain-supported learning neurorobots. Major challenges towards this goal include a techniques for robots as well as support tools for the meaningful mapping between tasks and neural structures as implementation of these algorithms. In particular, the papers well as making the simulated brain exhibit the desired in this symposium provide evidence of the advantages of the behavior. The large size and the complexity of biological proposed brain-supported learning solutions and the neural networks make the development of learning effectiveness of tools for the evaluation and implementation algorithms a huge challenge. A main prerequisite for the of these algorithms. 11 Continuous-time neural reinforcement learning of This talk will present latest results of bio-inspired learning working memory tasks mechanisms integrated within anticipative control architectures implemented on humanoid robots. Sander Bohte As living organisms, one of our primary characteristics is Cerebellar internal models for a modular robot the ability to rapidly process and react to unknown and unexpected events. To this end, we are able to recognize an Silvia Tolu event or a sequence of events and learn to respond properly. Despite advances in machine learning, current cognitive The problem to solve in controlling a dynamical system is to robotic systems are not able to rapidly and efficiently find out the input to the system that will achieve the desired respond in the real world: the challenge is to learn to behavior as output even under disturbances or changing recognize both what is important, and also when to act. environments. The cerebellum acts in this sense because it Reinforcement Learning (RL) is typically used to solve adapts its output in every condition by acquiring intrinsic complex tasks: to learn the how. To respond quickly - to models through experience by a perceptual feedback that learn when - the environment has to be sampled often allows the motor learning to proceed. Each internal model enough. For "enough", a programmer has to decide on the (IM) is then instantiated based on what has been learned step-size as a time-representation, choosing between a fine- about a specific motor control for a specific machine. Apart grained representation of time (many state-transitions; from adaptation, another issue is the Central Nervous difficult to learn with RL) or to a coarse temporal resolution System (CNS) capability of recalling the appropriate IM and (easier to learn with RL but lacking precise timing). Here, using it to make predictions during a movement. Therefore, we derive a continuous-time version of on-policy SARSA- after training and adaptation the IM becomes encoded into a learning in a working-memory neural network model, long-term memory. Neurorobots have proved useful for AuGMEnT. Using a neural working memory network investigating motor control, and for designing robot resolves the “what” problem, our “when” solution is built on controllers as well. Furthermore, they can generate the notion that in the real world, instantaneous actions of a hypotheses and test theories of brain functions. In this work, certain duration are actually impossible. We demonstrate we have designed a control system that can operate in an how we can decouple action duration from the internal time- unknown or changing environment, when the dynamical steps in the neural RL model using an action selection robot model is unknown (e.g. a modular robot) inspired by system. The resultant CT-AuGMEnT successfully learns to how the brain works. Furthermore a cerebellar model has react to the events of a continuous-time task, without any been developed with the aim of implementing model pre-imposed specifications about the duration of the events extraction schemes for acquisition of knowledge (forward or the delays between them. and inverse IMs). A modular robot (Fable robot) benefits from the organization and adaptivity of IMs that are embedded into its control system architecture. Finally, we Bio-inspired learning mechanisms and anticipation in have tested the adaptation of the IMs under a given task and humanoid robotics the robustness of the whole control system. Egidio Falotico Sensorimotor Learning for Neural Robot Control based Nowadays, increasingly complex robots are being designed. on the Kinematic Bézier Maps and Spiking Neural As the complexity of robots increases, traditional methods Networks for robotic control may become complex to handle. For this reason, the use of neuro-controllers, controllers based on Stefan Ulbrich biological learning mechanisms, have risen at a rapid pace. This kind of controllers are especially useful in the field of The Kinematic Bézier Maps are a highly specialized model humanoid robotics, where it is common for the robot to representation of robot kinematics and dynamics with perform difficult tasks (i.e. visual tracking, gaze guided related, optimal learning algorithms. By means of complex locomotion) in a complex unstructured environment. In basis transformations embedding prior knowledge, these order to perform these tasks, motor control cannot be based complex functions are transformed into a high-dimensional on sensory feedback, which would be too slow. Indeed, in space where they can be represented in a linear form and, humans, perceptual activity is not confined to the thus, efficiently be learned. In this work, we present our interpretation of sensory information, but it anticipates the ongoing research on how this model representation and consequences of action. Also in robotics, the anticipatory learning algorithms can be translated into a novel form control, generated thanks to internal models built by based on spiking neural networks exploiting the high degree experience, properly combined with reactive behaviours, of parallelism in order to benefit from the increased can greatly improve the effectiveness of perception-action performance in robotic applications when applied on loops and the overall behaviour in real-world environments. neuromorphic hardware. 12