Interactive Learning of Grounded Concepts Jens Nevens, Paul Van Eecke, and Katrien Beuls Artificial Intelligence Lab Vrije Universiteit Brussel Pleinlaan 2, B-1050 Brussels, Belgium {jens|paul|katrien}@ai.vub.ac.be Autonomous agents perceive the world through streams of continuous sensori- motor data. Yet, in order to reason and communicate about their environment, agents need to be able to distil meaningful symbolic concepts from their raw observations. Without such a repertoire of concepts, communication would need to happen by directly sending sensori-motor values. Such a system easily leads to miscommunication when perfect calibration is not possible. Existing approaches to bridge between the continuous and symbolic domain include deep learning techniques (e.g. [1]) and version space learning [2]. Deep learning techniques generally achieve high levels of accuracy. However, they rely on very large amounts of training data, they often fail to adapt to unseen scenarios, and the resulting concepts lack transparency. Version space learning, on the other hand, can yield human-interpretable concept representations, but are notoriously brittle when faced with noisy training data. In this interactive demo, we introduce a novel approach to grounded concept learning. Using the language game methodology [3], we set up a tutor-learner scenario where the learner is an autonomous agent, grounded in the world using a Nao humanoid robot, and the participant is its tutor. Using blocks of various shapes, sizes and colours, the tutor first creates a scene. In this scene, the tutor chooses a topic and describes it to the learner using an informative concept, such as ‘red’ or ‘cube’. The learner robot observes the world through human-interpretable streams of numeric data, such as ‘area’, ‘colour’ and ‘XY-coordinates’. These are obtained through standard computer vision techniques. The robot tries to find the object described by the tutor. After each interaction, whether is was successful or not, the tutor provides feedback to the learner by showing the intended topic object. The robot maximally benefits from this feedback to newly create or extend the representation of the concept. For each concept, the robot has to find out which data streams are important and what the typical values for each data stream within a concept are. To make these decisions, the learner makes use of the notion of discrimination, i.e. separating one particular object from the other objects in the scene. Over the course of many such interactions, the learner incrementally and in real-time builds a complete repertoire of concepts that is functional in the world. An overview of the experimental set-up is shown in Figure 1. A video of the demonstration can be found at https://ehai.ai. vub.ac.be/demos/interactive-concept-learning. Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) 2 J. Nevens, P. Van Eecke & K. Beuls Fig. 1. Overview of the experimental setup. The Nao humanoid robot observes a scene of blocks with different shapes, colours and sizes. During the demonstration, participants are able to inspect the conceptual system of the agent and follow its evolution. The acquired concepts are completely transparent and human-interpretable. We show that our approach does not rely on huge amounts of training data, since forming a repertoire of concepts only requires a few interactions. Additionally, the resulting concepts are general enough to be applied to previously unseen objects and can be learned in an incremental manner. The whole system is adaptive as it does not require us to specify the number of concepts that should be learned. This completely depends on the objects observed by the agent, hence there is no need for complete or even partial retraining when the environment changes. These properties make the approach well-suited to be used in robotic agents as the module that maps from continuous sensori-motor input to grounded, symbolic concepts that can then be used for higher-level reasoning tasks such as planning, explanation or communication. References 1. Higgins, I., Matthey, L., Glorot, X., Pal, A., Uria, B., Blundell, C., Mohamed, S., Lerchner, A.: Early visual concept learning with unsupervised deep learning. arXiv preprint arXiv:1606.05579 (2016) 2. Mitchell, T.M.: Generalization as search. Artificial intelligence 18(2), 203–226 (1982) 3. Steels, L.: Language games for autonomous robots. IEEE Intelligent systems 16(5), 16–22 (2001)