Gabriel J. Ferrer MAICS 2016 pp. 47–53 Real-time Unsupervised Clustering Gabriel J. Ferrer Department of Mathematics and Computer Science Hendrix College 1600 Washington Ave. Conway, Arkansas 72032 Abstract which any solution must conform. We then describe our learning algorithm, followed by our experiments and results. In our research program, we are developing machine We then give our conclusions and speculations on future learning algorithms to enable a mobile robot to build a compact representation of its environment. This re- work. quires the processing of each new input to terminate in constant time. Existing machine learning algorithms Background are either incapable of meeting this constraint or de- Clustering algorithms have long been an appealing machine liver problematic results. In this paper, we describe learning technique for robotics applications, due to their po- a new algorithm for real-time unsupervised cluster- tential for incremental machine learning in combination as ing, Bounded Self-Organizing Clustering. It executes in well as their ability to reduce complex continuous sensor constant time for each input, and it produces cluster- ings that are significantly better than those created by input values to a smaller number of discrete values. Sev- the Self-Organizing Map, its closest competitor, on sen- eral examples of this approach employ the Kohonen Self- sor data acquired from a physically embodied mobile Organizing Map (SOM) (Kohonen 2001) to reduce the sen- robot. sor inputs into a finite number of discrete states to facilitate the Q-learning algorithm. Clustering algorithms are unsupervised learning algorithms Each SOM is a neural network consisting of a single two- that employ a distance metric to organize their training in- dimensional layer of nodes arranged in a rectangular grid. puts into clusters. The classification of a previously unseen The number of nodes and their arrangement is fixed. Each input is determined by the cluster to which it is closest, node is an object of the same specifications as a network again based on the distance metric. This enables a large input input, which represents the reference input for that node. A space to be compressed into a more compact representation. distance metric is defined for the space of possible inputs. A useful application of this technique is to create models of Typically, this distance metric will be the sum-of-squared- mobile robot environments based on the robot’s sensor in- differences. The output node whose reference input that is puts. closest in distance to an input presented to the network is In both training and classification, the distances between considered the winning node; its coordinates are the output the input and the representative example of each cluster must of the SOM. be calculated. By placing a fixed upper bound on the num- To train a SOM, a training input is presented to the net- ber of clusters, we limit the total number of calculations re- work. The winning output node’s reference input, along with quired by each training and classification operation. This, in some of its neighbors, is then combined with the input value. turn, enables us to predict the cycle time of the algorithm, a The difference vector between the values is calculated and necessity for the practical deployment of such an algorithm multiplied by a learning rate. The resulting value is then aboard a mobile robot. added to the reference input of the learning node. For each In this paper, we introduce a novel clustering algorithm, affected SOM node i and input vector element j, the up- Bounded Self-Organizing Clusters (BSOC), that is designed date equation for the weight vector w of the reference input to meet these constraints. In the robotics literature, the Ko- (given a learning rate ↵, 0 < ↵ < 1) is: honen Self-Organizing Map (SOM) has been used for this purpose as well. We argue that the design of the SOM leads wij = wij + ↵ ⇤ (inputij wij ) to a relatively poor clustering, and that BSOC represents a The ↵ parameter is determined by the neighborhood significant improvement. We demonstrate this experimen- scheme of the specific SOM implementation. Many imple- tally using both sonar and image data captured from a mo- mentations (e.g. (Smith 2002)) make multiple passes over bile robot. the input, slowly reducing ↵ on each pass. (Touzet 1997) We begin with some background and describe related trains the winning node with ↵ = 0.9, and the four cardinal work. Next, we describe our physical system hardware, with neighbors are trained with ↵ = 0.5. As ↵ remains constant an eye towards making clear the technical limitations to throughout, this formulation of the SOM lends itself well to 47 Gabriel J. Ferrer MAICS 2016 pp. 47–53 real-time implementation aboard a robot. Since the number of nodes and topology are fixed, one can define a SOM out- put map that guarantees that the robot’s cycle time will reli- ably meet its timing requirements. This will be the variation of the SOM that will serve as a baseline in our experiments for assessing our own algorithm. The earliest application of the SOM to robotics was that of (Touzet 1997), in which a tiny (6 cm diameter) Khepera mobile robot equipped with eight infrared sensors for object detection learned to move forward while avoiding obstacles. The continuous values from the IR sensors were routed into a 4x4 SOM. The SOM output node was then employed as a state index for Q-learning. Touzet later scaled up this work to a much larger robot (Touzet 2006) employing a ring of 16 sonars. In this later work, Q-learning was abandoned; instead of using reinforcement rewards, the desired behav- ior was specified in terms of an acceptable range of sensor values. (Smith 2002) did similar work, with two major dif- ferences: a different definition of the SOM learning neigh- Figure 1: Robot borhood than that of Touzet, and implementation in simu- lation rather than on a physical robot. Instead of using im- mediate neighbors, all nodes in the network were affected, as real-time scientific data processing, finance, web applica- with the learning rate combined with a Gaussian smoothing tions, and telecommunications. function. (Ferrer 2010) implemented both approaches on a The most prominent clustering algorithm in the literature mobile robot and compared the results. He found that both is arguably k-means (Forgey 1965) (MacQueen 1967). For approaches worked respectably well, but that both were sur- our application domain, we did not consider this approach passed by Q-Learning without the use of clustering. at all, as it requires multiple passes through the input data. (Provost, Kuipers, and Miikkulainen 2006) undertook As each of these multiple passes requires visiting every input similar work in simulation, but replaced the SOM with sample, our target constraint of constant-time execution can- the closely related Growing Neural Gas (GNG) algorithm not be satisfied. Also related to our approach is bottom-up (Fritzke 1995). The GNG is appealing in this context be- agglomeration as described, for example, by (Koivistoinen, cause of its flexible topology. It starts with two nodes. Ad- Ruuska, and Elomaa 2006). We do not seek to perform a full ditional nodes are incorporated into the network as needed bottom-up agglomeration; instead, we use agglomeration as to represent inputs that correspond to significant previously- a strategy to help us execute in constant time while retaining unseen aspects of the input space. good relationships with the inputs. (Ferrer 2014a) also applied the GNG algorithm to robot learning. A robot built a GNG as it drove through its envi- Hardware ronment. A human then examined each GNG node, speci- Our robot is a Lego Mindstorms EV3 equipped with a Log- fying an appropriate action based on a visual inspection of itech C270 webcam and three sonar sensors. The EV3 has a the node’s reference input. This work was reasonably suc- 300 MHz ARM9 CPU and 64 MB of RAM. Its OS is a ver- cessful for behavior specification on a physical robot. How- sion of Linux. We implemented our software in Java using ever, the manual specification of an action for each node was LeJOS 0.9.0. The webcam is connected to the EV3 via its found to be laborious. (Ferrer 2014b) further advanced the USB 1.1 port. As the webcam is designed to be used with concept of employing a GNG for learning an environment. a USB 2.0 port or better, it is limited to grabbing 160x120 A human pilots a robot through an environment, learning a YUYV images at about 14 frames per second. We config- GNG representation of the environment and associating ac- ured the robot as shown in Figure 1. tions from the human with the GNG nodes. When running autonomously, the robot selects its action based on which GNG node is the winner for the current input image. Bounded Self-Organizing Clusters Because the GNG adapts the number of clusters based on Basic Version the inputs it receives, it is not possible to guarantee that We introduce a novel clustering algorithm called Bounded the robot will complete each round of processing within Self-Organizing Clusters (BSOC) (see Figure 2). Each a specified target time. While one appealing aspect of the BSOC is a complete undirected weighted graph. Each node GNG algorithm is the elimination of underutilized nodes, the in the graph contains the reference input for a cluster (com- specifics of this elimination process are too unpredictable to parable to the nodes in a Self-Organizing Map (Kohonen enable us to make guarantees about timing. 2001)). Each edge weight denotes the distance between the The problem of learning from a real-time input stream reference inputs of the clusters. is also of interest outside the robotics community. Each BSOC is given a maximum number of clusters it can (Hapfelmeier et al. 2012) mention application domains such represent. When training, each training input is added to the 48 Gabriel J. Ferrer MAICS 2016 pp. 47–53 set of clusters as a reference input. If adding the training in- importance in the merging stage as clusters that are the result put causes the number of nodes to exceed the maximum, the of single inputs. We find it intuitive that clusters that are the two nodes in the graph with the shortest edge are merged by result of repeated merges would be closer in distance to a computing a weighted average of their numerical represen- larger number of inputs from the input space. The following tations. variation is an attempt to codify this intuition. The edges are stored in a red-black tree. They are primar- In addition to a reference input, each node in the graph ily ordered by their distances, and secondarily ordered by the contains a counter, initially set to one. Each counter denotes indices of their node endpoints. This enables us to quickly the number of input samples that contributed to the current find the smallest edge when needed in the train method. node. These counters serve two major roles. First, they will When two nodes are merged (and removed from the main be used as weights when merging nodes. Second, each edge graph), their edges are also removed from the red-black tree. weight (i.e., the distance between the reference inputs of the connected nodes) will be multiplied by the larger of the two setup(max-nodes) counters for the connected nodes. edges = new red-black-tree We expect these roles to achieve two things. First, by nodes = map of node-indexes to inputs employing the larger counter as an edge weight multiplier, nodes that are derived from larger numbers of input samples lookup(input) will be less likely to be selected for merging. Second, by for each node weighting the merge based on the counters, when they are find distance from input to node selected they will preserve more of the broad influence of return (node-index, distance) of the inputs from which they were generated. closest node The revised BSOC algorithm is given in Figure 3. Note that setup and lookup are unchanged (and hence omit- train(input) ted). The additional arithmetic calculations have no asymp- insert(input) totic impact on performance. if number of nodes exceeds max-nodes edge = edges.remove(smallest edge) train(input) (n1, n2) = endpoints of edge insert(input, 1) Remove n1 and n2 and their edges if number of nodes exceeds max-nodes insert(merged(n1,n2)) edge = edges.remove(smallest edge) (n1, n2) = endpoints of edge insert(image, count) insert(merged(n1,n2), add (image, count) to nodes n1.count + n2.count) for each existing node n Create and insert a new edge: insert(image, count) - First vertex is image add (image, count) to nodes - Second vertex is n for each existing node n - Weight = distance from image to n d = distance from image to n Create and insert a new edge: merged(n1, n2) - First vertex is image img = (n1.image + n2.image) / 2 - Second vertex is n return img - Weight is d * max(count(image), count(n)) Figure 2: Pseudocode for Basic BSOC Training merged(n1, n2) Let M be the maximum number of nodes for a BSOC. w1 = n1.count / (n1.count + n2.count) Each call to lookup or insert makes no more than M w2 = n2.count / (n1.count + n2.count) calls to the distance function. Each call to insert per- img = w1 * n1.image + w2 * n2.image forms up to M log M 2 = 2M log M numerical compar- return img isons when inserting an edge connecting the input to each of up to M existing nodes. Each call to train makes up to Figure 3: Pseudocode for Weighted BSOC Training two calls to insert, along with a comparison, 2M edge re- movals (each requiring 2 log M comparisons in a red-black tree), and two hash table removals. Consequently, given that M can be fixed as a constant, we can say that each call to Experiments and Evaluation train or lookup takes constant time. Robot Environment Weighted-Input Variation We gathered two different types of sensor data from our A potential drawback of the above algorithm is that clusters robot: triples of distance values from the sonars, and images that are the result of repeated averages are given the same from the webcam. To gather the sonar data set and the first 49 Gabriel J. Ferrer MAICS 2016 pp. 47–53 Sonar (m2 ) Nodes SOM BSOC1 BSOC2 16 1.812 ⇥ 10 3 2.310 ⇥ 10 1 2.451 ⇥ 101 32 1.523 ⇥ 103 4.978 2.998 64 7.361 ⇥ 101 9.120 ⇥ 10 1 5.473 ⇥ 10 1 Figure 6: Sonar: SSD ing sonar position are computed. For the YUYV images, the differences between each corresponding byte are computed. When creating clusters, the merging process requires av- eraging the inputs. In both cases, the averaging is analogous to the computation of the distance. For the sonar triples, the weighted mean values are computed for each sonar position. For the images, the weighted mean values are computed for each byte. Since the byte values are 8-bit unsigned integers, Figure 4: Office, Carpeted Floor the result of each division is truncated. Experiments In this section, the basic version of BSOC is denoted BSOC1, and the BSOC variation weighted by input sample counts is BSOC2. We constructed our experiments to test the following hy- potheses: 1. Given the same number of clusters, BSOC1 will provide closer matches to its source inputs than the SOM as for- mulated by (Touzet 1997), and BSOC2 will provide closer matches than BSOC1. We define “closer matches” to be the sum-of-squared-differences (SSD) between each input and the reference input for the closest matching cluster. 2. Given the same number of clusters, BSOC2 will exhibit Figure 5: Office, Tile Floor more consistent performance than the SOM or BSOC1 regardless of the ordering of the input data. We measure “consistent performance” as the ratio of the standard devi- video data set, we navigated the robot through the cluttered ation to the mean. Smaller values indicate more consistent office environment depicted in Figure 4. performance. For the second video data set, we navigated the robot We tested each of our input sequences with clustering al- through a different office environment. While the first en- gorithms configured with 16, 32, and 64 nodes. For testing vironment has a carpeted floor, the second environment has the SOMs, we used 4x4, 4x8, and 8x8 grids. The SSD val- a tile floor of alternating colors, as we can see in Figure 5. ues for the sonar experiments are in units of meters squared. Each video sequence is 700 frames. Each frame is a For the video experiments, the units are pixel intensity val- 160x120 image in YUYV format. As YUYV uses 2 bytes ues squared. For presentation purposes, all entries have been per pixel, the total size of each frame is 38400 bytes. The rounded to four significant figures. sonar data set was created from a sequence of 1304 sonar triples. The sonars are configured as follows: One points about 45 degrees to the left of the robot, one points forward, Results and one points about 45 degrees to the right. The effective Hypothesis 1 As we can see from Figures 6 through range of each sonar is from 0 to 2.5 meters. In all cases, the 11, Hypothesis 1 was confirmed, with some qualifications. robot visited many different areas of the office, so as to en- BSOC1 consistently outperformed the SOM in the sonar ex- sure that each sequence was reasonably representative of the periments by multiple orders of magnitude, a compelling im- environment. provement. In the first video experiments, it outperformed For both types of data, distance values between samples the SOM by factors ranging from slightly over one to are calculated as sum-of-squared differences. For computing slightly over two, a much less compelling improvement, but the distance between two triples of sonar distance values, the still consistently superior. The second video was more favor- differences between the distance values of each correspond- able, performing 3-4 times better. 50 Gabriel J. Ferrer MAICS 2016 pp. 47–53 Sonar Ratio Video 2 Nodes BSOC1 BSOC2 Nodes SOM BSOC1 BSOC2 16 7.844 ⇥ 101 7.393 ⇥ 101 16 1.404 ⇥ 10 19 3.739 ⇥ 10 18 1.199 ⇥ 1018 32 3.059 ⇥ 102 5.080 ⇥ 102 32 5.390 ⇥ 1019 1.806 ⇥ 1018 6.745 ⇥ 1017 64 8.071 ⇥ 101 1.345 ⇥ 102 64 4.554 ⇥ 1018 1.058 ⇥ 1018 2.838 ⇥ 1017 Figure 7: Sonar: SOM SSD / BSOC SSD Figure 10: Video 2: SSD Video 1 Video 2 Ratio Nodes SOM BSOC1 BSOC2 Nodes SOM/BSOC1 SOM/BSOC2 16 5.465 ⇥ 1018 4.749 ⇥ 10 18 1.754 ⇥ 1018 16 3.75 1.17 ⇥ 101 32 4.257 ⇥ 1018 2.524 ⇥ 1018 1.113 ⇥ 1018 32 2.98 7.99 64 2.703 ⇥ 1018 1.125 ⇥ 1018 6.765 ⇥ 1017 64 4.30 1.60 ⇥ 101 Figure 8: Video 1: SSD Figure 11: Video 2: SOM SSD / BSOC SSD BSOC2 outperformed both of the other algorithms very 12 shows the mean, standard deviation, and ratio for the consistently, with the exception of the 16-node sonar exper- sonar experiments, and Figures 13 and 14 show the same iment, in which its performance was very close to BSOC1. values for the video experiments. In the other two sonar experiments, it outperformed BSOC1 In all three cases, BSOC2 had a much tighter standard by about 65%. deviation than the other algorithms, showing a more con- In the video experiments, BSOC2 consistently outper- sistent performance regardless of move ordering. In spite of formed both the SOM and BSOC1. In the first video, it was this clear result, in all cases the performance on the non- about 3-4 times better than the SOM, while with the second permuted input is much worse. Figures 15, 16, and 17 high- video it was 8-16 times better. light this phenomenon. For all of these algorithms, process- All of these clustering algorithms show significant im- ing the inputs in the order they were acquired creates a provement in performance as additional nodes are added. highly idiosyncratic statistical outlier. This demonstrates that those engineering a robotic system incorporating a clustering algorithm do have a clearly de- Potential Applications fined trade-off between cycle time and clustering accuracy. This improvement was less consistent with the second video We are in the process of developing several different robot stream, as the 32-node solution was more favorable to BSOC learning systems that would employ BSOC as a compo- than the 64-node solution. nent. Our first goal is to replicate the learning scheme de- scribed by (Touzet 2006). He employed two different SOMs Hypothesis 2 We ran a separate set of experiments to test to represent the robot’s environment as sensed by a ring of Hypothesis 2, as all of the previous experiments trained the sonars. One SOM was used to represent the environment it- learner by showing the inputs in the order that they were self, while the second SOM was used for modeling the re- acquired by the robot. While we consider that to be the ex- sult of action selection. The robot’s goals are specified in pected deployment situation of these algorithms, the sensi- terms of sensor values. The robot then attempts to move it- tivity to ordering is also worthwhile to measure. To this end, self to a SOM state that is compatible with those values. We we trained the 64-node version of each algorithm on six dif- have found that while the underlying concept is compelling, ferent randomly generated permutations of the input sam- there is not sufficient detail to represent the role of the sec- ples. We employ here the same performance metric as our ond SOM. We believe that replacing the first SOM with a earlier experiments, namely, the sum-of-squared-differences BSOC will result in improved modeling accuracy, while re- between each input and its closest matching cluster. Figure placing the second SOM with a Markov chain should prove Video 1 Ratio Permuted sonar Nodes SOM/BSOC1 SOM/BSOC2 Algorithm x̄ x̄ 16 1.151 3.116 SOM 4.585 ⇥ 101 1.008 ⇥ 101 2.200 ⇥ 10 1 32 1.687 3.825 BSOC1 6.695 ⇥ 10 1 2.043 ⇥ 10 1 3.051 ⇥ 10 1 64 2.403 3.996 BSOC2 2.477 ⇥ 10 1 2.717 ⇥ 10 2 1.097 ⇥ 10 1 Figure 9: Video 1: SOM SSD / BSOC SSD Figure 12: Sonar, with permuted inputs 51 Gabriel J. Ferrer MAICS 2016 pp. 47–53 effective. We plan to employ Dijkstra’s Algorithm to find paths in the Markov chain through the BSOC states that it Permuted video 1 discovers. Algorithm x̄ x̄ Once the above system proves workable with sonars, we SOM 1.27 ⇥ 10 18 1.455 ⇥ 1017 1.15 ⇥ 10 1 plan to employ video as the primary input. We believe this will enable us to do two things. First, we hope to be able to BSOC1 9.82 ⇥ 1017 1.990 ⇥ 1017 2.03 ⇥ 10 1 define robot behaviors in terms of video input, potentially BSOC2 2.84 ⇥ 1017 6.421 ⇥ 1015 2.26 ⇥ 10 2 opening up a wider array of possibilities than those given by sonar. Second, we would like to investigate the possibility Figure 13: Video 1, with permuted inputs of using the nodes for small-scale navigation of an environ- ment. We are also interested in using BSOC as a component to a supervised-learning system. In this way, BSOC would en- Permuted video 2 able a scheme along the lines of k-nearest-neighbor (Saun- ders, Nehaniv, and Dautenhahn 2006) to be implemented in Algorithm x̄ x̄ constant time. SOM 1.13 ⇥ 10 18 1.455 ⇥ 1017 1.29 ⇥ 10 1 We speculate that this algorithm could be useful for BSOC1 3.74 ⇥ 1017 4.835 ⇥ 1016 1.29 ⇥ 10 1 informing human decision-makers in situations requiring BSOC2 1.41 ⇥ 1017 8.661 ⇥ 1015 6.12 ⇥ 10 2 rapid responses where sensors are widely distributed and submitting information. Examples of such scenarios could include environmental assessment in the wake of a disaster Figure 14: Video 2, with permuted inputs or any number of military or law-enforcement scenarios1 . Conclusion For the purpose of modeling a robot environment in constant Sonar: Comparisons of SSD time using a sequence of sensor inputs, we have shown that Algorithm Original Permuted x̄ Original Bounded Self-Organizing Clusters (BSOC) is a compelling P ermuted alternative to the Kohonen Self-Organizing Map, the exist- SOM 7.361 ⇥ 101 4.585 ⇥ 101 1.606 ing approach, for both sonar and video inputs. The variation BSOC1 9.120 ⇥ 10 1 6.695 ⇥ 10 1 1.362 of BSOC that weights the results based on the input sources BSOC2 5.473 ⇥ 10 1 2.477 ⇥ 10 1 2.209 is a particularly clear winner in our experiments; we con- sider this to be the definitive variation of the algorithm. Figure 15: Sonar: Comparisons of SSD We believe that improved clustering has great potential to improve robot learning as a component of several possible larger systems. We would also be interested in finding appli- cations for this algorithm in other domains beyond robotics that would benefit from real-time clustering. Video 1: Comparisons of SSD Significant additional work remains in the development of Algorithm Original Permuted x̄ Original this algorithm. First, many robot vision systems employ pre- P ermuted processed features rather than raw pixel data, with the goal SOM 2.702 ⇥ 1018 1.27 ⇥ 1018 2.12 of improving recognition robustness across many different BSOC1 1.125 ⇥ 1018 9.82 ⇥ 1017 1.14 robot poses. We plan to investigate using ORB (Rublee et BSOC2 6.765 ⇥ 1017 2.84 ⇥ 1017 2.38 al. 2011) for this purpose. Second, many practical learn- ing situations involve supervised learning. We believe that Figure 16: Video 1: Comparisons of SSD BSOC could be used as a component of a real-time super- vised learning system designed along the lines of k-nearest- neighbor. Third, a system that utilizes BSOC clusters and maintains references to those clusters will need to preserve information as clusters merge. We currently have a proto- Video 2: Comparisons of SSD type system in place to address this employing the Observer Algorithm Original Permuted x̄ Original design pattern, but have not yet conducted an assessment of P ermuted its utility. SOM 4.553 ⇥ 1018 1.13 ⇥ 1018 4.02 Anyone who would like to experiment with our code BSOC1 1.058 ⇥ 1018 3.74 ⇥ 1017 2.83 and data sets is welcome to do so. It is archived BSOC2 2.838 ⇥ 1017 1.41 ⇥ 1017 2.01 at https://github.com/gjf2a/maics2016. The code is in the public domain. Figure 17: Video 2: Comparisons of SSD 1 We would like to thank one of our anonymous reviewers for suggesting these possibilities. 52 Gabriel J. Ferrer MAICS 2016 pp. 47–53 Acknowledgements Touzet, C. 2006. Modeling and simulation of elementary We would like to thank the three anonymous reviewers for robot behaviors using associative memories. International many helpful comments on our initial submission. We would Journal of Advanced Robotic Systems 3(2):165–170. also like to thank Mark Goadrich for proofreading this paper and giving many helpful suggestions, most specifically the idea to assess performance based on random permutations of the inputs. References Ferrer, G. J. 2010. Encoding robotic sensor states for Q- Learning using the self-organizing map. Journal of Com- puting Sciences in Colleges 25(5). Ferrer, G. J. 2014a. Creating visual reactive robot behav- iors using growing neural gas. In Proceedings of the 25th Modern Artificial Intelligence and Cognitive Science Con- ference, 39–44. Ferrer, G. J. 2014b. Towards human-induced vision-guided robot behavior. In Proceedings of the 2014 AAAI Fall Sym- posium: Knowledge, Skill, and Behavior Transfer in Au- tonomous Robots. Forgey, E. 1965. Cluster analysis of multivariate data: Efficiency vs. interpretability of classification. Biometrics 21(768). Fritzke, B. 1995. A growing neural gas network learns topologies. In Advances in Neural Information Processing Systems 7, 625–632. MIT Press. Hapfelmeier, A.; Mertes, C.; Schmidt1, J.; and Kramer, S. 2012. Towards real-time machine learning. In Proceedings of the ECML PKDD 2012 Workshop on Instant Interactive Data Mining. Kohonen, T. 2001. Self-Organizing Maps. Springer, 3rd edition. Koivistoinen, H.; Ruuska, M.; and Elomaa, T. 2006. A Voronoi diagram approach to autonomous clustering. In Proceedings of the 9th International Conference on Discov- ery Science, 149–160. MacQueen, J. 1967. Some methods for classification and analysis of multivariate observations. In Proc. of the Fifth Berkeley Symposium on Math. Stat. and Prob., 281–296. Provost, J.; Kuipers, B. J.; and Miikkulainen, R. 2006. De- veloping navigation behavior through self-organizing dis- tinctive state abstraction. Connection Science 18(2):159– 172. Rublee, E.; Rabaud, V.; Konolige, K.; and Bradski, G. 2011. ORB: An efficient alternative to SIFT or SURF. In Proceed- ings of the IEEE International Conference on Computer Vi- sion. Saunders, J.; Nehaniv, C. L.; and Dautenhahn, K. 2006. Teaching robots by moulding behavior and scaffold- ing the environment. In Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-robot Interaction, HRI ’06, 118–125. New York, NY, USA: ACM. Smith, A. 2002. Applications of the self-organizing map to reinforcement learning. Neural Networks 15:1107–1124. Touzet, C. 1997. Neural reinforcement learning for behavior synthesis. Robotics and Autonomous Systems 22(3-4). 53