Focused Exploration for Cooperative Robotic Watercraft Andrea Jeradi? , Masoume M. Raeissi? , Alessandro Farinelli? , Nathan Brooks† , Paul Scerri† ? Computer Science Department, University of Verona, Italy † Platypus LLC, Pittsburgh, U.S.A. Environmental monitoring is one of the main societal challenge of the last century. This is particularly important when considering the increasing popula- tion growth and the industrialization of developing countries. Information and Communication Technology (ICT) and robotics can play a significant role in this perspective as they offer invaluable tools to continuously gather relevant infor- mation over wide, possibly dangerous areas. In particular autonomous boats hold great promises for monitoring water bodies (lakes, rivers, fishfarms) or providing first aid in floods. However, a key feature for such systems to have a practical impact on society is to provide accurate measures over large areas while being cost effective. In this work, we focus on a water monitoring system, based on Cooperative Robotic Watercrafts (CRW), commercialised by Platypus LLC 1 . The CRW approach is based on the use of low cost, small, robotics plat- forms, which are able to interact with each other and make use of various ar- tificial intelligence methods to ensure a significant level of autonomy for navi- gation and information gathering tasks. Figure 1(a) shows a differential drive propeller versions. In addition to a battery based propulsion mechanism, each boat is equipped with an Android OS smartphone, custom electronics board, and sensor payload. The Android smartphone provides communication, GPS, compass, and multi-core processor. The Arduino Mega based electronics board receives commands from the Android phone and interfaces with the propulsion mechanism and sensor payload, as shown in Figure 1(b). The electronics board supports a wide variety of devices including acoustic doppler current profilers and sensors that measure temperature, Dissolved Oxygen (DO), and pH level. A crucial aspect for the CRW is the high level of autonomy of the platforms, which can be controlled by few human operators that provide high level instruc- tions to the system. In particular, Figure 1(c) shows the Graphical User Interface that the human operators can use to control the robotic platforms. Boats be- haviors are expressed in the form of high level plan encoded as Colored Petri Nets [1]. In this example, the human operator activated an explore area plan providing the shape of the area that must be explored. The platforms will then automatically divide the area in different sections and execute a pre-specified strategy to monitor the area. Figure 1(d) shows an example of the standard lawnmower strategy employed by the boats. Notice that such strategy is decided off-line, i.e., before the system acquires any readings, and as such it does not depend on the data that the platforms are collecting. 1 see http://senseplatypus.com/ for more info (a) (b) (c) (d) Fig. 1. 1(a) Twin propeller platform; 1(b) System architecture; 1(c) GUI for controlling the boat; 1(d) Lawnmower strategy for monitoring Now, in many situations it is important to perform an on-line monitoring strategy to focus on relevant parts of the area depending on reading acquired while executing the mission. For example, we might want to provide accurate readings on parts of the areas that exhibit a critical/interesting pattern (i.e., a high DO value or a high PH level). The contribution of this paper is to propose a new method for cooperative exploration, providing an approach to collect relevant information while mini- mizing costs. More in detail, this paper provides the following contributions to the state of art: i) we use a Gaussian Process (GP) to model the spatial and temporal dynamics of the target phenomena. GPs are powerful approaches for Bayesian inference [2, 3], and here we use a GP for predicting the possible values of unvisited locations given the values we measured. If the predicted value for an area falls within a specified range, we consider such area to be of interest for the exploration strategy; ii) we use the Marching Square algorithm to cluster lo- cations that have an interesting predicted value, by building a “border” around such area (called an isoline); iii) we cast the exploration process as a task assign- ment problem, where interesting areas (i.e., isolines) are allocated to boats that must acquire readings to increase the prediction accuracy of the GPs. We define several cost functions to drive the task allocation process and we use two task assignment methods: Greedy, which provides a basic approach for comparison and the Hungarian method (proposed by H. W. Kuhn in 1955 [4] and refined by J. Munkres in 1957 [5]), which provides the optimal allocation; iv) finally, we per- form an extensive empirical evaluation with the CRW simulating environment. Our empirical evaluation compares the various cost functions we devised for task assignment and shows that our approach significantly outperforms the previous lawnmower method in terms of prediction accuracy over interesting areas. In more detail, our approach is based on the following main ingredients: Environment Representation and Problem Formulation: Our goal is to minimize the predicted error on relevant points. These points are defined based on a-priori knowledge on the environment (i.e. DO > 7.0). To model the environment we use a regular grid, and we assign to each cell the average of all the readings in that area. Figure 1(d) shows the grid used in our experiments which represent the DO values obtained by a robotic boat operating in a coastal areas in Doha, Qatar. Black cells represent relevant readings (i.e. DO > 7.0). Gaussian Process for prediction: we use GP to predict value of cells. Specif- ically, the operation is divided in a learning phase and a prediction phase. In the learning phase the platforms acquire measures populating the training set. In contrast, in the prediction phase the GP is used to predict values for all unknown locations (i.e, the testing set). Such predicted values, are processed by the Marching Square Algorithm to cluster cells with the same value into isolines. When the isolines are created, they are assigned to platforms. Then platform move towards their assigned isolines and acquire new measures effec- tively executing another learning phase (and populating the training set with new data). The process is repeated until the mission is over. As the training set is populated the prediction accuracy of the Gaussian Process is increased and the goal of our approach is to assign boats to isoline so to minimize the prediction uncertainty of the GP for the regions of interest, in particular in each iteration we want to minimize the root qmean Pn square2 error for such regions, formally we (ŷt −yt ) want to minimize RM SE = t=1 n , where yt is the real value of a cell that belongs to the interesting regions (obtained from the entries of the log-file populated by the robotic boat operating in Qatar) and ŷt is the predicted value of the same cell (computed by the GP)2 . Task Assignment: the task assignment process operates over the isolines built by the marching square algorithm. Specifically, each isoline is a task that must be assigned to a single boat. To perform the task assignment we use two centralized approaches: the Hungarian method (which computes the optimal allocation for our task assignment problem) and a greedy method, which simply iterates over all tasks and all boats in sequence allocating the current task to the boat that has the minimum cost. The cost of assigning a boat i to a task j is defined by a cost function Ci (xj ). We used four types of cost functions: distance Ci (xj ) = di (xj ) which directly consider the travel distance from the current position of the boat to to closest point on the isoline ; variance Ci (xj ) = kdi (xj )k − kv(xj )k, where the variance of an isoline is the value obtained from the sum of the variances of each point (taken from the Gaussian process) composing the isoline related to task xj . This gives an indication of the uncertainty of values in that isoline 2 The GP has been implemented in matlab and after a tuning phase we used the covMaterniso (with d = 3) as covariance function. and hence it indicates how valuable it is for a boat to consider that task. Both distance and variance are normalized with respect to all isolines; length Ci (xj ) = kdi (xj )k−kl(xj )k, where l(xj ) is the normalised length of the isoline. By looking at isolines with a greater length, the boats should gather more information on the environment; Entropy Ci (xj ) = kdi (xj )k − kρ(xj )k, where kρ(xj )k is the normalized reduction of entropy of an isoline. The reduction of entropy can be computed by using the predicted data from the GP and provides an indication on the value of information for the cells associated to the isoline [6]. (a) (b) Fig. 2. 2(a) Comparison with standard explore area and new approach; 2(b) Results for the Hungarian Method with different cost functions We performed experiments as follows: i) we read the log file, and we map this on the area where boats can operate; ii) we create three boats which always start from the same point on the map; iii) the boats are sent to three specific points, to collect the inintial training set; iv) our cooperative exploration procedure is then performed for 30 iterations. Figure 2(b) reports the RMSE at the last iteration when varying the cost function for the Hungarian Method. Results shows that the entropy cost function is the ones that minimizes the RMSE, moreover (as expected) the Hungarian method performed better than the greedy approach (data not reported here). Figure 2(a) compares three methods: i) the standard lawnmower approach, ii) the Greedy method using distance as cost function (worst performing among our approach) and iii) the Hungarian method with entropy reduction as cost function (best performing). Our approach (even in the worst case) significantly outperforms the previous lawnmower strategy and the Hungarian method with entropy cost function further improves the RMSE. Overall, the proposed approach provides a promising method for monitoring water bodies when there is a-priori information on interesting range of values for the data to be acquired. More in general, this is a first important step to devise on-line autonomous water monitoring strategies for low-cost, robotics platforms to provide more accurate data. Future work includes investigating methods to explore new areas of the environment (instead of focusing only on existing iso- lines) to avoid the initial pre-determined training phase and fully address the exploration-exploitation trade-off typical of learning approaches. References 1. Farinelli, A., Marchi, N., Raeissi, M.M., Brooks, N., Scerri, P.: A mechanism for smoothly handling human interrupts in team oriented plans. In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems. AAMAS ’15, Richland, SC, International Foundation for Autonomous Agents and Multiagent Systems (2015) 377–385 2. Bishop, C.M., et al.: Pattern recognition and machine learning. Volume 4. springer New York (2006) 3. Rasmussen, C.E.: Gaussian processes for machine learning. (2006) 4. Kuhn, H.W.: The hungarian method for the assignment problem. Naval research logistics quarterly 2(1-2) (1955) 83–97 5. Munkres, J.: Algorithms for the assignment and transportation problems. Journal of the Society for Industrial & Applied Mathematics 5(1) (1957) 32–38 6. Ruben, S.: Decentralised Coordination of Information Gathering Agents. PhD thesis, University of Southampton - Faculty of Engineering and Applied Science and School of Electronics and Computer Science (2010)