Deep Transfer Learning of Traversability Assessment for Heterogeneous Robots Josef Zelinka, Miloš Prágr, Rudolf Szadkowski, Jan Bayer and Jan Faigl Czech Technical University in Prague, Faculty of Electrical Engineering, Technická 2, 166 27, Prague, Czech Republic Abstract For autonomous robots operating in an unknown environment, it is important to assess the traversability of the surrounding terrain to improve path planning and decision-making on where to navigate next in a cost-efficient way. Specifically, in mobile robot exploration, terrains and their traversability are unknown prior to the deployment. The robot needs to use its limited resources to learn its terrain traversability model on the go; however, reusing a provided model is still a desirable option. In a team of heterogeneous robots, the models assessing traversability cannot be reused directly since robots might possess different morphology or sensory equipment and thus experience the terrain differently. In this paper, we propose a transfer learning approach for convolutional neural networks assessing the traversability between heterogeneous robots, where the transferred network is retrained using data available for the target robot to accommodate itself to the robot’s traversability. The proposed method is verified in real-world experiments, where the proposed approach provides faster learning convergence and better traversal cost predictions than the baseline. Keywords heterogeneous robots, transfer learning, neural networks, traversability assessment 1. Introduction Teacher's model Transferred student's model Our work is motivated by autonomous tasks such as mo- Transfer of model bile robot exploration, where robots encounter terrains that might impede their movement but have unknown Terrain + Continuous learning Terrain observation properties due to the nature of the mission. In such de- observation using different sensors ployments, the robots can improve the efficiency of their navigation by learning the terrain properties incremen- tally during the mission. Further, we can reason about Predicted cost Predicted cost distributing the exploration to multiple robots to finish Figure 1: Assuming that the cost assessment model for the the mission faster. For a heterogeneous team of robots, teacher is available, the teacher’s model is transferred to the each robot can be assigned to a suitable part of the envi- student and modified to accept the student’s observation for- ronment, such as a small crawler exploring tight spaces, mat indicated by the different colormaps. Then, the trans- while larger and faster robots can be assigned to open ar- ferred model continues learning using the student’s observa- eas. However, robotic platforms with varying builds and tions to be informed about the student’s experiences. sensory equipment have different terrain perceptions; hence, each platform needs to learn standalone terrain assessment models. The knowledge transfer approach In this paper, we propose to utilize transfer learning to can reduce the complexity of training and maintaining share terrain traversability assessment models between multiple standalone models. heterogeneous robotic platforms, as illustrated in Fig- Transfer learning is a part of the machine learning ure 1. The individual models are neural networks that principles that aim to improve the performance in the tar- predict a continuous score describing the difficulty of the get domain by the experience in the source domain [1]1 . terrain traversal. The student’s neural network is initial- ized by weights transferred from the teacher, and then ITAT’22: Information technologies – Applications and Theory, Septem- the neural network is tuned using the data obtained in ber 23–27, 2022, Zuberec, Slovakia the student’s target domain. If the dimensions of obser- Envelope-Open zelinjo1@fel.cvut.cz (J. Zelinka); pragrmi1@fel.cvut.cz vations are unequal due to different sensors being used (M. Prágr); szadkrud@fel.cvut.cz (R. Szadkowski); by the teacher and student, the input dimensionality is bayerja1@fel.cvut.cz (J. Bayer); faiglj@fel.cvut.cz (J. Faigl) Orcid 0000-0003-2015-2338 (J. Zelinka); 0000-0002-8213-893X reduced (or increased) by additional neural network lay- (M. Prágr); 0000-0003-4075-116X (R. Szadkowski); ers. The proposed approach is compared to the baseline, 0000-0003-1190-1085 (J. Bayer); 0000-0002-6193-0792 (J. Faigl) learned only in the target domain, using a robot with © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). heterogeneous terrain experience simulated by various CEUR Workshop CEUR Workshop Proceedings (CEUR-WS.org) Proceedings http://ceur-ws.org ISSN 1613-0073 1 The source domain (teacher) denotes the entity providing knowl- edge to the individual in the target domain (student). traversability assessment methods. Besides, the feasi- be employed as in [13], where it is advocated that the bility of the approach is validated in the experimental learning rate of the neural network is set lower when scenario with real heterogeneous robots. applying only slight corrections to weights during the The rest of the paper is organized as follows. The fine-tunning. While fine-tuning, it can be beneficial to re- related work on traversability assessment and transfer frain from updating the weights in some layers (denoted learning is briefly reviewed in Section 2. In Section 3, the as layers freezing) [14]. problem of traversability transfer between heterogenous The costly data collection and various tasks and robots’ robots is presented. Section 4 describes the proposed bodies make robotics an interesting field for deploying approach for the transfer of traversability assessment transfer learning techniques. Although testing only in experience. The evaluation results of the proposed ap- simulators, the authors of [15] utilize transfer learning proach are presented in Section 5. Section 6 concludes to propagate experience in different scenarios of robotic the paper. soccer, where they propose a solution for transferring neural networks between tasks with different inputs and action spaces. In [16], humanoid robots observe human 2. Related work gestures and motions to replicate them later. Another method to transfer human experience to neural networks The traversability assessment is to support path planning utilized by robots is presented in [17], where humans and decisions, such as avoiding impassable terrain or op- provide knowledge directly to the network. A robotic timizing the path for the specific needs of the robot [2]. arm is trained to reach a destination of a colored block Correct traversability estimation is essential for appli- in [18]. The transfer is carried out between robotic arms cations where the robot encounters different terrains, with a different number of joints. including dangerous environments. Such fields are rep- However, since the traversability over a single terrain resented by extra-terrestrial exploration [3, 4, 5], search may greatly vary between robotic platforms, such as and rescue missions [6], and agriculture or off-road driv- wheeled and legged ground vehicles [19], the obtained ing [7]. In [2] and [8], the authors provide a thorough knowledge cannot be shared directly between different overview of traversability assessment methods suitable robots. Therefore, we aim to utilize transfer learning to for mobile robots. Therefore, we focus our brief review distribute a traversability assessment model consisting on recent neural network approaches, which provide of a neural network. appearance-based traversability predictions and utilize image-processing and classification methods. In [5], a fully convolutional neural network is utilized 3. Problem Statement to locate the best possible place for the rover to land by classifying multiple terrain types and has been used We examine various robots 𝑅 𝑖 perceiving diverse terrains for the Mars rover mission. An alternative architecture 𝑇 during operational usage. Let the robots be deployed is proposed in [9], where traversability predictions on in an environment modeled as the 2.5D grid, where each future paths are achieved using Generative Adversarial cell 𝜐 can be labeled by a number, and thus 𝜐 ∈ ℕ, and the Networks (GANs) [10] that create virtual images from cell size 𝑑𝜐 corresponds to the footprint of the smallest the already traversed path. However, a vast dataset is robot in the team. The center of each robot’s footprint is needed to train a neural network from scratch, fully high- discretized as the cell 𝜐 robot ∈ ℕ. Robots move through lighting the need for an approach capable of enhancing the environment along paths 𝜓 that are represented as performance with a smaller dataset. sequences of neighboring cells 𝜐1 , … , 𝜐𝑛 corresponding to For such cases, transfer learning can be employed, the robot’s discretized positions. which is an approach capable of improving knowledge The robot 𝑅 𝑖 ’s path-planning decisions are made with in the target domain by the transfer from the source do- respect to (w.r.t.) the particular robot’s cost 𝑐 𝑖 by finding main [1]. In [11], the authors utilize transfer learning a path with the minimal expected cost in the form of the weight transfer from remotely simi- 𝜓 ∗,𝑖 = argmin𝜓 ∈Ψ(𝜐,𝜐 ′ ) ∑ 𝑐 𝑖 (𝜐𝑗 ), (1) lar tasks to reduce the dataset size in the target domain 𝜐𝑗 ∈𝜓 necessary to train the convolutional neural network and thus shorten the training time. After transferring weights where Ψ(𝜐, 𝜐 ′ ) is a set of all paths from 𝜐 to 𝜐 ′ . How- between tasks, it is desirable to fine-tune them to suit ever, the cost function 𝑐 𝑖 is not known a priori; thus, the the task’s needs in the target domain. The classification robot has to learn it to estimate the cost by 𝑐 𝑖̂ . Hence, feature extractor output layers are reinitialized in [12] a traversability assessment model 𝑟 𝑖 is needed that as- to comply with the possibly different classes in the tar- signs the predicted cost 𝑐 𝑖̂ for a terrain 𝑇 observed using get domain. If the source and target tasks vary greatly, exteroceptive sensors as an entire redesign of the classifier’s architecture may 𝑟 𝑖 ∶ 𝐴 → 𝑐 𝑖̂ , (2) where 𝐴 is the observed terrain appearance. which each robot can compute using its cost computa- Since the mobile robot’s traversability is considered tion method. All the cost computation methods return too complex to be assessed using a handcrafted function, strictly positive values because zero and negative values a model 𝑟 𝑖 is trained from the robot’s experience to pre- would incentivize infinite paths, preventing the robot dict its future cost 𝑐 𝑖̂ . The costs utilized for training are from reaching its goal. We consider the following cost computed using proprioceptive sensors because of their computation methods. ability to measure how the environment influences the robot’s body. The training of each traversability model Velocity 𝑐𝑣 - monitors relation of the achieved speed 𝑣 𝑟 𝑖 aims to minimize the Root Mean Square Error (RMSE) and commanded velocity 𝑣 using equation 𝑐𝑣 = 𝑣 between the model’s cost assessments and the cost 𝐶 𝑖 . 𝑣cmd measured using proprioceptive sensors as Slope 𝑐𝑠 - computes the cost 𝑐𝑠 = 1 + 𝜃 as an angular 1 𝑛 𝑖 distance in the degrees from the straight pose 𝜃. RMSE𝑖 = ∑(𝑟 (𝐴(𝑇𝑗 )) − 𝑐𝑗𝑖 )2 . (3) The offset by 1 is to accent the energy expenditure 𝑛 𝑗=1 √ even on flat terrains. We address the traversability assessment for heteroge- Difference 𝑐𝑑 - is defined similarly to 𝑐𝑠 as 𝑐𝑑 = 1 + 𝛾, neous robotic platforms 𝑅 1 ≠ 𝑅 2 , which possess differ- where 𝛾 expresses the maximal angular distance ent capabilities. Therefore, we assume that differences in degrees of the subsequent robot’s positions. between the platforms can result in unequal cost mea- surements 𝑐𝑗1 ≠ 𝑐𝑗2 over some terrain 𝑇𝑗 . On the other Further, the costs are adjusted as hand, we assume that there are underlying similarities 𝑐 between how the robots interact with the terrain. The 𝑐adjusted = 𝑐max ⋅ tanh , (4) 𝑐max problem being addressed is to improve the performance of traversability assessment by transferring the cost as- where 𝑐 stands for 𝑐𝑣 , 𝑐𝑠 , and 𝑐𝑑 . The adjustment is to sessment model 𝑟 1 from the robot 𝑅 1 before learning lower the high-cost values for cases the achieved velocity on the robot 𝑅 2 , and thus learning its cost assessments is negligible when compared with commanded velocity relatively sooner while achieving similar or better pre- because the robot gets stuck. dictions than the regressor 𝑟 2 trained using only the 𝑅 2 ’s 𝑑𝑧 data. Tilt 𝑐𝑎 - is computed as 𝑐𝑎 = 𝛼 = | tan |, where 𝛼 is 𝑑𝑥𝑦 the absolute angle of the two opposing footholds 4. Method (for the legged robots) from the flat surface. For example, the left front and right rear legs are con- In the proposed approach for transferring mobile robot sidered opposing. The values 𝑑𝑧 and 𝑑𝑥𝑦 measure terrain traversal experience between the heterogeneous a difference in elevation and flat-plane distance robots, the traversability experience is denoted as the of the footholds, respectively. traversal cost 𝑐. Then, for each robot, the costs are pre- dicted using the robot’s regressor 𝑟. Each regressor is a 4.2. Traversal Cost Regressor neural network trained using the robot’s prior traversal costs associated with the description of the particularThe cost regressor 𝑟, based on the regressor proposed terrain where the cost was experienced. The teacher’s in [20], uses the terrain appearance and geometry to assess the robot’s traversal cost. The regressor is a neural terrain experience, represented as the teacher’s learned network, is transferred to the student who has no priornetwork that uses the image processing-like architecture shown in Figure 2 and operates as follows. terrain experience by using the teacher’s weights to ini- During the deployment, the robot uses its range mea- tialize the student’s network. After the transfer, the stu- surements to build a height map ℕ of the mission envi- dent’s network is further trained to adapt to the student’s domain fully. ronment in the form of an elevation grid map with the squared cell size of 7.5 cm [21]. Depending on the carried In the rest of this section, we describe in detail the sensory equipment, the grid map may also include the ter- traversal costs used by the individual robots, the regres- sor, its learning process, and the knowledge transfer. rain color in addition to the elevation information, which can be further utilized in regression and extrapolation of the learned traversal experience. 4.1. Robots’ Traversal Costs The regressor is learned from the cost measurements The traversability assessment regressor is trained on localized in ℕ paired with the terrain observations at the trails that include observations of the traversed ter- respective locations. Each terrain observation is in the rain paired with the cost perceived over the terrain, form of a 𝑤 ×𝑤 ×𝑛 segment centered at the location, where Layers to Terrain freeze observation Predicted cost conv1 Terrain conv2 observation Input Transferred conv3 flatten fc1 fc2 fc3 reshape regressor Predicted Figure 3: Setup used during the transfer of knowledge, where 1 x 1 x 16 32 cost 𝑙 denotes the frozen layers during the training. 3x3x8 8x8xn Fully connected Convolutional + PReLU+ Fully connected + PReLU + Batch norm Max pooling Dropout + Batch norm Global average pooling 5. Results Figure 2: Architecture of the regressor, where 𝑛 is the number The proposed knowledge transfer method has been exam- of terrain observation’s channels entering the neural network ined in several experimental scenarios. First, we simulate and the observation window width is 𝑤 = 8. the heterogeneity of the robots using a small hexapod crawler with varying cost perception, which provides an easy way to verify the feasibility of the proposed 𝑤 is the observation window size selected so that the approach. Then, we display the proposed knowledge entire robot’s footprint is covered. The dimensionality 𝑛 transfer using two different real robots. is either 1 when only range measurements are available, or 3 in the case range measurements are accompanied by a and b channels of the lab color space. The network is learned using Adam optimizer w.r.t. the mean absolute percentage error | 1| Λ = 100 |(𝑦𝑡 − 𝑦𝑝 ) | , (5) | 𝑦𝑡 | where 𝑦𝑡 is the expected output of the neural network (a) (b) and 𝑦𝑝 the prediction. Figure 4: (a) SCARAB II hexapod crawler and (b) Spot with sensor payload. 4.3. Knowledge transfer During the knowledge transfer, the weights from the The proposed method can be used with any set of source domain are utilized in the target domain. Besides, ground vehicles. However, we focus on multi-legged when needed, the transferred model is adjusted as shown robots since their traversal capabilities permit deploy- in Figure 3. The width 𝑤 of the input window is prepared ment in a wide range of terrains. The utilized robots separately since it is selected to fit the robot receiving are the small hexapod crawling robot SCARAB II [22] the model. However, the observation dimensionality can and the four-legged Spot that are depicted in Figure 4. differ between teachers’ and students’ exteroceptive sen- Besides their morphology, the robots also differ in size. sors. In such a case, an additional convolutional layer is Spot’s footprint is larger than SCARAB II that occupies used to reshape the input and accommodate the trans- a disk with a 25 cm radii, while Spot’s footprint fits into ferred regressor to the student’s perceived data. The layer 1.1 m × 0.5 m rectangle. Furthermore, SCARAB II carried comprises 1 × 1 convolutional kernel with the input and the Intel RealSense Tracking Camera T265 and the RGB-D output channels corresponding to the perceived number Intel RealSense Depth Camera D435 providing depth and of channels and the number of transferred regressor’s color appearance exteroceptive data. Spot perceived only input channels, respectively. range measurements using the Ouster OS0-128 LiDAR The new model is retrained using the student’s dataset and does not perceive color. collected by the student. It is because the teacher’s and student’s costs are heterogeneous, albeit it is assumed 5.1. Cost Assessment Methods that the network captures underlying terrain properties Examination that can be transferred. Additionally, during the retrain- ing of the regressor, 𝑙 layers can be frozen by fixing their The feasibility of the proposed method is firstly veri- weights since it is assumed that the initial layers extract fied in a scenario where the difference in perception of general features that are primarily similar between vari- heterogeneous robots is simulated using various cost as- ous data. sessment methods of SCARAB II. The robot collected the datasets in the Bull Rock Cave near Brno, Czech Repub- Table 2 lic, as described in [20]. The datasets were collected in RMSE’s mean (std) values between 300 and 600 epochs of various parts of the cave system, and each set is a result 𝑐𝑣 → 𝑐𝑠 transfer. of the robot walking over one of the particular terrains, Epochs Direct Transfer 0 Transfer 4 Transfer 8 whose selection is shown in Figure 5. Each dataset collec- 300 2.58 (0.57) 2.36 (0.65) 2.27 (0.28) 1.98 (0.09) tion included approximately 5 minutes of the navigation, 600 2.13 (0.28) 2.31 (0.38) 2.25 (0.30) 2.25 (0.21) enabling the robot to observe 6 × 6 m of the environment. sults show that the direct student’s model has improved with the increased number of epochs, while the RMSE of the transferred model slightly increased during the pro- longed training, likely because of overfitting the training data. The transfer model learned with 300 epochs has (a) (b) (c) achieved sufficient performance, and the results suggest that the transfer helps reduce the necessary number of Figure 5: Test terrains in the Bull Rock Cave in the (a) Chiffon, training epochs. (b) Hall, and (c) Room areas. 5.1.1. Transfer between Slope 𝑐𝑠 and Velocity 𝑐𝑣 The transfer learning approach is examined using five Cost Assessment Method scenarios for each pair of the cost assessment meth- - We further examine the transfer between the student’s ods. The scenarios are prepared by randomly choosing slope 𝑐𝑠 and teacher’s velocity 𝑐𝑣 cost assessment method five datasets from the collected dataset pool as testing in detail as those methods compute cost using dissimilar data for the direct and transferred model. The testing approaches. The dataset is split so that the student’s datasets are removed from the datasets available to train dataset is overall a third size of the teacher’s, hence the teacher’s and the student’s cost assessment models. suitable to showcase the knowledge transfer as there is From the remaining datasets, twelve are randomly drawn much information to be received by the student. Teach- for the teacher and five for the student to create training datasets for their cost assessment models. All regressors are trained for 300 epochs with a training-validation split Direct 90 of 9-to-1, and the width of the observation window is 90 Transfer + Direct 𝑤 = 8. 80 80 Direct Loss Loss Transfer + Direct 70 Table 1 70 60 Mean (std) of the RMSE for 5 randomly generated scenarios 50 for each pair of cost assessment methods. For the transfer 0 100 200 300 0 200 400 600 800 1000 Epochs Epochs regressors, the number denotes the frozen layers 𝑙 during the retraining of the transferred model. (a) (b) Scenario Direct Transfer 0 Transfer 4 Transfer 8 Figure 6: Progress of the cost assessment model’s neural 𝑐𝑑 → 𝑐 𝑠 2.10 (0.27) 2.08 (0.27) 2.27 (0.58) 2.00 (0.19) network training for (a) 300 and (b) 1000 epochs. 𝑐𝑑 → 𝑐 𝑣 3.25 (1.37) 1.68 (0.57) 1.43 (0.32) 1.95 (1.32) 𝑐𝑠 → 𝑐 𝑑 2.49 (1.33) 1.43 (0.16) 1.34 (0.22) 1.40 (0.16) 𝑐𝑠 → 𝑐 𝑣 2.83 (0.95) 2.19 (0.11) 1.98 (0.40) ers’ and students’ direct baseline cost assessment models 2.01 (0.45) 𝑐𝑣 → 𝑐 𝑑 3.41 (1.41) 1.60 (0.15) 1.52 (0.23) 1.34 (0.08) were trained for 300 epochs. After the teacher’s model 𝑐𝑣 → 𝑐 𝑠 2.58 (0.57) 2.36 (0.65) 2.27 (0.28) 1.98 (0.09) was transferred to the student, it was tweaked using 300 Overall 2.78 (0.98) 1.89 (0.32) 1.80 (0.34) 1.78 (0.38) epochs and 𝑙 = 0 frozen layers to create the transferred model. Figure 6a summarizes the first 300 epochs in the The results presented in Table 1 show that the model student’s domain, where an initial boost can be observed transfer lowered the RMSE of predictions on the testing for the transferred model. The prolonged training for datasets. The best results are achieved with 𝑙 = 8 frozen 1000 epochs, shown in Figure 6b, results in the improved layers. validation loss achieved by the student’s model. After In Table 2, the transfer from 𝑐𝑣 to 𝑐𝑠 is chosen to ex- training for 300 epochs, the RMSE of the regressors’ pre- amine the influence of training for increased number dictions against the collected ground truth is 3.69 and of the training epochs. The same randomly generated 1.84 for the direct and transferred models, respectively. transfer scenarios are utilized as in Table 1; however, the Hence, we can conclude that the transferred model im- models are trained for 600 epochs instead of 300. The re- proved performance faster. with artificial grass and spikes in the form of soundproof- 8 ing material. Outdoors, the robot traversed various sur- 6 faces such as hard-packed soil, cobbles, and sloped grass. Cost Spot can move faster than SCARAB II; thus, the Spot’s 4 Direct datasets are longer, as Spot traverses more terrain in sim- Transfer + Direct ilar 5-minute long deployment, where Spot is capable 2 Ground truth of traveling through 15 × 15 m environment. Therefore, 0 10 20 Steps 30 40 50 using fewer datasets to train the cost assessment model is sufficient. In the following paragraphs, we examine Figure 7: Predicted costs by the student’s and transferred the performance when transferring both from Spot to neural network compared to the ground truth after training SCARAB II, and in the opposite direction. for 300 epochs. 5.2.1. Spot Knowledge Transfer to SCARAB II Moreover, we examined the traversal of a single cave - The transfer from Spot to SCARAB II is achieved us- trail. Figure 7 shows the predicted costs by the student’s ing observation windows with the width of 𝑤 = 8 cells, direct and transferred models compared to the collected which is suitable for the smaller hexapod crawler. Each ground truth. We can observe that the transferred model transfer scenario, comprising transfer from Spot to one of follows the ground truth better than the direct model. the hexapod’s cost models, is evaluated in 5 setups. For each setup, 5 datasets are randomly chosen to train the Spot’s teacher model, while SCARAB II receives 6 ran- 0 Cost 5 10 0 Cost 5 10 −0.25 Height [m] 0.00 0.25 domly chosen datasets. The trained models are examined 0 0 0 on 5 randomly chosen datasets, which differ from the training sets. The regressors are trained for 300 epochs 20 20 20 with a 9-to-1 training-validation split. Cell Cell Cell 40 40 40 SCARAB II models the environment as a colored height 60 60 60 map, while Spot uses only a height map. Thus, we con- 0 20 40 0 20 40 0 20 40 sider the student’s model input with both 𝑛 = {1, 3} chan- Cell Cell Cell nels in each transfer scenario. When using the three- (a) Direct model (b) Transfer model (c) Height Map channel version, which perceives both the elevation and Figure 8: (a) Direct and (b) transferred model’s cost assess- the a and b channels of the lab color space, a convolu- ments of the perceived environment after training for 300 tional layer reshaping the input, is added to accommodate epochs; (c) and the environment’s height map, where the path teacher’s (Spot’s) model that has only one input channel. of the robot is in red. The shown maps have squared cells with the size 7.5 cm. Table 3 Mean (std) of the RMSE for knowledge transfer from Spot Figure 8 shows the cost prediction for the entire envi- to SCARAB II. Transfer 𝑥 denotes 𝑥 frozen layers during re- ronment observed from the trail. Note that the ground training the regressor, 𝑥 ∈ {0, 4}. Depth denotes 𝑛 = 1 input truth for the whole view is unavailable as the robot tra- channel for the resulting model perceiving just height map, versed just a single path (the trail). Thus, we manually Depth + ab denotes 𝑛 = 3 where SCARAB II utilizes colored evaluate the cost assessments’ feasibility for path plan- observations. ning. Compared to the direct model, the transferred Observ. Scenario Direct Transfer Transfer model returns higher costs in locations where the ele- 0 4 vation of the height map changes, suggesting that the Spot → 𝑐𝑑 2.13 (0.59) 1.45 (0.26) 1.45 (0.27) Spot → 𝑐𝑠 2.50 (0.72) 2.29 (0.35) 2.28 (0.53) transferred model produces improved assessments. Depth Spot → 𝑐𝑣 3.80 (1.52) 2.51 (1.04) 2.31 (0.85) Overall 2.81 (0.94) 2.08 (0.55) 2.01 (0.55) 5.2. Transfer between SCARAB II and Spot → 𝑐𝑑 Spot → 𝑐𝑠 2.44 (1.19) 2.80 (1.49) 1.60 (0.30) 2.15 (0.21) 1.53 (0.60) 2.05 (0.13) Depth+ ab Spot Spot → 𝑐𝑣 2.37 (0.86) 2.34 (1.06) 1.87 (0.87) Overall 2.54 (1.18) 2.03 (0.52) 1.81 (0.54) The dataset comprising heterogeneous robots is created by adding the Spot’s datasets to the SCARAB II’s data. The datasets were collected in indoor and outdoor loca- Table 3 shows the performance of the trained models. tions of the Czech Technical University in Prague campus All transferred models perform overall better than the di- at Charles Square, Prague, Czech Republic. Indoors, see rect model. However, the performance of the transferred Figure 4b, Spot moved over surfaces partially covered model has not improved when modifying the teacher’s model to accept the colored height map collected by 100 8 SCARAB II, although the direct model has improved 90 Direct Transfer + Direct when using 𝑛 = 3 input channels. In the authors’ opinion, 80 6 the added convolutional layer could not sufficiently mod- Cost Loss 4 70 ify the input observation to achieve good performance 2 Direct 60 in combination with the underlying transferred model. Transfer + Direct Ground truth 50 0 0 200 400 600 800 1000 0 25 50 75 100 Epochs Steps 5.2.2. SCARAB II Knowledge Transfer to Spot (a) (b) - For the transfer from SCARAB II to Spot, the scenar- Figure 9: (a) Progress of the cost assessment model’s neu- ios are adjusted by using 𝑤 = 16 to match Spot’s body ral network training for 1000 epochs; (b) and student’s and size, and the regressors are trained for 100 epochs in teacher’s predicted costs, and the ground truth after training Spot’s target domain. Besides, the reshaping convolu- for 300 epochs. tional layer is added during the transfer to Spot to utilize the SCARAB II’s model with the three input channels. The boost caused by the transfer can be observed in the Table 4 initial epochs. After training for more than 400 epochs, Mean (std) of the RMSE for knowledge transfer from SCARAB II to Spot. Transfer 𝑥 denotes 𝑥 frozen layers during the improvement in loss stops; particular losses do not retraining the regressor, 𝑥 ∈ {0, 4}. Depth denotes 𝑛 = 1 input change significantly until the end of the training. The channel for the resulting model perceiving just height map, transferred model produces lower validation losses, and Depth + ab denotes 𝑛 = 3 where SCARAB II utilizes colored the RMSE after training for 1000 epochs is 1.64 and 1.38 observations. Note that only one direct model is created for for the direct and transferred model, respectively. The both the Depth and Depth + ab observation setup, since for observation suggests that both models are overfitted after the student, both scenarios possess the same number of input such prolonged training. Figure 9b shows the measured channels 𝑛. and predicted costs on a particular trail where we can ob- Direct Transfer Transfer serve that the transferred model follows the ground truth Observ. Scenario 0 4 closely, avoiding spikes in the assessments observed in 𝑐𝑑 → Spot 0.37 (0.14) 0.28 (0.14) 0.52 (0.32) the direct model. However, even the transferred model 𝑐𝑠 → Spot 0.32 (0.14) 1.72 (2.12) 0.29 (0.23) Depth 𝑐𝑣 → Spot 0.92 (0.82) 0.31 (0.19) 0.24 (0.19) cannot closely follow the oscillations of the ground truth. Overall 0.54 (0.37) 0.77 (0.82) 0.35 (0.25) 𝑐𝑑 → Spot / 0.11 (0.01) 0.11 (0.01) Cost Cost 𝑐𝑠 → Spot / 0.12 (0.00) 0.11 (0.01) 0 5 10 0 5 10 Depth+ ab 𝑐𝑣 → Spot / 0.11 (0.01) 0.11 (0.00) 0 0 Overall / 0.11 (0.01) 0.11 (0.01) 25 25 Cell Cell 50 50 The results in Table 4 indicate improvements when uti- 75 75 lizing transfer learning. The transfer of the SCARAB II’s 0 50 100 150 0 50 100 150 model perceiving color has achieved the best perfor- Cell Cell mance on the testing dataset. The authors suppose that (a) Direct model (b) Transfer model the added convolutional layer and increased number of channels help the model better grasp the underlying terra- 0 Height [m] 1 2 mechanical properties. 0 25 5.3. Individual Transfer between Spot Cell 50 and SCARAB II 75 0 50 100 150 We further present a detailed overview of the knowledge Cell transfer between Spot and SCARAB II. SCARAB II uti- (c) Height map lizes the difference cost computation method 𝑐𝑑 , and the Figure 10: (a) Direct and (b) transferred model’s cost as- width of the observation window is set to 𝑤 = 8. Af- sessments of the perceived environment after training for ter 300 training epochs, the student’s direct transferred 300 epochs; (c) and the environment’s height map with the and fine-tuned models achieved RMSE of 1.79 and 1.09, squared cell of the size 7.5 cm. respectively. The progress of the regressors’ training after a pro- Figure 10 illustrates the cost assessments and the longed training for 1000 epochs is depicted in Figure 9a. height map with the marked robot trail in the Room part planning for a planetary rover, Autonomous Robots of the cave. Since the robot traversed only a tiny part of 6 (1999) 131–146. doi:1 0 . 1 0 2 3 / A : 1 0 0 8 8 3 1 4 2 6 9 6 6 . the observed environment, we present a manual evalu- [5] B. Rothrock, R. Kennedy, C. Cunningham, J. Pa- ation of the cost assessments as the cost measurements pon, M. Heverly, M. Ono, SPOC: Deep Learning- are not presented in the observed area. The transferred based Terrain Classification for Mars Rover Mis- model assigns a higher cost to the terrain edge, while the sions, in: AIAA SPACE, 2016, pp. 1–12. doi:1 0 . student’s model underestimates the difficulty. Addition- 2514/6.2016- 5539. ally, the transferred model suggests a more challenging [6] B. Cafaro, M. Gianni, F. Pirri, M. Ruiz, A. Sinha, cost in all areas of the environment, which resembles the Terrain traversability in rescue environments, in: actual perceived cost more closely. Thus, we conclude IEEE Safety Security and Rescue Robotics (SSRR), that the transfer can correct the student’s direct model 2013, pp. 1–8. doi:1 0 . 1 1 0 9 / S S R R . 2 0 1 3 . 6 7 1 9 3 5 8 . predictions. [7] A. Huertas, L. Matthies, A. Rankin, Stereo-based tree traversability analysis for autonomous off-road navigation, in: IEEE Workshops on Applications 6. Conclusion of Computer Vision (WACV/MOTION), 2005, pp. 210–217. doi:1 0 . 1 1 0 9 / A C V M O T . 2 0 0 5 . 1 1 1 . In this paper, we present an approach for sharing knowl- [8] M. A. Bekhti, Traversability Cost Prediction of Out- edge about traversability between heterogeneous robots. door Terrains for Mobile Robot Using Image Fea- Traversal cost predictors are created using neural net- tures, Ph.D. thesis, Shizuoka University, 2020. works processing observations from exteroceptive sen- [9] N. Hirose, A. Sadeghian, F. Xia, R. Martín-Martín, sors. The knowledge transfer is implemented as the trans- S. Savarese, VUNet: Dynamic scene view syn- fer of neural network weights, and the transferred net- thesis for traversability estimation using an rgb works are fine-tuned to adapt to the receiving robot’s ter- camera, Robotics and Automation Letters 4 (2019) rain perception. The proposed method is verified using 2062–2069. doi:1 0 . 1 1 0 9 / L R A . 2 0 1 9 . 2 8 9 4 8 6 9 . a small hexapod crawler and a large quadruped walker, [10] A. Creswell, T. White, V. Dumoulin, K. Arulku- with the proposed method lowering the traversability maran, B. Sengupta, A. A. Bharath, Generative prediction error. Next, we aim to deploy the proposed adversarial networks: An overview, IEEE Signal method in path planning tasks, with the final goal of Processing Magazine 35 (2018) 53–65. doi:1 0 . 1 1 0 9 / simultaneous online learning on multi-robots. MSP.2017.2765202. [11] K. Gopalakrishnan, S. K. Khaitan, A. Choud- Acknowledgments hary, A. Agrawal, Deep convolutional neural networks with transfer learning for computer This work was supported by the Czech Science Founda- vision-based data-driven pavement distress de- tion (GAČR) under research project No. 21-33041J. tection, Construction and Building Materials 157 (2017) 322–330. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j.conbuildmat.2017.09.110. References [12] H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao, D. Mollura, R. M. Summers, Deep [1] S. J. Pan, Q. Yang, A survey on transfer learn- convolutional neural networks for computer-aided ing, IEEE Transactions on Knowledge and Data detection: CNN architectures, dataset characteris- Engineering 22 (2010) 1345–1359. doi:1 0 . 1 1 0 9 / T K D E . tics and transfer learning, IEEE Transactions on 2009.191. Medical Imaging 35 (2016) 1285–1298. doi:1 0 . 1 1 0 9 / [2] P. Papadakis, Terrain traversability analysis meth- TMI.2016.2528162. ods for unmanned ground vehicles: A survey, En- [13] R. Girshick, J. Donahue, T. Darrell, J. Malik, Region- gineering Applications of Artificial Intelligence 26 based convolutional networks for accurate object (2013) 1373–1385. doi:1 0 . 1 0 1 6 / j . e n g a p p a i . 2 0 1 3 . 0 1 . detection and segmentation, IEEE Transactions on 006. Pattern Analysis and Machine Intelligence 38 (2016) [3] S. Singh, R. Simmons, T. Smith, A. Stentz, V. Verma, 142–158. doi:1 0 . 1 1 0 9 / T P A M I . 2 0 1 5 . 2 4 3 7 3 8 4 . A. Yahja, K. Schwehr, Recent progress in local [14] R. Ribani, M. Marengoni, A survey of transfer and global traversability for planetary rovers, in: learning for convolutional neural networks, in: IEEE International Conference on Robotics and Au- Conference on Graphics, Patterns and Images Tu- tomation (ICRA), 2000, pp. 1194–1200. doi:1 0 . 1 1 0 9 / torials (SIBGRAPI-T), 2019, pp. 47–57. doi:1 0 . 1 1 0 9 / ROBOT.2000.844761. SIBGRAPI- T.2019.00010. [4] D. B. Gennery, Traversability analysis and path [15] M. E. Taylor, S. Whiteson, P. Stone, Transfer via inter-task mappings in policy search reinforce- ment learning, in: International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2007, pp. 156–163. doi:1 0 . 1 1 4 5 / 1 3 2 9 1 2 5 . 1329170. [16] N. Makondo, M. Hiratsuka, B. Rosman, , O. Hasegawa, A non-linear manifold alignment approach to robot learning from demonstrations, Journal of Robotics and Mechatronics 30 (2018) 265–281. doi:1 0 . 2 0 9 6 5 / j r m . 2 0 1 8 . p 0 2 6 5 . [17] K. Ogawa, P. Hartono, Infusing common-sensical prior knowledge into topological representations of learning robots, Artificial Life and Robotics (2022) 1–10. doi:1 0 . 1 0 0 7 / S 1 0 0 1 5 - 0 2 2 - 0 0 7 7 6 - 5 . [18] C. Devin, A. Gupta, T. Darrell, P. Abbeel, S. Levine, Learning modular neural network policies for multi- task and multi-robot transfer, in: IEEE Inter- national Conference on Robotics and Automa- tion (ICRA), 2017, pp. 2169–2176. doi:1 0 . 1 1 0 9 / I C R A . 2017.7989250. [19] P. Arena, C. F. Blanco, A. Li Noce, S. Taffara, L. Patanè, Learning traversability map of dif- ferent robotic platforms for unstructured terrains path planning, in: International Joint Confer- ence on Neural Networks (IJCNN), 2020, pp. 1–8. doi:1 0 . 1 1 0 9 / I J C N N 4 8 6 0 5 . 2 0 2 0 . 9 2 0 7 4 2 3 . [20] J. Zelinka, M. Prágr, R. Szadkowski, J. Bayer, J. Faigl, Traversability transfer learning between robots with different cost assessment policies, in: 2021 Modelling and Simulation for Autonomous Sys- tems (MESAS), 2022, pp. 333–344. doi:1 0 . 1 0 0 7 / 978- 3- 030- 98260- 7_21. [21] J. Bayer, J. Faigl, Decentralized topological map- ping for multi-robot autonomous exploration under low-bandwidth communication, in: European Con- ference on Mobile Robots (ECMR), 2021, pp. 1–7. doi:1 0 . 1 1 0 9 / E C M R 5 0 9 6 2 . 2 0 2 1 . 9 5 6 8 8 2 4 . [22] M. Forouhar, P. Čížek, J. Faigl, SCARAB II: A small versatile six-legged walking robot, in: 5th Full-Day Workshop on Legged Robots at IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 1–2.