A Real-time Approach System for Vineyards Intra-row Weed Detection Vasileios Moysiadis 1,2, Dimitrios Kateris 1, Dimitrios Katikaridis 1,2, Giorgos Vasileiadis 1,3, Vasileios Kolorizos 1,4, Aristotelis C. Tagarakis 1 and Dionysis Bochtis 1 1 Institute for Bio-Economy and Agri-Technology (iBO), Centre for Research and Technology-Hellas (CERTH), 6th km Charilaou-Thermi Rd, 57001, Thessaloniki, Greece 2 Department of Computer Science & Telecommunications, University of Thessaly, 35131 Lamia, Greece 3 Laboratory of Agricultural Engineering, School of Agriculture, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece 4 Department of Energy Systems, University of Thessaly, Gaiopolis Campus, 41500, Larisa, Greece Abstract With the incorporation of autonomous robotic platforms in various areas (industry, agriculture, etc.), numerous mundane operations have become fully automated. The highly demanding working environment of Agriculture let the development of techniques and machineries that could cope with each case. New technologies (from high performance motors to optimization algorithms) have been implemented and tested in this field. Every cultivation season, there are several operations that contribute to crop development and have to take place at least once. One of these operations is the weeding. In every crop, there are plants that are not part of it. These plants, in most cases have a negative impact on the crop and had to be removed. In the past the weeding was taken place either by hand (smaller fields) or by the use of herbicides (larger fields). In the second case, the dosage and the time are pre-defined, and they are not taking into consideration the growth percentage and the weed allocation within the field. In this work, a novel approach for intra-row weed detection in vineyards is developed and presented. All the experiments both for data collection and algorithm testing took place in a high value vineyard which produce numerous wine varieties. The aim of this work is to implement an accurate real-time robotic system for weed detection and segmentation using a deep learning algorithm in order to optimize the weeding procedure. This approach consists of two essential sub-systems. The first one is the robotic platform that embeds all the necessary sensors and the required computational power for the detection algorithm. The second one is the developed algorithm. From all the developed models, the selected one performed accurately in the training procedure and in the unknown datasets. In order to properly validate the algorithm, the unknown datasets were acquired in different time periods with variations in both camera angle and wine varieties. Keywords 1 Weed detection, RGB camera, vineyard, UGV, deep learning, masked RCNN 1. Introduction Object detection and localization are essential procedures in state-of-the-art robotic operations. As the available computational power increases, the implementation of numerous algorithms related to the robots’ environmental awareness is documented. In this direction were developed two main types of algorithms could be developed in order to identify and localize side weeds. Gonzalez-de-Santos, P. et Proceedings of HAICTA 2022, September 22–25, 2022, Athens, Greece EMAIL: v.moisiadis@certh.gr (A. 1); d.kateris@certh.gr (A. 2); d.katikaridis@certh.gr (A. 3); g.vasileiadis@certh.gr (A. 4); v.kolorizos@certh.gr (A. 5); a.tagarakis@certh.gr (A. 6); d.bochtis@certh.gr (A. 7) ORCID: 0000-0001-5772-1392 (A. 1); 0000-0002-5731-9472 (A. 2); 0000-0002-6075-8150 (A. 4); 0000-0001-6478-7381 (A. 5); 0000-0001- 5743-625X (A. 6); 0000-0002-7058-5986 (A. 7) ©️ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 39 al in their work, propose a method incorporating a fleet of UAVs and UGV both RGB (with high spatial resolution) and multi-spectral cameras for weed detection [1]. Along with the aggregated images, various orthomasaics were produced to facilitate the weed detection process. A similar methodology with the use of an OBIA algorithm was proposed by Peña et al. [2]. In the work of de Castro et al., a custom OBIA algorithm to classify four distinct classes (vine, cover crop, cynodon dactylon and bare soil) is implemented [3]. With the use of a UAV several orthomοsaics, both 2D and 3D were generated for the validation of the proposed methodology. Furthermore, to accomplish better results a color index of the vegetation was used to filter the available data. On the other hand, in the last few years, numerous implementations of machine learning and deep learning algorithms in various scientific fields have been documented. The main prerequisite to training and embedding an algorithm in any system is the available training and validation datasets. To that end, in the work of Olsen et al., an open multiclass weed species image dataset for machine learning applications is proposed [4]. It contains more than 17,000 labeled images while the authors also propose a deep learning model based on Inception-v3 and ResNet-50. Based on the above-mentioned paper, in the work of dos Santos Ferreira et al., two unsupervised deep clustering algorithms based on two weed datasets are evaluated [5]. The first algorithm is the Joint Unsupervised Learning of Deep Representations and Image Clusters (JULE) which was proposed by Yang et al. [6]. The second algorithm is Deep Clustering for Unsupervised Learning of Visual Features (DeepCluster) which jointly learns the parameters of a neural network and clusters the assignments of the resulting features, as it was proposed by Caron et al. [7]. The datasets that were used to train the algorithms were the Grass- Broadleaf from a previous work of the same authors [8] and the DeepWeeds dataset. For further exploration, Hu et al. developed a novel graph-based deep learning architecture to recognize multiple types of weeds from RGB images [9]. They also used the DeepWeeds dataset to train the algorithm. Each of the input images were set as a multi-scale graph, of which the vertex patterns are associated with the image sub-patches, of multiple scales from local to global scopes. Finally, based on the review paper of Wu et al. [10] some of the most common deep learning architectures were implemented in different weed datasets. 2. Material and Methods Considering all the above, several methods were tested based on this work requirements. The disadvantages of the OBIA method are related to the constantly changing operational environment of the vineyards, in tandem with the available computational power. Especially the one related to the available CUDA cores led to the selection of the Deep Learning methods. Furthermore, the second essential decision that was made was based on the type of Deep Learning algorithm. There were three essential options: 1. “learn from scratch”, 2. use a pre-trained model utilizing transfer learning, and 3. fine tuning. In the “learn from scratch” method, it was necessary to start the learning process with random weights. This method is associated with the possession of many images for the training procedure. In fine tuning method, the weights which were initialized using a pre-trained model from the same domain were unfrozen, and the re-training of the model on the new data with a very low learning rate was attempted. Transfer learning is similar to the fine-tuning method with the noticeable difference that the weights from the pre-trained model were not necessarily from the same domain. The last two methods are commonly used when the available datasets consist of a small number of images. The advantages of the transfer learning and fine-tuning method led to the selection of a combination of the two methods. The proposed system consists of two essential sub-systems. The first one is the autonomous robotic platform (Thorvald, SAGA Robotics) that embeds all the necessary sensors along with the required computational power (Figure 1a). The second one is the developed weed detection algorithm. The robotic platform that was used in this work was a Thorvald by SAGA Robotics. Four sensors were mounted on the robotic platform. A laser scanner (Lidar), an RTK GPS, two IMU sensors and an RGB camera. The first three sensors are used to facilitate autonomous navigation and the fourth one is for weed detection. All the sensors were placed on the robotic platform in order to minimize the input data error as depicted in the figure below (Figure 1b). The communication between the robotic platform and the sensors became via the ROS framework. 40 a) b) Figure 1: a) The weed detection system mounted on Thorvald robotic platform and b) Location of the sensors n the relative coordinate system using the IMU. Also, two algorithms were developed and tested for weed detection at the intra-row path. The first algorithm was a Single Shot Detector (SSD) [11] with a ResNet 101 [12] as a backbone while the second algorithm was the Mask RCNN [13]. Python3.7, OpenCV and TensorFlow1.5 were used for both algorithms. The final implementation of the intra-row path weeds detection took place with the development of the Mask RCNN, based on the results of early stopping. All the measurements were taken place at Ktima Gerovassiliou vineyard, via an autonomous robotic platform. Ktima Gerovassiliou is one of the largest wine producers in the country cultivating several varieties of both white and red wine via an autonomous robotic platform. All the RGB images were aggregated from a ZED2 RGBD camera by Stereolabs which was embedded in the Thorvald robotic platform. The resolution of the images was 2208x1242 with f/1.8 and field of view 110° (H) x 70° (V). The camera was placed sideways of the robotic platform in a 45° angle. As described before the Mask RCNN algorithm is based on a Feature pyramid network and a RESNET 101 architecture. The dataset consists of 1326 RGB images from four vineyard varieties in three stages of the cultivation season. The selection of the days that the experiments were held was based on the weeds’ growth and more specifically early spring, early and late summer. 3. Results The proposed model was trained on a NVIDIA 1080 Ti with 64 Gb of RAM. The separation of the dataset was defined as 1066 images for training and 260 images for testing. The training procedure of the algorithm took place with three distinct stages, In stage one, a region proposal network was used known as RPN. In stage two, a proposal classification takes the region proposals from the RPN and classifies them and in stage three the Masks were generated. This stage takes the detections from the previous layer and runs the mask head to generate segmentation masks for every instance. Furthermore, images from a standalone RGB camera were acquired in order to validate the model in unknown images. In the following figure (Figure 2) six images are presented with different angles captured in different cultivation periods. Also, in the following five tables the results for each variety along with the summary of the results are presented (Table 1-5). 41 Figure 2: Validation from unknown images. Table 1 Sauvignon Blanc variety, Number of samples: 52 Accuracy Precision Recall F1 0.912280702 0.961904762 0.912280702 0.936435767 Table 2 Chardonnay variety, Number of samples: 48 Accuracy Precision Recall F1 0.931624 0.889831 0.931624 0.910248 Table 3 Malagouzia, Number of samples: 76 Accuracy Precision Recall F1 0.767442 0.956522 0.767442 0.851613 Table 4 Syrah variety, Number of samples: 84 Accuracy Precision Recall F1 0.745928 0.933884 0.745928 0.829391 Table 5 Summary, Number of samples: 260 Accuracy Precision Recall F1 0.8393186755 0.9355354405 0.8393186755 0.8695288045 42 4. Conclusion In conclusion, this work intends to provide a solution to weed management, one of the major challenges in viticulture, as long as weeds can cause significant yield losses and severe competition to the cultivation. According to this, the development of a totally automated procedure for weed monitoring will provide useful data for understanding weed management practices. Also, this work aims to present a new image-based technique that was developed in order to detect weeds at the intra- row path in vineyards. The developed model was tested in different field vineyard conditions with different levels of weed growth and it performed accurately in cases where the weeds had distinct borders. On the other hand, the results show that the proposed model gives promising results in various field conditions and in conditions where there are no distinct boundaries between the weeds. 5. Acknowledgements This work was supported by the HORIZON 2020 Project “BACCHUS: Mobile Robotic Platforms for Active Inspection and Harvesting in Agricultural Areas” (project code: 871704) financed by the European Union, under the call H2020-ICT-2018-2020. 6. References [1] P. Gonzalez-de-Santos et al., Fleets of robots for environmentally-safe pest control in agriculture, Precision Agriculture, 18(4) (2017) 574–614. doi: 10.1007/S11119-016-9476-3/FIGURES/16. [2] J. M. Peña, J. Torres-Sánchez, A. I. de Castro, M. Kelly, and F. López-Granados, Weed Mapping in Early-Season Maize Fields Using Object-Based Analysis of Unmanned Aerial Vehicle (UAV) Images, PLOS ONE, 8(10) (2013). doi: 10.1371/JOURNAL.PONE.0077151. [3] A. I. de Castro, J. M. Peña, J. Torres-Sánchez, F. Jiménez-Brenes, and F. López-Granados, Mapping Cynodon dactylon in vineyards using UAV images for site-specific weed control, Advances in Animal Biosciences, 8(2) (2017) 267–271. doi: 10.1017/S2040470017000826. [4] A. Olsen et al., DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning, Scientific Reports 9(1) (2019) 1–12. doi: 10.1038/s41598-018-38343-3. [5] A. dos Santos Ferreira, D. M. Freitas, G. G. da Silva, H. Pistori, and M. T. Folhes, Unsupervised deep learning and semi-automatic data labeling in weed discrimination, Computers and Electronics in Agriculture, 165 (2019) 104963. doi: 10.1016/J.COMPAG.2019.104963. [6] J. Yang, D. Parikh, and D. Batra, Joint unsupervised learning of deep representations and image clusters,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (2016) 5147–5156. doi: 10.1109/CVPR.2016.556. [7] M. Caron, P. Bojanowski, A. Joulin, and M. Douze, Deep Clustering for Unsupervised Learning of Visual Features, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11218 (2018) 139–156. doi: 10.1007/978-3-030-01264-9_9. [8] A. dos Santos Ferreira, D. Matte Freitas, G. Gonçalves da Silva, H. Pistori, and M. Theophilo Folhes, Weed detection in soybean crops using ConvNets, Computers and Electronics in Agriculture, 143 (2017) 314–324. doi: 10.1016/J.COMPAG.2017.10.027. [9] K. Hu, G. Coleman, S. Zeng, Z. Wang, and M. Walsh, Graph weeds net: A graph-based deep learning method for weed recognition,” Computers and Electronics in Agriculture, 174 (2020) 105520. doi: 10.1016/J.COMPAG.2020.105520. [10] Z. Wu, Y. Chen, B. Zhao, X. Kang, and Y. Ding, Review of Weed Detection Methods Based on Computer Vision, Sensors 21(11) (2021) 3647. doi: 10.3390/S21113647. [11] X. Lu, X. Kang, S. Nishide, F. Ren, F. Object detection based on SSD-ResNet. Proceedings of 6th IEEE International Conference on Cloud Computing and Intelligence Systems CCIS (2019) 89– 92. 43 [12] K. He, X. Zhang, S. Ren, J. Sun. Deep residual learning for image recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016) 770– 778. [13] K. He, G. Gkioxari, P. Dollár, R. Girshick. Mask R-CNN, IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2) (2017) 386–397. 44