A Real-time Approach System for Vineyards Intra-row Weed
Detection
Vasileios Moysiadis 1,2, Dimitrios Kateris 1, Dimitrios Katikaridis 1,2, Giorgos Vasileiadis 1,3,
Vasileios Kolorizos 1,4, Aristotelis C. Tagarakis 1 and Dionysis Bochtis 1
1
  Institute for Bio-Economy and Agri-Technology (iBO), Centre for Research and Technology-Hellas (CERTH),
6th km Charilaou-Thermi Rd, 57001, Thessaloniki, Greece
2
  Department of Computer Science & Telecommunications, University of Thessaly, 35131 Lamia, Greece
3
  Laboratory of Agricultural Engineering, School of Agriculture, Aristotle University of Thessaloniki, 54124
Thessaloniki, Greece
4
  Department of Energy Systems, University of Thessaly, Gaiopolis Campus, 41500, Larisa, Greece


                Abstract
                With the incorporation of autonomous robotic platforms in various areas (industry, agriculture,
                etc.), numerous mundane operations have become fully automated. The highly demanding
                working environment of Agriculture let the development of techniques and machineries that
                could cope with each case. New technologies (from high performance motors to optimization
                algorithms) have been implemented and tested in this field. Every cultivation season, there are
                several operations that contribute to crop development and have to take place at least once.
                One of these operations is the weeding. In every crop, there are plants that are not part of it.
                These plants, in most cases have a negative impact on the crop and had to be removed. In the
                past the weeding was taken place either by hand (smaller fields) or by the use of herbicides
                (larger fields). In the second case, the dosage and the time are pre-defined, and they are not
                taking into consideration the growth percentage and the weed allocation within the field. In
                this work, a novel approach for intra-row weed detection in vineyards is developed and
                presented. All the experiments both for data collection and algorithm testing took place in a
                high value vineyard which produce numerous wine varieties. The aim of this work is to
                implement an accurate real-time robotic system for weed detection and segmentation using a
                deep learning algorithm in order to optimize the weeding procedure. This approach consists of
                two essential sub-systems. The first one is the robotic platform that embeds all the necessary
                sensors and the required computational power for the detection algorithm. The second one is
                the developed algorithm. From all the developed models, the selected one performed accurately
                in the training procedure and in the unknown datasets. In order to properly validate the
                algorithm, the unknown datasets were acquired in different time periods with variations in both
                camera angle and wine varieties.

                Keywords 1
                Weed detection, RGB camera, vineyard, UGV, deep learning, masked RCNN

1. Introduction

   Object detection and localization are essential procedures in state-of-the-art robotic operations. As
the available computational power increases, the implementation of numerous algorithms related to the
robots’ environmental awareness is documented. In this direction were developed two main types of
algorithms could be developed in order to identify and localize side weeds. Gonzalez-de-Santos, P. et

Proceedings of HAICTA 2022, September 22–25, 2022, Athens, Greece
EMAIL: v.moisiadis@certh.gr (A. 1); d.kateris@certh.gr (A. 2); d.katikaridis@certh.gr (A. 3); g.vasileiadis@certh.gr (A. 4);
v.kolorizos@certh.gr (A. 5); a.tagarakis@certh.gr (A. 6); d.bochtis@certh.gr (A. 7)
ORCID: 0000-0001-5772-1392 (A. 1); 0000-0002-5731-9472 (A. 2); 0000-0002-6075-8150 (A. 4); 0000-0001-6478-7381 (A. 5); 0000-0001-
5743-625X (A. 6); 0000-0002-7058-5986 (A. 7)
             ©️ 2022 Copyright for this paper by its authors.
             Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
             CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                   39
al in their work, propose a method incorporating a fleet of UAVs and UGV both RGB (with high spatial
resolution) and multi-spectral cameras for weed detection [1]. Along with the aggregated images,
various orthomasaics were produced to facilitate the weed detection process. A similar methodology
with the use of an OBIA algorithm was proposed by Peña et al. [2]. In the work of de Castro et al., a
custom OBIA algorithm to classify four distinct classes (vine, cover crop, cynodon dactylon and bare
soil) is implemented [3]. With the use of a UAV several orthomοsaics, both 2D and 3D were generated
for the validation of the proposed methodology. Furthermore, to accomplish better results a color index
of the vegetation was used to filter the available data.
    On the other hand, in the last few years, numerous implementations of machine learning and deep
learning algorithms in various scientific fields have been documented. The main prerequisite to training
and embedding an algorithm in any system is the available training and validation datasets. To that end,
in the work of Olsen et al., an open multiclass weed species image dataset for machine learning
applications is proposed [4]. It contains more than 17,000 labeled images while the authors also propose
a deep learning model based on Inception-v3 and ResNet-50. Based on the above-mentioned paper, in
the work of dos Santos Ferreira et al., two unsupervised deep clustering algorithms based on two weed
datasets are evaluated [5]. The first algorithm is the Joint Unsupervised Learning of Deep
Representations and Image Clusters (JULE) which was proposed by Yang et al. [6]. The second
algorithm is Deep Clustering for Unsupervised Learning of Visual Features (DeepCluster) which jointly
learns the parameters of a neural network and clusters the assignments of the resulting features, as it
was proposed by Caron et al. [7]. The datasets that were used to train the algorithms were the Grass-
Broadleaf from a previous work of the same authors [8] and the DeepWeeds dataset. For further
exploration, Hu et al. developed a novel graph-based deep learning architecture to recognize multiple
types of weeds from RGB images [9]. They also used the DeepWeeds dataset to train the algorithm.
Each of the input images were set as a multi-scale graph, of which the vertex patterns are associated
with the image sub-patches, of multiple scales from local to global scopes. Finally, based on the review
paper of Wu et al. [10] some of the most common deep learning architectures were implemented in
different weed datasets.

2. Material and Methods

    Considering all the above, several methods were tested based on this work requirements. The
disadvantages of the OBIA method are related to the constantly changing operational environment of
the vineyards, in tandem with the available computational power. Especially the one related to the
available CUDA cores led to the selection of the Deep Learning methods. Furthermore, the second
essential decision that was made was based on the type of Deep Learning algorithm. There were three
essential options: 1. “learn from scratch”, 2. use a pre-trained model utilizing transfer learning, and 3.
fine tuning. In the “learn from scratch” method, it was necessary to start the learning process with
random weights. This method is associated with the possession of many images for the training
procedure. In fine tuning method, the weights which were initialized using a pre-trained model from
the same domain were unfrozen, and the re-training of the model on the new data with a very low
learning rate was attempted. Transfer learning is similar to the fine-tuning method with the noticeable
difference that the weights from the pre-trained model were not necessarily from the same domain. The
last two methods are commonly used when the available datasets consist of a small number of images.
The advantages of the transfer learning and fine-tuning method led to the selection of a combination of
the two methods.
    The proposed system consists of two essential sub-systems. The first one is the autonomous robotic
platform (Thorvald, SAGA Robotics) that embeds all the necessary sensors along with the required
computational power (Figure 1a). The second one is the developed weed detection algorithm. The
robotic platform that was used in this work was a Thorvald by SAGA Robotics. Four sensors were
mounted on the robotic platform. A laser scanner (Lidar), an RTK GPS, two IMU sensors and an RGB
camera. The first three sensors are used to facilitate autonomous navigation and the fourth one is for
weed detection. All the sensors were placed on the robotic platform in order to minimize the input data
error as depicted in the figure below (Figure 1b). The communication between the robotic platform and
the sensors became via the ROS framework.


                                                     40
                         a)                                             b)
Figure 1: a) The weed detection system mounted on Thorvald robotic platform and b) Location of
the sensors n the relative coordinate system using the IMU.

   Also, two algorithms were developed and tested for weed detection at the intra-row path. The first
algorithm was a Single Shot Detector (SSD) [11] with a ResNet 101 [12] as a backbone while the second
algorithm was the Mask RCNN [13]. Python3.7, OpenCV and TensorFlow1.5 were used for both
algorithms. The final implementation of the intra-row path weeds detection took place with the
development of the Mask RCNN, based on the results of early stopping.
   All the measurements were taken place at Ktima Gerovassiliou vineyard, via an autonomous robotic
platform. Ktima Gerovassiliou is one of the largest wine producers in the country cultivating several
varieties of both white and red wine via an autonomous robotic platform.
   All the RGB images were aggregated from a ZED2 RGBD camera by Stereolabs which was
embedded in the Thorvald robotic platform. The resolution of the images was 2208x1242 with f/1.8
and field of view 110° (H) x 70° (V). The camera was placed sideways of the robotic platform in a 45°
angle. As described before the Mask RCNN algorithm is based on a Feature pyramid network and a
RESNET 101 architecture. The dataset consists of 1326 RGB images from four vineyard varieties in
three stages of the cultivation season. The selection of the days that the experiments were held was
based on the weeds’ growth and more specifically early spring, early and late summer.

3. Results

The proposed model was trained on a NVIDIA 1080 Ti with 64 Gb of RAM. The separation of the
dataset was defined as 1066 images for training and 260 images for testing. The training procedure of
the algorithm took place with three distinct stages, In stage one, a region proposal network was used
known as RPN. In stage two, a proposal classification takes the region proposals from the RPN and
classifies them and in stage three the Masks were generated. This stage takes the detections from the
previous layer and runs the mask head to generate segmentation masks for every instance. Furthermore,
images from a standalone RGB camera were acquired in order to validate the model in unknown images.
In the following figure (Figure 2) six images are presented with different angles captured in different
cultivation periods. Also, in the following five tables the results for each variety along with the summary
of the results are presented (Table 1-5).


                                                     41
Figure 2: Validation from unknown images.


Table 1
Sauvignon Blanc variety, Number of samples: 52
        Accuracy                 Precision               Recall           F1
      0.912280702              0.961904762            0.912280702    0.936435767

Table 2
Chardonnay variety, Number of samples: 48
        Accuracy                Precision                 Recall          F1
        0.931624                0.889831                0.931624       0.910248

Table 3
Malagouzia, Number of samples: 76
        Accuracy                Precision                 Recall          F1
        0.767442                0.956522                0.767442       0.851613

Table 4
Syrah variety, Number of samples: 84
        Accuracy                  Precision               Recall          F1
        0.745928                  0.933884              0.745928       0.829391

Table 5
Summary, Number of samples: 260
         Accuracy               Precision                 Recall          F1
      0.8393186755           0.9355354405             0.8393186755   0.8695288045


                                                 42
4. Conclusion

    In conclusion, this work intends to provide a solution to weed management, one of the major
challenges in viticulture, as long as weeds can cause significant yield losses and severe competition to
the cultivation. According to this, the development of a totally automated procedure for weed
monitoring will provide useful data for understanding weed management practices. Also, this work
aims to present a new image-based technique that was developed in order to detect weeds at the intra-
row path in vineyards. The developed model was tested in different field vineyard conditions with
different levels of weed growth and it performed accurately in cases where the weeds had distinct
borders. On the other hand, the results show that the proposed model gives promising results in various
field conditions and in conditions where there are no distinct boundaries between the weeds.

5. Acknowledgements

   This work was supported by the HORIZON 2020 Project “BACCHUS: Mobile Robotic Platforms
for Active Inspection and Harvesting in Agricultural Areas” (project code: 871704)
financed by the European Union, under the call H2020-ICT-2018-2020.

6. References
[1] P. Gonzalez-de-Santos et al., Fleets of robots for environmentally-safe pest control in agriculture,
     Precision Agriculture, 18(4) (2017) 574–614. doi: 10.1007/S11119-016-9476-3/FIGURES/16.
[2] J. M. Peña, J. Torres-Sánchez, A. I. de Castro, M. Kelly, and F. López-Granados, Weed Mapping
     in Early-Season Maize Fields Using Object-Based Analysis of Unmanned Aerial Vehicle (UAV)
     Images, PLOS ONE, 8(10) (2013). doi: 10.1371/JOURNAL.PONE.0077151.
[3] A. I. de Castro, J. M. Peña, J. Torres-Sánchez, F. Jiménez-Brenes, and F. López-Granados,
     Mapping Cynodon dactylon in vineyards using UAV images for site-specific weed control,
     Advances in Animal Biosciences, 8(2) (2017) 267–271. doi: 10.1017/S2040470017000826.
[4] A. Olsen et al., DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning,
     Scientific Reports 9(1) (2019) 1–12. doi: 10.1038/s41598-018-38343-3.
[5] A. dos Santos Ferreira, D. M. Freitas, G. G. da Silva, H. Pistori, and M. T. Folhes, Unsupervised
     deep learning and semi-automatic data labeling in weed discrimination, Computers and Electronics
     in Agriculture, 165 (2019) 104963. doi: 10.1016/J.COMPAG.2019.104963.
[6] J. Yang, D. Parikh, and D. Batra, Joint unsupervised learning of deep representations and image
     clusters,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern
     Recognition, (2016) 5147–5156. doi: 10.1109/CVPR.2016.556.
[7] M. Caron, P. Bojanowski, A. Joulin, and M. Douze, Deep Clustering for Unsupervised Learning
     of Visual Features, Lecture Notes in Computer Science (including subseries Lecture Notes in
     Artificial Intelligence and Lecture Notes in Bioinformatics), 11218 (2018) 139–156. doi:
     10.1007/978-3-030-01264-9_9.
[8] A. dos Santos Ferreira, D. Matte Freitas, G. Gonçalves da Silva, H. Pistori, and M. Theophilo
     Folhes, Weed detection in soybean crops using ConvNets, Computers and Electronics in
     Agriculture, 143 (2017) 314–324. doi: 10.1016/J.COMPAG.2017.10.027.
[9] K. Hu, G. Coleman, S. Zeng, Z. Wang, and M. Walsh, Graph weeds net: A graph-based deep
     learning method for weed recognition,” Computers and Electronics in Agriculture, 174 (2020)
     105520. doi: 10.1016/J.COMPAG.2020.105520.
[10] Z. Wu, Y. Chen, B. Zhao, X. Kang, and Y. Ding, Review of Weed Detection Methods Based on
     Computer Vision, Sensors 21(11) (2021) 3647. doi: 10.3390/S21113647.
[11] X. Lu, X. Kang, S. Nishide, F. Ren, F. Object detection based on SSD-ResNet. Proceedings of 6th
     IEEE International Conference on Cloud Computing and Intelligence Systems CCIS (2019) 89–
     92.


                                                    43
[12] K. He, X. Zhang, S. Ren, J. Sun. Deep residual learning for image recognition. Proceedings of the
     IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016) 770–
     778.
[13] K. He, G. Gkioxari, P. Dollár, R. Girshick. Mask R-CNN, IEEE Transactions on Pattern Analysis
     and Machine Intelligence, 42(2) (2017) 386–397.


                                                   44