Multichannel Image Segmentation Algorithm for
         Agricultural Crop Detection

             Anna Denisova, Andrey Kuznetzov, Nikolay Glumov

                      Samara National Research University
            34, Moskovskoe shosse, Samara, 443086, Russian Federation
                              http://www.ssau.ru


      Abstract. In this article we propose an algorithm for multichannel im-
      age segmentation within the predefined region of interest. The algorithm
      allows to determine whether the region of interest is homogeneous or
      not and it also produces a partition of the region of interest into homo-
      geneous subsets and refines objects borders. The proposed method has
      been applied to remote sensing images of agricultural fields as a prelim-
      inary step of crop detection mechanism. Segmentation quality has been
      estimated by means of a specific measure proposed in this article.


1   Introduction
There are a lot of applications that require automatic segmentation to be per-
formed as initial step [1]. For example, in the case of automatic crop detection on
remote sensing images the initial agricultural field borders usually correspond to
particular farmers field with several planted crops. To define areas occupied by
each crop an automatic segmentation should be performed. Then the recognition
process can be carried out as well as vegetation stages and planting conditions are
determined. Another potential use of automatic segmentation in agriculture is
acreage control. Often farmers do not trace all the processing steps for each field
and a particular worker may slightly digress borders while sowing the plants.
So that automatic segmentation technique might be able to assist farmers to
control acreage independently using remote sensing data.
    Existing image segmentation methods can be regarded in the scope of fur-
ther classification of popular segmentation techniques given in [2]. Edge-based
segmentation applies contour detection [3] [4]. For remote sensing images, it is
supplementary mechanism because the most important information lies in color
and texture properties of the objects. It is rather difficult to apply edge-based
segmentation to any natural object because of the complex border form. Region-
based segmentation [4] includes different clustering techniques on the first step
and then produce segmentation by means of homogeneity functional optimiza-
tion. These algorithms usually exploit only texture features and they are essen-
tially pixel based methods. Due to the per pixel character of segmentation, it is
more probable to achieve tiny over segmented contours on the final stage. More
reliable segmentation methods correspond to per-field and multi-resolution seg-
mentation [5]. For the first of them, it is assumed that initial object borders
are known and they can assist the segmentation. The latter analyzes image in
different scales simultaneously. The key idea behind the multi-scale approach
is to take into account properties of data with different resolution. There are
some segmentation algorithms which use neural networks [6],[7]. The common
disadvantage of neural network algorithms is huge training data set that must
represent all possible variety of data. It is a big challenge to create such data
set for any particular case. Therefore, the methods based on neural networks are
not flexible in terms of novel types of data and features and could not be applied
for internal borders refinement of the particular object.
    In our article we assume that the database of objects’ contours is available.
This database may be presented in the form of vector map from geoinforma-
tion system (GIS). We explore the case of automatic analysis of internal map
object structure and its border refinement, thus we consider only the explored
object and its surroundings as input data. The aim of the segmentation in our
case is to define significant object’s parts that differ from each other by their
spectral-spatial characteristics, i.e. to clarify an internal object structure. Result-
ing contours should separate an initial object into meaningful parts. The outside
object border is supposed to be known from the vector map and is referred as
the region of interest (ROI). It is assumed that multiple features are available.
    As far as we suppose that it is more important to identify borders inside the
ROI rather than to refine its contour, we have based our algorithm on a merge
approach [8]. This approach includes two stages i.e. splitting the image into nu-
merous tiny locally homogeneous areas and their merging into more complicated
ones. As a result, the outside ROI contour will be initially presented with the
precision of tiny homogeneous regions and the inside borders will be defined with
the precision of the merging rule.
    Managing the splitting and merging process allows our algorithm to work
adaptively with each particular contrast and texture, which is highly determined
by image resolution. Such opportunity is very important for the agricultural
fields, because usually images are very different not only in sense of planted crops
but also as a result of relief and image acquisition properties. Relief changes lead
to texture variations within the fields with the same crop class. And acquisition
parameters significantly determine particular image contrast properties.
    The proposed algorithm has been revised by means of segmentation quality
measure on the set of remote sensing images. Experiments have shown successful
results in most of the cases and the best pair of the algorithm parameters has
been recommended.
    The article includes three main parts: problem statement, algorithm descrip-
tion and experimental evaluation of the method on remote sensing images. The
article ends with conclusions and acknowledgements sections.


2    Problem statement

In our notations initial image of L features with size N1 × N2 is denoted by
symbol Xl (n1 , n2 ) , 0 ≤ n1 ≤ N1 − 1, 0 ≤ n2 ≤ N2 − 1, 0 ≤ l ≤ L − 1 , where n1
and n2 are the coordinates of image pixel. The general segmentation problem is
formulated as optimization problem:

                                                                 [
      S = {Sk } , 0 ≤ k ≤ KS − 1, Ω = {Ωi } , 0 ≤ i ≤ KΩ − 1,         Sk ⊂ Ω,    (1)
                                                                  k
                                                            Q (S, Ω) 7→ max.     (2)

where Q (S) is the segmentation quality measure, S = {Sk } , 0 ≤ k ≤ KS − 1 is
a set of pixel subsets defining the results of segmentation, KS is the number of
subsets, Ω = {Ωi } , 0 ≤ i ≤ KΩ − 1 is an ideal segmentation and KΩ is ideal
number of subsets.
    We offer the following measure segmentation quality evaluation:
                                                                        !
                  1     1 X maxi |Ωi ∩ Sk |        1 X maxk |Ωi ∩ Sk |
      Q (S, Ω) =                               +                          , (3)
                  2 KS              |Sk |         KΩ i        |Ωi |
                             k

where |...| is the number of pixels per subset. The segmentation quality measure
is based on the quantities introduced in [9]. It takes values from 0.5 to 1. The
lower bound is achieved when one of the values KS or KΩ is equal to one and
the other is total number of image pixels. The upper bound of Q corresponds to
the precise segmentation.


3    Algorithm description
Let us denote the source remote sensing image as Fm (n1 , n2 ) , 0 ≤ m ≤ M −
1, 0 ≤ n1 ≤ N1 − 1, 0 ≤ n2 ≤ N2 − 1. Here by we describe all of the processing
steps of our algorithm including preprocessing and feature calculation.
    Step 1. Input data preparation.
    We clip initial remote sensing image within the bounding box of vector map
object to be analyzed using standard GIS instruments. The resulting piece of
the image represents the region of interest and contains all pixels corresponding
to analyzed object and its surroundings.
    Step 2. Feature extraction.
    This stage includes calculation of initial feature image Xl (n1 , n2 ) , 0 ≤ n1 ≤
N1 − 1, 0 ≤ n2 ≤ N2 − 1, 0 ≤ l ≤ L − 1 using local averages and variations for
each pixel block with W × W size in the source image. The number of features
L = 2M is two times higher than the original image channel count, because
these features are computed separately per each image component.
    Step 3. Feature variance normalization.
    To exclude the impact of different dynamic ranges for the computed features
we normalize each component using its global standard deviation σl :

                                          Xl (n1 , n2 )
                              xl (n1 , n2 ) =           .                        (4)
                                               σl
    Step 4. Splitting image into tiny regions.
    4.1. Before splitting initial image with segment indexes I (n1 , n2 ) , 0 ≤ n1 ≤
N1 , 0 ≤ n2 ≤ N2 is initialized by zero: I (n1 , n2 ). Current number of segments
is set to zero S = 0 too. Further analysis is produced for the sequence of grids
defined by shifting parameter a = 1;
    4.2. For each feature vector in the grid x (n1 , n2 ) , (n1 , n2 ) ∈ T with step aW :

                T = {(n1 , n2 ) : n1 mod aW = 0, n2 mod aW = 0} ,                    (5)

where mod is the remainder after division, the neighborhood is determined. This
neighborhood contains pixels located at distance aW per each spatial coordinate
from the current pixel. There are three possible cases. First case is when the
neighborhood pixels have different indexes of segments. Second case is when
only one pixel has non zero segment index s for the current neighborhood. And
the last case is when aW = 1 and all neighborhood pixels have the same index s.
If the second and third situations take place, then the condition of the similarity
between feature vectors is checked:

              ρ (x (n1 , n2 ) , x (n1 ± aW, n2 ± aW )) < ε, (n1 , n2 ) ∈ T           (6)
where ρ is some distance metric i.e. Euclidean distance. If the condition (6) is
fulfilled, the segment index I (n1 , n2 ) is set to s. Otherwise, it is assigned to
smax + 1, where smax is the biggest segment index at the current time.
    In the first case, when there are several different values of segment index in
the neighbourhood, current feature vector have to be compared with an average
feature vectors for each of the presented segments:


                              ρ (x (n1 , n2 ) , E (s)) < ε, (n1 , n2 ) ∈ T,          (7)
               s = argmins∈{I(n1 ±aW,n2 ±aW )} ρ (x (n1 , n2 ) , E (s))              (8)

where E (s) is an average feature vector for the segment with index s. If this
condition is fulfilled for some index s, it is stored in I (n1 , n2 ) as a current
estimate of the segment number.
    4.3 The grid step is decreased in twice a = a2 and the steps from 4.1 to 4.3
are repeated while the condition aW ≥ 1 is valid.
    Step 5. Merging.
    For each pair of the adjacent segments two segments are merged if their
average feature vectors are closer than the threshold value:

                                ρ (E (s1 ) , E (s2 )) < ε                            (9)
    Step 6. ROI border refinement.
    This step is intended to the analysis of non zero points on the segment border.
This analysis includes three steps listed below:
    6.1 Distance map calculation for each feature vector. This map contains the
distances ρ between current feature vector and average feature vectors of the
segments corresponding to pixels within the local window W × W .
    6.2 Parabolic filtering of the distance map. Each value of the distance ρ
within the window is multiplied by the coefficients of parabolic filter with same
size W × W and then the minimum is used as a result of filtration for current
window position. The coefficients of parabolic filter are defined by following
equation:

                          P (w1 , w2 ) = w1 ∗ w1 + w2 ∗ w2 ,             (10)
                                                     W W
   where w1 and w2 take values from the range − 2 , 2 .
   6.3 Final index of the segment is defined as the index of segment with the
minimum filtered distance which is also less than ε.
   The algorithm produces an image with the segments’ indexes. Step 2 can be
modified if it is necessary to apply any other features.


4   Experimental evaluation

The proposed algorithm has been used for agricultural fields analysis. Input
images were acquired by sensor UK DMC for the Samara region. Image resolution
was 22 meters per pixel. Test images were obtained during the period from the
1st of April 2012 till the 15th of June.
    The aim of our experimental research was to define which parameter values
deliver the best method performance. The following groups of factors may sig-
nificantly affect segmentation quality. The first group accumulates images which
are partially clouded and with shadows from the clouds. For these images clouds
and shadows always are distinguished as different segments and leads to over
segmentation. This is irremovable error, therefore only cloudless images should
be used. The second group accumulated images of fields with the rough relief
structure. If there are some ditches, channels, ravines or the slope and curva-
ture is high, it would be difficult to make segmentation properly. The last group
includes rough-textured fields. Usually, non hybrid field has the rough texture
of its surface because of the agricultural management events. In this case there
can be regular lines or spots on the images inside the borders of the field. If the
texture elements are enough big, they may lead to over segmentation too. Thus,
the results of the algorithm are highly dependent on input data and can not be
applied without calibration.
    For the agricultural field border monitoring we have used infrared, red and
green spectral channels. The features were local averages and variances for each
spectral channel.
    We have used a sample set of 120 images for the typical problem cases listed
above (30 images per category) and 30 images with simple non hybrid structure
(as an example of the case of good conditions for the algorithm). The ideal
segmentation has been done manually. The examples of test images are presented
in fig. 1.
    The quality of segmentation for each image has been estimated by means
of Q criterion described above. We have tested algorithm parameters for the
   a)                 b)                  c)                   d)

Fig. 1. The examples of test images for different categories a) hybrid field, b) non
hybrid field with complex relief, c) simple non hybrid field, d) textured non hybrid
field. Blue border is an ideal segmentation contour.


following ranges: for ε from 0.1 to 2 with step 0.1, for W from 3 to 15 with step
2.
    To inspect the relationship between each category and quality of segmenta-
tion we have assigned the index g to each of the four image categories: g = 1 for
hybrid field, g = 2 for non hybrid field with complex relief, g = 3 for simple non
hybrid field, g = 4 for textured hybrid field. Image index in the particular group
is denoted as j and there are J = 30 images per each category.
    For all pairs of parameters W and ε an averages and variances of Q per group
and for the whole image set have been calculated using formulas:

                                                        J
                                                    1X
                                    mQg (W, ε) =          Qgj (W, ε) ,         (11)
                                                    J j=1
                             v
                             u    J
                             u1 X                             2
                σQg (W, ε) = t       (Qgj (W, ε) − mQg (W, ε)) ,               (12)
                               J j=1
                                                    G   J
                                                1 XX
                              mQ (W, ε) =                 Qgj (W, ε) ,         (13)
                                               GJ g=1 j=1
                         v
                         u     G X J
                         u 1 X                                2
             σQ (W, ε) = t            (Qgj (W, ε) − mQ (W, ε)) .               (14)
                           GJ g=1 j=1

   For each category we have selected an optimal pair of parameters Poptg as a
pair that fulfill the following condition:

                    Poptg = {(W, ε) : mQg (W, ε) ≥ mg − σg } ,                 (15)
where
   mg = max(W,ε) [mQg (W, ε)],
   σg = max(W,ε)∈Pg [σQg (W, ε)],
   Pg = {(W, ε) : mQg (W, ε) = mg }.
   The table of mg and σg values for each category is shown below.


                Table 1. mg and σg values for each image category

        Category                             g       mg            σg
        hybrid field                         1       0.945         0.031
        non hybrid field with complex relief 2       0.990         0.012
        simple non hybrid field              3       0.986         0.014
        textured hybrid field                4       0.989         0.007


    The optimal pairs of the parameters for the whole image set have been found
as intersection of optimal parameters sets for each category Poptg .
    The table 2 contains optimal for the whole sample set parameters and aver-
age segmentation quality measure mQ (W, ε) for them. mQ1 (W, ε) is an average
value of the segmentation quality measure only for the hybrid field samples.


                       Table 2. Optimal pairs of parameters

        W                                   ε        mQ (W, ε) mQ1 (W, ε)
        5                                   0.5      0.969     0.931
        5                                   0.6      0.971     0.931
        5                                   0.8      0.967     0.914


  Fig. 2 illustrates the segmentation results with the best pair of parameters
W = 5 and ε = 0.6 for the images from the fig. 1.


   a)                  b)                  c)                 d)

Fig. 2. Segmentation results with the best pair of parameters W = 5 and ε = 0.6 for
the images from the fig. 1
5    Conclusion
In this article, we have proposed an algorithm that helps to make a segmentation
of particular objects. It is based on two steps: excess segmentation and merg-
ing. The quality of segmentation depends on the similarity threshold and initial
degree of over segmentation that is controlled by the window size. Algorithm
has been tested on agricultural field images and the best pair of parameters
have been selected. The control parameters of the algorithm allow to use it for
another kind of data with different resolution and contrast properties, but the
calibration must be carried out in advance.


Acknowledgements This study was financially supported by RFBR projects
16-37-00043 mola, 16-29-09494 ofim.


References
1. Narkhede, H.P.: Review of image segmentation techniques. International Journal of
   Science and Modern Engineering. 1(8), 54–61 (2013)
2. Blaschke, T., et al.: Object-oriented image processing in an integrated GIS remote
   sensing environment and perspectives for environmental applications. Environmen-
   tal information for planning, politics and the public. 2, 555–570. (2000)
3. Rydberg, A., Borgefors, G.: Integrated method for boundary delineation of agricul-
   tural fields in multispectral satellite images. IEEE Transactions on Geoscience and
   Remote Sensing. 39(11), 2514–2520. (2001)
4. Mueller, M., Segl, K., Kaufmann, H.: Edge-and region-based segmentation technique
   for the extraction of large, man-made objects in high-resolution satellite imagery.
   Pattern recognition. 37(8), 1619–1628. (2004)
5. Blaschke, T., Burnett, C., Pekkarinen, A.: Image segmentation methods for object-
   based analysis and classification. Remote sensing image analysis: Including the spa-
   tial domain. 211–236. (2006)
6. Basu, S., Ganguly, S., Mukhopadhyay, S.: DeepSat A Learning framework for Satel-
   lite Imagery. Proceedings of the 23rd SIGSPATIAL International Conference on
   Advances in Geographic Information Systems. (2015)
7. Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab:
   Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution,
   and Fully Connected CRFs. Proceedings of the 23rd SIGSPATIAL International
   Conference on Advances in Geographic Information Systems. (2015)
8. Sonka, M., Hlavac, V., Boyle, R.: Image processing, analysis, and machine vision.
   IEEE Transactions on Geoscience and Remote Sensing. Cengage Learning. (2014)
9. Weidner, U.: Contribution to the assessment of segmentation quality for remote
   sensing applications. International Archives of Photogrammetry, Remote Sensing
   and Spatial Information Sciences. 37(B7), 479–484. (2008)