                 Land Cover Semantic Segmentation Using ResUNet
                 Vasilis Pollatos                                       Loukas Kouvaras                                 Eleni Charou
               vaspoll97@gmail.com                                      Harokopio University                       exarou@iit.demokritos.gr
                       NTUA                                               Athens, Greece                              NCSR Demokrtios
                  Athens, Greece                                                                                        Athens, Greece
ABSTRACT                                                                              to develop automated systems that analyse this data and carry out
In this paper we present our work on developing an automated                          useful tasks. Labeled data are the most useful ones, as they can be
system for land cover classification. This system takes a multiband                   utilised for the purposes of supervised learning that solves a great
satellite image of an area as input and outputs the land cover map                    range of problems.
of the area at the same resolution as the input. For this purpose                         CLC provides a huge labeled dataset. It contains maps for the
convolutional machine learning models were trained in the task of                     most part of Europe for the last three decades. Our goal is to train
predicting the land cover semantic segmentation of satellite images.                  models to predict the labels of the CLC dataset. Most research done
This is a case of supervised learning. The land cover label data were                 in this field is about assigning one or more land cover labels into a
taken from the CORINE Land Cover inventory and the satellite                          whole satellite image patch (which can take an area of several square
images were taken from the Copernicus hub. As for the model,                          kilometres). Our approach to the problem is more general, trying
U-Net architecture variations were applied. Our area of interest                      to construct a semantic segmentation of the satellite image into the
are the Ionian islands (Greece). We created a dataset from scratch                    full range of the land cover classes provided by the Corine Land
covering this particular area. In addition, transfer learning from the                Cover inventor, at the maximal resolution provided by sentinel-2
BigEarthNet dataset [1] was performed. In [1] simple classification                   satellite images, which is 10m. The classes of CLC are hierarchical.
of satellite images into the classes of CLC is performed but not seg-                 We are testing the ability of the models to predict the classes on
mentation as we do. However, their models have been trained into                      each one of the hierarchical levels. As expected, we see that the
a dataset much bigger than ours, so we applied transfer learning us-                  superclasses on the higher levels are discriminated with greater
ing their pretrained models as the first part of out network, utilizing               accuracy than the subclasses on the lower levels.
the ability these networks have developed to extract useful features                      Corine Land Cover has a wide variety of applications, underpin-
from the satellite images (we transferred a pretrained ResNet50 into                  ning various Community policies in the domains of environment,
a U-Res-Net). Apart from transfer learning other techniques were                      but also agriculture, transport, spatial planning. Developing a sys-
applied in order to overcome the limitations set by the small size of                 tem that automates the production of CLC maps to some extent
our area of interest. We used data augmentation (cutting images                       is important because CLC needs to be updated every few years.
into overlapping patches, applying random transformations such                        Creating these maps is a burdensome and time-consuming job for
as rotations and flips) and cross validation. The results are tested on               the human and even so the accuracy of the produced maps isn’t
the 3 CLC class hierarchy levels and a comparative study is made                      perfect. An automatic land cover classification system could help
on the results of different approaches.                                               develop such maps in the future, track down sudden or short term
                                                                                      changes that happen to the land cover (for example due to natural
KEYWORDS                                                                              disasters or due to fast track rural and urban development). It could
                                                                                      also be applied to areas that are not included in the CLC.
LULC, U-NET, deep learning, transfer learning,Ionio
                                                                                          State of the art deep learning models were used and the training
                                                                                      and testing were done in the area of Ionio. This is a case of work on
1    INTRODUCTION                                                                     a relatively small area with special geological and natural features.
Modern AI technologies, such as deep learning, can be utilized in                     It is also an area of varying morphology and landscapes and small
various fields of natural science to automate and underpin proce-                     scale land cover characteristics that can hardly be detected on the
dures traditionally carried out by humans. Remote sensing nowa-                       resolution provided by sent-2 images. Similar approaches can be
days provides a great amount of data of high quality which are                        used for training and testing in other areas covered by the sentinel-2
updated on a daily basis. Another important thing is that these                       satellites. As a first step we trained a simple U-Net from scratch in
data are easily produced and are open to the public in contrast to                    the area of interest. Recently, a similar research was done in the TU
other sources, such as aerial photography that are of higher quality                  Berlin, developing the BigEarthNet. They perform simple classifica-
but are more expensively and less massively produced. For some                        tion of satellite images into the classes of CLC but not segmentation
problems (in our case land cover recognition) the resolution of the                   as we do. However, their models have been trained into a dataset
open remote sensing data (10m for sentinel-2) is adequate. The big                    much bigger than ours, so we applied transfer learning using their
data of remote sensing can be fed into machine learning models                        pretrained models as the first part of out network, utilizing the
                                                                                      ability these networks have developed to extract useful features
                                                                                      from the satellite images (we transferred a pretrained ResNet50 into
                                                                                      our area of interest. We used data augmentation (cutting images
into overlapping patches, applying random transformations such           is distributed over 6 Ionian islands (Corfu, Paxi, Lefkada, Kalamos,
as rotations and flips) and cross validation.                            Kefalonia, Zante) and the coast of Parga.

Land Cover Recognition gathers a lot of interest in the research
community. In our work we apply transfer learning from the mod-
els trained in BigEarthNet [1]. The BigEarthNet dataset contains
590,326 non-overlapping image patches of size 1200m ×1200m dis-
tributed over 10 european countries (Austria, Belgium, Finland,
Ireland, Kosovo, Lithuania, Luxembourg, Portugal, Serbia, Switzer-
land). Each image patch is annotated by multiple land-cover classes
(i.e., multi-labels) that are provided from the CORINE Land Cover
database of the year 2018 (CLC 2018). They train models that take
each patch as input and predict the classes appearing in this patch.
They solve a simpler problem than ours, because the resolution of
                                                                                      Kalamos                          South Corfu
the output of their models is 1200m, while the resolution of our
predicted maps is 10m. However, their models have been trained on
a dataset much bigger than ours and have learned to extract useful
features from the images (encoding) that are later on decoded to
solve their task. We are using the pretrained encoder of a res-net-50
trained on BigEarthNet as the encoder part of a unet-like architec-
ture to solve our semantic segmentation problem. This approach                         Parga                           North Corfu
has also been adopted by [3]. UNet architecture was introduced
in [21]. ResNetUnet, the architecture we are using, is commonly
used for such problems. In [5] a sophisticated ResNetUnet that per-
forms multitasking achieved state of the art results for the ISPRS
2DPotsdam dataset. One of the subproblems solved in this multi-
tasking is finding the class boundaries, which is also proposed in
[10]. However, as far as our problem is concerned, these methods
are applied on high resolution images of urban areas and may be
of little use for our problem. In order to conquer the limitations set
by our small dataset, data augmentation is applied as in [12], [13],
[22]. In our work we used Sentinel-2 bands with 10m resolution
and bands with 20m resolution. Others have used multisource data
                                                                                     Kefalonia                           Lefkada
including optical data and Sentinel-1 radar measurements [14] ,[15],
[16]. Multi-temporal data viewing the same area on different times-
tamps is another approach taken in [16],[17],[18],[19]. In order to
deal with missing labels active learning [19],[20], self-learning [18]
and weakly supervised learning [6], [7], [8] is performed.

3.1 Dataset
Our dataset was created by multispectral satellite images of the                   North Zante                             Paxi
Ionian Islands downloaded from Copernicus for the period of 2018
                                                                               Figure 1: Satellite images on the area of interest
and part of the CLC 2018 that covers the Ionian Islands. CLC vec-
tor files were georeferenced together with the Copernicus images,
turned into raster with 10m resolution and altogether were clipped          For each area we have the sentinel-2 10m resolution bands (R,G,B,
in the same bounds creating tiffs for each one of the islands. These     infrared), the sentinel-2 20m resolution bands (b05, b06, b07, b8A,
tiffs were cut into patches of size 1,28x1,28 km (128x128 pixels) with   b11 and b12) and the corine land cover class label for each pixel. In
some high degree of overlap. Xdata consists of these patches having      our problem the satellite image bands are the inputs to our network
the satellite image bands as features for each pixel and Ydata con-      and the clc classes the expected output.
sists of the corresponding CLC patches. Our networks are trained            Corine Land Cover classes are hierarchical into three levels. Our
to solve the task of predicting the CLC label for each pixel of the      approach is training the models on the full range of the corine land
input patch, given the band measurements for each pixel of the           cover classes and then testing them on each level separately.
input patch. So we are trying to find a function f such that Ydata =        The area of interest has to be splitted into training and test sets.
f (Xdata). This is a case of supervised learning. Our area of interest   Due to the small size of our dataset we chose not to use a validation
     Kefalonia(RGB        Kefalonia(infrared       Kefalonia(clc
      bands, 10m)            band, 10m)            classes, 10m)
                                                                                                                              1.28 km x 1.28 km
                Figure 2: Structure of our dataset                                                                                  patch
                                                                                      Overlapping patches

                                                                            Figure 4: Cutting the original satellite images into patches.

                                                                            3.2     Models
                                                                            Two different approaches were followed. The first approach was
                                                                            to train a baseline UNet from scratch into the area of interest. The
                                                                            second approach was to perform transfer learning. The transfer
                                                                            learning UNet model has a ResNet-50 architecture on the encoder
                                                                            part and the weights of the encoder are initialised to the values
                                                                            of the weights of a ResNet-50 trained on the BigEarthNet . The
                                                                            figure below shows the exact architecture of the transfer learning
                                                                            model. There are approximately 66.000.000 trainable parameters on
                                                                            this model. A more complex version of this model that applied no
                                                                            compression on the outputs of the encoder that were passed to the
                    Figure 3: CLC color legend                              decoder through shortcuts had 91.000.000 trainable parameters and
                                                                            improved fitting on the training set but didn’t seem to generalise
                                                                            better than the model presented below.
set for the fine tuning of hyperparameters such as the number of               The baseline UNet model that was trained from scratch solved
epochs. The training process was stopped when the loss function             an easier problem, as the output and the ground truth land cover
started to converge and not when it was minimal for the validation          images had a resolution of 100m.
set. We are performing cross validation so the area of interest has
to be divided into a number of subsets of approximately same size .
The area of interest was partitioned into the following 6 subsets: 1.
                                                                            3.3     Training
north Corfu, 2. south Corfu, 3. west Kefalonia, 4. east Kefalonia, 5.       We are trying to solve a semantic segmentation problem and a
Lefkada, 6. Paxi+North Zante+Kalamos+Parga The splitting into               composite dice and a binary cross entropy loss with logits criterion
training and validation sets is done 6 times, so that each time a           is used.The two loss criteria are summed, each one with a weight
different subset is the validation set and the remaining 5 are the          factor of 0.5. We experimented with positive weights pc in the bce:
                                                                            𝑙𝑐 (𝑥, 𝑦) = 𝐿𝑐 = {𝑙                      ⊤
training set.                                                                                 1,𝑐 , . . . , 𝑙 𝑁 ,𝑐 } ,                                 
    Each area is cut into overlapping patches. The overlaps are a               𝑙𝑛,𝑐 = −𝑤𝑛,𝑐 𝑝𝑐 𝑦𝑛,𝑐 · log 𝜎 (𝑥𝑛,𝑐 ) + (1 − 𝑦𝑛,𝑐 ) · log(1 − 𝜎 (𝑥𝑛,𝑐 ))
form of data augmentation. Patch size is 1.28 km x 1.28 km and              where c is the class number.
                                                                                                 𝑛𝑢𝑚𝑏𝑒𝑟𝑜 𝑓 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠𝑎𝑚𝑝𝑙𝑒𝑠𝑜 𝑓 𝑐𝑙𝑎𝑠𝑠𝑐 𝑎
the hop between adjacent patches is 0.64 km in each direction                   Setting 𝑝𝑐 = ( 𝑛𝑢𝑚𝑏𝑒𝑟𝑜 𝑓 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠𝑎𝑚𝑝𝑙𝑒𝑠𝑜 𝑓 𝑐𝑙𝑎𝑠𝑠𝑐 ) , for different
(longitude and latitude). Two memory optimisations were applied.            values of a in (0, 1] for class balancing deteriorated our results.
Firstly, patches are stored by defining only their limits in the original   Adam optimiser is used to achieve fitting in the training data. initial
satellite image and the cutting is only performed on dataloading.           𝐿𝑅 = 5 · 10−4 and it gradually decreases with the use of a sched-
Secondly, patches containing only sea are discarded ( e.g. the blue         uler.The complexity of our model requires the use of regularization
square in the right image below). This is a good practice because           techniques. We applied dropout, with rate 0-0.2 for the outer layers
it turns out that the models are able to learn to recognise the sea         and 0.3-0.4 for the inner hidden layers. For the first epochs of the
almost perfectly even without those patches. It also reduces class          training, the weights of the base transfer learning model remain
imbalancement, as sea patches are the most frequent ones in our             frozen. We unfreeze them when the learning process starts to con-
area.                                                                       verge, dropping at the same time the learning rate. As we can see
    For the transfer learning experiments data needed to be stan-           below, unfreezing the base model on epoch 80 causes some instabil-
dardised using the same mean and std values as the base model.              ity. However, after some epochs the loss returns to the low values
On dataloading random flips and rotations were applied for the              it had before the unfreezing. The pretrained encoder seems to work
purposes of data augmentation.                                              properly without further training, but the unfreezing brings some
                                                                                      Figure 7: Learning curve

                                                                 3.4    Experiments
                                                                 Several versions of the problem are being examined. Firstly, training
                                                                 from scratch was done on the area of interest. A baseline model
                                                                 shown in figure 6 was used. The produced maps had a resolution
                                                                 100m. The visual results and the metrics for the validation set are
                                                                 presented below:

                                                                         Kefalonia (target)                 Kefalonia (prediction)

   Figure 5: Architecture of the transfer learning model         Figure 8: Validation results for the model that was trained
                                                                 from scratch

                                                                    The pixel level classification metrics below are measured on the
                                                                 third level of the clc class hierarchy for the classes that were found
                                                                 on the validation set.
                                                                    accuracy = 0.787 𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.160 𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.787 𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 =
                                                                    precision score = [1. 0.198 1. 1. 0. 1. 1. 0.0086 1. 0.2435. 0.32 0.347
                                                                 0.321 0.727 0.435 0.479 0. 0. 1. 0.0259 1. 0. 0.992 ]
                                                                    recall score = [0. 0.304 0. 0. 0. 0. 0. 0.0013 0. 0.1885 0. 0.194 0.4066
                                                                 0.4337 0.1948 0.665 0.708 0. 1. 0. 0.0025 0. 1. 0.997 ]
                                                                    For the transfer learning model a more systematic testing was
                                                                 performed. As mentioned above, 6-fold cross validation was applied.
                                                                 Metrics and maps are calculated for each one of the validation
                                                                 folds. The metrics are taken with respect to each one of the 3
     Figure 6: Architecture of the baseline UNet model           clc class hierarchical levels separately over all pixels (pixel level
                                                                    The results for each one of the 6 folds are presented below. For the
slight improvements so we perform it. Training was executed on   first three we give the validation scores and the map visualisation
google colab.                                                    and for the other three just the visual result, for brevity reasons.
                                                                                                     Classification Report
Fold 1:                                                                         class                      support     precision   recall
                                                                                1.1 Urban fabric           119963      0.5082      0.4122
                                                                                1.2 Industrial, commer- 40513          1           2.468e-05
                                                                                cial and transport units
                                                                                1.3 Mine, dump and con- 2591           1           0.0003858
                                                                                struction sites
                                                                                1.4 Artificial, non- 12026             0.0002674   8.315e-05
                                                                                agricultural vegetated
                                                                                2.1 Arable land            129947      0.134       0.02698
                                                                                2.2 Permanent crops        217427      0.123       0.186
                                                                                2.3 Pastures               234101      0.4023      0.081
                                                                                2.4 Heterogeneous agri- 1341412        0.5679      0.6165
                                                                                cultural areas
                                                                                3.1 Forest                 590332      0.6192      0.5724
                                                                                3.2 Shrub and/or herba- 1654002        0.6504      0.7213
                                                                                ceous vegetation associ-
                                                                                3.3 Open spaces with lit- 87859        0.2044      0.1991
                                                                                tle or no vegetation
                                                                                4.1 Inland wetlands        5416        1           0.0001846
                                                                                5.2 Marine waters          8933755     0.9974      0.9985
                                                                              accuracy = 0.85329
                                                                              𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.4124
                                                                              𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.85329
                                                                              𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.8522

                                                                            CORINE CLASS LEVEL 3 :

             target                          prediction

Figure 9: Validation results for the transfer learning model,
west Kefalonia


                             Classification Report
    class     1.Artificial 2.Agricultural 3.Forest    4.Wetlands 5.Water
              Surfaces areas               and sem-              bodies
    support 175093         1922887         2332193    5416        8933755
    precision 0.52         0.7729          0.8075     1           0.9974
    recall    0.3001       0.747           0.8537     0.0001846   0.9985
  accuracy = 0.92753
  𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.67921
  𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.92753
  𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.92662

                          Classification Report
    class                       support     precision     recall
    1.1.1 Continuous urban 3443             1             0.0002904
    1.1.2 Discontinuous ur- 116520          0.4734        0.3953
    ban fabric
    1.2.1 Industrial or com- 28859          1             3.465e-05
    mercial units
    1.2.3 Port areas            2606        1             0.0003836
    1.2.4 Airports              9048        1             0.0001105
    1.3.1 Mineral extraction 2591           1             0.0003858
    1.4.2 Sport and leisure 12026           0.0002674     8.315e-05
    2.1.1      Non-irrigated 129947         0.134         0.02698
    arable land
    2.2.1 Vineyards             18644       1             5.363e-05
    2.2.3 Olive groves          198783      0.123         0.2034
    2.3.1 Pastures              234101      0.4023        0.081
    2.4.2 Complex cultiva- 536157           0.3528        0.4831
    tion patterns
    2.4.3 Land principally 805255           0.4042        0.3624
    occupied by agriculture,                                                                                             prediction
    with significant areas of
    natural vegetation                                                        Figure 10: Validation results for the transfer learning model,
    3.1.2 Coniferous forest     74557       0.002304      1.341e-05           east Kefalonia
    3.1.3 Mixed forest          515775      0.5519        0.5836
    3.2.1 Natural grassland 401405          0.4262        0.5664
    3.2.3 Sclerophyllous 1248545            0.5923        0.6059                𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.96251
    vegetation                                                                  𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.96409
    3.2.4 Transitional wood- 4052           4.215e-05     0.0002467
    land/shrub                                                                CORINE CLASS LEVEL 2 :
    3.3.2 Bare rock             27305       1             3.662e-05
    3.3.3 Sparsely vegetated 60554          0.1394        0.1944                                       Classification Report
    areas                                                                         class                      support     precision    recall
    4.1.1 Inland marshes        5416        1             0.0001846               1.1 Urban fabric           56640       0.8264       0.5547
    5.2.3 Sea and ocean         8933755     0.9974        0.9984                  1.3 Mine, dump and con- 2552           1            0.0003917
  accuracy = 0.81346                                                              struction sites
  𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.37624                                                              1.4 Artificial, non- 10834             0.3934       0.04513
  𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.81346                                                              agricultural vegetated
  𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.81504                                                           areas
                                                                                  2.1 Arable land            21436       0.6883       0.5296
Fold 2:                                                                           2.2 Permanent crops        209584      0.6226       0.654
                                                                                  2.3 Pastures               37121       0.3445       0.6391
CLASS LEVEL 1 :                                                                   2.4 Heterogeneous agri- 736192         0.611        0.7511
                                                                                  cultural areas
                             Classification Report
                                                                                  3.1 Forest                 1116452     0.7122       0.7187
    class     1.Artificial 2.Agricultural 3.Forest    4.Wetlands 5.Water
                                                                                  3.2 Shrub and/or herba- 1878253        0.7847       0.7261
              Surfaces areas               and sem-              bodies
                                           inatural                               ceous vegetation associ-
                                           areas                                  ations
    support 70026          1004333         3066378    -             9949503       3.3 Open spaces with lit- 71673        0.8346       0.1002
    precision 0.823        0.7099          0.9494     -             0.9975        tle or no vegetation
    recall    0.4614       0.8556          0.8895     -             0.9993        5.2 Marine waters          9949503     0.9975       0.9993
  Results for the Wetlands class are omitted due to zero support.               accuracy = 0.91362
  accuracy = 0.96251                                                            𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.6004
  𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.83431                                                            𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.91362
                                                                                𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.91511
CORINE CLASS LEVEL 3 :                                               CLASS LEVEL 1 :

                          Classification Report                                                  Classification Report
    class                       support     precision    recall         class     1.Artificial 2.Agricultural 3.Forest    4.Wetlands 5.Water
    1.1.2 Discontinuous ur- 56640           0.8264       0.5547                   Surfaces areas               and sem-              bodies
    ban fabric                                                                                                 inatural
    1.3.1 Mineral extraction 2552           1            0.0003917                                             areas
                                                                        support 176535         1869077         1861405    12270       5354057
                                                                        precision 0.7811       0.8004          0.7135     0.005529    0.9907
    1.4.2 Sport and leisure 10834           0.3934       0.04513
                                                                        recall    0.3617       0.6425          0.8799     0.0006519   0.9982
    facilities                                                         accuracy = 0.88929
    2.1.1      Non-irrigated 21436          0.6883       0.5296        𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.61471
    arable land                                                        𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.88929
    2.2.3 Olive groves          209584      0.6226       0.654         𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.89035
    2.3.1 Pastures              37121       0.3445       0.6391
    2.4.2 Complex cultiva- 210391           0.704        0.552       CORINE CLASS LEVEL 2 :
    tion patterns
    2.4.3 Land principally 525801           0.4826       0.6791                               Classification Report
    occupied by agriculture,                                             class                      support     precision    recall
    with significant areas of                                            1.1 Urban fabric           108422      0.6693       0.5016
    natural vegetation                                                   1.2 Industrial, commer- 3147           1            0.0003177
    3.1.2 Coniferous forest     419030      1            2.386e-06       cial and transport units
    3.1.3 Mixed forest          697422      0.5146       0.8313          1.3 Mine, dump and con- 4267           1            0.0002343
    3.2.1 Natural grassland     252838      0.7181       0.5809          struction sites
    3.2.3 Sclerophyllous 1401252            0.6837       0.7481          1.4 Artificial, non- 60699             0.002079     1.647e-05
    vegetation                                                           agricultural vegetated
    3.2.4 Transitional wood- 224163         1            4.461e-06       areas
    land/shrub                                                           2.1 Arable land            108101      0.2607       0.003996
    3.3.2 Bare rock             13180       1            7.587e-05       2.2 Permanent crops        552365      0.7296       0.0475
    3.3.3 Sparsely vegetated 58493          0.86         0.1228          2.3 Pastures               166819      0.001908     5.994e-06
    areas                                                                2.4 Heterogeneous agri- 1041792        0.4505       0.6323
    5.2.3 Sea and ocean         9949503     0.9975       0.9993          cultural areas
  accuracy = 0.88019                                                     3.1 Forest                 393277      0.2725       0.4209
  𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.559                                                       3.2 Shrub and/or herba- 1417396        0.6113       0.7278
  𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.88019                                                     ceous vegetation associ-
  𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.89214                                                  ations
                                                                         3.3 Open spaces with lit- 50732        0.2842       0.002089
Fold 3:                                                                  tle or no vegetation
                                                                         4.2 Coastal wetlands       12270       0.005533     0.0006519
                                                                         5.2 Marine waters          5354057     0.9907       0.9982
                                                                       accuracy = 0.78517
                                                                       𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.37775
                                                                       𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.78517
                                                                       𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.78474

                                                                     CORINE CLASS LEVEL 3 :

             target                         prediction

Figure 11: Validation results for the transfer learning model,
                          Classification Report
    class                       support     precision    recall
    1.1.2 Discontinuous ur- 108422          0.6693       0.5016
    ban fabric
    1.2.3 Port areas            3147        1            0.0003177
    1.3.1 Mineral extraction 4267           1            0.0002343
    1.4.2 Sport and leisure 60699           0.002079     1.647e-05
    2.1.1      Non-irrigated 108101         0.2607       0.003996
    arable land
    2.2.1 Vineyards             7541        1            0.0001326
    2.2.3 Olive groves          544824      0.7296       0.04818
    2.3.1 Pastures              166819      0.001908     5.994e-06
    2.4.2 Complex cultiva- 244647           0.4023       0.2524
    tion patterns
    2.4.3 Land principally 797145           0.3345       0.5491
    occupied by agriculture,                                                        target                      prediction
    with significant areas of
    natural vegetation                                               Figure 13: Validation results for the transfer learning model,
    3.1.1 Broad-leaved for- 20546           1            4.867e-05   South Corfu
    3.1.2 Coniferous forest     137230      0.000408     7.287e-06
    3.1.3 Mixed forest          235501      0.1906       0.4898      Fold 6:
    3.2.1 Natural grassland 150340          0.49         0.5003
    3.2.3 Sclerophyllous 1197070            0.5248       0.6725
    3.2.4 Transitional wood- 69986          1            1.429e-05
    3.3.1 Beaches, dunes, 12323             0.2842       0.008601
    3.3.3 Sparsely vegetated 38409          1            2.603e-05                     target                prediction
    4.2.1 Salt marshes          2929        0.003458     0.001706
    4.2.2 Salines               9341        1            0.000107
    5.2.1 Coastal lagoons       85666       0.7482       0.2571
    5.2.3 Sea and ocean         5268391     0.98         0.998
  accuracy = 0.73935
  𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.32759
                                                                                    target                      prediction
  𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.73935
  𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.74674

Fold 4:

                                                                                                             target       prediction
                                                                           target            prediction

                                                                     Figure 14: Validation results for the transfer learning model
             target                         prediction               (north Zante, Parga, Paxoi, Kalamos)

Figure 12: Validation results for the transfer learning model,
                                                                        In the experiments presented above our method was to keep
North Corfu
                                                                     a continuous area as a validation set, for example a whole island.
                                                                     Now we present a different approach where the validation patches
                                                                     are randomly distributed over the area of interest. This is also a
Fold 5:                                                              realistic problem, where the experts sparsely assign land cover
labels on the area of interest and the remaining unlabeled areas are                                    Classification Report
predicted by a model trained on the neighbouring labeled ones. To                class                        support     precision recall
make sure that the training and the validation set have no common                1.1.1 Continuous urban 7396              1         0.0001352
elements we skipped data augmentation via overlaps, but the flips                fabric
and rotations are still used. We split the area of interest into train           1.1.2 Discontinuous ur- 69984            0.855     0.09912
and validation with a ratio of 70, 30 respectively.                              ban fabric
   The metrics for the validation are presented below.                           1.2.1 Industrial or com- 2855            1         0.0003501
                                                                                 mercial units
CLASS LEVEL 1 :                                                                  1.2.3 Port areas             669         1         0.001493
                                                                                 1.2.4 Airports               621         1         0.001608
                               Classification Report                             1.3.1 Mineral extraction 1825            1         0.0005476
     class      1.Artificial 2.Agricultural 3.Forest     4.Wetlands 5.Water      sites
                Surfaces areas               and sem-                bodies      1.4.1 Green urban areas 207              1         0.004808
                                             inatural                            1.4.2 Sport and leisure 30160            1         3.316e-05
                                             areas                               facilities
     support 113717          1349974         1291539     7743        743203
                                                                                 2.1.1      Non-irrigated 55779           1         0.00285
     precision 0.8639        0.8457          0.8078      1           0.9744
                                                                                 arable land
     recall     0.06163      0.8162          0.8998      0.0001291 0.9916
  accuracy = 0.85792                                                             2.2.1 Vineyards              10122       1         9.878e-05
  𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.68526                                                             2.2.2   Fruit  trees  and    9040        1         0.0001106
  𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.85792                                                             berry   plantations
  𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.85891                                                          2.2.3 Olive groves           426780      0.5224    0.7187
                                                                                 2.3.1 Pastures               53487       1         1.87e-05
CORINE CLASS LEVEL 2 :                                                           2.4.2 Complex cultiva- 267127            0.3909    0.5091
                                                                                 tion patterns
                            Classification Report                                2.4.3 Land principally 527639            0.471     0.3283
    class                         support       precision recall                 occupied   by agriculture,
    1.1 Urban fabric              77380         0.855        0.08965             with significant areas of
    1.2 Industrial, commer- 4145                1            0.0002412           natural vegetation
    cial and transport units                                                     3.1.1 Broad-leaved for- 12029            1         8.313e-05
    1.3 Mine, dump and con- 1825                1            0.0005476           est
    struction sites                                                              3.1.2 Coniferous forest      76451       1         0.09269
    1.4 Artificial, non- 30367                  1            3.293e-05           3.1.3 Mixed forest           181754      0.4287    0.5532
    agricultural vegetated                                                       3.2.1 Natural grassland 174214           0.4769    0.741
    areas                                                                        3.2.3     Sclerophyllous     701163      0.5691    0.7445
    2.1 Arable land               55779         1            0.00285             vegetation
    2.2 Permanent crops           445942        0.523        0.6887              3.2.4 Transitional wood- 74017           1         1.351e-05
    2.3 Pastures                  53487         1            1.87e-05            land/shrub
    2.4 Heterogeneous agri- 794766              0.5563       0.5009              3.3.1 Beaches, dunes, 7253               1         0.0001379
    cultural areas                                                               sands
    3.1 Forest                    270234        0.6143       0.5493              3.3.2 Bare rock              7519        1         0.000133
    3.2 Shrub and/or herba- 949394              0.65         0.8134              3.3.3 Sparsely  vegetated    57139       0.5789    0.09184
    ceous vegetation associ-                                                     areas
    ations                                                                       4.1.1 Inland marshes         4506        1         0.0002219
    3.3 Open spaces with lit- 71911             0.5789       0.07298             4.2.1 Salt marshes           1788        1         0.000559
    tle or no vegetation                                                         4.2.2 Salines                1449        1         0.0006897
    4.1 Inland wetlands           4506          1            0.0002219           5.2.1 Coastal  lagoons       25642       1         3.9e-05
    4.2 Coastal wetlands          3237          1            0.0003088           5.2.3 Sea and ocean          717561      0.9416    0.9925
    5.2 Marine waters             743203        0.9744       0.9916            accuracy = 0.59871
  accuracy = 0.67743                                                           𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.28225
  𝑓 1𝑚𝑎𝑐𝑟𝑜 = 0.40289                                                           𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.59871
  𝑓 1𝑚𝑖𝑐𝑟𝑜 = 0.67743                                                           𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.62438
  𝑓 1𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 0.68708                                                        Finally   we are going to present some examples that show the
                                                                            performance of our model. All the predictions presented are on
CORINE CLASS LEVEL 3 :                                                      validation data.The number on the top of each image on the left
                                                                            indicates the fold number (6-fold cross validation).
                                                                               In some of the above examples we see the difficulty of our prob-
                                                                            lem, deriving from the low resolution of the input images and the
                                                                     Figure 16: Examples of correct prediction of land cover
                                                                     classes/ concord between the predicted and the labeled class
                                                                     boundaries .

Figure 15: Finding uncommon classes (villages and marshes)

                                                                     Figure 17: Examples that show inaccuracies of the corine
                                                                     land cover that were corrected by our model.

                                                                     4   CONCLUSION
                                                                         • Our models provide a basis for the creation of land cover
                                                                           maps based on the CLC nomenclature. The visual results
                                                                           show the ability of our models to find the boundaries be-
ambiguity of the corine labels. In some cases the model made the           tween classes and the accuracy on the higher levels of the
right predictions, even though it is a difficult task even for the         class hierarchy is pretty good. The accuracy on common sub-
human observing the rgb input image.                                       classes is also good. However, the performance on predicting
       uncommon classes and discriminating subclasses of the same                            2019.
       superclass on the lower levels of the CLC class hierarchy                         [12] G. J. Scott, M. R. England, W. A. Starms, R. A. Marcum and C. H. Davis Train-
                                                                                             ing Deep Convolutional Neural Networks for Land–Cover Classification of High-
       isn’t adequate and human supervision may be needed for                                Resolution Imagery IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 4, pp.
       this task.                                                                            549-553, April 2017, doi: 10.1109/LGRS.2017.2657778.
                                                                                         [13] Z. Benbahria, M. F. Smiej, I. Sebari, and H. Hajji Land cover intelligent mapping us-
     • The CLC dataset contains imperfections. These limit the                               ing transfer learning and semantic segmentation. 2019 7th Mediterranean Congress
       accuracy of our models. However, in some cases the model                              of Telecommunications (CMT), pp. 1–5, 10 2019.
       can outperform the accuracy of the dataset in cases where                         [14] N. Kussul, M. Lavreniuk, S. Skakun, and A. Shelestov Deep learning classification
                                                                                             of land cover and crop types using remote sensing data. IEEE Geoscience and Remote
       the dataset has a lower quality than it’s average.                                    Sensing Letters, vol. PP, pp. 1–5, 03 2017
     • Usually the land cover is mixed or can not be described ac-                       [15] Y. J. E. Gbodjo, D. Ienco, L. Leroux, R. Interdonato, R. Gaetano, B. Ndao, and S.
       curately by the existing CLC classes. This leads to discord                           Dupuy “Object-based multi-temporal and multi-source land cover mapping leverag-
                                                                                             ing hierarchical class relationships,” 2019.
       between the labeled data and the predictions, even for kinds                      [16] Gbodjo, Yawogan & Leroux, Louise & Gaetano, Raffaele & Ndao, Babacar. (2019).
       of land cover that have been seen on the training set. We                             RNN-based Multi-Source Land Cover mapping: An application to West African
       also observe that sometimes there are multiple class labels                       [17] Ziheng Sun, Liping Di & Hui Fang (2019) Using long short-term memory re-
       that could describe the land cover and despite the seeming                            current neural network in land cover classification on Landsat and Cropland data
       disagreement between the model output and the labels they                             layer time series. International Journal of Remote Sensing, 40:2, 593-614, DOI:
       are close to each other. This indicates the need for a more so-                   [18] Kim, Y.; Park, N.-W.; Lee, K.-D. Self-Learning Based Land-Cover Classification
       phisticated loss criterion and performance metrics that give                          Using Sequential Class Patterns from Past Land-Cover Maps. Remote Sens. 2017, 9,
       different penalties to different types of confusion between                           921.
                                                                                         [19] B. Demir, F. Bovolo and L. Bruzzone, Updating Land-Cover Maps by Classification
       classes, taking into account the hierarchical structure of the                        of Image Time Series: A Novel Change-Detection-Driven Transfer Learning Approach.
       classes and the similarities and overlaps between classes.                            in IEEE Transactions on Geoscience and Remote Sensing, vol. 51, no. 1, pp. 300-312,
                                                                                             Jan. 2013, doi: 10.1109/TGRS.2012.2195727.
     • Increasing the resolution of the output from 100m to 10m                          [20] Caleb Robinson, Anthony Ortiz, Kolya Malkin, Blake Elias, Andi Peng, Dan
       can give better results but bigger models are required (more                          Morris, Bistra Dilkina, Nebojsa Jojic Human-Machine Collaboration for Fast Land
       parameters).                                                                          Cover Mapping. arXiv:1906.04176
                                                                                         [21] Ronneberger, Olaf; Fischer, Philipp; Brox, Thomas (2015). "U-Net: Convolutional
     • The main contribution of transfer learning was speeding up                            Networks for Biomedical Image Segmentation"..
       the training processes and possibly improving the results.                        [22] Stivaktakis, Radamanthys & Tsagkatakis, Grigorios & Tsakalides, Panagio-
       The encoder part of the network didn’t have to be trained,                            tis. (2019). Deep Learning for Multilabel Land Cover Scene Categorization Us-
                                                                                             ing Data Augmentation. IEEE Geoscience and Remote Sensing Letters. PP. 1-5.
       at least for the first epochs of the training, resulting in de-                       10.1109/LGRS.2019.2893306.
       creased epoch duration.
     • Using a bigger dataset could boost the performance of our
       models in the area of interest, especially in the task of pre-
       dicting uncommon classes.

