<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Semantic Segmentation Using Deep Learning: Insights from the LICAID Dataset</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Girma Tariku</string-name>
          <email>g.tariku@unibs.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Isabella Ghiglieno</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anna Simonetto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gianni Gilioli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ivan Serina</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Agrofood Research Hub, University of Brescia - Department of Civil, Environmental, Architectural Engineering</institution>
          ,
          <addr-line>and Mathematics - via Branze 43, 25123, Brescia</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Brescia - Department of Information Engineering (DII)</institution>
          ,
          <addr-line>-via Branze 38, 25123, Brescia</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Land cover mapping is critical for monitoring global land use patterns, assessing ecosystem health, and supporting conservation efforts. However, challenges persist in handling large satellite imagery datasets and acquiring specialized aerial datasets for deep-learning models. To address these challenges, this study introduces a methodology for semantic segmentation of land cover in agricultural regions, specifically tailored to the wine-growing region of Franciacorta, Italy. We present the "Land Cover Aerial Imagery" (LICAID) dataset and employ the advanced deep learning model DeepLabV3 with various pre-trained backbones (ResNet, DenseNet, and EfficientNet) for comparative analysis. The dataset comprises eleven land cover classes: grasslands, arable land, herb-dominated habitats, hedgerows, vineyards, tree-dominated man-made habitats, olive groves, wetlands, lines of planted trees, small anthropogenic forests, and others. Results demonstrate significant performance improvements in land cover classification using deep learning with pre-trained networks, providing a scalable and cost-effective approach to land cover mapping that supports environmental monitoring and conservation.</p>
      </abstract>
      <kwd-group>
        <kwd>1Land cover mapping</kwd>
        <kwd>Semantic segmentation</kwd>
        <kwd>Deep learning</kwd>
        <kwd>Satellite imagery</kwd>
        <kwd>pre-trained backbone</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Accurate and timely land cover mapping is essential for understanding the complex interactions
of land use patterns and their profound impact on global ecosystems [2]. This is particularly
important in agricultural regions, where land management practices directly influence biodiversity,
ecosystem services, and the overall sustainability of food production systems [3, 4, 5, 6, 7]. Inaccurate
or outdated land cover data can have widespread impacts, affecting policy decisions related to
resource allocation, environmental protection, and agricultural planning. Effective land management
strategies rely on accurate land cover assessments to maximize resource utilization, minimize
environmental risks, and promote sustainable agricultural practices. For example, accurate
identification of various crop types enables targeted interventions to improve yields and reduce the
need for chemical inputs. Similarly, accurate mapping of natural habitats within agricultural
landscapes is crucial for biodiversity conservation and the maintenance of ecosystem services such
as pollination and pest control.</p>
      <p>Traditional land cover classification methods, such as support vector machines and random
forests [8, 9, 10, 11, 12, 13], often struggle to effectively process the high-resolution images readily
available from modern remote sensing platforms. These methods face challenges in handling the
large datasets generated by modern sensors, leading to computational limitations and increased
processing times [14, 15, 16, 17]. Furthermore, the inherent complexity of agricultural landscapes
characterized by significant spatial diversity and subtle spectral variations between land types often
compromises the accuracy and reliability of traditional classification approaches. The limitations are
particularly evident in areas with complex land use patterns and overlapping spectral signatures,
resulting in misclassifications and inaccurate maps. This highlights the need for more advanced
technologies capable of efficiently processing high-resolution images and large-scale datasets while
maintaining high accuracy.</p>
      <p>
        Recent advances in deep learning offer significant potential for improving the accuracy and
efficiency of land cover mapping [
        <xref ref-type="bibr" rid="ref12 ref2 ref3 ref4 ref9">18, 19, 20, 21, 22, 27, 30</xref>
        ]. However, a major bottleneck remains:
the scarcity of readily available, high-quality, and cost-effective datasets tailored to specific
agricultural contexts [
        <xref ref-type="bibr" rid="ref13 ref14 ref15 ref5 ref6 ref7">23, 24, 25, 31, 32, 33</xref>
        ]. The development of robust and accurate deep learning
models for land cover classification relies heavily on large, well-annotated datasets that accurately
reflect the diversity and complexity of the target environment. The high costs and time investment
associated with creating such datasets often hinder research and limit the widespread adoption of
deep learning in agricultural applications. This data shortage has impeded progress in accurate
agricultural practices and sustainable land management, underscoring the urgent need for a readily
accessible and representative dataset.
      </p>
      <p>This paper presents an approach to address the problem of data scarcity in land cover
classification. We introduce a new semantic segmentation dataset, Land Cover Aerial Imagery
(LICAID), specifically designed for the wine-growing region. LICAID includes eleven land cover
classes representing the diverse landscapes surrounding vineyards: pastures, arable land,
herbdominated habitats, hedgerows, vineyards, tree-dominated man-made habitats, olive groves,
wetlands, planted trees, small anthropogenic forests, and other. This detailed classification allows
for a more precise understanding of the complex interactions within the vineyard ecosystem and its
surrounding environment.</p>
      <p>Our methodology focuses on cost-effective data acquisition and processing, making it a replicable
approach for other agricultural regions. We evaluate the performance of the state-of-the-art deep
learning model DeepLabV3, utilizing ResNet, DenseNet, and EfficientNet backbones, on the LICAID
dataset. This comparative analysis demonstrates the potential of our approach to enhance the
accuracy and efficiency of land cover mapping in agricultural settings, ultimately supporting
sustainable land management and environmental monitoring efforts.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Materials and Methods</title>
      <p>
        This study uses the deeplabV3 model to use four-step semantic segmentation methods in the wine
region of Franciacorta wine region. The process includes remote sensing data collection, rigorous
image preprocessing, expert validation, training and evaluation of DeepLabV3 models. The focus on
DeepLabV3 allows for a deeper exploration of its implementation and optimization in this specific
application context [
        <xref ref-type="bibr" rid="ref16">34</xref>
        ].
      </p>
      <sec id="sec-2-1">
        <title>2.1. Study Area</title>
        <p>The study focuses on the famous Italian wine-growing region of Franciacorta in Lombardy
(Figure 1). Located in the picturesque province of Brescia, Franciacorta is renowned for its stunning
landscapes, rich history, and world-class wine production.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Image Preprocessing, Segmentation, and Expert Validation</title>
        <p>Satellite imagery for the study was acquired through Google Earth Pro, which provided
highresolution imagery of the area. The images were downloaded in .kmz format, which includes both
the visual data and spatial attributes, forming the foundation for further analysis. Following
acquisition, ArcGIS was used for georeferencing each image to ensure alignment with the
geographical coordinates of the region, enabling precise spatial analyses. The study area's
heterogeneous landscape posed a challenge for segmentation, as different cover types required
finetuned segmentation techniques.</p>
        <p>1. Image Tile Segmentation</p>
        <p>Multiresolution segmentation (MRS) was performed on the georeferenced images using
eCognition software. This approach divides the imagery into smaller, homogeneous
regions based on spectral and spatial characteristics, optimizing the image for more
accurate classification. Key segmentation parameters included:
•
•
•</p>
        <p>Scale parameter (100): Balances segment size to reflect distinct regions in the
landscape.</p>
        <p>Compactness (0.5): Adjusts the segment shape for clarity without excessive
fragmentation.</p>
        <p>Shape (0.4): Ensures the segments preserve geographic coherence while isolating
major landscape features.</p>
        <sec id="sec-2-2-1">
          <title>2. Expert Validation and Classification</title>
          <p>The segmented images were subsequently validated and classified by a plant expert using
QGIS software. Each polygon within the shapefile was analyzed and assigned one of
several classes, such as grasslands, vineyards, arboreal land, and olive groves. This
validation ensured that each segment's classification was botanically accurate, allowing for
reliable LULC representation.</p>
          <p>The expert identified eleven cover types essential to Franciacorta’s biodiversity and agricultural
ecosystem:
1.
2.
3.</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>Grassland: Supports biodiversity through pollination and pest control [3]. Arable Land: Cultivated land used for crop production. Herb-Dominated Habitats: Supports herbivores, pollinators, and decomposers, contributing to soil health.</title>
        </sec>
        <sec id="sec-2-2-3">
          <title>3. Patch Extraction and Dataset Preparation</title>
          <p>Given the size of the original image pixels, preprocessing is required to optimize
deeplearning model training. A Python script was developed to divide large images into smaller,
manageable regions while maintaining spatial coherence between each region and its
corresponding class labels. As illustrated in Figure 2, semantic segmentation was performed
in three steps: 1) Images and corresponding masks were acquired from QGIS software; 2)
Large training images and masks were divided into 128x128 pixel segments for training the
DeepLabv3 semantic segmentation model; and 3) The trained DeepLabv3 model (saved for
future use) was then used to predict and map the image.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Image and Mask Processing</title>
        <p>A Python script processed the image and mask files (Figure 3), resizing them to the nearest
dimension divisible by the chosen patch size. This ensured the creation of non-overlapping patches
using the patchy library, resulting in 921 training image-mask pairs and 230 validation image-mask
pairs. This preprocessing step was crucial for efficient model training and prevented artifacts that
could arise from overlapping patches.</p>
        <p>The non-overlapping patch strategy, facilitated by the patchy library, provided a clean and
consistent dataset for training the DeepLabv3 model. The resulting datasets 921 training and 230
validation image-mask pairs were balanced and suitable for effective model training and subsequent
performance evaluation. This careful dataset preparation was a key factor in achieving the high
accuracy reported in the results.</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Data Split and Organization</title>
        <p>Using TensorFlow’s split folders library, the patches were split into training and validation
datasets with an 80-20 split ratio. These splits facilitated model training and performance evaluation,
allowing for robust testing across both seen and unseen data.</p>
      </sec>
      <sec id="sec-2-5">
        <title>2.5. Semantic Segmentation Models and Backbone Architecture</title>
        <p>
          The DeepLabv3 model was implemented using TensorFlow/Keras. The training involved
minimizing loss functions (measuring the difference between predicted and ground truth masks) and
evaluating performance using IoU (Intersection over Union), accuracy, and mean IoU. We compared
semantic segmentation models with and without backbone architectures. The "no backbone"
approach trained DeepLabv3 from scratch, relying solely on its intrinsic architecture for feature
extraction. This contrasts with models incorporating four commonly used backbone architectures:
ResNet-34 [
          <xref ref-type="bibr" rid="ref17">35</xref>
          ], a deep convolutional network with residual connections to mitigate vanishing
gradients; InceptionV3 [
          <xref ref-type="bibr" rid="ref18">36</xref>
          ], a computationally efficient architecture using inception modules for
multi-scale feature extraction; EfficientNet [37], distinguished by its compound scaling method for
optimal resource utilization; and DenseNet [38], characterized by dense connections promoting
feature reuse and efficient gradient propagation.
        </p>
        <p>
          DeepLabV3[
          <xref ref-type="bibr" rid="ref16">34</xref>
          ] is an advanced convolutionary neural network architecture designed for
semantic image segmentation. It uses the atrous convolution to efficiently capture multi-scale
context information. The key features include the feature pyramid network for hierarchical feature
extraction, the spatial pyramid pool for a multi-scale feature aggregation, and the efficient
higherlevel sampling method for high-resolution segmentation maps. Our DeepLabV3+ model, shown in
Figure 4, uses the Effi-cientNetB0 framework for feature extraction without fully connected layers
as an encoder. The encoder processes the input image, extracts the functions, and transmits them to
the decoder. The decoder consists of an Atrous Spatial Pyramid Pooling (ASPP) module followed by
global average pooling and low-level function concatenation. The ASPP module captures multi-scale
context information by using a convolutional layer with different dispersion rates. In addition, global
average pooling is performed to capture global context information. Then the decoder combines
ASPP output, global context, and low-level functionality through concatenation. Further
convolutional layers refine the characteristics and then increase the sampling layers to restore spatial
information. Finally, the 1x1 convolutional layer activated by SoftMax generates pixel predictions
for semantic segmentation. The model was composed by Adam's optimizer, using categorical
crossentropy losses and accuracy as measures of evaluation.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results and Discussion</title>
      <p>Our DeeplabV3 semantic segmentation model was trained on a dataset comprising 1152 image
patches. These patches were derived from 15 satellite images by dividing them into smaller 128x128
pixel sections. To ensure rigorous evaluation, the dataset was divided into a training set (80%), and a
test set (20%).</p>
      <p>The training process began by loading and preprocessing the image-mask pairs from specified
directories. Each image was annotated with one of eleven land cover classes, and the corresponding
masks underwent label encoding before being split into training and testing datasets (80/20 split).
The DeepLabv3 model was then configured, experimenting with four different backbone
architectures (DenseNet121, ResNet34, InceptionV3, and EfficientNet) and employing a combined
Dice and Categorical Focal loss function for optimization. Training proceeded for 100 epochs, with
progress visualized through plots of training loss and Intersection over Union (IoU) scores. Upon
completion, the trained model was saved.</p>
      <p>Following training, the model's performance was rigorously evaluated on the held-out test
dataset. A comprehensive suite of performance metrics, including accuracy, precision, recall,
F1score, Jaccard index, and mean IoU, were calculated and reported to provide a thorough assessment
of the model's ability to accurately segment the eleven land cover classes. These results provided a
quantitative measure of the model's effectiveness and informed the subsequent analysis and
discussion.</p>
      <p>Our methodology proves to be both scalable and cost-effective, with potential applications in
other agricultural regions. The comparative analysis highlights the importance of model selection
and backbone configuration in optimizing performance for specific land cover mapping tasks.</p>
      <p>To address the class imbalance problem, where certain land cover classes were significantly
under-represented in the dataset compared to others, a class-balanced data augmentation strategy
was implemented. This involved selectively enhancing the training data by generating new image
patches that focused specifically on the under-represented classes.</p>
      <p>This augmentation technique modified existing masks to temporarily exclude the
overrepresented classes, effectively creating new masks where the under-represented classes became the
dominant features. New image patches were then generated from these modified masks, increasing
the number of training samples for the under-represented classes without altering the original
dataset. This targeted approach ensured that the model received sufficient training examples for all
classes, improving its ability to accurately segment even the less frequent land cover types.</p>
      <p>In order to reduce over-fitting, we implemented an early stop. Training stopped when validation
losses stopped declining for a certain period (e.g., 10 years), so that training continued only when
the model improved its performance in invisible data. In addition, batch normalization has been used
to stabilize and accelerate training, and loss-function class weight has helped to control class
imbalances. These strategies collectively contributed to preventing over-adaptation and improving
the generalization capacity of models.</p>
      <p>The comparison of semantic segmentation models with different backbone architectures reveals
nuanced differences in performance metrics (Table 1). Across all models, including ResNet34,
EfficientNet, InceptionV3, DenseNet, and a model without a specified backbone, there is a notable
consistency in accuracy, precision, recall, and F1 score, with variations typically within a range of
12 percentage points. However, when assessing metrics more tailored to semantic segmentation tasks,
such as mean IOU and Jaccard score, subtle disparities emerge. EfficientNet and DenseNet exhibit
slightly higher mean IOU and Jaccard scores compared to ResNet34 and InceptionV3, highlighting
their marginally superior ability to accurately segment objects in images. For instance, EfficientNet
achieves an accuracy of 85.6%, while DenseNet reaches 84.2%, both with corresponding mean IOU
scores of 59.0% and 56.2%, respectively. These results underscore the importance of selecting an
appropriate backbone architecture, as models with dedicated architectures designed for image
segmentation tasks demonstrate enhanced performance, particularly in terms of mean IOU and
Jaccard score, compared to models without a specified backbone.</p>
      <p>ResNet 79.0 78.8 78.3 71.1 49.5
EfficientNet 85.61 85.48 85.5 85.28 59.0</p>
      <p>B0
Inception V3 80.6 80.1 80.4 80.5 55.2 69.4
DenseNet 84.2 84.7 84.1 84.2 56.2 70.2
Without 75.0 75.9 75.8 75.5 49.9 63.2
Backbone
Accuracy: This metric measures the overall correctness of the segmentation by
calculating the ratio of correctly predicted pixels to the total number of pixels.</p>
      <p>Precision: Precision quantifies the model's ability to correctly identify positive
predictions among all predicted positives. It's calculated as the ratio of true positives
to the sum of true positives and false positives.</p>
      <p>Recall: Recall, also known as sensitivity, measures the ability of the model to detect
all relevant instances of the class in the image. It's calculated as the ratio of true
positives to the sum of true positives and false negatives.</p>
      <sec id="sec-3-1">
        <title>Jaccard Score 62.1 75.68</title>
        <p>F1 Score: The F1 score is the harmonic mean of precision and recall. It provides a
balanced measure between precision and recall and is calculated as 2 * (precision *
recall) / (precision + recall).</p>
        <p>Mean IoU: Mean IoU calculates the average IoU across all classes. It's a popular
metric for semantic segmentation tasks as it provides an overall measure of
segmentation accuracy across different classes.</p>
        <p>Jaccard Score (IoU): The Jaccard score, or Intersection over Union (IoU), measures
the ratio of the intersection of the predicted and ground truth segmentation masks to
their union. It evaluates the overlap between the predicted and ground truth regions.</p>
        <p>Following the training phase, the optimized DeepLabv3 model was loaded for evaluation. A batch
of test images and their corresponding ground truth masks were generated using the previously
prepared validation data generator. This ensured that the model's performance was assessed on
unseen data, providing a more robust and unbiased evaluation of its generalization capabilities. The
loaded model then processed this batch of test images, generating predictions in the form of predicted
segmentation masks. These predicted masks, initially represented in a categorical format, were
subsequently converted to an integer format for easier visualization and compatibility with the
Intersection over Union (IoU) calculation. This conversion simplifies the comparison between the
predicted and ground truth masks, facilitating both qualitative and quantitative analysis of the
model's performance.</p>
        <p>To provide a qualitative assessment of the model's performance, a randomly selected test image
was chosen for visualization alongside its corresponding ground truth mask and the model's
predicted mask. This visual comparison, presented in Figure 5, allowed for a direct assessment of the
model's ability to accurately segment the different land cover classes. The visual representation
provides valuable insights into the model's strengths and weaknesses, highlighting areas where the
model performed exceptionally well and areas where further improvement might be needed. This
qualitative assessment complements the quantitative evaluation provided by the calculated
performance metrics, offering a more comprehensive understanding of the model's overall
performance. Visual comparison helps to identify potential biases or limitations in the model’s
predictions, informing future model improvements and dataset refinements. The selection of a
random image ensures that the visualization is representative of the model's performance across the
entire test dataset, avoiding potential bias towards specific image characteristics.</p>
        <p>After training, the optimized DeepLabv3 model was loaded to process a significantly larger image
than those used during training and validation. Because the model was trained on smaller image
patches, this large image was first segmented into a series of overlapping patches of the same size as
those used during training. This tiling strategy ensured compatibility with the model's input
requirements. The model then processed each patch individually, generating a predicted
segmentation mask for each. A crucial step followed: these individual, patch-level predictions were
carefully stitched together using an appropriate image stitching algorithm to reconstruct a complete,
seamless predicted mask for the entire large input image. This process effectively scaled the model's
application to images exceeding the size constraints of the training data, allowing for the analysis of
larger scenes and demonstrating the model's ability to generalize to larger-scale applications (as
shown in Figure 6). The overlapping patches helped to mitigate boundary artifacts that can
sometimes occur during this stitching process, ensuring a more coherent and accurate final
segmentation mask.</p>
        <p>This study demonstrates land cover mapping in agricultural areas using semantic segmentation
with the DeepLabv3 deep learning model and transfer learning. A new dataset of eleven agricultural
land cover classes was created, addressing a critical gap in existing research. This dataset facilitates
the development of mapping methods for the main land cover classes in Italy's Franciacorta region
and provides a valuable resource for future research on biodiversity and sustainability. The study
overcomes limitations of traditional methods that struggle with spectral similarities and class
heterogeneity. Results show superior performance for DeepLabv3 with an EfficientNetB0 backbone,
achieving significantly higher accuracy (0.866) and improved performance across other key metrics.
DeepLabv3's advanced architecture, including atrous spatial pyramid pooling (ASPP), facilitates
multi-scale context integration for improved detail capture. EfficientNetB0's efficient compound
scaling method optimizes the balance between model complexity and accuracy. Future work should
include expanding the dataset to incorporate more land cover classes and integrating ground truth
data from field observations to further improve accuracy and reliability, particularly in complex
agricultural landscapes.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>This study successfully demonstrates the application of satellite imagery and deep learning,
specifically the DeepLabv3 model, for accurate land cover mapping in the Franciacorta wine-growing
area. A novel manually annotated dataset comprising eleven land cover classes provides a valuable
resource for agricultural research and future studies. Our results highlight the effectiveness of
DeepLabv3 in accurately segmenting these classes from satellite imagery, showcasing the potential
of this advanced method for sustainable land management, environmental monitoring, and informed
agricultural decision-making. The analysis of diverse land cover types emphasizes the importance of
understanding the contribution of surrounding habitats to vineyard sustainability, biodiversity, and
ecosystem services. Future work should focus on enhancing the generalizability of the findings
through dataset expansion (potentially including additional land cover classes and ground-truth data
from field surveys) and exploration of more advanced deep learning architectures to further refine
model performance and broaden applicability across diverse agricultural landscapes.</p>
      <p>This study successfully demonstrates the application of deep learning, using the DeepLabv3
model, for accurate land cover mapping in the Franciacorta wine-growing region of Italy. Leveraging
high-resolution satellite imagery, the study produced detailed land cover maps, providing valuable
insights into the spatial distribution of eleven distinct land cover classes within this complex
agricultural landscape. The creation of a novel, manually annotated dataset represents a significant
contribution, offering a valuable resource for future research in agricultural applications and
precision land management.</p>
      <p>The results clearly show the effectiveness of DeepLabv3 in accurately segmenting these diverse
land cover classes from satellite imagery. This success highlights the potential of advanced deep
learning techniques for improving the accuracy and efficiency of land cover mapping, thereby
supporting sustainable land management practices, environmental monitoring initiatives, and
evidence-based agricultural decision-making. The detailed analysis of the various land cover types
underscores the importance of understanding the intricate relationships between vineyard
ecosystems and their surrounding habitats, emphasizing the contribution of these surrounding areas
to overall vineyard sustainability, biodiversity, and the provision of essential ecosystem services.</p>
      <p>Future research should focus on enhancing the generalizability and robustness of the findings.
This can be achieved through expanding the dataset to include a wider range of land cover classes,
incorporating ground-truth data collected through field surveys to further validate the model's
accuracy, and exploring more advanced deep learning architectures or model optimization
techniques to improve performance and broaden applicability across diverse agricultural settings.
These enhancements will strengthen the model's capability for wider application and contribute to
more comprehensive and reliable land cover mapping in various agricultural contexts.
[2] Olofsson, P.; Foody, G.M.; Herold, M.; Stehman, S.V.; Woodcock, C.E.; Wulder, M.A. Good
practices for estimating area and assessing accuracy of land change. Remote Sens. Environ. 2014,
148, 42–57. https://doi.org/10.1016/j.rse.2014.02.015.
[3] Williams, J.N.; Morandé, J.A.; Vaghti, M.G.; Medellín-Azuara, J.; Viers, J.H. Ecosystem Services
in Vineyard Landscapes: A Focus on Aboveground Carbon Storage and Accumulation. Carbon
Balance Manag. 2020, 15, 23. https://doi.org/10.1186/s13021-020-00158-z.
[4] Giffard, B.; et al. Vineyard Management and Its Impacts on Soil Biodiversity, Functions, and</p>
      <p>Ecosystem Services. Front. Ecol. Evol. 2022, 10. https://doi.org/10.3389/fevo.2022.850272.
[5] The Regenerative Viticulture Foundation. Biodiversity. Available online:
https://www.regenerativeviticulture.org/toolkit/biodiversity/ (accessed on 9 October 2024).
[6] Abad, J.; de Mendoza, I.H.; Marín, D.; Orcaray, L.; Santesteban, L.G. Cover crops in viticulture.</p>
      <p>A systematic review (1): Implications on soil characteristics and biodiversity in vineyard. OENO
One 2021, 55, 1. https://doi.org/10.20870/oeno-one.2021.55.1.3599.
[7] Hurajová, E.; et al. Biodiversity and Vegetation Succession in Vineyards, Moravia (Czech</p>
      <p>Republic). Agriculture 2024, 14, 1036. https://doi.org/10.3390/agriculture14071036.
[8] Pal, M.; Mather, P.M. Support vector machines for classification in remote sensing. Int. J. Remote</p>
      <p>Sens. 2005, 26, 1007–1011. https://doi.org/10.1080/01431160512331314083.
[9] Pal, M.; Mather, P.M. An assessment of the effectiveness of decision tree methods for land cover
classification. Remote Sens. Environ. 2003, 86, 554–565.
https://doi.org/10.1016/S00344257(03)00132-9.
[10] Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. Random
forests for classification in ecology. Ecology 2007, 88, 2783–2792.
https://doi.org/10.1890/070539.1.
[11] Laban, N.; Abdellatif, B.; Ebeid, H.M.; Shedeed, H.A.; Tolba, M.F. Machine Learning for
Enhancement Land Cover and Crop Types Classification. In Machine Learning Paradigms:
Theory and Application; Hassanien, A.E., Ed.; Springer International Publishing: Cham,
Switzerland, 2019; pp. 71–87. https://doi.org/10.1007/978-3-030-02357-7_4.
[12] Phiri, D.; Simwanda, M.; Salekin, S.; Nyirenda, V.R.; Murayama, Y.; Ranagalage, M. Sentinel-2
Data for Land Cover/Use Mapping: A Review. Remote Sens. 2020, 12, 2291.
https://doi.org/10.3390/rs12142291.
[13] Gauci, A.; Abela, J.; Austad, M.; Cassar, L.F.; Zarb Adami, K. A Machine Learning approach for
automatic land cover mapping from DSLR images over the Maltese Islands. Environ. Model.</p>
      <p>Softw. 2018, 99, 1–10. https://doi.org/10.1016/j.envsoft.2017.09.014.
[14] Mardani, M.; Mardani, H.; De Simone, L.; Varas, S.; Kita, N.; Saito, T. Integration of Machine
Learning and Open Access Geospatial Data for Land Cover Mapping. Remote Sens. 2019, 11,
1907. https://doi.org/10.3390/rs11161907.
[15] Pelletier, C.; Valero, S.; Inglada, J.; Champion, N.; Dedieu, G. Assessing the robustness of
Random Forests to map land cover with high resolution satellite image time series over large
areas. Remote Sens. Environ. 2016, 187, 156–168. https://doi.org/10.1016/j.rse.2016.10.010.
[16] Machine Learning Algorithms for Satellite Image Classification Using Google Earth Engine and
Landsat Satellite Data: Morocco Case Study. IEEE Journals &amp; Magazine. Accessed: April 2, 2024.
[Online]. Available: https://ieeexplore.ieee.org/document/10177754.
[17] Han, R.; Liu, P.; Wang, G.; Zhang, H.; Wu, X. Advantage of Combining OBIA and Classifier
Ensemble Method for Very High-Resolution Satellite Imagery Classification. J. Sens. 2020, 2020,
e8855509. https://doi.org/10.1155/2020/8855509.
[18] Ma, L.; Liu, Y.; Zhang, X.; Ye, Y.; Yin, G.; Johnson, B.A. Deep learning in remote sensing
applications: A meta-analysis and review. ISPRS J. Photogramm. Remote Sens. 2019, 152, 166–
177. https://doi.org/10.1016/j.isprsjprs.2019.04.015.
[19] Zaabar, N.; Niculescu, S.; Kamel, M.M. Application of Convolutional Neural Networks With
Object-Based Image Analysis for Land Cover and Land Use Mapping in Coastal Areas: A Case
Study in Ain Témouchent, Algeria. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15,
5177–5189. https://doi.org/10.1109/JSTARS.2022.3185185.
Available:
https://www.cvfoundation.org/openaccess/content_cvpr_2016/html/Szegedy_Rethinking_the_Inception_CVP
R_2016_paper.html.
[37] Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.</p>
      <p>arXiv 2020, arXiv:1905.11946. https://doi.org/10.48550/arXiv.1905.11946.
[38] Weinberger, K.Q. Densely Connected Convolutional Networks. arXiv 2018, arXiv:1608.06993.
https://doi.org/10.48550/arXiv.1608.06993.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Aineto</surname>
          </string-name>
          , R. De Benedictis,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maratea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mittelmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Monaco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Scala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Serafini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Serina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Spegni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Tosello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Umbrico</surname>
          </string-name>
          , M. Vallati (Eds.),
          <source>Proceedings of the International Workshop on Artificial Intelligence for Climate Change, the Italian workshop on Planning and Scheduling</source>
          , the RCRA Workshop on
          <article-title>Experimental evaluation of algorithms for solving problems with combinatorial explosion, and</article-title>
          the Workshop on Strategies, Prediction, Interaction, and
          <article-title>Reasoning in Italy (AI4CC-IPS-RCRA-SPIRIT 2024), co-located with 23rd International Conference of the Italian Association for Artificial Intelligence</article-title>
          (AIxIA
          <year>2024</year>
          ), CEUR Workshop Proceedings, CEUR-WS.org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Zhang</surname>
            , X.; Han,
            <given-names>L</given-names>
          </string-name>
          .; Han,
          <string-name>
            <given-names>L</given-names>
            .;
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <article-title>How Well Do Deep Learning-Based Methods for Land Cover Classification and Object Detection Perform on High Resolution Remote Sensing Imagery? Remote Sens</article-title>
          .
          <year>2020</year>
          ,
          <volume>12</volume>
          , 417. https://doi.org/10.3390/rs12030417.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [21]
          <article-title>Big Data for Remote Sensing: Challenges and Opportunities</article-title>
          .
          <source>IEEE Xplore. Accessed: July</source>
          <volume>24</volume>
          ,
          <year>2024</year>
          . [Online]. Available: https://ieeexplore.ieee.org/document/7565634.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Ienco</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ; Gbodjo,
          <string-name>
            <given-names>Y.J.E.</given-names>
            ;
            <surname>Gaetano</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          ; Interdonato,
          <string-name>
            <surname>R.</surname>
          </string-name>
          <article-title>Weakly Supervised Learning for Land Cover Mapping of Satellite Image Time Series via Attention-Based CNN</article-title>
          .
          <source>IEEE Access</source>
          <year>2020</year>
          ,
          <volume>8</volume>
          ,
          <fpage>179547</fpage>
          -
          <lpage>179560</lpage>
          . https://doi.org/10.1109/ACCESS.
          <year>2020</year>
          .
          <volume>3024133</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <article-title>Semantic Segmentation of Satellite Images: A Deep Learning Approach Integrated with Geospatial Hash Codes</article-title>
          .
          <source>Remote Sens</source>
          .
          <year>2021</year>
          ,
          <volume>13</volume>
          , 2723. https://doi.org/10.3390/rs13142723.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Yuan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Gu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <article-title>A review of deep learning methods for semantic segmentation of remote sensing imagery</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <year>2021</year>
          ,
          <volume>169</volume>
          , 114417. https://doi.org/10.1016/j.eswa.
          <year>2020</year>
          .
          <volume>114417</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Garcia-Garcia</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Orts-Escolano</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Oprea</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Villena-Martinez</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Garcia-Rodriguez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>A Review on Deep Learning Techniques Applied to Semantic Segmentation</article-title>
          .
          <source>arXiv</source>
          <year>2017</year>
          , arXiv:
          <fpage>1704</fpage>
          .06857. https://doi.org/10.48550/arXiv.1704.06857.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Marmanis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Wegner</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Galliani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Schindler</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Datcu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Stilla</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          <article-title>Semantic Segmentation of Aerial Images with an Ensemble of CNNs</article-title>
          . ISPRS Ann. Photogramm. Remote Sens.
          <source>Spatial Inf. Sci</source>
          .
          <year>2016</year>
          , III-
          <volume>3</volume>
          ,
          <fpage>473</fpage>
          -
          <lpage>480</lpage>
          . https://doi.org/10.5194/isprs-annals-III-3
          <string-name>
            <surname>-</surname>
          </string-name>
          473-
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Zheng,
          <string-name>
            <given-names>S.</given-names>
            ;
            <surname>Duan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ;
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          ; Zhang,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network</article-title>
          .
          <source>Geo-Spat. Inform. Sci</source>
          .
          <year>2022</year>
          ,
          <volume>25</volume>
          ,
          <fpage>278</fpage>
          -
          <lpage>294</lpage>
          . https://doi.org/10.1080/10095020.
          <year>2021</year>
          .
          <volume>2017237</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Tzepkenlis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Marthoglou</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Grammalidis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <article-title>Efficient Deep Semantic Segmentation for Land Cover Classification Using Sentinel Imagery</article-title>
          .
          <source>Remote Sens</source>
          .
          <year>2023</year>
          ,
          <volume>15</volume>
          ,
          <year>2027</year>
          . https://doi.org/10.3390/rs15082027.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ; Zhang, J.;
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Meng</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ; Zhang,
          <string-name>
            <surname>X.</surname>
          </string-name>
          <article-title>RSSFormer: Foreground Saliency Enhancement for Remote Sensing Land-Cover Segmentation</article-title>
          .
          <source>IEEE Trans. Image Process</source>
          .
          <year>2023</year>
          ,
          <volume>32</volume>
          ,
          <fpage>1052</fpage>
          -
          <lpage>1064</lpage>
          . https://doi.org/10.1109/TIP.
          <year>2023</year>
          .
          <volume>3238648</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Längkvist</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kiselev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Alirezaie</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Loutfi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>Classification and Segmentation of Satellite Orthoimagery Using Convolutional Neural Networks</article-title>
          .
          <source>Remote Sens</source>
          .
          <year>2016</year>
          ,
          <volume>8</volume>
          , 329. https://doi.org/10.3390/rs8040329.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Vali</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Comai</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; Matteucci,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Deep Learning for Land Use and Land Cover Classification Based on Hyperspectral and Multispectral Earth Observation Data: A Review</article-title>
          .
          <source>Remote Sens</source>
          .
          <year>2020</year>
          ,
          <volume>12</volume>
          , 2495. https://doi.org/10.3390/rs12152495.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Digra</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Dhir</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Sharma,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>Land Use Land Cover Classification of Remote Sensing Images Based on Deep Learning Approaches: A Statistical Analysis</article-title>
          and
          <string-name>
            <given-names>Review. Arab. J.</given-names>
            <surname>Geosci</surname>
          </string-name>
          .
          <year>2022</year>
          ,
          <volume>15</volume>
          , 1003. https://doi.org/10.1007/s12517-022-10246-8.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Tu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ye</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; Tang, H.; Hu,
          <string-name>
            <given-names>Y.</given-names>
            ;
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Land Use and Land Cover Classification Meets Deep Learning: A Review</article-title>
          .
          <source>Sensors</source>
          <year>2023</year>
          ,
          <volume>23</volume>
          , 18966. https://doi.org/10.3390/s23218966.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Chen</surname>
          </string-name>
          , L.-C.;
          <string-name>
            <surname>Papandreou</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kokkinos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ; Murphy,
          <string-name>
            <given-names>K.</given-names>
            ;
            <surname>Yuille</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.L.</surname>
          </string-name>
          <article-title>DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs</article-title>
          .
          <source>IEEE Trans. Pattern Anal. Mach. Intell</source>
          .
          <year>2018</year>
          ,
          <volume>40</volume>
          ,
          <fpage>834</fpage>
          -
          <lpage>848</lpage>
          . https://doi.org/10.1109/TPAMI.
          <year>2017</year>
          .2699184
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [35]
          <string-name>
            <surname>He</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>Deep Residual Learning for Image Recognition</article-title>
          .
          <source>In Proceedings of the IEEE Conference on Computer Vision</source>
          and Pattern Recognition; IEEE: Las Vegas,
          <string-name>
            <surname>NV</surname>
          </string-name>
          , USA,
          <year>2016</year>
          ; pp.
          <fpage>770</fpage>
          -
          <lpage>778</lpage>
          . Accessed: April 22,
          <year>2023</year>
          . [Online]. Available: https://openaccess.thecvf.com/content_cvpr_2016/html/He_Deep_
          <article-title>Residual_Learning_CVPR_2 016_paper</article-title>
          .html.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Szegedy</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Vanhoucke</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ioffe</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Shlens</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Wojna</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <article-title>Rethinking the Inception Architecture for Computer Vision</article-title>
          .
          <source>In Proceedings of the IEEE Conference on Computer Vision</source>
          and Pattern Recognition; IEEE: Las Vegas,
          <string-name>
            <surname>NV</surname>
          </string-name>
          , USA,
          <year>2016</year>
          ; pp.
          <fpage>2818</fpage>
          -
          <lpage>2826</lpage>
          . Accessed: April 22,
          <year>2023</year>
          . [Online].
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>