<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>I2C-UHU-PERSEUS at PlantClef 2025 : Multi-Label Identification and Classification of Plant Species in Images Using Object Detection Techniques</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jesús Tejón Carrillo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Prieto Araujo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Victoria Pachón Álvarez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacinto Mata Vázquez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>I2C Research Group, University of Huelva</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>This work presents a hybrid pipeline for the automatic identification and classification of plant species using object detection and deep learning techniques. Specifically, the approach combines YOLOv11 for the detection of relevant regions in images and InceptionV3 for multi-label classification of the detected species. The methodology was evaluated within the context of the PlantCLEF 2025 challenge, which involves multi-species classification from natural vegetation quadrat images. To address the high class imbalance, random undersampling and data augmentation techniques were applied. Despite computational constraints that required dataset reduction, the proposed pipeline achieved an F1-score of 0.02203, ranking 28th out of 38 participating teams. In comparison, the top-performing team achieved a score of 0.38132. These results, although modest, highlight the potential of integrating object detection as a pre-classification step in complex natural scenarios.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Deep Learning</kwd>
        <kwd>Python</kwd>
        <kwd>Convolutional Neural Networks</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Image Detection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Nowadays, precise plant species classification tasks are carried out in the fields of botany, environmental
conservation, and biodiversity monitoring. Traditionally, this work was performed by specialists in the
area, which implies high time and resource costs, in addition to the possibility of human errors. It is for
this reason that, given these limitations, the use of techniques that combine computer vision with deep
learning becomes essential to automate and scale the recognition of diferent plant species.</p>
      <p>
        Recent advances in the field of Convolutional Neural Networks (CNNs) have demonstrated that this
type of network has a remarkable ability to extract and learn features from plant images, including
variable backgrounds, scales, and positions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Examples of this type of network include ResNet[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
DenseNet[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and more recent ones such as Vision Transformers[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which, together with a rich and
diversified dataset in term of classes, number of images datasets such as ImageNet, can significantly
improve performance in multiclass classification tasks in the field of computer vision, including
automatic classification of plant species, flowers, or fruits. Although architectures like ResNet or Vision
Transformers have shown good results, InceptionV3 was chosen due to its balance between performance
and eficiency, particularly useful in training environments with limited resources.
      </p>
      <p>
        However, multiclass classification of plant species presents significant challenges, such as visual
similarity between species, proximity between species that can lead to confusion, class imbalance in
the data, and diferences between training and test images.[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
      </p>
      <p>In this work, a solution was proposed a solution based on a hybrid detection and classification pipeline,
combining the YOLOv11 model for localization of relevant regions in images with an InceptionV3
network for multi-label classification of detected species. This approach allows us to simultaneously
address the detection of multiple species within the same image, as well as mitigate problems arising
from class imbalance and variability between training and test domains. Through this approach, the
objective was to improve accuracy and robustness in plant species classification scenarios in real and
uncontrolled images.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Context</title>
      <p>
        This work is focused on providing a solution to the problem posed in the PlantCLEF 2025 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] competition,
which provides a dataset composed of a total of 7806 distinct plant classes that must be classified. The
dataset presents a marked class imbalance, with some species represented by only a single image, while
others have up to 823 samples.
      </p>
      <p>Another important challenge lies in the diference between training and test images. Due to
computational resource constraints, the original training dataset had to be reduced in size. This selection was
carried out with the goal of preserving class diversity while ensuring feasible training times and memory
usage. Although this reduction may limit overall model generalization, it was necessary to adapt the
approach to the available hardware. Training images typically show a single plant, generally taken
from a vertical or lateral perspective, with the object centered. In contrast, test images are captured
from a zenithal (top-down) perspective and usually include multiple plants in the same frame, which
introduces greater complexity to the inference process.</p>
      <p>
        This type of CLEF competitions, specifically those of LifeCLEF [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], help to perform diferent
comparative evaluations in the field of computer vision applied to biodiversity.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Related Work</title>
      <p>
        Previous studies have investigated plant species detection through the use of computer vision techniques.
Among these, we can see studies related to fruit detection [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], disease detection in leaves [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] as well as
in complete plants [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and detection of medicinal plants [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Furthermore, these technologies have
demonstrated great importance in the field of agriculture, where they contribute to optimizing crop
monitoring, pest control, and management of diferent resources [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>However, most of these works focus on controlled scenarios or direct classification tasks, with good
quality images and homogeneous conditions. Few address the problem from a prior detection approach
or multi-label classification in unstructured natural environments, which limits their applicability in
real situations.</p>
      <p>
        In the context of competitions like PlantCLEF, which poses realistic challenges from images collected
by citizens (with noise, variable lighting, partially visible species, etc.), some recent works have explored
more robust solutions. For example, in PlantCLEF 2023, the NEUON team employed architectures
such as Inception-v4 and Inception-ResNet-v2, along with data augmentation strategies and
organspecific training (leaves, flowers, fruits, etc.), achieving outstanding performance [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Additionally, in
PlantCLEF 2024, approaches based on Vision Transformers for multi-label classification in vegetation
plot images were explored, addressing the challenges of variability in capture conditions and the
presence of multiple species per image [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>Despite these advances, many previous approaches remain limited in terms of their generalization,
especially when facing poorly represented species or when a single image contains multiple relevant
labels. In this work, we propose a hybrid pipeline based on prior detection with YOLOv11 and multi-label
classification with InceptionV3, specifically designed to address these challenges. Unlike traditional
direct classification approaches, the proposed methodology attempts to detected regions of interest
before classification, which may help improve accuracy in complex scenarios such as those posed by
PlantCLEF 2025.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <sec id="sec-4-1">
        <title>4.1. Models Used</title>
        <p>• InceptionV3</p>
        <p>
          In this project, this model was used for the multiclass classification part of plant species in diferent
images. InceptionV3 is a convolutional network architecture, developed by Google, that stands
out for its eficiency and high performance in tasks related to computer vision [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. This model
has been previously employed in plant species classification tasks, such as flower classification,
obtaining very good results [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. More recent architectures, such as EficientNet, were not chosen
due to hardware limitations and training time constraints. Additionally, although domain-specific
models like BioCLIP have shown promise in plant identification tasks, their relative novelty,
lack of extensive documentation, and limited availability of open-source implementations and
pretrained weights made them less practical for this study. In contrast, InceptionV3 provided a
balanced trade-of between performance, computational eficiency, and support within the deep
learning community, making it a more feasible and robust option under the constraints of this
project.
        </p>
        <p>To leverage the pre-trained weights of the model with the ImageNet dataset [17], the last layer
of the model was replaced with a global average pooling layer, followed by a dense layer with
ReLU activation, and finally a softmax output layer was added to adapt the model to multiclass
classification. The final architecture was trained using the Adam optimizer and freezing some
upper layers, which allowed the model to learn complex and determining relationships, enabling
it to show good capacity for extracting visual features in the botanical domain. Figure 3
illustrates the complete InceptionV3 architecture used, highlighting the modular structure of
the network—including the Inception modules A, B, and C—and the progressive reduction of
spatial dimensions through grid reduction layers. The diagram also emphasizes the parallel
convolutional paths within each Inception module, which contribute to the model’s eficiency
and ability to capture multi-scale features.
• YOLOv11</p>
        <p>YOLOv11 is a model that belongs to the YOLO family, which has a series of models specialized in
object detection with high speed and good accuracy even in real-time systems [18]. YOLOv11
is nothing more than an evolution within this family that incorporates improvements to the
architecture, such as new prediction modules or more eficient training strategies, which enable
it to obtain better results [19].</p>
        <p>The motivation for using YOLOv11 in this context was twofold. On one hand, it allows automatic
detection of regions of interest within images, such as plants, leaves, or flowers. This way
nonrelevant areas of the image were filtered out from what does not interest us in the image to favor
the performance of the classifier model (InceptionV3). On the other hand, this addressed the
problem of multiple plants in one image, since we only pass individual images to the classifier
model, which facilitates its work by not having to classify more than one image per plant, which
could generate confusion and errors. Figure 4 shows an example of what an image would look
like after detecting plant species using the YOLO model, which returns the same image with
bounding boxes and their corresponding confidence scores added.</p>
        <p>In case the YOLO model fails, or does not find any plant, the complete image is passed to the
classifier model for classification.</p>
        <p>With this combination between detection and classification, a more robust pipeline could be built,
where predictions are not based on the complete image, but on key fragments that are identified by the
detector. This proved useful when classifying images taken in diferent natural environments, where
the position and framing of photos vary significantly, such as those ofered in the PlantCLEF challenge.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Balancing Techniques</title>
        <sec id="sec-4-2-1">
          <title>Given such an imbalanced dataset, the following techniques were chosen: • Random Under-Sampling To reduce class imbalance and thus improve training performance, a random under-sampling process was chosen to eliminate those classes that contained an insignificant amount of examples.</title>
          <p>Specifically, classes that had only a single data point were eliminated, as they did not provide
suficient information for the model to predict correctly [ 20], and thus solve the space problem
on the equipment that performed the training.</p>
          <p>This technique allowed reducing the total number of classes, which in turn improved the
overall training perspective, allowing the model to focus on classes with more information.</p>
          <p>Furthermore, this technique helps prevent data overfitting, while improving training stability [ 21].
• Data Augmentation</p>
          <p>As mentioned several times before, the dataset presented a clearly imbalanced distribution. To
mitigate this problem, data augmentation techniques were applied, which have proven to be
really efective in image classification tasks, by generating new ones from transformations of the
originals [22]. These transformations included rotations, scaling, random zoom, and horizontal
lfipping, among others. This way, the training set could be artificially expanded without the need
to collect new data, since in this context it was impossible.</p>
          <p>For the practical implementation of this technique, the ImageDataGenerator method ofered by
the Keras library1 was used, which allows applying these transformations at runtime during
training. This strategy avoids having to store augmented images on the hard drive, reduces
memory usage, and allows greater data variability in each epoch.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Setup</title>
      <p>For the experimental setup, several Python libraries for machine learning were used. Some of those
used were "Keras", "TensorFlow" [23], "Pandas" [24], and "YOLO" [25], among others.</p>
      <p>Regarding the data, first the random under-sampling previously described was performed, reducing
the classes from 7806 to 4950. After this, a stratified split of the original dataset was applied—allocating
70% of the samples for training, 20% for validation, and 10% for testing—while preserving the original
class imbalance. Finally, data augmentation was applied at runtime during training.</p>
      <p>On the other hand, to train the YOLO model, a sample of 30 images from the test dataset provided by
PlantCLEF had to be taken, which was manually labeled using the "Roboflow" 2 tool. After this, data
augmentation was performed, going from 30 to 61 images divided into 52 for training, 6 for validation,
and 3 for testing, and the YOLO model was retrained for plant detection in images.</p>
      <p>The complete training and implementation notebooks, along with code and documentation, are
publicly available at https://github.com/JesusTejon/PlantCLEF-2025-TFG</p>
      <sec id="sec-5-1">
        <title>1https://keras.io 2https://roboflow.com/</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Results</title>
      <p>Regarding performance evaluation, the F1-score has been taken as the main metric, as it ofers a balance
between precision and recall, which is especially relevant in contexts with imbalanced classes or when
it is equally costly to commit false positives and false negatives. Unlike metrics such as precision or
recall separately, which focus solely on one of these aspects, the F1-score provides a more complete
view of the model’s behavior. On the other hand, the use of the confusion matrix has been discarded
due to its high dimensionality in this multiclass problem, which would hinder its visual interpretation
and detailed analysis.</p>
      <p>After experimenting with diferent configurations, the results shown in Table 1 were obtained.</p>
      <p>The results clearly show the progressive improvement in performance as additional techniques are
applied to the base model. The use of data augmentation produces a significant improvement compared
to direct training with the original dataset, which suggests that the introduced variability contributes
positively to the model’s generalization capacity.</p>
      <p>Likewise, the integration of a prior detection stage with YOLOv11 followed by classification with
InceptionV3 notably improves the F1-score, tripling the value obtained with the exclusive use of
data augmentation. This demonstrates that first localizing the relevant regions of the image before
classifying them helps the model focus on areas with significant information, thus reducing the impact
of background noise and improving labeling precision.</p>
      <p>Despite the absolute F1-score values still being low, these initial experiments show a clear trend of
improvement that validates the usefulness of the proposed hybrid approach. These initial tests laid
a promising foundation that these initial tests lay a promising foundation for future iterations of the
system, in which both the architecture and the quality of the input data could be further optimized to
achieve more competitive performance in scenarios such as those posed by PlantCLEF.</p>
      <p>After a manual review, it was observed that the main failure in the system occurs when the YOLO
model is unable to detect any plant species or detects an incorrect one, which subsequently causes
confusion for the classification model. An example of this can be seen in Figure 5, where the model fails
to detect any plant species, likely because it is somewhat camouflaged with the environment. On the
other hand, Figure 6 shows a case where the same plant is detected twice, with one of the detections
including a rock. This may lead the classification model to classify two diferent species where there is
only one.</p>
      <p>As shown in the images, both types of YOLO model errors can also lead to incorrect classification by
the InceptionV3 model on the extracted regions.</p>
      <p>Additionally, another expected error occurs with the classes that were removed during Random
Undersampling, which for obvious reasons cannot be classified by the model.</p>
      <p>As shown in Table 2, the top-performing teams achieved F1-scores close to 0.38, while the overall
average performance remained low (media = 0.2008), reflecting the dificulty of the task. The
I2C-UHUPERSEUS team obtained an F1-score of 0.02203, ranking 28th out of 38, which, although modest in
absolute terms, still positioned the team above 26% of the participants. These results underscore both
the technical complexity of the challenge and the value of the proposed approach as a baseline for
future improvements.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusions</title>
      <p>First, the critical importance of having a high-quality dataset that is representative of the real testing
environment is evident. The substantial diferences between the training and test sets likely negatively
afected the model’s performance, compromising its generalization capacity. Additionally, the marked
imbalance between classes—including a considerable number of categories with only one available
sample—represented a significant limitation during the training process.</p>
      <p>Regarding the results, while they did not achieve the expected performance, they clearly identify
several improvement factors. One of the most relevant lines to explore is the class reduction strategy:
eliminating those with few samples may harm the system’s ability to identify less frequent, but equally
important species. Alternatively, the use of few-shot learning techniques or synthetic sample generation
could be considered. Likewise, the adoption of classification models more specialized in the botanical
domain, as well as the use of more recent architectures such as Vision Transformers, could provide
significant improvements.</p>
      <p>In terms of detection, the use of YOLO as a prior detector presents advantages, but also introduces
errors that are transferred to the classification system. It would be pertinent to evaluate other finer
detections techniques or attention-based ones, with the objective of improving the quality of the
extracted regions of interest.</p>
      <p>Overall, this work highlights both the challenges and potential of automated plant species recognition
in complex environments. Despite the limitations encountered, the proposed pipeline establishes a solid
foundation for future research, and ofers a clear direction toward more robust, adaptive, and scalable
solutions in real contexts such as those posed by PlantCLEF.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used GPT-4 and Claude Sonnet 4 in order to: Grammar
and spelling check. Further, the authors used GPT-4 for figures 3 in order to: Generate images. After
using these services, the authors reviewed and edited the content as needed and takes full responsibility
for the publication’s content.
[17] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, Imagenet: A large-scale hierarchical image
database, in: 2009 IEEE conference on computer vision and pattern recognition, Ieee, 2009, pp.
248–255.
[18] J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object
detection, in: Proceedings of the IEEE conference on computer vision and pattern recognition,
2016, pp. 779–788.
[19] G. Jocher, J. Qiu, Ultralytics yolo11, GitHub: https://github.com/ultralytics/ultralytics (2024). URL:
https://github.com/ultralytics/ultralytics.
[20] M. Buda, A. Maki, M. A. Mazurowski, A systematic study of the class imbalance problem in
convolutional neural networks, Neural networks 106 (2018) 249–259.
[21] H. He, E. A. Garcia, Learning from imbalanced data, IEEE Transactions on knowledge and data
engineering 21 (2009) 1263–1284.
[22] C. Shorten, T. M. Khoshgoftaar, A survey on image data augmentation for deep learning, Journal
of big data 6 (2019) 1–48.
[23] F. J. J. Joseph, S. Nonsiri, A. Monsakul, Keras and tensorflow: A hands-on experience, Advanced
deep learning for engineers and scientists: A practical approach (2021) 85–111.
[24] W. McKinney, et al., pandas: a foundational python library for data analysis and statistics, Python
for high performance and scientific computing 14 (2011) 1–9.
[25] P. Jiang, D. Ergu, F. Liu, Y. Cai, B. Ma, A review of yolo algorithm developments, Procedia computer
science 199 (2022) 1066–1073.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Mayo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Remagnino</surname>
          </string-name>
          ,
          <article-title>How deep learning extracts and learns leaf features for plant classification</article-title>
          ,
          <source>Pattern recognition 71</source>
          (
          <year>2017</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Ren,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Deep residual learning for image recognition</article-title>
          ,
          <source>in: Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>770</fpage>
          -
          <lpage>778</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. Van Der</given-names>
            <surname>Maaten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. Q.</given-names>
            <surname>Weinberger</surname>
          </string-name>
          ,
          <article-title>Densely connected convolutional networks</article-title>
          ,
          <source>in: Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>4700</fpage>
          -
          <lpage>4708</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dosovitskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Beyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kolesnikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Weissenborn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Unterthiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dehghani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Minderer</surname>
          </string-name>
          , G. Heigold,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gelly</surname>
          </string-name>
          , et al.,
          <article-title>An image is worth 16x16 words: Transformers for image recognition at scale</article-title>
          , arXiv preprint arXiv:
          <year>2010</year>
          .
          <volume>11929</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Barré</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Stöver</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. F.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Steinhage</surname>
          </string-name>
          ,
          <article-title>Leafnet: A computer vision system for automatic plant species identification</article-title>
          ,
          <source>Ecological Informatics</source>
          <volume>40</volume>
          (
          <year>2017</year>
          )
          <fpage>50</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Martellucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Goëau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bonnet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Vinatier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joly</surname>
          </string-name>
          ,
          <article-title>Overview of PlantCLEF 2025: Multi-species plant identification in vegetation quadrat images</article-title>
          ,
          <source>in: Working Notes of CLEF 2025 - Conference and Labs of the Evaluation Forum</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Picek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Goëau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Adam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Larcher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Leblanc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Servajean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Janoušková</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Matas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Čermák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Papafitsoros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Planqué</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-P.</given-names>
            <surname>Vellinga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Klinck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Denton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Cañas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Martellucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Vinatier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bonnet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joly</surname>
          </string-name>
          , Overview of lifeclef 2025:
          <article-title>Challenges on species presence prediction and identification, and individual animal identification</article-title>
          ,
          <source>in: International Conference of the Cross-Language Evaluation Forum for European Languages</source>
          , Springer,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Koirala</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. B. Walsh</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>McCarthy, Deep learning-method overview and review of use for fruit detection and yield estimation, Computers and electronics in agriculture 162 (</article-title>
          <year>2019</year>
          )
          <fpage>219</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Sarkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. B.</given-names>
            <surname>Hazarika</surname>
          </string-name>
          ,
          <article-title>Leaf disease detection using machine learning and deep learning: Review and challenges</article-title>
          ,
          <source>Applied Soft Computing</source>
          <volume>145</volume>
          (
          <year>2023</year>
          )
          <fpage>110534</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Saleem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Potgieter</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Arif</surname>
          </string-name>
          ,
          <article-title>Plant disease detection and classification by deep learning</article-title>
          ,
          <source>Plants</source>
          <volume>8</volume>
          (
          <year>2019</year>
          )
          <fpage>468</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ravikumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. Eugene</given-names>
            <surname>Berna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Babu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Arockia Raj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Vijay</surname>
          </string-name>
          ,
          <article-title>Detection of medicinal plants using machine learning</article-title>
          ,
          <source>in: International Conference on Recent Trends in Computing</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>199</fpage>
          -
          <lpage>208</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>U.</given-names>
            <surname>Barman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sarma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Deka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lahkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Saikia</surname>
          </string-name>
          ,
          <article-title>Vit-smartagri: vision transformer and smartphone-based plant disease detection for smart agriculture</article-title>
          ,
          <source>Agronomy</source>
          <volume>14</volume>
          (
          <year>2024</year>
          )
          <fpage>327</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chulif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. L.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Deep learning for large-scale plant classification: Neuon submission to plantclef 2023</article-title>
          ., in: CLEF (Working Notes),
          <year>2023</year>
          , pp.
          <fpage>2035</fpage>
          -
          <lpage>2042</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H.</given-names>
            <surname>Goëau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Espitalier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bonnet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joly</surname>
          </string-name>
          , Overview of plantclef
          <year>2024</year>
          <article-title>: multi-species plant identification in vegetation plot images</article-title>
          ,
          <source>CEUR-WS</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Szegedy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vanhoucke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Iofe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shlens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wojna</surname>
          </string-name>
          ,
          <article-title>Rethinking the inception architecture for computer vision</article-title>
          , in
          <source>: Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>2818</fpage>
          -
          <lpage>2826</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>X.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Nan</surname>
          </string-name>
          ,
          <article-title>Inception-v3 for flower classification</article-title>
          , in:
          <year>2017</year>
          <article-title>2nd international conference on image, vision and computing (ICIVC)</article-title>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>783</fpage>
          -
          <lpage>787</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>