<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of LifeCLEF Plant Identi cation task 2020</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Herve Goeau</string-name>
          <email>herve.goeau@cirad.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pierre Bonnet</string-name>
          <email>pierre.bonnet@cirad.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexis Joly</string-name>
          <email>alexis.joly@inria.fr</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>AMAP, Univ Montpellier</institution>
          ,
          <addr-line>CIRAD, CNRS, INRAE, IRD, Montpellier</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>CIRAD, UMR AMAP</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Inria ZENITH team</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>LIRMM</institution>
          ,
          <addr-line>Montpellier</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Automated identi cation of plants has improved considerably thanks to the recent progress in deep learning and the availability of training data with more and more photos in the eld. However, this profusion of data only concerns a few tens of thousands of species, mostly located in North America and Western Europe, much less in the richest regions in terms of biodiversity such as tropical countries. On the other hand, for several centuries, botanists have collected, catalogued and systematically stored plant specimens in herbaria, particularly in tropical regions, and the recent e orts by the biodiversity informatics community made it possible to put millions of digitized sheets online. The LifeCLEF 2020 Plant Identi cation challenge (or "PlantCLEF 2020") was designed to evaluate to what extent automated identi cation on the ora of data de cient regions can be improved by the use of herbarium collections. It is based on a dataset of about 1,000 species mainly focused on the South America's Guiana Shield, an area known to have one of the greatest diversity of plants in the world. The challenge was evaluated as a cross-domain classi cation task where the training set consist of several hundred thousand herbarium sheets and few thousand of photos to enable learning a mapping between the two domains. The test set was exclusively composed of photos in the eld. This paper presents the resources and assessments of the conducted evaluation, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.</p>
      </abstract>
      <kwd-group>
        <kwd>LifeCLEF</kwd>
        <kwd>PlantCLEF</kwd>
        <kwd>plant</kwd>
        <kwd>domain adaptation</kwd>
        <kwd>cross-domain classi cation</kwd>
        <kwd>tropical ora</kwd>
        <kwd>Amazon rainforest</kwd>
        <kwd>Guiana Shield</kwd>
        <kwd>species identication</kwd>
        <kwd>ne-grained classi cation</kwd>
        <kwd>evaluation</kwd>
        <kwd>benchmark</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Automated identi cation of plants and animals has improved considerably in
the last few years. In the scope of the LifeCLEF 2017 Plant Identi cation
challenge [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] in particular, impressive identi cation performance were measured
thanks to recent deep learning models (e.g. up to 90 % classi cation accuracy
over 10K species), and it was shown in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] that automated systems are
nowadays not so far from the human expertise. However, these conclusions are valid
for species that are mostly living in Europe and North America. Therefore, the
LifeCLEF 2019 Plant identi cation challenge was focused on tropical countries,
where there are typically much less of collected observations and images and
where the ora is much more di cult to identify for human experts [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
In the meantime, biodiversity informatics initiatives such as iDigBio5 or
eReColNat6 made available online millions of digitized herbarium sheets collected
over several centuries and conserved in many natural history museums over the
world. During centuries, botanists have collected, catalogued and systematically
stored plant specimens in herbaria. These physical specimens are used to study
the variability of species, their phylogenetic relationship, their evolution, or
phenological trends. One of the key step in the work ow of botanists and taxonomists
is to nd the herbarium sheets that correspond to a new specimen observed in
the eld. This task requires a high level of expertise and can be very tedious.
Developing automated tools to facilitate this work is thus of crucial importance.
In the continuity of the PlantCLEF challenges organized in previous years [4{12],
the LifeCLEF 2020 Plant identi cation challenge presented in this paper was
designed to evaluate to what extend automated identi cation on the ora of data
de cient regions can be improved by the use of natural history collections of
herbarium sheets. Many species in tropical countries are not easily accessible,
resulting in a very limited number of photos collected in the eld, while
several hundred or even several thousand of herbarium sheets have been collected
over the centuries. Herbaria collections represent potentially a large pool of data
to train species prediction models, but they also induces a much more di cult
problem usually referred as a cross domain classi cation task. Indeed, a plant
photographed in the eld may have a very di erent visual appearance than its
dried version placed on a herbarium sheet (as illustrated in Figure 1).
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Datasets and task description</title>
      <sec id="sec-2-1">
        <title>Training set</title>
        <p>The conducted study was based on a newly created dataset of 997 species mainly
focused on the Guiana shield and the Northern Amazon rainforest (see Figure
2), an area known to have one of the greatest diversity of plants and animals
in the world. The dataset contains 321,270 herbarium sheets (see Table1 for
detailed information). About 12% were collected in French Guyana and hosted
in the "Herbier IRD de Guyane" (IRD Herbarium of French Guyana). These
herbarium sheets were digitized in the context of the e-ReColNat6 project. The
5 http://portal.idigbio.org/portal/search
6 https://explore.recolnat.org/search/botanique/type=index
remaining herbarium sheets come from the iDigBio5 portal (the US National
Resource for Advancing Digitization of Biodiversity Collections).</p>
        <p>
          In order to enable learning a mapping between the two domains (i.e. between
the "source" domain of herbarium sheets and the "target" domain of eld
photos), a relatively smaller set of 6,316 photos in the eld was provided additionally
to the large herbarium sheets dataset. About 62 % of them also come from he
iDigBio portal and were acquired by various photographers related to
numerous institutes and national museums that share their data in iDigBio. Besides,
two highly trusted experts of the French Guyana ora, Marie-Francoise Prevost
"Fanchon" [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and Jean-Francois Molino7 provided the remaining eld photos
that were divided between the training set and the test set.
        </p>
        <p>A valuable asset of the training set is that a set of 354 plant observations are
provided with both herbarium sheets and eld photos for the same individual
plant. This potentially allows a more precise mapping between the two domains
(see previous Figure 1 as an example).</p>
        <p>It should also be noted that about half of the species in the training set (495
to be precise) is only represented by herbarium sheets and therefore it is not
possible to learn to recognize them directly from eld photos.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Test set</title>
        <p>The test set was composed of 3,186 photos in the eld related to 638 plant
observations (about 5 pictures per plants on average). To avoid bias related to
similar pictures coming from neighboring plants in the same observation site, we
ensured that all observations of a given species by a given collector were either
in the training set or in the test set but never spread over the two sets. For
instance, for the observations of J.F. Molino, the 166 species in the test set are
di erent from the 125 species in the training set.</p>
        <p>Most importantly, plant species in the test set were selected according to
the number of eld photos illustrating them in the training set. As it can be
7 https://scholar.google.fr/citations?user=xZXYc4kAAAAJ&amp;hl=fr
observed in Figure 3 (a), the priority was given to species with few or no eld
pictures at all. Such a choice may seem drastic, making the task extremely
di cult, but the underlying idea was to encourage and promote methods that
are as generic as possible, capable of transferring knowledge between the two
domains, even without any examples in the target domain for some classes. The
second motivation of this choice, was to impose a mapping between herbarium
and eld photos and avoid that classical methods based on CNNs perform well
because of an abundance of eld photos in the training set rather than the use
of herbarium sheets above all.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>External training sets</title>
        <p>Participants to the evaluation were allowed to use complementary training data
(e.g. for pre-training purposes) but on the condition that (i) the experiment is
entirely reproducible, i.e. that the used external resource is clearly referenced
and accessible to any other research group, (ii) the use of external training data
or not is clearly mentioned for each evaluated method, and (iii) the additional
resource does not contain any of the test observations. External training data
was thus allowed but participants had to provide at least one submission that
used only the training data provided this year.
The goal of the task was to identify the correct species of the 638 plant of the
test set. For every plant, the evaluated systems had to return a list of species,
ranked without ex-aequo. Each participating group was allowed to submit up to
10 run les built from di erent methods or systems (a run le is a formatted
text le containing the species predictions for all test items).</p>
        <p>The main evaluation measure for the challenge was the Mean Reciprocal
Rank (MRR), which is de ned as the mean of the multiplicative inverse of the
rank of the correct answer:
where Q is the number of plant observations and rankq is the predicted rank of
the true label for the qth plant observation.</p>
        <p>A second evaluation measure was again the MRR but computed on a subset
of observations of di cult species that are rarely photographed in the eld. The
species were chosen based on the most comprehensive estimates possible from
di erent data sources (iDigBio, GBIF, Encyclopedia of Life, Bing and Google
Image search engines, that were actually provided by the organizers or some
participants of previous editions of PlantCLEF and ExpertCLEF challenges). It
is therefore a more challenging metric because it focuses on the species which
impose a mapping between herbarium and eld photos. Figure 3 (b) revises
the previous Figure 3 (a) according to the considered external data sources and
shows that many plant observations in the di cult test subset are related to
species estimated to have less than 10 eld photos.</p>
        <p>Course of the challenge: The training data was publicly shared in early
February 2020 through the AICrowd platform8. Any research team wishing to
participate in the evaluation could register on the platform and download the
data. The test data was shared in mid-April but without the species labels, which
were kept secret. Each team could then submit up to 10 submissions
corresponding to di erent methods or di erent settings of the same method. A submission
(also called a run) takes the form of a csv le containing the predictions of the
method being evaluated for all observations in the test set. For each submission,
the calculation of the evaluation metrics is then done automatically and visible
to the participant. Once, the submission phase was closed (mid-June), the
participants could also see the evaluation metric values of the other participants. As
8 https://www.aicrowd.com/challenges/lifeclef-2020-plant
a last important step, each participant was asked to provide a working note, i.e.
a detailed technical report containing all technical information required to
reproduce the results of all submissions. All LifeCLEF working notes are reviewed by
at least two members of LifeCLEF organizing committee to ensure a su cient
level of quality and reproducibility.
3</p>
        <p>Participants and methods
68 participants registered for the PlantCLEF challenge 2020 and downloaded the
data set, and 7 research groups succeeded in submitting a total of 49 runs, i.e.</p>
        <p>
          les containing the predictions of the system(s) they ran. Details of the
methods are developed in the individual working notes of most of the participants
(LU [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], ITCR PlantNet [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], Neuon AI [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and SSN [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]). The other teams did
not provide a detailed description of their systems, but some informal
descriptions were sometimes provided in the metadata associated with the submissions
and partially contributed to the comments below.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>LU, Lehigh University, USA, 10 runs [21]: this team used a Partial Domain</title>
        <p>
          Adaptation (PDA) approach corresponding to the scenario where target
categories are only a subset of source categories to promote positive transfer. They
rst extracted deep features from a pre-trained NASNetLarge model [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] and
nd the shared categories between the two domains. They then develop an novel
Adversarial Consistent Learning (ACL) approach through an uni ed deep
architecture which combine a source domain classi cation loss, an adversarial loss
and a feature consistent loss. The adversarial loss helps to learn domain-invariant
features while the feature consistent loss aims to preserve the ne-grained
feature transition between the two domains.
        </p>
        <p>
          Neuon AI, Malaysia, 7 runs [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]: this team developed a triplet loss
network [
          <xref ref-type="bibr" rid="ref1 ref18">1, 18</xref>
          ] between herbarium sheets and eld images, trained to maximize the
embedding distances of di erent species while minimizing the embedding
distances of same species. First, two InceptionV4 CNNs [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] were ne-tuned, one
exclusively with the herbarium sheets related to the 997 classes, and an other
one with more than 1 million pictures related to 10k classes from the PlantCLEF
2017 dataset [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. The networks are then fused in a nal embedding layer trained
to optimize the distances between two embeddings based on triplet loss, but
only on the subset of 435 classes that contain both herbarium sheets and eld
photos in the PlantCLEF 2020 training set. Then, each class is associated to a
single embedding computed as the average of embeddings from random
herbarium sheets of the class. For inference, a plant observation is then associated to
a single embedding computed as the average of the embeddings from all eld
photos of the observation. The Cosine similarity is used as a distance metric
between the embeddings of all the herbarium classes and the embedding of the
tested eld observation. It is then transformed with Inverse Distance Weighting
into probabilities for ranking the classes. The best results was reached with an
ensemble of 3 triplet loss networks, without any frozen layers and many data
augmentations techniques on both sides (training and test pictures).
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>ITCR - PlantNet, Costa Rica &amp; France, 10 runs, [20]: this team based</title>
        <p>
          most of its runs on a Few Shot Adversarial Domain Adaptation approach [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]
(FSADA) where the purpose is to learn a domain agnostic feature space while
preserving the discriminative ability of the features for performing the species
classi cation task. First, a ResNet50 is netuned in the herbarium sheets only.
Then, the encoder part of the ResNet50, without the classi er last layers, is
frozen and used to extract features on herbarium sheets or eld photos. Then,
given random pairs of extracted features, a discriminator is trained to distinguish
4 categories: (1) di erent domains and di erent classes, (2) di erent domains and
same class, (3) same domain and di erent classes, (4) same domain and same
classes. Finally, during a last stage, the encoder, the discriminator and the
classier are trained together. Domain adaptation is achieved once the discriminator
is not able to distinguish samples from categories (1) and (2) and categories
(3) and (4), when the discriminator is not able to tell which was the original
domain. This participant attempted several improvements based on a jigsaw
self-supervision technique or/and the use of the taxonomic information provided
in the metadata (with multi-task classi ers and multiple discriminators). The
best result was obtained with an ensemble of several variations of FSADA
models, while the best single model was using 3 taxonomic levels (species, genus,
family) and external datasets (PlantCLEF 2019 [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and GBIF [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]).
        </p>
      </sec>
      <sec id="sec-2-6">
        <title>SSN College of Engineering, India, 2 runs, [16]: this team used a classical</title>
        <p>CNN approach based on a ResNet50 which didn't give very good results given
the limited number of eld photos in the training set. It seems they didn't use
any external data.</p>
        <p>
          The 3 other teams did not provide an extended description of their system.
According to the description provided in the submission system, the UWB team
(3 runs) used a classical CNN approach based on ResNet18 with various
combinations of external data [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. Unfortunately, at the time of writing, there is not
enough information about the submissions from the "To Be" (10 runs) and the
"Domain" teams (7 runs).
4
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>We report in Figure 4 and Table 4 the performances achieved by the 49
evaluated runs. Figure 5 reorganizes the results according to the second MRR metric
focusing on the most di cult species.</p>
      <p>The main outcomes we can derive from that results are the following ones:</p>
      <sec id="sec-3-1">
        <title>A very di cult task even with the most advanced deep learning tech</title>
        <p>
          niques. The best MRR value obtained across all evaluated methods was equal to
0.18. This is quite low compared to the MRR value measured within more
classical plant identi cation benchmarks, e.g. MRR=0.92 in LifeCLEF 2017 Plant
Identi cation challenge [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] and MRR=0.36 LifeCLEF 2019 Plant Identi cation
challenge focused on tropical ora [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. As already noticed, tropical ora is
inherently more di cult than generalist ora (even for experts), which partially
explains the low performance achieved. The asymmetry between training data
based on herbarium sheets and test data based on eld photos adds a
considerable di culty. It is important to note that the low scores achieved do not mean
that the use of herbarium sheets does not improve the identi cation. The species
in the test set were actually selected as the ones having very few eld pictures in
the training set. The performance that would have been obtained on that species
without using any herbarium sheet would have been considerably weaker.
Traditional CNNs performed poorly. Figure 4 shows a great disparity
between the performances obtained by the di erent evaluated methods. To explain
that we have rst to distinguish between approaches based on classical CNNs
alone (typically pre-trained on ImageNet and ne-tuned with the provided
training data) and approaches that additionally incorporate an explicit and formal
domain adaptation technique between the herbarium and eld domains. As
expected regarding the low number of eld photos in the training set for numerous
species, directly ne-tuned CNNs with the PlantCLEF 2020 training set
obtained the lowest scores (ITCR PlantNet Run 1, SSN Run 1&amp;2, UWB Run 1).
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Moreover, the use of external training data with classical CNNs did not</title>
        <p>
          greatly improve performances. It provides some improvements on the main
evaluation metric as demonstrated with the UWB runs 2 &amp; 3 and ITCR PlantNet
Run 2. All these runs extended the training data with the PlantCLEF 2019
training data [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and the GBIF training data provided by [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]). ITCR PlantNet Run
2 made a greater improvement on the main MRR metric probably because they
used a two stage training strategy: they rst ne-tuned a pre-trained ResNet50
with all the herbarium sheets from PlantCLEF 2020, and then netuned it again
with all the eld photos PlantCLEF 2020 and the external training data
(PlantCLEF 2019 and GBIF). This two stages strategy can be seen eventually as a rst
naive domain adaptation technique because the second stage shifts the learned
features in an initial herbarium feature space to a eld feature space. However,
regarding the second MRR metric focusing on the most di cult species with few
eld photos in the training set, performances for all the aforementioned runs is
still quite low. This means that the performances of a classical CNN approach,
without a formal domain adaptation technique, is too dependent from the
number of eld photos available in the training data, and is not able to e ciently
transfer visual knowledge from the herbarium domain to the eld photos domain.
An adversarial domain adaptation technique performed the best. Among
other submissions, two participants stood out from the crowd with two di
erent domain adaptation techniques. ITCR PlantNet team based all its remaining
runs on a Few Shot Adversarial Domain Adaptation approach [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] (FSADA),
directly applied in the ITCR PlantNet Run 3. FSADA approach uses a
discriminator that helps the initial encoder trained on herbarium sheets to shift the
learned feature representations to a domain agnostic feature space where the
discriminator is no longer able to distinguish if a picture comes from the
herbarium or the eld domain, while maintaining the discriminative power regarding
the nal species classi cation task. The basic FSADA approach (ITCR
PlantNet Run 3) clearly outperformed the traditional CNN approach (ITCR PlantNet
Run 1), while both approaches are based on the same initial netuned ResNet50
model on the herbarium training data. It should be noted that the LU team also
used an adversarial approach but with less success.
        </p>
        <p>A mapping domain adaptation techniques reached an impressive
genericity on di cult species. While the adversarial domain adaptation technique
used by the ITCR PlantNet team obtained the best results on the overall MRR
metric, the Neuon AI team obtained the best results on the second MRR metric
focusing on the most di cult species in the test set. Contrary to the approach
used by ITCR which tries to learn a common and agnostic feature space to both
domains, the Neuon team tries on its side to netune two networks dedicated
each to one of the domain and to optimize a distance that maps the two domains
for the purpose of classifying species. The Neuon AI Run 5, which is an ensemble
of 3 instances of their approach, gave particularly impressive results with fairly
high and, more importantly, equivalent values for both MRR measures. It means
that Neuon AI's approach is very robust to the lack of training eld photos and
able to generalize on rare di cult species in the test set. In other words, their
approach is able to transfer knowledge to rare species which was the underlying
objective of the challenge.</p>
      </sec>
      <sec id="sec-3-3">
        <title>External data improved domain adaptation approaches. ITCR Plant</title>
        <p>Net Run 4 shows a signi cant impact on the main MRR metric from using
external training data compared to the same adversarial domain adaptation
approach (ITCR PlantNet Run 3), while maintaining the same level of genericity
on rare species with similar MRRs value on the second metric. Unfortunately it
is not possible to measure this impact on the Neuon AI method because they
did not provide a run using only this year's training data.</p>
        <p>Multi-task approaches have a positive impact on performance. Some
teams implemented multi-task approaches, i.e. they added additional tasks than
the main species identi cation task in the global optimization problem. Such
approaches are known to potentially improve the performance of the main task
by providing additional knowledge to the model and help the extraction of
potentially useful common features. The use of upper taxon level information, in
particular, was successful in ITCR PlantNet Run 6 (using a multi-classi cation
task integrated to the FSADA approach) compared to ITCR PlantNet Run 4
using only the species classi cation task. Noticeably, yhis is the rst time over
all LifeCLEF plant identi cation challenges that we clearly observe an
important gain of the use of genus and family information to improve the species
identi cation. Many species with few training data have apparently been able
to bene t indirectly from a "sibling" species with many data related to a same
genus or family. The impact is probably enhanced this year because of the lack
of visual data on many species. To a lesser extent, self supervision auxiliary task
such as jigsaw solving prediction task (ITCR PlantNet Run 5) improved a little
the baseline of this team (ITCR PlantNet Run 4), and the best submission over
all this year challenge is an ensemble of all FSADA approaches, combining self
supervision or not, upper taxons or not.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>This paper presented the overview and the results of the LifeCLEF 2020 plant
identi cation challenge following the 9 previous editions conducted within CLEF
evaluation forum. This year's task was particularly challenging, focusing on
species rarely photographed in the eld in the northern tropical Amazon. The
results revealed that the last advances in domain adaptation enable the use of
herbarium data to facilitate the identi cation of rare tropical species for which
no or very few other training photos are available. A mapping domain
adaptation technique based on a two-streamed Herbarium-Field triplet loss network
reached an impressive genericity by obtaining quite high similar results
regardless of whether the species have many or very few eld photos in the training
set. On the other hand, a Few Shot Adversarial Domain Adaptation technique
outperformed all the other approaches according to the main metric but not
with the same genericity according to the second metric, even if the use of
taxonomic information can improve the genericity. The results are thus contrasted
and allow us to hope for improvements in the near future on both aspects: raw
performances and genericity. We believe that the proposed task may be in the
future a new baseline dataset in the eld of domain adaptation, and motivate
new contributions through a realistic and crucial usage for the plant biology
research community.</p>
      <p>Acknowledgements This project has received funding from the French
National Research Agency under the Investments for the Future Program, referred
as ANR-16-CONV-0004 and from the European Union's Horizon 2020 research
and innovation program under grant agreement No 863463 (Cos4Cloud project).
This work was supported in part by the Microsoft AI for Earth program.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Argueso,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Picon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Irusta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            ,
            <surname>Medela</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , San-Emeterio,
          <string-name>
            <given-names>M.G.</given-names>
            ,
            <surname>Bereciartua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Alvarez-Gila</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Few-shot learning approach for plant disease classi cation using images taken in the eld</article-title>
          .
          <source>Computers and Electronics in Agriculture 175</source>
          ,
          <issue>105542</issue>
          (
          <year>2020</year>
          ). https://doi.org/https://doi.org/10.1016/j.compag.
          <year>2020</year>
          .
          <volume>105542</volume>
          , http:// www.sciencedirect.com/science/article/pii/S0168169920302544
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chulif</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>Y.L.</given-names>
          </string-name>
          :
          <article-title>Herbarium- eld triplets network for cross-domain plant identi cation - neuon submission to lifeclef 2020 plant</article-title>
          . In: CLEF working notes
          <year>2020</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2020</year>
          , Thessaloniki,
          <string-name>
            <surname>Greece.</surname>
          </string-name>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Delprete</surname>
            ,
            <given-names>P.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feuillet</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Marie-francoise prevost \fanchon"(</article-title>
          <year>1941</year>
          {
          <year>2013</year>
          ).
          <source>Taxon</source>
          <volume>62</volume>
          (
          <issue>2</issue>
          ),
          <volume>419</volume>
          {
          <fpage>419</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Plant identi cation in an open-world (lifeclef 2016)</article-title>
          . In: CLEF task overview
          <year>2016</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2016</year>
          , Evora,
          <string-name>
            <surname>Portugal.</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Plant identi cation based on noisy web data: the amazing performance of deep learning (lifeclef 2017)</article-title>
          . In: CLEF task overview
          <year>2017</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2017</year>
          , Dublin, Ireland. (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Overview of expertlifeclef 2018: how far automated identi cation systems are from the best experts ? In: CLEF task overview 2018, CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2018</year>
          , Avignon, France. (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Overview of lifeclef plant identi cation task 2019: diving into data de cient tropical countries</article-title>
          .
          <source>In: CLEF task overview</source>
          <year>2019</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2019</year>
          , Lugano,
          <string-name>
            <surname>Switzerland.</surname>
          </string-name>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bakic</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barthelemy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boujemaa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molino</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          :
          <article-title>The imageclef 2013 plant identi cation task</article-title>
          .
          <source>In: CLEF task overview</source>
          <year>2013</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2013</year>
          , Valencia, Spain.
          <source>Valencia</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boujemaa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barthelemy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molino</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Birnbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mouysset</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Picard</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The imageclef 2011 plant images classi cation task</article-title>
          .
          <source>In: CLEF task overview</source>
          <year>2011</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2011</year>
          , Amsterdam, Netherlands. (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yahiaoui</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barthelemy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boujemaa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molino</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          :
          <article-title>Imageclef2012 plant images identi cation task</article-title>
          .
          <source>In: CLEF task overview</source>
          <year>2012</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2012</year>
          , Rome, Italy. Rome (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. Goeau, H.,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Lifeclef plant identi cation task 2015</article-title>
          .
          <article-title>In: CLEF task overview 2015, CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2015</year>
          , Toulouse, France. (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. Goeau, H.,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Selmi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molino</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barthelemy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boujemaa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          :
          <article-title>The lifeclef 2014 plant images identi cation task</article-title>
          .
          <source>In: CLEF task overview</source>
          <year>2014</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2014</year>
          ,
          <article-title>She eld</article-title>
          , United Kingdom.
          <article-title>She eld</article-title>
          ,
          <source>UK</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Goeau, H.,
          <string-name>
            <surname>Botella</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glotin</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vellinga</surname>
            ,
            <given-names>W.P.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Overview of lifeclef 2018: a large-scale evaluation of species identi cation and recommendation algorithms in the era of ai</article-title>
          . In: Jones,
          <string-name>
            <given-names>G.J.</given-names>
            ,
            <surname>Lawless</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Kelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Cappellato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          , N. (eds.) CLEF:
          <article-title>CrossLanguage Evaluation Forum for European Languages</article-title>
          .
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction</source>
          , vol.
          <source>LNCS</source>
          . Springer, Avigon, France (
          <year>Sep 2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Goeau, H.,
          <string-name>
            <surname>Glotin</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spampinato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vellinga</surname>
            ,
            <given-names>W.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lombardo</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Planque</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palazzo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Lifeclef 2017 lab overview: multimedia species identi cation challenges</article-title>
          .
          <source>In: International Conference of the CrossLanguage Evaluation Forum for European Languages</source>
          . pp.
          <volume>255</volume>
          {
          <fpage>274</fpage>
          . Springer (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Motiian</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iranmanesh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Doretto</surname>
          </string-name>
          , G.:
          <article-title>Few-shot adversarial domain adaptation</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems</source>
          . pp.
          <volume>6670</volume>
          {
          <issue>6680</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Nanda H Krishna</surname>
            ,
            <given-names>R</given-names>
            am Kaushik R
          </string-name>
          , R.M.:
          <article-title>Plant species identi cation using transfer learning - plantclef 2020</article-title>
          . In: CLEF working notes
          <year>2020</year>
          ,
          <article-title>CLEF: CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2020</year>
          , Thessaloniki,
          <string-name>
            <surname>Greece.</surname>
          </string-name>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Picek</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sulc</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matas</surname>
          </string-name>
          , J.:
          <article-title>Recognition of the amazonian ora by inception networks with test-time class prior estimation</article-title>
          .
          <source>In: CLEF working notes</source>
          <year>2019</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2019</year>
          , Lugano,
          <string-name>
            <surname>Switzerland.</surname>
          </string-name>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Schro</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalenichenko</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Philbin</surname>
          </string-name>
          , J.:
          <article-title>Facenet: A uni ed embedding for face recognition and clustering</article-title>
          .
          <source>2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Jun</source>
          <year>2015</year>
          ). https://doi.org/10.1109/cvpr.
          <year>2015</year>
          .
          <volume>7298682</volume>
          , http://dx.doi.org/10.1109/CVPR.
          <year>2015</year>
          .7298682
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Szegedy</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , Io e, S.,
          <string-name>
            <surname>Vanhoucke</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Inception-v4, inception-resnet and the impact of residual connections on learning</article-title>
          .
          <source>arXiv preprint arXiv:1602.07261</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Villacis</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mata-Montero</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Domain adaptation in the context of herbarium collections: a submission to plantclef 2020</article-title>
          . In: CLEF working notes
          <year>2020</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2020</year>
          , Thessaloniki,
          <string-name>
            <surname>Greece.</surname>
          </string-name>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davison</surname>
            ,
            <given-names>B.D.</given-names>
          </string-name>
          :
          <article-title>Adversarial consistent learning on partial domain adaptation of plantclef 2020 challenge</article-title>
          . In: CLEF working notes
          <year>2020</year>
          ,
          <article-title>CLEF: Conference and Labs of the Evaluation Forum</article-title>
          , Sep.
          <year>2020</year>
          , Thessaloniki,
          <string-name>
            <surname>Greece.</surname>
          </string-name>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Zoph</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vasudevan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shlens</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            ,
            <given-names>Q.V.</given-names>
          </string-name>
          :
          <article-title>Learning transferable architectures for scalable image recognition (</article-title>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>