<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of ExpertLifeCLEF 2018: how far automated identi cation systems are from the best experts?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Herve Goeau</string-name>
          <email>herve.goeau@cirad.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pierre Bonnet</string-name>
          <email>pierre.bonnet@cirad.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexis Joly</string-name>
          <email>alexis.joly@inria.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CIRAD, UMR AMAP</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Inria ZENITH team</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>LIRMM</institution>
          ,
          <addr-line>Montpellier</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Automated identi cation of plants and animals has improved considerably in the last few years, in particular thanks to the recent advances in deep learning. The next big question is how far such automated systems are from the human expertise. Indeed, even the best experts are sometimes confused and/or disagree between each others when validating visual or audio observations of living organism. A picture actually contains only a partial information that is usually not su cient to determine the right species with certainty. Quantifying this uncertainty and comparing it to the performance of automated systems is of high interest for both computer scientists and expert naturalists. The LifeCLEF 2018 ExpertCLEF challenge presented in this paper was designed to allow this comparison between human experts and automated systems. In total, 19 deep-learning systems implemented by 4 di erent research teams were evaluated with regard to 9 expert botanists of the French ora. The main outcome of this work is that the performance of state-of-the-art deep learning models is now close to the most advanced human expertise. This paper presents more precisely the resources and assessments of the challenge, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.</p>
      </abstract>
      <kwd-group>
        <kwd>LifeCLEF</kwd>
        <kwd>ExpertCLEF</kwd>
        <kwd>plant</kwd>
        <kwd>expert</kwd>
        <kwd>leaves</kwd>
        <kwd>leaf</kwd>
        <kwd>ower</kwd>
        <kwd>fruit</kwd>
        <kwd>bark</kwd>
        <kwd>stem</kwd>
        <kwd>branch</kwd>
        <kwd>species</kwd>
        <kwd>retrieval</kwd>
        <kwd>images</kwd>
        <kwd>collection</kwd>
        <kwd>species identi cation</kwd>
        <kwd>citizen-science</kwd>
        <kwd>ne-grained classi cation</kwd>
        <kwd>evaluation</kwd>
        <kwd>benchmark</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Automated identi cation of plants and animals has improved considerably in
the last few years. In the scope of LifeCLEF 2017 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] in particular, we measured
impressive identi cation performance achieved thanks to recent deep learning
models (e.g. up to 90 % classi cation accuracy over 10K species). This raises
the question of how far automated systems are from the human expertise and
of whether there is a upper bound that can not be exceeded. A picture actually
contains only a partial information about the observed plant and it is often not
su cient to determine the right species with certainty. For instance, a decisive
organ such as the ower or the fruit, might not be visible at the time a plant was
observed. Or some of the discriminant patterns might be very hard or unlikely to
be observed in a picture such as the presence of pills or latex, or the morphology
of the root. As a consequence, even the best experts can be confused and/or
disagree between each others when attempting to identify a plant from a set of
pictures. Similar issues arise for most living organisms including shes, birds,
insects, etc. Quantifying this intrinsic data uncertainty and comparing it to the
performance of the best automated systems is of high interest for both computer
scientists and expert naturalists. This was the goal of the ExpertCLEF challenge,
organized as part of the LifeCLEF 2018 campaign [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>In the following subsections, we synthesize the resources and assessments of
the challenge, summarize the approaches and systems employed by the
participating research groups, and provide an analysis of the main outcomes.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Dataset</title>
      <p>
        To evaluate the above mentioned scenario at a large scale and in realistic
conditions, we built and shared several di erent datasets coming from di erent
sources. As training data, we provided all the previous datasets used during the
previous PlantCLEF challenge [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The test set was built with the best experts
in the plant domains, in western Europe. For that test set we created sets of
observations that were identi ed in the eld by other experts (in order to have
a near-perfect golden standard). These pictures were immersed in a much larger
test set that had to be processed by the participating systems.
      </p>
      <p>Trusted and Noisy data in Training Set expertclef2018 : a trusted
subtraining set based on the online collaborative Encyclopedia Of Life (EoL). A list
of 10K species were selected as the most populated species in EoL data after
a curation pipeline (taxonomic alignment, duplicates removal, herbaria sheets
removal, etc.). The training set contains 256,287 pictures in total but has a
strong class imbalance with a minimum of 1 picture for Achillea lipendulina
and a maximum of 1245 pictures for Taraxacum laeticolor. A noisy sub-training
set built through Web crawlers (Google and Bing image search engines) and
containing about 1.2 million images. This training set is also imbalanced with
a minimum of 4 pictures for Plectranthus sanguineus and a maximum of 1732
pictures for Fagus grandifolia.</p>
      <p>The main objective of providing these 2 sub-datasets was to o er to the
participants the opportunity to evaluate to what extent machine learning techniques
can learn from noisy data compared to trusted data. Pictures of EoL are
themselves coming from di erent sources, including institutional databases as well as
public data sources such as Wikimedia, iNaturalist, Flickr or various websites
dedicated to botany. This aggregated data is continuously revised and rated by
the EoL community so that the quality of the species labels is globally very
good. On the other side, the noisy web dataset contains more images but with
several types and levels of noise: some images are labeled with the wrong species
name (but sometimes with the correct genus or family), some are portraits of
a botanist specialist of the targeted species, some are labeled with the correct
species name but are drawings or herbarium sheets, etc.</p>
      <p>Pl@ntNet test set: the test data to be analyzed within the challenge is a
large sample of the query images submitted by the users of the mobile
application Pl@ntNet (iPhone4 &amp; Androd5). It contains a large number of wild plant
species mostly coming from the Western Europe Flora and the North American
Flora, but also plant species used all around the world as cultivated or
ornamental plants including some endangered species. This test set was obtained after
a curation pipeline (collaborative species identi cation evaluation, author
reputation, visual quality evaluation, etc.). This test set was extended with expert
observations, according to the following procedure. First, 125 plants were
photographed between May and June 2017, in a botanical garden called the "Parc
oral de Paris", and in a natural area located in the north of Montpellier city
(southern part of France, close to the Mediterranean sea). The photos have been
done with two smartphone models, an iPhone 5 and a Samsung S5 G930F, by a
botanist and an amateur under his supervision. The selection of the species has
been motivated by several criteria including (i) their membership to a di cult
plant group (i.e. a group known as being the source of many confusions), (ii)
the availability of well developed specimens with well visible organs on the spot
and (iii), the diversity of the selected set of species in terms of taxonomy and
morphology. About fteen pictures of each specimen were acquired in order to
cover all the informative parts of the plant. However, all pictures were not
included in the nal test set in order to deliberately hide a part of the information
and increase the di culty of the identi cation. Therefore, a random selection of
only 1 to 5 pictures was operated for each specimen. In the end, a subset of 75
plants illustrated by a total of 216 images related to 33 families and 58 genera
was selected.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Task Description</title>
      <p>Based on the previously described testbed, we conducted a system-oriented
evaluation involving di erent research groups who downloaded the data and ran their
system. Each participating group was allowed to submit up to 5 run les built
from di erent methods (a run le is a formatted text le containing the species
predictions for all test items). Semi-supervised, interactive or crowdsourced
approaches were allowed but had to be clearly signaled within the submission
system. But none of the participants employed such methods. The main evaluation
metric was the top-1 accuracy.
4 https://itunes.apple.com/fr/app/plantnet/id600547573?mt=8
5 https://play.google.com/store/apps/details?id=org.plantnet</p>
    </sec>
    <sec id="sec-4">
      <title>Participants and methods</title>
      <p>
        28 participants were registered to the ExpertCLEF challenge 2018. Among this
large raw audience, 4 research groups nally succeeded in submitting run les.
Details of the used methods and evaluated systems are synthesized below and
further developed in the working notes of the participants (CMP [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], MfN [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
Sabanci Gebze[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], TUC [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The following paragraphs give a few more details
about the methods and the overall strategy employed by each participant.
CMP, Dept. of Cybernetics, Czech Technical University in Prague,
Czech Republic, 5 runs, [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]: used an ensemble of a dozen Convolutional
Neural Networks (CNNs) based on 2 state-of-the-art architectures
(InceptionResNet-v2 and Inception-v4). The CNNs were initialized with weights pre-trained
on ImageNet, then ne-tuned with di erent hyper-parameters and with the use
of data augmentation (random horizontal ip, color distortions and random crops
for some models). Each single test image is also augmented with 14
transformations (central/corner crops, horizontal ips, none) to combine and improve the
predictions. Still at test time, the predictions are computed using the
Exponential Moving Average feature of TensorFlow, i.e. by averaging the predictions of
the set of models trained during the last iterations of the training phase (with
an exponential decay). This popular procedure is inspired from Polyak
averaging method [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and is known to sometimes produce signi cantly better results
than using the last trained model solely. As a last step in their system, assuming
that there is a strong unbalanced distribution of the classes between the test
and the training sets, the outputs of the CNNs are adjusted according to an
estimation of the class prior probabilities in the test set based on an
Expectation Maximization algorithm. The best score of 88.4% top-1 accuracy during the
challenge was obtained by this team with the largest ensemble (CMP Run 3).
With half less combined models, the CMP Run 4 reached a close top-1 accuracy
and even obtained a slightly better accuracy on the smaller test subset
identied by human experts. It can be explained by the strategy during the training
of using the trusted and noisy sets: a comparison between CMP Run 1 and 4
clearly illustrates that re ning further a model with only the trusted training
set after learning it on the whole noisy training set is not relevant. CMP Run 3
which combines all the models seems to have its performances degraded by the
inclusion of the models re ned on the trusted training set when we compare it
with CMP Run 4 on the test subset identi ed by human experts.
MfN, Museum fuer Naturkunde Berlin, Leibniz Institute for
Evolution and Biodiversity Science, Germany, 4 runs, [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]: followed quite
similar approaches used last year during the PlantCLEF2017 challenge [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. This
participant used an ensemble of ne-tuned CNNs pretrained on ImageNet, based
on 4 architectures (GoogLeNet, ResNet-152, ResNeXT, DualPathNet92), each
trained with bagging techniques. Data augmentation was used systematically for
each training, in particular random cropping, horizontal ipping, variations of
saturation, lightness and rotation. For the three last transformations, the
intensity of the transformation is correlated to the diminution of the learning rate
during training to let the CNNs see patches progressively closer to the original
image at the end of the training. Test images followed similar transformations
for combining and boosting the accuracy of the predictions. MfN Run 1 used
basically the best and winning approach during PlantCLEF2017 by averaging
the prediction of 11 models based on 3 architectures (GoogLeNet, ResNet-152,
ResNeXT). However, surprisingly, the runs MfN Run 2 and 3, which are based on
only one architecture (respectively ResNet152 and DualPathNet92), performed
both better than the Run 1 combining several architectures and models. The
combination of all the approaches in MfN Run 4 seems even to be penalized by
the winning approach during PlantCLEF2017.
      </p>
      <p>
        SabanciU-GTU, Sabanci University, Turkey, 5 runs, [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]: ne-tuned and
combined two recent successful CNN architectures: DenseNet (Densely
connected convolutional Networks), and SeNet (Squeeze-and-excitation Networks),
more precisely a SeNet-ResNet-50. Indeed, SeNet introduces building blocks that
can be integrated to any modern CNN such as ResNet-50 and that are designed
for improving channel interdependencies by adding parameters to each channel
of a convolutional block so that the network can adaptively adjust the weighting
of each feature map. For its part, a DenseNet is composed of dense blocks where
each unit inside is connected to every unit before it. DenseNet has a
counterintuitive property where fewer parameters than a traditional CNN are required
while lessening the vanishing-gradient problem. For the challenge, Sabanci-GTU
ne-tuned three pre-trained SeNet-ResNet-50 models and one DenseNet. The
two rst SeNet-ResNet-50 model were trained only on the trusted dataset, while
the third one and the DenseNet were ne-tuned on all the available training
datasets. Saliency detection, ip, and several rotation angles were used as data
augmentation. SabanciU-GTU run 1, 3, 4 and 5 are various weighted
combinations of the outputs of the four ne-tuned models. The best result was obtained
by the run 5 by weighting the outputs of the CNNs according to the "quality"
and "organ" tags provided in the xml metadata les. Run 3 used also the organ
tag with manually xed weights for giving more weight to pictures with
"sexual" organs ( ower, fruit) or the entire view. Run 2 applied a Error-Correcting
Output Codes approach (ECOC) expressing the 10k classes problem through a
n-bits (n = 200 here) error-correcting output code. Each bit is related to a
binary classi er splitting arbitrarily and randomly into two sets the 10k species. A
binary classi er was a 2-hidden layer shallow networks (500 hidden nodes at each
layer) taking as input the features from the last layer of the rst
SeNet-ResNet50 trained model. Unfortunately, this approach performed the worst during the
challenge.
      </p>
      <p>
        TUC MI, Technische Universitt Chemnitz, Germany, 5 runs, [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]: this
team based their system on three architectures (ResNet-50, Inception-v3 and
DenseNet-201) ne-tuned on the noisy or trusted dataset with various data
augmentations (horizontal and vertical ip, zooming, rotating, shearing and
shifting). DenseNet-201 models was ne-tuned with adjusted class weights over
multiple iterations to attempt to balance the classes. The best results was obtained
by Run 1 and Run 5 which are ensemble classi ers. Run 1 is based on one ResNet,
one Inception-v3 and three DenseNet-201, all ne-tuned with the noisy training
dataset, and weighted according to their validation accuracy. Run 5 performed
slightly better on the whole test set by using only 3 ne-tuned models instead of
5 in Run 1, (2 ResNet-50 and 1 DenseNet-201) and without a speci c weighting
rule.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Results and "Experts vs. Machines" evaluation</title>
      <p>Considering the automated approaches solely, without comparisons with the
experts, we can con rm and remind quickly the same conclusions noticed during
the last PlantCLEF 2017 challenge:
{ the measured performances are very high despite the di culty of the task,
{ the best results were obtained mostly by systems that were learned on both
the trusted and the noisy datasets,
{ all teams used and ne-tuned popular Convolutional Neural Networks
conrming de nitively the supremacy of this kind of approach over previous
methods,
{ the best results were obtained by ensemble classi ers of ConvNets with many
data augmentations.
A di cult task, even for experts: as a rst noticeable outcome, none of
the botanist correctly identi ed all observations. The top-1 accuracy of the
experts is in the range 0:613 0:96. with a median value of 0:8. This illustrates the
di culty of the task, especially when reminding that the experts were authorized
to use any external resource to complete the task, Flora books in particular. It
shows that a large part of the observations in the test set do not contain enough
information to be identi ed with con dence when using classical identi cation
keys. Only the four experts with an exceptional eld expertise were able to
correctly identify more than 80% of the observations.</p>
      <p>
        Deep learning algorithms were defeated by the best experts but the
margin of progression is becoming tighter and tighter. The top-1 accuracy of
the evaluated systems is in the range 0:32 0:84 with a median value of 0:64.
This is globally lower than the experts but it is noticeable that the best systems
were able to perform better than 5 of the highly skilled participating experts.
Moreover, regarding a previous Man vs. Machine evaluation in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we can notice
      </p>
      <p>Run
CMP Run 4
CMP Run 3
MfN Run 2
MfN Run 4
CMP Run 2
MfN Run 3
CMP Run 5
CMP Run 1</p>
      <p>MfN Run 1
TUC MI Run 5
TUC MI Run 1</p>
      <p>TUC MI Run 2
SabanciU-GTU Run 5
SabanciU-GTU Run 3</p>
      <p>TUC MI Run 3
SabanciU-GTU Run 1
SabanciU-GTU Run 4</p>
      <p>TUC MI Run 4
SabanciU-GTU Run 2
that some participants succeeded to improve their system in one short year on
the same dataset (the best top-1 accuracy was 0:733 in the previous experiment,
0:84 during this ExpertCLEF 2018 challenge). We can assume that there is still
a room for improvement and that the machines would probably be able to
compete with the 3 best human experts next year when the challenge will be re-open
on the crowdai platform.</p>
      <p>Identi cation failures (machines): looking in details the results, we can
notice that some of the best automated systems can perform as well as experts
for about 86% of the observations This is the case for the best evaluated system
CMP Run 4 where 65 of the 75 test observations ranked the right species at a
lower or equal rank than the best expert. Among the 10 remaining observations,
5 were correctly identi ed in the top-2 predictions, 2 in the top-3 and only 3
observations were completely failed (see Table 2). The causes of the identi
cation failures di ers from an observation to another one. For one observation
(2792091) it is probably due to a mismatch between the training data and the
test sample. Actually, the training samples of the correct species usually contain
visible open yellow owers whereas only beige buds are visible in the test
sample. In the second missed observation (2791146), it is more likely that the failure
is due to the intrinsic di culty of the associated genus Lathyrus within which
many species are visually very similar (but most of the proposals in machine
runs are nevertheless under the Lathyrus genus). The same for the last missed
observation (2791317) related to the genus Galium with an additional di culty
related to fact the observation contains only one entire view.
2792091
2791146
2791317
Identi cation failures (experts): on the other hand, it is important to
notice that some automated systems can perform better in some cases than the
experts. If we compare again the best automated system CMP Run 4 and the
best expert, we can notice that three observations have been better identi ed by
the automated approach (see Table 3). For one observation (2792706) the best
system gave the correct species at rank 1 while it was at rank 2 for the best
expert. For the two observations 2790900 and 2791110, the best automated system
gave the correct species at rank 3 while there were no species propositions at all
from the best human expert. The two observations are actually cultivated plants,
probably varieties visually di erent from the "original" species, and relatively
far from the core expertise of the human.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>This paper presented the overview and the results of the LifeCLEF 2018
expert identi cation challenge following the seven previous LifeCLEF plant
identi cation challenges conducted within CLEF evaluation forum. The task was
performed again on the biggest plant images dataset ever published in the
literature, but focused on an expert vs. machine evaluation. The main goal behind
that was to answer the question of whether automated plant identi cation
systems still have a margin of progression or if they already perform as well as
experts for identifying plants in images. We showed that identifying plants from
images solely is a di cult task, even for some of the highly skilled specialists who
accepted to participate to the experiment. This con rms that pictures of plants
only contain partial information and that it is often not su cient to determine
the right species with certainty. Regarding the performance of the automated
approaches, we shows that there is still a margin of progression but that it is
becoming tighter and tighter. The best system was able to correctly classify 84%
of the test samples including some belonging to very di cult taxonomic groups.
This performance is still far from the best expert who correctly identi ed 96:7%
of the test samples. However, a strength of the automated systems is that they
can return quickly an exhaustive list of all the possible species whereas this
is a very di cult task for humans. We believe that this already makes them
highly powerful tools for modern botany. Furthermore, the performance of
automated systems will continue to improve in the following years thanks to the
quick progress of deep learning technologies. They have the potential to become
essential tools for teachers and students, but they should not replace an in-depth
understanding of botany.
2790900
2791110</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Atito</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yankoglu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aptoula</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ganiyusufoglu</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yildiz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yildirir</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baris</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Plant identi cation with deep learning ensembles</article-title>
          .
          <source>In: Working Notes of CLEF</source>
          <year>2018</year>
          (
          <article-title>Cross Language Evaluation Forum) (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Goeau, H.,
          <string-name>
            <surname>Hang</surname>
            ,
            <given-names>S.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lasseck</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sulc</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malecot</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jauzein</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Melet</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>You</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Plant Identi cation: Experts vs</article-title>
          .
          <source>Machines in the Era of Deep Learning</source>
          , pp.
          <volume>131</volume>
          {
          <fpage>149</fpage>
          . Springer International Publishing,
          <string-name>
            <surname>Cham</surname>
          </string-name>
          (
          <year>2018</year>
          ), https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -76445-
          <issue>0</issue>
          _
          <fpage>8</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Goeau</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Plant identi cation based on noisy web data: the amazing performance of deep learning (lifeclef 2017)</article-title>
          . In: CLEF 2017-
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          . pp.
          <volume>1</volume>
          {
          <issue>13</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Haupt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kahl</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kowerko</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eibl</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Large-scale plant classi cation using deep convolutional neural networks</article-title>
          .
          <source>In: Working Notes of CLEF</source>
          <year>2018</year>
          (
          <article-title>Cross Language Evaluation Forum) (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Goeau, H.,
          <string-name>
            <surname>Botella</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glotin</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vellinga</surname>
            ,
            <given-names>W.P.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Overview of lifeclef 2018: a large-scale evaluation of species identi cation and recommendation algorithms in the era of ai</article-title>
          . In: Jones,
          <string-name>
            <given-names>G.J.</given-names>
            ,
            <surname>Lawless</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Kelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Cappellato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          , N. (eds.) CLEF:
          <article-title>CrossLanguage Evaluation Forum for European Languages</article-title>
          .
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction</source>
          , vol.
          <source>LNCS</source>
          . Springer, Avigon, France (
          <year>Sep 2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Goeau, H.,
          <string-name>
            <surname>Glotin</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spampinato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vellinga</surname>
            ,
            <given-names>W.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lombardo</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Planque</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palazzo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Lifeclef 2017 lab overview: multimedia species identi cation challenges</article-title>
          .
          <source>In: International Conference of the CrossLanguage Evaluation Forum for European Languages</source>
          . pp.
          <volume>255</volume>
          {
          <fpage>274</fpage>
          . Springer (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lasseck</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Image-based plant species identi cation with deep convolutional neural networks</article-title>
          .
          <source>In: Working notes of CLEF 2017 conference (</source>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lasseck</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Machines vs. experts: Working note on the expertlifeclef 2018 plant identi cation task</article-title>
          .
          <source>In: Working Notes of CLEF</source>
          <year>2018</year>
          (
          <article-title>Cross Language Evaluation Forum) (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Polyak</surname>
            ,
            <given-names>B.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Juditsky</surname>
            ,
            <given-names>A.B.</given-names>
          </string-name>
          :
          <article-title>Acceleration of stochastic approximation by averaging</article-title>
          .
          <source>SIAM Journal on Control and Optimization</source>
          <volume>30</volume>
          (
          <issue>4</issue>
          ),
          <volume>838</volume>
          {
          <fpage>855</fpage>
          (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Sulc</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Picek</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matas</surname>
          </string-name>
          , J.:
          <article-title>Plant recognition by inception networks with testtime class prior estimation</article-title>
          .
          <source>In: Working Notes of CLEF</source>
          <year>2018</year>
          (
          <article-title>Cross Language Evaluation Forum) (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>