<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Plant Identi cation with Deep Learning Ensembles in ExpertLifeCLEF 2018</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sara Atito</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Berrin Yanikoglu</string-name>
          <email>berring@sabanciuniv.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Erchan Aptoula</string-name>
          <email>eaptoula@gtu.edu.tr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>I_pek Ganiyusufoglu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aras Y ld z</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kerem Y ld r r</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bar s Sevilmis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M. Umut Sen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Engineering and Natural Sciences, Sabanci University</institution>
          ,
          <addr-line>Istanbul</addr-line>
          ,
          <country country="TR">Turkey</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Information Technologies, Gebze Technical University</institution>
          ,
          <addr-line>Kocaeli</addr-line>
          ,
          <country country="TR">Turkey</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This work describes the plant identi cation system that we submitted to the ExpertLifeCLEF plant identi cation campaign in 2018. We ne-tuned two pre-trained deep learning architectures (SeNet and DensNetwork) using images shared by the CLEF organizers in 2017. Our main runs are 4 ensembles obtained with di erent weighted combinations of the 4 deep learning architectures. The fth ensemble is based on deep learning features but uses Error Correcting Output Codes (ECOC) as the ensemble. Our best system has achieved a classi cation accuracy of 74.4%, while the best system obtained 86.7% accuracy, on the whole of the o cial test data. This system ranked 4th place among all the teams, but matched the accuracy of one of the human experts.</p>
      </abstract>
      <kwd-group>
        <kwd>plant identi cation</kwd>
        <kwd>deep learning</kwd>
        <kwd>convolutional neural networks</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Automatic plant identi cation is the problem of identifying the given plant
species in a given photograph. Plant identi cation challenge of the Conference
and Labs of the Evaluation Forum (CLEF) [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1,2,3,4,5,6,7,8</xref>
        ] is the most
wellknown annual event that benchmarks the progress in identi cation of plant
species. The campaign has been running since 2011, with plant species reaching
10,000 classes in the 2017 evaluation.
      </p>
      <p>The emphasis of the campaign changes slightly from year to year, while the
core of the campaign is to benchmark plant identi cation progress. This year's
emphasis was on measuring automatic systems' performances with that of human
experts. For that reason, a subset of the test data was labelled by human experts
and the systems were evaluated on their accuracy on the whole test set, as well
as their performance on the subset. The details of the plant identi cation and
the overall LifeCLEF campaigns are described in [8] and [9] respectively.</p>
      <p>We have been participating into this campaign since 2011, rst with
traditional approaches and carefully selected features [10,11,12] and then with deep
learning approaches [13]. While the traditional approaches worked well on the
simpler problem of leaf based identi cation (leaf images on simple backgrounds),
deep learning approaches brought a signi cant increase in accuracy despite much
increased problem complexity (unrestricted photographs and 10,000 classes).</p>
      <p>This year our team participated in the ExpertLifeCLEF2018 challenge under
the name of SabanciU-GTU. In our main 4 runs (Runs 1, 3, 4, 5), we have used
an ensemble of four convolutional networks according to di erent combination
weights. The networks were pre-trained deep convolutional neural networks of
SeNet [14] and DensNetwork [15] that were ne-tuned with plant images. In
the fth system, we took the deep learning features (last convolutional layer
activations) of our SeNet system and trained 200 di erent binary classi ers to
form an Error Correcting Codes (ECOC) ensemble.</p>
      <p>The training data was obtained from CLEF, as a combination of data
collected from the Encyclopedia of Life (EOL) and images collected from the web
and shared by CLEF in 2017. This latter set is noisy as it is not veri ed by
experts for correctness. The submitted systems were di erent combination schemes
applied to the four models.</p>
      <p>The rest of this paper is organized as follows. Section 2 describes the proposed
methods based on the ne-tuning of SeNet and DensNetwork models for plant
identi cation, data augmentation, and classi ers' fusion. Section 4 is dedicated
to the description of the utilized dataset and presentation of designed
experiments and their results. The paper concludes in Section 5 with the summary
and discussion of the utilized methods and obtained results.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Core System</title>
      <p>Our approach was based on ne-tuning and fusing of two successful deep learning
models, namely SeNet [14] and DensNetwork[15]. These models are, respectively,
the rst-ranked and second-ranked architectures of the ImageNet Large-Scale
Visual Recognition Challenge (ILSVRC) 2014{both trained on the ILSVRC 2012
dataset with 1.2 million labeled images of 1,000 object classes.</p>
      <p>SeNet [14], Winner of ImageNet 2017 Classi cation Task [16], introduces
a building block for convolution neural networks that improves channel
interdependencies. The main idea is to weight each channel adaptively based on its
importance. SE-block is exible which means that it can be integrated into any
modern deep learning architecture. In this work, we utilized SE-blocks with
ResNet-50 [17] module.</p>
      <p>DensNetwork [15] are built from dense blocks and pooling operations where
there is a connection between each block to every receding blocks. Thus, with
n blocks, there are n(n + 1)=2 direct connections. Input of each dense block is
an iterative concatenation of previous feature maps. One of the advantages of
DensNetwork is that it lessens the vanishing-gradient problem which makes it
easy to train.</p>
      <p>Score-level averaging is applied to combine the prediction scores assigned
to each class for all the augmented patches within a single network and then
for combining the scores obtained for di erent images of the same unique plant
(called an "observation" in the campaign terminology).</p>
      <p>All training and tests were run on a linux system with a Titan X Pascal GPU
and 12GB of video memory.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Error-Correcting Output Codes</title>
      <p>As a second ensemble approach, we tried the Error Correcting Output Codes
(ECOC) approach [18]. In ECOC, a number of binary classi ers are trained
such that each one is assigned a separate dichotomy of the classes, which is
de ned by a given ECOC matrix. In the ECOC matrix M , the jth column
indicates the dichotomy assigned for base classi er hj . That is, a particular element
Mij f+1; 1g indicates the desired label for class ci to be used in training the
base classi er hj . The ith row of M , denoted as Mi, is the codeword for class ci
indicating the desired output for that class.</p>
      <p>A given test instance x is rst classi ed by each base classi er, obtaining the
output vector y = [y1; :::; yL] where yj is the output of the classi er hj for the
given input x. Then, the distance between y and the codeword Mi of class ci is
computed by using a distance metric such as the Hamming distance. The class
ck for which this distance is minimum, is chosen as the estimated class label:
k = argmini=1:::K d(y; Mi)</p>
      <p>We took the deep learning features (last convolutional layer activations) of
our SeNet system (System2) and trained 200 di erent binary classi ers according
to the predetermined ECOC matrix.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Experiments and Results</title>
      <p>The rst three systems are trained using SeNet-ResNet-50 architecture. For
training the rst system, we only used the EOL data consisting of 256; 203
images of di erent plant organs, belonging to 10,000 species. Internal
augmentation was applied during training (at each iteration, a random crop of the image
is used and randomly mirrored horizontally). For validation, we used the plant
test dataset of LifeCLEF 2017 consisting of 25; 170 images.</p>
      <p>For the second system, several data augmentation are applied to the training
images like saliency detection [19], ip, and several rotation angles. In total,
number of images in the training dataset after augmentation is around 4; 500; 000
images and the system was trained over 10 epochs. For the third system, we trained
using all of the available data with augmentation (EOL data, web-collected noisy
data, and testing set of LifeCLEF2017) excluding 1; 000 images from test 2017
for validation. This system was trained over 25 epochs. The fourth system is
trained using DensNetwork using the same training data as in System 3.
Training DensNetwork was quite slow, therefore, we trained system 4 over only 5
epochs.</p>
      <p>We implemented SeNet and DensNetwork models using the Ca e deep
learning framework [20]. All the weights were ne-tuned, while the last layer was
learned from scratch. We used the same learning rate for all of the system which
is 0.01.</p>
      <p>Run 1,3,4,5. Di erent weighted combinations of the same basic four deep
learning systems described in Section 2. In System 5, that was the best performing
system by a 0.001 margin, we used the image quality information that is given
inside the metadata in the xml les. The score of each image is weighted using
the quality information. In the absence of quality information, no weighting is
applied.</p>
      <p>Run 2. The ECOC ensemble where 200 base classi ers were trained on binary
classi cation tasks set forth according to a predetermined, random ECOC
matrix. The ECOC matrix was initialized randomly, and then simulated annealing
was used to increase the Hamming distance between rows. As features, we used
the deep learning features obtained from the last convolutional layer of rst
system described above, and trained 2-hidden layer shallow networks (500 hidden
nodes at each layer) as base classi ers.</p>
      <p>While the accuracy of this system fell short of the performance of the deep
learning architectures, the system shows promise in that the accuracy increases
as we increase the number of base classi ers: from 51% with 100 base
classiers, to 59% and 61% with 200 and 300 base classi ers, on the LifeCLEF 2017
test data. The training times are also less than one tenth of that of one deep
architecture (around 2-3 hours per 100 base classi ers on an iMac).</p>
      <p>As a promising and fast alternative, we are planning on working on
improvements of the ECOC ensemble as proposed in [21] and [22].</p>
      <p>Test Results. We submitted the classi cation results of the before mentioned
systems on the o cial test set of the ExpertLifeCLEF 2018. The utilized o cial
metric for evaluation was the average accuracy on a small subset of the test
data that was also identi ed by human experts. Results on the whole test set
were also provided. The released results by the challenge organizers are shown
in Figure 1 and given in [9].</p>
      <p>Our best system has achieved a top-1 classi cation accuracy of 74.4%, while
the best system obtained 86.7% accuracy on the whole o cial test data. This
system ranked 4th place among all the teams, but matched the accuracy of one
of the human experts.</p>
      <p>Our results for the small subset that is also labelled by human experts is
61.3%, while the 9 human experts scores range from 96% to 61.3%, on this
subset. In other words, our best system has reached the top-1 identi cation
accuracy of one of the human experts.
The competition that has been running for several years now has seen a shift
from hand-crafted features and to deep learning classi ers in the last years. Our
goal this year was to use the best performing pre-trained architectures while
diversifying the base classi ers within the ensemble. Considering the fact that
we only had one machine with GPU, we consider the performance of our system
(74.4% accuracy) satisfactory on such a complex problem (10,000 classes). For
the future, we plan to work on better ensemble techniques with deep
architectures, including improvements of the ECOC ensemble.</p>
      <p>Acknowledgments. We gratefully acknowledge NVIDIA Corporation with the
donation of the Titan X Pascal GPU used in this research.
5. Goeau, H., Bonnet, P., Joly, A.: LifeCLEF plant identi cation task 2015. In: CLEF
(Working Notes). (2015)
6. Goeau, H., Bonnet, P., Joly, A.: Plant identi cation in an open-world (lifeclef
2016). In: CLEF working notes 2016. (2016)
7. Goeau, H., Bonnet, P., Joly, A.: Plant identi cation based on noisy web data: the
amazing performance of deep learning (lifeclef 2017), CEUR Workshop Proceedings
(2017)
8. Goeau, H., Bonnet, P., Joly, A.: Overview of expertlifeclef 2018: how far automated
identi cation systems are from the best experts? In: CLEF working notes 2018
9. Joly, A., Goau, H., Glotin, H., Spampinato, C., Bonnet, P., Vellinga, W.P.,
Lombardo, J.C., Planque, R., Palazzo, S., Muller, H.: Overview of lifeclef 2018: a
large-scale evaluation of species identi cation and recommendation algorithms in
the era of ai. In: Proceedings of CLEF 2018
10. Yanikoglu, B., Aptoula, E., Tirkaz, C.: Sabanci-Okan system at imageclef 2011:</p>
      <p>Plant identi cation task. In: CLEF (Working Notes). (2011)
11. Yanikoglu, B., Aptoula, E., Tirkaz, C.: Sabanci-Okan system at imageclef 2012:
Combining features and classi ers for plant identi cation. In: CLEF (Working
Notes). (2012)
12. Mehdipour-Ghazi, M., Yanikoglu, B., Aptoula, E.: Plant identi cation using deep
neural networks via optimization of transfer learning parameters. Neurocomputing
235 (2017) 228{235
13. Yanikoglu, B., Aptoula, E., Tirkaz, C.: Open-set plant identi cation using an
ensemble of deep convolutional neural networks. In: Working Notes of CLEF 2016
- Conference and Labs of the Evaluation forum, Evora, Portugal, 5-8 September,
2016. (2016) 518{524
14. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. arXiv preprint
arXiv:1709.01507 (2017)
15. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected
convolutional networks. In: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition. (2017)
16. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z.,
Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large
Scale Visual Recognition Challenge. International Journal of Computer Vision
(IJCV) 115(3) (2015) 211{252
17. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.</p>
      <p>In: Proceedings of the IEEE conference on computer vision and pattern recognition.
(2016) 770{778
18. Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via
errorcorrecting output codes. Journal of Arti cial Intelligence Research (1995) 263{286
19. Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in neural
information processing systems. (2007) 545{552
20. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.,
Guadarrama, S., Darrell, T.: Ca e: Convolutional architecture for fast feature embedding.</p>
      <p>In: Proceedings of the 22nd ACM. (2014) 675{678
21. Zor, C., Yanikoglu, B., Windeatt, T., Alpaydin, E.: FLIP-ECOC: a greedy
optimization of the ECOC matrix. In: Proceedings of the 25th International
Symposium on Computer and Information Sciences, Springer (2010) 149 { 154
22. Zor, C., Yanikoglu, B., Merdivan, E., Windeatt, T., Kittler, J., Alpaydin, E.:
BeamECOC: A local search for the optimization of the ECOC matrix. In: 23rd
International Conference on Pattern Recognition, ICPR 2016, Cancun, Mexico,
December 4-8, 2016. (2016) 198{203</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boujemaa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barthelemy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molino</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Birnbaum</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mouysset</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Picard</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The CLEF 2011 plant images classi cation task</article-title>
          . In: CLEF (Notebook Papers/Labs/Workshop). (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yahiaoui</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barthelemy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boujemaa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molino</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          :
          <article-title>The ImageCLEF 2012 plant identi cation task</article-title>
          . In: CLEF (Online Working Notes/Labs/Workshop). (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Goeau, H.,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bakic</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barthelemy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boujemaa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molino</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          :
          <article-title>The ImageCLEF 2013 plant identi cation task</article-title>
          .
          <source>In: CLEF (Working Notes)</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Goeau, H.,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Selmi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molino</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barthelemy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boujemaa</surname>
          </string-name>
          , N.:
          <article-title>LifeCLEF plant identi cation task 2014</article-title>
          . In: CLEF (Working Notes). (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>