-

The CLEF 2011 plant images classi cation task

Herve Goeau

Pierre Bonnet

Alexis Joly

Nozha Boujemaa

Daniel Barthelemy

Jean-Francois Molino

Philippe Birnbaum

Elise Mouysset

Marie Picard

INRIA

IMEDIA team

France

name.surname@inria.fr

http://www-rocq.inria.fr/imedia/

UMR AMAP

France

pierre.bonnet@cirad.fr

http://amap.cirad.fr/fr/index.php

CIRAD

Direction

UMR AMAP

France

daniel.barthelemy@cirad.fr

http://amap.cirad.fr/fr/index.php

UMR AMAP

France

jean-francois.molino@ird.fr

http://amap.cirad.fr/fr/index.php

CIRAD

UMR AMAP

France

philippe.birnbaum@cirad.fr

http://amap.cirad.fr/fr/index.php

Tela Botanica

France

name@tela-botanica.org

http://www.tela-botanica.org/

ImageCLEF' plant identi cation task provides a testbed for the system-oriented evaluation of tree species identi cation based on leaf images. The aim is to investigate image retrieval approaches in the context of crowdsourced images of leaves collected in a collaborative manner. This paper presents an overview of the resources and assessments of the plant identi cation task at ImageCLEF 2011, summarizes the retrieval approaches employed by the participating groups, and provides an analysis of the main evaluation results.

ImageCLEF plant leaves images collection identi cation classi cation evaluation benchmark

Convergence of multidisciplinary research is more and more considered as the next big thing to answer profound challenges of humanity related to health, biodiversity or sustainable energy. The integration of life sciences and computer sciences has a major role to play towards managing and analyzing cross-disciplinary scienti c data at a global scale. More speci cally, building accurate knowledge of the identity, geographic distribution and uses of plants is essential if agricultural development is to be successful and biodiversity is to be conserved. Unfortunately, such basic information is often only partially available for professional stakeholders, teachers, scientists and citizens, and often incomplete for ecosystems that possess the highest plant diversity. A noticeable consequence, expressed as the taxonomic gap, is that identifying plant species is usually impossible for the general public, and often a di cult task for professionals, such as farmers or wood exploiters and even for the botanists themselves. The only way to overcome this problem is to speed up the collection and integration of raw observation data, while simultaneously providing to potential users an easy and e cient access to this botanical knowledge. In this context, content-based visual identi cation of plant's images is considered as one of the most promising solution to help bridging the taxonomic gap. Evaluating recent advances of the IR community on this challenging task is therefore an important issue. This paper presents the plant identi cation task that was organized within ImageCLEF 20117 for the system-oriented evaluation of visual based plant identi cation. This rst year pilot task was more precisely focused on tree species identi cation based on leaf images. Leaves are far from being the only discriminant visual key between tree species but they have the advantage to be easily observable and the most studied organ in the computer vision community. The task was organized as a classi cation task over 70 tree species with visual content being the main available information. Additional information only included contextual meta-data (author, date, locality name) and some EXIF data. Three types of image content were considered: leaf scans, leaf photographs with a white uniform background (referred as scan-like pictures) and unconstrained leaf's photographs acquired on trees with natural background. The main originality of this data is that it was speci cally built through a citizen sciences initiative conducted by Telabotanica 8, a French social network of amateur and expert botanists. This makes the task closer to the conditions of a real-world application: (i) leaves of the same species are coming from distinct trees living in distinct areas (ii) pictures and scans are taken by di erent users that might not used the same protocol to collect the leaves and/or acquire the images (iii) pictures and scans are taken at di erent periods in the year. 2 2.1

Task resources

Building e ective computer vision and machine learning techniques is not the only side of the taxonomic gap ptoblem. Speeding-up the collection of raw observation data is clearly another crucial one. The most promising approach in that way is to build real-world collaborative systems allowing any user to enrich the global visual botanical knowledge [ 9 ]. To build the evaluation data of ImageCLEF plant identi cation task, we therefore set up a citizen science project around the identi cation of common woody species covering the Metropolitan French territory. This was done in collaboration with TelaBotanica social network and with researchers specialized in computational botany.

Technically, images and associated tags were collected through a crowd-sourcing web application [ 9 ] and were all validated by expert botanists. Several cycles of such collaborative data collection and taxonomical validation occurred. Scans of leaves were rst collected over two seasons, between July and September 2009

7 http://www.imageclef.org/2011

8 http://www.tela-botanica.org/ and between June and September 2010 thanks to the work of active contributors from TelaBotanica social network. The idea of collecting only scans during this rst period was to initialize the training data with limited noisy background and to focus on plant variability rather than mixed plant and view conditions variability. This allowed to collect 2228 scans over 55 species. A public version of the application 9 was then opened in October 2010 and additional data were collected up to March 2011. The new collected images were either scans, or photographs with uniform background (referred as scan-like photos ), or unconstrained photographs with natural background. They involved 15 new species from the previous set of 55 species. The Pl@ntLeaves dataset used within ImageCLEF nally contained 5436 images with 3070 scans, 897 scan-like photos and 1469 photographs. Figure 2 displays samples of these 3 image types for 4 distinct tree species. The full list of species is provided in Figure 1.

9 http://combraille.cirad.fr:8080/demo plantscan/

{ Date upload date of the image { Type (acquisition type: scan, scan-like or photograph) { Content content type: single leaf, single dead leaf or foliage (several leaves on tree visible in the picture) { Taxon full taxon name (sub-regnum, regnum, class, division, order, family, genus, species) { VernacularNames French or English vernacular names { Author name of the author of the picture { Organization name of the organization of the author { Locality locality name (a district or a country division or a region) { GPSLocality GPS coordinates of the observation These meta-data are stored in independent xml les, one for each image. Figure 3 displays an example image with its associated xml data.

Additional but partial meta-data information can be found in the image's EXIF, and might include the camera or the scanner model, the image resolution and dimension, the optical parameters, the white balance, the light measures, etc. 2.3 The main originality of Pl@ntLeaves compared to previous leaf datasets, such as the Swedish dataset [ 13 ] or the Smithsonian one [ 1 ], is that it was built in a collaborative manner through a citizen sciences initiative. This makes it closer to the conditions of a real-world application: (i) leaves of the same species are coming from distinct trees living in distinct areas (ii) pictures and scans are taken by di erent users that might not used the same protocol to collect the leaves and/or acquire the images (iii) pictures and scans are taken at di erent periods in the year. Intra-species visual variability and view conditions variability are therefore more stressed-out which makes the identi cation more realistic but more complex. Figures 4 to 9 provide illustrations of the intra-species visual variability over several criteria including leaf's color, leaf's global shape, leaf's margin appearance, number and relative positions of lea ets and number of lobes. On the other side, Figure 10 illustrates the light re ection and shadows variations of scan-like photos. It shows that this acquisition protocol is actually very di erent than pure scans. Both share the property of a limited noisy background but scan-like photos are much more complex due to the lighting conditions variability ( ash, sunny weather, etc.) and the un atness of leaves. Finally, the variability of unconstraint photographs acquired on the tree and with natural background is de nitely a much more challenging issue as illustrated in Figure 11. 3

Task description

The task was evaluated as a supervised classi cation problem with tree species used as class labels. 3.1

Training and Test data

A part of Pl@ntLeaves dataset was provided as training data whereas the remaining part was used later as test data. The training subset was built by randomly selecting 2/3 of the individual plants of each species (and not by randomly splitting the images themselves). So that pictures of leaves belonging to the same individual tree cannot be split across training and test data. This prevents identifying the species of a given tree thanks to its own leaves and that makes the task more realistic. In a real world application, it is indeed much unlikely that a user tries to identify a tree that is already present in the training data. Detailed statistics of the composition of the training and test data are provided in Table 1.

Task objective and evaluation metric

The goal of the task was to associate the correct tree species to each test image. Each participant was allowed to submit up to 3 runs built from di erent methods. As many species as possible can be associated to each test image, sorted by decreasing con dence score. Only the most con dent species was however used in the primary evaluation metric described below. But providing an extended ranked list of species was encouraged in order to derive complementary statistics (e.g. recognition rate at other taxonomic levels, suggestion rate on top k species, etc.).

The primary metric used to evaluate the submitted runs was a normalized classi cation rate evaluated on the 1st species returned for each test image. Each test image is attributed with a score of 1 if the 1st returned species is correct and 0 if it is wrong. An average normalized score is then computed on all test images. A simple mean on all test images would indeed introduce some bias with regard to a real world identi cation system. Indeed, we remind that the Pl@ntLeaves dataset was built in a collaborative manner. So that few contributors might have provided much more pictures than many other contributors who provided few. Since we want to evaluate the ability of a system to provide correct answers to all users, we rather measure the mean of the average classi cation rate per author. Furthermore, some authors sometimes provided many pictures of the same individual plant (to enrich training data with less e orts). Since we want to evaluate the ability of a system to provide the correct answer based on a single plant observation, we also decided to average the classi cation rate on each individual plant. Finally, our primary metric was de ned as the following average classi cation score S:

S = 1 XU 1 XPu 1 NXu;p su;p;n

U u=1 Pu p=1 Nu;p n=1 U : number of users (who have at least one image in the test data) Pu : number of individual plants observed by the u-th user Nu;p : number of pictures taken from the p-th plant observed by the u-th user su;p;n : classi cation score (1 or 0) for the n-th picture taken from the p-th plant observed by the u-th user

It is important to notice that while making the task more realistic, the normalized classi cation score also makes it more di cult. Indeed, it works as if a bias was introduced between the statistics of the training data and the one of the test data. It highlights the fact that bias-robust machine learning and computer vision methods should be preferred to train such real-world collaborative data. Finally, to isolate and evaluate the impact of the image acquisition type (scan, scan-like, photogragh), a normalized classi cation score S was computed for each (1) type separately. Participants were therefore allowed to train distinct classi ers, use di erent training subsets or use distinct methods for each data type. 4

Participants and techniques

A total of 8 groups submitted 20 runs, which is a successful participation rate for a rst year pilot task on a new topic. Participants were mainly academics, specialized in computer vision and multimedia information retrieval, coming from all around the world: Australia (1), Brazil (1), France (2), Romania (1), Spain (1), Turkey (1) and UK (1). We list below the 8 participants and give a brief overview of the techniques they used to run the plant identi cation task. We remind here that ImageCLEF benchmark is a system-oriented evaluation and not a formal evaluation of the underlying methods. Readers interested by the scienti c and technical details of any of these methods should refer to the CLEF2011 working note of each participant.

IFSC (3 runs) [ 6 ] The two best runs obtained by this participant (IFSC USP run2 & IFSC USP run1) are mainly based on a new shape boundary analysis method they introduced recently [ 3 ]. It is based on the complex network theory [ 2 ]. A shape is modeled into a small-world complex network and it uses degree and joint degree measurements in a dynamic evolution network to compose a set of shape descriptors. This method is claimed to be robust, noise tolerant, scale invariant and rotation invariant and proved to provide better performances than Fourier shape descriptors, curvature-based descriptors, Zernike moments and multiscale fractal dimensions.

LIRIS (4 runs) [ 7 ] This participant also used a classi cation scheme based on shape boundary analysis. The main originality however is that they used a model-driven approach for the segmentation and shape estimation. Their four runs di er in the parameters of the method.

UAIC (3 runs) This participant was the only one trying to bene t from metadata associated to the images (location, date, author, etc.). They submitted therefore 3 runs to evaluate the contribution of metadata compared to using visual content only. Their rst run (UAIC2011 Run01) is based on visual content only, the second one (UAIC2011 Run02) uses only metadata based features in the classi cation process, and the third one uses both (UAIC2011 Run03).

SABANCI-OKAN (1 run) [ 16 ] The system consists of two separate subparts for: i) scan-pseudoscan images and ii) photos. The features used for the scan categories are computed using basic color and shape descriptors such as color moments and convexity, as well as more complex ones such as the Fourier descriptors of the contour and several morphological texture descriptors based on covariance extensions. Since some of these features are not meaningful for the photo category, a subset with color and texture features was used there. For training, Support Vector Machines and classi er combination using only image content and none of the meta-data were used. All of the scan-pseudoscan images were used for part i), while for part ii), all of the images were used. The system was fully automatic.

INRIA (2 runs) [ 10 ] This participant submitted two runs based on two radically di erent methods. Their second run (inria imedia plantnet run2) is based on a shape boundary feature, called DFH [ 15 ], that they introduced in 2006. Their rst run (inria imedia plantnet run1) is more surprising for a supervised classi cation task of leaves since it is based on local features matching with rigid geometrical models . Such generalist method is usually more dedicated to largescale retrieval of rigid objects and this is the only participant who used such approach.

RMIT (2 runs) [ 11 ] RMIT mainly focused on comparing two distinct machine learning algorithms: an instance-based learning, implemented on Weka as IB1 (nearest-neighbor classi er) (RMIT run1), and a decision tree technique implemented on Weka as J48 (RMIT run2). For both, all training data were used, without complementary data. The features used were GIFT 166 colour histograms.

DAEDALUS (1 run) [ 14 ] This participant used a generalist image retrieval framework based on SIFT features and a nearest-neighbor classi er.

KMIMMIS (4 runs) This participant also used a generalist image retrieval framework based on local features and a nearest-neighbor classi er. They compared di erent con gurations: basic clustered SIFT with 1-NN label transfer (kmimmis run1 and kmimmis run4), simple edge and corner point detection with 1-NN vote label transfer (kmimmis run2), simple edge and corner point detection with 10-NN vote label transfer (kmimmis run3).

Over all runs, the most frequently used class of methods is shape boundary analysis. 8 runs among 20 are based on some boundary shape features.This is not surprising since state-of-the-art methods addressing leaf-based identi cation in the literature are mostly based on leaf segmentation and shape boundary features [ 4, 12, 5, 15, 3 ]. On the other side, it was a good news that the majority of the runs were based on other various approaches so that more relevant conclusions can be expected. 5 5.1

Results Global analysis

Figures 12, 13 and 14 present the normalized classi cation scores of the 20 submitted runs for each of the three image types. Alternatively, Figure 15 presents the overall performances averaged over the 3 image types. Table 2 nally presents the same results but with detailed numerical values.

A rst global remark is that, as expected, the performances are degrading with the complexity of the acquisition image type. Scans are more easy to identify than scan-like photos and unconstrained photos are much more di cult. This is can be easily seen in Figure 15 where the relative scores of each image type are highlighted by distinct colors.

A second global remark is that no method provide the best score for all image types. None of the run even belongs to the top-3 runs of all image types (as shown in Table 2). This is somehow disappointing from the genericity point of view but not surprising regarding the nature of the di erent image types. One could expect that scans and scan-like photos lead to similar conclusions but this is actually not the case. The only runs that give quite stable and good performances over the three image types are the two runs of IFSC based on complex network shape boundary analysis method (IFSC USP run1 & IFSC USP run2). This justi es their excellent ranking when averaging the classi cation scores over the three image types. All other methods fail in providing as good results for the unconstrained photographs as shown in Figure 14. This score gap between IFSC run and others has however to be mitigated by a bias introduced by the author normalization of the classi cation score. Indeed, their high score is mostly due to excellent performances on the images of one of the 3 contributors. All these images are very similar and less cluttered than the average of the unconstrained photos (actually they are all close-up of a juda's tree leaf). But still, it seems that this is the only method that have at least perform very well on these images.

A third important remark is that shape boundary analysis methods do not provide the best results on the scan images whereas they are usually considered as being state-of-the-art on such data. They all provide good classi cation scores between 48% and 56% but they are consistently outperformed by two more generic image retrieval approaches (as shown in Figure 12). The best score is achieved by INRIA's run using large-scale matching of local features with rigid geometrical models (inria imedia plantnet run1, 68% classi cation rate). This suggests that modeling leaves as part-based rigid and textured objects might be an interesting alternative to shape boundary approaches that do not characterize well margin details, ribs or limb texture. Second best score on scans is obtained by the run of SABANCI-OKAN (Sabanci-okan-run1) which uses a supervised classi cation approach based on support vector machine (SVM) and a combination of 3 global visual features. This suggests that combining shape boundary features with other color and shape textures is also a promising direction.

Global conclusions on the scan-like photos are quite di erent (Figure 13). The best score is obtained by INRIA's run (inria imedia plantnet run2) purely based on a global shape boundary feature (DFH [ 15 ]). It is followed closely by the four runs of LIRIS also based on boundary shape features but using a model-driven approach for the segmentation and the shape estimation. Then come the two runs of INRIA and SABANCI-OKAN that ranked rst on the scan images (the rst one based on rigid objects matching and the second one training combined features) and nally the shape boundary method of IFSC. One conclusion is that shape boundary methods appear to provide more stable results for scans and scan-like photos. On the other side the rigid object matching method of INRIA degrades much more from scans to scan-like pictures. This can be explained by the fact that it is more discriminant regarding the leave's morphology but less robust to light re ections and shadows. These lighting variations might also explain the degrading performances of the combined features used by SABANCIOKAN. Using metadata to help the identi cation, and more particularly geo-tags, is de nitely something that has to be studied. We were therefore very enthusiastic to see the runs of UAIC, aimed at evaluating the potential contributions of Pl@ntLeaves metadata. Unfortunately, results clearly show that adding metadata degrades their identi cation performances. Metadata alone even give a null classi cation success rate. Besides technical details that could probably slightly improve such species ltering based on metadata, it still shows that Pl@ntLeaves metadata might be intrinsically not very useful for identi cation purposes. The main reason is probably that the geographic spread of the data is limited (French mediterranean area). So that most species of the dataset might be identically and uniformly distributed in the covered area. Geo-tags would be for sure more useful at a global scale (continent, countries). But at a local scale, the geographical distribution of plants is much more complex. It usually depends on localized environmental factors such as sun exposition or water proximity that would require much more data to be modeled. To evaluate which species are more di cult to identify than others, we averaged the performances over the runs of all participants (for each specie). It is however di cult to understand precisely the score variations. They can be due to morphological variations but also to di erent view conditions or other statistical bias in the data such as the number of training images. Figure 16 presents the obtained graph for the scan images only (in order to limit view conditions bias). The only global trend we discovered so far is that simple leaves are on average easier to identify than compound leaves.

Performances per plant morphology

In botany, species are described and categorized by morphological features, frequently on leaves [ 8 ]. These features are very numerous and concern for instance the leaf organization, the margin type, the shapes, the venation, the presence of spines, etc. We attempt here to evaluate which kinds of morphological features make more di cult the species identi cation, through three essential kinds of features: the leaf organization, the global shape (for simple leaves only) and the margin type. Table 17 gives a detailed description of these features for each species involved in the test image dataset.

To evaluate which leaf organization makes more di cult the species identi cation, we averaged the performances over the runs of all participants for each kind of organization: simple or compound and palmately compound. Figure 18 presents the obtained scores for each kind of image. This graph con rms that species with simple leaves are on average easier. However, compound leaves can be subdivided into sub-categories describing how lea ets are organized: pinnately compound (with lea ets arranged along a the main axis called rachis), or palmately compound (with lea ets attached at one same basal point). Figure 19 presents the obtained scores and illustrates that species with palmately compound leaves can be easier than species with simple leaves, at least for the two tested species "Aesculus hippocastanum" and "Vitex agnus-castus". Another interesting point is the di erence on scores between scan and scan-like for pinnate compound leaves (there were no scan-like images of palmately compound leaves in the test image dataset). One explanation is that scan-like of compound leaves are often in relief involving disturbing shadows and a global shape .

In the case of simple leaves, to evaluate which kind of shape makes more di cult the species identi cation, we averaged the performances over the runs of all participants for each category of shape identi ed in the test image dataset: asymmetrical, elliptic, lanceolate, linear, lobate, obovate and orbicular. Figure 20 presents the obtained scores for each kinds of shape and image. Results show that species with an orbicular shape are easier to identify. This is con rmed in the graph 16 where three of the six orbicular species in the test image dataset ("acer campestre", "cercis siliquastrum", "corylus avellana") give good results (the best one for "corylus avellana").

One last important morphological feature studied by botanists is the margin type. In this case, to evaluate which type of margin makes more di cult the species identi cation, we averaged the performances over the runs of all participants for each category of margin identi ed in the test image dataset: "untoothed", "dentate", "serrate" and "crenate". Figure 21 presents the obtained scores for each kinds of margin and image. Results show that the margin type weakly a ect species identi cation, except maybe for species with crenate margin. To qualitatively assess which kind of images causes good results and which one makes all methods failed, we sorted all test pictures by the number of runs in which they were correctly identi ed. The obtained ranking con rms that scan images are much easier for the identi cation. 99% of 100-top ranked images are actually scans (with 11 to 17 successful runs). Figure 22 displays the 4 best identi ed images (with 17/20 successful runs). They all are very standard leaf images similar to the one found in books illustrations. On the other side, 260 images were not identi ed by any run with a majority of unconstrained photos (63 scans, 27 scan-like photos, 168 unconstrained photographs). The scans and scan-like photos belonging to this category of most di cult images are very interesting. As illustrated in Figure 23, most of them correspond to outlier leaves with defaults or unusual morphology (ill or torn leaves, missing lea ets, etc.). Figure 23 displays 8 of them. 6

Conclusions

This paper presented the overview and the results of ImageCLEF 2011 plant identi cation testbed. A challenging collaborative dataset of tree's leaf images was speci cally built for this evaluation and 8 participants coming from di erent countries submitted a total of 20 runs. A rst conclusion was that identi cation performances are close from mature when using scans or photos with uniform background but that unconstrained photos are still much more challenging. More data and evaluation are clearly required to progress on such data. Another important conclusion was that sate-of-the-art methods based on shape boundary analysis were not the best ones on leaf scans. Better performances were notably obtained with a local features matching technique usually more dedicated to the large-scale retrieval of rigid objects. On the other side, shape boundary analysis methods remain better on scan-like photos due to their better robustness to light re ections and shadow e ects. This suggests that combining shape boundary features with part-based rigid object models might be an interesting direction. Adding texture and color information also showed some improvements. On the contrary, using additional metadata such as geo-tags was not concluding on the evaluated dataset. Probably because the geographic spread of the data was limited.

Acknowledgement

This work was funded by the Agropolis fundation through the project Pl@ntNet (http://www.plantnet-project.org/) and the EU through the CHORUS+ Coordination action (http://avmediasearch.eu/) Fig. 20. Mean classi cation score per margin type averaged over all participant runs

1. Agarwal , G. , Belhumeur , P. , Feiner , S. , Jacobs , D. , Kress , J.W. ,

Ramamoorthi , N.B. , Dixit , N. , Ling , H. , Mahajan , D. , Russell , R. , Shirdhonkar , S. , Sunkavalli , K. , White , S. : First steps toward an electronic eld guide for plants . Taxon 55 , 597 { 610 ( 2006 )

2. Albert , R.Z. : Statistical mechanics of complex networks . Ph.D. thesis , Notre Dame, IN, USA ( 2001 )

3. Backes , A.R. , Casanova , D. , Bruno , O.M.: A complex network-based approach for boundary shape analysis . Pattern Recognition 42 ( 1 ), 54 { 67 ( 2009 )

4. Belhumeur , P. , Chen , D. , Feiner , S. , Jacobs , D. , Kress , W. , Ling , H. , Lopez , I. , Ramamoorthi , R. , Sheorey , S. , White , S. , Zhang , L.: Searching the world4s herbaria: A system for visual identi cation of plant species . In: ECCV , pp. 116 { 129 ( 2008 )

5. Bruno , O.M. , de Oliveira

Plotze

, R. , Falvo , M., de Castro , M.: Fractal dimension applied to plant identi cation . Information Sciences 178 ( 12 ), 2722 { 2733 ( 2008 )

6. Casanova , D. , ao Batista Florindo, J. , Bruno , O.M. : Ifsc/usp at imageclef 2011: Plant identi cation task . In: Working notes of CLEF 2011 conference ( 2011 )

7. Cerutti , G. , Tougne , L. , Mille , J. , Vacavant , A. , Coquin , D. : Guiding active contours for tree leaf segmentation and identi cation . In: Working notes of CLEF 2011 conference ( 2011 )

8. Ellis , B. , Daly , D.C. , Hickey , L.J. , Johnson, K.R., Mitchell, J.D., Mitchell, J.D. , Wilf , P. , Wing , S.L. : Manual of Leaf Architecture . Comstock Publishing Associates ( 2009 )

9. Goeau , H. , Joly , A. , Selmi , S. , Bonnet , P. , Mouysset , E. , Joyeux , L. : Visual-based plant species identi cation from crowdsourced data . In: Proceedings of ACM Multimedia 2011 ( 2011 )

10. Goeau , H. , Joly , A. , Yahiaoui , I. , Bonnet , P. , Mouysset , E.: Participation of inria & pl@ntnet to imageclef 2011 plant images classi cation task . In: Working notes of CLEF 2011 conference ( 2011 )

11. Hamid , R.A. , Thom , J.A. : Rmit at imageclef 2011 plant identi cation . In: Working notes of CLEF 2011 conference ( 2011 )

12. Neto , J.C., Meyer, G.E., Jones , D.D. , Samal , A.K. : Plant species identi cation using elliptic fourier leaf shape analysis . Computers and Electronics in Agriculture 50 ( 2 ), 121 { 134 ( 2006 )

13. Soderkvist, O.J.O. : Computer Vision Classi cation of Leaves from Swedish Trees . Master's thesis , Linkoping University, SE-581 83 Linkoping, Sweden ( September 2001 ), liTH-ISY-EX-3132

14. Villena-Roman , J. , Lana-Serrano , S. , Gonzalez-Cristobal , J.C. : Daedalus at imageclef 2011 plant identi cation task: Using sift keypoints for object detection . In: Working notes of CLEF 2011 conference ( 2011 )

15. Yahiaoui , I. , Herve , N. , Boujemaa , N.: Shape-based image retrieval in botanical collections . In: Advances in Multimedia Information Processing - PCM 2006 , vol. 4261 , pp. 357 { 364 ( 2006 )

16. Yanikoglu , B. , Aptoula , E. , Tirkaz , C. : Sabanci-okan system at imageclef 2011: Plant identi cation task . In: Working notes of CLEF 2011 conference ( 2011 )