=Paper=
{{Paper
|id=Vol-1222/paper2
|storemode=property
|title=LifeCLEF: Multimedia life species identification
|pdfUrl=https://ceur-ws.org/Vol-1222/paper2.pdf
|volume=Vol-1222
|dblpUrl=https://dblp.org/rec/conf/mir/JolyMGGSRBVFP14
}}
==LifeCLEF: Multimedia life species identification==
LifeCLEF: Multimedia Life Species Identification Alexis Joly Henning Müller Hervé Goëau INRIA, France HES-SO, Switzerland INRIA, France alexis.joly@inria.fr Henning.Mueller@hevs.ch herve.goeau@inria.fr Hervé Glotin Concetto Spampinato Andreas Rauber IUF & Univ. de Toulon, France University of Catania, Italy Vienna Univ. of Tech., Austria glotin@univ-tln.fr cspampin@dieei.unict.it rauber@ifs.tuwien.ac.at Pierre Bonnet Willem-Pier Vellinga Bob Fisher CIRAD, France Xeno-canto foundation, The Edinburgh Univ., UK pierre.bonnet@cirad.fr Netherlands rbf@inf.ed.ac.uk wp@xeno-canto.org ABSTRACT sparse knowledge is that identifying living plants or animals Building accurate knowledge of the identity, the geographic is usually impossible for the general public, and often a dif- distribution and the evolution of living species is essential ficult task for professionals, such as farmers, fish farmers or for a sustainable development of humanity as well as for foresters, and even also for the naturalists and specialists biodiversity conservation. In this context, using multimedia themselves. This taxonomic gap [32] was actually identified identification tools is considered as one of the most promis- as one of the main ecological challenges to be solved during ing solution to help bridging the taxonomic gap. With the the Rio’s United Nations Conference in 1992. recent advances in digital devices/equipment, network band- In this context, using multimedia identification tools is width and information storage capacities, the production of considered as one of the most promising solution to help multimedia big data has indeed become an easy task. In par- bridging the taxonomic gap [23, 11, 8, 31, 28, 1, 30, 18]. allel, the emergence of citizen sciences and social networking With the recent advances in digital devices, network band- tools has fostered the creation of large and structured com- width and information storage capacities, the production of munities of nature observers (e.g. eBird, Xeno-canto, Tela multimedia data has indeed become an easy task. In paral- Botanica, etc.) that have started to produce outstanding lel, the emergence of citizen sciences and social networking collections of multimedia records. Unfortunately, the perfor- tools has fostered the creation of large and structured com- mance of the state-of-the-art multimedia analysis techniques munities of nature observers (e.g. eBird1 , Xeno-canto2 , Tela on such data is still not well understood and is far from Botanica3 , etc.) that have started to produce outstanding reaching the real world’s requirements in terms of identifi- collections of multimedia records. Unfortunately, the perfor- cation tools. The LifeCLEF lab proposes to evaluate these mance of the state-of-the-art multimedia analysis techniques challenges around 3 tasks related to multimedia information on such data is still not well understood and are far from retrieval and fine-grained classification problems in 3 living reaching the real world’s requirements in terms of identi- worlds. Each task is based on large and real-world data and fication tools [18]. Most existing studies or available tools the measured challenges are defined in collaboration with typically identify a few tens or hundreds of species with mod- biologists and environmental stakeholders in order to reflect erate accuracy whereas they should be scaled-up to take one, realistic usage scenarios. two or three orders of magnitude more, in terms of number of species (the total number of living species on earth is es- timated to be around 10K for birds, 30K for fishes, 300K for 1. LifeCLEF LAB OVERVIEW plants and more than 1.2M for invertebrates [6]). 1.1 Motivations 1.2 Evaluated Tasks Building accurate knowledge of the identity, the geographic The LifeCLEF lab proposes to evaluate these challenges in distribution and the evolution of living species is essential for the continuity of the image-based plant identification task a sustainable development of humanity as well as for biodi- [19] that was run within ImageCLEF lab during the last versity conservation. Unfortunately, such basic information three years with an increasing number of participants. It is often only partially available for professional stakeholders, however radically enlarges the evaluated challenge towards teachers, scientists and citizens, and more often incomplete multimodal data by (i) considering birds and fish in addition for ecosystems that possess the highest diversity, such as to plants (ii) considering audio and video contents in addi- tropical regions. A noticeable cause and consequence of this tion to images (iii) scaling-up the evaluation data to hun- dreds of thousands of life media records and thousands of living species. More concretely, the lab is organized around Copyright c by the paper’s authors. Copying permitted only for private and academic purposes. 1 http://ebird.org/ In: S. Vrochidis, K. Karatzas, A. Karpinnen, A. Joly (eds.): Proceedings of 2 the International Workshop on Environmental Multimedia Retrieval (EMR http://www.xeno-canto.org/ 3 2014), Glasgow, UK, April 1, 2014, published at http://ceur-ws.org http://www.tela-botanica.org/ 7 three tasks: • provide large and original data sets of biological records, and then allow comparison of multimedia-based iden- tification techniques PlantCLEF: an image-based plant iden- tification task • boost research and innovation on this topic in the next few years and encourage multimedia researchers to work BirdCLEF: an audio-based bird identifi- on trans-disciplinary challenges involving ecological and cation task environmental data FishCLEF: a video-based fish identifica- • foster technological ports from one domain to another tion task and exchanges between the different communities (in- formation retrieval, computer vision, bio-accoustic, ma- As described in more detail in the following sections, each chine learning, etc.) task is based on big and real-world data and the measured challenges are defined in collaboration with biologists and • promote citizen science and nature observation as a environmental stakeholders so as to reflect realistic usage way to describe, analyse and preserve biodiversity scenarios. For this pilot year, the three tasks are mainly concerned with species identification, i.e., helping users to retrieve the taxonomic name of an observed living plant or 2. TASK1: PlantCLEF animal. Taxonomic names are actually the primary key to organize life species and to access all available information 2.1 Context about them either on the web, or in herbariums, in scientific Content-based image retrieval approaches are nowadays literature, books or magazines, etc. Identifying the taxon considered to be one of the most promising solution to help observed in a given multimedia record and aligning its name bridge the botanical taxonomic gap, as discussed in [14] with a taxonomic reference is therefore a key step before any or [22] for instance. We therefore see an increasing inter- other indexing or information retrieval task. More focused est in this trans-disciplinary challenge in the multimedia or complex challenges (such as detecting species duplicates community (e.g. in [16, 9, 21, 24, 17, 4]). Beyond the or ambiguous species) could be evaluated in coming years. raw identification performances achievable by state-of-the- The three tasks are primarily focused on content-based art computer vision algorithms, the visual search approach approaches (i.e. on the automatic analyses of the audio offers much more efficient and interactive ways of brows- and visual signals) rather than on interactive information ing large floras than standard field guides or online web retrieval approaches involving textual or graphical morpho- catalogs. Smartphone applications relying on such image- logical attributes. The content-based approach to life species based identification services are particularly promising for identification has several advantages. It is first intrinsi- setting-up massive ecological monitoring systems, involving cally language-independent and solves many of the multi- hundreds of thousands of contributors at a very low cost. lingual issues related to the use of classical text-based mor- The first noticeable progress in this way was achieved by phological keys that are strongly language dependent and the US consortium at the origin of LeafSnap4 . This pop- understandable only by few experts in the world. Further- ular iPhone application allows a fair identification of 185 more, an expert of one region or a specific taxonomic group common American plant species by simply shooting a cut does not necessarily know the vocabulary dedicated to an- leaf on a uniform background (see [22] for more details). A other group of living organisms. A content-based approach step beyond was achieved recently by the Pl@ntNet project can then be much more easily generalizable to new floras [18] which released a cross-platform application (iPhone [13], or faunas contrary to knowledge-based approaches that re- android5 and web 6 ) allowing (i) to query the system with quire building complex models manually (ontologies with pictures of plants in their natural environment and (ii) to rich descriptions, graphical illustrations of morphological at- contribute to the dataset thanks to a collaborative data val- tributes, etc.). On the other hand, LifeCLEF lab is inher- idation workflow involving Tela Botanica7 (i.e. the largest ently cross-modal through the presence of contextual and botanical social network in Europe). social data associated to the visual and audio contents. This includes geo-tags or location names, time information, au- As promising as these applications are, their performances thor names, collaborative ratings or comments, vernacular are however still far from the requirements of a real-world names (common names of plants or animals), organ or pic- social-based ecological surveillance scenario. Allowing the ture type tags, etc. The rules regarding the use of these mass of citizens to produce accurate plant observations re- meta-data in the evaluated identification methods will be quires to equip them with much more accurate identification specified in the description of each task. Overall, these rules tools. Measuring and boosting the performances of content- are always designed so as to reflect real possible usage sce- based identification tools is therefore crucial. This was pre- narios while offering the largest diversity in the affordable cisely the goal of the ImageCLEF8 plant identification task approaches. organized since 2011 in the context of the worldwide evalua- tion forum CLEF9 . In 2011, 2012 and 2013 respectively 8, 10 1.3 Expected Outcomes 4 http://leafsnap.com/ The main expected outcomes of LifeCLEF evaluation cam- 5 https://play.google.com/store/apps/details?id=org.plantnethl=fr paign are the following: 6 http://identify.plantnet-project.org/ 7 • give a snapshot of the performances of state-of-the-art http://www.tela-botanica.org/ 8 multimedia techniques towards building real-world life http://www.imageclef.org/ 9 species identification systems http://www.clef-initiative.eu/ 8 Figure 2: 6 plant species sharing the same com- mon name for laurel in French, belonging to distinct species. image-based identification methods and evaluation data pro- posed in the past were so far based on leaf images (e.g. in [22, 5, 9] or in the more recent methods evaluated in [15]). Only few of them were focused on flower’s images as in [25] or [3]. Leaves are far from being the only discriminant visual Figure 1: Distribution map of botanical records of key between species but, due to their shape and size, they the Plant task 2013. have the advantage to be easily observed, captured and de- scribed. More diverse parts of the plants however have to be considered for accurate identification. As an example, the and 12 international research groups did cross the finish line 6 species depicted in Figure 2 share the same French com- of this large collaborative evaluation by benchmarking their mon name of ”laurier” even though they belong to different images-based plant identification systems (see [14], [15] and taxonomic groups (4 families, 6 genera). [19] for more details). Data mobilised during these 3 first The main reason is that these shrubs, often used in hedges, years can be consulted at the following url10 , geographic dis- share leaves with more or less the same-sized elliptic shape. tribution of theses botanical records can be seen on Figure Identifying a laurel can be very difficult for a novice by just 1. observing leaves, while it is undisputably easier with flowers. Contrary to previous evaluations reported in the litera- Beyond identification performances, the use of leaves alone ture, the key objective was to build a realistic task closer to has also some practical and botanical limitations. Leaves real-world conditions (different users, cameras, areas, peri- are not visible all over the year for a large fraction of plant ods of the year, individual plants, etc.). This was initially species. Deciduous species, distributed from temperate to achieved through a citizen science initiative initiated 4 years tropical regions, can’t be identified by the use of their leaves ago in the context of the Pl@ntNet project in order to boost over different periods of the year. Leaves can be absent the image production of Tela Botanica social network. The (ie. leafless species), too young or too much degraded (by evaluation data was enriched each year with the new contri- pathogen or insect attacks), to be exploited efficiently. More- butions and progressively diversified with other input feeds over, leaves of many species are intrinsically not enough in- (Annotation and cleaning of older data, contributions made formative or very difficult to capture (needles of pines, thin through Pl@ntNet mobile applications). The plant task of leaves of grasses, huge leaves of palms, ...). LifeCLEF 2014 is directly in the continuity of this effort. Another originality of PlantCLEF dataset is that its social Main novelties compared to the last years are the following: nature makes it closer to the conditions of a real-world iden- (i) an explicit multi-image query scenario (ii) the supply of tification scenario: (i) images of the same species are coming user ratings on image quality in the meta-data (iii) a new from distinct plants living in distinct areas (ii) pictures are type of view called ”Branch” additionally to the 6 previ- taken by different users that might not used the same proto- ous ones (iv) basically more species (about 500 which is an col to acquire the images (iii) pictures are taken at different important step towards covering the entire flora of a given periods in the year. Each image of the dataset is associ- region). ated with contextual meta-data (author, date, locality name, plant id) and social data (user ratings on image quality, col- 2.2 Dataset laboratively validated taxon names, vernacular names) pro- vided in a structured xml file. The gps geo-localization and More precisely, PlantCLEF 2014 dataset is composed of the device settings are available only for some of the images. 60,962 pictures belonging to 19,504 observations of 500 species Table 3 gives some examples of pictures with decreasing av- of trees, herbs and ferns living in a European region centered eraged users ratings for the different types of views. Note around France. This data was collected by 1608 distinct con- that the users of the specialized social network creating these tributors. Each picture belongs to one and only one of the 7 ratings (Tela Botanica) are explicitely asked to rate the im- types of view reported in the meta-data (entire plant, fruit, ages according to their plant identification ability and their leaf, flower, stem, branch, leaf scan) and is associated with accordance to the pre-defined acquisition protocol for each a single plant observation identifier allowing to link it with view type. This is not an aesthetic or general interest judge- the other pictures of the same individual plant (observed the ment as in most social image sharing sites. same day by the same person). It is noticeable that most 10 http://publish.plantnet-project.org/project/plantclef 2.3 Task Description 9 across all plant observation queries. A simple mean on all plant observation test would however introduce some bias. Indeed, we remind that the PlantCLEF dataset was built in a collaborative manner. So that few contributors might have provided much more observations and pictures than many other contributors who provided few. Since we want to evaluate the ability of a system to provide the correct an- swers to all users, we rather measure the mean of the average classification rate per author. Finally, our primary metric is defined as the following average classification score S: U Pu 1 X 1 X 1 S= su,p U u=1 Pu p=1 Nu,p where U is the number of users, Pu the number of indi- vidual plants observed by the u-th user, Nu,p the number of pictures of the p-th plant observation of the u-th user, su,p is the score between 1 and 0 equals to the inverse of the rank of the correct species. Figure 3: Examples of PlantCLEF pictures with decreasing averaged users ratings for the different 3. TASK2: BirdCLEF types of views 3.1 Context The bird and the plant identification tasks share similar The task will be evaluated as a plant species retrieval task usage scenarios. The general public as well as profession- based on multi-image plant observations queries. The goal als like park rangers, ecology consultants, and of course, the is to retrieve the correct plant species among the top results ornithologists themselves might actually be users of an au- of a ranked list of species returned by the evaluated sys- tomated bird identifying system, typically in the context of tem. Contrary to previous plant identification benchmarks, wider initiatives related to ecological surveillance or biodi- queries are not defined as single images but as plant obser- versity conservation. Using audio records rather than bird vations, meaning a set of one to several images depicting pictures is justified by current practices [8, 31, 30, 7]. Birds the same individual plant, observed by the same person, are actually not easy to photograph as they are most of the the same day. Each image of a query observation is asso- time hidden, perched high in a tree or frightened by human ciated with a single view type (entire plant, branch, leaf, presence, and they can fly very quickly, whereas audio calls fruit, flower, stem or leaf scan) and with contextual meta- and songs have proved to be easier to collect and very dis- data (data, location, author). Each participating group is criminant. allowed to submit up to 4 runs built from different methods. Only three noticeable previous initiatives on bird species Semi-supervised and interactive approaches (particularly for identification based on their songs or calls in the context of segmenting parts of the plant from the background), are al- worldwide evaluation took place, in 2013. The first one was lowed but will be compared independently from fully auto- the ICML4B bird challenge joint to the international Con- matic methods. Any human assistance in the processing of ference on Machine Learning in Atlanta, June 2013. It was the test queries has therefore to be signaled in the submitted initiated by the SABIOD MASTODONS CNRS group11 , runs meta-data. the university of Toulon and the National Natural History Museum of Paris [12]. It included 35 species, and 76 partici- In practice, the whole PlantCLEF dataset is split in two pants submitted their 400 runs on the Kaggle interface. The parts, one for training (and/or indexing) and one for test- second challenge was conducted by F. Brigs at MLSP 2013 ing. The test set was built by randomly choosing 1/3 of the workshop, with 15 species, and 79 participants in August observations of each species whereas the remaining observa- 2013. The third challenge, and biggest in 2013, was organ- tions were kept in the reference training set. The xml files ised by University of Toulon, SABIOD and Biotope, with 80 containing the meta-data of the query images were purged species from the Provence, France. More than thirty teams so as to erase the taxon name (the ground truth), the ver- participated, reaching 92% of average AUC. The descrip- nacular name (common name of the plant) and the image tion of the ICML4B best systems are given into the on-line quality ratings (that would not be available at query stage book [2], including for some of them reference to some useful in a real-world mobile application). Meta-data of the obser- scripts. vations in the training set are kept unaltered. In collaboration with the organizers of these previous chal- lenges, BirdCLEF 2014 goes one step further by (i) signifi- The metric used to evaluate the submitted runs will be a cantly increasing the species number by almost an order of score related to the rank of the correct species in the re- magnitude (ii) working on real-world social data built from turned list. Each query observation will be attributed with hundreds of recordists (iii) moving to a more usage-driven a score between 0 and 1 reflecting equal to the inverse of the and system-oriented benchmark by allowing the use of meta- rank of the correct species (equal to 1 if the correct species data and defining information retrieval oriented metrics. Over- is the top-1 decreasing quickly while the rank of the correct 11 species increases). An average score will then be computed http://sabiod.org 10 canto community. 3.3 Task Description Participants are asked to determine the species of the most active singing birds in each query file. The background noise can be used as any other meta-data, but it is forbidden to correlate the test set of the challenge with the original an- notated Xeno-canto data base (or with any external content as many of them are circulating on the web). More precisely and similarly to the plant task, the whole BirdCLEF dataset has been split in two parts, one for training (and/or index- ing) and one for testing. The test set was built by randomly choosing 1/3 of the observations of each species whereas the remaining observations were kept in the reference training set. Recordings of the same species done by the same per- son the same day are considered as being part of the same observation and cannot be split across the test and training set. The xml files containing the meta-data of the query Figure 4: Xeno-canto audio recordings distribution recordings were purged so as to erase the taxon name (the centered around Brazil area ground truth), the vernacular name (common name of the bird) and the collaborative quality ratings (that would not be available at query stage in a real-world mobile applica- all, the task is expected to be much more difficult than previ- tion). Meta-data of the recordings in the training set are ous benchmarks because of the higher confusion risk between kept unaltered. the classes, the higher background noise and the higher diversity in the acquisition conditions (devices, recordists The groups participating to the task will be asked to pro- uses, contexts diversity, etc.). It will therefore probably pro- duce up to 4 runs containing a ranked list of the most prob- duce substantially lower scores and offer a better progression able species for each query records of the test set. Each margin towards building real-world generalist identification species will have to be associated with a normalized score in tools. the range [0; 1] reflecting the likelihood that this species is 3.2 Dataset singing in the sample. The primary metric used to com- pare the runs will be the Mean Average Precision aver- The training and test data of the bird task is composed aged across all queries. Additionally, to allow easy com- by audio recordings collected by Xeno-canto (XC)12 . Xeno- parisons with the previous Kaggle ICML4B and NIPS4B canto is a web-based community of bird sound recordists benchmarks, the AUC under the ROC curve will be com- worldwide with about 1500 active contributors that have al- puted for each species, and averaged over all species. ready collected more than 150,000 recordings of about 9000 species. Nearly 500 species from Brazilian forests are used in the BirdCLEF dataset, representing the 500 species of 4. TASK3: FishCLEF that region with the highest number of recordings, totalling about 14,000 recordings produced by hundreds of users. Fig- 4.1 Context ure 4 illustrates the geographical distribution of the dataset Underwater video monitoring has been widely used in re- samples. cent years for marine video surveillance, as opposed to hu- To avoid any bias in the evaluation related to the used au- man manned photography or net-casting methods, since it dio devices, each audio file has been normalized to a constant does not influence fish behavior and provides a large amount bandwidth of 44.1 kHz and coded over 16 bits in wav mono of material at the same time. However, it is impractical for format (the right channel is selected by default). The con- humans to manually analyze the massive quantity of video version from the original Xeno-canto data set was done using data daily generated, because it requires much time and con- ffmpeg, sox and matlab scripts. The optimized 16 Mel Fil- centration and it is also error prone. Automatic fish identifi- ter Cepstrum Coefficients for bird identification (according cation in videos is therefore of crucial importance, in order to to an extended benchmark [10]) have been computed with estimate fish existence and quantity [29, 28, 26]. Moreover, their first and second temporal derivatives on the whole set. it would help supporting marine biologists to understand They were used in the best systems run in ICML4B and the natural underwater environment, promote its preserva- NIPS4B challenges. tion, and study behaviors and interactions between marine Audio records are associated with various meta-data in- animals that are part of it. Beyond this, video-based fish cluding the species of the most active singing bird, the species species identification finds applications in many other con- of the other birds audible in the background, the type of texts: from education (e.g. primary/high schools) to the sound (call, song, alarm, flight, etc.), the date and location entertainment industry (e.g. in aquarium). of the observations (from which rich statistics on species dis- To the best of our knowledge, this is the first worldwide tribution can be derived), some textual comments of the au- initiative on automatic image and video based fish species thors, multilingual common names and collaborative quality identification. ratings. All of them were produced collaboratively by Xeno- 12 http://www.xeno-canto.org/ 4.2 Dataset 11 output masks of the object detection methods are re- quired; • Recall for fish detection in still images as a function of bounding box overlap percentage: a detection is con- sidered true positive if the PASCAL score between it and the corresponding object in the ground truth is over 0.7; • Average recall and recall per fish species for the fish recognition subtask. The participants to the above tasks will be asked to pro- duce several runs containing a list of detected fish together with their species (only for subtask 3). When dealing fish Figure 5: 4 snapshots of 4 cameras monitoring the species identification, a ranked list of the most probable Taiwan’s Kenting site species (and the related likelihood values) for each detected fish must be provided. The underwater video dataset used for FishCLEF, is de- rived from the Fish4Knowledge13 video repository, which 5. SCHEDULE AND PERSPECTIVES contains about 700k 10-minute video clips that were taken LifeCLEF 2014 registrations opened in December 2013 in the past five years to monitor Taiwan coral reefs. The Tai- and will close at the end of April 2014. At the time of writ- wan area is particularly interesting for studying the marine ing, already 61 research groups registered to at least one of ecosystem, as it holds one of the largest fish biodiversities of the three task and this number will continue growing. As the world with more than 3000 different fish species whose in any evaluation campaign, many of the registered groups taxonomy is available at 14 . The dataset contains videos won’t cross the finish line be submitting official runs but this recorded from sunrise to sunset showing several phenom- reflects at least their interest in LifeCLEF data and the re- ena, e.g. murky water, algae on camera lens, etc., which lated challenges. The schedule of the ongoing and remaining makes the fish identification task more complex. Each video steps of LifeCLEF 2014 campaign is the following: has a resolution of 320x240 with 8 fps and comes with some additional metadata including date and localization of the 31.01.2014: training data ready and shared recordings. Figure 5 shows 4 snapshots of 4 cameras mon- 15.03.2014: test data ready and shared itoring the coral reef by Taiwan’s Kenting site and it illus- 01.05.2014: deadline for submission of runs trates the complexity of automatic fish detection and recog- 15.05.2014: release of raw results nition in real-life settings. 15.06.2014: submission of working notes describing More specifically, the FishCLEF dataset consists of about each participant systems and runs 3000 videos with several thousands of detected fish. The 15.07.2014: overall tasks reports including results fish detections were obtained by processing such underwa- analysis and main findings ter videos with video analysis tools [27] and then manually 16.09.2014: 1 day LifeCLEF workshop at CLEF 2014 labeled using the system in [20]. Conference (Sheffield, UK) The organisation of a new campaign in 2015, as well as its 4.3 Task Description precise content, will depend on the outcomes of the 2014 edi- The dataset for the video-based fish identification task tion and on the feedback received from the registered groups. will be released in two times: the participants will first have access to the training set and a few months later, they will be provided with the testing set. The goal is to automati- 6. ADDITIONAL AUTHORS cally detect fish and its species. The task comprises three Robert Planqué (Xeno-Canto, The Netherlands, email: sub-tasks: 1) identifying moving objects in videos by either r.planque@vu.nl) background modeling or object detection methods, 2) de- tecting fish instances in video frames and then 3) identifying species (taken from a subset of the most seen fish species) 7. REFERENCES of fish detected in video frames. [1] MAED ’12: Proceedings of the 1st ACM International Participants could decide to compete for only one subtask Workshop on Multimedia Analysis for Ecological Data, or all subtasks. Although tasks 2 and 3 are based on still im- New York, NY, USA, 2012. ACM. 433127. ages, participants are invited to exploit motion information [2] Proc. of the first workshop on Machine Learning for extracted from videos to support their strategies. Bioacoustics, 2013. As scoring functions, the authors are asked to produce: [3] A. Angelova, S. Zhu, Y. Lin, J. Wong, and C. Shpecht. Development and deployment of a large-scale flower • ROC curves for sub-task one. In particular, precision, recognition mobile app, December 2012. recall and F-measures measured when comparing, on [4] E. Aptoula and B. Yanikoglu. Morphological features a pixel basis, the ground truth binary masks and the for leaf based plant recognition. In Proc. IEEE Int. 13 www.fish4knowledge.eu Conf. Image Process., Melbourne, Australia, page 7, 14 http://fishdb.sinica.edu.tw/ 2013. 12 [5] A. R. Backes, D. Casanova, and O. M. Bruno. Plant D. Barthélémy, and N. Boujemaa. The Imageclef leaf identification based on volumetric fractal Plant Identification Task 2013. In International dimension. International Journal of Pattern workshop on Multimedia analysis for ecological data, Recognition and Artificial Intelligence, Barcelone, Espagne, Oct. 2013. 23(6):1145–1160, 2009. [20] I. Kavasidis, S. Palazzo, R. Salvo, D. Giordano, and [6] H.-T. C. Baillie, J.E.M. and S. Stuart. 2004 iucn red C. Spampinato. An innovative web-based collaborative list of threatened species. a global species assessment. platform for video annotation. Multimedia Tools and IUCN, Gland, Switzerland and Cambridge, UK, 2004. Applications, pages 1–20, 2013. [7] F. Briggs, B. Lakshminarayanan, L. Neal, X. Z. Fern, [21] H. Kebapci, B. Yanikoglu, and G. Unal. Plant image R. Raich, S. J. Hadley, A. S. Hadley, and M. G. Betts. retrieval using color, shape and texture features. The Acoustic classification of multiple simultaneous bird Computer Journal, 54(9):1475–1490, 2011. species: A multi-instance multi-label approach. The [22] N. Kumar, P. N. Belhumeur, A. Biswas, D. W. Jacobs, Journal of the Acoustical Society of America, W. J. Kress, I. C. Lopez, and J. V. B. Soares. 131:4640, 2012. Leafsnap: A computer vision system for automatic [8] J. Cai, D. Ee, B. Pham, P. Roe, and J. Zhang. Sensor plant species identification. In European Conference network for the monitoring of ecosystem: Bird species on Computer Vision, pages 502–516, 2012. recognition. In Intelligent Sensors, Sensor Networks [23] D.-J. Lee, R. B. Schoenberger, D. Shiozawa, X. Xu, and Information, 2007. ISSNIP 2007. 3rd and P. Zhan. Contour matching for a fish recognition International Conference on, pages 293–298, Dec 2007. and migration-monitoring system. In Optics East, [9] G. Cerutti, L. Tougne, A. Vacavant, and D. Coquin. A pages 37–48. International Society for Optics and parametric active polygon for leaf segmentation and Photonics, 2004. shape estimation. In International Symposium on [24] S. Mouine, I. Yahiaoui, and A. Verroust-Blondet. Visual Computing, pages 202–213, 2011. Advanced shape context for plant species [10] O. Dufour, T. Artieres, H. GLOTIN, and P. Giraudet. identification using leaf image retrieval. In ACM Clusterized mel filter cepstral coefficients and support International Conference on Multimedia Retrieval, vector machines for bird song idenfication. 2013. pages 49:1–49:8, 2012. [11] K. J. Gaston and M. A. O’Neill. Automated species [25] M.-E. Nilsback and A. Zisserman. Automated flower identification: why not? Philosophical Transactions of classification over a large number of classes. In Indian the Royal Society of London. Series B: Biological Conference on Computer Vision, Graphics and Image Sciences, 359(1444):655–667, 2004. Processing, pages 722–729, 2008. [12] H. Glotin and J. Sueur. Overview of the first [26] M. R. Shortis, M. Ravanbakskh, F. Shaifat, E. S. international challenge on bird classification. 2013. Harvey, A. Mian, J. W. Seager, P. F. Culverhouse, [13] H. Goëau, P. Bonnet, A. Joly, V. Bakić, J. Barbe, D. E. Cline, and D. R. Edgington. A review of I. Yahiaoui, S. Selmi, J. Carré, D. Barthélémy, techniques for the identification and measurement of N. Boujemaa, et al. Pl@ ntnet mobile app. In fish in underwater stereo-video image sequences. In Proceedings of the 21st ACM international conference SPIE Optical Metrology 2013, pages 87910G–87910G. on Multimedia, pages 423–424. ACM, 2013. International Society for Optics and Photonics, 2013. [14] H. Goëau, P. Bonnet, A. Joly, N. Boujemaa, [27] C. Spampinato, E. Beauxis-Aussalet, S. Palazzo, D. Barthélémy, J.-F. Molino, P. Birnbaum, C. Beyan, J. Ossenbruggen, J. He, B. Boom, and E. Mouysset, and M. Picard. The ImageCLEF 2011 X. Huang. A rule-based event detection system for plant images classification task. In CLEF working real-life underwater domain. Machine Vision and notes, 2011. Applications, 25(1):99–117, 2014. [15] H. Goëau, P. Bonnet, A. Joly, I. Yahiaoui, [28] C. Spampinato, Y.-H. Chen-Burger, G. Nadarajan, D. Barthelemy, N. Boujemaa, and J.-F. Molino. The and R. B. Fisher. Detecting, tracking and counting imageclef 2012 plant identification task. In CLEF fish in low quality unconstrained underwater videos. working notes, 2012. In VISAPP (2), pages 514–519. Citeseer, 2008. [16] H. Goëau, A. Joly, S. Selmi, P. Bonnet, E. Mouysset, [29] C. Spampinato, D. Giordano, R. Di Salvo, Y.-H. J. L. Joyeux, J.-F. Molino, P. Birnbaum, D. Bathelemy, Chen-Burger, R. B. Fisher, and G. Nadarajan. and N. Boujemaa. Visual-based plant species Automatic fish classification for underwater species identification from crowdsourced data. In ACM behavior understanding. In Proceedings of ACM conference on Multimedia, pages 813–814, 2011. ARTEMIS 2010, pages 45–50. ACM, 2010. [17] A. Hazra, K. Deb, S. Kundu, P. Hazra, et al. Shape [30] M. Towsey, B. Planitz, A. Nantes, J. Wimmer, and oriented feature selection for tomato plant P. Roe. A toolbox for animal call recognition. identification. International Journal of Computer Bioacoustics, 21(2):107–125, 2012. Applications Technology and Research, 2(4):449–meta, [31] V. M. Trifa, A. N. Kirschel, C. E. Taylor, and E. E. 2013. Vallejo. Automated species recognition of antbirds in a [18] A. Joly, H. Goeau, P. Bonnet, V. Bakić, J. Barbe, mexican rainforest using hidden markov models. The S. Selmi, I. Yahiaoui, J. Carré, E. Mouysset, J.-F. Journal of the Acoustical Society of America, Molino, N. Boujemaa, and D. Barthélémy. Interactive 123:2424, 2008. plant identification based on social image data. [32] Q. D. Wheeler, P. H. Raven, and E. O. Wilson. Ecological Informatics, 2013. Taxonomy: Impediment or expedient? Science, [19] A. Joly, H. Goëau, P. Bonnet, V. Bakic, J.-F. Molino, 303(5656):285, 2004. 13