A Mission-Oriented Citizen Science Platform for Efficient Flower Classification Based on Combination of Feature Descriptors Andréa Britto Mattos Rogerio Schmidt Feris Ricardo Guimarães Herrmann IBM TJ Watson Research Kelly Kiyumi Shigeno rsferis@us.ibm.com IBM Research - Brazil {abritto,rhermann,kshigeno}@br.ibm.com ABSTRACT analyzed for scientific purposes, has been active for genera- This paper describes a citizen science system for flora mon- tions, dating back to 1900, when the Christmas Bird Count itoring that employs a concept of missions, as well as an au- (CBC)1 project started. It was a form of mass collaboration tomatic approach for flower species classification. The pro- citizen science project that used paper forms sent by post to posed method is fast and suitable for use in mobile devices, the responsible society or research group of scientists who as means to achieve and maintain high user engagement. requested the data. Besides providing a web-based interface for visualization, Nevertheless, it is noticeable that the power of crowd- the system allows the volunteers to use their smartphones sourced science is growing intensely nowadays and such pro- as powerful sensors for collecting biodiversity data in a fast jects are becoming increasingly popular, even gaining the and easy way. attention of major news media. For instance, projects such The classification accuracy is increased by a preliminary as eBird2 and Galaxy Zoo3 are able to engage hundreds of segmentation step that requires simple user interaction, us- thousands of volunteers and are being broadly used for sci- ing a modified version of the GrabCut algorithm. The pro- entific and educational purposes. posed classification method obtains good performance and The citizen science’s popularity boost is given by the fact accuracy, by combining traditional color and texture fea- that data collection and analysis tasks became much eas- tures together with carefully designed features, including a ier to address. Even simple and low cost smartphones are robust shape descriptor to capture fine morphological struc- equipped with GPS and high-resolution cameras that allow tures of the objects to be classified. A novel weighting tech- huge amounts of data to be collected with small effort. Also, nique assigns different costs to each feature, taking into ac- the Internet brings together a high number of volunteers to count the inter-class and intra-class variation between the work on this data remotely and simultaneously. However, considered species. there are still difficulties that prevent the further growth of The method is tested on the popular Oxford Flower Da- this methodology. taset, containing 102 categories and we achieve state-of-the- One challenge of citizen science projects is how to obtain art accuracy while proposing a more efficient approach than high and constant engagement of the users. Although some previous methods described in the literature. volunteers are motivated by the scientific contribution on its own, it is possible to resort to “gamification”, the use of game elements in non-game contexts [7], to get others Categories and Subject Descriptors to further engage with the community and contribute more H.3 [Information Storage and Retrieval]: Online Infor- enthusiastically [8]. mation Services; I.4 [Image Processing and Computer Project Noah4 , for instance, is a large scale project in Vision]: Scene Analysis; I.5 [Pattern Recognition]: Ap- which every report is assigned to a specific mission, whose plications goal is to monitor certain types of flora or fauna classes in a specific location. There are also many aspects that can Keywords increase user motivation, as observed in [28]. For instance, the volunteers must have confidence that the collected data Computer vision, fine-grained classification, flora classifica- is being used: therefore, they should have easy access to the tion, citizen science visualization of the collected data. Also, it is important to notice that some users are not willing to modify their daily 1. INTRODUCTION activities to report data, and providing training and mentor- Citizen science is not a new concept: the idea of conduct- ing for volunteers to increase their skills can be fundamental ing research by citizens, gathering crowdsourced data that is for valid data registration. This last point highlights an additional drawback of proje- Copyright c by the paper’s authors. Copying permitted only for private 1 http://birds.audubon.org/christmas-bird-count and academic purposes. 2 http://www.ebird.org/ In: S. Vrochidis, K. Karatzas, A. Karpinnen, A. Joly (eds.): Proceedings of 3 the International Workshop on Environmental Multimedia Retrieval (EMR http://www.galaxyzoo.org/ 4 2014), Glasgow, UK, April 1, 2014, published at http://ceur-ws.org http://www.projectnoah.org/ 45 Table 1: Review of citizen science projects in the botanical field. Project Goal Reported data Platform Scope Limitations Presence or absence of affected Web, Requires training to Conker Tree Plague monitoring plants and severity of plague Android, UK differentiate natural aging Science damage iPhone and plague effect Web, Requires training to Monitor invasive plants Presence of 14 types of invasive Plant Tracker Android, UK differentiate native and (non-native) plants iPhone invasive plants The Great Presence and quantity of Requires training to Sunflower Monitor pollinators pollinators (bees and Web US differentiate bees from Project hummingbirds), plant type wasps and flies Monitor sweetclover Phenology of blueberry, Requires repetitive work Melibee (invasive) effect in the cranberry and sweetclover, Web Alaska and training to differentiate Project pollination of blueberry monitoring 5 plants during a the monitored species and cranberry (native) certain period Requires training to identify Wildflowers Identify 99 species in a Annual wildflowers count Web UK 99 types of flowers, location count randomly chosen 1 km2 area chosen by the system Virbunum Monitor plagues in Requires training to identify Presence of infected plants Web NY Leaf Beetle Viburnum species Viburnum species Study effects of climate Presence and quantity of flowers Web, Requires training to identify BudBurst change on plant and fruits (single or regular US mobile the monitored flowers phenology reports) British Only monitors significant E-Flora BC Build flower catalog Geo-located rare species Web Columbia species Web, Build fauna and flora Geo-located pictures within a Project Noah Android, Worldwide Not all species are identified catalog mission iPhone NY / Requires picture of a single LeafSnap Build plant catalog Geo-located leaf pictures iPhone Washington leaf in a white background DC cts monitoring biodiversity: relying on the users having the Regarding automatic classification, the LeafSnap project5 knowledge to classify, without further assistance, the speci- offers a good differential with respect to other citizen science mens being reported. This can be a difficult task when one projects, since it runs a shape-based algorithm to classify considers the large number of species that share similar mor- plant species from their leaves automatically [15]. However, phology. Misclassified inputs can compromise the environ- the user needs to extract (i.e., cut) the leaf and place it on a mental research, and the impossibility to assure the quality white background for the segmentation algorithm to work. of the analysis made by non-experts may be the bottleneck The segmentation method proposed by LeafSnap might of such projects. In this regard, an algorithm to assist the be considered harmful in an environmental aspect, once it user would be very useful. A system that could automati- requires the extraction of the leaves from their natural sur- cally recognize an input image with small effort, or at least roundings. Also, lighting variations can affect the back- retrieve a list of the most similar species, could make the user ground color and interfere in the segmentation result. Fi- feel more confident, perhaps even engaging a larger number nally, it is hard to extend LeafSnap for additional domains of volunteers, preventing errors and accelerating the iden- other than leaf classification: besides taking shape infor- tification process. If such a system could be deployed in a mation as the only feature for classification, a segmenta- mobile device, the benefits would be even greater, once the tion method based on a fixed background is only feasible classification could be executed at collection time. for static species, once it is clear that, when photographing To assist non-expert classification, automatic works in the specimens that can move, there is no guarantee that they flora field – which is the domain of focus of this paper – will remain in place. are getting good attention, and recent studies were able to Table 1 shows a list of various citizen science projects in achieve good results in leaf and flower classification [1, 3, the botanical field. To the best of our knowledge, these are 18, 19]. This popularity is evident as we see efforts such as the most representative projects in this area. Their limita- the recent Plant Identification Task, promoted by the widely tions highlights the previously discussed issues regarding the popular ImageCLEF challenge [9]. difficulty of species identification or adaptation of the vol- Besides robustness, automatic or semi-automatic methods unteers routine, once the data upload requires a computer can reduce data collection time. For instance, as indicated and/or Internet connection. by Zou and Nagy [33], for a dataset of 102 flower species, Our goal is to develop a system similar to LeafSnap, but the time for a semi-automatic classification is much lower focusing on flowers, and replacing the user effort for seg- than that made by humans alone. mentation by a simpler and less intrusive action taken in his own device. Since our study is part of the citizen science context, user interaction is accepted, once we assume there 1.1 Related Work is always a volunteer operating the system in his smartphone or tablet. Our platform is developed with the concern of be- 1.1.1 Citizen Science in the Flora Domain 5 http://leafsnap.com/ 46 ing attractive to users so it can reach the largest number of Using segmented images, the authors are able to achieve volunteers as possible. Therefore, our flower classification a more robust classification for this dataset, though none of algorithm must have high accuracy and also must be fast the considered works is suitable for real time. Nilsback [21, enough so that it can run in the web browser and also in 22] applies a multi-kernel SVM classifier using four differ- mobile devices. Besides, we use missions to engage and en- ent features (color in HSV space, HOG, and SIFT in fore- tertain the volunteers, following gamification principles. In ground and boundary regions). Chai [5] proposes a robust order to compare our classification results with a benchmark, approach for co-segmentation (segmenting images with sim- we use the challenging Oxford Flower Dataset6 . ilar background color distributions) and applies a SVM clas- sifier based on color features in the Lab space and SIFT in 1.1.2 Flower Classification foreground region. Angelova [1, 16] proposes a segmenta- In the flower classification field, previous studies worked tion approach, followed by the extraction of HOG features on small datasets of 10 to 17 flower species [10, 14, 20, 25], at four different levels, encoded using LLC. The encoded achieving accuracy rates of 81% - 96%. Most of these works features are subject to a max pooling and a SVM classifier rely on contour analysis and SIFT features [17], which is is used. They mention future efforts in making a real time known to be a robust descriptor, but computationally inef- approach. ficient. Using larger datasets, consisting of 30 to 79 flower species, 1.2 Contributions and Organization accuracy rates vary from 63% - 94%, combining a variety Our main goal is to develop a citizen science application of approaches such as histograms in HSV color space, con- for flower monitoring whose collection is oriented by mis- tour features, co-occurrence matrix (GLCM) for texture fea- sions. Also, to assist the volunteers, we propose an algorithm tures, RBF and probabilistic classification [6, 29, 30]. Qi for classification of flower species that meets the following [26] focuses on a fast environmental classification for flowers requirements: and leaves, using spatial co-occurrence features and a linear (i) Has high accuracy: We propose a novel approach that SVM, but accuracy is not reported. relies on efficient histogram-based feature descriptors that Works on the Oxford Flower Dataset, that contains 102 capture both global properties and fine shape information of flower species, reported accuracy rates no greater than 80%. objects. This is made while leveraging a learning-based dis- The dataset was introduced by Nilsback and Zisserman[21] tance measure to properly weight the feature contributions, and is used by many authors. It is considered an extremely which is a critical step for increasing accuracy as demon- challenging dataset, because it contains pictures with severe strated in previous studies [4, 31, 32]. variations in illumination and viewpoint, and also, due to (ii) Is fast: Mobile-based applications have a challenge re- small inter-class variations and large intra-class variations garding connectivity: users can only submit requests and between the considered flower species. See some examples receive answers if their devices have access to the Internet. of such cases in Fig. 1. There are several locations in which this kind of service is not affordable or has poor quality, making image transfer- ence prohibitive. Our algorithm is fast enough to run di- rectly in the devices (without transferring information to a server), so the user does not have to rely upon a network connection and wait too long for a response, considering ad- ditional time latency for uploading the image and retrieving (a) Spear Thistle (left) and (b) English Marigold (left) the classification results. Artichoke (right). and Barbeton Daisy (right). This paper is structured as follows. In Section 2, we de- scribe the overall structure of our citizen science platform and, in Section 3, we describe the core approach for fine- grained classification. In Section 4, we show and discuss the obtained classification results on the considered dataset and (c) Spring Crocus. (d) Bougainvillea. our conclusions and future work are described in Section 5. Figure 1: Samples of the Oxford Flower Dataset. On the 2. SYSTEM top row, samples of different classes with small inter-class The system is comprised of two main parts: a web portal, variations. On the bottom, samples of the same class, with organized toward data visualization and community organi- large variations in color, texture and shape. zation, and a mobile application, aimed at data collection. Both components arrange user-contributed data around the Using the Oxford Dataset, studies working with unseg- concept of missions, which aggregate users with the common mented images demonstrate lower accuracy in classification goal of collecting structured and unstructured data about a and require powerful classifiers, that are computationally ex- certain class of observations. pensive [2, 11]. Khan’s work [13] relies on color and shape, The collected data is structured in a set of attributes, using a SIFT based approach. Kanan [12] applies salience whose pre-established values are filled in by the user and re- maps to extract a high-dimensional image representation fer to the domain being registered. This structure allows us that is used in a probabilistic classifier. Both works require to provide query-by-example features, which helps to avoid high computation time due to their feature extraction meth- common input errors and establish a common vocabulary for ods. user-contributed data. The unstructured data, on the other hand, is comprised of images only, since we are dealing with 6 http://www.robots.ox.ac.uk/~vgg/data/flowers/ plants. 47 (a) Image upload (b) Free-hand user marker (c) Classification of the (d) Other results (e) Screen for geo-analysis (in white) segmented input Figure 2: System mobile interface for classification. The user uploads an image, inserts a marker for semi-automatic seg- mentation of the input image and retrieves a scrollable screen with the top similar classes for the uploaded image, sorted by highest probability. The observation is inserted in a map using the device’s location. Missions are created by users and can be set as public or overall concept of the system is generic for accepting data private, as determined by the mission’s owner, since some belonging to other categories. Since observations can be of them may contain sensitive geo-spatial data about obser- arbitrary, we need to load the corresponding classifier that vations belonging to endangered species. As the number of was trained with the relevant data for the current category. missions in the system may be rather large, the portal ranks them according to their activity, to make it easier for users 3. ALGORITHM to engage with the rest of the community. This section describes the proposed algorithm for flower It is currently possible to collect geo-referenced photos recognition that is divided in segmentation and classification with consumer-grade mobile devices, using their cameras steps. Both tasks are designed to be fast and precise and and GPS receivers, as mentioned in Section 1. The steps are generic enough to be used for classifying categories other the user has to go through in order to submit a contribution than flowers. are highlighted in Fig. 2. For the purpose of submitting a report, the user first needs to join one mission. After the pic- 3.1 Segmentation ture is taken, he must loosely delineate the region of interest We use the GrabCut algorithm [27] for a semi-manual in the image. This is used for a segmentation process that segmentation using small user interaction. This method is will be described int the next Section. This step can require also used in Qi’s work [26] due to its good performance. In user confirmation for proceeding with the next task, which our method, instead of defining a bounding box and control is the automatic classification of the segmented input. The points, the user draws a free hand marker, which is more system retrieves the top 5 matches that are more similar to intuitive for a general user. The marker replaces the control the uploaded image and the user can select the correct one points for a more refined boundary in a faster interaction. or browse within a list of all the registered classes. It must involve the whole object but does not have to be In situations where there is no connectivity, the algorithms precise. can be executed directly on the devices, as a significant range A Gaussian Mixture Model is used to learn the pixel dis- of current mobile processors have multiple cores and some tribution in the background (outside the marker) and pos- of them have vector processing instructions, which are al- sible foreground regions and a graph is built from this pixel ready supported by popular libraries for mobile platforms, distribution. Edges’ weights are given according to pixel such as OpenCV. After the report is submitted (and syn- similarity (a large difference in pixel color generates a low chronized with the server, in situations of intermittent or no weight). Finally, graph and image are segmented by a min- connectivity), other users can view and bookmark this re- cut algorithm. See an example in Fig. 2(b) of a user marker port. The feedback is provided to the users as bookmarked and corresponding segmented output in Fig. 2(c). reports. Additionally, the user may fill in a form contain- The advantages of this approach are its simplicity in a user ing values for the observable raw morphological attributes perspective, generation of good segmentations and that it is of the sample. The set of supported attributes is defined faster than the segmentation approaches proposed in Section by the mission’s owner at the moment it is created, where 1.1.2, requiring average time of 1.5 second. he can define the required fields according to the mission’s needs. For the botanic field, some examples of attributes 3.2 Classification are stem diameter, leaf thickness and so on. We compare multiple features (i.e. color, texture and In order to obtain more accurate classification results, we shape-based features) with histogram matching, which is let the user choose the category of the image he is uploading. fast and invariant to rotation, scale and partial occlusion. We currently only have trained classifiers for flowers, but the For each class, a weight is assigned to each feature and 48 (a) Two classes that have similar color for (b) Species that can have many different col- (c) Two classes with similar shape and color, all samples, but variations in shape. ors, while having a similar shape. but with large variation in texture. Figure 3: A few examples of intra-class and inter-class variation between features of different training samples of six of the considered classes. the segmented images are matched by comparing their his- elongated shapes are easily distinguished from round shapes. tograms, using a kNN classifier. The difference between The angular descriptor computes the angles between p and two histograms is computed by a metric based on the Bhat- its neighbors pi−1 and pi+1 , measuring the smoothness of tacharyya distance that is described as follows. the contour. The proposed descriptor becomes powerful by Consider histograms H1 and H2 , with n bins. Let H1 (i) merging two simple features being able to represent, for in- denote the ith bin element of H1 , i ∈ 1 . . . n. H2 (i) is defined stance, petals length, density and symmetry. analogously. The distance between H1 and H2 is measured as: 3.2.2 Metric Learning v Pn p In order to achieve higher accuracy, we assign different H1 (i) × H2 (i) weights to the features when matching each class, as to con- u d(H1 , H2 ) = 1 − s i=1 u template the situations described in Fig. 3. Our idea is to u u Pn Pn H1 (i) × H2 (i) find which features are more discriminative taking into ac- t i=1 i=1 count each feature variation inside a same class, but also, Note that low scores indicate best matches. Our algorithm with respect to the global variation of all classes. We learn, compares histograms built on three features cues. for each class, one weight for each of the four feature de- scriptors. We estimate few weights due to the small number 3.2.1 Features of training samples per class. We consider N classes, each containing M training sam- ples, evaluated according to P features7 . Color. p Let Hi,j denote, for a class i, the histogram of a sample We build a histogram of the segmented image’s colors in j regarding feature p. We compute the mean histogram of a the HSV space, which is known to be less sensitive to lighting feature p for a class i as: variations, using 30 bins for hue and 32 bins for saturation. Image quantization is applied before computation and the M histograms are normalized in order to be comparable with P p Hi,j the proposed distance metric. j=1 H̄ip = M Texture. Likewise, we define εpi as the mean distance per class i, The texture operator applied in the segmented image is regarding feature p, computing the distance from all the LBP (Local Binary Pattern) [23], that is commonly used in training samples to the mean histogram. This help us to real time applications due to its computational simplicity. evaluate intra-class variations, once we are considering the In LBP, pixels are represented as a binary number that is difference between all samples with respect to a histogram product from thresholding its neighbors against it. that estimates the overall structure of the class. We use the extended version of LBP [24], that considers a circular neighborhood with variable radius and is able to M detect details in images with different scales. Our method P p d(Hi,j , H̄ip ) computes a single histogram with 255 bins, once the consid- j=1 εpi = ered textures are mainly uniform and spatial texture infor- M mation did not improve the classification in our tests. Also, we compute the mean distance per feature p (εp ) as follows: Shape. N We propose the use of two simple and fast descriptors P εpi that, when combined, are able to represent different shape p ε = i=1 characteristics. The shape contour is partitioned in 72 bins N using fixed angles, resulting of m contour points. For each and we use this information to estimate inter-class varia- point pi , i ∈ 1 . . . m, we compute modular and angular de- tions between all types of species regarding each considered scriptors, and both vectors are later represented as separate feature. Finally, the weight λ attributed to each of the con- histograms. sidered features p in class i is given by: The modular descriptor is taken by computing the point distance from the shape centroid, normalized by the major distance dmax computed in this process. It measures the 7 In our tests, we use N = 102, M = 20 and P = 4, since we contour relation according to the shape mass center, and consider two histograms for shape information. 49 (a) Input belonging to Bearded Iris (BI) class, with large variation in color. (b) Input belonging to Bishop of Llandaff (BoL) class, with small variation in color. (c) Input belonging to Spring Crocus (SC) class, with large variation on texture. (d) Input belonging to Blackberry Lily (BL) class, with small variation in texture. Figure 4: Results for 4 different inputs. In each column, from left to right: input with user marker; segmented input and top 5 matches sorted by highest probability; respective weights λ(i, p) for the corresponding classes; 8 training samples of each class. In the graphics, columns indicate angular, modular, color and texture features. we can see that the system is able to achieve a very high 1 − x(i, p) accuracy rate for a very challenging dataset. λ(i, p) = P , where x(i, p) = εpi − εp + min(εp ) P p x(i, p) Table 2: Algorithm’s accuracy when considering the top n p=1 matches for the test inputs. and we are able to take into account intra-class and inter- class variations between all training samples, estimating how Metric Learning n=1 n=3 n=5 n = 10 each feature should be considered when evaluating each con- No 65.58% 83.13% 88.72% 94.11% sidered class. Yes 80.88% 92.45% 96.07% 98.33% After the weights computation, we employ a kNN classifier in the test phase where the cost of matching an image I with The results also show that weighting features can im- an image IC of class C is computed as: prove classification significantly (+15.3%) and we are able to achieve the best results reported on the state-of-the-art P X λ(C, p) × d(HI , HIC ) for this dataset, in terms of accuracy, as seen in Table 3. C(I, IC ) = p=1 maxi (εpi ) Table 3: Accuracy comparison with previous works using and select the classes with the lowest costs. We choose the the Oxford Dataset. kNN classifier because it is robust and performs well with training sets whose dimension is similar to the ones we are Ito and Kubota [11] 53.9% dealing with. Nilsback and Zisserman [21] 72.8% Khan et al. [13] 73.3% 4. RESULTS Kanan and Cottrell [12] 75.2% Our tests followed the specifications for using the Oxford Nilsback [22] 76.3% Dataset as a benchmark, considering 20 training samples per Angelova et al. [2] 76.7% class and computing the mean-per-class accuracy. In this Chai et al. [5] 80.0% dataset, the number of images in each class varies from 40 Angelova and Shenghuo [16] 80.6% to 258. Fig. 4 shows the top 5 results for 4 different inputs. Ours 80.8% Every image is segmented as described in Section 3.1 and the used weights are given as described in our metric learn- Efficiency can not be compared precisely, because this in- ing method. Note how the weights affect the classification formation is not reported in all the previous works. However, results. our average time for extracting and matching all features Table 2 describes the algorithm’s accuracy for a variable proved to be 4 times faster than running SIFT in the im- number of top matches, with and without using metric learn- ages’ foreground region. The baseline work of Angelova and ing. Once our platform returns the top 5 most likely classes, Shenghuo [16] takes about 5 seconds for segmentation and 2 50 seconds for classification. Our classification runs in less than classification. In IEEE International Conference on a second, and there is a tradeoff for segmentation: ours is Computer Vision, ICCV’11, pages 2579–2586, 2011. semi-automatic, but requires 1.5 seconds on average. [6] S.-Y. Cho. Content-based structural recognition for Finally, we replicated the training instances, and Table flower image classification. In IEEE Conference on 4 shows that our method is still efficient for much larger Industrial Electronics and Applications, ICIEA’12, training sets. Approximate nearest-neighbor methods based pages 541–546, 2012. on hashing or kd-trees could also be used for obtaining even [7] S. Deterding, D. Dixon, R. Khaled, and L. Nacke. higher efficiency. From game design elements to gamefulness: Defining ”gamification”. In Proceedings of the 15th Table 4: Elapsed time (in seconds) for classifying a single International Academic MindTrek Conference: input, with increasing number of training samples per class. Envisioning Future Media Environments, MindTrek ’11, pages 9–15, New York, NY, USA, 2011. ACM. # samples 20 40 60 100 200 1000 [8] S. Deterding, M. Sicart, L. Nacke, K. O’Hara, and Elapsed time 0.07 0.09 0.13 0.20 0.35 1.3 D. Dixon. Gamification. using game-design elements in non-gaming contexts. In CHI ’11 Extended Abstracts on Human Factors in Computing Systems, 5. CONCLUSIONS CHI EA ’11, pages 2425–2428, New York, NY, USA, This paper proposes a citizen science platform based on 2011. ACM. a novel approach for flower recognition. We introduce a [9] H. Goëau, A. Joly, P. Bonnet, V. Bakic, strategy for comparing feature histograms for fine-grained D. Barthélémy, N. Boujemaa, and J.-F. Molino. The classification, a robust shape descriptor and a metric learn- ImageCLEF plant identification task 2013. In ing approach that employs different weights to each feature, Proceedings of the 2Nd ACM International Workshop that can improve classification accuracy significantly. Our on Multimedia Analysis for Ecological Data, MAED algorithm is extremely fast, being suitable for offline mo- ’13, pages 23–28, New York, NY, USA, 2013. ACM. bile applications and was able to outperform previous works [10] R.-G. Huang, S.-H. Jin, Y.-L. Han, and K.-S. Hong. using the popular Oxford Dataset. Flower image recognition based on image rotation and Our system is organized around missions, which in gen- DIE. In International Conference on Digital Content, eral will help us acquire more data due to gamification as- Multimedia Technology and its Applications, IDC’10, pects. Also, besides engaging users, missions provide addi- pages 225–228, 2010. tional data external to the image, that may be useful for [11] S. Ito and S. Kubota. Object classification using aiding classification in the future. heterogeneous co-occurrence features. In European Other future works include testing our algorithm in other Conference on Computer Vision, ECCV’10, pages species that contain large variations in the considered fea- 27–30, 2010. tures (color, texture and shape), such as butterflies, fishes, [12] C. Kanan and G. Cottrell. Robust classification of birds and so on, and evaluating more efficient classifiers. objects, faces, and flowers using natural image We will also evaluate our system with real users analyz- statistics. In IEEE Conference on Computer Vision ing how their behavior is affected with gamification and and Pattern Recognition, CVPR’10, pages 2472–2479, automatic classification techniques. Finally, we aim to use 2010. crowdsourcing for labeling training images, in order to build [13] F. S. Khan, J. van de Weijer, A. D. Bagdanov, and various datasets that should represent the local flora and M. Vanrell. Portmanteau vocabularies for multi-cue fauna for diverse locations. image representation. In International Conference on Neural Information Processing Systems, 2011. 6. REFERENCES [14] J.-H. Kim, R.-G. Huang, S.-H. Jin, and K.-S. Hong. [1] A. Angelova and S. Zhu. Efficient object detection and Mobile-based flower recognition system. In segmentation for fine-grained recognition. In 2013 International Conference on Intelligent Information IEEE Conference on Computer Vision and Pattern Technology Application, IITA’09, pages 580–583, 2009. Recognition, CVPR’13, pages 811–818, 2013. [15] N. Kumar, P. N. Belhumeur, A. Biswas, D. W. Jacobs, [2] A. Angelova, S. Zhu, Y. Lin, J. Wong, and C. Shpecht. W. J. Kress, I. C. Lopez, and J. a. V. B. Soares. Development and deployment of a large-scale flower Leafsnap: A computer vision system for automatic recognition mobile app. In NEC Labs America plant species identification. In European Conference Technical Report, 2012. on Computer Vision, ECCV’12, pages 502–516, 2012. [3] E. Aptoula and B. Yanikoglu. Morphological features [16] Y. Lin, S. Zhu, and A. Angelova. Image segmentation for leaf based plant recognition. In IEEE International for large-scale subcategory flower recognition. In IEEE Conference on Image Processing, ICIP ’13, pages Workshop on Applications of Computer Vision, 1496–1499, 2013. WACV ’13, pages 39–45, 2013. [4] H. Cai, F. Yan, and K. Mikolajczyk. Learning weights [17] D. Lowe. Object recognition from local scale-invariant for codebook in image classification and retrieval. In features. In International Conference on Computer Computer Vision and Pattern Recognition, 2010 IEEE Vision, ICCV ’99, pages 1150–1157, 1999. Conference on, CVPRâĂŹ10, pages 2320–2327, June [18] S. Mouine, I. Yahiaoui, and A. Verroust-Blondet. 2010. Plant species recognition using spatial correlation [5] Y. Chai, V. Lempitsky, and A. Zisserman. BiCoS: A between the leaf margin and the leaf salient points. In bi-level co-segmentation method for image IEEE International Conference on Image Processing, ICIP ’13, 2013. 51 [19] S. Mouine, I. Yahiaoui, and A. Verroust-Blondet. A pages 1257–1258, 2012. shape-based approach for leaf classification using [27] C. Rother, V. Kolmogorov, and A. Blake. “Grabcut”: multiscaletriangular representation. In ACM Interactive foreground extraction using iterated graph Conference on International Conference on cuts. In ACM SIGGRAPH, SIGGRAPH’04, pages Multimedia Retrieval, ICMR ’13, pages 127–134, 2013. 309–314, 2004. [20] M.-E. Nilsback and A. Zisserman. A visual vocabulary [28] H. Roy, M. Pocock, C. Preston, D. Roy, J. Savage, for flower classification. In IEEE Computer Society J. Tweddle, and L. Robinson. Understanding citizen Conference on Computer Vision and Pattern science and environmental monitoring: final report on Recognition, CVPR’06, pages 1447–1454, 2006. behalf of UK Environmental Observation Framework. [21] M.-E. Nilsback and A. Zisserman. Automated flower Technical report, Wallingford, NERC/Centre for classification over a large number of classes. In Indian Ecology & Hydrology, 2012. Conference on Computer Vision, Graphics & Image [29] T. Saitoh, K. Aoki, and T. Kaneko. Automatic Processing, ICVGIP’08, pages 722–729, 2008. recognition of blooming flowers. In International [22] M.-E. Nilsback and A. Zisserman. An automatic visual Conference on Pattern Recognition, ICPR’04, pages Flora - segmentation and classification of flower 27–30, 2004. images. PhD thesis, University of Oxford, UK, 2009. [30] F. Siraj, M. Salahuddin, and S. Yusof. Digital image [23] T. Ojala, M. Pietikäinen, and D. Harwood. A classification for malaysian blooming flower. In comparative study of texture measures with International Conference on Computational classification based on featured distributions. Pattern Intelligence, Modelling and Simulation, pages 33–38, Recognition, 29(1):51 – 59, 1996. 2010. [24] T. Ojala, M. Pietikainen, and T. Maenpaa. [31] J. Wang, A. Kalousis, and A. Woznica. Parametric Multiresolution gray-scale and rotation invariant local metric learning for nearest neighbor texture classification with local binary patterns. classification. In Neural Information Processing Pattern Analysis and Machine Intelligence, IEEE Systems (NIPS), pages 1610–1618, 2012. Transactions on, 24(7):971–987, 2002. [32] K. Q. Weinberger and L. K. Saul. Distance metric [25] W. Qi, X. Liu, and J. Zhao. Flower classification based learning for large margin nearest neighbor on local and spatial visual cues. In IEEE International classification. J. Mach. Learn. Res., 10:207–244, June Conference on Computer Science and Automation 2009. Engineering, CSAE’12, pages 670–674, 2012. [33] J. Zou and G. Nagy. Evaluation of model-based [26] X. Qi, R. Xiao, L. Zhang, C.-G. Li, and J. Guo. A interactive flower recognition. In International rapid flower/leaf recognition system. In ACM Conference on Pattern Recognition, ICPR ’04, 2004. International Conference on Multimedia, MM ’12, 52