Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 1 Planteome & BisQue: Automating Image Annotation with Ontologies using Deep-Learning Networks Dimitrios Trigkakis, Sinisa Todorovic Dept. of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR, USA Justin Preece, Austin Meier, Justin Elser, Pankaj Jaiswal Dept. of Botany & Plant Pathology, Oregon State University, Corvallis, OR, USA Kris Kvilekval, Dmitry Fedorov, B.S. Manjunath Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA, USA Keywords—image analysis; segmentation; ontology; annotation; shape). Both classification models return results mapped to machine learning, deep learning, convolutional neural networks ontology terms as a form of annotation enrichment. This I. INTRODUCTION current version of the Planteome Deep Segmenter module [6] combines image classification with optional guided The field of computer vision has recently experienced segmentation and ontology annotation. We have most recently tremendous progress due to advances in deep learning [1]. run the module on local Planteome BisQue client services, and This development holds particular promise in applications for are currently working with CyVerse [7] to install a hosted plant research, due to a significant increase in the scale of version on their BisQue client service. image data harvesting and a strong field-driven interest in the automated processing of observable phenotypes and visible II. RELEVANT WORK traits within agronomically important species. The state-of-the-art models for classifying images typically Parallel developments have occurred in semantic use a hierarchy of filter convolutions for extracting computing; for example, new ontologies have been initiated to discriminative image features from raw pixels and then map capture plant traits and disease indicators [2]. When these these deep image features to class predictions using a ontologies are combined with existing segmentation hierarchy of linear regressions. The former computing module capabilities [3], it is possible to conceptualize software is called convolutional layers, and the latter, fully connected applications that give researchers the ability to analyze large layers. A number of convolutional layers followed by quantities of plant phenotype image data, and to auto-annotate fully-connected layers constitutes a typical deep convolutional that data with meaningful, computable semantic terminology. neural network (CNN). The literature presents a host of CNNs that are successful in image classification on big datasets, We have previously reported on a software application that including AlexNet [8], ResNet [9], GoogLeNet [10], VGGNet integrates segmentation and ontologies, but lacked the ability [11] etc. These CNNs differ in the number of convolutional to manage very high-resolution images, and also lacked a and fully-connected layers used, and their connectivity. Most database platform to allow for high-volume storage of them are too complex and require a huge amount of training requirements. We have also previously reported our migration data, which is not suitable for our domain where expert of the AISO [4] user-guided segmentation feature to a BisQue annotations of images are usually scarce. In our work, we train (Bio-Image Semantic Query User Environment) [5] module to ResNet [9] models for the task of image classification, for take advantage of its increased power, ability to scale, secure both the coarse classification of plant images and fine-grained data management environment, and collaborative software leaf trait classification. We use segmentation with the dynamic ecosystem. graph-cuts algorithm [12] as an optional pre-processing step. Neither AISO nor our initial BisQue implementation While CNNs aimed directly at semantic segmentation of possessed a machine-learning component for interpreting images have also been studied in the deep learning literature (parts of) images. Plant researchers could benefit greatly from [13][14], they typically require a huge amount of pixel-wise a trained classification model that predicts image annotations ground-truth annotations, which are not available in our with a high degree of accuracy. We have therefore domain. Therefore, we have decided to use the pre-processing implemented two deep-learning prototypes: a coarse segmentation step for delineating important image parts, and classification module for plant object identification (​i.e. then apply to the selected image segments our ResNet for flower, fruit) and a fine-grained classification module that coarse and fine classification. focuses on plant traits (​e.g. reticulate vs. parallel venation, tip ICBO 2018 August 7-10, 2018 1 Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 2 Previous work for classification of leaves, like LeafSnap [15], attempted to identify specific species of leaves. In contrast with that work, our goal is to identify specific leaf traits that can be mapped to pre-existing ontology terms for any leaf species, even when the input image contains a leaf that has not been seen in the training dataset. We are not aware of any other work that attempts to classify leaf traits with ontological Figure 1b. Classification results over the segmented part labelling in a given image. of the image, accompanied by the ontology PO-term for the identified plant. The user can follow the provided link to the corresponding ontology for more information III. IMPLEMENTATION about the identified plant term. The latest iteration of our module on the BisQue platform preserves and extends the functionality of both AISO and our original BisQue segmentation efforts. The input consists of optional freeline user-guided markup on a selected image along with an interface for specifying module behavior. The markup defines the foreground and background of the image. The user then has the ability to perform exact object segmentation on any visual element of interest utilizing a Dynamic Graph Cuts algorithm. Our interface allows varying the segmentation quality in order to allow users to control the computational overhead of the segmentation. The BisQue platform provides access to an online image library, along with annotations associated with graphical object on the images. The platform provides the ability to analyze segmentation results from previous module outputs in the form of overlayed graphical annotations in an image. We augment the segmentation capabilities of the module with machine learning algorithms that can automatically extract semantic information from either the entire input image, or a segmented part of it. To this end we introduce two Figure 2. The same image can be used for segmentation tasks: coarse plant classification and fine-grained leaf part of different plant parts. The segmented part, identified as classification. For the first task, we want to identify the class a leaf, can be further classified using the fine-grained of a plant image, categorizing it to one of five categories, leaf classification model. namely ‘leaf’, ‘fruit’, ‘flower’, ‘stem’ and ‘entire plant’. The For the second task, given an image of a leaf, we want to segmentation and classification functionality of our module is simultaneously classify multiple leaf characteristics, like leaf shown in Figure 1(a-b). tip shape and leaf venation as shown in Fig. 3. Figure 1a. User guided markup on an image uploaded to the database. Markup includes foreground and background annotations for identifying the plant part of interest. Figure 3. Fine-grained leaf classification. Each leaf characteristic is classified, and a link is provided for the corresponding ontology term. ICBO 2018 August 7-10, 2018 2 Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 3 The above functionality can be combined with the append parallel fully connected layers, one for each leaf trait. segmentation capabilities of our module, to produce In this way we obtain multiple predictions for each image. The classification of specific parts of the image. The user interface selection of the appropriate model is done through the allows the segmentation and classification to be enabled interface, where a multiple choice menu allows the user to separately or in combination. To enable better module select either the coarse or the fine-grained classification portability, we isolate our deep learning framework within a pre-trained model to use in the module. virtual environment that uses PyTorch [16] and opencv [17] For each classification result provided by our models, a for all image operations. corresponding PO (Plant Ontology) or PATO (Phenotype And Trait Ontology) term is supplied that connects the class of an object with the correct ontology term. These terms can be accessed online through a supplied hyperlink, which maps all classes to their ontology through the Planteome database of ontologies. IV. RESULTS We report our classification results in Tables 1 and 2. In Table 1 we show the confusion matrix for the five categories. In a confusion matrix, the labels at the top row represent the predictions of the classifier, while the ground truth labels are represented in the left column. For example, the entry ‘21’ in Figure 4. Desktop annotation tool, used for labelling the second row represents the fact that 21 testing samples were fine-grained leaf characteristics. On the right, each predicted as flower, but were actually of the fruit category. The diagonal entries represent predictions that were actually classification field can be expanded so that we can find the correct (the predicted and ground truth labels match). where corresponding field’s classification label. The toolbar helps correct predictions correspond to the diagonal entries. We with image manipulation in order to identify these obtain an accuracy of 91%. Our results for fine-grained classification labels. classification of leaf traits are shown in Table 2. Each column represents a category, where each category contains a varying For our task of supervised classification of images, we number of classes, so that multiple classification results have make use of the ImageClef 2013 plant identification dataset been grouped together in different classification tasks. In [18]. The dataset contains labels for the five classes in the Table 2, we also compare with a baseline approach which aforementioned coarse plant classification task. We augment represents random guessing over the classes of each category. the dataset with a selection of images collected by the Jaiswal The results reported were produced by k-fold cross validation, lab. For fine-grained leaf classification, we needed additional where 20% of the entire dataset of 549 leaves was used for the labels for each leaf characteristic to be classified. To that end, classifier’s testing accuracy, while the rest of the dataset was we developed a desktop annotator tool (Fig 4) [19], which used for training the classifier. We repeat this process seven enabled fast annotations of leaf characteristics based on times on different, non-overlapping subsets of the dataset and reference images tagged with the appropriate terms. The average the testing set accuracy. The ‘Testing accuracy’ row annotation process using this tool resulted 549 fully annotated contains these averaged testing accuracies after the leaf images, selected from the subset of the aforementioned aforementioned process has been completed. dataset when the leaves were presented clearly on a sheet (leaf-on-sheet pre-annotated category). Generalizing to natural images of leaves requires the use of the segmentation capabilities of our module to isolate a leaf in an image. The deep learning models we employ are based on the ResNet architecture, a type of convolutional neural network. For the coarse classification task, we train ResNet-50 on the augmented ImageClef 2013 dataset, using the category labels provided in the dataset. For fine-grained classification, we find that ResNet-50 overfits the training set (i.e., cannot generalize well to new images that are not in the training set), so we reduce the number of model parameters by using ResNet-18. The model acts as a feature extractor, on top of which we ICBO 2018 August 7-10, 2018 3 Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 4 Fruit Flower Leaf Stem Entire pave the way for future developments and more sophisticated learning models. Processing of high resolution digital images Fruit 1896 21 13 2 7 and management of high-throughput databases fulfill user needs in an era where more data becomes available as Flower 24 405 58 2 4 computational costs decrease dramatically. Future work will Leaf 46 69 1062 1 33 focus on online retraining of deep learning models, assembly and import of new models for new tasks, synchronous Stem 9 7 1 596 2 segmentation, more robust ontology metadata services, deeper semantic search integration with the BisQue platform, and Entire 64 18 99 4 648 public deployment and release of the software and source code (upon peer-reviewed publication; manuscript and hosted site Table 1. Coarse classification task: Confusion matrix for are currently in preparation). ​We anticipate that further the training set. Classification accuracy on the testing set performance improvements could be made by expanding the is 91%. The table’s cells represent a sample from the initial training dataset of images with unannotated examples testing set. That cell’s row label (leftmost column), from the ImageClef dataset and then conducting shows the actual classification label. The cell’s column semi-supervised training of our ResNet model for image label (topmost row), shows the predicted classification classification. label. ACKNOWLEDGMENTS Leaf Leaf Leaf base Leaf tip Leaf Leaf type shape shape shape margin venation We acknowledge and thank the following curators for testing and using the desktop leaf annotation tool to generate Testing 98.2% 41.7% 37.1% 29.9% 51.2% 96.0% annotated image data for model training purposes: Laurel Accuracy Cooper, Noor Al-Bader, Austin Meier, and Parul Gupta. We Random 50% 3.1% 10% 5.5% 2.6% 50% would like to thank Philip Daly and Blake L. Joyce for their Guessing assistance and dedication in hosting the Planteome BisQue module on their infrastructure, via the CyVerse Extended Table 2. Fine-grained leaf trait classification task: Collaborative Support program. We thank the National Classification accuracy on the testing set for each leaf Science Foundation for project support under the following trait. Unlike Table 1, we only show the actual accuracy awards: DBI-0735191, DBI-1265383 (CyVerse), 1340112 percentages on the test set for the six classification fields. (Planteome, OSU), ABI-1356750 (BisQue, UCSB), and SI2-SSI 1650972 (LIMPID, UCSB). V. DISCUSSION REFERENCES User enhancements to our module interface have included differential markup line coloring (as shown in Fig. 2), free-line [1] X. W. Chen and X. Lin, "Big Data Deep Learning: Challenges and Perspectives," in ​IEEE Access​, vol. 2, pp. 514-525, 2014. doi: markup input, classification model selection and segmentation 10.1109/ACCESS.2014.2325029 options. Our module not only provides segmentation of image [2] Planteome by http://planteome.org is licensed under a ​Creative parts that can later be annotated with ontology terms, but also Commons Attribution 4.0 International License​. Based on a work at the mapping of such segments to a predefined set of classes planteome.org​. that already contain the cor responding ontology [3] Vignesh Jagadeesh, Utkarsh Gaur, “​Graph Cuts-based Image Matting / terms, providing users with the information related to the Segmentation”, unpublished. Online BisQue module: http://bisque.iplantcollaborative.org/module_service/ImageMatting/ ontology of the selected segment. The ontology terms link [4] Lingutla N*, Preece J*, Todorovic S, Cooper L, Moore L, Jaiswal P. directly to the planteome ontology database to provide all 2014. AISO: Annotation of Image Segments with Ontologies. Journal of relevant information that the ontology term is associated with. Biomedical Semantics. 5:50. (*: Co-authors) Our module also allows the seamless integration of new deep [5] Kristian Kvilekval, Dmitry Fedorov, Boguslaw Obara, Ambuj Singh and models through the same interface, for other tasks related to B.S. Manjunath, “Bisque: A Platform for Bioimage Analysis and plant classification​. Management”, Bioinformatics, vol. 26, no. 4, pp. 544-552, Feb. 2010. [6] Planteome Deep Segmenter software repository: VI. CONCLUSIONS https://github.com/Planteome/planteome-deep-segmenter [7] Merchant, Nirav, et al., "The iPlant Collaborative: Cyberinfrastructure We have developed an effective suite of guided for Enabling Data to Discovery for the Life Sciences," PLOS Biology segmentation and auto-annotation tools that can be used for (2016), doi: 10.1371/journal.pbio.1002342. online image analysis. Integration of deep learning models [8] Imagenet classification with deep convolutional neural networks A with segmentation and ontology annotation functionality have Krizhevsky, I Sutskever, GE Hinton Advances in neural information provided a strong foundation for plant image analysis, and processing systems, 1097-1105 ICBO 2018 August 7-10, 2018 4 Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 5 [9] Deep Residual Learning for Image Recognition, K He, X Zhang, S Ren, J Sun Computer Vision and Pattern Recognition (CVPR), 2016 [10] Going Deeper with Convolutions, Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich (CVPR), 2015 [11] Karen Simonyan, Andrew Zisserman: Very Deep Convolutional Networks for Large-Scale Image Recognition (ICLR), 2015 [12] Boykov Y., Kolmogorov V. (2001) An Experimental Comparison of Min-cut/Max-flow Algorithms for Energy Minimization in Vision. In: Figueiredo M., Zerubia J., Jain A.K. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 2001. Lecture Notes in Computer Science, vol 2134. Springer, Berlin, Heidelberg [13] Fully Convolutional Models for Semantic Segmentation Jonathan Long*, Evan Shelhamer*, Trevor Darrell CVPR 2015 arXiv:1411.4038 [14] "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs" Liang-Chieh Chen*, George Papandreou*, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille (*equal contribution) arXiv preprint, 2016 [15] Kumar N. et al. (2012) Leafsnap: A Computer Vision System for Automatic Plant Species Identification. In: Fitzgibbon A., Lazebnik S., Perona P., Sato Y., Schmid C. (eds) Computer Vision – ECCV 2012. Lecture Notes in Computer Science, vol 7573. Springer, Berlin, Heidelberg [16] https://pytorch.org/​, PyTorch framework for deep learning [17] https://opencv.org/​, OpenCV library for computer vision [18] http://www.imageclef.org/2013/plant​, ImageClef 2013 plant identification challenge [19] Plant Image Desktop Classifier ​software repository​: https://github.com/Planteome/plant-image-desktop-classifier ICBO 2018 August 7-10, 2018 5