Regimvid at ImageCLEF 2015 Scalable Concept Image Annotation Task: Ontology based Hierarchical Image Annotation Mohamed Zarka, Anis Ben Ammar and Adel M. Alimi mohamed.ezzarka@ieee.org ben.ammar.regim@gmail.com adel.alimi@ieee.org REGIM: Reaserch Groups on Intelligent Machines, University of Sfax, ENIS, BP 1173, Sfax, 3038, Tunisia http://www.regim.org/ Abstract. In this paper, we describe our participation in the Image- CLEF 2015 Scalable Concept Image Annotation task. In this participa- tion, we display our approach for an automatic image annotation by the use of an ontology-based semantic hierarchy handled at both learning and annotation steps. While recent works focused on the use of semantic hierarchies to improve concept detector accuracy, we are investigating the use of such hierarchies to reduce detector complexity and then, to handle efficiently large-scale image datasets. Our framework is based on two steps: (1) constructing a fuzzy ontology through analyzing learning dataset, and (2) guiding the annotation process through a reasoning en- gine. The obtained results confirm that this approach is promising for scalable image annotation. Keywords: Image Annotation, Classification, Concept Detection, fuzzy Ontology, fuzzy Reasoning 1 Introduction For the ImageCLEF2015 Scalable Concept Image Annotation task, our aim is to construct an automated image annotation framework that focuses on the scala- bility aspect through reducing semantic concept detection cost and complexity. Automatic photo annotation is considered as a classification problem that consists in assigning a set of semantic concepts to a semantic content of a given image [33, 39]. Image collections are increasing staggeringly. Thus, retrieving from large- scale image datasets is a challenging task [36, 35, 16, 34]. The access to such enormous contents has forced the image retrieval community to look for advanced approaches and techniques in order to make the availability of automated and efficient semantic annotation for such contents [4, 30, 29]. In the previous ImageCLEF Scalable Concept Image Annotation task, [29] focused on the use of a knowledge based approach. Thus, an ontology was gen- erated and used both: (1) in training phase to select images that should be used for optimizing classifiers, and (2) in testing phase for deducing new annotations through concept inter-relationships. This approach was considered as the winner within ImageCLEF2014 Scalable Concept Image Annotation task. Aspiring to reconcile and exploit the semantic assets provided by the use of knowledge based approaches, number of initiatives have investigated the knowl- edge engineering for image retrieval ([26, 2, 10] to cite a few). Yet, ontologies (as a knowledge database) are powerful tools to design concepts and their interrela- tionships. In general, ontology-based approaches consists in defining a knowledge conceptualization and a reasoning process in order to handle and enhance a se- mantic interpretation. An immediate effect of such efforts is the alleviation of the semantic barriers and many promising works raised [18]. Aiming to contribute towards this direction, in [38], we presented a fuzzy ontology based framework for enhancing a multimedia content indexing accu- racy. Key dimensions of this inquiry constitute the three main issues addressed by the existing ontologies, namely a generic ontology structure aspect, an auto- mated knowledge extraction process for populating an ontology content, and a machine-driven context detection for a multimedia content. What was accom- plished in this study is a novel ontology management method which is intended to a machine-driven knowledge database construction. The experiment that we conducted on the ImageCLEF2012 dataset displayed semantic improvements over a classical image annotation framework used in large-scale multimedia con- tents. Our submitted runs within ImageCLEF2015 Scalable Concept Image An- notation rely on a visual analysis of the provided testing dataset. As visual features, we used a k-means [32] algorithm to classify training local feature ex- tract by Surf algorithm [3]. For scalability, our runs aim to show that we can go further in such aspect by reducing computing cost. In fact, we propose an ontology based approach that alleviates the computing cost for labeling a given testing image by candidate semantic concepts. By reading papers that have been published within ImageCLEF Labs [25, 36, 35, 7, 34], it can be clearly seen that there are a serious focus made on scalability through reducing the candidate concept list to be analyzed within an image content. Mainly, these works rely on dividing candidate concepts into: (1) initial concepts that can be detected directly through analyzing an image content, and (2) extended concepts that can be detected through reasoning with the initial ones. In [38], we presented a fuzzy framework for enhancing a semantic interpretation through reasoning with a given initial concept set. In our submitted runs, we focused on detecting initial concepts. Thus, our contribution consists in developing a fuzzy ontology to guide the annotation process through reducing the number of concepts to be detected. The present working note is organized as follows: Section 2 describes the proposed framework. Section 3 describes our submitted runs to ImageCLEF2015 Scalable Concept Image Annotation task as well as comparison results with other participants runs. Finally, a conclusion and some future directions of our work are described in Section 4. 2 The RegimVid Image Annotation System 2.1 Framework overview The RegimVid [11, 12, 19] is a semantic video indexing, retrieval and visual- ization system developed within our team. In this paper, we propose a scalable image annotation framework based on hierarchical annotators. We investigate research works on semantic hierarchies for hierarchical image annotation. Our framework relies on constructing and managing a fuzzy ontology that handle a semantic hierarchy. Such a hierarchy is used then to train more accurate image annotators (see figure 1). Fig. 1. Ontology based semantic annotator hierarchy for image annotation Image annotation is considered as a multi-class classification problem. Many approaches were proposed to handle the annotation scalability aspect (large number of concept to annotate with) through combining semantic hierarchical structures with classification techniques (like Svm : Support Vector Machine) [8, 22, 1, 24]. Mainly, two different approaches were proposed for constructing the semantic hierarchy. The first one is qualified as top-down method: the semantic hierarchy is build through recursive class set clustering [8]. The second one is qualified as bottom-up method: the hierarchy is defined by agglomerative partitioning of the classes [22]. Furthermore, two different approaches were proposed also for hier- archical image classification: the first is the Binary Hierarchical Decision Trees (Bhdts)[8], and the second is the Decision Directed Acyclic Graphs (Ddags) [15]. Let C = {c1 , c2 , . . . , cN } be a set of N semantic concept. The Ddags ap- proach trains N ∗ (N − 1)/2 binary classifiers and uses a DAG to decide if an image image belongs or not to a semantic concept class ci ∈ C. At each given node at a distance d from the tree root, d semantic concept classes are elimi- nated, and N − d decision nodes remain to be evaluated. The Bhdts approach handle the semantic hierarchy as a binary tree: concept classes are clustered hierarchically into two subsets. This clustering step is iterated until a single con- cept class set is reached. For every clustering step, an Svm classifier is trained in order to decide if an image image could be annotated by the first or the opposite semantic concept class. A total of log2 (N ) Svm classifiers are trained and used for analyzing a test image. Despite the fact that these two approaches enable accurate classifiers, they handle semantic hierarchy as binary structures which requests a considerable structure to handle with large amount of concept classes. In our proposed framework, we aim to define a new method for constructing a hierarchical classifiers for scalable image annotation. At first, an annotated image dataset is analyzed to construct the hierarchy tree for concept classes. Then, and for every level of the defined tree structure, an Svm is trained for predicting if a test image image belongs to the first concept class set or the second one. By starting by the first level (root node), the hierarchy is walked until reaching leafs nodes through computing classifier votes (see figure 2). Fig. 2. Ontology based Hierarchical image classification Our method is inspired from fuzzy decision tree based method [37, 6] to ex- tract uncertain knowledge in a classification problem. Fuzzy set theory is used to model the tree structure. Thus, our proposed approach is based on a fuzzy ontology that handles such a decision tree. In what follows, we discuss the struc- ture of our fuzzy ontology, we show how we populate its content, and how to infer available knowledge in order to use the hierarchical classifiers to annotate a test image accurately. Ontology Structure The ontology structure is based on three conceptual classes : the semantic concept Concept, the hierarchical node Node, and the test image Image. We define also a set of relationships between these conceptual classes (see table 1). Table 1. Semantic Relationships between conceptual classes Relationships Definition Meaning isIndexedBy (hImage, Concepti : isIndexedBy) ≥ p1 The image Image is annotated by the concept Concept by a fuzzy weight p1 votesFor (hNode, Imagei : votesFor) ≥ p2 The Svm for the node Node votes for the image image by a fuzzy weight p2 existsIn (hConcept, Nodei : existsIn) ≥ p3 The image image exists in the node Node by a fuzzy weight p3 isChildOf (hNode, Nodei : isChildOf) The first node Node has a semantic concept subset of the second node Node The relationship isChildOf depicts that a node node1 ∈ Node is a child of another node node2 ∈ Node. This relationship is used then for modeling the semantic hierarchy for concept classes. The relationship existsIn enumerates for each node node ∈ Node the contained set of concept classes. A concept concept ∈ Concept can exists in many nodes, but for separate levels. The relationship votesFor is used when an image image is being annotated and the hierarchy is walked from the root node to the leafs. A node node ∈ Node votes for an image image ∈ Image by a fuzzy weight p2 when a Svm classification on that image predicts that the image image could annotated by the set of semantic concepts that exists in the node node. Finally, the relationship isIndexedBy depicts that an image image ∈ Image is annotated by the concept concept ∈ Concept by a fuzzy weight equal to p1 . The proposed ontology structure is used to enable handling the hierarchical classifiers, to trace the hierarchy walk for classifying a given test image, and then to model the set of semantic concepts that annotate that image (see figure 3). In what follows, we expose the population process for our ontology, then, we discuss the reasoning process used to guide and assist the hierarchical annotation. Fig. 3. Ontology based hierarchical image classification Ontology population Given a defined set of semantic concepts, we start by clustering it through analyzing annotated image dataset provided by the Image- CLEF 2015 Scalable Concept Image Annotation task. At first, we apply a binary clustering for the whole concept set, and we define two new nodes in the ontology node1 and node2 . We use a k-means clustering algorithm with k = 2. Then, each concept is instantiated within the ontology, and for every concept concept that belongs to the node node, a new relationship existsIn is instantiated between concept and node. This process is recursively called on node1 and node2 until a sub-node contains only one semantic concept class, or the clustering process seems unable to cluster a given semantic con- cept classes. At each iteration, the new defined nodes are populated within the ontology through instantiating the isChildOf relationships. Hierarchical classificators construction Once the hierarchical structure is defined through the above mentioned recursive binary clustering, an Svm based classifier is trained for all the nodes that belong to the same level. As training images, we select some development images for every concept that belongs to a node. In section 2.2, we detail the development image dataset used for the training task. At a given level, two possible nodes are figuring (see figure 4). By exploring existsIn relationships, we construct a training image dataset. For the first node (see node Root in figure 4), and for each concept that belongs to that node, a subset of images that are annotated by this concept are selected to be training images for corresponding node. For a leaf node (see node Node1 in figure 4), we proceed as follow: let Cm = {c1 , c2 , . . . , ck } be a set of k concepts that belongs to the node nodem . We construct then k classifiers. Each classifier is related to a given concept and trained against the other concepts. Then, and for a classifier f of a concept Fig. 4. Hierarchical SVM classificator construction cf ∈ Cm , we train an Svm classifier based on two image sets: the first set is based on images that are annotated by the concept cf , and the other set is based on images that are annotated by the other concepts (Cm \ cf ). For a leaf node that contain only one concept class (see node Node2 in figure 4), no Svm classifier will be constructed. And an image annotation for this concept will be computed through the leaf node classification vote. Reasoning We start reasoning from the root node (top node) of the constructed fuzzy tree (see figure 3). For a given node, we compute the values of the mem- bership functions (µ) for the child nodes through firing the corresponding Svm classifiers. The classification results (the vote) are populated into the fuzzy on- tology through instantiating the votesFor relationship. In order to improve reasoning accuracy and to minimize the decision tree walk (which will also minimize the number of Svm classifiers to be fired), we define a Fuzziness control threshold θr = 0.1: given two sub-nodes node1 and node2 , firing the Svm classifier at this level provides two membership function values µ1 for node1 and µ2 for node2 . Then, we compute θr = |µ1 − µ2 |. if θr ≤ 0.1, then we could not be sure if the Svm classifier is discriminative to judge if the content of a test image belongs to the first or to the second node. We proceed so to walk both sub-nodes (node1 and node2 ). For the opposite case (θr > 0.1), the reasoner walks only the node that has the greater membership function value (µ). Given the example in figure 3, the Svm classifier of the node root computed µ1 = 0.2 for the node N ode1, and µ2 = 0.8 for the node N ode2. Then, the reasoning algorithm stops walking the node N ode1 and proceeds to walk the N ode2 since θr = 0.8 − 0.2 = 0.6 and 0.1 ≤ 0.6. A leaf node can contain a set of concept classes, or only one concept class. In the first case, and for every contained concept class, an Svm classifier is fired for that concept against the other contained concept classes. The classification result is populated in the ontology through the instantiation of the relationship isIndexedBy between the concept class and the test image. The fuzzy weight for the new relationship is computed as an average of µ values computed from the root node to the leaf one. In case of a single concept class, a new isIndexedBy relationship is instantiated within the ontology between that concept and the test image. The fuzzy weight as in the first case. Our proposed fuzzy decision tree reasoner assists the annotation of a given test image through firing recursive trained Svm classifiers in order to optimize the number of concept to be detected. Such an optimization should reduce also the computing cost of a given test image annotation process. In the next section, we expose how we construct an Svm classifier for each node in the constructed fuzzy hierarchical semantic structure of concept classes. 2.2 Svm Classifier Construction In our participation within ImageCLEF 2015 Scalable Concept Image Annota- tion task, we aimed basically to evaluate the scalability aspect of our preliminary automatic annotation framework. For semantic concept detector/annotator, we have not really defined an original approach, but we implemented state-of-the-art bags of quantized local features and linear classifiers learned by support vector machines. In fact, and as pointed in [28], bag-of-features and codebook approach has gained a great attention by image classification and annotation community as it showed notable semantic accuracy [26, 17, 9]. In what follow, we expose how we construct Svm classifiers for semantic concept detection and annotation. Construct a learning dataset Image annotation has always been heavily dependent to good development datasets. First, datasets were mainly hand- collected. However, and recently, several researches attempt to automate such a laborious task. Re-ranking images gathered from popular Image search engines (Google, Yahoo!, Bing, . . . ) can construct automatically an image learning dataset [14, 13, 31]. As a development dataset, we have not used one provided by the ImageCLEF 2015 Scalable Concept Image Annotation task. In fact, not all the concepts were annotated. We relied then on Flickr image search engine to obtain image set and construct a learning dataset. We used so the information provided with concept list to query the search engine and we gathered first 100 result images for each given concept. At the outset, it seems to be curious to use an external data source as a development dataset. Our aim is to explore available on-line data-sources (like search engines) to train non annotated semantic concepts. Local Feature Extraction Our framework extracts features from an input image through a robust local feature extractor. We followed a basic and state- of-the-art framework for such purpose (as described in [28]). Leading extrac- tors for such a purpose includes Scale Invariant Feature Transform (Sift) and Speeded Up Robust Features (Surf). Local feature descriptors handle a pixel within an image by analyzing its neighborhood pixels. Many different descrip- tors and interest-point detectors were proposed and discussed in the literature. While the Sift descriptor [23] is considered as the most widely used descriptor, Surf [3] is known as robust local feature extraction to various image perturba- tions. Our framework extracts local features and descriptors using Surf. Such a choice is argued by Surf concise descriptor length (64 floating point values). The Surf implementation that we used is provided by OpenCv [5]. For query image analysis, local features are extracted and mapped into near- est computed cluster centroids. The query image is then handled by a vector that represents defined visual bag-of-words. Classification of local Features and Contructing the bag-of-words model After extracting local features, a bag-of-words model is used to represent these descriptors. The latter are extracted from training images and are grouped into N clusters of visual words using k-means. Each defined descriptor is classified into its cluster centroid by computing the Euclidean distance metric. For our runs, we choose a value of N = 100. This value is argued by a balance between high bias (under-fitting) and high variance (over-fitting). In order to alleviate the computing cost of k-means clustering, we used Mini Batch k-means [32] as an alternative to the k-means algorithm for clustering massive datasets. Mini Batch k-means reduces the computational cost by han- dling fixed size subsample instead of all the data in the database. This strategy reduces the amount of distance to be computed at each clustering iteration. Learning Algorithm The learning algorithm consists in training one-vs.- one linear Svm to operate in the bag of Surf feature space. Training images are classified through a histogram vector conctructed in the k-means based clustering. We used a linear kernel for our Svm based learning algorithm in view of its simplicity and computational efficiency in training and classification: K(x, y) = xT y + c. Basically, Svm are binary classifier. For a given detector, an image is anno- tated by one of two distinct groups. A one-vs.-one scheme is used in which each Svm trained for each combination of individual classes. The Svm implementation used in our runs is given by Scikit-Learn library [27]. Decision As an Svm decision function, a class membership probability estima- tion fits the decision values. Scikit-Learn library uses a Platt Scaling in order to calibrate the Svm classifier to produce, in addition to class predictions, probabilities. When the Svm is trained, an optimization process is called to optimize parameter vectors A and B such that : P (y|X) = 1/(1 + exp(A ∗ f (X) + B)) where f (X) is the signed distance of a sample from the hyperplane. 2.3 Object Localization Our developed framework does not handle yet concept localization. As future work, we are motivated to use state-of-the-art based techniques (such as [21, 20]). In our submitted runs, we considered the whole image content as a localization for all annotated concepts. 3 Experiments and Results 3.1 Submitted Runs We submitted two runs, where the only difference is the value of the threshold weight for all computed annotation: – Run 1 (regimvid at imageclef2015 task1 ): In this run, we considered all the annotation weights performed by our annotation framework. – Run 2 (regimvid at imageclef2015 task1 0.7 ): In this run, we considered only annotation weights that are greater or equal to 0.7. 3.2 Results We would like to notice that we annotated only 300 000 images of the 500 000 images provided in the test dataset. And, since many test images were not acces- sible on-line, we extracted features from the provided image thumbnails1 (with low resolutions). Due to these facts, our runs haven’t reached an advanced posi- tion compared to the other runs (see tables 2 and 3). Furthermore, our system used a state-of-the-art based Svm classifier. We think that a complete image annotation on real (full size) images with more tweaked Svm classifiers should give better results. Furthermore, fuzzy ontology based semantic enhancement (described in [38]) should also enhance our framework annotation accuracy. Table 2. MAP 0 Overlap Runs evaluation MAP Best run 0, 795403 (/SMIVA/21.run) Worst run 0, 0305398 (/REGIM/regimvid at imageclef2015 task1 0.7.txt) Average 0, 31046 Our best run 0, 0366072 (position 85/89) 3.3 Runtime Training process: Training the Svm classifiers task elapsed about 4 days (we used 100 learning images per a concept). This task was executed on a moderm machine (Intel i5 processor with 16 GB RAM memory). 1 compressed in the webupv15 data visual images.zip file. Table 3. MAP 0.5Overlap Runs evaluation MAP Best run 0, 659507 (/SMIVA/21.run) Worst run 0, 000231898 (/MLVISP6/run blur1.txt) Average 0, 18673 Our best run 0, 0161687 (position 75/89) Annotation process: The annotation task was done on 10 Vps machines (each one has one core CPU and 1 GB of RAM). The annotation of 300 000 images elapsed about 1 633 hours (without taking into consideration the Vps parallel computing). Our framework annotates a test image with an average of 19.615 seconds (the maximum record was 597.250 seconds and the minimum one was 0.066 second). And for a given test image, an average of 52 Svm classifiers were fired (the maximum was 175 and the minimum was 6). Our framework has reduced the number of Svm classifiers to be fired in order to annotate a given test image. 4 Conclusion In this working note, we described our annotation framework for the ImageCLEF 2015 Scalable Concept Image Annotation task. We discussed our ontology based framework for reducing the number of concepts to be detected for a given image. We developed a state-of-the art bag-of-words based concept detector (that uses Surf feature extractor and k-means classification). Then, concept detectors are selected through reasoning with a fuzzy ontology content. Thus, not all the concept detectors are used for a given image. In our experiment, we showed how the use of such a method could reduce the number of concept detectors to be used in order to efficiently annotate a large- scale image dataset. While the obtained results were not really impressive, we still believe that our framework can reveal better results through tweaking local feature extraction and training, and exploiting semantic enhancement through fuzzy reasoning. Thus, we are considering potential future directions to further improve our proposed framework. Acknowledgment The authors would like to acknowledge the financial support of this work by grants from General Direction of Scientific Research (DGRST), Tunisia, under the ARUB program. The authors would like to acknowledge also the Image- CLEF2015 Organising Committee. References 1. Bannour, H., Hudelot, C.: Hierarchical image annotation using semantic hierar- chies. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. pp. 2431–2434. CIKM ’12, ACM, New York, NY, USA (2012) 2. Bannour, H., Hudelot, C.: Building and using fuzzy multimedia ontologies for se- mantic image annotation. Multimedia Tools and Applications pp. 1–35 (2013) 3. Bay, H., Ess, A., Tuytelaars, T., Gool, L.V.: Speeded-up robust features (surf). Computer Vision and Image Understanding 110(3), 346 – 359 (2008), similarity Matching in Computer Vision and Multimedia 4. Benavent, X., Castellanos, A., de Ves, E., Hernández-Aranda, D., Granados, R., Garcı́a-Serrano, A.: A multimedia ir-based system for the photo annotation task at imageclef2013. In: Working Notes for CLEF 2013 Conference , Valencia, Spain, September 23-26, 2013. (2013) 5. Bradski, D.G.R., Kaehler, A.: Learning Opencv, 1st Edition. O’Reilly Media, Inc., first edn. (2008) 6. Bujnowski, P., Szmidt, E., Kacprzyk, J.: Intuitionistic fuzzy decision tree: A new classifier. In: Angelov, P., Atanassov, K., Doukovska, L., Hadjiski, M., Jotsov, V., Kacprzyk, J., Kasabov, N., Sotirov, S., Szmidt, E., Zadrony, S. (eds.) Intelligent Systems’2014, Advances in Intelligent Systems and Computing, vol. 322, pp. 779– 790. Springer International Publishing (2015) 7. Cappellato, L., Ferro, N., Jones, G., San Juan, E. (eds.): CLEF 2015 Labs and Workshops, Notebook Papers. No. 994 in CEUR Workshop Proceedings (CEUR- WS.org) (2015), http://ceur-ws.org/Vol-1391 8. Cevikalp, H.: New clustering algorithms for the support vector machine based hierarchical classification. Pattern Recognition Letters 31(11), 1285 – 1291 (2010) 9. Elleuch, N., Ammar, A.B., Alimi, A.M.: A generic framework for semantic video in- dexing based on visual concepts/contexts detection. Multimedia Tools Appl. 74(4), 1397–1421 (2015) 10. Elleuch, N., Zarka, M., Ben Ammar, A., Alimi, M.A.: A fuzzy ontology: based framework for reasoning in visual video content analysis and indexing. In: Pro- ceedings of the Eleventh International Workshop on Multimedia Data Mining. pp. 1–1. MDMKDD ’11, ACM, New York, NY, USA (2011) 11. Elleuch, N., Zarka, M., Feki, I., Ammar, A.B., Alimi, A.M.: REGIMVID at TRECVID2010: semantic indexing. In: TRECVID 2010 workshop participants notebook papers, Gaithersburg, MD, USA, November 2010 (2010) 12. Feki, G., Ksibi, A., Ammar, A.B., Amar, C.B.: Regimvid at imageclef2012: Improv- ing diversity in personal photo ranking using fuzzy logic. In: CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy, September 17-20, 2012 (2012) 13. Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Computer Vision, 2005. ICCV 2005. Tenth IEEE Inter- national Conference on. vol. 2, pp. 1816–1823 Vol. 2 (Oct 2005) 14. Fergus, R., Perona, P., Zisserman, A.: A visual category filter for google images. In: Pajdla, T., Matas, J. (eds.) Computer Vision - ECCV 2004, Lecture Notes in Computer Science, vol. 3021, pp. 242–256. Springer Berlin Heidelberg (2004) 15. Gao, T., Koller, D.: Discriminative learning of relaxed hierarchy for large-scale visual recognition. In: Proceedings of the 2011 International Conference on Com- puter Vision. pp. 2072–2079. ICCV ’11, IEEE Computer Society, Washington, DC, USA (2011) 16. Gilbert, A., Piras, L., Wang, J., Yan, F., Dellandrea, E., Gaizauskas, R., Villegas, M., Mikolajczyk, K.: Overview of the ImageCLEF 2015 Scalable Image Annotation, Localization and Sentence Generation task. In: CLEF2015 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Toulouse, France (September 8-11 2015) 17. Kanehira, A., Hidaka, M., Makuta, Y., Tsuchiya, Y., Mano, T., Harada, T.: MIL at imageclef 2014: Scalable system for image annotation. In: Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014. pp. 372–379 (2014) 18. Kannan, P., Bala, P., Aghila, G.: A comparative study of multimedia retrieval using ontology for semantic web. In: Advances in Engineering, Science and Management (ICAESM), 2012 International Conference on. pp. 400–405 (March 2012) 19. Ksibi, A., Ammar, B., Ammar, A.B., Amar, C.B., Alimi, A.M.: Regimrobvid: Ob- jects and scenes detection for robot vision 2013. In: Working Notes for CLEF 2013 Conference , Valencia, Spain, September 23-26, 2013. (2013) 20. Lampert, C.H., Blaschko, M., Hofmann, T.: Beyond sliding windows: Object lo- calization by efficient subwindow search. In: Computer Vision and Pattern Recog- nition, 2008. CVPR 2008. IEEE Conference on. pp. 1–8 (June 2008) 21. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. vol. 2, pp. 2169–2178 (2006) 22. Li, L.J., Wang, C., Lim, Y., Blei, D., Fei-Fei, L.: Building and using a semantivi- sual image hierarchy. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. pp. 3336–3343 (June 2010) 23. Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004) 24. McNamara, D.S., Crossley, S.A., Roscoe, R.D., Allen, L.K., Dai, J.: A hierarchical classification approach to automated essay scoring. Assessing Writing 23(0), 35 – 59 (2015) 25. Mller, H., Clough, P., Deselaers, T., Caputo, B.: ImageCLEF: Experimental Evalu- ation in Visual Information Retrieval. Springer Publishing Company, Incorporated, 1st edn. (2010) 26. Mylonas, P., Spyrou, E., Avrithis, Y., Kollias, S.: Using visual context and re- gion semantics for high-level concept detection. Multimedia, IEEE Transactions on 11(2), 229–243 (2009) 27. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011) 28. Piras, L., Giacinto, G.: Open issues on codebook generation in image classifica- tion tasks. In: Machine Learning and Data Mining in Pattern Recognition - 10th International Conference, MLDM 2014, St. Petersburg, Russia, July 21-24, 2014. Proceedings. pp. 328–342 (2014) 29. Reshma, I.A., Ullah, M.Z., Aono, M.: KDEVIR at imageclef 2014 scalable concept image annotation task: Ontology based automatic image annotation. In: Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014. pp. 386– 397 (2014) 30. Sahbi, H.: CNRS - TELECOM paristech at imageclef 2013 scalable concept image annotation task: Winning annotations with context dependent svms. In: Working Notes for CLEF 2013 Conference , Valencia, Spain, September 23-26, 2013. (2013) 31. Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(4), 754–766 (2011) 32. Sculley, D.: Web-scale k-means clustering. In: Proceedings of the 19th International Conference on World Wide Web. pp. 1177–1178. WWW ’10, ACM, New York, NY, USA (2010) 33. Snoek, C.G.M., Worring, M.: Concept-based video retrieval. Foundations and Trends in Information Retrieval 2(4), 215–322 (2009) 34. Villegas, M., Müller, H., Gilbert, A., Piras, L., Wang, J., Mikolajczyk, K., de Her- rera, A.G.S., Bromuri, S., Amin, M.A., Mohammed, M.K., Acar, B., Uskudarli, S., Marvasti, N.B., Aldana, J.F., del Mar Roldán Garcı́a, M.: General Overview of ImageCLEF at the CLEF 2015 Labs. Lecture Notes in Computer Science, Springer International Publishing (2015) 35. Villegas, M., Paredes, R.: Overview of the imageclef 2014 scalable concept image annotation task. In: CLEF 2014 Evaluation Labs and Workshop, Online Working Notes (2014) 36. Villegas, M., Paredes, R., Thomee, B.: Overview of the imageclef 2013 scalable concept image annotation subtask. In: CLEF 2013 Evaluation Labs and Workshop, Online Working Notes, Valencia, Spain (2013) 37. Wang, X., Liu, X., Pedrycz, W., Zhang, L.: Fuzzy rule based decision trees. Pattern Recognition 48(1), 50 – 59 (2015) 38. Zarka, M., Ben Ammar, A., Alimi, A.: Fuzzy reasoning framework to improve semantic video interpretation. Multimedia Tools and Applications pp. 1–32 (2015) 39. Zhang, D., Islam, M.M., Lu, G.: A review on automatic image annotation tech- niques. Pattern Recogn. 45(1), 346–362 (Jan 2012)