Imaging Words – Wording Images Adrian Popescu, Gregory Grefenstette, Cristophe Millet, Pierre-Alain Moëllic, Patrick Hède  preliminary results of our method for image ontology construction. Abstract—The rapid growth of the Internet information sources has led to organizing proposals, such as the Semantic Web initiative, with its ontological level providing a formal II. WORDNET TO OWL structuring for this disparate data. But given the amount of information to be treated even in a restricted domain, manual A. Automatic Ontology Construction organization becomes rapidly unmanageable, and automatic methodologies for ontology building are required. Here we Our current work deals with the automatic construction of a describe techniques for the automatic construction of a image grounded ontology. In order to automatically build such a ontology based on multimedia data (text and images) for a formal structure, we need an associated taxonomy. There are specific class of objects, manmade tools. Our approach combines two main possibilities that are offered to us: learning modification of existing lexical resources and search engine taxonomy from concepts found on the Web [1] or using one querying in order to obtain raw images. These images are then from an existing resource. We have chosen the second variant clustered into representative concepts for the ontology. Our and used WordNet [5] as source for our taxonomy. Thus, we automated approach can be applied to any subset of physical preserve the automatic character of our methodology and are objects. able to exploit the richness of a resource that was manually Index Terms—Image, Ontology, OWL, Semantic Web, constructed by lexicographers. We are aware of the criticisms WordNet raised by the transformation of WordNet into a formal ontology [4], but with the implementation choices we have made, we try to minimize their effects. There is notably the I. INTRODUCTION fact that our method only addresses picturable objects, which are ontologically less controversial than high level concepts. A s proven by initiatives like CYC [2], the manual construction of large scale ontologies is a costly effort and it is unrealistic to think that this approach can solve The approach we propose is domain independent. It depends uniquely of the knowledge contained in the resource we parsed. For exemplification purposes only, the examples current needs for knowledge organization. This is especially furnished here are subconcepts of tool in WordNet. true for highly dynamic resources like the WWW, where the The envisioned application, construction of a structured increase in knowledge resources follows an exponential curve. image catalogue, determined us to parse only parts of the The Semantic Web, with its description of content in information contained in WordNet to OWL. We transformed ontologies has been presented as a potential solution to the the sets of synonyms (synsets) in OWL classes, preserving the information structuring problems. But, as underlined in [1], a sense separation. Thus, knife from the lexical hierarchy vicious circle is created as the Semantic Web is dependent on becomes knife__1 in the ontology, while garden tool, lawn the existence of metadata and these last rely themselves on the tool is transformed to garden_tool__1. Lawn tool is saved as existence of a well populated Semantic Web. A way to cope an RDFS comment as another member of the garden_tool__1 with this problem is the development of automatic or class. We equally parsed the terms definitions in the ontology. semiautomatic methodologies for the ontology construction. Image clusters are associated exclusively to leave concepts Interesting results for automatic lexical ontology building are in the OWL ontology. The rationale for this decision is that, reported in [1]. with the use of hyponymy relation, we can propose image sets In this paper, we describe our technique for automatically for all concepts in the ontology. Moreover, the leave terms filling multimedia ontologies, grounding each concept in text and images. After a transposition of parts of WordNet [Miller] generally are specialized concepts that point towards precise into OWL (Ontology Web Language) in order to create a entities [6], which are less ambiguous both in language and taxonomical base, we have lexical information associated to the associated picture representation. concepts. For the image part of the grounding, we query the Web to gather pictures corresponding to objects in the taxonomy that are then clustered and filtered. III. IMAGE CLUSTERING MODULE We structured the rest of this paper as follows: we discuss a We propose a second structuring axis in our image translation of WordNet to OWL, we describe our image catalogue. The use of an ontology allows inter-class gathering and clustering tool and, before concluding, some organization, while an image clustering tool provides means for intra-class structure. A clustering process was run for each All authors are with Commisariat à l’Energie Atomique – LIST, France. leaf concept in the ontology. This process consists of two steps: image indexing and clustering following visual similarity. A. Image indexing We deal with pictures from broad domains and we need a general image indexing technique. Using an approach based on border/interior pixel classification [7]. We construct two histograms for each image, one for pixels on the image borders and pixels in interior regions. This indexing algorithm is fast, simple and provides information about colors in the image and, equally important, about sizes of image regions having a constant color (possibly objects). It leads to the construction of a vector containing 128 elements for each Fig. 1. Selection of images for knife using Google Image. picture. We use the Riemann distance as similarity measure between two images. Distances are calculated between all pairs of images. B. Image clustering The indexed images are clustered using a k-SNN (Shared Nearest Neighbors) algorithm [3]. For each image, a neighborhood of k images is considered in the algorithm. The similarity of two images is assessed with respect to the degree of overlapping of their neighborhoods. Next, pictures that are most similar to their neighbors are considered as topic images and clusters are structured around them. A useful feature of the algorithm is that it does not impose the classification of all indexed images. Pictures considered weakly related to topics remain unclustered. This last feature is important in our application as we work in a noisy environment (there are a lot of images on the Web that are not annotated in direct relation to their visual content). We thus hope to isolate images that Fig. 2. Selection of images for knife using ontologies and image clustering are irrelevant for the desired object and build highly coherent clusters of images containing it. Given that the classification is We observe that the images in fig. 2 illustrate better the entirely automatic, there is noise that subsists in the clusters, notion of knife and are ontologically and visually organized, but the obtained results seem more coherent than the set of which is not the case for figure 1. Extensive evaluations are images initially retrieved, though we have not yet performed needed in order to assess if the proposed method performs extensive evaluation. better than existing ones in image retrieval tasks. REFERENCES IV. PRELIMINARY RESULTS [1] P. Cimiano, A. Hotho,and S. Staab, “Comparing Conceptual, Divisive We already stated that our purpose here is to build a and Agglomerative Clustering for Learning Taxonomies from Text”, in structured image catalogue using images from the Web. Proc. of ECAI 2004, Valencia, Spain, 2004, pp.435–439 [2] CYC, www.cyc.com Instead of querying for images for all concepts in the [3] L. Ertoz, M. Steibach, and V. Kumar, “Finding Topics in Collections of ontology, we perform this operation for leaves only and, via Documents: A Shared Nearest Neighbor Approach”, In: Wu, W., Xiong, hyponymy, propose picture sets for all other concepts in the H., Sheklar, S.(eds.):Clustering and Information Retrieval, Kluwer, hierarchy. This results in an structured presentation of results, 2003. [4] A. Gangemi, R. Navigli, and P. Velardi, “The OntoWordNet Project: while taking advantage of the fact that the image sets Extension and Axiomatisation of Conceptual Relations in WordNet”, in associated to leaves are less noisy (they correspond to well Proc of. CoopIS/DOA/ODBASE 2003, Catania, Sicily, Italy, 2003, pp. defined entities in the world[6]). An example of the obtained 689–706. [5] G. A. Miller, “Nouns in WordNet: a Lexical Inheritance System. results is presented for knife in two situations. We use Google International Journal of Lexicography”, 3,4, 1990, pp. 245-264. Image for the pictures in fig. 1 and our method (ontology for [6] E. Rosch, C. Mervis, C. B. Gray, D. M. Johnson, P. Boyes – Braem, inter-class structure and clustering for intra-class “Basic Objects in Natural Categories”, Cognitive Psychology, 8, 1976, pp. 382–439. organization). [7] R. O. Stehling, M.A. Nascimento, and A.X. Falcao, “A Compact and Efficient Image Retrieval Approach Based on Border/Interior Pixel Classification”, inProc. of the Eleventh International Conference on Information and Knowledge Management. Mc Lean, Virginia, USA. ACM Press,2002, pp. 102–109.