Improving Ontology Recommendation and Reuse in WebCORE by Collaborative Assessments Iván Cantador, Miriam Fernández, Pablo Castells Escuela Politécnica Superior Universidad Autónoma de Madrid Campus de Cantoblanco, 28049, Madrid, Spain {ivan.cantador, miriam.fernandez, pablo.castells}@uam.es ABSTRACT achievements. Novel tools have been recently developed, such as In this work, we present an extension of CORE [2], a tool for ontology search engines [6] represent an important first step Collaborative Ontology Reuse and Evaluation. The system receives towards automatically assessing and retrieving ontologies which an informal description of a specific semantic domain and satisfy user queries and requests. However, ontology reuse determines which ontologies from a repository are the most demands additional efforts to address special needs and appropriate to describe the given domain. For this task, the requirements from ontology engineers and practitioners. It is environment is divided into three modules. The first component necessary to evaluate and measure specific ontology features, receives the problem description as a set of terms, and allows the such as lexical vocabulary, relations [3], restrictions, consistency, user to refine and enlarge it using WordNet. The second module correctness, etc., before making an adequate selection. Some of applies multiple automatic criteria to evaluate the ontologies of the these features can be measured automatically, but others require a repository, and determines which ones fit best the problem human judgment to be assessed. description. A ranked list of ontologies is returned for each criterion, The Web 2.0 is arising as a new trend where people collaborate and and the lists are combined by means of rank fusion techniques. share their knowledge to successfully achieve their goals. Following Finally, the third component uses manual user evaluations in order this aspiration, the aim of this research is to enhance ontology to incorporate a human, collaborative assessment of the ontologies. retrieval and recommendation, combining automatic evaluation techniques with explicit users’ opinions and experiences. This work Categories and Subject Descriptors follows a previous approach for Collaborative Ontology Reuse and Evaluation over controlled repositories, named CORE [2]. The tool H.3.3 [Information Storage and Retrieval]: Information Search has been enhanced and adapted to the Web. Novel technologies, and Retrieval – information filtering, retrieval models, selection such as AJAX1, have been incorporated to the system for the design process. and implementation of the user interface. It has also been improved to overcome previous limitations, such as handling large numbers of General Terms ontologies. The collaborative capabilities have also been extended. Algorithms, Measurement, Human Factors. 2. SYSTEM ARCHITECTURE Keywords WebCORE is a web application for Collaborative Ontology Reuse Ontology evaluation, ontology reuse, collaborative filtering. and Evaluation. A user logins into the system via a web browser, and, thanks to AJAX technology and the Google Web Toolkit2, dynamically describes a problem domain, searches for ontologies 1. INTRODUCTION related to this domain, obtains relevant ontologies ranked by The Web can be considered as a live entity that grows and evolves several lexical, taxonomic and collaborative criteria, and evaluates fast over time. The amount of content stored and shared on the by himself those ontologies that he likes or dislikes most. web is increasing quickly and continuously. The global body of multimedia resources on the Internet is undergoing a significant In this section, we describe the server-side architecture of growth, reaching a presence comparable to that of traditional text WebCORE. Figure 1 shows an overview of the system. We contents. The consequences of this enlargement result in well distinguish three different modules. The first one, the left module, known difficulties and problems, such as finding and properly receives the problem description (Golden Standard) as a full text managing all the existing amount of sparse information. or as a set of initial terms, than can be extended by the user using WordNet [4]. The second one, represented in the centre of the To overcome these limitations the so-called “Semantic Web” figure, allows the user to select a set of ontology evaluation trend has emerged with the aim of helping machines to process techniques to recover the ontologies closest to the given Golden information, enabling browsers or other software agents to Standard. Finally, the third one, on the right of the figure, is a automatically find, share and combine information in consistent collaborative module that re-ranks the list of recovered ontologies, ways. At the core of these new technologies, ontologies are taking into consideration previous evaluations of the users. envisioned as key elements to represent knowledge that can be understood, used and shared among distributed applications and machines. However, ontological knowledge mining and development are difficult and costly tasks that require major 1 Garrett, J. J. (2005). AJAX: A New Approach to Web engineering efforts. In this context, ontology reuse becomes an Applications. In http://www.adaptivepath.com/ essential need in order to exploit past and current efforts and 2 Google Web Toolkit, http://code.google.com/webtoolkit/ immediately updated: the final list of (root and expanded) terms that represent the domain of the problem is shown in the bottom of the figure. The user can also make term expansion using WordNet. He selects one of the terms from the Golden Standard definition and the system shows him all its meanings contained in WordNet (top of the figure). After he has chosen one of them, the system presents him three different lists with the synonyms, hyponyms and hypernyms of the term. The user can then selects one or more elements of these lists and add them to the expanded term list. For each expansion, the depth of the new term is increased by one unit. In the problem definition phase a collaborative component has been added to the system (right side of Figure 2). This component reads the term currently selected by the user, and searches for all the stored problem definitions that contain it. For each of these problem definitions, the rest of their terms and the number of problems in which they appear are retrieved and shown in the web browser. With this simple strategy the user is suggested the most popular terms, fact that could help him to better describe the Figure 1. WebCORE architecture domain in which he is interested in. 2.1 Golden Standard Definition The first phase of our ontology recommender system is the Golden Standard definition. The user describes a domain of interest specifying a set of relevant terms that will be searched through the concepts (classes or instances) of the ontologies stored in the system. These terms can automatically be obtained by the internal Natural Language Processing (NLP) module, which uses a repository of documents related to the specific domain in which the user is interested in. This NLP module accesses to the repository of documents, and returns a list of pairs (lexical entry, part of speech) that roughly represents the domain of the problem. On the other hand, the list of initial (root) terms can be manually specified. The module also allows the user to expand the root terms using WordNet [4] and some of the relations it provides: hypernym, hyponym and synonym. The new terms added to the Golden Standard using these relations might also be extended again, and Figure 2. WebCORE problem definition phase new terms can iteratively be added to the problem definition. The final representation of the Golden Standard is defined as a 2.2 Automatic Ontology Recommendation set of terms T (LG, POS, LGP, R, Z) where: Once the user has selected the most appropriate set of terms to • LG is the set of lexical entries defined for the Golden describe the problem domain, the tool performs the processes of Standard. ontology retrieval and ranking. Our approach to ontology retrieval can be seen as an evolution of classic keyword-based retrieval • POS corresponds to the different Parts Of Speech considered techniques [5], where textual documents are replaced by by WordNet: noun, adjective, verb and adverb. ontologies. • LGP is the set of lexical entries of the Golden Standard that The queries supported by our model are expressed using the terms have been extended. selected during the Golden Standard definition phase. In classic • R is the set of relations between terms of the Golden Standard: keyword-based vector-space models for information retrieval, synonym, hypernym, hyponym and root (if a term has not been each query keyword is assigned a weight that represents the obtained by expansion, but is one of the initial terms). importance of the concept in the information need expressed by the query. Analogously, in our system, the terms included in the • Z is an integer number that represents the depth or distance Golden Standard are weighted, using the depth measure to of a term to the root term from which it has been derived. indicate the relative interest of the user for each of the terms to be Example: T1 = (“genetics”, NOUN, “”, ROOT, 0). T1 is one of the explicitly mentioned in the ontologies. root terms of the Golden Standard. The lexical entry that it To carry out the retrieval process, we focus on the lexical level, represents is “genetics”, its part of speech is “noun”, it has not recovering those ontologies that contain a subset of the terms been expanded from any other term so its lexical parent is the expressed by the user during the Golden Standard definition. To empty string, its relation is “root”, and its depth is 0. compute the term matching, two different options are available Figure 2 shows the interface of the Golden Standard Definition within the tool: search for exact matches or search for matches phase. In the left side of the screen, the current list of root terms is based on the Levenshtein distance between two terms. shown. The user can manually insert new root terms to this list Furthermore, the tool also offers two different search spaces, the giving their lexical entries and selecting their parts of speech. ontologies and the corresponding knowledge bases. Adding new terms, the final Golden Standard definition is Figure 3 shows the system recommendation interface. At the left Each component oij contains specific information about the side the user can select the matching methodology (fuzzy or similarity between the ontology and the corresponding term ti. To exact), the search spaces (ontology entities and knowledge base compute the final similarity between the query vector q and the entities), and the weight or importance given to each of the ontology vector oj, the vectorial model calculates the cosine previously selected search spaces. In the right part the user can measure between both vectors. However, if we follow the visualize the ontology and navigate across it. Finally, the middle traditional model, we will only be considering the difference of the interface presents the list of ontologies selected for the user between the query and the ontology vectors according to the angle to be evaluated during the collaborative evaluation phase. they form, but not taking into account their dimensions. To overcome this limitation, the cosine measure has been replaced by the simple dot product. Hence, the similarity measure between an ontology oj and the query q is simply compute as follows: sim ( q, o j ) =q ⋅ o j If the knowledge in the ontology is incomplete, the ontology ranking algorithm performs very poorly. Queries will return less results than expected, the relevant ontologies will not be retrieved, or will get a much lower similarity value than it should. For instance, if there are ontologies about “restaurants”, and “dishes” are expressed as instances in the corresponding Knowledge Base (KB), a user searching for ontologies in this domain may be also interested in the instances and literals contained in the KB. To cope with this issue, our ranking model combines the similarity obtained from the terms that belong to the ontology with the similarity obtained from the terms that belong to the KB using the Figure 3. WebCORE system recommendation phase adaptation of the vector space model explained before. The user can select a value vi ∈ [1, 5] for each kind of search, and this Let T be the set of all terms defined in the Golden Standard definition phase. Let di be the depth measure associate with each value is then mapped to a corresponding value si = vi . Following 5 term ti ∈ T. Let q be query vector extracted from the Golden this idea, the final score is computed as: Standard definition, and let wi be the weight associated to each of these terms, where for each ti ∈ T, wi ∈ [0,1]. Then, the weight wi sO × sim(q, o) + s kb × sim(q, kb) is calculated as: 1 2.3 Collaborative Ontology Evaluation wi = The third and last phase of the system is compound of a novel di + 1 ontology recommendation algorithm that exploits the advantages This measure gives more relevance to the terms explicitly expressed of Collaborative Filtering [1], exploring the manual evaluations by the user, and less importance to those ones extended or derived stored in the system to rank the set of ontologies that best fulfils from previously selected terms. An interesting future work could be the user’s interests. to enhance and refine the query, e.g. based on terms popularity, or In WebCORE, user evaluations are represented as a set of five other more complex strategies as terms frequency analysis. different criteria and their respective values, manually determined The search engine computes a semantic similarity value between by the users who made the evaluations: correctness, readability, the query and each ontology as follows. We represent each flexibility, level of formality and type of model. ontology with a vector oj ∈ O, where oji is the mean of the term ti The above criteria can have discrete numeric or non-numeric similarities with all the matched entities in the ontology if any values. The user’s interests are expressed like a subset of these matching exists, and zero otherwise. The components oji are criteria, and their respective values, meaning thresholds or calculated as: restrictions to be satisfied by user evaluations. Thus, a numeric ∑ w(m ) M ji ji criterion will be satisfied if an evaluation value is equal or greater than that expressed by its interest threshold, while a non-numeric o ji = M ji ∑ w(m ) i criterion will be satisfied only when the evaluation is exactly the Mi given threshold (i.e. in a Boolean or yes/no manner). where Mji is the set of matches of the term ti in the ontology According to both types of user evaluation and interest criteria, oj, w(mji) represents the similarities between the term ti and the numeric and Boolean, the recommendation algorithm will entities of the ontology oj that matches with it, Mi is the set of measure the degree in which each user restriction is satisfied by matches of the term ti within all the ontologies and w(mi) the evaluations, and will recommend a ranked ontology list represents the weights of each of these matches. according to similarity measures between the thresholds and the For example, if we define in the Golden Standard a term “acid”, collaborative evaluations. this term may return several matches in the same ontology with Figure 4 shows all the previous definitions and ideas, locating different entities as: “acid”, “amino acid”, etc. In order to them in the graphical interface of the system. On the left side of establish the appropriate weight in the ontology vector, oij, the the screen, the user introduces the thresholds for the goal is to compute the number of matches of one term in the recommendations and obtains the final collaborative ontology whole repository of ontologies and give more relevance to those ranking. On the right side, the user adds new evaluations for the ontologies that have matched that specific term more times. ontologies and checks evaluations given by the rest of the users. Table 1. Average number of reused ontologies and execution times (in minutes) for tasks 2 and 3 Task 2 Task 3 (without (with % collaborative collaborative improvement modules) modules) # reused 3.45 4.35 26.08 ontologies execution 9.3 7.1 23.8 time On the other hand, table 2 shows the average degrees of satisfaction revealed by the users about the retrieved ontologies and the collaborative modules. Again, the results evidence positive applications of our approach. Table 2. Average satisfactions values (1-5 rating scale) for ontologies Figure 4. WebCORE user evaluation phase reused in tasks 2 and 3, collaborative recommendations and rankings Task % Initial term Final ontology Task 2 3. EXPERIMENTS 3 improvement recommendation ranking In this section, we present some early experiments that attempt to 3.34 3.56 6.58 4.7 4.4 measure: a) the gain of efficiency and effectiveness, and the b) increment of users’ satisfaction obtained with the use of our 4. CONCLUSIONS AND FUTURE WORK system when searching ontologies within a specific domain. In this paper, a web application for ontology evaluation and reuse The scenario of the experiments was the following. A repository has been presented. The novel aspects of our proposal include the of thirty ontologies was considered and eighteen subjects use of WordNet to help users to define the Golden Standard; a participated in the evaluations. They were Computer Science new ontology retrieval technique based on traditional Information Ph.D. students of our department, all of them with some expertise Retrieval models; rank fusion techniques to combine different in modeling and exploitation of ontologies. They were asked to ontology evaluation measures; and two collaborative modules: search and evaluate ontologies with WebCORE in three different one that suggests the most popular terms for a given domain, and tasks. For each task and each student, one of the following one that recommends lists of ontologies with a multi-criteria problem domains was selected family, genetics and restaurant. strategy that takes into account user opinions about ontology features that can only be assessed by humans. In the repository, there were six different ontologies related to each of the above domains, and twelve ontologies describing other 5. ACKNOWLEDGMENTS no related knowledge areas. No information about the domains This research was supported by the Spanish Ministry of Science and the existent ontologies was given to the students. and Education (TIN2005-06885 and FPU program). Tasks 1 and 2 were performed first without the help of the collaborative modules of the system, i.e., the term recommender 6. REFERENCES of the problem definition phase and the collaborative ranking of [1] Adomavicius, G., and Tuzhilin, A.: Toward the Next the user evaluation phase. After all users finished the previous Generation of Recommender Systems: A Survey of the State- ontology searches and evaluations, task 3 was done with the of-the-Art and Possible Extensions. IEEE Transactions on collaborative components activated. For each task and each Knowledge and Data Engineering 17(6): 734-749, 2005. student, we measured the time expended, and the number of [2] Fernández, M., Cantador, I., and Castells, P. CORE: A Tool ontologies retrieved and selected (‘reused’). We also asked the for Collaborative Ontology Reuse and Evaluation. users about their satisfaction (in a 1-5 rating scale) about each of Proceedings of the 4th Int. Workshop on Evaluation of the selected ontologies and the collaborative modules. Ontologies for the Web (EON’06), at the 15th Int. World Tables 1 and 2 contain a summary of the obtained results. Note Wide Web Conference (WWW’06). Edinburgh, UK, 2006. that measures of task 1 are not shown. We have decided not to [3] Maedche, A., and Staab, S.: Measuring similarity between consider them for evaluation purposes because we discern the first ontologies. Proceedings of the 13th European Conference on task as a learning process of the use of the tool, and its time Knowledge Acquisition and Management (EKAW 2002). executions and number of selected ontologies as skewed no Madrid, Spain, 2002. objective measures. [4] Miller, G. A.: WordNet: A lexical database for English. New To evaluate the enhancements in terms of efficiency and horizons in commercial and industrial Artificial Intelligence. effectiveness, we present in Table 1 the average number of reused Communications of the Association for Computing ontologies and the average execution times for task 2 and 3. The Machinery, 38(11): 39-41, 1995. results show a significant improvement when the collaborative [5] Salton, G., and McGill, M.: Introduction to Modern modules of the system were activated. In all the cases, the Information Retrieval. McGraw-Hill, New York, 1983. students made use of the terms and evaluations suggested by others, accelerating the processes of problem definition and [6] Swoogle - Semantic Web Search Engine. relevant ontology retrieval. http://swoogle.umbc.edu