=Paper= {{Paper |id=Vol-273/paper-6 |storemode=property |title=Improving Ontology Recommendation and Reuse in WebCORE by Collaborative Assessments |pdfUrl=https://ceur-ws.org/Vol-273/paper_8.pdf |volume=Vol-273 |dblpUrl=https://dblp.org/rec/conf/www/CantadorFC07 }} ==Improving Ontology Recommendation and Reuse in WebCORE by Collaborative Assessments== https://ceur-ws.org/Vol-273/paper_8.pdf

Improving Ontology Recommendation and Reuse in
WebCORE by Collaborative Assessments
Iván Cantador, Miriam Fernández, Pablo Castells
Escuela Politécnica Superior
Universidad Autónoma de Madrid
Campus de Cantoblanco, 28049, Madrid, Spain
{ivan.cantador, miriam.fernandez, pablo.castells}@uam.es

ABSTRACT automatically find, share and combine information in consistent
In this work, we present an extension of CORE [8], a tool for ways. As put by Tim Berners-Lee in 1999, “I have a dream for
Collaborative Ontology Reuse and Evaluation. The system receives the Web in which computers become capable of analyzing all the
an informal description of a specific semantic domain and data on the Web – the content, links, and transactions between
determines which ontologies from a repository are the most people and computers. A ‘Semantic Web’, which should make this
appropriate to describe the given domain. For this task, the possible, has yet to emerge, but when it does, the day-to-day
environment is divided into three modules. The first component mechanisms of trade, bureaucracy and our daily lives will be
receives the problem description as a set of terms, and allows the handled by machines talking to machines. The ‘intelligent agents’
user to refine and enlarge it using WordNet. The second module people have touted for ages will finally materialize”.
applies multiple automatic criteria to evaluate the ontologies of the At the core of these new technologies, ontologies are envisioned
repository, and determines which ones fit best the problem as key elements to represent knowledge that can be understood,
description. A ranked list of ontologies is returned for each criterion, used and shared among distributed applications and machines.
and the lists are combined by means of rank fusion techniques. However, ontological knowledge mining and development are
Finally, the third component uses manual user evaluations in order difficult and costly tasks that require major engineering efforts.
to incorporate a human, collaborative assessment of the ontologies. Developing an ontology from scratch requires the expertise of at
The new version of the system incorporates several novelties, such least two different individuals: an ontology engineer that ensures
as its implementation as a web application; the incorporation of a the correctness during the ontology design and development, and
NLP module to manage the problem definitions; modifications on a domain expert, responsible for capturing the semantics of a
the automatic ontology retrieval strategies; and a collaborative specific field into the ontology. In this context, ontology reuse
framework to find potential relevant terms according to previous becomes an essential need in order to exploit past and current
user queries. Finally, we present some early experiments on efforts and achievements.
ontology retrieval and evaluation, showing the benefits of our system.
In this scenario, it is also important to emphasize that ontologies,
as well as content, do not stop evolving and growing within the
Categories and Subject Descriptors Web. They are part of its wave of growth and evolution, and they
H.3.3 [Information Storage and Retrieval]: Information Search need to be managed and kept up to date in distributed
and Retrieval – information filtering, retrieval models, selection environments. In this perspective, the initial efforts to collect
process. ontologies in libraries [17] are not sufficient, and novel
technologies are necessary to successfully retrieve this special
kind of content.
General Terms
Algorithms, Measurement, Human Factors. Novel tools have been recently developed, such as ontology
search engines [24] represent an important first step towards
automatically assessing and retrieving ontologies which satisfy
Keywords user queries and requests. However, ontology reuse demands
Ontology evaluation, ontology reuse, rank fusion, collaborative additional efforts to address special needs and requirements from
filtering, WordNet. ontology engineers and practitioners. It is necessary to evaluate
and measure specific ontology features, such as lexical
1. INTRODUCTION vocabulary, relations [11], restrictions, consistency, correctness,
The Web can be considered as a live entity that grows and evolves etc., before making an adequate selection. Some of these features
fast over time. The amount of content stored and shared on the can be measured automatically, but some, like the correctness or
web is increasing quickly and continuously. The global body of the level of formality, require a human judgment to be assessed.
multimedia resources on the Internet is undergoing a significant In this context, the Web 2.0 is arising as a new trend where people
growth, reaching a presence comparable to that of traditional text collaborate and share their knowledge to successfully achieve
contents. The consequences of this enlargement result in well their goals. New search engines like Technorati1 exploit blogs
known difficulties and problems, such as finding and properly with the aim of finding not only the information that the user is
managing all the existing amount of sparse information. looking for, but also the experts that might better answer the
To overcome these limitations the so-called “Semantic Web” users’ requirements. As put by David Sifry, one of the founders of
trend has emerged with the aim of helping machines process
information, enabling browsers or other software agents to 1
Technorati, blog search engine, http://technorati.com/
Technorati, in an interview for a Spanish newspaper, “Internet has To obtain the most appropriate ontology and fulfil ontology
been transformed from the great library to the great engineers’ requirements, search engines and libraries should be
conversation”. complemented with evaluation methodologies.
Following this aspiration, the work presented here aims to Ontology evaluation can be defined as assessing the quality and
enhance ontology retrieval and recommendation, combining the adequacy of an ontology for being used in a specific context,
automatic evaluation techniques with explicit users’ opinions and for a specific goal. From our perspective, ontology evaluation
experiences. This work follows a previous approach for constitutes the cornerstone of ontology reuse because it faces the
Collaborative Ontology Reuse and Evaluation over controlled complex task of evaluate, and consequently select the most
repositories, named CORE [8]. For the work reported in this appropriate ontology on each situation.
paper, the tool has been enhanced and adapted to the Web. Novel An overview of ontology evaluation approaches is presented in
technologies, such as AJAX2, have been incorporated to the [4], where four different categories are identified: those that
system for the design and implementation of the user interface. It evaluate an ontology by comparing it to a Golden Standard [11];
has also been modified and improved to overcome previous those that evaluate the ontologies by plugging them in an
limitations, such as handling large numbers of ontologies. The application and measuring the quality of the results that the
collaborative capabilities have also been extended within two application returns [16]; those that evaluate ontologies by
different frameworks. Firstly, during the problem definition phase, comparing them to unstructured or informal data (e.g. text
the system helps users to express their needs and requirements by documents) [5], and those based on human interaction to measure
showing other problem descriptions previously given by different ontology features not recognizable by machines [10]. In each of
users. Secondly, during the ontology retrieval phase, the system the above approaches several evaluation levels are identified:
helps users to enhance the automatic system recommendations by lexical, taxonomical, syntactic, semantic, contextual, and
using other user evaluations and comments. structural between others. Table 1 summarized these ideas.
Following Leonardo Da Vinci’s words, “Wisdom is the daughter
of experience”, our tool aims to take a step forwards for helping Table 1. An overview of approaches to ontology evaluation
users to be wise in exploiting other people’s experience and Approach
expertise.
Golden Application Data Assessment
Level
The rest of the paper has been organized as follows. Section 2 Standard based driven by humans
summarizes some relevant work related to our system. Its Lexical entries,
architecture is described in Section 3. Section 4 contains empirical vocabulary, X X X X
results obtained from early experiments done with a prototype of concept, data
Hierarchy,
the system. Finally, several conclusions and future research lines X X X X
taxonomy
are given in Section 5. Other
semantic X X X X
2. RELATED WORK relations
Context,
2.1 Ontology Evaluation application
X X
Two well-known scenarios for ontology reuse have been
identified in the Semantic Web area. The first one addresses the Syntactic X X

common problem of finding the most adequate ontologies for a Structure,
X
specific domain. The second scenario envisions the not so architecture, design
common but real situation in which Semantic Web applications Once the ontologies have been searched, retrieved and evaluated,
need to automatically and dynamically find an ontology. In this the next step is to select the most appropriate one that fulfils user
work, we focus our attention on the fist scenario, where users are or application goals. Some approaches for ontology selection have
the ones who express their information needs. In this scenario, been addressed in [20] and complemented in [19], where a
ontology reuse involves several areas such as ontology evaluation, complete study is presented to determine the connections between
selection, search and ranking. ontology selection and evaluation.
Several ontology libraries and search engines have been When the user and not the application is the one that demands an
developed in the last few years to address the problem of ontology ontology, the selection task should be less categorical, returning
search and retrieval. [6] presents a complete study of ontology not only one but the set of the most suitable results. To sort these
libraries (WebOnto, Ontolingua, SHOE, etc.), where their results according to the evaluation criteria, several ontology
functionalities are evaluated attending to different criteria such as ranking measures have been proposed in the literature. Some of
ontology management, ontology adaptation and ontology them are presented in [2] and [3]. Both works aim to take a step
standardization. Although ontology libraries are a good temporary beyond to the approaches based on the page-rank algorithm [24],
solution for ontology retrieval, they suffer from the current where ontologies are ranked considering the number of links
limitation of not being opened to the web. In that sense, Swoogle between them, because this ranking methodology does not work
[24] constitutes one of the biggest efforts carried out to crawl, for ontologies with poor connectivity and lack of referrals from
index and search for ontologies distributed across the Web. other ontologies.

2
Garrett, J. J. (2005). AJAX: A New Approach to Web
Applications. In http://www.adaptivepath.com/
As it has been shown before, current ontology reuse approaches • Profile adaptation. Techniques are needed to adapt the user
take advantage of ontology evaluation, search, retrieval, selection profiles to new interests and forget old ones as user interests
and ranking methodologies. All these areas provide different evolve with time. Again, in our approach profile adaptation
advantages to the process of ontology evaluation and reuse, but is done manually (manual update of ontology evaluations).
they do not exploit others related to the well known
Recommender Systems [1]; is it helpful to know other users’ Filtering method. Items or actions are recommended to a user
opinions to evaluate and select the most suitable ontology? taking into account the available information (item content
descriptions and user profiles). There are three main information
The collaboration between users has been addressed in the area of filtering approaches for making recommendations:
ontology design and construction [23]. In [14], the necessity of
mechanisms for ontology maintenance is presented under • Demographic filtering: Descriptions of people (e.g. age,
scenarios like “ontology-development in collaborative gender, etc) are used to learn the relationship between a
environments”. Moreover, works as [7], present tools and single item and the type of people who like it.
services to support the process of achieving consensus on
• Content-based filtering: The user is recommended items
common shared ontologies by geographically distributed groups.
based on the descriptions of items previously evaluated by
However, despite all these common scenarios where the user’s other users. Content-based filtering is chosen approach in
collaboration is required for ontology design and construction, the our work (the system recommends ontologies using previous
use of collaborative tools for ontology evaluation is still a novel evaluations of those ontologies).
and incipient approach in the literature [8].
• Collaborative filtering: People with similar interests are
2.2 Recommender Systems matched and then recommendations are made.
Collaborative filtering strategies make automatic predictions
(filter) about the interests of a user by collecting taste information Matching method. It defines how user interests and item
from many users (collaborating). This approach usually consists characteristics are compared. Two main approaches can be
of two steps: a) look for users that have a similar rating pattern to identified:
that of the active user (the user for whom the prediction is done), • User profile matching: people with similar interests are
and b) use the ratings of users found in the previous step to matched before making recommendations.
compute the predictions for the active user. These predictions are
specific to the user, differently to those given by more simple • User profile-item matching: a direct comparison is made
approaches that provide average scores for each item of interest, between the user profile and the items. The degree of
for example based on its number of votes. appropriateness of the ontologies is computed by taking into
account previous evaluations of those ontologies.
Collaborative filtering is a widely explored field. Three main
aspects typically distinguish the different techniques reported in In WebCORE, a new ontology evaluation measure based on
the literature [13]: user profile representation and management, collaborative filtering is proposed, considering users’ interests and
filtering method, and matching method. previous assessments of the ontologies.
User profile representation and management can be divided
into five different tasks: 3. SYSTEM ARCHITECTURE
As mentioned before, WebCORE is a web application for
• Profile representation. Accurate profiles are vital for the Collaborative Ontology Reuse and Evaluation. A user logins into
content-based component (to ensure recommendations are the system via a web browser, and, thanks to AJAX technology
appropriate) and the collaborative component (to ensure that and the Google Web Toolkit3, dynamically describes a problem
users with similar profiles are in fact similar). The type of domain, searches for ontologies related to this domain, obtains
profile chosen in this work is the user-item ratings matrix relevant ontologies ranked by several lexical, taxonomic and
(ontology evaluations based on specific criteria). collaborative criteria, and optionally evaluates by himself those
• Initial profile generation. The user is not usually willing to ontologies that he likes or dislikes most.
spend too much time in defining her/his interests to create a In this section, we describe the server-side architecture of
personal profile. Moreover, user interests may change WebCORE. Figure 1 shows an overview of the system. We
dynamically over time. The type of initial profile generation distinguish three different modules. The first one, the left module,
chosen in this work is a manual selection of values for only receives the problem description (Golden Standard) as a full text
five specific evaluation criteria. or as a set of initial terms. In the first case, the system uses a NLP
module to obtain the most relevant terms of the given text. The
• Profile learning. User profiles can be learned or updated initial set of terms can also be modified and extended by the user
using different sources of information that are potentially using WordNet [12]. The second one, represented in the centre of
representative of user interests. In our work, profile learning the figure, allows the user to select a set of ontology evaluation
techniques are not used. techniques provided by the system to recover the ontologies
• The source of user input and feedback to infer user interests closest to the given Golden Standard. Finally, the third one, on the
from information used to update user profiles. It can be right of the figure, is a collaborative module that re-ranks the list
obtained in two different ways: using information explicitly of recovered ontologies, taking into consideration previous
provided by the user, and using information implicit feedback and evaluations of the users.
observed in the user’s interaction. Our system uses no
feedback to update the user profiles.
3
Google Web Toolkit, http://code.google.com/webtoolkit/
• Z is an integer number that represents the depth or distance
of a term to the root term from which it has been derived.
Examples:
T1 = (“genetics”, NOUN, “”, ROOT, 0). T1 is one of the root
terms of the Golden Standard. The lexical entry that it represents
is “genetics”, its part of speech is “noun”, it has not been
expanded from any other term so its lexical parent is the empty
string, its relation is “root”, and its depth is 0.
T2 = (“biology”, NOUN, “genetics”, HYPERNYM, 1). T2 is a
term expanded from “genetics” (T1). The lexical entry it
represents is “biology”, its part of speech is “noun”, the lexical
entry of its parent is “genetics”, it has been expanded by the
“hypernym“ relation, and the number of relations that separates it
from the root term T1 is 1.
Figure 2 shows the interface of the Golden Standard Definition
phase. In the left side of the screen, the current list of root terms is
Figure 1. WebCORE architecture shown. The user can manually insert new root terms to this list
giving their lexical entries and selecting their parts of speech. The
correctness of these new insertions is controlled by verifying that all
3.1 Golden Standard Definition the considered lexical entries belong to the WordNet repository.
The first phase of our ontology recommender system is the
Adding new terms, the final Golden Standard definition is
Golden Standard definition. As done in the first version of CORE
immediately updated: the final list of (root and expanded) terms that
[8], the user describes a domain of interest specifying a set of
represent the domain of the problem is shown in the bottom of the
relevant terms that will be searched through the concepts (classes
figure. The user can also make term expansion using WordNet. He
or instances) of the ontologies stored in the system.
selects one of the terms from the Golden Standard definition and the
As an improvement, WebCORE includes an internal NLP system shows him all its meanings contained in WordNet (top of the
component that automatically retrieves the most informative terms figure). After he has chosen one of them, the system presents him
from a given text. Moreover, we have added a new collaborative three different lists with the synonyms, hyponyms and hypernyms
component that continuously offers to the user a ranked list with of the term. The user can then selects one or more elements of these
the terms that have been used in those previous problem lists and add them to the expanded term list. For each expansion, the
descriptions in which a given term appears. depth of the new term is increased by one unit. This will be used
later to measure the importance of the term within the Golden
3.1.1 Term-based Problem Description Standard: the greater the depth of the derived term with respect to its
In our system, the Golden Standard is described by a set of initial root term, the less its relevance will be.
set of terms. These terms can automatically be obtained by the
internal Natural Language Processing (NLP) module, which uses 3.1.2 Collaborative Problem Description
a repository of documents related to the specific domain in which In the problem definition phase a collaborative component has
the user is interested in. This NLP module accesses to the been added to the system (right side of Figure 2). This component
repository of documents, and returns a list of pairs (lexical entry, reads the term currently selected by the user, and searches for all
part of speech) that roughly represents the domain of the problem. the stored problem definitions that contain it. For each of these
On the other hand, the list of initial (root) terms can be manually problem definitions, the rest of their terms and the number of
specified. problems in which they appear are retrieved and shown in the web
The module also allows the user to expand the root terms using browser.
WordNet [12] and some of the relations it provides: hypernym, With this simple strategy the user is suggested the most popular
hyponym and synonym. The new terms added to the Golden terms, fact that could help him to better describe the domain in
Standard using these relations might also be extended again, and which he is interested in. It is very often the case that a person has
new terms can iteratively be added to the problem definition. very specific goals or interests, but does not know how to
The final representation of the Golden Standard is defined as a correctly explain/describe them, and how to effectively find
set of terms T (LG, POS, LGP, R, Z) where: solutions for them. With the retrieved terms, the user might
discover new ways to describe the problem domain and obtain
• LG is the set of lexical entries defined for the Golden better solutions in the ontology recommendation phase.
Standard.
This follows somehow the ideas of the well known folksonomies4.
• POS corresponds to the different Parts Of Speech considered The term “folksonomy” is a combination of “folk” and
by WordNet: noun, adjective, verb and adverb. “taxonomy”, and was firstly used by Thomas Vander Wal [22] in
• LGP is the set of lexical entries of the Golden Standard that
have been extended.
• R is the set of relations between terms of the Golden 4
Mathes, A. (2004). Folksonomies: Cooperative Classification
Standard: synonym, hypernym, hyponym and root (if a term
and Communication through Shared Metadata.
has not been obtained by expansion, but is one of the initial
http://www.adammathes.com/academic/computer-mediated-
terms).
communication/folksonomies.html
Figure 2. WebCORE problem definition phase

a discussion on a mailing list about the system of organization user for each of the terms to be explicitly mentioned in the
developed in Delicious5 and Flickr6. It is associated to those ontologies. In our system, these weights are automatically
information retrieval methodologies consisting of collaboratively assigned considering the depth measure of each of the terms
generated, open-ended labels that categorize content. included in the Golden Standard.
Although they suffer from problems of imprecision and Let T be the set of all terms defined in the Golden Standard
ambiguity, techniques employing free-form tagging encourage definition phase. Let di be the depth measure associate with each
users to organize information in their own ways and actively term ti ∈ T. Let q be query vector extracted from the Golden
interact with the system. Standard definition, and let wi be the weight associated to each of
these terms, where for each ti ∈ T, wi ∈ [0,1]. Then, the weight wi
3.2 Automatic Ontology Recommendation is calculated as:
Once the user has selected the most appropriate set of terms to
1
describe the problem domain, the tool performs the processes of wi =
ontology retrieval and ranking. These processes play a key role di + 1
within the system, since they provide the first level of information This measure gives more relevance to the terms explicitly
to the user. To enhance the previous approaches of CORE, an expressed by the user, and less importance to those ones extended
adaptation of traditional Information Retrieval techniques have or derived from previously selected terms. An interesting future
been integrated into the system. Our novel strategy to ontology work could be to enhance and refine the query, e.g. based on terms
retrieval can be seen as an evolution of classic keyword-based popularity, or other more complex strategies as terms frequency
retrieval techniques [21], where textual documents are replaced by analysis.
ontologies.
To carry out the process of ontology retrieval, the approach is
3.2.1 Query encoding and ontology retrieval focused on the lexical level, retrieving those ontologies that
The queries supported by our model are expressed using the terms contain a subset of the terms expressed by the user during the
selected during the Golden Standard definition phase. Golden Standard definition. To compute the matching, two
different options are available within the tool: search for exact
In classic keyword-based vector-space models for information
matches and search for matches based on the Levenshtein distance
retrieval [21], each of the query keywords is assigned a weight
between two terms.
that represents the importance of the keyword in the information
need expressed by the query, or its discriminating power for In both cases, the query execution returns a set of ontologies that
discerning relevant from irrelevant documents. satisfy user requirements. Considering that not all the retrieved
ontologies fulfil the same level of satisfaction, it is the system task
Analogously, in our model, the terms included in the Golden
to sort them and present the ranked list to the user.
Standard can be weighted to indicate the relative interest of the

5
del.icio.us - social bookmarking, http://del.icio.us/
6
Flickr - photo sharing, http://www.flickr.com/
Figure 3. WebCORE system recommendation phase

3.2.2 Ontology ranking Hence, the similarity measure between an ontology oj and the
Once the list of ontologies is formed, the ontology-search engine query q is simply compute as follows:
computes a semantic similarity value between the query and each sim ( q, o j ) =q ⋅ o j
ontology as follows. We represent each ontology in the search
space as an ontology vector oj ∈ O, where oji is the mean of the 3.2.3 Combination with Knowledge Base Retrieval
term ti similarities with all the matched entities in the ontology if If the knowledge in the ontology is incomplete, the ontology
any matching exists, and zero otherwise. ranking algorithm performs very poorly. Queries will return less
The components oji are calculated as: results than expected, the relevant ontologies will not be retrieved,
or will get a much lower similarity value than it should. For
∑ w(m )
M ji
ji instance, if there are ontologies about “restaurants”, and “dishes”
o ji = M ji are expressed as instances in the corresponding Knowledge Base
∑ w(m )
Mi
i (KB), a user searching for ontologies in this domain may be also
interested in the instances and literals contained in the KB. To
where Mji is the set of matches of the term ti in the ontology cope with this issue, our ranking model combines the similarity
oj, w(mji) represents the similarities between the term ti and the obtained from the terms that belong to the ontology with the
entities of the ontology oj that matches with it, Mi is the set of similarity obtained from the terms that belong to the KB using the
matches of the term ti within all the ontologies and w(mi) adaptation of the vector space model explained before.
represents the weights of each of these matches.
On the other hand, the combination of outputs of several search
For example, if we define in the Golden Standard a term “acid”, engines has been a widely addressed research topic in the
this term may return several matches in the same ontology with Information Retrieval field [9]. After testing several approaches,
different entities as: “acid”, “amino acid”, etc. In order to we have selected the so-called Comb-MNZ strategy. This
establish the appropriate weight in the ontology vector, oij, the technique has been shown in prior works as one of the simplest
goal is to compute the number of matches of one term in the and most effective rank aggregation techniques, and consists of
whole repository of ontologies and give more relevance to those computing a combined ranking score by a linear combination of
ontologies that have matched that specific term more times. the input scores with additional factors that measure the relevance
Due to the way in which the vector oj is constructed, each of each score in the final ranking. In our case, the relevancies of
component oij contains specific information about the similarity the scores, i.e., the relevancies of the similarity computation
between the ontology and the corresponding term ti. To compute within the ontology and within the knowledge base, are given by
the final similarity between the query vector q and the ontology the user. He can select a value vi ∈ [1, 5] for each kind of search,
vector oj, the vectorial model calculates the cosine measure and this value is then mapped to a corresponding value si using
between both vectors. However, if we follow the traditional the following normalization.
vectorial model, we will only be considering the difference
v
between the query and the ontology vectors according to the angle si = i
they form, but not taking into account their dimensions. Thus, to 5
overcome this limitation, the above cosine measure used in the Following this idea, the final score is computed as:
vectorial model has been replaced by the simple dot product.
sO × sim(q, o) + s kb × sim(q, kb)
Figure 4. WebCORE user evaluation phase

For future work, we are considering to set si using statistical focused on generic types of tasks or activities) and
information about the knowledge contained in the ontologies, the application-ontologies (for ontologies describing a domain
knowledge contained in the KBs and the information requested by in an application-dependent manner).
the user during the Golden Standard definition phase.
The above criteria can have discrete numeric or non-numeric
Figure 3 shows the system recommendation interface. At the left values. The user’s interests are expressed like a subset of these
side the user can select the matching methodology (fuzzy or criteria, and their respective values, meaning thresholds or
exact), the search spaces (ontology entities and knowledge base restrictions to be satisfied by user evaluations. Thus, a numeric
entities), and the weight or importance given to each of the criterion will be satisfied if an evaluation value is equal or greater
previously selected search spaces. In the right part the user can than that expressed by its interest threshold, while a non-numeric
visualize the ontology and navigate across it. Finally, the middle criterion will be satisfied only when the evaluation is exactly the
of the interface presents the list of ontologies selected for the user given threshold (i.e. in a Boolean or yes/no manner).
to be evaluated during the collaborative evaluation phase.
According to both types of user evaluation and interest criteria,
3.3 Collaborative Ontology Evaluation numeric and Boolean, the recommendation algorithm will
The third and last phase of the system is compound of a novel measure the degree in which each user restriction is satisfied by
ontology recommendation algorithm that exploits the advantages the evaluations, and will recommend a ranked ontology list
of Collaborative Filtering [1], exploring the manual evaluations according to similarity measures between the thresholds and the
stored in the system to rank the set of ontologies that best fulfils collaborative evaluations. To create the final ranked ontology list
the user’s interests. the recommender module follows two phases. In the first one it
calculates the similarity degrees between all the user evaluations
In WebCORE, user evaluations are represented as a set of five and the specified user interest criteria thresholds. In the second
different criteria [15] and their respective values, manually one it combines the similarity measures of the evaluations,
determined by the users who made the evaluations. generating the overall rankings of the ontologies.
• Correctness: specifies whether the information stored in the Figure 4 shows all the previous definitions and ideas, locating
ontology is true, independently of the domain of interest. them in the graphical interface of the system. On the left side of
• Readability: indicates the non-ambiguous interpretation of the screen, the user introduces the thresholds for the
the meaning of the concept names. recommendations and obtains the final collaborative ontology
ranking. On the right side, the user adds new evaluations for the
• Flexibility: points out the adaptability or capability of the ontologies and checks evaluations given by the rest of the users.
ontology to change.
3.3.1 Collaborative Evaluation Measures
• Level of formality: highly informal, semi-informal, semi- As mentioned before, a user evaluates an ontology considering
formal, rigorously-formal. five different criteria that can be divided in two different groups:
a) numeric criteria (‘correctness’, ‘readability’ and ‘flexibility’),
• Type of model: upper-level (for ontologies describing
which take discrete numeric values [1, 2, 3, 4, 5], where 1 means
general, domain-independent concepts), core-ontologies (for
the ontology does not fulfil the criterion, and 5 means the
ontologies that contain the most important concepts on a
ontology completely satisfies the criterion, and, b) Boolean
specific domain), domain-ontologies (for ontologies that
criteria (‘level of formality’ and ‘type of model’), which are
broadly describe a domain), task-ontologies (for ontologies
represented by specific non-numeric values that can be or not similarity num ( criterionmn ) =
satisfied by the ontology.
*
= 1 + similarity num ( criterionmn )· penalty num (threshold mn ) ∈ [0, 2]
Taking into account the previous definitions, user interests will be
a subset of the above criteria and their respective values This measure will also return values between 0 and 2. The idea of
representing the set of thresholds that should be reached by the returning a similarity value between 0 and 2 is inspired on other
ontologies. Given a set of user interests, the system will size up all collaborative matching measures [18] to not manage negative
the stored evaluations, and will calculate their similarity measures. numbers, and facilitate, as we shall show in the next subsection, a
To explain these similarities we shall use a simple example of six coherent calculation of the final ontology rankings.
different evaluations (E1, E2, E3, E4, E5 and E6) of a given The similarity assessment is based on the distance between the
ontology. In the explanation we shall distinguish between the value of the criterion n in the evaluation m, and the threshold
numeric and the Boolean criteria. We start with the Boolean ones, indicated in the user’s interests for that criterion. The more the
assuming two different criteria, C1 and C2, with three possible value of the criterion n in evaluation m overcomes the threshold,
values: “A”, “B” and “C”. In Table 1 we show the threshold the greater the similarity value shall be.
values established by a user for these two criteria, “A” for C1 and
“B” for C2, and the six evaluations stored in the system. Specifically, following the expression below, if the difference dif
= (evaluation – threshold) is equal or greater than 0, we assign a
Table 2. Thresholds and evaluations for Boolean criteria C1 and C2 positive similarity in (0,1] that depends on the maximum
difference maxDif = (maxValue – threshold) we can achieve with
Evaluations
the given threshold; and else, if the difference dif is lower than 0,
Criteria Thresholds E1 E2 E3 E4 E5 E6 we give a negative similarity in [-1,0), punishing the distance of
C1 “A” “A” “B” “A” “C” “A” “B”
the value with the threshold.

C2 “B” “A” “A” “B” “C” “A” “A” ⎧ 1 + dif
⎪⎪ 1 + maxDif ∈ (0,1] if dif ≥ 0
In this case, because of the threshold of a criterion n is satisfied or *
similarity num ( criterionmn ) = ⎨
not by a certain evaluation m, their corresponding similarity
measure is simply 0 if they have the same value, and 2 otherwise.
⎪ dif ∈ [ −1, 0) if dif < 0
⎪⎩ threshold
⎧0 if evaluationmn ≠ threshold mn
Table 5 summarizes the similarity* values for the three numeric
similaritybool ( criterionmn ) = ⎨
⎩2 if evaluationmn = threshold mn criteria and the six evaluations of the example.
The similarity results for the Boolean criteria of the example are Table 5. Similarity* values for numeric criteria C3, C4 and C5
shown in Table 3.
Evaluations
Table 3. Similarity values for Boolean criteria C1 and C2 Criteria Thresholds E1 E2 E3 E4 E5 E6
Evaluations C3 ≥3 1/4 2/4 3/4 3/4 -1/3 -1
Criteria Thresholds E1 E2 E3 E4 E5 E6 C4 ≥0 1/6 2/6 5/6 1 1/6 1/6
C1 “A” 2 0 2 0 2 0 C5 ≥5 1 1 1 1 -1/5 -1
C2 “B” 0 0 2 0 0 0 Comparing the evaluation values of Table 4 with the similarity
For the numeric criteria, the evaluations can overcome the values of Table 5, the reader may notice several important facts:
thresholds to different degrees. Table 4 shows the thresholds 1. Evaluation E4 satisfies criteria C4 and C5 with evaluations of 5.
established for criteria C3, C4 and C5, and their six available Applying the above expression, these criteria receive the same
evaluations. Note that E1, E2, E3 and E4 satisfy all the criteria, while similarity of 1. However, criterion C4 has a threshold of 0, and
E5 and E6 do not reach some of the corresponding thresholds. C5 has a threshold equal to 5. As it is more difficult to satisfy
the restriction imposed to C5, this one should have a greater
Table 4. Thresholds and evaluations for numeric criteria C3,C4 and C5 influence in the final ranking.
Evaluations 2. Evaluation E6 gives an evaluation of 0 to criteria C3 and C5, not
Criteria Thresholds E1 E2 E3 E4 E5 E6 satisfying either of them and generating the same similarity
value of -1. Again, because of their different thresholds, we
C3 ≥3 3 4 5 5 2 0 should distinguish their corresponding relevance degrees in
C4 ≥0 0 1 4 5 0 0 the rankings.
C5 ≥5 5 5 5 5 4 0 For these reasons, a threshold penalty is applied, reflecting how
difficult it is to overcome the given thresholds. The more difficult
In this case, the similarity measure has to take into account two to surpass a threshold, the lower the penalty value shall be.
different issues: the degree of satisfaction of the threshold, and the
difficulty of achieving its value. Thus, the similarity between the 1 + threshold
value of criterion n in the evaluation m, and the threshold of interest penaltynum (threshold ) = ∈ (0,1]
1 + maxValue
is divided into two factors: 1) a similarity factor that considers
whether the threshold is surpassed or not, and, 2) a penalty factor Table 6 shows the threshold penalty values for the three numeric
which penalizes those thresholds that are easier to be satisfied. criteria and the six evaluations of the example.
Table 6. Threshold penalty values for numeric criteria C3, C4 and C5 4. EXPERIMENTS
Evaluations In this section, we present some early experiments that attempt to
measure: a) the gain of efficiency and effectiveness, and the b)
Criteria Thresholds E1 E2 E3 E4 E5 E6
increment of users’ satisfaction obtained with the use of our
C3 ≥3 4/6 4/6 4/6 4/6 4/6 4/6 system when searching ontologies within a specific domain.
C4 ≥0 1/6 1/6 1/6 1/6 1/6 1/6 The scenario of the experiments was the following. A repository
C5 ≥5 1 1 1 1 1 1 of thirty ontologies was considered and eighteen subjects
participated in the evaluations. They were Computer Science
Ph.D. students of our department, all of them with some expertise
in modeling and exploitation of ontologies. They were asked to
Finally, the similarity results for the numeric criteria of the
search and evaluate ontologies with WebCORE in three different
example are shown in Table 7.
tasks. For each task and each student, one of the following
Table 7. Similarity values for numeric criteria C3, C4 and C5 problem domains was selected:
Evaluations • Family. Search for ontologies including family members:
mother, father, daughter, son, etc.
Criteria Thresholds E1 E2 E3 E4 E5 E6
C3 ≥3 1.17 1.33 1.5 1.5 0.78 0.33 • Genetics. Search for ontologies containing specific
vocabulary of Genetics: genes, proteins, amino acids, etc.
C4 ≥0 1.03 1.05 1.14 1.17 1.03 1.03
C5 ≥5 2 2 2 2 0.5 0 • Restaurant. Search for ontologies with vocabulary related
to restaurants: food, drinks, waiters, etc.
As a preliminary approach, we calculate the similarity between an
ontology evaluation and the user’s requirements as the average of In the repository, there were six different ontologies related to
its N criteria similarities. each of the above domains, and twelve ontologies describing other
no related knowledge areas. No information about the domains
N
1 and the existent ontologies was given to the students.
similarity ( evaluationm ) =
N
∑ similarity (criterion )
n =1
mn
Tasks 1 and 2 were performed first without the help of the
collaborative modules of the system, i.e., the term recommender
A weighted average could be even more appropriate, and might
of the problem definition phase and the collaborative ranking of
make the collaborative recommender module more sophisticated
the user evaluation phase. After all users finished the previous
and adjustable to user needs. This will be considered for a
ontology searches and evaluations, task 3 was done with the
possible enhancement of the system in the continuation of our
collaborative components activated. For each task and each
research.
student, we measured the time expended, and the number of
3.3.2 Collaborative Ontology Ranking ontologies retrieved and selected (‘reused’). We also asked the
Once the similarities are calculated taking into account the user’s users about their satisfaction (in a 1-5 rating scale) about each of
interests and the evaluations stored in the system, a ranking is the selected ontologies and the collaborative modules.
assigned to the ontologies. Tables 8 and 9 contain a summary of the obtained results. Note
The ranking of a specific ontology is measured as the average of that measures of task 1 are not shown. We have decided not to
its M evaluation similarities. Again, we do not consider different consider them for evaluation purposes because we discern the first
priorities in the evaluations of several users. We have planned to task as a learning process of the use of the tool, and its time
include in the system personalized user appreciations about the executions and number of selected ontologies as skewed no
opinions of the rest of the users. Thus, for a certain user some objective measures.
evaluations will have more relevance than others, according to the
To evaluate the enhancements in terms of efficiency and
users that made it.
effectiveness, we present in Table 8 the average number of reused
1
M ontologies and the average execution times for task 2 and 3. The
ranking ( ontology ) =
M m =1
∑ similarity (evaluation ) m results show a significant improvement when the collaborative
modules of the system were activated. In all the cases, the
1
M N students made use of the terms and evaluations suggested by
= ∑∑ similarity (criterion ) mn others, accelerating the processes of problem definition and
MN m =1 n =1 relevant ontology retrieval.
Finally, in case of ties, the collaborative ranking mechanism sorts
Table 8. Average number of reused ontologies and execution times (in
the ontologies taking into account not only the average similarity minutes) for tasks 2 and 3
between the ontologies and the evaluations stored in the system,
but also the total number of evaluations of each ontology, Task 2 Task 3
(without (with %
providing thus more relevance to those ontologies that have been
collaborative collaborative improvement
rated more times. modules) modules)
# reused
M 3.45 4.35 26.08
ranking ( ontology ) ontologies
M total execution
9.3 7.1 23.8
time
On the other hand, table 9 shows the average degrees of [9] Lee, J. H.: Analysis of multiple evidence combination.
satisfaction revealed by the users about the retrieved ontologies Proceedings of the 20th ACM Int. Conference on Research
and the collaborative modules. Again, the results evidence and Development in IR (SIGIR’97). New York, 1997.
positive applications of our approach.
[10] Lozano-Tello, A., and Gómez-Pérez, A.: Ontometric: A
Table 9. Average satisfactions values (1-5 rating scale) for ontologies method to choose the appropriate ontology. Journal of
reused in tasks 2 and 3, collaborative recommendations and rankings Database Management, 15(2):1–18, 2004.

Task 2
Task % Initial term Final ontology [11] Maedche, A., and Staab, S.: Measuring similarity between
3 improvement recommendation ranking ontologies. Proceedings of the 13th European Conference on
3.34 3.56 6.58 4.7 4.4 Knowledge Acquisition and Management (EKAW 2002).
Madrid, Spain, 2002.

5. CONCLUSIONS AND FUTURE WORK [12] Miller, G. A.: WordNet: A lexical database for English. New
In this paper, a web application for ontology evaluation and reuse horizons in commercial and industrial Artificial Intelligence.
has been presented. The novel aspects of our proposal include the Communications of the Association for Computing
use of WordNet to help users to define the Golden Standard; a Machinery, 38(11): 39-41, 1995.
new ontology retrieval technique based on traditional Information [13] Montaner, M., López, B., and De la Rosa, J.L.: A Taxonomy
Retrieval models; rank fusion techniques to combine different of Recommended Agents on the Internet. Artificial
ontology evaluation measures; and two collaborative modules: intelligence Review 19: 285-330, 2003.
one that suggests the most popular terms for a given domain, and
one that recommends lists of ontologies with a multi-criteria [14] Noy, N. F., Chugh, A., Liu, W., and Musen, M. A.: A
strategy that takes into account user opinions about ontology Framework for Ontology Evolution in Collaborative
features that can only be assessed by humans. Environments. Proceedings of the 5th Int. Semantic Web
Conference (ISWC’06). Athens, Georgia, USA, 2006.
6. ACKNOWLEDGMENTS [15] Paslaru, E.: Using Context Information to Improve Ontology
This research was supported by the Spanish Ministry of Science Reuse. Doctoral Workshop at the 17th Conference on
and Education (TIN2005-06885 and FPU program). Advanced Information Systems Engineering (CAiSE’05).
Porto, Portugal, 2005.
7. REFERENCES [16] Porzel, R., and Malaka, R.: A task-based approach for
[1] Adomavicius, G., and Tuzhilin, A.: Toward the Next ontology evaluation. Proc. of the 16th European Conference
Generation of Recommender Systems: A Survey of the State- on Artificial Intelligence (ECAI’04). Valencia, Spain, 2004.
of-the-Art and Possible Extensions. IEEE Transactions on
Knowledge and Data Engineering 17(6): 734-749, 2005. [17] Protégé OWL ontology Repository.
http://protege.stanford.edu/download/ontologies.html
[2] Alani, H., and Brewster, C.: Metrics for Ranking Ontologies.
Proceedings of the 4th Int. Workshop on Evaluation of [18] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and
Ontologies for the Web (EON’06), at the 15th Int. World Riedl, J.: GroupLens: An Open Architecture for
Wide Web Conference (WWW’06). Edinburgh, UK, 2006. Collaborative Filtering of Netnews. Internal Research
Report, MIT Center for Coordination Science, 1994.
[3] Alani, H., Brewster, C., and Shadbolt, N.: Ranking
Ontologies with AKTiveRank. Proc.. of the 5th Int. Semantic [19] Sabou, M., López, V., Motta, E., and Uren, V.: Ontology
Web Conference (ISWC’06). Athens, Georgia, USA, 2006. Evaluation on the Real Semantic Web. Proceedings of the 4th
Int. Workshop on Evaluation of Ontologies for the Web
[4] Brank J., Grobelnik M., and Mladenic D.: A Survey of (EON’06), at the 15th Int. World Wide Web Conference
Ontology Evaluation Techniques. Proceedings of the 4th (WWW’06). Edinburgh, UK, 2006.
Conference on Data Mining and Data Warehouses
(SiKDD‘05), at the 7th Int. Multi-conference on Information [20] Sabou, M., López, V., Motta, E., and Uren, V.: Ontology
Society (IS’05). Ljubljana, Slovenia, 2005. Selection for the Real Semantic Web: How to cover the
Queen’s Birthday Dinner? Proc. of the 15th International
[5] Brewster, C., Alani, H., Dasmahapatra, S. and Wilks, Y. Data Conference on Knowledge Engineering and Knowledge
driven ontology evaluation. Proc. of the 4th Int. Conf. on Management (EKAW’06). Podebrady, Czech Republic, 2006.
Language Resources and Evaluation (LREC04). Lisbon 2004
[21] Salton, G., and McGill, M.: Introduction to Modern
[6] Ding, Y., and Fensel, D.: Ontology Library Systems: The key to Information Retrieval. McGraw-Hill, New York, 1983.
successful Ontology Reuse. Proc. of the 1st Semantic Web
Working Symposium (SWWS’01). Stanford, CA, USA, 2001. [22] Smith, G.: Atomiq: Folksonomy: Social Classification. 2004.
http://atomiq.org/archives/2004/08/folksonomy_social_class
[7] Farquhar, A., Fikes, R., and Rice, J.: The Ontolingua server: ification.html
A tool for collaborative ontology construction. Technical
report, Stanford KSL 96-26, 1996. [23] Sure, Y., Erdmann, M., Angele, J., Staab, S., Studer, R., and
Wenke, D.: OntoEdit: Collaborative Ontology Development
[8] Fernández, M., Cantador, I., and Castells, P. CORE: A Tool for the Semantic Web. Proceedings of the 1st International
for Collaborative Ontology Reuse and Evaluation. Semantic Web Conference (ISWC ‘02), Sardinia, Italy, 2002.
Proceedings of the 4th Int. Workshop on Evaluation of
Ontologies for the Web (EON’06), at the 15th Int. World [24] Swoogle - Semantic Web Search Engine.
Wide Web Conference (WWW’06). Edinburgh, UK, 2006. http://swoogle.umbc.edu