=Paper= {{Paper |id=Vol-273/paper-6 |storemode=property |title=Improving Ontology Recommendation and Reuse in WebCORE by Collaborative Assessments |pdfUrl=https://ceur-ws.org/Vol-273/paper_8.pdf |volume=Vol-273 |dblpUrl=https://dblp.org/rec/conf/www/CantadorFC07 }} ==Improving Ontology Recommendation and Reuse in WebCORE by Collaborative Assessments== https://ceur-ws.org/Vol-273/paper_8.pdf
        Improving Ontology Recommendation and Reuse in
            WebCORE by Collaborative Assessments
                                    Iván Cantador, Miriam Fernández, Pablo Castells
                                                   Escuela Politécnica Superior
                                                 Universidad Autónoma de Madrid
                                            Campus de Cantoblanco, 28049, Madrid, Spain
                            {ivan.cantador, miriam.fernandez, pablo.castells}@uam.es

ABSTRACT                                                                   automatically find, share and combine information in consistent
In this work, we present an extension of CORE [8], a tool for              ways. As put by Tim Berners-Lee in 1999, “I have a dream for
Collaborative Ontology Reuse and Evaluation. The system receives           the Web in which computers become capable of analyzing all the
an informal description of a specific semantic domain and                  data on the Web – the content, links, and transactions between
determines which ontologies from a repository are the most                 people and computers. A ‘Semantic Web’, which should make this
appropriate to describe the given domain. For this task, the               possible, has yet to emerge, but when it does, the day-to-day
environment is divided into three modules. The first component             mechanisms of trade, bureaucracy and our daily lives will be
receives the problem description as a set of terms, and allows the         handled by machines talking to machines. The ‘intelligent agents’
user to refine and enlarge it using WordNet. The second module             people have touted for ages will finally materialize”.
applies multiple automatic criteria to evaluate the ontologies of the      At the core of these new technologies, ontologies are envisioned
repository, and determines which ones fit best the problem                 as key elements to represent knowledge that can be understood,
description. A ranked list of ontologies is returned for each criterion,   used and shared among distributed applications and machines.
and the lists are combined by means of rank fusion techniques.             However, ontological knowledge mining and development are
Finally, the third component uses manual user evaluations in order         difficult and costly tasks that require major engineering efforts.
to incorporate a human, collaborative assessment of the ontologies.        Developing an ontology from scratch requires the expertise of at
The new version of the system incorporates several novelties, such         least two different individuals: an ontology engineer that ensures
as its implementation as a web application; the incorporation of a         the correctness during the ontology design and development, and
NLP module to manage the problem definitions; modifications on             a domain expert, responsible for capturing the semantics of a
the automatic ontology retrieval strategies; and a collaborative           specific field into the ontology. In this context, ontology reuse
framework to find potential relevant terms according to previous           becomes an essential need in order to exploit past and current
user queries. Finally, we present some early experiments on                efforts and achievements.
ontology retrieval and evaluation, showing the benefits of our system.
                                                                           In this scenario, it is also important to emphasize that ontologies,
                                                                           as well as content, do not stop evolving and growing within the
Categories and Subject Descriptors                                         Web. They are part of its wave of growth and evolution, and they
H.3.3 [Information Storage and Retrieval]: Information Search              need to be managed and kept up to date in distributed
and Retrieval – information filtering, retrieval models, selection         environments. In this perspective, the initial efforts to collect
process.                                                                   ontologies in libraries [17] are not sufficient, and novel
                                                                           technologies are necessary to successfully retrieve this special
                                                                           kind of content.
General Terms
Algorithms, Measurement, Human Factors.                                    Novel tools have been recently developed, such as ontology
                                                                           search engines [24] represent an important first step towards
                                                                           automatically assessing and retrieving ontologies which satisfy
Keywords                                                                   user queries and requests. However, ontology reuse demands
Ontology evaluation, ontology reuse, rank fusion, collaborative            additional efforts to address special needs and requirements from
filtering, WordNet.                                                        ontology engineers and practitioners. It is necessary to evaluate
                                                                           and measure specific ontology features, such as lexical
1. INTRODUCTION                                                            vocabulary, relations [11], restrictions, consistency, correctness,
The Web can be considered as a live entity that grows and evolves          etc., before making an adequate selection. Some of these features
fast over time. The amount of content stored and shared on the             can be measured automatically, but some, like the correctness or
web is increasing quickly and continuously. The global body of             the level of formality, require a human judgment to be assessed.
multimedia resources on the Internet is undergoing a significant           In this context, the Web 2.0 is arising as a new trend where people
growth, reaching a presence comparable to that of traditional text         collaborate and share their knowledge to successfully achieve
contents. The consequences of this enlargement result in well              their goals. New search engines like Technorati1 exploit blogs
known difficulties and problems, such as finding and properly              with the aim of finding not only the information that the user is
managing all the existing amount of sparse information.                    looking for, but also the experts that might better answer the
To overcome these limitations the so-called “Semantic Web”                 users’ requirements. As put by David Sifry, one of the founders of
trend has emerged with the aim of helping machines process
information, enabling browsers or other software agents to                 1
                                                                               Technorati, blog search engine, http://technorati.com/
Technorati, in an interview for a Spanish newspaper, “Internet has      To obtain the most appropriate ontology and fulfil ontology
been transformed from the great library to the great                    engineers’ requirements, search engines and libraries should be
conversation”.                                                          complemented with evaluation methodologies.
Following this aspiration, the work presented here aims to              Ontology evaluation can be defined as assessing the quality and
enhance ontology retrieval and recommendation, combining                the adequacy of an ontology for being used in a specific context,
automatic evaluation techniques with explicit users’ opinions and       for a specific goal. From our perspective, ontology evaluation
experiences. This work follows a previous approach for                  constitutes the cornerstone of ontology reuse because it faces the
Collaborative Ontology Reuse and Evaluation over controlled             complex task of evaluate, and consequently select the most
repositories, named CORE [8]. For the work reported in this             appropriate ontology on each situation.
paper, the tool has been enhanced and adapted to the Web. Novel         An overview of ontology evaluation approaches is presented in
technologies, such as AJAX2, have been incorporated to the              [4], where four different categories are identified: those that
system for the design and implementation of the user interface. It      evaluate an ontology by comparing it to a Golden Standard [11];
has also been modified and improved to overcome previous                those that evaluate the ontologies by plugging them in an
limitations, such as handling large numbers of ontologies. The          application and measuring the quality of the results that the
collaborative capabilities have also been extended within two           application returns [16]; those that evaluate ontologies by
different frameworks. Firstly, during the problem definition phase,     comparing them to unstructured or informal data (e.g. text
the system helps users to express their needs and requirements by       documents) [5], and those based on human interaction to measure
showing other problem descriptions previously given by different        ontology features not recognizable by machines [10]. In each of
users. Secondly, during the ontology retrieval phase, the system        the above approaches several evaluation levels are identified:
helps users to enhance the automatic system recommendations by          lexical, taxonomical, syntactic, semantic, contextual, and
using other user evaluations and comments.                              structural between others. Table 1 summarized these ideas.
Following Leonardo Da Vinci’s words, “Wisdom is the daughter
of experience”, our tool aims to take a step forwards for helping             Table 1. An overview of approaches to ontology evaluation
users to be wise in exploiting other people’s experience and                                                      Approach
expertise.
                                                                                                Golden    Application    Data     Assessment
                                                                               Level
The rest of the paper has been organized as follows. Section 2                                 Standard     based        driven   by humans
summarizes some relevant work related to our system. Its                Lexical entries,
architecture is described in Section 3. Section 4 contains empirical    vocabulary,               X           X              X        X
results obtained from early experiments done with a prototype of        concept, data
                                                                        Hierarchy,
the system. Finally, several conclusions and future research lines                                X           X              X        X
                                                                        taxonomy
are given in Section 5.                                                 Other
                                                                        semantic                  X           X              X        X
2. RELATED WORK                                                         relations
                                                                        Context,
2.1 Ontology Evaluation                                                 application
                                                                                                              X                       X
Two well-known scenarios for ontology reuse have been
identified in the Semantic Web area. The first one addresses the        Syntactic                 X                                   X

common problem of finding the most adequate ontologies for a            Structure,
                                                                                                                                      X
specific domain. The second scenario envisions the not so               architecture, design
common but real situation in which Semantic Web applications            Once the ontologies have been searched, retrieved and evaluated,
need to automatically and dynamically find an ontology. In this         the next step is to select the most appropriate one that fulfils user
work, we focus our attention on the fist scenario, where users are      or application goals. Some approaches for ontology selection have
the ones who express their information needs. In this scenario,         been addressed in [20] and complemented in [19], where a
ontology reuse involves several areas such as ontology evaluation,      complete study is presented to determine the connections between
selection, search and ranking.                                          ontology selection and evaluation.
Several ontology libraries and search engines have been                 When the user and not the application is the one that demands an
developed in the last few years to address the problem of ontology      ontology, the selection task should be less categorical, returning
search and retrieval. [6] presents a complete study of ontology         not only one but the set of the most suitable results. To sort these
libraries (WebOnto, Ontolingua, SHOE, etc.), where their                results according to the evaluation criteria, several ontology
functionalities are evaluated attending to different criteria such as   ranking measures have been proposed in the literature. Some of
ontology management, ontology adaptation and ontology                   them are presented in [2] and [3]. Both works aim to take a step
standardization. Although ontology libraries are a good temporary       beyond to the approaches based on the page-rank algorithm [24],
solution for ontology retrieval, they suffer from the current           where ontologies are ranked considering the number of links
limitation of not being opened to the web. In that sense, Swoogle       between them, because this ranking methodology does not work
[24] constitutes one of the biggest efforts carried out to crawl,       for ontologies with poor connectivity and lack of referrals from
index and search for ontologies distributed across the Web.             other ontologies.




2
    Garrett, J. J. (2005). AJAX: A New Approach to Web
    Applications. In http://www.adaptivepath.com/
As it has been shown before, current ontology reuse approaches                • Profile adaptation. Techniques are needed to adapt the user
take advantage of ontology evaluation, search, retrieval, selection             profiles to new interests and forget old ones as user interests
and ranking methodologies. All these areas provide different                    evolve with time. Again, in our approach profile adaptation
advantages to the process of ontology evaluation and reuse, but                 is done manually (manual update of ontology evaluations).
they do not exploit others related to the well known
Recommender Systems [1]; is it helpful to know other users’              Filtering method. Items or actions are recommended to a user
opinions to evaluate and select the most suitable ontology?              taking into account the available information (item content
                                                                         descriptions and user profiles). There are three main information
The collaboration between users has been addressed in the area of        filtering approaches for making recommendations:
ontology design and construction [23]. In [14], the necessity of
mechanisms for ontology maintenance is presented under                        • Demographic filtering: Descriptions of people (e.g. age,
scenarios like “ontology-development in collaborative                           gender, etc) are used to learn the relationship between a
environments”. Moreover, works as [7], present tools and                        single item and the type of people who like it.
services to support the process of achieving consensus on
                                                                              • Content-based filtering: The user is recommended items
common shared ontologies by geographically distributed groups.
                                                                                based on the descriptions of items previously evaluated by
However, despite all these common scenarios where the user’s                    other users. Content-based filtering is chosen approach in
collaboration is required for ontology design and construction, the             our work (the system recommends ontologies using previous
use of collaborative tools for ontology evaluation is still a novel             evaluations of those ontologies).
and incipient approach in the literature [8].
                                                                              • Collaborative filtering: People with similar interests are
2.2 Recommender Systems                                                         matched and then recommendations are made.
Collaborative filtering strategies make automatic predictions
(filter) about the interests of a user by collecting taste information   Matching method. It defines how user interests and item
from many users (collaborating). This approach usually consists          characteristics are compared. Two main approaches can be
of two steps: a) look for users that have a similar rating pattern to    identified:
that of the active user (the user for whom the prediction is done),           • User profile matching: people with similar interests are
and b) use the ratings of users found in the previous step to                   matched before making recommendations.
compute the predictions for the active user. These predictions are
specific to the user, differently to those given by more simple               • User profile-item matching: a direct comparison is made
approaches that provide average scores for each item of interest,               between the user profile and the items. The degree of
for example based on its number of votes.                                       appropriateness of the ontologies is computed by taking into
                                                                                account previous evaluations of those ontologies.
Collaborative filtering is a widely explored field. Three main
aspects typically distinguish the different techniques reported in       In WebCORE, a new ontology evaluation measure based on
the literature [13]: user profile representation and management,         collaborative filtering is proposed, considering users’ interests and
filtering method, and matching method.                                   previous assessments of the ontologies.
User profile representation and management can be divided
into five different tasks:                                               3. SYSTEM ARCHITECTURE
                                                                         As mentioned before, WebCORE is a web application for
   • Profile representation. Accurate profiles are vital for the         Collaborative Ontology Reuse and Evaluation. A user logins into
     content-based component (to ensure recommendations are              the system via a web browser, and, thanks to AJAX technology
     appropriate) and the collaborative component (to ensure that        and the Google Web Toolkit3, dynamically describes a problem
     users with similar profiles are in fact similar). The type of       domain, searches for ontologies related to this domain, obtains
     profile chosen in this work is the user-item ratings matrix         relevant ontologies ranked by several lexical, taxonomic and
     (ontology evaluations based on specific criteria).                  collaborative criteria, and optionally evaluates by himself those
   • Initial profile generation. The user is not usually willing to      ontologies that he likes or dislikes most.
     spend too much time in defining her/his interests to create a       In this section, we describe the server-side architecture of
     personal profile. Moreover, user interests may change               WebCORE. Figure 1 shows an overview of the system. We
     dynamically over time. The type of initial profile generation       distinguish three different modules. The first one, the left module,
     chosen in this work is a manual selection of values for only        receives the problem description (Golden Standard) as a full text
     five specific evaluation criteria.                                  or as a set of initial terms. In the first case, the system uses a NLP
                                                                         module to obtain the most relevant terms of the given text. The
   • Profile learning. User profiles can be learned or updated           initial set of terms can also be modified and extended by the user
     using different sources of information that are potentially         using WordNet [12]. The second one, represented in the centre of
     representative of user interests. In our work, profile learning     the figure, allows the user to select a set of ontology evaluation
     techniques are not used.                                            techniques provided by the system to recover the ontologies
   • The source of user input and feedback to infer user interests       closest to the given Golden Standard. Finally, the third one, on the
     from information used to update user profiles. It can be            right of the figure, is a collaborative module that re-ranks the list
     obtained in two different ways: using information explicitly        of recovered ontologies, taking into consideration previous
     provided by the user, and using information implicit                feedback and evaluations of the users.
     observed in the user’s interaction. Our system uses no
     feedback to update the user profiles.
                                                                         3
                                                                             Google Web Toolkit, http://code.google.com/webtoolkit/
                                                                            • Z is an integer number that represents the depth or distance
                                                                              of a term to the root term from which it has been derived.
                                                                       Examples:
                                                                       T1 = (“genetics”, NOUN, “”, ROOT, 0). T1 is one of the root
                                                                       terms of the Golden Standard. The lexical entry that it represents
                                                                       is “genetics”, its part of speech is “noun”, it has not been
                                                                       expanded from any other term so its lexical parent is the empty
                                                                       string, its relation is “root”, and its depth is 0.
                                                                       T2 = (“biology”, NOUN, “genetics”, HYPERNYM, 1). T2 is a
                                                                       term expanded from “genetics” (T1). The lexical entry it
                                                                       represents is “biology”, its part of speech is “noun”, the lexical
                                                                       entry of its parent is “genetics”, it has been expanded by the
                                                                       “hypernym“ relation, and the number of relations that separates it
                                                                       from the root term T1 is 1.
                                                                       Figure 2 shows the interface of the Golden Standard Definition
                                                                       phase. In the left side of the screen, the current list of root terms is
                 Figure 1. WebCORE architecture                        shown. The user can manually insert new root terms to this list
                                                                       giving their lexical entries and selecting their parts of speech. The
                                                                       correctness of these new insertions is controlled by verifying that all
3.1 Golden Standard Definition                                         the considered lexical entries belong to the WordNet repository.
The first phase of our ontology recommender system is the
                                                                       Adding new terms, the final Golden Standard definition is
Golden Standard definition. As done in the first version of CORE
                                                                       immediately updated: the final list of (root and expanded) terms that
[8], the user describes a domain of interest specifying a set of
                                                                       represent the domain of the problem is shown in the bottom of the
relevant terms that will be searched through the concepts (classes
                                                                       figure. The user can also make term expansion using WordNet. He
or instances) of the ontologies stored in the system.
                                                                       selects one of the terms from the Golden Standard definition and the
As an improvement, WebCORE includes an internal NLP                    system shows him all its meanings contained in WordNet (top of the
component that automatically retrieves the most informative terms      figure). After he has chosen one of them, the system presents him
from a given text. Moreover, we have added a new collaborative         three different lists with the synonyms, hyponyms and hypernyms
component that continuously offers to the user a ranked list with      of the term. The user can then selects one or more elements of these
the terms that have been used in those previous problem                lists and add them to the expanded term list. For each expansion, the
descriptions in which a given term appears.                            depth of the new term is increased by one unit. This will be used
                                                                       later to measure the importance of the term within the Golden
3.1.1 Term-based Problem Description                                   Standard: the greater the depth of the derived term with respect to its
In our system, the Golden Standard is described by a set of initial    root term, the less its relevance will be.
set of terms. These terms can automatically be obtained by the
internal Natural Language Processing (NLP) module, which uses          3.1.2 Collaborative Problem Description
a repository of documents related to the specific domain in which      In the problem definition phase a collaborative component has
the user is interested in. This NLP module accesses to the             been added to the system (right side of Figure 2). This component
repository of documents, and returns a list of pairs (lexical entry,   reads the term currently selected by the user, and searches for all
part of speech) that roughly represents the domain of the problem.     the stored problem definitions that contain it. For each of these
On the other hand, the list of initial (root) terms can be manually    problem definitions, the rest of their terms and the number of
specified.                                                             problems in which they appear are retrieved and shown in the web
The module also allows the user to expand the root terms using         browser.
WordNet [12] and some of the relations it provides: hypernym,          With this simple strategy the user is suggested the most popular
hyponym and synonym. The new terms added to the Golden                 terms, fact that could help him to better describe the domain in
Standard using these relations might also be extended again, and       which he is interested in. It is very often the case that a person has
new terms can iteratively be added to the problem definition.          very specific goals or interests, but does not know how to
    The final representation of the Golden Standard is defined as a    correctly explain/describe them, and how to effectively find
set of terms T (LG, POS, LGP, R, Z) where:                             solutions for them. With the retrieved terms, the user might
                                                                       discover new ways to describe the problem domain and obtain
   • LG is the set of lexical entries defined for the Golden           better solutions in the ontology recommendation phase.
     Standard.
                                                                       This follows somehow the ideas of the well known folksonomies4.
   • POS corresponds to the different Parts Of Speech considered       The term “folksonomy” is a combination of “folk” and
     by WordNet: noun, adjective, verb and adverb.                     “taxonomy”, and was firstly used by Thomas Vander Wal [22] in
   • LGP is the set of lexical entries of the Golden Standard that
     have been extended.
   • R is the set of relations between terms of the Golden             4
                                                                           Mathes, A. (2004). Folksonomies: Cooperative Classification
     Standard: synonym, hypernym, hyponym and root (if a term
                                                                           and Communication through Shared Metadata.
     has not been obtained by expansion, but is one of the initial
                                                                           http://www.adammathes.com/academic/computer-mediated-
     terms).
                                                                           communication/folksonomies.html
                                                     Figure 2. WebCORE problem definition phase

a discussion on a mailing list about the system of organization            user for each of the terms to be explicitly mentioned in the
developed in Delicious5 and Flickr6. It is associated to those             ontologies. In our system, these weights are automatically
information retrieval methodologies consisting of collaboratively          assigned considering the depth measure of each of the terms
generated, open-ended labels that categorize content.                      included in the Golden Standard.
Although they suffer from problems of imprecision and                      Let T be the set of all terms defined in the Golden Standard
ambiguity, techniques employing free-form tagging encourage                definition phase. Let di be the depth measure associate with each
users to organize information in their own ways and actively               term ti ∈ T. Let q be query vector extracted from the Golden
interact with the system.                                                  Standard definition, and let wi be the weight associated to each of
                                                                           these terms, where for each ti ∈ T, wi ∈ [0,1]. Then, the weight wi
3.2 Automatic Ontology Recommendation                                      is calculated as:
Once the user has selected the most appropriate set of terms to
                                                                                                                  1
describe the problem domain, the tool performs the processes of                                          wi =
ontology retrieval and ranking. These processes play a key role                                                 di + 1
within the system, since they provide the first level of information       This measure gives more relevance to the terms explicitly
to the user. To enhance the previous approaches of CORE, an                expressed by the user, and less importance to those ones extended
adaptation of traditional Information Retrieval techniques have            or derived from previously selected terms. An interesting future
been integrated into the system. Our novel strategy to ontology            work could be to enhance and refine the query, e.g. based on terms
retrieval can be seen as an evolution of classic keyword-based             popularity, or other more complex strategies as terms frequency
retrieval techniques [21], where textual documents are replaced by         analysis.
ontologies.
                                                                           To carry out the process of ontology retrieval, the approach is
3.2.1 Query encoding and ontology retrieval                                focused on the lexical level, retrieving those ontologies that
The queries supported by our model are expressed using the terms           contain a subset of the terms expressed by the user during the
selected during the Golden Standard definition phase.                      Golden Standard definition. To compute the matching, two
                                                                           different options are available within the tool: search for exact
In classic keyword-based vector-space models for information
                                                                           matches and search for matches based on the Levenshtein distance
retrieval [21], each of the query keywords is assigned a weight
                                                                           between two terms.
that represents the importance of the keyword in the information
need expressed by the query, or its discriminating power for               In both cases, the query execution returns a set of ontologies that
discerning relevant from irrelevant documents.                             satisfy user requirements. Considering that not all the retrieved
                                                                           ontologies fulfil the same level of satisfaction, it is the system task
Analogously, in our model, the terms included in the Golden
                                                                           to sort them and present the ranked list to the user.
Standard can be weighted to indicate the relative interest of the




5
    del.icio.us - social bookmarking, http://del.icio.us/
6
    Flickr - photo sharing, http://www.flickr.com/
                                                  Figure 3. WebCORE system recommendation phase

3.2.2 Ontology ranking                                                     Hence, the similarity measure between an ontology oj and the
Once the list of ontologies is formed, the ontology-search engine          query q is simply compute as follows:
computes a semantic similarity value between the query and each                                      sim ( q, o j ) =q ⋅ o j
ontology as follows. We represent each ontology in the search
space as an ontology vector oj ∈ O, where oji is the mean of the           3.2.3 Combination with Knowledge Base Retrieval
term ti similarities with all the matched entities in the ontology if      If the knowledge in the ontology is incomplete, the ontology
any matching exists, and zero otherwise.                                   ranking algorithm performs very poorly. Queries will return less
The components oji are calculated as:                                      results than expected, the relevant ontologies will not be retrieved,
                                                                           or will get a much lower similarity value than it should. For
                                      ∑ w(m )
                                      M ji
                                             ji                            instance, if there are ontologies about “restaurants”, and “dishes”
                        o ji = M ji                                        are expressed as instances in the corresponding Knowledge Base
                                      ∑ w(m )
                                       Mi
                                             i                             (KB), a user searching for ontologies in this domain may be also
                                                                           interested in the instances and literals contained in the KB. To
where Mji is the set of matches of the term ti in the ontology             cope with this issue, our ranking model combines the similarity
oj, w(mji) represents the similarities between the term ti and the         obtained from the terms that belong to the ontology with the
entities of the ontology oj that matches with it, Mi is the set of         similarity obtained from the terms that belong to the KB using the
matches of the term ti within all the ontologies and w(mi)                 adaptation of the vector space model explained before.
represents the weights of each of these matches.
                                                                           On the other hand, the combination of outputs of several search
For example, if we define in the Golden Standard a term “acid”,            engines has been a widely addressed research topic in the
this term may return several matches in the same ontology with             Information Retrieval field [9]. After testing several approaches,
different entities as: “acid”, “amino acid”, etc. In order to              we have selected the so-called Comb-MNZ strategy. This
establish the appropriate weight in the ontology vector, oij, the          technique has been shown in prior works as one of the simplest
goal is to compute the number of matches of one term in the                and most effective rank aggregation techniques, and consists of
whole repository of ontologies and give more relevance to those            computing a combined ranking score by a linear combination of
ontologies that have matched that specific term more times.                the input scores with additional factors that measure the relevance
Due to the way in which the vector oj is constructed, each                 of each score in the final ranking. In our case, the relevancies of
component oij contains specific information about the similarity           the scores, i.e., the relevancies of the similarity computation
between the ontology and the corresponding term ti. To compute             within the ontology and within the knowledge base, are given by
the final similarity between the query vector q and the ontology           the user. He can select a value vi ∈ [1, 5] for each kind of search,
vector oj, the vectorial model calculates the cosine measure               and this value is then mapped to a corresponding value si using
between both vectors. However, if we follow the traditional                the following normalization.
vectorial model, we will only be considering the difference
                                                                                                                v
between the query and the ontology vectors according to the angle                                           si = i
they form, but not taking into account their dimensions. Thus, to                                                5
overcome this limitation, the above cosine measure used in the             Following this idea, the final score is computed as:
vectorial model has been replaced by the simple dot product.
                                                                                              sO × sim(q, o) + s kb × sim(q, kb)
                                                 Figure 4. WebCORE user evaluation phase

For future work, we are considering to set si using statistical               focused on generic types of tasks or activities) and
information about the knowledge contained in the ontologies, the              application-ontologies (for ontologies describing a domain
knowledge contained in the KBs and the information requested by               in an application-dependent manner).
the user during the Golden Standard definition phase.
                                                                         The above criteria can have discrete numeric or non-numeric
Figure 3 shows the system recommendation interface. At the left          values. The user’s interests are expressed like a subset of these
side the user can select the matching methodology (fuzzy or              criteria, and their respective values, meaning thresholds or
exact), the search spaces (ontology entities and knowledge base          restrictions to be satisfied by user evaluations. Thus, a numeric
entities), and the weight or importance given to each of the             criterion will be satisfied if an evaluation value is equal or greater
previously selected search spaces. In the right part the user can        than that expressed by its interest threshold, while a non-numeric
visualize the ontology and navigate across it. Finally, the middle       criterion will be satisfied only when the evaluation is exactly the
of the interface presents the list of ontologies selected for the user   given threshold (i.e. in a Boolean or yes/no manner).
to be evaluated during the collaborative evaluation phase.
                                                                         According to both types of user evaluation and interest criteria,
3.3 Collaborative Ontology Evaluation                                    numeric and Boolean, the recommendation algorithm will
The third and last phase of the system is compound of a novel            measure the degree in which each user restriction is satisfied by
ontology recommendation algorithm that exploits the advantages           the evaluations, and will recommend a ranked ontology list
of Collaborative Filtering [1], exploring the manual evaluations         according to similarity measures between the thresholds and the
stored in the system to rank the set of ontologies that best fulfils     collaborative evaluations. To create the final ranked ontology list
the user’s interests.                                                    the recommender module follows two phases. In the first one it
                                                                         calculates the similarity degrees between all the user evaluations
In WebCORE, user evaluations are represented as a set of five            and the specified user interest criteria thresholds. In the second
different criteria [15] and their respective values, manually            one it combines the similarity measures of the evaluations,
determined by the users who made the evaluations.                        generating the overall rankings of the ontologies.
   • Correctness: specifies whether the information stored in the        Figure 4 shows all the previous definitions and ideas, locating
     ontology is true, independently of the domain of interest.          them in the graphical interface of the system. On the left side of
   • Readability: indicates the non-ambiguous interpretation of          the screen, the user introduces the thresholds for the
     the meaning of the concept names.                                   recommendations and obtains the final collaborative ontology
                                                                         ranking. On the right side, the user adds new evaluations for the
   • Flexibility: points out the adaptability or capability of the       ontologies and checks evaluations given by the rest of the users.
     ontology to change.
                                                                         3.3.1 Collaborative Evaluation Measures
   • Level of formality: highly informal, semi-informal, semi-           As mentioned before, a user evaluates an ontology considering
     formal, rigorously-formal.                                          five different criteria that can be divided in two different groups:
                                                                         a) numeric criteria (‘correctness’, ‘readability’ and ‘flexibility’),
   • Type of model: upper-level (for ontologies describing
                                                                         which take discrete numeric values [1, 2, 3, 4, 5], where 1 means
     general, domain-independent concepts), core-ontologies (for
                                                                         the ontology does not fulfil the criterion, and 5 means the
     ontologies that contain the most important concepts on a
                                                                         ontology completely satisfies the criterion, and, b) Boolean
     specific domain), domain-ontologies (for ontologies that
                                                                         criteria (‘level of formality’ and ‘type of model’), which are
     broadly describe a domain), task-ontologies (for ontologies
represented by specific non-numeric values that can be or not                           similarity num ( criterionmn ) =
satisfied by the ontology.
                                                                                                           *
                                                                                          = 1 + similarity num ( criterionmn )· penalty num (threshold mn ) ∈ [0, 2]
Taking into account the previous definitions, user interests will be
a subset of the above criteria and their respective values                           This measure will also return values between 0 and 2. The idea of
representing the set of thresholds that should be reached by the                     returning a similarity value between 0 and 2 is inspired on other
ontologies. Given a set of user interests, the system will size up all               collaborative matching measures [18] to not manage negative
the stored evaluations, and will calculate their similarity measures.                numbers, and facilitate, as we shall show in the next subsection, a
To explain these similarities we shall use a simple example of six                   coherent calculation of the final ontology rankings.
different evaluations (E1, E2, E3, E4, E5 and E6) of a given                         The similarity assessment is based on the distance between the
ontology. In the explanation we shall distinguish between the                        value of the criterion n in the evaluation m, and the threshold
numeric and the Boolean criteria. We start with the Boolean ones,                    indicated in the user’s interests for that criterion. The more the
assuming two different criteria, C1 and C2, with three possible                      value of the criterion n in evaluation m overcomes the threshold,
values: “A”, “B” and “C”. In Table 1 we show the threshold                           the greater the similarity value shall be.
values established by a user for these two criteria, “A” for C1 and
“B” for C2, and the six evaluations stored in the system.                            Specifically, following the expression below, if the difference dif
                                                                                     = (evaluation – threshold) is equal or greater than 0, we assign a
 Table 2. Thresholds and evaluations for Boolean criteria C1 and C2                  positive similarity in (0,1] that depends on the maximum
                                                                                     difference maxDif = (maxValue – threshold) we can achieve with
                                                   Evaluations
                                                                                     the given threshold; and else, if the difference dif is lower than 0,
   Criteria   Thresholds       E1        E2        E3       E4      E5         E6    we give a negative similarity in [-1,0), punishing the distance of
      C1          “A”         “A”        “B”      “A”      “C”     “A”         “B”
                                                                                     the value with the threshold.

      C2          “B”         “A”        “A”       “B”     “C”     “A”     “A”                                             ⎧ 1 + dif
                                                                                                                           ⎪⎪ 1 + maxDif ∈ (0,1]      if dif ≥ 0
In this case, because of the threshold of a criterion n is satisfied or                              *
                                                                                          similarity num ( criterionmn ) = ⎨
not by a certain evaluation m, their corresponding similarity
measure is simply 0 if they have the same value, and 2 otherwise.
                                                                                                                            ⎪ dif ∈ [ −1, 0)          if dif < 0
                                                                                                                            ⎪⎩ threshold
                                    ⎧0        if evaluationmn ≠ threshold mn
                                                                                     Table 5 summarizes the similarity* values for the three numeric
    similaritybool ( criterionmn ) = ⎨
                                    ⎩2        if evaluationmn = threshold mn         criteria and the six evaluations of the example.
The similarity results for the Boolean criteria of the example are                       Table 5. Similarity* values for numeric criteria C3, C4 and C5
shown in Table 3.
                                                                                                                                      Evaluations
       Table 3. Similarity values for Boolean criteria C1 and C2                        Criteria     Thresholds      E1      E2        E3       E4       E5        E6
                                                   Evaluations                             C3            ≥3          1/4     2/4      3/4      3/4      -1/3       -1
   Criteria   Thresholds       E1        E2        E3       E4      E5         E6          C4            ≥0          1/6     2/6      5/6       1        1/6       1/6
      C1          “A”          2          0         2       0        2          0          C5            ≥5           1       1        1        1       -1/5       -1
      C2          “B”          0          0         2       0        0          0    Comparing the evaluation values of Table 4 with the similarity
For the numeric criteria, the evaluations can overcome the                           values of Table 5, the reader may notice several important facts:
thresholds to different degrees. Table 4 shows the thresholds                        1. Evaluation E4 satisfies criteria C4 and C5 with evaluations of 5.
established for criteria C3, C4 and C5, and their six available                         Applying the above expression, these criteria receive the same
evaluations. Note that E1, E2, E3 and E4 satisfy all the criteria, while                similarity of 1. However, criterion C4 has a threshold of 0, and
E5 and E6 do not reach some of the corresponding thresholds.                            C5 has a threshold equal to 5. As it is more difficult to satisfy
                                                                                        the restriction imposed to C5, this one should have a greater
Table 4. Thresholds and evaluations for numeric criteria C3,C4 and C5                   influence in the final ranking.
                                                   Evaluations                       2. Evaluation E6 gives an evaluation of 0 to criteria C3 and C5, not
   Criteria   Thresholds       E1        E2        E3       E4      E5         E6        satisfying either of them and generating the same similarity
                                                                                         value of -1. Again, because of their different thresholds, we
      C3           ≥3          3          4         5       5        2          0        should distinguish their corresponding relevance degrees in
      C4           ≥0          0          1         4       5        0          0        the rankings.
      C5           ≥5          5          5         5       5        4          0    For these reasons, a threshold penalty is applied, reflecting how
                                                                                     difficult it is to overcome the given thresholds. The more difficult
In this case, the similarity measure has to take into account two                    to surpass a threshold, the lower the penalty value shall be.
different issues: the degree of satisfaction of the threshold, and the
difficulty of achieving its value. Thus, the similarity between the                                                               1 + threshold
value of criterion n in the evaluation m, and the threshold of interest                            penaltynum (threshold ) =                        ∈ (0,1]
                                                                                                                                  1 + maxValue
is divided into two factors: 1) a similarity factor that considers
whether the threshold is surpassed or not, and, 2) a penalty factor                  Table 6 shows the threshold penalty values for the three numeric
which penalizes those thresholds that are easier to be satisfied.                    criteria and the six evaluations of the example.
 Table 6. Threshold penalty values for numeric criteria C3, C4 and C5                      4. EXPERIMENTS
                                                           Evaluations                     In this section, we present some early experiments that attempt to
                                                                                           measure: a) the gain of efficiency and effectiveness, and the b)
   Criteria    Thresholds        E1            E2          E3      E4        E5     E6
                                                                                           increment of users’ satisfaction obtained with the use of our
     C3             ≥3           4/6           4/6         4/6    4/6        4/6    4/6    system when searching ontologies within a specific domain.
     C4             ≥0           1/6           1/6         1/6    1/6        1/6    1/6    The scenario of the experiments was the following. A repository
     C5             ≥5             1               1        1      1          1      1     of thirty ontologies was considered and eighteen subjects
                                                                                           participated in the evaluations. They were Computer Science
                                                                                           Ph.D. students of our department, all of them with some expertise
                                                                                           in modeling and exploitation of ontologies. They were asked to
Finally, the similarity results for the numeric criteria of the
                                                                                           search and evaluate ontologies with WebCORE in three different
example are shown in Table 7.
                                                                                           tasks. For each task and each student, one of the following
     Table 7. Similarity values for numeric criteria C3, C4 and C5                         problem domains was selected:
                                                           Evaluations                        • Family. Search for ontologies including family members:
                                                                                                mother, father, daughter, son, etc.
   Criteria    Thresholds        E1            E2          E3      E4        E5     E6
     C3             ≥3        1.17             1.33        1.5    1.5        0.78   0.33      • Genetics. Search for ontologies containing specific
                                                                                                vocabulary of Genetics: genes, proteins, amino acids, etc.
     C4             ≥0        1.03             1.05        1.14   1.17       1.03   1.03
     C5             ≥5             2               2        2      2         0.5     0        • Restaurant. Search for ontologies with vocabulary related
                                                                                                to restaurants: food, drinks, waiters, etc.
As a preliminary approach, we calculate the similarity between an
ontology evaluation and the user’s requirements as the average of                          In the repository, there were six different ontologies related to
its N criteria similarities.                                                               each of the above domains, and twelve ontologies describing other
                                                                                           no related knowledge areas. No information about the domains
                                                       N
                                               1                                           and the existent ontologies was given to the students.
          similarity ( evaluationm ) =
                                             N
                                               ∑ similarity (criterion )
                                                   n =1
                                                                              mn
                                                                                           Tasks 1 and 2 were performed first without the help of the
                                                                                           collaborative modules of the system, i.e., the term recommender
A weighted average could be even more appropriate, and might
                                                                                           of the problem definition phase and the collaborative ranking of
make the collaborative recommender module more sophisticated
                                                                                           the user evaluation phase. After all users finished the previous
and adjustable to user needs. This will be considered for a
                                                                                           ontology searches and evaluations, task 3 was done with the
possible enhancement of the system in the continuation of our
                                                                                           collaborative components activated. For each task and each
research.
                                                                                           student, we measured the time expended, and the number of
3.3.2 Collaborative Ontology Ranking                                                       ontologies retrieved and selected (‘reused’). We also asked the
Once the similarities are calculated taking into account the user’s                        users about their satisfaction (in a 1-5 rating scale) about each of
interests and the evaluations stored in the system, a ranking is                           the selected ontologies and the collaborative modules.
assigned to the ontologies.                                                                Tables 8 and 9 contain a summary of the obtained results. Note
The ranking of a specific ontology is measured as the average of                           that measures of task 1 are not shown. We have decided not to
its M evaluation similarities. Again, we do not consider different                         consider them for evaluation purposes because we discern the first
priorities in the evaluations of several users. We have planned to                         task as a learning process of the use of the tool, and its time
include in the system personalized user appreciations about the                            executions and number of selected ontologies as skewed no
opinions of the rest of the users. Thus, for a certain user some                           objective measures.
evaluations will have more relevance than others, according to the
                                                                                           To evaluate the enhancements in terms of efficiency and
users that made it.
                                                                                           effectiveness, we present in Table 8 the average number of reused
                                       1
                                           M                                               ontologies and the average execution times for task 2 and 3. The
          ranking ( ontology ) =
                                   M m =1
                                           ∑ similarity (evaluation )    m                 results show a significant improvement when the collaborative
                                                                                           modules of the system were activated. In all the cases, the
                                       1
                                             M         N                                   students made use of the terms and evaluations suggested by
                              =       ∑∑ similarity (criterion )              mn           others, accelerating the processes of problem definition and
                                   MN       m =1 n =1                                      relevant ontology retrieval.
Finally, in case of ties, the collaborative ranking mechanism sorts
                                                                                           Table 8. Average number of reused ontologies and execution times (in
the ontologies taking into account not only the average similarity                                             minutes) for tasks 2 and 3
between the ontologies and the evaluations stored in the system,
but also the total number of evaluations of each ontology,                                                     Task 2            Task 3
                                                                                                              (without            (with             %
providing thus more relevance to those ontologies that have been
                                                                                                            collaborative     collaborative    improvement
rated more times.                                                                                             modules)          modules)
                                                                                              # reused
                             M                                                                                  3.45              4.35            26.08
                                       ranking ( ontology )                                  ontologies
                          M total                                                            execution
                                                                                                                9.3               7.1              23.8
                                                                                                time
On the other hand, table 9 shows the average degrees of                   [9] Lee, J. H.: Analysis of multiple evidence combination.
satisfaction revealed by the users about the retrieved ontologies             Proceedings of the 20th ACM Int. Conference on Research
and the collaborative modules. Again, the results evidence                    and Development in IR (SIGIR’97). New York, 1997.
positive applications of our approach.
                                                                          [10] Lozano-Tello, A., and Gómez-Pérez, A.: Ontometric: A
Table 9. Average satisfactions values (1-5 rating scale) for ontologies       method to choose the appropriate ontology. Journal of
reused in tasks 2 and 3, collaborative recommendations and rankings           Database Management, 15(2):1–18, 2004.

 Task 2
           Task         %              Initial term    Final ontology     [11] Maedche, A., and Staab, S.: Measuring similarity between
             3     improvement      recommendation        ranking             ontologies. Proceedings of the 13th European Conference on
  3.34     3.56        6.58                4.7               4.4              Knowledge Acquisition and Management (EKAW 2002).
                                                                              Madrid, Spain, 2002.

5. CONCLUSIONS AND FUTURE WORK                                            [12] Miller, G. A.: WordNet: A lexical database for English. New
In this paper, a web application for ontology evaluation and reuse            horizons in commercial and industrial Artificial Intelligence.
has been presented. The novel aspects of our proposal include the             Communications of the Association for Computing
use of WordNet to help users to define the Golden Standard; a                 Machinery, 38(11): 39-41, 1995.
new ontology retrieval technique based on traditional Information         [13] Montaner, M., López, B., and De la Rosa, J.L.: A Taxonomy
Retrieval models; rank fusion techniques to combine different                 of Recommended Agents on the Internet.              Artificial
ontology evaluation measures; and two collaborative modules:                  intelligence Review 19: 285-330, 2003.
one that suggests the most popular terms for a given domain, and
one that recommends lists of ontologies with a multi-criteria             [14] Noy, N. F., Chugh, A., Liu, W., and Musen, M. A.: A
strategy that takes into account user opinions about ontology                 Framework for Ontology Evolution in Collaborative
features that can only be assessed by humans.                                 Environments. Proceedings of the 5th Int. Semantic Web
                                                                              Conference (ISWC’06). Athens, Georgia, USA, 2006.
6. ACKNOWLEDGMENTS                                                        [15] Paslaru, E.: Using Context Information to Improve Ontology
This research was supported by the Spanish Ministry of Science                Reuse. Doctoral Workshop at the 17th Conference on
and Education (TIN2005-06885 and FPU program).                                Advanced Information Systems Engineering (CAiSE’05).
                                                                              Porto, Portugal, 2005.
7. REFERENCES                                                             [16] Porzel, R., and Malaka, R.: A task-based approach for
[1] Adomavicius, G., and Tuzhilin, A.: Toward the Next                        ontology evaluation. Proc. of the 16th European Conference
    Generation of Recommender Systems: A Survey of the State-                 on Artificial Intelligence (ECAI’04). Valencia, Spain, 2004.
    of-the-Art and Possible Extensions. IEEE Transactions on
    Knowledge and Data Engineering 17(6): 734-749, 2005.                  [17] Protégé OWL ontology Repository.
                                                                              http://protege.stanford.edu/download/ontologies.html
[2] Alani, H., and Brewster, C.: Metrics for Ranking Ontologies.
    Proceedings of the 4th Int. Workshop on Evaluation of                 [18] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and
    Ontologies for the Web (EON’06), at the 15th Int. World                   Riedl, J.: GroupLens: An Open Architecture for
    Wide Web Conference (WWW’06). Edinburgh, UK, 2006.                        Collaborative Filtering of Netnews. Internal Research
                                                                              Report, MIT Center for Coordination Science, 1994.
[3] Alani, H., Brewster, C., and Shadbolt, N.: Ranking
    Ontologies with AKTiveRank. Proc.. of the 5th Int. Semantic           [19] Sabou, M., López, V., Motta, E., and Uren, V.: Ontology
    Web Conference (ISWC’06). Athens, Georgia, USA, 2006.                     Evaluation on the Real Semantic Web. Proceedings of the 4th
                                                                              Int. Workshop on Evaluation of Ontologies for the Web
[4] Brank J., Grobelnik M., and Mladenic D.: A Survey of                      (EON’06), at the 15th Int. World Wide Web Conference
    Ontology Evaluation Techniques. Proceedings of the 4th                    (WWW’06). Edinburgh, UK, 2006.
    Conference on Data Mining and Data Warehouses
    (SiKDD‘05), at the 7th Int. Multi-conference on Information           [20] Sabou, M., López, V., Motta, E., and Uren, V.: Ontology
    Society (IS’05). Ljubljana, Slovenia, 2005.                               Selection for the Real Semantic Web: How to cover the
                                                                              Queen’s Birthday Dinner? Proc. of the 15th International
[5] Brewster, C., Alani, H., Dasmahapatra, S. and Wilks, Y. Data              Conference on Knowledge Engineering and Knowledge
    driven ontology evaluation. Proc. of the 4th Int. Conf. on                Management (EKAW’06). Podebrady, Czech Republic, 2006.
    Language Resources and Evaluation (LREC04). Lisbon 2004
                                                                          [21] Salton, G., and McGill, M.: Introduction to Modern
[6] Ding, Y., and Fensel, D.: Ontology Library Systems: The key to            Information Retrieval. McGraw-Hill, New York, 1983.
    successful Ontology Reuse. Proc. of the 1st Semantic Web
    Working Symposium (SWWS’01). Stanford, CA, USA, 2001.                 [22] Smith, G.: Atomiq: Folksonomy: Social Classification. 2004.
                                                                               http://atomiq.org/archives/2004/08/folksonomy_social_class
[7] Farquhar, A., Fikes, R., and Rice, J.: The Ontolingua server:              ification.html
    A tool for collaborative ontology construction. Technical
    report, Stanford KSL 96-26, 1996.                                     [23] Sure, Y., Erdmann, M., Angele, J., Staab, S., Studer, R., and
                                                                              Wenke, D.: OntoEdit: Collaborative Ontology Development
[8] Fernández, M., Cantador, I., and Castells, P. CORE: A Tool                for the Semantic Web. Proceedings of the 1st International
    for Collaborative Ontology Reuse and Evaluation.                          Semantic Web Conference (ISWC ‘02), Sardinia, Italy, 2002.
    Proceedings of the 4th Int. Workshop on Evaluation of
    Ontologies for the Web (EON’06), at the 15th Int. World               [24] Swoogle - Semantic Web Search Engine.
    Wide Web Conference (WWW’06). Edinburgh, UK, 2006.                        http://swoogle.umbc.edu