=Paper=
{{Paper
|id=Vol-273/paper-3
|storemode=property
|title=Promotion of Ontological Comprehension: Exposing Terms and Metadata with Web 2.0
|pdfUrl=https://ceur-ws.org/Vol-273/paper_40.pdf
|volume=Vol-273
|dblpUrl=https://dblp.org/rec/conf/www/GibsonWS07
}}
==Promotion of Ontological Comprehension: Exposing Terms and Metadata with Web 2.0==
Promotion of Ontological Comprehension: Exposing
Terms and Metadata with Web 2.0
Andrew Gibson Katy Wolstencroft Robert Stevens
University of Manchester
School of Computer Science, Kilburn Building,
Oxford Road, Manchester, UK
+44 161 275 0649
andrew.p.gibson@manchester.ac.uk
ABSTRACT W3C 1 . This next generation Web promises to transform the
Knowledge artifacts that have been labeled as ontologies have information Web into a machine computable utopia for
many different qualities and intended outcomes. This is semantically described data and information. Despite the
particularly true of bio-ontologies where high demand has led to a development of the technologies, there is, however, only little
rapid growth in the number of these artifacts. Good evidence of the materialization of the Semantic Web (or Webs).
communication between the human agents involved in the life Simple RDFS vocabularies such as Friend of a Friend have
cycle of ontologies is essential for the ontologist to encode the provided small views on the potential of the Semantic Web [9].
right knowledge in the ontology. Not only this, but it should be Rich ontological views supported by reasoning have appeared in
encoded such that subsequent retrieval of the knowledge from the applications [27, 30, 31], but less so in the Web itself, and when
ontology by any agent can be clear and precise. The ontologist they do, they often represent unconnected niche pockets of
can encode ontological statements, for interpretation by a interest.
computer agent, or meta-ontological statements, for interpretation In contrast, Web 2.0 is in the here and now, in use by large
by human agents. We consider how the current communication interconnected user communities, and is ever growing as more
between agents and ontologies produces drawbacks that add to the people adopt and contribute to various community efforts. To try
considerable overheads associated with ontology development. and specify Web 2.0 would almost be a contradiction in terms,
We describe the processes of communication between human and restricting its users with strong recommendations would be
agents and ontologies as Ontology Comprehension. We then seen as an attempt to unnecessarily limit the creativity of those
suggest how these processes could be augmented, particularly who have something new to try. Taxonomies give way to
with the use of Web 2.0 ideas. By exposing and enhancing the folksonomies, letting the user mark-up things lightly on the Web
social interactions involved in ontology comprehension, rather than specify a typed URI. The technologies of Web 2.0
development overheads are potentially reduced and the prospect were not specified; they evolved out of clear and present needs of
of ontology sharing and reuse is improved. users to connect with one another. The principles of Web 2.0
grow out of a mixture of hindsight and insight to current practice,
Categories and Subject Descriptors and revolve around online community building, quick and easy
I.2.4 [Artificial Intelligence]: Knowledge Representation linking, unlimited customization in the hands of the masses. In
Formalisms and Methods – representations, representation this article we use ‘Web 2.0’ to refer to these principles rather
languages. than any specific technology.
It has not gone unnoticed however that the artifacts, such as
General Terms vocabularies and ontologies, that will support the Semantic Web
Design, Human Factors, Standardization, Languages. need populating [25, 26], and for this to happen, both the
technology and the nature of ontology building need to be
accessible to the masses. Similarly in the computer science view
Keywords of knowledge artifacts such as ontologies inherently have this
Ontologies, Semantic Web, Web 2.0, OWL, Ontology community aspect—they are shared conceptualizations that aim to
Comprehension. enable both human and computational interoperation of diverse
resources at a semantic level.
1. INTRODUCTION The simplicity and robustness of HTML fuelled the growth of the
The technologies of the Semantic Web [6] have been centrally current Web, but the highly-specified nature of the technologies
conceived, specified and designed with recommendations by the in the Semantic Web recommendations suggests that the semantic
side of the development, delivered through ontologies, will be
driven mostly by experts. In this way, it is key that somehow this
Copyright is held by the author/owner(s). barrier of complexity is lowered through creating an easier user
WWW 2007, May 8--12, 2007, Banff, Canada.
1
http://www.w3.org/2001/sw/
experience, and that the motivators that are driving Web 2.0 are domains, and have the virtues of being sharable and reusable. As
harnessed to promote uptake of Semantic Web ideas. yet, it is difficult to find an ontology that could be said to have
In this paper we consider the social and communication been designed to fit the criteria for enabling a Semantic Web by
dependent aspects of the ontology development life cycle, and being domain general and rich in content. One prominent example
identify problems encountered by people with specific roles of of an ontology approaching these criteria is the Foundational
interaction. From this, we suggest that a clear, layered separation Model of Anatomy (FMA) [12, 23]. The FMA could be said to be
is made between statements in ontologies that are logical and more of a true domain ontology (or reference ontology) than any
those that are linguistic, supporting annotations on the ontology. other in bio-medicine. However, even the FMA has barriers to the
In doing so, the annotations can be exposed to the collaborative Semantic Web goals of sharing and reuse because of its large size,
aspects of Web 2.0, promoting light discussion at the level of perhaps because it was developed in Frames and later converted
natural language about the meanings of terms, whilst leaving the to OWL.
heavier encoding of knowledge into OWL as a task for In computer science, what are called ontologies covers a broad
ontologists. range of knowledge artifacts. Glossaries, vocabularies, thesauri,
informal and formal ontologies (both in language and ontological
2. ONTOLOGIES AND DEVELOPMENT discrimination) are all used at various points in the Semantic
The central premise of the Semantic Web is enabling Web. Different levels of expressiveness (sometimes called
computational processing of Web resources through knowledge formality) come from the purpose and demands of the ontology
artifacts. The W3C have provided the Resource Description being developed [28]. These demands can be considered with
Framework (RDF) and the Web Ontology Language (OWL) increasing levels of expressiveness from very “light-weight” term
recommendations. The latter, particularly in its OWL-DL variant, lists, thesauri, dictionaries or hierarchies up to “heavy-weight”
is offered as a means of building robust property based with very expressive constraints [10, 25]. OWL-DL offers a
descriptions with a logical underpinning that can be used to formal language and can be used to build rich, logical
provide vocabulary for describing Web content, but also support representations of descriptions of what exists; it can also be used,
reasoning across Web content [20]. Such ontologies are to be the in various forms, to develop other forms of knowledge artifact
semantic backbone for linking resources in the Semantic Web. while still retaining strict language semantics in the
Additionally, these ontologies are to represent knowledge of representation, but weakening the ontological distinctions made in
Figure 1: Ontology Comprehension: Current model of interactions between various agents and an ontology, as described in
Section 3. The human agents are not necessarily different individuals, but rather are separated here by the roles fulfilled in
the development and inspection processes.
the knowledge artifact. of the ontology development life cycle, the ontologist (assuming
they have no domain knowledge) will usually rely on the domain
Building OWL-DL logic based ontologies is a difficult process
expert to provide a core set of terms from the domain of interest
[21] and reaching a community consensus is hard, especially in
as a starting point. The initial scope of the ontology, rather than
complex domains such as biology, where knowledge for making
being rigidly defined, is often roughly determined from the initial
ontological distinctions can be incomplete. These issues need to
term list and this will get refined as things move on. At this early
be addressed if ontologies are to play their role in the Semantic
stage it is necessary for the domain expert to be able to quickly
Web. Here, we are mostly interested in the aspect of reaching a
assess if the terms are appropriate. As things are, the easiest way
community consensus. Focus is often placed on the aspect of
to do this is for the domain expert to be able to access the
collaborative ontology building, that is, a group of people
ontology for themselves and browse the hierarchy of terms, whilst
working directly with one ontology. We do not aim to discuss this
checking and adding in textual annotations for the terms, as well
type of system, as we see such systems as expert systems for
as any comments about the specific or contextual use of any of
logic-savvy ontologists rather than currently being suitable for
the terms.
“the masses”. Much more work needs to be done on enabling true
collaboration in logic based ontologies. Instead, we currently The ontologist will be using one of the commonly available
envisage a core of expertise for logic encoding supported by ontology development tools such as Protégé-OWL 2 , Swoop 3 and
people conceptualizing and gathering linguistic material. We OBO-Edit 4 . All of these tools are centered on the user interacting
acknowledge that there is a wealth of methodologies that address with a class hierarchy view, which the ontologist will be building
certain aspects of the ontology development lifecycle [10, 29] and from the terms given to them by the domain expert. At this stage,
evaluation [8, 24], good reviews of these fields can be found in the domain expert will primarily be concerned with having the
the references. For the purposes of this paper, we wish to focus on correct term-definition pairs represented in the proto-ontology.
the social interactions during these processes rather than the Decisions regarding the class hierarchy signal the beginning of a
processes themselves. slightly more complex level of expressivity, as the ontologist will
be making assertions between classes about subsumption
relationships [14]. This is especially true of OWL ontologies, and
3. ONTOLOGY COMPREHENSION such decisions do not necessarily need to be considered for
We learn from the field of software engineering that effective simpler controlled structured vocabularies in which hierarchical
reuse of elements of object oriented frameworks is reliant on relationships “broader than” and “narrower than” are possible.
many levels of understanding from the point of view of the The ontologist may also start to guide the domain expert in how
programmer [4, 15]. In software engineering, improving these to transfer knowledge regarding some of the more fundamental
levels of understanding is known as “software comprehension”, object properties such as part-hood.
and we extend the principles to ontology development. We
outline ontology comprehension as the interaction between human At some point, the domain experts need to let the ontologists start
agents and the knowledge expressed in an ontology. to make even more expressive assertions in the ontology that they
may not necessarily understand the implications of for
Figure 1 outlines the interactions between various agents and an themselves. This signals the next stage of ontology development,
ontology that are considered in this section. There are two main in which the balance shifts so that the ontologist starts to refine
modes in which ontology comprehension is important: the assertions in the ontology. Instead of being instructed and
1. Development mode. Ontology development requires guided by the domain expert, the ontologist now needs to ask
that there is efficient interaction between experts that careful questions of the domain expert. The aim of these questions
represent the knowledge of the domain in the scope of should be to extract the intrinsic meaning of the terms that the
the ontology (domain experts) and the ontologist that is domain expert has provided so that the ontologist can encode
responsible for the construction and continued these meanings into the ontology using more and more expressive
maintenance of the ontology. Here we assume a model restrictions and axioms. Significantly, unless the domain expert
where, for a specific ontology development exercise, has had training in understanding the meanings of logical
there is a limited cohort of domain experts that are assertions of ontologies, they will still primarily rely on the
involved with an ontologist. lexical annotations and definitions when evaluating the ontology.
2. Inspection mode. Ontology inspection is a light Once the content of the ontology has begun to stabilize (i.e. there
evaluative process that an agent will go through the are fewer major revisions in the content of the ontology being
ontology to quickly assess whether or not that ontology made) it will be made available to a wider audience. This can
is of good quality and whether what it contains is signal a whole new critical process of revision for the ontology. In
suitable for some specific needs of the inspector. the next section we will consider what sort of interactions may
What follows is an outline of task models that highlight how occur between different agents and ontologies when they are first
currently, the interactions of agents involved with ontologies encountered.
leads to discrepancies in ontology comprehension. Eventually, the increase in the content of the ontology, both
lexical and logical, should start to level off as the content and the
3.1 Task Model 1: Ontology Development intended scope, at which time further structural modifications
We consider early ontology development as a process that begins may be made, such as modularization, which could happen once
with the lightest possible knowledge structure, essentially a term
list, and subsequently moves up through levels of complexity and 2
expressiveness of the types discussed in [10]. This happens http://protege.stanford.edu/
3
socially as well as in the ontology as all those involved in the http://code.google.com/p/swoop/
development become more familiar with scope. At the beginning 4
http://www.oboedit.org/
the micro-organization of the knowledge in a domain has become It is hard not to liken an ontology inspection process to some sort
clear. A publicly available and relatively stable ontology has a of evaluation. What we describe here is fairly close to ontology
new set of requirements, for which the topics of ontology selection [24], except that ontology inspection is more of a
evolution and change management address [19]. Change browsing process, driven by what access there is to comparative
management of ontologies has been considered in a technological information between several ontologies. Selection has much better
sense for some time, and it should be clear that changes to a defined initial parameters for the desired outcome, and can give a
publicly available ontology need to be transparent. However, more targeted outcome. We do not wish to label this inspection as
there is a growing trend for including extra hierarchical structures an evaluation however, as we do not make the assumption that the
into the ontology that represent deprecated classes (e.g. [30]). The inspector will be following any pre-determined criteria, and if
need to do this is obvious; it is less so how to do it neatly and they are, that they are rational criteria.
ontologically. Versioning etc. are all parts of the ontology life- The ontology inspection process is short lived, and for many
cycle that have no really, consistent support. people’s goals, the choice of beginning a new ontology that they
The following discrepancies in ontology comprehension should know will satisfy their criteria is more favorable than editing an
be clear from this section. existing one. However, such inspections can quickly be deemed
fruitless when the term searched for turns out not to be defined by
1. Discrepancies in Early Development logical statements in an ontology. This is a common occurrence,
a. The most convenient means of constructing, as such ‘classes’ can be placeholders for future development or
looking at and sharing the early term list is, intrinsically defined terms where no logical definition was
unusually, from within an ontology file, thought necessary. Ontologies can be intensely developed in one
which implies some hierarchical structure. particular area where immediate goals are important, yet there is
no way to effectively discover this other than through thorough
b. Early revisions of the ontology are browsing. For the goals of the Semantic Web, it is imperative that
experimental for the ontologist, yet are still such information required to carry out this inspection process be
subject to inspection and lexical evaluation made as clear as possible for the inspector, such that we do not
from the domain expert. see immense reproduction of individual effort and no clear
“shared conceptualizations”.
c. Domain experts, having looked directly at
revisions of the ontology file, may be resistant The domain knowledge these ontologies describe can require a
to subsequent major changes in structure and considerable amount of understanding for anyone trying to
terminology by the ontologist as knowledge is inspect them. There are several ways in which this can be the
disambiguated. case.
2. Emerging Discrepancies 1. The domain knowledge encoded may be outside the
experience of the inspector, or in a different context to
a. Inclusion of information regarding deprecated what was expected. The inspector may not be able to
classes into the class hierarchy of the tell if the knowledge represented is valid because it is
ontology. not within their expertise, and will need to seek help
3. Communicative Discrepancies and advice from a domain expert.
2. The knowledge may be appropriate, but encoded with
a. Discussions between the domain experts axioms and restrictions that the inspector may not be
about terminology that are potentially crucial able to accurately interpret as real world meaning, such
for ontology comprehension are lost or are that they have to find the advice of an ontologist.
completely separated from the ontology itself. 3. The ontology may have been written for a specific
purpose. The inspector may not be able to tell whether
b. Discussions between the domain experts and
this is the case, and could therefore assume that the first
the ontologists about disambiguation of terms
or second scenario above is true, unless it is possible to
are lost or are completely separated from the
seek advice from the original authors or find a resource
ontology itself.
containing this information.
c. Potential for misinterpretation of logical The three scenarios above are serious issues for the future of
aspects of the ontology by the domain experts ontologies in the Semantic Web. Most ontologies are developed
through exposure to the logical component. as part of projects, and projects are usually pragmatic in terms of
their goals. Hence, people build these ontologies as application
ontologies that serve the immediate needs of the project. There is
3.2 Task Model 2: Ontology Inspection no perceivable immediate benefit for a project to develop a more
Ontologies are complex entities. If any ontology is going to get general domain ontology in tandem with an application ontology,
used by someone other than the person or group that implemented and so it does not happen. Consequently the Semantic Web goals
it, there has to be a way in which it can be decided whether or not of sharing and reuse become much harder, as people will tend to
it is an appropriate ontology for the task in hand [18]. Currently, assess these application specific ontologies as too specific for a
this inspection process is difficult because of the paucity of new purpose, as they see that they will need to invest effort in its
ontologies available, and the fact that many have been designed re-engineering. Another danger here is that with so many
for a specific purpose. Also, the discrepancies listed in 3.1 result application ontologies being developed, that inspectors always
in a general lack of information that can aid effective inspection start to assume that unusual features of ontologies are the result of
and overall ontology comprehension. the needs of an application, and dismiss the ontology as
potentially unusable. What is really needed is for the inspector to b. No indication without exploration of the level
be sure what sort of artifact they are looking at by having easy of effort put into different areas of an
access to certain parameters. ontology.
In the Semantic Web vision, the first course of action for an
ontologist would be to verify the existence or non-existence of a 4. DESIDERATA FOR SEMANTIC
domain ontology with close or overlapping scope to the ontology ONTOLOGY COMPREHENSION
they are to develop. This process will be laborious if it relies on Section 3 highlights the social and communicative discrepancies
the current practice of downloading ontologies and browsing them that prevent an effective amount of ontology comprehension that
to see if they are at all reusable. In response to this, technologies is required for the uptake of the Semantic Web goals of ontology
such as Swoogle [16] and AKTiveRank [2] are starting to provide sharing and reuse. This section cross-analyses these discrepancies
access to online ontologies through page ranking and other to produce some desiderata that can be considered for future
analytical methods to establish potential target ontologies. systems. Whilst all types of data in and about an ontology may be
However, these technologies have been criticized for ignoring the considered ‘ontological’, we specify ‘Meta-Ontological Data’,
meaning of concepts and also relations [24]. Furthermore, we note ‘Ontological Metadata’ and ‘Logical Statements’ as clearly
that the results returning from these searches are whole OWL identifiable parts. For information contained within ontology files
files, free and independent of contextual information. For that is only for human interpretation of the encoded semantic
example, a Swoogle search for “Protein” has in its top hits an content, we use the idea of Meta-Ontological data. For data
ontology used in an educational tutorial (that in this case is specific to an individual ontology that is necessary for interpret
evident from its URL), which is by no means intended to be a and inspection across the whole structure and history of
shared or reusable resource, but none the less is discovered and development, we use the idea of Ontology Metadata. The ‘logical
accessible. statements’ in an ontology constitute the remainder of the content.
Those inspecting ontologies can find themselves in an isolated
situation where Web searches and personal inspection of an 4.1 Separating the Ontological from the
ontology or its documentation are the only means to ontology Meta-Ontological
comprehension. It has already been recognized that the Web has
Ontologies come with a considerable amount of meta-ontological
enormous potential for social organization and engagement. In
information (or should do so) which is used by the human to
ontology comprehension, for example, it offers the means of
assess and see the intended use of that ontology. Much of this
asking those who know. It also, as Wikipedia has shown, offers
meta-ontological information is linguistically orientated. These
the means by which elements that aid ontology comprehension
meta-ontological extensions to the ontology itself are meaningless
can be developed. Having concluded the need for ontology
strings to the computer, and in this respect are unnecessary in so
comprehension, we now explore what is necessary for such a
far as the computational goals of the Semantic Web are
facility.
concerned. We know that this meta-ontological information is
The following discrepancies in ontology comprehension should necessary, but we also see that it is not convenient to access; it
be clear from this section: lacks the human resources that often make the most of such
1. Discovery Level Discrepancies material, as in Section 3.2 where a lack of a single access point
means that secondary information needs to be sought out
a. Targeted discovery based on search for terms manually.
rather than meanings
In reality, we have a chance to design and build support for the
b. Ontologies are discoverable independently of meta-ontological in the light of current experience. OWL has
statements of purpose, scope etc. virtually no support, apart from some ad hoc solutions, for
c. Searches may discover anything from tutorial carrying meta-ontological knowledge. We would advocate such a
OWL files, programmatic OWL fragments, separation of the ontological from meta-ontological and this is
application ontologies, outdated or where a Web 2.0 approach could help.
unmaintained ontologies etc. Our current scenario places too much reliance on assessment
2. Ontology Level Discrepancies through simple linguistic inspection of, for instance, terms. These
are labels for concepts and a simple assumption of lexical
a. Statements of scope, purpose, expressivity etc
matching implying conceptual matching is dangerous. For
are often missing altogether, or require extra
example, in biology, it might seem safe to assume that hepatocyte
searches to discover them.
and liver cells are the same thing. In fact, cells in the liver include
b. Discussions that have affected overall hepatocyte cells, but also include adipocyte cells. Hepatocytes
ontology development are not recorded make the liver the liver, but there are other cells too.
c. Minimal opportunities to interact with the Ontologies are only intuitively discoverable through the
development team identification and inspection of the appropriate individual terms.
3. Term Level Discrepancies Even the construction of linguistic definitions can leave
ambiguous meanings for those inspecting an ontology, with no
a. Feeding in from Section 3.1, ontologies need real way to find out how those definitions were converged upon.
exploring in the development environment to Even with logical definitions, we still rely upon natural language
assess appropriateness of terms. labels. The aim of languages such as OWL-DL is, however, to
minimize potential ambiguities through logical descriptions.
Overall, there should be a synergy between logical and linguistic knowledge in a way not possible in file-oriented development.
definitions. The ontologists have a way to interact with the domain experts as
Non-ontologist domain experts will attach intrinsic meaning to a community to perform tasks such as the disambiguation of terms
terms by drawing on their internal knowledge and the context in before they have been encoded in an ontology, reducing the
which a term is used. It is possible to restrict the intrinsic meaning chance that major revisions of ontological structure will be
of a term using the consensus of a domain, so long as it is stated required. As this resource is shared and linkable, project and
in the context of the purpose of the controlled vocabulary. domain contexts for terms can be established. These contexts can
Interpretation of meaning in these controlled vocabularies still be used by both the ontologists and the domain experts to traverse
requires a human agent and the knowledge is logically the gap into discussions that involve other groups, and discover
inaccessible to a computer agent. overlapping scopes more intuitively. Additionally, these resources
would provide ideal testing grounds for lexical research (e.g. [7])
Thus, we define the inline linguistic portions of ontology files as that should lead to future improvement on methodologies for
meta-ontological data. These include anything that a human agent these workspaces.
would use for the translation of specific complex logical
statements into meaning (including links to other meanings) but
are also intrinsically meaningless to the computer. Primarily,
these are:
• Terms
o The specific string by which the logical
meaning is labeled, usually considered as the
real meaning.
o E.g. (from celltype ontology) ‘subsidiary cell’
• Synonyms
o Any number of labels that refer to the same
meaning.
o E.g. (cont’d) ‘accessory cell’
• Definitions
o Short, concise description of the meaning,
including links to other terms.
o E.g. (cont’d) ‘An epidermal cell associated
with a stoma and at least morphologically
distinguishable from the epidermal cells
composing the groundmass of the tissue’
• Annotations (examples of)
o Longer, more verbose descriptions.
o Examples of how the term is used.
o Explanations of contextual use for the term.
o Links to term provenance.
o E.g. (cont’d) DBXREF - TAIR:0000296
Achieving the separation of this meta-ontological layer allows for
the consideration of how to manage this mostly linguistic
information. This separation is our major desideratum and from
this flows the means by which Web 2.0 can provide a platform to
expose meta-ontological information and harness and extend the
range of group activities.
4.2 Promoting Social Interaction – A Meta-
Ontological Workspace
Explicit logic based ontologies for the Semantic Web are going to
need to capture implicit knowledge with axioms and restrictions.
Yet, unless the experts with the knowledge all manage to learn
how to interpret complex logical statements, there needs to be a
workspace in which implicit knowledge can be discussed and
defined lexically within expert groups. In other words, terms and Figure 2: An augmented form of ontology comprehension.
term linked information can essentially exist independently of the Ontology Metadata and Meta-Ontological statements have
formal environment of ontologies. This implicit knowledge can be been separated from the Logical Statements and has been
used by ontologists as a resource. With such a resource, exposed to a community using Web 2.0 principles.
development of early stage ontologies will not require the
Generating discussion of implicit meaning may sound a little like
construction of formal hierarchies until a critical amount of
cutting the domain expert out of the ontology construction
implicit knowledge has been collected in these more lightweight
process. It should in fact considerably reduce the overhead of
resources. Also, multiple hierarchies for different purposes could
ontology development by shifting the discussions based around
be constructed from the same resource, reusing the collected
intrinsic meanings of terms and which terms are the most domain ontologies, this can be marked up and become visible
appropriate to use away from the attention of the ontologists. It is such that ontology level provenance, a history of where
important not to make the division too wide, as there is a risk that everything in an ontology originated from and how it changed
bias could creep in from the ontologists as the domain experts over time, can start to be built up.
would be unable to assess the implications of certain restrictions Introducing a strong community aspect would encourage those
and axioms. In terms of feedback to the domain expert from the developing ontologies to start using tagging, thereby linking up
ontologist, we recognize that there is a need for some sort of their ontologies to particular domains and projects. Domain
consistent translation methodology that can generate accurate ontology construction could be promoted by using ranking
textual definitions from logical statements, but we consider this systems where inspectors can rate how useful the ontology was in
outside of the scope of this article. terms of what was expected, assuming that more general
It should be clear that such discussion workspaces would be well ontological models will fit the requirements of more people.
suited for Web 2.0 style systems. These workspaces should OWL has been made popular for use as an ontology language
promote the creation of lexicons in which a group of experts can because of the publicity of the Semantic Web, accessibility of
start to add in and inspect lexical information. In this implicit tools for creating OWL ontologies and the fact that it is useful
view, it is the terms that are the focus of discussion, not the beyond the scope of the Semantic Web. OWL has been used for a
ontological interpretation, which are two different goals that lot of purposes, and searching for ontologies based on the content
sometimes get confused during ontology development. Within the of their files seems like it may be unsustainable as the number of
workspace, the terms can be discussed, and annotated with textual files grows faster than the number of useful ontologies.
definitions, comments about usage, links to synonymous terms,
requests for clarification etc. Helium was, for instance, discovered Efficient inspection of ontologies can be limited by a large size of
in 1894. Of course it was the category of Helium that was ontologies. The current tendency is to build larger ontologies, as
discovered, not the instances of the helium atoms (which the tool support and methodologies for modularization have been
presumably have existed much before 1894). This is an example slow off the mark until recently. As we learn more about the
of meta-class statements that are part of the ontology. They are implications and methods of modularization [22], ontologies can
class level statements, but those that are well suited to this become more manageable, reducing the amount of evaluation cost
linguistic, community style of interaction. per ontology. This of course will require better indexing, along
with information about how each ontology has been
The purpose of targeting Web 2.0 as a base for this meta- used/imported, perhaps leading to a ‘shopping cart’ model for
ontological data is not to completely remove this type of highly modular ontology construction.
information from the view of the ontology, we merely seek to
relocate it so that the incredibly social nature of the definition of Perhaps one of the most motivating factors for achieving this
knowledge can be coupled with an environment that is equally desire for more effective inspection is the aspect of learning. Once
socially driven (see Figure 2). Modularization of ontologies is it becomes easy to empirically see what constitutes a good and
seen to be one of the keys to making ontologies viable for the useful ontology, then these features get propagated and discussed.
Semantic Web vision, and as such, import mechanisms exist that As has been noted in [1], the viral spread of understanding how to
support the combination of different sources. Lexicons developed write HTML was in part because existing HTML could be
by groups could be given URIs, as could all of the terms inspected and copied. Also, the effect of newly written HTML
described in them. Knowledge held in WordNet [17] style lexical was instantly verifiable in a Web browser. It is harder to have this
resources could be linked using online URIs in a similar way to sort of verification with ontologies, and there are a lot of
imported online ontologies taking advantage of well established conflicting styles of ontology development with no consensus of
methods for dealing with words and their meanings at the lexical what is ‘right’, If the Ontological Metadata Workspace were to be
level. realized, then a hub of comparable, commented and marked-up
ontologies could develop in a much quicker and consistent
fashion than the solitary efforts that are currently the norm.
4.3 Promoting Ontology Sharing and Reuse:
An Ontological Metadata Workspace 5. BIO-ONTOLOGIES: EXPERIENCES
The production of ontologies that can be effectively shared and
reused is a major step towards achieving the goals of the Semantic AND PERSPECTIVES
Web. There are significant barriers to these goals in our current While our discussion in this article is most pertinent to the notion
model of ontological comprehension. We have highlighted how of the Semantic Web as a whole, it originates from the discipline
ontologists and domain experts alike need to inspect ontologies to of bioinformatics. Biologists were early adopters of the Web as a
assess whether they are appropriate for their needs. Currently, the means of disseminating data and the tools for their analysis. These
information that would be necessary to effectively conduct this data and tools are developed in a highly autonomous manner and
investigation is hard to find, and does not always come in the consequently they are beset by both syntactic and semantic
same format. heterogeneities. Bioinformaticians have seen ontologies as a
means to create common understandings for human and
An ontological metadata workspace would provide access to computers about the meaning of data in their distributed resources
whole-ontology level information for ontologies necessary to in a life science Semantic Web [13]. The DNA sequences of
carry out light evaluative processes. A collaborative Web 2.0 different organisms, for example have a common representation,
approach to ontologies would see ‘ontology profiles’ that include but this is not so for the functional knowledge associated with
clear statements about the purpose and scope of the ontology and those sequences. So, the sequences can be interpreted by humans
information regarding its status. Ontologies would clearly be and computers, but not what is known about those sequences.
labeled as domain and application ontologies to help evaluation,
and subsequently, when application ontologies are derived from
Consequently, biologists have created ontologies to describe, for such as ‘develops_from’. Despite having the full expressivity of
instance, the functional attributes of DNA and proteins [3, 11]. OWL available in the OBO 1.2 syntax, there is little evidence to
Bioinformatics has, therefore, much Web accessible data suggest that the developers in this community either see the need
described by ontologies. The W3C have recognized a nascent or have the will to take on this level of expressivity in their
Semantic Web in this domain in the development of the Health knowledge.
Care and Life Sciences SIG 5 . It is a significant feature of the Perhaps then, this community can be a model for the future of
move towards ontologies in this sector that it is biologists who ontology development on the Web. Quick and easy development
build these tools, with some guidance from ontologists. Whilst of terms by engaging the user, employing Web 2.0 design
this community has not made great use of OWL, but its own principles to forge more coordinated communities for
representation, OBO 6 , it still provides a good representation of development of Semantic Web technologies. Web 2.0 has the
Semantic Web activities. capability to expose all of the ‘light’ lexical issues and some basic
The OBO ontologies have significant standing in biological assertions of linking meaning to terms. ‘Heavier’ more expressive
communities, and it is perhaps the community building aspect that assertions in OWL are in the domain of the ontologist, who can be
fuels this standing, as it includes: informed by the interactions they can have with domain experts
and other ontologists through Web 2.0 communities.
• A large number of centrally available OBO ontologies 7
• The OBO-Edit OBO ontology development tool that is 6. DISCUSSION
specifically designed by a working group of users. We propose the construction of ontology specific resources, using
the Web as a platform, which specifically deals with the
• A committee, the OBO-Foundry 8 , that has been set up
management of lexical meta-ontological aspects of ontology
and has produced a set of principles for new OBO
development together with the management of ontology metadata.
ontologies to aspire to, including the promise of textual
The applications of Web2.0 are geared towards harnessing these
definitions for all terms and good documentation for all
types of community interaction, which is precisely the sort of
ontologies.
interaction that is not supported in the current model of ontology
• The OBO file format, for which the primary goals development. Dealing with meta-ontological data in
include human readability and ease of parsing together downloadable ontology files and disparate descriptions of
with a syntax that makes them exportable as OWL. ontology metadata on development sites is prohibitive to a more
• Pages on the SourceForge 9 open source software universal appreciation of ontology design and implementation.
development site, which includes the potential for A centralized resource for sharing OWL resources would act as a
project information, forums, downloads and issue hub for community learning, sharing and reusing of ontology
tracking by which suggestions for new terms and resources, bringing together ontology users and builders in a way
modifications can be submitted. that is currently not possible. Designing ontologies by consensus
Contributors to OBO are starting to pull together as a virtual in such workspaces would encourage best practice and speed up
community by pooling its resources on the Web. The Gene the uptake of the more complicated Semantic Web technologies,
Ontology [3] saw a phenomenal growth in the number of terms it starting with OWL and the knowledge that is to be contained
contained through user interaction alone that is well documented within. At the same time the system would provide a measure of
[5], such was the demand for the resource to represent so many control, ensuring that the dangers of misinterpreting the powerful
researchers. Since then, the trend has continues as more and more semantics of OWL by untrained eyes are avoided. Having the
biological domains aim to be represented by OBO. community built lexical resources is the beginning of an
opportunity to link up ontologists with a more specific system that
The caveat for the relative success of OBO has probably been can refer to the online lexical corpus.
similar to that of Web 2.0 over Semantic Web (so far). Formality
and methodology have temporarily made way for ease of use and The widespread realization of the Semantic Web will depend on
ease of interaction. Interestingly, the majority of the OBO the production of ontologies that can be effectively shared and
ontologies clearly state that they are “structured controlled reused, but in order to achieve this, the overheads of ontology
vocabularies”, which require nothing like the expressive power of development and ontology comprehension have to be
OWL, and little in the way of knowledge engineering because the considerably reduced. The OBO community/consortium has
statements linking things do not require it. This is not for any effectively demonstrated the advantages of lowering these
other reason than nothing more complex than this is required, overheads by engaging a community of domain experts in
OBO ontologies are used for marking biological data so that they ontology development. OBO ontologies, however, are for human
can be linked if they are annotated in the same way. Primarily, interpretation, so the true Semantic Web vision of human and
these ontologies contain a hierarchy of terms denoting ‘is_a’ computational understanding is not addressed. At the same time,
relationships. Less often but still common are ‘part_of’ highly expressive OWL-DL ontologies, for both computer and
relationships, and occasionally other properties key to biology human interpretation are being produced, but largely in isolation.
We propose a solution here which would bridge the gap between
5
these approaches and effectively enable the same type of domain
http://www.w3.org/2001/sw/hcls/ expert community engagement for formal ontologies.
6
http://www.geneontology.org/GO.format.obo-1_2.shtml
7
http://obo.sourceforge.net/ 7. ACKNOWLEDGMENTS
8 Funding for this work was through BBSRC grant BBS/B/17156.
http://obofoundry.org/
9
http://sourceforge.net/
8. REFERENCES and metadata engine for the semantic web Proceedings
of the thirteenth ACM international conference on
[1] Alani, H., Position paper: ontology construction from Information and knowledge management, ACM Press,
online ontologies. in Proceedings of the 15th Washington, D.C., USA, 2004.
International Conference on World Wide Web [17] Miller, G.A. WordNet: a lexical database for English.
(Edinburgh, Scotland, 2006), ACM Press, New York, Communicationd of the ACM, 38 (11). 39-41.
NY, 491-495. [18] Noy, N.F., Guha, R. and Musen, M.A., User Rating of
[2] Alani, H., Brewster, C. and Shadbolt, N., Ranking ontologies: Who will rate the raters? in AAAI 2005
Ontologies with AKTiveRank. in International Spring Symposium on Knowledge Collection from
Semantic Web Conference, (Athens, GA, USA, 2006). Volunteer Contributors, (Stamford, CA, USA, 2005).
[3] Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., [19] Noy, N.F. and Klein, M. Ontology Evolution: Not the
Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Same as Schema Evolution. Knowledge and
Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Information Systems, V6 (4). 428-440.
Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., [20] Pulido, J.R.G., Ruiz, M.A.G., Herrera, R., Cabello, E.,
Richardson, J.E., Ringwald, M., Rubin, G.M. and Legrand, S. and Elliman, D. Ontology languages for the
Sherlock, G. Gene Ontology: tool for the unification of semantic web: A never completely updated review.
biology. Nat Genet, 25 (1). 25-29. Knowledge-Based Systems, 19 (7). 489-497.
[4] Austin, M.A., III and Samadzadeh, M.H., Software [21] Rector, A., Drummond, N., Horridge, M., Rogers, J.,
comprehension/maintenance: an introductory course. in, Knublauch, H., Stevens, R., Wang, H. and Wroe, C.
(2005), 414-419. OWL Pizzas: Practical Experience of Teaching OWL-
[5] Bada, M., Stevens, R., Goble, C., Gil, Y., Ashburner, DL: Common Errors & Common Patterns, 2004.
M., Blake, J.A., Cherry, J.M., Harris, M. and Lewis, S. [22] Rector, A.L. Modularisation of domain ontologies
A short study on the success of the Gene Ontology. Web implemented in description logics and related
Semantics: Science, Services and Agents on the World formalisms including OWL Proceedings of the 2nd
Wide Web, 1 (2). 235-240. international conference on Knowledge capture, ACM
[6] Berners-Lee, T., Hendler, J. and Lassila, O. The Press, Sanibel Island, FL, USA, 2003.
Semantic Web. Scientific American, 284. 34-43. [23] Rosse, C. and Mejino, J.L.V. A reference ontology for
[7] Bodenreider, O., Burgun, A. and Rindflesch, T.C. biomedical informatics: the Foundational Model of
Assessing the consistency of a biomedical terminology Anatomy. Journal of Biomedical Informatics, 36 (6).
through lexical knowledge. International Journal of 478-500.
Medical Informatics, 67 (1-3). 85-95. [24] Sabou, M., Lopez, V., Motta, E. and Uren, V., Ontology
[8] Brank, J., Grobelnik, M. and Mladenic, D., A survey of Selection: Ontology Evaluation on the Real Semantic
ontological evaluation techniques. in Conference on Web. in WWW2006, (Edinburgh, UK, 2006).
Data Mining and Data Warehouses, (Ljubljana, [25] Schaffert, S., Gruber, A. and Westenhaler, R., A
Slovenia, 2005). Semantic Wiki for collaborative knowledge formation.
[9] Brickley, D. and Miller, L. FOAF vocabulary in Semantics, (Vienna, Austria, 2005).
specification, 2005. [26] Shadbolt, N., Hall, W. and Berners-Lee, T. The
[10] Corcho, O., Fernandez-Lopez, M. and Gomez-Perez, A. Semantic Web revisited. IEEE intelligent systems, 21
Methodologies, tools and languages for building (3). 96-101.
ontologies. Where is their meeting point? Data & [27] Stevens, R., Baker, P., Bechhofer, S., Ng, G., Jacoby,
Knowledge Engineering, 46 (1). 41-64. A., Paton, N.W., Goble, C.A. and Brass, A. TAMBIS:
[11] Eilbeck, K., Lewis, S., Mungall, C., Yandell, M., Stein, Transparent Access to Multiple Bioinformatics
L., Durbin, R. and Ashburner, M. The Sequence Information Sources. Bioinformatics, 16 (2). 184-186.
Ontology: a tool for the unification of genome [28] Uschold, M. Knowledge level modelling: concepts and
annotations. Genome Biology, 6 (5). R44. terminology. The Knowledge Engineering Review, 13
[12] Golbreich, C., Zhang, S. and Bodenreider, O. The (1). 5-29.
foundational model of anatomy in OWL: Experience [29] Vrandecic, D., Pinto, S., Tempich, C. and Sure, Y. The
and perspectives. Web Semantics: Science, Services and DILIGENT knowledge processes. Journal of
Agents on the World Wide Web, 4 (3). 181-195. Knowledge Management, 9 (5). 85-96.
[13] Good, B.M. and Wilkinson, M.D. The Life Sciences [30] Whetzel, P.L., Parkinson, H., Causton, H.C., Fan, L.,
Semantic Web is Full of Creeps! Brief Bioinform, 7 (3). Fostel, J., Fragoso, G., Game, L., Heiskanen, M.,
275-286. Morrison, N., Rocca-Serra, P., Sansone, S.-A., Taylor,
[14] Guarino, N. and Christopher, W. Evaluating ontological C., White, J. and Stoeckert, C.J., Jr. The MGED
decisions with OntoClean. Commun. ACM, 45 (2). 61- Ontology: a resource for semantics-based description of
65. microarray experiments. Bioinformatics, 22 (7). 866-
[15] Kirk, D., Roper, M. and Wood, M., Identifying and 873.
addressing problems in framework reuse. in, (2005), 77- [31] Wolstencroft, K., Lord, P., Tabernero, L., Brass, A. and
86. Stevens, R. Protein classification using ontology
[16] Li, D., Tim, F., Anupam, J., Rong, P., Cost, R.S., Yun, classification. Bioinformatics, 22 (14). e530-538.
P., Pavan, R., Vishal, D. and Joel, S. Swoogle: a search